0% found this document useful (0 votes)
50 views29 pages

Robotics 13 00012

This document discusses machine learning approaches that have been applied to mobile robot control. It reviews supervised learning, unsupervised learning, and reinforcement learning methods. The document analyzes how different machine learning algorithms have been used for tasks like position estimation, environment mapping, obstacle avoidance, and path following. It also discusses ongoing challenges in using machine learning for real-time robot control and operation in changing environments.

Uploaded by

Akash Bachhar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views29 pages

Robotics 13 00012

This document discusses machine learning approaches that have been applied to mobile robot control. It reviews supervised learning, unsupervised learning, and reinforcement learning methods. The document analyzes how different machine learning algorithms have been used for tasks like position estimation, environment mapping, obstacle avoidance, and path following. It also discusses ongoing challenges in using machine learning for real-time robot control and operation in changing environments.

Uploaded by

Akash Bachhar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

robotics

Review
A Survey of Machine Learning Approaches for Mobile
Robot Control
Monika Rybczak *,† , Natalia Popowniak and Agnieszka Lazarowska *,†

Department of Ship Automation, Gdynia Maritime University, 81-87 Morska St., 81-225 Gdynia, Poland;
[email protected]
* Correspondence: [email protected] (M.R.); [email protected] (A.L.)
† These authors contributed equally to this work.

Abstract: Machine learning (ML) is a branch of artificial intelligence that has been developing at a
dynamic pace in recent years. ML is also linked with Big Data, which are huge datasets that need
special tools and approaches to process them. ML algorithms make use of data to learn how to
perform specific tasks or make appropriate decisions. This paper presents a comprehensive survey
of recent ML approaches that have been applied to the task of mobile robot control, and they are
divided into the following: supervised learning, unsupervised learning, and reinforcement learning.
The distinction of ML methods applied to wheeled mobile robots and to walking robots is also
presented in the paper. The strengths and weaknesses of the compared methods are formulated, and
future prospects are proposed. The results of the carried out literature review enable one to state the
ML methods that have been applied to different tasks, such as the following: position estimation,
environment mapping, SLAM, terrain classification, obstacle avoidance, path following, learning to
walk, and multirobot coordination. The survey allowed us to associate the most commonly used
ML algorithms with mobile robotic tasks. There still exist many open questions and challenges such
as the following: complex ML algorithms and limited computational resources on board a mobile
robot; decision making and motion control in real time; the adaptability of the algorithms to changing
environments; the acquisition of large volumes of valuable data; and the assurance of safety and
reliability of a robot’s operation. The development of ML algorithms for nature-inspired walking
robots also seems to be a challenging research issue as there exists a very limited amount of such
solutions in the recent literature.
Citation: Rybczak, M.; Popowniak, N.;
Lazarowska, A. A Survey of Machine Keywords: artificial intelligence; machine learning; mobile robots; walking robots; robot control
Learning Approaches for Mobile
Robot Control. Robotics 2024, 13, 12.
https://doi.org/10.3390/
robotics13010012 1. Introduction
Academic Editor: Marco Ceccarelli Machine learning (ML) is a rapidly growing part of the science in artificial intelligence.
ML algorithms are mainly based on huge databases, which are categorized and processed
Received: 7 November 2023 accordingly. For example, the authors in [1–3] discussed the classification of machine
Revised: 27 December 2023
learning-based schemes in healthcare, categorizing them based on data preprocessing
Accepted: 3 January 2024
and learning methods. They emphasized that ML has potential; however, there is still an
Published: 9 January 2024
issue in the appropriate selection of data on which the algorithms are to learn from. For
example, the paper of [4] presented the idea that building databases for medicine is based
on the many hundreds of (for example) images of a single case; furthermore, the paper
Copyright: © 2024 by the authors.
noted that this will stimulate collaboration between scientists even more than it does today.
Licensee MDPI, Basel, Switzerland. The authors mentioned that “Machine learning fuelled by the right data has the power
This article is an open access article to transform the development of breakthrough, new medicines and optimise their use in
distributed under the terms and patient care”. Another example [5] is a review of articles related to the use of artificial
conditions of the Creative Commons intelligence as a tool in higher education. Machine learning in automation is used especially
Attribution (CC BY) license (https:// in robots. An interesting example of the use of machine learning and inverse kinematics
creativecommons.org/licenses/by/ is the work on the motion simulation of an industrial robot, the results of which clearly
4.0/). presented a reduction in development time and investment [6,7].

Robotics 2024, 13, 12. https://doi.org/10.3390/robotics13010012 https://www.mdpi.com/journal/robotics


Robotics 2024, 13, 12 2 of 29

This paper is focused on the applications of machine learning methods for mobile
robots. A very interesting overview of the techniques was presented in [8], where ML was
applied for the classification of defects in wheeled mobile robots. Several techniques, i.e.,
random forest, support vector machine, artificial neural network, and recurrent neural
network, were investigated in this research. The authors stated that machine learning
can be applied to improve the performance of mobile robots, thus allowing them to save
energy by waiting close to the point where work orders come in most often. An algorithm
based on reinforcement learning was proposed for mobile robots in [9]. This algorithm
discretizes obstacle information and target direction information into finite states, designs
a continuous reward function, and improves training performance. The algorithm was
tested in a simulation environment and on a real robot [10]. Another study focused on
using deep reinforcement learning to train real robot primitive skills such as go-to-ball
and shoot the goalie [11]. The study presented state and action representation, reward and
network architecture, as well as showed good performance on real robots. An improved
genetic algorithm was used for path planning in mobile robots, thereby solving problems
such as path smoothness and large control angles [12]. Finally, a binary classifier and
positioning error compensation model that combined the genetic algorithm and extreme
learning machine was proposed for the indoor positioning of mobile robots.
Many authors claim that they achieve very satisfactory results applying ML methods
in mobile robot problems. For example, in the work of [13], the best results for mobile
robots in machine learning were obtained using Central Moments as a feature extractor and
Optimum Path Forest as a classifier with an accuracy of 96.61%. The authors of [9] stated
that the developed algorithm for autonomous mobile robots in industrial areas enabled
the execution of work orders with a 100% accuracy. In [14], the proposed model was based
on the Extreme Learning Machine-based Genetic Algorithm (GA-ELM), which achieved a
71.32% reduction in positioning error for mobile robots without signal interference. These
are just a few examples of the recent research applying different ML approaches in robotics.
The machine learning methods, as shown in Figure 1, are divided into the following: super-
vised learning, unsupervised learning, reinforcement learning, and semi-supervised learning.

Machine Learning

Supervised Unsupervised Reinforcement Semi-supervised


Learning Learning Learning Learning

Detection of Visualization and


Regression Classification Clustering anomalies and dimensionality reduction
novelties

One-Class Principal
Linear K-means
Decision Trees Support Vector Component
Regression Method
Machine (SVM) Analysis (PCA)

Polynomial K-nearest Hierarchical Kernel


Regression Neighbors Cluster Analysis Principal
Method Component
Analysis
Regression
Density-Based
Tree Support Vector Spatial
Machine (SVM)
Clustering of
Applications
Neural with Noise
Networks Naive Bayes (DBSCAN)
Classifier

Random Forest

Logistic
Regression

Neural
Networks

Figure 1. The classification of machine learning methods.


Robotics 2024, 13, 12 3 of 29

Robotics is one of the vital branches of science and technology nowadays, experi-
encing dynamic growth. Therefore, it is very difficult to track the development of all
topics belonging to this area of knowledge. The surveys of the works associated with
the specific aspects of this field of research can be found in the recent literature. In [15],
the authors concentrated on the heuristic methods applied for the robot path planning
in the years 1965–2012. Their study included the application of neural networks (NN),
fuzzy logic (FL), and nature-inspired algorithms such as the following: genetic algorithms
(GA), particle swarm optimization (PSO), and ant colony optimization (ACO). In [16], the
authors presented the approaches utilizing artificial intelligence (AI), machine learning
(ML), and deep learning (DL) in different tasks of advanced robotics, such as the following:
autonomous navigation, object recognition and manipulation, natural language processing,
and predictive maintenance. They presented different applications of AI, ML and DL in
industrial robots, advanced transportation systems, drones, ship navigation, aeronautical
industry, aviation managements, and taxi services. An overview of ML algorithms applied
for the control of bipedal robots was presented in [17]. In the work of [18], the authors
introduced a review of Visual SLAM methods based on DL.
The classification of mobile robots can be based on the type of the motion system,
as shown in [19], where the wheeled mobile robots and the walking robots can be distin-
guished. The examples of mobile robots classified to both types of motion systems are
shown in Figures 2 and 3.

Figure 2. The examples of the wheeled robots.

Figure 3. The examples of the walking robots.

The main issues concerning mobile robots can be classified into navigation, control,
and remote sensing, as stated in [20].
One of the specific tasks that is associated with the navigation of the mobile robots is
the localization and environment mapping, which is also known as Simultaneous Localiza-
tion and Mapping (SLAM). The SLAM algorithms allow for the environment mapping with
the simultaneous tracking of the current robot’s position. The SLAM algorithms perceive
the environment using sensors such as the following: cameras, lidars, and radars. The
Robotics 2024, 13, 12 4 of 29

different subtasks of SLAM are the environment perception, the robot localization/position
estimation, and the environment mapping.
The remote sensing in mobile robots deals with the usage of different sensors in order
to perceive the robot’s surroundings. The commonly used sensors are cameras, lidars (Light
Detection and Ranging), radars (Radio Detection and Ranging), ultrasonic sensors, infrared
sensors and the GPS (Global Positioning System). The specific tasks being developed in
the mobile robotics research, which are associated with the remote sensing and the sensor
data collecting, include the data dimensionality reduction, the feature selection, the terrain
classification and the machine vision solutions.
The path planning and obstacle avoidance tasks are inherent in the navigation of
mobile robots. The path planning algorithms calculate a feasible, optimal or near-optimal
path from the current position of a robot to the defined goal position. The most commonly
applied optimization criteria are the shortest distance and the minimal energy consumption.
The path planning process has to consider the avoidance of static and dynamic obstacles.
The obstacle avoidance problem is also related to the obstacle detection and clustering.
Another issue also connected with this topic is the target attraction task, which is aimed at
leading the mobile robot toward a specific target or a goal position.
The mobile robotics research is also concentrated on the motion control that includes
the attitude control, the heading control, the speed control, and the steering along a path.
In this last task, a trajectory controller has to be developed. Other topics that are related
to the mobile robot motion control include learning to walk, which is associated with the
walking robots, the multirobot coordination and the autonomous navigation.
The classification of the mobile robotics tasks is divided into four main categories:
SLAM, sensor data, path planning, and motion control, which are presented in Figure 4.

Mobile robot task

Path planning/ obstacle Motion control/ motion


SLAM Sensor data
avoidance planning

Dimensionality Trajectory
Environment Obstacle clustering
reduction controller
perception

Feature selection Collision detection Attitude control


Localization/ Position
estimation

Collision/obstacle Autonomous
Terrain classification
Environment avoidance navigation
mapping

Machine vision Path planning Learning to walk


Localization and env.
Mapping

Multirobot
Target attraction
coordination

Figure 4. The classification of the mobile robotics tasks.


Robotics 2024, 13, 12 5 of 29

The subsequent sections present a comparison of the ML techniques proposed recently


for the different tasks of robots, such as positioning, path planning, path following, and
environment mapping.

2. Supervised Learning Approaches


This section provides an overview of the supervised machine learning methods that
were used in mobile robotics in the years 2003–2023. In the supervised learning, as the
name of the method suggests, the solutions of the considered problem are attached to the
set of the training data as labels or classes. The main application areas of the supervised
learning methods are the regression and the classification problems, as shown in Figure 1.
The aim of the regression algorithm is to predict a value of an output variable. An
example of a regression problem is the forecasting of the value of a car based on its features
such as the following: the model, the brand, the year of production, and the engine
capacity. Another similar case is the prediction of the house prices based on their various
features, such as the following: the number of bedrooms, the neighborhood’s crime rate,
and the proximity to schools. An example of the SL application in a different domain is the
estimation of a person’s age based on the facial features. The common regression algorithms
include linear regression, polynomial regression, regression tree, and neural networks.
The goal of the classification algorithm is the categorization of the input data into
the predefined classes or categories. An example of a classification problem is the task
of classifying a message as “spam” or “non-spam”. Other classification tasks include
the image classification, the sentiment analysis, and the disease diagnosis. The common
classification algorithms include decision trees, the k-nearest neighbors method, support
vector machines (SVM), the Naive Bayes classifier, the random forest, logistic regression,
and neural networks.

2.1. Regression Methods


As mentioned above, the supervised learning methods are grouped into two main
categories of regression and classification. This subsection presents a literature review of the
regression approaches that were applied in mobile robotics. Table 1 shows a comparative
analysis of the recent SL algorithms that were proposed for solving of the regression
problems in mobile robots.
The SL regression methods are applied in mobile robots for tasks such as the following:
localization, obstacle detection and avoidance, path planning, and slippage prediction.

Table 1. A comparison of the SL regression algorithms used in mobile robots.

Simulations/Real
Method Authors Year Object Task
Exp.
Linear regression Sharma et al. [21] 2016 Mobile robot Localization Simulations
Collision avoidance,
Linear regression Das et al. [22] 2022 Mobile robot Sim. and real exp.
path planning
Linear regression Naveen et al. [23] 2020 Mobile robot Obstacle avoidance Real exp.
GPR Gonzalez et al. [24] 2018 Single wheel Slippage prediction Real exp.
ANN Crnokić et al. [25] 2023 Mobile robot Sensor data Simulations
CNN Ballesta et al. [26] 2021 Mobile robot Localization Real exp.

The charts presenting the statistics of the SL regression algorithms applied in mobile
robotics are shown in Figure 5.

2.1.1. Regression Methods for Robot Localization


One of the tasks of mobile robots, according to Figure 4, is the robot localization. In [21],
the authors proposed a method for the identification of the optimal value of the wheel speed.
The approach was used for the relative localization of a differentially driven robot, moving
on the circular and straight paths. The linear regression analysis was performed in order
Robotics 2024, 13, 12 6 of 29

to find the relationship between the wheel speed and the odometry of the two-wheeled
differential drive mobile robot. The test that used the Analysis of Variance (ANOVA)
technique was carried out based on the statistical tools available in the MINITAB scientific
analysis software. The V-Rep 3.2.1 simulation software was used for the validation of the
proposed method.

Type of SL method applied for SL approaches for regression problems


regression problems in mobile robots - evaluation method
in mobile robots
4

3.5
17%
3
33%
2.5

1.5 50%
1

0.5

0
Linear regression GPR Neural network Sim. Real exp. Sim.and real exp.

Figure 5. The types of the SL regression algorithms applied for solving the problems of mobile robots
in the works considered in this survey (left chart); the types of the evaluation methods used in the
works regarding the SL regression algorithms for solving the problems of mobile robots in the works
considered in this survey (right chart).

In [26], the authors described the application of the convolutional neural network
(CNN) for the robot localization problem. The issue was solved using a hierarchical
approach. In the first stage, the CNN solved a classification problem and found the room
where the robot was located. In the second stage, the regression CNN was applied in
order to estimate the exact position of the robot (the X and Y coordinates). The approach
was tested with the use of the dataset containing the sensor data that were registered by a
mobile robot under real operating conditions.

2.1.2. Regression Methods for Obstacle Detection


In [25], the authors proposed the application of the artificial neural networks (ANNs)
for the obstacle detection with the use of data from infrared (IR) sensors and a camera. The
ANN was developed and trained using the Matlab/Simulink software, and the simulation
tests were carried out in the RobotinoSIM virtual environment. An obstacle detection
accuracy of 85.56% was achieved. The authors stated that in order to accomplish greater
accuracy, a larger dataset should be used, but it would cause longer learning and data
processing times. The usage of hardware with higher computational capabilities would
also be needed.

2.1.3. Regression Methods for Obstacle Avoidance and Other Tasks


A linear regression approach for the collision avoidance and path planning of an
autonomous mobile robot (AMR) was proposed in [22]. The authors developed the adaptive
stochastic gradient descent linear regression (ASGDLR) algorithm for solving this task. The
velocities of the right and left wheels, and the distance from an obstacle, were measured
by two infrared sensors and one ultrasonic sensor. The stochastic gradient descent (SGD)
optimization technique was applied for the iterative updating of the ASGDLR model
weights. The difference between the actual velocity and the model output velocity was
used as an error signal. The ASGDLR algorithm was implemented on the NodeMCU ESP
8266 controller. The method was verified in the simulations test with the use of the Python
Robotics 2024, 13, 12 7 of 29

IDLE platform and the Matlab software as well as in real experiments using an AMR.
The approach was also compared with four other algorithms: VFH, VFH*, FLC, and A*.
The authors stated that the advantages of the algorithm are “the effectiveness of memory
utilization and less time requirement for each command obtained as a command signal from
the NodeMCU to the DC motor module compared to other Linear Regression algorithms”.
In [23], the authors proposed a linear regression approach for the mobile robot obstacle
avoidance. The task was carried out by predicting the wheel velocities of the differential
drive robot. Input data were obtained with the use of the ultrasonic sensors for the distance
and the IR sensors for the wheel velocities measurements. The robot control platform used
in this research was the Atmega328 microcontroller.
In [24], a slippage prediction method was introduced using Gaussian process regres-
sion (GPR). The approach was validated with the use of the MIT single-wheel testbed
equipped with an MSL spare wheel. This solution can be useful for the off-road mobile
robots. According to the authors, the results proved an appropriate trade-off between the
accuracy and the computation time. The algorithm returned the variance associated with
every prediction, which might be useful for the route planning and the control tasks.

2.2. Classification Methods


The second category in the supervised learning is the classification. This section
presents the recent approaches for the classification problems applied in mobile robotics.
Table 2 presents a comparative analysis of the recent SL classification algorithms applied in
mobile robotics.

Table 2. A comparison of the SL classification algorithms used in mobile robots.

Method Authors Year Object Task Simulations/Real Exp.


Swere and
Decision trees 2003 Mobile robots Navigation Sim. and real exp.
Mulvaney [27]
Decision trees Swere et al. [28] 2006 Mobile robots Navigation Sim. and real exp.
Decision trees Roth et al. [29] 2021 Mobile robot Navigation Real exp.
K-nearest neighbors Sarah and Riadh [30] 2019 Mobile robot Navigation Real exp.
K-nearest neighbors Elias et al. [31] 2021 Mobile robot Object detection Real exp.
SVM Zheng et al. [32] 2021 Mobile robot Motion control Real exp.
SVM Liu et al. [33] 2016 Biped robot Gait control Real exp.
Random Forest Liao et al. [34] 2023 Mobile robot Terrain classification Real exp.
Random Forest Zhang et al. [35] 2016 Mobile robot Terrain classification Real exp.
Logistic regression Becker and Ebner [36] 2019 Mobile robot Collision detection Real exp.
ANN Sanusi et al. [37] 2023 Mobile robot Terrain classification Real exp.
Hoshino and
CNN with LSTM 2022 Mobile robot Navigation Real exp.
Yoshida [38]
DNN Kozlowski & Walas [39] 2018 Mobile robot Terrain classification Real exp.

The charts presenting the statistics of the SL classification algorithms applied in mobile
robotics are shown in Figures 6 and 7.
Robotics 2024, 13, 12 8 of 29

SL methods for classification problems in mobile SL approaches for classification


robots - distribution over years 2016-2023 problems in mobile robots -
3 evaluation method

2.5

2 15%

1.5

1
85%

0.5

0
2016 2017 2018 2019 2020 2021 2022 2023 Real exp. Sim.and real exp.

Figure 6. The distribution over the years 2007–2023 of the SL classification methods for mobile robots
considered in this survey (left chart); the types of the evaluation methods used in the works regarding
the SL classification algorithms for solving the problems of mobile robots in the works considered in
this survey (right chart).

SL methods for classification problems Type of SL method for classification


in mobile robots - type of solved task problems in mobile robots
5 3
4.5
4 2.5
3.5
3 2
2.5
2 1.5
1.5
1
1
0.5
0
0.5

0
Decision K-nearest SVM Random Logistic Neural
trees neighbors Forest Regression networks

Figure 7. The types of the solved tasks in the works regarding SL classification methods considered
in this survey (left chart); the types of the SL regression algorithms applied for solving the problems
of mobile robots in the works considered in this survey (right chart).

2.2.1. Classification Methods for Terrain Type Recognition


An important issue in mobile robotics for the use in the rescue operations or the
inspection tasks is the terrain classification. The appropriate recognition of the type of a
terrain will enable for the mobile robot’s behavior adaptation to the environment. It will
also allow the robot to reach the defined target faster and in a more effective way. The
different supervised learning classification algorithms proved to be suitable for this task.
The random forest classifier was proposed in the work of [35] for the evaluation of the
traversability of the terrain.
In [34], the authors introduced the random forest classifier optimized by a genetic
algorithm for the classification of ground types. This approach allowed overcoming the
limitation of the traditional random forest algorithm, which is the lack of a formula for the
determination of the optimal combination of many initial parameters. The method allowed
the authors to achieve the recognition accuracy of 93%. This was a significantly higher
value then the results obtained with the use of the traditional random forest algorithm.
An artificial neural network (ANN) was applied for the terrain classification in the
research presented in [37]. The ANN was implemented on the Raspberry Pi 4B in order
to process the vibration data in real time. The 9-DOF inertial measurement unit (IMU),
Robotics 2024, 13, 12 9 of 29

including an accelerometer, a gyroscope, and a magnetometer, was used for the data
reception. The Arduino Mega Board was used as a control unit. The carried out experiments
allowed for the achievement of online terrain classification prediction results above 93%.
In the work of [39], a deep neural network (DNN) was applied for the terrain recogni-
tion task. The input data to the model were the vision data from an RGB-D sensor, which
contained a depth map and an infrared image, in addition to the standard RGB data.

2.2.2. Classification Methods for Mobile Robot Navigation


In [27], the authors proposed the application of a decision tree algorithm for the mobile
robot navigation. The developed learning system was aimed at performing the incremental
learning in real time. It was using the limited memory in order to be applicable in an
embedded system. The algorithm was developed using the Matlab environment and was
tested with the use of the Talrik II mobile robot, equipped with 12 infrared sensors and
sonar transmitters, and receivers. The controller was implemented on the ARM evaluator
7T development board, running the eCos real-time operating system. The continuation
of this research was presented in [28]. The method proposed in this paper was based on
the incremental decision tree. In this approach, the feature vectors were kept in the tree.
The experiments with the use of a mobile robot, performing the real-time navigation task,
showed that “the calculation time is typically an order of magnitude less than that of an
incremental generation of an ITI decision tree”.
In the work of [29], the expert policy based on the deep reinforcement learning was
applied for the calculation of a collision-free trajectory for a mobile robot in a dense,
dynamic environment. In order to enhance the system’s reliability, an expert policy was
combined with the policy extraction technique. The resulting policy was converted into a
decision tree. The application of the decision trees was aimed at improving of the solutions
obtained by the algorithm. The improvements included the smoothness of the trajectory,
the frequency of oscillations, the frequency of immobilization, and the obstacle avoidance.
The method was tested in simulations and with the use of the Clearpath Jackal robot, which
was navigating in the environment with moving pedestrians.
In the work of [38], the authors introduced a motion planner based on the convolu-
tional neural network (CNN) with the long short-term memory (LSTM). The imitation
learning was applied for training the policy of the motion planner. In the experiments
carried out, the robot was moving autonomously toward the destination and was also
avoiding standing and walking persons.
In the work of [32], a method based on support vector machine (SVM) was proposed
for the application to the mobile robot’s precise motion control. The control algorithm
was implemented on the ARM9 control board. The results of the experiments proved the
feasibility of the approach for the precise position control of a mobile robot. The achieved
maximum error was less than 32 cm in the linear movement on the distance of 10 m.
In [30], the authors considered the application of the different classification algorithms
for the wall following the navigation task of a mobile robot. They introduced the k-
nearest neighbors approach and compared it with the other methods, such as the following:
decision trees, neural networks, Naïve Bayes, JRipper and support vector machines.

2.2.3. Classification Methods for Other Tasks


In [31], the authors proposed the application of the k-nearest neighbors method for
the image classification task. The approach was used for object detection and recognition
in the machine vision-based mobile robotic system.
In [36], the authors applied logistic regression for the collision detection task. The
training data were obtained from the acceleration sensor. The data were registered during
the movement of a small mobile robot. The accelerometer data and the motor commands
were afterwards combined in the logistic regression model. The Dagu T’Rex Tank chassis
was used in the experiments. The robot was driven by two motors via the Sabertooth motor
Robotics 2024, 13, 12 10 of 29

controller. The Arduino Mega 256 microcontroller was used as the robot’s control unit. The
trained model detected 13 out of 14 collisions with no false positive results.
An essential task for the walking robots is learning to walk. In [33], the authors
presented a method for the gait control. It was based on support vector machine (SVM)
with the mixed kernel function (MKF). The ankle and the hip trajectories were applied
as the inputs. The corresponding trunk trajectories were used as the outputs. The results
of the SVM training were the dynamic kinematics relationships between the legs and the
trunk of the biped robot. The authors stated that their method achieved better performance
than the SVM with the radial basis function (RBF) kernels and the polynomial kernels.
The analysis of the recent approaches for the mobile robots based on the supervised
learning techniques enabled stating that these methods were applied for the following:
• The path control;
• The robot navigation;
• The environment mapping;
• The robot’s position and orientation estimations;
• The collision detection;
• The clustering of the data registered with the use of the different robot’s sensors;
• The recognition of the different terrain types;
• The classification of the robot’s images;
• The exploration and the path planning in unknown or partially known environments.

3. Unsupervised Learning Approaches


This section provides an overview of the unsupervised machine learning methods
that were used in mobile robotics in the years 2009–2023.
The unsupervised learning approaches use unlabeled training data. In other words,
the raw data are fed into the algorithm. The algorithm is responsible for finding the
connections between these data. This type of machine learning is also called teaching
without a teacher. Three common types of tasks that were solved with the use of the
unsupervised learning approaches include clustering, the detection of anomalies and
novelties, and visualization and dimensionality reduction. The unsupervised learning
methods that were applied to solving the clustering problems include the k-means method,
hierarchical cluster analysis, and the Density-Based Spatial Clustering of Applications
with Noise (DBSCAN). The one-class support vector machine method was applied for
anomaly and novelty detection. The visualization and dimensionality reduction problems
were solved with the use of the principal component analysis (PCA) and kernel principal
component analysis techniques (KPCA).
Tables 3 and 4 present a comparative analysis of the recent UL algorithms proposed
for solving the problems in mobile robotics.

Table 3. A comparison of the unsupervised learning algorithms for mobile robots—part 1.

Simulations/Real
Method Authors Year Object Task
Exp.
Wheeled
Clustering sensor Environment
Giguere and Dudek [40] 2009 robot/hexapod Sim. and real exp.
data perception
robot
DBSCAN Wang and Sun [41] 2023 Mobile robot Obstacle clustering Simulations
DBSCAN Iaboni et al. [42] 2021 Mobile robot Robot detection Real exp.
Wheeled mobile Vanishing point
DBSCAN Hernández et al. [43,44] 2014 Real exp.
robot estimation
DPGMM, SOGP Gao et al. [45] 2016 Mobile robot Task recognition Sim. and real exp.
Environment
ESOINN Xu et al. [46] 2023 Hexapod robot Real exp.
perception
Wheeled mobile Trajectory
ESOINN Juman et al. [47] 2019 Sim. and real exp.
robot controller
Robotics 2024, 13, 12 11 of 29

Table 3. Cont.

Simulations/Real
Method Authors Year Object Task
Exp.
Wheeled mobile
FuzzyART NN Lameski, Kulakov and 2009 Position estimation Real exp.
robot
Davcev [48]
Wheeled mobile
GNG network Lameski and 2010 Position estimation Real exp.
robot
Kulakov [49]
Wheeled mobile Environment
K-means Goodwin and 2022 Sim. and real exp.
robot mapping
Nokleby [50]
K-means,
Domestic service 3D environment
K-means++ and Hernández et al. [51] 2022 Sim. and real exp.
robot mapping
LBG
Wheeled mobile Environment
K-Means Ravankar et al. [52] 2012 Real exp.
robot mapping
Dimensionality
KPCA Errachdi and 2017 Mobile robot Simulations
reduction
Benrejeb [53]
Sensor data
KPCA Shamsfakhr and 2017 Mobile robot dimensionality Simulations
Sadeghibigham [54] reduction
Wheeled mobile Sim. exp. with
LatentSLAM Çatal et al. [55] 2021 SLAM
robot dataset
Wheeled mobile Env. mapping and Sim. exp. with
LCDA Balaska et al. [56] 2020
robot localization dataset
Unknown object
One-class SVM Kabir et al. [57] 2022 Mobile robot Real exp.
detection
One-class SVM Tsukada et al. [58] 2011 Mobile robot Feature selection Sim. and real exp.

Table 4. A comparison of the unsupervised learning algorithms for mobile robots—part 2.

Simulations/
Method Authors Year Object Task
Real Exp.
PCA Kishimoto et al. [59] 2021 Wheeled mobile robot Path planning Sim. and real exp.
PCA Cui et al. [60] 2020 Wheeled mobile robot Robot localization Real exp.
PCA Zhou et al. [61] 2018 Mobile robot Fault detection Sim. and real exp.
Creative task
PCA Qayum et al. [62] 2017 Mobile robot Real exp.
coordination
Env. mapping and
SLINK Erkent et al. [63] 2017 Mobile robot Real exp.
localization
SOM Arena et al. [64] 2022 Quadruped robot Attitude control Simulations
SOM Faigl J. [65] 2016 A group of cooperating Path planning Simulations
robots
Environment Sim. exp. with
SOM Guillaum et al. [66] 2011 Mobile robot
mapping dataset
Target attraction
STDP Azimirad et al. [67] 2017 Mobile robot/robotic Simulations
task
arm
Autonomous
STDP Arena et al. [68,69] 2009 Hybrid mini-robot Real exp.
navigation

3.1. Unsupervised Learning for SLAM


This subsection presents the recent unsupervised learning approaches that were ap-
plied to the SLAM task, such as the following: environment perception, robot localiza-
tion/position estimation, and environment mapping.
Clustering is the problem of grouping similar data together. In [52], the authors
proposed an algorithm based on the k-means method for the wheeled mobile robot indoor
mapping with the use of a laser distance sensor. The results of the calculations for three
Robotics 2024, 13, 12 12 of 29

different cluster sizes, equal to 20, 25, and 30, were compared in the paper. The authors
stated that in order to achieve satisfactory results with the use of the proposed method, the
filtering techniques should be applied to the dataset registered with the use of the sensor.
The limitation of the method was the necessity of predicting how many clusters should be
used in order to achieve good results.
In [50], the authors presented k-means clustering. It was applied to the task allocation
problem in the process of the unknown environments’ exploration and mapping by a team
of the mobile robots. In this approach, the k-means clustering algorithm was responsible
for the assignment of the frontiers to the different robots. The frontiers are the boundaries
between the known and unknown space in the environment exploration process. The
introduced concept was evaluated with the use of both the simulations and the real-world
experiments on the TurtleBot3 robots. The results were also compared with two other
methods. In the first method, the map was explored by each robot separately without
any coordination between them. In the second method, the robots exploring the space
shared their information in order to create a global map of the environment. The proposed
k-means method achieved better results than the other methods in terms of both the time,
the robots needed for the space exploration, and the traveled distance.
In the work of [51], the authors developed a system for the compact 3D map repre-
sentation based on the point-cloud spatial clustering called the Sparse-Map. The results
obtained with the use of three clustering algorithms—k-means, k-means++, and LBG—were
presented in the paper. The achieved partition quality and the runtime of the clustering
algorithms were also compared. The results showed that the fastest was the k-means
algorithm, which was applied in the GPU, and the slowest was the LBG algorithm. The
authors stated that their system allowed for the obtainment of the maps of the environment
useful for the calculation of the high-quality paths for the domestic service robots.
In [49], the authors applied the GNG neural network for this task. This approach
is classified as the unsupervised incremental clustering algorithm. The data about the
environment were obtained with the use of an ultrasonic sensor. The sensor was applied
on the Lego Mindstorms NXT robot. The registered data were used for the construction of
a connected graph. The graph was composed of the nodes, presenting the robot’s states,
and the links between the nodes. The links included the information about the actions that
should be carried out in order to make a transition between the nodes. The GNG network
approach was compared with a different method, which was proposed by the authors
in [48]. In this paper, the authors introduced the FuzzyART (Adaptive Resonance Theory)
neural network. Both of the algorithms enabled the achievement of similar results in terms
of the accuracy of the position estimation. The FuzzyART neural network allowed for the
accomplishment of slightly better effects.
In the work of [56], the UL approach based on the Louvain community detection
algorithm (LCDA) was proposed for the semantic mapping and localization tasks. The
authors compared their method with two other UL approaches: the Single-Linkage (SLINK)
agglomerative algorithm, which was presented in [63], and the self-organizing map (SOM),
which was introduced in the work of [66]. The self-organizing map is a type of an artificial
neural network that does not use the labeled data during the training phase. It is applied
for clustering as well as the visualization and the dimensionality reduction tasks.
The machine learning methods for the legged robots are much less explored. The
recently introduced method is the enhanced self-organizing incremental neural network
(ESOINN). It was applied for the environment perception in the work of [46]. The approach
was tested on a hexapod robot in both indoor and outdoor environments.
In the work of [60], the authors proposed the indoor positioning system (IPS) based
on the robust principal component analysis-extreme learning machine (RPCA-ELM). The
method was verified with the use of a mobile robotic platform. It was controlled by the
ARM microcontroller.
Robotics 2024, 13, 12 13 of 29

3.2. Unsupervised Learning for Remote Sensing


In [40], the authors proposed the UL clustering sensor data approach. It was applied
for the environment perception task, specifically for the identification of the terrain type.
The introduced approach was the single-stage batch method, which is aimed at finding
the global description of a cluster. The approach was evaluated with the use of two types
of the mobile robots: a hexapod robot and a differential drive robot. The hexapod robot
was equipped with the following sensors: three accelerometers, three rate gyroscopes, six
leg angle encoders, and six motor current estimators. Principal component analysis (PCA)
was applied for the dimensionality reduction of the collected data. In the robot with the
differential drive, a tactile sensor was used for the terrain identification. The tactile sensor
is a metallic rod with an accelerometer that is located at its tip. The PCA was also applied
for the dimensionality reduction of the collected dataset. The comparative tests proved that
the proposed method outperformed the window-based clustering and the hidden Markov
model trained using expectation–maximization (EM-HMM).
In [55], the authors introduced the UL approach for the generation of the compact
latent representations of the sensor data, which was called the LatentSLAM. This method
was the development of the RatSLAM, which was a SLAM system based on the navigational
processes in a rat’s hippocampus. The system was composed of the pose cells, the local
view cells, and the experience map. The authors demonstrated that their approach can be
used with the different sensors’ data such as the following: the camera images, the radar
range-doppler maps and the lidar scans. A dataset of over 60 GB of the camera, lidar and
radar recordings was used in the verification tests.
In the work of [54], the kernel principal component analysis (KPCA) was applied for
the dimensionality reduction of the dataset obtained with the use of the laser range sensor
SICK LMS 200-30106. The sensor was used in order to register the robot’s surroundings.
In [53], the KPCA was coupled with the online radial basis function (RBF) neural network
algorithm for the identification of a mobile robot’s position. The KPCA was applied as the
preprocessing step. The aim was the feature dimensionality reduction of the dataset that
was further fed into the RBF neural network.
In [58], the authors applied the one-class SVM for the target feature points selection.
Such an approach can be applied on a vision-based mobile robot. The authors also applied
the SOM for the creation of the visual words and the histograms from the selected features.

3.3. Unsupervised Learning for Obstacle Avoidance and Path Planning


The density-based spatial clustering of applications with noise (DBSCAN) method
was applied to mobile robots in [43,44]. It was applied for vanishing point estimation
based on the data registered with the use of an omnidirectional camera installed onboard a
wheeled mobile robot. The heading angle was estimated on the basis of the determined
vanishing point. Afterwards, it was used by the robot control system utilizing the fuzzy
logic controller. In the work of [42], the DBSCAN method was used for the detection of
mobile robots. In this approach, up to four mobile robots were detected with the use of
a stationary camera. The single k-dimensional tree-based technique called IDTrack was
applied for tracking the detected robots. The performance of the developed system was
evaluated in the experiments. The following metrics were used in the evaluation: precision,
recall, mean absolute error, and multi-object tracking accuracy. The mobile robots used
in this study had two differentially driven treads and the 9-degrees of freedom inertial
measurement unit (IMU) with an accelerometer, a gyroscope and a magnetometer. In [41],
the DBSCAN method was applied for the obstacle clustering task using a grid map. The
method was used along with the graph search path-planning algorithm and was evaluated
in the simulation tests using the Matlab environment and the C# language on the Visual
Studio Ultimate 2012 platform.
The one-class support vector machine (SVM) algorithm was proposed for application
in the cloud-based mobile robotic system in [57]. The presented system was composed of a
robot local station and a cloud-based station. The robot local station was implemented on
Robotics 2024, 13, 12 14 of 29

four Raspberry-Pi 4B microcomputers. A computer with an Intel Core i9 CPU, 64 GB RAM,


and NVIDIA GeForce RTX 3090 GPU was used as the cloud-based station. The robot local
station was responsible for the camera image data registration and for sending these data to
the cloud-based station. The cloud-based station was responsible for the image processing
with the use of the machine-learning algorithms in order to generate the actions for the
mobile robots. The one-class SVM was applied for the purpose of the unknown objects’
detection and for the execution of the incremental learning process. Principal component
analysis (PCA) was applied for the dimensionality reduction task. It was used in order
to enhance the speed of the one-class SVM model training. Thousands of features from
the images were extracted with the use of the CNN model in this approach. Therefore, the
dimensionality reduction of these features had to be carried out.
The application of the self-organizing map (SOM) technique was also proposed for
solving the path-planning problem of a group of cooperating mobile robots in [65].
The PCA method was also applied for the localization of the radiation sources [59]. A
mobile robot equipped with an imaging gamma-ray detector was used in this approach. A
path surrounding the radiation sources was generated with the use of the PCA, utilizing
the results of the previous measurements. The approach was verified with the use of
the simulation tests and the real experiments with the mobile robot Pioneer-3DX and the
all-around view Compton camera.

3.4. Unsupervised Learning for Motion Control


A different task associated with mobile robots is the design of the trajectory controller.
It is responsible for the control of the robot’s motion during the process of the predefined
path following. For this purpose, the authors in [47] presented the UL algorithm, which
was called the enhanced self-organizing incremental neural network (ESOINN). It was
introduced by Furao et al. in 2007 [70]. The method was proposed for the control of the
movement of the four-wheel skid steer mobile robot. The robot was developed for use
on the oil palm plantations. The robot control system was based on the Arduino Mega
microcontroller. The method also applied incremental learning. This feature allowed the
achievement of the robot’s adaptation to new environments and noise elimination. The
designed controller did not need the kinematic and dynamic models of the robot, as it
developed its own robot’s model during the training phase based on the registered motion
data. During the incremental learning stage, the previously measured trajectory data
and the simulated data were added to the system in order to achieve a better accuracy
of the path-following task. The ESOINN approach was also compared with two other
UL algorithms: the self-organizing map (SOM) method [71] and the k-means clustering
method [72], and also with one SL approach, based on the feed-forward neural network
with the adaptive weight learning (adaptive NN) [73]. The authors stated that their method
achieved the best results in terms of the path-following accuracy. It also accomplished the
shortest processing time during the incremental learning phase.
In the works of [68,69], the authors proposed the application of the spike timing-
dependent plasticity (STDP) algorithm for the autonomous navigation of a hybrid mini-
robot. The spiking neural network (SNN) was applied to the implementation of the
navigation control algorithm of the robot. The SNN is a type of artificial neural network
that is inspired by the spiking activity of the neurons in the brain. The SNN models
the behavior of the neurons using the spikes. They constitute a representation of the
discrete electrical pulses that are generated by the neurons during the transmission of the
information. The hybrid mini-robot contained the two wheel-legs modules in order to
enhance its walking ability in the rough terrain and the obstacle avoidance task. It was also
equipped with a manipulator for the object grasping and the obstacle climbing tasks. Two
ATmega128 8-bit microcontrollers were applied for the robot control along with a PC that
communicated with the robot through the RF wireless XBee module.
Robotics 2024, 13, 12 15 of 29

In the work of [67], the STDP algorithm was proposed for attracting the robot toward
the target. The approach was tested in the simulations with the use of a mobile robot and a
robotic arm.
In [64], the authors proposed the self-organizing map (SOM) method for error com-
pensation in the attitude control of the quadruped robot. The model-free, adaptive unsu-
pervised nonlinear controller, called the motor-map-based nonlinear controller (MMC),
was introduced in this work. Its aim was to act as a feed-forward error compensator. The
approach was verified by extensive simulation tests.
In the work of [61], the improved PCA was applied to the fault detection task. It was
implemented in a system for the measurement of the mobile robot’s attitude with the use
of five gyroscopes.

3.5. Unsupervised Learning for Other Tasks


In the work of [62], PCA was applied for the creative tasks’ coordination using the
mobile robotic platforms. The developed system was verified with the use of the flower
pattern detection and painting on a canvas with the use of mobile robots. The other task was
the person’s identity and mood detection, which involved the mobile robots performing a
creative art in order to enhance the mood.
In [45], the authors introduced the UL approach for the task recognition problem. The
unsupervised contextual task learning and recognition approach was composed of two
phases. At first, the Dirichlet process Gaussian mixture model (DPGMM) was applied for
the clustering of the human motion patterns of the task executions. In the post-clustering
phase, the sparse online Gaussian process (SOGP) was applied for the classification of the
query point with the learned motion patterns. The holonomic mobile robot was used for the
evaluation of this approach. The 2D Laser Range Finder was applied for the environment
perception task. The data for the evaluation were collected during the performance of the
following four contextual task types: doorway crossing, object inspection, wall following
and object bypass. The results proved that the proposed approach was capable of detecting
the unknown motion patterns that were different from those used in the training set.
The charts presenting the statistics of the UL algorithms applied in mobile robotics are
shown in Figures 8–10.

UL methods for mobile robots -


distribution over years 2009-2023
5
4.5
4
3.5
3
2.5
2
1.5
1
0.5
0
2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023

Figure 8. The distribution over the years 2009–2023 of the UL methods for mobile robots considered
in this survey.
Robotics 2024, 13, 12 16 of 29

UL methods - type of a mobile robot UL approaches for mobile robots -


evaluation method

14%

38% 11%
21%

29%
38%

10% 39%

Wheeled mobile robot Walking robot


Unspecified mobile robot Other type of mobile robot Sim. Real exp. Sim.and real exp. Sim. with real dataset

Figure 9. The types of mobile robots used in the works regarding UL methods considered in this
survey (left chart); the types of evaluation methods used in the works regarding the UL algorithms
for solving the problems of mobile robots in the works considered in this survey (right chart).

UL methods for mobile robots - type of solved Type of UL method applied for mobile robots
task tasks
12 7

10 6

8 5
6
4
4
3
2
2
0
1

Figure 10. The types of solved tasks in the works regarding the UL methods considered in this survey
(left chart); the types of UL algorithms applied for solving the problems of mobile robots in the works
considered in this survey (right chart).

The analysis of the recent approaches for mobile robots based on the UL techniques
enabled stating that these methods were applied for the following:
• Environment mapping;
• The robot’s position and orientation estimations;
• Simultaneous Localization and Mapping (SLAM);
• Data compression, that is the reduction of the amount of data registered by the
robot’s sensors;
• Anomalies and unusual patterns detection in the data registered by the robot’s sensors;
• The clustering of data registered with the use of the different robot’s sensors;
• The recognition of the different terrain types;
• The feature extraction from the data registered by the robot’s sensors;
• The exploration and the path planning in unknown or partially known environments;
• The autonomous learning, that is the continuous refinement of the applied models
and representations of the environment based on the actually acquired experience.

4. Reinforcement Learning Approaches


This section provides an overview of the reinforcement machine learning methods
that were used in mobile robotics in the years 2012–2023.
Robotics 2024, 13, 12 17 of 29

The reinforcement machine learning (RL) does not work with data. It uses an agent to
generate the data. The agent interacts with the environment, learning the most beneficial
interactions. The signals in the form of the rewards are applied in order to shape the
agent’s action policy. The agent has to discover what actions lead to the achievement of the
biggest rewards.
The RL approaches can be classified into the following groups: value-based methods,
policy-based methods, and actor–critic methods. In addition, the following types of RL
algorithms are distinguished: multi-agent RL, hierarchical RL, inverse RL, and state repre-
sentation learning. The value-based RL approaches include algorithms based on Q-learning,
the deep Q-network (DQN), and the double deep Q-network. The policy-based RL algo-
rithms are based on proximal policy optimization (PPO). The actor–critic approaches use
algorithms based on advantage actor–critic (A2C), asynchronous advantage actor–critic
(A3C), and deep deterministic policy gradient (DDPG). The model-based RL approaches
are based on the Dyna-Q and the Model Predictive Control (MPC).

4.1. Reinforcement Learning for Obstacle Avoidance and Path Planning


The Q-learning algorithm was applied to path planning and obstacle avoidance tasks of
the mobile robots. The Q-learning method analyzes all states of the robot. The disadvantage
of this approach is the high computational cost. Therefore, an improved Q-learning model
was proposed in the paper [74]. This method allowed reducing the computational time
by limiting the impassable areas. In the paper [75], the authors introduced an approach
considering eight optional path directions for the robot. The dynamic exploration factor
was applied in this research in order to speed up the algorithm’s convergence. In [76], the
authors also proposed the application of the Q-learning algorithm for the mobile robot’s
path planning and collision avoidance.
The reinforcement learning in the combination with the deep Q-network (DQN) was
applied to the training of the mobile robots in different environments. In [77], the authors
proposed an approach for the autonomous navigation of a mobile robot. It was applied for
the robot’s movement from its current position to a desired final position with the use of
visual observation without a pre-built map of the environment. The authors trained the
DQN using a simulation environment and afterwards applied the developed solution to a
real-world mobile robot navigation scenario.
Another example of the deep RL approach applied to the mobile robot navigation
in a storage environment was presented in [78]. The survey data were supported by the
information obtained with the use of the LIDAR sensors. The algorithm detected obstacles
and performed actions in order to reach the target area. The Deep Q-network-based path
planning for a mobile robot was presented in the work of [79], where the information about
the environment was extracted from the RGB images. In the work of [80], the double DQN
model was used for mobile robot collision avoidance.
In [81], the authors applied the asynchronous advantage actor–critic method (A3C)
for the mobile robot motion planning in a dynamically changing environment, that is, in a
crowd. The proposed method was evaluated in the simulated pedestrian crowds and was
finally demonstrated using the real pedestrian traffic.
The A2C algorithm was proven to be the best choice in terms of learning efficiency and
the generalization applied for robot navigation in the paper [82]. The state representation
learned through A2C achieved satisfactory performance for the task of navigating a mobile
robot in the laboratory experiments.
The deep deterministic policy gradient (DDPG) is an RL method used and applied for
controlling the robot’s movement in the different navigation scenarios. It was applied to
the task of dynamic obstacles’ avoidance in complex environments in [83,84].
The Dyna-Q is an RL algorithm used for a mobile robot’s path planning in unfamiliar
environments. It incorporates the heuristic search strategies and the reactive navigation
rules in order to enhance the exploration and learning performance. In research presented
Robotics 2024, 13, 12 18 of 29

in [85], a mobile robot was able to find a collision-free path, thus fulfilling the task of the
autonomous navigation in the real world.

4.2. Reinforcement Learning for Motion Control and Other Tasks


In [86], the authors presented a lightweight RL method, called the mean-asynchronous
advantage actor–critic (M-A3C) for the real-time gait planning of the bipedal robots. In
this approach, the robot interacts with the environment, evaluates its control actions and
adjusts the control strategy based on the walking state. The proposed method allowed
attaining the continuous and stable gait planning for a bipedal robot, as demonstrated in
the various experiments.
In [87], the authors proposed the application of the convolutional proximal policy
optimization network for the mobile robot navigation problem without using a map. In [88],
the PPO algorithm was applied to the acceleration of the training process convergence and
the reduction of the vibrations in the mobile robotic arm.
In [89], the authors successfully implemented the DDPG algorithm for skid-steered
wheeled robot trajectory tracking. The RL was applied to training the agent in an unsuper-
vised manner. The effectiveness of the trained policy was demonstrated in the dynamic
model simulations with ground force interactions. The trained system met the requirement
of achieving a certain distance from the reference paths. This research demonstrated the
effectiveness of the DDPG in attaining the collision-free trajectories, reducing the path
distance and improving the movement time in the various robotic applications.
The model-based RL was applied to learning the models of the world and the achieve-
ment of long-term goals. The guided Dyna-Q was developed in order to reason about
the action knowledge and to improve the exploration strategies [90]. The deep Dyna-Q
algorithm was applied to controlling the formation in a cooperative multi-agent system,
accelerating the learning process and improving the formation achievement rate. The
verification of this approach was carried out through the simulations and in the real-world
conditions [91].
Model predictive control (MPC) is a control strategy increasingly used in robotics,
especially for mobile robots. The MPC allows optimal control under constraints, such as
in the real-time collision avoidance. It can be applied to controlling the mobile robots’
movement in real time, considering their dynamics and kinematics [92]. The MPC-based
motion-planning solutions were evaluated using various models, including the holonomic
and non-holonomic kinematic and dynamic models, also in the simulations and in the
real-world experiments [93]. The MPC was also applied for dynamic object tracking,
allowing the mobile robot to track and respond to the moving objects, at the same time
ensuring the safety of the process [94]. In addition, the MPC was also integrated with the
trajectory planning and the control of the autonomous mobile robots in the workspaces
shared with humans, considering the following constraints: energy efficiency, safety and
human-aware proximity.
During the model training, the repetition of the successive trials is very important. The
repetition of the experience is a technique used in the RL in order to address the problem of
sparse rewards and to improve the learning performance. In [95], the authors presented a
method combining the simple reward engineering and the hindsight experience repetition
(HER) in order to overcome the sparse reward problem in the autonomous driving of
a mobile robot. Another example was the implementation of the HER and curriculum
learning (CL) techniques in the process of learning the optimal navigation rules in the
dense crowds without the need for the usage of additional demonstration data [96]. In [97],
the authors demonstrated the application of the acceleration RL algorithm based on the
replay learning experience. It was applied in order to improve the performance of the RL
in the mobile robot navigation task.
The charts presenting the statistics of the RL algorithms applied in mobile robotics are
shown in Figures 11 and 12.
Robotics 2024, 13, 12 19 of 29

RL methods for mobile robots - RL approaches for mobile robots -


distribution over years 2018-2023 evaluation method
8

6
21%
5

4 12%
3
67%

0
2018 2019 2020 2021 2022 2023 Sim. Real exp. Sim.and real exp.

Figure 11. The distribution over the years 2009–2023 of the RL methods for mobile robots considered
in this survey (left chart); the types of the evaluation methods used in the works regarding the
RL algorithms for solving the problems of mobile robots in the works considered in this survey
(right chart).

RL methods for mobile robots - Type of RL method applied for mobile


type of solved task robots tasks
6 3

5 2.5

2
4

1.5
3
1
2
0.5

1
0

0
Path Navigation Collision Learning to Motion Trajectory Other task
planning avoidance walk planning planning

Figure 12. The types of the solved tasks in the works regarding the RL methods considered in this
survey (left chart); the types of the RL algorithms applied for solving the problems of mobile robots
in the works considered in this survey (right chart).

The list in Table 5 refers to a different approach to the learned model because of the
interaction that occurs between the agent, the policy it performs to learn the environment
and the dataset. The RL architecture teaches the model so as not to overtrain it.
The analysis of the recent approaches for mobile robots based on the RL techniques
enabled to state that these methods were applied for the following:
• Environment mapping;
• Robot navigation;
• Path planning;
• Exploration and path planning in unknown environments.
Robotics 2024, 13, 12 20 of 29

Table 5. A comparison of the RL algorithms for the mobile robotic tasks.

Simulations/Real
Method Authors Year Object Task
Exp.
Wheeled mobile
Q-learning Prakash et al. [74] 2022 Path planning Simulations
robot
Cooperative mobile Path planning,
Q-learning Ataollahi et al. [75] 2023 Simulations
robots collision avoidance
Q-learning Kim et al. [76] 2022 Mobile robot Path planning Simulations
DQN Yue et al. [77] 2019 Mobile robot Navigation Real exp.
DQN Balachandran et al. [78] 2022 Mobile robot Navigation Real exp.
DQN Zhou et al. [79] 2018 Mobile robots Path planning Real exp.
Double DQN Xue et al. [80] 2019 Mobile robot Collision avoidance Simulations
PPO Kokila et al. [88] 2023 Mobile robotic arm Opening-door task Sim. and real exp.
PPO Toan et al. [87] 2021 Mobile robot Maples navigation Simulations
A3C Leng et al. [86] 2022 Bipet robot Learning to walk Sim. and real exp.
A3C Sasaki et al. [81] 2019 Mobile robot Motion planning Sim. and real exp.
A2C Chen et al. [82] 2023 Mobile robot Navigation Sim. and real exp.
DDPG Gao et al. [83] 2020 Mobile robot Obstacle avoidance Sim. and real exp.
DDPG Nakamura et al. [84] 2023 Mobile robot Local path planning Sim. and real exp.
Slip-steered wheel
DDPG Sandeep et al. [89] 2022 Mobile robot Simulations
robots
Dyna-Q Pei et al. [85] 2022 Mobile robot Path planning Sim. and real exp.
Dyna-Q Budiyanto et al. [90] 2023 Mobile robot Navigation Sim. and real exp.
Exploration and
Guided Dyna-Q Hayamizu et al. [91] 2020 Mobile robot Sim. and real exp.
navigation
MPC Piccinelli et al. [92] 2023 Mobile robots Motion planning Sim. and real exp.
MPC Hong S et al. [93] 2023 Mobile robot Motion planning Sim. and real exp.
MPC Chen et al. [94] 2023 Mobile robot Trajectory planning Sim. and real exp.
Experience replay Park et al. [95] 2022 Mobile robots Trajectory planning Sim. and real exp.
Curriculum
Experience replay Li et al. [96] 2021 Mobile robot Sim. and real exp.
learning
Experience replay Duan et al. [97] 2012 Mobile robot Navigation Sim. and real exp.

5. Semi-Supervised Learning Approaches


This section provides an overview of the semi-supervised machine learning methods
that were used in mobile robotics in the years 2003–2021.
The semi-supervised learning (SSL) incorporates the assumptions of both supervised
and unsupervised learning. It works on the dataset with both labeled and unlabeled data.
The main advantage of such an approach is the improvement of the model’s performance
by the usage of the labeled data along with the extraction of the additional information
from the unlabeled data. The semi-supervised learning applications include areas such as
the following: natural language processing, computer vision, and speech recognition. The
semi-supervised learning algorithm utilizes the different techniques, such as for example
self-training. In this concept, the model is first trained with the use of the labeled data.
Afterwards, it is used for making the predictions with the use of the unlabeled data, which
are added to the dataset of the labeled data. Another approach is based on the co-training,
which involves the training process of multiple models, using the different feature sets or
the subsets of the data. As in the self-training approach, the model is trained with the use
of the labeled data. Afterwards, it is used for the prediction of the labels of the unlabeled
data. There also exist multi-view learning methods, which use the multiple perspectives of
the data, such as the different feature sets or the data representations, in order to improve
the performance of the developed model.
The semi-supervised learning methods for mobile robots are definitely less common in
the current literature. Table 6 presents a comparative analysis of the recent semi-supervised
learning algorithms proposed for solving the different problems in mobile robotics.
Robotics 2024, 13, 12 21 of 29

Table 6. A comparison of the semi-supervised learning algorithms for the mobile robots.

Simulations/Real
Method Authors Year Object Task
Exp.
Trajectory
SSIL Lee et al. [98] 2022 RC car Sim. and real exp.
controller
Terrain Exp. using
RNN Ahmadi et al. [99] 2021 Quadruped robot
classification real-world datasets
VAE semi-sup. Exp. using
Qian et al. [100] 2021 Mobile robot Indoor localization
model real-world datasets
Terrain
SMMDN Li [101] 2020 Mobile robot Simulations
classification
LapERLS,
Yoo and
time-series learn., 2017 Mobile robot Indoor localization Real exp.
Johansson [102]
LapLS
KPCA-RLS Wu et al. [103] 2016 Mobile robot Indoor localization Real exp.
Großmann Wheeled mobile Clustering robot
DTW 2003 Real exp.
et al. [104] robot experiences

5.1. Semi-Supervised Learning for Robot Localization


One of the tasks where the SSL methods are applied is robot localization. In [100],
the authors proposed the application of the variational autoencoder (VAE)-based semi-
supervised learning model for indoor localization. It constitutes a deep learning-based
model. The approach was verified with the use of two real-world datasets. The obtained
results proved that the introduced VAE-based model outperformed the other conventional
machine learning and deep learning methods.
In the work of [102], the authors proposed the SSL algorithm for the indoor localization
task. In this approach, the Laplacian embedded regression least square (LapERLS) was
applied for the pseudo-labeling process. Afterwards, the time-series regularization was
applied in order to sort the dataset in chronological order. Then, the Laplacian least square
(LapLS) was applied. It combines manifold regularization and the transductive support
vector machine (TSVM) method. The approach was verified by the real experiments with
the use of a smartphone mobile robot for the estimation of unknown locations.
In [103], the authors introduced the kernel principal component analysis (KPCA)-
regularized least-square (RLS) algorithm for robot localization with uncalibrated monocular
visual information. The approach also utilized the semi-supervised learning technique.
The method was verified with the use of a wheeled mobile robot equipped with a stereo
camera and a wireless adapter. One computer was used for tracking the robot and the
registration of its positions. Another computer controlled the robot’s movement along a
defined path. The authors stated that their “online localization algorithm outperformed the
state-of-the-art appearance-based SLAM algorithms at a processing rate of 30 Hz for new
data on a standard PC with a camera”.

5.2. Semi-Supervised Learning for Terrain Classification


Another task that was solved with the use of the semi-supervised learning is terrain
classification. In [99], the authors proposed the application of semi-supervised gated
recurrent neural networks (RNN) for the terrain classification with the use of a quadruped
robot. The method was verified in experiments with the use of the real-world datasets and
was compared with the results obtained with the use of the supervised learning approach.
The results showed that the semi-supervised model outperformed the supervised model in
the situations with only small amounts of the available labeled data. In [101], the authors
proposed an approach to the terrain classification based on visual image processing and the
semi-supervised multimodal deep network (SMMDN). The presented simulation results
proved that the SMMDN contributed to the improvement of the mobile robots’ perception
and recognition abilities in the complex outdoor environments.
Robotics 2024, 13, 12 22 of 29

5.3. Semi-Supervised Learning for Motion Control and Other Tasks


In [98], a semi-supervised learning approach called semi-supervised imitation learning
(SSIL) was proposed for the development of a trajectory controller. The method was tested
with the use of an RC car. In [104], the authors presented a semi-supervised method for
the clustering of the robot’s experiences using dynamic time warping (DTW). The method
was evaluated in real-world experiments with the use of the Pioneer 2 robot. The authors
stated that they achieved 91% accuracy in the classification of the high-dimensional robot
time series.
The charts presenting the statistics of the SSL algorithms applied in mobile robotics
are shown in Figures 13 and 14.

SSL methods - type of a mobile robot SSL approaches for mobile robots -
evaluation method

14% 15%

20% 20%
14%

57%
60%

Wheeled mobile robot Walking robot


Unspecified mobile robot Other type of mobile robot Sim. Real exp. Sim.and real exp.

Figure 13. The types of mobile robots used in works regarding the SSL methods considered in this
survey (left chart); the types of evaluation methods used in the works regarding the SSL algorithms
for solving the problems of mobile robots in the works considered in this survey (right chart).

SSL methods for mobile robots -


type of solved task
3
3

2.5

2
2

1.5

1 1
1

0.5

0
Indoor localization Trajectory controller Terrain classification Clustering robot exp.

Figure 14. The types of the solved tasks in the works regarding the SSL methods considered in
this survey.

The analysis of the recent approaches for mobile robots based on the semi-supervised
learning techniques allowed us to conclude that these methods were applied for the following:
• Object detection and recognition, specifically for the terrain classification;
• Simultaneous Localization and Mapping (SLAM), specifically for the indoor and
outdoor localization;
• Motion control, in the development of a trajectory controller;
• Clustering of the robot’s experiences in order to construct a model of its interaction
with the environment.
Robotics 2024, 13, 12 23 of 29

6. Open Issues and Future Prospects


The literature review of the machine learning methods applied for the different tasks
associated with the control of mobile robots allowed the authors of this survey to realize
that this topic covers a large amount of the various issues and proposed approaches.
The main challenges and open issues related to this topic were stated below. The
relations between the application of the complex machine learning algorithms and the
limited computational resources of the control system used on-board the mobile robot
constitutes a challenge. The application of the machine learning algorithms enabling the
mobile robot for decision making and motion control in real time is also a challenging issue.
The adaptability of the algorithms to the dynamic, changing environments also constitutes
a very demanding open issue.
Many machine learning algorithms need a large amount of data in order to achieve
satisfactory performance. Therefore, the acquisition of large volumes of valuable data for
training of the model can constitute a difficult task. In such cases, the solution might be
to develop the effective algorithms, which can learn from the limited or unlabeled data.
Another issue is the robustness and the generalization of the models. This is essential for
the achievement of the reliable mobile robot’s operation.
The challenge, especially with regard to the application of the self-learning algorithms,
might also be the assurance of the safety and reliability of the robot’s operation in the vari-
ous environments and situations. This is particularly important in cases where interactions
with humans occur.
The review allowed the detection of relationships between the specific machine learn-
ing methods or groups of methods and the associated mobile robotics tasks. One of the
most easily noticeable cases is the application of reinforcement learning methods for the
mobile robot’s path planning. However, it does not mean that this is the ideal method for
solving this task and that the application of the other approaches is useless.
According to the authors, one of the significant future prospects is the development
of self-learning algorithms for mobile robots performing different tasks—especially those
replacing humans in the dangerous and/or physically exhausting activities. The important
aspect that should be considered in future research is the assurance of safe and reliable
operation. The development of the nature-inspired robots, for example the walking robots,
implementing the ML methods, also constitutes a demanding open research task.

7. Conclusions
The paper presents a comprehensive review of the different machine learning methods
applied in mobile robotics. The survey was divided into sections dedicated to the vari-
ous subgroups of the ML approaches, including supervised learning (SL), unsupervised
learning (UL), reinforcement learning (RL), and semi-supervised learning (SSL).
The revised literature covered the years 2003–2023 from the following scientific
databases: Scopus, Web of Science, IEEE Xplore, and ScienceDirect. The works were
primarily evaluated in terms of the applied method: SL, UL, RL or SSL and the considered
task, where the main solved problems were: environment mapping, robot localization,
SLAM, sensor data dimensionality reduction, fault detection, trajectory controller, obstacles
clustering, and path planning.
Every approach was also evaluated with regard to the object for which the problem
was solved. In this criterion, the mobile robots were distinguished as wheeled mobile
robots or legged robots, such as quadruped robots, and hexapod robots. In some of the
works, the exact mobile platform was not specified. In such cases, the control object was
defined as a mobile robot without any specifications. The overview of the ML approaches
also included the verification method, where the possibilities included simulations, real
experiments, simulations with real-world datasets, and evaluation by both simulations and
real experiments.
The above-described analysis allowed the authors to formulate the following conclud-
ing remarks:
Robotics 2024, 13, 12 24 of 29

• Supervised machine learning is often used in mobile robotics for a variety of tasks;
based on the literature review, it can be seen that both the regression and the classifi-
cation methods were applied to navigation, terrain recognition and obstacle avoid-
ance tasks;
• According to the authors, the supervised learning algorithms are less useful than other
methods for learning to walk and path-planning tasks;
• An interesting study was that of Zhang [35,37], which used the random forest and
the ANN algorithms to examine the terrain, on which the mobile robot was moving;
the results of this work showed the usefulness of the classification for making certain
decisions when performing the robot’s movement;
• The reinforcement machine learning is characterized by many examples of the ap-
proaches utilized for learning of the environment;
• The mobile robotics approaches utilizing the semi-supervised learning methods are
less frequently applied than the other ML algorithms;
• Most of the works considered the problems related to the wheeled mobile robots as
solutions for the walking robots are less common;
• Many methods concentrated on the tasks such as the position estimation or the
environment mapping; a potential application area that is less extensively explored is
the development of the ML-based methods for self-learning robots in order to enhance
their autonomy;
• To sum up, ML is characterized by significant implementation potential in many
aspects of robotic tasks and problems.

Author Contributions: Conceptualization, M.R. and A.L.; methodology, M.R. and A.L.; validation,
M.R. and A.L.; formal analysis, M.R. and A.L.; investigation, M.R. and A.L.; resources, M.R., N.P.
and A.L.; data curation, M.R. and A.L.; writing—original draft preparation, M.R. and A.L.; writing—
review and editing, M.R. and A.L.; visualization, M.R. and A.L.; supervision, M.R. and A.L.; project
administration, M.R. and A.L.; funding acquisition, M.R. and A.L. All authors have read and agreed
to the published version of the manuscript.
Funding: This research received no external funding.
Data Availability Statement: The data presented in this study are openly available in Scopus, Web of
Science, IEEE Xplore, and ScienceDirect.
Conflicts of Interest: The authors declare no conflicts of interest.

Abbreviations
The following abbreviations are used in this manuscript:

ACO Ant Colony Optimization


AI Artificial Intelligence
AMR Autonomous Mobile Robot
ANN Artificial Neural Network
ASGDLR Adaptive Stochastic Gradient Descent Linear Regression
A2C Advantage Actor–Critic
A3C Asynchronous Advantage Actor–Critic)
DBSCAN Density-Based Spatial Clustering of Applications with Noise
CL Curriculum Learning
CNN Convolutional Neural Network
DDPG Deterministic Policy Gradient
DL Deep Learning
DNN Deep Neural Network
DPGMM Dirichlet Process Gaussian Mixture Model
DQN Deep Q-Network
DTW Dynamic Time Warping
EM-HMM Hidden Markov Model Trained Using Expectation-Maximization
ESOINN Enhanced Self-Organizing Incremental Neural Network
Robotics 2024, 13, 12 25 of 29

FL Fuzzy Logic
GA Genetic Algorithm
GA-ELM Extreme Learning Machine-Based Genetic Algorithm
GPR Gaussian Process Regression
HER Hindsight Experience Repetition
IMU Inertial Measurement Unit
IPS Indoor Positioning System
KNN K-Nearest Neighbors algorithm
KPCA Kernel Principal Component Analysis
LapERLS Laplacian Embedded Regression Least Square
LapLS Laplacian Least Square
LCDA Louvain Community Detection Algorithm
LSTM Long Short-Term Memory
MKF Mixed Kernel Function
ML Machine Learning
MPC Model Predictive Control
NN Neural Network
PCA Principal Component Analysis
PPO Proximal Policy Optimisation
PSO Particle Swarm Optimization
RBF Radial Basis Function neural network
RL Reinforcement Learning
RLS Regularized Least-Square Algorithm
RNN Recurrent Neural Network
RPCA-ELM Robust Principal Component Analysis-Extreme Learning Machine
SL Supervised Learning
SLAM Simultaneous Localization and Mapping
SLINK Single-Linkage Agglomerative Algorithm
SM-MDN Semi-Supervised MultiModal Deep Network
SNN Spiking Neural Network
SOGP Sparse Online Gaussian Process
SOM Self-Organizing Map
SSIL Semi-Supervised Imitation Learning
SSL Semi-Supervised Learning
STDP Spike Timing Dependent Plasticity
SVM Support Vector Machines
TSVM Transductive Support Vector Machine
UL Unsupervised Learning
VAE Variational AutoEncoder

References
1. Rahmani, A.M.; Yousefpoor, E.; Yousefpoor, M.S.; Mehmood, Z.; Haider, A.; Hosseinzadeh, M.; Ali Naqvi, R. Machine Learning
(ML) in Medicine: Review, Applications, and Challenges. Mathematics 2021, 9, 2970. [CrossRef]
2. Verma, A.A.; Murray, J.; Greiner, R.; Cohen, J.P.; Shojania, K.G.; Ghassemi, M.; Straus, S.E.; Pou-Prom, C.; Mamdani, M.
Implementing machine learning in medicine. Can. Med. Assoc. J. 2021, 193, E1351–E1357. [CrossRef] [PubMed]
3. May, M. Eight ways machine learning is assisting medicine. Nat. Med. 2021, 27, 2–3. [CrossRef] [PubMed]
4. Almaazmi, A.; Karmastaji, E.; Atallah, S.; Alkhazaleh, H.A.; Manoor, W. Learning Analytics and Machine Learning. In Proceedings
of the 5th International Conference on Signal Processing and Information Security (ICSPIS), Dubai, United Arab Emirates, 7–8
December 2022; pp. 164–168. [CrossRef]
5. Pinto, A.S.; Abreu, A.; Costa, E.; Paiva, J. How Machine Learning (ML) is Transforming Higher Education: A Systematic Literature
Review. J. Manag. Inf. Syst. 2023, 8, 21168. [CrossRef]
6. Nutonen, K.; Kuts, V.; Otto, T. Industrial Robot Training in the Simulation Using the Machine Learning Agent. Procedia Comput.
Sci. 2023, 217, 446–455. [CrossRef]
7. Tagliani, F.L.; Pellegrini, N.; Aggogeri, F. Machine Learning Sequential Methodology for Robot Inverse Kinematic Modelling.
Appl. Sci. 2022, 12, 9417. [CrossRef]
8. Ibrahim, F.; Boussaid, B.; Abdelkrim, M.N. Fault detection in wheeled mobile robot based Machine Learning. In Proceedings of
the 19th International Multi-Conference on Systems, Signals & Devices (SSD), Sétif, Algeria, 6–10 May 2022; pp. 58–63. [CrossRef]
Robotics 2024, 13, 12 26 of 29

9. Kulaç, N.; Engin, M. Developing a Machine Learning Algorithm for Service Robots in Industrial Applications. Machines 2023,
11, 421. [CrossRef]
10. ZiXuan, L.; Qingchuan, W.; Bingsong, Y. Reinforcement Learning-Based Path Planning Algorithm for Mobile Robots. Wirel.
Commun. Mob. Comput. 2022, 2022, 1859020. [CrossRef]
11. Zhu, Y.; Schwab, D.; Veloso, M. Learning Primitive Skills for Mobile Robots. In Proceedings of the International Conference on
Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 7597–7603. [CrossRef]
12. Li, Y.; Huang, Z.; Xie, Y. Path planning of mobile robot based on improved genetic algorithm. In Proceedings of the 3rd
International Conference on Electron Device and Mechanical Engineering (ICEDME), Suzhou, China, 1–3 May 2020; pp. 691–695.
[CrossRef]
13. Rebouças Filho, P.; Silva, S.; Ohata, E.; Almeida, J.; Sousa, P.; Nascimento, N.; Silva, F. A New Strategy for Mobile Robots
Localization based on Omnidirectional Sonar Images and Machine Learning. In Proceedings of the 32th Conference on Graphics,
Patterns and Images, Rio de Janeiro, Brazil, 28–31 October 2019; pp. 168–171. [CrossRef]
14. Ma, J.; Duan, X.; Shang, C.; Ma, M.; Zhang, D. Improved Extreme Learning Machine Based UWB Positioning for Mobile Robots
with Signal Interference. Machines 2022, 10, 218. [CrossRef]
15. Mac, T.T.; Copot, C.; Tran, D.T.; De Keyser, R. Heuristic approaches in robot path planning: A survey. Robot. Auton. Syst. 2016, 86,
13–28. [CrossRef]
16. Soori, M.; Arezoo, B.; Dastres, R. Artificial intelligence, machine learning and deep learning in advanced robotics, a review. Cogn.
Robot. 2023, 3, 54–70. [CrossRef]
17. Wang, S.; Chaovalitwongse, W.; Babuska, R. Machine Learning Algorithms in Bipedal Robot Control. IEEE Trans. Syst. Man
Cybern. Part C (Appl. Rev.) 2012, 42, 728–743. [CrossRef]
18. Zhang, Y.; Wu, Y.; Tong, K.; Chen, H.; Yuan, Y. Review of Visual Simultaneous Localization and Mapping Based on Deep Learning.
Remote Sens. 2023, 15, 2740. [CrossRef]
19. Moreno, J.; Clotet, E.; Lupiañez, R.; Tresanchez, M.; Martínez, D.; Pallejà, T.; Casanovas, J.; Palacín, J. Design, Implementation
and Validation of the Three-Wheel Holonomic Motion System of the Assistant Personal Robot (APR). Sensors 2016, 16, 1658.
[CrossRef]
20. Cook, G. Mobile Robots: Navigation, Control and Remote Sensing; Wiley-IEEE Press: Hoboken, NJ, USA, 2011. Available online:
https://ieeexplore.ieee.org/book/6047594 (accessed on 2 January 2024).
21. Sharma, A.; Patel, R.K.; Thapa, V.; Gairola, B.; Pandey, B.; Epenetus, B.A.; Choudhury, S.; Mondal, A.K. Investigation on optimized
relative localization of a mobile robot using regression analysis. In Proceedings of the 2016 International Conference on Robotics:
Current Trends and Future Challenges (RCTFC), Thanjavur, India, 19–20 December 2016; pp. 1–6. [CrossRef]
22. Das, S.; Mishra, S.K. A Machine Learning approach for collision avoidance and path planning of mobile robot under dense and
cluttered environments. Comput. Electr. Eng. 2022, 103, 108376. [CrossRef]
23. Naveen, V.; Aasish, C.; Kavya, M.; Vidhyalakshmi, M.; Sailaja, K. Autonomous obstacle avoidance robot using regression. In
Lecture Notes on Data Engineering and Communications Technologies, Proceedings of International Conference on Computational Intelligence
and Data Engineering; Springer: Singapore, 2021; Volume 56. [CrossRef]
24. Gonzalez, R.; Fiacchini, M.; Iagnemma, K. Slippage prediction for off-road mobile robots via machine learning regression and
proprioceptive sensing. Robot. Auton. Syst. 2018, 105, 85–93. [CrossRef]
25. Crnokić, B.; Peko, I.; Grubišić, M. Artificial neural networks-based simulation of obstacle detection with a mobile robot in a
virtual environment. Int. Robot. Auto J. 2023, 9, 62–67. [CrossRef]
26. Ballesta, M.; Payá, L.; Cebollada, S.; Reinoso, O.; Murcia, F. A CNN Regression Approach to Mobile Robot Localization Using
Omnidirectional Images. Appl. Sci. 2021, 11, 7521. [CrossRef]
27. Swere, E.; Mulvaney, D.J. Robot navigation using decision trees. Electron. Syst. Control Div. Res. 2003. Available online:
https://api.semanticscholar.org/CorpusID:16567021 (accessed on 6 November 2023).
28. Swere, E.; Mulvaney, D.; Sillitoe, I. A fast memory-efficient incremental decision tree algorithm in its application to mobile robot
navigation. In Proceedings of the 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, Beijing, China, 9–15
October 2006; pp. 645–650. [CrossRef]
29. Roth, A.M.; Liang, J.; Manocha, D. XAI-N: Sensor-based Robot Navigation using Expert Policies and Decision Trees. In
Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic,
27 September–1 October 2021; pp. 2053–2060. [CrossRef]
30. Sarah, M.; Riadh, B.-A. Comparison of Classification Techniques for Wall Following Robot Navigation and Improvements to the
KNN Algorithm. Comput. Sci. Inf. Technol. 2019, 9, 73–87. [CrossRef]
31. Vega, J.E.M.; Chaidez, A.G.; García, C.M.; López, M.R.; Fuentes, W.F.; Sergiyenko, O. Recognition System by Using Machine
Vision Tools and Machine Learning Techniques for Mobile Robots. In Examining Optoelectronics in Machine Vision and Applications
in Industry 4.0; IGI Global: Hershey, PA, USA, 2019. Available online: https://www.irma-international.org/viewtitle/269678/?
isxn=9781799865223 (accessed on 6 November 2023).
32. Zheng, Y.; Hu, X.; Sun, H. Research on Motion Control for a Mobile Robot Using Learning Control Method. Appl. Math. Nonlinear
Sci. 2021, 6, 227–234. [CrossRef]
33. Liu, Z.; Wang, L.; Zhang, Y.; Chen, C.L.P. A SVM controller for the stable walking of biped robots based on small sample sizes.
Appl. Soft Comput. 2016, 38, 738–753. [CrossRef]
Robotics 2024, 13, 12 27 of 29

34. Liao, W. Ground classification based on optimal random forest model. In Proceedings of the 2023 IEEE International Conference
on Control, Electronics and Computer Technology (ICCECT), Jilin, China, 28–30 April 2023; pp. 709–714. [CrossRef]
35. Zhang, H.; Dai, X.; Sun, F.; Yuan, J. Terrain classification in field environment based on Random Forest for the mobile robot. In
Proceedings of the 2016 35th Chinese Control Conference (CCC), Chengdu, China, 27–29 July 2016; pp. 6074–6079. [CrossRef]
36. Becker, F.; Ebner, M. Collision Detection for a Mobile Robot using Logistic Regression. In Proceedings of the 16th International
Conference on Informatics in Control, Automation and Robotics (ICINCO 2019), Prague, Czech Republic, 29–31 July 2019;
pp. 167–173. Available online: https://www.scitepress.org/Papers/2019/77686/77686.pdf (accessed on 6 November 2023).
37. Sanusi, M.A.; Dewantara, B.S.B.; Sigit, R.S. Online Terrain Classification Using Neural Network for Disaster Robot Application.
Indones. J. Comput. Sci. 2023, 12, 48–62. [CrossRef]
38. Hoshino, S.; Yoshida, Y. Motion Planner based on CNN with LSTM through Mediated Perception. In Proceedings of the 2022
61st Annual Conference of the Society of Instrument and Control Engineers (SICE), Kumamoto, Japan, 6–9 September 2022;
pp. 622–627. [CrossRef]
39. Kozlowski, P.; Walas, K. Deep neural networks for terrain recognition task. In Proceedings of the 2018 Baltic URSI Symposium
(URSI), Poznan, Poland, 15–17 May 2018; pp. 283–286. [CrossRef]
40. Giguere, P.; Dudek, G. Clustering sensor data for autonomous terrain identification using time-dependency. Auton. Robot. 2009,
26, 171–186. [CrossRef]
41. Wang, L.; Sun, L. Path Planning Algorithm Based on Obstacle Clustering Analysis and Graph Search. Symmetry 2023, 15, 1498.
[CrossRef]
42. Iaboni, C.; Patel, H.; Lobo, D.; Choi, J.-W.; Abichandani, P. Event Camera Based Real-Time Detection and Tracking of Indoor
Ground Robots. IEEE Access 2021, 9, 166588–166602. [CrossRef]
43. Hernández, D.C.; Hoang, V.-D.; Filonenko, A.; Jo, K.-H. Vision-based heading angle estimation for an autonomous mobile robots
navigation. In Proceedings of the IEEE 23rd International Symposium on Industrial Electronics (ISIE), Istanbul, Turkey, 1–4 June
2014; pp. 1967–1972. [CrossRef]
44. Hernández, D.C.; Hoang, V.-D.; Filonenko, A.; Jo, K.-H. Fuzzy Logic Guidance Control Systems for Autonomous Navigation
Based on Omnidirectional Sensing. In Modern Advances in Applied Intelligence, IEA/AIE 2014; Lecture Notes in Computer Science;
Ali, M., Pan, J.S., Chen, S.M., Horng, M.F., Eds.; Springer: Cham, Switzerland, 2014; Volume 8481. [CrossRef]
45. Gao, M.; Kohlhaas, R.; Zöllner, J.M. Unsupervised Contextual Task Learning and Recognition for Sharing Autonomy to Assist
Mobile Robot Teleoperation. In Proceedings of the 13th International Conference on Informatics in Control, Automation and
Robotics (ICINCO 2016), Lisbon, Portugal, 29–31 July 2016; SCITEPRESS-Science and Technology Publications, Lda: Setubal,
Portugal, 2016; pp. 238–245. [CrossRef]
46. Xu, P.; Ding, L.; Li, Z.; Yang, H.; Wang, Z.; Gao, H.; Zhou, R.; Su, Y.; Deng, Z.; Huang, Y. Learning physical characteristics like
animals for legged robots. Natl. Sci. Rev. 2023, 10, nwad045. [CrossRef]
47. Juman, M.A.; Wong, Y.W.; Rajkumar, R.K.; Kow, K.W.; Yap, Z.W. An incremental unsupervised learning based trajectory controller
for a 4 wheeled skid steer mobile robot. Eng. Appl. Artif. Intell. 2019, 85, 385–392. [CrossRef]
48. Lameski, P.; Kulakov, A.; Davcev, D. Learning and position estimation of a mobile robot in an indoor environment using
FuzzyART neural network. In Proceedings of the IEEE/ASME International Conference on Advanced Intelligent Mechatronics,
Singapore, 14–17 July 2009; pp. 770-774. [CrossRef]
49. Lameski, P.; Kulakov, A. Position Estimation of Mobile Robots Using Unsupervised Learning Algorithms. In Proceedings
of the ICT Innovations Conference 2009, Ohrid, Macedonia, 28–30 September 2009; Davcev, D., Gómez, J.M., Eds.; Springer:
Berlin/Heidelberg, Germany, 2010. [CrossRef]
50. Goodwin, L.; Nokleby, S. A K-Means Clustering Approach to Segmentation of Maps for Task Allocation in Multi-robot Systems
Exploration of Unknown Environments. In Mechanisms and Machine Science, Proceedings of the 2022 USCToMM Symposium on
Mechanical Systems and Robotics, Rapid City, SD, USA, 19–21 May 2022; Larochelle, P., McCarthy, J.M., Eds.; Springer: Cham,
Switzerland, 2022; Volume 118, pp. 198–211. [CrossRef]
51. Hernández, J.; Savage, J.; Negrete, M.; Contreras, L.; Sarmiento, C.; Fuentes, O.; Okada, H. Sparse-Map: Automatic topological
map creation via unsupervised learning techniques. Adv. Robot. 2022, 36, 825–835. [CrossRef]
52. Ravankar, A.A.; Hoshino, Y.; Emaru, T.; Kobayashi, Y. Robot Mapping Using k-means Clustering of Laser Range Sensor Data.
Bull. Netw. Comput. Syst. Softw. 2016, 1, 9–12.
53. Errachdi, A.; Benrejeb, M. Online identification using radial basis function neural network coupled with KPCA. Int. J. Gen. Syst.
2017, 46, 52–65. [CrossRef]
54. Shamsfakhr, F.; Sadeghibigham, B. A Neural Network Approach to Navigation of a Mobile Robot and Obstacle Avoidance in
Dynamic and Unknown Environments. Turk. J. Electr. Eng. Comput. Sci. 2017, 25, 1629–1642. [CrossRef]
55. Çatal, O.; Jansen, W.; Verbelen, T.; Dhoedt, B.; Steckel, J. LatentSLAM: Unsupervised multi-sensor representation learning for
localization and mapping. arXiv 2021, arXiv:2105.03265. [CrossRef]
56. Balaska, V.; Bampis, L.; Boudourides, M.; Gasteratos, A. Unsupervised semantic clustering and localization for mobile robotics
tasks. Robot. Auton. Syst. 2020, 131, 103567. [CrossRef]
57. Kabir, R.; Watanobe, Y.; Islam, M.R.; Naruse, K.; Rahman, M.M. Unknown Object Detection Using a One-Class Support Vector
Machine for a Cloud–Robot System. Sensors 2022, 22, 1352. [CrossRef]
Robotics 2024, 13, 12 28 of 29

58. Madokoro, H.; Tsukada, M.; Sato, K. Unsupervised feature selection and category formation for mobile robot vision. In
Proceedings of the 2011 International Joint Conference on Neural Networks, San Jose, CA, USA, 31 July–5 August 2011;
pp. 320–327. [CrossRef]
59. Kishimoto, T.; Woo, H.; Komatsu, R.; Tamura, Y.; Tomita, H.; Shimazoe, K.; Yamashita, A.; Asama, H. Path Planning for
Localization of Radiation Sources Based on Principal Component Analysis. Appl. Sci. 2021, 11, 4707. [CrossRef]
60. Cui, W.; Liu, Q.; Zhang, L.; Wang, H.; Lu, X.; Li, J. A robust mobile robot indoor positioning system based on Wi-Fi. Int. J. Adv.
Robot. Syst. 2020, 17, 1729881419896660. [CrossRef]
61. Zhou, Z.; Zhang, Q.; Zhao, Q.; Chen, R.; Zeng, Q. An Improved Principal Component Analysis in the Fault Detection of
Multi-sensor System of Mobile Robot. Int. J. Online Biomed. Eng. 2018, 14, 82–97. [CrossRef]
62. Qayum, M.A.; Nahar, N.; Siddique, N.A.; Saifullah, Z.M. Interactive intelligent agents with creative minds: Experiments with
mobile robots in cooperating tasks by using machine learning. In Proceedings of the IEEE International Conference on Imaging,
Vision & Pattern Recognition (icIVPR), Dhaka, Bangladesh, 13–14 February 2017; pp. 1–6. [CrossRef]
63. Erkent, Ö.; Karaoguz, H.; Isıl Bozma, H. Hierarchically self-organizing visual place memory. Adv. Robot. 2017, 31, 865–879.
[CrossRef]
64. Arena, P.; Di Pietro, F.; Li Noce, A.; Patanè, L. Attitude control in the Mini Cheetah robot via MPC and reward-based feed-forward
controller. IFAC-PapersOnLine 2022, 55, 41–48. [CrossRef]
65. Faigl, J. An Application of Self-Organizing Map for Multirobot Multigoal Path Planning with Minmax Objective. Comput. Intell.
Neurosci. 2016, 2026, 2720630. [CrossRef]
66. Guillaume, H.; Dubois, M.; Frenoux, E.; Tarroux, P. Temporal Bag-of-Words—A Generative Model for Visual Place Recognition
using Temporal Integration. In Proceedings of the Sixth International Conference on Computer Vision Theory and Applications
(VISAPP 2011), Algarve, Portugal, 5–7 March 2011; pp. 286–295.
67. Azimirad, V.; Sani, M.F.; Ramezanlou, M.T. Unsupervised learning of target attraction for robots through Spike Timing Dependent
Plasticity. In Proceedings of the IEEE 4th International Conference on Knowledge-Based Engineering and Innovation (KBEI),
Tehran, Iran, 22 December 2017; pp. 428–433. [CrossRef]
68. Arena, P.; De Fiore, S.; Patané, L.; Pollino, M.; Ventura, C. Insect inspired unsupervised learning for tactic and phobic behavior
enhancement in a hybrid robot. In Proceedings of the 2010 International Joint Conference on Neural Networks (IJCNN), Barcelona,
Spain, 18–23 July 2010; pp. 1–8. [CrossRef]
69. Arena, P.; De Fiore, S.; Patané, L.; Pollino, M.; Ventura, C. STDP-based behavior learning on the TriBot robot. In Proceedings of
the Society of Photo-Optical Instrumentation Engineers SPIE, Dresden, Germany, 4–6 May 2009; Bioengineered and Bioinspired
Systems IV; Volume 7365, p. 736506. [CrossRef]
70. Furao, S.; Ogura, T.; Hasegawa, O. An enhanced self-organizing incremental neural network for online unsupervised learning.
Neural Netw. 2007, 20, 893–903. [CrossRef]
71. Kohonen, T. The self-organizing map. Proc. IEEE. 1990, 78, 1464–1480. [CrossRef]
72. Lloyd, S. Least squares quantization in PCM. IEEE Trans. Inf. Theory 1982, 28, 129–137. [CrossRef]
73. Hagan, M.T.; Menhaj, M.B., Training feedforward networks with the Marquardt algorithm. IEEE Trans. Neural Netw. 1994, 5,
989–993. [CrossRef]
74. Prakash, N.N.V.S.; Reddy, V.S.; Chandran, V.; Amudha, J. Autonomous Driving Mobile Robot using Q-learning. In Proceedings
of the 2022 International Conference on Futuristic Technologies (INCOFT), Belgaum, India, 25–27 November 2022; pp. 1–8.
[CrossRef]
75. Ataollahi, M.; Farrokhi, M. Online path planning of cooperative mobile robots in unknown environments using improved
Q-Learning and adaptive artificial potential field. J. Eng. 2023, 2023, e12231. [CrossRef]
76. Kim, H.; Lee, W. Dynamic Obstacle Avoidance of Mobile Robots Using Real-Time Q-learning. In Proceedings of the 2022
International Conference on Electronics, Information, and Comunication (ICEIC), Jeju, Republic of Korea, 6–9 February 2022;
pp. 1–2. [CrossRef]
77. Yue, P.; Xin, J.; Zhao, H.; Liu, D.; Shan, M.; Zhang, J. Experimental Research on Deep Reinforcement Learning in Autonomous
navigation of Mobile Robot. In Proceedings of the 2019 14th IEEE Conference on Industrial Electronics and Applications (ICIEA),
Xi’an, China, 19–21 June 2019; pp. 1612–1616. [CrossRef]
78. Balachandran, A.; Lal, S.A.; Sreedharan, P. Autonomous Navigation of an AMR using Deep Reinforcement Learning in a
Warehouse Environment. In Proceedings of the 2022 IEEE 2nd Mysore Sub Section International Conference (MysuruCon),
Mysuru, India, 16–17 October 2022; pp. 1–5. [CrossRef]
79. Zhou, S.; Liu, X.; Xu, Y.; Guo, J. A Deep Q-network (DQN) Based Path Planning Method for Mobile Robots. In Proceedings of the
2018 IEEE International Conference on Information and Automation (ICIA), Wuyishan, China, 11–13 August 2018; pp. 366–371.
[CrossRef]
80. Xue, X.; Li, Z.; Zhang, D.; Yan, Y. Deep Reinforcement Learning Method for Mobile Robot Collision Avoidance based on Double
DQN. In Proceedings of the 2019 IEEE 28th International Symposium on Industrial Electronics (ISIE), Vancouver, BC, Canada,
12–14 June 2019; pp. 2131–2136. [CrossRef]
81. Sasaki, Y.; Matsuo, S.; Kanezaki, A.; Takemura, T. A3C Based Motion Learning for an Autonomous Mobile Robot in Crowds. In
Proceedings of the 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC), Bari, Italy, 6–9 October 2019;
pp. 1036–1042. [CrossRef]
Robotics 2024, 13, 12 29 of 29

82. Chen, H.; Liu, Y.; Zhou, Z.; Zhang, M. A2C: Attention-Augmented Contrastive Learning for State Representation Extraction. Appl.
Sci. 2020, 10, 5902. [CrossRef]
83. Gao, X.; Yan, L.; Li, Z.; Wang, G.; Chen, I.-M. Improved Deep Deterministic Policy Gradient for Dynamic Obstacle Avoidance of
Mobile Robot. IEEE Trans. Syst. Man Cybern. Syst. 2023, 53, 3675–3682. [CrossRef]
84. Nakamura, T.; Kobayashi, M.; Motoi, N. Local Path Planning with Turnabouts for Mobile Robot by Deep Deterministic Policy
Gradient. In Proceedings of the 2023 IEEE International Conference on Mechatronics (ICM), Loughborough, UK, 15–17 March
2023; pp. 1–6. [CrossRef]
85. Pei, M.; An, H.; Liu, B.; Wang, C. An Improved Dyna-Q Algorithm for Mobile Robot Path Planning in Unknown Dynamic
Environment. IEEE Trans. Syst. Man Cybern. Syst. 2022, 52, 4415–4425. [CrossRef]
86. Leng, J.; Fan, S.; Tang, J.; Mou, H.; Xue, J.; Li, Q. M-A3C: A Mean-Asynchronous Advantage Actor-Critic Reinforcement Learning
Method for Real-Time Gait Planning of Biped Robot. IEEE Access 2022, 10, 76523–76536. [CrossRef]
87. Toan, N.D.; Woo, K.G. Mapless Navigation with Deep Reinforcement Learning based on The Convolutional Proximal Policy
Optimization Network. In Proceedings of the 2021 IEEE International Conference on Big Data and Smart Computing (Big Comp),
Jeju Island, Republic of Korea, 17–20 January 2021; pp. 298–301. [CrossRef]
88. Kokila, M.; Amalredge, G. Mobile Robotic Arm for Opening Doors Using Proximal Policy Optimization. Data Anal. Artif. Intell.
2023, 3, 107–112. [CrossRef]
89. Srikonda, S.; Norris, W.R.; Nottage, D.; Soylemezoglu, A. Deep Reinforcement Learning for Autonomous Dynamic Skid Steer
Vehicle Trajectory Tracking. Robotics 2022, 11, 95. [CrossRef]
90. Hayamizu, Y.; Amiri, S.; Chandan, K.; Zhang, S.; Takadama, K. Guided dyna-Q for mobile robot exploration and navigation. arXiv
2020, arXiv:2004.11456. Available online: https://api.semanticscholar.org/CorpusID:216144691 (accessed on 2 January 2024).
91. Budiyanto, A.; Matsunaga, N. Deep Dyna-Q for Rapid Learning and Improved Formation Achievement in Cooperative Trans-
portation. Automation 2023, 4, 210–231. [CrossRef]
92. Piccinelli, N.; Vesentini, F.; Muradore, R. MPC Based Motion Planning For Mobile Robots Using Velocity Obstacle Paradigm. In
Proceedings of the 2023 European Control Conference (ECC), Bucharest, Romania, 13–16 June 2023; pp. 1–6. [CrossRef]
93. Hong, S.; Miller, Z.; Lu, J. A Transient Response Adjustable MPC for Following A Dynamic Object. In Proceedings of the 2023
American Control Conference (ACC), San Diego, CA, USA, 31 May–2 June 2023; pp. 1434–1439. [CrossRef]
94. Chen, J.; Chen, X.; Liu, S. Trajectory Planning of Autonomous Mobile Robot using Model Predictive Control in Human-Robot
Shared Workspace. In Proceedings of the 2023 IEEE 3rd International Conference on Electronic Technology, Communication and
Information (ICETCI), Changchun, China, 26–28 May 2023; pp. 462–467. [CrossRef]
95. Park, M.; Lee, S.Y.; Hong, J.S.; Kwon, N.K. Deep Deterministic Policy Gradient-Based Autonomous Driving for Mobile Robots in
Sparse Reward Environments. Sensors 2022, 22, 9574. [CrossRef] [PubMed]
96. Li, K.; Lu, Y.; Meng, M.Q.-H. Human-aware robot navigation via reinforcement learning with hindsight experience replay and
curriculum learning. In Proceedings of the 2021 IEEE International Conference on Robotics and Biomimetics (ROBIO), Sanya,
China, 27–31 December 2021; pp. 346–351. [CrossRef]
97. Duan, Y.; Li, C.; Xie, M. One fast RL algorithm and its application in mobile robot navigation. In Proceedings of the 2012
IEEE International Conference on Computer Science and Automation Engineering (CSAE), Zhangjiajie, China, 25–27 May 2012;
Volume 3, pp. 552–555. [CrossRef]
98. Lee, G.; Oh, W.; Oh, J.; Shin, S.; Kim, D.; Jeong, J.; Choi, S.; Oh, S. Semi-Supervised Imitation Learning with Mixed Qualities of
Demonstrations for Autonomous Driving. In Proceedings of the 22nd International Conference on Control, Automation and
Systems (ICCAS), Jeju, Republic of Korea, 27–30 November 2022; pp. 20–25. [CrossRef]
99. Ahmadi, A.; Nygaard, T.; Kottege, N.; Howard, D.; Hudson, N. Semi-Supervised Gated Recurrent Neural Networks for Robotic
Terrain Classification. IEEE Robot. Autom. Lett. 2021, 6, 1848–1855. [CrossRef]
100. Qian, W.; Lauri, F.; Gechter, F. Supervised and semi-supervised deep probabilistic models for indoor positioning problems.
Neurocomputing 2021, 435, 228–238. [CrossRef]
101. Li, Y. Multimodal visual image processing of mobile robot in unstructured environment based on semi-supervised multimodal
deep network. J. Ambient. Intell. Human. Comput. 2020, 11, 6349–6359. [CrossRef]
102. Yoo, J.; Johansson, K.H. Semi-supervised learning for mobile robot localization using wireless signal strengths. In Proceedings
of the International Conference on Indoor Positioning and Indoor Navigation (IPIN), Sapporo, Japan, 18–21 September 2017;
pp. 1–8. [CrossRef]
103. Wu, H.; Wu, Y.-X.; Liu, C.-A.; Yang, G.-T.; Qin, S.-Y. Fast Robot Localization Approach Based on Manifold Regularization with
Sparse Area Features. Cogn. Comput. 2016, 8, 856–876. [CrossRef]
104. Großmann, A.; Wendt, M.; Wyatt, J. A Semi-supervised Method for Learning the Structure of Robot Environment Interactions. In
Advances in Intelligent Data Analysis V. IDA 2003; Berthold, M.R., Lenz, H.J., Bradley, E., Kruse, R., Borgelt, C., Eds.; Lecture Notes
in Computer Science; Springer: Berlin/Heidelberg, Germany, 2003; Volume 2810. [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

You might also like