Deep Learning Methods in Soft Robotics Architectur
Deep Learning Methods in Soft Robotics Architectur
www.advintellsyst.com
Adv. Intell. Syst. 2024, 2400576 2400576 (1 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
feature extraction but also of capturing considerably more com- and e-skins, bioinspired soft robotic structures, and soft grippers.
plex relations between many variables in high-dimensional sys- To keep the review sufficiently compact with regard to recent
tems such as soft robots.[13] In contrast to FEM-based models, remarkable developments in the area of deep learning, only
trained deep learning models can potentially be evaluated faster architectures present in the referenced works were considered,
and may be more practical for applications in different scenarios. including feedforward neural networks (FNNs), recurrent neural
Typically, deep learning models have a black-box form without networks (standard recurrent neural networks (RNNs), long
any physical relevance to their parameters; however, their hybrid- short-term memory (LSTM), and gated recurrent units (GRUs)),
ization with physical models, such as residual modeling or direct transformer networks, generative adversarial networks (GANs),
implementation of physics into a training process, is possible and convolutional networks, as well as their respective modifica-
and relevant in standard[14] and soft robotics.[15] tions and hybrid forms.
Although there are few reviews related to the use of machine
learning methods in soft robotics,[6,10,11,13] they are more inclu-
sive in terms of their AI/machine learning orientation compared 2. Literature Review Process
to our review. Our objective is to specifically review the area of
deep learning, which has seen extremely intense development in The Web of Science and Scopus databases were used to select the
recent years and deserves to be treated as a standalone topic. The publications listed in the review. The following keywords were
specific capabilities of deep learning methods that are steadily used to search for references: “deep learning AND soft robotics”
improving provide further impetus for the development of OR “deep learning AND soft robots” OR “deep learning AND
applied soft robotics, which can potentially be incorporated into soft manipulators” OR “deep learning AND soft grippers” OR
real-world applications in various areas. In this review, we ana- “deep learning AND soft sensors” OR “deep learning AND bio-
lyze studies on soft robotic applications in which deep learning inspired soft robots” OR “machine learning AND soft robotics”
methods have been applied in any typical learning scenario OR “machine learning AND soft robots” OR “machine learning
(supervised learning (SL), unsupervised learning (USL), rein- AND soft manipulators” OR “machine learning AND soft grip-
forcement learning (RL), or semisupervised learning (SSL)). pers” OR “machine learning AND soft sensors” OR “machine
This learning classification (Figure 1) was chosen as a major learning AND bioinspired soft robots.” For the literature search,
foundation in this review to maintain the focus on the deep the architecture of the method used was verified to meet the
learning aspect, but also with a consistent classification of soft requirements of a deep structure. A list of references was created
robotic applications. A comparison of the basic advantages by selecting relevant articles for review, resulting in a total of 229
and disadvantages of typical learning scenarios is presented in publications. Figure 2 shows an overview of the number of
Table 1. articles mentioned in the review according to the year of publi-
At the application level, we classified the studies into four cation. The growing trend in the number of articles reflects
groups based on the objects to which the aforementioned meth- increasing interest in research in the field, especially over the last
ods were applied: soft actuators and manipulators, soft sensors 7 years. Considering that the review was finalized in the middle
Adv. Intell. Syst. 2024, 2400576 2400576 (2 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
of 2024, predicting the trend that would occur by the end of 2024 The review is divided into the following chapters based on the
in terms of the number of articles the authors could access was learning scenarios applied in soft robotics that are presented in
not relevant. the selected articles: “Overview of Deep SL in Soft Robotics,”
Adv. Intell. Syst. 2024, 2400576 2400576 (3 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
Figure 3. Classification of the references based on the type of learning and application area.
“Overview of Deep RL in Soft Robotics,” and “Overview of Deep During model training, the training algorithm provides a
SSL and USL in Soft Robotics.” A graphical interpretation of the systematic approach for model parameter generation and the
frequency of articles represented in individual sections (SL, RL, subsequent selection of optimal parameters using a labeled data-
SSL, and USL) is shown in Figure 3 as a pie chart. Articles used set. In this case, the corresponding set is labeled as the training
in the SL section constitute the largest proportion (≈59%). The dataset. The selection of the optimal parameters involves the
second largest section in terms of number of referenced articles minimization of the error in the set of training data between
is RL (24%). The smallest proportion is for SSL and USL appli- the predicted and real class labels. In general, training represents
cations (8% of the reported articles). The remaining 9% of the the process of optimizing model parameters, that is, tuning
articles were used as references in the “Introduction” and model parameters and optimizing them using labeled
“Challenges and Future Prospects” sections. datasets.[19,20] The second phase of SL model creation, referred
In each part of the learning scenario, the specific applications to as testing, is the process of evaluating the performance of the
in the field of soft robotics from the following options were soft model, which is trained using the training algorithm. The test
manipulators, soft grippers, soft sensors, and bioinspired soft algorithm confirms whether the model with the tuned parame-
robots. Figure 3 contains pie graphs that interpret the observed ters works optimally with another labeled dataset. The corre-
application areas within a given deep learning paradigm and the sponding set is referred to as the test dataset. We conclude
number of articles presented for a specific application. Because that testing confirms the optimal functioning of the trained
each section describing SL, RL, USL, and SSL in soft robotics model on another dataset.[19,20] During the validation phase,
begins with a general overview of a specific paradigm, some the model is tested based on a combination of training and test
references are presented in this section (and thus are not part datasets. The validation algorithm provides training and testing
of the graphical evaluation in the pie chart). under different conditions to ensure that the model is optimized
and works effectively on unseen data. Therefore, model valida-
3. Overview of Deep SL in Soft Robotics tion is performed on a dataset that is not used during model
training.[19,20]
SL is one of the four basic machine learning models.[16] It is char- Within SL, we can distinguish between two different learning
acterized by using labeled datasets to train algorithms that accu- methods: classification and regression. During classification, the
rately classify data or predict results. SL algorithms generate a model is trained to predict outputs (unknown values) based on a
function that maps inputs and transforms them into desired out- set of inputs (known values).[18] Classification is a process in
puts.[17] In general, they support the search for optimal values of which a function is created to divide data into individual classes
relevant model parameters based on extensive datasets without based on relevant parameters. The model is trained to categorize
overfitting the model. The creation of the basic SL architecture the data into individual classes.[21] Conversely, regression
consists of the initial collection of datasets and the subsequent makes it possible to describe correlations between individual
division of the data into three groups: training, testing, and vali- dependent and independent variables and to predict continuous
dation. The process of creating an SL model comprises three values based on these variables. The difference between these
phases: training, testing, and validation of the model.[18] approaches is that classification algorithms are applied to predict
Adv. Intell. Syst. 2024, 2400576 2400576 (4 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
discrete values, whereas regression algorithms are applied to pre- thereby avoiding the vanishing gradient problem that limits
dict continuous values.[22,23] traditional RNNs.[46,47] GRUs represent a simplified version of
Deep SL is a subfield of SL in which the model is trained on a LSTM with fewer parameters, which can facilitate the network
labeled dataset consisting of input–output pairs. The algorithm training process and increase computational efficiency. The basis
learns the mapping of input data to the corresponding output is the use of gating mechanisms for the selective updating of the
labels by generalizing the relationships in the training files. hidden state of the network at each time step. Gating mecha-
Deep SL typically pertains to the application of DNNs, decision nisms are applied to control the flow of information to and from
trees, k-nearest neighbors, or regression algorithms.[24] Deep SL the network. GRUs contain two gating mechanisms: a reset gate,
techniques that apply relevant methods are powerful computa- which determines how much information from the previous hid-
tional tools for various applications.[25–30] den state should be forgotten, and an update gate, which deter-
DNNs are widely used in the field of AI, including machine mines how much information from the new input should be
vision, speech recognition, robotics, and navigation tasks. They used to update the hidden state. The final output of the GRU
are inspired by the structure and function of the human brain network is calculated based on the updated hidden state.[48–50]
and enable computers to solve cognitive tasks typical of human
thinking. Neural networks consist of simple computing nodes
(neurons) that are connected to each other, work in parallel, 3.1. Deep SL and Soft Manipulators
and are arranged in interconnected layers.[31,32] A simple neural
network is composed of three layers (input, hidden, and output). For the efficient movement and manipulation of objects, manip-
However, shallow neural networks are not sufficient to optimally ulators must control their shape, position, or deformation.[51]
predict a given task. DNNs were designed to create more predic- Soft robotic manipulators are composed of materials such as sil-
tive models that represent a sequence of layers, in which individ- icone, rubber, and other flexible polymers[52] that can undergo
ual layers perform a linear transformation followed by elementary significant deformations, as shown in Figure 4b,d. The behavior
nonlinearity. The resulting combination of a large number of of soft materials is highly nonlinear, which implies that their
layers allows the created model to have a high predictive ability.[33] response to forces is not directly proportional. This nonlinearity
Convolutional neural networks (CNNs) and RNNs are among the complicates the prediction of movements and interactions with
most widely used DNN algorithms in deep SL.[34,35] the environment.[53] Deep SL models (Figure 4a) can predict the
CNNs, also referred to as ConvNets, represent a powerful type results of different manipulations or control methods, allowing
of DNN primarily used in machine vision applications. They are soft manipulators to adapt their movements in real time to suc-
composed of a series of convolutional and pooling layers that cessfully interact with different objects.[54–56] Deep SL methods
extract the data, that is, the relevant features of the inputs, using are applied to learn the relationship between the activation inputs
a set of filters. These are followed by one or more connected of a soft manipulator and its final shape, which ultimately
layers that use these features to make predictions. The fully con- enables precise positioning.[57–59] To train the deep algorithm,
nected (FC) layer at the end of the network performs classifica- a significant amount of labeled data is needed to capture various
tion based on the features extracted by the previous layers and states, deformations, and interactions of the soft manipulator.
their filters. It represents a traditional neural network layer, that Images obtained using camera systems are widely used as
is, all neurons are interconnected between two layers, and inputs.[60–62] This data collection method uses high-speed cam-
generates the final output of the CNNs.[36,37] eras and reflective markers placed on the soft manipulator to
RNNs use sequential or time-series data. They are character- track its movement and deformations in 3D space.
ized by their memory because they use information from previ- Yoo et al.[63] created a model for estimating and visualizing the
ous steps; that is, they use the output of a given step as the input shapes and positions of soft robots. The deep architecture of the
of the current step. Thus, the most essential feature is the hidden model is based on the encoder–decoder combination, where
state, which remembers information regarding the data sequen- the encoder uses a CNN, specifically the VGG module, to encode
ces. This state is referred to as the memory state because it visual data regarding the position and shape of a soft robot based
remembers previous network inputs. RNNs are mainly used on the image from the proprioceptive camera. Yang et al.[64]
for speech recognition applications, time-series forecasting, text designed a flying mobile platform with a soft continuum arm
processing, and translation and image recognition.[38–40] Within for detecting and repairing cracks in hard-to-reach locations
RNNs, new architectures such as bidirectional RNNs (BRNNs), (Figure 4c). A mask R-CNN was applied for the identification
LSTM, and GRUs have been created to solve the vanishing and localization of cracks in walls based on visual sensors and
gradient problem.[41–43] BRNNs are a variation of RNNs that the navigation of the soft arm to the defect location. The required
combines hidden layers in opposite directions so that the movement and position of the soft arm were ensured by applying
output layer, which is used for making predictions, obtains appropriate pressure to the three tubes forming the soft arm.
information from past (backward) and future (forward) states Lu et al.[65] are working on the estimation of the shape and posi-
simultaneously.[44,45] LSTM works on the read-write-and-forget tion of a soft manipulator using camera images. The image is
principle; that is, it stores the most important data and forgets processed into a binary mask that segments the robot and back-
data that are not essential for the prediction of the output. ground pixels. In the mask, a value of 1 represents the robot
LSTM networks process and analyze sequential data such as time pixels and a value of 0 represents the background pixels.
series, text, and speech. To predict the outputs, they use a mem- The semantic segmentation of the robot from the background
ory cell and gates that control the flow of information and data, is based on a CNN; specifically, it is trained by DeepLabV3
allowing them to selectively retain or remove data as required, with 10 K synthetic image data generated by CoppeliaSim.
Adv. Intell. Syst. 2024, 2400576 2400576 (5 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
Figure 4. Examples of soft manipulators, soft segments, and deep SL network structure. a) Recurrent neural network-based model, where the length of
the time history horizon is determined by η and features composed of adjacent nodes pose. Reproduced under terms of the CC-BY license.[51] Copyright
2021, The Authors. Published by Frontiers. b) Soft robot platform. Reproduced under terms of the CC-BY license.[58] Copyright 2021, The Authors.
Published by Frontiers. c) Soft bionic manipulator with intrinsic actuation. Reproduced under terms of the CC-BY license.[64] Copyright 2022,
The Authors. Published by Frontiers. d) Soft robot segment with three magnetoresistive sensors integrated into a printed circuit board (PCB) at
the tip of the segment. Reproduced under terms of the CC-BY license.[71] Copyright 2023, The Authors. Published by RSC. e) Soft pneumatic actuator
with distributed flexible bending sensors. Reproduced under terms of the CC-BY license.[79] Copyright 2023, The Authors. Published by MDPI.
Almanzor et al.[66] used deep learning to control the movement of were the model inputs, an LSTM network was trained to predict
a soft arm. Using a CNN, the control system could realize the the 3D configuration of the shape and position of the investigated
desired target shapes based on the image input. The created con- device. The soft robot could perceive the configuration of its own
trol system demonstrated versatility and application for control- body in 3D space. The realization of proprioception and closed-
ling continuous manipulators regardless of the materials used, loop control for a soft manipulator was also applied in ref. [73].
kinematics, or the number of degrees of freedom of movement. Movement, extension, and bending of the soft manipulator were
The positioning of a soft manipulator intended for picking predicted based on an LSTM network and FNNs. The inputs
berries without causing excessive damage to the bushes was pre- were signals from sensors that mapped positions using inductive
dicted in refs. [67,68]. A CNN was trained based on images from springs and inertial measurement units (IMUs). Relaño et al.[74]
an integrated camera. The VGG16 network was used to estimate used approximate Gaussian process (AGP) and deep Gaussian
the input activation values to ensure the desired position of the process (DGP) methods for system identification, kinematic
robot. Zhang et al.[69] used two cameras to sense the position and modeling, and soft arm control. For the design of the internal
deformations during the movement of handheld shearing aux- structure of soft pneumatic actuators, the CNN model, which
etic (HSA) actuators, which provided visual inputs for training is part of a genetic algorithm (GA), was used by Mosser et al.[75]
a CNN. The trained CNN predicted the position of the actuator No previous experience was used to determine the internal struc-
tip and the contact force acting on the actuator. ture; thus, the CNN significantly reduced the computation time.
The training of deep SL methods based on sensor data, which The CNN inputs were the muscle volumes and the output was a
subsequently allows the estimation of the position and deforma- vector containing the phenotype, which was the 3D displacement
tion of the soft manipulator, is a process known as propriocep- of the considered points on the muscle. Li et al.[76] sensed the
tion and is an essential element of precise control.[70,71] Force and curvature of a soft manipulator using fiber Bragg grating
tactile sensors are part of the soft manipulator or connected to it (FBG) sensors implemented in the manipulator. The direct
to record the forces during various manipulations and interac- implementation of sensors in manipulator structures makes it
tions. Conversely, soft sensors composed of conductive or piezo- difficult to create a sensing model owing to the effect of elasticity.
electric materials are used to collect input data and are integrated A data transfer model was created using information regarding
into the body of the soft manipulator to provide data on internal the curvature of the FBG sensors, and a modified LSTM network
deformations and external pressures. Truby et al.[72] worked with was created for final outdoor location mapping. Tan et al.[77]
a three-segment soft manipulator containing integrated piezore- proposed model-free soft n-segment arm control based on two
sistive sensors designed to sense its position in space. Based on varying-parameter RNNs (VP-RNNs). One model solves the
the position and movement data of the soft manipulator, which inverse kinematics, and the other directly adapts the value of
Adv. Intell. Syst. 2024, 2400576 2400576 (6 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
the pseudoinverse of the Jacobian matrix. Furthermore, diverse datasets. Deep SL includes the application of neural net-
Zhang et al.[78] studied a system of electrically actuated robotic works trained on selected datasets for the identification of the
platforms and pneumatically actuated soft manipulators. They movement itself when grasping objects, design of controls,
investigated the suitability of neural network inputs obtained and enhancement of the performance and capabilities of soft
based on sensors and proposed several models, including an grippers.[81–84]
LSTM network, for positioning soft systems. Shu et al.[79] used In the first case, the data used to train individual network types
an LSTM network to determine the relationship between the val- are obtained using integrated cameras and form a group of visual
ues obtained from flexible porous piezoresistive sensors and the inputs. Soft robotic grasping based on a machine vision system is
position of the soft actuator (Figure 4e). The inputs to the net- an area of robotics that is currently receiving considerable atten-
work were signals from individual sensors, composed of a thin tion.[85,86] Almanzor et al.[87] constructed a robot called LitterBot
conductive sponge material designed to sense bending and posi- designed to collect garbage. The robot was completed using a
tion, and the output was the absolute spatial position. Thus, the fin-ray-type soft gripper with an integrated camera (Figure 5b).
LSTM network modeled the position of a given system. The machine vision system applies the Detectron 2 version of
the Mask Region CNN (Mask R-CNN) for the classification
3.2. Deep SL and Soft Grippers and segmentation of objects on the ground and the determina-
tion of their angular orientation for subsequent grasping. The
In contrast to rigid structures, soft grippers are constructed from CNN network is trained on the Trash Annotations in Context
flexible materials (Figure 5a). When grasping objects of different (TACO) dataset, which consists of images of objects obtained
sizes and shapes, deformations and changes in the shape of the using an integrated camera. Wang and Ling[88] created a CNN
grippers occur. Therefore, the identification of their movement based on visual inputs from an integrated camera and deep learn-
and the design of control represent challenges and opportunities ing to predict the graspable positions and orientation of objects.
in robotic manipulation tasks. Deep SL enables the soft gripper to As part of soft grips, this area of research also includes the appli-
predict the probability of successfully grasping an object even cation of neural networks to learn the control of the contact force
before the actual grasping movement is performed. This predic- during interactions with grasped objects.[89] Wan et al.[90] exam-
tion is based on several factors, such as the shape of the object, ined different learning strategies for estimating contact force
properties of the contact surfaces, contact force, and configura- control and arm end-effector interactions. A CNN structure
tion of the grippers.[80] Modeling relevant operations using deep was applied to estimate the force and torque based on image data
SL requires capturing interactions represented by large and of the internal deformation and visual tracking of the 6D position
Figure 5. Examples of soft grippers and fingers. a) Soft robotic sensor-actuator with a hydraulic actuation system, along with the Leap Motion infrared
camera and breakout board for data acquisition. Reproduced under terms of the CC-BY license.[81] Copyright 2021, The Authors. Published by MDPI.
b) Robot called LitterBot designed to collect garbage; it includes a Fin Ray-type soft gripper that integrates a camera. Reproduced under terms of the
CC-BY license.[87] Copyright 2022, The Authors. Published by Frontiers. c) Training of deep network based on deformation images of a Fin-Ray finger
captured via FEA simulation to predict contact forces and stress in real time. Reproduced under terms of the CC-BY license.[91] Copyright 2021, The
Authors. Published by Frontiers. d) Bimanual robotic system, composed of two Franka Emika Panda Robots, endowed with two Pisa/IIT soft hands as end
effectors. Reproduced under terms of the CC-BY license.[97] Copyright 2022, The Authors. Published by Wiley-VCH. e) Soft gripper with piezoresistive
tactile sensors for fruit grasping. Reproduced under terms of the CC-BY license.[101] Copyright 2023, The Authors. Published by MDPI.
Adv. Intell. Syst. 2024, 2400576 2400576 (7 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
of a tag fixed inside a hollow, cone-shaped soft 3D material. The created for the soft anthropomorphic finger. In Shi et al.[99] a soft
CNN was applied to directly predict the force along the x- and robotic gripper had an integrated ultrasonic sensor, which was
z-axes and the torque along the y-axis in a mutual interaction. used for the noncontact acquisition of information regarding
The contact force prediction was the subject of ref. [91]. The the position of the grasped object and setting the gripper to a
CNN network was trained based on simulated data generated suitable grasping point. After grasping the object, tactile data
by finite element analysis (FEA) in the form of image inputs were obtained using integrated triboelectric bending and tactile
(Figure 5c). The obtained data represented the simulated defor- sensors. A CNN network was created for data analysis and proc-
mations during the contact of ball and cube-shaped objects of essing and was used for object recognition, classification, and
different sizes, heights, and angles on the fin-ray soft finger. grasping applications. In ref. [100], a 3D CNN was also trained
The predicted contact deformations were subsequently com- based on 3D point clouds from a depth sensor to predict the
pared with real data from the force sensor. The created network appropriate grasps of objects. Zhou et al. designed and
can be incorporated into a control system to determine the con- constructed a soft gripper using touch sensors[101] for gentle
tact force in real-time based on its visible deformation. handling during fruit harvesting (Figure 5e). The soft gripper
Conversely, in addition to visual data from integrated cameras, with four fingers contained 24 integrated piezoresistive touch
input data from sensors integrated into the fingers of soft grip- sensors that provided datasets used for the creation of a deep-
pers have been used.[92,93] In a study,[94] the authors worked with touch CNN and obtained by performing experimental gripping
a soft anthropomorphic finger containing integrated soft sensors. operations. The created network predicted and differentiated
An LSTM network was used in constructing a model of the individual fruit grasping scenarios tested under laboratory
behavior of the investigated system, where the inputs were visual conditions.
data and data from integrated sensors. The model identified the
location, interacted with the environment, and controlled the
handling of the objects. To obtain training data for an LSTM net- 3.3. Deep SL and Soft Sensors
work designed to perform object recognition based on the strain
and tactile sensing data, Zuo et al.[95] developed ionic hydrogel- Soft sensors, similar to soft manipulators and grippers, are com-
based strain and tactile sensors integrated into the soft fingers of posed of flexible and deformable materials and exhibit highly
a designed gripper. These sensors are characterized by good con- nonlinear responses to stimuli.[102] This nonlinearity complicates
ductivity, extensibility, toughness, and easy gripping on request the accurate modeling of their behaviors using traditional linear
compared to a soft finger. Rho et al.[96] created a stiffness-aware approaches.[103] Materials used for the production of soft sensors,
temporal-attention LSTM (SATA-LSTM) to estimate the contact such as silicon,[104] hydrogels,[105] and elastomers,[106] have
force of the fingertips of a soft robotic glove based on a tendon- complex mechanical properties that can change depending on
sheath mechanism. The network consisted of two parts: the stiff- external conditions. Soft sensors are designed to operate in
ness estimation model and finger contact force estimation. For environments with variable temperature, humidity, pressure,
stiffness estimation, an LSTM network was trained based on the or other factors.[107] When creating a model, it is necessary to
stiffness data of the experimental cylinders, obtained by multiple consider these changes under relevant conditions. However, rel-
grasping and releasing through sensors integrated into the glove. evant sensors are often integrated into other soft structures, such
Subsequently, the estimated stiffness was the input for creating as soft manipulators, actuators, and grippers.[108,109] In this case,
the LSTM model to estimate the contact force of the fingertips of the models must not only predict the behavior of the soft sensor
the soft glove. Consequently, the proposed SATA-LSTM architec- but also easily integrate with the control algorithms of the com-
ture estimated the contact force based on the stiffness of the plete soft structure. Considering the above, deep SL is a suitable
grasped object. One area of research on the application of soft method for modeling the nonlinear behavior of soft sensors and
gloves for individual grasping operations is the prediction provides a range of model architectures that can be adapted
of grasping failure or the slipping of the grasped object to individual types of sensor data and specific modeling
(Figure 5d).[97] Based on IMUs located on the individual fingers needs.[110–113]
of the gloves, vibrations caused by the sliding of grasped objects Zhao et al.[114] developed a soft optoelectronic sensory system
are recorded during individual experiments. The data are subse- for the direct detection of signals during hand gestures. The opti-
quently used to create a GRU network that is trained to detect the cal sensors were based on U-shaped microfibers (UMFs) encap-
occurrence of relevant conditions and predict events when object sulated in biocompatible polymers. Four soft sensory foils were
grasping fails. Thus, the network predicts whether the grasped attached to the glove at the finger joints and were used to
object is moving and in what direction to regrasp and stabilize sensitively detect their movements, as shown in Figure 6a.
the object. The LSTM network was used by Thuruthel et al.[98] to Finger movement deformed the UMFs and caused changes in
learn force control in a closed loop with integrated soft sensors. the optical transmission. The transmitted light was converted
The authors created a simple feedback force controller for a soft into electrical signals using photodetectors, which served as sam-
anthropomorphic finger. A resistive soft strain sensor called a ples for the dataset with the optical microfiber dynamic response.
conductive thermoplastic elastomer (CTPE) is a part of the soft A CNN (VGGNet) was then created to process the optical
finger that comes into contact with a force-sensitive resistor. signals and classify the gestures. A robust thumb-sized soft
Contact force values provided by the sensors of the respective 3D haptic vision-based sensor called Insight was investigated
systems were used for modeling the relationship between the by Sun et al.[115] and provided a directional map of the force dis-
sensor responses and the force acting on the tip of the soft finger. tribution over its entire conical sensing surface. The sensor con-
Based on this model, a simple feedback force regulator was sisted of a single layer of elastomer pressed onto a rigid hollow
Adv. Intell. Syst. 2024, 2400576 2400576 (8 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
Figure 6. Examples of soft sensors and deep SL network structures. a) Schematic of optical U-shaped microfiber (UMF) sensors attached to the joint
region of a soft glove and network architectures of VGGNet, which consists of 13 convolution layers and five pooling layers to extract features, followed
by three full-connection layers and a softmax layer to obtain the recognition results. Reproduced under terms of the CC-BY license.[114] Copyright 2022,
The Authors. Published by Wiley-VCH. b) Overall structure of the Insight sensor (4-1 Elastomer, 4-2 Skeleton, 4-3 Collimator, 4-4 LED ring, 4-5 Supporter,
4-6 Fisheye lens, 4-7 Camera, 4-8 Image DAQ, 5 Connector) and machine-learning model, where the inputs are three images (raw, reference, and skeleton
images), and the outputs are the contact location (Px, Py, and Pz denote the coordinates of the contact in the reference frame of the sensor) and contact
force vector (Fs1, Fs2, and Fn). Reproduced under terms of the CC-BY license.[115] Copyright 2022, The Authors. Published by Nature. c) Experimental setup
of bending angle perception of a soft pneumatic fiber-reinforced bending actuator by kirigami-inspired flexible sponge sensors. Reproduced under terms
of the CC-BY license.[122] Copyright 2022, The Authors. Published by MDPI. d) Conceptual schematic of the proposed soft sensing approach without
integrated sensors and proposed deep learning architecture comprising the common feature extractor and decoder (regression section). Reproduced
under terms of the CC-BY license.[135] Copyright 2022, The Authors. Published by Nature.
frame to maintain its shape and allow for high interaction forces as a visual texture, and integrated cameras. Three network archi-
without damage. The soft sensor used shading effects and struc- tectures were trained and compared to estimate inverse depth
tured light to monitor 3D surface deformation using an inte- maps from input images in contact with the respective object,
grated monocular camera, as shown in Figure 6b. The sensor namely, ResNet, BTS, and PackNet. The final output of the
output was calculated using a data-driven machine learning network was the monocular depth estimation, which involved
approach that directly derived the distributed contact force infor- predicting depth information from a single RGB image. A soft
mation from raw camera data. The force was derived using a heterogeneous multimodal sensor based on room-temperature
CNN (ResNet) that mapped the images to the spatial distribution ionic liquid (RTIL) optoelectronics was designed by Xu et al.[118]
of the 3D contact force (normal and shear). The model was to detect deformations during mechanical stretching and bend-
trained on data in which each data point combined one image ing. Ionic liquids serve as a medium for light propagation, have
from the camera with the indenter’s contact position and orien- the same geometric units as optoelectronics, are easily integrated
tation, contact force vector, and diameter. A CNN network was into the sensor structure, and are characterized by low volatility,
used in Koh et al.[116] to estimate and classify the contact posi- high conductivity, and thermal stability. The proposed soft sen-
tions of investigated objects on a vision-based dome-type soft sor monitored and compared the difference between RTIL and
tactile sensor. Based on the image input, VGGNet was trained optoelectronic signals under various stimuli and applied an
to extract images in real time, classify the object, and mark the LSTM network to detect combined tensile and bending deforma-
relative position of the contact on the soft sensor. The estimation tions. In ref. [119], the proposed structures of soft sensors were
of the geometry of the tactile deformation and the contact surface tested to create an LSTM network for estimating the magnitude
during the contact of the soft sensor with the object under inves- of pressure and the position of its action. Two types of soft
tigation was also the subject of research by Ambrus et al.[117] sensors were designed for this study, which consisted of a micro-
The authors designed a gripper with two ellipsoidal form-factor channel filled with liquid metal (eutectic gallium-indium or
soft-bubble sensor fingers. The soft sensors included two inflated EGaIn) and covered with a layer of silicone elastomer.
bubbles with a dot pattern on the inside of a flexible membrane Structurally, they differed in the shape of the microchannel,
Adv. Intell. Syst. 2024, 2400576 2400576 (9 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
where one soft sensor had a straight microchannel with three subnetworks were created to recognize the location and contact
different cross-sectional areas, and the other had a uniform depth. A local finding network (LFN) was used to identify the
cross-sectional area with three different serpentine patterns. contact position, and a deformation-detecting network (DDN)
Owing to the pressure, the cross section of the microchannel was used to identify the indentation depth and soft skin during
decreased and its electrical resistance increased, as detected by interactions. Massari et al.[133] presented the development of a
the voltage divider circuit. The training data files for creating soft biomimetic skin formed by a polymer matrix in which pho-
the LSTM network represented the pressure values from the sen- tonic FBG transducers were integrated, which partially imitated
sors and the force acting on the soft sensors from the load sen- the functionality of Ruffini mechanoreceptors with diffuse, over-
sors. In ref. [120], an LSTM network was created to predict the lapping receptive fields. The polymer substrate contained 16
movement state of piezoelectric soft sensors integrated into a FBGs that transmitted the forces acting on the optical sensors.
smart glove composed of silicone rubber. In the proposed A CNN deep learning algorithm and multinetwork neural inte-
network, sequential sensor data representing the angles during gration process were implemented to decode the FBG sensor
finger movements were used as input data. Data from the Leap data to derive the contact force magnitude and contact location
Motion Controller were used to train the LSTM network used to on the soft skin surface. Geier et al. focused on the integration of
predict the movement of the piezoelectric soft sensors based on tactile sensing with feedback mechanisms[134] for creating a loop
finger bending angles. The LSTM network transferred the move- in which tactile data from a soft skin sensor were processed using
ment of the glove into a virtual environment in real time and a GRU autoencoder (GRU-AE). They constructed an Allegro
simulated the movement of the fingers. The LSTM network robotic hand with three-axis soft sensors based on the Hall effect
was also used to decode finger movements by Kim et al.[121] using a silicon structure. Tactile data from the soft sensor skin
An ultrasensitive sensor was attached to the wrist while measur- were processed using the GRU-AE to provide tactile stimulation
ing skin deformations and muscle movement during individual feedback. By contrast, Shimadera et al.[135] proposed an optical
finger movements. Minor deformations of the soft sensor struc- approach for simultaneous multimodal sensing without integrat-
ture were detected when the wrist topology changed, based on ing sensors, that is, without creating sensor arrays or matrices in
which the LSTM network was trained for the subsequent soft skin. The sensing approach is based on the optical scattering
detection of finger movement. Soft sensors are widely used to of a laser beam in soft materials, which is highly sensitive to
determine the positions of soft actuators, which is an essential external stimuli and enables the encoding of various stimuli
step in their closed-loop control. The basis is a sensor with soft as a spatial interference pattern (Figure 6d). When in contact
and flexible characteristics that do not affect the movement of the with a material, optical scattering creates an interference pattern,
soft actuator. Flexible sensors with kirigami-inspired structures referred to as a speckle pattern, which contains data on the defor-
were used by Shu et al.[122] for position sensing and soft fiber- mation of the soft material. A DNN model was used to estimate
reinforced bending actuators (Figure 6c). This study proposed four quantities: the indentation depth, corresponding contact
a calibration LSTM network for mapping the resistance of soft force, contact position, and temperature. The model consisted
sensors to the bending angle of soft actuator segments. The of an extractor formed by two CNN layers and one FC layer,
inputs of the network were the signals from the sensors, and the which extracted common features from the speckle images.
outputs were the bending angles of the corresponding segments. The second part of the model was formed using branched FC
A special group of soft sensors represented by flexible tactile layers representing a decoder (regression model) to transform
sensors are referred to as soft skins and are designed to imitate the extracted features into the investigated quantities.
the tactile sensing ability of human skin.[123,124] Soft skins rep-
resent flexible and stretchable substrates in which various sens-
ing elements are integrated, such as piezoresistive, capacitive, or 3.4. Deep SL and Bioinspired Soft Robots
optical sensors,[125] which can detect pressure, shape, and other
tactile information.[126,127] The second group is represented by Bioinspired soft robots are based on biological organisms; there-
soft skins without integrated sensors, where individual interac- fore, they imitate the physical and functional properties of living
tions are captured by cameras that monitor the movement of organisms. These robots are designed to replicate the adaptabil-
reference points when the soft skins come into contact with ity, resilience, efficient movements, and multifunctionality found
individual objects.[128] These soft sensors are typically composed in nature, allowing them to integrate into their environment in a
of materials such as silicone, elastomers, or other flexible manner that rigid structures cannot. They simulate the swim-
polymers.[129] Their contact with objects or surfaces generates ming (Figure 7a), crawling (Figure 7b), grasping, or flying of
data that contain information about the force, pressure distribu- animals, such as squid,[136] worms,[137] fish,[138,139] or the limbs
tion, temperature, or vibration.[130] Depending on the nature of of animals and people.[140] They are composed of materials such
the modeling process, deep SL algorithms are applied, whose as silicone, rubber, hydrogels, or polymers that provide a high
inputs are data from tactile sensors, and the output corresponds degree of flexibility and deformation.[141,142] Modeling the prop-
to tactile information such as contact force, object shape, pres- erties of soft materials, such as elasticity, viscosity, and friction,
sure, or texture.[131] requires sophisticated techniques to ensure that the soft robot
In soft tactile sensing research, Yoshigi et al.[132] constructed a mimics the behavior of its biological counterpart. Nonlinear
sensing platform consisting of soft skin covering a transparent and complex dynamics that include structural deformations, con-
tube representing bone. Cameras were designed at both ends tinuous shape changes, and interactions with external forces
to monitor the movement of brands inside the soft skin. require the application of deep supervised algorithms as power-
Based on the image input, a CNN was trained, and two ful tools for modeling the behavior of soft bioinspired robots.[143]
Adv. Intell. Syst. 2024, 2400576 2400576 (10 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
Figure 7. Examples of bioinspired soft structures. a) Soft silicone arm inspired by the tentacle of a squid with embedded bend sensors which reflect the
arm posture for each time step. Reproduced under terms of the CC-BY license.[136] Copyright 2015, The Authors. Published by Nature. b) Bioinspired soft
segment with seamlessly embedded sensors replicating the concept and sequence of earthworm movement. Reproduced under terms of the CC-BY
license.[137] Copyright 2021, The Authors. Published by Wiley-VCH. c) Block diagram of the hardware components of a soft biomimetic prosthesis for
LSTM-based gesture classification. Reproduced under terms of the CC-BY license.[150] Copyright 2023, The Authors. Published by MDPI.
Abeb et al.[144] constructed a mobile soft robot whose structure Zhang et al.[146] proposed a bioinspired gripper that imitated the
was inspired by the body of a silkworm and its movement. tongue of a chameleon and could attach to various objects using
The robot was composed of two identical pneumatic muscles. negative pressure. A CNN (BIG-Net) was created to estimate the
By combining the pressures in the muscles, movement was grasping parameters. The proposed network captured a depth
achieved in three directions: forward, right, and left. The image of a grasp scene and estimated the parameters for robotic
YOLOv3 deep learning platform was applied to the detection grasping, such as grasp directions, grasp qualities, and gripping
of moving objects and subsequent navigation of the robot based steps. The natural movement of inchworms and snakes inspired
on machine vision. Visual images of a moving tennis ball, the creation of a flexible robotic manipulator by Liu et al.[147] who
obtained using an integrated camera, were used to train the studied the soft tubular body of the inchworm with a multiseg-
CNN. The created algorithm was used to calculate the movement mented structure and the oblique muscles of snakes that imple-
parameters and send a control command to the pneumatic mus- ment crawling and climbing. The manipulator was designed for
cle to achieve the desired movement. The soft robot was ulti- gripping and manipulation operations using individual objects.
mately able to navigate autonomously and avoid obstacles in A CNN structure (YOLOv3) was trained using machine vision in
its working space. As reported in ref. [145], a CNN was designed which an integrated camera provided network inputs that repre-
that used nonlinear Doppler shift signatures to estimate the sented images of objects at different angles and distances. The
direction of the sound source. The authors proposed a soft created structure detected objects and their successful grasping,
robotic biomimetic receiver that imitated the fast nonrigid defor- and the YOLOv3 object detector determined and adjusted the
mation of the ears in certain bat species. The soft device used one deviation between the center of the object and the center of
receiver at one frequency, inspired by the biosonar sensing sys- the image captured by the integrated camera for subsequent suc-
tem of bats. An electrostatic speaker was used to emit ultrasonic cessful grasping operations. An LSTM network was created by
pulses, which served as the input signals. These signals were Li et al.[148] to predict the required angle sizes and angular
received by a capacitive microphone mounted in the ear canal velocities of the fins of a bioinspired soft robot. The authors
of the biomimetic pinna, digitized by the sampling frequency, constructed an underactuated soft batoid robot that mimicked
and transformed into spectrogram time-frequency representa- the morphology and swimming kinematics of rajiform batoids.
tions using short-time Fourier transforms. The spectrograms The six-axis force/torque sensor Nano17 was used to obtain the
were clipped along the frequency axis to a region that contained input data for the LSTM network, which represented the driving
all expected Doppler shifts. The result was a square matrix with forces and torques generated during fin flapping. The angular
normalized power spectral density values as a function of sizes and velocities generated by the LSTM network were used
time and frequency, which served as input for creating a CNN. to control the motion of the robot and verify whether the
Adv. Intell. Syst. 2024, 2400576 2400576 (11 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
measured driving forces and torques corresponded to their target The essence of RL lies in decision making—an intelligent agent,
values. Research on the behavior of a bioinspired soft robot oper- which is in a dynamic environment, observes the given environ-
ating under a water surface was the subject of ref. [149]. Three ment and simultaneously interacts with it. Subsequently, based
soft tentacles replicating the swimming of an octopus were on this information, it decides how to react, whereas its effort is
designed, each containing an internal cavity. Individual cavities to achieve the maximum rewards and minimum punishments
were connected to the pressure sensors using flexible silicone for these reactions. Rewards can be either immediate or delayed.
tubes. The deformation of the tentacles under the action of exter- This essence is included in one of the most famous policies of RL,
nal forces caused a change in the volume of the cavity, which was called the Markov decision process.[154,155] The agent can make a
subsequently used for the proprioceptive derivation of tentacle decision (future state) based on its interaction with the environ-
morphology. To reconstruct the movement of the tentacles, a ment (current state) without the need to know the previous state
two-way LSTM network was trained, whose inputs were sequen- (past); however, this is on the condition that the reward may not
ces of pressure sensor data and images of the shape of the be received immediately, but only in the future. For example, a
tentacle curvature during movement obtained using a camera meandering agent should move from the edge to the center; on
located outside the water tank. Toro-Ossaba et al.[150] used a bio- each subsequent move, it has several options and only a few
inspired soft prosthesis based on the skeletal architecture of a steps/actions later will it learn whether its current decision will
human hand operated using electromyography (EMG) signals. lead it to the goal (reward) or reach, for example, a dead end.
The myoelectric control system made it possible to move the soft This is one of the challenges faced by RL algorithms.[156] Such
prosthesis based on EMG signals captured during the movement models can be applied widely in the real world in the context
of the human hand, which indicated electrical activity in the of solving challenging tasks where other algorithms and machine
muscles during contraction. Based on the EMG signals acquired learning techniques are insufficient, precisely because of a
during human hand gestures using MyoWare sensors, an LSTM dynamic, rapidly evolving, and uncertain environment. These
network was trained to classify individual gestures. The created include robotics, autonomous driving systems, gaming, and com-
EMG control system based on the LSTM network can control the puter vision.[153,157] In recent years, however, the possibility of
movements of the soft prosthesis and replicate the investigated using RL has been strengthened by combining it with deep learn-
movements of a real human hand, as shown in Figure 7c. ing techniques, resulting in deep reinforcement learning (DRL).
An overview of the application of deep SL networks for modeling DRL is a rapidly developing subfield created with the aim of
the behavior of soft robotic structures, along with the perfor- solving problems that cannot be solved with “shallow” machine
learning techniques (for example, increasing the level of under-
mance indicators used, is presented in Table 2, where mean
standing of autonomous systems, the visual world, or the level of
absolute error (MAE), mean squared error (MSE), root mean
robot control directly from real-time camera footage).[156] These
squared error (RMSE), normalized root mean squared error
tasks often require a sequence of decisions within time; there-
(NRMSE), percentage of correct keypoints (PCK), and coefficient
fore, sequential decision making can be applied using algorithms
of determination (R2) represent the performance metrics used,
from this subfield. At the same time, the agents of DRL must
and a dash represents references without a specified perfor-
decide multiple times between improving a known strategy with
mance metric.
which they have already obtained rewards in the past, or focusing
on finding a new strategy that could potentially bring higher
rewards than they had previously. However, training such agents
4. Overview of Deep RL in Soft Robotics is difficult because it is a computationally demanding process
that requires many conditions, such as a large amount of input
RL is another machine learning paradigm. The basic difference from the environment and stability. Although the training pro-
between RL and the previously described machine learning cess can sometimes be lengthier, if the agent receives a high
methods is that while SL requires labeled data and simulta- reward, it can generalize the acquired information and use it
neously defines the value of the outputs and USL uses unlabeled in other tasks.
data, whereas the output values are predicted by the model itself, DRL methods can be categorized based on their specific char-
RL does not use defined data but learns from the environment. acteristics and similarities.[157] One group can be formed using
The task of RL is to train a model that can create a sequence of methods based on the value function, such as the deep
decisions and subsequently choose a solution based on its inter- Q-network (DQN) algorithm, which uses a neural network to
action with the environment in which it is located. When learn- approximate the Q-function, thereby combining Q-learning
ing a model, it is important that it be guided in the form of a and neural networks.[158] Similar to the previous algorithm, dou-
positive reward if it approaches the desired goal, or a negative ble deep Q-learning (DDQN) also belongs to the group of meth-
reward if it moves away from the goal.[151,152] Several basic terms ods that use a value function, as it is an extension of the standard
are distinguished within RL, for example, agent: a model trained DQN model.[159] The second group consists of policy methods,
using RL methods; environment: the environment in which the where the proximal policy optimization (PPO) and trust region
model is trained; action: all steps/actions that the agent can policy optimization (TRPO) algorithms are typical representa-
perform; reward: rewards help the agent choose an action that tives. The TRPO algorithm is one of the most frequently used
ensures movement in the right or desired direction; state: algorithms in the second group of methods—policy—and is
current location/state of the agent; and policy: conditions how characterized by the fact that it directly optimizes the agent’s
the agent will make decisions within its current state, and all policy; that is, it does not need to model a value function.[160,161]
options of actions available to it.[151,153] PPO implies that, like TRPO, it belongs to the group of policy
Adv. Intell. Syst. 2024, 2400576 2400576 (12 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
Table 2. Overview of the application of deep SL networks for modeling the behavior of soft robotic structures, along with the performance indicators
used.
Network Application of network Device Data collection Performance indicator Performance metric References
structure for network
CNN Shape control and positioning Soft manipulator Vision based Shape and position error MAE [63,69]
Positioning error MSE [64,67,68]
Pose estimation error PCK [65]
Shape error RMSE [66]
Sensor based Shape error RMSE/R2 [75]
Object recognition and grasping Soft gripper Vision based Recognition/grasping success – [87,88]
rate error
Sensor based Recognition accuracy error – [99]
Grasping success rate error – [100]
Recognition/grasping success rate error – [101]
Contact force control Soft gripper Vision based Force prediction error R2 [90]
Force prediction error MSE [91]
Motion detection Soft sensor Sensor based Recognition accuracy error – [114]
Contact force control Soft sensor Vision based Force prediction error and contact MSE [115,135]
location error
Contact location error – [132]
Sensor based Force prediction error and contact MAE [133]
location error
Positioning Soft sensor Vision based Contact location error – [116]
Contact location error RMSE [117]
Motion detection Soft bioinspired robot Vision based Recognition accuracy error – [144]
Sound source detection Soft bioinspired robot Sensor based Direction-finding error RMSE [145]
Object recognition and grasping Soft bioinspired robot Vision based Grasping success rate error MSE [146]
Positioning error – [147]
LSTM Shape control and positioning Soft manipulator Sensor based Positioning error MAE/RMSE [72,76,78]
Pose error MSE [73]
Positioning error – [79]
Object recognition and grasping Soft gripper Vision/Sensor based Predicted states error MSE [94]
Sensor based Recognition accuracy error – [95]
Contact force control Soft gripper Sensor based Force prediction error MAE/RMSE [96,98]
2
Positioning Soft sensor Sensor based Positioning error RMSE/R [122]
Contact force control Soft sensor Sensor based Force prediction error and contact RMSE [119]
location error
Motion detection Soft sensor Sensor based Stretching and bending error RMSE/R2 [118]
Motion accuracy error – [120,121]
Bioinspired limb movement Soft bioinspired robot Sensor based Predicted position, velocity and force error NRMSE [148]
Tip deflection prediction error NRMSE [149]
Cross-validation accuracy error – [150]
GRU Tactile stimulation feedback Soft sensor Sensor based 3D taxel force reconstruction error MSE [134]
DGP Shape control and positioning Soft manipulator Sensor based Positioning error MAE/R2 [74]
VP-RNN Shape control and positioning Soft manipulator Sensor based Positioning error RMSE [77]
methods. This algorithm represents a simpler, faster, and more deviation between the old and new policy.[162] The deep deter-
efficient method to optimize an agent’s policy than TRPO. PPO ministic policy gradient (DDPG) algorithm belongs to the third
uses a clipping mechanism, which means that a “clipped” group of methods—actor–critic. The DDPG combines DNNs
surrogate objective function is used, which prevents a large (actor–critic) and deterministic policy gradients. The task of
Adv. Intell. Syst. 2024, 2400576 2400576 (13 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
DNNs is to approximate the Q-functions, and the deterministic a robotic operating system (ROS). An actor–critic algorithm was
policy gradient learns the appropriate policy.[163] The twin- applied to learn effective motion control using a two-actuator
delayed DDPG algorithm (TD3) was derived from the DDPG PneuNet (Pneumatic Networks) soft robot equipped with inte-
to increase the efficiency and stability of learning and eliminate grated resistive flex sensors. The sensors were integrated into
suboptimal solutions. While DDPG has one actor neural network the lower part of the actuators to monitor the state of the soft
and one critic, TD3 uses up to two critic networks whose task is to robot. AprilTag was placed on the upper side to determine the
determine the Q-values.[164] The soft actor–critic (SAC) algorithm position and orientation of the soft robot in the examined space.
is the last DRL algorithm, described in this section owing to its The twin delayed deep deterministic policy gradient (TD3) was
popularity and usability. Similar to the DDPG and TD3, this type used for motion control, which represents a model-free actor–
of algorithm is suitable for use in dynamic environments with a critic algorithm for the optimal method of maximizing the
high frequency of actions. Compared to TD3, it matches the use reward. In the relevant research, the distance the robot had
of two Q-networks, target networks, and the experience replay moved forward was the metric that the reward function
technique. However, innovation in the SAC algorithm involves attempted to maximize. The actor mapped the state of the soft
entropy regularization and a stochastic policy.[165] robot using integrated sensors as it moved forward, provided
The abovementioned methods and individual algorithms of information about interactions with the environment, and then,
DRL are among the most commonly used because of their ability based on feedback from the critic, adjusted its action to maximize
to solve complex problems owing to the combination of standard cumulative reward over time. Thus, the sensors of the soft robot
RL techniques with DNNs. The ability to learn directly from data provided real-time feedback to the DRL algorithm, which
provides an algorithm with a high degree of flexibility. The appli- adjusted the control to optimize the movement of the investi-
cation of DRL algorithms is wide ranging, from robotics, energy, gated system. A control system based on a DDPG-based control
and autonomous vehicles, to financial trading, games, and vari- system for continuous task-space manipulation was created by
ous types of simulations.[166–172] Li et al.[180] to control the movement of the soft manipulator.
The interaction of the investigated system with the environment
was characterized using the MDP. To provide feedback, two elec-
4.1. DRL and Soft Manipulators tromagnetic (EM) coils representing an EM tracking system were
connected to the tip of the soft manipulator. The control control-
Soft manipulators are robotic systems composed of highly com- ler was initialized using the DDPG algorithm in the simulation
pliant materials that enable flexible, adaptable, and gentle inter- environment and then transferred to a real system. The frame-
actions with the environment. Unlike rigid structures, soft work consisted of a network of actors that acted as a controller to
manipulators can deform, making them suitable for delicate generate the action command, and a critic that evaluated the
applications in unstructured environments, such as handling actions proposed by the actor and estimated the Q-value, which
soft objects or working near people. As previously mentioned, represented the expected cumulative reward for performing the
soft manipulators are composed of flexible materials and have given action. The critic’s feedback helps the actor improve its per-
an infinite number of degrees of freedom of movement because formance by guiding it to take actions that bring about higher
of their deformation, and their behavior is influenced by elastic- rewards. Consequently, the proposed DRL controller can track
ity, hysteresis, and compliance.[173,174] These facts lead to highly the movement and effectively adapt to the changing external load
complex, nonlinear behaviors, and capturing these complex of the investigated soft manipulator. The actor–critic system
material properties and their dynamics requires accurate model- within the DDPG was also applied by Satheeshbabu et al.[181]
ing, for which traditional approaches are insufficient. Deep SL is to control the movement and track the trajectory of the end effec-
a type of machine learning in which a model is trained on a tor of the soft manipulator. The soft continuum arm (SCA),
labeled dataset consisting of input–output pairs, and the goal designated as the BR2 SCA, comprised parallel-connected pneu-
of the model is to minimize the difference between its predictions matic segments mounted on a rotating platform. Eight motion
and the actual data. Unlike deep SL, DRL is a type of machine capture cameras were used to capture the position of the effector.
learning in which an agent learns to make decisions by interact- The actor network used the current position of the BR2 SCA as
ing with the environment and receiving feedback in the form of the input, generated a movement action as the output, and the
rewards and penalties to maximize the cumulative rewards.[175] In critic evaluated these actions. The goal of the DRL framework is
the context of modeling the behavior of soft manipulators, DRL to maximize the Q-values obtained from the critic network as a
algorithms are powerful tools that enable the learning of rules function of the weights of the actor network. The control of the
directly from raw sensor data, thus eliminating the need for movement of a two-segment hydraulic soft arm was also based
explicit modeling of complex dynamics. DRL integrates data sens- on the DDPG algorithm in Zhang et al.[182] During movement
ing and control into a single framework, which allows soft manip- control, the current position and previous activity of the agent,
ulators to adapt to changing environmental conditions and which ensured the respective position, were monitored. The
unexpected changes, thereby increasing their functionality in agent, based on the DDPG, adjusted the pressure values in
unstructured environments. The ability to learn from interactions the chambers to achieve the desired position of the tip of the soft
allows soft manipulators to adjust their behavior in real time, opti- manipulator. The current state of the investigated system was
mizing the performance for tasks such as self-positioning, object subsequently expressed by the error between the current and
manipulation, and human–robot interaction.[176–178] desired positions. For the soft manipulator to move quickly
Marquez et al.[179] proposed a modular framework for testing, and stably to the target position, the Euclidean distance between
debugging, and describing the movement of soft segments using the current tip position and its desired position was used as the
Adv. Intell. Syst. 2024, 2400576 2400576 (14 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
basis for the reward. Therefore, activities that brought the inves- time-consuming nature of manual hyperparameter tuning, the
tigated system closer to the required position and its stability dur- authors employed the AutoMPC method, which automates the
ing movement and stopping were rewarded. Actions that moved tuning process using optimization techniques for hyperpara-
the tip of the soft manipulator from the desired position and meter adjustment and optimal control performance. The training
caused instability were penalized. Centurelli et al.[183] reported set for the AutoMPC controllers consisted of end-effector move-
a neural network-based closed-loop controller trained on the ment trajectories under random control within different areas of
basis of the TRPO algorithm. The training was conducted in a the configuration space. Feedback was provided by pressure sen-
simulation environment, and a model approximation of the sors and a machine vision system that tracked reference markers
investigated system obtained using LSTM was used. The inves- on the central axis of the soft robot. To compare the performance
tigated system was a manipulator called the AM-I-Support robot, of AutoMPC, four separate controllers, including TRPO, were
which was composed of two identical interconnected soft mod- created for the same motion control of the system under study.
ules. The positional coordinates for training were obtained using As reported in ref. [185], control of the end position of a soft robot
the VICON motion-sensing system with eight infrared cameras. was implemented based on a data-driven model and deep
The inputs of the approximate model were the movement param- Q-learning. Individual movements and positions of the soft arm
eters, and based on these, the next position of the investigated were captured using an integrated camera. The authors first
system was predicted. The TRPO-based controller operated as created a simulation MLP model based on the morphological
a DRL agent; received information about the randomly generated characteristics of the steering gear and soft arm of the investi-
target position; transmitted weight, current error, and current gated system. Subsequently, they applied the DQN algorithm
position; and predicted the next optimal activation to maximize for motion control, first in a simulation environment and then
the reward. The reward function penalized deviations from the through experiments in a real environment. The DQN deter-
desired trajectory and ensured that the soft manipulator followed mined the movement commands and transformed them into
the target path accurately. Null et al. applied a data-driven model the control parameters of the steering gear, such that direct
predictive control (MPC) strategy to enhance the control perfor- manipulation of the soft arm was possible (Figure 8b). Ji et al.[186]
mance of an underwater soft robot.[184] This control strategy uti- applied a multiagent deep Q-network (MADQN) for control and
lized a system model to predict future states, thereby adjusting precise positioning and a cable-driven continuum surgical
control actions accordingly. The experimental system was a pla- manipulator with two degrees of freedom of movement. The
nar soft robot comprising two modules with parallel sets of MADQN is an extension of the traditional DQN adapted to multi-
hydraulic soft actuators (shown in Figure 8a). Different shapes agent environments. Within MADQN, individual agents contain
and configurations of the soft robot were obtained by adjusting their own DQNs that they apply for control based on local obser-
the pressure of the actuators. Because of the demanding and vations and interactions with the environment. In this study, one
Figure 8. Examples of soft structures and deep RL network frameworks. a) Experimental setup of a 2D underwater soft robot. Reproduced under terms of
the CC-BY license.[184] Copyright 2024, The Authors. Published by IEEE. b) Three-step control process for soft robot arm using reinforcement learning.
Reproduced under terms of the CC-BY license.[185] Copyright 2020, The Authors. Published by MDPI. c) RL framework for soft robotic gripper actuation
and task execution. Reproduced under terms of the CC-BY license.[195] Copyright 2023, The Authors. Published by IEEE.
Adv. Intell. Syst. 2024, 2400576 2400576 (15 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
agent was associated with each degree of freedom of movement type. For example, Zhao et al.[192] presented a soft gripper inte-
and was trained in a simulated environment to control the grated with a UR5 manipulator. The soft gripper could perform
manipulator by maximizing the rewards associated with accurate two types of grasping actions when handling an object: pinching
and safe movements. The trained MADQN controllers were eval- and enveloping. Thus, the authors referred to hybrid grasping
uated based on their ability to perform tasks with high precision tasks in their study. They divided the overall algorithm process
while adhering to safety constraints. into three parts: recognition, decision making, and execution.
Recognition was provided by an RGB-D camera that captured
the color and depth images of the current state. The images were
4.2. DRL and Soft Grippers
updated continuously, and the group of images obtained was
sent to YOLOv5 for object classification. The second part was
The sophistication of the grasping actions performed by robots
the decision-making process. The depth images captured by
and manipulators that use grippers depends on the size, shape,
the camera were sent to the Q-value prediction network, which
and properties of the object to be grasped. Currently, grasping
evaluated whether an envelope or pinch action should be per-
technologies inspired by biological systems (for example, human
formed. In this part, DRL was applied using the DQN algorithm,
hands, animals, and their body parts) and soft materials are being
which determined the grasping strategy. The third part was exe-
developed. Grippers constructed from soft materials exhibit
cution, which involved designing the trajectory of the movement
superior properties in terms of elasticity, flexibility, and of the manipulator arm. In addition to training this system
safety.[187] Consequently, they adapt considerably better to the through simulations, the authors conducted testing under real
variety of objects they need to grasp. However, the gripper design conditions on various objects with different shapes. The results
alone is insufficient to ensure a precise and safe grip. Grip pre- of these experiments were interpreted for four basic shapes—
cision depends on gripper control. At the beginning of this sec- sphere, cube, cylinder, and irregular object—for both the
tion, the strengths of DRL are explained, particularly the ability of envelope and pinch actions in both the simulation and real-world
an agent to learn through interactions with its environment. scenarios. A similar approach was used by Liu et al.[193] where a
Using a DRL agent to control soft grippers offers several advan- soft gripper (soft multimodal gripper) was attached to the arm of
tages. Because the agent learns directly from the environment, it the UR5 robot. Although the soft gripper presented in the previ-
can adapt to dynamic changes based on interactions, eliminating ous article contained two fingers, this soft multimodal gripper
the need for manual programming for new tasks. Thus, the rate had four fingers and was capable of performing three grasping
of adaptation to the new conditions is high. Multiple DRL algo- actions: enveloping, sucking, and enveloping followed by
rithms are well suited for use in dynamic environments with con- sucking. The hybrid grasping learning model required RGB-D
tinuous learning processes and unpredictable states, leading to images of objects (created using an RGB-D camera) as input.
solutions for complex problems/tasks. In the real world, such These images were classified using the R-CNN. The DRL
conditions are commonly found in industries that involve grasp- algorithm, DDQN, was used to determine the optimal policy
ing and moving objects, such as manufacturing (for example, (grasping strategy). Similar to the previous study, training and
handling materials, semifinished products, and products of vari- testing were performed on a simulated set of objects, followed
ous shapes and sizes), food (for example, harvesting crops from by testing on real objects under real conditions. The enveloping
shrubs and trees), healthcare, and pharmaceuticals.[188] and sucking actions achieved the highest efficiency. These stud-
One recent publication is Newbury et al.[189] which describes ies, along with many others, explored the possibilities of combin-
the use of deep learning with different grasping methods for ing conventional (rigid robotics) with unconventional (soft
robots with six degrees of freedom. This review presents four robotics) robots to increase efficiency and, in particular, the
grasping synthesis methodologies, one of which uses DRL for safety of people working in the proximity of robots and manip-
grasping strategy learning. Another review describes the devel- ulators. Owing to advanced DRL methods, UR manipulators
opment of techniques suitable for robotic grasp learning.[190] (as well as other types of rigid manipulators) can operate with
The article provides an overview of the key algorithms used in different types of grippers, creating new possibilities.
intelligent grasping. Current trends and challenges in a real, rap- Dai et al.[194] described the regulation of soft finger stiffness
idly changing environment indicate the need for greater adapta- using DRL techniques, specifically the DDPG algorithm. The
tion of robotic devices with grippers, making it necessary to object of this study was a soft robotic finger with rotating parts.
explore control options that avoid the need for reprogramming. Under vacuum pressure, individual parts solidify, and the
The last review worth mentioning in connection with DRL tech- desired condition (angle between segments of the soft finger)
niques and grasping is from Sekkat et al.[191] which describes the is not achieved, necessitating stiffness regulation. The problem
four main trends and challenges that must be addressed in is defined as the MDP, and the DDPG algorithm is used to learn
robotic grasping and the DRL algorithms that can be used to the stiffness control strategy. The learning approach includes
tackle these issues. This review describes three specific algo- actor–critic networks, which help optimize the actions performed
rithms in the area of actor–critic methods (DDPG, TD3, and (vacuum pressure level) and rewards. In simulations, this
SAC) for robotic grasping and explains the advantages and approach showed better results than the traditional PID control,
disadvantages of using these algorithms in the current context with lower error rates and better robustness in the required
and for future applications. grasping actions. Bianchi et al.[195] reported the results of an
Research papers have often investigated various DRL algo- investigation of a SofToss robot designed to throw real objects
rithms for soft grippers in connection with popular and widely to specified positions. In addition to the soft gripper, the robot
used UR (Universal Robots) manipulators, specifically the UR5 had a soft arm and vision-based motion sensors (Figure 8c).
Adv. Intell. Syst. 2024, 2400576 2400576 (16 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
In this case, DRL (specifically, the PPO algorithm) was used to Figure 9a) provide procedures for learning control instructions
predict the pattern of actions that the robot should perform in environments where the state space is large and continuous,
to ensure that the object reaches the desired position. ensure stable policy updates, are crucial for delicate control, and
Furthermore, the authors used neural networks to determine provide computational efficiency and reliability, making them
the relationship between the predicted and target actions. The ideal for real-time applications where resources are limited.
actual movements of the thrown objects were tracked using They also emphasize exploration and handle continuous action
markers attached to the objects. The results offer the potential spaces (ASs) well, which are essential for adaptive control in
for developing similar techniques and their subsequent applica- unstructured environments.[197–199]
tions, especially in logistics, where the accurate movement of DRL methods have been applied in individual studies, primar-
objects is required. The aforementioned articles present the ily for the design and optimal control of the gait of bioinspired
advantages of using DRL techniques in soft robotics, improving soft robots. The agent is integrated with the soft robot environ-
the robustness and efficiency and reducing the error rate of ment and receives data from the integrated sensors or visual
actions. Such articles open the door for further investigations information regarding the configuration of the current position.
as well as more reliable applications to the real world in various During training, selected movements are performed, rewards are
areas of life and branches of industry. assigned for efficient and stable movements, and penalties are
imposed for unstable and ineffective movements. The goal is
4.3. DRL and Bioinspired Soft Robots to maximize the cumulative reward and create an optimal strat-
egy for controlling the movement of bioinspired soft robots.
With the current development of bioinspired robots that mimic Li et al.[200] focused on defining and optimizing energy efficiency
the movement, flexibility, and adaptability of biological organ- during the undulating swimming of snake-like organisms.
isms, significant progress has been made in robotics. Their con- The authors presented a comparison of the influence of soft-body
struction materials provide unique properties such as resistance dynamics on energy efficiency during underwater movement
to environmental influences, flexibility, the possibility of struc- while testing a structure with soft and rigid segments that were
tural deformation, the ability to operate in hard-to-reach, unstruc- connected by joints and characterized by the same weight,
tured environments, and safety when interacting with people.[196] dimensions, and motor capacity, while maintaining the same
DRL learning is a suitable tool owing to its ability to model actuation degrees of freedom, as shown in Figure 9b. The agent
and manage complex, high-dimensional, nonlinear dynamics, was designed to learn and improve swimming efficiency over
and behavior in real time. DRL algorithms (as shown in time based on PPO. In the experiments, two reward methods
Figure 9. Examples of bioinspired soft structures and deep RL network structures. a) DDPG process chart incorporating image-based observations of soft
robot inspired by fish. Reproduced under terms of the CC-BY license.[199] Copyright 2022, The Authors. Published by Frontiers. b) Morphological struc-
tures of a soft snake robot (left) and a rigid snake robot (right) in the simulation environment. Reproduced under terms of the CC-BY license.[200]
Copyright 2023, The Authors. Published by Frontiers. c) A quadruped robot with soft actuators driven by tendons as the four legs and representation
of the high- and low-level control components. Reproduced under terms of the CC-BY license.[203] Copyright 2022, The Authors. Published by Elsevier.
d) Image processing of the posture of the soft arm. Reproduced under terms of the CC-BY license.[205] Copyright 2022, The Authors. Published by
Frontiers.
Adv. Intell. Syst. 2024, 2400576 2400576 (17 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
were proposed: speed of movement in the positive direction and Rewards were calculated based on robot performance, such as
amount of energy consumed in a single time step. The faster the distance traveled, energy efficiency, and stability. The reward
experimental snake moved, the higher the reward; conversely, function was designed to promote relevant gait parameters
the more energy it expended, the lower the reward. The goal and penalize instability and excessive energy consumption.
was to maximize the reward, that is, the distance traveled per unit Li et al.[204] designed a soft robot inspired by the body of a snake
of energy, by adjusting the driving method through a trial-and- that could swim using a dielectric elastomer actuator (DEA). The
error system. Consequently, they investigated how the dynamics investigated system consisted of three DEA segments that bent in
of a soft body could minimize the energy required for movement, response to applied stress. The subject of this research was the
thereby maximizing efficiency. The use of the PPO algorithm for application of the SAC algorithm for efficient movement control,
motion control is also related to the research by Min et al.[201] The namely, swimming in a straight line at the highest possible
DRL control framework was focused on the muscle excitation speed. The reward function was designed based on various per-
and movement of a soft robot inspired by the behavior of octopus formance metrics such as efficient forward motion, stability,
tentacles. Rewards were given to achieve the desired moves and and energy consumption. Moreover, the management and
penalized for inefficient or unstable behaviors. The optimization optimization of walking have been the subject of research.[205]
of swimming control with reasonable energy efficiency was also The DQN algorithm was applied to control and optimize the
investigated by Wang et al.[202] The authors designed a soft bipedal walking capability of an underwater soft robot. The inves-
robotic eel with two wire-driven segments and two complaint tigated system was designed with flexible and deformable limbs
bodies composed of elastic material. A DRL framework was inspired by an octopus, as shown in Figure 9d. This structure
designed to optimize the swimming control of a soft robot along allows for a wide range of movements and mimics the natural
a straight line using the SAC algorithm. In this case, the reward motions of octopuses. A DRL framework was created for motion
consisted of three parts: the reward for the speed of movement in control, in which the soft robot was trained using the DQN algo-
the forward direction, the reward for the consumed energy, and rithm. The agent integrated with the environment; received
the reward for the deviation from the desired trajectory of move- visual inputs, such as limb position; and generated control
ment. The application of the SAC method as described in signals to control the limbs. Thus, during training, the robot
ref. [203] consisted of optimizing the robot’s gait using four soft performed various movement actions, receiving rewards for suc-
actuators representing the robot’s legs (Figure 9c). Soft actuators cessful, stable, and efficient walking, and penalties for unstable
powered by tendons provided flexibility and adaptability during or inefficient movements. The goal was to maximize the reward,
movement. A simulation model was created for the investigated which led to optimal gait control strategies for the system under
system, which imitated the physical interactions in the real world investigation. An overview of the application of deep RL networks
and included the dynamics of the actuators, their speeds, and for modeling the behavior of soft robotic structures, along with
contact forces, enabling realistic training of the DRL algorithm. the performance indicators used, is presented in Table 3, where
Table 3. Overview of the application of deep RL networks for modeling the behavior of soft robotic structures, along with the performance indicators
used.
Network Application of network Device Data collection Performance indicator Performance metric References
structure for network
TD3 Motion control Soft manipulator Sensor based Distance error – [179]
DDPG Motion control Soft manipulator Sensor based Tracking accuracy error and distance error RMSE [180]
Motion error – [182]
Vision based Motion error – [181]
TRPO Motion control Soft manipulator Vision based Tracking error IQR [183]
Vision/Sensor based Position error – [184]
DQN Positioning and motion control Soft manipulator Vision based Position error – [185]
Sensor based Trajectory tracking error RMSE [186]
PPO Control of tossing the objects Soft gripper Vision based Distance error L2 norm [195]
DQN Grasping strategy Soft gripper Vision based Grasping success rate error – [192]
DDQN Grasping strategy Soft gripper Vision based Grasping success rate error SE [193]
DDPG Stiffness control Soft gripper Sensor based Finger angle error from initial position – [194]
PPO Motion control Soft bioinspired robot Sensor based Reward function based on velocity – [200,201]
SAC Motion control Soft bioinspired robot Vision/sensor based Predicted swimming distance error – [202]
Sensor based Predicted horizontal distance error RMSE [203]
Vision/Sensor based Reward based on velocity and torque, – [204]
position sum difference
DQN Motion control Soft bioinspired robot Vision based Predicted pose error MSE [205]
Adv. Intell. Syst. 2024, 2400576 2400576 (18 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
MSE, RMSE, interquartile ranger error (IQR), and standard error continuum structures. High-fidelity physics-based approaches
(SE) represent the performance metrics, and a dash represents to the modeling of such systems may result in complicated mod-
references without a specified performance metric. els that are too computationally heavy to be applied to practical
problems. To derive tractable models of these systems,
data-driven approaches can be viewed as viable methods in
5. Overview of Deep SSL and USL in Soft which data may be obtained from measurements or generated
Robotics synthetically.[210] Unlike the SL scenario with its dependence
on labeled datasets, USL methods can be used to learn lower-
The success of deep learning in many applications is due to the order latent representations of soft robot behavior.[11,211]
availability of vast amounts of data, which, for SL tasks, must be Deep AEs are suitable architectures for dimensionality reduction
labeled so that the desired output is known. Similar to other through a low-order latent space representation in USL
applications, a sufficient number of labeled samples is required scenarios. AEs are typically trained using the backpropagation
for soft robots, which can be demanding. However, providing technique and a type of gradient-based algorithm such as sto-
voluminous datasets of unlabeled samples using various types chastic gradient descent or its variants (such as Adam).
of sensors may be easier, particularly under laboratory condi- Metaheuristic algorithms can also be used to train AEs to reduce
tions. The problem of the partial or complete unavailability of the dimensionality of soft robots. In ref. [212], the authors
labeled data can be addressed using two main approaches: applied a combination of EAs (in this case, a GA) and the
USL and SSL.[206,207] Generally, SSL aims to learn from a certain BP (Backpropagation) technique for the electrophysiological type
number of labeled data samples and a large amount of unlabeled of soft robot. However, in terms of low-order representation
data, typically with the same data distribution.[208] In contrast, learning performance, basic AEs may struggle to effectively
USL works only with unlabeled data, with no relevant informa- generalize under noisy and highly variable input data,[213] which
tion regarding the desired outputs. In addition, another type of may be the case for unpredictable environments where soft
learning has become important in recent applications where the robots are applied. Variational AEs (VAEs), a type of generative
processing of unlabeled data is assumed: self-supervised model in which the basic principle of AEs is extended by
learning. Self-supervised learning can also be considered a incorporating Bayes inference, may be more suitable for applica-
method of USL; however, whereas USL is typically applied to tions with soft manipulators. Spielberg et al.[210] proposed the
clustering or dimensionality reduction tasks, SSL is mainly used use of a special architecture based on convolutional variational
for classification and regression.[209] The loss function in both AEs (CVAEs) as a part of a “learning-in-the-loop” concept,
scenarios differs in the inclusion or absence of a supervised where the process of learning the low-order soft-robot-state
objective term, with an unsupervised objective being present representation and optimization of its control and/or material
in both cases. parameters take place simultaneously (Figure 10a). The observa-
The high dimensionality and nonlinear nature of soft robot tion of the full state of the robot is based on a computer-vision
behavior resulting from the properties of soft materials offer approach in which the input data are in the form of a two-channel
the potential for using USL and SSL, especially for providing trac- velocity grid. Nevertheless, the 3D implementation of the algo-
table, reduced-order models. Therefore, dimensionality reduc- rithm is difficult to realize, and the training times can be
tion to capture the most relevant features is a crucial task excessive.
using USL. The reasons for using SSL in soft robotics can be The USL approach can sometimes be used in combination
found in cases in which soft robot states under known conditions with SL but in contrast with the SSL concept as two separate
may provide labeled samples, whereas other possible states may methods, where USL is used in a preceding step to help process
form unlabeled datasets. In real-world scenarios, these condi- data in a suitable manner. Zou et al.[214] exemplify this with the
tions may be encountered during the operation of a soft robot combination of self-organizing maps (SOMs) and an FFNN
in an unstructured environment under different loadings. In soft (Feedforward neural network) used as a universal approximator
gripping, where objects of different shapes, sizes, and textures and applied to a soft continuum arm with an intrinsic structure
are manipulated, methods suitable for handling uncertain and and three pneumatic muscle actuators. This fusion helps address
highly variable conditions are desirable. In addition, USL and the issue of relatively low accuracy of the widespread piecewise
SSL have some advantages over SL methods in the case of constant curvature (PCC)-based model for continuum arm kine-
dynamic updates of models with new data without extensive matics modeling. SOMs were used to extract the shape of the
retraining because of their decreased reliance on labeled data. manipulator centerline from its contours, which were then used
Although there is a relative abundance of model families in both to estimate the spatial shape of the entire arm. However, these
USL and SSL,[208] only a few have found their way into soft robot- results were not sufficiently accurate and were corrected using a
ics applications. One of the most relevant architectures in USL DNN, where the inputs were cable lengths (from wire encoders)
scenarios and soft robotics applications is the deep autoencoder and robot shape estimation from a camera. Probabilistic models,
(AE), which is suitable for identifying reduced-order latent such as Gaussian mixture models and their extension in the form
representations.[207] of a Gaussian mixture regression model, are potent tools for USL
or SSL scenarios in soft robotics applications. Their application
5.1. Deep SSL and USL within Soft Manipulators may address the generalization problem of deep learning mod-
els; for example, different realizations of trajectories with highly
The bodies of soft robots and manipulators pose significant chal- variable inputs from humans. This approach was used by
lenges for modeling and control owing to their deformable Hamaya et al.[215] where a physically soft robot was used for
Adv. Intell. Syst. 2024, 2400576 2400576 (19 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
Figure 10. Examples of deep SSL and USL learning structures for soft robots. a) Architecture of convolutional variational autoencoder. The autoencoder
takes two-channel pixel grid data as input, with each channel representing the x or y velocity field at that pixel. At inference time, simulation data are fed
into the encoder, ϵ, which produces a latent vector. The mean μ variables are then fed as inputs to the controller, C. Reproduced under terms of the CC-BY
license.[210] Copyright 2019, The Authors. Published by NIPS. b) Arm with soft gripper and scheme of method showing successful and unsuccessful
collecting demonstrations with a teaching device, the application of PC-GMM (Gaussian mixture model) and Gaussian mixture regressions (GMR) to the
labeled demonstrations, and use of deep model-based reinforcement learning. Reproduced with permission.[215] Copyright 2020, IEEE. c) Integrated
finger with an LED and an inner camera, and architecture of the proposed framework, which takes two resized grayscale images and a gripper configu-
ration as inputs and predicts the 6D pose and category of the object. Reproduced under terms of the CC-BY license.[218] Copyright 2023, The Authors.
Published by MDPI.
the assembly task while learning from a demonstration setup. is possible for a model to extract useful features using a sufficient
The investigated system is shown in Figure 10b. First, a physi- amount of manually unlabeled data, after which suitable
cally consistent Gaussian mixture model was applied to all trials, fine-tuning of the model may occur.[219] Liu et al.[218] used a cam-
after which a Gaussian mixture regression model was used to era integrated into a gripper with two fingers composed of poly-
create reference trajectories from only successful attempts. urethane fingers to estimate an object’s pose and category
This, together with DRL allowed for an increase in the success (Figure 10c). The feature extractor used an encoder–decoder
rate compared with a successful attempt-only approach. architecture based on ResNet, which worked on resized grayscale
images. The resulting latent representations at the output of the
encoder, together with the calculated configuration, were used as
5.2. Deep SSL and USL within Soft Grippers inputs to the two MLPs to estimate the 6D pose and object class.
The problem of the unavailability of labeled datasets can also be
Similar to the bodies of soft robots (manipulators), soft grippers solved by generating synthetic data with characteristics similar to
are characterized by continuum structures that are expected to those of the generating process/system. In the context of deep
undergo significant deformation during operation to provide learning, this problem is often addressed by using GANs.
mechanically adaptive contact with various objects.[216] More Sapai et al.[220] reported an approach for a pneumatic soft gripper
sophisticated manipulation tasks may require estimation of using three PneuNet fingers. A special type of GAN, called
the soft gripper shape and contact forces, which is difficult to Transformer TimeGAN (TTGAN), was combined with a self-
achieve using standard computational methods.[217] USL techni- attention mechanism to enable the learning of the complex
ques can be used for specific tasks (for example, dimensionality dynamics of a gripper. This GAN worked on latent representa-
reduction) in information processing related to the use of soft tions in the time domain obtained from a common encoder–
grippers, particularly at low levels.[217] One of the dominant deep decoder architecture. Moreover, a conditional framework was
learning architectures used for dimensionality reduction is the included in the model, such that data generation for different
deep AE. In cases where there is a need for labeled datasets types of behaviors was possible (free bending and tip contact).
but their availability is either limited or difficult to obtain, archi- Using this approach, it was possible to train a model with only
tectures such as deep AEs can be used for a feature extraction a fraction of real samples compared to the case with a full real
task in a self-supervised learning setup.[218] In this manner, it dataset, and to achieve similar error rates.
Adv. Intell. Syst. 2024, 2400576 2400576 (20 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
5.3. Deep SSL and USL within Soft Sensors learning methods, selected according to their frequency of occur-
rence in related articles.
Soft sensors play an important role in the perception of soft
robots. However, in contrast to typical industrial robots, propri-
oception is significantly more difficult because of challenges 6. Challenges and Future Prospects
related to the kinematic complexity of soft machines.[221]
In general, machine learning presents a potent framework for
Various types of soft sensors have recently been developed;[13]
soft robotic systems owing to its computational flexibility and
however, for most types, the following challenges must be
data-driven characteristics, which are central to the modeling
addressed: 1) variable sensor characteristics, 2) sensitivity to envi-
and control of the highly complex systems expected to operate
ronmental conditions, 3) drift and degradation, and 4) nonlinear
in unstructured environments. However, recent developments
characteristics, hysteresis, and creep effects.[217] These factors
in deep learning methods have significantly extended their capa-
make the soft sensor calibration process extremely demanding
bilities owing to the possibility of automatic feature extraction as
and prone to large errors. In this case, deep learning models offer
well as enhanced architectural designs, which perform consider-
a way to decode useful information about the measured variables ably better for difficult-to-describe spatiotemporal characteristics
in soft robots.[13] However, to derive these models in an SL sce- found in systems with soft structures. While the concept of deep
nario, a large amount of labeled data is required, which may not learning is more strongly coupled with the era of CNNs and
be easy to obtain. An SSL approach can be used to reduce the size recurrent neural networks, such as LSTMs, GRUs, and all newer
of the required labeled training datasets. Kim et al.[211] used two architectures, MLPs with three or more layers can be considered
types of datasets—a calibration dataset (labeled data) and gait deep and sometimes useful for addressing certain types of prob-
motion dataset (unlabeled data)—to solve the problem of lems in soft robotics, such as vision-based machine learning for
soft sensor calibration. The second was used to address the soft robotic metamaterials[89] or magnetostriction-based soft
difficulties encountered during the changing conditions of manipulators.[57] Nevertheless, the presence of effects such as
soft sensors (particularly their placement) by representing the hysteresis, creep, and time-varying properties necessitates
hidden characteristics of the original dataset using a three- modeling and control using more sophisticated architectures,
component DL model, where each used a different architecture where the power of deep learning can be fully exploited.
and function (deep AE, GRU network, and FFNN with ReLU However, the application of these methods in soft robotics
activation functions). Using this approach, it was possible to presents major challenges in several areas (Figure 11), which
achieve an average error of only 21.22 mm for speeds from can be classified as follows: 1) datasets, 2) computational com-
2 to 6 km h 1. An overview of the application of deep SSL plexity, 3) generalization capabilities of models, 4) interpretability
and USL networks for modeling the behavior of soft robotic of models, and 5) real-time processing.
structures, along with the performance indicators used, is pre- The first remains one of the main challenges related to most
sented in Table 4, where MAE and RMSE represent the perfor- problems to which deep learning models are applied. This is
mance metrics, and a dash represents references without a related not only to the sheer number of samples available for
specified performance metric. training and testing (or validation), but also to their labeling
The relevant previous chapters described the possibilities of on which SL and RL scenarios are dependent. In the case of soft
applying deep learning methods in the field of soft robotics robotics applications, this seems to be less pronounced com-
and categorized such possibilities into basic learning scenarios pared to areas such as image or natural language processing,
(supervised, unsupervised, reinforcement, and semisupervised). which is attested to by significant disproportions in the use of
Figure 1 shows the methods belonging to the respective learning USL and SSL compared to SL and RL. The reasons can only
scenarios that are generally applied within deep learning. Table 5 be hypothesized but may be connected to the conditions under
summarizes the methods that constitute a significant portion of which experiments with soft robots are often performed, which
the applied approaches for modeling the behavior of systems in can favor the availability of labeled data over the characteristics of
the field of soft robotics. The table provides an overview of the data in real-world settings. Training deep learning models is
features and application fields of the most commonly used deep computationally demanding, and powerful hardware setups
Table 4. Overview of the application of deep SSL and USL methods for modeling the behavior of soft robotic structures, along with the performance
indicators used.
Network structure Application of network Device Data collection for network Performance indicator Performance metric References
CVAE Control of dynamic Soft robot Vision based Latent state reconstruction error – [210]
EAs/BP Enhance of control Soft robot Sensor based Fitness value for DNN evolution – [212]
SOMs/FFNN Shape control Soft manipulator Vision/Sensor based 3D shape estimation error RMSE [214]
GMM Motion control Sofr gripper Sensor based Positioning error MAE [215]
DA Pose estimation Soft gripper Vision based Pose estimation error – [218]
TTGAN Motion control Soft gripper Vision based Dynamics prediction error MAE [220]
AE Motion detection Soft sensor Vision based Position vector error for gait reconstruction RMSE [211]
Adv. Intell. Syst. 2024, 2400576 2400576 (21 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
are typically needed to maintain training times at a reasonable operation is anticipated. Addressing the issue of computational
level. In the case of soft robot models, computational complexity complexity reduction involves various techniques, including prun-
can also be crucial for using DL models in real time, as in DRL. ing, quantization, knowledge distillation, multiplication reduction,
Although simulations may prove helpful here, it is necessary and network architecture search.[222] Shimadera et al.[135] proposed
to decrease the number of sim-to-real-gap high-fidelity models a high-resolution multimodal sensing approach for a soft sensor
with significant computational power requirements.[113] that does not rely on integrating multiple sensors. Based on their
Generalization of deep learning models is another important results, they suggested using knowledge distillation, where an
challenge that must be addressed in soft machines, particularly ensemble of models is trained on different datasets, and the
for their use in unstructured environments. In this regard, the knowledge is distilled from the trained model into a smaller
use of DL models may require approaches that go beyond stan- one. Deep learning architectures are often overparameterized,
dard regularization techniques to help prevent overfitting (that is, and searching for an optimal architecture may be an effective solu-
dropout, batch normalization, or weight decay) and should result tion for reducing the computational complexity of the resulting
in models usable in significantly different operating conditions model. Nadizar et al.[38] employed such an approach to design neu-
compared to the training settings. Soft robots often need to adapt rocontrollers for voxel-based soft robots, where evolutionary
to dynamic environments or change tasks in real time. Training strategy algorithms were applied to search for (quasioptimal)
deep learning models that can quickly and reliably adapt to new architectures of controllers based on MLP, RNN, and SNN.
situations without extensive retraining remains challenging. Many of these techniques have been used in various deep-
learning applications such as computer vision and natural lan-
6.1. Computational Complexity and Interpretability/ guage processing. Although general real-time applications often
Trustworthiness require steps to reduce the computational complexity of deep-
learning models, many recently developed techniques have
Computational complexity, as a measure of the time and space not yet been applied in the context of soft robotics. A stronger
required for either training or inference of deep learning models, promotion of their use may occur if soft robots are deployed
is a critical factor in robotic applications, especially when real-time in real-world conditions with strict computational requirements.
Adv. Intell. Syst. 2024, 2400576 2400576 (22 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
Figure 11. Challenges and future prospects of deep learning in soft robotics.
The inherent black-box nature of deep learning models con- 6.2. Deep Supervised Learning
trasts with the need for interpretability and explainability of
the proposed solutions, which affects their trustworthiness. In terms of the number of applications, the SL paradigm is the
Several techniques have been proposed to address the explain- dominant method in soft robotics. Currently, most research on
ability and/or interpretability of deep learning models. These soft robotics is conducted in laboratory environments, which can
include saliency maps, feature attribution, out-of-distribution support a somewhat easier collection of labeled data compared to
detection, surrogate models, active learning, and shared auton- real-world conditions. Given the availability of labeled data, SL
omy/control.[223] The last two, which are particularly helpful in methods offer the use of well-established architectures (for exam-
increasing the trustworthiness of deep learning models, have ple, CNNs, RNNs, and FFNNs) that can be effectively trained and
been used in soft robotics. In ref. [224], active learning was are relatively easy to deploy. However, many challenges remain
employed to enable informed decision making and appropriate associated with the utilization of complex deep learning models
responses to diverse materials and surfaces using a multimodal in specific applications, such as soft robots, grippers, and sen-
bioinspired sensor. The principle of shared control was used in sors. Many soft continuum manipulator setups in laboratories
ref. [140] for an upper-limb assistive exoskeleton, in which EMG- are tested only under ideal conditions, excluding, for example,
based intention detection based on deep learning models was the effects of loading or contact with objects, which severely
applied. affects their performance. Their inclusion is important and
Although these models are still far from being fully can be accounted for in a simulation environment[77] or real-
explainable or interpretable, applying these approaches repre- world applications.[62] When contact with objects is considered,
sents a step toward feature improvement for the future use of passive adaptation of the gripper fingers can be used in view of
deep learning models in soft robotics. This is particularly material compliance.[87] However, more robust solutions, which
important for healthcare and other human-robot interaction are more likely to be successful in highly uncertain environ-
(HRI)-based applications. ments, have been proposed. These are based on multimodal
Adv. Intell. Syst. 2024, 2400576 2400576 (23 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
sensor integration, in which different types of sensors (ultrasonic other applications that function in unstructured environments
remote and tribologic sensors) are used.[99] In this case, deep with possible structural adaptability are important. Underwater
learning methods (for example, RNNs in SL setups) are often soft robotics is a prospective subarea where bioinspired robotic
selected because of their ability to handle complex relationships designs can be beneficial for certain tasks. The DRL framework is
in temporal data. Using this approach, high-fidelity propriocep- suitable for this scenario owing to the highly complex interac-
tion in hybrid soft actuator sensors can be achieved, even for tions between a robot and its environment. This approach was
unstructured sensor configurations.[81] Likewise, future develop- used in ref. [199], where a soft robotic fish with artificial muscles
ments in the area of electronic skins will require an increased num- based on supercoiled polymers was designed and controlled
ber of sensor units per unit area to improve the resolution and their using a DDPG algorithm.
stacking in several layers to be closer to human skin in terms of For more sophisticated tasks, soft robots must be equipped
sensitivity and ability to respond appropriately to various types of with additional advanced sensor techniques, including vision sys-
stimuli. The processing of large amounts of data relies on the tems. When DRL is used to control the machine, its hybridiza-
reliable extraction of important features, where CNNs excel.[217] tion with CNNs for automatic feature extraction improves the
Performance of baseline deep learning models can be further overall performance. Such a combination results in a more
improved using more advanced architectures based on self- effective coordination of the soft body of the robot and its vision
attention modules and transformers.[225] Hu et al.[127] presented provided by cameras. The studies[173,199] illustrate this for an
this for a capacitive-based electronic skin, where a combination SMA-based continuum arm (spinal muscular atrophy) and soft
of MLPs and a transformer decoder was used for the morphologi- robotic fish, respectively.
cal reconstruction of a soft robot using readouts from 392 sensors.
Based on the given application, the typical black-box nature of 6.4. Generalization of the Deep Learning Models
the derived deep learning models can be viewed as a major weak-
ness in offering interpretable and trustworthy results. The area of Several methods for addressing the issue of deep learning gen-
explainable AI (xAI) has significant potential for future applica- eralization have been proposed recently and have potential for
tions of soft machines in human-centered environments. For soft robotics applications. Domain randomization is a method
complex architectures of DL models, the model interpretability in which randomized modifications of domain aspects in simu-
can be enhanced using a predictive estimation framework,[92] lated environments can be used to improve the robustness of a
which is used to express the confidence of predictions with DL model under real-world conditions. This is useful for the DRL
LSTM-based models in real time for a PneuNet soft robot. methods that are often applied in soft robotics.[194,199,201] Transfer
learning is based on the idea of adapting knowledge from the origi-
6.3. Deep Reinforcement Learning nal domain, typically using pretrained models, to a related domain,
thus decreasing reliance on task-specific data. This approach was
Owing to the many challenges associated with controlling highly used for cross-domain transfer learning as part of a domain-
nonlinear and nonstationary systems, such as soft robots, the RL adaptable sequential variational Bayes framework in a soft gripper
framework, which is based on the direct interaction of an agent with PneuNets.[226] Generalization can also be improved using
with the environment, is a method of choice for operating soft techniques that rely on very few examples of new tasks, which
machines in unstructured spaces. However, RL with standard is referred to as few—or even zero—shot learning. Similar to
model architectures may struggle with the high-dimensional the domain randomization approach, learning can be performed
and continuous state spaces associated with soft robots. In this in a simulated environment and applied to real-world conditions.
regard, soft manipulators with continuous ASs can benefit from The sim-to-learn pipeline was proposed by Yoo et al.[63] in which a
methods that directly handle continuous ASs, such as deep vision-based system was applied to a pneumatic soft robot for
actor–critic or deep policy gradient methods. One of these shape reconstruction. The lack of extensive datasets that are rep-
approaches (DDPG) was used by Satheeshbabu et al.[181] for a resentative of the behavior of a system to improve the performance
BR2 pneumatic soft continuum arm, which proved effective in of unseen data can also be solved using data augmentation tech-
addressing the problem of varying loads at the arm tip. The appli- niques or the generation of synthetic data.[220] An interesting direc-
cation of DRL methods in soft robotics is often enhanced by tion involves using hybrid models, where improvements in the
training models in simulation environments; however, success- generalization of deep learning models are achieved through
ful deployment can be adversely affected by existing sim-to-real the incorporation of physical knowledge for a given system by
gaps. This problem was addressed by Li et al.[180] using domain embedding it, for example, in the loss function of a neural network
randomization in simulations for fast control policy initialization (physics-informed neural networks). In this regard, physics-
and an offline retraining strategy for controller parameters. informed RNNs were used in soft robotic applications with PAMs
Although the DDPG appears to be a widely used method in soft by Sun et al.[15] with the Bayesian optimization of parameters.
manipulator control, its application is fraught with problems of
low stability of learning convergence.[164] Improvements can be 6.5. Generative AI
achieved using the TD3 algorithm, where the actor network is
updated less frequently than the critic networks.[169] The power of generative AI has recently been well demonstrated,
At the application level, the dominant application of DRL is especially for large language models in the form of generative
soft manipulator control, particularly in positioning tasks. pretrained transformer (GPT) models. Although LLMs may have
Owing to the specific properties of soft machines in general, some relevance to traditional robotics,[227] their capabilities have
Adv. Intell. Syst. 2024, 2400576 2400576 (24 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
not yet been used for soft robotics. However, generative AI, with Received: July 11, 2024
its ability to generate novel content (data), can be viewed as a pro- Revised: October 15, 2024
spective direction for specific aspects of soft machines. One of Published online:
these is geometry and/or material selection in the design of soft
robots. This is dictated by the need to use the complex geometries
of certain types of robots to achieve the desired functionality (for [1] F. Stella, J. Hughes, Front. Robot. AI 2023, 9, https://doi.org/10.3389/
example, soft origami/kirigami robots) and maintain their flexi- frobt.2022.1059026.
bility and structural integrity. In addition, specific soft robot [2] B. Caasenbrood, Design, Modeling, and Control Strategies for Soft
designs (such as fluidic designs) require the integration of mul- Robots, Eindhoven University of Technology, Eindhoven 2024.
[3] D. Rus, M. T. Tolley, Nature 2015, 521, 467.
tiple components, such as fluid channels, embedded sensors,
[4] N. El-Atab, R. B. Mishra, F. Al-Modaf, L. Joharji, A. A. Alsharif,
and actuators. Generative methods for soft robot design have
H. Alamoudi, M. Diaz, N. Qaiser, M. M. Hussain, Adv. Intell. Syst.
great potential for exploring novel solutions to optimize geome-
2020, 2, 2000128.
try, which is difficult to achieve using manual approaches.[228] [5] F. Tauber, M. Desmulliez, O. Piccin, A. A. Stokes, Bioinspir. Biomim.
Chan et al.[229] proposed the use of deep generative diffusion 2023, 18, 035001.
models for soft robot design, which are more suitable for this [6] C. Laschi, T. G. Thuruthel, F. Lida, R. Merzouki, E. Falotico, IEEE
purpose because of their ability to model long-range dependen- Control Syst. Mag. 2023, 43, 100.
cies and their principled framework for uncertainty estimation. [7] C. Armanini, F. Boyer, A. T. Mathew, C. Duriez, F. Renda 2022,
In particular, a text-to-shape model with three components was https://doi.org/10.48550/arXiv.2112.03645.
used, where an encoder–decoder architecture was employed for [8] C. Della Santina, C. Duriez, D. Rus, IEEE Control Syst. Mag. 2023,
latent object space transformations, a denoising network for 43, 30.
backward diffusion, and a text encoder for conditioning. [9] M. S. Xavier, A. J. Fleming, Y. K. Yong, Adv. Intell. Syst. 2021, 3, 2000187.
Other potential uses of generative AI in soft robotics include [10] Z. Chen, F. Renda, A. L. Gall, L. Mocellin, M. Bernabei, T. Dangel,
synthetic data generation, which is a method for improving the G. Ciuti, M. Cianchetti, C. Stefanini, IEEE Trans. Automat. Sci. Eng.
generalization properties of the resulting deep learning models. 2024, 1.
GANs are often the architectures of choice for generating novel [11] D. Kim, S.-H. Kim, T. Kim, B. B. Kang, M. Lee, W. Park, S. Ku, D. Kim,
samples of specific data types. Sapai et al.[220] enhanced the basic J. Kwon, H. Lee, J. Bae, Y.-L. Park, K.-J. Cho, S. Jo, PLoS One 2021, 16,
GAN model with a time-series transformer and self-attention to e0246102.
[12] T. George Thuruthel, E. Falotico, M. Manti, A. Pratesi, M. Cianchetti,
avoid lengthy data collection processes for a soft gripper.
C. Laschi, Soft Robot. 2017, 4, 285.
Additionally, a conditioning network was included to identify
[13] K. Chin, T. Hellebrekers, C. Majidi, Adv. Intell. Syst. 2020, 2, 1900171.
different types of behaviors. [14] M. Bolderman, M. Lazar, H. Butler, in 2022 IEEE 61st Conf. on Decision
Although generative AI has great potential, these methods and Control (CDC), IEEE, Piscataway, NJ 2022, pp. 1497–1498.
have been applied sparingly in soft robotics to date. In contrast [15] W. Sun, N. Akashi, Y. Kuniyoshi, K. Nakajima, IEEE Robot. Autom.
to image or text generation in other applications, there may be Lett. 2022, 7, 6862.
less pressure to use these, and the benefits may be less obvious. [16] K. Tyagi, C. Rane, M. Manry, in Artificial Intelligence and Machine
However, with the further development of soft robotics applica- Learning for EDGE Computing (Eds: R. Pandey, S. K. Khatri,
tions that can push soft machines closer to real-world conditions, N. Kumar Singh, P. Verma), Academic Press, Cambridge 2022,
generative AI methods are likely to become more widespread. pp. 3–22.
[17] V. Nasteski, Horiz. B 2017, 4, 51.
[18] S. Dridi 2021, https://doi.org/10.31219/osf.io/qtmcs.
Acknowledgements [19] S. Suthaharan, in Machine Learning Models and Algorithms for Big
Data Classification: Thinking with Examples for Effective Learning
This research was funded by the Scientific grant agency of the Ministry of (Ed: S. Suthaharan), Springer US, Boston, MA 2016, pp. 183–206.
Education, Research, Development and Youth of the Slovak Republic and [20] T. Jiang, J. L. Gradus, A. J. Rosellini, Behav. Ther. 2020, 51, 675.
the Slovak Academy of Sciences under the project VEGA 1/0061/23;
[21] P. C. Sen, M. Hajra, M. Ghosh, in Emerging Technology in Modelling
Cultural and Educational grant agency of the Ministry of Education,
Research, Development and Youth of the Slovak Republic under the proj- and Graphics (Eds: J. K. Mandal, D. Bhattacharya), Springer,
ect 022TUKE-4/2023; Research and Development Support Agency under Singapore 2020, pp. 99–111.
the project APPV-23-0591; and the EU NextGenerationEU through the [22] F.-R. Stöter, S. Chakrabarty, B. Edler, E. A. P. Habets, in 2018 IEEE Int.
Recovery and Resilience Plan for Slovakia under the project No. 09I03- Conf. on Acoustics, Speech and Signal Processing (ICASSP), IEEE,
03-V03-00075. Piscataway, NJ 2018, pp. 436–440.
[23] A. H. de Souza, F. Corona, G. A. Barreto, Y. Miche, A. Lendasse,
Neurocomputing 2015, 164, 34.
Conflict of Interest [24] I. H. Sarker, SN Comput. Sci. 2021, 2, 420.
[25] Y. Hu, S. Luo, L. Han, L. Pan, T. Zhang, Artif. Intell. Med. 2020, 102,
The authors declare no conflict of interest. 101764.
[26] H.-F. Yang, K. Lin, C.-S. Chen, IEEE Trans. Pattern Anal. Mach. Intell.
2018, 40, 437.
[27] D. Wang, J. Chen, IEEE/ACM Trans. Audio, Speech, Lang. Process.
Keywords 2018, 26, 1702.
deep learning, neural networks, reinforcement learning, soft robotics, [28] X. Yuan, Y. Gu, Y. Wang, C. Yang, W. Gui, IEEE Trans. Neural Netw.
supervised learning Learn. Syst. 2020, 31, 4737.
Adv. Intell. Syst. 2024, 2400576 2400576 (25 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
[29] H. Mostafa, V. Ramesh, G. Cauwenberghs, Front. Neurosci. 2018, 12, [61] J. M. Bern, Y. Schnider, P. Banzet, N. Kumar, S. Coros, in 2020 3rd
https://doi.org/10.3389/fnins.2018.00608. IEEE Int. Conf. on Soft Robotics (RoboSoft), IEEE, Piscataway, NJ 2020,
[30] K. Makantasis, K. Karantzalos, A. Doulamis, N. Doulamis, in 2015 pp. 417–423.
IEEE Int. Geoscience and Remote Sensing Symp. (IGARSS), IEEE, [62] W. Liu, Z. Jing, X. Dun, G. D’Eleuterio, W. Chen, H. Leung, in 2021
Piscataway, NJ 2015, pp. 4959–4962. IEEE/ASME Int. Conf. on Advanced Intelligent Mechatronics (AIM),
[31] V. Sze, Y.-H. Chen, T.-J. Yang, J. S. Emer, Proc. IEEE 2017, 105, 2295. IEEE, New York 2021, pp. 1331–1336.
[32] R. M. Cichy, D. Kaiser, Trends Cogn. Sci. 2019, 23, 305. [63] U. Yoo, H. Zhao, A. Altamirano, W. Yuan, C. Feng, London 2023,
[33] W. Samek, G. Montavon, S. Lapuschkin, C. J. Anders, K.-R. Müller, pp. 544–551, https://doi.org/10.48550/arXiv.2303.04307.
Proc. IEEE 2021, 109, 247. [64] X. Yang, M. Kahouadji, O. Lakhal, R. Merzouki, Front. Robot. AI 2022,
[34] Z. Li, F. Liu, W. Yang, S. Peng, J. Zhou, IEEE Trans. Neural Netw. 9, https://doi.org/10.3389/frobt.2022.980800.
Learn. Syst. 2022, 33, 6999. [65] J. Lu, F. Liu, C. Girerd, M. C. Yip 2023, pp. 560–566, https://doi.org/
[35] H. Salehinejad, S. Sankar, J. Barfett, E. Colak, S. Valaee 2018, https:// 10.48550/arXiv.2302.14039.
doi.org/10.48550/arXiv.1801.01078. [66] E. Almanzor, F. Ye, J. Shi, T. G. Thuruthel, H. A. Wurdemann, F. Iida,
[36] M. Krichen, Computers 2023, 12, 151. IEEE Trans. Robot. 2023, 39, 2973.
[37] S. Albawi, T. A. Mohammed, S. Al-Zawi, in 2017 Int. Conf. on [67] S. Kamtikar, S. Marri, B. T. Walt, N. K. Uppalapati, G. Krishnan,
Engineering and Technology (ICET), Antalya, Turkey, August 2017, G. Chowdhary 2022, https://doi.org/10.48550/arXiv.2202.05200.
pp. 1–6. [68] S. Kamtikar, S. Marri, B. Walt, N. K. Uppalapati, G. Krishnan,
[38] G. Nadizar, E. Medvet, S. Nichele, S. Pontes-Filho, Appl. Soft Comput. G. Chowdhary, IEEE Robot. Autom. Lett. 2022, 7, 5504.
2023, 145, 110610. [69] A. Zhang, R. L. Truby, L. Chin, S. Li, D. Rus, IEEE Robot. Autom. Lett.
[39] Z. C. Lipton, J. Berkowitz, C. Elkan 2015, https://doi.org/10.48550/ 2022, 7, 11509.
arXiv.1506.00019. [70] R. Wang, S. Wang, S. Du, E. Xiao, W. Yuan, C. Feng, IEEE Robot.
[40] R. M. Schmidt 2019, https://doi.org/10.48550/arXiv.1912.05911. Autom. Lett. 2020, 5, 3382.
[41] S. Das, A. Tariq, T. Santos, S. S. Kantareddy, I. Banerjee, in Machine [71] T. Baaij, M. Klein Holkenborg, M. Stölzle, D. van der Tuin,
Learning for Brain Disorders (Ed: O. Colliot), Springer US, New York, J. Naaktgeboren, R. Babuška, C. D. Santina, Soft Matter 2023, 19, 44.
NY 2023, pp. 117–138. [72] R. L. Truby, C. Della Santina, D. Rus, IEEE Robot. Autom. Lett. 2020, 5,
[42] S. Baloch, H. Ali, Recurrent Neural Networks: Architectures and 3299.
Applications 2023, https://doi.org/10.13140/RG.2.2.33556.48002. [73] Y. Meng, G. Fang, J. Yang, Y. Guo, C. C. L. Wang, IEEE/ASME Trans.
[43] O. M. Surakhi, M. A. Zaidan, S. Serhan, I. Salah, T. Hussein, Mechatron. 2024, 29, 832.
Computers 2020, 9, 89. [74] C. Relaño, J. Muñoz, C. A. Monje, Eng. Appl. Artif. Intell. 2023, 126,
[44] S. Jeong, M. Ferguson, R. Hou, J. P. Lynch, H. Sohn, K. H. Law, 107174.
Adv. Eng. Inform. 2019, 42, 100991. [75] L. Mosser, L. Barbe, L. Rubbert, P. Renaud, IEEE Robot. Autom. Lett.
[45] Y. Su, C.-C. J. Kuo, Neurocomputing 2019, 356, 151. 2023, 8, 6603.
[46] A. Sherstinsky, Phys. D 2020, 404, 132306. [76] W. Li, Y. He, P. Geng, Y. Yang, Electronics 2023, 12, 1476.
[47] X. Song, Y. Liu, L. Xue, J. Wang, J. Zhang, J. Wang, L. Jiang, Z. Cheng, [77] N. Tan, P. Yu, F. Ni, Z. Sun, in 2021 IEEE Int. Conf. on Systems, Man,
J. Petrol. Sci. Eng. 2020, 186, 106682. and Cybernetics (SMC), IEEE, Piscataway, NJ 2021, pp. 1035–1041.
[48] F. M. Salem, in Recurrent Neural Networks: From Simple to Gated [78] A. Zhang, T.-H. Wang, R. L. Truby, L. Chin, D. Rus, in 2023 IEEE/RSJ
Architectures (Ed: F. M. Salem), Springer International Publishing, Int. Conf. on Intelligent Robots and Systems (IROS), IEEE, Detroit, MI,
Cham 2022, pp. 85–100. USA 2023, pp. 2564–2571.
[49] F. Bonassi, M. Farina, R. Scattolini, Syst. Control Lett. 2021, 157, [79] J. Shu, J. Wang, K. C.-C. Cheng, L.-F. Yeung, Z. Li, R. K. Tong, Sensors
105049. 2023, 23, 6189.
[50] S. yon Jhin, N. Park, in 2023 11th Int. Conf. on Learning [80] M. Bednarek, P. Kicki, J. Bednarek, K. Walas, Electronics 2021, 10, 96.
Representations, Kigali, Rwanda, May 2023, pp. 1–19. [81] P. Preechayasomboon, E. Rombokas, Actuators 2021, 10, 30.
[51] A. Tariverdi, V. K. Venkiteswaran, M. Richter, O. J. Elle, J. Tørresen, [82] X. Chen, Alex. Eng. J. 2023, 84, 37.
K. Mathiassen, S. Misra, Ø. G. Martinsen, Front. Robot. AI 2021, 8, [83] C. D. Santina, V. Arapi, G. Averta, F. Damiani, G. Fiore, A. Settimi,
https://doi.org/10.3389/frobt.2021.631303. M. G. Catalano, D. Bacciu, A. Bicchi, M. Bianchi, IEEE Robot. Autom.
[52] H. P. Thanabalan, Int. J. Eng. Appl. Sci. Technol. 2021, 5, 17. Lett. 2019, 4, 1533.
[53] L. Wang, J. Lam, X. Chen, J. Li, R. Zhang, Y. Su, Z. Wang, Soft Robot. [84] I. Nate, Z. Wang, M. Kameoka, Y. Watanabe, S. M. N. Islam,
2023, 10, 825. M. Kawakami, H. Furukawa, S. Hirai, in 2022 IEEE/ASME Int.
[54] N. Agarwal, B. Wadhwa, M. S. Reddy, A. Rastogi, Int. J. Intell. Syst. Conf. on Advanced Intelligent Mechatronics (AIM), IEEE, New York
Appl. Eng. 2024, 12, 414. 2022, pp. 1018–1023.
[55] J. F. Lazo, C.-F. Lai, S. Moccia, B. Rosa, M. Catellani, M. de Mathelin, [85] S. Zhang, J. Shan, B. Fang, F. Sun, Robotica 2021, 39, 378.
G. Ferrigno, P. Breedveld, J. Dankelman, E. De Momi 2022, https:// [86] H. Wang, H. Xu, Y. Meng, X. Ge, A. Lin, X.-Z. Gao, IEEE Robot. Autom.
doi.org/10.48550/arXiv.2207.00401. Lett. 2022, 7, 11070.
[56] H. El-Hussieny, I. A. Hameed, A. A. Nada, Biomimetics 2023, [87] E. Almanzor, N. R. Anvo, T. G. Thuruthel, F. Iida, Front. Robot. AI
8, 611. 2022, 9, https://doi.org/10.3389/frobt.2022.1064853.
[57] P. Abdollahzadeh, S. Azizi, in 2019 7th Int. Conf. on Robotics and [88] H.-Y. Wang, W.-K. Ling, in 2016 IEEE Int. Conf. on Consumer
Mechatronics (Icrom 2019), IEEE, New York 2019, pp. 241–247. Electronics-China (Icce-China), IEEE, New York 2016.
[58] C. C. Johnson, T. Quackenbush, T. Sorensen, D. Wingate, [89] X. Han, S. Liu, F. Wan, C. Song, in 2023 IEEE Int. Conf. on
M. D. Killpack, Front. Robot. AI 2021, 8, https://doi.org/10.3389/ Development and Learning (ICDL), IEEE, Piscataway, NJ 2023,
frobt.2021.654398. pp. 331–338.
[59] P. Hyatt, D. Wingate, M. D. Killpack, Front. Robot. AI 2019, 6, 22. [90] F. Wan, X. Liu, N. Guo, X. Han, F. Tian, C. Song, in 2021 Conf. on
[60] W. D. Null, J. Menezes, Y. Zhang Singapore 2023. Robot Learning, London, England, November 2021, pp. 1–10.
Adv. Intell. Syst. 2024, 2400576 2400576 (26 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
[91] D. De Barrie, M. Pandya, H. Pandya, M. Hanheide, K. Elgeneidy, [122] J. Shu, J. Wang, S. C. Y. Lau, Y. Su, K. H. L. Heung, X. Shi, Z. Li,
Front. Robot. AI 2021, 8, 631371. R. K.-Y. Tong, Sensors 2022, 22, 7705.
[92] Z. Y. Ding, J. Y. Loo, V. M. Baskaran, S. G. Nurzaman, C. P. Tan, IEEE [123] J. Su, H. Zhang, H. Li, K. He, J. Tu, F. Zhang, Z. Liu, Z. Lv, Z. Cui,
Robot. Autom. Lett. 2021, 6, 951. Y. Li, J. Li, L. Z. Tang, X. Chen, Adv. Mater. 2024, 36, 2311549.
[93] J. Ha, D. Kim, S. Jo, in 2018 18th Int. Conf. on Control, Automation [124] K. Tao, J. Yu, J. Zhang, A. Bao, H. Hu, T. Ye, Q. Ding, Y. Wang,
and Systems (ICCAS), IEEE, New York 2018, pp. 570–574. H. Lin, J. Wu, H. Chang, H. Zhang, W. Yuan, ACS Nano 2023,
[94] T. G. Thuruthel, F. Iida, Singapore 2023, https://doi.org/10.48550/ 17, 16160.
arXiv.2205.04202. [125] H. Yao, W. Yang, W. Cheng, Y. J. Tan, H. H. See, S. Li, H. P. A. Ali,
[95] R. Zuo, Z. Zhou, B. Ying, X. Liu, in 2021 IEEE Int. Conf. on Robotics B. Z. H. Lim, Z. Liu, B. C. K. Tee, Proc. Natl. Acad. Sci. U. S. A. 2020,
and Automation (Icra 2021), IEEE, New York 2021, pp. 12164–12169. 117, 25352.
[96] E. Rho, D. Kim, H. Lee, S. Jo, IEEE Robot. Autom. Lett. 2021, 6, 8126. [126] H. Lee, H. Park, G. Serhat, H. Sun, K. J. Kuchenbecker, in 2020 IEEE
[97] G. Averta, F. Barontini, I. Valdambrini, P. Cheli, D. Bacciu, Int. Conf. on Robotics and Automation (ICRA), IEEE, New York 2020,
M. Bianchi, Adv. Intell. Syst. 2022, 4, 2100146. pp. 1632–1638.
[98] T. George Thuruthel, P. Gardner, F. Iida, Soft Robot. 2022, 9, https:// [127] D. Hu, F. Giorgio-Serchi, S. Zhang, Y. Yang, Nat. Mach. Intell. 2023,
doi.org/10.1089/soro.2021.0012. 5, 261.
[99] Q. Shi, Z. Sun, X. Le, J. Xie, C. Lee, in 2023 IEEE 18th Int. Conf. on [128] Q. K. Luu, D. Q. Nguyen, N. H. Nguyen, V. A. Ho, in 2023 IEEE Int.
Nano/Micro Engineered and Molecular Systems (NEMS), IEEE, Conf. on Soft Robotics (RoboSoft), IEEE, Piscataway, NJ 2023, pp. 1–6.
Piscataway, NJ 2023, pp. 19–22. [129] K. Park, H. Yuk, M. Yang, J. Cho, H. Lee, J. Kim, Sci. Robot. 2022, 7,
[100] C. Choi, W. Schwarting, J. DelPreto, D. Rus, IEEE Robot. Autom. Lett. eabm7187.
2018, 3, 2370. [130] H. Park, H. Lee, K. Park, S. Mo, J. Kim, in 2019 IEEE/RSJ Int. Conf. on
[101] H. Zhou, H. Kang, X. Wang, W. Au, M. Y. Wang, C. Chen, Agronomy Intelligent Robots and Systems (IROS), IEEE, New York 2019,
2023, 13, 503. pp. 7447–7452.
[102] Y. Yan, Z. Hu, Z. Yang, W. Yuan, C. Song, J. Pan, Y. Shen, Sci. Robot. [131] C. Larson, J. Spjut, R. Knepper, R. Shepherd, Soft Robot. 2019,
6, 611.
2021, 6, eabc8801.
[132] S. Yoshigi, J. Wang, S. Nakayama, V. A. Ho, in 2020 3rd IEEE Int.
[103] R. R. Chandran, V. Y. Chakrapani, S. Krishnan, D. G. Dharmaraj, in
Conf. on Soft Robotics (Robosoft), IEEE, New York 2020, pp. 132–137.
2023 3rd Int. Conf. on Advances in Computing, Communication,
[133] L. Massari, G. Fransvea, J. D’Abbraccio, M. Filosa, G. Terruso,
Embedded and Secure Systems (ACCESS), Kalady, Ernakulam,
A. Aliperta, G. D’Alesio, M. Zaltieri, E. Schena, E. Palermo,
India, May 2023, pp. 130–136.
E. Sinibaldi, C. M. Oddo, Nat. Mach. Intell. 2022, 4, 425.
[104] D. Kim, J. Kwon, B. Jeon, Y.-L. Park, Adv. Intell. Syst. 2020, 2,
[134] A. Geier, R. Tucker, S. Somlor, H. Sawada, S. Sugano, IEEE Robot.
1900178.
Autom. Lett. 2020, 5, 6467.
[105] S. Wang, Z. Sun, J. Bionic Eng. 2023, 20, 845.
[135] S. Shimadera, K. Kitagawa, K. Sagehashi, Y. Miyajima, T. Niiyama,
[106] J. Yuan, Y. Zhang, G. Li, S. Liu, R. Zhu, Adv. Funct. Mater. 2022, 32,
S. Sunada, Sci. Rep. 2022, 12, 13096.
2204878.
[136] K. Nakajima, H. Hauser, T. Li, R. Pfeifer, Sci. Rep. 2015, 5, 10487.
[107] P. Xu, J. Zheng, J. Liu, X. Liu, X. Wang, S. Wang, T. Guan, X. Fu,
[137] P. Karipoth, A. Christou, A. Pullanchiyodan, R. Dahiya, Adv. Intell.
M. Xu, G. Xie, Z. L. Wang, Research 2023, 6, 0062.
Syst. 2022, 4, 2100092.
[108] T. G. Thuruthel, B. Shih, C. Laschi, M. T. Tolley, Sci. Robot. 2019, 4,
[138] S. Yin, Z. Jia, X. Li, J. Zhu, Y. Xu, T. Li, Extreme Mech. Lett. 2022, 52,
https://doi.org/10.1126/scirobotics.aav1488.
101635.
[109] B. Ando, S. Graziani, M. G. Xibilia, IEEE Trans. Instrum. Meas. 2019,
[139] G. Li, T.-W. Wong, B. Shih, C. Guo, L. Wang, J. Liu, T. Wang, X. Liu,
68, 1637. J. Yan, B. Wu, F. Yu, Y. Chen, Y. Liang, Y. Xue, C. Wang, S. He,
[110] W. Hong, J. Lee, W. G. Lee, Biosensors 2022, 12, 580. L. Wen, M. T. Tolley, A.-M. Zhang, C. Laschi, T. Li, Nat.
[111] B. Lan, X. Xiao, A. Di Carlo, W. Deng, T. Yang, L. Jin, G. Tian, Y. Ao, Commun. 2023, 14, 7097.
W. Yang, J. Chen, Adv. Funct. Mater. 2022, 32, 2207393. [140] P. Sedighi, X. Li, M. Tavakoli, IEEE Robot. Autom. Lett. 2024, 9, 41.
[112] J. Barreiros, I. Karakurt, P. Agarwal, T. Agcayazi, S. Reese, K. Healy, [141] S. Panda, S. Hajra, P. M. Rajaitha, H. J. Kim, Micro Nano Syst. Lett.
Y. Menguc, in 2020 3rd IEEE Int. Conf. on Soft Robotics (RoboSoft), 2023, 11, 2.
IEEE, New York 2020, pp. 229–236. [142] X. Yang, L. Lan, X. Pan, Q. Di, X. Liu, L. Li, P. Naumov, H. Zhang,
[113] H. Park, J. Cho, J. Park, Y. Na, J. Kim, IEEE Robot. Autom. Lett. 2020, Nat. Commun. 2023, 14, 2287.
5, 3525. [143] K. Tanaka, Y. Minami, Y. Tokudome, K. Inoue, Y. Kuniyoshi,
[114] L. Zhao, B. Wu, Y. Niu, S. Zhu, Y. Chen, H. Chen, J.-H. Chen, Adv. K. Nakajima, IEEE Robot. Autom. Lett. 2022, 7, 11244.
Mater. Technol. 2022, 7, 2101698. [144] A. A. Abed, A. Al-Ibadi, I. A. Abed, J. Robot. Control 2023, 4, 299.
[115] H. Sun, K. J. Kuchenbecker, G. Martius, Nat. Mach. Intell. 2022, [145] X. Yin, R. Müller, Nat. Mach. Intell. 2021, 3, 507.
4, 135. [146] H. Zhang, Y. Wu, E. Demeester, K. Kellens, IEEE Robot. Autom. Lett.
[116] S. Koh, B. Cho, J.-K. Park, C.-H. Kim, S. Lee, in 2019 13th Int. Conf. on 2023, 8, 584.
Sensing Technology (Icst), IEEE, New York 2019. [147] W. Liu, Z. Jing, J. Huang, X. Dun, L. Qiao, H. Leung, W. Chen,
[117] R. Ambrus, V. Guizilini, N. Kuppuswamy, A. Beaulieu, A. Gaidon, IEEE Trans. Ind. Electron. 2023, 70, 12616.
A. Alspach, in 2021 IEEE 4th Int. Conf. on Soft Robotics [148] G. Li, T. Stalin, V. T. Truong, P. V. Y. Alvarado, IEEE Robot. Autom.
(RoboSoft), IEEE, Piscataway, NJ 2021, pp. 643–649. Lett. 2022, 7, 1024.
[118] M. Xu, J. Ma, Q. Sun, H. Liu, IEEE Sens. J. 2023, 23, 14809. [149] A. Vicari, N. Obayashi, F. Stella, G. Raynaud, K. Mulleners,
[119] S. Han, T. Kim, D. Kim, Y.-L. Park, S. Jo, IEEE Robot. Autom. Lett. C. D. Santina, J. Hughes, in 2023 IEEE Int. Conf. on Soft Robotics
2018, 3, 873. (RoboSoft), IEEE, Piscataway, NJ 2023, pp. 1–6.
[120] S. H. Kim, Y. Kwon, K. Kim, Y. Cha, Appl. Sci. 2020, 10, 2194. [150] A. Toro-Ossaba, J. C. Tejada, S. Rúa, A. López-González, Biomimetics
[121] K. K. Kim, I. Ha, M. Kim, J. Choi, P. Won, S. Jo, S. H. Ko, 2023, 8, 29.
Nat. Commun. 2020, 11, 2149. [151] S. Yatawatta, Astron. Comput. 2024, 48, 100833.
Adv. Intell. Syst. 2024, 2400576 2400576 (27 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
[152] R. S. Sutton, A. G. Barto, Reinforcement Learning: An Introduction, [177] C. Alessi, H. Hauser, A. Lucantonio, E. Falotico, in 2023 IEEE Int.
Bradford Books, London 2018. Conf. on Soft Robotics (RoboSoft), IEEE, Piscataway, NJ 2023,
[153] M.-A. Blais, M. A. Akhloufi, Cogn. Robot. 2023, 3, 226. pp. 1–7.
[154] J. Jia, W. Wang, in 2020 35th Youth Academic Annual Conf. of Chinese [178] J. Liu, Z. Song, Y. Lu, H. Yang, X. Chen, Y. Duo, B. Chen, S. Kong,
Association of Automation (YAC), IEEE, Zhanjiang, China 2020, Z. Shao, Z. Gong, S. Wang, X. Ding, J. Yu, L. Wen, IEEE/ASME Trans.
pp. 186–191. Mechatron. 2024, 29, 1007.
[155] D. Pecioski, V. Gavriloski, S. Domazetovska, A. Ignjatovska, in 2023 [179] J. Marquez, C. Sullivan, R. M. Price, R. C. Roberts, IEEE Robot.
12th Mediterranean Conf. on Embedded Computing (MECO), IEEE, Autom. Lett. 2023, 8, 6076.
Budva, Montenegro 2023, pp. 1–4. [180] Y. Li, X. Wang, K.-W. Kwok, in 2022 IEEE/RSJ Int. Conf. on Intelligent
[156] K. Arulkumaran, M. P. Deisenroth, M. Brundage, A. A. Bharath, IEEE Robots and Systems (IROS), IEEE, Kyoto, Japan 2022, pp. 7074–7081.
Signal Process. Mag. 2017, 34, 26. [181] S. Satheeshbabu, N. K. Uppalapati, T. Fu, G. Krishnan, in 2020 3rd
[157] Y. Li 2018, https://doi.org/10.48550/arXiv.1701.07274. IEEE Int. Conf. on Soft Robotics (RoboSoft), IEEE, Piscataway, NJ
[158] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, 2020, pp. 497–503.
M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, [182] Y. Zhang, T. Wang, N. Tan, S. Zhu, in Intelligent Robotics and
G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, Applications, ICIRA 2021, Pt I (Eds: X. J. Liu, Z. Nie, J. Yu, F. Xie,
H. King, D. Kumaran, D. Wierstra, S. Legg, D. Hassabis, Nature R. Song), Springer International Publishing Ag, Cham 2021,
2015, 518, 529. pp. 302–312.
[159] H. Van Hasselt, A. Guez, D. Silver, AAAI 2016, 30, https://doi.org/ [183] A. Centurelli, L. Arleo, A. Rizzo, S. Tolu, C. Laschi, E. Falotico, IEEE
10.1609/aaai.v30i1.10295. Robot. Autom. Lett. 2022, 7, 4741.
[160] J. Schulman, S. Levine, P. Abbeel, M. Jordan, P. Moritz, in Proc. of the [184] W. D. Null, W. Edwards, D. Jeong, T. Tchalakov, J. Menezes,
32nd Int. Conf. on Machine Learning, PMLR, Lile, France, July 2015, K. Hauser, Y. Z, IEEE Robot. Autom. Lett. 2024, 9, 571.
pp. 1889–1897. [185] Q. Wu, Y. Gu, Y. Li, B. Zhang, S. A. Chepinskiy, J. Wang,
[161] K. Thattai, J. Ravishankar, C. Li, in 2023 IEEE Belgrade PowerTech, A. A. Zhilenkov, A. Y. Krasnov, S. Chernyi, Information 2020, 11, 310.
IEEE, Piscataway, NJ 2023, pp. 1–6. [186] G. Ji, J. Yan, J. Du, W. Yan, J. Chen, Y. Lu, J. Rojas, S. S. Cheng, IEEE
[162] Y. Gu, Y. Cheng, C. L. P. Chen, X. Wang, IEEE Trans. Syst. Man Robot. Autom. Lett. 2021, 6, 7461.
Cybern. Syst. 2022, 52, 4600. [187] M. Zhu, J. Dai, Y. Feng, Soft Robot. 2023, 11, https://doi.org/10.
[163] T. Lillicrap, J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, 1089/soro.2022.0246.
D. Wierstra, CoRR 2015, https://doi.org/10.48550/arXiv.1509.02971. [188] K. Tanaka, R. Yonetani, M. Hamaya, R. Lee, F. von Drigalski, Y. Ijiri,
[164] S. Fujimoto, H. Hoof, D. Meger, in Proc. of the 35th Int. Conf. on in 2021 IEEE Int. Conf. on Robotics and Automation (ICRA), IEEE
Machine Learning, PMLR, Stockholm, Sweden, July 2018, Press, Xi’an, China 2021, pp. 4627–4633.
pp. 1587–1596. [189] R. Newbury, M. Gu, L. Chumbley, A. Mousavian, C. Eppner,
[165] T. Haarnoja, A. Zhou, K. Hartikainen, G. Tucker, S. Ha, J. Tan, J. Leitner, J. Bohg, A. Morales, T. Asfour, D. Kragic, D. Fox,
V. Kumar, H. Zhu, A. Gupta, P. Abbeel, S. Levine, ArXiv 2018, A. Cosgun, IEEE Trans. Robot. 2023, 39, 3994.
https://doi.org/10.48550/arXiv.1812.05905. [190] Z. Xie, X. Liang, C. Roberto, Front. Robot. AI 2023, 10, https://doi.
[166] M. Cai, Q. Wang, Z. Qi, D. Jin, X. Wu, T. Xu, L. Zhang, IEEE Trans. org/10.3389/frobt.2023.1038658.
Cybern. 2023, 53, 7699. [191] H. Sekkat, O. Moutik, L. Ourabah, B. Elkari, Y. Chaibi,
[167] M. Mohammadi, A. Z. Kouzani, M. Bodaghi, J. Long, S. Y. Khoo, T. A. Tchakoucht, Stat. Optim. Inf. Comput. 2023, 12, 571.
Y. Xiang, A. Zolfagharian, Robot. Comput. Integr. Manuf. 2024, [192] L. Zhao, H. Liu, F. Li, X. Ding, Y. Sun, F. Sun, J. Shan, Q. Ye, L. Li,
85, 102636. B. Fang, London 2023, pp. 5887–5893.
[168] S. A. Moezi, R. Sedaghati, S. Rakheja, ISA Trans. 2023, https://doi. [193] F. Liu, F. Sun, B. Fang, X. Li, S. Sun, H. Liu, IEEE Trans. Robot. 2023,
org/10.1016/j.isatra.2023.10.030. 39, 2379.
[169] M. Oghogho, M. Sharifi, M. Vukadin, C. Chin, V. K. Mushahwar, [194] J. Dai, M. Zhu, Y. Feng, in 2021 27th Int. Conf. on Mechatronics and
M. Tavakoli, in 2022 Int. Symp. on Medical Robotics (ISMR), IEEE, Machine Vision in Practice (M2vip), IEEE, New York 2021.
New York 2022. [195] D. Bianchi, M. G. Antonelli, C. Laschi, A. M. Sabatini, E. Falotico,
[170] J. Yao, Q. Cao, Y. Ju, Y. Sun, R. Liu, X. Han, L. Li, Adv. Intell. Syst. IEEE Robot. Autom. Mag. 2023, 2, https://doi.org/10.1109/MRA.
2023, 5, https://doi.org/10.1002/aisy.202200339. 2023.3310865.
[171] L. Li, J. Li, L. Qin, J. Cao, M. S. Kankanhalli, J. Zhu, IEEE Robot. [196] Q. Ren, W. Zhu, J. Cao, W. Liang, IEEE Trans. Cognit. Dev. Syst. 2024,
Autom. Lett. 2019, 4, 2094. 16, 606.
[172] M. Raeisinezhad, N. Pagliocca, B. Koohbor, M. Trkov, Front. Robot. [197] X. Liu, R. Gasoto, C. Onal, J. Fu 2020, https://doi.org/10.48550/
AI 2021, 8, 639102. arXiv.2001.04059.
[173] W. Liu, Z. Jing, H. Pan, L. Qiao, H. Leung, W. Chen, J. Bionic Eng. [198] X. Liu, C. D. Onal, J. Fu, IEEE Trans. Robot. 2023, 39, 3382.
2020, 17, 1126. [199] S. K. Rajendran, F. Zhang, Front. Robot. AI 2022, 8, 809427.
[174] N. Komeno, B. Michael, K. Küchler, E. Anarossi, T. Matsubara 2022, [200] G. Li, J. Shintake, M. Hayashibe, Front. Robot. AI 2023, 10, https://
https://doi.org/10.48550/arXiv.2210.07563. doi.org/10.3389/frobt.2023.1102854.
[175] A. Ataka, A. P. Sandiwan, in 2023 9th Int. Conf. on Control, [201] S. Min, J. Won, S. Lee, J. Park, J. Lee, ACM Trans. Graph. 2019,
Automation and Robotics (ICCAR), Beijing, China, April 2023, 38, 208.
pp. 115–120. [202] Q. Wang, Z. Hong, Y. Zhong, Biomimetic Intell. Robot. 2022, 2,
[176] S. Satheeshbabu, N. K. Uppalapati, G. Chowdhary, G. Krishnan, in 100066.
2019 Int. Conf. on Robotics and Automation (ICRA) (Eds: A. Howard, [203] Q. Ji, S. Fu, K. Tan, S. Thorapalli Muralidharan, K. Lagrelius,
K. Althoefer, F. Arai, F. Arrichiello, B. Caputo, J. Castellanos, D. Danelia, G. Andrikopoulos, X. V. Wang, L. Wang, L. Feng,
K. Hauser, V. Isler, J. Kim, H. Liu, P. Oh, V. Santos, Robot. Comput. Integr. Manuf. 2022, 78, 102382.
D. Scaramuzza, A. Ude, R. Voyles, K. Yamane, A. Okamura), [204] G. Li, J. Shintake, M. Hayashibe, in 2021 IEEE Int. Conf. on Robotics
IEEE, New York 2019, pp. 5133–5139. and Automation (ICRA 2021), IEEE, New York 2021, pp. 12033–12039.
Adv. Intell. Syst. 2024, 2400576 2400576 (28 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
[205] Q. Wu, Y. Wu, X. Yang, B. Zhang, J. Wang, S. A. Chepinskiy, [217] B. Shih, D. Shah, J. Li, T. G. Thuruthel, Y.-L. Park, F. Iida, Z. Bao,
A. A. Zhilenkov, Front. Robot. AI 2022, 9, 815435. R. Kramer-Bottiglio, M. T. Tolley, Sci. Robot. 2020, 5, eaaz9239.
[206] L.-Z. Guo, Z.-Y. Zhang, Y. Jiang, Y.-F. Li, Z.-H. Zhou, in Proc. of the [218] X. Liu, X. Han, N. Guo, F. Wan, C. Song, Biomimetics 2023,
37th Int. Conf. on Machine Learning, PMLR, Online Event, July 2020, 8, 501.
pp. 3897–3906. [219] Y. Liu, H. He, T. Han, X. Zhang, M. Liu, J. Tian, Y. Zhang, J. Wang,
[207] L. Schmarje, M. Santarossa, S.-M. Schröder, R. Koch, IEEE Access X. Gao, T. Zhong, Y. Pan, S. Xu, Z. Wu, Z. Liu, X. Zhang, S. Zhang,
2021, 9, 82146. X. Hu, T. Zhang, N. Qiang, T. Liu, B. Ge 2024, https://doi.org/10.
[208] Y. Chen, M. Mancini, X. Zhu, Z. Akata 2022, https://doi.org/10. 48550/arXiv.2401.02038.
48550/arXiv.2208.11296. [220] S. Sapai, J. Y. Loo, Z. Y. Ding, C. P. Tan, V. M. Baskaran,
[209] S.-C. Huang, A. Pareek, M. Jensen, M. P. Lungren, S. Yeung, S. G. Nurzaman, Soft Robot. 2023, 10, 1224.
A. S. Chaudhari, npj Digit. Med. 2023, 6, 1. [221] H. Wang, M. Totaro, L. Beccai, Adv. Sci. 2018, 5, 1800541.
[210] A. Spielberg, A. Zhao, T. Du, Y. Hu, D. Rus, W. Matusik, in Advances [222] Md. B. Hossain, N. Gong, M. Shaban, in 2023 IEEE Int. Conf. on
in Neural Information Processing Systems 32 (NIPS 2019) Artificial Intelligence, Blockchain, and Internet of Things (AIBThings),
(Eds: H. Wallach, H. Larochelle, A. Beygelzimer, F. d’Alche-Buc, IEEE, Piscataway, NJ 2023, pp. 1–6.
E. Fox, R. Garnett), Neural Information Processing Systems [223] B. Leblanc, P. Germain 2024, https://doi.org/10.48550/arXiv.2311.
(NIPS), La Jolla 2019. 11491.
[211] D. Kim, M. Kim, J. Kwon, Y.-L. Park, S. Jo, IEEE Robot. Autom. Lett. [224] S. Das, V. Prado da Fonseca, A. Soares, Front. Robot. AI 2024, 11,
2019, 4, 2501. https://doi.org/10.3389/frobt.2024.1281060.
[212] H. Pandey, D. Windridge, Not Known 2018. [225] J. Schneider, M. Vlachos 2023, https://doi.org/10.48550/arXiv.2302.
[213] I. Goodfellow, Y. Bengio, A. Courville, Deep Learning, The MIT Press, 00722.
Cambridge, MA 2016. [226] S. Sapai, J. Y. Loo, Z. Y. Ding, C. P. Tan, R. C.-W. Phan,
[214] S. Zou, Y. Lyu, J. Qi, G. Ma, Y. Guo, Sens. Actuator, A 2022, 344, V. M. Baskaran, S. G. Nurzaman, London 2023, pp. 552–559,
113692. https://doi.org/10.48550/arXiv.2303.01693.
[215] M. Hamaya, F. von Drigalski, T. Matsubara, K. Tanaka, R. Lee, [227] J. Wang, Z. Wu, Y. Li, H. Jiang, P. Shu, E. Shi, H. Hu, C. Ma, Y. Liu,
C. Nakashima, Y. Shibata, Y. Ijiri, in 2020 IEEE/RSJ Int. Conf. on X. Wang, Y. Yao, X. Liu, H. Zhao, Z. Liu, H. Dai, L. Zhao, B. Ge, X. Li,
Intelligent Robots and Systems (IROS), IEEE, New York 2020, T. Liu, S. Zhang 2024, https://doi.org/10.48550/arXiv.2401.04334.
pp. 8309–8315. [228] J. Pinskier, D. Howard, Adv. Intell. Syst. 2022, 4, 2100086.
[216] S. Zaidi, M. Maselli, C. Laschi, M. Cianchetti, Curr. Robot. Rep. 2021, [229] W. K. Chan, P. Wang, R. C.-H. Yeow 2024, https://doi.org/10.48550/
2, 355. arXiv.2405.01824.
Tomáš Čakurda obtained his doctoral degree in 2023 in the field of Mechanical Engineering at the
Faculty of Manufacturing Technologies with a seat in Prešov of the Technical University of Košice. He is
currently working at the respective faculty as an Assistant Professor. His research focuses on the
identification of behavior and modeling of soft devices, such as soft continuous arms, soft actuators, soft
sensors, and artificial muscles, based on machine learning methods, deep learning methods, and
computational intelligence methods in the context of human-robot collaboration.
Monika Trojanová finished doctoral studies at the Faculty of Manufacturing Technologies with a seat in
Prešov, Technical University of Košice, Slovakia (2019). Following her studies, she continued her
academic career at the faculty and currently serves as an assistant professor. As part of her research
activities, she has been involved in several projects focused on the research and development of soft
actuators, soft robotic arms, and manipulators. Currently, her research and development efforts are
primarily concentrated on identifying and modeling soft actuators and applying artificial intelligence
techniques within the context of Industry 4.0.
Pavlo Pomin received his M.Sc. from the Technical University of Košice in 2023. He is currently a PhD
candidate at the Faculty of Manufacturing Technologies with a seat in Prešov, Technical University of
Košice. His research area includes deep learning methods, computational intelligence, soft continuum
arms, fluidic actuators, and system identification.
Adv. Intell. Syst. 2024, 2400576 2400576 (29 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH
www.advancedsciencenews.com www.advintellsyst.com
Alexander Hošovský completed his M.Sc. in avionics engineering at the Air Force Academy of
M.R.Štefánik in Košice in 2004 and Ph.D. at the Technical University of Košice. He is currently an
associate professor at the Faculty of Manufacturing Technologies with a seat in Prešov, Technical
University of Košice. His research interests include soft robotics and soft actuators, artificial muscles,
deep learning, artificial intelligence/machine learning methods in general, system identification,
bioinspired computation, and optimization.
Adv. Intell. Syst. 2024, 2400576 2400576 (30 of 30) © 2024 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH