Human-Robot Interaction Based On Motion and Force Control
Human-Robot Interaction Based On Motion and Force Control
Martin Karlsson
3
Acknowledgments
I would like to thank my supervisors Prof. Rolf Johansson and Prof. Anders
Robertsson, for invaluable advice and support throughout this work. You
have given me insights in all aspects of robot research, including theory,
implementation, and presentation of results. Thanks to your ability to give
me good advice in any situation, while also giving me the freedom to define
my work, I have looked forward to every next day of this journey.
Prof. Bo Bernhardsson, I am happy that I took your advice to work at
the department, both times. First as a teaching assistant, and then as a
PhD student. I would also like to thank you and Prof. Fredrik Tufvesson
again for the guidance as my Master’s thesis supervisors many years ago.
Dr. Fredrik Bagge Carlson, it has been great to have you as a close
colleague. It has been fun to improve robots together with you. We have a
common interest not only in our work, but also in solid-state physics and
especially defected semiconductors, and I look forward to review this area
regularly together with you in the future.
Dr. Björn Olofsson, thank you for being a great role model and for
generously sharing insights about robotics in both theory and practice. Your
morale, good mood, and unquestionable competence set a great example.
Prof. Charlotta Johnsson, thank you for bringing me into the robotic
surgery project Surgeon’s Perspective. I really enjoy the work with you and
our project colleagues, and I highly appreciate your guidance in where to
direct our work. I would like to acknowledge M.D. Kiet Phan Tran for
introducing me to cardiac surgery, for sharing your knowledge extremely
generously, and for the interesting discussions we have about the intersec-
tion of robotics and surgery.
I would like to acknowledge my colleagues and former colleagues in the
Robotics Lab, Dr. Mahdi Ghazaei Ardakani, Dr. Anders Nilsson, Anders
Blomdell, Pontus Andersson, Asst. Prof. Mathias Haage, Prof. Jacek Malec,
Assoc. Prof. Elin Anna Topp, Assoc. Prof. Klas Nilsson, Dr. Maj Stenmark,
Dr. Magnus Linderoth, Dr. Olof Sörnmo, Martin Holmstrand, Matthias
5
Mayr, Prof. Volker Krüger, Julian Salt, Alexander Dürr, Dr. Karl Berntorp,
and Dr. Andreas Stolt. Thank you for great cooperation.
Dr. Mårten Wadenbäck, thank you for introducing me to the exciting
field of visual odometry, and for being a great collaborator and friend. It
was worth all the difficulties with network connections, camera settings,
and even two minor electric shocks.
Throughout the years I have had the honor and pleasure of sharing office
with many great colleagues. Thank you Olof Troeng, Dr. Björn Olofsson, Nils
Vreman, Martin Morin, Victor Millnert, Dr. Ola Johnsson, Lic. Tech. Jonas
Dürango, Irene Zorzan, Dr. Meike Rönn, Dr. Magnus Linderoth, Dr. Gabriel
Turesson, and Lic. Tech. Alina Andersson, for creating such a nice working
climate!
Prof. Karl Johan Åström, you are gratefully acknowledged for fruitful
discussions about this thesis, and about automatic control in general. You
are a great source of knowledge and inspiration at our department.
I am grateful to the administrative staff at the department, Ingrid Nils-
son, Mika Nishimura, Monika Rasmusson, Cecilia Edelborg Christensen,
and Dr. Eva Westin, for your help with various things related to work. You
are always so positive, well organized, and helpful. A special thanks to In-
grid and Monika, for always being supportive and understanding when I
did my best with project-time reporting.
Leif Andersson, thank you for helping me using LATEX to its full potential.
Without your aid, this thesis, as well as my other publications, would have
looked much worse. You have helped me to position figures correctly, number
equations and subequations, and to remove unnecessary LATEX code that I
naively copied from the Internet. On a related note, I thank Dr. Björn
Olofsson, Axel Karlsson, Dr. Fredrik Bagge Carlson, Julian Salt, Matthias
Mayr, and Dr. Mahdi Ghazaei Ardakani for proofreading of this thesis.
I also thank all colleagues at the department. Each one creates a posi-
tive, inspiring, and fun working environment, and contributes to interesting
discussions about control theory and related topics.
Last but not least, Daniel, Ann-Eli, Axel, and Johan Karlsson, have
always been a fantastic family. Heartfelt thanks for all the love and support!
6
Financial Support
Financial support is gratefully acknowledged from the VINNOVA project
Kirurgens Perspektiv (Surgeon’s Perspective in English). Financial sup-
port is also gratefully acknowledged from the European Commission, un-
der the Framework Programme Horizon 2020 within grant agreement No
644938–SARAFun, and under the 7th Framework Programme within grant
agreement No 606156–FlexiFab, as well as from the Swedish Foundation
for Strategic Research through the SSF project ENGROSS. The author is a
member of the LCCC Linnaeus Center, supported by the Swedish Research
Council, and the ELLIIT Excellence Center, supported by the Swedish Gov-
ernment.
7
Contents
1. Introduction 13
1.1 Background and Motivation . . . . . . . . . . . . . . . . . . 13
1.2 Robot Programming and Control . . . . . . . . . . . . . . . 15
1.3 Introduction to Machine Learning . . . . . . . . . . . . . . 16
1.4 Problem Formulation . . . . . . . . . . . . . . . . . . . . . 18
1.5 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.6 Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.7 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.8 Videos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2. Robots and Interfaces for Experimental Validation 25
2.1 YuMi Prototype Version . . . . . . . . . . . . . . . . . . . . 26
2.2 YuMi Product Version . . . . . . . . . . . . . . . . . . . . . 26
3. Dynamical Movement Primitives—An Overview 29
3.1 Related Research . . . . . . . . . . . . . . . . . . . . . . . . 32
3.2 Parameter Choices . . . . . . . . . . . . . . . . . . . . . . . 33
3.3 Comparison with Alternative Movement Representations . 34
4. Modification of Robot Movement Using Lead-Through Pro-
gramming 35
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2 Motivating Examples . . . . . . . . . . . . . . . . . . . . . . 37
4.3 Problem Formulation . . . . . . . . . . . . . . . . . . . . . 39
4.4 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5. Temporally Coupled Dynamical Movement Primitives in Rn 49
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.2 Previous Research on Temporally Coupled DMPs for Pertur-
bation Recovery . . . . . . . . . . . . . . . . . . . . . . . . . 51
9
Contents
10
Contents
11
1
Introduction
13
Chapter 1. Introduction
14
1.2 Robot Programming and Control
15
Chapter 1. Introduction
16
1.3 Introduction to Machine Learning
scenarios this strategy works well, but most of the tasks that humans
accomplish in their everyday life are far too complex to mediate in such
fashion. Whereas distinguishing, for instance, whether a certain image rep-
resents a car or a bicycle is typically easy for humans, hand crafting the
classification rules from raw image data is not feasible. The introduction
in [Goodfellow et al., 2016] provides interesting examples of tasks that are
easy for humans to perform, but difficult to describe formally.
Machine learning is the most promising framework to address such prob-
lems. This framework consists of three major parts; supervised learning,
unsupervised learning, and reinforcement learning [Bishop, 2007; Murphy,
2012; Goodfellow et al., 2016], all of which are relevant in robotics. The
main idea is to provide computers with example data, and use these to
define models, instead of specifying the models manually. In this thesis, we
mainly consider supervised learning. Whereas traditional automatic control
is useful for achieving specific robot movements, machine learning can po-
tentially allow for programming on higher levels of abstraction. For instance,
if natural language or image data could be interpreted in a meaningful way,
these could potentially be used as a decision basis by a robot. In machine
learning, such data are typically analyzed using artificial neural networks
[Kröse and Smagt, 1993; Goodfellow et al., 2016], which were inspired by
the connected neurons in biological brains.
Iterative learning control is closely related to reinforcement learning,
and has been used for motion control [Norrlöf, 2002; Norrlöf and Gunnars-
son, 2002] and force control [Marchal et al., 2014] of industrial robots. The
main idea is to iteratively update the control signal based on previous trials,
to compensate for repetitive sources of error. System identification [Ljung,
1987; Johansson, 1993; Lindsten et al., 2013; Schön et al., 2015] is closely
related to machine learning, and is specialized on finding dynamical models
from time series of input and output signals.
Supervised machine learning is used to approximate a given function,
y( x), with a parameterized function, ŷ( xpθ ), which takes some input data x,
and maps it to an output ŷ. Here, θ denotes the model parameters. In the
example of image recognition, x could be pixel values, y would be the true
image category, and ŷ( x) could be interpreted as the probability distribution
over the two categories, i.e., car and bicycle, given x.
In order to learn ŷ( x), the model is exposed to a large data set of
examples, called training data. In supervised learning, the training data
consist of both input data and the corresponding known outputs, usually
manually labeled. In the training phase, the elements of θ are adjusted
to fit the training data by means of optimization. A loss function, L, in
which some measure of the error of ŷ( x) compared to y( x) is included, is
minimized with respect to the model parameters.
Since the training data can only include a small subset of all possible
17
Chapter 1. Introduction
18
1.5 Contributions
1.5 Contributions
The main contributions of this thesis are:
19
Chapter 1. Introduction
1.6 Publications
In this section, publications authored or co-authored by the thesis author are
presented. The published manuscripts have been subjected to peer review.
For each manuscript, A. Robertsson and R. Johansson have contributed with
supervision, including insights regarding previous work and our ongoing
research, as well as suggestions for improvement. First, the publications on
which this thesis is based are described.
Parts of the research presented in this PhD thesis have previously been
published in the Licentiate thesis above.
20
1.6 Publications
21
Chapter 1. Introduction
22
1.7 Thesis Outline
23
Chapter 1. Introduction
1.8 Videos
Videos that show each of the functionalities developed are available in the
entry for this thesis in the Lund University Research Portal:
http://portal.research.lu.se/portal/
24
2
Robots and Interfaces for
Experimental Validation
Most of the experiments presented in this thesis have been done in the
Robotics Lab shared by the departments of Automatic Control and Com-
puter Science at Lund University, and some have been done at ABB Cor-
porate Research in Västerås, Sweden. Different versions of the ABB YuMi
robot have been used. YuMi is a dual-arm collaborative robot [Kock et al.,
2011; ABB Robotics, 2018]. Each arm has seven joints and is therefore re-
dundant. It is designed for safe human–robot interaction [Matthias et al.,
2011], with low mass, weak motors, smooth surfaces, and speed limitations.
Each joint is controlled by the ABB IRC5 control system [ABB Robotics,
2019a], running at 2 kHz. The control system consists of a current-control
loop to achieve desired joint torque, and outer control loops to achieve de-
sired joint position and velocity; see, e.g., Fig. 2.4 in [Stolt, 2015] for a block
diagram of the IRC5 system. In this thesis, we will refer to the IRC5 sys-
tem as the internal robot controller. The algorithms developed in this thesis
were implemented on ordinary PCs, and communicated with the internal
robot controller through the research interfaces described in the following
sections.
YuMi is backdrivable, which means that it is possible to move it through
physical contact without being prevented by high friction, large mass, and
high gear ratio between the arm and motor sides. This is a common prop-
erty for light-weight robots, whereas larger traditional industrial robots are
typically not backdrivable. Sensorless lead-through programming is easier
to achieve for robots with such backdrivability. In [Stolt et al., 2015a], it was
implemented on the YuMi prototype described in Sec. 2.1. A version of that
implementation is included in the YuMi product (see Sec. 2.2) by default.
25
Chapter 2. Robots and Interfaces for Experimental Validation
Figure 2.1 The prototype version of YuMi [ABB Robotics, 2018] used in
the experiments. It was located in the Robotics Lab at Lund University.
26
2.2 YuMi Product Version
Figure 2.2 The product version of YuMi mainly used in the experiments.
It was located in the Robotics Lab at Lund University.
Figure 2.3 The product version of YuMi located at ABB Corporate Re-
search in Västerås, used for some of the experiments in Chapter 7 and for
some separate demonstrations in SARAFun.
27
Chapter 2. Robots and Interfaces for Experimental Validation
Figure 2.4 The product version of YuMi located at ABB Corporate Re-
search in Västerås, used as main demonstration platform in SARAFun.
28
3
Dynamical Movement
Primitives—An Overview
29
Chapter 3. Dynamical Movement Primitives—An Overview
Notation Description
+
n ∈Z Dimension of robot configuration
y ∈ Rn Robot configuration
ÿr ∈ Rn Robot acceleration reference
τ ∈ R+ Time parameter
∈ Rn Goal state
x ∈ R+ Phase variable
f ( x) ∈ Rn Learnable virtual forcing term
α, β , α x ∈ R+ Positive constants
Nb ∈ Z+ Number of basis functions
Ψ j ( x) ∈ Rn The j:th basis function vector
wj ∈ Rn The j:th weight vector
while retaining the notation from [Ijspeert et al., 2013] where applicable.
The system consists of a proportional-derivative (PD) controller, together
with feedforward by f ( x). The dynamics are achieved by sending ÿr as a
reference acceleration to internal motion-control loops. It is assumed that
the reference acceleration can be realized during free-space motion, so that
ÿ = ÿr . This is reasonable for an acceleration of moderate magnitude and
time derivative. Depending on the level of control, ÿr can also be seen as a
control signal. The control structure is visualized in Fig. 3.1.
Consider for now the simplified case where f ( x) = 0, so that
ÿr y
DMP Robot
-1
30
Chapter 3. Dynamical Movement Primitives—An Overview
31
Chapter 3. Dynamical Movement Primitives—An Overview
ftarget
2
ftarget = τ ÿdemo − α(β ( − ydemo ) − τ ẏdemo ) =
2
..
(3.8)
.
N
ftarget
s
32
3.2 Parameter Choices
33
Chapter 3. Dynamical Movement Primitives—An Overview
34
4
Modification of Robot
Movement Using
Lead-Through Programming
4.1 Introduction
Similar to most software products, robot programs need to be updated from
time to time. A change of product to be manufactured, a desire to shorten
cycle time, previous programming mistakes, or a changed work-cell config-
uration are typical reasons for this. It might also be desirable to migrate
functionality between different robot stations, or to be able to download
preliminary programs or general templates put together elsewhere and
subsequently make final adjustments for the intended robot cell. To make
it worthwhile to reuse existing but imperfect robot programs rather than
programming from the beginning, easy means of modification are necessary.
This chapter presents a framework that allows for a robot operator to
adjust robot movements in an intuitive way. Given a faulty trajectory, the
operator can use lead-through programming [Stolt et al., 2015a] to demon-
strate a corrective one. Based on this, a modified trajectory representation
is determined automatically. DMPs, described in [Ijspeert et al., 2013] and
Chapter 3, are used to represent and generate the movement, though this
presented framework does not require that the faulty trajectory is generated
by a DMP.
In the process of program updating, it is natural that previously de-
fined movements need modification. In the case where a starting point has
been updated without further consideration, which in principle is supported
by the DMP framework [Ijspeert et al., 2013], a DMP-generated trajectory
would still converge to the goal configuration, given that no physical limi-
tations prevent this, but it would necessarily go through a different path.
The path modification would not be specified by the operator by default,
35
Chapter 4. Modification of Robot Movement. . .
and it would be larger for a larger difference between the original and the
updated starting points. Similarly, if the complete trajectory is of interest
it is not enough to modify the goal state. One way to solve the problem
would be to record an entirely new trajectory, and then construct a cor-
responding DMP. However, this would be unnecessarily time consuming
for the operator, if only some part of the trajectory had to be modified.
In this chapter, we consider the unfavorable case where the last part of a
robot trajectory is unsatisfactory. This is a realistic scenario, because many
manipulation movements start by approaching a work object with large
position tolerances, and end with physical contact and small position tol-
erances. Section 4.2 presents two such scenarios, in the context of robotic
assembly.
A standard approach has been to use demonstrated configuration tra-
jectories to determine preliminary DMPs. However, simply repeating these
trajectories does not guarantee success. Further refinements and ability to
adapt based on sensor data are typically necessary. Adaptation based on
force measurements to follow force profiles in robotic assembly was consid-
ered in [Abu-Dakka et al., 2015]. Further, initial demonstrations have been
followed by trajectory-based reinforcement learning for many different tasks.
In [Pastor et al., 2013], a simulated legged robot was considered. A DMP
that enabled the robot to make a small jump was first manually defined.
After applying reinforcement learning, the robot could make a longer jump
over a gap. The same principle was applied in [Kober et al., 2008], where
a simulated robot arm was taught to catch a ball in a cup. In [Kroemer
et al., 2010], reinforcement learning and imitation learning were combined
to achieve object grasping in an unstructured environment. Compared to
such refinements, the modification presented here is less time consuming
and does not require engineering work. On the other hand, the previous
work offers modulation based on sensor data and finer movement adjust-
ment. Therefore, the framework presented in this chapter can be used as an
intermediate step where, if necessary, a DMP can be modified to prepare
for further improvement, see Fig. 4.1. This modification can be used within
a wide range of tasks. In this chapter, we exemplify with peg-in-hole tasks.
In [Pastor et al., 2009; Ijspeert et al., 2013; Pastor et al., 2013], avoid-
ance of spherical obstacles was incorporated in the DMP framework by
modeling repulsive forces from the obstacles. In [Stavridis et al., 2017], this
research was extended for avoidance of obstacles contained in ellipsoids.
These approaches have been verified for several scenarios, but require an
infrastructure for obstacle detection, as well as some coupling-term pa-
rameters to be defined. In practice they preserve convergence to the goal
configuration in uncomplicated settings, but since the path to get there is
modified by the obstacle avoidance, it is not guaranteed to follow any spe-
cific trajectory to the goal. This is often significant, for instance in assembly
36
4.2 Motivating Examples
Further
improvement
Successful
Initial
Create DMP Evaluation
demonstration
Unsuccessful
Modify Corrective
DMP demonstration
tasks. Further, for more complex environments that are realistic in robotic
manipulation, modeling repulsive forces from obstacles would imply a risk
of getting stuck, thereby jeopardizing task completion. Instead, the method
described here allows for the operator to lead the manipulator backwards,
approximately along the part of the deficient path that should be adjusted,
followed by a desired trajectory, as visualized in Fig. 4.2.
Setup 4.1 Consider the setup shown in Fig. 4.3, where a button should
be placed into a yellow case. A DMP was run for this purpose, but due to
any of the reasons described above the movement was not precise enough,
and the robot got stuck on its way to the target. Hitherto, such a severe
shortcoming would have motivated the operator to teach a completely new
37
Chapter 4. Modification of Robot Movement. . .
50
100
50 100
y2 [mm] 00
y1 [mm]
Figure 4.2 Trajectories of the robot’s end-effector from one of the experi-
ments. The arrow indicates the motion direction. A deficient trajectory was
generated from an original DMP. After that, the operator demonstrated a
corrective trajectory. Merging of these resulted in a modified trajectory. The
projection on the horizontal plane is only to facilitate the visualization.
DMP, and erase the old one. With the method proposed in this chapter, the
operator had the opportunity to approve the first part of the trajectory, and
only modify the last part. This was done by leading the robot arm backwards,
approximately along the faulty path, until it reached the acceptable part.
Then, the operator continued to lead the arm along the desired path to the
goal. When this was done, the acceptable part of the first trajectory was
merged with the last part of the corrective trajectory. After that, a DMP
was fitted to the resulting trajectory. Compared to just updating the target
point, this approach also enabled the operator to determine the trajectory
leading there.
Setup 4.2 For the setup in Fig. 4.4, there existed a DMP for moving the
robot arm from the right, above the button that was already inserted, to a
position just above the hole in the leftmost yellow case. However, under the
evaluation the operator realized that there would have been a collision if a
button were already placed in the case in the middle. A likely reason for
this to happen would be that the DMP was created in a slightly different
scene, where the potential obstacle was not taken into account. Further, the
operator desired to extend the movement to complete the peg-in-hole task,
rather than stopping above the hole. With the method described herein, the
action of the operator would be similar to that in Setup 4.1, again saving
work compared to previous methods.
38
4.3 Problem Formulation
(a) (b)
(c) (d)
Figure 4.3 Visualization of Setup 4.1. The evaluation started in (a), and
in (b) the robot failed to place the button in the hole because of inadequate
accuracy. Between (a) and (b), the deficient trajectory was recorded. The
operator led the robot arm backwards (c), approximately along a proportion of
the deficient trajectory, and subsequently led it to place the button properly,
while the corrective trajectory was recorded. The robot then made the entire
motion, starting in a configuration similar to that in (a), and ending as
displayed in (d). Results from this setup are shown in Figs. 4.7 and 4.8.
39
Chapter 4. Modification of Robot Movement. . .
(a) (b)
(c) (d)
(e) (f)
Figure 4.4 Visualization of Setup 4.2. The initial goal was to move the
button to the leftmost yellow case, above the hole, to prepare for placement.
The evaluation started in (a), and in (b) the trajectory was satisfactory as the
placed button was avoided. In (c), however, there would have been a collision
if there were a button placed in the middle case. Further, it was desired to
complete the peg-in-hole task, rather than stopping above the hole. Hence,
the evaluated trajectory was considered deficient. In (d), the operator led
the robot arm back, and then in a motion above the potential obstacle, and
into the hole, forming the corrective trajectory. Based on the modified DMP,
the robot started in a position similar to that in (a), avoided the potential
obstacle in (e) and reached the new target in (f). The trajectories from one
attempt are shown in Fig. 4.9.
40
4.4 Method
Notation Description
+
n ∈Z Dimension of robot configuration
y ∈ Rn Robot configuration
yd ∈ Rn Deficient trajectory
yc ∈ Rn Corrective trajectory
ydr ∈ Rn Retained part of deficient trajectory
ym ∈ Rn Modified version of ydr
yk ∈ Rn Robot configuration at time sample k
τ ∈ R+ Time parameter
∈ Rn Goal state
Nb ∈ Z+ Number of basis functions
wj ∈ Rn The j:th weight vector
4.4 Method
In this section a method to determine which part of a deficient trajectory to
retain is presented, followed by a description of how it should be merged with
a corrective trajectory to avoid discontinuities. Finally, some implementation
aspects are addressed. Figure 4.1 displays a schematic overview of the work
flow of the application, from the user’s perspective. Table 4.1 lists some of
the notation used in this chapter.
41
Chapter 4. Modification of Robot Movement. . .
50
Deficient trajectory
Lead-through transportation
40
Corrective trajectory
30
y3 [mm]
d∗
20
10
0
100 105 110 115 120 125 130 135 140
y1 [mm]
m
ydr = ydm , ∀m ∈ {1, 2 . . . M } (4.1)
where
42
4.4 Method
Deficient trajectory
50
Lead-through transportation
Corrective trajectory
40 Modified trajectory
Separation point set by user
Separation point set automatically
y3 [mm]
30
20
10
0
100 105 110 115 120 125 130 135 140
y1 [mm]
Figure 4.6 Same trajectories as in Fig. 4.2, but zoomed in on the corrective
trajectory. Arrows indicate directions. The parts of the trajectories between
the separation markers were not retained. The right, blue, separation point
was determined explicitly by the operator during the corrective demonstra-
tion. The left, green, separation point was determined according to (4.2).
Further, what was left of the deficient trajectory was modified for a smooth
transition. However, the part of the corrective trajectory retained was not
modified, since it was desired to closely follow this part of the demonstration.
Note that the trajectories retained were not intended for direct play-back ex-
ecution. Instead, they were used to form a modified DMP, which in turn
generated a resulting trajectory, as shown in Figs. 4.7 to 4.9.
43
Chapter 4. Modification of Robot Movement. . .
2013] and Chapter 3. The next step in the work flow was to evaluate the
resulting DMP, as shown in Fig. 4.1.
Software Implementation
Most of the programming was done in C++, where DMPs were stored as
objects of a class. Among the data members of this class were the parame-
ters τ , , and w1... Nb , as well as some description of the context of the DMP
and when it had been created. It contained member functions for display-
ing the parameters, and for modifying and τ . The C++ linear algebra
library Armadillo [Sanderson and Curtin, 2016] was used in a major part
of the implementation. Further, the code generator CVXGEN [Mattingley
and Boyd, 2012] was used to generate C code for solving the optimization
problem in (4.3). By default, the solver code was optimized with respect to
computation time. This resulted in a real-time application, in which the
computation times were negligible in teaching scenarios. The optimization
problem was typically solved well below one millisecond on an ordinary PC.
The implementation developed in [Stolt et al., 2015a] was used to enable
lead-through programming. The operator could switch between different
high-level modes such as DMP execution, lead-through transportation, and
corrective movement by pressing buttons in a terminal user interface. Dur-
ing lead-through programming, trajectory logging started when a velocity
significantly different from zero was achieved.
4.5 Experiments
The robot used in the experimental setup was a prototype of the dual-arm
ABB YuMi [ABB Robotics, 2018], described in Chapter 2. The computations
took place in joint space, and the robot’s forward kinematics were used for
visualization in Cartesian space in the figures presented in this chapter.
The setups in Sec. 4.2 were used to evaluate the proposed method. For each
trial, the following steps were taken:
• The DMP was used to generate a trajectory similar to the initial one.
This formed the deficient trajectory;
44
4.6 Results
200
150
y3 [mm]
100
Deficient trajectory
Lead-through transportation
Corrective trajectory
50 Resulting trajectory
Figure 4.7 Trajectories from the experimental evaluation of Setup 4.1. The
deficient trajectory went past the goal in the negative y1 -direction, preventing
the robot from lowering the button into the hole. After correction, the robot
was able to reach the target as the modified DMP generated the resulting
trajectory.
First, Setup 4.1 was set up for evaluation, see Fig. 4.3. The scenario
started with execution of a deficient trajectory. For each attempt, a new
deficient trajectory was created and modified. A total of 50 attempts were
made.
Similarly, Setup 4.2 (see Fig. 4.4) was set up, and again, a total of 50
attempts were made.
A video is available as a publication attachment to [Karlsson et al.,
2017b], to facilitate understanding of the experimental setup and results.
A version with higher resolution is available on [Karlsson, 2017d].
4.6 Results
For each attempt of Setup 4.1, the robot was able to place the button properly
in the yellow case after the modification. Results from two of these attempts
are shown in Figs. 4.7 and 4.8. In the first case, the deficient trajectory went
past the goal, whereas in the second case, it did not reach far enough.
Each of the attempts of Setup 4.2 was also successful. After modifica-
tion, the DMPs generated trajectories that moved the grasped stop button
45
Chapter 4. Modification of Robot Movement. . .
200
150
y3 [mm]
100
Deficient trajectory
Lead-through transportation
Corrective trajectory
50 Resulting trajectory
Figure 4.8 Similar to Fig. 4.7, except that in this case, the deficient tra-
jectory did not reach far enough in the negative y1 -direction.
above the height of potential obstacles, in this case other stop buttons, and
subsequently inserted it into the case. The result from one attempt is shown
in Fig. 4.9.
4.7 Discussion
Future work includes integration of the presented framework with rein-
forcement learning [Kober et al., 2008; Kroemer et al., 2010; Pastor et al.,
2013], in order to optimize the motion locally with respect to criteria such as
execution time. The program should also be augmented to take the purpose
of, and relation between, different DMPs into consideration. This extension
will emphasize the necessity of keeping track of different states within the
work flow. To this purpose, a state machine implemented in, e.g., JGrafchart
[Theorin, 2014], or the framework of behavior trees, applied on an aerial
vehicle in [Ögren, 2012] and robot control in [Marzinotto et al., 2014], would
be suitable.
Similar to movement adjustments based on sensor data, adjustments
by the operator during DMP execution could also be considered. Such a
step was taken in [Talignani Landi et al., 2018a], where [Karlsson et al.,
2017b] was extended significantly to enable adjustment by physical contact
during movement. The software in Sec. 4.4, first implemented in [Karlsson
et al., 2017b], was also used as a starting point for the implementation in
[Talignani Landi et al., 2018a]. A presentation of [Talignani Landi et al.,
2018a] in video format, including a demonstration of the functionality, is
46
4.7 Discussion
120
y3 [mm] 100
80
60 Deficient trajectory
Lead-through transportation
Corrective trajectory
40 Resulting trajectory
Figure 4.9 Trajectories from experimental evaluation of Setup 4.2. For the
deficient trajectory, the end-effector was lowered too early, causing a potential
collision. After the correction, the robot was able to reach the target while
avoiding the obstacles. The movement was also extended to perform the
entire peg-in-hole task, rather than stopping above the hole.
47
Chapter 4. Modification of Robot Movement. . .
4.8 Conclusion
In this chapter, an approach for modification of DMPs, using lead-through
programming, was presented. It allowed for a robot operator to modify faulty
trajectories, instead of demonstrating new ones from the beginning. Based
on corrective demonstrations, modified DMPs were formed automatically. A
real-time application, that did not require any additional engineering work
by the user, was developed and verified experimentally. A video showing the
functionality is available online [Karlsson, 2017d].
48
5
Temporally Coupled
Dynamical Movement
Primitives in Rn
5.1 Introduction
In this chapter, we continue to extend the DMP framework to support move-
ment adjustment. Whereas Chapter 4 focused on corrective demonstrations
after faulty movements, this chapter aims to allow for replanning during
movement.
Consider the case where a robot has to deviate from its intended move-
ment. One could think of several motivations for such deviation, for instance
avoiding an obstacle, avoiding blocking camera vision, waiting for a work
object to be handed over from another robot, etc. In this thesis, we use
physical contact with a human to exemplify. A human grasps or pushes the
robot, thereby stopping its movement or pushing it away from its path.
Typical industrial robots would react to this in two possible ways. A large
robot could possibly continue its movement despite the physical contact,
because of its large mass and strength compared to the human. A light-
weight collaborative robot, which is mainly considered in this thesis, has
weaker motors and its movement would be significantly affected by the
human. The motors would not be counteracted by the human for long,
however, because the robot would typically stop if reference states were too
far from measured states, or if joint torques were unexpectedly high. The
robot motors would then deactivate, the brakes would activate, and the
robot would possibly provide the operator with an error message. This is
a safety feature, sometimes called motion supervision, with the purpose of
avoiding damage of the robot and its surroundings. Motion supervision runs
on larger robots as well, though these are less sensitive to external forces.
Triggering such an error in an industrial setting implicates production stop
49
Chapter 5. Temporally Coupled Dynamical Movement Primitives in Rn
and significant engineering work to reset and adjust hardware and software.
Hence, this strategy is not feasible in less predictable environments.
Already the original DMP formulation implicates some replanning ca-
pabilities, see [Ijspeert et al., 2013] and Chapter 3. It is possible to update
the goal configuration during movement as utilized in [Prada et al., 2014],
and to avoid uncomplicated obstacles [Pastor et al., 2009; Ijspeert et al.,
2013; Pastor et al., 2013]. DMPs with compliant behavior were developed in
[Denisa et al., 2016; Batinica et al., 2017]. Further, movement can be gen-
erated from any configuration with guaranteed exponential convergence to
the goal [Perk and Slotine, 2006; Ijspeert et al., 2013; Wensing and Slotine,
2017]. Therefore, it was possible to physically push a robot to correct its
movement in [Talignani Landi et al., 2018a], knowing that it would reach
the goal after the perturbation.
However, in the original formulation, a DMP would continue its time
evolution despite any significant perturbations. The behavior of the robot
would then likely be undesirable and not intuitive, as discussed in [Ijspeert
et al., 2013]. A solution was also proposed in [Ijspeert et al., 2013]. Intu-
itively, the time evolution of the DMP was slowed down in case of deviation
from the intended trajectory. This is called temporal coupling, of which
phase stopping is one component. In this chapter, it is discovered that the
temporal coupling proposed in [Ijspeert et al., 2013] needs to be extended to
be realizable in practice, though the core concept in [Ijspeert et al., 2013] is
very promising and therefore forms a foundation for the method developed
in this chapter.
This chapter provides a practically realizable extension of the temporal
coupling in [Ijspeert et al., 2013], in the form of a two-degree-of-freedom
motion control system; see [Åström and Wittenmark, 2013] for a description
of the general architecture for two-degree-of-freedom control. The feedfor-
ward part of the controller promotes tracking of the DMP trajectory in the
absence of significant perturbations, thus mitigating unnecessarily slow tra-
jectory evolution due to temporal coupling acting on small tracking errors.
The feedback part suppresses significant errors. It is further shown theo-
retically that the control system is globally exponentially stable, which in
practice means that the goal configuration is guaranteed to be reached.
This chapter is organized as follows. Previous research is described
and evaluated in Sec. 5.2. The research problem addressed in this chap-
ter is specified in Sec. 5.3. The proposed control algorithm is presented
in Sec. 5.4. Thereafter, the proposed control algorithm is compared with
previous research through simulations in Sec. 5.5. These simulations also
support understanding of the key concepts of temporally coupled DMPs, and
this is why they are presented before the mathematical stability analysis
in Sec. 5.6. Further, experiments and results are presented in Secs. 5.7
and 5.8, a discussion is presented in Sec. 5.9, and concluding remarks are
50
5.2 Previous Research on Temporally Coupled DMPs. . .
Notation Description
+
n ∈Z Dimension of robot configuration
y ∈ Rn Physical robot configuration
ÿr ∈ Rn Robot acceleration reference
τ ∈ R+ Time parameter
∈ Rn Goal state
x ∈ R+ Phase variable
f ( x) ∈ Rn Learnable virtual forcing term
α, β , α x ∈ R+ Positive constants
e ∈ Rn Low-pass filtered configuration error
yc ∈ Rn Coupled robot configuration
τa ∈ R+ Adaptive time parameter
αe , k t , kc ∈ R+ Positive constants
z ∈ Rn DMP state
Kp, Kv ∈ R+ Large control gains
kp , kv ∈ R+ Moderate control gains
yu ∈ Rn Unperturbed, uncoupled trajectory
Nb ∈ Z+ Number of basis functions
ξ ∈ R5n+1 DMP state vector
qξ i q ∈ R+ 2-norm of DMP state i
finally presented in Sec. 5.10. Table 5.1 lists some of the notation used in
this chapter.
Successful perturbation recovery using the proposed method is visual-
ized in a video available online [Karlsson, 2016]. The video will be explained
in more detail in this chapter, but could already at this point support un-
derstanding of the functionality.
A code example is available in [Karlsson, 2017c], to allow for exploration
of the system proposed. The system was also integrated in the Julia DMP
package in [Bagge Carlson, 2016], originally based on [Ijspeert et al., 2013].
51
Chapter 5. Temporally Coupled Dynamical Movement Primitives in Rn
can not be realized, i.e., the robot configuration y can not evolve according
to plan. In the original DMP formulation, the phase variable would evolve
independently of any perturbation, according to
τ ẋ = −α x x (5.2)
This causes undesired behavior, since it is then likely that the actual tra-
jectory y deviates significantly from the intended trajectory even after the
cause of the perturbation has vanished. This is more thoroughly described
in [Ijspeert et al., 2002; Ijspeert et al., 2013]. Consider, for instance, the
case where the robot is stopped during its movement until x has converged
to 0. This implies that f ( x) = 0, and once released the robot will move to
its goal without taking the parameters of f ( x) into account. In such a case,
the value of the phase variable does not correspond to the progress of the
movement. To mitigate this problem, the solution described in the following
paragraph was suggested in [Ijspeert et al., 2013].
A coupling term C t and an adaptive time parameter τa were introduced.
ė = α e ( y − yc − e) (5.3a)
Ct = kt e (5.3b)
τa = 1 + k c e e
T
(5.3c)
τa ẋ = −α x x (5.4)
τ ż = α(β ( − y) − z) + f ( x) (5.5a)
τ ẏ = z (5.5b)
τa ż = α(β ( − yc ) − z) + f ( x) + C t (5.6a)
τa ẏc = z (5.6b)
52
5.3 Problem Formulation
A PD controller given by
53
Chapter 5. Temporally Coupled Dynamical Movement Primitives in Rn
1
y
Position [m]
yc
0.5 yu
0 1 2 3 4 5 6 7 8
50
Acceleration [m/s2 ]
ÿr
0 ÿc
ÿu
−50
−100
0 1 2 3 4 5 6 7 8
Time [s]
moderate control signals must be used. The benefits of the DMP framework
described in [Ijspeert et al., 2013], i.e., scalability in time and space as well
as guaranteed convergence to the goal , must be preserved. Further, in
the absence of significant perturbations, the behavior of y should resemble
that of the original DMP framework described in [Ijspeert et al., 2013] and
Chapter 3.
The stability of the proposed control algorithm requires special attention.
Unstable robot motion control could damage the robot and its surroundings,
such as tooling and work pieces. Further, in robotic manipulation it is crucial
that the robot reaches its goal configuration in each of its movements. If
this would not be achieved, subtasks would likely be left incomplete, yielding
unforeseen hardware configurations, which in turn could result in collision
and broken hardware. For temporally coupled DMPs to be used on a larger
scale in the future, it is therefore necessary to prove that these result in
stable behavior. Stability for the original DMP framework, i.e., not including
temporal coupling, was concluded in [Perk and Slotine, 2006; Wensing and
Slotine, 2017].
Specifically, it will in this chapter be investigated whether the proposed
temporally coupled DMPs are globally exponentially stable. A mathematical
54
5.4 Method
1.5
y
Position [m]
yc
1
yu
0.5
0
0 1 2 3 4 5 6 7 8
200
Acceleration [m/s2 ]
ÿr
ÿc
0 ÿu
−200
0 1 2 3 4 5 6 7 8
Time [s]
Figure 5.2 Similar to Fig. 5.1, except that y was moved away from the
intended path between 2 s and 3 s. Again, a prohibitively large acceleration
reference ÿr was generated.
definition of exponential stability can be found in, e.g., [Slotine and Li,
1991]. In words, a system is globally exponentially stable if the state vector
converges to the origin faster than an exponentially decaying function. In
the context of DMPs, global exponential stability implicates that the robot
eventually reaches the goal configuration.
5.4 Method
The proposed method extends the approach in [Ijspeert et al., 2013] as
follows. The PD controller in (5.7) is augmented with feedforward control,
as shown in (5.8). Further, the PD controller gains are moderate, to get
a practically realizable control signal. Additionally, the time constant τ is
introduced as a factor in the expression for the adaptive time parameter τa ,
see (5.3c) and (5.9). Our method is detailed in the following.
To enforce y to follow yc , we apply the following control law.
55
Chapter 5. Temporally Coupled Dynamical Movement Primitives in Rn
ÿc
yc P d P ÿr y
DMP kp + kv (·) Robot
dt
-1
control loop has a double pole given by −5 rad/s. Since the real parts are
negative, the system (5.8) is asymptotically stable, and since the imaginary
parts are 0, it is critically damped. Such a critically damped system is
obtained for k p = k2v /4. This relationship is advisable to fulfill in the imple-
mentation, since overshoot is then avoided, which is important for avoiding
collision. This is especially important in manipulation that involves contact
between robot and environment; see also Chapter 3, where α and β are con-
strained based on the same reasoning. The delay margin is 130 ms, which
is an improvement compared to 12 ms for the previous method, described in
Sec. 5.2. A schematic overview of the control system is shown in Fig. 5.3.
Further, the relation (5.3c) is modified in order to include the original
time constant τ , as follows.
τa = τ (1 + k c eT e) (5.9)
3 4
d d z τa ż − τ̇a z żτa − 2τ k c ( eT ė) z
ÿc = ( ẏc ) = = = (5.10)
dt dt τa τa2 τa2
56
5.5 Simulations
Feedforward control has been used in the DMP context previously, but then
only for low-level joint control, with motor-torque commands as control sig-
nals, see [Park et al., 2008; Pastor et al., 2009]. This control structure was
also applied in the internal controller used in the implementation in this
chapter. This inner control design should not be confused with the feedfor-
ward control described in this section, which operated outside the internal
robot controller, and was used to determine the reference acceleration for
the robot.
5.5 Simulations
In the simulations, the robot configuration consisted of position in Cartesian
space. First, a demonstrated trajectory, ydemo , was simulated by creating a
time series of positions. Thereafter, the corresponding DMP parameters
were determined. Simulating the resulting DMP without perturbations and
without temporal coupling yielded an unperturbed, uncoupled, trajectory yu .
Position in one dimension was simulated in the following, and two different
perturbations were considered: one where y was stopped, and one where it
was moved. The perturbations took place from time 2 s to 3 s. The systems
were sampled at 250 Hz, using the forward Euler method. The same DMP,
yielding the same yu , was used in each trial. The adaptive time parameter τa
was determined according to (5.9) in all simulations, to get comparable time
scales. First, the controller detailed in [Ijspeert et al., 2013] was applied.
Except for the perturbations themselves, the conditions were assumed to
be ideal, i.e., no delay and no noise were present. The results are shown in
Figs. 5.1 and 5.2. Despite ideal conditions, prohibitively large accelerations
were generated by the controller in both cases.
57
Chapter 5. Temporally Coupled Dynamical Movement Primitives in Rn
1
y
Position [m]
yc
0.5 yu
−0.5
0 2 4 6 8 10 12 14
5
Acceleration [m/s2 ]
ÿr
ÿc
0
ÿu
−5
−10
0 2 4 6 8 10 12 14
Time [s]
Figure 5.4 Simulation with the control structure in [Ijspeert et al., 2013],
except that the gains were lower ( K p = 25 and K v = 10). The reference
acceleration was of realizable magnitude. For instance, the YuMi robot has
a maximum end-effector acceleration of approximately 10 m/s2 . However, the
coupled and real systems were slowed down due to small tracking errors
combined with temporal coupling.
Figure 5.4 shows the result from a simulation where the controller
detailed in [Ijspeert et al., 2013] was used, except that the gains were
lowered to moderate values. The conditions were ideal, and no perturbation
was present. This resulted in realizable control signals. However, small
control errors in combination with the temporal coupling slowed down the
evolution of the coupled system as well as the actual movement.
Thereafter, the controller proposed in this chapter, described in Sec. 5.4,
was used. In order to verify robustness under realistic conditions, noise
and time delay were introduced. Position measurement noise, and velocity
process noise, were modeled as zero-mean Gaussian white noise, with stan-
dard deviations of 1 mm and 1 mm/s, respectively. The time delay between
the process and the controller was 12 ms. This delay was suitable to simu-
late since it corresponds both to the delay margin of the method suggested
in [Ijspeert et al., 2013], and to the actual delay in the implementation
presented in this chapter, see Sec. 5.7. (It is, however, a coincidence that
these two have the same value. Nevertheless, this shows that a 12 ms de-
lay margin is not necessarily enough.) The results are shown in Figs. 5.5
and 5.6. For comparison, the method in [Ijspeert et al., 2013], with the
large gains, was also evaluated under these conditions, although without
58
5.6 The Proposed Temporally Coupled DMPs are GES
1.5
y
Position [m]
yc
1
yu
0.5
0
0 1 2 3 4 5 6 7 8
20
Acceleration [m/s2 ]
ÿr
10 ÿc
ÿu
0
−10
−20
0 1 2 3 4 5 6 7 8
Time [s]
Figure 5.5 Similar to Fig. 5.1, but with modeled noise and delay, and using
the controller proposed in this chapter. The behavior was satisfactory both
regarding position and acceleration.
any perturbation except for the noise. Because of the time delay, this system
was unstable, as shown in Fig. 5.7.
Finally, two-dimensional movement using the proposed control system
was simulated. The purpose of these simulations was to investigate the
convergence of the proposed control system. At the start of each simulation,
yc was set to the beginning of the demonstrated trajectory ydemo , and y
was deliberately initialized with a randomly chosen error with respect to
yc . The system was simulated 100 times. The simulations gave mutually
similar results. The results from some of the simulations are visualized in
Figs. 5.8 and 5.9. For each simulation, the position converged to the goal.
Further, the DMP states in Fig. 5.9 converged to 0. This is consistent with
the theory that will be presented in Sec. 5.6.
59
Chapter 5. Temporally Coupled Dynamical Movement Primitives in Rn
1.5
y
Position [m]
yc
1
yu
0.5
0
0 1 2 3 4 5 6 7 8
20
Acceleration [m/s2 ]
ÿr
10 ÿc
ÿu
0
−10
−20
0 1 2 3 4 5 6 7 8
Time [s]
Figure 5.6 Similar to Fig. 5.5, except that y was moved away from the
intended path between 2 s and 3 s. Again, the behavior was satisfactory both
regarding position and acceleration.
·1022
0.5
Position [m]
y
−0.5 yc
yu
−1
0 1 2 3 4 5 6 7
·1026
1
Acceleration [m/s2 ]
0.5
0 ÿr
−0.5 ÿc
ÿu
−1
0 1 2 3 4 5 6 7
Time [s]
Figure 5.7 Using the control system in [Ijspeert et al., 2013], subject to
the simulated noise and time delay, resulted in unstable behavior.
60
5.6 The Proposed Temporally Coupled DMPs are GES
0.6
y
yc
0.4 ydemo
y2 [m] 0.2
−0.2
−0.4
−0.6
−0.4 −0.2 0 0.2 0.4 0.6 0.8 1
y1 [m]
by
61
Chapter 5. Temporally Coupled Dynamical Movement Primitives in Rn
0.3 q y − yc q [m]
q ẏ − ẏc q [m/s]
0.2
qξ i q
q eq [m]
0.1
0
0 2 4 6 8 10 12 14 16 18 20
1 x [1]
q yc − q [m]
qξ i q
q zq [m/s]
0.5
0
0 2 4 6 8 10 12 14 16 18 20
Time [s]
Figure 5.9 The magnitudes of DMP states ξ i plotted over time, for one
of the simulations displayed in Fig. 5.8. The notation q · q represents the
2-norm, and the unit symbol [1] indicates dimensionless quantity. It can be
seen that y converged to yc , and that e converged to 0. As will be shown in
Sec. 5.6, this is predicted by Theorem 5.1. Further, each state converged to
0, which is predicted by Theorem 5.2.
62
5.6 The Proposed Temporally Coupled DMPs are GES
Denote by ξ¯ and Ā the state vector and the system matrix in (5.15), respec-
tively.
Theorem 5.1
The dynamical system defined by (5.15) for DMP operation has ξ¯ = 0 as a
globally exponentially stable equilibrium. 2
kv
λ1,...,2n = − (5.16a)
2
λ2n+1,...,3n = −α e (5.16b)
These are strictly negative, since k v , α e > 0. The linear system is hence
globally asymptotically stable. For linear systems, asymptotic stability is
equivalent with exponential stability [Slotine and Li, 1991; Rugh, 1996].
The system is therefore globally exponentially stable. 2
− kp ( y − yc ) − kv ( ẏ − ẏc )
ẏ − ẏ
d c
=
e α e ( y − yc − e) (5.17)
dt
αx
x − x
τa
63
Chapter 5. Temporally Coupled Dynamical Movement Primitives in Rn
Proposition 5.1
If ẋ1 = 1 ( x1 ) is contracting, and ẋ2 = 2 ( x1 , x2 ) is contracting for each
fixed x1 , then the hierarchy
3 4 3 4
d x1 1 ( x1 )
= (5.18)
dt x2 2 ( x1 , x2 )
is contracting. 2
We now address the stability of the entire control system in (5.14) as follows.
Theorem 5.2
The system given by (5.14) for DMP operation has ξ = 0 as a globally
exponentially stable equilibrium point. 2
For the fixed point of (5.17) we have τa = τ and f ( x) = f (0) = 0, and the
system (5.19) then simplifies to
A B 0n
1
In A y − B
d yc − τ c
= (5.20)
dt z αβ α z
− In − In
τ τ
64
5.7 Experiments
This system is linear, and with β = α/4 (see Chapter 3) the eigenvalues of
the system matrix are given by
α
µ1,...,2n = − (5.21)
2τ
Since the eigenvalues are strictly negative, the system (5.20) is contracting.
Further, we note that (5.14) is hierarchical from (5.17) to (5.19). Therefore,
it follows from Proposition 5.1 that (5.14) is contracting. This is true for
the whole state space. Hence, all solutions of (5.14) converge exponentially
to the same trajectory. Since one solution is given by ξ = 0, they must
all converge exponentially to this equilibrium point. Thus, (5.14) is globally
exponentially stable. 2
Since y = for ξ = 0, Theorem 5.2 implies that the system (5.14)
converges exponentially to a state where the goal configuration is reached
by the actual robot position.
5.7 Experiments
The algorithm in Sec. 5.4 was implemented on the ABB YuMi prototype
robot, see Chapter 2. Similar to the simulations, the control system ran at
250 Hz, and the delay between robot and controller was 3 sample periods,
corresponding to 12 ms. The computations took place in joint space, and the
robot’s forward kinematics were used for visualization in Cartesian space
in some of the figures presented.
Six different setups were used to evaluate the control algorithm. For
each setup, 50 trials were made, yielding a total of 300 trials. Prior to each
trial, a temporally coupled DMP had been determined from demonstration
by means of lead-through programming [Stolt et al., 2015a]. In each trial,
the temporally coupled DMP was executed while data were logged and
saved. Perturbations were introduced by physical contact with a human.
A wrist-mounted ATI Mini40 force/torque sensor was used to measure the
contact force, and a corresponding acceleration was added to ÿr as a load
disturbance. The purpose of Setups 5.1 and 5.2 was to demonstrate the
proposed functionality in general and examine the reference acceleration,
whereas the main purpose of Setups 5.3 to 5.6 was to verify the exponential
stability shown in Sec. 5.6. The six setups were as follows.
Setup 5.1 This setup is visualized in Fig. 5.10. The task of the robot was
to place a stop button in its corresponding case. The assembly parts are
shown in Fig. 5.11. The human introduced two perturbations during the
DMP execution. The first perturbation was formed by moving the end-
effector away from its path, and then releasing it. The second perturbation
consisted of a longer, unstructured, movement later along the trajectory.
65
Chapter 5. Temporally Coupled Dynamical Movement Primitives in Rn
(a) (b)
(c) (d)
Figure 5.10 Setup 5.1. In (a), the robot started to execute a DMP for placing
the stop button in the rightmost yellow case. A human perturbed the motion
twice. The first perturbation (b) was formed by moving the end-effector away
from its path, and then releasing it. The second perturbation (c) lasted for a
longer time, and consisted of unstructured movement. The robot recovered
from both perturbations, and managed to place the stop button in the case
(d). Data from one trial are shown in Fig. 5.15.
Figure 5.11 Yellow case (left), stop button (upper right) and gasket (lower
right) used in Setups 5.1, 5.2 and 5.6.
66
5.7 Experiments
Setup 5.2 This setup is visualized in Fig. 5.12. A human co-worker realized
that the stop buttons in the current batch were missing rubber gaskets, and
acted to modify the robot trajectory, allowing for the co-worker to attach
the gasket on the stop button manually. During execution of the DMP,
the end-effector was stopped and lifted to a comfortable height by the co-
worker. Thereafter, the gasket was attached, and finally the end-effector
was released. For the sake of completeness, the modified trajectory was
used to form yet another DMP, which allowed for the co-worker to attach
the gaskets without perturbing the trajectory of the robot, for the remaining
buttons in the batch. To verify this functionality, one such modified DMP
was executed at the end of each trial.
Setup 5.3 The robot moved one of its arms without manipulating any ob-
jects. The arm was subjected to two perturbations.
Setup 5.4 The robot moved both of its arms simultaneously, again without
any manipulation. Two perturbations were introduced: first one on the left
arm, and subsequently one on the right arm.
Setup 5.5 This setup is visualized in Fig. 5.13. The robot used both arms
to pick up a ball. One perturbation was introduced on the left arm.
Setup 5.6 This setup is visualized in Fig. 5.14. The objective of the robot
was to mate a stop button, grasped with the right arm, with its correspond-
ing case, grasped with the left arm. The left arm was subjected to one
perturbation.
Videos
A video of Setups 5.1 and 5.2 is available in [Karlsson, 2016]. Further, a
video of Setups 5.3 to 5.6 is available in [Karlsson, 2017a]. In Part I of the
latter video, the experimental setups are shown, except that the temporally
coupled DMPs were executed without any perturbations present. This is to
visualize what should be achieved by the robot, in each setup. In Part II,
DMPs without temporal coupling were used, and the robot was subject to
perturbations. The purpose of this part is to demonstrate that the robot
risks to fail in such a scenario, and hence motivate why temporal coupling
is necessary. This risk of failure was also indicated in [Ijspeert et al., 2013],
using one-dimensional simulation examples. Part III of the video shows the
experiments, where temporally coupled DMPs were executed in the presence
of perturbations.
67
Chapter 5. Temporally Coupled Dynamical Movement Primitives in Rn
(a) (b)
(c) (d)
(e) (f)
Figure 5.12 Setup 5.2. The robot started its motion toward the rightmost
yellow case in (a). The end-effector was stopped and lifted, and the gasket
was mounted in (b). The robot was then released, and continued its motion
to the case, (c) and (d). The actual trajectory was saved and used to form a
modified DMP, and the robot was reset to a configuration similar to that in
(a). When executing the modified DMP, the human co-worker could attach
the gasket without perturbing the motion of the robot (e). The robot finished
the modified DMP in (f). Data from one trial are shown in Fig. 5.16.
68
5.8 Results
(a) (b)
(c) (d)
Figure 5.13 Setup 5.5. The robot task was to pick up the blue ball us-
ing both arms. The movement started in (a), and a human perturbed the
movement in (b). Thereafter, the robot was released and recovered from the
perturbation. The arms reached the ball simultaneously in (c). The goal state
was then reached in (d). Data from one trial are shown in Fig. 5.19.
5.8 Results
In this section, data from one trial of each of the setups in Sec. 5.7 are
displayed. For each setup, all trials gave qualitatively similar results. In
each of the 300 trials, the perturbations were first recovered from and
thereafter the goal state was reached.
Data from a trial of Setup 5.1 are displayed in Fig. 5.15. As intended,
the two disturbances were successfully recovered from. The reference accel-
eration was of realizable magnitude.
Data from a trial of Setup 5.2 are displayed in Fig. 5.16. First, the
perturbation was successfully recovered from as intended. The reference
acceleration was of realizable magnitude. When the modified DMP was exe-
cuted, it behaved like a smooth version of the perturbed original trajectory.
Data from Setups 5.3 to 5.6 are shown in Figs. 5.17 to 5.20. The figures
show the magnitudes of the states in (5.14) represented in joint space, plot-
ted over time. The perturbations are clearly visible in the data. It can be
69
Chapter 5. Temporally Coupled Dynamical Movement Primitives in Rn
(a) (b)
(c) (d)
Figure 5.14 Setup 5.6. The task was to insert the red stop button into
the corresponding hole in the yellow case. The initial configuration is shown
in (a). In (b), a human perturbed the movement, and in (c) the robot had
recovered from the perturbation. The goal configuration was reached in (d).
Data from one trial are shown in Fig. 5.20.
5.9 Discussion
Compared to the previous research in [Ijspeert et al., 2013], the method
in this chapter contains the following extensions. Feedforward control was
added to the PD controller, thus forming a two-degree-of-freedom controller.
Further, the PD controller gains were reduced to moderate magnitudes. The
expression for τa was also modified, to include the original time constant τ
as a factor. These changes resulted in the following benefits, compared to
the previous method. The feedforward part allowed for the controller to act
also for insignificant position and velocity error, thus improving the trajec-
tory tracking. Because of this, the large controller gains used in previous
70
5.9 Discussion
y
−0.05 yu
y2 [m]
−0.1
−0.15
0.1
0.05
0
0 5 10 15 20 25
0
y
−0.05 yc
y2 [m]
−0.1 yu
−0.15
−0.2
0 5 10 15 20 25
3
q ÿr q [m/s2 ]
2
1
0
0 5 10 15 20 25
Time [s]
Figure 5.15 Experimental data from a trial of Setup 5.1. The first (from
above) plot shows two position coordinates, y1 and y2 , of the end-effector in
Cartesian space. The arrow indicates the direction of the movement, which
started in the upper right and finished in the lower left of the plot. The two
perturbations are clearly visible. The second plot shows the distance between
y and yc over time. In the third plot, it can be seen that the evolution of yc
slowed down during each perturbation. Subsequently, y recovered, and when
it was close to yc , the movement continued as a delayed version of yu . The
reference acceleration was of realizable magnitude, as shown in the fourth
plot.
71
Chapter 5. Temporally Coupled Dynamical Movement Primitives in Rn
−0.05 y
yu
ym
−0.1
y2 [m]
−0.15
0.08
0.06
0.04
0.02
0
0 5 10 15 20 25
0
y
−0.05 yc
y2 [m]
−0.1 yu
−0.15
−0.2
0 5 10 15 20 25
3
q ÿr q [m/s2 ]
0
0 5 10 15 20 25
Time [s]
Figure 5.16 Experimental data from a trial of Setup 5.2. The organization
of this figure is similar to that of Fig. 5.15. The perturbation for stopping
and lifting the end-effector took place from time 10 s to 17.5 s, and is clearly
visible in each plot. This perturbation was recovered from as intended, and
the reference acceleration was of realizable magnitude. The uppermost plot
also displays the measured trajectory obtained by executing the modified
DMP, denoted ym . It behaved like a smooth version of the perturbed original
trajectory y.
72
5.9 Discussion
0.6 q y − yc q [rad]
q ẏ − ẏc q [rad/s]
0.4
qξ i q q eq [rad]
0.2
0
0 5 10 15 20 25
1.5 x [1]
q yc − q [rad]
1
qξ i q
q zq [rad/s]
0.5
0
0 5 10 15 20 25
Time [s]
Figure 5.17 Data from a trial of Setup 5.3, where the left arm was per-
turbed twice during its movement.
1
q y − yc q [rad]
0.8
q ẏ − ẏc q [rad/s]
0.6
qξ i q
q eq [rad]
0.4
0.2
0
0 5 10 15 20 25
x [1]
1 q yc − q [rad]
qξ i q
q zq [rad/s]
0.5
0
0 5 10 15 20 25
Time [s]
Figure 5.18 Data from a trial of Setup 5.4, where both arms moved simul-
taneously and were perturbed once each.
73
Chapter 5. Temporally Coupled Dynamical Movement Primitives in Rn
0.8
q y − yc q [rad]
0.6 q ẏ − ẏc q [rad/s]
qξ i q q eq [rad]
0.4
0.2
0
0 5 10 15 20 25
1.5 x [1]
q yc − q [rad]
qξ i q
1 q zq [rad/s]
0.5
0
0 5 10 15 20 25
Time [s]
Figure 5.19 Data from a trial of Setup 5.5. The setup is shown in Fig. 5.13.
0.6
q y − yc q [rad]
0.4 q ẏ − ẏc q [rad/s]
qξ i q
q eq [rad]
0.2
0
0 5 10 15 20 25
2
x [1]
1.5 q yc − q [rad]
qξ i q
1 q zq [rad/s]
0.5
0
0 5 10 15 20 25
Time [s]
Figure 5.20 Data from a trial of Setup 5.6. The setup is shown in Fig. 5.14.
In this particular trial, the robot was grasped by the human for a significantly
longer time than in the trials shown in Figs. 5.17 to 5.19, though this was
not the case for all trials of Setup 5.6.
74
5.9 Discussion
75
Chapter 5. Temporally Coupled Dynamical Movement Primitives in Rn
exp( t)
x1 = (5.24)
2
which is not stable. It can be seen that x1 → ∞ as t → ∞. The contraction
guarantees exponential convergence to a single trajectory [Lohmiller and
Slotine, 1998], in this case to (5.24), but it does not guarantee exponential
convergence to a constant. Nevertheless, for autonomous systems such as
DMPs, global contraction is equivalent with global exponential stability
[Lohmiller and Slotine, 1998].
Even though Theorem 5.2 states that the state vector ξ converges to
the origin, in practice noise and model error prevent the state vector from
staying arbitrarily close to the origin. In the experiments there was an
unmodeled time delay of 12 ms between the controller in (5.8) and the
internal robot controller, which is an example of a model error. Further,
the robot end-effectors reached their targets with an accuracy of ±1 mm.
76
5.9 Discussion
Because the origin could not be reached by ξ with exactly zero error, the
movement could for instance be considered finalized once qξ q < ρ for some
small constant ρ , or upon force-based detection of task completion as in
Chapter 7. A related discussion about convergence toward single-point goals
can be found in [LaValle, 2006]. There, the concept of so-called asymptotic
solution plans is also described. Since exponential stability is a stronger
requirement than asymptotic stability [Slotine and Li, 1991], Theorem 5.2
implies that temporally coupled DMPs are asymptotic solution plans.
Stability for the original DMP framework was shown in [Perk and Slo-
tine, 2006; Wensing and Slotine, 2017] by utilizing that f ( x) converged to 0,
which followed from the fact that x decayed exponentially, regardless of any
deviation from the intended movement. However, this is true only if tempo-
ral coupling is not used, and the convergence of f ( x) is less obvious for the
DMP version studied in this chapter. Due to the adaptive time parameter
τa in (5.4), it can not be assumed that x decays exponentially for temporally
coupled DMPs. Instead, the stability of temporally coupled DMPs has now
been established in the proof of Theorem 5.2.
In the experiments, a force sensor was mounted on each wrist of the
robot, and the measured forces were scaled and added to the reference
accelerations as disturbances, which allowed for the human to introduce
perturbations in an intuitive way. This arrangement worked well for the
sake of evaluating the stability properties of temporally coupled DMPs, and
it was straightforward to implement. However, for the purpose of interacting
physically with the robot, one could also consider passive force control.
Such an approach was presented in [Denisa et al., 2016], though temporal
coupling was not included. As suggested in [Denisa et al., 2016], it would
therefore be interesting to combine temporal coupling with passive force
control in the DMP framework in future research.
The effect of temporal coupling can be seen both in the simulated and
the experimental data. For example, consider Fig. 5.20 where the robot was
grasped by the human at t ( 2.5 s. During the perturbation, the evolution
of x, yc , and z was slowed down. Such behavior is desired in general, since it
helps to retain the coordination between the degrees of freedom of the robot.
It resulted from the fact that e had a significant magnitude, which in turn
caused τa to be significantly larger than τ , see (5.9). The robot was released
at t ( 10 s, and the controller described by (5.8) started to drive y to yc ,
whereby e was driven to 0. After e had converged to insignificant magnitude,
at approximately 12 s, it can be seen that x, yc , and z evolved considerably
faster. The same phenomenon could be seen in each experiment, as well as
in the simulations.
It is necessary to implement saturation on the control signals, to prevent
too large acceleration and velocity for large perturbations. Such boundaries
were implemented, but never reached in the experiments in this work.
77
Chapter 5. Temporally Coupled Dynamical Movement Primitives in Rn
78
5.10 Conclusion
see for instance [Ijspeert et al., 2013; Ude et al., 2014]. However, whereas the
assumption of a contractible state space in Sec. 5.6 is correct for the robot
joint space and the position in Cartesian space, some care must be taken
regarding the orientation in Cartesian space, as discussed in, e.g., [Mayhew
et al., 2011]. Restrictions on the orientation must be invoked to guarantee
that it is contained in a contractible space. Finding such restrictions, that
are not unnecessarily conservative, is the subject of Chapter 6.
5.10 Conclusion
In this research, it was shown how perturbations of DMPs could be recovered
from, while preserving the characteristics of the original DMP framework
in the absence of significant perturbations. Feedforward control was used
to track the reference trajectory generated by a DMP. Feedback control
with moderate gains was used to suppress deviations. This design is the
first, to the best of the author’s knowledge, that takes the following aspects
into account. In the absence of significant disturbances, the position error
must be small enough, so that the dynamical system would not slow down
unnecessarily due to the temporal coupling. Very large controller gains
would result in small errors under ideal conditions, but in practice they
yield control signals that are not realizable. On the other hand, if the gains
are moderate and only feedback control is used, too large errors occur.
Feedforward enabled the controller to act even without significant error,
which in turn allowed for moderate controller gains. The proposed control
system was verified in simulations and experimentally. Videos of the exper-
iments are available in [Karlsson, 2016; Karlsson, 2017a]. Further, it was
shown mathematically that the proposed control system is globally expo-
nentially stable, which implies exponential convergence to a steady state in
which the goal position is reached by the actual robot position.
79
Chapter 5. Temporally Coupled Dynamical Movement Primitives in Rn
V (0) = 0 (5.26a)
V (ξ ) ≥ 0, ∀ξ ,= 0 (5.26b)
V̇ (ξ ) ≤ f ( x) z/τa , ∀ξ , f ( x)
T
(5.26c)
Denote by ξ¯ and Ā the state vector and the system matrix in (5.27), respec-
tively. Because k p = k2v /4 (see Sec. 5.4) the eigenvalues are given by
kv
λ1,...,2n = − (5.28a)
2
λ2n+1,...,3n = −α e (5.28b)
80
5.A Passivity of Temporally Coupled Dynamical Movement Primitives
the left half-plane, it follows [Glad and Ljung, 2000] that there exists a
symmetric, positive definite matrix P solving the Lyapunov equation
P Ā + ĀT P = − Q (5.29)
1 2 1 1
V (ξ ) = ξ¯T P ξ¯ + x + αβ ( yc − )T ( yc − ) + zT z (5.30)
2 2 2
d
V (ξ ) = ξ¯T P ξ˙¯ + ξ˙¯T P ξ¯ + x ẋ
dt
d
+ αβ ( yc − )T ( yc − ) + żT z
dt
αx 2 1
= ξ ( P Ā + Ā P )ξ −
¯ T T ¯ x + αβ ( yc − )T z
τa τa
3 4T
α 1
+ (β ( − yc ) − z) + f ( x) z
τa τa
αx 2 α T
= −ξ¯T Qξ¯ − x − z z
τa τa
1 1
+ f ( x)T z ≤ f ( x)T z, ∀ξ , f ( x) (5.31)
τa τa
81
Chapter 5. Temporally Coupled Dynamical Movement Primitives in Rn
of (5.31).
Z t1
V (ξ ( t 1 )) = V (ξ (0)) + f ( x)T z/τa dt
| {z } | {z } 0 | {z }
Stored energy Energy at start Supplied power
Z t1
αx 2 α T
− ξ¯T Qξ¯ + x + z z dt
0 τa τa
| {z }
Dissipation power
Z t1
≤ V (ξ (0)) + f ( x)T z/τa dt (5.32)
| {z } 0 | {z }
Energy at start Supplied power
Here, integrating the supplied power with respect to time yields total work
done by f ( x). Hence, the increase in stored energy is not larger than the
externally added energy, which indicates passivity.
The global exponential stability established in Sec. 5.6 does not rely on
the passivity shown in this appendix. However, this appendix may contribute
with some intuition of the role of f ( x), and of why finding a Lyapunov
function of ξ is not trivial.
82
6
Temporally Coupled
Dynamical Movement
Primitives in 3D Orientation
Space
6.1 Introduction
In this chapter, we extend the results in Chapter 5 to support orienta-
tion in Cartesian space. Higher levels of robot control typically operate in
Cartesian space, for instance to control the pose of a robot end-effector or
an unmanned aerial vehicle. However, control of orientation in Cartesian
space is fundamentally limited: The rotation group SO(3) is not contractible
(defined in Sec. 6.4), and only globally contractible state spaces support
continuous and globally asymptotically stable feedback control systems. In
this chapter, a control system for temporally coupled dynamical movement
primitives (DMPs) in Cartesian space is designed. This is an extension of
the control law in Chapter 5 to support 3D orientation, and equivalently an
extension of [Ude et al., 2014] to include temporal coupling. Unit quater-
nions are used to represent orientations, and it is shown that the unit
quaternion set minus one point is contractible. Finally, the efficacy of the
control system is verified experimentally on an industrial robot.
83
Chapter 6. Temporally Coupled DMPs in 3D Orientation Space
these previous DMP formulations are applicable only if the robot state space
is Euclidean. As a consequence, previous applications and extensions of
DMPs typically rely on Euclidean state spaces; see, e.g., [Prada et al., 2014;
Karlsson et al., 2017b; Talignani Landi et al., 2018a; Papageorgiou et al.,
2018; Yang et al., 2018]. In [Perk and Slotine, 2006; Wensing and Slotine,
2017], a helicopter and a legged robot were controlled using DMPs, and
global exponential stability was shown assuming Euclidean state spaces.
The assumption of a Euclidean state space is correct for position in
Cartesian space and joint space, but not for orientation in Cartesian space.
This circumstance motivated the DMP formulation proposed in [Ude et
al., 2014], which supports Cartesian space including orientations. It was
applied for robot programming by demonstration in [Nemec et al., 2018].
This chapter extends [Ude et al., 2014] by including temporal coupling
and addressing the stability properties of the proposed control algorithm.
84
6.4 The Unit Quaternion Set Minus One Single Point is Contractible
single point; see, e.g., [Crossley, 2006] for a definition of homotopy equiv-
alence. Contractibility is necessary for applying the contraction theory of
[Lohmiller and Slotine, 1998], as done in Chapter 5. In this chapter, unit
quaternions are used to parameterize SO(3). Similarly to SO(3), the unit
quaternion set, H, is not contractible. In this section, however, it is shown
that it is sufficient to remove one point from H to yield a contractible space.
Table 6.1 lists some of the notation used in this chapter.
Preliminary Topology
The fundamentals of mathematical topology and set theory are described
in, e.g., [Hatcher, 2002; Levy, 2002; Crossley, 2006; Schwarz, 2013]. We will
use the fact that homeomorphism [Crossley, 2006] is a stronger relation
than homotopy equivalence.
Lemma 6.1
If two spaces X and Y are homeomorphic, then they are homotopy equiva-
lent. 2
Lemma 6.2
Assume that X ∼= Y, with a homeomorphism f1 : X → Y. Then X minus a
point p ∈ X, denoted X \ p, is homeomorphic to Y \ f1 (p). 2
Lemma 6.3
If X ∼
= Y, and X is contractible, then Y is also contractible. 2
85
Chapter 6. Temporally Coupled DMPs in 3D Orientation Space
Table 6.1 Notation used in this chapter. All quaternions represent orienta-
tions and are therefore unit quaternions, i.e., quaternions of norm one. For
a unit quaternion, its inverse is equal to its conjugate.
Notation Description
H Unit quaternion set
Sn ∈ Rn+1 Unit sphere of dimension n
y ∈ R3 Actual robot position
∈ R3 Goal position
yc ∈ R3 Coupled robot position
qa ∈H Actual robot orientation
q ∈H Goal orientation
qc ∈H Coupled robot orientation
q0 ∈H Initial robot orientation
ωa ∈ R3 Actual angular velocity
ωc ∈ R3 Coupled angular velocity
z, ω z ∈ R3 DMP states
h Orientation difference space
dc ∈h Difference between q c and q
α, β , k v , k p ∈ R+ Constant control gains
τ ∈ R+ Time parameter
τa ∈ R+ Adaptive time parameter
x ∈ R+ Phase variable
α x , αe , kc ∈ R+ Positive constants
f ( x) ∈ R6 Learnable virtual forcing term
fp ( x), fo ( x) ∈ R3 Position and orientation components
Nb ∈ Z+ Number of basis functions
m ∈ Z+ Dimension of Cartesian configuration
Ψ j ( x) ∈ R6 The j:th basis function vector
wj ∈ R6 The j:th weight vector
e ∈ R3 $ h Low-pass filtered pose error
ep ∈ R3 Position component of e
eo ∈h Orientation component of e
ÿr , ω̇ r ∈ R3 Reference robot acceleration
ξ ∈ R22 $ h3 DMP state vector
qξ i q ∈ R+ 2-norm of DMP state i
q̄ ∈H Inverse of quaternion q (same as conjugate)
h ∈ R+ Sample period (4 ms in experiments)
) Homotopy equivalence
∼
= Homeomorphic relation
86
6.4 The Unit Quaternion Set Minus One Single Point is Contractible
Definition 6.1
Let n be a non-negative integer. The unit sphere with dimension n is defined
as ) *
Sn = p ∈ Rn+1 p qpq2 = 1 (6.1)
2
Theorem 6.1
Let n be a non-negative integer. The unit sphere Sn minus a point p ∈ Sn ,
denoted Sn \ p, is contractible. 2
87
Chapter 6. Temporally Coupled DMPs in 3D Orientation Space
1 1
0.5 S0 \ p 0.5 R0 = p̂
0 ∼
= 0
−0.5 −0.5
−1 −1
−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1
1 1
0.5 S1 \ p 0.5 R1 ) p̂
0 ∼
= 0
−0.5 −0.5
−1 −1
−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1
2
S \p
1 1
R2 ) p̂
0 = 0
∼
−1 −1
1 1
1 1
0 0 0 0
−1 −1 −1 −1
88
6.5 Method
which motivates the search for the largest possible contractible subset of H.
One of the experiments (Setup 6.3 in Sec. 6.6) provides an example of when
both half spheres are necessary for a continuous representation of the robot
orientation.
6.5 Method
In this section, we augment the controller in Chapter 5 to incorporate
orientation in Cartesian space. The resulting algorithm can also be seen as
a temporally coupled version of the Cartesian DMPs proposed in [Ude et al.,
2014]. In this chapter, we consider the control of one end-effector to limit
the notation, but it would be straightforward to include several end-effectors
and robots in the proposed control algorithm. The pose in Cartesian space
consists of position and orientation. The position control in this chapter is
the same as described in Chapter 5, except that it is also affected by the
orientation through the shared time parameter τa in this chapter.
Similar to the approaches in [Ude, 1999; Ude et al., 2014], we define a
difference between two quaternions, q 1 and q 2 , as
d (q 1 q̄ 2 ) = 2 · Im[log(q 1 q̄ 2 )] ∈ h (6.2)
d c = d (q c q̄ ) = 2 · Im[log(q c q̄ )] (6.3)
represents the difference between coupled and goal orientations. This map-
ping preserves the contractibility concluded in Sec. 6.4, as established by
the following theorem.
Theorem 6.3
The orientation difference space h is contractible. 2
d : H \ (−1, 0, 0, 0) → h (6.4)
89
Chapter 6. Temporally Coupled DMPs in 3D Orientation Space
τa ż = α(β ( − yc ) − z) + fp ( x) (6.5a)
τa ẏc = z (6.5b)
τaω̇ z = α(β (− d c ) − ω z ) + fo ( x) (6.5c)
τaω c = ω z (6.5d)
τa ẋ = − α x x (6.6)
Here, σ and c denote the width and center of each basis function, re-
spectively. The forcing term fp ( x) is determined accordingly, see Chapter 3.
Further, the parameters of f ( x) can be determined based on a demonstrated
trajectory by means of locally weighted regression [Atkeson et al., 1997], as
described in [Ijspeert et al., 2013] and Chapter 3.
All dimensions of the robot pose are temporally coupled through the
shared adaptive time parameter τa . Denote by y the actual position of the
robot, and by q a the actual orientation. The adaptive time parameter τa
is determined based on the low-pass filtered difference between the actual
and coupled poses as follows.
ė p = α e ( y − yc − e p ) (6.9a)
ė o = α e ( d ac − e o ) (6.9b)
e= [ eTp eTo ]T (6.9c)
τa = τ (1 + k c e e) T
(6.9d)
In (6.9c), the contributions from the position and orientation errors are
weighted equally to limit the notation, though this is not necessary in an
implementation. Further, we assume that the length unit for the position
90
6.5 Method
This can be seen as a pose PD controller (see Sec. 6.8) together with the
feedforward terms ÿc and ω̇ c . Here, ÿr and ω̇ r denote reference accelerations
sent to the internal controller of the robot, after conversion to joint values
using the robot Jacobian [Spong et al., 2006]. We let k p = k2v /4, so that
(6.10) represents a critically damped control loop (see Chapter 5). Similarly,
β = α/4 (see Chapter 3 and [Ijspeert et al., 2013]). The control system
is schematically visualized in Fig. 6.2. We model the ’Robot’ block as a
double integrator, so that ÿ = ÿr and ω̇ a = ω̇ r , as justified in Chapter 5 for
accelerations with moderate magnitudes and time derivatives. In summary,
the proposed control system is given by
τa = τ (1 + k c eT e) (6.11d)
τa ẋ = −α x x (6.11e)
τa ẏc = z (6.11f )
τa ż = α(β ( − yc ) − z) + fp ( x) (6.11g)
τaω c = ω z (6.11h)
τaω̇ z = α(β (− d c ) − ω z ) + fo ( x) (6.11i)
wz
91
Chapter 6. Temporally Coupled DMPs in 3D Orientation Space
5 6
Feedforward ÿc
ω̇ c 5 6
5 6 5 6 5 6
yc ÿr y
q qc ω̇ r qa
PD P
DMP Robot
controller
Feedback
Figure 6.2 The control structure for temporally coupled Cartesian DMPs.
The block denoted ’Robot’ includes the internal controller of the robot, to-
gether with transformations between Cartesian and joint space for low-level
control. The ’DMP’ block corresponds to the computations in (6.5) to (6.9).
The PD controller and the feedforward terms are specified in (6.10). This
forms a cascade controller, with the DMP as outer controller and the PD as
the inner.
Since R22 $h3 is a product of contractible spaces (see Theorem 6.3), and such
a product is itself contractible [Kahn, 1975], the state space is contractible.
In Sec. 6.A, it is shown that an approximation of the proposed control system
in discrete time is exponentially stable.
6.6 Experiments
The control law in Sec. 6.5 was implemented in the Julia programming lan-
guage [Bezanson et al., 2014], to control an ABB YuMi [ABB Robotics, 2018]
robot. The Julia program communicated with the internal robot controller
through a research-interface version [Bagge Carlson and Haage, 2017] of
Externally Guided Motion (EGM) [ABB Robotics, 2019c] at a sampling rate
of 250 Hz.
Three different setups were used to investigate the behavior of the con-
troller. As preparation for each setup, a temporally coupled Cartesian DMP
had been determined from a demonstration by means of lead-through pro-
gramming, which was available in the YuMi product by default. In each
trial, the temporally coupled DMP was executed while the magnitudes of
the states in (6.12) were logged.
Perturbations were introduced by physical contact with a human. This
was enabled by estimating joint torques induced by the contact, and map-
ping these to Cartesian contact forces and torques using the robot Jacobian.
A corresponding acceleration was then added to the reference acceleration
as a load disturbance. However, we emphasize that this chapter is not
focused on how to generate the perturbations themselves. Instead, that
functionality was used only as an example of unforeseen deviations, and to
92
6.6 Experiments
(a) (b)
Figure 6.3 Photographs of a trial of Setup 6.1. The robot was initially
released from the pose in (a), with an offset to the goal pose. In (b), the goal
pose was reached.
Setup 6.1 This setup is visualized in Fig. 6.3. Prior to the experiment, a
test DMP that did not perform any particular task was executed, and the
robot then converged to the goal pose, i.e., to y = yc = and d ac = d c = 0.
Thereafter, the operator pushed the end-effector, so that the actual pose
deviated from the coupled and goal poses. The experiment was initialized
when the operator released the robot arm. The purpose of this procedure
was to examine the stability of the subsystem in (6.11a) to (6.11c). A total
of 100 perturbations were conducted.
Setup 6.2 See Fig. 6.4. The task of the robot was to reach a work object
(in this case a gore-tex graft used in cardiac and vascular surgery) from
its home position. A DMP defined for this purpose was executed, and the
operator introduced two perturbations during the robot movement. The
purpose of this setup was to investigate the stability of the entire control
system in (6.11). A total of 10 trials were conducted.
Setup 6.3 See Fig. 6.5. The task of the robot was to hand over the work
object from its right arm to its left. The movement was specifically designed
to require an end-effector rotation angle of more than π , thus requiring both
the upper and the lower halves of the quaternion hypersphere (see Fig. 6.9),
and not only one of the halves which is sometimes used [LaValle, 2006].
Such movements motivate the search for the largest possible contractible
subset of H in Sec. 6.4. Similar to Setup 6.2, the purpose was to investigate
the stability of (6.11), and 10 trials were conducted.
93
Chapter 6. Temporally Coupled DMPs in 3D Orientation Space
(a) (b)
(c) (d)
Figure 6.4 Photographs of a trial of Setup 6.2. The DMP was executed
from the home position (a), and was perturbed twice on its way toward the
goal (b). It recovered from these perturbations (c), and reached the goal at
the work object (d).
6.7 Results
Figures 6.6 to 6.9 display data from the experiments. Figure 6.6 shows the
magnitude of the states during a trial of Setup 6.1, and it can be seen
that each state converged to 0 after the robot had been released. This is
consistent with Theorem 6.4 in Sec. 6.A. Similarly, Figs. 6.7 and 6.8 show
data from Setups 6.2 and 6.3 respectively, and it can be seen that the robot
recovered from each of the perturbations. Further, each state subsequently
converged to 0. This is consistent with Theorem 6.5 in Sec. 6.A. All trials in
a given setup gave similar results. Further, these results suggest that the
control system (6.11) is exponentially stable.
Figure 6.9 shows orientation data from Setup 6.2 (left) and Setup 6.3
(right). The upper plots show quaternions for the demonstrated paths, q d ,
determined using lead-through programming prior to the experimental tri-
als, relative to the goal quaternions q . The middle plots show coupled
orientations q c relative to q . It can be seen that the paths of q d and q c
were similar for each of the setups, which was expected given a sufficient
94
6.8 Discussion
(a) (b)
(c) (d)
Figure 6.5 Photographs of a trial of Setup 6.3. The robot started its move-
ment from the configuration in (a). The end-effector was rotated as indicated
by the red arrows, which resulted in a rotation larger than π from start
to goal. The robot was perturbed twice by the operator (b), recovered and
continued its movement (c), and accomplished the handover (d).
number of DMP basis functions. The perturbations can be seen in the bot-
tom plots, which show q a relative to q c . Though q a q̄ c was very close to the
identity quaternion for most of the time, it deviated significantly twice per
trial as a result of the perturbations. Setup 6.3 is an example of a movement
where it would not be possible to restrict the quaternions to the upper half
sphere, without introducing discontinuities. This is shown in Figure 6.9, as
quaternions were present not only on the upper half sphere, but also on the
lower, for Setup 6.3.
6.8 Discussion
In each of the experiments, the robot recovered from the perturbations
and subsequently reached the goal pose, which was the desired behavior.
Because each state converged to 0, we conclude that the experimental results
verify Theorem 6.4 and Theorem 6.5; see Sec. 6.A. Further, the behavior
95
Chapter 6. Temporally Coupled DMPs in 3D Orientation Space
1.5 q y − yc q [m]
q ẏ − ẏc q [m/s]
qξ i q
1 q dac q [rad]
qω a − ω c q [rad/s]
0.5
q eq [1]
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
Time [s]
Figure 6.6 Data from a trial of Setup 6.1. The notation q · q represents
the 2-norm, and the unit symbol [1] indicates dimensionless quantity. The
experiment was initialized with some position error y − yc and orientation
error d ac . The operator released the robot at t = 0. It can be seen that each
state converged to 0.
q y − yc q [m]
1
q ẏ − ẏc q [m/s]
qξ i q
q dac q [rad]
0.5 qω a − ω c q [rad/s]
q eq [1]
0
0 5 10 15 20 25 30
2
x [1]
1.5 q yc − q [m]
qξ i q
1 q zq [m/s]
q dc q [rad]
0.5 qω z q [rad/s]
0
0 5 10 15 20 25 30
Time [s]
Figure 6.7 Data from a trial of Setup 6.2. Consider first the upper plot.
The two perturbations are clearly visible, and these were recovered from as
the states converged to 0. In the lower plot, it can be seen that the time
evolution of the states was slowed down in the presence of perturbations. It
can further be seen that each of the states converged to 0.
96
6.8 Discussion
1.5
q y − yc q [m]
1 q ẏ − ẏc q [m/s]
qξ i q q dac q [rad]
0.5 qω a − ω c q [rad/s]
q eq [1]
0
0 5 10 15 20 25 30 35
4
x [1]
3 q yc − q [m]
qξ i q
2 q zq [m/s]
q dc q [rad]
1 qω z q [rad/s]
0
0 5 10 15 20 25 30 35
Time [s]
Figure 6.8 Data from a trial of Setup 6.3. The organization is the same as
in Fig. 6.7, and similar conclusions can be drawn. In addition, the required
rotation angle from start to goal was larger than π in this setup, which
corresponds to q d c q being larger than π initially.
97
Chapter 6. Temporally Coupled DMPs in 3D Orientation Space
H q d q̄
Start Identity quaternion
1 1
Re[q]
0 0
−1 −1
−1 1 −1 −1
0 0 0 0
1 −1 1 1
1 1
Re[q]
0 0
−1 −1
−1 q c q̄ 1 −1 −1
0 0 0 0
1 −1 1 1
0.5 0.5
0 0
−0.5 q a q̄ c −0.5
−0.5 0 0.5 −0.5 0 0.5
Figure 6.9 Orientation data from Setup 6.2 (left) and Setup 6.3 (right).
Quaternions have been projected onto S2 for the purpose of visualization.
Vertical axes represent quaternion real parts, and horizontal axes represent
the first two imaginary elements with magnitudes adjusted to yield unit
length of the resulting projection. The bottom plots show the quaternion set
seen from above, and hence their real axes are directed out from the figure.
98
6.8 Discussion
99
Chapter 6. Temporally Coupled DMPs in 3D Orientation Space
6.9 Conclusion
In this chapter, it was first shown that the unit quaternion set minus one
point is contractible. A contractible state space allows for continuous and
exponentially stable control systems. Thereafter, a control algorithm for
DMPs with temporal coupling in Cartesian space was designed. The pro-
posed DMP functionality was verified experimentally on an industrial robot.
Further, exponential stability was proven mathematically for an approxi-
mation of the proposed control system. A video that shows the experiments
is available in [Karlsson, 2019].
τa = τ (1 + k c eT e) (6.13d)
τa ẋ = −α x x (6.13e)
τa ẏc = z (6.13f )
τa ż = α(β ( − yc ) − z) + fp ( x) (6.13g)
τaω c = ω z (6.13h)
τaω̇ z = α(β (− d c ) − ω z ) + fo ( x) (6.13i)
wz
100
6.A Stability Analysis
101
Chapter 6. Temporally Coupled DMPs in 3D Orientation Space
is contracting. 2
ξ 1 ( t + h) = Φ1ξ 1 ( t) (6.17)
where ξ 1 is given by
y − yc
ẏ − ẏc
dac
ξ1 =
(6.18)
ω a − ω c
ep
eo
Further, denote by I the identity matrix and by 0 the zero matrix, of due
sizes. The system matrix is given by
I hI
0 0 0 0
− hk p I (1 − hk v ) I 0 0 0 0
I hI
0 0 0 0
Φ1 =
0
0 − hk p I (1 − hk v ) I 0 0
hα e I 0 0 0 (1 − hα e ) I 0
0 0 hα e I 0 0 (1 − hα e ) I
(6.19)
Theorem 6.4
The dynamical system defined by (6.17) for Cartesian DMP operation is a
contraction and has ξ 1 = 0 as fixed point. 2
hk v
λ1,...,2m = 1 − (6.20a)
2
λ2m+1,...,3m = 1 − hα e (6.20b)
102
6.A Stability Analysis
Each eigenvalue is strictly within the unit circle since h, k v , α e > 0, and the
system is therefore globally exponentially stable. Because it is also a linear
time-invariant system, contraction can be concluded. 2
ξ 1 ( t + h) = Φ1ξ 1 ( t) (6.21a)
αx
x( t + h) = (1 − h ) x( t) (6.21b)
τa
is contracting and has the origin as fixed point. 2
103
Chapter 6. Temporally Coupled DMPs in 3D Orientation Space
so that ξ = [ξ 1T x ξ 2T ]T , and
h
I I 0 0
τ
3 4
hαβ hα
− I 1− I
0 0
τ τ
Φ2 =
(6.25)
h
0 0 I I
τ
3 4
hαβ hα
0 0 − I 1− I
τ τ
hα
λ1,...,2m = 1 − (6.26)
2τ
and because they are all strictly within the unit circle since h, α, τ > 0,
(6.23) is contracting. It now follows from Proposition 6.1 that the hierarchy
(6.15) is contracting. Hence, it converges globally exponentially to a single
trajectory. Since one solution to (6.15) is ξ = 0, any trajectory must converge
exponentially to this point. 2
104
7
Detection and Control of
Contact-Force Transients
7.1 Introduction
Robot programming is traditionally done using position-based control. Se-
quences of movements are specified with details such as velocities, target
positions, etc. Robot positions in free space are uncomplicated to visualize
and simulate, for instance with software tools such as rviz [ROS, 2018] and
Gazebo [Gazebo, 2018], and successful movement can be verified by com-
paring measured position with target position. Control in the contact-force
domain is also possible [Olsson et al., 2002; García et al., 2006], but this is
a more difficult control problem and forces are only indirectly visible [Jo-
hansson et al., 2015]. Nevertheless, many applications rely on force control.
In friction-stir welding, for instance, a rotating tool is pushed against work
pieces with a specified contact force to generate heat from friction [Cook
et al., 2004; De Backer, 2014; De Backer and Bolmsjö, 2014; Bagge Carl-
son et al., 2016; Karlsson et al., 2016]. In lead-through programming, motor
torques are commanded to help the programmer in overcoming joint friction
[Stolt et al., 2015a].
Whereas the previous chapters focused on algorithms for motion control,
the aim of this chapter is to use contact forces for validation of manipulation
subtasks. In robotic manipulation tasks, such as assembly tasks, it might
not be possible to validate subtasks based on robot position. One reason
is that the position tolerances are typically lower than the position uncer-
tainties. The uncertainties are introduced both by positioning of the robot
end-effector, which can be done within 1 mm on a typical industrial robot,
and of the work objects. Further, early detection is usually desirable, and in
some cases force data may indicate completion earlier than position data.
Force measurements were used to validate subtasks in robotic assembly in,
e.g., [Stolt et al., 2011; Stolt et al., 2012b; Linderoth, 2013; Stolt, 2015].
105
Chapter 7. Detection and Control of Contact-Force Transients
106
7.2 Problem Formulation
sated for, so that the presented joint torques are of the same magnitude as
on the arm side even though they have been measured on the motor side.
The research presented in this chapter also extends [Stolt et al., 2015b]
by investigating whether a given detection model could generalize to tasks
that involve new objects, from which no data have been used to deter-
mine the model. In order to investigate whether robot joint torques contain
enough information to distinguish transients, we follow a machine-learning
procedure. First, we gather data to determine a recurrent neural network
(RNN), which is an artificial neural network specialized in processing se-
quential data [Graves, 2012; Goodfellow et al., 2016], and subsequently we
evaluate the performance of the RNN on new data.
The procedure proposed does not rely on any assumption of what tasks
or work objects are considered, as long as distinguishable joint-torque tran-
sients are generated. In this chapter, we evaluate the procedure on the
manipulation scenarios shown in Figs. 7.1 and 7.2. These scenarios also
serve as examples of how the procedure could be used in practice.
Machine learning for analyzing contact forces in robotic assembly was
also applied in [Rodriguez et al., 2010], where force measurements were
used as input to an SVM, to distinguish between successful and failed
assemblies. A verification system, specialized in snap-fit assembly, was de-
veloped in [Rojas et al., 2012]. Similar to [Stolt et al., 2015b] and [Rodriguez
et al., 2010], a force/torque sensor was assumed in [Rojas et al., 2012]. Such
a requirement has been avoided in some previous research, by using in-
ternal robot sensors instead. For instance, a method to estimate contact
forces from joint torques was presented in [Linderoth et al., 2013]. Further,
force-controlled assembly without a force sensor was achieved in [Stolt et
al., 2012b], by estimating contact forces from position errors in the internal
controller of the robot.
107
Chapter 7. Detection and Control of Contact-Force Transients
(a) (b)
(c) (d)
Figure 7.1 An ABB YuMi robot [ABB Robotics, 2018] was used in the
experiments (a). The experimental setup for the switch assembly task is
shown in (b), (c), and (d). The robot gripper, box, and switch are shown
in (b). The robot grasped the switch, and attached it to the box by pushing
downwards. The downward motion began in (c), where the switch was not yet
snapped into place. It ended when the assembly was completed, in (d). These
photos were taken during the experimental evaluation (see Sec. 7.4), and the
same setup was used for gathering training and test data (see Sec. 7.3).
108
7.3 Method
Notation Description
+
T ∈Z Number of samples in torque sequence
n pre ∈ Z+ Number of samples before transient peak
n post ∈ Z+ Number of samples after transient peak
n ch ∈ Z+ Number of input channels
tanh(·) Rn → Rn Element-wise hyperbolic tangent function
τ̄ ∈ Rnch Robot joint torque
h ∈ Rnch Activation of hidden RNN neurons
U, W ∈ Rnch $nch RNN weight matrices
b ∈ Rnch RNN bias vector
V ∈ R2$nch RNN weight matrix
c ∈ R2 RNN bias vector
o ∈ R2 RNN output vector
e ∈R Base of the natural logarithm
p̂ ∈ R2 Estimated probability distribution
p ∈ R2 Target label for given torque sequence
L ∈R Loss function
7.3 Method
In this section, the proposed machine-learning procedure to determine a
force/torque transient-detection model from training and test data is de-
tailed. The assembly scenario used to acquire the data consisted of attach-
ing a switch to a box, see Fig. 7.1. The objective for the robot was to move
toward the box while holding the switch, thus pushing the switch against
the box, until it snapped into place. The robot should detect the completion
of the task automatically, stop moving toward the box, and possibly start a
new movement. Table 7.1 lists some of the notation used in this chapter.
Sequence Model
An RNN [Graves, 2012; Goodfellow et al., 2016] was used as a sequence
classifier. This choice is discussed in Sec. 7.6. It had a sequence of joint
torques as input, one single output indicating whether the sequence con-
tained a given transient or not, one hidden layer, and recurrent connec-
tions between its hidden neurons. Each input torque sequence consisted of
T = n pre + 1 + n post time samples, where n pre and n post were determined
109
Chapter 7. Detection and Control of Contact-Force Transients
110
7.3 Method
Weighted
L
cross-entropy
Target Normalized
p p̂
label probability
softmax
o(T ) Output vector
V
W W W
h(1) h(2) h(3) h(T ) Hidden layer
U U U U
Input
τ̄ (1) τ̄ (2) τ̄ (3) τ̄ (T ) torque
111
Chapter 7. Detection and Control of Contact-Force Transients
Model Training
Given the training set, the model parameters U, V, W, b, and c were deter-
mined by minimizing a loss function L. The training set contained much
more negative data points than positive ones. If not taken into account, this
type of class imbalance has been reported to obstruct the training procedure
of several different classifiers. The phenomenon has been described in more
detail in [Japkowicz, 2000; Japkowicz and Stephen, 2002], and should be
taken into account when designing the loss function. Consider first the fol-
lowing loss function L̃, which is the ordinary cross-entropy between training
data and model predictions, averaged over the training examples.
D A
1 XX d ! "
L̃ = − p a log p̂da (7.4)
D
d =1 a=1
Here, a and d are indices for summing over the vector elements and training
data points, respectively. This cross-entropy is commonly used as a loss
function in machine learning [Rubinstein and Kroese, 2013; Goodfellow et
al., 2016]. Because of the class imbalance in the present training set, it
would be possible to yield a relatively low loss L̃ by simply classifying all
or most of the data points as negative, regardless of the input, even though
that strategy would not be desirable.
In order to take the class imbalance into account, weighted cross-entropy
[Japkowicz and Stephen, 2002; Panchapagesan et al., 2016] was used as loss
function. Denote by r the ratio between negative and positive data points
in the training set, and introduce the weight vector wr = [r 1]T . The loss
function was defined as
D A
1 XX d ! "
L=− p a log p̂da · wTr yd (7.5)
D
d =1 a=1
The values of n pre and n post were determined using both the training
set and the test set as follows. All positive data points available were used,
and r = 20 times as many negative data points. Starting with n pre =
n post = 1, the model was trained using the training set, and its performance
was measured using the test set. Subsequently, both n pre and n post were
increased by 1, and the training and evaluation procedure was repeated.
This continued until perfect classification was achieved, or until the values
of n pre and n post were large; 30 was chosen as an upper limit, though it
was never reached in the experiments presented here. Thereafter, n pre was
kept constant, and it was investigated how much n post could be lowered with
retained performance. This was done by decreasing n post one step at a time,
while repeating the training and evaluation procedure for each value. Once
the performance was decreased, the value just above that was chosen for
112
7.3 Method
n post . This way, the lowest possible value of n post was found, that resulted
in retained performance.
Once n pre and n post were determined, new model parameters were ob-
tained by training on a larger data set, with r = 100. The reason for using
a lower value for the other iterations, was that it took significantly longer
computation time to use such a large data set.
Because of the class imbalance in the test set, ordinary classification ac-
curacy, as defined by the number of correctly classified test data points
divided by the total number of test data points, would not be a good
model-performance measurement. Instead, the F-measurement [Hripcsak
and Rothschild, 2005] was used, defined as
PR
F1 = 2 (7.6)
P+R
Software Implementation
The RNN in Sec. 7.3 was defined as a computational graph in the Julia pro-
gramming language [Bezanson et al., 2014], using TensorFlow [Abadi et al.,
2016; TensorFlow, 2019] and the wrapper TensorFlow.jl [Malmaud, 2017].
The Adam algorithm [Kingma and Ba, 2014] was used for minimization of
the loss function L.
After training, the RNN model was saved on a server, and loaded into a
Julia program on a PC, which communicated with the internal controller of
the robot as described in Chapter 2. The sample frequency was 250 Hz, and
hence the sample period was 4 ms. The robot joint torques were logged and
saved, and for each time sample t, a joint-torque sequence was formed by the
samples in [ t − n pre − n post ; t], and sent as input to the RNN. Measurements
after time t were not available at t, which is why the input only contained
samples up until t. Each sequence was classified in real time. The compu-
tation time for one classification was short; well below the sample period.
To move the robot, desired velocity references for the gripper in Cartesian
space were first specified in the Julia program. Then, the corresponding
joint velocities were computed using the robot Jacobian, and these were
sent as references to the internal controller of the robot.
113
Chapter 7. Detection and Control of Contact-Force Transients
7.4 Experiments
Two ABB YuMi robots, shown in Figs. 7.1 and 7.4 and described in Chapter 2,
were used for experimental evaluation. The implementation in Sec. 7.3 was
used for robot control and transient detection. To facilitate understanding
of the experimental setups and results, a video is available on [Karlsson,
2017b]. The same RNN, obtained as described in Sec. 7.3, was used for
detection in each setup. The following three setups were considered.
Setup 7.1 The same robot as used for RNN training was used. Since the
test set had been used to determine the hyperparameters of the RNN, i.e.,
n pre and n post , it was necessary to gather new measurements to evaluate the
general performance. The experimental setup was similar to that in Sec. 7.3,
except that the measured torque sequences were saved and classified by
the RNN, instead of just saved to the training and test sets. The robot was
programmed to first move its gripper down, thus pushing the switch against
the box. Once a transient was recognized, it was programmed to stop its
downward motion, and instead move the box to the side. The assembly was
repeated 50 times, to evaluate the robustness of the proposed approach.
The experimental setup is visualized in Fig. 7.1.
Setup 7.2 The same robot as used for RNN training was used for manipu-
lation tasks that involved objects not used during model training, in order
to test its generalizability. Again, the robot was programmed to move its
gripper down until a transient was detected. Once a transient was detected,
it was programmed to move its gripper upwards. Three tasks were consid-
ered: snapping a battery cover into place on a pocket calculator, closing a
spectacle case, and pushing a power switch on an extension cord. The setup
for each task is shown in Fig. 7.2. For each of these tasks, 50 attempts were
made for validation.
Setup 7.3 A different robot individual but of the same model as used for
RNN training was used for manipulation of a power switch. The setup is
shown in Fig. 7.4. These experiments were done at ABB Corporate Research
in Västerås, Sweden. The RNN was downloaded on a local PC, and used for
the detection. A total of 5 trials were done.
7.5 Results
The performance of the RNN on the test set, for different values of the
hyperparameters, is shown in Table 7.2. The abbreviations are as follows:
number of true positives (TP), true negatives (TN), false positives (FP),
and false negatives (FN). The hyperparameters were increased until n pre =
114
7.5 Results
(a) (b)
Figure 7.4 Experimental setup at ABB Corporate Research in Västerås,
Sweden (Setup 7.3). The YuMi robot was different from the one used to
acquire training and test data. The task began in (a) and finished in (b).
n post = 5, for which perfect classification was obtained. Then, n post was
decreased until the performance decreased at n post = 2. With this value of
n post , larger values of n pre were tested (see second last row in Table 7.2),
which did not yield perfect classification for any value, i.e., F1 < 1. Thus,
(npre , npost ) = (5, 3) was chosen for the final model.
After training, the RNN detected all 50 snap-fits in Setup 7.1 correctly,
without any false positives prior to the snap. The torque data and RNN
output from one of the trials are shown in Figs. 7.5 and 7.6. The other trials
gave qualitatively similar results. The RNN also detected each transient
Table 7.2 RNN performance on the test set, for different values of the
hyperparameters. The row with the lowest value of n post that yielded perfect
classification is marked in blue. With these values, the model was trained
and tested again, but now with more negative data points (see last row,
marked in red).
n pre n post TP TN FP FN P R F1
1 1 20 500 0 5 1 0.80 0.89
2 2 22 500 0 3 1 0.88 0.94
3 3 23 498 2 2 0.92 0.92 0.92
4 4 23 500 0 2 1 0.92 0.96
5 5 25 500 0 0 1 1 1
5 4 25 500 0 0 1 1 1
5 3 25 500 0 0 1 1 1
5 2 22 500 0 3 1 0.88 0.94
6, 7, . . . 30 2 - - - - - - <1
5 3 25 2500 0 0 1 1 1
115
Chapter 7. Detection and Control of Contact-Force Transients
6
τ̄1
τ̄2
Torque [Nm]
4
τ̄3
2 τ̄4
τ̄5
0 τ̄6
τ̄7
−2
0 1 2 3 4 5 6 7 8
1 p̂ 1
Probability
DB
0.5
0
0 1 2 3 4 5 6 7 8
Time [s]
Figure 7.5 Data from Setup 7.1. The robot joint torques (upper plot) were
used as input for the RNN. The torque of joint i is represented by τ̄i . The
first element of the RNN output (p̂ 1 in the lower plot) was close to 0 before
the assembly was completed, and increased to close to 1 once the task was
finished. Completion was indicated when p̂ 1 was above the decision boundary
(DB, at 0.5) for the first time. Thus, for detection purposes, the RNN output
generated after this event was not relevant.
successfully, without any false positives, for each trial in Setup 7.2. Data
from one trial of each task are shown in Fig. 7.7, and the other trials gave
qualitatively similar results. Similarly, successful detection was achieved
for the 5 trials on the robot that had not been used for training (Setup 7.3).
Data from one of these trials are shown in Fig. 7.8.
7.6 Discussion
There are several alternatives to RNNs for classification. Using an RNN
is motivated as follows. Two less complicated models, matched filter and
logistic regression, were tried initially, without achieving satisfactory per-
formance on test data. A linear model, an SVM, and learned force thresholds
for transient detection have been evaluated in [Stolt et al., 2015b]. RNNs
are specialized in processing sequential data, and the torque measurements
used in this work were sequential. Two properties of the RNN contribute to
lower its complexity, as compared to a standard artificial neural network.
These consist of parameter sharing, which means that the same parame-
116
7.6 Discussion
6
τ̄1
τ̄2
Torque [Nm]
4
τ̄3
2 τ̄4
τ̄5
0 τ̄6
τ̄7
−2
6.27 6.28 6.29 6.3 6.31 6.32 6.33 6.34
1 p̂ 1
Probability
DB
0.5
0
6.27 6.28 6.29 6.3 6.31 6.32 6.33 6.34
Time [s]
Figure 7.6 Same data as in Fig. 7.5, but zoomed in on the time when the
task finished. The transient was detected at time t = 6.308 s. The first torque
sequence to be classified as positive was that within the vertical dashed lines
in the upper plot. Figure 7.5 might give the impression that p̂ 1 increased
from 0 to 1 instantly upon detection. In this figure, however, it can be seen
that the confidence was actually built up during a couple of sample periods,
which was also the case for the remaining trials.
ters are used in several connections, and sparsity, which means that only
some of the neurons are connected to each other. This can be seen in
Fig. 7.3. Thanks to the lower complexity, it is possible to estimate a model
with significantly fewer training examples than would be needed otherwise.
Compared to models that are not specialized in sequential data, e.g., logit
models, SVMs, and ordinary neural networks, the RNN is less sensitive to
variations of the exact time step in which some information in the input
sequence appears. An RNN can also be generalized to classify data points
of sequence lengths not present in the training set, though this was not
used in this present work. General properties of RNNs are well described
in [Goodfellow et al., 2016].
The concept of RNNs is well known from previous research [Graves,
2012; Goodfellow et al., 2016], and hence it should not be seen as a con-
tribution from this chapter. Instead we have used it to evaluate whether
transients in robot motor torques could be used for validation of manipu-
lation tasks, which in turn is the main purpose of this chapter. The RNN
architecture described in Sec. 7.3 serves as an example of how joint torques
could be used for validation in practice.
117
Chapter 7. Detection and Control of Contact-Force Transients
τ̄1
5 τ̄2
Torque [Nm]
τ̄3
τ̄4
0 τ̄5
τ̄6
τ̄7
−5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
1 p̂ 1
DB
Probability
0.5
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Time [s]
5 5
Torque [Nm]
0 0
−5 −5
0 2 4 0 2 4
1 p̂ 1 1 p̂ 1
DB DB
Probability
0.5 0.5
0 0
0 2 4 0 2 4
Time [s] Time [s]
Figure 7.7 Experimental data from the manipulation tasks in Setup 7.2;
snapping the battery cover into place on the pocket calculator (upper), closing
the spectacle case (lower left), and pushing the power switch on the extension
cord (lower right). The organization of each plot is the same as in Figs. 7.5
and 7.6.
118
7.6 Discussion
τ̄1
5
τ̄2
Torque [Nm]
τ̄3
τ̄4
0 τ̄5
τ̄6
τ̄7
−5
0 0.5 1 1.5 2 2.5 3
1 p̂ 1
Probability
DB
0.5
0
0 0.5 1 1.5 2 2.5 3
Time [s]
Figure 7.8 Experimental data from Setup 7.3, where a power switch on
an extension cord was pushed by a YuMi robot not used for RNN training.
Detection occurred at time t ( 1 s.
As compared to [Stolt et al., 2015b], the approach proposed here had four
new benefits. A force sensor was no longer required, the detection delay was
reduced, the computation time for model training was shortened, and the
confidence level for detection could be set through one parameter. In [Stolt
et al., 2015b], n post > 10 (corresponding to > 40 ms) was required for perfect
classification, whereas our proposed method required n post = 3 (12 ms). The
training time of the RNN was in the order of minutes on an ordinary PC,
which was an improvement compared to days in [Stolt et al., 2015b]. The
decision boundary for p̂ 1 in Sec. 7.3 corresponds to a desired confidence level
for detection. It was set to 0.5 in Sec. 7.3, meaning that a task was classified
as completed when the estimated probability of completion was above 50 %.
However, the meaning of p̂ 1 is intuitive, and the decision boundary can
easily be adjusted to a desired level of confidence. For example, it might be
worse to fail to detect a force transient and risk to damage the hardware,
than to finish a subtask too early and possibly detect this mistake later on.
Such a cost asymmetry would motivate a lower decision boundary. In some
scenarios, however, backup detectors may confirm completion reliably but
late, and it might then be more important to avoid false positives. This was
the case in the experimental scenario in [Stolt et al., 2015b]. The ability to
adjust the confidence level according to specific circumstances is valuable.
In some approaches, the confidence level is acquired by weighting false
119
Chapter 7. Detection and Control of Contact-Force Transients
positive and false negative points differently in the cost function used in the
model training, see, e.g., [Stolt et al., 2015b]. This requires a new training
and validation procedure to adjust the confidence level, which is avoided
with the method proposed here.
Further, this research extended [Stolt et al., 2015b; Karlsson, 2017e]
by verifying generalization of the classification model to manipulation of
new objects, that had not been used to generate training and test data. It
is expected that this generalization is limited to similar tasks. Therefore,
the idea is that the training procedure described in Sec. 7.3 should be
gone through whenever a detection model for a completely different task is
required.
Given a certain contact force/torque acting on the end-effector of the
robot, the corresponding motor torques depend on the configuration, as
well as gravity and friction between arm side and motor side of each joint.
It is expected that different robot individuals of the same model would
experience slightly different motor torques for a given task, because of small
uncertainties in the robot manufacturing and different wear and tear. In
this chapter, it has been shown that generalization of the RNN to a new
robot individual is possible. This fact was also used during the final review
of the SARAFun project [SARAFun, 2019], where Setup 7.3 was used to
demonstrate the proposed detection algorithm.
However, changing robot model or configuration would change the joint
torques significantly, and generalization can not be expected with this cur-
rent implementation. Figure 7.9 shows the success rate for the setup in
Fig. 7.4, but where the task was carried out at different distances from the
position used during training. As expected, the success rate deteriorated
with increased distance. The failures consisted both of false detections and
undetected transients. However, already this approach allowed for some
uncertainty in the work-space configuration, in the order of 0.1 m. Even
though generalization to the specific configurations tested in Fig. 7.9 could
be achieved by first including those configurations in the training proce-
dure, it would not be a feasible strategy for generalization to the entire
work space. The reason is that this would require more training examples
than one could provide in reasonable time.
To enable generalization also to new configurations and robot models, it
would be a good idea to first model and compensate for gravity and friction-
induced motor torques, then estimate the external contact forces/torques,
and subsequently use these as input for the detection model. Furthermore,
if external force/torque sensors are available in a given setup, it would be
straightforward to include those measurements in this approach. Gener-
alization to new robots and configurations would then easily be achieved.
Motor torques could still be relevant for validation of the external measure-
ments.
120
7.6 Discussion
100
80
40
20
0
0 0.05 0.1 0.15 0.2 0.25 0.3
Distance [m]
(a) (b)
In this architecture, task completion was concluded the first time p̂ 1 was
greater than the decision boundary. Therefore, the values of p̂ 1 after that
event were not relevant for detection purposes, and torque measurements
after any transient were not used as training or test data. Further, the
torques after task completion depend not only on the interaction forces, but
also on the commanded behavior of the robot.
The concept of weighted loss to compensate for class imbalance in ma-
chine learning has been evaluated in [Japkowicz and Stephen, 2002], and
successfully applied to a deep neural network in [Panchapagesan et al.,
2016]. The loss function (7.5) extends (7.4) by the factor wTr yd , which evalu-
ates to r for positive data points, and to 1 for negative ones.
In Fig. 7.5, it should be noted that p̂ 1 raised significantly above 0 at
t ( 4.2 s, even though the task did not yet finish at that time. Even though
the values were still well below the decision boundary, this leaves room for
improvement in terms of reliability of the detection.
In a complete setup, the validation procedure described here could
be combined with more sophisticated motion controllers, such as DMPs.
Whereas the evaluation of the model was done in real time in this work,
the RNN training was performed offline. A complete product should have a
user interface allowing for an operator to gather training data and test data,
label them, and run the training procedure. This would shorten the time
required for acquisition of training and test data, which was approximately
one working day with the current arrangement. It would also be valuable if
121
Chapter 7. Detection and Control of Contact-Force Transients
7.7 Conclusion
In this research, we have addressed the question of whether robot joint-
torque measurements could be used for detection of force/torque transients
in robotic manipulation. We have shown that such detection is possible,
which is the main contribution of this chapter. In contrast, the concept of
RNNs is well known from previous research, and was used in this chapter
for the purpose of evaluation, as well as to exemplify how joint torques could
be used for detection in a practical setup. First, training and test data were
gathered and labeled. Then, these were used to determine the parameters of
the detection model. Finally, a real-time application for transient detection
was implemented and tested, both on the assembly task used to generate
training and test data, and on new tasks and using a new robot, that had not
been used to determine any model parameters. The method presented seems
promising, since the resulting model had high performance, both on the test
data and during the experiments. A video that shows the functionality is
available on [Karlsson, 2017b].
122
8
A Dual-Arm
Haptic Interface
for Task Demonstration
8.1 Introduction
Whereas purely position-based programs can be mediated to robots through
ordinary lead-through programming, tasks that involve contact forces be-
tween a robot and its surroundings are not as straightforward to demon-
strate. Ordinary lead-through programming, using only one robot arm,
yields forces between the operator and the robot which can not be distin-
guished from the task-specific forces in general. In this chapter, an immer-
sive haptic interface for task demonstration is presented and evaluated. The
interface allows for an operator to act and sense remotely, thereby demon-
strating desired movements and interaction forces. The interface consists of
two robot arms, one of which can be moved directly by an operator through
physical contact, and one that moves accordingly, while contact forces are
reflected to either arm as haptic feedback. A control law enforces a constant
pose offset in Cartesian space between the end-effectors of the robot arms.
We refer to the robot arm in direct contact with the operator as the master
side, and to the other robot arm as the slave side. However, the presented
algorithm is symmetric with respect to the robot arms, i.e., either arm could
be used as master or slave without any specification. The concept is visu-
alized in Fig. 8.1 and in a video available online [Ghazaei Ardakani et al.,
2019].
Haptic feedback for robotic teleoperation [Hokayem and Spong, 2006]
and for virtual reality [Constantinescu, 2008] has become an important
research topic. Four-channel haptic systems represent the most general
form [Aliaga et al., 2004], where position and force data are transferred
between the master and slave sides in both directions. In addition to a
123
Chapter 8. A Dual-Arm Haptic Interface for Task Demonstration
Figure 8.1 Example use case of the proposed interface. An operator demon-
strates a peg-in-hole task by moving the left robot arm through physical
contact. The right robot arm follows the demonstrated movement in real
time.
haptic system, visual feedback from the slave side to the master side is
typically necessary. In the arrangements considered in this chapter, the
slave side can be seen directly by the operator, see Fig. 8.1. If this is not the
case, cameras on the slave side are commonly used. A teleoperated interface
is transparent if movement and contact forces are reflected to either side
through the interface, so that the operator has the feeling of interacting
with the remote environment.
There are several factors that limit the transparency. In [Lawrence,
1993], it has been shown that transparency and stability are conflicting fea-
tures. Delays are detrimental both to stability and transparency [Anderson
and Spong, 1989; Lee and Spong, 2006; Hatanaka et al., 2015]. Light-weight
collaborative robots may be used as haptic devices without external sensors.
Joint friction should then be compensated for, and uncertainties in the fric-
tion model would limit the transparency. In this chapter, we describe how
measurements of external forces can be included in the control law, but
in the experiments we consider a worst-case scenario where external sen-
sors are not available. Transparency is further limited by the structures
of the master and slave devices. The common work space is limited to
the configurations reachable by both devices simultaneously. At the work-
space boundary, at least one device has reduced manipulability because of
a singular configuration or limits of reach or joint angles. It is desirable to
provide feedback regarding these internal limitations, though this contra-
dicts transparency with respect to the environment. The operator typically
expects force feedback both from limitations of the robotic system, and from
interaction with the environment. The proposed algorithm supports deploy-
124
8.2 Problem Formulation
125
Chapter 8. A Dual-Arm Haptic Interface for Task Demonstration
task demonstration, where one robot arm can be moved directly by the
operator, and the other robot arm moves accordingly. Because interaction
with the operator and with the work objects have been separated to dif-
ferent robot arms, demonstrated forces between robot and work objects are
easily distinguished. In the context of task demonstration through a haptic
interface, transparency is an important aspect. Further, it is important to
consider stability, especially during physical contact with a stiff environ-
ment. Therefore, this chapter also aims to evaluate the movement and force
tracking between the two robot arms, as well as the stability of the proposed
system.
126
8.4 Method
Table 8.1 Notation used in this chapter. The subscripts 1 and 2 are used
to distinguish each robot arm. The subscript i indicates one arbitrary arm.
Notation Description
+
ni ∈Z Number of joints on arm i
n ∈ Z+ n1 + n2
q ∈ Rn Joint positions
pi ∈ R3 Cartesian position
Ri ∈ SO(3) Cartesian orientation
Ji ∈ R6$ni Robot Jacobian
G ∈ R6$n [− J1 J2 ]
M ∈ Rn$n Mass matrix
C ∈ Rn$n Centripetal and Coriolis matrix
Qe ∈ Rn External torques
Qm ∈ Rn Motor torques scaled to arm side
Qvc ∈ Rn Torques due to virtual constraint
λ ∈ R6 Forces due to virtual constraint
e ∈ R6 Pose deviation from offset constraint
ẏ ∈ R6 Difference between end-effector velocities
Γ ∈ R6$6 G M −1 GT
x, y, z ∈R Cartesian coordinates, see Fig. 8.3
8.4 Method
The control law for the haptic interface implemented in this chapter was
first described in [Ghazaei Ardakani, 2016], and is briefly described in this
section for convenience. The main idea is to virtually constrain the two
end-effectors to have a constant pose offset, as if they were physically and
rigidly attached to each other, and subsequently design a control law that
realizes this constraint. Table 8.1 lists some of the notation used in this
chapter.
Virtual Constraint
Denote by n 1 and n 2 the number of joints on the first and second arm,
respectively, and let n = n 1 + n 2 represent the total number of joints.
Throughout this chapter, we will continue to use the subscripts 1 and 2
to associate quantities with each of the two arms. Further, let q 1 ∈ Rn1
and q 2 ∈ Rn2 represent joint positions, and introduce q = [qT1 qT2 ]T . The
127
Chapter 8. A Dual-Arm Haptic Interface for Task Demonstration
p 2 − p 1 = ∆p (8.1a)
RT1 R 2 = ∆R (8.1b)
Control Algorithm
The equations of motion for a rigid-body model of each robot arm i can be
written as [Spong et al., 2006]
where torques induced by Coulomb friction and gravity have been omitted
since we compensate for these separately. Here, Mi (q i ) is the mass matrix,
C i (q i , q̇ i ) represents the effect of centripetal and Coriolis forces, Qei rep-
resents external torques, Qm i represents motor torques scaled to the arm
side, and µi represents coefficients for viscous friction. Denote by Qvc motor
torques that enforce the virtual constraint. Our aim is to use these as con-
trol signals, so that Qm = Qvc . The equations of motion of the constrained
system can be derived from (8.2) and (8.3) as
where
Qvc = GT λ (8.6)
128
8.4 Method
By introducing the state vector ξ = [qT q̇T ]T and replacing λ with u, (8.4)
can be written as
C D
q̇
ξ˙ = ! " (8.8a)
M −1 (q) −C (q, q̇)q̇ + Qe − µq̇ + GT (q)u
ẏ = G(q)q̇ (8.8b)
This solves the servo problem. To also solve the regulation problem, i.e.,
to drive the physical offset to the reference in case of any deviation, we
introduce the state feedback
λ = u = u∗ − Γ −1 ( K d ẏ + K p e) (8.10)
ÿ + K d ẏ + K p e = 0 (8.11)
129
Chapter 8. A Dual-Arm Haptic Interface for Task Demonstration
Q̂e m
Control Q = Q
vc
q, q̇
+ Mm (q)
Manipulators
Law
Figure 8.2 Control architecture. Estimated external torques Q̂e and the
states of the robots are inputs to the controller for the calculation of Qvc ,
the torques from the virtual constraint. The manipulators are gravity and
Coulomb-friction compensated.
Qm = Qvc = GT λ (8.12)
8.5 Experiments
For the experiments, the ABB YuMi prototype was used. The control algo-
rithm in Sec. 8.4 was implemented on a PC that communicated with the
internal controller of the robot. According to Fig. 8.2, the sum of the gen-
eralized forces from the virtual constraint Qvc , and the gravity and friction
compensation signals, was sent as a torque reference to the internal robot
controller. For the friction compensation at zero joint velocities, a dithering
technique was used [Capurso et al., 2017]. The worst-case scenario was ex-
perimentally evaluated, where no external sensors were assumed and the
estimated external forces and torques were set to zero. Each end-effector
of the robot was equipped with a force/torque sensor, and these were used
for evaluation only. The experimental arrangement is shown in Fig. 8.3.
Additionally, a video that shows the setup and the experiments is available
online [Ghazaei Ardakani et al., 2019]. Two different setups were consid-
ered.
Setup 8.1 The left robot arm was used as master. First, free-space motion
was conducted, in order to evaluate the motion tracking between the two
arms. The right arm was then brought to collision with a concrete block
in front of the robot, in order to evaluate force tracking and stability in
contact with a stiff environment. When in contact, the operator pushed the
end-effector of the master arm in the positive y-direction, defined in Fig. 8.3,
so that the slave arm was pushed against the block in order to excite the
interaction forces.
130
8.6 Results
z
y
x
(a) (b)
Figure 8.3 Robot configurations for experimental evaluation. Setup 8.1 is
shown in (a), where the left arm of the robot was used as master, and the
movement was such that the end-effector of the right arm established contact
with the concrete block in front of the robot. Setup 8.2 is shown in (b), where
the right robot arm was used as master, and the left arm was first brought
upwards to its reach limit.
Setup 8.2 The right robot arm was used as master. By moving the master
arm, the left arm was first brought to its upper reach limit. This configu-
ration was also a singularity, see Fig. 8.3. Thereafter, it was moved slightly
to the side thus reaching the angle limit of Joint 1, the joint closest to
the body. The purpose of this experiment was to evaluate stability proper-
ties and interaction forces when the slave arm reached internal physical
limitations.
8.6 Results
Data from Setup 8.1 are shown in Fig. 8.4. It can be seen that motion track-
ing was achieved. The bottom plot compares the forces in the y-direction, as
measured by the wrist-mounted force sensors. The force measured on the
left arm has been negated for easier comparison, and corresponds to the in-
teraction force experienced by the operator. After 50 s of free-space motion,
a contact was established between the right end-effector and the concrete
block. It can be seen that force tracking was achieved, meaning that the
operator experienced the same contact forces as the slave arm. Moreover,
the contact force from the concrete block prevented both arms from moving
further in the y-direction, despite the force exerted by the operator. This
behavior is the desired result of the virtual constraint.
131
Chapter 8. A Dual-Arm Haptic Interface for Task Demonstration
0.4 R.p x
R.p y
Position [m]
0.2
R.p z
0 L.p x
−0.2 L.p y
L.p z
−0.4
0 10 20 30 40 50 60 70 80 90
0.2
R.ṗ y
Velocity [m/s]
0.1 L.ṗ y
0
−0.1
−0.2
0 10 20 30 40 50 60 70 80 90
10
R. Fy
Force [N]
0 L. Fy
−10
−20
0 10 20 30 40 50 60 70 80 90
Time [s]
Figure 8.4 Results from Setup 8.1. For an uncluttered view, only some
of the dimensions are shown. The first plot from above shows the position
of the arms, where the position of the left arm has been compensated for
with the desired offset between the arms. The second plot shows velocities
in the y-direction, and the third plot shows forces in the y-direction. R.p x
represents the right arm’s position in the x-direction, and so on.
Data from Setup 8.2 are visualized in Fig. 8.5. The left arm was first
brought to its upper reach limit. Despite a large force exerted upwards on
the master arm by the operator, it can be seen that the master arm was
prevented from moving in the z-direction by the virtual constraint. Joint 1
of the left arm was thereafter brought to its angle limit. Then, further
motion of the arms in the limited direction was prevented despite pushing
the master arm, since the joint limit of the slave arm was being propagated
to the master arm thanks to the virtual constraint.
As expected from an impedance-type controller, stability was retained
both during free-space motion, during contact with a stiff environment, and
132
8.7 Discussion
0.6 R.p x
R.p y
Position [m]
0.4
R.p z
0.2 L.p x
0 L.p y
L.p z
−0.2
0 5 10 15 20 25 30 35
−2
Joint 1 angle [rad]
L.q (1)
−2.2
−2.4
−2.6
0 5 10 15 20 25 30 35
30
R. Fy
20
Force [N]
R. Fz
10
0
0 5 10 15 20 25 30 35
Time [s]
Figure 8.5 Results from Setup 8.2. The master arm was subjected to large
forces twice; first when the left arm reached a work-space limit in the z-
direction, and thereafter when Joint 1 of the left arm was in its lower angle
limit.
8.7 Discussion
The proposed control law solves the problem of capturing interaction forces
from demonstrations based on lead-through programming. The dual-arm
133
Chapter 8. A Dual-Arm Haptic Interface for Task Demonstration
lead-through programming is more user friendly and has less side effects
than single-arm arrangements, thanks to the separation between contact
forces from the operator and the work objects.
A possible drawback with the proposed approach is that the physical
dynamical properties of the robots may be undesired. This can be mitigated
by using local feedback to adjust the apparent dynamical properties, before
applying the proposed controller.
In the current sensorless implementation, the force tracking was mainly
limited by uncertainties in the friction model. A simple model was adopted,
consisting of Coulomb and viscous friction. In reality, many aspects af-
fect joint friction, such as load, configuration [Wahrburg et al., 2018], and
temperature [Bagge Carlson et al., 2015]. Especially zero-velocity friction
is difficult to estimate [Stolt et al., 2015a; Capurso et al., 2017]. In the
experiments, external sensors were not used for the haptic interface, to
investigate a worst-case scenario. The algorithm supports integration of ex-
ternal sensors, which would improve the overall performance significantly.
The sensorless approach is limited to light-weight collaborative robots, as
large industrial robots exhibit more joint friction and usually require exter-
nal sensing for any lead-through programming.
The prototype YuMi robot was used partly because it had force sensors
for validation, and partly because its research interface was more mature
than that of the product version of YuMi at the time, with predictable delays
and low jitter, which was important for the implementation.
This research has implications for the industry, as the robot company
Cognibotics AB, Lund, Sweden, aims to commercialize the interface. As a
future step, the implementation will be migrated to the product version
of YuMi, possibly using another version of the research interface. It will
then be a benefit that the product version of YuMi exhibits lower and more
predictable joint friction than the prototype.
Using a dual-arm robot for this interface has the benefit that two arms
are then naturally available. In such an arrangement, however, manipula-
tion tasks that require both arms leave no arm available as master side.
This can be solved by using yet another robot as master. For instance, it
would be straightforward to use an entire YuMi robot as master, controlling
all 14 degrees of freedom, and another one as slave. Enforcing the same
configuration on both robots would then be possible and perhaps more in-
tuitive for an operator, but would not be a requirement for the interface as
it also supports independent configurations of the master and slave.
The control signal in (8.9) results in zero dynamics of the system (8.8). It
was found by following the steps described in [Slotine and Li, 1991], i.e., time
differentiate the output until the control signal appears, and subsequently
choose the control signal to cancel nonlinearities and guarantee tracking.
Alternatively, the control signal for the zero dynamics could be derived
134
8.7 Discussion
135
Chapter 8. A Dual-Arm Haptic Interface for Task Demonstration
8.8 Conclusion
In this chapter, an immersive haptic interface for task demonstration using
two manipulators has been described. It allows for an operator to demon-
strate both movements and interaction forces in an intuitive way. The ap-
proach is based on introducing virtual constraints between the manipula-
tors, ensuring that the offset between the end-effectors remains constant.
Based on this, a control law coupling the motion of two nonlinear robotic
arms in task space has been derived. The interface supports using dissimilar
arms, mounted arbitrarily and starting in any configurations independently.
Interaction forces from the environment, as well as internal limitations such
as singularities, are naturally mediated to the operator as haptic feedback.
A video illustrating the interface is available online [Ghazaei Ardakani et
al., 2019].
RT1 R 2 = ∆ R \ R 2 = R 1 ∆ R (8.13)
S (ω 2 ) R 2 = S (ω 1 ) R 1 ∆ R = S (ω 1 ) R 2 (8.14)
S (ω 2 − ω 1 ) R 2 = 0 (8.15)
S (ω 2 − ω 1 ) = 0 [ ω 2 − ω 1 = 0 (8.16)
ṗ 2 − ṗ 1 = 0 (8.17a)
ω2 − ω1 = 0 (8.17b)
136
8.A Proof of Kinematic Constraint
or, equivalently,
Gq̇ = 0 (8.19)
This completes the proof.
137
9
Conclusion and Future
Research
138
Chapter 9. Conclusion and Future Research
139
Bibliography
140
Bibliography
141
Bibliography
142
Bibliography
143
Bibliography
144
Bibliography
145
Bibliography
146
Bibliography
147
Bibliography
148
Bibliography
149
Bibliography
150
Bibliography
151
Bibliography
152
Bibliography
153
Bibliography
154
Bibliography
155