0% found this document useful (0 votes)
45 views7 pages

Probabilistic Movement Primitive Control Via Control Barrier Functions

The document summarizes a research paper presented at the 2021 IEEE 17th International Conference on Automation Science and Engineering about using control barrier functions to control probabilistic movement primitives (ProMPs) for robots. ProMPs define a distribution of trajectories for robots based on demonstrations, but existing control methods have drawbacks like relying on linear control approaches and sensitivity to initial parameters. The presented approach uses feedback linearization, quadratic programming, and multiple control barrier functions to guide a robot along a trajectory within the ProMP distribution while guaranteeing the robot state never leaves a desired distance from the distribution mean. Simulations show the efficacy of the approach in real time.

Uploaded by

pa.ilbeig
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views7 pages

Probabilistic Movement Primitive Control Via Control Barrier Functions

The document summarizes a research paper presented at the 2021 IEEE 17th International Conference on Automation Science and Engineering about using control barrier functions to control probabilistic movement primitives (ProMPs) for robots. ProMPs define a distribution of trajectories for robots based on demonstrations, but existing control methods have drawbacks like relying on linear control approaches and sensitivity to initial parameters. The presented approach uses feedback linearization, quadratic programming, and multiple control barrier functions to guide a robot along a trajectory within the ProMP distribution while guaranteeing the robot state never leaves a desired distance from the distribution mean. Simulations show the efficacy of the approach in real time.

Uploaded by

pa.ilbeig
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

2021 IEEE 17th International Conference on Automation Science and Engineering (CASE)

August 23-27, 2021. Lyon, France

Probabilistic Movement Primitive Control via Control Barrier Functions


Mohammadreza Davoodi1 , Asif Iqbal1 , Joseph M. Cloud2 , William J. Beksi2 , and Nicholas R. Gans1

Abstract— In this paper, we introduce a novel means of


2021 IEEE 17th International Conference on Automation Science and Engineering (CASE) | 978-1-6654-1873-7/21/$31.00 ©2021 IEEE | DOI: 10.1109/CASE49439.2021.9551540

control design for probabilistic movement primitives (ProMPs).


ProMPs are a powerful tool to define a distribution of trajecto-
ries for robots or other dynamic systems. However, existing
control methods to execute desired motions suffer from a
number of drawbacks such as a reliance on linear control
approaches and sensitivity to initial parameters. We propose
the use of feedback linearization, quadratic programming, and
multiple control barrier functions to guide a system along a
trajectory within the distribution defined by a ProMP, while
guaranteeing that the system state never leaves more than
a desired distance from the distribution mean. This allows
for better performance on nonlinear systems and offers firm
stability and known bounds on the system state. Furthermore,
we highlight how the proposed method may allow a designer
to emphasize certain objectives that are more important than
the others. A series of simulations demonstrate the efficacy of Fig. 1: A set of robot trajectories generated by our demonstration
our approach and show it can run in real time. and control method. The controller guarantees the system never
leaves a neighborhood defined by the training set and provides a
I. I NTRODUCTION simple way to define trajectories that enforce safety constraints.
Research in robot motion planning has provided a vari-
ety of successful frameworks to generate trajectories [1]. Probabilistic movement primitives (ProMPs) are a concept
However, these frameworks can be difficult to implement whereby a distribution of trajectories is learned from multiple
in environments without a predefined static map. Further- demonstrations. In [7], the design of a stochastic ProMP
more, motion planning typically requires the selection of feedback controller was studied by exploiting the property of
algorithms, the design of cost functions, and other tasks that the covariance derivatives, which can be explicitly computed.
are outside the expertise of nontechnical users. Learning from A model-free ProMP controller that adapts movement to
demonstration is a paradigm that has played an important force-torque input was designed in [8]. In [9], the authors
role in addressing the issue of scaling up robot learning [2], designed a model predictive control-based ProMP controller
[3]. It can avert the drawbacks of traditional robot motion for a linear discrete time system model.
planning by relying on the presence of a human teacher. Nevertheless, ProMP methods suffer from some notable
One approach to learning via demonstration is the use of deficiencies. In the aforementioned works, a linearized model
parameterized movements, known as movement primitives of the system is used for designing the controller. This makes
(MPs), to encode and generalize human demonstrations for the controller less applicable for nonlinear systems such as
training robots. MPs are modeled through a compact repre- robotics and autonomous vehicles. In addition, ProMPs and
sentation of implicitly continuous and high-dimensional tra- their resulting controllers are often difficult to implement,
jectories. For example, dynamic motion primitives (DMPs) sensitive to noise, and highly dependent upon design param-
use demonstrations to learn a model of the control effort eters and initial conditions. This reduces their usability for
necessary to produce a desired trajectory for a stabilized non-experts. Finally, since ProMPs are by definition stochas-
dynamic system [4], [5]. However, DMPs can be inflexible tic, the distribution of trajectories may have a large support
in that they can only deliver the demonstrated trajectory. and result in trajectories that deviate from the training set.
Capturing the natural variation in a human demonstration of Real-time safety in safety-critical dynamical systems is
a task can help a robot overcome uncertainty and deviation an important problem that has received attention in recent
between the training regime and actual task execution [6]. years [10]. This problem has been investigated by designing
1 The authors are with the University of Texas at Arlington Research Insti- polynomial barrier certificates/functions, using offline iter-
tute, Fort Worth, TX, USA. Emails: [email protected], [email protected], ative algorithms based on sum-of-squares optimization, to
[email protected]. verify safety for a given dynamical system [11]. The concept
2 The authors are with the Department of Computer Science and Engi-
neering, University of Texas at Arlington, Arlington, TX, USA. Emails: of barrier certificates/functions was extended for synthesizing
[email protected], [email protected]. real-time safe control laws for dynamical systems via control
This material is based upon work supported by the National Science barrier functions (CBFs) [12].
Foundation (NSF) under grant No. 1728057 and the University of Texas at
Arlington Research Institute. Joseph M. Cloud was supported by an NSF CBFs integrate seamlessly with control Lyapunov func-
Graduate Research Fellowships Program grant No. 1746052. tions (CLFs) to offer stability while respecting limits and

978-0-7381-2503-9/21/$31.00 ©2021 IEEE 697

Authorized licensed use limited to: University of Texas at Arlington. Downloaded on June 08,2022 at 19:55:36 UTC from IEEE Xplore. Restrictions apply.
safe regions of the state space [13]–[15]. Additionally, CBF trajectories. The trajectory distribution can be defined and
and CLF controllers typically solve a constrained quadratic generated in any space that accommodates the system (e.g.,
program (QP) to find an optimal controller at runtime. This joint or task spaces). We consider joint space trajectories.
allows the system to minimize the control effort while Within a ProMP, the execution of a trajectory is mod-
ensuring stability and safety. Other tasks formulated as eled as the set of robot positions, ζi = {qi (k)}, where
cost functions or constraints can be included as well. One k = 0, . . . , K, qi (k) ∈ R is the state variable sampled at
downside to CBFs and CLFs is the complexity in defining time k, and i ∈ {1, . . . , n} is the joint index of a robot. Let
the barriers and trajectories. Recent efforts to automate the wi ∈ R1×L be a weight matrix with L terms. A linear basis
definition of CBFs and CLFs include mapping temporal function model is then given by
logic statements with respect to performance requirements
 
q (k)
[16] and fitting piecewise-linear barrier functions to trained xi (k) = i = Φ(k)wi + ξxi ,
q˙i (k)
obstacles or regions [17].  >
In this work, we propose a novel control method that uses where Φ(k) = φ(k) φ̇(k) ∈ R2×L is the time-
the distribution delivered by a ProMP to define a set of dependent basis function matrix, and L is the number of basis
CLFs and CBFs. This idea addresses each of the previously functions. Gaussian noise is described by ξxi ∼ N (0, Σxi ).
mentioned issues with ProMPs, CLFs, and CBFs. CLF and Thus, the ProMP trajectory is represented by a Gaussian
CBF controllers are fundamentally based on the nonlinear distribution over the weight vector wi and the parameter
kinematic/dynamic equations of the system and overcome the vector θi = {µwi , Σwi }, which simplifies the estimation of
inherent linearity of ProMP controllers. Our controller not the parameters.
only enjoys the stability and robustness guarantees possessed We marginalize out wi to create the trajectory distribution
Z
by CLF and CBF controllers, but it can also guarantee
p(ζi , θi ) = p(ζi | wi )p(wi ; θi )dwi . (1)
that the system never leaves a neighborhood defined by
the training set. Moreover, our approach provides a simple Here, the distribution p(ζi , θi ) defines a hierarchical
way to define trajectories for CLFs and barriers for CBFs Bayesian model over the trajectories ζi [7] and p(wi | θi ) =
using the mean, standard deviation, or other moments of the N (wi | µwi , Σwi ). In an MP representation, the parameters
distribution. We provide a stability analysis for the proposed of a single primitive must be easy to obtain from demonstra-
control laws, and we demonstrate their effectiveness and tions. The distribution of the state p(xi (k) ; θi ) is
computational efficiency through simulations of a 2-link and
p(xi (k); θi )=N (xi (k) | Φ(k)µwi , Φ(k)Σwi Φ(k)> +Σxi ). (2)
a 6-link robot. Examples of generated control trajectories for
a Universal Robots UR5 are shown in Fig. 1. The trajectory can be generated from the ProMP distribution
The remainder of this paper is structured as follows. In using wi , the basis function Φ(k), and (2). The basis function
Section II, ProMPs, CBFs, and CLFs are reviewed. The is chosen based on the type of robot movement which can
problem is described in Section III. Our approach for a be either rhythmic or stroke-based. From (2), the mean
ProMP controller based on CLFs and CBFs is provided in µ̃i (k) ∈ R2 of the ProMP trajectory at k is Φ(k)µwi and
Section IV. In Section V, simulation results are presented. the covariance Σi (k) is Φ(k)Σwi Φ(k)> + Σxi .
Finally, we conclude in Section VI with a discussion on Multiple demonstrations are needed to learn a distribu-
future work. tion over wi . We use a combination of radial basis and
polynomial basis functions for training the ProMP. From
II. BACKGROUND the demonstrations, the parameters θi can be estimated
In this section we provide essential background informa- using maximum likelihood estimation [18]. However, this
tion on ProMPs, CBFs, and CLFs. may result in unstable estimates of the ProMP parameters
Notation: Given a matrix A, let A> denote its transpose. when there are insufficient demonstrations. Our method uses
Let the identity and zero matrices, with appropriate dimen- regularization to estimate the ProMP distribution similar to
sions, be denoted by I and 0, respectively. We denote ? as the work by Gomez-Gonzalez et al. [6]. We maximize θi for
the symmetric entries of a matrix. For a vector field fi (x) the posterior distribution over the ProMP using expectation
and vector of vector fields F (x) = [f1 (x), ..., fn (x)], let Lfi maximization,
and LF denote, respectively, the Lie derivative along fi (x) p(θi | xi (k)) ∝ p(θi )p(xi (k) | θi ). (3)
and the vector of Lie derivatives in the directions fi (x) :
In addition, we use a Normal-Inverse-Wishart as a prior
LF = [Lf1 , ..., Lfn ]. Zero-mean i.i.d. Gaussian distribution
distribution p(θi ) to increase stability when training the
with mean m and (co)variance Σ is denoted N (m, Σ), and
ProMP parameters [6].
the number of joints for a robot arm is represented by n.
B. Control Barrier and Control Lyapunov Functions
A. Probabilistic Movement Primitives
Consider the affine, nonlinear system
ProMPs provide a parametric representation of trajectories
that can be executed in multiple ways through the use of a ẋ = f (x) + G̃(x)u, (4)
n m
probability distribution. Basis functions are used to reduce where x ∈ R denotes the state, u ∈ R is the control input,
model parameters and aid learning over the demonstrated G̃ = [g1 , ..., gm ], and f : Rn → Rn and gi : Rn → Rn are

698

Authorized licensed use limited to: University of Texas at Arlington. Downloaded on June 08,2022 at 19:55:36 UTC from IEEE Xplore. Restrictions apply.
locally Lipschitz vector fields. It is assumed that the system ProMP
Demonstration
in (4) is controllable. Model
1) Control Barrier Functions: A smooth function h(x) :
Rn → R is defined to encode a constraint on the state x of
Feedback
system. The constraint is satisfied if h(x) > 0 and violated CLF/CBF
Linearization Robot
Controller
if h(x) ≤ 0. Consider the set C defined by Controller
C = {η : h(η) ≥ 0},
∂C = {η : h(η) = 0}, (5) Fig. 2: A block diagram illustrating the general structure of the
proposed system.
Int(C) = {η : h(η) > 0},
n
where h : R → R is a continuously differentiable function. u, actuate the system. Then, the system description in (8) can
Existing approaches to define CBFs include exponential be converted to an ODE of the form in (4) where x = [q, q̇]>
CBFs, zeroing CBFs, and reciprocal CBFs [12], [19]. How- and
   
ever, these methods have trade-offs in terms of ease of q̇ 0
f (x) = , g(x) = . (9)
definition, boundedness of velocities, speed of convergence, −D−1 (q)H(q, q̇) D−1 (q)E
etc. We investigate the use of a reciprocal CBF. This type
of CBF has a small value when the states are far from B. Problem Definition
the constraints and becomes unbounded when the states Our primary goal is to design a controller such that the
approach the constraint. system output tracks a trajectory within the distribution
Definition 1. Given C and h, a function B : C → R is a generated by a ProMP. To this end, we first construct a
CBF if there exists class K functions α1 , α2 , and a constant nonlinear inner-loop control law founded on the feedback
scalar γ > 0 such that linearization of (8). Then, an outer-loop controller based on
1 1 a CLF-CBF is designed using the distribution parameters,
≤ B(x) ≤ ,
α1 (h(x)) α2 (h(x)) µ̃i and Σi . This is summarized as the following problem
γ (6)
objectives.
Lf B(x) + LG̃ B(x)u − ≤ 0.
B(x) 1) Design a feedback linearization controller to obtain a
2) Control Lyapunov Functions: A CLF is a Lyapunov linear and decoupled input-output closed-loop relation-
function for a dynamical system with inputs, and it can be ship for the error signal.
used to design the control inputs to ensure objectives such 2) Use the demonstrated trajectories of a robot to train and
as stability, convergence to the origin (or other set point), estimate a ProMP distribution. The ProMP provides the
or convergence to a desired trajectory. In order to have sim- time-varying mean and variance of a trajectory.
ilar constructions as CBFs, we will consider exponentially 3) Design a CLF to stabilize the system such that
stabilizing CLFs [20]. ∀i, qi → µi , where µi is the first element of µ̃i .
Definition 2. In a domain X ⊂ Rn , a continuously 4) Design a CBF to guarantee that the error ei = qi − µi
differentiable function V : X → R is an exponentially satisfies the safety constraint ∀i, |ei | < σi , where σi is
stabilizing CLF (ES-CLF) if, ∀x ∈ X, there exists positive the (1, 1) element of Σi .
scalar constants c1 , c2 , c3 > 0 such that The general structure of our proposed system is shown in
c1 ||x||2 ≤ V (x) ≤ c2 ||x||2 , Fig. 2.
(7)
Lf V (x) + LG̃ V (x)u + c3 V (x) ≤ 0.
IV. C ONTROL C ONSTRUCTION
Having established a CBF (to accomplish safety) and CLF
(to achieve control performance objectives), they can be We define the error and trajectory vectors as e =
combined through a QP. Consequently, safe control laws [e1 , . . . , en ]> and µ = [µ1 , . . . , µn ]> , respectively. Using (4)
may be computed using the QP to solve the constrained and taking the derivative r times along f (x) and g(x), we
optimization problems at each point in time [12]. obtain
(r) (r−1)
III. P ROBLEM FORMULATION e(r) (x) = Lf e(x) + Lg Lf e(x)u − µ(r) . (10)
| {z }
A. System Modeling Γ

We consider the equations of motion for a robot given in Assume the decoupling matrix, Γ, is well-defined and has
the general form by the Euler-Lagrange equations full rank. This implies that the system in (4) is feedback
linearizable and we can prescribe the control law
D(q)q̈ + H(q, q̇) = Eu, (8)  
(r)
where q ∈ Q is a generalized coordinate position. Suppose u(x) = Γ−1 −Lf e(x) + µ(r) + v , (11)
that Q ⊂ Rn is the configuration space of the robot, D(q) is
where v is an auxiliary feedback control value. This yields
the inertia matrix, H(q, q̇) = C(q, q̇)q̇ + K(q) is a vector
the rth order linear system from input v to output e,
containing the Coriolis and gravity terms, and E is the
actuation matrix that determines the way in which the inputs, e(r) = v. (12)

699

Authorized licensed use limited to: University of Texas at Arlington. Downloaded on June 08,2022 at 19:55:36 UTC from IEEE Xplore. Restrictions apply.
In this work, we limit ourselves to relative degree-two safety constraint has a relative degree of 2. For the relative
systems (i.e., r = 2). By defining η = [e, ė]> , (12) can degree-two constraints, the reciprocal CBF is defined as [22],
be written as a linear time invariant system 
hij (ηi )

bEij ḣij (ηi )2
    Bj (ηi ) = − ln + aEij ,
0 I 0 1 + hij (ηi ) 1 + bEij ḣij (ηi )2
η̇ = η+ v. (13)
0 0 I (20)
From there, n decoupled systems can be obtained from (13), where j ∈ {1, 2} and aEij , bEij are positive scalars. Note
that by choosing small values for aEij and bEij , the system
η̇i = F ηi + Gvi , i = 1, . . . , n, (14) will stop far from the constraint surfaces. By choosing large
parameters, the system will stop close to the constraints.
 
0 1
where ηi = [ei , ėi ]> , F = , and G = [0 1]> . In some cases, especially in the presence of uncertainties,
0 0
To accomplish problem objective 3, it is sufficient to choosing aEij and bEij to be too large may cause constraint
ensure ei → 0. This is accomplished through designing violations (i.e., no solution exists to the QP problem). As a
an appropriate CLF. To satisfy problem objective 4, it is result, based on the given application, a compromise needs
sufficient to make |ei | < σi . This objective is satisfied by be considered for choosing these parameters. The following
defining appropriate CBFs. In the ensuing subsections, the control barrier condition should be satisfied for time varying
appropriate CLFs and CBFs are defined for the system in constraints (which leads to time varying CBFs),
(14). Moreover, the ith controller for each system in (14) is ∂Bj (ηi ) γi
designed by combining the corresponding CLFs and CBFs LF Bj (ηi ) + LG Bj (ηi )vi + − ≤ 0. (21)
∂t Bj (ηi )
through a QP problem.
C. Quadratic Program (QP)
A. Control Lyapunov Function As shown in (19), two safety constraints need to be sat-
Consider the following rapidly exponentially stabilizing- isfied simultaneously for each linearized, decoupled system.
CLF (RES-CLF) [21], Due to this fact, a single controller can be obtained in such
    a way that guarantees adherence to both constraints [23]. In
1/i I 0 1/i I 0
Vi (ηi ) = ηi> P η, (15) this subsection, n QPs are proposed to unify RES-CLF and
0 I 0 I i
CBFs for each system in (14) into a single controller. The n
where i is a positive scalar and P is a symmetric positive QPs for i ∈ {1, . . . , n} are defined as
definite matrix that can be obtained by solving the continuous
min v>
i Hi vi ,
time algebraic Riccati equation vi =[vi , δi ]> ∈R2

F > P + P F − P GG> P + I = 0. (16) subject to


c3i
In order to exponentially stabilize the system, we want to LF Vi (ηi ) + LG Vi (ηi )vi + V (ηi ) ≤ δi (CLF) (22)
i i
find vi such that
∂Bj (ηi ) γi
c3i LF Bj (ηi ) + LG Bj (ηi )vi + ≤ (CBFs)
V̇i (ηi ) = LF Vi (ηi ) + LG Vi (ηi )vi ≤ − Vi (ηi ), (17) ∂t Bj (ηi )
i  
where c3i is a positive constant value. To guarantee a feasible 1 0
where Hi = and psci ∈ R+ is a variable that can
solution for the QP, the CLF constraint can be relaxed by 0 psci
be chosen based on the designer’s assessment of weighting
δi > 0 [20] resulting in
the control inputs.
c3i
LF Vi (ηi ) + LG Vi (ηi )vi + V (ηi ) ≤ δi . (18) Remark 1. We design the ProMP generation and controller
i i formulation such that the safety constraints are defined on
This relaxation parameter will be minimized in the QP the independent variance of each joint trajectory distribution.
cost function. It is worth mentioning that by providing a However, the proposed approach can be easily extended to
weighting factor on the relaxation parameter δi , the QP can the case where we want to encode the coupling between the
mediate a trade-off among performance and safety objectives joints for ProMP training. Thus, the safety constraints should
in a way that safety is always satisfied. be defined based on the covariance matrix. To guarantee
safety in the presence of covariance, we can impose the
B. Control Barrier Functions following confidence support
We propose two safety constraints for each system in (14). e> Σ−1 e ≤ kp , with probability 1 − p. (23)
More specifically, each system should satisfy −σi < ei < σi .
Consequently, we have the following two safety constraints The covariance matrix Σ is a symmetric positive definite
matrix. Consequently, it is possible to decompose it into Σ =
hi1 = ei + σi ,
(19) Ψ> ΛΨ, where Λ is the diagonal eigenvalue matrix and Ψ
hi2 = −ei + σi . is the orthogonal eigenvector matrix. Due to this fact, from
From (19), it is obvious that we have multiple time-varying (23) we get
constraints that should be satisfied simultaneously. Moreover,
X
Ψ> Λ−1 Ψ = νi> λ−1 > −1
i νi ≥ νi λi νi ∀i = 1, . . . , n,
it is easy to verify that LG ei = 0 and LF LG ei 6= 0, thus the i

700

Authorized licensed use limited to: University of Texas at Arlington. Downloaded on June 08,2022 at 19:55:36 UTC from IEEE Xplore. Restrictions apply.
1.15

0.6

0.1
q1 (angle)

-0.4

-0.9

-1.4

-1.95
0.0 1.0 2.0 3.0 4.0 5.0
time (sec)
1.9

1.5
Fig. 4: A visualization of the 2-link ProMP in the robot workspace.
The robot link positions over time are shown in red, while the end-
1.05 effector trajectory following the mean ProMP joint trajectories is
q2 (angle)

shown in green. The four trajectories of the end effector with joint
combinations µi ±σi , i ∈ {1, 2} are illustrated in cyan, blue, yellow
0.6 and magenta. The black circles correspond to obstacles.
 
−m2 l1 l2 sin(q2 )q̇2 (2q̇1 + q̇2 )
C(q, q̇) = ,
0.2 m2 l1 l2 q̇12 sin(q2 )
 
(m1 + m2 )gl1 sin(q1 ) + m2 gl2 sin(q1 + q2 )
K(q) = ,
m2 gl2 sin(q1 + q2 )
-0.25
0.0 1.0 2.0 3.0 4.0 5.0 where m1 and m2 are the link masses, l1 and l2 are the
time (sec) lengths of the links, and g is the gravitational acceleration.
Fig. 3: Training of the ProMPs for the first joint (top) and second
For the simulations, the values of these variables are selected
joint (bottom). The 50 input trajectories are shown in red and the
ProMP mean joint trajectories (µi ) are shown in dark green. A as m1 = 1, m2 = 1, l1 = 1, l2 = 1, and g = 9.8.
light-green fill shows µi ± σi . We generated 50 trajectories that achieve a goal position
from various starting positions while avoiding three obsta-
where νi and λi denote, respectively, the eigenvectors and cles. Using this dataset, we train a ProMP with Algorithm
the eigenvalues of the Σ. This leads to the polytopic bounds 1 from [6]. We use L = 2 basis functions consisting of five
− kp λi ≤ νi> e ≤ kp λi , ∀i = 1, . . . , n.
p p
(24) radial basis parameters. The results of the ProMP training
are presented in Fig. 4 where the black circles indicate the
These conditions can be represented in the form of linear location of obstacles and the 50 input trajectories shown in
constraints for the QP. As a result, by solving the QP problem red. The ProMP mean joint trajectories, µi , are shown in
in the presence of these constraints, safety is assured with dark green, and in a light-green fill we show µi ± σi . A
probability of at least 1 − p. visualization of the ProMP in the workspace, based on [25],
is displayed in Fig. 4. The robot link positions over time are
V. S IMULATIONS
highlighted in red with different colors representing key end-
In this section we demonstrate different aspects and ca- effector trajectories from the ProMP. The green trajectory
pabilities of our methodology for the ProMP control of a is the mean of the ProMP distribution. The other four
robotic system. The system models and proposed real-time trajectories result from combinations of µi ± σi , i ∈ {1, 2}.
controller are simulated using a MATLAB 2019a environ- Three sets of simulations were conducted. In each sim-
ment. All computations were run on a Dell OptiPlex 7050 ulation, the CLF parameters are selected as i = 0.1 and
machine with an Intel Core i7-7700X CPU and 8 GB of c3i = 0.5. In the first scenario, greater priority is given to
memory. the CLF than CBF by choosing a high gain, i.e., psci = 200.
Moreover, the CBF design parameters are set to aE11 =
A. Case Study 1: Two-Link Robot
aE12 = 20.1, aE21 = aE22 = 20, bE11 = bE12 = 1,
We consider a rigid, two-link robot with the dynamic bE21 = bE22 = 0.9, γ1 = 10.1, and γ2 = 9. In the
model of (8) and the following parameters [24] second scenario, psci is chosen as psc1 = psc2 = 0.02 which
m1 l12 + m2 (l12 + l22 + 2l1 l2 cos(q2 ))
 
? implies less priority to the CLF in comparison with the CBF.
D(q) = ,
m2 (l22 + l1 l2 cos(q2 )) m2 l22 Moreover, the CBF parameters in this scenario are similar to

701

Authorized licensed use limited to: University of Texas at Arlington. Downloaded on June 08,2022 at 19:55:36 UTC from IEEE Xplore. Restrictions apply.
of the trajectory while ensuring safety.
The main computational cost with respect to time of our
controller comes from the fact that it has to solve a set of
QPs at every time step. In the performed simulations, two
QP problems are solved in real time (one for each link).
The average required time (Tave ), maximum time (Tmax ), and
the standard deviation (std) for solving the QP problems are
Tave = [0.0015, 0.0011], Tmax = [0.1148, 0.0119], std =
[0.0053, 0.0006] where the units are seconds. From these
results, it is clear that the expected execution time of the QP
problems is very small (e.g., in the range of 1 millisecond).
The large maximum times are each a single outlier. Hence,
the controller is appropriate for a real-time implementation.
B. Case Study 2: Universal Robots UR5 6-Link Robot
The equation of motion of the UR5 robot can be written
in the form of (8) with the following parameters [26]
X 6 
D(q) = mi Jv>i Jvi + Jw>i Ri Imi Ri> Jwi , (25)
i=1
where mi ∈ R is the mass of the ith link, Jvi ∈ R3×6 and
Jwi ∈ R3×6 are the linear and angular part of the Jacobian
matrix Ji , respectively. Ri ∈ R3×3 is the rotation matrix and
Imi ∈ R3×3 is the inertia tensor. The elements of C(q, q̇)
are obtained from the inertia matrix as
6  
X 1 ∂mij ∂mik ∂mkj
cij = + − q̇k , (26)
2 ∂qk ∂qj ∂qi
k=1
where mij are the entries of the inertia matrix. Moreover,
the elements of gravity vector are obtained from
∂P
Ki (q) = , (27)
Fig. 5: The results of the CLF/CBF-based ProMP controller for
∂qi
the first joint (top) and second joint (bottom). The safe region of where P is the total potential energy of the robot. Additional
µi ± σi , i ∈ {1, 2} is shown as a filled “tube”. The control results information on these equations can be found in [27].
in different trajectories for distinct values of the weight psci , but We generate 90 joint-space trajectories with defined goals,
all trajectories remain safe. obstacles, and starting positions. The 90 UR5 trajectories
are then used to train a joint-space ProMP using the same
the first scenario. parameters as in the two-link robot case study. The following
To show the effects of changing the CBF parameters aEij , set of CLF and CBF parameters are chosen: 1 = 2 = 3 =
bEij , and γi , we consider another scenario. In the third 4 = 5 = 0.1, 6 = 0.01, and c3i = 1.1, aEij = 20.1,
scenario, aE11 = aE12 = 1.1, aE21 = aE22 = 1.1, bE11 = bEij = 1, and γi = 10.1, i ∈ {1, ..., 6}, j ∈ {1, 2}. The
bE12 = 0.4, bE21 = bE22 = 0.5, γ1 = 1.3, and γ2 = 1.51, simulation environment is depicted in Figs. 1 and 6.
with psci = 0.02 as in the second scenario. Consequently, We consider two different scenarios. In the first sce-
the effects of changing the CBF parameters can be concluded nario psci = 200, which gives higher importance to the
by comparing the second and third scenarios. CLF. In the second scenario psci = 0.001, which im-
The simulation results are exhibited in Fig. 5. In the first plies the design interest and priority is on the CBF. As
scenario, by choosing a large value for psci (more priority is clear from Fig. 6, in both scenarios the robot can ef-
to the CLF than CBF), the system output remains close to fectively track the mean of the ProMP and simultaneously
the mean trajectory. However, in the second scenario, by avoid colliding with environmental obstacles. The running
considering a small value for psci (more priority to the CBF time statistics for solving the QP problems are Tave =
than CLF), the system remains safely inside the distribution [0.0014, 0.0011, 0.0010, 0.0010, 0.0011, 0.0010], Tmax =
but does not necessarily stay close to the mean. In the third [0.1193, 0.0120, 0.0128, 0.0058, 0.0314, 0.0028], std =
scenario, it can be seen that by choosing smaller values [0.0053, 0.0005, 0.0005, 0.0003, 0.0014, 0.0002]. Again,
for aEij , bEij , and γi , the system output can have more the large maximum values are for a single outlier, thus the
deviation from the constraint surfaces, i.e., be closer to the expected operation time is well within the needs of a robotic
mean trajectory. In short, our proposed method provides a system. The cause of outlier run times is an avenue of future
valuable option to the designer that grants levels of flexibility research to offer improved performance guarantees.

702

Authorized licensed use limited to: University of Texas at Arlington. Downloaded on June 08,2022 at 19:55:36 UTC from IEEE Xplore. Restrictions apply.
[7] A. Paraschos, C. Daniel, J. Peters, and G. Neumann, “Using proba-
bilistic movement primitives in robotics,” Autonomous Robots, vol. 42,
no. 3, pp. 529–551, 2018.
[8] A. Paraschos, E. Rueckert, J. Peters, and G. Neumann, “Probabilistic
movement primitives under unknown system dynamics,” Advanced
Robotics, vol. 32, no. 6, pp. 297–310, 2018.
[9] S. Calinon and D. Lee, “Learning control,” Humanoid Robotics: A
Reference, pp. 1261–1312, 2017.
[10] P. Wieland and F. Allgöwer, “Constructive safety using control barrier
functions,” IFAC Proceedings Volumes, vol. 40, no. 12, pp. 462–467,
2007.
[11] C. Sloth, R. Wisniewski, and G. J. Pappas, “On the existence of com-
positional barrier certificates,” in Proceedings of the IEEE Conference
on Decision and Control, 2012, pp. 4580–4585.
[12] A. D. Ames, X. Xu, J. W. Grizzle, and P. Tabuada, “Control barrier
function based quadratic programs for safety critical systems,” IEEE
Transactions on Automatic Control, vol. 62, no. 8, pp. 3861–3876,
2016.
[13] A. D. Ames, S. Coogan, M. Egerstedt, G. Notomista, K. Sreenath,
Fig. 6: The results of the proposed ProMP controller for the UR5. and P. Tabuada, “Control barrier functions: Theory and applications,”
in Proceedings of the European Control Conference, 2019, pp. 3420–
3431.
VI. C ONCLUSION AND F UTURE W ORK [14] B. T. Lopez, J.-J. E. Slotine, and J. P. How, “Robust adaptive control
In this work, a ProMP robot guidance problem was solved barrier functions: An adaptive and data-driven approach to safety,”
Control Systems Letters, vol. 5, no. 3, pp. 1031–1036, 2020.
using a CLF/CBF-based controller. The proposed approach [15] W. S. Cortez, D. Oetomo, C. Manzie, and P. Choong, “Control barrier
stabilizes the robot and guarantees the system output is functions for mechanical systems: Theory and application to robotic
always inside the distribution generated by a ProMP. The grasping,” IEEE Transactions on Control Systems Technology, 2019.
[16] M. Srinivasan and S. Coogan, “Control of mobile robots using barrier
time-varying nature of the ProMP ensures the robot is guided functions under temporal logic specifications,” IEEE Transactions on
along the distribution at the desired rate. Moreover, using Robotics, 2020.
our proposed method, it is possible to provide a trade-off [17] M. Saveriano and D. Lee, “Learning barrier functions for con-
strained motion planning with dynamical systems,” arXiv preprint
between trajectory performance and safety objectives. These arXiv:2003.11500, 2020.
safety guarantees and performance tuning options are not [18] A. Lazaric and M. Ghavamzadeh, “Bayesian multi-task reinforcement
available in the native ProMP control design. Our simulation learning,” in Proceedings of the International Conference on Machine
Learning, 2010, pp. 599–606.
studies on a 2-link and 6-link robot confirm the viability of [19] Q. Nguyen and K. Sreenath, “Exponential control barrier functions for
the proposed method for designing the controller. enforcing high relative-degree safety-critical constraints,” in Proceed-
There are several topics of ongoing work. We are in- ings of the American Control Conference, 2016, pp. 322–328.
[20] A. D. Ames, J. W. Grizzle, and P. Tabuada, “Control barrier function
vestigating the trade-offs of various different CBFs (e.g., based quadratic programs with application to adaptive cruise control,”
zeroing versus reciprocal), other choices of cost functions, in Proceedings of the IEEE Conference on Decision and Control, 2014,
and constraints in the QP. Additionally, we are seeking pp. 6271–6278.
[21] H. Zhao, S. Kolathaya, and A. D. Ames, “Quadratic programming and
novel methods that automatically define additional barriers impedance control for transfemoral prosthesis,” in Proceedings of the
to ensure the safe movement of a co-robot around dynamic IEEE International Conference on Robotics and Automation, 2014,
obstacles (e.g., humans). This includes the exploration of pp. 1341–1347.
[22] S.-C. Hsu, X. Xu, and A. D. Ames, “Control barrier function based
an active learning framework for ProMPs, whereby a co- quadratic programs with application to bipedal robotic walking,” in
robot has the capability to detect when it is outside of Proceedings of the American Control Conference, 2015, pp. 4542–
its safe trajectory zone and thus requires assistance from 4548.
[23] M. Rauscher, M. Kimmel, and S. Hirche, “Constrained robot control
a human. Finally, experimental validation of the proposed using control barrier functions,” in Proceedings of the IEEE/RSJ
methodology will be carried out. International Conference on Intelligent Robots and Systems, 2016, pp.
279–285.
R EFERENCES [24] J. P. Kolhe, M. Shaheed, T. Chandar, and S. Talole, “Robust control of
robot manipulators based on uncertainty and disturbance estimation,”
[1] M. Mohanan and A. Salgoankar, “A survey of robotic motion planning International Journal of Robust and Nonlinear Control, vol. 23, no. 1,
in dynamic environments,” Robotics and Autonomous Systems, vol. pp. 104–122, 2013.
100, pp. 171–185, 2018. [25] A. Sakai, D. Ingram, J. Dinius, K. Chawla, A. Raffin, and A. Paques,
[2] B. D. Argall, S. Chernova, M. Veloso, and B. Browning, “A survey “Pythonrobotics: a python code collection of robotics algorithms,”
of robot learning from demonstration,” Robotics and Autonomous arXiv preprint arXiv:1808.10703, 2018.
Systems, vol. 57, no. 5, pp. 469–483, 2009. [26] M. W. Spong and M. Vidyasagar, Robot Dynamics and Control. John
[3] S. Chernova and A. L. Thomaz, “Robot learning from human teach- Wiley & Sons, 2008.
ers,” Synthesis Lectures on Artificial Intelligence and Machine Learn- [27] K. Katharina, “Force estimation in robotic manipulators: Modeling,
ing, vol. 8, no. 3, pp. 1–121, 2014. simulation and experiments,” Master’s thesis, Norwegian University
[4] A. J. Ijspeert, J. Nakanishi, H. Hoffmann, P. Pastor, and S. Schaal, of Science and Technology, 2014.
“Dynamical movement primitives: learning attractor models for motor
behaviors,” Neural Computation, vol. 25, no. 2, pp. 328–373, 2013.
[5] A. Dahlin and Y. Karayiannidis, “Adaptive trajectory generation under
velocity constraints using dynamical movement primitives,” IEEE
Control Systems Letters, vol. 4, no. 2, pp. 438–443, 2019.
[6] S. Gomez-Gonzalez, G. Neumann, B. Schölkopf, and J. Peters, “Adap-
tation and robust learning of probabilistic movement primitives,” IEEE
Transactions on Robotics, vol. 36, no. 2, pp. 366–379, 2020.

703

Authorized licensed use limited to: University of Texas at Arlington. Downloaded on June 08,2022 at 19:55:36 UTC from IEEE Xplore. Restrictions apply.

You might also like