0% found this document useful (0 votes)
37 views8 pages

Differentiable Simulation For Physical System Identification

Uploaded by

Rakesh Pattanaik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views8 pages

Differentiable Simulation For Physical System Identification

Uploaded by

Rakesh Pattanaik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 6, NO.

2, APRIL 2021 3413

Differentiable Simulation for Physical System


Identification
Quentin Le Lidec , Igor Kalevatykh, Ivan Laptev, Cordelia Schmid, and Justin Carpentier

Abstract—Simulating frictional contacts remains a challenging


research topic in robotics. Recently, differentiable physics emerged
and has proven to be a key element in model-based Reinforcement
Learning (RL) and optimal control fields. However, most of the
current formulations deploy coarse approximations of the under-
lying physical principles. Indeed, the classic simulators loose preci-
sion by casting the Nonlinear Complementarity Problem (NCP) of
frictional contact into a Linear Complementarity Problem (LCP)
to simplify computations. Moreover, such methods deploy non-
smooth operations and cannot be automatically differentiated. In
this letter, we propose (i) an extension of the staggered projections
algorithm for more accurate solutions of the problem of contacts
Fig. 1. Overview of our differentiable simulator. The differentiability of the
with friction. Based on this formulation, we introduce (ii) a differ- simulator allows to integrate it into a larger learning architecture and infer
entiable simulator and an efficient way to compute the analytical physical parameters such as friction coefficients μ and mass M of the objects,
derivatives of the involved optimization problems. Finally, (iii) we from real trajectories of these objects.
validate the proposed framework with a set of experiments to
present a possible application of our differentiable simulator. In
particular, using our approach we demonstrate accurate estimation
of friction coefficients and object masses both in synthetic and real However, the case of simulation with frictional contacts remains
experiments.
a challenging problem for the control community [8].
Index Terms—Calibration and identification, contact modeling, In the same vein, simulation of frictional contacts is a crucial
optimization and optimal control, simulation and animation. point when training Reinforcement Learning (RL) agents to
achieve complex control tasks involving contact interactions.
I. INTRODUCTION Indeed, RL is a powerful tool to learn control policies but often
HYSICAL simulation, as it allows for both training and requires millions of samples generated in simulation, which is
P testing control policies, appears to be a key element in
robotics. Rigid Body Algorithms [1] provide an efficient way
the reason why the simulator has to be efficient. Moreover, the
learned policies tend to exploit the artifacts of the simulators
to compute the forward dynamics of multi-body rigid systems due to approximations of the underlying physics which leads to
when there is no frictional contact. It is also possible to differen- unrealistic motions that are difficult to transfer to real systems.
tiate the quantities simulated with these algorithms with respect This mismatch between reality and simulation, known as the
to the state and the control variables of the system. Using ana- reality gap [9], highly limits the ability to transfer simulation-
lytical derivatives (instead of Automatic Differentiation or finite learnt policies to real robots. Hence, simulators should be both
differences) allows for efficient computation [2]. Differentiable fast and accurate.
physics has proven to be very useful for gradient-based algo- Modeling frictional contacts is one of the most challeng-
rithms for optimal control and trajectory optimization [3]–[7]. ing aspect of physical simulations given the non-linearity and
non-convexity of complementarity constraint and the maximum
Manuscript received October 15, 2020; accepted January 31, 2021. Date of dissipation principle. These underlying physical laws of rigid
publication February 25, 2021; date of current version March 23, 2021. This contact dynamics are typically simplified (spring-damper [10]),
letter was recommended for publication by Associate Editor H. Liu and Editor A.
Morales upon evaluation of the reviewers’ comments. This work was supported approximated [11] or relaxed [12] in classic physics engines.
in part by the HPC resources from GENCI-IDRIS under Grant AD011011342, These choices aim to increase computational efficiency but may
in part by the French government under management of Agence Nationale de la also result in non-realistic behaviors in simulation [13].
Recherche as part of the “Investissements d’avenir” program, reference ANR-
19-P3IA-0001 (PRAIRIE 3IA Institute), and in part by Louis Vuitton ENS Chair Simulating a system requires accurate values of its physical
on Artificial Intelligence. (Corresponding author: Quentin Le Lidec.) parameters, such as masses and friction coefficients of objects.
The authors are with the Inria, Ecole normale suprieure, CNRS, PSL Given the difficulty of estimating these parameters, however,
Research University, 75005 Paris, France (e-mail: [email protected];
[email protected]; [email protected]; [email protected]; their values are often randomized [14]. As result, such an ap-
[email protected]). proach often leads to the imprecise simulation.
This letter has supplementary downloadable material available at https://doi. In this letter, we propose an approach that guarantees the
org/10.1109/LRA.2021.3062323, provided by the authors.
Digital Object Identifier 10.1109/LRA.2021.3062323 differentiability of the simulator and also avoids error-prone
approximations of complementarity constraints and the
2377-3766 © 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: Siksha O Anusandhan University. Downloaded on August 04,2021 at 06:57:17 UTC from IEEE Xplore. Restrictions apply.
3414 IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 6, NO. 2, APRIL 2021

maximum dissipation principle. To this end, we extend the of intermediate QCQP problems enables us to exploit techniques
staggered projections algorithm [15] to deal with the friction from sensitivity analysis to differentiate the solution of the
cone constraint. In addition, we use techniques from the field of problem by back-propagating the solution over the cascade of
sensitivity analysis to differentiate the result of the simulation convex problems.
with respect to the physical parameters. This allows us to design Differentiable optimization and differentiable physics
a process to infer unknown physical parameters of a system that engine: Given that the solution of the frictional contact problem
are essential for simulations but complex to measure in practice is a solution of a sequence of optimization problems, its differ-
(such as friction coefficients), directly from real data. entiation requires derivatives of the solution of an optimization
The core contributions of this work are as follows: problem with respect to its parameters. In the case of an un-
constrained optimization problem, a solution introduced in [20]
1) We extend the work in [15] by formulating the fric-
consists in replacing the argmin operator by an approximation
tional contact problem as a sequence of Quadratically
with an optimization procedure such as gradient descent. In
Constrained Quadratic Programming (QCQP) problems
this case, the number of gradient steps is fixed and each step
without approximating any of the underlying physics prin-
represents an operation into the computational graph of the
ciples and taking elastic collisions into account.
layer. Then, the gradient descent can be unrolled to compute
2) We propose computation of analytical derivatives of the
the gradient with respect to the parameters of the optimization
solution of a QCQP as well as the efficient and robust
problem. However, this technique can lead to large computa-
implementation of the solver and its derivatives.
tional graphs when the number of required gradient steps is
3) We demonstrate applications of our differentiable simula-
important, increasing the computational cost when performing
tion to system identification by inferring physical proper-
the backpropagation. Moreover, it is not possible to proceed
ties of objects from videos of dynamical scenes.
this way when considering a constrained optimization problem
This letter is organized as follows. Section II proposes an because the optimization procedures often require projection
overview of the work done in the area of differentiable simulation steps which cannot be differentiated. Thus, implicit argmin
and physical system identification. In III-A, we introduce the differentiation which relies on the differentiation of optimality
mathematical framework of the problem of frictional contacts conditions appears to be a way to deal with constrained prob-
and solve it by extending the staggered projection algorithm. In lems [21]. Although the implementations of this approach allows
III-B, we expose the analytical derivatives of a QCQP which to solve very general constrained optimization problems and get
allow us to derive a differentiable and accurate simulator. In the derivatives of the solution, they also lose efficiency in the
Section IV, we validate our method by applying it to the task of process. More specialized solvers [22] use an equivalent implicit
physical system identification and discuss the issue of parameter approach while taking advantages of the structure of the problem
observability. This leads us to present some future research they are solving to gain efficiency. Simulators like [8] adapted
directions in Section V. this solver to be able to solve the LCP problem resulting from
the approximation of the friction cone, to build a differentiable
simulator. Although our work is closely related to [8], we avoid
II. RELATED WORK making any approximation by exploiting our extension of the
Physical simulation algorithms: The problem of contacts formulation from [15] and the chain rule to differentiate the
without frictions can be formulated as a Linear Complementarity output of our simulator by differentiating through a sequence of
Problem (LCP) [16] and can be solved for instance using the optimization problems.
Projected Gauss-Seidel (PGS) algorithm. This formulation can Generative physics model for system identification: The
be adapted to the frictional case by approximating the friction field of system identification [23] intends to build a mathemat-
cone with a four sided pyramid [17] as done in Bullet [11] ical model of a dynamical system from its measurements. The
or [8]. The algorithm of staggered projections [15] introduces a related work [8], [24]–[26] identifies parameters of physical
formulation of the frictional contacts problem as a fix point of systems using simulators as generative models. In each case,
coupled projections. By also using the pyramidal approximation the identification is done by simulating the physical system and
of Coulomb’s law, this method achieves simulating a system then optimizing the physical parameters so that the simulations
after solving a cascade of Quadratic Programming (QP) prob- are fitting to the real scenes. In this work, we adopt an approach
lems. Some others approaches [18] relax the complementarity close to [8], by relying on the differentiation of our physical
constraint in order to transform the frictional contact problem model to estimate its physical parameters. However, because
into a single and simple optimization one [12]. However, this we avoid some of the approximations made in [8], we are able
relaxation may lead to physically implausible behaviors such to consider not only 2D but also 3D systems. This allows us
as object interactions without objects being in contact [13]. to apply our approach to the concrete task of inferring physical
In this letter, we extend the formulation in [15] to laws of parameters from videos [24].
multiple elastic collisions [19]. In addition, we adapt it to account
for conic constraints (conic friction constraint represented as
ice-cream cones). This makes it possible to write explicitly III. DIFFERENTIABLE SIMULATION
the problem of frictional contact as a sequence of optimization In this section, we show how the staggered projection algo-
problems, where the problems become QCQP problems. The use rithm [15] can be adapted to handle both the friction cone and

Authorized licensed use limited to: Siksha O Anusandhan University. Downloaded on August 04,2021 at 06:57:17 UTC from IEEE Xplore. Restrictions apply.
LE LIDEC et al.: DIFFERENTIABLE SIMULATION FOR PHYSICAL SYSTEM IDENTIFICATION 3415

elastic collisions. Then, we introduce the analytical derivatives where  is the coefficient of restitution quantifying the elasticity
of the QCQP problem appearing in this formulation, and propose of the impact (when  = 1 the impact occurs with full restitution
a robust implementation that leads to a differentiable simulator. while  = 0 is a completely inelastic impact).
To model frictional contacts, we adopt Coulomb’s law of
A. Solving the Frictional Contact Problem friction. It imposes that the contact impulse λ lies into a cone
Simulating a physical system corresponds to computing the whose tightness is determined by the coefficient of friction μ.
next system state (q t+1 , v t+1 ) and the current contact forces At this stage, it is worth noting that μ takes two different values
λ, given the initial state (q t , v t ), where q ∈ Rnq and v ∈ Rnv depending on if the object in contact are static (μstat ) or in relative
are the vectors of generalized position and velocities, nc being motion (μkin ) and μstat ≥ μkin . This constraint combined with
the number of contact points.1 To compute these quantities, our MDP gives:
  T t+1 
method relies on three main physical laws that we are going λt+1 = argmax − J T
λ v (4)
t t t
to introduce: the Euler-Lagrange equation of motion, the com- λt s.t. λt(i) 2 ≤μi λt+1
n(i)
plementarity constraint between contact normal accelerations
and forces, and the Maximum Dissipation Principle (MDP) where λn,t(i) corresponds to the contact impulses of the ith
from Coulomb’s law of friction. From the classical Lagrangian contact point. We note that the MDP (4) actually corresponds
dynamics, we get the following generalized equations of motion to the dual of the least action principle, which guarantees it to
in continuous time for an unconstrained system: remain valid even at stiction.
Let A be a convex set, we denote by:
M a = τ (q, v)
1
where M ∈ Rnv ×nv is the inertia matrix of the system, a ∈ Rnv PA (x) = argmin (x − z)M −1 (x − z)
z∈A 2
is the generalized acceleration and τ the vector of generalized
forces which contains Coriolis and centrifugal effects, actuation, the operator of projection on the set A under the metric induced
gravity and external forces. Moreover, when the dynamical by the inertia matrix M . We also note respectfully the sets C =
system interacts with other objects, an additional term J T λ has {JnT λn , λn ≥ 0} and F(λn ) = {JtT λt , ∀i λt(i) 2 ≤ μi λn(i) },
to be gathered to represent the effect of contact interaction forces: the sets of admissible normal and tangential contact impulses.
For the following, we also note v p ∈ Rnv the contact-free
M a = τ (q, v) + J T λ (1)
velocity which verifies M (v p − v t ) = Δt τ (qt , v t ).
where J ∈ R3nc ×nv is the Jacobian of the contact points and λ Finally, (2), (3) and (4) correspond to the three physical
contains both the normal forces and the tangential friction forces. principles we consider to simulate our system and compute the
Following the approach proposed by [15] we consider separately three unknowns v t+1 , λt+1
t and λt+1
n . As demonstrated in [15],
the normal λt+1
n ∈ Rnc and tangential λt+1
t ∈ R2nc components t+1 t+1
the λt and λn solving these three equations equivalently
of λ. We denote by Jn ∈ Rnc ×nv and by Jt ∈ R2nc ×nv the verify the following staggered projections:
projections of J on the normal and tangent directions of contacts
respectively. Thus, after discretizing (1) with a time step Δt, we PC (−M (v p + v t ) − JtT λt+1
t ) = JnT λt+1
n
T t+1
obtain: PF (λt+1
n )
(−M v p − JnT λt+1
n ) = J t λt
 
M v t+1 − v t = Δt τ (qt , v t ) + JtT λt+1t + JnT λt+1
n . (2) and expanding the PC and PF (λt+1 operators leads to the
n )
In this letter, we prefer to exploit this velocity-based formu- interdependant QP and QCQP:
lation to be able to deal with discontinuities appearing during 1
collisions, as we will see later. This is why we will now talk about λt+1
n = argmin λT Gn λ + λT gn
λ≥0 2
impulses instead of forces when evoking the contact interaction (5)
quantity λ. 1 T
λt+1
t = argmin λ Gt λ + λT gt
Integrating the complementary constraint that (i) rigid bodies λt(i) 2 ≤μi λt+1
2
n(i)
can not interpenetrate each other while (ii) contact impulses can
act only to separate them when they are in contact, leads to the where:
complementarity constraint: Gn = Jn M −1 Jn T , gn = Jn (v p + v t ) + Gnt λt+1
t
0 ≤ Jn v t+1 ⊥ λt+1 ≥0
n
Gt = Jt M −1 Jt T , gt = Jt v p + Gnt T λt+1
n
Considering the law for multiple collision points [19] leads to
the slightly modified constraint:2 with Gnt = Jn M −1 JtT . The formulation of (5) naturally in-
  duces a fix point algorithm to solve for λt+1 t and λt+1
n . Fi-
0 ≤ Jn v t+1 + v t ⊥ λt+1 n ≥0 (3) nally, by fixing the number of fix point iterations to nstep (a
convergence analysis similar to [15] finds nstep ∈ [3, 10] to have
1 Here, we also considered that the configuration vector and its related velocity
reasonable computation time and a precision sufficient for most
vector may have different dimensions.
2 In the same way, Baumgarte’s stabilization can be used to avoid the point
of applications), λt+1
t and λt+1
n can be computed by solving a
“drift” issue, with 0 ≤ Jn (v t+1 + v t + v B ) ⊥ λt+1n ≥ 0 where Jn v B = sequence of optimization problems alternating between a QP
−Ke and e the penetration error. and a QCQP (at lines 8 and 10 of Algo. 1). To compute the

Authorized licensed use limited to: Siksha O Anusandhan University. Downloaded on August 04,2021 at 06:57:17 UTC from IEEE Xplore. Restrictions apply.
3416 IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 6, NO. 2, APRIL 2021

where γ ∈ Rnc corresponds to the dual solution associated to


the constraints and where Γ ∈ R2nc ×nc :
⎛ ⎞
1 0 0 ... 0
⎜1 0 0 . . . 0⎟
⎜ ⎟
⎜ ⎟
⎜0 1 0 . . . 0 ⎟
⎜ ⎟
⎜0 1 0 . . . 0 ⎟
Γ=⎜ ⎟
⎜. . ⎟
⎜ .. ..
.⎟
.
⎜ . ⎟
⎜ ⎟
⎝0 . . . . . . 0 1⎠
0 ... ... 0 1

Then, it is possible to differentiate these equations to get the


following system where the unknowns are the variations of the
primal and dual solutions dzi+1 and dγ:


Δ =δ (7)
dzi+1

where:
argmin operation, we avoid classic Primal Dual Interior Point Δ11 Δ12 δ1 dμ + δ2 dλn
Method solvers and rely on a regularized ADMM algorithm. Δ= , δ=
Δ21 Δ22 −dGzi+1 − dg
As we detail it in Appendix A, this allows us to solve QCQPs
in a way requiring as much as computation as QPs, but also to and:
deal with over-constrained situations making the problems (5) ill
conditioned while requiring only tens of iterations to converge. Δ11 = Diag(zi+1 t(i) 22 − μ2i λ2n(i) ),
We can already notice that using the chain rule allows to
Δ12 = 2Diag(γ)ΓT Diag(zi+1 ),
differentiate the computed velocity v t+1 and contact impulses
λt+1
t and λt+1
n as long as we know how to differentiate the Δ21 = 2Diag(zi+1 )Γ,
successive argmin operators.
Δ22 = G + 2Diag (Γγ) ,
δ1 = 2Diag(γi μi λ2n(i) ), δ2 = 2Diag(γi μ2i λn(i) )
B. Differentiating the Solution
Solving (7) allows to compute the derivatives dzi+1 of the so-
As explained in III-A, differentiating the outputs v t+1 , λt+1
t ,
lution of our QCQP with respect to G, g, μ and λn . It is important
λt+1
n of the simulation requires to compute the derivatives of
to notice that those derivatives remain true as long as the matrix
the solution of the QCQP and QP problems that are involved in
Δ is invertible. For instance, when all constraints are inactive
the algorithm 1. Thus, in the same way as it is done in [22], we
and G has an high condition number, it is not possible to invert
implemented the function solving the particular case of QCQP
Δ. In this case we use iterative refinement as it is introduced
(we do not detail the QP case as it is already studied in [22])
in [28], to solve the system. This allows to solve approximately
appearing during the step of projection onto F(λn ) . This function
systems like Ax = b even when A is ill-conditioned. To do
can be written as:
so, we intend to solve the problem minx 21 Ax − b22 , but the
1 T solution of this problem requires to compute the pseudo-inverse
zi+1 = argmin z G(zi )z + g(zi )T z (6) of A by applying a shift to the original problem to regularize it.
zt(i) 2 ≤μ(zi )λn(i) (zi )2 2
Instead, iterative refinement uses an iterative process defined by:
xk+1 = argminx 21 Ax − b22 + ρ2 x − xk 22 which converges
where zi is an input variable (μ, J, M , λn , τ , v t in our case) to the solution of the least squares problem and only requires
parameterizing the QCQP and zi+1 ∈ R2nc its solution. Using the computation of a regularized pseudo-inverse.
the implicit differentiation approach [27], we implemented the However, in many cases such as in IV-B, we do not want to
analytical derivatives that allow to compute ∂z∂zi+1 i
which is compute the variations of the primal and dual variables dzi+1 ,
necessary when performing a backward pass. The Karush-Kuhn- dγ but rather the gradient of a loss L formed with zi+1 that
Tucker optimality conditions of the QCQP can be written: we are minimizing with respect to the parameters zi . As done
in [22], we proceed by directly computing the product with the
  previous backward pass vector ∂z∂L as follows:
Diag zi+1 t(i) 22 − μ2i λ2n(i) γ = 0 i+1

∂L
Gzi+1 + g + 2Diag (Γγ) zi+1 = 0 dL = dzi+1
∂zi+1

Authorized licensed use limited to: Siksha O Anusandhan University. Downloaded on August 04,2021 at 06:57:17 UTC from IEEE Xplore. Restrictions apply.
LE LIDEC et al.: DIFFERENTIABLE SIMULATION FOR PHYSICAL SYSTEM IDENTIFICATION 3417

Fig. 2. Experimental setup to determine the physical properties of the cube. We considered two different scenes: in the first one, the unknown cube is sliding on
the floor, starting with a given initial velocity v 0 ; in the second setup, the same cube collides with a second cube whose characteristics are known.

IV. EXPERIMENTS
In this section, we show through experiments how the dif-
ferentiability of our simulator can be exploited to retrieve the
physical parameters of a system from its trajectories. In ad-
dition, we show that it is possible only under the condition
that the trajectory contains enough information to avoid any
ambiguity, which leads us to some experiments on the ob-
servability issue. A video illustrating our work is available at
https://youtu.be/d248IWMLW9o.
Fig. 3. Comparisons of runtime performances between [21] and ours, on
randomly generated QCQPs of the form (6).

A. Experimental Setup
⎛ ⎞T In our experiments, we intend to estimate the physical pa-
0 dγ rameters of a cube from simulated and real dynamical scenes.
= ⎝ ∂L
T ⎠ The second part of experiments involves another cube whose
∂zi+1
dzi+1
properties were known (Fig. 2).
T For our simulator, we used the Pinocchio library [29], [30]
bγ for the implementation of the rigid body algorithms from [1]
= δ
bzi+1 and for collisions detection algorithm, and PyTorch [31] for the
implementation of backward Automatic Differentiation.
0
where ( bzbγ ) = Δ−T ( ( ∂L T ). Using the expression of δ, we
i+1 ∂zi+1 )
get:
B. Physical Parameters Inference From Trajectories
∂L ∂L To exhibit the new ability of our simulator we will consider
= δ2T bγ , = δ1T bγ ,
∂λn ∂μ the scenario of an object interacting with the floor (a cube sliding
∂L ∂L on the floor but it could be a more complicated scenario like a
T
= −bzi+1 , = −bzi+1 zi+1 walking robot), where every parameters (the inertias M , external
∂g ∂G
forces τ ) are known except for the coefficient of kinetic friction
∂L μkin of the object with the floor (we could do the same with
And finally, the gradient we are interested in ∂z is obtained with
i others parameters). We will also consider that we dispose of
the chain rule:
a trajectory (x0 , x1 , . . . , xT ) of this object interacting with the
∂L ∂L ∂λn ∂L ∂μ ∂L ∂g ∂L ∂G floor, where x = (q, v, a). Here, we cover two cases: either (i)
= + + + . x is generated in simulation so we know precisely the ground
∂zi ∂λn ∂zi ∂μ ∂zi ∂g ∂zi ∂G ∂zi
truth parameters of the system (Figs. 4(a), 4(b), 5 and 6), or (ii)
Eventually, we observe on Fig. 3, that on our particular the trajectory x is extracted with a pose estimation algorithm
QCQP/QP problems, our solver is efficient and robust during from videos of real experiments (Fig. 4(c)). It is worth noting
both the forward and backward passes. Indeed, the regularized that the simulated trajectories were generated with an algorithm
ADMM of V-A and the iterative refinement respectively allow (PGS-NCP from [13]) and a time step different from the ones of
to solve ill-conditioned problems and compute their deriva- the differentiable simulator used for the inference, and that we
tives. This point is determinant as it makes it possible to deal also added white noise (variance of 10−3 m) to make sure that
with the case of G only being positive semi-definite, which results from simulations do not depend on the way trajectories
is occurring often in robotics when G is Delassus’ matrix for are simulated.
over-constrained systems. The code of our solver is publicly We note the “simulator function” gμ : xt → x̂t+1 whose com-
available at https://github.com/quentinll/diffqcqp. putational graph corresponds to the Algorithm 1 and whose only

Authorized licensed use limited to: Siksha O Anusandhan University. Downloaded on August 04,2021 at 06:57:17 UTC from IEEE Xplore. Restrictions apply.
3418 IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 6, NO. 2, APRIL 2021

Fig. 4. Results of the inference process of physical parameters from simulated (Fig. 4(a), 4(b) or real (Fig. 4(c)) trajectories. When the inference is done from
simulated trajectories, our method always converges to the ground truth value. When trajectories come from real experiments (Fig. 4(c)), the ground truth value
of μkin is not available but we observe that our system converges to μkin = 0.13 for every initialization and this value is consistent with the tables [33] and the
coefficient of static friction we measured μstat = 0.18.

Thus in 3D, because pyramidal cones are not isotropic, a same


friction coefficient value may lead to two different simulated
trajectories when using formulations based on this kind of
approximation [8], [11], [24], [26]. Similarly, in the context of
friction estimation, the same observed motion may then lead to
two different friction values depending on the orientation of the
frictional pyramid (Fig. 5(a), 5(b)). This effect could be limited
by approximating the cone by a polyhedron with more faces,
which also comes at the cost of a larger computational time.

Fig. 5. Limitations of simulators approximating the friction cone with a


pyramid [8], [11], [24] and [26]. On Fig. 5(a), the darker pyramid corresponds C. Parameters Observability
to the worst case, when the pyramid is rotated with an angle π/4 with respect to
the contact velocity. As predicted, Fig. 5(b) shows that the inferred value μest1 In the context of inferring several physical parameters
 at the
√ same time, we aim at minimizing L(μ, M, ) = Tt=1 xt −
kin
converges towards 0.141 = μkin / 2.
gμ,M, (xt−1 )22 . When optimizing the model parameters, our
approach would allow to get one of the possible combinations of
unknown parameter is μ. Then, we can defined the MSE loss: parameters (in the sense that several combinations of parameters
may lead to the observed trajectories), but it may not be the true
T
 T
 one. This limitation directly comes from the observability of the
L(μ) = xt − x̂t 22 = xt − gμ (xt−1 )22 physical parameters. In the same way, for instance, it would be
t=1 t=1
impossible to infer the friction coefficient of the cube with the
which is the sum of the errors made by the simulator at each floor if there was no contact between the cube and the floor in
time step. Using the differentiability of the “simulator function” the given trajectory because, in this case, any value of μkin would
gμ with respect to μ, it is possible to compute ∇μ L by back- be possible. Thus, the trajectory given for the inference process
propagating the loss using the Automatic Differentiation tool of needs to make the desired parameters observable by excluding
PyTorch [31], as illustrated on Fig. 1. Then, we minimize L with others possible values. This leads to other very interesting ques-
respect to μ using Adam algorithm [32]. tions: How to generate a trajectory that allows to expose some
Proceeding this way allows to retrieve the coefficient of particular properties of the system? Is it possible to infer physical
kinetic friction μkin from both simulated (Fig. 4(a)) or real characteristics of an object using information only coming from
(Fig. 4(c)) trajectories when all others parameters are known. trajectories?
The same method makes it possible to also infer the mass of the This last question refers to the case where the number of
cube M (Fig. 4(b)) or any other physical parameter (external parameters we try to infer becomes important (which can happen
forces, initial state, etc). when inferring shapes for instance) and leads to ambiguities
Moreover, Fig. 5 demonstrates why our choice of modeling that would require adding visual or material information to be
the friction as an ice-cream cone (instead of a pyramid) is solved. However, in this work, we only address the first question.
determinant to ensure the success of the inference. Indeed, when We show that using more complex trajectories where the object
such a pyramidal approximation is made, the value of μest kin whose characteristics are unknown is interacting with other
depends on the choice of orientation between the axes of the known objects allows to avoid some of the possible ambiguities.
pyramid and the contact point velocity, as illustrated by Fig. 5(a). We proceed by using a setup similar to the previous, except

Authorized licensed use limited to: Siksha O Anusandhan University. Downloaded on August 04,2021 at 06:57:17 UTC from IEEE Xplore. Restrictions apply.
LE LIDEC et al.: DIFFERENTIABLE SIMULATION FOR PHYSICAL SYSTEM IDENTIFICATION 3419

Fig. 6. The collision with an object, whose physical characteristics are known, allows to solve the ambiguity and retrieve the true parameters μ and M
simultaneously, which was not possible previously. In the same time, we are also able to find the value of the coefficient of restitution involved in the collision.

that the unknown cube collides with a known one during the shape of objects directly from videos. Learning these quantities
experiment. We observe on Fig. 6 that this enables us to infer the can require the introduction of many additional parameters, so
mass of the cube M , together with its friction coefficient μ and we expect the observability issue to be central. In the present
the elasticity parameter  at the same time. Although the collision work, we only used videos to retrieve objects’ trajectories, and
introduce a new parameter , it does not make the parameters we can expect that using additional advanced computer vision
harder to observe because  can be determined independently algorithms will provide precious information to solve this issue.
+ −
using  = −vrel /vrel . Finally, exploiting the differentiable dynamics introduced in
this letter in the frame of model-based control approaches (e.g.
D. Limitations of the Approach optimal control or model-based reinforcement learning) appears
as another exciting research direction.
Even if they were not apparent during the previous exper-
iments, we noticed two limitations to our framework that are
APPENDIX
inherent to the staggered projections algorithm [15]. Indeed, as
shown in [15], the algorithm is not monotone, thus, it does not A. Projecting on the Set of Frictional Impulses With ADMM
have theoretical convergence guarantees. Demonstrating possi-
In order to improve the performances of our extension
ble guarantees for the staggered projections algorithm would be
of the staggered projections algorithm we implemented with
an interesting work to be done. In addition, our approach also
PyTorch [31] a solver for the specific QCQP problem appearing
requires to solve a cascade of optimization problems at each step
during the projection step on F(λn ) . To solve this problem we
which is why it is accurate, but it can also appear costly compared
used the ADMM algorithm from [34]. The problem can be
to algorithms linearizing the friction cone and solving only one
re-written:
LCP.
min f (x) + g(z) s.t. x = z
x,z
V. DISCUSSION AND FUTURE WORK
1 T
In this work, we extended the formulation proposed by where f (x) = 2x Px + q T x and g(z) = IC (z), with IC the
Kaufman et al. of the frictional contact problem that allows characteristic function of C = {z, ∀i zt(i) 2 ≤ μi λn(i) }. Thus,
to write the contact impulses as a solution of a sequence of the ADMM algorithm can be written:
convex optimization problems. Then we introduced the analyt- xk+1 = argmin Lρ (x, z k , y k )
ical derivatives of the various optimization problems that are x
involved in this formulation, and, proposed a simple but efficient z k+1
= argmin Lρ (xk+1 , z, y k )
implementation for these solvers and their analytical derivatives. z
We showed experimentally that our approach is able to infer y k+1
= y + ρ(xk+1 − z k+1 )
k
physical parameters directly from videos of the evolution of
interacting rigid body systems. Our experiments also allowed where y is the dual variable of the problem and Lρ (x, z, y) =
to demonstrate the importance of the observability issue when f (x) + g(z) + y T (x − z) + ρ2 ||x − z||22 is the associated aug-
addressing this kind of a task. More generally, we believe that the mented Lagrangian. We also chose to add the term α2 ||x − xk ||22
efficiency and robustness of our differentiable simulator can lead to the objective function f , which corresponds to a proximal
to concrete applications involving real physical scenes and large regularization. Indeed, this term modifies P̃ = P + α Id and
amount of data, in particular in the context of robotic dexterous q̃k = q − α xk . This regularization of P allows to handle ill-
manipulation. conditioned cases. In addition, it also induces that the smallest
In future work, we intend to extend our framework to also eigenvalue of P̃ is equal to α, and, using the work from [35],
include the inference of the position of contact points and the we can automatically scale the parameter ρ of the augmented

Authorized licensed use limited to: Siksha O Anusandhan University. Downloaded on August 04,2021 at 06:57:17 UTC from IEEE Xplore. Restrictions apply.
3420 IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 6, NO. 2, APRIL 2021


Lagrangian, with ρ = L α ( L α)
0.4
where L is the biggest [15] D. M. Kaufman, S. Sueda, D. L. James, and D. K. Pai, “Staggered pro-
jections for frictional contact in multibody systems,” ACM Trans. Graph.
eigenvalue of P . Moreover, we observe that the step z k+1 = (SIGGRAPH Asia 2008), no. 5, pp. 1–11, 2008.
argminz Lρ (xk+1 , z, y k ) can be seen as a projection step on [16] R. W. Cottle, J.-S. Pang, and R. E. Stone, The Linear Complementarity
the convex set C. We adapt ρ during the optimization, in the Problem. Siam, 1992.
[17] M. Anitescu and F. A. Potra, “Formulating dynamic multi-rigid-body con-
way proposed by [34]. That way, when the ratio between the tact problems with friction as solvable linear complementarity problems,”
primal and dual residual is over a threshold (we fixed it to 10), Nonlinear Dyn., vol. 14, no. 3, pp. 231–247 1997.
we correspondingly adapt ρ by multiplying or dividing by the [18] M. Anitescu and A. Tasora, “An iterative approach for cone complemen-
tarity problems for nonsmooth dynamics,” Comput. Optim. Appl., vol. 47,
conditioning number ( L α)
0.1
. Due to this automatic scaling of ρ, no. 2, pp. 207–235, 2010.
the ADMM algorithm allows to solve the QCQP problem in a [19] T. Giang, G. Bradshaw, and C. O’Sullivan, “Complementarity based
very efficient way and stable way for a large class of rigid body multiple point collision resolution,” in Proc. Fourth Ir. Workshop Comput.
Graph., 2003, pp. 1–8.
systems. Because the dual variable of the problem is computed [20] J. Domke, “Generic methods for optimization-based modeling,” in Proc.
iteratively, another advantage of the ADMM algorithm is that it 15th Int. Conf. Artif. Intell. Statist., in Proc. Mach. Learn. Res., N.
induces a natural optimality criterion which is the verification of D. Lawrence and M. Girolami, Eds., vol. 22. La Palma, Canary Is-
lands: PMLR, Apr. 21–23, 2012, pp. 318–326. [Online]. Available: http:
the KKT optimality conditions on the gradient of the Lagrangian //proceedings.mlr.press/v22/domke12.html
P x + q + y∞ < . [21] A. Agrawal, S. Barratt, S. Boyd, E. Busseti, and W. Moursi, “Differen-
tiating through a cone program,” J. Appl. Numer. Optim., vol. 1, no. 2,
2019.
REFERENCES [22] B. Amos and J. Z. Kolter, “Optnet: Differentiable optimization as a layer
in neural networks,” in Proc. Int. Conf. Mach. Learn. PMLR, 2017, pp.
[1] R. Featherstone, Rigid Body Dynamics Algorithms, Springer, 2014. 136–145.
[2] J. Carpentier and N. Mansard, “Analytical derivatives of rigid body dy- [23] L. Ljung, System Identification: Theory for the User. Pearson Education,
namics algorithms,” in Robot.: Sci. Syst., 2018. 1998.
[3] J. Koenemann et al., “Whole-body model-predictive control applied to the [24] J. Wu, I. Yildirim, J. J. Lim, B. Freeman, and J. Tenenbaum, “Galileo:
hrp-2 humanoid,” in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., 2015, Perceiving physical object properties by integrating a physics engine
pp. 3346–3351. with deep learning,” in Proc. Ad. Neural Inf. Process. Syst., C. Cortes,
[4] M. Posa, C. Cantu, and R. Tedrake, “A direct method for trajectory N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, Eds., vol. 28. Curran
optimization of rigid bodies through contact,” Int. J. Robot. Res., vol. 33, Associates, Inc., 2015.
no. 1, pp. 69–81, 2014. [25] S. Purushwalkam, A. Gupta, D. M. Kaufman, and B. Russell, “Bounce
[5] J. Carpentier and N. Mansard, “Multicontact locomotion of legged robots,” and learn: Modeling scene dynamics with real-world bounces,” 2019,
IEEE Trans. Robot., vol. 34, no. 6, pp. 1441–1460, Dec. 2018. arXiv:1904.06827.
[6] J. Carius, R. Ranftl, V. Koltun, and M. Hutter, “Trajectory optimization [26] C. Song and A. Boularias, “Learning to slide unknown objects with
with implicit hard contacts,” IEEE Robot. Automat. Lett., vol. 3, no. 4, differentiable physics simulations,” in Proc. Robot.: Sci. Syst., Corvalis,
pp. 3316–3323, Oct. 2018. Oregon, USA, Jul. 2020.
[7] C. Mastalli et al., “Crocoddyl: An efficient and versatile framework for [27] A. Griewank and A. Walther, Evaluating Derivatives, 2nd ed. Society for
multi-contact optimal control,” in Proc. IEEE Int. Conf. Robot. Automat., Industrial and Applied Mathematics, 2008.
2020, pp. 2536–2542. [28] N. Parikh and S. Boyd, “Proximal algorithms,” Found. Trends Optim.,
[8] F. de Avila Belbute-Peres, K. A. Smith, K. R. Allen, J. B. Tenenbaum, and vol. 1, no. 3, pp. 127-239. Jan. 2014.
J. Z. Kolter, “End-to-end differentiable physics for learning and control,” [29] J. Carpentier et al. “Pinocchio: Fast forward and inverse dynamics for
Adv. Neural Inf. Process. Syst., vol. 31, pp. 7178–7189, 2018. poly-articulated systems,” 2015–2021. [Online]. Available: https://stack-
[9] J. Tan et al., “Sim-to-Real: Learning agile locomotion for quadruped of-tasks.github.io/pinocchio
robots,” in Proc. Robot.: Sci. Syst., Pittsburgh, Pennsylvania, [30] J. Carpentier et al., “The pinocchio c library-a fast and flexible implemen-
Jun. 2018. tation of rigid body dynamics algorithms and their analytical derivatives,”
[10] A. M. Castro, A. Qu, N. Kuppuswamy, A. Alspach, and M. Sherman, “A in Proc. IEEE Int. Symp. Syst. Integrations, 2019, pp. 614–619.
transition-aware method for the simulation of compliant contact with regu- [31] A. Paszke et al., “Automatic Differentiation in Pytorch,” 2017.
larized friction,” IEEE Robot. Automat. Lett., vol. 5, no. 2, pp. 1859–1866, [32] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”
Apr. 2020. 2014, arXiv:1412.6980.
[11] E. Coumans and Y. Bai, “Pybullet, a Python Module for Physics Simula- [33] D. Atack and D. Tabor, “The friction of wood,” Proc. Roy. Soc. London.
tion for Games, Robotics and Machine Learning,” 2016–2019. [Online]. Ser. A., Math. and Phys. Sci., vol. 246, no. 1247, pp. 539–555 1958.
Available: http://pybullet.org [34] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed
[12] E. Todorov, T. Erez, and Y. Tassa, “Mujoco: A physics engine for model- optimization and statistical learning via the alternating direction method
based control,” in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., 2012, pp. of multipliers,” Foundations Trends Mach. Learn., vol. 3, 01 2011.
5026–5033. [35] R. Nishihara, L. Lessard, B. Recht, A. Packard, and M. Jordan, “A general
[13] P. C. Horak and J. C. Trinkle, “On the similarities and differences among analysis of the convergence of admm,” in Int. Conf. Mach. Learn., PMLR,
contact models in robot simulation,” IEEE Robot. Automat. Lett., vol. 4, 2015, pp. 343–352.
no. 2, pp. 493–499, Apr. 2019.
[14] J. Tobin, R. Fong, A. Ray, J. Schneider, W. Zaremba, and P. Abbeel, “Do-
main randomization for transferring deep neural networks from simulation
to the real world,” in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst, 2017.

Authorized licensed use limited to: Siksha O Anusandhan University. Downloaded on August 04,2021 at 06:57:17 UTC from IEEE Xplore. Restrictions apply.

You might also like