An Asynchronous Multi-Body Simulation Framework For Real-Time Dynamics Haptics and Learning With Application To Surgical Robots
An Asynchronous Multi-Body Simulation Framework For Real-Time Dynamics Haptics and Learning With Application To Surgical Robots
Abstract— Surgical robots for laparoscopy consist of several for collecting da Vinci Research Kit (dVRK) manipulators
patient side slave manipulators that are controlled via surgeon and vision data, primarily for training learning agents by
operated master telemanipulators. Commercial surgical robots motion decomposition of sub-tasks is developed in [8].
do not perform any sub-tasks – even of repetitive or non-
invasive nature – autonomously or provide intelligent assistance. Recent developments in deep learning and AI have sparked
While this is primarily due to safety and regulatory reasons, the the interest of researchers from a variety of fields. Until very
state of such automation intelligence also lacks the reliability recently, the scope of deep learning algorithms were limited
and robustness for use in high-risk applications. Recent develop- to discretized problems, and thus, most real world control
ments in continuous control using Artificial Intelligence and Re- problems remained out of reach. However, the introduction
inforcement Learning have prompted growing research interest
in automating mundane sub-tasks. To build on this, we present of Deep Deterministic Policy Gradients (DDPG) model [9] –
an inspired Asynchronous Framework which incorporates real- an improvement over Deterministic Policy Gradients – broke
time dynamic simulation – manipulable with the masters of a new grounds while making the realization of smart agents
surgical robot and various other input devices – and interfaces for continuous control problems seemingly possible. These
with learning agents to train and potentially allow for the advancements led to the successful training of a simulated
execution of shared sub-tasks. The scope of this framework
is generic to cater to various surgical (as well as non-surgical) human rag-doll capable of running, jumping and avoiding
training and control applications. This scope is demonstrated obstacles [10]. Unsurprisingly, there has been an increase
by examples of multi-user and multi-manual applications which in the number of software libraries targeted for machine
allow for realistic interactions by incorporating distributed con- and reinforcement learning developed by the open source
trol, shared task allocation and a well-defined communication community. Many of these libraries provide Python API’s
pipe-line for learning agents. These examples are discussed in
conjunction with the design philosophy, specifications, system- and are capable of utilizing high-bandwidth system resources
architecture and metrics of the Asynchronous Framework and for faster training of data. Zamora et al. [11] present a useful
the accompanying Simulator. We show the stability of Simulator reinforcement learning toolkit catered towards mobile robots
while achieving real-time dynamic simulation and interfacing that employs such a Python interface for reinforcement
with several haptic input devices and a training agent at the learning by interconnecting the Gazebo simulator with Open-
same time.
AI’s GYM [12].
I. INTRODUCTION
Partial autonomy of sub-tasks has exciting prospects for
research aimed for the next generation of surgical robotics.
Research in this area focuses on assisting the surgeon in
accomplishing sub-tasks, thereby making the automation
passive in nature. Some notable research in this area includes
autonomous algorithms for performing soft-tissue suturing
[1], an automated approach for sinus surgery using computer
navigation techniques [2], characterization and automation
of soft-tissue suturing using a curved needle guide [3] and
automation of cutting/creasing sub-tasks while employing
learning by observation [4]. Additionally, [5] presents a Fig. 1: An overview of components selected for the Asynchronous Frame-
work for Assistive intelligence, simulation and collaborative control.
holistic approach to simplifying the task of manipulator
positioning prior to surgeon interaction, and [6] demonstrates
We propose the use-case of an Intelligent Agent for col-
a telemanipulated surgical simulation designed for heart
laborative control of real-time tasks, specialized for robotic
surgery. A trainable infrastructure is presented in [7] with
surgery. Given that we intend to assist the Master with
controllable dominance and aggression factors for automat-
collaboration rather than fully automatic control, we use
ing repetitive surgical tasks. Lastly, a shared infrastructure
the more appropriate term of Assistive Intelligence. The
coordination can range between two target types, (1) multi-
Adnan Munawar & Gregory S. Fischer are with the Department of sensory feedback to the Master – visual, haptic and tactile
Robotics Engineering, Worcester Polytechnic Institute, MA, 01609, USA – and (2) cooperative control of one or more slaves in
[amunawar, gfischer]@wpi.edu
This work is supported by the National Science Foundation through conjunction with user-controlled manipulators. We propose a
National Robotics Initiative Grant (NRI): IIS-1637759 framework to achieve both forms of assistance by providing
6269
Authorized licensed use limited to: LAHORE UNIV OF MANAGEMENT SCIENCES. Downloaded on July 19,2023 at 18:06:39 UTC from IEEE Xplore. Restrictions apply.
To counteract these issues, an asynchronous control
scheme is implemented where the loop delays are isolated
from each other to prevent cyclic deterioration. A block
diagram representing this control scheme is shown in Figure
3. Implementation wise, the dynamic update-loop runs in a
separate thread and all of the haptic update-loops in separate
individual threads. Each input device owns a data-structure
which is shared to allow for asynchronous reads and writes.
Fig. 3: A block diagram depicting the Design of Asynchronous Control This data-structure maintains the device’s states and has
Scheme, the Simulated end-effectors and Devices maintain independent and fields to store the commanded forces. A similar, but non-
mutually exclusive Data Structures (DS) that are updated on successive identical, data-structure is defined for each SDE. The novelty
writes and are capable of asynchronous reads
in this implementation is the application of forces/commands
in dynamic and haptic threads, as the execution counters of
reactionary forces are applied to the corresponding input each thread are asynchronous by design. The difference be-
devices in the same time-step. tween the two control schemes is analyzed by experimenting
with a multi-manual task of grasping, picking and placing
objects and recording the dynamic and haptic update-rates.
In one example configuration, the framework is stressed
by simultaneously testing five haptic devices including two
Novint Falcons, a Geomagic Touch, and two master telema-
nipulators (MTMs) from (Intuitive Surgical Inc., Sunnyvale,
CA, USA). As shown in Figure 4(a), (c) in the sequential
implementation, the update-rate never meets the 1 kHz set-
(a) (b)
point. On the other hand, in Figure 4(b), and (d), the device
update-rates stay close to 1 kHz but the dynamic update-
rate can swing depending upon the collision computation
for high-density meshes during contact. The states and com-
mands are stored outside the haptic/dynamic update-loops
and are then used as “set-points” in the relevant threads to
prevent saturating the forces in both simulation and haptic
feedback loops.
(c) (d)
Fig. 4: Figure (a) and (b) show the haptic update-rate of 5 devices when B. Design of Asynchronous Framework (AMBF) Simulator
controlled ’sequentially’ vs ’asynchronously’, respectively. Figure (c) and
(d) show the corresponding rates for physics update-loops for ’sequential’ We used a design philosophy, motivated by several differ-
vs ’asynchronous’ control ent sources, which assimilates the concept of bodies in dy-
namic simulation as independent objects with self-contained
Controlling each device in task-space requires a large num- kinematic & dynamic properties, thereby mimicking real-
ber of matrix operations, including similarity transforms to world objects. This philosophy is to distinguish from the
enable hand-eye (camera) coordination and offset-transforms practical implementation where the simulated bodies are part
for clutch engaging/disengaging (presented in Section III- of an interconnected graph in a unified simulation and require
C and equation 4). The drivers for several commercial sequential updates. The goal of this design philosophy is to
devices – Geomagic Phantom/Touch from (3D Systems Corp, allow for asynchronous manipulation and control of each
Rock Hill, SC, USA) and Falcon (Novint Technologies Inc., simulated body independently. As a result, objects in the
NY, USA) – impose a delay while commanding forces to simulation are classified as either afObject or afWorld, where
restrict the update-rate. Tracker devices (such as Razer Hydra ‘af’ stands for ‘Asynchronous Framework’. An afObject is a
from (Razer Inc., CA, USA) operate at lower update-rates kinematic or dynamic rigid body which can have any number
(≤ 400Hz), and hence pose additional challenges. There- of movable parts (including 0). At their core both afObject
fore, the issue with the “sequential” implementation is that and afWorld have two interfaces for communication, utilizing
reading/writing multiple devices throttles the update-rate of afState / afCommand for state / command pair. These two
the dynamic and haptic feedback loops. Moreover, mixing interfaces implement a generic input-output design that is
devices with different update-rates makes the dynamic sim- easy to scale and communicate in parallel through an Inter
ulation unusable. This can be alleviated by withholding the Process Communication (IPC) medium (Figure 2).
force commands in the main loop and executing them con- The AMBF Simulator has a single world instance which
currently in a separate thread. However, while this improves is responsible for managing all the visual, kinematic and
the update-timing, it makes the devices and SDEs unstable dynamic objects. This world instance supports features such
due to a non-deterministic delay between the computation of as step-throttling, step-skipping and reporting metrics (dis-
control laws and the application of output forces. cussed in more detail in Section III-F).
6270
Authorized licensed use limited to: LAHORE UNIV OF MANAGEMENT SCIENCES. Downloaded on July 19,2023 at 18:06:39 UTC from IEEE Xplore. Restrictions apply.
offset. These quantities are calculated from equations 3 and
5.
6271
Authorized licensed use limited to: LAHORE UNIV OF MANAGEMENT SCIENCES. Downloaded on July 19,2023 at 18:06:39 UTC from IEEE Xplore. Restrictions apply.
provides the flexibility to launch the communication in- Ena Throttle is used to control the flow of simulation based
stances from within the AMBF Simulator without the need on the toggle of Clock. The Jump Steps is the number of
for ros-launch files. steps the simulation must take between each clock toggle
Each communication instance owns a WatchDog timer (event). The requirement for throttling the simulation comes
which has a primary and a secondary function for data from the action-reward pair for the valid Markov States in
transmission control. The primary function of the Watch- RL problems. This requirement mandates the states to have
dog is to reset the afCommand if the timing condition – associated rewards which are meaningless if the simulation is
the invocation frequency of the afObjComm/afWorldComm not throttled between the update-steps of training (forward
callback – is not met. This keeps the asynchronous control and backward pass of the Neural Network). Server Time
safe for physical devices connected to AMBF Simulator. is the time (to nano secs precision) of the clock running
The Watchdog timer is re-initiated once a stream of new in AMBF Simulator, and the Sim Time is the time of the
commands starts flowing in. The secondary purpose of the inner-clock of dynamic simulation. This time is incremented
watchdog timer is to limit the publishing frequency of at each iteration of the dynamic solver such that:
afStates to lower values if the watchdog timer expires, thus
reducing the use of computing resources. n−1
tnsim = tsim + dtd (7)
E. The Python Client
Since the dynamic update-loop (Dyn Freq) runs asyn-
As discussed in section I, many of the popular libraries for chronously without any real-time constraints, using a fixed dt
learning and training agents have Python interfaces (Keras, causes time dilation between the wall (world) and simulation
GYM, Tensorflow/Theano, and Keras-RL). In alignment with clock (shown in Figure 8a with one input device and dynamic
these preferred interfaces, we present a stand-alone Python time-step dt = 0.001). The reason for this dilation is evident.
client that complements the AMBF Simulator. This client is Lacking a real-time kernel and custom sleep function makes
capable of creating callable instances of afObjects and af- it harder to meet the desired frequency which shifts the two
World (using ROS Communication) which are isolated from clocks. Even with a real-time kernel, the start-up time for
one another to reduce communication and computational initializing haptic devices can throw off the simulation clock.
overheads. We outlined various specifications that prioritize Moreover, the nature of collision computation techniques in
robustness in handling load. These design specifications are physics simulation libraries makes the computational time
intended for real-time training on data as well as closed-loop variable, and in effect, non-deterministic as it depends on the
control by accounting for the communication overheads and varying number of bodies in contact and their geometries.
slower execution speeds of Python applications. Based on
these specifications, the Python Client uses data sequencing
techniques and payload time-stamps to keep track of states,
actions and rewards. The consequence of the design speci-
fications is reflected not only in the Python Client itself but
also in the AMBF Simulator and the Payload Types (Section
III-F). A block diagram representing the Python Client is
shown in Figure 7.
For safety reasons, each callable instance of afObect and
afWorld in the client inherits a WatchDog timer which
enforces command resetting if the timing condition fails. The
Python Client is capable of throttling the dynamic update-
Fig. 7: The Python Client communicates with the AMBF Simulator using
loop of the AMBF Simulator, in which case, it provides a ROS as a middle-ware, AMBF ENV retrieves the requested handles for
clock to step the dynamic update-loop. This clock is provided objects from Python Client and provides a GYM compatible interface
using “Clock” field in the afCommand message for afWorld
and number of jump steps can be set to > 1. All this is
done automatically by the Python Client as the user/training
agent sets the corresponding parameters at run-time.
6272
Authorized licensed use limited to: LAHORE UNIV OF MANAGEMENT SCIENCES. Downloaded on July 19,2023 at 18:06:39 UTC from IEEE Xplore. Restrictions apply.
TABLE I: afObjects and afWorlds payloads for closed loop control and
training via online data
afWorld afObject
afState afCommand afState afCommand
- - Base Frame Base Frame
Msg Num - Msg Num Msg Num
Server Client Time Server Time Client Time
Time
Sim Time - Sim Time -
Num Devs Ena Throttle Name Ena Pos Ctrl
Dyn Freq Clock Mass & Inertia Pose
- Jump Steps Transform Wrench
- - Children[] Pos Ctrlr Mask[]
- - Joint Positions[] Joint Cmds[]
6273
Authorized licensed use limited to: LAHORE UNIV OF MANAGEMENT SCIENCES. Downloaded on July 19,2023 at 18:06:39 UTC from IEEE Xplore. Restrictions apply.
Fig. 10: These sub-figures show the progression (left to right) of a bi-manual task using the AMBF Simulator. The Two end-effectors holding the green
multi-link puzzle piece are controlled by dVRK Masters (shown as Picture in Picture on top right) and the other two end-effectors are controlled via Razer
Hydra (shown as Picture in Picture on top left)
6274
Authorized licensed use limited to: LAHORE UNIV OF MANAGEMENT SCIENCES. Downloaded on July 19,2023 at 18:06:39 UTC from IEEE Xplore. Restrictions apply.
An essential feature of the Python Client is controlling Proceedings. ICRA’02. IEEE International Conference on, vol. 4.
the AMBF Simulator synchronously for satisfying Markov’s IEEE, 2002, pp. 3769–3774.
[6] R. Bauernschmitt, E. U. Schirmbeck, A. Knoll, H. Mayer, I. Nagy,
action-reward pair property under the umbrella of the Asyn- N. Wessel, S. Wildhirt, and R. Lange, “Towards robotic heart surgery:
chronous Framework. Immediate feedback is difficult to Introduction of autonomous procedures into an experimental surgi-
achieve in a distributed architecture since it involves delay cal telemanipulator system,” The International Journal of Medical
Robotics and Computer Assisted Surgery, vol. 1, no. 3, pp. 74–79,
due to (1) round-trip communication of packets and (2) pro- 2005.
cessing time in between for computing kinematics, dynamics [7] K. Shamaei, Y. Che, A. Murali, S. Sen, S. Patil, K. Goldberg, and
and updating packet data. The latency caused by the round- A. M. Okamura, “A paced shared-control teleoperated architecture
for supervised automation of multilateral surgical tasks,” in Intelligent
trip of data is presented in Figure 12(a) and (b). As illustrated Robots and Systems (IROS), 2015 IEEE/RSJ International Conference
in Figure 12 (b), the bottleneck is caused by the execution on. IEEE, 2015, pp. 1434–1439.
speed of Python. This jitter is expected as a result of the [8] T. D. Nagy and T. Haidegger, “An open-source framework for surgical
subtask automation,” Robotics and Automation (ICRA) Workshops,
longer queue sizes. Hence, the latency increases as newer 2018 IEEE International Conference on, 2018.
data needed to wait in a queue while the program execution [9] T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa,
processed older data. For synchronous control, however, we D. Silver, and D. Wierstra, “Continuous control with deep reinforce-
ment learning,” arXiv preprint arXiv:1509.02971, 2015.
are concerned with the latest data, and thus, the queue-size [10] N. Heess, S. Sriram, J. Lemmon, J. Merel, G. Wayne, Y. Tassa, T. Erez,
is set to 1. The limited queue-size helps to drive down the Z. Wang, S. Eslami, M. Riedmiller, et al., “Emergence of locomotion
round-trip communication latency to ≤ 0.001secs for 2 kHz behaviours in rich environments,” arXiv preprint arXiv:1707.02286,
2017.
of communication speed (Figure 12(a)). [11] I. Zamora, N. G. Lopez, V. M. Vilches, and A. H. Cordero, “Extending
With regard to the evolution of the Asynchronous Frame- the openai gym for robotics: a toolkit for reinforcement learning using
work, we intend on integrating flexible body dynamics ros and gazebo,” arXiv preprint arXiv:1608.05742, 2016.
[12] G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman,
and visualizations to create realistic surgical training ap- J. Tang, and W. Zaremba, “Openai gym,” 2016.
plications using simulated body tissues, organs, cloths and [13] A. Deguet, R. Kumar, R. Taylor, and P. Kazanzides, “The cisst libraries
threads. The foreseeable challenges include the complexity for computer assisted intervention systems,” in MICCAI Workshop
on Systems and Arch. for Computer Assisted Interventions, Midas
of implementation, stability, performance, dynamic-update Journal, vol. 71, 2008.
to track real-world clock and manipulation using haptic [14] P. Kazanzides, Z. Chen, A. Deguet, G. S. Fischer, R. H. Taylor,
devices. Moreover, new guidelines for the communication and S. P. DiMaio, “An open-source research kit for the da vinci
R
surgical system,” in 2014 IEEE International Conference on Robotics
interfaces afState-afCommand need to be developed. Lastly, and Automation (ICRA). IEEE, may 2014.
the inclusion of flexible body dynamics should employ a [15] C. et al., “The CHAI libraries,” in Proceedings of Eurohaptics 2003,
generic framework design to allow for universal adaptation Dublin, Ireland, 2003, pp. 496–500.
[16] E. Coumans, “Bullet physics simulation,” in ACM SIGGRAPH 2015
as opposed to a more targeted application. Courses, ser. SIGGRAPH ’15. New York, NY, USA: ACM, 2015.
In this manuscript, we discussed the motivation behind [Online]. Available: http://doi.acm.org/10.1145/2776880.2792704
the design philosophy of an Asynchronous Framework for a [17] C. et al., “Keras,” https://github.com/keras-team/keras, 2015.
[18] M. Plappert, “Keras-rl,” https://github.com/matthiasplappert/keras-rl,
distributed application that is intended for training learning 2016.
agents via real-time input from a dynamic-haptic simulation. [19] D. Shreiner and T. K. O. A. W. Group, OpenGL Programming Guide:
The entire framework is available at the public repository The Official Guide to Learning OpenGL, Versions 3.0 and 3.1, 7th ed.
Addison-Wesley Professional, 2009.
[24]. The challenges to such an implementation are discussed [20] A. Munawar and G. Fischer, “Towards a haptic feedback framework
throughout the text, and we have concluded with the perfor- for multi-dof robotic laparoscopic surgery platforms,” in Intelligent
mance analysis of the proposed Asynchronous Framework. Robots and Systems (IROS), 2016 IEEE/RSJ International Conference
on. IEEE, 2016, pp. 1113–1118.
[21] A. Munawar, “Plugin based Interface for the da Vinci Research Kit
R EFERENCES (dVRK) MTMs,” https://github.com/WPI-AIM/dvrk arm, 2016.
[22] M. Quigley, K. Conley, B. Gerkey, J. Faust, T. Foote, J. Leibs,
[1] A. Shademan, R. Decker, J. Opfermann, S. Leonard, A. Krieger, and R. Wheeler, and A. Y. Ng, “Ros: an open-source robot operating
P. C. W. Kim, “Supervised autonomous robotic soft tissue surgery,” system,” in ICRA workshop on open source software, vol. 3, no. 3.2.
Science Translational Medicine, vol. 8, pp. 337ra64–337ra64, 05 2016. Kobe, Japan, 2009, p. 5.
[2] K. Bumm, J. Wurm, J. Rachinger, T. Dannenmann, C. Bohr, [23] M. Sagardia, T. Stouraitis, and J. L. e Silva, “A new fast and robust
R. Fahlbusch, H. Iro, and C. Nimsky, “An automated robotic approach collision detection and force computation algorithm applied to the
with redundant navigation for minimal invasive extended transsphe- physics engine bullet: Method, integration, and evaluation,” in Prof.
noidal skull base surgery,” Minimally invasive neurosurgery : MIN, of the Conf. and Exhibition of the European Association of Virtual
vol. 48, pp. 159–64, 07 2005. and Augmented Reality (EuroVR), 2014, pp. 65–76.
[3] S. Sen, A. Garg, D. V. Gealy, S. McKinley, Y. Jen, and K. Goldberg, [24] A. Munawar, “The Asynchronous Multi-Body Framework,” https://
“Automating multi-throw multilateral surgical suturing with a mechan- github.com/WPI-AIM/ambf, 2019.
ical needle guide and sequential convex optimization,” in Robotics and
Automation (ICRA), 2016 IEEE International Conference on. IEEE,
2016, pp. 4178–4185.
[4] A. Murali, S. Sen, B. Kehoe, A. Garg, S. McFarland, S. Patil, W. D.
Boyd, S. Lim, P. Abbeel, and K. Goldberg, “Learning by observation
for surgical subtasks: Multilateral cutting of 3d viscoelastic and 2d
orthotropic tissue phantoms,” in Robotics and Automation (ICRA),
2015 IEEE International Conference on. IEEE, 2015, pp. 1202–1209.
[5] A. Krupa, J. Gangloff, M. de Mathelin, C. Doignon, G. Morel, L. Soler,
J. Leroy, and J. Marescaux, “Autonomous retrieval and positioning of
surgical instruments in robotized laparoscopic surgery using visual
servoing and laser pointers,” in Robotics and Automation, 2002.
6275
Authorized licensed use limited to: LAHORE UNIV OF MANAGEMENT SCIENCES. Downloaded on July 19,2023 at 18:06:39 UTC from IEEE Xplore. Restrictions apply.