0% found this document useful (0 votes)
118 views44 pages

Swarm of Micro Flying Robots in The Wild

Uploaded by

1020544978
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
118 views44 pages

Swarm of Micro Flying Robots in The Wild

Uploaded by

1020544978
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Supplementary Materials for

Swarm of micro flying robots in the wild

Xin Zhou et al.

Corresponding author: Fei Gao, [email protected]; Chao Xu, [email protected]

Sci. Robot. 7, eabm5954 (2022)


DOI: 10.1126/scirobotics.abm5954

The PDF file includes:

Sections S1 to S10
Figs. S1 to S17
Tables S1 to S13

Other Supplementary Material for this manuscript includes the following:

Software
Section S1. Extra Comparison of Reciprocal Collision Avoid-
ance

Table S1. Extra performance comparison. Comparison is performed under a position swap
configuration in an 8×8 m empty space containing eight agents with radius of 0.25 m. Max-
imum velocity and acceleration are set to 1.7 m/s and 6 m/s2 , respectively. Min. Clear. de-
notes the minimum
R distance from collision. Smoothness is computed from the time integral of
2
squared jerk ||j||2 dt. Units of time and distance are second and meter, respectively.

Online/ Comp. Traj. Traj. Min. Smooth-


Method Safety
Offline Time Time Length Clear. ness
SCP, h scp=0.3s 1.46 7.65 9.64 -0.37 No 7.01
SCP, h scp=0.17s 7.72 7.40 9.70 -0.26 No 17.5
Offline
RBP,no batches 0.320 15.4 11.3 0.01 Yes 0.608
RBP,batch size=4 0.413 15.4 11.5 0.03 Yes 0.604
Mader 0.0239 7.50 12.2 0.19 Yes 125.1
EGO, horizon=7.5m 0.000554 8.12 10.1 0.17 Yes 72.0
Online EGO, horizon=10m 0.000844 8.17 10.3 0.31 Yes 94.1
Proposed, κ = 5 0.000465 7.57 9.70 0.09 Yes 30.4
Proposed, κ = 8 0.000557 7.58 9.71 0.11 Yes 28.2

Except for EGO (33) and MADER (37), we compare the proposed planner against two

other centralized offline swarm planners, i.e., SCP (49) and RBP (50). Results are given in
Table S1. A bold term indicates a better value in the corresponding category (Online/Offline),
while an underlined term indicates the best value among all planners. As given in Table S1, SCP

achieves a shorter flight time at the sacrifice of trajectory safety. RBP achieves smoother flight
at the cost of a relatively long flight time. Both fail to balance various aspects of trajectory
quality. Compared with offline methods, the three online planners show more balance while

requiring less computation by several orders of magnitude. Among them, MADER is more
conservative because its safety is ensured through minimum volume enclosing simplices. All
the data are recorded from 10 random runs.
Section S2. Scalability Evaluation
Time and Space Complexity

In real-world usage, larger swarms always mean a more powerful team for broader coverage

in exploration, a grater possible load for transportation, and higher robustness against robot
failures, among other advantages. However, traditional optimization-based trajectory planning
suffers from a high computing complexity related to the number of robots, as stated by Park et

al. (50) the time complexity of which is O(n3 ). Therefore, Park et al. have to divide the entire

swarm into groups and then plan the trajectory sequentially, which only reliefs the complexity

to a limited extent.

Owing to the decentralized architecture and appropriate problem formulation, our time com-

plexity is significantly reduced to approximately linear to the drone number from statistics.

Since the baseline of computing time is short, the swarm size can increase to hundreds of robots

before breaking the real-time bottom line. Meanwhile, the space complexity is exactly linear to

the drone number, as one drone only stores the latest trajectory from every other one. Indeed,

this space usage can be ignored because one trajectory is typically below 1 kB in size, and there

are several gigabytes in the system memory.

The complexity can be further reduced (20) if each drone only considers nearby units, since
we perform continuous planning only within a limited distance. In our work, we ignore drones

more than twice the planning distance (see Table S6 ) away from each other. The resulting
complexity is below O(n), and is determined by the density of drone distribution as in Fig. S1.
Such complexity allows the swarm to be expanded to a large size before breaking the real-time

threshold.
0.6

Time (ms)
Computing Time(ms)
0.5

Computing 0.4

0.3
0.3 1m
1m
2m
5m
5m
0.2
0.2
10
10 20
20 30 40
40 50
50
Drone Number
Drone Number

Fig. S1. Scalability on computation efficiency. This figure provides the computing time for
different drone numbers under three swarm distributions shown in Fig. S15.

Wireless Communication Quality

Communication capacity can be affected by environmental signal interference, connection topol-

ogy, and drone distribution. To control signal interference, we conduct this experiment in a quiet

park with drones hovering and UWBs turned on. Then, in this test, we compare two topologies,

namely a centralized network with one access point and a peer-to-peer ad hoc network using the

hardware described in Section Palm-Sized Drone Hardware. In addition, various distributions


illustrated in Fig. S2A from tight to loose, are tested. Two widely used metrics, bandwidth, and

latency, are adopted to assess network capacity. Since we write reliable network communica-
tion software that guarantees data arrival, the packet loss rate is not evaluated (which equals
either 0% or 100%). The bandwidth Bw is quantized to the highest trajectory (175 bytes each)

broadcasting frequency by all the drones together before reaching the 250-ms latency threshold.
Latency TL is recorded at 5 Hz of the trajectory broadcasting rate.
To be collision-free, in a centralized network, each drone must maintain a direct connection

to the access point (AP), so we record the worst value among all drones. By contrast, for the
Fig. S2. Scalability on communication quality. (A) Three kinds of swarm distributions. (B)
Network quality for two kind of topologies under three swarm distributions.

ad hoc network, each drone directly connects to the others. Furthermore, when two drones get

further than twice the planning horizon from each other, the collision penalty between them is

always zero, and freedom from collision is guaranteed naturally. Thus, we only evaluate data

between each pair of drones with a relative distance below twice the planning horizon (15 m in
this experiment) for an ad hoc network.
From the results in Fig. S2B, a centralized network performs better in tight formation, while

an ad hoc network shows better communication quality when the formation is loose. From the
integral view, an ad hoc network scales better with the number of drones when in loose distribu-
tion, while a centralized network provides higher bandwidth when the drone number is small;

thus, both networks are complementary to an extent. In theory, as a peer-to-peer network, an ad


hoc network creates connections that follow factorial complexity in a dense formation, while a
centralized network only follows linear complexity in this scenario. Therefore, developers must
choose a proper network topology or adopt both types, of them according to task characteristics.

The proposed drone platform supports both networks.

Fig. S3. Localization drift corrections. (A) Thin and thick lines are trajectories before and
after corrections, respectively. The former ones show significant drifts that could have ruined
the formation if not corrected. (B) Localization drift of drone 0. More drones are shown in
Fig. S13.

Section S3. Assessments of Localization Drift Correction

The data recorded from the experiment in Section Formation Navigation in the Wild are given

in Fig. S3. The desired start and goal shapes are 2 × 5 drone rectangles with 1-m spacings. With
the corrected localization, the drone managed to maintain the shape after a 55-m flight, although
they are not precisely aligned in yaw at the start point. By contrast, Figs. S3 and S13 show the

magnitude of the estimated drift, which reaches approximately 3 m. Under such significant
drift, the formation suffers severe shape distortion and reciprocal collision if no correction is
made. However, since no global information such as GPS data is available, accumulated errors

for the entire team must exist. That is why we add some fixed anchors in Section Intensive
Reciprocal Avoidance Evaluation. In summary, the necessity and effectiveness of the proposed
correction method are validated here. The block diagram is shown in Fig. S14.

Section S4. Quantitative Analysis of the Gate-passing Test

To further analyze the planner performance, the gate-passing test in Section Benchmark Com-
parisons is conducted again with some representative data visualized in Fig. S4. Among the
four sub-figures, B and C are mutually constrained. This means, for a shorter flight time, these

drones must fly as fast as possible under the maximum speed of 2 m/s, while they also have
to alter velocity to maintain a relative distance. Then, for drones in front, velocity can be in-
creased to reduce flight time while enlarging relative distance, but the velocity is limited. For

the drones left behind, reducing the velocity as little as possible to maintain the shortest relative
distance to those in front is the best strategy. The above is exactly what is achieved in Fig. S4B
and C, where drones managed to fly across the gate at the maximum velocity and minimum

relative distance allowed. What’s more, from Fig. S4D, we can conclude that drones fly along
the shortest trajectory without detours or wasting of energy.

Section S5. Trajectory Parameterization for Aerial Robots

The pros and cons of typical trajectory parameterization for aerial robots are summarized in Ta-
ble S2. Polynomial curves are widely used by traditional approaches (4, 10, 38), which mostly

7
A
2
4
5 1
3 0

B
Relative Distance (m)

2 4 5 1 0 3
2

Safety Threshold
0
0 2 4 6 Time(s) 8 10 12 14

C 2
2 4 5 1 0 3 vx
vy
Velocity (m/s)

1 vz

-1
0 2 4 6 Time(s) 8 10 12 14
D 30
Straight-line Distance
Trajectory Length (m)

22.36 22.38 Real Trajectory Distance 22.36 22.36


20.88 20.89 20.10 20.11 20.10 20.10 20.88 20.89
20

10

0
0 1 2 Drone ID 3 4 5

Fig. S4. Performance evaluation in the gate-passing test. (A) Six snapshots taken when
each drone is passing the gate. (B) The minimum relative distances recorded by each drone
are denoted by the light blue area, the lower bound of which is emphasized by a blue curve.
Multicolored lines marked with numbers of drone ID indicate the time at which each drone flies
through the gate. Red dotted line is the 0.7-m swarm clearance in optimization. (C) Minimum-
to-maximum velocity curves. The orange dotted line at a 2-m/s velocity is the velocity limit.
(D) Comparison between real trajectory lengths and straight-line distances (the minimum length
in theory).

8
use polynomial coefficients as decision variables, along with explicit continuity constraints.
Many works (10, 38) deform a piece-wise polynomial trajectory by adding intermediate way-

points. However, the distribution of waypoints requires a careful configuration to balance fea-
sibility satisfaction and computational burden. To optimize the time profile of a polynomial,
the representative method uses gradient descent with finite differences (38). Nevertheless, this

method requires a dense matrix inversion to transform waypoints to coefficients, making the
optimization quickly intractable when the problem scales up. Bézier and B-Spline share the
convex hull nature, which indicates that the curve is entirely contained in the convex hull of its

control points (13, 23). Therefore, these curves are convenient for adding constraints, which
can be done by merely restricting the control points inside feasible convex regions. Many
works (24, 37, 50) follow this underlying property and show successful applications in aerial

swarms. However, this property indeed brings conservativeness, as analyzed by (47), prevent-
ing a trajectory from being aggressive near its physical limits. Moreover, an n order B-Spline
is naturally n − 1 order continuous (48). Thus no continuity constraint is required as long as

the B-Spline has a qualified order. The time profile of a Bézier curve can be adjusted with ac-
ceptable complexity by its basis (23), while this approach is rather complicated for a B-Spline
due to its time evaluation containing high nonlinearity. MINCO (57) is a newly developed

trajectory representation specialized for integrator chain systems, and its core feature is that it
efficiently handles a wide variety of constraints while retaining spatial and temporal optimality.
By discretizing the trajectory to add constraints, MINCO is less conservative but requires further

checks after obtaining the solution. Although MINCO is the most complicated representation
and requires a sophisticated implementation, it provides a solid foundation for spatial-temporal
drone swarm planning.

9
Table S2. Pros and cons of commonly used trajectory parameterizations.

Dynamical Temporal Implementation


Method Continuity Safety
Feasibility Optimization Difficulty
Polynomial Extra By Discretization Easy
Coupled
Bézier Curve Overhead Analytical but
B-Spline Conservative Intractable Medium
By Nature
MINCO By Discretization Decoupled

Section S6. Spatial-Temporal Trajectory Optimization for


Swarms — A More General and Detailed Problem Formula-
tion
MINCO Trajectory Class

MINCO is a polynomial trajectory class defined as


n
MINCO = p(t) : [0, tM ] 7→ Rm c = M(q, T),
o
m×M −1 M
q∈R , T ∈ R>0 ,

where p(t) is an m-dimensional M -piece polynomial trajectory with degree N, N = 2s − 1

with s the order of the relevant integrator chain, c = (cT T T


1 , . . . , cM ) ∈ R
2M s×m
the polynomial
coefficient with ci ∈ R2s×m the coefficient matrix, q = (q1 , . . . , qM −1 ) the intermediate way-
points, T = (T1 , T2 , . . . , TM )T the time allocated for all pieces, ti = ij=1 Tj , i ∈ [1, M ], and
P

t0 = 0 in the Supplementary Materials. The notion M(q, T) equals Mq,T in the main text.
Specifically, the trajectory is evaluated as

p(t) = pi (t − ti−1 ), ∀t ∈ [ti−1 , ti ]. (S1)

The i-th piece pi (t) : [0, Ti ] 7→ Rm is defined by

pi (t) = cT
i β(t), ∀t ∈ [0, Ti ], (S2)

10
𝑇𝑇1 𝑇𝑇2 𝑇𝑇𝑀𝑀−1 𝑇𝑇𝑀𝑀
𝐪𝐪1 𝐩𝐩𝟐𝟐,𝟎𝟎
𝐩𝐩𝟐𝟐,𝟏𝟏
𝑝𝑝(𝑡𝑡) 𝐜𝐜𝑀𝑀−1 𝛽𝛽(𝑡𝑡)
𝐜𝐜1 𝛽𝛽(𝑡𝑡) 𝐩𝐩𝟐𝟐,𝟐𝟐
𝐜𝐜2 𝛽𝛽(𝑡𝑡) 𝐩𝐩𝟐𝟐,𝟑𝟑
𝐩𝐩𝟐𝟐,𝟒𝟒 … 𝐪𝐪𝑀𝑀−2
𝐪𝐪𝑀𝑀−1 𝐜𝐜𝑀𝑀 𝛽𝛽(𝑡𝑡)
𝐪𝐪2

Fig. S5. Parameters of a MINCO trajectory and its constraint evaluation. The trajectory
is directly parameterized by each waypoint qi and time Ti . Red dots are constraint points p̊ at
network
which thatis shares
any constraint evaluated trajectories
and propagated toand
everyperforms
qi and Ti . on-demand time

where β(t) := [1, t, · · · , tN ]T is the natural basis. The core of MINCO is the parameter map-

ping c = M(q, T) constructed from Theorem 2 in (57). Technically, the mapping directly
constructs a minimum control trajectory for an m-dimensional s-integrator chain for any spec-
ified initial and terminal conditions. An instance in MINCO is intuitively shown in Fig. S5.

Owing to the limited paper length, we only intuitively introduce the features of MINCO. De-
tailed proofs are presented in (57).
Feature 1: Compactly represented by q and T, MINCO is a class of C s−1 polynomial

trajectories satisfying the following control effort minimization:


Z tM
min p(s) (t)T Wp(s) (t)dt, (S3a)
p(t) t0
[s−1]
s.t. p (t0 ) = p̄o , p[s−1] (tM ) = p̄f , (S3b)

p(ti ) = qi , 1 ≤ i < M, (S3c)

ti−1 < ti , 1 ≤ i ≤ M, (S3d)

where W ∈ Rm×m is a diagonal matrix with positive entries, p[s−1] = [p, p(1) , p(2) , · · · , p(s−1) ]

with , p̄o and p̄f ∈ Rm×s the specified initial and terminal conditions, respectively. The trajec-
tory is enforced to pass qi at time ti .
Feature 2: The evaluation and differentiation of the mapping c = M(q, T) benefits from

11
linear complexity. A more specific correspondence can be expressed as the following function

M(T)c = b(q), (S4)

where M(T) ∈ R2M s×2M s is a banded matrix with non-singularity for any T  0 ensured by
Theorem 2 in (57), b(q) ∈ R2M s×m . The conversion between these two trajectory represen-

tations, i.e., (c, T) and (q, T) is achieved using Banded PLU Factorization with O(M ) linear
time and space complexity. The evaluation is extremely fast, which takes approximately 1 µs
per piece for minimum-jerk (s = 3) trajectory generation on a desktop CPU.

The formulation of M and b are


 
F0 0 0 ··· 0

 E1 F1 0 ··· 0 

 0 E2 F2 ··· 0 
M= , (S5a)
 
.. .. .. ... ..

 . . . . 

 0 0 0 · · · FM −1 
0 0 0 · · · EM
Ei = (β (0) (Ti ), β (0) (Ti ), β (1) (Ti ), · · · , β (2s−2) (Ti ))T ∈ R2s×2s , i ∈ [1, M − 1] (S5b)

Fi = (02s×1 , −β (0) (0), · · · , −β (2s−2) (0))T ∈ R2s×2s , i ∈ [1, M − 1] (S5c)


T
EM = β (0) (TM ), · · · , β (s−1) (TM ) ∈ Rs×2s , (S5d)
T
F0 = β (0) (0), · · · , β (s−1) (0) ∈ Rs×2s ; (S5e)

T
b= D0 , D1 , 0m×(2s−1) , · · · , DM −1 , 0m×(2s−1) , DM , (S6a)

Di = qi ∈ Rm , i ∈ [1, M − 1], (S6b)

D0 = p̄o , DM = p̄f , D0 , DM ∈ Rm×s . (S6c)

Feature 3: Feature 2 allows any user-defined objective or constraint function F (c, T) with
available gradients applicable to MINCO parameterized in (q, T). Specifically, the correspond-

12
ing objective of MINCO is computed as

H(q, T) = F (M(q, T), T). (S7)

Then the mapping c = M(q, T) gives a linear-complexity way of computing ∂H/∂q and
∂H/∂T from the corresponding ∂F/∂c and ∂F/∂T, i.e.,

∂H ∂F ∂c ∂H ∂F ∂c ∂F
= , = + , (S8)
∂q ∂c ∂q ∂T ∂c ∂T ∂T
↑   ↑   

where derivatives marked by ↑ are the gradients finally required by a numerical optimizer, 
are the elements calculated by user code, and  are just computed using the linear-complexity

mapping c = M(q, T), i.e., Eqs. S4 –S6. Users need not consider the constraints between c
and T as if decoupled. Next, a high-level optimizer is able to optimize the objective efficiently.

Time Integral Constraints

Conventionally, requirements for dynamical feasibility and collision avoidance are formulated
as functional-type constraints G(p(t), . . . , p(s) (t))  0, ∀t ∈ [0, tM ], G ∈ Rng with ng the num-

ber of types of constraints. However, this formulation cannot be directly handled by constrained
optimization, since G contains infinitely many inequality constraints. Thus, we transform G into
a finite-dimensional one via an integral of constraint violation. Practically, the integral is trans-

formed into a weighted sum JΣ (c, T) of the sampled penalty function. In most cases where
constraints are decoupled, i.e., constraints G(p[s] (t)) with ti ≤ t < ti+1 are merely determined
by ci and Ti , p[s] = [p, p(1) , · · · , p(s) ], the penalty for an i-th trajectory piece is computed as
κi
Ti X j
Ji (ci , Ti ) = ω̄j χT max (G(ci , Ti , ), 0)3 , (S9)
κi j=0 κi
n
where κi is the sample number on the i-th piece, χ ∈ R≥0g a vector of penalty weights with
appropriately large entries (several orders of magnitude larger than the smoothness penalty),

13
and (ω̄0 , ω̄1 , . . . , ω̄κi −1 , ω̄κi ) = (1/2, 1, · · · , 1, 1/2) the quadrature coefficients following the
trapezoidal rule. We define the points determined by {ci , Ti , j/κi } as constraint points p̊i,j :=

pi ((j/κi )Ti ), with pi (t) the i-th polynomial piece. Then G(p̊i,j ) := G(ci , Ti , j/κi ). Note that
p̊i,κi = p̊i+1,0 . The cubic penalty is used here for its twice continuous differentiability. JΣ (c, T)
is then defined as the summation:
M
X
JΣ (c, T) = Ji (ci , Ti ). (S10)
i=1

Since the gradients w.r.t. (c, T) are constructed from all (ci , Ti ), we only give gradient
templates for ci and Ti using the chain rule when Eq. S9 holds,

∂JΣ ∂Ji ∂Ji ∂G


= = , (S11)
∂ci ∂ci ∂G ∂ci
∂JΣ ∂Ji Ji ∂Ji ∂G ∂t
= = + , (S12)
∂Ti ∂Ti Ti ∂G ∂t ∂Ti
κi
∂Ji Ti X
=3 ω̄j χT max (G, 0)2 , (S13)
∂G κi j=0

j ∂t j
t= Ti , = . (S14)
κi ∂Ti κi
From the above equations, Eqs. S11 –S14, once the gradients of constraints G relative to polyno-
mial coefficients ci and time t are evaluated, the gradients of JΣ (c, T) are efficiently computed,
which are then substituted into Eq. S8 to acquire final gradients respective to q and T.

Penalty Formulation

The basic requirements on a desired trajectory are smoothness, total time, dynamical feasibility,
as well as safety among obstacles and other agents. In this paper, we adopt MINCO to address
above all concerns. Since MINCO is naturally guaranteed C s−1 by Theorem 2 in (57), no

extra effort must be paid to trajectory continuity. The trajectory generation is formulated as an

14
unconstrained optimization problem:
X
min λx Jx , (S15)
q,T
x

where Jx are JΣ instances, λx is the weight for each penalty, and subscripts x = {s, t, d, o, w, u}

represent smoothness (s), total time (t), dynamical feasibility (d), obstacle avoidance (o), swarm
reciprocal avoidance (w), and uniform distribution of constraint points (u), respectively. More
task-specific penalties can also be added. The problem is solved via standard, unconstrained

nonlinear optimization.

Smoothness Js

The smoothness metric follows the definition in Eq. S3b. Without loss of generality, considering
a single dimension of the i-th piece pi (t) : [0, Ti ] 7→ R of a polynomial spline, its derivatives
with respect to ci and Ti are
Z Ti 
∂Js (s) (s) T
=2 β (t)β (t) dt ci , (S16)
∂ci 0

∂Js T
= cT (s) (s)
i β (Ti )β (Ti ) ci . (S17)
∂Ti

Total Time Jt

A shorter flight time is desirable, so we also minimize the weighted total execution time Jt =
PM
i=1 Ti . Obviously, its gradients ∂Jt /∂c = 0, ∂Jt /∂T = 1.

Dynamical Feasibility Penalty Jd

For multicopters, dynamical feasibility is always guaranteed by limiting the trajectory’s deriva-

tives. In this paper, we limit the amplitude of velocity, acceleration, and jerk. Following Eq. S9,
constraints of velocity, acceleration, and jerk are denoted by

...
Gv = ṗ(t)2 − vm
2
, Ga = p̈(t)2 − a2m , Gj = p (t)2 − jm
2
, (S18)

15
where vm , am , jm are maximum allowed velocity, acceleration and jerk, respectively. The cor-
responding gradients are

∂Gx (n) ∂Gx (n)


= 2β (n) (t)pi (t)T , = 2β (n+1) (t)T ci pi (t), (S19)
∂ci ∂t

where x = {v, a, j}, n = {1, 2, 3}, and t = jTi /κi , respectively. Substituting Eq. S18 into S9

and S19 into S11 & S12 obtains the penalty and the gradients of ci and Ti .

Obstacle Avoidance Penalty Jo

In this paper, we adopt the front-end of collision evaluation devised by (14). A brief description

is given here for better understanding. In (14), the information for an unsafe trajectory to escape
collision is extracted by comparing the unsafe initial trajectory with a collision-free guiding
path. As depicted in Fig. S6, a fixed safe point s with a fixed safe vector v is generated and

recorded by a corresponding key point pkey on the trajectory. Then, the distance of pkey to the
obstacle that the {s, v} pair generated is defined as

do (pkey , {s, v}) = (pkey − s)T v. (S20)

Using the distance information, the trajectory deforms iteratively during optimization. If pkey
discovers a new obstacle, a new {s, v} pair is stacked to pkey ’s records. Therefore each pkey may

record several {s, v} pairs. Note that each pkey generates and possesses {s, v} pairs privately,
and thus does not calculate its distance from other pkey s’ {s, v} pairs.
In this paper, constraint points p̊i,j with 1 ≤ i ≤ M, 1 ≤ j ≤ κi are selected as the

key points pkey . To enforce safety, we penalize distance to obstacles by less than a obstacle
clearance Co . Therefore, the obstacle avoidance constraint is defined as

Go = (· · · , Gok (p̊i,j ), · · · )T ∈ RNsv , (S21)

Gok (p̊i,j ) = Co − do (p̊i,j , {s, v}k ), (S22)

16
Enlarging

Unsafe Trajectory A Safe Solution

Fig. S6. Collision avoidance formulation. The red curve is the optimized trajectory and blue
curve a collision-free path that starts and ends on the red one. A fixed point s with a vector v is
generated to determine the collision penalty.
network that shares trajectories and performs on-demand time
where Nsv is the number of {s, v} pairs recorded by a single p̊i,j . For the case in which
do (p̊i,j , {s, v}k ) ≤ Co , the gradient is computed as

∂Gok ∂Gok
= −β(t)vT , = −ṗi (t)T v. (S23)
∂ci ∂t

Otherwise, ∂Gok /∂ci = 02s×m , ∂Gok /∂t = 0.

Swarm Reciprocal Avoidance Penalty Jw

In this paper, a decentralized, non-cooperative swarm framework is adopted, which means that

one agent receives other agents’ trajectories as constraints and generates trajectories only for
itself. Considering the u-th agent in a multicopter swarm containing U agents, swarm collision
avoidance is guaranteed when the trajectory u p(t) of agent u maintains a distance greater than

a safe clearance Cw to all the trajectories at the same global timestamps of other agents as
depicted in Fig. S7. However, caution must be taken because we have always been using a
relative time t = jTi /κi for pi (t) in our optimization, while the other agents’ trajectories take

an absolute timestamp τ = tof + T1 + · · · + Ti−1 + jTi /κi , with tof = tw − τw as a time


offset to align t and τ from different world time tw and τw , respectively. This indeed makes
the penalty evaluated on the i-th piece depend on all its preceding piece times. We denote by

17
Other Agents’
Trajectories

𝑡𝑚𝑖𝑛 𝑡𝑚𝑎𝑥
My Trajectory

network
Fig. S7. Reciprocal thatavoidance
collision shares trajectories and performs
formulation. on-demand
Gray dotted arrowstime
indicate states at the
same absolute time, at which a safe distance must be ensured.

Gi,j u u
w the constraint evaluated at the j-th constraint point on the i-th piece of p(t), i.e., pi (t).

Therefore the reciprocal avoidance constraint is defined as

Gwi,j (u pi (t), τ ) = (· · · , Gwk (u pi (t), τ ), · · · )T ∈ RU −1 , k 6= u, (S24)

Gwi,jk (u pi (t), τ ) = Cw2 − d2w (u pi (t), k p(τ )), (S25)

dw (u pi (t), k p(τ )) = E1/2 (u pi (t) − k p(τ )) , (S26)

where k p(τ ) is the trajectory of the k-th agent that the u-th agent must avoid, k 6= u. The matrix
E := diag(1, 1, 1/c) with c > 1 transforms Euclidean distance into ellipsoidal distance with
the minor axes at the z-axis to relieve the downwash risk from rotors. Squared distance is used

to avoid square root operations.


When d2w (u pi (t), k p(τ )) ≤ Cw2 , the gradient to ci is

∂Gwi,jk
= −2β(t)(u pi (t) − k p(τ ))T E. (S27)
∂ci

The gradient to T is more complicated, since τ is a function of not only Ti , but T1 to Ti−1 as
well when calculating the gradients for Gwi,jk . This led to the gradient-propagate template Eq. S12
not holding here for ∂JΣ /∂Ti 6= ∂Ji /∂Ti . When d2w (u pi (t), k p(τ )) ≤ Cw2 , considering both

preceding and current time profiles Tl for 1 ≤ l ≤ i, the proper gradient should be computed as

18
U U M X κi
∂Jw X ∂Jwk X X ∂Jwi,jk
= = , (S28)
∂Tl k=1,k6=u
∂T l
k=1,k6=u i=l j=1
∂T l

∂Jwi,jk Jwi,jk ∂Jwi,jk ∂Gwi,jk


= + , (S29)
∂Tl Tl ∂Gwi,jk ∂Tl
∂Gwi,jk ∂Gwi,jk ∂t ∂Gwi,jk ∂τ
= + , (S30)
∂Tl ∂t ∂Tl ∂τ ∂Tl
∂Gwi,jk T
= 2 k p(τ ) − u pi (t) Eu ṗi (t), (S31)
∂t
∂Gwi,jk T
= 2 u pi (t) − k p(τ ) Ek ṗ(τ ), (S32)
∂τ ( (
j j
∂t l = i, ∂τ l = i,
= κi = κi (S33)
∂Tl 0 l < i, ∂Tl 1 l < i.
The trajectory k p(τ ) for any other agent is already known, and thus its derivatives are all avail-
able. ∂Jwi,jk /∂Gwi,jk in Eq. S29 is computed following Eq. S13, while (∂G/∂t)(∂t/∂Ti ) in Eq. S12
is replaced by Eq. S30. Next ∂Jw /∂T = [∂Jw /∂T1 , · · · , ∂Jw /∂TM ]T .

Uniform Distribution Penalty Ju

This penalty is not mentioned in the main body of the article since this well-packed module will
not bring about sufficient insights to general readers in terms of understanding the methodology,

but only some confusion. The purpose of this penalty is to make the constraint points p̊i,j
equally spaced for all 1 ≤ i ≤ M and 0 < j ≤ κi , which is a simpler substitution to the space-
uniform variant of MINCO mentioned in (57). It is necessary since if all MINCO parameters

q and T are optimized freely, some pieces of the trajectory tend to vanish to reduce the total
cost. This phenomenon is harmful for three reasons. First, Ti = 0 for the i-th trajectory piece
is a unique singular point for MINCO, which does not consider consecutive aligned waypoints.

Second, since the spatial collision avoidance is constrained by a finite number of constraint
points, non-uniform distribution increases the possibility of skipping some tiny or thin obstacles,
as shown in Fig. S8. Third, since we compute the distance to obstacles following Eq. S20, the

19
Skipped

network that shares trajectories and performs on-demand time


Fig. S8. Non-uniform constraint points (red dots) fail to detect a thin obstacle.

obstacles are modeled as planes. The accuracy of this modeling decreases when constraint

points move along the plane and the uniform distribution reduces such movement.
To enforce the uniform distribution of constraint points, we penalize the variance of the
squared distances between each pair of adjacent constraint points. For simplicity, we denote

Nc = M
P
i=1 κi as the total number of distances. The variance is computed as

1 1
Ju = E R2 − E [R]2 = kRk22 − 2 kRk21 ,
 
(S34)
Nc Nc

where R ∈ RNc is defined by

R = kp̊1,1 − p̊1,0 k22 , · · · , kp̊M,κM − p̊M,κM −1 k22 .



(S35)

R is the squared distance vector for all Nc + 1 constraint points. We denote by Rk the k-th

entry in R, k = j + i−1
P
l=1 κl . The gradient is computed as

κi   T
∂Ju X jTi ∂Ju
= β , (S36)
∂ci j=1
κi ∂p̊ i,j
 
∂Ju 4 kRk1
= Rk−1 − (p̊i,j − p̊i,j−1 )
∂p̊i,j Nc Nc
 
4 kRk1
− Rk − (p̊i,j+1 − p̊i,j ) , (S37)
Nc Nc
κi  T  
∂Ju 1 X ∂Ju T jTi
= j ci β̇ . (S38)
∂Ti κi j=1 ∂p̊i,j κi

20
Temporal Constraint Elimination

An open-domain constraint is T  0, which is directly eliminated by a diffeomorphism map as


is done in (57):

Ti = eτi , (S39)

where τi is the unconstrained virtual time. Gradients w.r.t. Ti are propagated to τi following
∂J/∂τi = (∂J/∂Ti )eτi . More generally, any twice continuously differentiable bijection f :
R 7→ R>0 suffices for this map.

Section S7. Collision Avoidance for Heterogeneous Swarms

In the conducted experiments, desired swarm clearance Cw is fixed due to identical agent size. If
implementing the proposed planning algorithm for a heterogeneous swarm, each agent should
broadcast its desired clearance, which is typically its radius plus a tolerance distance. Then, for

the u-th agent that is going to avoid agent k, the swarm clearance between them is defined as
uk
Cw = u Cw +k Cw , where u Cw and k Cw are the desired clearances by agents u and k, respectively.

Section S8. Drone Removal on Depth Images

In the proposed system, an occupancy grid is adopted for static map representation, which
can be affected by moving agents in depth images. Therefore, pixels of other agents are set

to unknown. This procedure is performed at Line 14 in Alg. 1. The visual-inertial odometry


that we utilize for localization requires grayscale images of static environments, which indeed
contain moving agents. Nonetheless, the outlier is eliminated by RANSAC. Thus, moving

agents are unlikely to affect our localization module which uses a wide angle-of-view camera.

21
Algorithm 1: DroneDetection(u, k)
1 Notation: Detector u, Target Drone k,
2 Detector Pose Pu , Target Pose Pk ,
3 Depth Image Iu , Threshold T .
Input: Iu , Pu , Pk
Output: CorrectPk , ModifiedIu
4 begin
5 PriorPk ← P oseW orld2Cam(Pk , Pu )
6 PriorPixelk ← P ose2P ixel(PriorPk )
7 if InDepthImage(PriorPixelk , Iu ) then
8 Box ← BoundingBox(PriorPixelk , Iu , PriorPk )
9 for Pixel in Box do
10 P ← P ixel2P ose(Pixel)
11 if Distance(P, Pk ) ≤ T then
12 DronePixelsk .P ushBack(Pixel)
13 CorrectPixelk ← GetCenter(DronePixelsk )
14 CorrectPk ← P ixel2P ose(CorrectPixelk )
15 ModifiedIu ← SetU nknown(Iu , DronePixelsk )
16 return;

Drone Detection

Unlike (65), in which drone detection is done merely by neural networks, we propose a lightweight
but robust approach by pixel gathering from depth images using agent position as an a priori.

The algorithm is explained in Alg. 1. Function PoseWorld2Cam() transforms Pk into the three-
dimensional (3D) camera frame of the detector u. Function Pose2Pixel() transforms PriorPk
into the 2D camera frame represented by the conventional u − v pixel axis of the detector u.

Function InDepthImage() returns whether the pixel value of PriorPixelk is within the bound-
ary of the image Iu . Function BoundingBox() creates a bounding box centered at PriorPixelk
with size larger if the z-component of PriorPk on Iu is smaller. Function Pixel2Pose() trans-

forms Pixel with depth from Iu into the 3D world frame. Function Distance() calculates Eu-
clidean distance between P and Pk . Function PushBack() means data saving. Function Get-

22
Center() computes the average position of all the DronePixelsk . Function SetUnknown() is
an image operator that modifies the depth pixel values belonging to the detected drone to the

unknown, and then the witnessed drone is removed. The CorrectPk can be further used to
correct localization drift.

Section S9. Failure Cases and Troubleshooting

Although a great deal of engineering considerations have been taken into account, limited by
our knowledge and equipment, failures still occasionally occur. To those who are intended to

reproduce our work to carry out their own research, possible solutions to some cases of failure
are listed in Table S3.

23
Table S3. Cases of failure and troubleshooting.

Failure Cases Frequency Possible Solutions


Often if using
Change the default ekf2 to Q estimator (comple-
Arm denied by FCU. ekf2 estimator
mentary filter) in PX4 Autopilot.
in PX4.
Becomes
Initial state estimation Re-calibrate extrinsic parameters between
common
not stable. camera and IMU.
after a crash.
Often if the
RealSense D430 failed Make sure the type-c port has enough power and
power supply is
to boot. use a high-current type-c cable.
insufficient.
This happens when pointing the camera at a
RealSense D430 outputs
highlighted plane or a surface with repeated
significantly wrong Rare.
textures; therefore, try to add some features on
distance measurements.
these surfaces.
Some of the images are Caused by signal interfering or vibration, so
corrupted when in Rare. check the connection of the camera cable and
flight. place it away from the battery, ESC, and motors.
Use better network hardware and do not print
Trajectory message loss. Sometimes. all the information to the ground computer via
network.
Increase the map resolution and set appropriate
Emergency stops
inflation; otherwise, some small gaps frequently
triggered frequently Sometimes.
occur and continuously switch between free and
in dense environments.
occupied.
Caused by the poor network every time and
meanwhile, two drones did not see each other
Two drones collide. Rare.
in depth images, so fix the network first and
consider enlarging the depth FOV (field of view).
RealSense D430 is the culprit every time, so
Hitting an obstacle. Rare. check what goes wrong with this device
following the above troubleshooting regarding it.

Section S10. More Discussion on Communication

The requirements for communication are task-specific. In experiments Fly Through Dense For-

est, the goal positions have been written into configuration files before takeoff. Then, trajec-

24
tory is the only message type to be transmitted for inter-robot collision avoidance and relative
localization correction. In such cases, the communication range must be at least twice the

planning distance, since drones with spacing closer than this threshold will affect each other
when planning the trajectory. If their distance is longer than this threshold, they are sure to be
collision-free in a short period of future time without any avoidance actions. Benefiting from the

developments of wireless communication technology, such a dozens-of-meters communication


range has been achieved by most solutions, e.g., WiFi, UWB, and Zigbee.
In formation flight tasks, except for reciprocal avoidance, drones have to dynamically main-

tain a formation according to other drones’ real-time positions extracted from received trajec-
tories. This requires that each drone maintains at least one connection to others and the whole
formation is inseparable regarding to wireless connection. Then adopting some multihop and

routing protocols, the trajectory messages can reach to all other drones. In the experiment For-
mation Navigation in the Wild, the formation scale is relatively compact, therefore messages
are directly shared in the whole swarm.

In the experiment Intensive Reciprocal Avoidance Evaluation, trajectory-transmission re-


quirements are the same as those in Fly Through Dense Forest. However, the goals in this ex-
periment are given by a ground computer in real-time. Therefore, a connection from the ground

computer to each drone is necessary. In addition, the ground computer can be connected directly
as in our experiments or through wireless routing.
In the experiment Multi-Drone Tracking with Target Occlusion, each drone combines goals

from its own observations with others’ observations received from the network. Therefore, more
sharing means higher robustness to occlusion. If there is no observation from others, the drone
can still track the target, but may lose it when encountering a significant occlusion. Thus, it is

difficult to determine an exact threshold of communication range in this task. In our experiment,
we share the observation among all drones.

25
Regarding communication bandwidth, in the bamboo forest experiment of 10 drones, the
size of one trajectory is approximately 170 bytes. For the 65-m flight in the bamboo forest

there are about 100 trajectories sent and 1000 trajectories received for each drone. The data
rate (bandwidth) containing both sent and received information is approximately 2 kB/s and
reached 8 kB/s at peak. These are minor values for WiFi network, which always provides a

several megabytes-per-second (MBps) data rate. The statistics are listed in Table S4 and S5.

Table S4. Data sent in the bamboo forest flight. Trajectory is the only message type trans-
mitted via network. Traj.: Trajectory; Min.: Minimal; Max.: Maximal; Avg.: Average.

Min. Max. Peak Avg.


Total Avg. One
Drone Traj. Traj. Traj. Data Data
Size Traj. Size
ID Number Size Size Rate Rate
(bytes) (bytes)
(bytes) (bytes) (bytes/s) (bytes/s)
0 93 15811 127 175 170.01 875 171.63
1 96 16224 127 175 169.00 636 176.11
2 88 15064 127 175 171.18 525 163.52
3 98 16830 127 175 171.73 700 182.69
4 102 17418 127 175 170.76 1209 189.07
5 132 21628 127 175 163.85 2159 234.77
6 125 20755 127 175 166.04 1002 225.30
7 112 19088 127 175 170.43 2100 207.20
8 102 17514 127 175 171.71 700 190.12
9 93 15891 127 175 170.87 684 172.50
Avg. 104.10 17622.30 127.00 175.00 169.56 1059.00 191.29

26
Table S5. Data received in the bamboo forest flight. Trajectory is the only message type
transmitted via network. Traj.: Trajectory; Min.: Minimal; Max.: Maximal; Avg.: Average.

Min. Max. Peak Avg.


Total Avg. One
Drone Traj. Traj. Traj. Data Data
Size Traj. Size
ID Number Size Size Rate Rate
(bytes) (bytes)
(bytes) (bytes) (bytes/s) (bytes/s)
0 948 160412 127 175 169.21 6202 1741.29
1 945 159999 127 175 169.31 5377 1736.80
2 953 161159 127 175 169.11 5838 1749.40
3 943 159393 127 175 169.03 5377 1730.22
4 939 158805 127 175 169.12 5043 1723.84
5 909 154595 127 175 170.07 4295 1678.14
6 916 155468 127 175 169.72 5377 1687.62
7 929 157135 127 175 169.14 4502 1705.71
8 939 158709 127 175 169.02 6538 1722.80
9 948 160332 127 175 169.13 5377 1740.42
Avg. 936.90 158600.70 127.00 175.00 169.29 5392.60 1721.62

27
Figures

Fig. S9. Composite images of two formation flights. (A) Letters ZJU (ZheJiang University)
formed by 16 drones. (B) Letters FAST (FAST Lab) formed by 22 drones.

28
Fig. S10. Maps of the Bamboo Forest Built by Drones. The light-colored background denotes
the map fused by overlaying 10 position-aligned drones’ sub-maps. Different colors represent
different heights. For example, green indicates the ground while the purple indicates bamboos.
All trajectories are collision-free. Owing to the drift correction, 10 maps are all accurately
aligned here.

29
Fig. S11. Control errors during flights in Section Formation Navigation in the Wild. Green
dotted ellipses indicate emergency stops activated by the safety check introduced in Section
Hierarchical Safety Guarantee. Blue translucent areas indicate significant changes in speeds.
Here, it is emergency stop rather than speed change that increases the position error.

30
Fig. S12. A 40-drone position swap on a circle.

31
Fig. S13. Localization drift corrections for drones 1–9. Drone 0 is visualized in Fig. S3B.

32
Fig. S14. Block diagram of localization drift correction. See Section Localization and Drift
Correction for details. Sensors with corresponding fusion methods in dashed light-yellow boxes
are optional and not adopted in our experiments. Positions in our framework are shared via
trajectories.

33
Fig. S15. Formation settings for scalability tests on the computation efficiency in Section
Time and Space Complexity. (A) A loose formation. (B) A medium-intensity formation. (C)
A tight formation. Orange circles indicate the range of two times the planning distance (7.5 m
in Table S6).

34
Fig. S16. Histograms of Nsv and the number of constraint points p̊. (A) The statistic of Nsv
on each constraint point. (B) The number of constraint points on each trajectory. (C) The test
scenario.

35
Fig. S17. More photographs of the environments and closeup views. (A) An aerial view of
the bamboo forest where experiments are conducted. (B-E) Closeup views of the drone. (F) A
miniature flight controller. (G) A mechanical model. (H) A PCB layout of the flight controller.
Both (G) and (H) are released.

36
Tables
Table S6. Parameter settings used throughout this work. Capital letters A–H indicate var-
ious scenarios: A includes the 10-/40-drone swarm flight, circle position swap, circle random
goals, goal select and set, moving obstacle avoidance, scalability test on drone number, and
benchmark with various obstacle densities: B is the formation flight simulation, C is the track-
ing simulation, D is the velocity scalability benchmark, E is the real flight in the bamboo forest,
F is the real-world formation flight, G is the real-world intensive reciprocal avoidance evalua-
tion, and H is the real-world swarm tracking. Parameter λu is defined in Section Penalty For-
mulation. Formation weight λf o and Tracking weight λtr are not explicitly defined elsewhere,
but they follow the same naming conventions as other weights.

Parameters Scenarios Values


Trajectory piece number M All 5
Number of constraint points per piece κ All 5
Smoothness weight λs All 1.0
Total time weight λt All 10.0
Obstacle avoidance weight λo All 10000.0
Swarm collision avoidance weight λw All 10000.0
Dynamical feasibility weight λd All 10000.0
Uniform distribution weight λu All 10000.0
Formation weight λf o B, F 100.0
Tracking weight λtr C, H 100.0
A-D 0.3m
Obstacle clearance Co
E-H 0.2m
A-D 0.5m
Swarm clearance Cw
E-H 0.4m
A-C 1.5m/s
Maximum velocity vm E-H 1.0m/s
D 1/2/3m/s
Maximum acceleration am All 6m/s2
Maximum jerk jm All 10m/s2
Planning distance All 7.5m
Emergency stop time threshold All 1s

37
Table S7. Component weights.

Component Weight (g)


Onboard computer 69.1
RealSense D430 19.0
FCU 5.3
UWB 4.7
Battery 101.4
ESC 4.5
Four motors & propellers 42.8
WiFi adapter 3.4
Drone frame 19.5
Power management unit 6.5
Screws, cables, etc. 17.3
Total 293.5

38
Table S8. The values of sub-figure (1,1) in Fig. 7B.

Computing Time (MADER)


Obstacle Spacing 2.0 1.5 1.0
Median 0.10650 0.10860 0.14150
Lower Quartile 0.06390 0.06190 0.07790
Upper Quartile 0.15550 0.16600 0.20580
Computing Time (EGO-Swarm)
Obstacle Spacing 2.0 1.5 1.0
Median 0.00089 0.00120 0.00330
Lower Quartile 0.00050 0.00059 0.00079
Upper Quartile 0.00280 0.00370 0.01010
Computing Time (Proposed)
Obstacle Spacing 2.0 1.5 1.0
Median 0.00035 0.00041 0.00063
Lower Quartile 0.00017 0.00020 0.00034
Upper Quartile 0.00063 0.00068 0.00097

39
Table S9. The values of sub-figure (2,1) in Fig. 7B.

Computing Time (MADER)


Velocity 1.0 2.0 3.0
Median 0.10650 0.10860 0.14150
Lower Quartile 0.06390 0.06190 0.07790
Upper Quartile 0.15550 0.16600 0.20580
Computing Time (EGO-Swarm)
Velocity 1.0 2.0 3.0
Median 0.00081 0.00120 0.00210
Lower Quartile 0.00048 0.00059 0.00078
Upper Quartile 0.00240 0.00370 0.00620
Computing Time (Proposed)
Velocity 1.0 2.0 3.0
Median 0.00043 0.00041 0.00029
Lower Quartile 0.00021 0.00020 0.00017
Upper Quartile 0.00078 0.00068 0.00065

40
Table S10. The values of sub-figures (1,2) and (2,2) in Fig. 7B.

Minimum Clearance
Obstacle Spacing 2.0 1.5 1.0
MADER 0.008 0.011 -0.184
EGO-Swarm 0.045 0.067 0.064
Proposed 0.086 0.065 0.058
Minimum Clearance
Velocity 1.0 2.0 3.0
MADER 0.000 0.011 -0.086
EGO-Swarm 0.062 0.067 0.047
Proposed 0.068 0.065 0.064

41
Table S11. The values of sub-figures (1,3) and (2,3) in Fig. 7B.

Control Effort
Obstacle Spacing 2.0 1.5 1.0
Average 6608.10 7575.70 8922.20
MADER
SD 4254.50 4707.70 4958.20
Average 624.30 523.94 1133.30
EGO-Swarm
SD 346.29 362.08 811.70
Average 189.03 246.42 465.18
Proposed
SD 98.87 129.27 183.72
Control Effort
Velocity 1.0 2.0 3.0
Average 743.52 7575.70 19994.00
MADER
SD 1271.70 4707.70 7195.70
Average 13.26 523.94 4176.30
EGO-Swarm
SD 10.29 362.08 2945.20
Average 70.64 246.42 456.18
Proposed
SD 36.14 129.27 207.19

42
Table S12. The values of sub-figure (1,4) in Fig. 7B.

Flight Time (MADER)


Obstacle Spacing 2.0 1.5 1.0
Median 18.00 18.66 21.13
Lower Quartile 16.53 17.74 19.11
Upper Quartile 19.00 20.00 23.74
Flight Time (EGO-Swarm)
Obstacle Spacing 2.0 1.5 1.0
Median 23.12 23.40 24.15
Lower Quartile 21.47 22.14 22.82
Upper Quartile 24.69 24.82 26.35
Flight Time (Proposed)
Obstacle Spacing 2.0 1.5 1.0
Median 21.25 20.79 20.75
Lower Quartile 20.00 20.14 19.70
Upper Quartile 22.27 22.58 22.71

43
Table S13. The values of sub-figure (2,4) in Fig. 7B.

Flight Time (MADER)


Velocity 1.0 2.0 3.0
Median 35.00 18.66 14.65
Lower Quartile 33.03 17.73 13.41
Upper Quartile 36.09 20.00 16.15
Flight Time (EGO-Swarm)
Velocity 1.0 2.0 3.0
Median 47.96 23.40 15.58
Lower Quartile 45.51 22.14 14.76
Upper Quartile 51.48 24.82 16.60
Flight Time (Proposed)
Velocity 1.0 2.0 3.0
Median 35.37 20.79 17.72
Lower Quartile 34.19 20.14 16.66
Upper Quartile 37.45 22.58 18.93

44

You might also like