Swarm of Micro Flying Robots in The Wild
Swarm of Micro Flying Robots in The Wild
Sections S1 to S10
Figs. S1 to S17
Tables S1 to S13
Software
Section S1. Extra Comparison of Reciprocal Collision Avoid-
ance
Table S1. Extra performance comparison. Comparison is performed under a position swap
configuration in an 8×8 m empty space containing eight agents with radius of 0.25 m. Max-
imum velocity and acceleration are set to 1.7 m/s and 6 m/s2 , respectively. Min. Clear. de-
notes the minimum
R distance from collision. Smoothness is computed from the time integral of
2
squared jerk ||j||2 dt. Units of time and distance are second and meter, respectively.
Except for EGO (33) and MADER (37), we compare the proposed planner against two
other centralized offline swarm planners, i.e., SCP (49) and RBP (50). Results are given in
Table S1. A bold term indicates a better value in the corresponding category (Online/Offline),
while an underlined term indicates the best value among all planners. As given in Table S1, SCP
achieves a shorter flight time at the sacrifice of trajectory safety. RBP achieves smoother flight
at the cost of a relatively long flight time. Both fail to balance various aspects of trajectory
quality. Compared with offline methods, the three online planners show more balance while
requiring less computation by several orders of magnitude. Among them, MADER is more
conservative because its safety is ensured through minimum volume enclosing simplices. All
the data are recorded from 10 random runs.
Section S2. Scalability Evaluation
Time and Space Complexity
In real-world usage, larger swarms always mean a more powerful team for broader coverage
in exploration, a grater possible load for transportation, and higher robustness against robot
failures, among other advantages. However, traditional optimization-based trajectory planning
suffers from a high computing complexity related to the number of robots, as stated by Park et
al. (50) the time complexity of which is O(n3 ). Therefore, Park et al. have to divide the entire
swarm into groups and then plan the trajectory sequentially, which only reliefs the complexity
to a limited extent.
Owing to the decentralized architecture and appropriate problem formulation, our time com-
plexity is significantly reduced to approximately linear to the drone number from statistics.
Since the baseline of computing time is short, the swarm size can increase to hundreds of robots
before breaking the real-time bottom line. Meanwhile, the space complexity is exactly linear to
the drone number, as one drone only stores the latest trajectory from every other one. Indeed,
this space usage can be ignored because one trajectory is typically below 1 kB in size, and there
The complexity can be further reduced (20) if each drone only considers nearby units, since
we perform continuous planning only within a limited distance. In our work, we ignore drones
more than twice the planning distance (see Table S6 ) away from each other. The resulting
complexity is below O(n), and is determined by the density of drone distribution as in Fig. S1.
Such complexity allows the swarm to be expanded to a large size before breaking the real-time
threshold.
0.6
Time (ms)
Computing Time(ms)
0.5
Computing 0.4
0.3
0.3 1m
1m
2m
5m
5m
0.2
0.2
10
10 20
20 30 40
40 50
50
Drone Number
Drone Number
Fig. S1. Scalability on computation efficiency. This figure provides the computing time for
different drone numbers under three swarm distributions shown in Fig. S15.
ogy, and drone distribution. To control signal interference, we conduct this experiment in a quiet
park with drones hovering and UWBs turned on. Then, in this test, we compare two topologies,
namely a centralized network with one access point and a peer-to-peer ad hoc network using the
latency, are adopted to assess network capacity. Since we write reliable network communica-
tion software that guarantees data arrival, the packet loss rate is not evaluated (which equals
either 0% or 100%). The bandwidth Bw is quantized to the highest trajectory (175 bytes each)
broadcasting frequency by all the drones together before reaching the 250-ms latency threshold.
Latency TL is recorded at 5 Hz of the trajectory broadcasting rate.
To be collision-free, in a centralized network, each drone must maintain a direct connection
to the access point (AP), so we record the worst value among all drones. By contrast, for the
Fig. S2. Scalability on communication quality. (A) Three kinds of swarm distributions. (B)
Network quality for two kind of topologies under three swarm distributions.
ad hoc network, each drone directly connects to the others. Furthermore, when two drones get
further than twice the planning horizon from each other, the collision penalty between them is
always zero, and freedom from collision is guaranteed naturally. Thus, we only evaluate data
between each pair of drones with a relative distance below twice the planning horizon (15 m in
this experiment) for an ad hoc network.
From the results in Fig. S2B, a centralized network performs better in tight formation, while
an ad hoc network shows better communication quality when the formation is loose. From the
integral view, an ad hoc network scales better with the number of drones when in loose distribu-
tion, while a centralized network provides higher bandwidth when the drone number is small;
Fig. S3. Localization drift corrections. (A) Thin and thick lines are trajectories before and
after corrections, respectively. The former ones show significant drifts that could have ruined
the formation if not corrected. (B) Localization drift of drone 0. More drones are shown in
Fig. S13.
The data recorded from the experiment in Section Formation Navigation in the Wild are given
in Fig. S3. The desired start and goal shapes are 2 × 5 drone rectangles with 1-m spacings. With
the corrected localization, the drone managed to maintain the shape after a 55-m flight, although
they are not precisely aligned in yaw at the start point. By contrast, Figs. S3 and S13 show the
magnitude of the estimated drift, which reaches approximately 3 m. Under such significant
drift, the formation suffers severe shape distortion and reciprocal collision if no correction is
made. However, since no global information such as GPS data is available, accumulated errors
for the entire team must exist. That is why we add some fixed anchors in Section Intensive
Reciprocal Avoidance Evaluation. In summary, the necessity and effectiveness of the proposed
correction method are validated here. The block diagram is shown in Fig. S14.
To further analyze the planner performance, the gate-passing test in Section Benchmark Com-
parisons is conducted again with some representative data visualized in Fig. S4. Among the
four sub-figures, B and C are mutually constrained. This means, for a shorter flight time, these
drones must fly as fast as possible under the maximum speed of 2 m/s, while they also have
to alter velocity to maintain a relative distance. Then, for drones in front, velocity can be in-
creased to reduce flight time while enlarging relative distance, but the velocity is limited. For
the drones left behind, reducing the velocity as little as possible to maintain the shortest relative
distance to those in front is the best strategy. The above is exactly what is achieved in Fig. S4B
and C, where drones managed to fly across the gate at the maximum velocity and minimum
relative distance allowed. What’s more, from Fig. S4D, we can conclude that drones fly along
the shortest trajectory without detours or wasting of energy.
The pros and cons of typical trajectory parameterization for aerial robots are summarized in Ta-
ble S2. Polynomial curves are widely used by traditional approaches (4, 10, 38), which mostly
7
A
2
4
5 1
3 0
B
Relative Distance (m)
2 4 5 1 0 3
2
Safety Threshold
0
0 2 4 6 Time(s) 8 10 12 14
C 2
2 4 5 1 0 3 vx
vy
Velocity (m/s)
1 vz
-1
0 2 4 6 Time(s) 8 10 12 14
D 30
Straight-line Distance
Trajectory Length (m)
10
0
0 1 2 Drone ID 3 4 5
Fig. S4. Performance evaluation in the gate-passing test. (A) Six snapshots taken when
each drone is passing the gate. (B) The minimum relative distances recorded by each drone
are denoted by the light blue area, the lower bound of which is emphasized by a blue curve.
Multicolored lines marked with numbers of drone ID indicate the time at which each drone flies
through the gate. Red dotted line is the 0.7-m swarm clearance in optimization. (C) Minimum-
to-maximum velocity curves. The orange dotted line at a 2-m/s velocity is the velocity limit.
(D) Comparison between real trajectory lengths and straight-line distances (the minimum length
in theory).
8
use polynomial coefficients as decision variables, along with explicit continuity constraints.
Many works (10, 38) deform a piece-wise polynomial trajectory by adding intermediate way-
points. However, the distribution of waypoints requires a careful configuration to balance fea-
sibility satisfaction and computational burden. To optimize the time profile of a polynomial,
the representative method uses gradient descent with finite differences (38). Nevertheless, this
method requires a dense matrix inversion to transform waypoints to coefficients, making the
optimization quickly intractable when the problem scales up. Bézier and B-Spline share the
convex hull nature, which indicates that the curve is entirely contained in the convex hull of its
control points (13, 23). Therefore, these curves are convenient for adding constraints, which
can be done by merely restricting the control points inside feasible convex regions. Many
works (24, 37, 50) follow this underlying property and show successful applications in aerial
swarms. However, this property indeed brings conservativeness, as analyzed by (47), prevent-
ing a trajectory from being aggressive near its physical limits. Moreover, an n order B-Spline
is naturally n − 1 order continuous (48). Thus no continuity constraint is required as long as
the B-Spline has a qualified order. The time profile of a Bézier curve can be adjusted with ac-
ceptable complexity by its basis (23), while this approach is rather complicated for a B-Spline
due to its time evaluation containing high nonlinearity. MINCO (57) is a newly developed
trajectory representation specialized for integrator chain systems, and its core feature is that it
efficiently handles a wide variety of constraints while retaining spatial and temporal optimality.
By discretizing the trajectory to add constraints, MINCO is less conservative but requires further
checks after obtaining the solution. Although MINCO is the most complicated representation
and requires a sophisticated implementation, it provides a solid foundation for spatial-temporal
drone swarm planning.
9
Table S2. Pros and cons of commonly used trajectory parameterizations.
t0 = 0 in the Supplementary Materials. The notion M(q, T) equals Mq,T in the main text.
Specifically, the trajectory is evaluated as
pi (t) = cT
i β(t), ∀t ∈ [0, Ti ], (S2)
10
𝑇𝑇1 𝑇𝑇2 𝑇𝑇𝑀𝑀−1 𝑇𝑇𝑀𝑀
𝐪𝐪1 𝐩𝐩𝟐𝟐,𝟎𝟎
𝐩𝐩𝟐𝟐,𝟏𝟏
𝑝𝑝(𝑡𝑡) 𝐜𝐜𝑀𝑀−1 𝛽𝛽(𝑡𝑡)
𝐜𝐜1 𝛽𝛽(𝑡𝑡) 𝐩𝐩𝟐𝟐,𝟐𝟐
𝐜𝐜2 𝛽𝛽(𝑡𝑡) 𝐩𝐩𝟐𝟐,𝟑𝟑
𝐩𝐩𝟐𝟐,𝟒𝟒 … 𝐪𝐪𝑀𝑀−2
𝐪𝐪𝑀𝑀−1 𝐜𝐜𝑀𝑀 𝛽𝛽(𝑡𝑡)
𝐪𝐪2
Fig. S5. Parameters of a MINCO trajectory and its constraint evaluation. The trajectory
is directly parameterized by each waypoint qi and time Ti . Red dots are constraint points p̊ at
network
which thatis shares
any constraint evaluated trajectories
and propagated toand
everyperforms
qi and Ti . on-demand time
where β(t) := [1, t, · · · , tN ]T is the natural basis. The core of MINCO is the parameter map-
ping c = M(q, T) constructed from Theorem 2 in (57). Technically, the mapping directly
constructs a minimum control trajectory for an m-dimensional s-integrator chain for any spec-
ified initial and terminal conditions. An instance in MINCO is intuitively shown in Fig. S5.
Owing to the limited paper length, we only intuitively introduce the features of MINCO. De-
tailed proofs are presented in (57).
Feature 1: Compactly represented by q and T, MINCO is a class of C s−1 polynomial
where W ∈ Rm×m is a diagonal matrix with positive entries, p[s−1] = [p, p(1) , p(2) , · · · , p(s−1) ]
with , p̄o and p̄f ∈ Rm×s the specified initial and terminal conditions, respectively. The trajec-
tory is enforced to pass qi at time ti .
Feature 2: The evaluation and differentiation of the mapping c = M(q, T) benefits from
11
linear complexity. A more specific correspondence can be expressed as the following function
where M(T) ∈ R2M s×2M s is a banded matrix with non-singularity for any T 0 ensured by
Theorem 2 in (57), b(q) ∈ R2M s×m . The conversion between these two trajectory represen-
tations, i.e., (c, T) and (q, T) is achieved using Banded PLU Factorization with O(M ) linear
time and space complexity. The evaluation is extremely fast, which takes approximately 1 µs
per piece for minimum-jerk (s = 3) trajectory generation on a desktop CPU.
T
b= D0 , D1 , 0m×(2s−1) , · · · , DM −1 , 0m×(2s−1) , DM , (S6a)
Feature 3: Feature 2 allows any user-defined objective or constraint function F (c, T) with
available gradients applicable to MINCO parameterized in (q, T). Specifically, the correspond-
12
ing objective of MINCO is computed as
Then the mapping c = M(q, T) gives a linear-complexity way of computing ∂H/∂q and
∂H/∂T from the corresponding ∂F/∂c and ∂F/∂T, i.e.,
∂H ∂F ∂c ∂H ∂F ∂c ∂F
= , = + , (S8)
∂q ∂c ∂q ∂T ∂c ∂T ∂T
↑ ↑
where derivatives marked by ↑ are the gradients finally required by a numerical optimizer,
are the elements calculated by user code, and are just computed using the linear-complexity
mapping c = M(q, T), i.e., Eqs. S4 –S6. Users need not consider the constraints between c
and T as if decoupled. Next, a high-level optimizer is able to optimize the objective efficiently.
Conventionally, requirements for dynamical feasibility and collision avoidance are formulated
as functional-type constraints G(p(t), . . . , p(s) (t)) 0, ∀t ∈ [0, tM ], G ∈ Rng with ng the num-
ber of types of constraints. However, this formulation cannot be directly handled by constrained
optimization, since G contains infinitely many inequality constraints. Thus, we transform G into
a finite-dimensional one via an integral of constraint violation. Practically, the integral is trans-
formed into a weighted sum JΣ (c, T) of the sampled penalty function. In most cases where
constraints are decoupled, i.e., constraints G(p[s] (t)) with ti ≤ t < ti+1 are merely determined
by ci and Ti , p[s] = [p, p(1) , · · · , p(s) ], the penalty for an i-th trajectory piece is computed as
κi
Ti X j
Ji (ci , Ti ) = ω̄j χT max (G(ci , Ti , ), 0)3 , (S9)
κi j=0 κi
n
where κi is the sample number on the i-th piece, χ ∈ R≥0g a vector of penalty weights with
appropriately large entries (several orders of magnitude larger than the smoothness penalty),
13
and (ω̄0 , ω̄1 , . . . , ω̄κi −1 , ω̄κi ) = (1/2, 1, · · · , 1, 1/2) the quadrature coefficients following the
trapezoidal rule. We define the points determined by {ci , Ti , j/κi } as constraint points p̊i,j :=
pi ((j/κi )Ti ), with pi (t) the i-th polynomial piece. Then G(p̊i,j ) := G(ci , Ti , j/κi ). Note that
p̊i,κi = p̊i+1,0 . The cubic penalty is used here for its twice continuous differentiability. JΣ (c, T)
is then defined as the summation:
M
X
JΣ (c, T) = Ji (ci , Ti ). (S10)
i=1
Since the gradients w.r.t. (c, T) are constructed from all (ci , Ti ), we only give gradient
templates for ci and Ti using the chain rule when Eq. S9 holds,
j ∂t j
t= Ti , = . (S14)
κi ∂Ti κi
From the above equations, Eqs. S11 –S14, once the gradients of constraints G relative to polyno-
mial coefficients ci and time t are evaluated, the gradients of JΣ (c, T) are efficiently computed,
which are then substituted into Eq. S8 to acquire final gradients respective to q and T.
Penalty Formulation
The basic requirements on a desired trajectory are smoothness, total time, dynamical feasibility,
as well as safety among obstacles and other agents. In this paper, we adopt MINCO to address
above all concerns. Since MINCO is naturally guaranteed C s−1 by Theorem 2 in (57), no
extra effort must be paid to trajectory continuity. The trajectory generation is formulated as an
14
unconstrained optimization problem:
X
min λx Jx , (S15)
q,T
x
where Jx are JΣ instances, λx is the weight for each penalty, and subscripts x = {s, t, d, o, w, u}
represent smoothness (s), total time (t), dynamical feasibility (d), obstacle avoidance (o), swarm
reciprocal avoidance (w), and uniform distribution of constraint points (u), respectively. More
task-specific penalties can also be added. The problem is solved via standard, unconstrained
nonlinear optimization.
Smoothness Js
The smoothness metric follows the definition in Eq. S3b. Without loss of generality, considering
a single dimension of the i-th piece pi (t) : [0, Ti ] 7→ R of a polynomial spline, its derivatives
with respect to ci and Ti are
Z Ti
∂Js (s) (s) T
=2 β (t)β (t) dt ci , (S16)
∂ci 0
∂Js T
= cT (s) (s)
i β (Ti )β (Ti ) ci . (S17)
∂Ti
Total Time Jt
A shorter flight time is desirable, so we also minimize the weighted total execution time Jt =
PM
i=1 Ti . Obviously, its gradients ∂Jt /∂c = 0, ∂Jt /∂T = 1.
For multicopters, dynamical feasibility is always guaranteed by limiting the trajectory’s deriva-
tives. In this paper, we limit the amplitude of velocity, acceleration, and jerk. Following Eq. S9,
constraints of velocity, acceleration, and jerk are denoted by
...
Gv = ṗ(t)2 − vm
2
, Ga = p̈(t)2 − a2m , Gj = p (t)2 − jm
2
, (S18)
15
where vm , am , jm are maximum allowed velocity, acceleration and jerk, respectively. The cor-
responding gradients are
where x = {v, a, j}, n = {1, 2, 3}, and t = jTi /κi , respectively. Substituting Eq. S18 into S9
and S19 into S11 & S12 obtains the penalty and the gradients of ci and Ti .
In this paper, we adopt the front-end of collision evaluation devised by (14). A brief description
is given here for better understanding. In (14), the information for an unsafe trajectory to escape
collision is extracted by comparing the unsafe initial trajectory with a collision-free guiding
path. As depicted in Fig. S6, a fixed safe point s with a fixed safe vector v is generated and
recorded by a corresponding key point pkey on the trajectory. Then, the distance of pkey to the
obstacle that the {s, v} pair generated is defined as
Using the distance information, the trajectory deforms iteratively during optimization. If pkey
discovers a new obstacle, a new {s, v} pair is stacked to pkey ’s records. Therefore each pkey may
record several {s, v} pairs. Note that each pkey generates and possesses {s, v} pairs privately,
and thus does not calculate its distance from other pkey s’ {s, v} pairs.
In this paper, constraint points p̊i,j with 1 ≤ i ≤ M, 1 ≤ j ≤ κi are selected as the
key points pkey . To enforce safety, we penalize distance to obstacles by less than a obstacle
clearance Co . Therefore, the obstacle avoidance constraint is defined as
16
Enlarging
Fig. S6. Collision avoidance formulation. The red curve is the optimized trajectory and blue
curve a collision-free path that starts and ends on the red one. A fixed point s with a vector v is
generated to determine the collision penalty.
network that shares trajectories and performs on-demand time
where Nsv is the number of {s, v} pairs recorded by a single p̊i,j . For the case in which
do (p̊i,j , {s, v}k ) ≤ Co , the gradient is computed as
∂Gok ∂Gok
= −β(t)vT , = −ṗi (t)T v. (S23)
∂ci ∂t
In this paper, a decentralized, non-cooperative swarm framework is adopted, which means that
one agent receives other agents’ trajectories as constraints and generates trajectories only for
itself. Considering the u-th agent in a multicopter swarm containing U agents, swarm collision
avoidance is guaranteed when the trajectory u p(t) of agent u maintains a distance greater than
a safe clearance Cw to all the trajectories at the same global timestamps of other agents as
depicted in Fig. S7. However, caution must be taken because we have always been using a
relative time t = jTi /κi for pi (t) in our optimization, while the other agents’ trajectories take
17
Other Agents’
Trajectories
𝑡𝑚𝑖𝑛 𝑡𝑚𝑎𝑥
My Trajectory
network
Fig. S7. Reciprocal thatavoidance
collision shares trajectories and performs
formulation. on-demand
Gray dotted arrowstime
indicate states at the
same absolute time, at which a safe distance must be ensured.
Gi,j u u
w the constraint evaluated at the j-th constraint point on the i-th piece of p(t), i.e., pi (t).
where k p(τ ) is the trajectory of the k-th agent that the u-th agent must avoid, k 6= u. The matrix
E := diag(1, 1, 1/c) with c > 1 transforms Euclidean distance into ellipsoidal distance with
the minor axes at the z-axis to relieve the downwash risk from rotors. Squared distance is used
∂Gwi,jk
= −2β(t)(u pi (t) − k p(τ ))T E. (S27)
∂ci
The gradient to T is more complicated, since τ is a function of not only Ti , but T1 to Ti−1 as
well when calculating the gradients for Gwi,jk . This led to the gradient-propagate template Eq. S12
not holding here for ∂JΣ /∂Ti 6= ∂Ji /∂Ti . When d2w (u pi (t), k p(τ )) ≤ Cw2 , considering both
preceding and current time profiles Tl for 1 ≤ l ≤ i, the proper gradient should be computed as
18
U U M X κi
∂Jw X ∂Jwk X X ∂Jwi,jk
= = , (S28)
∂Tl k=1,k6=u
∂T l
k=1,k6=u i=l j=1
∂T l
This penalty is not mentioned in the main body of the article since this well-packed module will
not bring about sufficient insights to general readers in terms of understanding the methodology,
but only some confusion. The purpose of this penalty is to make the constraint points p̊i,j
equally spaced for all 1 ≤ i ≤ M and 0 < j ≤ κi , which is a simpler substitution to the space-
uniform variant of MINCO mentioned in (57). It is necessary since if all MINCO parameters
q and T are optimized freely, some pieces of the trajectory tend to vanish to reduce the total
cost. This phenomenon is harmful for three reasons. First, Ti = 0 for the i-th trajectory piece
is a unique singular point for MINCO, which does not consider consecutive aligned waypoints.
Second, since the spatial collision avoidance is constrained by a finite number of constraint
points, non-uniform distribution increases the possibility of skipping some tiny or thin obstacles,
as shown in Fig. S8. Third, since we compute the distance to obstacles following Eq. S20, the
19
Skipped
obstacles are modeled as planes. The accuracy of this modeling decreases when constraint
points move along the plane and the uniform distribution reduces such movement.
To enforce the uniform distribution of constraint points, we penalize the variance of the
squared distances between each pair of adjacent constraint points. For simplicity, we denote
Nc = M
P
i=1 κi as the total number of distances. The variance is computed as
1 1
Ju = E R2 − E [R]2 = kRk22 − 2 kRk21 ,
(S34)
Nc Nc
R is the squared distance vector for all Nc + 1 constraint points. We denote by Rk the k-th
entry in R, k = j + i−1
P
l=1 κl . The gradient is computed as
κi T
∂Ju X jTi ∂Ju
= β , (S36)
∂ci j=1
κi ∂p̊ i,j
∂Ju 4 kRk1
= Rk−1 − (p̊i,j − p̊i,j−1 )
∂p̊i,j Nc Nc
4 kRk1
− Rk − (p̊i,j+1 − p̊i,j ) , (S37)
Nc Nc
κi T
∂Ju 1 X ∂Ju T jTi
= j ci β̇ . (S38)
∂Ti κi j=1 ∂p̊i,j κi
20
Temporal Constraint Elimination
Ti = eτi , (S39)
where τi is the unconstrained virtual time. Gradients w.r.t. Ti are propagated to τi following
∂J/∂τi = (∂J/∂Ti )eτi . More generally, any twice continuously differentiable bijection f :
R 7→ R>0 suffices for this map.
In the conducted experiments, desired swarm clearance Cw is fixed due to identical agent size. If
implementing the proposed planning algorithm for a heterogeneous swarm, each agent should
broadcast its desired clearance, which is typically its radius plus a tolerance distance. Then, for
the u-th agent that is going to avoid agent k, the swarm clearance between them is defined as
uk
Cw = u Cw +k Cw , where u Cw and k Cw are the desired clearances by agents u and k, respectively.
In the proposed system, an occupancy grid is adopted for static map representation, which
can be affected by moving agents in depth images. Therefore, pixels of other agents are set
agents are unlikely to affect our localization module which uses a wide angle-of-view camera.
21
Algorithm 1: DroneDetection(u, k)
1 Notation: Detector u, Target Drone k,
2 Detector Pose Pu , Target Pose Pk ,
3 Depth Image Iu , Threshold T .
Input: Iu , Pu , Pk
Output: CorrectPk , ModifiedIu
4 begin
5 PriorPk ← P oseW orld2Cam(Pk , Pu )
6 PriorPixelk ← P ose2P ixel(PriorPk )
7 if InDepthImage(PriorPixelk , Iu ) then
8 Box ← BoundingBox(PriorPixelk , Iu , PriorPk )
9 for Pixel in Box do
10 P ← P ixel2P ose(Pixel)
11 if Distance(P, Pk ) ≤ T then
12 DronePixelsk .P ushBack(Pixel)
13 CorrectPixelk ← GetCenter(DronePixelsk )
14 CorrectPk ← P ixel2P ose(CorrectPixelk )
15 ModifiedIu ← SetU nknown(Iu , DronePixelsk )
16 return;
Drone Detection
Unlike (65), in which drone detection is done merely by neural networks, we propose a lightweight
but robust approach by pixel gathering from depth images using agent position as an a priori.
The algorithm is explained in Alg. 1. Function PoseWorld2Cam() transforms Pk into the three-
dimensional (3D) camera frame of the detector u. Function Pose2Pixel() transforms PriorPk
into the 2D camera frame represented by the conventional u − v pixel axis of the detector u.
Function InDepthImage() returns whether the pixel value of PriorPixelk is within the bound-
ary of the image Iu . Function BoundingBox() creates a bounding box centered at PriorPixelk
with size larger if the z-component of PriorPk on Iu is smaller. Function Pixel2Pose() trans-
forms Pixel with depth from Iu into the 3D world frame. Function Distance() calculates Eu-
clidean distance between P and Pk . Function PushBack() means data saving. Function Get-
22
Center() computes the average position of all the DronePixelsk . Function SetUnknown() is
an image operator that modifies the depth pixel values belonging to the detected drone to the
unknown, and then the witnessed drone is removed. The CorrectPk can be further used to
correct localization drift.
Although a great deal of engineering considerations have been taken into account, limited by
our knowledge and equipment, failures still occasionally occur. To those who are intended to
reproduce our work to carry out their own research, possible solutions to some cases of failure
are listed in Table S3.
23
Table S3. Cases of failure and troubleshooting.
The requirements for communication are task-specific. In experiments Fly Through Dense For-
est, the goal positions have been written into configuration files before takeoff. Then, trajec-
24
tory is the only message type to be transmitted for inter-robot collision avoidance and relative
localization correction. In such cases, the communication range must be at least twice the
planning distance, since drones with spacing closer than this threshold will affect each other
when planning the trajectory. If their distance is longer than this threshold, they are sure to be
collision-free in a short period of future time without any avoidance actions. Benefiting from the
tain a formation according to other drones’ real-time positions extracted from received trajec-
tories. This requires that each drone maintains at least one connection to others and the whole
formation is inseparable regarding to wireless connection. Then adopting some multihop and
routing protocols, the trajectory messages can reach to all other drones. In the experiment For-
mation Navigation in the Wild, the formation scale is relatively compact, therefore messages
are directly shared in the whole swarm.
computer to each drone is necessary. In addition, the ground computer can be connected directly
as in our experiments or through wireless routing.
In the experiment Multi-Drone Tracking with Target Occlusion, each drone combines goals
from its own observations with others’ observations received from the network. Therefore, more
sharing means higher robustness to occlusion. If there is no observation from others, the drone
can still track the target, but may lose it when encountering a significant occlusion. Thus, it is
difficult to determine an exact threshold of communication range in this task. In our experiment,
we share the observation among all drones.
25
Regarding communication bandwidth, in the bamboo forest experiment of 10 drones, the
size of one trajectory is approximately 170 bytes. For the 65-m flight in the bamboo forest
there are about 100 trajectories sent and 1000 trajectories received for each drone. The data
rate (bandwidth) containing both sent and received information is approximately 2 kB/s and
reached 8 kB/s at peak. These are minor values for WiFi network, which always provides a
several megabytes-per-second (MBps) data rate. The statistics are listed in Table S4 and S5.
Table S4. Data sent in the bamboo forest flight. Trajectory is the only message type trans-
mitted via network. Traj.: Trajectory; Min.: Minimal; Max.: Maximal; Avg.: Average.
26
Table S5. Data received in the bamboo forest flight. Trajectory is the only message type
transmitted via network. Traj.: Trajectory; Min.: Minimal; Max.: Maximal; Avg.: Average.
27
Figures
Fig. S9. Composite images of two formation flights. (A) Letters ZJU (ZheJiang University)
formed by 16 drones. (B) Letters FAST (FAST Lab) formed by 22 drones.
28
Fig. S10. Maps of the Bamboo Forest Built by Drones. The light-colored background denotes
the map fused by overlaying 10 position-aligned drones’ sub-maps. Different colors represent
different heights. For example, green indicates the ground while the purple indicates bamboos.
All trajectories are collision-free. Owing to the drift correction, 10 maps are all accurately
aligned here.
29
Fig. S11. Control errors during flights in Section Formation Navigation in the Wild. Green
dotted ellipses indicate emergency stops activated by the safety check introduced in Section
Hierarchical Safety Guarantee. Blue translucent areas indicate significant changes in speeds.
Here, it is emergency stop rather than speed change that increases the position error.
30
Fig. S12. A 40-drone position swap on a circle.
31
Fig. S13. Localization drift corrections for drones 1–9. Drone 0 is visualized in Fig. S3B.
32
Fig. S14. Block diagram of localization drift correction. See Section Localization and Drift
Correction for details. Sensors with corresponding fusion methods in dashed light-yellow boxes
are optional and not adopted in our experiments. Positions in our framework are shared via
trajectories.
33
Fig. S15. Formation settings for scalability tests on the computation efficiency in Section
Time and Space Complexity. (A) A loose formation. (B) A medium-intensity formation. (C)
A tight formation. Orange circles indicate the range of two times the planning distance (7.5 m
in Table S6).
34
Fig. S16. Histograms of Nsv and the number of constraint points p̊. (A) The statistic of Nsv
on each constraint point. (B) The number of constraint points on each trajectory. (C) The test
scenario.
35
Fig. S17. More photographs of the environments and closeup views. (A) An aerial view of
the bamboo forest where experiments are conducted. (B-E) Closeup views of the drone. (F) A
miniature flight controller. (G) A mechanical model. (H) A PCB layout of the flight controller.
Both (G) and (H) are released.
36
Tables
Table S6. Parameter settings used throughout this work. Capital letters A–H indicate var-
ious scenarios: A includes the 10-/40-drone swarm flight, circle position swap, circle random
goals, goal select and set, moving obstacle avoidance, scalability test on drone number, and
benchmark with various obstacle densities: B is the formation flight simulation, C is the track-
ing simulation, D is the velocity scalability benchmark, E is the real flight in the bamboo forest,
F is the real-world formation flight, G is the real-world intensive reciprocal avoidance evalua-
tion, and H is the real-world swarm tracking. Parameter λu is defined in Section Penalty For-
mulation. Formation weight λf o and Tracking weight λtr are not explicitly defined elsewhere,
but they follow the same naming conventions as other weights.
37
Table S7. Component weights.
38
Table S8. The values of sub-figure (1,1) in Fig. 7B.
39
Table S9. The values of sub-figure (2,1) in Fig. 7B.
40
Table S10. The values of sub-figures (1,2) and (2,2) in Fig. 7B.
Minimum Clearance
Obstacle Spacing 2.0 1.5 1.0
MADER 0.008 0.011 -0.184
EGO-Swarm 0.045 0.067 0.064
Proposed 0.086 0.065 0.058
Minimum Clearance
Velocity 1.0 2.0 3.0
MADER 0.000 0.011 -0.086
EGO-Swarm 0.062 0.067 0.047
Proposed 0.068 0.065 0.064
41
Table S11. The values of sub-figures (1,3) and (2,3) in Fig. 7B.
Control Effort
Obstacle Spacing 2.0 1.5 1.0
Average 6608.10 7575.70 8922.20
MADER
SD 4254.50 4707.70 4958.20
Average 624.30 523.94 1133.30
EGO-Swarm
SD 346.29 362.08 811.70
Average 189.03 246.42 465.18
Proposed
SD 98.87 129.27 183.72
Control Effort
Velocity 1.0 2.0 3.0
Average 743.52 7575.70 19994.00
MADER
SD 1271.70 4707.70 7195.70
Average 13.26 523.94 4176.30
EGO-Swarm
SD 10.29 362.08 2945.20
Average 70.64 246.42 456.18
Proposed
SD 36.14 129.27 207.19
42
Table S12. The values of sub-figure (1,4) in Fig. 7B.
43
Table S13. The values of sub-figure (2,4) in Fig. 7B.
44