Mean-Field Control Barrier Functions: A Framework For Real-Time Swarm Control
Mean-Field Control Barrier Functions: A Framework For Real-Time Swarm Control
Mean-Field Control Barrier Functions: A Framework For Real-Time Swarm Control
Abstract—Control Barrier Functions (CBFs) are an effective The CBF methodology is appealing due to its theoretical
methodology to ensure safety and performative efficacy in real- guarantees, local nature, and computational benefits. Indeed,
time control applications such as power systems, resource alloca- the set of safe controls at a given state depends only on the
tion, autonomous vehicles, robotics, etc. This approach ensures
safety independently of the high-level tasks that may have been data at the current state. Additionally, the set of safe controls is
pre-planned off-line. For example, CBFs can be used to guarantee convex for control-affine systems, and state-of-the-art convex
that a vehicle will remain in its lane. However, when the number optimization algorithms are applicable for fast computations
of agents is large, computation of CBFs can suffer from the curse of safe controllers. Finally, no computation is necessary when
of dimensionality in the multi-agent setting. In this work, we the nominal control is safe.
present Mean-field Control Barrier Functions (MF-CBFs), which
extends the CBF framework to the mean-field (or swarm control) In this paper, we extend the CBF methodology to infinite-
setting. The core idea is to model a population of agents as dimensional control problems in the space of probability
probability measures in the state space and build corresponding measures. Such control problems are often called mean-field
control barrier functions. Similar to traditional CBFs, we derive
control problems as one aims to control distributions of states
safety constraints on the (distributed) controls but now relying
on the differential calculus in the space of probability measures. rather than a single state [3]–[8]. Hence, we call the framework
Mean-field Control Barrier Functions (MF-CBFs).
Index Terms—real-time control, safety, optimal control, barrier The mean-field framework is an efficient way of modeling
functions, mean-field, swarm control, robotics
multi-agent systems [3]–[8]. Indeed, the dynamics of a swarm
in a state space are equivalent to the dynamics of the empirical
I. I NTRODUCTION
distribution of the swarm in the space of probability measures.
Control problems are ubiquitous in applications, including Modeling the swarm behavior via mean-field framework has
aerospace engineering, robotics, economics, finance, power several benefits. First, the mathematical analysis is performed
systems management, etc. Typically, one has a system that can in the space of probability measures, which is independent of
be manipulated by applying a control, and the goal is to drive the swarm size unlike the product space of the joint state of the
the system to certain states, maintaining suitable constraints swarm. Second, instead of searching for individual controls,
and acting as economically as possible. one can search for a common (distributed) control in a
When the constraints are known ahead of time, one can feedback form, significantly reducing the problem dimension.
solve for controls that maintain these constraints offline. How- See [9]–[14] for the challenges occurring in high-dimensional
ever, there are numerous situations where some constraints are multi-agent control problems.
unknown before deployment and must be dealt with online. Our main contributions in this paper are as follows.
Examples of such constraints include avoiding an unexpected
obstacle or maintaining a certain distance from an agent with • Formulation of the CBF framework for mean-field control
unknown dynamics, e.g., pedestrians. problems in the space of probability measures.
For safety-critical applications, real-time computation of • Derivation of MF-CBFs suitable for swarm avoidance and
effective constraint-maintaining controllers is crucial. Control tracking.
Barrier Functions is an effective framework for computing • Numerical experiments of swarm avoidance and tracking
such controllers [1], [2]. In short, one represents the state with up to 200 agents.
constraints as a sublevel set of a suitable function, which
This paper is organized as follows. In Section II, we
then yields the set of safe (constraint-maintaining) controls
review preliminary concepts on control barrier functions and
via differential inequality. Hence, one can replace the nominal
their challenges in the mult-agent setting. In Section III, we
control by the closest possible safe control.
present the mean-field control barrier function framework. In
This work was partially funded by NSF DMS award 2110745. Section IV, we walk through some illustrative examples of
The authors contributed equally. swarm tracking and avoidance with up 200 agents.
II. BACKGROUND : C ONTROL BARRIER F UNCTIONS B. Control Barrier Functions
In this section, we provide a brief introduction to CBFs and Controls in feedback-form (5) are satisfactory when we have
refer to [1], [2] for a more in-depth discussion of CBFs. access to problem data, such as f, L, G in (4) or R, D in (3),
that encode the essential features of the problem. However, q
A. Control Problems is not designed to handle unforeseen circumstances such as
We consider deterministic finite time-horizon control prob- real-time collision and danger zone avoidance, or tracking.
lems, where a system obeys the dynamics To this end, one can enhance the nominal (pre-computed)
controller u∗ with mission-oriented filters that use sensor data
∂s z(s) = f (s, z(s), u(s)), z(t) = x, t ≤ s ≤ T. (1)
to adjust u∗ in real-time when, e.g., an unforeseen obstacle
Above, x ∈ Rd is the initial state, T is the time-horizon, appears.
t < T is an initial time, and z : [t, T ] → Rd and u : [t, T ] → A successful approach to filter u∗ are CBFs. The basic
U ⊂ Rm are, correspondingly, the state and the control of idea underlying CBFs is as follows. Consider the dynamics of
the system as functions of time. Furthermore, the function agents in (1), where u is some control. Furthermore, assume
f : [t, T ] × Rd × U → Rd models the evolution of the state that h encodes safety constraints or other goals so that it is
z in response to the control u : [t, T ] → U . We say that the desireable to have
system is control-affine if the dynamics are affine with respect
to the control; that is, h(z(s)) ≥ 0, ∀s ≥ t. (6)
f (s, z, u) = A(s, z) + B(s, z)u (2) One way to ensure this is to impose
and
Theorem 2 provides constraints on the policy function that
MMD(ρ, ρ† (s, ·))2
Z
ensure safe controls or mission accomplishing controls for the
∂s H(s, ρ) = δρ† − ϵ ∂s ρ† (s, x)dx
swarm. The mean-field analog of (9) is d 2
ZR
qCBF (s, ·) ∈ argmin ∥q − q ∗ (s, ·)∥2L2 (ρ(s,·)) , = K(x, y)(ρ† (s, y) − ρ(y))∂s ρ† (s, x)dxdy.
q∈KCBF (s,ρ(s,·)) (20) R2d
q(x)∈U (25)
where q ∗ (s, ·), s ≥ 0 is the nominal control of the swarm. 1 We note that other metrics such as Wasserstein distance may be considered.
Swarm Avoidance Example: 3D Double Integrator
Fig. 1. Illustration of Swarm avoidance with the acceleration model. Swarm of agents (black dots) avoid the red agent. The agent swarm must maintain an
MMD value greater than 2ϵ while moving the least amount possible as described in (31).
Assuming ρ† evolves according to the dynamics where ρ† is now the distribution the swarm that one wants to
track. Then (14) would ensure that ρ stays close to ρ† all the
∂s ρ† (s, x) + ∇ · (ρ† (s, x)v † (s, x)) = 0, (26) time. Recycling the calculations for swarm avoidance yields
we obtain that
KCBF (s, ρ) = q :
∂s H(s, ρ) Z
Z
= ∇x K(x, y) · v † (s, x)(ρ† (s, y) − ρ(y))ρ† (s, x)dxdy. ∇x K(x, y) · f (s, x, q(s, x))(ρ(y) − ρ† (s, y))ρ(x)dxdy
2d
R2d RZ
(27)
+ ∇x K(x, y) · v † (s, x)(ρ† (s, y) − ρ(y))ρ† (s, x)dxdy
R2d
Combining the derivations above, we find that
MMD(ρ, ρ† (s, ·))2
−α ϵ− ≤0 .
2
KCBF (s, ρ) = q : (30)
Z
∇x K(x, y) · f (s, x, q(s, x))(ρ(y) − ρ† (s, y))ρ(x)dxdy IV. E XPERIMENTS
2d
RZ
+ ∇x K(x, y) · v † (s, x)(ρ† (s, y) − ρ(y))ρ† (s, x)dxdy We illustrate the effectiveness of the MF-CBF framework
R2d
on two types of applications: swarm avoidance and swarm
MMD(ρ, ρ† (s, ·))2
+α −ϵ ≥0 . tracking. We note that in practice we typically have access
2
to samples of ρ in (20); consequently, we have a discrete
(28)
approximation where q and q ⋆ are vectors and the objective
2) Swarm tracking: The swarm avoidance framework in the is given by the Euclidean l2 norm instead. The dynamics
previous section can be easily modified to a swarm tracking used in both applications are double integrator dynamics, that
one by changing the sign of H in (22). Indeed, consider is, f (t, z, u) = Az + Bu where z stacks the position and
velocity of the agents. Our experiments are coded in python; in
H(s, ρ) particular, the quadratic programs arising from the MF-CBFs
MMD(ρ, ρ† (s, ·))2 are solved with cvxpy [23], an open source library for solving
=ϵ − convex optimization problems. In both setups, the nominal
2
1
Z controller is given by q ∗ = 0, i.e., we would like the swarm
=ϵ − K(x, y)(ρ(x) − ρ† (s, x))(ρ(y) − ρ† (s, y))dxdy, to have constant velocity (or remain still if they are already
2 R2d
(29) stationary).
Swarm Tracking Example: 3D Double Integrator
Fig. 2. Illustration of Swarm tracking with the acceleration model. Swarm of agents (black dots) and adversary (red dot). The agent swarm must maintain
an MMD value less than 2ϵ while moving the least amount possible as shown in (32).
a) Swarm Avoidance b) Swarm Tracking than 2ϵ. The MF-CBF problem to be solved at each time step
0.40
H(t, ·)
0.010 is then given by
0.35
safety bound 0.008
0.30
min ∥q∥2 s.t. q ∈ KCBF in (30) (32)
0.25 0.006 q
0.20
0.15 0.004
As before, Figure 2 shows the effectiveness of the mean-
0.10 H(t, ·)
0.05
0.002 field CBF in the swarm tracking. In this experiment, the
monitoring bound
0.00 0.000 red agent has a constant velocity. Feasibility of the tracking
0 1 2 3 4 5 6 0 1 2 3 4 5
time time distance is shown in Fig. 3, where H(ρ(s, ·)) ≥ 0 for all s,
which guarantees that 12 MMD2 (ρ, ρ† ) ≤ ϵ.
Fig. 3. Values of H over time (blue line) and monitoring bound that
guarantees H(t, ρ) ≥ 0 (orange dashed line). In both examples, H(t, ρ) ≥ 0,
which guarantees (a) 21 MMD2 (ρ, ρ† ) ≥ ϵ (safety of the swarm of agents from
C. Discussion
the ρ† in red), and (b) 12 MMD2 (ρ, ρ† ) ≤ ϵ (proximity to ρ† in red). Just like traditional CBFs, there are two major considera-
tions when employing MF-CBFs: the choice of α(x) and the
necessary time discretization, both of which are nuanced tasks
A. Swarm Avoidance and context-dependent [2]. In our experiments, these were hy-
In these experiments, we suppose we have a swarm of perparameters that we tuned until we obtained the desired per-
agents that are stationary, and as soon as a moving object gets formance; for instance, the swarm tracking problems required
too close, the swarm of agents avoid the object; this is akin a much finer time-discretization since the projection onto the
to a swarm of fish avoiding an incoming shark. The MF-CBF set of controls in (18) was activated more frequently. Finally,
problem to be solved at each time step is then given by we remark that since we have a finite number of agents, the
distributions are comprised of Dirac-delta functions so that the
min ∥q∥2 s.t. q ∈ KCBF in (28) (31)
q norm ∥ · ∥2 in (31) and (32) are equivalent to ∥ · ∥L2 (ρ(s,·))
Figure 1 shows the effectiveness of the MF-CBF in the in (20). Code details and accompanying videos can be found
swarm avoidance setting. In this experiment, the red dot is in https://github.com/mines-opt-ml/mean-field-cbf.
a moving obstacle has a constant velocity. Guaranteed safety
V. C ONCLUSION
is shown in Fig. 3, where H(ρ(s, ·)) ≥ 0 for all s, which
guarantees that 21 MMD2 (ρ, ρ† ) ≥ ϵ. We present MF-CBF, a mean-field framework for real-time
swarm control. The core idea is to extend CBFs to the space
B. Swarm Tracking of distributions. Our numerical experiments show MF-CBFs
In this experiments, we suppose we have a swarm of agents are effective in a swarm avoidance and a swarm tracking
that must maintain a certain distance from the red agent. In example. Future work involves employing these in optimal
particular, they must maintain an MMD squared value less control settings where the feedback control is available [9],
[10], [24], [25] and improving their computational efficiency [23] S. Diamond and S. Boyd, “Cvxpy: A python-embedded modeling lan-
via kernel decoupling techniques [26]–[30]. guage for convex optimization,” Journal of Machine Learning Research,
vol. 17, no. 83, pp. 1–5, 2016.
[24] D. Onken, S. Wu Fung, X. Li, and L. Ruthotto, “OT-Flow: Fast
R EFERENCES and accurate continuous normalizing flows via optimal transport,” in
Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35,
[1] A. D. Ames, X. Xu, J. W. Grizzle, and P. Tabuada, “Control barrier pp. 9223–9232, 2021.
function based quadratic programs for safety critical systems,” IEEE [25] A. Vidal, S. Wu Fung, L. Tenorio, S. Osher, and L. Nurbekyan, “Taming
Transactions on Automatic Control, vol. 62, no. 8, pp. 3861–3876, 2016. hyperparameter tuning in continuous normalizing flows using the jko
[2] A. D. Ames, S. Coogan, M. Egerstedt, G. Notomista, K. Sreenath, and scheme,” Scientific Reports, vol. 13, no. 1, p. 4501, 2023.
P. Tabuada, “Control barrier functions: Theory and applications,” in 2019 [26] L. Nurbekyan et al., “Fourier approximation methods for first-order
18th European control conference (ECC), pp. 3420–3431, IEEE, 2019. nonlocal mean-field games,” Portugaliae Mathematica, vol. 75, no. 3,
[3] J.-M. Lasry and P.-L. Lions, “Jeux à champ moyen. ii–horizon fini et pp. 367–396, 2019.
contrôle optimal,” Comptes Rendus. Mathématique, vol. 343, no. 10, [27] Y. T. Chow, S. Wu Fung, S. Liu, L. Nurbekyan, and S. Osher, “A numer-
pp. 679–684, 2006. ical algorithm for inverse problem from partial boundary measurement
arising from mean field game problem,” Inverse Problems, vol. 39, no. 1,
[4] J.-M. Lasry and P.-L. Lions, “Mean field games,” Japanese journal of
p. 014001, 2022.
mathematics, vol. 2, no. 1, pp. 229–260, 2007.
[28] S. Agrawal, W. Lee, S. Wu Fung, and L. Nurbekyan, “Random features
[5] M. Fornasier and F. Solombrino, “Mean-field optimal control,” ESAIM:
for high-dimensional nonlocal mean-field games,” Journal of Computa-
Control, Optimisation and Calculus of Variations, vol. 20, no. 4,
tional Physics, vol. 459, p. 111136, 2022.
pp. 1123–1152, 2014.
[29] S. Liu, M. Jacobs, W. Li, L. Nurbekyan, and S. J. Osher, “Computational
[6] D. A. Gomes and J. Saúde, “Mean field games models—a brief survey,” methods for first-order nonlocal mean field games with applications,”
Dynamic Games and Applications, vol. 4, pp. 110–154, 2014. SIAM Journal on Numerical Analysis, vol. 59, no. 5, pp. 2639–2668,
[7] L. Ruthotto, S. J. Osher, W. Li, L. Nurbekyan, and S. Wu Fung, “A 2021.
machine learning framework for solving high-dimensional mean field [30] A. Vidal, S. Wu Fung, S. Osher, L. Tenorio, and L. Nurbekyan,
game and mean field control problems,” Proceedings of the National “Kernel expansions for high-dimensional mean-field control with non-
Academy of Sciences, vol. 117, no. 17, pp. 9183–9193, 2020. local interactions,” arXiv preprint arXiv:2405.10922, 2024.
[8] A. T. Lin, S. Wu Fung, W. Li, L. Nurbekyan, and S. J. Osher,
“Alternating the population and control neural networks to solve high-
dimensional stochastic mean-field games,” Proceedings of the National
Academy of Sciences, vol. 118, no. 31, p. e2024713118, 2021.
[9] D. Onken, L. Nurbekyan, X. Li, S. Wu Fung, S. Osher, and L. Ruthotto,
“A neural network approach applied to multi-agent optimal control,” in
European Control Conference (ECC), pp. 1036–1041, 2021.
[10] D. Onken, L. Nurbekyan, X. Li, S. Wu Fung, S. Osher, and L. Ruthotto,
“A neural network approach for high-dimensional optimal control ap-
plied to multiagent path finding,” IEEE Transactions on Control Systems
Technology, 2022.
[11] S. Bansal, M. Chen, K. Tanabe, and C. J. Tomlin, “Provably safe and
scalable multivehicle trajectory planning,” IEEE Transactions on Control
Systems Technology, vol. 29, no. 6, pp. 2473–2489, 2020.
[12] S. Bansal and C. J. Tomlin, “DeepReach: A deep learning approach
to high-dimensional reachability,” in IEEE International Conference on
Robotics and Automation (ICRA), pp. 1817–1824, 2021.
[13] Y. Chen, A. Singletary, and A. D. Ames, “Guaranteed obstacle avoidance
for multi-robot operations with limited actuation: A control barrier
function approach,” IEEE Control Systems Letters, vol. 5, no. 1, pp. 127–
132, 2020.
[14] S. Zhang, O. So, K. Garg, and C. Fan, “Gcbf+: A neural graph control
barrier function framework for distributed safe multi-agent control,”
2024.
[15] K. Kunisch and D. Walter, “Semiglobal optimal feedback stabilization of
autonomous systems via deep neural network approximation,” ESAIM:
Control, Optimisation and Calculus of Variations, vol. 27, 2021.
[16] W. H. Fleming and H. M. Soner, Controlled Markov Processes and
Viscosity Solutions, vol. 25 of Stochastic Modelling and Applied Prob-
ability. Springer, New York, second ed., 2006.
[17] L. R. G. Carrillo, A. E. D. López, R. Lozano, and C. Pégard, “Modeling
the quad-rotor mini-rotorcraft,” in Quad Rotorcraft Control, pp. 23–34,
Springer, 2013.
[18] S. Bansal, M. Chen, S. Herbert, and C. J. Tomlin, “Hamilton-jacobi
reachability: A brief overview and recent advances,” in 2017 IEEE 56th
Annual Conference on Decision and Control (CDC), pp. 2242–2253,
IEEE, 2017.
[19] R. Bellman, Dynamic Programming. Princeton University Press, Prince-
ton, N. J., 1957.
[20] D. A. Paley, F. Zhang, and N. E. Leonard, “Cooperative control for ocean
sampling: The glider coordinated control system,” IEEE Transactions on
Control Systems Technology, vol. 16, no. 4, pp. 735–744, 2008.
[21] K. Glock and A. Meyer, “Mission planning for emergency rapid mapping
with drones,” Transportation science, vol. 54, no. 2, pp. 534–560, 2020.
[22] L. Ambrosio, N. Gigli, and G. Savaré, Gradient flows in metric spaces
and in the space of probability measures. Lectures in Mathematics ETH
Zürich, Birkhäuser Verlag, Basel, second ed., 2008.