Mean-Field Control Barrier Functions: A Framework For Real-Time Swarm Control

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Mean-Field Control Barrier Functions:

A Framework for Real-Time Swarm Control


Samy Wu Fung Levon Nurbekyan
Department of Applied Mathematics and Statistics Department of Mathematics
Colorado School of Mines Emory University
Golden, USA Atlanta, USA
swufung@mines.edu lnurbek@emory.edu
arXiv:2409.18945v1 [math.OC] 27 Sep 2024

Abstract—Control Barrier Functions (CBFs) are an effective The CBF methodology is appealing due to its theoretical
methodology to ensure safety and performative efficacy in real- guarantees, local nature, and computational benefits. Indeed,
time control applications such as power systems, resource alloca- the set of safe controls at a given state depends only on the
tion, autonomous vehicles, robotics, etc. This approach ensures
safety independently of the high-level tasks that may have been data at the current state. Additionally, the set of safe controls is
pre-planned off-line. For example, CBFs can be used to guarantee convex for control-affine systems, and state-of-the-art convex
that a vehicle will remain in its lane. However, when the number optimization algorithms are applicable for fast computations
of agents is large, computation of CBFs can suffer from the curse of safe controllers. Finally, no computation is necessary when
of dimensionality in the multi-agent setting. In this work, we the nominal control is safe.
present Mean-field Control Barrier Functions (MF-CBFs), which
extends the CBF framework to the mean-field (or swarm control) In this paper, we extend the CBF methodology to infinite-
setting. The core idea is to model a population of agents as dimensional control problems in the space of probability
probability measures in the state space and build corresponding measures. Such control problems are often called mean-field
control barrier functions. Similar to traditional CBFs, we derive
control problems as one aims to control distributions of states
safety constraints on the (distributed) controls but now relying
on the differential calculus in the space of probability measures. rather than a single state [3]–[8]. Hence, we call the framework
Mean-field Control Barrier Functions (MF-CBFs).
Index Terms—real-time control, safety, optimal control, barrier The mean-field framework is an efficient way of modeling
functions, mean-field, swarm control, robotics
multi-agent systems [3]–[8]. Indeed, the dynamics of a swarm
in a state space are equivalent to the dynamics of the empirical
I. I NTRODUCTION
distribution of the swarm in the space of probability measures.
Control problems are ubiquitous in applications, including Modeling the swarm behavior via mean-field framework has
aerospace engineering, robotics, economics, finance, power several benefits. First, the mathematical analysis is performed
systems management, etc. Typically, one has a system that can in the space of probability measures, which is independent of
be manipulated by applying a control, and the goal is to drive the swarm size unlike the product space of the joint state of the
the system to certain states, maintaining suitable constraints swarm. Second, instead of searching for individual controls,
and acting as economically as possible. one can search for a common (distributed) control in a
When the constraints are known ahead of time, one can feedback form, significantly reducing the problem dimension.
solve for controls that maintain these constraints offline. How- See [9]–[14] for the challenges occurring in high-dimensional
ever, there are numerous situations where some constraints are multi-agent control problems.
unknown before deployment and must be dealt with online. Our main contributions in this paper are as follows.
Examples of such constraints include avoiding an unexpected
obstacle or maintaining a certain distance from an agent with • Formulation of the CBF framework for mean-field control
unknown dynamics, e.g., pedestrians. problems in the space of probability measures.
For safety-critical applications, real-time computation of • Derivation of MF-CBFs suitable for swarm avoidance and
effective constraint-maintaining controllers is crucial. Control tracking.
Barrier Functions is an effective framework for computing • Numerical experiments of swarm avoidance and tracking
such controllers [1], [2]. In short, one represents the state with up to 200 agents.
constraints as a sublevel set of a suitable function, which
This paper is organized as follows. In Section II, we
then yields the set of safe (constraint-maintaining) controls
review preliminary concepts on control barrier functions and
via differential inequality. Hence, one can replace the nominal
their challenges in the mult-agent setting. In Section III, we
control by the closest possible safe control.
present the mean-field control barrier function framework. In
This work was partially funded by NSF DMS award 2110745. Section IV, we walk through some illustrative examples of
The authors contributed equally. swarm tracking and avoidance with up 200 agents.
II. BACKGROUND : C ONTROL BARRIER F UNCTIONS B. Control Barrier Functions
In this section, we provide a brief introduction to CBFs and Controls in feedback-form (5) are satisfactory when we have
refer to [1], [2] for a more in-depth discussion of CBFs. access to problem data, such as f, L, G in (4) or R, D in (3),
that encode the essential features of the problem. However, q
A. Control Problems is not designed to handle unforeseen circumstances such as
We consider deterministic finite time-horizon control prob- real-time collision and danger zone avoidance, or tracking.
lems, where a system obeys the dynamics To this end, one can enhance the nominal (pre-computed)
controller u∗ with mission-oriented filters that use sensor data
∂s z(s) = f (s, z(s), u(s)), z(t) = x, t ≤ s ≤ T. (1)
to adjust u∗ in real-time when, e.g., an unforeseen obstacle
Above, x ∈ Rd is the initial state, T is the time-horizon, appears.
t < T is an initial time, and z : [t, T ] → Rd and u : [t, T ] → A successful approach to filter u∗ are CBFs. The basic
U ⊂ Rm are, correspondingly, the state and the control of idea underlying CBFs is as follows. Consider the dynamics of
the system as functions of time. Furthermore, the function agents in (1), where u is some control. Furthermore, assume
f : [t, T ] × Rd × U → Rd models the evolution of the state that h encodes safety constraints or other goals so that it is
z in response to the control u : [t, T ] → U . We say that the desireable to have
system is control-affine if the dynamics are affine with respect
to the control; that is, h(z(s)) ≥ 0, ∀s ≥ t. (6)

f (s, z, u) = A(s, z) + B(s, z)u (2) One way to ensure this is to impose

where A, B are possibly nonlinear maps of time and state. d


h(z(s)) ≥ −α(h(z(s))), ∀s ≥ t, (7)
Control-affine systems are ubiquitous and cover a wide range ds
of applications [2], [15]–[17]. where α : R → R is a strictly increasing smooth function such
In control problems, one searches for controls for achieving that α(0) = 0 [1], [2]. Feeding (7) in the dynamics of z, we
a suitable goal. For instance, one might seek controls for obtain
reaching a final destination while avoiding dangerous zones;
that is, ∇h(z(s)) · f (s, z(s), u(s)) ≥ −α(h(z(s))), ∀s ≥ t. (8)

find u(·) s.t. z(T ) ∈ R and z(s) ∈


/ D, ∀t ≤ s ≤ T, (3) Thus, one can adjust u∗ in real time by solving the following
quadratic program
where R is the destination set, and D is the dangerous zone.
See [18] for a detailed discussion on problems of type (3). uCBF (s) ∈ argmin ∥u − u∗ (s)∥2
A large class of control problems seek to control a system in u∈U (9)
an optimal manner; that is, search for controls that minimize s.t. ∇h(z(s)) · f (s, z(s), u) ≥ −α(h(z(s))).
a cost functional
Z T The construction of h depends on the application and on
  the type of live sensors.
inf L s, z(s), u(s) ds + G z(T )
u(s)∈U t (4) 1) Existing Challenges: One of the main drawbacks of
s.t. (1) holds, CBFs is that it is prone to the curse of dimensionality [19]
for multi-agent systems. The latter appear, for instance, in
where L : [0, T ] × Rd × U → R is the running cost (or the applications such as the Glider Coordinated Control Sys-
Lagrangian), G : Rd → R is the terminal cost, and Φ is the tem for ocean monitoring [20] and informative Unmanned
so-called value function or optimal cost-to-go. Aerial Vehicle (UAV) mission planning [21]. Indeed, assume
Problems such as (4) are called optimal control problems. A that z1 , z2 , · · · , zn represent the states of n control systems
solution u∗t,x of (4) is called an optimal control. Accordingly, (agents) obeying the dynamics

the zt,x which corresponds to u∗t,x is called an optimal
trajectory. See, for instance, [16] for a detailed account on ∂s zi (s) = fi (zi (s), ui (s)), 1 ≤ i ≤ n,
(10)
optimal control problems. z(t) = x, t ≤ s ≤ T.
Whether it is the reachability problem (3) or the optimal
control problem (4), it is advantageous to find controls in a Furthermore, let h1 , h2 , · · · , hn encode the safety require-
feedback form; that is, ments for individual agents 1, 2, · · · , n, respectively. Finally,
let hij encode a mutual safety requirement for a pair of agents
u∗t,x (s) = q(s, zt,x

(s)), t ≤ s ≤ T. (5) 1 ≤ i ̸= j ≤ n. For example, the functions
Here, the function q is called a policy function. Hence, instead hij (zi , zj ) = ∥zi − zj ∥2 − ϵ2safe
of searching for controls separately at each initial point (t, x)
one can search for a suitable policy function that yields the reflect the requirement that the distance between two agents
desired controls for all initial points at once. should be at least ϵsafe > 0.
A common approach to study such multi-agent systems is In the mean-field control setting, we consider only feedback-
to concatenate all states and controls into one “large” agent. form (distributed) controls and assume that all agents adopt
Specifically, let the same policy function. Hence, given a policy function q
adopted by the population, the density evolves according to
z =(z1 , z2 , . . . , zn ) ∈ Rd×n ,
the continuity equation
u =(u1 , u2 , . . . , un ) ∈ U n ⊂ Rm×n , (
∂s ρ(s, x) + ∇ · (ρ(s, x)f (s, x, q(s, x)) = 0, s ∈ (0, T ),
f (z, u) =(f1 (z1 , u1 ), f2 (z2 , u2 ), . . . , fn (zn , un )) ∈ Rd×n ,
ρ(0, x) = ρ0 (x),
and (12)
hi (z) = hi (zi ), ∀1 ≤ i ≤ n, where ρ0 is the initial distribution of the population, and ∇·
is the divergence operator with respect to the state variable x.
hij (z) = hij (zi , zj ), ∀1 ≤ i ̸= j ≤ n.
The mean-field control or swarm-control problem is then
Then the quadratic program for computing safe controls is formulated as
Z T
uCBF (s) ∈ argmin ∥u − u∗ (s)∥2 inf L(s, ρ(s, ·), q(s, ·))ds + G(ρ(T, ·))
u∈U n q(s,x)∈U 0 (13)
s.t. ∇hi (z(s)) · f (z(s), u) ≥ −α(hi (z(s))), ∀i, (11)
s.t. (12) holds,
∇hij (z(s)) · f (z(s), u) ≥ −α(hij (z(s))), ∀i ̸= j.
where L and G are mean-field running and terminal costs,
where u∗ = (u∗1 , u∗2 , . . . , u∗n ) ∈ U n is the concatenated respectively. The dependencies of L, G on ρ encode the swarm
nominal controllers. behavior that one attempts to model.
A few important remarks are in order.
B. Mean-field control-barrier functions
1) The dimension of the quadratic problem (11) is n × m
Analogous to single-agent control problems one may have
as opposed to m in (9).
safety constraints or goals for swarm control problems. Build-
2) The number of inequality constraints (excluding the a
ing on the mean-field control framework, we propose mean-
priori constraints, u ∈ U n , on the controls) in (11)
field control-barrier functions (MF-CBFs) for efficiently han-
is n2 as opposed to 1 in (9). Hence, for control-affine
dling safety constraints and other goals.
systems, going from a single agent to n agents yields
Assume that H encodes, possibly time-dependent, safety
going from one additional half-space constraint to n2
constraints or other goals of the swarm; that is, one must have
additional half-space constraints.
3) If individual agents need more than one CBF for their H(s, ρ(s, ·)) ≥ 0, ∀s ≥ 0. (14)
safety requirements, and there are additional collective
As in (6), one can ensure this previous inequality by imposing
safety or goal requirements beyond pair-to-pair interac-
tions, the total number of CBF or inequality constraints d
H(s, ρ(s, ·)) ≥ −α(H(s, ρ(s, ·))), ∀s ≥ 0, (15)
in the projection problem will be even larger. ds
4) The inequality constraints in (11) are coupled and in where α : R → R is again a strictly increasing smooth
general cannot be solved on the level of individual function such that α(0) = 0. We provide a simple proof of
agents. Smart decoupling techniques mitigate the cou- this statement for completeness.
pling issue, but still lead to n quadratic programs with
Theorem 1. Let α ∈ C(R) be a strictly increasing function
O(n) inequality constraints, which are challenging to
such that α(0) = 0, and ρ(s, ·) and H(s, ·) be such that
solve online for n ≫ 10 [13].
ω(s) = H(s, ρ(s, ·)), s ≥ 0, is a continuously differentiable
III. M EAN - FIELD C ONTROL - BARRIER F UNCTIONS function. Then ω(0) ≥ 0 and (15) guarantee that ω(s) ≥ 0
for all s ≥ 0.
To mitigate the challenges of computing CBFs for large
multi-agent swarms, we introduce Mean-Field Control Barrier Proof. Assume by contradiction that
Functions (MF-CBFs). The core idea is to formulate the swarm
N = {s ≥ 0 : ω(s) < 0} =
̸ ∅.
CBF problem in (11) in the space of distributions. This allows
us to, e.g., represent the inter-agent distance requirements Since ω is continuous, we have that N ⊂ (0, ∞) is an open set;
in (11) by a lower bound on a single mean-field function. hence, N is a union of disjoint open intervals. Let (a, b) ⊂ N
be one such interval. Then we have that 0 < a < ∞, and
A. Mean-field Control
ω(a) = 0, ω(s) < 0, s ∈ (a, b). (16)
Assume that we have a population (swarm) of agents
in the state space, where an individual agent follows the Hence, we have that
dynamics (1). Furthermore, assume that the distribution of
ω ′ (s) ≥ −α(ω(s)) > −α(0) = 0, s ∈ (a, b).
the population in the state space at time s is described by
the probability measure ρ(s, ·), where we often use the same Thus, ω is strictly increasing in [a, b), which contradicts
notation for a measure and its density function. to (16).
Next, we find the constraints that (15) imposes on the For control-affine systems we have that
feedback (distributed) controls that the swarm should adopt.
KCBF (s, ρ)
Theorem 2. Let
 Z
= q : ∇δρ H(s, ρ) · (A(s, x) + B(s, x)q(x))ρ(x)dx
KCBF (s, ρ) Rd

 Z
≥ −∂s H(s, ρ) − α (H(s, ρ)) .
= q : ∇δρ H(s, ρ) · f (s, x, q(x))ρ(x)dx
Rd (17)
 (21)
≥ −∂s H(s, ρ) − α (H(s, ρ)) , s ≥ 0, Two critical remarks are in order.
1) (20) is a quadratic program when U is a convex polytope
and assume that the swarm adopts a policy function q(s, ·) so and the system is control-affine.
that ρ(·, s) evolves according to (12). Then (15) is equivalent 2) In the mean-field setting, we only have one constraint
to (excluding the a priori constraints q(x) ∈ U ) as opposed
q(s, ·) ∈ KCBF (s, ρ(s, ·)), ∀s ≥ 0. (18) to n2 in the direct approach (11).
These two points yield MF-CBF an efficient framework for
Remark 1. For simplicity, we assume necessary regularity safe swarm-control.
for (12) to be well-posed and smooth enough for differential
calculus. Additionally, we assume that ρ(s, ·) decays fast C. Examples
enough at infinity, e.g., compactly supported, which can be Here we discuss applications of the MF-CBF framework in
ensured by fast decaying ρ0 and smooth, e.g. Lipschitz, q(s, ·) swarm avoidance and tracking examples.
for s ≥ 0 [22]. 1) Swarm avoidance: Suppose we wish to avoid (and
Proof. We have that maintain a certain distance from) an incoming object, denoted
by ρ† . Mathematically, this condition can be formulated as
d
H(s, ρ(s, ·)) d(ρ(s, ·), ρ† (s, ·)) ≥ ϵ, ∀s ≥ 0,
ds Z
=∂s H(s, ρ(s, ·)) + δρ H(s, ρ(s, ·))∂s ρ(s, x)dx, where d is some distance function, and ϵ > 0. Although there
Rd are many choices for d, we consider the square maximum
mean discrepancy (MMD) distance due to its analytic and
where δρ H is the Fréchet derivative with respect to ρ variable.
computational simplicity1 . Hence, for a suitable choice of a
Taking into account (12) and integrating by parts we find that
symmetric positive-definite kernel K, we consider
Z
δρ H(s, ρ(s, ·))∂s ρ(s, x)dx H(s, ρ)
d
RZ
MMD(ρ, ρ† (s, ·))2
= −ϵ
=− δρ H(s, ρ(s, ·))∇ · (ρ(s, x)f (s, x, q(s, x))dx Z 2
Z Rd 1
= K(x, y)(ρ(x) − ρ† (s, x))(ρ(y) − ρ† (s, y))dxdy − ϵ.
= ∇δρ H(s, ρ(s, ·)) · f (s, x, q(s, x))ρ(s, x)dx. 2 R2d
Rd (22)
Hence, (15) reduces to Next, we have that
( 2 †
δρ MMD 2(ρ,ρ ) = Rd K(x, y)(ρ(y) − ρ† (y))dy,
R
∂s H(s, ρ(s, ·))
2 † (23)
δρ† MMD 2(ρ,ρ ) = Rd K(x, y)(ρ† (y) − ρ(y))dy.
Z R
+ ∇δρ H(s, ρ(s, ·)) · f (s, x, q(s, x))ρ(s, x)dx (19)
Rd
Hence, we have that
≥ − α(H(s, ρ(s, ·))), ∀s ≥ 0, Z
or, equivalently, (18). δρ H(s, ρ) = K(x, y)(ρ(y) − ρ† (s, y))dy, (24)
Rd

and
Theorem 2 provides constraints on the policy function that
MMD(ρ, ρ† (s, ·))2
Z  
ensure safe controls or mission accomplishing controls for the
∂s H(s, ρ) = δρ† − ϵ ∂s ρ† (s, x)dx
swarm. The mean-field analog of (9) is d 2
ZR
qCBF (s, ·) ∈ argmin ∥q − q ∗ (s, ·)∥2L2 (ρ(s,·)) , = K(x, y)(ρ† (s, y) − ρ(y))∂s ρ† (s, x)dxdy.
q∈KCBF (s,ρ(s,·)) (20) R2d
q(x)∈U (25)

where q ∗ (s, ·), s ≥ 0 is the nominal control of the swarm. 1 We note that other metrics such as Wasserstein distance may be considered.
Swarm Avoidance Example: 3D Double Integrator

Fig. 1. Illustration of Swarm avoidance with the acceleration model. Swarm of agents (black dots) avoid the red agent. The agent swarm must maintain an
MMD value greater than 2ϵ while moving the least amount possible as described in (31).

Assuming ρ† evolves according to the dynamics where ρ† is now the distribution the swarm that one wants to
track. Then (14) would ensure that ρ stays close to ρ† all the
∂s ρ† (s, x) + ∇ · (ρ† (s, x)v † (s, x)) = 0, (26) time. Recycling the calculations for swarm avoidance yields
we obtain that 
KCBF (s, ρ) = q :
∂s H(s, ρ) Z
Z
= ∇x K(x, y) · v † (s, x)(ρ† (s, y) − ρ(y))ρ† (s, x)dxdy. ∇x K(x, y) · f (s, x, q(s, x))(ρ(y) − ρ† (s, y))ρ(x)dxdy
2d
R2d RZ
(27)
+ ∇x K(x, y) · v † (s, x)(ρ† (s, y) − ρ(y))ρ† (s, x)dxdy
R2d
Combining the derivations above, we find that
MMD(ρ, ρ† (s, ·))2
  
 −α ϵ− ≤0 .
2
KCBF (s, ρ) = q : (30)
Z
∇x K(x, y) · f (s, x, q(s, x))(ρ(y) − ρ† (s, y))ρ(x)dxdy IV. E XPERIMENTS
2d
RZ

+ ∇x K(x, y) · v † (s, x)(ρ† (s, y) − ρ(y))ρ† (s, x)dxdy We illustrate the effectiveness of the MF-CBF framework
R2d
on two types of applications: swarm avoidance and swarm
MMD(ρ, ρ† (s, ·))2
  
+α −ϵ ≥0 . tracking. We note that in practice we typically have access
2
to samples of ρ in (20); consequently, we have a discrete
(28)
approximation where q and q ⋆ are vectors and the objective
2) Swarm tracking: The swarm avoidance framework in the is given by the Euclidean l2 norm instead. The dynamics
previous section can be easily modified to a swarm tracking used in both applications are double integrator dynamics, that
one by changing the sign of H in (22). Indeed, consider is, f (t, z, u) = Az + Bu where z stacks the position and
velocity of the agents. Our experiments are coded in python; in
H(s, ρ) particular, the quadratic programs arising from the MF-CBFs
MMD(ρ, ρ† (s, ·))2 are solved with cvxpy [23], an open source library for solving
=ϵ − convex optimization problems. In both setups, the nominal
2
1
Z controller is given by q ∗ = 0, i.e., we would like the swarm
=ϵ − K(x, y)(ρ(x) − ρ† (s, x))(ρ(y) − ρ† (s, y))dxdy, to have constant velocity (or remain still if they are already
2 R2d
(29) stationary).
Swarm Tracking Example: 3D Double Integrator

Fig. 2. Illustration of Swarm tracking with the acceleration model. Swarm of agents (black dots) and adversary (red dot). The agent swarm must maintain
an MMD value less than 2ϵ while moving the least amount possible as shown in (32).

a) Swarm Avoidance b) Swarm Tracking than 2ϵ. The MF-CBF problem to be solved at each time step
0.40
H(t, ·)
0.010 is then given by
0.35
safety bound 0.008
0.30
min ∥q∥2 s.t. q ∈ KCBF in (30) (32)
0.25 0.006 q
0.20
0.15 0.004
As before, Figure 2 shows the effectiveness of the mean-
0.10 H(t, ·)
0.05
0.002 field CBF in the swarm tracking. In this experiment, the
monitoring bound
0.00 0.000 red agent has a constant velocity. Feasibility of the tracking
0 1 2 3 4 5 6 0 1 2 3 4 5
time time distance is shown in Fig. 3, where H(ρ(s, ·)) ≥ 0 for all s,
which guarantees that 12 MMD2 (ρ, ρ† ) ≤ ϵ.
Fig. 3. Values of H over time (blue line) and monitoring bound that
guarantees H(t, ρ) ≥ 0 (orange dashed line). In both examples, H(t, ρ) ≥ 0,
which guarantees (a) 21 MMD2 (ρ, ρ† ) ≥ ϵ (safety of the swarm of agents from
C. Discussion
the ρ† in red), and (b) 12 MMD2 (ρ, ρ† ) ≤ ϵ (proximity to ρ† in red). Just like traditional CBFs, there are two major considera-
tions when employing MF-CBFs: the choice of α(x) and the
necessary time discretization, both of which are nuanced tasks
A. Swarm Avoidance and context-dependent [2]. In our experiments, these were hy-
In these experiments, we suppose we have a swarm of perparameters that we tuned until we obtained the desired per-
agents that are stationary, and as soon as a moving object gets formance; for instance, the swarm tracking problems required
too close, the swarm of agents avoid the object; this is akin a much finer time-discretization since the projection onto the
to a swarm of fish avoiding an incoming shark. The MF-CBF set of controls in (18) was activated more frequently. Finally,
problem to be solved at each time step is then given by we remark that since we have a finite number of agents, the
distributions are comprised of Dirac-delta functions so that the
min ∥q∥2 s.t. q ∈ KCBF in (28) (31)
q norm ∥ · ∥2 in (31) and (32) are equivalent to ∥ · ∥L2 (ρ(s,·))
Figure 1 shows the effectiveness of the MF-CBF in the in (20). Code details and accompanying videos can be found
swarm avoidance setting. In this experiment, the red dot is in https://github.com/mines-opt-ml/mean-field-cbf.
a moving obstacle has a constant velocity. Guaranteed safety
V. C ONCLUSION
is shown in Fig. 3, where H(ρ(s, ·)) ≥ 0 for all s, which
guarantees that 21 MMD2 (ρ, ρ† ) ≥ ϵ. We present MF-CBF, a mean-field framework for real-time
swarm control. The core idea is to extend CBFs to the space
B. Swarm Tracking of distributions. Our numerical experiments show MF-CBFs
In this experiments, we suppose we have a swarm of agents are effective in a swarm avoidance and a swarm tracking
that must maintain a certain distance from the red agent. In example. Future work involves employing these in optimal
particular, they must maintain an MMD squared value less control settings where the feedback control is available [9],
[10], [24], [25] and improving their computational efficiency [23] S. Diamond and S. Boyd, “Cvxpy: A python-embedded modeling lan-
via kernel decoupling techniques [26]–[30]. guage for convex optimization,” Journal of Machine Learning Research,
vol. 17, no. 83, pp. 1–5, 2016.
[24] D. Onken, S. Wu Fung, X. Li, and L. Ruthotto, “OT-Flow: Fast
R EFERENCES and accurate continuous normalizing flows via optimal transport,” in
Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35,
[1] A. D. Ames, X. Xu, J. W. Grizzle, and P. Tabuada, “Control barrier pp. 9223–9232, 2021.
function based quadratic programs for safety critical systems,” IEEE [25] A. Vidal, S. Wu Fung, L. Tenorio, S. Osher, and L. Nurbekyan, “Taming
Transactions on Automatic Control, vol. 62, no. 8, pp. 3861–3876, 2016. hyperparameter tuning in continuous normalizing flows using the jko
[2] A. D. Ames, S. Coogan, M. Egerstedt, G. Notomista, K. Sreenath, and scheme,” Scientific Reports, vol. 13, no. 1, p. 4501, 2023.
P. Tabuada, “Control barrier functions: Theory and applications,” in 2019 [26] L. Nurbekyan et al., “Fourier approximation methods for first-order
18th European control conference (ECC), pp. 3420–3431, IEEE, 2019. nonlocal mean-field games,” Portugaliae Mathematica, vol. 75, no. 3,
[3] J.-M. Lasry and P.-L. Lions, “Jeux à champ moyen. ii–horizon fini et pp. 367–396, 2019.
contrôle optimal,” Comptes Rendus. Mathématique, vol. 343, no. 10, [27] Y. T. Chow, S. Wu Fung, S. Liu, L. Nurbekyan, and S. Osher, “A numer-
pp. 679–684, 2006. ical algorithm for inverse problem from partial boundary measurement
arising from mean field game problem,” Inverse Problems, vol. 39, no. 1,
[4] J.-M. Lasry and P.-L. Lions, “Mean field games,” Japanese journal of
p. 014001, 2022.
mathematics, vol. 2, no. 1, pp. 229–260, 2007.
[28] S. Agrawal, W. Lee, S. Wu Fung, and L. Nurbekyan, “Random features
[5] M. Fornasier and F. Solombrino, “Mean-field optimal control,” ESAIM:
for high-dimensional nonlocal mean-field games,” Journal of Computa-
Control, Optimisation and Calculus of Variations, vol. 20, no. 4,
tional Physics, vol. 459, p. 111136, 2022.
pp. 1123–1152, 2014.
[29] S. Liu, M. Jacobs, W. Li, L. Nurbekyan, and S. J. Osher, “Computational
[6] D. A. Gomes and J. Saúde, “Mean field games models—a brief survey,” methods for first-order nonlocal mean field games with applications,”
Dynamic Games and Applications, vol. 4, pp. 110–154, 2014. SIAM Journal on Numerical Analysis, vol. 59, no. 5, pp. 2639–2668,
[7] L. Ruthotto, S. J. Osher, W. Li, L. Nurbekyan, and S. Wu Fung, “A 2021.
machine learning framework for solving high-dimensional mean field [30] A. Vidal, S. Wu Fung, S. Osher, L. Tenorio, and L. Nurbekyan,
game and mean field control problems,” Proceedings of the National “Kernel expansions for high-dimensional mean-field control with non-
Academy of Sciences, vol. 117, no. 17, pp. 9183–9193, 2020. local interactions,” arXiv preprint arXiv:2405.10922, 2024.
[8] A. T. Lin, S. Wu Fung, W. Li, L. Nurbekyan, and S. J. Osher,
“Alternating the population and control neural networks to solve high-
dimensional stochastic mean-field games,” Proceedings of the National
Academy of Sciences, vol. 118, no. 31, p. e2024713118, 2021.
[9] D. Onken, L. Nurbekyan, X. Li, S. Wu Fung, S. Osher, and L. Ruthotto,
“A neural network approach applied to multi-agent optimal control,” in
European Control Conference (ECC), pp. 1036–1041, 2021.
[10] D. Onken, L. Nurbekyan, X. Li, S. Wu Fung, S. Osher, and L. Ruthotto,
“A neural network approach for high-dimensional optimal control ap-
plied to multiagent path finding,” IEEE Transactions on Control Systems
Technology, 2022.
[11] S. Bansal, M. Chen, K. Tanabe, and C. J. Tomlin, “Provably safe and
scalable multivehicle trajectory planning,” IEEE Transactions on Control
Systems Technology, vol. 29, no. 6, pp. 2473–2489, 2020.
[12] S. Bansal and C. J. Tomlin, “DeepReach: A deep learning approach
to high-dimensional reachability,” in IEEE International Conference on
Robotics and Automation (ICRA), pp. 1817–1824, 2021.
[13] Y. Chen, A. Singletary, and A. D. Ames, “Guaranteed obstacle avoidance
for multi-robot operations with limited actuation: A control barrier
function approach,” IEEE Control Systems Letters, vol. 5, no. 1, pp. 127–
132, 2020.
[14] S. Zhang, O. So, K. Garg, and C. Fan, “Gcbf+: A neural graph control
barrier function framework for distributed safe multi-agent control,”
2024.
[15] K. Kunisch and D. Walter, “Semiglobal optimal feedback stabilization of
autonomous systems via deep neural network approximation,” ESAIM:
Control, Optimisation and Calculus of Variations, vol. 27, 2021.
[16] W. H. Fleming and H. M. Soner, Controlled Markov Processes and
Viscosity Solutions, vol. 25 of Stochastic Modelling and Applied Prob-
ability. Springer, New York, second ed., 2006.
[17] L. R. G. Carrillo, A. E. D. López, R. Lozano, and C. Pégard, “Modeling
the quad-rotor mini-rotorcraft,” in Quad Rotorcraft Control, pp. 23–34,
Springer, 2013.
[18] S. Bansal, M. Chen, S. Herbert, and C. J. Tomlin, “Hamilton-jacobi
reachability: A brief overview and recent advances,” in 2017 IEEE 56th
Annual Conference on Decision and Control (CDC), pp. 2242–2253,
IEEE, 2017.
[19] R. Bellman, Dynamic Programming. Princeton University Press, Prince-
ton, N. J., 1957.
[20] D. A. Paley, F. Zhang, and N. E. Leonard, “Cooperative control for ocean
sampling: The glider coordinated control system,” IEEE Transactions on
Control Systems Technology, vol. 16, no. 4, pp. 735–744, 2008.
[21] K. Glock and A. Meyer, “Mission planning for emergency rapid mapping
with drones,” Transportation science, vol. 54, no. 2, pp. 534–560, 2020.
[22] L. Ambrosio, N. Gigli, and G. Savaré, Gradient flows in metric spaces
and in the space of probability measures. Lectures in Mathematics ETH
Zürich, Birkhäuser Verlag, Basel, second ed., 2008.

You might also like