0% found this document useful (0 votes)
36 views19 pages

(24.07) Combining Federated Learning and Control A Survey

Uploaded by

213181436
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views19 pages

(24.07) Combining Federated Learning and Control A Survey

Uploaded by

213181436
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Combining Federated Learning and

Control: A Survey*
Jakob Weber1 , Markus Gurtner1 , Amadeus Lobe1 , Adrian Trachte2 , and Andreas Kugi3

Abstract—This survey provides an overview of combining and privacy preservation. The core concept involves com-
Federated Learning (FL) and control to enhance adaptability, puting model updates locally on individual devices, securely
scalability, generalization, and privacy in (nonlinear) control aggregating these updates, and globally computing a combined
applications. Traditional control methods rely on controller
design models, but real-world scenarios often require online model. This paradigm shift from traditional centralized to
arXiv:2407.11069v2 [cs.LG] 17 Jul 2024

model retuning or learning. FL offers a distributed approach to decentralized training addresses data privacy and security
model training, enabling collaborative learning across distributed by keeping the raw data localized, thus ensuring minimal
devices while preserving data privacy. By keeping data localized, exposure risk. Beyond regulatory considerations, see [6] for
FL mitigates concerns regarding privacy and security while a discussion of FL under the European Union Artificial Intel-
reducing network bandwidth requirements for communication.
This survey summarizes the state-of-the-art concepts and ideas ligence Act [7], FL offers a technical advantage in reducing
of combining FL and control. The methodical benefits are the required network bandwidth through a more efficient
further discussed, culminating in a detailed overview of expected information transfer.
applications, from dynamical system modeling over controller Federated Learning has already gained substantial attention,
design, focusing on adaptive control, to knowledge transfer in particularly in industries and domains characterized by abun-
multi-agent decision-making systems.
dant sensitive data with high data privacy and communication
Index Terms—Federated Learning, Control Systems, Nonlinear efficiency demands, such as healthcare [8]–[10], finance and
Control economics [11]–[14], manufacturing [15]–[17], and IoT appli-
cations [18]–[23]. The distributed nature of FL is advantageous
in scenarios featuring large datasets or devices spread across
I. I NTRODUCTION
diverse spatial locations, see e.g., [24] and [25].
With the growing importance of data-driven models within This survey aims to offer a comprehensive overview of the
control systems, there is an increasing emphasis on integrating potential of FL for control problems, focusing on enhancing
learning-based models directly into the control loop. This adaptability, scalability, resilience, and privacy in (nonlinear)
integration enhances adaptability and allows for broader gen- control applications. Our contribution is to provide answers to
eralizability across diverse and possibly unseen operational the following research questions:
scenarios. Nowadays, an increasing number of control system Q1) What is the current state-of-the-art at the intersection of
hardware options include integrated connectivity solutions, FL and control?
exemplified by [1] or [2], thereby creating opportunities, Q2) What are the anticipated benefits of combining FL and
e.g., to integrate cloud-based solutions to enhance system control?
performance, resilience, and adaptability. While centralized The paper is structured as follows: Section II introduces FL,
approaches leveraging connected Internet-of-Things (IoT) de- presenting its fundamental concept, prevalent algorithms, and
vices and cloud computing infrastructure present viable op- various categorizations. Section III is dedicated to research
tions, bandwidth limitations and data privacy challenges are question Q1 and offers a detailed overview of the existing
still present when transmitting raw data, see, e.g., [3] dis- literature with a focus on decentralized control and learning
cussing privacy in the context of machine-learning and 6G in control, as well as on concepts at the intersection of FL and
communication, and [4] for an in-depth discussion of privacy control. In Section IV, we delve into the expected benefits of
and security for distributed machine-learning techniques. Fur- merging those fields and discuss potential applications, which
thermore, [5] offers an excellent overview of communication- addresses research question Q2. Finally, Section V gives some
efficient and distributed learning. In this context, Federated conclusions. A visualization of the paper’s structure is given
Learning (FL) offers a compelling solution. in Fig. 1.
Federated Learning enables collaborative model training
II. F EDERATED L EARNING
across distributed devices while preserving sensitive data,
thereby addressing the challenges of communication efficiency This section provides an overview of the fundamental
principles of Federated Learning (FL), laying the groundwork
1 Jakob Weber, Markus Gurtner, and Amadeus Lobe are with the Center for for addressing the research questions posed earlier. We first
Vision, Automation & Control, AIT Austrian Institute of Technology GmbH, introduce the primary goal of FL and a commonly used
Vienna, Austria [email protected] optimization algorithm. We then categorize the various appli-
2 Adrian Trachte is with the Robert Bosch GmbH, Renningen, Germany
3 Andreas Kugi is with the Automation and Control Institute, TU Wien, cations of FL to highlight its versatility. The section concludes
Austria and the AIT Austrian Institute of Technology GmbH, Vienna, Austria with a discussion of the key challenges inherent to FL.

1
Combining Federated
Learning and Control: A
Survey

Section III:
Section I: Section II: SOTA at the Section IV: Section V:
Introduction Federated Learning Intersection of FL FL and Control Conclusion
and Control

B. Expected
B. Learning and C. Existing A. Methodical
A. Distributed Applications of
Control with Focus Literature for FL Benefits of Combing
Control Combinig FL and
on FL and Control FL and Control
Control

1) System
1) System Identification
Identification
A. Basic Concepts
B: Categorization C: Challenges
& Algorithms
2) Controller Design
2) Controller Design

3) Multi-Agent
Decision-Making

Fig. 1: The structure of this paper.

A. Basic Concepts & Algorithms Assuming standard regularity conditions, (1) is typically
Federated Learning emerged as a novel research area fol- solved using gradient descent techniques, wherein the global
lowing the pioneering work at Google, see [26] and [27]. gradient is approximated as the expected value of local gra-
This work introduced a distributed model training approach dients. This involves utilizing sample averages over the local
particularly aimed at preserving privacy. The general idea data-generating distributions Di to estimate the local gradients.
presented is that devices - mainly mobile phones - down- Subsequently, the expected value of the global gradient is
load the current model from a central server, learn from computed through some form of weighted averaging of these
the local, private data, and then transmit, using encrypted local gradients. Consequently, only gradient information and
communication, the model updates back to the central server. never any raw data xi are exchanged, facilitating the privacy-
The server immediately aggregates these updates to refine the preserving and communication-efficient optimization of the
shared global model. Importantly, all training data remains global objective function.
on the device, and individual updates are not stored in the A widely adopted FL algorithm, known as Federated Av-
cloud, see [28]. The present survey does not claim to give a eraging, introduced in [27], performs the stochastic approx-
comprehensive description and discussion of all existing FL imation of the global gradient through a strategy of partial
algorithms, methods, and technologies in the context of FL. participation of the clients. This involves approximating the
However, it focuses on the connection between FL and control. expected value over the client distribution P through a sample
For an in-depth exploration of FL in general, the readers are average over a number of clients N , coupled with executing
referred to the literature, such as [24], [29] and [30], and IEEE local steps, encompassing multiple iterations of local gradient
standards [31], [32]. steps. The standard Federated Averaging can be extended
Federated Learning encompasses a collection of directives to provide flexibility in choosing both the local and global
and algorithms tailored for distributed, privacy-preserving, optimization method, resulting in the Generalized Federated
and communication-efficient learning. While this definition Averaging algorithm, also known as FedOpt. This extension
is informal, it can be precisely formulated as a distributed is detailed in [33] and presented in Algorithm 1. Examining
optimization problem expressed by the following mathematical Algorithm 1, it becomes evident that three key degrees of
representation freedom govern the optimization of the FL objective in (1),
namely
θ∗ ← arg min Ei∼P [Exi ∼Di [fi (xi , θ)]] . (1)
θ • Local optimization (client-side),
Here, the objective is to find the optimal value θ , which min- ∗ • Global optimization (server-side), and
imizes the expected value Ei∼P , taken over clients i sampled • Model aggregation.
from the client distribution P, of the expected value Exi ∼Di Importantly, these components can be independently chosen,
of the local client cost functions fi (xi , θ), parameterized by θ paving the way for a diverse family of algorithms capable of
and conditioned on the data xi sampled from the local data- solving a manifold of different problems. [h] The optimization
generating distribution Di . procedure of the Generalized Federated Averaging algorithm

2
Algorithm 1: Generalized FedAvg (FedOpt) algo- The key characteristics are emphasized in a dark gray shade,
rithm. where the main differentiating features are the orchestration
1 Input: Initial model θ
(0)
, ClientOpt & ServerOpt with and client state.
learning rates η, ηs ; Another common distinction in FL is horizontal and vertical
2 for t = 0, 1, . . . , T do FL, primarily based on the data space, as introduced in [29]
3 Sample a subset St of clients from P; and visualized in Fig. 3. Adopting the notation from [29],
4 for client i ∈ St in parallel do horizontal FL, also known as sample-based FL, is employed
5 Initialize local model θi
(t,0)
← θ(t) ; when clients share the same feature space X and target space
6 for k = 0, . . . , τi − 1 do Y but differ in the sample space I. This can be thought of as a
(t,k) horizontal split through the large data matrix we would obtain
7 Compute local gradient gi (θi );
(t,k+1)
in a centralized setting (clients send their data to a central
8 Local update: θi = server). We formalize this as
(t,k) (t,k)
ClientOpt(θi , gi (θi ), η, t);
9 end Xi = Xj , Yi = Yj , Ii ̸= Ij ∀ Di , Dj , i ̸= j, (2)
(t) (t,τ ) (t,0)
10 Compute local changes ∆i = θi i − θi ; wherein D denotes the local data. In horizontal FL, the
11 end objective is to address the same learning task across all clients,
12 Aggregate local changes involving identifying the functional relationship between data
(t) P
∆(t) = i∈St pi ∆i / i∈St pi ;
P samples drawn from Xi and Yi for each client i. A prominent
13 Update global model instance of this approach can be found in mobile phones, for
θ(t+1) = ServerOpt(θ(t) , ∆(t) , ηs , t); example, in next-word prediction, emoji suggestion, and out-
14 end of-vocabulary word discovery, as detailed in [28]. It is crucial
to emphasize that the FL task within horizontal FL is effec-
tively defined only when all clients share their true functional
relationship realized through their data-generating processes.
is depicted in detail in Fig. 2. Starting from the global model
While minor disparities, e.g., through different noise levels,
θ(t) , steps 6-9 entail local optimization, where numerous
can be accommodated, more significant deviations in the
local gradient steps are performed locally for clients I and II.
functional relationship necessitate the application of advanced
Subsequently, in step 10, these steps are consolidated into local
(t) (t) techniques such as clustered FL, as discussed in [39]–[41], or
model updates ∆I and ∆II w.r.t. the initial model for each
federated meta-learning (FedMeta), see [42].
client. The model aggregation in step 12 combines these local
Vertical FL, referred to as feature-based FL, is employed
model updates to generate a global model adjustment ∆(t) ,
when clients share the sample space I but differ in the feature
where pi is introduced to weigh the impact of each client.
space X , as outlined in [29]. This scenario can be formalized
Finally, step 13 illustrates the global optimization, wherein a
as
single gradient step θ(t+1) = θ(t) +ηs ∆(t) with global learning
rate ηs is executed. Typically, steps 4-13 are reiterated until Xi ̸= Xj , Yi ̸= Yj , Ii = Ij ∀ Di , Dj , i ̸= j. (3)
convergence of the global model is achieved. Apart from the
It can be conceptualized as instances where the large data
proposed method for solving (1), FL can also be formulated as
matrix we would obtain in a centralized setting undergoes ver-
a consensus optimization problem - tackled using primal-dual
tical splits across multiple clients; see Fig. 3b for a graphical
methods for distributed convex-optimization, see [34] - or as
depiction [43]. Vertical FL is applied in privacy-preserving
the problem of privacy-sensitive fusion of multiple probability
regression tasks, as demonstrated in [44] or [45], as well as
density functions, see [35].
in healthcare [8] and finance [11]. Note that the differences
in target space Y are implicit in the task definition, as only
B. Categorizations one client (in our case client 1 as marked by the same color
The FL objective in (1) enables various applications, often in Fig. 3b) can provide targets to regress on. It is further
categorized by client heterogeneity, see [36], and learning important to highlight that vertical FL necessitates integration
tasks. An alternative FL formulation, learning an implicit with supplementary privacy-preservation methods like differ-
mixture of the global and pure local models, is presented ential privacy, homomorphic encryption, or secure multi-party
in [37]. A comprehensive survey and classification of FL is computation, further elaborated in [46]–[49], respectively, as
provided in [38]. The most common distinction is between sharing some encrypted data is necessary. Additionally intro-
cross-device and cross-silo FL. Cross-device FL embodies duced in [29] is the niche concept of Transfer FL or federated
the original concept of collaboratively learning a shared model transfer learning, applied in situations at the boundary between
across a large number of devices, as detailed in [27] and [28]. horizontal and vertical FL. In this context, transfer learning
Conversely, cross-silo FL, introduced in [24], becomes par- techniques, as delineated in [50], are employed in the FL
ticularly relevant for smaller sets of clients. Here, every task, see [51], [52]. This involves learning a shared feature
client engages in each round of learning and maintains a representation from a small common data set, which can be
state that describes the current model and its evolution. The applied subsequently to generate predictions for samples with
primary characteristics of both approaches are summarized features from only one client. For further details, see [29],
and compared with distributed learning in Table I, see [24]. [51], [53].

3
6-9 10 12 13

(t)
∆II θ(t+1)
(t)
(t) ∆
∆I

II
I
θ(t)
Fig. 2: Sketch of the optimization procedure of Generalized Federated Averaging for two clients.

TABLE I: Comparison of distributed learning, cross-silo and cross-device FL, adapted from [24].
Distributed learning Cross-silo FL Cross-device FL
Training a model on a large dataset -
Training a model on siloed data - clients Clients are a very large number of mo-
Setting / Clients clients are computing nodes in a single
are different organizations or entities bile or IoT devices
cluster or datacenter
Data is centrally stored and can easily
Data be shuffled and balanced across clients Data is generated locally and remains decentralized - each client stores its own
distribution - any client can read any part of the data and cannot read the data of other clients
dataset
Only a fraction of clients is available at
Orchestration All clients are almost always available
any one time
Scale Typically, 103 clients Typically, 100 − 102 clients Massively parallel, up to 1010 clients
Primary bottle- Computation, as very fast communica-
Computation or communication Communication
neck tion is available in datacenters
Each client has an identity or name that allows the system to access it Direct indexing of clients is not possi-
Addressability
specifically ble
Stateful – each client may participate in each round of the computation, carrying Stateless – each client will likely only
Client state
a state participate once or a few times
Connection Stable Relatively stable Highly unreliable
Data-partition Partition is fixed — horizontal or verti-
Arbitrary across the clients Partition is fixed and horizontal
axis cal

Another significant categorization is based on the global constant exchange of model updates between the clients and a
model. In the prevalent approach of FL, also termed central- central server. This is particularly challenging in systems with
ized FL, a central entity - the global server - is responsible limited bandwidth, potentially leading to delays, increased
for creating and managing the global model and steering the latency, and increased resource utilization, see [63], [64]. The
federation process. This approach offers benefits such as ease heterogeneity of the clients’ computational resources poses
of implementation and increased control over client sampling. another hurdle, demanding adaptive strategies to manage col-
However, drawbacks include heightened vulnerabilities to sys- laborative learning across devices with diverse computational
tem failures and adversarial attacks against a single point of power, memory, and energy resources, see [36], [65]–[68].
failure, the global server. In contrast, decentralized FL has The non-independent and identically distributed (non-IID)
no central entity, as discussed in detail in [54], [55]. Instead, nature of data across clients introduces further complexities
some form of decentralized model aggregation occurs over in model training, necessitating algorithms that account for
a peer-to-peer network, aiming to address the downsides of variations in local data distributions, see [69], [70]. Finally,
centralized FL mentioned above. Another categorization is the challenge of model aggregation complexity arises from
introduced in [56]. Herein, additional servers (edge server) are the intricate process of combining asynchronous updates,
introduced, allowing for near real-time responses to a smaller addressing learning rate discrepancies, and ensuring robust
number of clients, leading to the concept of edge-based FL. mechanisms for aggregating disparate local models into a
Combining the advantages of edge servers with a central server coherent global model, see [71]. Addressing these challenges
then introduces hierarchical FL, see [57], [58] for in-depth is pivotal to successfully integrating FL in systems with diverse
discussions of hierarchical FL. clients and data distributions.

In addition to the technical challenges, FL faces critical


C. Challenges hurdles related to evaluation criteria, digital ethics, and incen-
Implementing FL methods and algorithms on dedicated tive mechanisms. Robust diagnostics capable of identifying
applications is associated with various technical challenges, and eliminating updates from clients with faulty sensors or
discussed in detail in [59]–[62]. Despite the communication- incorrect data are essential for ensuring the reliability and
efficient distributed learning, there is still a substantial com- accuracy of the aggregated FL models, see [72]. On the ethical
munication overhead compared to local learning, with the front, privacy and security issues are paramount [6]. FL’s

4
Features Targets Features Targets

Client 1 1

Client 2 2
Client 1 Client 2 . . . Client N 1
.. ..
. .

Client N N

(a) Horizontal federated learning (HFL) (b) Vertical federated learning (VFL)
Fig. 3: Sketch of the data partition for horizontal (a) and vertical federated learning (b). Note that only one client (Client 1)
provides target values for VLF. The red lines indicate the splits through the centralized data matrix.

core principle of preserving user privacy while training models through a network of communication links to other controllers,
without exchanging raw data is a delicate balance, requiring see [81]–[87]. Again, the scope of this section is not to provide
advanced encryption techniques to aggregate model updates a detailed introduction to DC, as there is excellent literature
without compromising sensitive information. However, in- available, e.g., [88], [89], but to provide the main concepts,
herent security issues, including data poisoning, adversarial assumptions, and some examples. Similarities and synergies
attacks, and the potential for reconstructing private raw data, between DC and FL, as well as major differences, are also
are present challenges to FL systems, see [73]. Designing discussed. Earlier work on distributed multi-agent coordination
effective incentive mechanisms for FL presents a formidable can be found in, e.g., [90]–[92]. A further overview focusing
challenge as participants must be motivated to contribute their on decentralized control is given by [93].
computational resources and data without direct access to In the context of DC, several foundational assumptions
the immediate benefits of the global model, see [74]–[76]. are essential. DC is based on the decentralization of con-
This balancing act requires strategies to encourage active and trol, which means that each controller has enough compu-
sustained participation in the collaborative learning process. tational power and resources to manage local control tasks
Despite the mentioned challenges, a key objective of FL is autonomously. This decentralization demands robust connec-
to obtain strong, personalized models for individual clients, tivity for the efficient exchange of information, guaranteeing
see [77]–[80]. that controllers can coordinate and synchronize operations
throughout the system. Nevertheless, the DC architecture is
III. S TATE - OF - THE -A RT AT THE I NTERSECTION OF FL also designed so that the failure of individual components does
AND C ONTROL not trigger a systemic failure, thereby ensuring continuous
Building on the prior brief introduction to Federated Learn- system operation despite localized disturbances. Scalability
ing (FL), this section focuses on the state-of-the-art at the is another critical assumption underlying DC. The system
intersection of FL and control. First, the concept of distributed design should allow for a seamless addition or removal of
control is introduced. Following that, we investigate learning nodes and modifications to the network’s configuration without
and control, concentrating on system identification and con- significant disruptions to its overall functionality. Synchro-
troller design. Here, we provide an overview of the literature nization mechanisms within DC are assumed to ensure that,
we deem interesting for FL. Finally, we provide a concrete despite the autonomy of individual controllers, operations are
overview of the current literature at the intersection of FL and harmonized across the system, particularly where processes
control. are interconnected [89], [94]. Moreover, the effectiveness of
DC depends on the availability of accurate and timely data
from local sensors, which each node uses to make informed
A. Distributed Control decisions. This reliance on data assumes that all nodes have
Distributed Control (DC) is a subfield of control that focuses access to the necessary resources - such as computational
on designing and implementing control systems where the con- power and energy - to perform their functions reliably. Finally,
trol task is distributed across multiple connected controllers. given the distributed nature of the system, security measures
This allows for a robust framework for managing complex are presumed to be robust, safeguarding data privacy against
and large-scale systems where centralized control would be potential adversaries. These assumptions serve as the founda-
less effective. Each controller in the system manages its tion for the design of DC systems.
local operations and integrates into the global control strategy Established application examples for DC include smart

5
grids, multi-agent systems, and connected autonomous vehi- manageable and operationally efficient as it grows. These
cles (CAVs). In smart grids, various components like renew- shared characteristics underline the adaptability and efficiency
able energy sources and consumer appliances must coordi- of both DC and FL in managing complex, distributed tasks.
nate to distribute electricity efficiently, e.g., [94]–[97]. This Whether in controlling physical processes or in data processing
coordination is done using DC, where each node adjusts its for machine-learning, both paradigms leverage decentraliza-
operations based on local data, e.g., energy demand or supply tion, local processing, and coordinated communication to meet
availability, and communicates with neighboring nodes to the respective goals effectively.
maintain stability and efficiency. The work [98] also provides While Distributed Control (DC) and Federated Learning
an early example of applying ideas from DC to large-scale (FL) share some fundamental principles, their differences are
systems, illustrated by a power system example, whereas [99] significant, reflecting different aims, applications, and operat-
provides a summary of recent research. ing approaches.
In distributed control of multi-agent systems, see [100], Purpose and Application: DC is commonly used in en-
[101], formation control involves multiple agents maintain- gineering systems to control physical processes and devices,
ing a predefined spatial arrangement while moving, see, such as grids, fleet robotics, and infrastructure management.
e.g., [102]. An early example of formation control of un- Its main goal is the direct control of physical entities to ensure
manned aerial vehicles is provided by [103]. Each robot operational efficiency and reliability - fulfilling the global con-
adjusts its position based on the positions of its neighbors trol task. In contrast, FL is a machine-learning technique that
rather than central instructions, see [104]. In [105], control enables the training of models across multiple decentralized
performance and communication effort are compared in multi- devices and is particularly useful in data-driven applications
agent systems. where privacy and bandwidth constraints are critical.
Furthermore, the platooning of connected automated ve- Data Handling: Data handling also underscores a funda-
hicles (CAVs) is projected to significantly alter road trans- mental difference between the DC and FL. DC is concerned
portation by enhancing traffic efficiency and decreasing fuel with real-time control, whereas FL uses data to train machine-
consumption, see [106], [107]. In [108], a distributed model learning models without real-time considerations.
predictive control algorithm is proposed for the vehicle pla- Algorithmic Focus: From an algorithmic perspective, DC
tooning problem. The rapid development of vehicle-to-vehicle concentrates on algorithms that ensure stability, control, and
communications, see [109] for a detailed discussion of vehicle- optimization of system dynamics, reflecting its direct interac-
to-everything (V2X) communication, also encourages using tion with physical systems. On the other hand, FL’s algorithms
distributed control techniques. The work [110] provides a are geared towards learning and inference, aiming to optimize
detailed survey on cooperative longitudinal motion control for accuracy, reduce model bias, and, importantly, enhance data
multiple CAVs. privacy through its decentralized approach.
Further examples are given in [111], wherein decentralized, Privacy Concerns: In terms of privacy, while DC may
distributed and centralized control systems are compared based handle sensitive information, especially in critical infrastruc-
on the objective of improved system performance of a multi- ture settings, its primary concerns revolve around operational
zone furnace. The authors of [112] discuss the problem of security and system reliability. FL explicitly addresses privacy
achieving multi-consensus in a linear multi-agent system using concerns, as it is designed to minimize data exposure by
distributed controllers. The distributed control of nonlinear ensuring that only model updates are communicated rather
interconnected systems is studied in [113]. State observers than raw data.
are also investigated in the setting of distributed control. For In summary, while DC and FL operate on distributed prin-
example, [114]–[116] discuss the important issue of distributed ciples, they apply these principles to fundamentally different
observer design for LTI systems, whereas [117] focuses on tasks - control and optimization in real systems for DC
the design of distributed Luenberger observers. In [118], the and privacy-preserving collaborative learning in FL. Their
distributed Kalman filter is discussed. methodologies reflect their respective goals: immediate and
DC and FL share several foundational principles based direct control of physical environments versus incremental and
on their decentralized nature. Both methodologies emphasize privacy-preserving improvement of predictive models.
decentralization, where decision-making for DC and learning
for FL are distributed across agents rather than centralized in B. Learning and Control with Focus on FL
a single entity. This structure reduces the risk associated with This section briefly introduces Learning and Control (LC),
single points of failure, increasing the system’s robustness. focusing on concepts similar to Federated Learning (FL).
Both DC and FL rely heavily on local processing. Agents While LC deserves extensive coverage, we provide a concise
in these systems handle data locally, reducing the need to overview of relevant topics and refer readers to the key
transport vast amounts of data across the network. This not literature to explore this exciting intersection of artificial
only saves bandwidth but also improves privacy by limiting intelligence and control theory. Within the field of LC, we
the accessibility of critical information. Scalability is another see two main categories that are promising for FL, namely
shared feature; DC and FL are designed to handle an in- • System identification
creasing number of agents easily. This scalability means that • Controller design
growing the network or integrating more agents does not affect The review of data-driven control [119] offers insights into
the overall system performance, allowing the system to remain the challenges of model-based control (MBC) theory, the

6
significance of data-driven control (DDC) methods, the state- forward and inverse models. An excellent review summarizing
of-the-art, their classifications, and the relationship between state representation learning formalism, as well as the learning
model-based and data-driven control. objectives and building blocks, are given in [155].
1) System Identification: Standard textbooks describe vari- Table II summarizes recent advances in learning-based iden-
ous ways to tackle the problem of system identification, such tification for linear and nonlinear systems, which we consider
as [120]–[122]. In the following, we discuss selected works promising for FL applications.
in this field that apply ideas from the learning literature to 2) Controller Design: When given a system description,
system identification problem. through first-principles modeling or data-driven system identi-
For the identification of stable LTI systems, [123] proposes fication, we typically want to control the system to behave in
a maximum likelihood routine based on the Expectation Max- a desired way. For this, a controller which determines the sys-
imization (EM) algorithm with latent states or disturbances tem’s input is designed. The main distinction here is between
and Lagrangian relaxation. The identification of unstable linear controllers without feedback (feedforward) and with feedback,
systems is discussed in [124], where finite-time bounds on the wherein a feedback controller uses measurements of the sys-
error of the Least Squares (LS) estimate of the dynamic matrix tem’s output and closes the control loop. Learning controllers
are derived for a large class of heavy-tailed noise distributions. using neural networks have a rich history, exemplified in works
The work [125] presents a novel statistical analysis of the LS like [142], [143], [171], [172]. Moreover, data-driven control
estimator for LTI systems with stable, marginally stable, and (DDC) is thematized in [173], [174], with a special focus
explosive dynamics. Learning autonomous linear dynamical on control design for nonlinear systems. A distinction can be
systems from a single trajectory or rollout is thematized made between direct DDC and indirect DDC, where system
in [126], [127], whereas [128] and [129] discuss identification identification and control are performed sequentially. In [175],
based on multiple trajectories. In [130], [131] spectral filtering the connection between indirect and direct DDC approaches is
is introduced to learn the impulse response function for latent- discussed. They formulate these approaches using behavioral
state linear dynamical systems. When working with nonlinear systems theory and parametric mathematical programs and
systems of the form ẋ = f (x, u), a common practice is bridge them through a multi-criteria formulation, trading off
linearizing around a reference point {xr , ur }. The work [132] system identification and control objectives. The study reveals
follows this idea and uses regularized LS to obtain the lin- that direct DDC can be derived as a convex relaxation of the
earized dynamics. indirect approach, with regularization accounting for implicit
An overview of nonlinear dynamic system identification identification steps. Direct and indirect predictive control is
techniques is given in [133]. A perspective based on kernel- the main topic of [176]. In this comparative study based on
methods and Bayesian statistics, including support vector stochastic LTI systems, two distinct non-asymptotic regimes
machines, Gaussian regression, and reproducing kernel Hilbert in control performance can be distinguished for direct and
spaces (RKHS), is proposed in [121] and further emphasized indirect predictive control.
in [134]. A kernel specifically tailored for Port-Hamiltonian There is also much work on optimal control, especially for
systems is presented in [135] and preserves the passive nature unknown, or partially known systems. In [177], a Thompson
within this system class. Gaussian mixture models are applied sampling-based learning algorithm is used to learn the dynam-
to encode the robot’s motion as a first-order autonomous ics, which are subsequently used for Linear Quadratic Regu-
nonlinear ODE, see [136]. In [137], the combination of SINDY lator (LQR) control design. They show robustness to time-
- sparse identification of nonlinear dynamics, see [138] - varying parameters of the controlled stochastic LTI system.
and MPC is proposed to enhance the control performance. Controlling an LTI system with known noisy dynamics and
A modern perspective based on deep neural networks is adversarial quadratic loss is tackled using semi-definite relax-
elaborated in [139] and [140]. Using neural networks in ation in [178] and [179], leading to strongly stable policies.
system identification was already proposed in the 1990s [141]– LQR control for unknown linear systems is further investigated
[143]. The authors of [144] formulate the system identifica- in [180]–[183]. The problem of adversarial changing convex
tion task based on high-dimensional uncertain measurements, cost functions with known linear dynamics is tackled in [184]
e.g., videos, as a neural network-based approximation of the and [185] usingP a Disturbance-Action Controller (DAC) given
H
posterior of the control loss. Deep Learning was also applied by ut = Kxt + i=1 Mi−1 wt−i combined with online convex
to learn representations of the Koopman operator and its optimization. The papers [186]–[189] focus on controlling un-
eigenfunctions from data, as seen in [145]. Another line of known linear dynamical systems subjected to non-stochastic,
work focuses on Neural State Space models, wherein neural adversarial perturbations. The class of Gradient Perturbation
networks are utilized to learn state space representations. Early Controllers (GPC) is introduced, combining a stabilizing linear
works trace back to [146], where recurrent neural networks controller K with a DAC parametrized by the matrices Mi , i =
are used for nonlinear system identification. Recent works 1, . . . , H for some horizon H and disturbance wi . The general
utilize autoencoders [147], [148], genetic algorithms [149], or class of convex optimization control policies (COCPs), includ-
meta-learning [150], [151] for learning a nonlinear state space ing standard applications like LQR, approximate dynamic pro-
representation. The problem of learning long sequence depen- gramming, and model predictive control is discussed in [190].
dencies is tackled via structured state space models in [152]– They propose updating the control parameters based on the
[154]. Learning state representations can be interpreted as a projected stochastic gradients of performance metrics (cost
generalization of system identification, as it comprises learning functions) instead of the standard way of tuning by hand, or

7
TABLE II: Recent literature regarding linear and nonlinear system identification with potential for FL.
Source Description Dynamics Techniques FL potential Input
[156] Introduces distributed stochastic gradient descent (SGD) LTI SGD High None
with reversed experience replay for distributed online
system identification of identical LTI systems
[157] Clustered system identification based on mean squared LTI SGD; Cluster High Gaussian
error (MSE) criterion estimation
[158] System identification inspired by multi-task learning to LTI regularized LS; High Constant
estimate the dynamics of linear time-invariant systems Proximal gradi-
jointly by leveraging structural similarities across the ent method
systems via regularized LS
[159] Propose Control-oriented regularization for LTI system LTI Bayesian Medium Linear
identification using control specifications as Bayesian perspective;
prior regularized LS
[160] Provide finite-time analysis for learning Markov param- part. observable LS; Markov pa- High Gaussian
eters of LTI systems applying an ordinary LS estimator LTI (open-loop rameters
with multiple rollouts covering both stable and unstable stable or unsta-
systems ble)
[161] Prefiltered Least Squares algorithm that provably esti- part. observable prefiltered LS Medium Gaussian
mates the dynamics of partially-observed linear systems LTI
[162] Leverage data from auxiliary (similar) systems LTI weighted LS Medium Gaussian
[163] Jointly estimating transition matrices of multiple, related LTI SGD; Basis High None
systems functions
[164] Automated tuning method for controller with safety Linear Bayes opt.; High PI-control
constraints Nonlin.
regression
[165] Simultaneously estimate states and explore structural Linear multiple neural Medium Necessary1
dependencies between estimated dynamics parameter networks
varying (LPV)
[166] Multi-robot transfer learning for SISO systems Nonlinear Transfer learn- Medium Input-output
ing linearization
[146], [167] Neural state space models Nonlinear Neural network High Possible2
[147], [148] Neural state space models via autoencoder Nonlinear neural network; Medium Necessary1
Autoencoder
[151] Safe learning and control based on an online Nonlinear NN; Last layer High Necessary1
uncertainty-aware meta-learned dynamics model adaptation
[168] Bayesian multi-task learning model using trigonometric Nonlinear Basis Medium MPC
basis functions to identify errors in the dynamics functions; Max.
Likelihood;
Kalman
filtering
[169] Stochastic MPC based on a learned input-feature model Nonlinear Bayesian; NN; High MPC
combined with (online) Bayesian linear regression and
online model selection to leverage multiple input-feature
models
[170] Probabilistic Deep Learning to meta-learn building Nonlinear NN; meta- High None
models using multi-source datasets learning;
1 Inputs are necessary for system identification, but not specified in the source.
2 Method also applicable for autonomous systems.

by grid search. The work of [191] considers Linear Quadratic This problem is tackled in [202], wherein a non-parametric
Gaussian (LQG) problems with unknown dynamics. They adaptive controller is proposed that scales to high-dimensional
leverage Input-Output Parameterization (IOP), see [192], for systems by learning the unknown dynamics in a reproduc-
robust controller synthesis based on a convex parameterization. ing kernel Hilbert space (RKHS) leveraging random Fourier
Similar works [193] and [194] focus on the adaptive aspect features. The work [203] introduces local Gaussian process
of the problem and introduce a control algorithm combining regression as a method that achieves high accuracy in online
system identification and adaptive control for LGQ systems. learning and real-time prediction, applied to inverse dynamics
model learning for model-based control of two 7-DoF robot
In adaptive control, early work uses neural-network-based arms.
adaptive controllers for trajectory tracking of robotic ma- Table III summarizes the recent advances in learning-based
nipulators, see [195]–[197]. Radial basis function networks controller design, which we consider promising for FL appli-
were used even earlier for adaptive control, see [198]. Recent cations.
work on model reference adaptive control (MRAC) based on
Gaussian Processes is given in [199], whereas [200] studies
MRAC based on deep neural networks. In [201], an adaptive C. Existing Literature for FL and Control
nonlinear MPC is designed, so that model uncertainties are Currently, existing works that combine Federated Learning
learned online and immediately compensated for. Adaptive (FL) methodologies with control theory can be found in four
control for high-dimensional systems is always challenging. major areas:

8
TABLE III: Recent literature regarding controller design with potential for FL.
Source Description System Techniques FL potential
[204], [205] Controlling an unknown system with varying latent parameters, Nonlinear NN; meta- High
aiming to learn approximate models for both the dynamics and learning
the prior over latent parameters from observed trajectory data
[206], [207] Control-oriented approach to learning parametric adaptive con- Nonlinear NN; meta- Medium
trollers through offline meta-learning from past trajectory data learning;
[208] Meta-learning online control algorithm that learns across a Linear Projected gradi- Medium
sequence of similar control tasks ent descent
[209] Transferring knowledge from source system to design a stabi- Linear Transfer Learn- High
lizing controller for target system ing
[210] Combining prior knowledge and measured data for learning- Linear Linear Medium
based robust controller design, leading to stability and perfor- fractional
mance guarantees for closed-loop systems transformations
[211] Online multi-task learning approach for adaptive control with Nonlinear NN; Last-layer High
environment-dependent dynamics adaptation
[212] Multi-task imitation learning via shared representations for Linear Multi-task imi- High
linear dynamical systems tation learning
[213] Learning a time-varying, locally linear residual model of the Linear residual Ridge Medium
dynamics to compensate for prediction errors of the controller’s model regression;
design model Replay buffer
[214] Combine simulated source domains and adversarial training to Nonlinear Policy gradient; Medium
learn robust policies NN
[215] Present a learning algorithm termed Meta Strategy Optimization Nonlinear Meta learning; Medium
that learns a latent strategy space suitable for fast adaptation in NN
training
[216] Decomposing neural network policies into task-specific and Nonlinear NN; multi-task High
robot-specific modules, enabling transfer learning across dif- & transfer
ferent robots and tasks with minimal additional training learning;
[217] LQR in a multi-task setting, with the MAML-LQR approach Nonlinear Policy gradient High
producing stabilizing controllers close to task-specific optimal & NN; meta-
controllers learning

• system identification, learning, and optimization across networked systems, as sur-


• controller design, veyed in [86] or [234]. These surveys offer valuable insights
• federated reinforcement learning, and for comparing diverse network topologies and facilitate an
• control-inspired aggregation. assessment of adaptive networks’ performance in contrast to
Table IV summarizes the primary outcomes of these studies. centralized implementations.
This section refers to the first research question, Q1, in Controller design has a line of work on federated LQR
Section I. concepts, e.g., [220] and [221]. In [220], the distributed LQR
In system identification, a strong focus has been put on tracking problem is studied in a setting of clients sharing
collaboratively learning linear, time-invariant (LTI) system linear but unknown dynamics and tracking different targets.
dynamics from diverse observations across multiple clients The proposed model-free federated zero-order policy gradient
under privacy considerations, as exemplified in [218] and algorithm capitalizes on a shared LQR matrix across clients,
[158]. In [218], the FedSysID algorithm is introduced, ad- demonstrating a linear speedup in the number of clients over
dressing Least Squares (LS) system identification for multi- communication-free, local alternatives, even when clients have
ple linear clients with similar dynamics, showing improved a heterogeneous component in their objective, namely their
convergence compared to individual learning. In [158], the tracking target. Simulation results demonstrate the algorithm’s
focus lies on leveraging similarities among multiple systems effectiveness in linear and nonlinear system settings, show-
to accurately estimate LTI dynamics using LS, although pri- casing its scalability. In [221], the focus is on the model-
vacy considerations were not specifically addressed. Another free federated LQR problem involving multiple clients with
notable contribution, detailed in [219], introduces a distributed distinct and unknown yet similar LTI dynamics collaborat-
Recursive Least Squares (RLS) algorithm tailored to robust ing to minimize an average quadratic cost function while
estimation in networked environments. In this scenario, each maintaining data privacy. The proposed FL approach, named
client measures samples linked by a common, yet unknown, FedLQR, allows clients to periodically communicate with a
linear regression model. This algorithm can be interpreted central server to train policies, addressing questions related to
within the framework of FL, as knowledge is exchanged the stability of the learned common policy, its proximity to
between clients by transmitting and aggregating covariance individual clients’ optimal policies, and the speed of learning
matrices of an RLS algorithm. Pioneering work in adaptive with data from all clients. In [223], fleet-level learning of static
networks1 , a field not directly linked but conceptually related feedback controllers from distributed and potentially hetero-
to FL, comprehensively examines recent strides in adaptation, geneous robotic data is discussed. They proposed the FLEET-
1 Adaptive networks consist of a collection of nodes with learning abilities.
MERGE algorithm, which efficiently merges neural network-
The nodes interact with each other on a local level and diffuse information based controllers by accounting for permutation invariance in
across the network to solve inference and optimization tasks decentralized. recurrent neural networks without centralizing the data.

9
TABLE IV: Summary of the literature connecting FL and Control.
Topic Summary and Literature References
System Identification Recent research explores collaborative learning of linear systems in system identification, using FedSysID
to balance performance and system heterogeneity, see [158], [218]. A distributed Recursive Least Squares
(RLS) algorithm, discussed in [219], addresses robust estimation in networked environments, aligning with FL
principles.
Controller Design Practical applications of Federated LQR concepts have demonstrated their scalability and efficiency, marking
significant strides in both linear and nonlinear systems [220], [221]. In [222], the integration of neural networks
into an adaptive PID-control system, leveraging FL to overcome limitations posed by limited local training data,
showcases the potential of this approach. In [223], FL is harnessed to learn static feedback controllers for a
fleet of robots, while [224] explores the use of FL in swarm control of drones. Personalized FL is applied for
learning in robotic control applications [225].
Reinforcement Learning Federated Learning mitigates reinforcement learning’s sample inefficiency and accommodates heterogeneous
sensor data, see [226]–[229].
Control-inspired Aggregation Novel research integrates control theory principles into FL by enhancing global model aggregation with integral
and derivative terms, akin to PID control frameworks, see [230]–[233].

An additional line of work delves into the integration acquire effective policies. This inefficiency poses a significant
of neural networks within adaptive PID-control systems for drawback, particularly in real-world scenarios where data col-
connected autonomous vehicles (CAVs), as exemplified in [75] lection can be resource-intensive and time-consuming. Feder-
and [222]. Control parameters are dynamically adjusted to ated Learning techniques offer a promising avenue to mitigate
enable navigation under varying traffic and road conditions, these challenges by facilitating collaborative learning among
which is facilitated by a neural network-based auto-tuning multiple agents (clients) to derive optimal policies, see [226].
unit learning the system behavior, as detailed in [235]. The The authors propose a federated RL scheme based on the
autonomous vehicle’s local training data, restricted by onboard actor-critic Proximal Policy Optimization (PPO) algorithm to
memory, limits the controller’s adaptability to specific traffic learn a classical nonlinear control problem: the upswing of
scenarios, potentially compromising safety. To overcome this a pendulum. They clearly show the effectiveness of the pro-
constraint, the proposed approach advocates for implementing posed FL solution by significantly reducing the convergence
FL. In this collaborative process, CAVs collectively learn time, despite the slightly different dynamics of the individual
the neural network-based auto-tuning units facilitated by a devices, namely three Quanser QUBETM-Servo 2, see [239].
wireless base station functioning as a global server. This In [240], a federated reinforcement learning setup shows a
FL mechanism empowers CAVs to enhance neural network linear convergence speedup concerning the number of agents.
auto-tuning units for local controller adjustments, effectively Additionally, FL’s capability to accommodate heterogeneous
tackling the issue of limited local training data and broadening real-life or artificially generated sensor data further broadens
the controller’s applicability across diverse traffic patterns. its applicability and effectiveness in diverse and complex
In [225], personalized FL is applied for learning trajectory environments, as exemplified in [227]–[229], [241].
forecasting models in robotic control applications, wherein The discussion so far predominantly centers around ap-
personalization is achieved by adjusting learning rates based plying FL techniques within specific control areas, namely
on parameter variances. An excellent example of the combi- system identification, controller design, and reinforcement
nation of FL and control is given in [224]. Here, the swarm learning. However, the landscape looks different when ex-
control of a large number of drones is considered, combining amining control-inspired aggregation, representing a shift
mean-field game theory, Federated Learning, and Lyapunov in viewpoint. In this context, concepts derived from control
stability analysis. Numerical examples show the efficacy of are employed to enhance FL algorithms. A noteworthy re-
the proposed approach. search path involves augmenting the global model aggregation
The intricate connections between control and reinforce- process by incorporating integral and derivative terms akin
ment learning (RL) arise from their common pursuit of to conventional PID-control frameworks. This approach, ex-
effective decision-making and optimization. Control theory emplified in works such as [230]–[233], represents a unique
contributes well-established principles for system regulation integration of control theory principles to optimize the FL
and trajectory tracking, complemented by reinforcement learn- process. Another work, conceptually similar to FL, is given
ing’s adaptive techniques designed for acquiring optimal be- by [99]. Here, they tackle the problem of consumer scheduling
haviors in dynamic and uncertain settings; see, e.g., [236]– under incomplete information.
[238] for detailed reviews of reinforcement learning. The
growth in computational power and the development of deep IV. FL AND C ONTROL
neural networks have significantly enhanced the capabilities of In light of the literature review conducted in the previous
reinforcement learning algorithms. This technological growth section, this section explores various scenarios in which Fed-
has propelled the field into new frontiers, enabling more erated Learning (FL) concepts can be integrated into con-
sophisticated learning and decision-making processes. Rein- trol theory, encompassing system identification and control,
forcement learning algorithms, however, are confronted with and knowledge transfer in multi-agent systems. Furthermore,
the challenge of sample inefficiency, often necessitating a FL’s adaptive learning capabilities are discussed for dynamic
substantial number of interactions with the environment to systems, highlighting its suitability for evolving environments

10
and changing operating conditions. This section refers to the identification is separated into parametric SysID and non-
second research question, Q2, in Section I and presents FL as a parametric SysID. In parametric SysID, first principles and
promising framework for advancing various facets of control. process knowledge are used to develop parametric models,
whose physics-based parameters are calibrated based on input-
A. Methodical Benefits of Combining FL and Control output data. In non-parametric SysID, black-box models, e.g.,
highly-capable function approximation algorithms like neural
In control, the integration of FL principles offers a
networks, are employed to learn the system dynamics. Fed-
promising avenue for advancing system adaptability via
erated Learning extends this paradigm by enabling multiple
communication-efficient, collaborative learning with preserved
clients to collaboratively learn from decentralized data sources
privacy as well as enhanced generalization and robustness.
while preserving data privacy, leading to federated SysID.
Federated Learning’s adaptive learning capabilities find res-
In this context, parametric federated SysID allows indi-
onance in control applications dealing with dynamic systems.
vidual clients to leverage their input-output data to calibrate
Clients can continuously update their knowledge using a
local models, which are subsequently aggregated to refine a
system model or controller parameterization based on chang-
global model representing the parametric system dynamics,
ing environmental conditions. This makes FL suitable for
subsequently termed federated dynamics. This ranges from
control scenarios where system dynamics or shared external
sharing single parameter information to learning the complete
influences evolve. The decentralized approach inherent in
dynamics of some LTI or nonlinear system. Sharing informa-
FL aligns seamlessly with distributed control systems. This
tion through the global model not only copes with parameter
synergy enables multiple clients engaged in control tasks to
scattering caused by minor, not modeled system differences,
collaboratively update their control policies or global model
e.g., component-aging effects, but also allows for some form
representation based on local observations, fostering a coordi-
of persistent excitation of the global model; see [244] for
nated and efficient system behavior, as demonstrated in [226]
further information regarding persistent excitation. Further-
and [242]. The emphasis on communication efficiency is
more, the scalability and generalizability of the obtained
particularly beneficial for control systems with many partic-
federated dynamics are enhanced, and concerns regarding data
ipants or low bandwidths. Clients engaged in collaborative
privacy and security are addressed. An improved convergence
learning tasks can exchange minimal data while acquiring
rate can be obtained compared to individual client learning.
control policies, mitigating communication challenges, and
For instance, smart pneumatic or hydraulic valves can come
enhancing overall system efficiency. Transfer learning proves
equipped with series models that capture the nonlinear input-
valuable in control domains, mirroring FL’s practice of training
output relationships in a parametric form. During operation,
models on one task and adapting them to another, see [243].
these valves encounter distinct application-specific operating
This concept facilitates transferring and fine-tuning of learned
conditions and device-specific variations like wear and tear,
control policies from one system to a similar yet distinct one,
often not accounted for during end-of-line calibration. The
minimizing the necessity for extensive and time-consuming
valves can adapt to such scenarios by embedding a local
retuning and retraining efforts. Privacy-preserving control
learning algorithm. These adaptations can also be fed back to
emerges as a critical application, leveraging FL’s emphasis on
the manufacturer in a privacy-preserving manner to enhance
local model training without raw data sharing. In scenarios
the overall series model. As a note on the downside, chal-
where data privacy is paramount, such as personal data in
lenges inherent to FL, as discussed in Section II-C, are also
healthcare, critical infrastructure, or strictly confidential pro-
inherent for parametric SysID when combined with FL. A
duction data, clients can collaboratively learn control policies
high level of system knowledge is also required to formulate
without disclosing sensitive information, thus aligning with
the parametric models. Parametric federated SysID can only
privacy requirements. By integrating robust control techniques
be applied when the participating clients and systems follow
into FL algorithms, further guarantees of system performance
similar dynamics. A possible remedy here is the application of
under uncertainties or adversarial conditions may be estab-
clustered FL techniques, wherein the clients are clustered into
lished, bolstering the reliability and generalization of learned
similar groups sharing similar dynamics, see [39]–[41]. This
models in real-world control applications.
cluster assignment is either based on a priori known criteria or
In summary, including FL principles, encompassing adapt-
obtained through applying clustering techniques on the client
ability, generalization, communication efficiency, decentral-
model parameters.
ization, and privacy preservation, holds significant potential
In non-parametric SysID, the system dynamics are typically
for advancing various facets of control. This integration con-
learned from input-output data using sophisticated function
tributes to developing more efficient, flexible, secure, and
approximation algorithms like neural networks or Gaussian
responsive control systems.
Processes. Non-parametric federated SysID enables the
learned client models to update a global model collaboratively,
B. Expected Applications of Combining FL and Control representing the federated dynamics. This again fosters more
There are many ways to include concepts from FL in robust and generalizable models of dynamical systems and
control. Table V summarizes the introduced applications. facilitates a more comprehensive exploration of the systems
1) System Identification: Foremost, Federated Learning dynamics. Furthermore, a faster convergence typical to FL is
presents a promising avenue for enhancing system identifica- obtained if the client dynamics are similar. Additionally, the
tion (SysID) within the control framework. Typically, system computational power of the global server can be exploited

11
TABLE V: Overview of Expected Applications.
Application Core Idea Benefits Potential Downsides
Parametric federated SysID Inferring parameters of the struc- Improved convergence rate over FL costs1 ; useful only for simi-
turally known dynamics of a sys- single client learning; persistent ex- lar systems; high level of system
tem based on local input-output citation of global model; federated knowledge required
data from multiple clients dynamics
Non-parametric federated SysID Collaboratively learn non- Improved convergence; increasing FL costs1 ; useful only for similar
parametric dynamics based on generalizability; federated dynam- systems
local input-output data from ics
multiple clients
Indirect Adaptive Control Learning federated dynamics Improved control performance and FL costs1; similar control require-
model and use local or global robustness under uncertainties in ments necessary;
model as robust control design dynamic environments
model
Direct Adaptive Control Learn a control law on local data Improved generalization; improved FL costs1 ; application to similar
and aggregate at global server initial model for local adaptive con- systems only; potential stability is-
trol; sues
Advanced server-side optimization Utilizes cloud computing power on Server-side update/training of FL costs1 ; powerful server neces-
global server neural-network-based control sary
laws for low power hardware,
e.g., Learning-based NMPC
approximation
Multi-agent decision-making Explicit distinction between Improves privacy-preserving FL costs1;
decision-oriented and support- decision-making and coordination
oriented agents among multiple agents
Sensor fusion & shared representa- Combine different sensor modali- Enhances situational awareness and FL costs1; scalability issues
tion learning ties to extract a richer environment perception capabilities, e.g., merg-
representation ing visual, depth, and semantic
maps
Transfer learning Merges real-life and simulator data Facilitates safer exploration & FL costs1;
for knowledge transfer knowledge transfer
1 See Section II-C for a detailed discussion of FL costs.

for further model refinement during the global model update. and robustness under local uncertainties. On the downside are
This can enforce knowledge-based constraints similar to the again the challenges inherent to FL; see Section II-C. Also,
approach underlying the physics-informed neural networks similar control requirements on the client level are necessary
(PINNs), which seems intractable for embedded hardware; to successfully apply indirect adaptive control combined with
see [245] for more information on physics-based Deep Learn- FL.
ing or [246] for physics-informed Machine Learning. FL can On the other hand, direct adaptive control entails learn-
also support the development of confidence models, which ing the client controller directly from its datasets. Federated
estimate the reliability of learned federated dynamics based Learning enables combining these local client controllers
on contextual features. These models provide valuable insights in a robust, privacy-preserving way. The resulting global
into the confidence level associated with predictions, enabling controller model ensures robustness and generalizability in
better decision-making in dynamic environments where uncer- various dynamic settings. For instance, hydraulic excavators
tainty is prevalent. Integrating FL with confidence modeling can use data-driven servo-compensation models to improve
enhances the trustworthiness of purely data-driven dynamical path or trajectory tracking for tasks like leveling or grading,
system models, contributing to more effective control and see [247]. This servo-compensation may be equipped with
decision-making processes. On the downside are again the a local learning algorithm, adapting individual machines to
challenges inherent to FL; see Section II-C. Moreover, that various environmental conditions, e.g., the temperature influ-
non-parametric SysID can only be successfully applied to encing the viscosity of the hydraulic oil. This information
similar systems. Again, clustered FL techniques, as discussed can be shared by applying FL to obtain a global servo-
in [39]–[41], may be helpful here. compensation model, leading to a general improvement in
2) Controller Design: Furthermore, FL techniques can also machine performance. Conversely, we again face challenges
be applied to adaptive control problems, where indirect and di- intrinsic to FL, as elaborated in Section II-C. Additionally,
rect adaptive control methods are explored. Indirect adaptive there are potential stability issues related to the global model.
control involves learning a system model or its parameters Another way to use FL is to employ the global server’s
from distributed data and deriving a controller based on computational power, termed advanced server-side optimiza-
the acquired model, as exemplified by the neural-network- tion, typically much larger than the client devices, for further
based PID auto-tuning [235]. Federated Learning can obtain advanced control tasks, such as optimal control or nonlinear
a federated dynamics model, serving as a feedback controller model predictive control (NMPC). Especially for the latter, the
design model for individual clients. The local controller can hardware requirements for a real-time implementation are still
then be designed using the federated dynamics model or a a significant cost factor. In [248]–[250], this hurdle is over-
locally adapted version to ensure a high control performance come by optimizing the corresponding operating range already

12
in the development phase and subsequently training a neural tonomous model cars in changing indoor environments with
network (NN) to reproduce the solution in a computationally obstacles. Similarly, the work [253] introduces an FL archi-
efficient way. Federated Learning enables an extension of this tecture for cooperative simultaneous localization and mapping
framework towards online adaptations, as the central server (SLAM) in cloud robotic systems. This adaptability is particu-
can perform the computationally intensive optimization utiliz- larly valuable in dynamic environments where conditions may
ing the federated dynamics and then distribute the solution to change over time. By leveraging transfer learning techniques
the clients, e.g., in the form of neural networks. The global with FL, multi-agent systems can effectively adapt to new
server must have suitable computational hardware to perform scenarios and improve overall performance and robustness.
this optimization. Also, challenges related to FL and federated Inherent to the presented applications regarding knowledge
SysID, as already mentioned before, must be considered. transfer, namely multi-agent planning, shared representation
3) Multi-Agent Decision-Making: Progressing from system learning, sensor fusion, and transfer learning, are the already
identification over controller design to knowledge transfer, FL introduced challenges of FL, see Section II-C.
holds promise for advancing multi-agent decision-making
scenarios by facilitating collaborative decision-making among
V. C ONCLUSION
agents, including both decision-oriented and support-oriented
agents, see [242]. In such settings, FL enables agents to learn This paper presents an overview of state-of-the-art methods
and share knowledge while preserving the privacy of their local and ideas for combining Federated Learning (FL) and control.
data. Decision-oriented agents responsible for making critical A detailed literature review reveals a scarcity of research at
decisions can benefit from FL by leveraging insights from de- the intersection of FL and (nonlinear) control. Building on the
centralized data sources to improve decision-making accuracy advancements in distributed control and learning, combining
and robustness. Support-oriented agents providing auxiliary FL and control can improve the system’s adaptability and
functions such as data preprocessing or information aggrega- privacy preservation by allowing for decentralized controller
tion can enhance their capabilities using FL by collectively updates, privacy-preserving control, and adaptive learning,
learning from distributed datasets. Additionally, FL fosters thereby improving control efficiency, security, and responsive-
the development of decentralized coordination mechanisms, ness.
allowing agents to coordinate their actions efficiently without Practically, FL enriches control methodologies by enhancing
centralized control. By employing FL techniques, multi-agent system identification, improving dynamical system modeling,
systems can adapt and evolve, leveraging the collective intelli- and advancing control techniques. Additionally, it facilitates
gence of diverse agents to achieve more effective and resilient knowledge transfer in multi-agent decision-making via envi-
models, exemplified in [227] and termed Lifelong Feder- ronment representation learning, sensor fusion, and transfer
ated Reinforcement Learning. For instance, decision-oriented learning, fostering adaptive and resilient control systems.
mobile working machines in unstructured environments may Apart from individual scientific activities in the synergetic
be assisted by support-oriented drones equipped with visual combination of FL and control, several further developments
sensors to solve tasks like handling cargo or material transport are required. The evident mismatch in the main objectives
collaboratively. between FL and control - FL attempts to achieve privacy-
Furthermore, FL offers significant potential in multi-agent preserving global model learning, whereas control usually
systems for learning a shared representation of the en- concentrates on localized optimal performance - may be the
vironment, see [251]. By collaboratively training models main reason for this open research potential, see Section III.
on decentralized data sources, FL enables agents to collec- Tables II and III, combined with insights presented in Sec-
tively understand the environment, leading to more coher- tion IV, provide important guidance and ideas for applying
ent decision-making and coordination among agents. This FL to system identification and control. Up to the author’s
becomes particularly relevant in settings like construction knowledge, this work marks the first comprehensive survey of
sites where numerous stakeholders interact, making privacy- FL and control, laying the groundwork for further research to
preserving learning and adaptation of shared environmental combine FL and control for collaborative control applications.
representations through FL methods of utmost interest. In essence, FL offers a valuable additional feature set that,
Moreover, FL can enhance sensor fusion capabilities within when integrated with control, can drive advancements across
multi-agent systems by leveraging data from diverse sensor various domains. However, substantial further research is
modalities distributed across agents, see [228], wherein visual required to realize the ideas outlined in this work.
data (RGB and depth maps) and semantic segmentation data
are combined within an FL setting. This enables the extraction
R EFERENCES
of richer and more comprehensive information about the
environment, improving situational awareness and decision- [1] Robert Bosch GmbH. Bodas connect. [Online]. Available:
making accuracy. https://www.boschrexroth.com/de/de/transforming-mobile-machines/
elektronifizierung-und-iot/bodas-connect/
Additionally, FL facilitates transfer learning and multi- [2] Beckhoff Automation GmbH. Twincat. [Online].
task learning (see [252]) in control by allowing agents to Available: https://www.beckhoff.com/de-at/produkte/automation/
transfer knowledge learned from one environment to another. twincat-3-fuer-industrie-4.0/
[3] Y. Sun, J. Liu, J. Wang, Y. Cao, and N. Kato, “When Machine Learning
This is exemplified in [229], where sophisticated simulators Meets Privacy in 6G: A Survey,” IEEE Communications Surveys &
are combined with real-life LIDAR data to navigate au- Tutorials, vol. 22, no. 4, pp. 2694–2724, 2020.

13
[4] C. Ma, J. Li, K. Wei, B. Liu, M. Ding, L. Yuan, Z. Han, and [23] Y. Shi, K. Yang, T. Jiang, J. Zhang, and K. B. Letaief, “Communication-
H. Vincent Poor, “Trusted AI in Multiagent Systems: An Overview Efficient Edge AI: Algorithms and Systems,” IEEE Communications
of Privacy and Security for Distributed Learning,” Proceedings of the Surveys & Tutorials, vol. 22, no. 4, pp. 2167–2191, 2020.
IEEE, vol. 111, no. 9, pp. 1097–1132, 2023. [24] P. Kairouz, H. B. McMahan, B. Avent, A. Bellet, M. Bennis, A. N.
[5] J. Park, S. Samarakoon, A. Elgabli, J. Kim, M. Bennis, S.-L. Kim, and Bhagoji, K. Bonawitz, Z. Charles, G. Cormode, R. Cummings et al.,
M. Debbah, “Communication-Efficient and Distributed Learning Over “Advances and Open Problems in Federated Learning,” Foundations
Wireless Networks: Principles and Applications,” Proceedings of the and Trends® in Machine Learning, vol. 14, no. 1–2, pp. 1–210, 2021.
IEEE, vol. 109, no. 5, pp. 796–819, 2021. [25] M. Nokleby, H. Raja, and W. U. Bajwa, “Scaling-Up Distributed
[6] H. Woisetschläger, A. Erben, B. Marino, S. Wang, N. D. Lane, Processing of Data Streams for Machine Learning,” Proceedings of
R. Mayer, and H.-A. Jacobsen, “Federated Learning Priorities Un- the IEEE, vol. 108, no. 11, pp. 1984–2012, 2020.
der the European Union Artificial Intelligence Act,” arXiv preprint [26] J. Konečnỳ, H. B. McMahan, D. Ramage, and P. Richtárik, “Federated
arXiv:2402.05968, 2024. Optimization: Distributed Machine Learning for On-Device Intelli-
[7] European Commision. Proposal for a REGULATION OF THE gence,” arXiv preprint arXiv:1610.02527, 2016.
EUROPEAN PARLIAMENT AND OF THE COUNCIL LAYING [27] B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas,
DOWN HARMONISED RULES ON ARTIFICIAL INTELLIGENCE “Communication-Efficient Learning of Deep Networks from Decen-
(ARTIFICIAL INTELLIGENCE ACT) AND AMENDING CERTAIN tralized Data,” in Proceedings of the 20th International Conference
UNION LEGISLATIVE ACTS - 2021. [Online]. Available: https: on Artificial Intelligence and Statistics, ser. Proceedings of Machine
//eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:52021PC0206 Learning Research, vol. 54. PMLR, 2017, pp. 1273–1282.
[8] Nvidia Cooperation. Federated Learning powered by NVIDIA [28] B. McMahan and D. Ramage. (2017) Google Research
Clara. [Online]. Available: https://developer.nvidia.com/blog/ Blog. [Online]. Available: https://blog.research.google/2017/04/
federated-learning-clara/ federated-learning-collaborative.html
[9] J. Chen, C. Yi, S. D. Okegbile, J. Cai, and X. S. Shen, “Networking [29] Q. Yang, Y. Liu, T. Chen, and Y. Tong, “Federated Machine Learning:
Architecture and Key Supporting Technologies for Human Digital Concept and Applications,” ACM Transactions on Intelligent Systems
Twin in Personalized Healthcare: A Comprehensive Survey,” IEEE and Technology (TIST), vol. 10, no. 2, pp. 1–19, 2019.
Communications Surveys & Tutorials, vol. 26, no. 1, pp. 706–746, [30] J. Wang, Z. Charles, Z. Xu, G. Joshi, H. B. McMahan, M. Al-Shedivat,
2023. G. Andrew, S. Avestimehr, K. Daly, D. Data et al., “A Field Guide to
[10] S. K. Zhou, H. Greenspan, C. Davatzikos, J. S. Duncan, B. Van Gin- Federated Optimization,” arXiv preprint arXiv:2107.06917, 2021.
neken, A. Madabhushi, J. L. Prince, D. Rueckert, and R. M. Summers, [31] “IEEE Guide for Architectural Framework and Application of Feder-
“A Review of Deep Learning in Medical Imaging: Imaging Traits, ated Machine Learning,” IEEE Std 3652.1-2020, pp. 1–69, 2021.
Technology Trends, Case Studies With Progress Highlights, and Future [32] Q. Yang, L. Fan, R. Tong, and A. Lv, “IEEE Federated Machine
Promises,” Proceedings of the IEEE, vol. 109, no. 5, pp. 820–838, 2021. Learning,” IEEE Federated Machine Learning - White Paper, pp. 1–18,
[11] Nvidia Cooperation. Using Federated Learning 2021.
to Bridge Data Silos in Financial Ser- [33] S. J. Reddi, Z. Charles, M. Zaheer, Z. Garrett, K. Rush, J. Konečný,
vices. [Online]. Available: https://developer.nvidia.com/blog/ S. Kumar, and H. B. McMahan, “Adaptive Federated Optimization,”
using-federated-learning-to-bridge-data-silos-in-financial-services/ in 9th International Conference on Learning Representations, Virtual
[12] G. Long, Y. Tan, J. Jiang, and C. Zhang, “Federated Learning for Only Conference, 2021. [Online]. Available: https://openreview.net/
Open Banking,” in Federated Learning: Privacy and Incentive. Cham, forum?id=LkFG3lB13U5
Switzerland: Springer Nature, 2020, pp. 240–254. [34] D. Jakovetić, D. Bajović, J. Xavier, and J. M. F. Moura, “Primal–Dual
[13] Y. Wang, F. Zobiri, and G. Deconinck, “Federated Learning Methods for Large-Scale and Distributed Convex Optimization and
for Collaborative Price Prediction and Optimal Trading in the Data Analytics,” Proceedings of the IEEE, vol. 108, no. 11, pp. 1923–
Local Flexibility Market,” SSRN, 2023. [Online]. Available: https: 1938, 2020.
//www.ssrn.com/abstract=4534499 [35] G. Koliander, Y. El-Laham, P. M. Djurić, and F. Hlawatsch, “Fusion
[14] T. Liu, Z. Wang, H. He, W. Shi, L. Lin, R. An, and C. Li, “Efficient of Probability Density Functions,” Proceedings of the IEEE, vol. 110,
and Secure Federated Learning for Financial Applications,” Applied no. 4, pp. 404–453, 2022.
Sciences, vol. 13, no. 10, p. 5877, 2023. [36] A. Mitra, R. Jaafar, G. J. Pappas, and H. Hassani, “Linear Convergence
[15] T. Deng, Y. Li, X. Liu, and L. Wang, “Federated learning-based in Federated Learning: Tackling Client Heterogeneity and Sparse
collaborative manufacturing for complex parts,” Journal of Intelligent Gradients,” in Advances in Neural Information Processing Systems,
Manufacturing, vol. 34, no. 7, pp. 3025–3038, 2023. vol. 34. Curran Associates, Inc., 2021, pp. 14 606–14 619.
[16] S. Savazzi, M. Nicoli, M. Bennis, S. Kianoush, and L. Barbieri, [37] F. Hanzely and P. Richtárik, “Federated Learning of a Mixture of
“Opportunities of Federated Learning in Connected, Cooperative, Global and Local Models,” arXiv preprint arXiv:2002.05516, 2020.
and Automated Industrial Systems,” IEEE Communications Magazine, [38] O. A. Wahab, A. Mourad, H. Otrok, and T. Taleb, “Federated Machine
vol. 59, no. 2, pp. 16–21, 2021. Learning: Survey, Multi-Level classification, Desirable Criteria and
[17] V. Hegiste, T. Legler, and M. Ruskowski, “Application of Federated Future Directions in Communication and Networking Systems,” IEEE
Machine Learning in Manufacturing,” in International Conference on Communications Surveys & Tutorials, vol. 23, no. 2, pp. 1342–1397,
Industry 4.0 Technology. Pune, Maharashtra, India: IEEE, 2022, pp. 2021.
1–8. [39] F. Sattler, K.-R. Müller, and W. Samek, “Clustered Federated Learning:
[18] D. C. Nguyen, M. Ding, P. N. Pathirana, A. Seneviratne, J. Li, and H. V. Model-Agnostic Distributed Multitask Optimization Under Privacy
Poor, “Federated Learning for Internet of Things: A Comprehensive Constraints,” IEEE Transactions on Neural Networks and Learning
Survey,” IEEE Communications Surveys & Tutorials, vol. 23, no. 3, Systems, vol. 32, no. 8, pp. 3710–3722, 2020.
pp. 1622–1658, 2021. [40] C. Briggs, Z. Fan, and P. Andras, “Federated learning with hierarchical
[19] R. Tallat, A. Hawbani, X. Wang, A. Al-Dubai, L. Zhao, Z. Liu, G. Min, clustering of local updates to improve training on non-IID data,” in
A. Y. Zomaya, and S. H. Alsamhi, “Navigating industry 5.0: A Survey International Joint Conference on Neural Networks. Virtual Only
of Key Enabling Technologies, Trends, Challenges, and Opportunities,” Conference: IEEE, 2020, pp. 1–9.
IEEE Communications Surveys & Tutorials, vol. 26, no. 2, pp. 1080– [41] A. Taik, Z. Mlika, and S. Cherkaoui, “Clustered Vehicular Federated
1126, 2024. Learning: Process and Optimization,” IEEE Transactions on Intelligent
[20] S. Liu, B. Guo, C. Fang, Z. Wang, S. Luo, Z. Zhou, and Z. Yu, Transportation Systems, vol. 23, no. 12, pp. 25 371–25 383, 2022.
“Enabling Resource-Efficient AIoT System With Cross-Level Opti- [42] X. Liu, Y. Deng, A. Nallanathan, and M. Bennis, “Federated Learning
mization: A Survey,” IEEE Communications Surveys & Tutorials, and Meta Learning: Approaches, Applications, and Directions,” IEEE
vol. 26, no. 1, pp. 389–427, 2024. Communications Surveys & Tutorials, vol. 26, no. 1, pp. 571–618,
[21] A. Brecko, E. Kajati, J. Koziorek, and I. Zolotova, “Federated Learning 2023.
for Edge Computing: A Survey,” Applied Sciences, vol. 12, no. 18, p. [43] W. Xia, Y. Li, L. Zhang, Z. Wu, and X. Yuan, “A Vertical Feder-
9124, 2022. ated Learning Framework for Horizontally Partitioned Labels,” arXiv
[22] H. Gu, L. Zhao, Z. Han, G. Zheng, and S. Song, “AI-Enhanced preprint arXiv:2106.10056, 2021.
Cloud-Edge-Terminal Collaborative Network: Survey, Applications, [44] A. Gascón, P. Schoppmann, B. Balle, M. Raykova, J. Doerner, S. Zahur,
and Future Directions,” IEEE Communications Surveys & Tutorials, and D. Evans, “Secure Linear Regression on Vertically Partitioned
vol. 26, no. 2, pp. 1322–1385, 2024. Datasets,” IACR Cryptol. ePrint Arch., vol. 2016, p. 892, 2016.

14
[45] S. Hardy, W. Henecka, H. Ivey-Law, R. Nock, G. Patrini, G. Smith, Resource-Constrained IoT Devices: Panoramas and State of the Art,
and B. Thorne, “Private Federated Learning on Vertically Partitioned pp. 7–27.
Data via Entity Resolution and Additively Homomorphic Encryption,” [67] A. Imteaj, U. Thakker, S. Wang, J. Li, and M. H. Amini, “A Survey
arXiv preprint arXiv:1711.10677, 2017. on Federated Learning for Resource-Constrained IoT Devices,” IEEE
[46] C. Dwork, “Differential Privacy: A Survey of Results,” in Theory and Internet of Things Journal, vol. 9, no. 1, pp. 1–24, 2022.
Applications of Models of Computation. Springer Berlin Heidelberg, [68] T. Nishio and R. Yonetani, “Client Selection for Federated Learning
2008, pp. 1–19. with Heterogeneous Resources in Mobile Edge,” in 2019 International
[47] A. Acar, H. Aksu, A. S. Uluagac, and M. Conti, “A Survey on Ho- Conference on Communications (ICC). Shanghai, China: IEEE, 2019,
momorphic Encryption Schemes: Theory and Implementation,” ACM pp. 1–7.
Computing Surveys, vol. 51, no. 4, pp. 1–35, 2018. [69] Y. Zhao, M. Li, L. Lai, N. Suda, D. Civin, and V. Chandra, “Federated
[48] D. Bogdanov, S. Laur, and J. Willemson, “Sharemind: A Framework Learning with Non-IID Data,” arXiv preprint arXiv:1806.00582, 2018.
for Fast Privacy-Preserving Computations,” in Computer Security - [70] F. Sattler, S. Wiedemann, K.-R. Müller, and W. Samek, “Robust
ESORICS 2008. Springer Berlin Heidelberg, 2008, pp. 192–206. and Communication-Efficient Federated Learning from Non-IID Data,”
[49] H. B. McMahan, D. Ramage, K. Talwar, and L. Zhang, “Learning IEEE Transactions on Neural Networks and Learning Systems, vol. 31,
Differentially Private Recurrent Language Models,” in 6th International no. 9, pp. 3400–3413, 2019.
Conference on Learning Representations, Vancouver, BC, Canada, [71] K. Pillutla, S. M. Kakade, and Z. Harchaoui, “Robust Aggregation for
2018. [Online]. Available: https://openreview.net/forum?id=BJ0hF1Z0b Federated Learning,” IEEE Transactions on Signal Processing, vol. 70,
[50] S. J. Pan and Q. Yang, “A Survey on Transfer Learning,” IEEE pp. 1142–1154, 2022.
Transactions on Knowledge and Data Engineering, vol. 22, no. 10, [72] A. Imteaj and M. H. Amini, “FedAR: Activity and Resource-Aware
pp. 1345–1359, 2010. Federated Learning Model for Distributed Mobile Robots,” in Inter-
[51] Y. Liu, Y. Kang, C. Xing, T. Chen, and Q. Yang, “A Secure Federated national Conference on Machine Learning and Applications. Miami,
Transfer Learning Framework,” IEEE Intelligent Systems, vol. 35, no. 4, Florida, USA: IEEE, 2020, pp. 1153–1160.
pp. 70–82, 2020. [73] R. Gosselin, L. Vieu, F. Loukil, and A. Benoit, “Privacy and Security
[52] S. Saha and T. Ahmad, “Federated Transfer Learning: Concept and in Federated Learning: A Survey,” Applied Sciences, vol. 12, no. 19,
Applications,” Intelligenza Artificiale, vol. 15, no. 1, pp. 35–44, 2021. 2022.
[53] R. Razavi-Far, B. Wang, M. E. Taylor, and Q. Yang, Federated and [74] Y. Zhan, P. Li, S. Guo, and Z. Qu, “Incentive Mechanism Design
Transfer Learning. Cham, Switzerland: Springer, 2022. for Federated Learning: Challenges and Opportunities,” IEEE Network,
[54] E. T. M. Beltrán, M. Q. Pérez, P. M. S. Sánchez, S. L. Bernal, G. Bovet, vol. 35, no. 4, pp. 310–317, 2021.
M. G. Pérez, G. M. Pérez, and A. H. Celdrán, “Decentralized Federated [75] T. Zeng, O. Semiari, M. Chen, W. Saad, and M. Bennis, “Federated
Learning: Fundamentals, State of the Art, Frameworks, Trends, and Learning on the Road Autonomous Controller Design for Connected
Challenges,” IEEE Communications Surveys & Tutorials, vol. 25, no. 4, and Autonomous Vehicles,” IEEE Transactions on Wireless Communi-
pp. 2983–3013, 2023. cations, vol. 21, no. 12, pp. 10 407–10 423, 2022.
[55] A. Fallah, A. Mokhtari, and A. Ozdaglar, “Personalized Feder- [76] M. Xu, J. Peng, B. B. Gupta, J. Kang, Z. Xiong, Z. Li, and A. A. A.
ated Learning with Theoretical Guarantees: A Model-Agnostic Meta- El-Latif, “Multiagent Federated Reinforcement Learning for Secure
Learning Approach,” in Advances in Neural Information Processing Incentive Mechanism in Intelligent Cyber–Physical Systems,” IEEE
Systems, vol. 33. Curran Associates, Inc., 2020, pp. 3557–3568. Internet of Things Journal, vol. 9, no. 22, pp. 22 095–22 108, 2022.
[56] D. Xu, T. Li, Y. Li, X. Su, S. Tarkoma, T. Jiang, J. Crowcroft, and
[77] Y. Deng, M. M. Kamani, and M. Mahdavi, “Adaptive Personalized
P. Hui, “Edge Intelligence: Empowering Intelligence to the Edge of
Federated Learning,” 2021. [Online]. Available: https://openreview.net/
Network,” Proceedings of the IEEE, vol. 109, no. 11, pp. 1778–1837,
forum?id=g0a-XYjpQ7r
2021.
[78] Y. Jiang, J. Konečnỳ, K. Rush, and S. Kannan, “Improving Federated
[57] L. Liu, J. Zhang, S. Song, and K. B. Letaief, “Client-Edge-Cloud
Learning Personalization via Model Agnostic Meta Learning,” 2020.
Hierarchical Federated Learning,” in 2020 International Conference
[Online]. Available: https://openreview.net/forum?id=BkeaEyBYDB
on Communications (ICC), 2020, pp. 1–6.
[79] Y. Mansour, M. Mohri, J. Ro, and A. T. Suresh, “Three Approaches
[58] J. Zhang and K. B. Letaief, “Mobile Edge Intelligence and Computing
for Personalization with Applications to Federated Learning,” arXiv
for the Internet of Vehicles,” Proceedings of the IEEE, vol. 108, no. 2,
preprint arXiv:2002.10619, 2020.
pp. 246–261, 2020.
[59] L. U. Khan, W. Saad, Z. Han, E. Hossain, and C. S. Hong, “Federated [80] K. Wang, R. Mathews, C. Kiddon, H. Eichner, F. Beaufays, and
Learning for Internet of Things: Recent Advances, Taxonomy, and D. Ramage, “Federated Evaluation of On-device Personalization,”
Open Challenges,” IEEE Communications Surveys & Tutorials, vol. 23, arXiv preprint arXiv:1910.10252, 2019.
no. 3, pp. 1759–1799, 2021. [81] D. D. Siljak, Decentralized Control of Complex Systems. North
[60] V. P. Chellapandi, L. Yuan, C. G. Brinton, S. H. Żak, and Z. Wang, Chelmsford, Massachusetts, US: Courier Corporation, 2011.
“Federated Learning for Connected and Automated Vehicles: A Sur- [82] G. E. Dullerud and R. D’Andrea, “Distributed control of heterogeneous
vey of Existing Approaches and Challenges,” IEEE Transactions on systems,” IEEE Transactions on Automatic Control, vol. 49, no. 12, pp.
Intelligent Vehicles, vol. 9, no. 1, pp. 119–137, 2023. 2113–2128, 2004.
[61] W. Y. B. Lim, N. C. Luong, D. T. Hoang, Y. Jiao, Y.-C. Liang, [83] E. Tse, C. Chong, and S. Mori, “Distributed Control for Linear
Q. Yang, D. Niyato, and C. Miao, “Federated Learning in Mobile Edge Systems,” in 1983 American Control Conference. San Francisco,
Networks: A Comprehensive Survey,” IEEE Communications Surveys California, USA: IEEE, 1983, pp. 1116–1120.
& Tutorials, vol. 22, no. 3, pp. 2031–2063, 2020. [84] R. D’Andrea and G. E. Dullerud, “Distributed control design for
[62] T. Li, A. K. Sahu, A. Talwalkar, and V. Smith, “Federated Learning: spatially interconnected systems,” IEEE Transactions on Automatic
Challenges, Methods, and Future Directions,” IEEE Signal Processing Control, vol. 48, no. 9, pp. 1478–1495, 2003.
Magazine, vol. 37, no. 3, pp. 50–60, 2020. [85] D. Trentesaux, “Distributed control of production systems,” Engineer-
[63] J. Konečný, H. B. McMahan, F. X. Yu, P. Richtarik, A. T. ing Applications of Artificial Intelligence, vol. 22, no. 7, pp. 971–978,
Suresh, and D. Bacon, “Federated Learning: Strategies for Improving 2009.
Communication Efficiency,” in NIPS Workshop on Private Multi-Party [86] G. Baggio, D. S. Bassett, and F. Pasqualetti, “Data-driven control of
Machine Learning, 2016. [Online]. Available: https://arxiv.org/abs/ complex networks,” Nature Communications, vol. 12, no. 1429, 2021.
1610.05492 [87] Y.-Y. Liu, J.-J. Slotine, and A.-L. Barabási, “Controllability of complex
[64] W. Luping, W. Wei, and L. Bo, “CMFL: Mitigating Communication networks,” Nature, vol. 473, pp. 167–173, 2011.
Overhead for Federated Learning,” in 39th International Conference [88] G. Antonelli, “Interconnected dynamic systems: An overview on dis-
on Distributed Computing Systems. Dallas, Texas, USA: IEEE, 2019, tributed control,” IEEE Control Systems Magazine, vol. 33, no. 1, pp.
pp. 954–964. 76–88, 2013.
[65] J. Wang, Q. Liu, H. Liang, G. Joshi, and H. V. Poor, “Tackling the [89] Y. Cao, W. Yu, W. Ren, and G. Chen, “An Overview of Recent
Objective Inconsistency Problem in Heterogeneous Federated Opti- Progress in the Study of Distributed Multi-Agent Coordination,” IEEE
mization,” in Advances in Neural Information Processing Systems, Transactions on Industrial Informatics, vol. 9, no. 1, pp. 427–438,
vol. 33. Curran Associates, Inc., 2020, pp. 7611–7623. 2013.
[66] A. Imteaj, K. Mamun Ahmed, U. Thakker, S. Wang, J. Li, and [90] J. N. Tsitsiklis, “Problems in decentralized decision making and com-
M. H. Amini, Federated and Transfer Learning. Cham, Switzerland: putation,” Ph.D. dissertation, Massachusetts Institute of Technology,
Springer International Publishing, 2023, ch. Federated Learning for 1984.

15
[91] J. Tsitsiklis, D. Bertsekas, and M. Athans, “Distributed asynchronous cation on Interconnected Cruise Control of Intelligent Vehicles,” IEEE
deterministic and stochastic gradient optimization algorithms,” IEEE Transactions on Intelligent Vehicles, vol. 8, no. 2, pp. 1874–1888, 2023.
Transactions on Automatic Control, vol. 31, no. 9, pp. 803–812, 1986. [114] L. Wang and A. S. Morse, “A Distributed Observer for a Time-Invariant
[92] R. M. Murray, “Recent Research in Cooperative Control of Multivehi- Linear System,” IEEE Transactions on Automatic Control, vol. 63,
cle Systems,” Journal of Dynamic Systems, Measurement, and Control, no. 7, pp. 2123–2130, 2018.
vol. 129, no. 5, pp. 571–583, 2007. [115] A. Mitra and S. Sundaram, “Distributed Observers for LTI Systems,”
[93] L. Bakule, “Decentralized control: An overview,” Annual Reviews in IEEE Transactions on Automatic Control, vol. 63, no. 11, pp. 3689–
Control, vol. 32, no. 1, pp. 87–98, 2008. 3704, 2018.
[94] Q. Zhou, M. Shahidehpour, A. Paaso, S. Bahramirad, A. Alabdulwahab, [116] S. Park and N. C. Martins, “Design of Distributed LTI Observers for
and A. Abusorrah, “Distributed Control and Communication Strategies State Omniscience,” IEEE Transactions on Automatic Control, vol. 62,
in Networked Microgrids,” IEEE Communications Surveys & Tutorials, no. 2, pp. 561–576, 2017.
vol. 22, no. 4, pp. 2586–2633, 2020. [117] T. Kim, H. Shim, and D. D. Cho, “Distributed Luenberger observer
[95] J. Zhao and F. Dörfler, “Distributed control and optimization in DC design,” in 2016 IEEE 55th Conference on Decision and Control
microgrids,” Automatica, vol. 61, pp. 18–26, 2015. (CDC). Las Vegas, Nevada, USA: IEEE, 2016, pp. 6928–6933.
[96] D. K. Molzahn, F. Dörfler, H. Sandberg, S. H. Low, S. Chakrabarti, [118] U. A. Khan and A. Jadbabaie, “On the stability and optimality of
R. Baldick, and J. Lavaei, “A Survey of Distributed Optimization and distributed kalman filters with finite-time data fusion,” in Proceedings
Control Algorithms for Electric Power Systems,” IEEE Transactions of the 2011 American Control Conference. San Francisco, California,
on Smart Grid, vol. 8, no. 6, pp. 2941–2962, 2017. USA: IEEE, 2011, pp. 3405–3410.
[97] F. Guo, C. Wen, and Y.-D. Song, Distributed control and optimization [119] Z.-S. Hou and Z. Wang, “From model-based control to data-driven
technologies in smart grid systems. Boca Raton, Florida: CRC Press, control: Survey, classification and perspective,” Information Sciences,
2017. vol. 235, pp. 3–35, 2013.
[98] H. Khalil and P. v. Kokotovic, “Control strategies for decision makers [120] L. Ljung, System Identification: Theory for the User, 2nd ed. Upper
using different models of the same system,” IEEE Transactions on Saddle River, New Jersey, USA: Prentice Hall, 1991.
Automatic Control, vol. 23, no. 2, pp. 289–298, 1978. [121] A. Chiuso and G. Pillonetto, “System Identification: A Machine Learn-
[99] D. Kalathil and R. Rajagopal, “Online learning for demand response,” ing Perspective,” Annual Review of Control, Robotics, and Autonomous
in 2015 53rd Annual Allerton Conference on Communication, Control, Systems, vol. 2, pp. 281–304, 2019.
and Computing (Allerton). Monticello, Illinois, United States: IEEE, [122] O. Nelles, Nonlinear Dynamic System Identification. Heidelberg,
2015, pp. 218–222. Germany: Springer Berlin Heidelberg, 2020.
[100] F. L. Lewis, H. Zhang, K. Hengster-Movric, and A. Das, Coopera- [123] J. Umenberger, J. Wågberg, I. R. Manchester, and T. B. Schön, “Max-
tive Control of Multi-Agent Systems: Optimal and Adaptive Design imum likelihood identification of stable linear dynamical systems,”
Approaches. London, Great Britain: Springer London, 2013. Automatica, vol. 96, pp. 280–292, 2018.
[101] Y. Tang, Z. Ren, and N. Li, “Zeroth-order feedback optimization for [124] M. K. S. Faradonbeh, A. Tewari, and G. Michailidis, “Finite time
cooperative multi-agent systems,” Automatica, vol. 148, p. 110741, identification in unstable linear systems,” Automatica, vol. 96, pp. 342–
2023. 353, 2018.
[102] G. Wang, W. Yao, X. Zhang, and Z. Li, “A Mean-Field Game Control [125] T. Sarkar and A. Rakhlin, “Near optimal finite time identification
for Large-Scale Swarm Formation Flight in Dense Environments,” of arbitrary linear dynamical systems,” in Proceedings of the 36th
Sensors, vol. 22, no. 14, p. 5437, 2022. International Conference on Machine Learning, ser. Proceedings of
[103] A. G. Mutambara and H. Durrant-Whyte, “Distributed decentralized Machine Learning Research, vol. 97. PMLR, 2019, pp. 5610–5618.
robot control,” in Proceedings of 1994 American Control Conference - [126] M. Simchowitz, H. Mania, S. Tu, M. I. Jordan, and B. Recht, “Learning
ACC ’94, vol. 2. Baltimore, MD, USA: IEEE, 1994, pp. 2266–2267. Without Mixing: Towards A Sharp Analysis of Linear System Identi-
[104] D. M. Stipanović, G. Inalhan, R. Teo, and C. J. Tomlin, “Decentralized fication,” in Proceedings of the 31st Conference On Learning Theory,
overlapping control of a formation of unmanned aerial vehicles,” ser. Proceedings of Machine Learning Research, vol. 75. PMLR, 2018,
Automatica, vol. 40, no. 8, pp. 1285–1296, 2004. pp. 439–473.
[105] O. Demir and J. Lunze, “Cooperative control of multi-agent systems [127] S. Oymak and N. Ozay, “Non-asymptotic Identification of LTI Systems
with event-based communication,” in 2012 American Control Confer- from a Single Trajectory,” in 2019 American Control Conference
ence (ACC). Montreal, Québec, Candada: IEEE, 2012, pp. 4504–4509. (ACC). Philadelphia, Pennsylvania, USA: IEEE, 2019, pp. 5655–
[106] S. E. Li, Y. Zheng, K. Li, Y. Wu, J. K. Hedrick, F. Gao, and H. Zhang, 5661.
“Dynamical Modeling and Distributed Control of Connected and [128] L. Xin, L. Ye, G. Chiu, and S. Sundaram, “Identifying the Dynamics of
Automated Vehicles: Challenges and Opportunities,” IEEE Intelligent a System by Leveraging Data from Similar Systems,” in 2022 American
Transportation Systems Magazine, vol. 9, no. 3, pp. 46–58, 2017. Control Conference (ACC). Atlanta, Georgia, USA: IEEE, 2022, pp.
[107] S. E. Li, X. Qin, Y. Zheng, J. Wang, K. Li, and H. Zhang, “Distributed 818–824.
Platoon Control Under Topologies With Complex Eigenvalues: Stabil- [129] S. Tu, R. Frostig, and M. Soltanolkotabi, “Learning from many
ity Analysis and Controller Synthesis,” IEEE Transactions on Control trajectories,” arXiv preprint arXiv:2203.17193, 2022.
Systems Technology, vol. 27, no. 1, pp. 206–220, 2019. [130] E. Hazan, K. Singh, and C. Zhang, “Learning Linear Dynamical
[108] Y. Zheng, S. E. Li, K. Li, F. Borrelli, and J. K. Hedrick, “Distributed Systems via Spectral Filtering,” in Advances in Neural Information
Model Predictive Control for Heterogeneous Vehicle Platoons Under Processing Systems, vol. 30. Curran Associates, Inc., 2017.
Unidirectional Topologies,” IEEE Transactions on Control Systems [131] E. Hazan, H. Lee, K. Singh, C. Zhang, and Y. Zhang, “Spectral
Technology, vol. 25, no. 3, pp. 899–910, 2017. filtering for general linear dynamical systems,” in Advances in Neural
[109] M. Noor-A-Rahim, Z. Liu, H. Lee, M. O. Khyam, J. He, D. Pesch, Information Processing Systems, vol. 31, 2018.
K. Moessner, W. Saad, and H. V. Poor, “6G for Vehicle-to-Everything [132] L. Xin, G. Chiu, and S. Sundaram, “Learning Linearized Models from
(V2X) Communications: Enabling Technologies, Challenges, and Op- Nonlinear Systems with Finite Data,” in 2023 62nd IEEE Conference
portunities,” Proceedings of the IEEE, vol. 110, no. 6, pp. 712–734, on Decision and Control (CDC). Marina Bay Sands, Singapore: IEEE,
2022. 2023, pp. 2477–2482.
[110] Z. Wang, Y. Bian, S. E. Shladover, G. Wu, S. E. Li, and M. J. Barth, [133] J. Schoukens and L. Ljung, “Nonlinear System Identification: A User-
“A Survey on Cooperative Longitudinal Motion Control of Multiple Oriented Road Map,” IEEE Control Systems Magazine, vol. 39, no. 6,
Connected and Automated Vehicles,” IEEE Intelligent Transportation pp. 28–99, 2019.
Systems Magazine, vol. 12, no. 1, pp. 4–24, 2020. [134] G. Pillonetto, T. Chen, A. Chiuso, G. De Nicolao, and L. Ljung,
[111] O. Demir and J. Lunze, “A decomposition approach to decentralized Regularized System Identification: Learning Dynamic Models from
and distributed control of spatially interconnected systems,” IFAC Data. Basel, Switzerland: Springer International Publishing, 2022.
Proceedings Volumes, vol. 44, no. 1, pp. 9109–9114, 2011, 18th IFAC [135] T. Beckers, J. Seidman, P. Perdikaris, and G. J. Pappas, “Gaussian
World Congress. process port-hamiltonian systems: Bayesian learning with physics
[112] L. V. Gambuzza and M. Frasca, “Distributed Control of Multicon- prior,” in 2022 IEEE 61st Conference on Decision and Control (CDC),
sensus,” IEEE Transactions on Automatic Control, vol. 66, no. 5, pp. 2022, pp. 1447–1453.
2032–2044, 2021. [136] S. M. Khansari-Zadeh and A. Billard, “Learning Stable Nonlinear Dy-
[113] H. Xu, S. Liu, B. Wang, and J. Wang, “Distributed-Observer-Based namical Systems With Gaussian Mixture Models,” IEEE Transactions
Distributed Control Law for Affine Nonlinear Systems and Its Appli- on Robotics, vol. 27, no. 5, pp. 943–957, 2011.

16
[137] E. Kaiser, J. N. Kutz, and S. L. Brunton, “Sparse identification of [160] Y. Zheng and N. Li, “Non-Asymptotic Identification of Linear Dy-
nonlinear dynamics for model predictive control in the low-data limit,” namical Systems Using Multiple Trajectories,” IEEE Control Systems
Proceedings of the Royal Society A, vol. 474, p. 20180335, 2018. Letters, vol. 5, no. 5, pp. 1693–1698, 2021.
[138] S. L. Brunton, J. L. Proctor, and J. N. Kutz, “Sparse Identification [161] M. Simchowitz, R. Boczar, and B. Recht, “Learning Linear Dynamical
of Nonlinear Dynamics with Control (SINDYc),” IFAC-PapersOnLine, Systems with Semi-Parametric Least Squares,” in Proceedings of the
vol. 49, no. 18, pp. 710–715, 2016, 10th IFAC Symposium on Non- Thirty-Second Conference on Learning Theory, ser. Proceedings of
linear Control Systems. Machine Learning Research, vol. 99. PMLR, 2019, pp. 2714–2802.
[139] G. Pillonetto, A. Aravkin, D. Gedon, L. Ljung, A. H. Ribeiro, and [162] L. Xin, L. Ye, G. Chiu, and S. Sundaram, “Learning dynamical
T. B. Schön, “Deep networks for system identification: a survey,” arXiv systems by leveraging data from similar systems,” arXiv preprint
preprint arXiv:2301.12832, 2023. arXiv:2302.04344, 2023, submitting to IEEE Transactions on Auto-
[140] C. Legaard, T. Schranz, G. Schweiger, J. Drgoňa, B. Falay, C. Gomes, matic Control.
A. Iosifidis, M. Abkar, and P. Larsen, “Constructing Neural Network [163] A. Modi, M. K. S. Faradonbeh, A. Tewari, and G. Michailidis, “Joint
Based Models for Simulating Dynamical Systems,” ACM Computing learning of linear time-invariant dynamical systems,” Automatica, vol.
Surveys, vol. 55, no. 11, pp. 1–34, 2023. 164, p. 111635, 2024.
[141] J. Sjöberg, H. Hjalmarsson, and L. Ljung, “Neural Networks in System [164] M. Khosravi, A. Eichler, N. Schmid, R. S. Smith, and P. Heer,
Identification,” IFAC Proceedings Volumes, vol. 27, no. 8, pp. 359–382, “Controller Tuning by Bayesian Optimization An Application to a Heat
1994, IFAC Symposium on System Identification. Pump,” in 2019 18th European Control Conference (ECC). Naples,
[142] K. S. Narendra and K. Parthasarathy, “Identification and control of Italy: IEEE, 2019, pp. 1467–1472.
dynamical systems using neural networks,” IEEE Transactions on [165] Y. Bao, J. M. Velni, A. Basina, and M. Shahbakhti, “Identification of
Neural Networks, vol. 1, no. 1, pp. 4–27, 1990. State-space Linear Parameter-varying Models Using Artificial Neural
[143] K. J. Hunt, D. Sbarbaro, R. Żbikowski, and P. J. Gawthrop, “Neural Networks,” IFAC-PapersOnLine, vol. 53, no. 2, pp. 5286–5291, 2020,
Networks for Control Systems — A Survey,” Automatica, vol. 28, no. 6, 21st IFAC World Congress.
pp. 1083–1112, 1992. [166] M. K. Helwa and A. P. Schoellig, “Multi-robot transfer learning:
[144] A. Achille and S. Soatto, “A Separation Principle for Control in the A dynamical system perspective,” in 2017 IEEE/RSJ International
Age of Deep Learning,” Annual Review of Control, Robotics, and Conference on Intelligent Robots and Systems (IROS). Vancouver,
Autonomous Systems, vol. 1, pp. 287–307, 2018. Canada: IEEE, 2017, pp. 4702–4708.
[145] B. Lusch, J. N. Kutz, and S. L. Brunton, “Deep learning for universal [167] M. Forgione and D. Piga, “dynoNet: A neural network architecture for
linear embeddings of nonlinear dynamics,” Nature Communications, learning dynamical systems,” International Journal of Adaptive Control
vol. 9, p. 4950, 2018. and Signal Processing, vol. 35, no. 4, pp. 612–626, 2021.
[146] J. M. Zamarreño and P. Vega, “State space neural network. Properties [168] E. Arcari, A. Scampicchio, A. Carron, and M. N. Zeilinger, “Bayesian
and application,” Neural networks, vol. 11, no. 6, pp. 1099–1112, 1998. multi-task learning using finite-dimensional models: A comparative
[147] D. Masti and A. Bemporad, “Learning nonlinear state–space models study,” in 2021 60th IEEE Conference on Decision and Control (CDC).
using autoencoders,” Automatica, vol. 129, p. 109666, 2021. Austin, Texas, USA: IEEE, 2021, pp. 2218–2225.
[148] G. Beintema, R. Toth, and M. Schoukens, “Nonlinear state-space [169] C. D. McKinnon and A. P. Schoellig, “Meta Learning With Paired
identification using deep encoder networks,” in Proceedings of the 3rd Forward and Inverse Models for Efficient Receding Horizon Control,”
Conference on Learning for Dynamics and Control, ser. Proceedings IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 3240–3247,
of Machine Learning Research, vol. 144. PMLR, 2021, pp. 241–250. 2021.
[149] E. Skomski, J. Drgoňa, and A. Tuor, “Automating Discovery of [170] S. Zhan, G. Wichern, C. Laughman, A. Chong, and A. Chakrabarty,
Physics-Informed Neural State Space Models via Learning and Evolu- “Calibrating building simulation models using multi-source datasets
tion,” in Proceedings of the 3rd Conference on Learning for Dynamics and meta-learned Bayesian optimization,” Energy and Buildings, vol.
and Control, ser. Proceedings of Machine Learning Research, vol. 144. 270, p. 112278, 2022.
PMLR, 2021, pp. 980–991. [171] K. Hunt and D. Sbarbaro, “Neural networks for nonlinear internal
[150] A. Chakrabarty, “Optimizing Closed-Loop Performance with Data from model control,” in IEE Proceedings D (Control Theory and Applica-
Similar Systems: A Bayesian Meta-Learning Approach,” in 2022 IEEE tions), vol. 138, no. 5. IET, 1991, pp. 431–438.
61st Conference on Decision and Control (CDC). Cancun, Mexico: [172] E. Grant and B. Zhang, “A neural-net approach to supervised learning
IEEE, 2022, pp. 130–136. of pole balancing,” in Proceedings. IEEE International Symposium on
[151] T. Lew, A. Sharma, J. Harrison, A. Bylard, and M. Pavone, Intelligent Control 1989. IEEE, 1989, pp. 123–129.
“Safe Active Dynamics Learning and Control: A Sequential Ex- [173] C. De Persis and P. Tesi, “Formulas for Data-Driven Control: Stabi-
ploration–Exploitation Framework,” IEEE Transactions on Robotics, lization, Optimality, and Robustness,” IEEE Transactions on Automatic
vol. 38, no. 5, pp. 2888–2907, 2022. Control, vol. 65, no. 3, pp. 909–924, 2020.
[152] A. Gu, K. Goel, and C. Ré, “Efficiently Modeling Long Sequences [174] ——, “Learning controllers for nonlinear systems from data,” Annual
with Structured State Spaces,” arXiv preprint arXiv:2111.00396, 2021. Reviews in Control, p. 100915, 2023.
[153] ——, “Efficiently Modeling Long Sequences with Structured State [175] F. Dörfler, J. Coulson, and I. Markovsky, “Bridging Direct and
Spaces,” in Poster presented at 10th International Conference on Indirect Data-Driven Control Formulations via Regularizations and
Learning Representations, Virtual Only Conference, 2022. Relaxations,” IEEE Transactions on Automatic Control, vol. 68, no. 2,
[154] A. Gu and T. Dao, “Mamba: Linear-Time Sequence Modeling pp. 883–897, 2023.
with Selective State Spaces,” 2024. [Online]. Available: https: [176] V. Krishnan and F. Pasqualetti, “On Direct vs Indirect Data-Driven
//openreview.net/forum?id=AL1fq05o7H Predictive Control,” in 2021 60th IEEE Conference on Decision and
[155] T. Lesort, N. Dı́az-Rodrı́guez, J.-F. Goudou, and D. Filliat, “State Control (CDC). IEEE, 2021, pp. 736–741.
representation learning for control: An overview,” Neural Networks, [177] Y. Ouyang, M. Gagrani, and R. Jain, “Control of unknown linear
vol. 108, pp. 379–392, 2018. systems with Thompson sampling,” in 2017 55th Annual Allerton
[156] T.-J. Chang and S. Shahrampour, “Distributed Online System Identi- Conference on Communication, Control, and Computing (Allerton).
fication for LTI Systems Using Reverse Experience Replay,” in 2022 Monticello, Illinois, United States: IEEE, 2017, pp. 1198–1205.
IEEE 61st Conference on Decision and Control (CDC). Cancun, [178] A. Cohen, A. Hasidim, T. Koren, N. Lazic, Y. Mansour, and K. Talwar,
Mexico: IEEE, 2022, pp. 6672–6677. “Online Linear Quadratic Control,” in Proceedings of the 35th Interna-
[157] L. F. Toso, H. Wang, and J. Anderson, “Learning Personalized Models tional Conference on Machine Learning, ser. Proceedings of Machine
with Clustered System Identification,” in 2023 62nd IEEE Conference Learning Research, vol. 80. PMLR, 2018, pp. 1029–1038.
on Decision and Control (CDC). Marina Bay Sands, Singapore: IEEE, [179] A. Cohen, T. Koren, and Y. Mansour,√ “Learning Linear-Quadratic
2023, pp. 7162–7169. Regulators Efficiently with only T Regret,” in Proceedings of the
[158] Y. Chen, A. M. Ospina, F. Pasqualetti, and E. Dall’Anese, “Multi- 36th International Conference on Machine Learning, ser. Proceedings
Task System Identification of Similar Linear Time-Invariant Dynamical of Machine Learning Research, vol. 97. PMLR, 2019, pp. 1300–1309.
Systems,” in 2023 62nd IEEE Conference on Decision and Control [180] S. Dean, H. Mania, N. Matni, B. Recht, and S. Tu, “Regret Bounds
(CDC). Marina Bay Sands, Singapore: IEEE, 2023, pp. 7342–7349. for Robust Adaptive Control of the Linear Quadratic Regulator,” in
[159] S. Formentin and A. Chiuso, “Control-oriented regularization for linear Advances in Neural Information Processing Systems, vol. 31. Curran
system identification,” Automatica, vol. 127, p. 109539, 2021. Associates, Inc., 2018.

17
[181] S. Dean, S. Tu, N. Matni, and B. Recht, “Safely Learning to Control the [202] N. Boffi, S. Tu, N. Matni, J.-J. Slotine, and V. Sindhwani, “Learning
Constrained Linear Quadratic Regulator,” in 2019 American Control Stability Certificates from Data,” in Proceedings of the 2020 Con-
Conference (ACC). Philadelphia, Pennsylvania, USA: IEEE, 2019, ference on Robot Learning, ser. Proceedings of Machine Learning
pp. 5582–5588. Research, vol. 155. PMLR, 2021, pp. 1341–1350.
[182] S. Dean, H. Mania, N. Matni, B. Recht, and S. Tu, “On the Sample [203] D. Nguyen-Tuong, J. Peters, and M. Seeger, “Local Gaussian Process
Complexity of the Linear Quadratic Regulator,” Foundations of Com- Regression for Real Time Online Model Learning,” in Advances in
putational Mathematics, vol. 20, no. 4, pp. 633–679, 2020. Neural Information Processing Systems, vol. 21. Curran Associates,
[183] M. Ferizbegovic, J. Umenberger, H. Hjalmarsson, and T. B. Schön, Inc., 2008.
“Learning robust LQ-controllers using application oriented explo- [204] J. Harrison, A. Sharma, R. Calandra, and M. Pavone, “Control adapta-
ration,” IEEE Control Systems Letters, vol. 4, no. 1, pp. 19–24, 2019. tion via meta-learning dynamics,” in 2nd Workshop on Meta-Learning
[184] N. Agarwal, E. Hazan, and K. Singh, “Logarithmic Regret for Online at NeurIPS, 2018.
Control,” in Advances in Neural Information Processing Systems, [205] J. Harrison, A. Garg, B. Ivanovic, Y. Zhu, S. Savarese, L. Fei-Fei, and
vol. 32. Curran Associates, Inc., 2019. M. Pavone, “AdaPT: Zero-Shot Adaptive Policy Transfer for Stochastic
[185] N. Agarwal, B. Bullins, E. Hazan, S. Kakade, and K. Singh, “Online Dynamical Systems,” in Robotics Research. Cham, Switzerland:
Control with Adversarial Disturbances,” in Proceedings of the 36th Springer International Publishing, 2020, pp. 437–453.
International Conference on Machine Learning, ser. Proceedings of [206] S. M. Richards, N. Azizan, J.-J. Slotine, and M. Pavone, “Adaptive-
Machine Learning Research, vol. 97. PMLR, 2019, pp. 111–119. control-oriented meta-learning for nonlinear systems,” in Robotics: Sci-
[186] E. Hazan, S. Kakade, and K. Singh, “The Nonstochastic Control ence and Systems, Virtual Only Conference, 2021. [Online]. Available:
Problem,” in Proceedings of the 31st International Conference on https://roboticsconference.org/2021/program/papers/056/index.html
Algorithmic Learning Theory, ser. Proceedings of Machine Learning [207] ——, “Control-oriented meta-learning,” The International Journal of
Research, vol. 117. PMLR, 2020, pp. 408–421. Robotics Research, vol. 42, no. 10, pp. 777–797, 2023.
[187] M. Simchowitz, “Making Non-Stochastic Control (Almost) as Easy [208] D. Muthirayan, D. Kalathil, and P. P. Khargonekar, “Meta-Learning
as Stochastic,” Advances in Neural Information Processing Systems, Online Control for Linear Dynamical Systems,” in 2022 IEEE 61st
vol. 33, pp. 18 318–18 329, 2020. Conference on Decision and Control (CDC). Cancun, Mexico: IEEE,
[188] M. Simchowitz, K. Singh, and E. Hazan, “Improper Learning for Non- 2022, pp. 1435–1440.
Stochastic Control,” in Proceedings of Thirty Third Conference on [209] L. Li, C. De Persis, P. Tesi, and N. Monshizadeh, “Data-Based Transfer
Learning Theory, ser. Proceedings of Machine Learning Research, vol. Stabilization in Linear Systems,” IEEE Transactions on Automatic
125. PMLR, 2020, pp. 3320–3436. Control, vol. 69, no. 3, pp. 1866–1873, 2024.
[189] X. Chen and E. Hazan, “Black-Box Control for Linear Dynamical [210] J. Berberich, C. W. Scherer, and F. Allgöwer, “Combining Prior
Systems,” in Proceedings of Thirty Fourth Conference on Learning Knowledge and Data for Robust Controller Design,” IEEE Transactions
Theory, ser. Proceedings of Machine Learning Research, vol. 134. on Automatic Control, vol. 68, no. 8, pp. 4618–4633, 2023.
PMLR, 2021, pp. 1114–1143. [211] G. Shi, K. Azizzadenesheli, M. O’Connell, S.-J. Chung, and Y. Yue,
“Meta-Adaptive Nonlinear Control: Theory and Algorithms,” in Ad-
[190] A. Agrawal, S. Barratt, S. Boyd, and B. Stellato, “Learning Convex
vances in Neural Information Processing Systems, vol. 34. Curran
Optimization Control Policies,” in Proceedings of the 2nd Conference
Associates, Inc., 2021, pp. 10 013–10 025.
on Learning for Dynamics and Control, ser. Proceedings of Machine
[212] T. T. Zhang, K. Kang, B. D. Lee, C. Tomlin, S. Levine, S. Tu, and
Learning Research, vol. 120. PMLR, 2020, pp. 361–373.
N. Matni, “Multi-Task Imitation Learning for Linear Dynamical Sys-
[191] Y. Zheng, L. Furieri, M. Kamgarpour, and N. Li, “Sample Complexity
tems,” in Proceedings of The 5th Annual Learning for Dynamics and
of Linear Quadratic Gaussian (LQG) Control for Output Feedback Sys-
Control Conference, ser. Proceedings of Machine Learning Research,
tems,” in Proceedings of the 3rd Conference on Learning for Dynamics
vol. 211. PMLR, 2023, pp. 586–599.
and Control, ser. Proceedings of Machine Learning Research, vol. 144.
[213] Y. Sun, W. L. Ubellacker, W.-L. Ma, X. Zhang, C. Wang, N. V.
PMLR, 2021, pp. 559–570.
Csomay-Shanklin, M. Tomizuka, K. Sreenath, and A. D. Ames, “Online
[192] L. Furieri, Y. Zheng, A. Papachristodoulou, and M. Kamgarpour, “An
Learning of Unknown Dynamics for Model-Based Controllers in
Input–Output Parametrization of Stabilizing Controllers: Amidst Youla
Legged Locomotion,” IEEE Robotics and Automation Letters, vol. 6,
and System Level Synthesis,” IEEE Control Systems Letters, vol. 3,
no. 4, pp. 8442–8449, 2021.
no. 4, pp. 1014–1019, 2019.
[214] A. Rajeswaran, S. Ghotra, B. Ravindran, and S. Levine, “EPOpt:
[193] S. Lale, K. Azizzadenesheli, B. Hassibi, and A. Anandkumar, “Adaptive Learning Robust Neural Network Policies Using Model Ensembles,”
Control and Regret Minimization in Linear Quadratic Gaussian (LQG) in 5th International Conference on Learning Representations.
Setting,” in 2021 American Control Conference (ACC). Virtual Only Toulon, France: OpenReview.net, 2017. [Online]. Available: https:
Conference: IEEE, 2021, pp. 2517–2522. //openreview.net/forum?id=SyWvgP5el
[194] ——, “Logarithmic Regret Bound in Partially Observable Linear [215] W. Yu, J. Tan, Y. Bai, E. Coumans, and S. Ha, “Learning Fast
Dynamical Systems,” in Advances in Neural Information Processing Adaptation With Meta Strategy Optimization,” IEEE Robotics and
Systems, vol. 33. Curran Associates, Inc., 2020, pp. 20 876–20 888. Automation Letters, vol. 5, no. 2, pp. 2950–2957, 2020.
[195] F. Sun, Z. Sun, P. Woo, and Z. Zhang, “Neural adaptive controller [216] C. Devin, A. Gupta, T. Darrell, P. Abbeel, and S. Levine, “Learn-
design of robotic manipulators with an observer,” IFAC Proceedings ing modular neural network policies for multi-task and multi-robot
Volumes, vol. 32, no. 2, pp. 5485–5490, 1999, 14th IFAC World transfer,” in 2017 IEEE International Conference on Robotics and
Congress 1999, Beijing, Chia, 5-9 July. Automation (ICRA). Marina Bay Sands, Singapore: IEEE, 2017, pp.
[196] M. K. Ciliz and C. Işik, “On-line learning control of manipulators 2169–2176.
based on artificial neural network models,” Robotica, vol. 15, no. 3, [217] L. F. Toso, D. Zhan, J. Anderson, and H. Wang, “Meta-Learning Linear
pp. 293–304, 1997. Quadratic Regulators: A Policy Gradient MAML Approach for the
[197] Y. Jin, T. Pipe, and A. Winfield, “Stable neural network control for Model-free LQR,” arXiv preprint arXiv:2401.14534, 2024.
manipulators,” Intelligent Systems Engineering, vol. 2, no. 4, pp. 213– [218] H. Wang, L. F. Toso, and J. Anderson, “FedSysID: A Federated
222, 1993. Approach to Sample-Efficient System Identification,” in Proceedings
[198] R. M. Sanner and J.-J. E. Slotine, “Gaussian Networks for Direct of The 5th Annual Learning for Dynamics and Control Conference,
Adaptive Control,” in 1991 American Control Conference. Boston, ser. Proceedings of Machine Learning Research, vol. 211. PMLR,
Massachusetts, USA: IEEE, 1991, pp. 2153–2159. 2023, pp. 1308–1320.
[199] G. Joshi and G. Chowdhary, “Adaptive Control using Gaussian-Process [219] I. A. Azzollini, M. Bin, L. Marconi, and T. Parisini, “Robust and
with Model Reference Generative Network,” in 2018 IEEE Conference scalable distributed recursive least squares,” Automatica, vol. 158, p.
on Decision and Control (CDC). Miami Beach, FL, USA: IEEE, 2018, 111265, 2023.
pp. 237–243. [220] Z. Ren, A. Zhong, and N. Li, “LQR with Tracking: A Zeroth-order
[200] ——, “Deep Model Reference Adaptive Control,” in 2019 IEEE 58th Approach and Its Global Convergence,” in 2021 American Control
Conference on Decision and Control (CDC). Nice, France: IEEE, Conference (ACC), Virtual Only Conference, 2021, pp. 2562–2568.
2019, pp. 4601–4608. [221] H. Wang, L. F. Toso, A. Mitra, and J. Anderson, “Model-free Learning
[201] D. Hanover, P. Foehn, S. Sun, E. Kaufmann, and D. Scaramuzza, with Heterogeneous Dynamical Systems: A Federated LQR Approach,”
“Performance, Precision, and Payloads: Adaptive Nonlinear MPC for arXiv preprint arXiv:2308.11743, 2023.
Quadrotors,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. [222] T. Zeng, O. Semiari, M. Chen, W. Saad, and M. Bennis, “Federated
690–697, 2022. Learning for Collaborative Controller Design of Connected and Au-

18
tonomous Vehicles,” in 2021 60th IEEE Conference on Decision and [244] K. S. Narendra and A. M. Annaswamy, “Persistent excitation in
Control (CDC). Austin, Texas, USA: IEEE, 2021, pp. 5033–5038. adaptive systems,” International Journal of Control, vol. 45, no. 1,
[223] L. Wang, K. Zhang, A. Zhou, M. Simchowitz, and R. Tedrake, “Robot pp. 127–160, 1987.
Fleet Learning via Policy Merging,” in 12th International Conference [245] N. Thuerey, P. Holl, M. Mueller, P. Schnell, F. Trost, and K. Um,
on Learning Representations, Vienna, Austria, 2024. Physics-based Deep Learning. WWW, 2021. [Online]. Available:
[224] H. Shiri, J. Park, and M. Bennis, “Communication-Efficient Massive https://physicsbaseddeeplearning.org
UAV Online Path Control: Federated Learning Meets Mean-Field Game [246] T. X. Nghiem, J. Drgoňa, C. Jones, Z. Nagy, R. Schwan, B. Dey,
Theory,” IEEE Transactions on Communications, vol. 68, no. 11, pp. A. Chakrabarty, S. Di Cairano, J. A. Paulson, A. Carron, M. N.
6840–6857, 2020. Zeilinger, W. Shaw Cortez, and D. L. Vrabie, “Physics-informed
[225] M. Nakanoya, J. Im, H. Qiu, S. Katti, M. Pavone, and S. Chinchali, machine learning for modeling and control of dynamical systems,” in
“Personalized federated learning of driver prediction models for au- 2023 American Control Conference (ACC), 2023, pp. 3735–3750.
tonomous driving,” arXiv preprint arXiv:2112.00956, 2021. [247] J. Weigand, J. Raible, N. Zantopp, O. Demir, A. Trachte, A. Wagner,
[226] H.-K. Lim, J.-B. Kim, J.-S. Heo, and Y.-H. Han, “Federated Reinforce- and M. Ruskowski, “Hybrid Data-Driven Modelling for Inverse Control
ment Learning for Training Control Policies on Multiple IoT Devices,” of Hydraulic Excavators,” in 2021 IEEE/RSJ International Conference
Sensors, vol. 20, no. 5, p. 1359, 2020. on Intelligent Robots and Systems (IROS). Prague, Czech Republic:
[227] B. Liu, L. Wang, and M. Liu, “Lifelong Federated Reinforcement IEEE, 2021, pp. 2127–2134.
Learning: A Learning Architecture for Navigation in Cloud Robotic [248] M. Hertneck, J. Köhler, S. Trimpe, and F. Allgöwer, “Learning an ap-
Systems,” IEEE Robotics and Automation Letters, vol. 4, no. 4, pp. proximate model predictive controller with guarantees,” IEEE Control
4555–4562, 2019. Systems Letters, vol. 2, no. 3, pp. 543–548, 2018.
[228] B. Liu, L. Wang, M. Liu, and C.-Z. Xu, “Federated Imitation Learning: [249] M. Abu-Ali, F. Berkel, M. Manderla, S. Reimann, R. Kennel, and
A Novel Framework for Cloud Robotic Systems With Heterogeneous M. Abdelrahem, “Deep Learning-Based Long-Horizon MPC: Robust,
Sensor Data,” IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. High Performing, and Computationally Efficient Control for PMSM
3509–3516, 2020. Drives,” IEEE Transactions on Power Electronics, vol. 37, no. 10, pp.
[229] X. Liang, Y. Liu, T. Chen, M. Liu, and Q. Yang, “Federated transfer 12 486–12 501, 2022.
reinforcement learning for autonomous driving,” in Federated and [250] B. Karg, T. Alamo, and S. Lucia, “Probabilistic performance validation
Transfer Learning. Cham, Switzerland: Springer, 2022, pp. 357–371. of deep learning–based robust NMPC controllers,” International Jour-
[230] L. Mächler, I. Ezhov, F. Kofler, S. Shit, J. C. Paetzold, T. Loehr, nal of Robust and Nonlinear Control, vol. 31, no. 18, pp. 8855–8876,
C. Zimmer, B. Wiestler, and B. H. Menze, “FedCostWAvg: A New 2021.
Averaging for Better Federated Learning,” in Brainlesion: Glioma, [251] P. P. Liang, T. Liu, L. Ziyin, N. B. Allen, R. P. Auerbach, D. Brent,
Multiple Sclerosis, Stroke and Traumatic Brain Injuries, Springer. R. Salakhutdinov, and L.-P. Morency, “Think locally, act globally:
Virtual Event: Springer International Publishing, 2022, pp. 383–391. Federated learning with local and global representations,” in 2nd
[231] A. B. Mansour, G. Carenini, A. Duplessis, and D. Naccache, “Federated Workshop on Federated Learning at NeurIPS for Data Privacy and
Learning Aggregation: New Robust Algorithms with Guarantees,” in Confidentiality, Vancouver, BC, Canada, 2019.
2022 21st IEEE International Conference on Machine Learning and [252] V. Smith, C.-K. Chiang, M. Sanjabi, and A. S. Talwalkar, “Federated
Applications (ICMLA). Atlantis Hotel, Bahamas: IEEE, 2022, pp. Multi-Task Learning,” in Advances in Neural Information Processing
721–726. Systems, vol. 30. Curran Associates, Inc., 2017.
[232] L. Mächler, I. Ezhov, S. Shit, and J. C. Paetzold, “FedPIDAvg: A [253] Z. Li, L. Wang, L. Jiang, and C.-Z. Xu, “FC-SLAM: Federated Learn-
PID Controller Inspired Aggregation Method for Federated Learning,” ing Enhanced Distributed Visual-LiDAR SLAM In Cloud Robotic
in International MICCAI Brainlesion Workshop, Springer. Cham, System,” in 2019 IEEE International Conference on Robotics and
Switzerland: Springer Nature Switzerland, 2023, pp. 209–217. Biomimetics (ROBIO). Dali, Yunnan, China: IEEE, 2019, pp. 1995–
[233] H. Gao, Q. Wu, X. Zhao, J. Zhu, and M. Zhang, “FedADT: An 2000.
Adaptive Method Based on Derivative Term for Federated Learning,”
Sensors, vol. 23, no. 13, p. 6034, 2023.
[234] A. H. Sayed, “Adaptive networks,” Proceedings of the IEEE, vol. 102,
no. 4, pp. 460–497, 2014.
[235] L. Nie, J. Guan, C. Lu, H. Zheng, and Z. Yin, “Longitudinal speed
control of autonomous vehicle based on a self-adaptive PID of radial
basis function neural network,” IET Intelligent Transport Systems,
vol. 12, no. 6, pp. 485–494, 2018.
[236] D. Bertsekas, Reinforcement learning and optimal control. Nashua,
New Hampshire, USA: Athena Scientific, 2019.
[237] S. Levine, A. Kumar, G. Tucker, and J. Fu, “Offline reinforcement
learning: Tutorial, review, and perspectives on open problems,” arXiv
preprint arXiv:2005.01643, 2020.
[238] M. Vidyasagar, “A tutorial introduction to reinforcement learning,”
SICE Journal of Control, Measurement, and System Integration,
vol. 16, no. 1, pp. 172–191, 2023.
[239] Quanser. QUBE – Servo 2. [Online]. Available: https://www.quanser.
com/products/qube-servo-2/
[240] S. Khodadadian, P. Sharma, G. Joshi, and S. T. Maguluri, “Federated
Reinforcement Learning: Linear Speedup Under Markovian Sampling,”
in Proceedings of the 39th International Conference on Machine
Learning, ser. Proceedings of Machine Learning Research, vol. 162.
PMLR, 2022, pp. 10 997–11 057.
[241] S. Kumar, P. Shah, D. Hakkani-Tur, and L. Heck, “Federated
control with hierarchical multi-agent deep reinforcement learning,”
in Hierarchical Reinforcement Learning Workshop at the 31st
Conference on Neural Information Processing Systems, Long Beach,
California, USA, 2017. [Online]. Available: https://drive.google.com/
file/d/1F-pZOwYigOr8IhDefgy2CyIfYx-h2ar-/view
[242] J. Qi, Q. Zhou, L. Lei, and K. Zheng, “Federated reinforcement
learning: techniques, applications, and open challenges,” Intelligence
& Robotics, vol. 1, no. 1, 2021. [Online]. Available: https:
//www.oaepublish.com/articles/ir.2021.02
[243] Y. Xiang, T. Tang, T. Su, C. Brach, L. Liu, S. S. Mao, and M. Geimer,
“Fast CRDNN: Towards on Site Training of Mobile Construction
Machines,” IEEE Access, vol. 9, pp. 124 253–124 267, 2021.

19

You might also like