0% found this document useful (0 votes)
3 views32 pages

Federated Learning - Presentación

Uploaded by

pablitocrafter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views32 pages

Federated Learning - Presentación

Uploaded by

pablitocrafter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

ARQUITECTURAS E

INFRAESTRUCTURAS PARA
INTELIGENCIA ARTIFICIAL
Máster en Inteligencia Artificial

Tema 5
Computación en la nube

Equipo Docente
Table of Contents
1. Conventional Model
2. Federated Learning Model
3. Types of Federated Learning
4. Advantages
5. Challenges
6. Classification
7. Aggregation Methods
8. Applications
9. Frameworks
2
1. Conventional Model

model
request

prediction

data
feedback

Clients Server

3
2. Federated Learning Model
Step 3: Global model
Server aggregation and
update
aggregation
(W1, W2, ,…, Wn) W’

W1 Wn Step 1: Model
W’
W2 initialization

Model 1 Model 2 Model n

Data Data Data


Client 1 Client 2 Client n

...
Step 2: Local model
training and upload
Clients
4
2. Federated Learning Model
 Elements:
 Server
 The server coordinates and manages the distributed training process, ensuring
data integrity and privacy.
 Clients
 The clients represent the user devices that participate in the local training of the
model.
 Aggregator
 The aggregator is responsible for receiving training results from clients and
averaging them to improve the overall model.
 Global Model
 The global model is the combined representation of the models trained on the
clients, which is iteratively updated as training is performed on the local devices.

• McMahan, B., Moore, E., Ramage, D., Hampson, S., & y Arcas, B. A. (2017, April). Communication-efficient learning of deep
networks from decentralized data. In Artificial intelligence and statistics (pp. 1273-1282). PMLR.

5
3. Types of Federated Learning
• Based on Architecture

Centralized Decentralized

https://link.springer.com/chapter/10.1007/978-3-031-47508-5_1

6
3. Types of Federated Learning
 Centralized
 A central server coordinates the learning process and maintains a global machine learning
model. Local nodes perform model training on their own data and then send local model
updates to the central server, which aggregates them to produce an improved global model.
Communication between the central server and the local nodes can be synchronous or
asynchronous. Although the central server is crucial, it can become a point of failure due to
network, hardware or software problems, affecting the collaborative learning process. Traffic
congestion can also be a problem when many nodes communicate with the same server.
 Decentralized
 The learning process is not coordinated by a central server; instead, all nodes coordinate with
each other to perform the learning process and update the global model without the need
for a dedicated server. Each node performs model training with its local data, updates its local
model, and exchanges the update with its neighbors in the network to produce the improved
global model. Although it eliminates the dependency on the central server and avoids a single
point of failure, the decentralized architecture is complex and presents a high communication
cost due to the large number of nodes involved. Sometimes, despite the decentralized
architecture, a central authority may be in charge of configuring the learning task.

7
3. Types of Federated Learning
• Based on the Federative scope

Cross-Device Cross-Silo

https://link.springer.com/chapter/10.1007/978-3-031-47508-5_1

8
3. Types of Federated Learning
 Cross-Device
 In federated cross-device learning, the learning process is based on local data
from a large number of mobile devices. These devices are typically computers,
smartphones and IoT devices, which act as a data source to train the model
locally. The model learning process is similar to centralized federated learning,
orchestrated by a central server. All devices perform local model training with
their local data, update the local model and send it to the central server,
which then aggregates all local updates to produce the improved global
model. This type of federated configuration requires millions of devices to
provide effective training due to device limitations such as no connection or
insufficient data.

9
3. Types of Federated Learning
 Cross-Silo
 In federated cross-silo learning, the learning process is based on local data
from selected organizations that form the learning network. A silo is an
isolated data storage location for an organization that contains raw data with
restricted access within that organization. These silos are used collaboratively
and act as a data source to train the model locally, allowing organizations to
achieve common goals without directly sharing their data. This approach
involves a small number of data-rich and computationally powerful nodes, but
its implementation and distribution of computational resources across
organizations can be challenging within the privacy framework.

10
3. Types of Federated Learning
 Cross-Device vs Cross-Silo
 The easiest way to understand their difference is to associate cross-silo with
organizations and cross-device devices with mobiles. In the case of cross-silo, the
number of clients is usually small but they have large computational power. When it
comes to cross-device, the number of clients is huge and has small computational
power. Another aspect is that of trust: we can trust that organizations (cross-silo) are
always available for training, however, it cannot be assured in the case of mobile devices
(cross-devices), there is a possibility that a poor network may hinder the availability of
the device.

11
3. Types of Federated Learning
• Based on Data Partitioning

Horizontal Vertical

https://link.springer.com/chapter/10.1007/978-3-031-47508-5_1

12
3. Types of Federated Learning
 Horizontal
 It uses data with the same feature space but different sample spaces across all
nodes in the learning network to collaboratively train a global model. It mainly
involves nodes that have a homogeneous set of data, meaning that they use
the same features across the entire learning network. This type of federated
learning is commonly used in multi-device environments, where different
nodes improve model performance on tasks related to the same features.
Nodes can train local models using their own dataset and model architecture.
Then, the global model can be updated by simply averaging all local model
updates.This type of training is common in Cross-Device models.

13
3. Types of Federated Learning
 Vertical
 Vertical federated learning uses data with different feature spaces but with
the same sample space across all nodes in the learning network to
collaboratively train a global model. It mainly deals with nodes that have a
heterogeneous set of data, meaning that they use different features
throughout the network. This type of federated learning is commonly used in
multi-silo (Cross-silo) environments, where different nodes improve model
performance on tasks related to the same sample space. The nodes can train
the local models using their local data with the specific model architecture
according to the requirements. Finally, the improved global model is obtained
by combining all the local model updates.

14
4. Advantages
Data
Privacy Sovereignty Efficiency

Customization Scalability

IoT

15
4. Advantages
 Privacy
 Data privacy is ensured by not sharing “raw” client data with the server, only the
weights of the trained local models.
 Integration in IoT systems (Internet of Things)
 Federated Learning offers the possibility to train robust models on low performance
devices.
 Having distributed training offers the possibility to make up for the insufficiency of data
in a client by having collaborative training.
 Scalability
 Federated Learning uses distributed processing resources across multiple devices
located in different geographical areas.This parallelized approach improves scalability.

16
4. Advantages
 Customization
 By training models on individual devices, Federated Learning enables precise
model customization, adapting to each user's unique preferences and
characteristics.
 Efficiency
 By performing local calculations on the devices, Federated Learning reduces the
need to transfer large amounts of data, making the process faster and more
efficient.
 Data Sovereignty
 In federated learning, the owner of the device that owns the data has full and
sovereign control of its data and personal information, and is not controlled by
any other party in the learning network. The model owner can only train his
model with the user's data, but cannot own or control it. The data owner can
access, update, share, hide or delete their data and personal information without
necessarily notifying the model owner. This data sovereignty is crucial in several
application areas, such as medical, financial and government organizations.

17
5. Challenges

Privacy

Heterogeneity

Efficient communication

18
5. Challenges
 Privacy
 Although this technique already offers privacy by not sharing the “raw” data with the
server, preventing anyone who infiltrates the communication from obtaining this data,
there is still a need for research in this area.
 Mainly, in methods that allow validating the data received from the clients, in order to
avoid malicious devices that interfere in the training, since Federated Learning assumes
trust between the devices that will collaborate in the training.
 Heterogeneidad
 Federated learning typically involves millions of heterogeneous systems/nodes with
different computational, storage and communication capabilities. This presents a
significant challenge in providing effective and unbiased model training due to variations
in node availability and reliability, as well as in the types and sizes of data at each node.
Various techniques are used to handle these heterogeneities, such as asynchronous
communication and data heterogeneity optimization. This can result in increased
complexity in the learning process and a possible decrease in model accuracy
compared to a fully centralized, conventional dataset.

19
5. Challenges
 Efficient communication
 Federated learning generally involves millions of nodes in a learning network, where
nodes iteratively send model updates or small messages as part of the distributed
learning process rather than sending data across the network. The communication
overhead to get the models to the device must be moderately low for federated
learning to succeed, otherwise it could negatively impact the process. However,
communication can be affected by several factors such as limited bandwidth, lack of
resources or geographical location, which can be resolved by minimizing the size of
model updates transmitted in each iteration or reducing the number of iterations.
 Non-Identically and Independently Distributed Data (Non IID Data)
 Non-Identically and Independently Distributed (IID) data is data that is distributed
across multiple clients or devices in a federated learning system, but does not conform
to the standard concept of identical and independent distribution (IID).
 In an IID configuration, each data point is drawn separately from the same probability
distribution, and the data points are assumed to be statistically identical. However, this
assumption does not apply in a non-IID scenario, as data from multiple clients or
devices may show changes or discrepancies in terms of statistical characteristics or
patterns.

20
5. Challenges
 Datos no Idénticamente e independientemente distribuidos (Non IID
Data)

21
6. Classification

Data Division Communication


protocols

Selection of the
Techniques for
specific Machine
managing
Learning model
heterogeneity

Privacy
Mechanisms

• Zhang, C., Xie, Y., Bai, H., Yu, B., Li, W., & Gao, Y. (2021). A survey on federated
learning. Knowledge-Based Systems, 216, 106775.

22
6. Classification

• Zhang, C., Xie, Y., Bai, H., Yu, B., Li, W., & Gao, Y. (2021). A survey on federated
learning. Knowledge-Based Systems, 216, 106775.

23
7. Aggregation Methods: FedSGD
 Stochastic Gradient Descent (SGD)

𝑤𝑡+1 = 𝑤𝑡 − 𝛾∇𝐿𝑖 (𝑤𝑡 )


• McMahan, B., Moore, E., Ramage, D., Hampson, S., & y Arcas, B. A. (2017, April). Communication-
efficient learning of deep networks from decentralized data. In Artificial intelligence and
statistics (pp. 1273-1282). PMLR.

24
7. Aggregation Methods: FedSGD
 Stochastic Gradient Descent (SGD)
 It is an optimization technique used to train neural networks in machine learning
and deep learning. It is a version of the gradient descent optimization method to
determine the ideal parameters (weights and bias) of a model that minimizes a
defined loss function as efficiently as possible.
 Although this approach is computationally efficient, it requires a large number of
training rounds to produce a good model when applied as a federated
optimization method. In addition, SGD requires centralized data to perform a
good model, which poses privacy risks.

25
7. Aggregation Methods: FedSGD
 Federated Stochastic Gradient Descent (FedSGD)
 This method is based on the principles of stochastic gradient descent and can be
naively applied to the federated learning optimization problem. In this case, the
algorithm takes a random fraction C of the customers in each round and
calculates the loss gradient over all the data held by the customers. It is important
to note that C controls the overall batch size, so C=1 corresponds to a full (non-
stochastic) batch gradient descent.
 Each client calculates its local gradient using its local data and sends these
gradients (not the model updates) to the central server. The server then
aggregates these gradients, usually by a simple averaging process or proportionally
to the number of training samples from each node.

• McMahan, B., Moore, E., Ramage, D., Hampson, S., & y Arcas, B. A. (2017, April). Communication-
efficient learning of deep networks from decentralized data. In Artificial intelligence and
statistics (pp. 1273-1282). PMLR.

26
7. Aggregation Methods: FedSGD
 Federated Stochastic Gradient Descent (FedSGD)
 As FedSGd is based on gradient exchange, the communication cost between the
clients and the server is too high and even inefficient, especially if the gradients are
large, the communication bandwidth should be higher (increased latency), resulting
in a big problem.
 FedSGD still maintains some privacy because the “raw” data is only kept on each
client and is not transmitted to the server. However, the gradient could reveal
some information about the data, which would pose a privacy risk.
 Finally, another problem that arises when employing this approach is that FedSGD
is based on equal contribution, which means that each client contributes equally
to the global model update during gradient aggregation.

• McMahan, B., Moore, E., Ramage, D., Hampson, S., & y Arcas, B. A. (2017, April). Communication-
efficient learning of deep networks from decentralized data. In Artificial intelligence and
statistics (pp. 1273-1282). PMLR.

27
7. Aggregation Methods: FedSGD
 Federated Averaging(FedAvg)

28
7. Aggregation Methods: FedSGD
 Federated Averaging(FedAvg)
 FedAvg updates the global model primarily through weighted averages. The size of
each client's local dataset is used to weight the model updates. Clients with larger
data sets have a greater impact on the global model.
 Moreover, it is designed to facilitate communication. Clients compute their own
local model updates, and only these updates (rather than the “raw” data) are sent
to the central server for aggregation. This reduces the amount of data sent
between the clients and the server, making it suitable for cases where bandwidth
is limited.
 As this approach does not share the “raw” data, privacy is guaranteed, even by
uploading the update of the model weights, because in this case it is not possible
to extract information and know the characteristics of the data.

29
7. Aggregation Methods: FedSGD
 FedProx (Non IID data)
 Li, T., Sahu, A. K., Zaheer, M., Sanjabi, M., Talwalkar, A., & Smith, V. (2020).
Federated optimization in heterogeneous networks. Proceedings of Machine
learning and systems, 2, 429-450.
 Quantized Federated Averaging (q-FedAvg)
 Li, T., Sanjabi, M., Beirami, A., & Smith, V. (2019). Fair resource allocation in
federated learning. arXiv preprint arXiv:1905.10497.
 Personalized Federated Learning
 Fallah, A., Mokhtari, A., & Ozdaglar, A. (2020). Personalized federated learning
with theoretical guarantees: A model-agnostic meta-learning
approach. Advances in Neural Information Processing Systems, 33, 3557-3568.

30
8. Federated Learning Applications

1 Personalized Medicine

2 Autonomous Vehicles

3 Intelligent Keyboard

31
9. Frameworks

https://www.apheris.com/resources/blog/top-7-open-source-frameworks-for-federated-learning
32

You might also like