0% found this document useful (0 votes)
47 views2 pages

Experience-Driven Computational Resource Allocation of Federated Learning by Deep Reinforcement Learning

The document proposes an experience-driven computational resource allocation algorithm for federated learning using deep reinforcement learning. It aims to improve the energy efficiency of federated learning on mobile devices by optimizing CPU cycle frequency allocation. The algorithm uses a deep reinforcement learning agent that learns the best strategies for resource allocation based on previous experiences. Simulation results show the proposed approach achieves better performance in terms of average system cost and energy consumption compared to heuristic and static allocation methods. However, the algorithm has only been tested in simulations and not real-world experiments, and uses an older network dataset that may not reflect current network conditions.

Uploaded by

Harshit Shukla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views2 pages

Experience-Driven Computational Resource Allocation of Federated Learning by Deep Reinforcement Learning

The document proposes an experience-driven computational resource allocation algorithm for federated learning using deep reinforcement learning. It aims to improve the energy efficiency of federated learning on mobile devices by optimizing CPU cycle frequency allocation. The algorithm uses a deep reinforcement learning agent that learns the best strategies for resource allocation based on previous experiences. Simulation results show the proposed approach achieves better performance in terms of average system cost and energy consumption compared to heuristic and static allocation methods. However, the algorithm has only been tested in simulations and not real-world experiments, and uses an older network dataset that may not reflect current network conditions.

Uploaded by

Harshit Shukla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Question 1

“Experience-Driven Computational Resource Allocation of Federated Learning by Deep


Reinforcement Learning”

Motivation
Deep Learning Techniques have emerged as the most promising in training models for various tasks like object detection,
classification, anomaly detection, etc. However, a large chunk of big data is generated through less resourceful user-equipments
(UE’s) like smartphones, IoT devices, where it is impractical to upload all this data to one centralized node server for training.
Centralized model training is a cumbersome process proposing many hindrances like network quality restrictions, privacy and
ownership concerns, lack of collaboration, etc. Hence Federated learning was introduced as a novelty where distributed learning
was brought under one umbrella, enhancing collaboration among devices without private data exposure. Federated Learning
works in iterations, where UE’s train their local models, upload the model parameters instead of the data to the centralized server,
where a global model is synthesized and then distributed back to the UE’s. UE’s however, are heterogeneous in computation and
communication capabilities due to varying underlying hardware; hence there is a tradeoff between the two costs during Federated
Learning, and there has been active research conducted in this field for its optimization. There has also been a push for designing
quick learning algorithms with better converging speeds, but the main issue we face in our research is "energy efficiency". This is
the crucial motivation for the work in the paper. A tradeoff between energy efficiency and learning time arises due to UE's
heterogeneous nature combined with the synchronization among training nodes after each iteration. All of this must take into
account the unpredictable network quality due to mobility or environmental factors. Instead of combining network quality
prediction and optimization algorithms, we turn to machine learning for solving federated learning problems.

Problem Statement
Federated Learning over wireless networks proposes the optimization problem of computational resource allocation on mobile
devices, that captures the tradeoff between connectivity and computational costs and improves the energy efficiency(by trading
idle time for power saving) without slowing the training process, a significant issue with mobile devices with heterogeneous
environments of computation & communication capabilities combined with varying physical specifications and battery exhaustion
limitations. Another factor that is unrealistically assumed in previous papers is the assumption of stable network connectivity
among connected devices.

Contributions
The paper contributes to providing a new computational resource allocation algorithm for federated learning that considers both
the converging time and mobile devices' energy consumption. The algorithm proposed is experience-driven i.e., it can learn the
best strategies for resource allocation based on the previous action (using an action-critic model) and is tested and proved both on
small-scale testbed as well as large scale simulations where it outperforms the traditional state-of-the-art solutions by a superior
margin. The DRL agent is based on an action-critic network forming the federated learning’s core and predicts the best suitable
CPU-cycle frequency for each mobile device at the beginning of every iteration. DRL interacts with the Federated Learning
system, which defines the rules, restrictions, and reward mechanism, observes its state, and determines the action based on
previous experiences, bringing the “experience-driven” part of the algorithm into actuality. It learns through a state, action, and
reward system to find the best policy mapping a state to an action that maximizes the discounted accumulated reward. Improving
the energy-efficiency of federated learning by carefully controlling the CPU-cycle frequency is the key contribution of the paper.
Due to the hardness of the control problem and the unawareness of the network quality, machine learning methods were applied
and an experience-driven method was devised to solve the control problem based on DRL. We train the DRL agent based on the
real-world network datasets. The final trace-driven experiments further demonstrate the superiority of the DRL-based approach
compared to the state-of-the-art solutions.
Proposed Approach
The proposed approach is broadly classified into two parts- Federated Learning system and the DRL agent. Considering a
practical scenario with dynamic network bandwidth, the author sets the state space in DRL formulation since future network
bandwidth is related to historical bandwidth information. Action in the mth round is defined as the set of CPU cycle frequencies of
all connected mobile devices in the mth iteration. Hence in one iteration, the mobile devices complete the federated learning
updates, upload the new parameters to the node server and after each mobile device has completed the upload, the DRL agent
obtains the system cost in the current iteration. The PPO algorithm is chosen for our policy optimization approach due to its ease
of implementation, sample complexity, ease of tuning, and assurance of low deviation from previous policy. The DRL agent
maintains an experience replay buffer, a policy(action), and an estimate of the value function(critic). Since the parameter server
can access mobile devices’ information, hence we can train the DRL agent in an offline manner.
The training procedure begins with random initialization of action and critic network parameters. The real-world network dataset
and mobile devices’ information are pre-loaded, hence constructing a simulated training environment of federated learning
systems. In order to train the DRL agent efficiently, another policy is used to sample the federated learning environment. In this
way, the DRL agent can repeatedly use the experience sampled by the old policy multiple times. The federated learning system
randomly selects a start time and the DRL agent constructs the initial state by each mobile device’s past bandwidth historical
information. Then the DRL agent starts to execute CPU-cycle frequency control and at the beginning of the kth iteration in
federated learning, the DRL agent feeds the current state into the policy network and derives the corresponding action. After the
mobile devices receive the action from the DRL agent, they train the deep learning model with the specified CPU-cycle frequency
determined by the current action. The kth iteration ends with the parameter server receiving all the updates from the mobile
devices. Then the DRL agent can calculate the reward obtained in the kth iteration, and the federated learning system moves to the
next state. At the same time, the experience in the kth iteration is stored in the experience replay buffer. When the experience
replay buffer is full, we can update the DRL agent with the experience in the replay buffer, where the actor-network is updated by
the PPO approach. After the DRL agent learns the information from the experience in the buffer, the new parameters of
actor-network are assigned to the old policy to do the next sampling.

Outcomes and Conclusion


The trace dataset contains bandwidth measurements of 4G networks along several routes, collected by Huawei P8 Lite
smartphones, as a uniform distribution is adjusted within 50-100 MB, running in 6 scenarios in and around the city of Ghent,
Belgium, with a number of CPU cycles distributed within 10-30 cycles/bit and maximum CPU-cycle frequency within 1.0-2.0
GHz, during the period of 2015-12-16 to 2016-02-04. The proposed algorithm is compared with two state-of-the-art approaches
heuristic and static. In Offline DRL Training the training loss becomes stabilized after less than 200 episodes, which means that
the DRL agent learns to adapt the federated learning environment. In the case of Online DRL Reasoning where after adequate
offline training, the DRL model is saved for reasoning, it generated an average system cost of 7.25,compared to 9.74 and 10.5 for
heuristic and static, respectively. The average system cost of the traditional approaches is 35% higher than DRL-based approach.
The DRL-based approach also consumes the minimum computational energy as compared to the other two approaches ranging
from 1.5 to 1.6 for DRL in each iteration. However, in the heuristic approach, over 80% of the energy consumption is more than
1.7 whereas in static, the mobile device implements the computation with the same CPU-cycle frequency. This verifies our
suspicion that even though mobile devices invest more computing power, it can not necessarily accelerate the convergence rate of
federated learning. In order to evaluate the scalability of the proposed DRLbased approach, simulations were conducted with 50
mobile devices. Where the DRL approach obtained the best performance as compared with the state-of-the-art with the system
cost of each iteration being almost less than 12 in contrast to 14 and 16 with traditional approaches. The final trace-driven
experiments further demonstrate the superiority of the DRL-based approach compared to the state-of-the-art solutions.

Limitations
The question for Federated Learning implementations has always been around its efficiency.Federated Learning requires small
delays and higher reliability of mobile devices.Although the paper tries to address this issue, however the algorithm is tested in a
simulation and not a real-time experiment. The dataset itself is old and won’t cater to the modern interconnection bandwidth,
traffic and other complications. The unpredictability of network bandwidth and QoS always remains a limitation in these areas and
are still open issues.

You might also like