Research on Joint Optimization of Task Offloading and UAV Trajectory in Mobile Edge Computing Considering Communication Cost Based on Safe Reinforcement Learning
Abstract
:1. Introduction
- (1)
- Introducing communication mechanisms between UAVs to enable them to make final decisions based on their observations, estimations, and received messages, solving the problem of ground user devices not being covered or covered repeatedly by multiple UAVs in the training process;
- (2)
- Integrating a safety layer into the MADDPG algorithm to constrain the UAV’s actions and enable the use of safe reinforcement learning to plan UAV trajectories and avoid UAV collisions, solving the problem of high energy consumption and delay caused by UAV collisions in the training process;
- (3)
- Conducting numerical simulations and demonstrating that the proposed MADDPG-based optimization algorithm outperforms other methods. The experimental results show that the coverage of UAVs for user devices significantly improved and collisions among UAVs were reduced during the training process.
2. Materials and Methods
2.1. System Model
2.1.1. UAV Movement Model
2.1.2. Communication Model
- 1.
- User devices to UAV
- 2.
- UAV to EC
2.1.3. Computation Model
- UAV Computation Model
- 2.
- EC Computation Model
2.1.4. Problem Modeling
2.2. MADDPG Solves UAV Path Planning and Task Offloading Problem
- MDP
- 2.
- State
- 3.
- Action
- 4.
- Reward Function
2.3. Security Reinforcement Learning Algorithm Integrating UAV Communication Mechanism
2.3.1. UAV Communication Mechanism
2.3.2. Security Reinforcement Learning
- (1)
- Introduction to Security Reinforcement Learning
- (2)
- MADDPG Algorithm with Integrated Security Layer.
Algorithm 1 SC-MADDPG |
|
3. Results
3.1. Training Performance of SC-MADDPG
3.2. The Impact of Communication Mechanisms on Latency and Energy Consumption
3.3. Performance Comparison of Optimization Algorithms after Integrating Security Layers
- (1)
- MADDPG algorithm (with UAV communication mechanism): execute joint UAV actions directly obtained from the Actor network without considering safety constraints.
- (2)
- Hard-MADDPG algorithm (with UAV communication mechanism): consider safety constraints and require the output safe joint action to satisfy all constraints.
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Liao, H.; Li, X.; Guo, D.; Kang, W.; Li, J. Dependency-Aware Application Assigning and Scheduling in Edge Computing. IEEE Internet Things J. 2022, 9, 4451–4463. [Google Scholar] [CrossRef]
- Xu, Y.; Zhang, T.; Liu, Y.; Yang, D.; Xiao, L.; Tao, M. UAV-Assisted MEC Networks With Aerial and Ground Cooperation. IEEE Trans. Wirel. Commun. 2021, 20, 7712–7727. [Google Scholar] [CrossRef]
- Liu, Q.; Shi, L.; Sun, L.; Li, J.; Ding, M.; Shu, F.S. Path Planning for UAV-Mounted Mobile Edge Computing With Deep Reinforcement Learning. IEEE Trans. Veh. Technol. 2020, 69, 5723–5728. [Google Scholar] [CrossRef]
- Wang, L.; Wang, K.; Pan, C.; Xu, W.; Aslam, N.; Nallanathan, A. Deep Reinforcement Learning Based Dynamic Trajectory Control for UAV-Assisted Mobile Edge Computing. IEEE Trans. Mob. Comput. 2022, 21, 3536–3550. [Google Scholar] [CrossRef]
- Zhao, N.; Liu, Z.; Cheng, Y. Multi-agent deep reinforcement learning for trajectory design and power allocation in multi-UAV networks. IEEE Access 2020, 8, 139670–139679. [Google Scholar] [CrossRef]
- Yang, G.; Dai, R.; Liang, Y. Energy-efficient UAV backscatter communication with joint trajectory design and resource optimization. IEEE Trans. Wirel. Commun. 2021, 20, 926–941. [Google Scholar] [CrossRef]
- Li, M.; Cheng, N.; Gao, J.; Wang, Y.; Zhao, L.; Shen, X. Energyefficient UAV-assisted mobile edge computing: Resource allocation and trajectory optimization. IEEE Trans. Veh. Technol. 2020, 69, 3424–3438. [Google Scholar] [CrossRef]
- Wang, Y.; Ru, Z.; Wang, K.; Huang, P. Joint deployment and task scheduling optimization for large-scale mobile users in multi-UAVenabled mobile edge computing. IEEE Trans. Cybern. 2020, 50, 3984–3997. [Google Scholar] [CrossRef] [PubMed]
- Xu, Y.; Zhang, T.; Yang, D.; Liu, Y.; Tao, M. Joint resource and trajectory optimization for security in UAV-assisted MEC systems. IEEE Trans. Commun. 2021, 69, 573–588. [Google Scholar] [CrossRef]
- Yu, Z.; Gong, Y.; Gong, S.; Guo, Y. Joint task offloading and resource allocation in UAV-enabled mobile edge computing. IEEE Internet Things J. 2020, 7, 3147–3159. [Google Scholar] [CrossRef]
- Liu, Y.; Xie, S.; Zhang, Y. Cooperative offloading and resource management for UAV-enabled mobile edge computing in power IoT system. IEEE Trans. Veh. Technol. 2020, 69, 12229–12239. [Google Scholar] [CrossRef]
- Ji, J.; Zhu, K.; Yi, C.; Niyato, D. Energy consumption minimization in UAV-assisted mobile-edge computing systems: Joint resource allocation and trajectory design. IEEE Internet Things J. 2021, 8, 8570–8584. [Google Scholar] [CrossRef]
- Zhang, J.; Zhou, L.; Tang, Q.; Ngai, E.C.-H.; Hu, X.; Zhao, H.; Wei, J. Stochastic computation offloading and trajectory scheduling for UAV-assisted mobile edge computing. IEEE Internet Things J. 2019, 6, 3688–3699. [Google Scholar] [CrossRef]
- Sun, C.; Ni, W.; Wang, X. Joint Computation offloading and trajectory planning for UAV-assisted edge computing. IEEE Trans. Wirel. Commun. 2021, 20, 5343–5358. [Google Scholar] [CrossRef]
- Zhan, C.; Hu, H.; Liu, Z.; Wang, Z.; Mao, S. Multi-UAV-enabled mobile-edge computing for time-constrained IoT applications. IEEE Internet Things J. 2021, 8, 15553–15567. [Google Scholar] [CrossRef]
- Ding, R.; Xu, Y.; Gao, F.; Shen, X. Trajectory Design and Access Control for Air-Ground Coordinated Communications System with Multi-Agent Deep Reinforcement Learning. IEEE Internet Things J. 2021, 9, 5785–5798. [Google Scholar] [CrossRef]
- Koushik, A.M.; Hu, F. Deep Q-Learning-Based Node Positioning for Throughput-Optimal Communications in Dynamic UAV Swarm Network. IEEE Trans. Cogn. Commun. Netw. 2019, 5, 554–566. [Google Scholar] [CrossRef]
- You, W.; Dong, C.; Wu, Q.; Qu, Y.; Wu, Y.; He, R. Joint task scheduling, resource allocation, and UAV trajectory under clustering for FANETs. China Commun. 2022, 19, 104–118. [Google Scholar] [CrossRef]
- Zhan, C.; Zeng, Y. Aerial–Ground Cost Tradeoff for Multi-UAV-Enabled Data Collection in Wireless Sensor Networks. IEEE Trans. Commun. 2020, 68, 1937–1950. [Google Scholar] [CrossRef]
- Wang, Z.; Liu, R.; Liu, Q.; Thompson, J.S.; Kadoch, M. Energy-Efficient Data Collection and Device Positioning in UAV-Assisted IoT. IEEE Internet Things J. 2020, 7, 1122–1139. [Google Scholar] [CrossRef]
- Zhang, Q.; Miao, J.; Zhang, Z. Energy-Efficient Secure Video Streaming in UAV-Enabled Wireless Networks: A Safe-DQN Approach. IEEE Trans. Green Commun. Netw. 2021, 5, 1892–1905. [Google Scholar] [CrossRef]
- Zhang, T.; Lei, J.; Liu, Y.; Feng, C.; Nallanathan, A. Trajectory Optimization for UAV Emergency Communication With Limited User Equipment Energy: A Safe-DQN Approach. IEEE Trans. Green Commun. Netw. 2021, 5, 1236–1247. [Google Scholar] [CrossRef]
Parameter | Value |
---|---|
Number of hidden layer units | 32 |
Number of critic network layers | 3 |
Number of neurons | The sum of the state space sizes of all drones, 192 and 1 |
Number of Planner network | 3 |
Number of neurons | State space size, 192, and m of UAV i |
Input to the Executor network | Observation of UAV i |
Output to the Executor network | Drone actions ai |
Activation function | ReLu |
Initial learning rate of constraint network | 5 × 10−4 |
The number of neurons in the hidden layer of the actor network | 100 |
The initial learning rate of the actor network | 1× 10−4 |
The number of neurons in the hidden layer of the critic network | 500 |
The initial learning rate of the critic network | 1× 10−3 |
Episode | 5000 |
Step | 50 |
Discount factor | 0.99 |
Batch size | 64 |
Parameter | Value |
---|---|
Number of mobile devices | 80 |
Number of UAVs | 4 |
Number of edge servers | 2 |
Task size | [1, 5] Mbits |
Number of CPU cycles required toprocess 1-bit data | [100, 200] cycles/bit |
Task achievement rate | 1 task/s |
Maximum flying altitude of unmanned aerial vehicle | 100 m |
Minimum flying altitude of unmanned aerial vehicle | 50 m |
Maximum horizontal flight distance of unmanned aerial vehicle | 49 m |
Vertical maximum flight distance of unmanned aerial vehicle | 12 m |
Unit path loss | −50 dB |
Channel bandwidth from device to UAV | 10 MHz |
Channel bandwidth from UAVs to edge servers | 0.5 MHz |
Maximum transmission power of UAVs | 5 W |
Transmission power of mobile devices | 0.1 W |
Received power of UAVs | 0.1 W |
Computing resources of UAVs | 3 GHz |
Computing resources of edge servers | [6, 9] GHz |
Effective Switched capacitor | |
Gaussian noise at UAV | −100 dBm |
Gaussian noise at the edge server | −100 dBm |
Weight | 1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Dai, Y.; Fu, J.; Gao, Z.; Yang, L. Research on Joint Optimization of Task Offloading and UAV Trajectory in Mobile Edge Computing Considering Communication Cost Based on Safe Reinforcement Learning. Appl. Sci. 2024, 14, 2635. https://doi.org/10.3390/app14062635
Dai Y, Fu J, Gao Z, Yang L. Research on Joint Optimization of Task Offloading and UAV Trajectory in Mobile Edge Computing Considering Communication Cost Based on Safe Reinforcement Learning. Applied Sciences. 2024; 14(6):2635. https://doi.org/10.3390/app14062635
Chicago/Turabian StyleDai, Yu, Jiaming Fu, Zhen Gao, and Lei Yang. 2024. "Research on Joint Optimization of Task Offloading and UAV Trajectory in Mobile Edge Computing Considering Communication Cost Based on Safe Reinforcement Learning" Applied Sciences 14, no. 6: 2635. https://doi.org/10.3390/app14062635