0% found this document useful (0 votes)
12 views

1854785

Uploaded by

issam hamdi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

1854785

Uploaded by

issam hamdi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Joint service caching and computation offloading scheme based on deep

reinforcement learning in vehicular edge computing systems


Zheng Xue, Chang Liu, Canliang Liao, Guojun Han, Zhengguo Sheng

Publication date
05-01-2023

Licence
This work is made available under the Copyright not evaluated licence and should only be used in accordance
with that licence. For more information on the specific terms, consult the repository record for this item.

Document Version
Accepted version

Citation for this work (American Psychological Association 7th edition)


Xue, Z., Liu, C., Liao, C., Han, G., & Sheng, Z. (2023). Joint service caching and computation offloading
scheme based on deep reinforcement learning in vehicular edge computing systems (Version 1). University of
Sussex. https://hdl.handle.net/10779/uos.23493914.v1

Published in
IEEE Transactions on Vehicular Technology

Link to external publisher version


https://doi.org/10.1109/TVT.2023.3234336

Copyright and reuse:


This work was downloaded from Sussex Research Open (SRO). This document is made available in line with publisher policy
and may differ from the published version. Please cite the published version where possible. Copyright and all moral rights to the
version of the paper presented here belong to the individual author(s) and/or other copyright owners unless otherwise stated. For
more information on this work, SRO or to report an issue, you can contact the repository administrators at [email protected].
Discover more of the University’s research at https://sussex.figshare.com/
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. XX, NO. XX, XX 2023 1

Joint Service Caching and Computation Offloading


Scheme Based on Deep Reinforcement Learning in
Vehicular Edge Computing Systems
Zheng Xue, Chang Liu, Canliang Liao, Guojun Han, Senior Member, IEEE,
and Zhengguo Sheng, Senior Member, IEEE

Abstract—Vehicular edge computing (VEC) is a new com- vehicular applications, such as augmented reality (AR) navi-
puting paradigm that enhances vehicular performance by in- gation and autonomous driving. However, offloading all the
troducing both computation offloading and service caching, computing tasks to the remote cloud results in a heavy
to resource-constrained vehicles and ubiquitous edge servers.
Recent developments of autonomous vehicles enable a variety backhaul load and unacceptable latency. Through vehicular
of applications that demand high computing resources and low edge computing (VEC) [1], vehicular users can offload their
latency, such as automatic driving, auto navigation, etc. However, tasks via vehicle-to-infrastructure (V2I) communications to
the highly dynamic topology of vehicular networks and limited edge nodes to reduce response latency, which caters for un-
caching space at resource-constrained edge servers calls for intel- precedentedly exploding data traffic and increasingly stringent
ligent design of caching placement and computation offloading.
Meanwhile, service caching decisions are highly correlated to the requirements of vehicular applications [2].
computation offloading decisions, which pose a great challenge
to effectively design service caching and computation offloading
strategies. In this paper, we investigate a joint optimization prob- Task computation requires both input task data from users
lem by integrating service caching and computation offloading and program data installed on edges, which are defined as
in a general VEC scenario with time-varying task requests. To content caching and service caching, respectively. Content
minimize the average task processing delay, we formulate the caching refers to caching of the input data needed and output
problem using long-term mixed integer non-linear programming data generated (e.g., in computational or infotainment appli-
(MINLP) and propose an algorithm based on deep reinforcement
learning to obtain a suboptimal solution with low computation cations) at vehicles and edge nodes [3]–[7]. Since these data
complexity. The simulation results demonstrate that our proposed dominate mobile data traffic, content caching at edge servers
scheme exhibits an effective performance improvement in task can effectively alleviate mobile traffic on backhaul links and
processing delay compared with other representative benchmark reduce content delivery latency [3]. On the other hand, service
methods. caching refers to caching the specific programs for task execu-
Index Terms—Vehicular edge computing, service caching, com- tion [8]–[14]. As a motivating example, in object detection, the
putation offloading, deep reinforcement learning. input data consists of videos and radar sensor data, and task
execution requires the corresponding object detection service
I. I NTRODUCTION program to be cached in the vehicle or the edge server. The
input data of object detection service is typically unique and
T HE rapid development of Internet of Things (IoT) and
artificial intelligence (AI) has recently led to the emer-
gence of diverse computation-intensive and latency-sensitive
hardly reusable for other executions. In comparison, service
program data in the cache is evidently reusable by future
executions of the same type of tasks. Because edge servers
Copyright (c) 2022 IEEE. Personal use of this material is permitted. have limited caching space, how to selectively cache service
However, permission to use this material for any other purposes must be
obtained from the IEEE by sending a request to [email protected]. program over space (e.g., at multiple edge servers) and time
Manuscript received June 2, 2022; revised September 4, 2022; accepted resources for achieving optimum transmission and computing
December 9, 2022. This work was supported in part by Guangzhou Sci- performance is crucial for efficient task computation.
ence and Technology Plan Project under Grant 202201010239; in part by
Graduate Education & Innovation Project of Guangdong Province under
Grant 2022XSLT022; in part by Guangdong Introducing Innovative and
Entrepreneurial Teams of The Pearl River Talent Recruitment Program The design of optimal computation offloading and service
(2021ZT09X044); in part by Guangdong Introducing Outstanding Young caching faces many challenges in vehicular networks. First,
Scholars of The Pearl River Talent Recruitment Program (2021QN02X546); vehicles and edge servers can only cache a small number
in part by Guangdong Provincial Key Laboratory of Photonics Information
Technology (No. 2020B121201011); in part by the Science and Technology of service programs at a time due to limited storage space.
Program of Guangzhou under Grant 202102020869, and the Guangdong Basic Thus, which service programs should be cached needs to
and Applied Basic Research Foundation under Grant 2022A1515110602 and be decided judiciously. Second, the computing resources on
Grant 2022A1515010153. (Corresponding author : Chang Liu.)
Zheng Xue, Chang Liu, Canliang Liao and Guojun Han are with different edge servers may be unevenly distributed. It is critical
School of Information Engineering, Guangdong University of Technolo- to balance the computation load by cooperative offloading.
gy, Guangzhou 510006, China. (e-mail: [email protected]; li- Third, the computation offloading decisions and the service
[email protected]; [email protected]; [email protected]).
Zhengguo Sheng is with the Department of Engineering and Design, Uni- caching decisions are closely correlated. Intuitively, we tend
versity of Sussex, Brighton, BN1 9RH, U.K. (e-mail: [email protected]). to offload a task if the required program is already cached at
2 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. XX, NO. XX, XX 2023

the edge.1 Besides, the network status and available resources communication, caching and computing strategy to minimize
of edge servers change dynamically during the movement the system cost. To guarantee the reliability of communication
of vehicles. Therefore, it becomes significant and yet very systems, powerful error correction codes, such as low-density
challenging to design an appropriate service caching and parity-check (LDPC) codes, can be applied to enhance the
computation offloading strategy in the VEC systems. anti-noise and anti-fading capability [25]–[27]. Qiao et al. [20]
In this paper, we investigate a joint optimization of service minimize the content access cost for a novel cooperative edge
caching and computation offloading for a VEC system with caching framework. Ning et al. [21] design a mobility-aware
limited storage and computing capacities, taking account of edge computing and caching scheme to maximize mobile
time-varying task requests and dynamic network topology. In network operators’ profits. However, all these works are based
order to make full use of the limited caching and computing re- on the implicit assumption that all the services are available in
sources of each node (i.e. vehicles and edge servers) as well as edge servers, which is impractical due to their limited storage
the cooperative offloading between edge servers, we propose capacities.
a deep reinforcement learning (DRL)-based service caching Service caching, which refers to the caching of related
and computation offloading scheme to provide low-complexity programs for computational task execution, can significantly
decision making and adaptive resource management. The main affect the performances of MEC systems since service caching
contributions of this paper can be summarized as follows: strategies and computation offloading strategies are always
1) We design a novel edge service caching and computation coupled. There has been considerable research focusing on
offloading framework with cooperation among the cloud, edge joint service caching, computation offloading and resource
servers and vehicles, which not only balances the computa- allocation for mobile users in MEC system [8]–[12]. Yan
tion load among edge servers, but also enables integration et al. [8] propose an MEC service pricing scheme to co-
of caching and computing resources combined with edge ordinate with service caching decisions and control wireless
intelligence. devices’ task offloading behaviors in a cellular network. Ko
2) We minimize the cumulative average task processing et al. [9] maximize a sum-utility for multi-mobile service
delay over a long time-slotted horizon, considering dynamic caching enabled MEC. Bi et al. [10] formulate a mixed
task requesting, offloading and service caching, as well as integer nonlinear programming (MINLP) that jointly optimizes
dynamic channel conditions between vehicles and edge servers service caching placement, computation offloading decisions,
at each time slot. To solve the formulated long-term mixed and system resource allocation to minimize computation delay
integer non-linear programming (MINLP) problem, we pro- and energy consumption of mobile user. Zhang et al. [12]
pose an edge caching and offloading scheme based on deep investigate joint service caching, computation offloading and
deterministic policy gradient (DDPG) [15] to efficiently make resource allocation problem in a general multi-user multi-task
decisions on task offloading and service caching. scenario and aim to minimize the weighted sum of all users
3) We carry out extensive simulations to evaluate the computation cost and delay cost. However, the joint caching
average task processing delay and energy consumption of and computing strategy optimization in MEC systems cannot
the proposed scheme in VEC. Numerical results demonstrate be directly applied to VEC systems. The high mobility of
that our proposed scheme can achieve better performances vehicles results in complicated and dynamic topology as well
compared with the other benchmark approaches. as frequent server switching.
The rest of the paper is organized as follows. We review the There have been emerging efforts on content caching and
related work in Section II. The system model and the problem computation offloading in VEC [20], [21], [28]–[31]. Tian
formulation are described in Section III. In Section IV, the et al. [30] propose a collaborative computation offloading
proposed scheme is presented in details. Section V provides and content caching method, by leveraging DRL for a self-
numerical results, and Section VI concludes this paper. driving system. Wu et al. [31] propose a multi-agent based
reinforcement learning (RL) algorithm to make decisions on
II. R ELATED W ORK task offloading and edge caching to optimize both service
latency and energy consumption of vehicles. The important
In the past few years, computation offloading in mobile edge
difference between content caching and service caching is
computing (MEC) has been intensively discussed [16]–[19].
that the latter not only concerns storage capacity but also the
Zhang et al. [16] minimize the average bandwidth consump-
computing capacity. Thus, the service caching brings more
tion with a novel bidirectional computation task model by joint
challenges to the VEC system design. To the best of our
caching and computing offloading policy optimization. Tout et
knowledge, so far only a few works [13], [14] have inves-
al. [17] find the optimal dissemination of computational tasks
tigated the problem of joint optimization of service caching
within MEC-based infrastructures while satisfying persona
and computation offloading in VEC system. Tang et al. [13]
needs on a wider range of devices and assuring minimal
apply application caching to VEC to optimize the response
additional fees imposed by remote execution. Multi-user multi-
time for the outsourced applications while satisfying the time
task computation offloading and resource allocation for mobile
slot spanned energy consumption constraint. The Lyapunov
edge computing are proposed in [19]. For the task offloading
optimization technology is adopted to tackle this constraint
in VEC [1], [2], [20]–[24]. Tan et al. [2] propose a joint
issue. Finally, two greedy heuristic algorithms are incorporated
1 In this paper, we use the terms “edge”, “edge node” and “edge server” into the drift-plus-penalty based algorithm to help finding the
interchangeably. approximate optimal solution. Lan et al. [14] propose a fog-
XUE et al.: JOINT SERVICE CACHING AND COMPUTATION OFFLOADING SCHEME 3

TABLE I
based vehicular architecture featuring computation offloading P RIMARY N OTATIONS
with service caching and design the offloading strategies based
on a DRL algorithm to reduce the task computation delay and Notation Definition
V The set of vehicles V = {1, 2, ...v, ...}
energy cost. However, the limitation of storage and computing E The set of edge nodes E = {1, 2, ...e, ...NE }
resources in vehicles or edge servers is not considered in [13], K The set of service programs K = {1, 2, ...k, ...NK }
[14]. Based on the existing literature, service caching and B The total bandwidth of each edge server
Bv,e (t) The bandwidth of edge server e allocated to vehicle v
computation offloading scenarios involving vehicles and edge in time slot t
servers with limited resources in VEC have not been explored. dk The input data size of task k
In recent years, AI-based algorithms have been successfully θk The required storage space of service program k
SvV (SeE ) The storage at each vehicle (edge)
applied in numerous related works on emerging vehicular γv,e The signal-to-noise ratio from edge e to vehicle v
applications and services which are mostly delay-sensitive. gv,e (t) The up-link gain between edge e and vehicle v in time
This smart driving assistance thus can significantly improve slot t
f V (f E ) The fixed CPU frequency of each vehicle (edge)
driving safety, reduce energy consumption, and enhance traffic κ The computing energy efficiency parameter
management efficiency [32], [33]. Specifically, RL and DRL λ The nature of service application
have been demonstrated to significantly improve the perfor- cV E
v,k (ce,k ) The caching decision of vehicle v’s (edge e’s) service
mances of vehicular task offloading [14], [20], [21], [29]–[31], program k
oV E
v,k (oe,k ) The offloading decision of vehicle v’s (edge e’s) task
[34]–[39]. Zhu et al. [35] propose a multiagent DRL-based k
computation offloading scheme to minimize the total task exe- Netask (t) The total number of tasks received at edge e in time
cution delay of the considered system during a certain period. slot t
Redge (Rcloud ) The transmission rate from edge to edge (cloud)
Since current vehicular services always contain multiple tasks pedge (pcloud ) The transmission power from edge to edge (cloud)
with data dependency, a knowledge driven service offloading
framework is proposed in [36] by exploring the DRL model
to find the long-term optimal service offloading policy. Since Service program
DRL agents can react to vehicular environment changes in Caching resource
milliseconds to achieve real-time decision making, they have Service A B C D E ... Computing resource
superiority in complex and highly dynamic vehicular networks  Energy
with fast varying channels and computation load. Edge cooperative offloading

The above studies on VEC are mostly based on the assump- Task offloading Service caching
tion that service programs are completely available in vehicles Edge pool
or edge servers with limited resources, which is not always
feasible in practice. To the best of our knowledge, Refs. [13]
and [14] are the only existing literature that do not assume full
availability of service programs. However, they have not taken
the resource limitation into consideration. To fill this gap, we
aim at joint computation offloading and service caching, taking
full advantage of the limited resources of vehicles and edge
nodes as well as edge cooperative offloading to minimize the
Fig. 1. System illustration.
average task processing time.
service-dependent tasks in the system (i.e., running these tasks
III. S YSTEM M ODEL AND P ROBLEM F ORMULATION requires precaching of their corresponding service programs).
In this section, we first provide the system overview, com- If the associated service program of a requested task has been
munication model, as well as the service caching and task cached at the cloud or the edge pool, vehicles can offload
offloading model. Then we analytically derive the computa- a portion of the computing tasks via wireless communication
tion delay, energy consumption, and provide extreme value (e.g., cellular vehicle-to-everything (C-V2X)) to the edge pool
analysis of single task processing delay and edge node energy or the cloud, depending on the trajectory of the vehicle and
consumption. Last we present our problem formulation, which the location of the cached service program. The edge pool
is essential for service caching and task offloading scheme de- consists of a set of interconnected edge servers to balance
cision making. The primary notations utilized in the following different computing and caching resources. Due to the mobility
are summarized in Table I. of vehicles, cooperative edge offloading between multiple edge
servers can further improve the caching efficiency. However,
unlike the cloud which has abundant computing and storage
A. System Overview resources, limited computing and caching resources of edge
As illustrated in Fig. 1, we consider a general vehicular nodes allow only a small set of services program to be
network consisting of an edge pool (the set of edge nodes cached at the same time. Therefore, an AI-based management
are denoted as E = {1, 2, ...e, ...NE }), the cloud and a controller (i.e., agent) is typically deployed at the edge pool,
number of moving vehicles (the set of vehicles are denoted which collects information from vehicles and edge servers, and
as V = {1, 2, ...v, ...}). Suppose that there are NK ser- makes decisions on service caching and task offloading. To this
vice programs (e.g., executable .EXE files) corresponding to end, time is divided into a set of discrete time slot, indexed
4 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. XX, NO. XX, XX 2023

Agent Caching and offloading Caching and computing


Task offloading
request decision making resource allocation

Program Task completely Program Program


No No No
cached offloading cached cached

Yes Yes
Yes Case 5 Case 6
Task completely Task completely
Program offloading offloading
cached
Yes
Case 1
No Program
No offloading cached Yes
Case 2 Case 3
Task partially No
No offloading
offloading
Case 4
Task partially
offloading

Task execution Task execution in Task execution in Task execution


in vehicle the nearest edge edge pool in cloud
Vehicles The Nearest edge Edge pool Cloud

Fig. 2. Flowchart of task offloading.

by {1, 2, ...t, ...tend }, each of which has an equal duration the Shannon’s formula:
∆t. The service caching and task offloading strategy can be
Rv,e (t) = Bv,e (t)log2 (1 + γv,e (t)), (2)
updated at each time slot.
where Bv,e (t) is the allocated bandwidth for vehicle v and
B. Communication Model Bv,e (t) = B/Netask (t).
To characterize a practical vehicle moving environment,
vehicles with driving speed ranging from Vmin to Vmax are C. Service Caching and Task Offloading Model
considered [4], [5] and edge nodes are assumed to be aware At the beginning of each time slot t, vehicles entering the
of all the vehicles belonging to its coverage area. We consider range of edge nodes update task requests, and complete task
a multi-channel uplink model, where each edge has overall offloading and computation in this time slot. Before the end
bandwidth of B, and each channel has two possible states (i.e., of each time slot, service caching decisions for edge nodes are
occupied and unoccupied). Note that the allocated spectrum is made by the management controller, while those for vehicles
the same for different edge nodes and the allocated channels are made by the vehicles themselves based on interests. Let
are all orthogonal so that the interference is negligible in the K = {1, 2, ...k, ...Nk } denote the set of service programs.
coverage area of the same edge node. We assume that each Each task can be represented as {dk , θk }, k ∈ K where
vehicle can only send one task request in time slot t, with dk denotes data size of the input data and θk denotes the
Netask (t) denoting the total number of tasks received at edge required storage space of service program k. Then we have
node e in time slot t. In this case, the signal-to-interference- the following storage capacity constraints:
plus-noise ratio (SINR) between vehicle v and edge e in time NK
X
slot t is given by cVv,k (t)θk ≤ SvV , ∀v, t, (3)
gv,e (t)pv,e (t) k=1
γv,e (t) = PN , (1)
E
PNetask (t)
pv,e (t) + σ 2 and
e=1 gv,e (t) v=1 NK
X
2
where σ refers to the variance of additive white Gaussian cE E
e,k (t)θk ≤ Se , ∀v, t. (4)
k=1
noise, gv,e (t) denotes the average channel gain, and pv,e (t)
represents the average channel transmission power of edge where cVv,k (t) ∈ {0, 1}, cE e,k (t) ∈ {0, 1} are binary decision
node e. variables to denote whether service program k is cached (i.e.,
If vehicle v needs to offload data to the nearest edge e, the cVv,k (t) = 1, cE V E
e,k (t) = 1) or not (i.e., cv,k (t) = 0, ce,k (t) = 0)
wireless transmission rate at time slot t is calculated based on on vehicle v and edge node e in time slot t. Sv and SeE V
XUE et al.: JOINT SERVICE CACHING AND COMPUTATION OFFLOADING SCHEME 5

TABLE II
TASK O FFLOADING R ATIO OF D IFFERENT C ACHING C ASES

Task offloading ratio


Cases for caching Task execution Task execution in Task execution Task execution
in vehicle the nearest edge in edge pool in cloud
cVv,k (t) = 1, cEe,k (t) = 0,
Case 1 PNE E
1 − oVv,k (t) = 1 oVv,k (t) = 0 0 0
i=1,i6=e ce,k (t) ≥ 0
cVv,k (t) = 1, cEe,k (t) = 1,
Case 2 PNE E
1 − oVv,k (t) oVv,k (t) 0 0
i=1,i6=e ce,k (t) ≥ 0
cVv,k (t) = 0, cEe,k (t) = 1,
Case 3 PNE E
1 − oVv,k (t) = 0 oVv,k (t)(1 − oE
e,k (t)) = 1 oVv,k (t)oE
e,k (t) = 0 0
i=1,i6=e ce,k (t) = 0
cv,k (t) = 0, cE
V
e,k (t) = 1,
Case 4 PNE E
1 − oVv,k (t) = 0 oVv,k (t)(1 − oE
e,k (t)) oVv,k (t)oE
e,k (t) 0
i=1,i6=e ce,k (t) > 0
cVv,k (t) = 0, cEe,k (t) = 0,
Case 5 PNE E
1 − oVv,k (t) = 0 oVv,k (t)(1 − oE
e,k (t)) = 0 oVv,k (t)oE
e,k (t) = 1 0
i=1,i6=e ce,k (t) > 0
cVv,k (t) = 0, cEe,k (t) = 0,
Case 6 PNE E
0 0 0 1
i=1,i6=e ce,k (t) = 0

are the storage capacity of each vehicle and each edge server, exponent parameter. ωk denotes the number of cycles needed
respectively. for service program k, which is expressed as the number
In our system model, a partial offloading strategy is used of computation input data dk multiplied by a factor λ, i.e.
for vehicles’ tasks in each time slot, as illustrated in Fig. 2. ωk = λdk . Here λ (λ > 0) is determined based on the nature
For example, when vehicle v within communication range of service application, (e.g., computational complexity) [40].
of edge e initiates an offloading request for task k, if the f V and f E denote the fixed CPU frequencies of the vehicle
corresponding service program is not locally cached, this task and the edge server, respectively.
will be completely offloaded to the nearest edge. Then if the When vehicle v within communication range of the nearest
edge pool does not have the needed service program for task k edge e initiates a offloading request for task k, the agent
precached, the task will be uploaded to the cloud for execution selects device nodes that task k should be offloaded to
(Case 6 in Fig. 2). If program k is cached in both the vehicle according to the cache placement of the service program,
and its nearest edge node, the task can be handled at edge and determines the offload ratio for resource allocation. Let
nodes by partial offloading (Case 2 in Fig. 2). If program k oVv,k (t) ∈ [0, 1], oE
e,k (t) ∈ [0, 1] be a continuous decision
is cached in the nearest edge node and the edge pool but not variable to denote the ratio of task k offloaded to the nearest
in the vehicle itself, the vehicle completely offloads task k to edge and edge pool, and (1-oVv,k (t)) and (1-oE e,k (t)) bes the
the nearest edge node, and then task k is partially offloaded remaining task that is locally executed at vehicle v and the
from the nearest edge node to the edge pool (Case 4 in Fig. 2). nearest edge e, respectively. Table II lists the mathematical
As can be seen from Fig. 2, tasks are offloaded to different expressions of the task offloading ratio in different caching
computing nodes based on the corresponding service caching cases (corresponding to those in Fig. 2). Based on those, the
strategies. Therefore, caching and computing resources need computation time and transmission time of tasks involving
to be properly allocated by the agent to achieve maximum different caching cases are derived as follows.
benefits. Then the local execution delay of task k in vehicle v at time
slot t can be given as
D. Computation Delay and Energy Consumption
local λdk
Following [10], the time and energy consumption for the Tv,e,k (t) = cVv,k (t)(1 − oVv,k (t)) . (7)
fV
computation of task k are calculated as
ωk If task k needs to be offloaded to the nearest edge e, the time
Dk = , (5) consumption for the input data offloading of task k is
fk
( P
NE E dk
and up ϕ( e=1 ce,k (t))oVv,k (t) Rv,e (t) , Rv,e (t) > 0,
Tv,e,k (t) =
εk = κfkα Dk = κωk (fk )α−1 , (6) 0, Rv,e (t) = 0,
(8)
respectively, where fk denotes the CPU frequency and is
where
constrained by a maximum frequency fmax (i.e., fk < fmax ), (
κ (κ > 0) denotes the computing energy efficiency parameter, 1, if x > 0,
ϕ(x) = (9)
and α (in this work we assume that α = 2) denotes the 0, if x = 0.
6 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. XX, NO. XX, XX 2023

In (8), Rv,e (t) = 0 indicates that the caching scheme belongs We define the average delay of task processing in time slot
to Case 1, and tasks do not need to be offloaded to edge nodes. t as
The execution time on this nearest edge is expressed as Ne task
NE
1 X 1 X(t)
total (15)
edge λdk Td (t) = Tv,e,k (t).
Tv,e,k (t) = cE V E
e,k (t)ov,k (t)(1 − oe,k (t)) . (10) NE e=1 Netask (t) v=1
fE
If the nearest edge server e is unavailable and task k needs Similarly, the average energy consumption of edge servers in
to be further offloaded to other edge servers in the edge pool, time slot t can be calculated as
task
the uploading time is obtained as NE Ne
X(t)
1 X 1
NE ε(t) = εtotal (t). (16)
uppool
X NE e=1 Netask (t) v=1 v,e,k
Tv,e,k (t) =(1 − cVv,k (t))ϕ( cE V E
e,k (t))ov,k (t)oe,k (t)
i=1,i6=e
dk E. Problem Formulation
× ,
Redge With the emerging of interactive AR/VR services, user ex-
(11) perience is crucial for users’ viscosity, which is the key to the
where Redge denotes the transmission rate among edge servers. success of those services. As an important aspect of the user
Then, the delay resulting from the computation at edge pool experience, the task processing delay is gradually becoming a
can be given as crucial wireless network performance metric. The main goal
NE
X of our work is to design a joint computation offloading and
pool service caching scheme for the purpose of minimizing long-
Tv,e,k (t) =(1 − cVv,k (t))ϕ( cE V E
e,k (t))ov,k (t)oe,k (t)
i=1,i6=e term cumulative average task processing time. Specifically, the
λdk optimization problem can be described as:
× E.
f tX
end
Td (t)
(12) min γ t−1 ( ), (17)
Since the output data is typically much smaller than the {cE
e,k
(t),oV
v,k
(t),oE
e,k
(t)}
t=1
Tdmax
input task size, we ignore the time delay of the output return s.t.
process [31]. As the computing and storage resources are
cVv,k (t) ∈ {0, 1}, ∀v ∈ V, ∀k ∈ K, ∀t ∈ {1, 2, ..., tend },
abundant in the cloud, we ignore the queuing and computing
(17a)
latency incurred by the cloud. However, users need to consider
the latency for task offloading to the cloud. Considering cE
e,k (t) ∈ {0, 1}, ∀e ∈ E, k, t, (17b)
the transmission delay and calculation delay under different 0≤ oVv,k (t) ≤ 1, ∀v, k, t, (17c)
caching and offloading decisions, the total delay of processing 0≤ oE ≤ 1, ∀e, k, t,
e,k (t) (17d)
task k in time slot t can be expressed as
0 ≤ γ ≤ 1; (17e)
up NK
total local
Tv,e,k (t) = max{Tv,e,k (t), Tv,e,k (t) X
cVv,k (t)θk ≤ SvV , ∀v, t, (17f)
edge uppool pool
+ max{Tv,e,k (t), Tv,e,k (t) + Tv,e,k (t)}} k=1
NK
N E X
(13) cE E
e,k (t)θk ≤ Se , ∀e, t. (17g)
X
+ (1 − cVv,k (t))(1 − ϕ( cEe,k (t)))
e=1 k=1

dk dk Where γ is the discounted factor. Tdmax is the maximum


×( + ),
Rv,e (t) Rcloud tolerable delay (constant) and Tdmax = max{max{Td (t)}} =
total
max{max{Tv,e,k (t)}}. Constraints (17f) and (17g) are the
which can represent all the cases of service caching and task
caching capacity limitation of each vehicle and edge server.
offloading in Table II. Rcloud denotes the transmission rate
Note that the problem is a long-term MINLP problem and
from an edge server to the cloud.
NP-hard. In order to solve this optimization problem, the
Additionally, the energy consumption of the computing and
key is to make appropriate decisions on task offloading and
offloading of task k on the edge server in time slot t is
service caching at each time period. Moreover, constantly
changing status of vehicle participation and ephemeral interac-
edge pool uppool
εtotal E 2
v,e,k (t) =κ(f ) (Tv,e,k (t) + Tv,e,k (t)) + pedge Tv,e,k (t) tions increase the operation complexity of edge management
NE controller. The system state space becomes large with the
X pcloud dk
+ (1 − cVv,k (t))(1 − ϕ( cE
e,k (t))) , increasing of vehicles and edges. Thus, we need to find an
e=1
Rcloud effective approach to address these issues.
(14)
where pedge denotes the transmission power from an edge total
server to another edge server and pcloud denotes the trans- F. The extreme value analysis of Tv,e,k (t) and εtotal
v,e,k (t)
mission power of communication from an edge server to the In order to evaluate the maximum and minimum values
total
cloud. of Tv,e,k (t), we need to derive the task processing delay in
XUE et al.: JOINT SERVICE CACHING AND COMPUTATION OFFLOADING SCHEME 7

total
time slot t under different caching cases, in which the caching Hence, the minimum value of Tv,e,k (t) can be expressed as
decision variable cVv,k (t), cE
e,k (t) are not relevant.
total
1) Case 1: min{Tv,e,k (t)} = min{min{TdCase2 (t)}, min{TdCase4 (t)},
λdk
TdCase1 (t) = V . (18) TdCase6 (t)}.
f (28)
2) Case 2: The analysis above indicates that the maximum and min-
imum task processing delays are related to the communi-
λdk
TdCase2 (t) = max{(1 − oVv,k (t)) , cation, computing and caching capabilities of each device.
fV Meanwhile, aiming at minimizing the delay, cache decisions
(19)
dk λdk corresponding to the maximum tolerated delay should be
oVv,k (t)( + E )}.
Rv,e (t) f avoided as much as possible.
Similarly, in order to evaluate the maximum and minimum
3) Case 3:
values of εtotal
v,e,k (t), we need to derive the energy consumption
dk λdk of edge servers in time slot t under different caching cases.
TdCase3 (t) = + E. (20)
Rv,e (t) f 1) Case 1:
εCase1 (t) = 0. (29)
4) Case 4:
2) Case 2:
dk λdk
TdCase4 (t) = + max{(1 − oEe,k (t)) E , λdk
Rv,e f εCase2 (t) = κ(f E )2 (ov,k (t) ). (30)
(21) fE
dk λd k
oE
e,k (t)( + E )}. 3) Case 3:
Redge f
λdk
5) Case 5: εCase3 (t) = κ(f E )2 ( ). (31)
fE
dk dk λdk 4) Case 4:
TdCase5 (t) = + + E. (22)
Rv,e (t) Redge f λdk λdk
εCase4 (t) = κ(f E )2 ((1 − oe,k (t)) + oe,k (t) E )+
6) Case 6: fE f
dk
dk dk oe,k (t)pedge
TdCase6 (t) = + . (23) Redge
Rv,e (t) Rcloud dk
= κf E λdk + oe,k (t)pedge .
From the above analysis, we have Redge
(32)
total
max{Tv,e,k (t)} = max {TdCasei (t)} 5) Case 5:
{i=1,...6}
dk
= max{TdCase1 (t), TdCase5 (t), TdCase6 (t)} εCase5 (t) = κf E λdk + pedge . (33)
Redge
λdk dk dk λdk
= max{ V , + + E, 6) Case 6:
f Rv,e (t) Redge f
dk dk dk
+ }. εCase6 (t) = pcloud . (34)
Rv,e (t) Rcloud Rcloud
(24) Based on the above analysis, we have
max{εtotal
v,e,k (t)} = max {εCasei (t)}
total {i=1,...6}
min{Tv,e,k (t)} = min {TdCasei (t)}
{i=1,...6} = max{εCase5 (t), εCase6 (t)}
= min{TdCase2 (t), TdCase4 (t), TdCase6 (t)}. dk (35)
(25) = max{κf E λdk + pedge ,
Redge
In order to eliminate the offloading decision variables, it is dk
easy to obtain the minimum delays of Case 2 and Case 4 pcloud }.
Rcloud
based on their properties as piecewise linear functions.
From (29) it can be easily observed that min{εtotal
v,e,k (t)} = 0.
E
dk (f + λRv,e (t)) From the analysis of the maximum and minimum values
min{TdCase2 (t)} = fV fE
, total
λ + fV Rv,e (t) + f E Rv,e (t) of Tv,e,k (t), it can be seen that in order to reduce the task
(26) processing delay, it is necessary to make Case 2 and Case 4
fV
where ov,k (t) = 1/(1 + λ ( Rv,e1 (t) + λ
fE
)). caching decisions as many as possible. That is, tasks should be
offloaded to the edge nodes as much as possible. On the other
dk λdk (λRedge + f E ) hand, energy consumption is also a critical indicator for edge
min{TdCase4 (t)} = + E , (27)
Rv,e (t) f (2λRedge + f E ) server operators. Offloading tasks to edge nodes can reduce
task processing delays, but it increases energy consumption of
fE
where oe,k (t) = 1/(2 + λRedge ). edge servers.
8 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. XX, NO. XX, XX 2023

IV. D EEP R EINFORCEMENT L EARNING FOR E DGE • cEe,k (t) : The service caching indicator of edge node e
C ACHING AND O FFLOADING after completing all task offloading and calculation in
Due to diverse user demands and the constrained compu- time slot t.
V
• ov,k (t) : The proportion of vehicle v offloading task k to
tation and caching resources, it is complex to minimize the
cumulative system average delay in (17). Most traditional edge node e in time slot t.
E
• oe,k (t) : The proportion of edge node e offloading task
optimization methods (e.g. convex optimization, game theory,
etc.) are assumed to have the knowledge of key factors in k to edge pool in time slot t.
vehicular networks, such as channel conditions and content The action of edge node e in time slot t is denoted as
popularity. However, these key factors are time-varying and ae (t) ={[cE (t)]ρmax ×NK , [oV (t)]ρmax ×NK , [oE (t)]ρmax ×NK }.
unavailable in reality. These methods can merely achieve (38)
optimal or near optimal results for one snapshot, since they Then the actions of all edge nodes are denoted as ae (t) after
ignore how the current decision exerts long-term influence on dimensionality reduction and normalization. The system action
resource allocation [21]. DRL is viewed as an efficient way to at time slot t consists of the actions of all edge nodes, defined
solve the complicated problem in a dynamic environment by as
optimizing the expected cumulative reward. In DRL, an agent at = {a1 (t), a2 (t), ..., aNE (t)}. (39)
collects the needed information regarding diverse demands
of users and available resources in vehicular networks. Then 3) Reward: The agent’s behavior is reward-based, and the
the agent takes an action to manage offloading and caching reward should correspond with the objective function. Hence,
decisions and optimizes resource allocation. We formulate the we set the reward in time slot t as
joint optimization problem as a discrete-time markov decision Td (t)
process (MDP) and propose a DDPG-based edge caching and rt = r(st , at ) = −( ). (40)
Tdmax
offloading scheme for joint service caching and computation
offloading strategy design. B. DDPG-Based Edge Caching and Offloading Scheme
Since the state space consists of a great amount of dynamic
A. Problem Formulation Based on DRL environmental information and the action space contains con-
1) State space: At the initial phase of each time slot t, each tinuous value, we exploit the deep deterministic policy gradi-
edge server gathers all the environmental parameters, which ent (DDPG) algorithm, a model-free and actor-critic algorithm,
contains the following parts: to solve the joint computation offloading and service caching
• Iv,k (t) : The request indicator for task k by vehicle v
problem. The framework of DDPG-based method is illustrated
within the coverage range of edge node e at time slot t, in Fig. 3, consisting of primary networks, target networks and
V
• cv,k (t) : The service caching indicator for vehicle v
a replay buffer.
within the coverage range of edge node e at time slot Based on the deterministic actor-critic model, we leverage
t. deep neural networks to provide accurate estimation of de-
• Bv,e (t), γv,e (t) : The bandwidth that edge node e allo-
terministic policy function µ(st ) and action-value function
cates to vehicle v and the received SINR of edge node e Q(st , at ), which should satisfy the following condition:
at time slot t.
E
• ce,k (t) : The service caching indicator of edge node e at Q(st , µ(st |θµ )|θQ ) ≈r(st , at )+
0 0 (41)
time slot t. γQ(st+1 , µ(st+1 |θµ )|θQ ).
The state of edge node e at time slot t is denoted as
As shown in Fig. 3, The primary networks use the actor
se (t) ={[I(t)]ρmax ×NK , [cV (t)]ρmax ×NK , [B(t)]ρmax ×NK , network µ(st |θµ ) and the critic network Q(st , at |θQ ) to
[γ(t)]ρmax ×NK , [cE (t)]ρmax ×NK }, approximate the policy function and the Q-value function,
(36) respectively. In addition, The target networks contain a actor
0 0
2
where ρmax denotes the maximum vehicle density within network µ(st |θµ ) and a critic network Q(st , at |θQ ) with the
each edge node. The system state at time slot t consists of the same structure. The target Q-value can be represented as
states of all edge nodes, defined as 0 0
yt = r(st , at ) + γQµ (st+1 , µ(st+1 |θµ )|θQ ). (42)
st = {s1 (t), s2 (t), ..., sNE (t)}. (37) We utilize the actor network to explore the policies and
Where the states of edge node e is denoted as se (t) after the critic network to critic the policies. The actor network
dimensionality reduction and normalization. architecture is illustrated in Fig. 4, which takes the state st
2) Action space: The agent obtains the states of service as input, and outputs an action at . The action variable cE
e,k (t)
caching and communication information between vehicles and needs to be discretized (i.e. dcE
e,k (t))e).
edge nodes, and then decides the offloading ratio of all tasks At the beginning of each time slot t, the agent collects
and updates service caching of all edge nodes. Here, the environmental information to get the current system state st .
corresponding action contains the following parts: In order to solve the exploration problem of deterministic
policy, we construct the action space by adding behavior noise
2 Bold letters are used to denote matrices. nt to obtain action at = µ(st |θµ ) + nt . After vehicles and
XUE et al.: JOINT SERVICE CACHING AND COMPUTATION OFFLOADING SCHEME 9

Environment The parameter θµ in the primary actor network is updated


by the policy gradient, which can be expressed as
N
1 X
5θµ J(θQ ) = [5a Q(s, a|θQ )|s=sj ,a=µ(sj |θµ )
N j=1 (44)
× 5θµ µ(s|θµ )|s=sj ].
Finally, we utilize the soft updating method instead of copy-
ing the parameters θQ and θµ to partially update parameters
State Reward of the target networks, which can be expressed as
0 0

Agent θQ ← τ θQ + (1 − τ )θQ ,
Replay Sampling 0 0 (45)
buffer
N ´ ( st , at , rt , st +1 )
θµ ← τ θµ + (1 − τ )θµ ,
Action
where τ is the updating coefficient. The whole process of the
Primary networks Target networks
Actor_P Critic_P Actor_T Critic_T
DDPG-based edge caching and offloading scheme is presented
in Algorithm 1.
    Action        Action   
    at = m ( st | q m ) 
 
 
 
 
 
 
 at +1 = m(st +1 | q m¢ )

 
 

Algorithm 1 DDPG-Based Edge Caching and Offloading
Algorithm
Qm ( st , m ( st | q m ) | q Q )
Qm ( st +1 , m ( st +1 | q m ¢ ) | q Q¢ ) 0 0
Update q m
Update q Q
Input: The initial parameters: γ, θµ , θµ , θQ , θQ τ , M ,
Q Q
Policy gradient Ñq m J (q ) Loss function J (q ) tend , D, N .
Output: Primary optimal actor network parameter θµ .
1: Initialize primary networks and target networks.
2: Empty the experience replay buffer D.
Fig. 3. The framework of the DDPG-based method for vehicular edge caching
and offloading. 3: for episode = 1, 2, ...M do
4: Initial observation state s0 .
5: Add a random Gaussian distributed behavior noise nt
for action exploration.
6: for t = 1, 2, ...tend do
Vehiclesÿrequest ŸŸ
task Ÿ 7: Agent receives normalized observation state st .
Vehiclesÿ ŸŸ
Edgesÿcaching 8: Select action at = µ(st |θµ ) + nt .
ŸŸ Ÿ indicators
caching indicators Ÿ
9: Perform action at , calculate immediate reward rt , and
Proportion of tasks
Channel bandwidth
ŸŸ
Ÿ Ÿ Ÿ
Ÿ
Ÿ ŸŸ
offloaded to the
obtain the next normalized state st+1 .
allocated to Ÿ
Ÿ Ÿ Ÿ Ÿ Ÿ
vehicles
Ÿ Ÿ Ÿ
nearest edges 10: if the replay buffer is not full then
Proportion of tasks
The received SINR ŸŸ
ŸŸ
Ÿ offloaded to edge
11: Store transition (st , at , rt , st+1 ) in replay buffer D.
Ÿ
of edge nodes pools 12: else
Edgesÿcaching ŸŸ 64 13: Randomly replace a transition in replay buffer D
Ÿ
indicators
128 with (st , at , rt , st+1 ).
256 14: Randomly sample a mini-batch of N transitions
512
(sj , aj , rj , sj+1 ), ∀j = 1, 2, ..., N from D.
0 0
Input Layer Hidden Layers Output Layer 15: Set yj = r(sj , aj ) + γQµ (sj+1 , µ(sj+1 |θµ )|θQ ).

16: Update the θQ in critic network by minimizing the


Fig. 4. The actor network architecture.
loss according to (43).
edges carry out the computing offloading and service caching 17: Update actor network θµ by the gradient of the
scheme based on action at , the agent can observe the next policy according to (44).
state st+1 and the immediate reward rt . Then the transition 18: Update target networks according to (45).
(st , at , rt , st+1 ) is stored in the replay buffer. This operation 19: end if
can avoid sample-correlation during the training process. After 20: end for
that, the algorithm randomly selects N transitions from the 21: end for
replay buffer to make up a mini-batch sample and inputs it
into primary networks and target networks to update network
parameters. Then, we update parameter θQ in the primary V. P ERFORMANCE E VALUATION
critic network by minimizing the loss function, i.e.,
In this section, we evaluate the performances of the pro-
N posed scheme through numerical simulations. We first provide
1 X
J(θQ ) = (yj − Qµ (sj , µ(sj |θµ )|θQ ))2 . (43) experimental settings in Sec. V-A and then present extensive
N j=1 simulation results in Sec. V-B.
10 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. XX, NO. XX, XX 2023

TABLE III
S IMULATION PARAMETERS slot.
5) Least recently used (LRU) edge caching and offloading:
Parameter Value The services requested by the edge server in the previous time
Total number of time slots (tend ) 40
The duration for each time slot ∆t 30 s slot continue to be cached in the next time slot, the services
Density of vehicles within range of edge (ρ) [2,5] that have not been requested are randomly replaced [41], and
Each edge communication range 500 m the offloading ratio is determined according to the offloading
Bandwidth of each edge server (B) 20 MHz
The value of SINR 4∼5 dB scheme based on latency minimization.
Number of edge servers (NE ) 3 6) Executing all tasks in the cloud: All tasks are offloaded
Number of service types (NK ) 5 to the cloud for execution.
Data size of each task (dk ) 20 Mb
Storage space of each service program (θk ) 50 Mb
CPU cycles of each vehicle (f V ) 5 × 108 cy-
cles/s B. Simulation Results
CPU cycles of each edge server (f E ) 1 × 109 cy- First, we compare the total delay per episode of different
cles/s
Computation intensity for each task (λ) 105 cycles/bit schemes based on the DDPG learning algorithm. It can be seen
Storage capacity of each vehicle (SvV ) 50 Mb in Fig. 5 that all schemes can approach their stable cumulative
Storage capacity of each edge server (SeE ) 100 Mb average delay as the number of episodes increases. Meanwhile,
The computing energy efficiency parameter (κ) 1 × 10−26
Transmission rate between edge servers (Redge ) 15 Mbps
since the energy consumption is related to the task processing
Transmission rate from edge to the cloud Rcloud 10 Mbps delay, we evaluate the total energy consumption per episode
Transmission power between edge servers (pedge ) 1W in Fig. 6. As the number of episodes increases, except for
Transmission power from edge to the cloud (pcloud ) 2W
Discount factor (γ) 0.99
the offloading without edge caching scheme, the total delay
Learning rate of actor network (lra ) 0.001 of all the other considered schemes decreases and reaches a
Learning rate of critic network (lrc ) 0.002 stable value, while the total energy consumption increases and
Soft update coefficient (τ ) 0.01
Size of mini-batch sample (N ) 128
reaches a stable value. As analyzed in Sec. III-F, to reduce
The size of experience replay buffer (D) 10000 latency, it is necessary to offload tasks to edge nodes as much
as possible, which on the other hand increases the energy
A. Experimental Settings consumption of edge nodes. This is verified in Fig. 5 and
The involved parameters along with their corresponding Fig. 6. Another notable point is that our proposed scheme
values are listed in Table III. These previous works [10], achieves the lowest task processing latency with the same
[12], [31] can bring rich experience to the parameter settings, energy consumption of edge nodes. The offloading without
including the duration for each time slot ∆t, κ,  and so edge caching scheme keeps the maximum delay, which is
on. For instance, ∆t should be set appropriately such that approximately the same as the total delay that the offloading
on one hand the agent has enough time to make caching and scheme based on energy minimization converges to. This is
offloading decisions, and on the other hand tasks offloaded to because when aiming to minimize energy consumption, the
edge server can be completely accomplished by the end of time offloading ratios are determined as the ones that minimizes
slot. We use Python 3.6 to create a simulation environment energy consumption in all the considered cases. However, of-
for the considered vehicular edge caching and task offloading floading without edge caching scheme consumes the maximum
system. In the simulation, each vehicle randomly requests its energy. The offloading without edge caching scheme remains
task of interest at the beginning of each time slot. The duration smooth because agent cannot participate in decision-making,
of each time slot is set appropriately such that the agent has and edge servers cannot provide computing and caching
sufficient time to make caching and offloading decisions, while resources. Our proposed scheme can yield the lowest total
tasks offloaded to other nodes can be accomplished by the delay compared with the other benchmark schemes, which
end of a time slot. Furthermore, we use TensorFlow 1.14.0 demonstrates the efficiency of DDPG-Based edge caching and
to implement the DDPG-based edge caching and offloading offloading scheme.
scheme. Next, we investigate the effects of vehicle density on the
We consider the following benchmark methods for perfor- total delay and energy consumption for different schemes in
mance comparison: Fig. 7. As the density of vehicle ρ increases, the number
1) Offloading without edge caching: Tasks requested by the of task requests increases, while the bandwidth allocated to
vehicle are computed locally or offloaded to the cloud. each vehicle and the transmission rate of tasks uploaded to
2) Offloading based on latency minimization: The task edge nodes decrease, resulting in an increase in the total task
offloading is performed according to the optimal ratio to processing delay. When ρ = 2, 3, the total latency of the
achieve the minimum latency under each case based on the executing all tasks in the cloud scheme is lower than that of
DDPG learning algorithm. the offloading without edge caching scheme. This is because as
3) Offloading based on energy minimization: The task each vehicle occupies more bandwidth resources, tasks can be
offloading is performed according to the optimal ratio to uploaded faster to the cloud, which has powerful computing
achieve the minimum energy consumption under each case resources. As the vehicle density continues to increase, this
based on the DDPG learning algorithm. superiority vanishes. It can be observed that our proposed
4) Random edge caching and offloading: The service scheme outperforms the other methods in terms of the total
caching and task offloading ratios are random in each time delay in the period tend .
XUE et al.: JOINT SERVICE CACHING AND COMPUTATION OFFLOADING SCHEME 11

''3*EDVHGHGJHFDFKLQJDQGRIIORDGLQJ ''3*EDVHGHGJHFDFKLQJDQGRIIORDGLQJ
 
2IIORDGLQJZLWKRXWHGJHFDFKLQJ /58HGJHFDFKLQJDQGRIIORDGLQJ

2IIORDGLQJEDVHGRQODWHQF\PLQLPL]DWLRQ 5DQGRPHGJHFDFKLQJDQGRIIORDGLQJ

tend

7RWDOGHOD\SHUHSLVRGH

 2IIORDGLQJEDVHGRQHQHUJ\PLQLPL]DWLRQ ([HFXWLQJDOOWDVNVLQWKHFORXG

7RWDOGHOD\LQWKHSHULRG
2IIORDGLQJZLWKRXWHGJHFDFKLQJ

















           

(SLVRGH 7KHYHKLFOHGHQVLW\DWHDFKHGJH

Fig. 5. The performance of total delay per episode. Fig. 7. The effect of the vehicle density ρ on the total task processing delay.


7RWDOHQHUJ\SHUHSLVRGH

PD[ W


7
GPD[ W



 ''3*EDVHGHGJHFDFKLQJDQGRIIORDGLQJ

2IIORDGLQJZLWKRXWHGJHFDFKLQJ
       
2IIORDGLQJEDVHGRQODWHQF\PLQLPL]DWLRQ
(GJHFDFKLQJGHFLVLRQFDVHV
2IIORDGLQJEDVHGRQHQHUJ\PLQLPL]DWLRQ


       

(SLVRGH Fig. 8. The average number of edge caching decision cases in the period
tend with ρ = 5.
Fig. 6. The performance of total energy per episode.
energy consumption.
In order to explore the impacts of different caching deci- Then we investigate the effects of task size on unnormalized
sion schemes on latency and energy, we count the average total latency and energy consumption using different schemes
number of edge caching case decisions made by different in Fig. 9 and Fig. 10, respectively. The total delay and energy
schemes in the tend period, which is shown in Fig. 8. Ac- consumption increases linearly with the task size, which is
cording to the parameter settings in Table III and the analysis because that the functions of delay and energy are proportional
total
in Sec. III. F, it can be deduced that max{Tv,e,k (t)} = to the task size dk , as presented in eqs. (13) and (14). Larger
Case5 total Case6
Td (t), max{εv,e,k (t)} = ε (t). That is, greater num- task size increases the transmission delay, computation delay
ber of caching decisions in Case 5 leads to higher task and total energy consumption for all the schemes. However, In
processing delay, and that in Case 6 leads to higher energy order to avoid making Case 5 decisions with the largest task
consumption at edge nodes. It can be observed that the DDPG- processing delay, more decisions are made in Case 6, which is
based edge caching and offloading scheme avoids caching the case with the largest energy consumption of edge nodes.
decisions in cases with the largest delay as much as possible. This explains why our scheme has higher energy consumption
This decision can greatly reduce the task processing delay. It than the LRU and the random edge caching and offloading
is reasonable to observe that offloading without edge caching schemes in Fig. 10.
scheme only makes decisions in Case 1 and Case 6, and Fig. 11 shows the impacts of the edge server cycle frequency
the executing all tasks in the cloud scheme only make Case on the unnormalized total delay using five different schemes.
6 decisions. For the LRU and random edge caching and Certainly, higher edge server cycle frequency can reduce the
offloading schemes, it can be seen that more than 85% of total execution latency except for the offloading without edge
the tasks are executed locally or in the edge pool. This is caching and the executing all tasks in the cloud schemes.
because the cached service programs at edge nodes increase This is because the tasks of these two schemes do not
the task hit ratio, thereby having less energy consumption perform computations at edge nodes, and are not related to
compared with that when tasks are uploaded to the cloud. Our the computing power of the edge servers. For the LRU and
proposed scheme can jointly optimize caching and offloading the random edge caching and offloading schemes, more tasks
decisions, allocate caching and computing resources properly, are offloaded to the edge pool for execution. As the cycle
and improve user experience within a reasonable range of frequency increases, the total latency is lower than that of
12 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. XX, NO. XX, XX 2023

 V


''3*EDVHGHGJHFDFKLQJDQGRIIORDGLQJ

tend
5DQGRPHGJHFDFKLQJDQGRIIORDGLQJ

7RWDOGHOD\ XQQRUPDOL]HG LQWKHSHULRG


 ([HFXWLQJDOOWDVNVLQWKHFORXG

/58HGJHFDFKLQJDQGRIIORDGLQJ

2IIORDGLQJZLWKRXWHGJHFDFKLQJ









       

(GJHVHUYHUF\FOHIUHTXHQF\ 1e8

Fig. 9. The total delay (unnormalized) versus task size with ρ = 5. Fig. 11. The total delay (unnormalized) versus edge sever cycle frequency
with ρ = 5.
 -

scheme can effectively decrease the long-term average task


tend

''3*EDVHGHGJHFDFKLQJDQGRIIORDGLQJ

5DQGRPHGJHFDFKLQJDQGRIIORDGLQJ
processing delay by utilizing the available caching and com-
7RWDOHQHUJ\ XQQRUPDOL]HG LQWKHSHULRG

 ([HFXWLQJDOOWDVNVLQWKHFORXG

/58HGJHFDFKLQJDQGRIIORDGLQJ
puting resources and is easy to implement.
2IIORDGLQJZLWKRXWHGJHFDFKLQJ There are several interesting directions in future work. First,
 the cached service programs from different vehicles may be
shared. This can effectively reduce task uploading cost, but on
the other hand raises new technical challenges, such as privacy


issues in service data sharing. Second, in this work, each


vehicle is assumed to only generate a single task. We believe

that the edge service caching and computation offloading with
multi-tasking vehicles taken into consideration is an interesting

step forward. Last but not least, the application of multiagent
        
systems in edge service caching and computation offloading
7DVNVL]H
design is expected to become a powerful tool to solve more
Fig. 10. The total energy (unnormalized) versus task size with ρ = 5. complex problems and remains much to be explored.
the executing all tasks in the cloud scheme and offloading
without edge caching scheme, indicating that the benefits of R EFERENCES
task computing on the edge server outweighs offloading to the [1] X. Jiang, F. R. Yu, T. Song, and V. C. M. Leung, “Resource allocation
cloud. In addition, due to randomness, the performances of the of video streaming over vehicular networks: A survey, some research
random edge caching and offloading scheme is quite unstable. issues and challenges,” IEEE Trans. Intell. Transp. Syst., pp. 1–21, 2021,
doi:10.1109/TITS.2021.3065209.
Moreover, the result demonstrates that the proposed scheme [2] L. T. Tan, R. Q. Hu, and L. Hanzo, “Twin-timescale artificial intelli-
can significantly reduce the task execution delay. gence aided mobility-aware edge caching and computing in vehicular
To conclude, the above evaluation results show that the networks,” IEEE Trans. Veh. Technol., vol. 68, no. 4, pp. 3086–3099,
Apr. 2019.
proposed DDPG-based edge caching and offloading scheme [3] Y. Dai, D. Xu, K. Zhang, S. Maharjan, and Y. Zhang, “Deep rein-
can significantly reduce total delay of service caching and task forcement learning and permissioned blockchain for content caching
offloading in dynamic vehicular environments. in vehicular edge computing and networks,” IEEE Trans. Veh. Technol.,
vol. 69, no. 4, pp. 4312–4324, Apr. 2020.
[4] J. Chen, H. Wu, P. Yang, F. Lyu, and X. Shen, “Cooperative edge caching
with location-based and popular contents for vehicular networks,” IEEE
VI. C ONCLUSIONS AND F UTURE W ORK Trans. Veh. Technol., vol. 69, no. 9, pp. 10 291–10 305, Sep. 2020.
This paper proposes a novel edge service caching and com- [5] Z. Xue, Y. Liu, G. Han, F. Ayaz, Z. Sheng, and Y. Wang, “Two-layer
distributed content caching for infotainment applications in VANETs,”
putation offloading framework in a general VEC system. To IEEE Internet Things J., vol. 9, no. 3, pp. 1696–1711, Feb. 2022.
minimize the cumulative average task processing delay of task [6] D. Gupta, S. Rani, A. Singh, and J. J. P. C. Rodrigues, “ICN based
offloading and service caching, we formulate the optimization efficient content caching scheme for vehicular networks,” IEEE Trans.
Intell. Transp. Syst., pp. 1–9, 2022, doi:10.1109/TITS.2022.3171662.
problem as a long-term MINLP problem, which is challenging [7] P. Yang, N. Zhang, S. Zhang, L. Yu, J. Zhang, and X. Shen, “Content
to solve since service caching decisions and computation popularity prediction towards location-aware mobile edge caching,”
offloading decisions are strongly coupled. Furthermore, we IEEE Trans. Multimedia, vol. 21, no. 4, pp. 915–929, Apr. 2019.
[8] J. Yan, S. Bi, L. Duan, and Y.-J. A. Zhang, “Pricing-driven service
deduce the boundaries of task processing delay and energy caching and task offloading in mobile edge computing,” IEEE Trans.
consumption in each case in detail. Considering the highly Wireless Commun., vol. 20, no. 7, pp. 4495–4512, Jul. 2021.
dynamic vehicular environments, we propose a DDPG-based [9] S.-W. Ko, S. J. Kim, H. Jung, and S. W. Choi, “Computation offloading
and service caching for mobile edge computing under personalized
scheme to update offloading decisions and service caching service preference,” IEEE Trans. Wireless Commun., pp. 1–16, 2022,
placements. Extensive simulations show that our proposed doi:10.1109/TWC.2022.3151131.
XUE et al.: JOINT SERVICE CACHING AND COMPUTATION OFFLOADING SCHEME 13

[10] S. Bi, L. Huang, and Y.-J. A. Zhang, “Joint optimization of service [30] H. Tian, X. Xu, L. Qi, X. Zhang, W. Dou, S. Yu, and Q. Ni, “CoPace:
caching placement and computation offloading in mobile edge com- Edge computation offloading and caching for self-driving with deep
puting systems,” IEEE Trans. Wireless Commun., vol. 19, no. 7, pp. reinforcement learning,” IEEE Trans. Veh. Technol., vol. 70, no. 12,
4947–4963, Jul. 2020. pp. 13 281–13 293, Dec. 2021.
[11] J. Xu, L. Chen, and P. Zhou, “Joint service caching and task offloading [31] J. Wu, J. Wang, Q. Chen, Z. Yuan, P. Zhou, X. Wang, and C. Fu,
for mobile edge computing in dense networks,” in Proc. IEEE Conf. “Resource allocation for delay-sensitive vehicle-to-multi-edges (V2Es)
Comput. Commun. (INFOCOM), Honolulu, HI, USA, Apr. 2018, pp. communications in vehicular networks: A multi-agent deep reinforce-
207–215. ment learning approach,” IEEE Trans. Network Sci. Eng., vol. 8, no. 2,
[12] G. Zhang, S. Zhang, W. Zhang, Z. Shen, and L. Wang, “Joint service pp. 1873–1886, Apr. 2021.
caching, computation offloading and resource allocation in mobile edge [32] M. Xu, D. T. Hoang, J. Kang, D. Niyato, Q. Yan, and D. I.
computing systems,” IEEE Trans. Wireless Commun., vol. 20, no. 8, pp. Kim, “Secure and reliable transfer learning framework for 6G-
5288–5300, Aug. 2021. enabled internet of vehicles,” IEEE Wireless Commun., pp. 1–8, 2022,
[13] C. Tang, C. Zhu, H. Wu, Q. Li, and J. J. P. C. Rodrigues, “Toward doi:10.1109/MWC.004.2100542.
response time minimization considering energy consumption in caching- [33] J. Kang, Z. Xiong, X. Li, Y. Zhang, D. Niyato, C. Leung, and C. Miao,
assisted vehicular edge computing,” IEEE Internet Things J., vol. 9, “Optimizing task assignment for reliable blockchain-empowered fed-
no. 7, pp. 5051–5064, Apr. 2022. erated edge learning,” IEEE Trans. Veh. Technol., vol. 70, no. 2, pp.
[14] D. Lan, A. Taherkordi, F. Eliassen, and L. Liu, “Deep reinforcement 1910–1923, Feb. 2021.
learning for computation offloading and caching in fog-based vehicular [34] J. Liu, M. Ahmed, M. A. Mirza, W. U. Khan, D. Xu, J. Li, A. Aziz,
networks,” in Proc. IEEE Int. Conf. Mob. Ad Hoc Sens. Syst (MASS), and Z. Han, “RL/DRL meets vehicular task offloading using edge and
Delhi, India, Dec. 2020, pp. 622–630. vehicular cloudlet: A survey,” IEEE Internet Things J., pp. 1–25, 2022,
[15] S. David, L. Guy, H. Nicolas, D. Thomas, W. Daan, and R. Martin, doi:10.1109/JIOT.2022.3155667.
“Deterministic policy gradient algorithms,” in Proc. 31st Int. Conf. [35] X. Zhu, Y. Luo, A. Liu, M. Z. A. Bhuiyan, and S. Zhang, “Multiagent
Mach. Learn. (ICML), Beijing, China, Jun. 2014, pp. 387–395. deep reinforcement learning for vehicular computation offloading in
[16] L. Zhang, Y. Sun, Z. Chen, and S. Roy, “Communications-caching- IoT,” IEEE Internet Things J., vol. 8, no. 12, pp. 9763–9773, Jun. 2021.
computing resource allocation for bidirectional data computation in [36] Q. Qi, J. Wang, Z. Ma, H. Sun, Y. Cao, L. Zhang, and J. Liao,
mobile edge networks,” IEEE Trans. Commun., vol. 69, no. 3, pp. 1496– “Knowledge-driven service offloading decision for vehicular edge com-
1509, Mar. 2021. puting: A deep reinforcement learning approach,” IEEE Trans. Veh.
[17] H. Tout, A. Mourad, N. Kara, and C. Talhi, “Multi-persona mobility: Technol., vol. 68, no. 5, pp. 4192–4203, May 2019.
Joint cost-effective and resource-aware mobile-edge computation of- [37] B. Shang, L. Liu, and Z. Tian, “Deep learning-assisted energy-efficient
floading,” IEEE/ACM Trans. Networking, vol. 29, no. 3, pp. 1408–1421, task offloading in vehicular edge computing systems,” IEEE Trans. Veh.
Jun. 2021. Technol., vol. 70, no. 9, pp. 9619–9624, Sep. 2021.
[18] T. X. Tran and D. Pompili, “Joint task offloading and resource allocation [38] J. Shi, J. Du, J. Wang, and J. Yuan, “Deep reinforcement learning-based
for multi-server mobile-edge computing networks,” IEEE Trans. Veh. V2V partial computation offloading in vehicular fog computing,” in
Technol., vol. 68, no. 1, pp. 856–868, Jan. 2019. Proc. IEEE Wireless Commun. Netw. Conf. (WCNC), Nanjing, China,
[19] I. A. Elgendy, W.-Z. Zhang, Y. Zeng, H. He, Y.-C. Tian, and Y. Yang, Mar. 2021, pp. 1–6.
“Efficient and secure multi-user multi-task computation offloading for [39] S. M. A. Kazmi, S. Otoum, R. Hussain, and H. T. Mouftah, “A novel
mobile-edge computing in mobile IoT networks,” IEEE Trans. Netw. deep reinforcement learning-based approach for task-offloading in ve-
Serv. Manage., vol. 17, no. 4, pp. 2410–2422, Dec. 2020. hicular networks,” in Proc. IEEE Glob. Commun. Conf. (GLOBECOM),
[20] G. Qiao, S. Leng, S. Maharjan, Y. Zhang, and N. Ansari, “Deep Madrid, Spain, Dec. 2021, pp. 1–6.
reinforcement learning for cooperative content caching in vehicular edge [40] Y. Wang, M. Sheng, X. Wang, L. Wang, and J. Li, “Mobile-edge com-
computing and networks,” IEEE Internet Things J., vol. 7, no. 1, pp. puting: Partial computation offloading using dynamic voltage scaling,”
247–257, Jan. 2020. IEEE Trans. Commun., vol. 64, no. 10, pp. 4268–4282, Oct. 2016.
[21] Z. Ning, K. Zhang, X. Wang, M. S. Obaidat, L. Guo, X. Hu, B. Hu, [41] Z. Su, Y. Hui, Q. Xu, T. Yang, J. Liu, and Y. Jia, “An edge caching
Y. Guo, B. Sadoun, and R. Y. K. Kwok, “Joint computing and caching scheme to distribute content in vehicular networks,” IEEE Trans. Veh.
in 5G-envisioned internet of vehicles: A deep reinforcement learning- Technol., vol. 67, no. 6, pp. 5346–5356, Jun. 2018.
based traffic control system,” IEEE Trans. Intell. Transp. Syst., vol. 22,
no. 8, pp. 5201–5212, Aug. 2021.
[22] J. Zhao, Q. Li, Y. Gong, and K. Zhang, “Computation offloading
and resource allocation for cloud assisted mobile edge computing in
vehicular networks,” IEEE Trans. Veh. Technol., vol. 68, no. 8, pp. 7944–
7956, Aug. 2019. Zheng Xue received the B.E. degree in automotive
[23] L.-H. Yen, J.-C. Hu, Y.-D. Lin, and B. Kar, “Decentralized configura- service engineering from Chongqing Jiaotong Uni-
tion protocols for low-cost offloading from multiple edges to multiple versity, Chongqing, China, in 2018, and the M.E.
vehicular fogs,” IEEE Trans. Veh. Technol., vol. 70, no. 1, pp. 872–885, degree in electronic and communication engineer-
Jan. 2021. ing from the Guangdong University of Technology,
[24] M. S. Bute, P. Fan, L. Zhang, and F. Abbas, “An efficient distributed Guangzhou, China, in 2021, where he is currently
task offloading scheme for vehicular edge computing networks,” IEEE working toward the Ph.D. degree with the Depart-
Trans. Veh. Technol., vol. 70, no. 12, pp. 13 149–13 161, Dec. 2021. ment of Communication Engineering.
[25] L. Dai, Y. Fang, Z. Yang, P. Chen, and Y. Li, “Protograph LDPC-coded His primary research interests are vehicular net-
BICM-ID with irregular CSK mapping in visible light communication works and cooperative perception.
systems,” IEEE Trans. Veh. Technol., vol. 70, no. 10, pp. 11 033–11 038,
Oct. 2021.
[26] Y. Fang, Y. Bu, P. Chen, F. C. M. Lau, and S. A. Otaibi, “Irregular-
mapped protograph LDPC-coded modulation: A bandwidth-efficient
solution for 6G-enabled mobile networks,” IEEE Trans. Intell. Transp.
Syst., pp. 1–14, 2021, doi:10.1109/TITS.2021.3122994.
[27] Y. Fang, G. Han, G. Cai, F. C. M. Lau, P. Chen, and Y. L. Guan, “Design Chang Liu received the Ph.D. degree from Kansas
guidelines of low-density parity-check codes for magnetic recording State University, Manhattan, KS, USA, in 2016.
systems,” IEEE Commun. Surv. Tutorials, vol. 20, no. 2, pp. 1574–1606, She is an Associate Professor with the Guangdong
Secondquarter 2018. University of Technology, Guangzhou, China. Her
[28] Z. Qin, S. Leng, J. Zhou, and S. Mao, “Collaborative edge computing current research areas include Internet of Vehicles
and caching in vehicular networks,” in Proc. IEEE Wireless Commun. and Internet of Things.
Netw. Conf. (WCNC), Seoul, Korea, May 2020, pp. 1–6.
[29] K. Zhang, J. Cao, H. Liu, S. Maharjan, and Y. Zhang, “Deep reinforce-
ment learning for social-aware edge computing and caching in urban
informatics,” IEEE Trans. Ind. Inf., vol. 16, no. 8, pp. 5467–5477, Aug.
2020.
14 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. XX, NO. XX, XX 2023

Canliang Liao received the B.E. degree in com-


munication from Wuhan Polytechnic University,
Wuhan, China, in 2019. He is currently working
toward the M.E. degree with the Department of
Communication Engineering, Guangdong University
of Technology, Guangzhou, China.
His primary research interest is Internet of Vehi-
cles.

Guojun Han received his Ph.D. from Sun Yatsen


University, Guangzhou, China, and the M.E. de-
gree from South China University of Technology,
Guangzhou, China.
From March 2011 to August 2013, he was a
Research Fellow at the School of Electrical and Elec-
tronic Engineering, Nanyang Technological Univer-
sity, Singapore. From October 2013 to April 2014,
he was a Research Associate at the Department of
Electrical and Electronic Engineering, Hong Kong
University of Science and Technology. He is now a
Full Professor and Executive Dean at the School of Information Engineering,
Guangdong University of Technology, Guangzhou, China. He has been a
Senior Member of IEEE since 2014. His research interests are in the areas of
wireless communications, signal processing, coding and information theory.
He has more than 14 years experience on research and development of
advanced channel coding and signal processing algorithms and techniques
for various data storage and communication systems.

Zhengguo Sheng (Senior Member, IEEE) received


the B.Sc. degree from the University of Electronic
Science and Technology of China, Chengdu, China,
in 2006, and the M.S. and Ph.D. degrees from
Imperial College London, London, U.K., in 2007
and 2011, respectively.
He is currently a Senior Lecturer with the Uni-
versity of Sussex, Brighton, U.K. Previously, he was
with UBC, Vancouver, BC, Canada, as a Research
Associate and with Orange Labs, Santa Monica, CA,
USA, as a Senior Researcher. He has more than 120
publications. His research interests cover IoT, vehicular communications, and
cloud/edge computing.

You might also like