Resource_Allocation_for_Edge_Computing_in_IoT_Networks_via_Reinforcement_Learning

Uploaded by

Sauban Khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views6 pages

Resource_Allocation_for_Edge_Computing_in_IoT_Networks_via_Reinforcement_Learning

Uploaded by

Sauban Khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Resource Allocation for Edge Computing in IoT

Networks via Reinforcement Learning

Xiaolan Liu, Zhijin Qin, Yue Gao
Queen Mary University of London, London, U.K.

Abstract—In this paper, we consider resource allocation for edge devices, i.e., a edge server, is finite. Therefore, it cannot support
computing in internet of things (IoT) networks. Specifically, each the massive computation tasks from all the end devices in
end device is considered as an agent, which makes its decisions on its coverage area. Furthermore, offloading computation tasks
whether offloading the computation tasks to the edge devices or
not. To minimize the long-term weighted sum cost which includes of those end devices requires abundant spectrum resources or
the power consumption and the task execution latency, we consider it might bring about the congestion of wireless channels [5].
the channel conditions between the end devices and the gateway, Therefore, resource allocation, such as computation capacity,
the computation task queue as well as the remaining computation power and spectrum resource allocation, is quite important
resource of the end devices as the network states. The problem for such types of resource-constrained networks. Dynamic
of making a series of decisions at the end devices is modelled
as a Markov decision process and solved by the reinforcement computation tasks offloading scheme, i.e., the task is executed
learning approach. Therefore, we propose a near optimal task at a local end device or edge server, is an effective method.
offloading algorithm based on ϵ-greedy Q-learning. Simulations It has been mostly discussed in the context of mobile edge
validate the feasibility of our proposed algorithm, which achieves computing (MEC), in which the mobile user makes a binary
a better trade-off between the power consumption and the task decision to either offload the computation tasks to the edge
execution latency compared to these of edge computing and local
computing modes. device or not [6].
Some research work has proposed optimal computation task
I. I NTRODUCTION offloading schemes by minimizing the energy consumption or
The phenomenon of the increasing number of end devices, task execution latency in the network. Most of them have
such as sensors and actuators etc., has caused an exponential adopted the conventional optimization methods to solve the
growth of requirements for data processing, storage and com- formulated optimization problem, like Lyapunov optimization
munications. A cloud platform has been proposed to connect and convex optimization techniques [7]. However, these opti-
a large number of internet of things (IoT) devices, and a mization techniques can construct an approximately optimal
massive amount of data generated by those devices can be solution only. Note that designing the computation task of-
offloaded to a cloud server for further processing [1]. The floading scheme can be modeled as a Markov decision pro-
cloud server generally has an infinite ability of computation cess (MDP). Reinforcement learning has been adopted as an
and storage, however, it is physically and/or logically far from effective method to solve this optimization problem without
its clients, implying that offloading big data to the cloud server requiring the priori knowledge of environment statistics [8].
is inefficient due to intensive bandwidth requirements. More- But the explosion of the state and action space makes the
over, it cannot satisfy the ultra-low latency requirements for conventional reinforcement learning algorithm inefficient and
time-sensitive applications and provide location-aware services. even infeasible. Deep reinforcement learning approaches, such
Edge computing has been proposed to address this problem by as deep Q-network (DQN) has been proposed to explore the
moving data processing to the edge computing devices, such optimal policy by solving the aforementioned optimization
as devices with computing capacity (e.g., desktop PCs, tablets problem [9, 10].
and smart phones), data centers (e.g., IoT gateway) and devices The increase of computation capacity at edge devices con-
with virtualization capacity, which are closer to end devices, tributes to a new research area, called edge learning, which
and then a distributed data processing network is implemented crosses and revolutionizes two disciplines: wireless commu-
[1, 2]. The edge is not located on the IoT devices but as close nication and machine learning [11, 12]. Edge learning can be
as one hop to them, or even more than one hop away from accomplished by leveraging MEC platform. Deep reinforce-
them. ment learning (DRL) is an effective method to design the
Compared with the cloud server, edge devices can support computation task offloading policy in wireless powered MEC
latency-critical services and a variety of IoT applications. The networks by considering the time-varying channel qualities,
end devices are in general resource-constrained, for instance, harvested energy units and task arrivals [9]. [10] has designed
the battery capacity and local CPU computation capacity are an offloading policy for mobile user to minimize its monetary
limited [3]. Offloading computation tasks to relatively resource- cost and energy consumption by implementing the DQN-based
rich edge devices can meet the quality of service (QoS) offloading algorithm. In [13], an In-Edge artificial intelligence
requirements of applications as well as augment the capabilities has been evaluated and could achieve near-optimal performance
of end devices for running resource-demanding applications by investigating the scenarios of edge caching and computation
[4]. However, in practice, the computation capacity of edge offloading in MEC systems. Moreover, DRL could also achieve

978-1-5386-8088-9/19/$31.00
Authorized licensed use©2019
limitedIEEE
to: REVA UNIVERSITY. Downloaded on November 28,2024 at 06:09:28 UTC from IEEE Xplore. Restrictions apply.
a good performance by developing a decentralized resource
allocation mechanism for vehicle-to-vehicle communications
[14]. Deep learning achieves excellent performance with large
amount of data generated by IoT applications [15].
In the previous works, the task execution latency and the
power consumption have rarely been considered together when
designing the optimal computation task offloading scheme. [16]
has optimized the task offloading scheduling by minimizing
the weighted sum of execution delay and end device energy Gateway with edge server
consumption with conventional optimization tools. Encouraged « «
by [16], we formulate a task offloading problem with its

Data size
objective function including not only the cost in [16], but T1 T3
TT −1 TT
also the power consumption of the edge device. Specifically, T2 T4 u3
« u1 u4 « uU
we are proposing to use reinforcement learning techniques to u2
Computation tasks
solve this problem. This approach has been put in use recently
to solve the task offloading problem while only considering : Offload to cloud
: Local computing device
either the execution delay or the energy consumption as the : Offload to edge server
negative reward function [9, 10]. Moreover, we will discuss : Edge computing device
: Computation tasks producing
the remaining computation resource of the end device since
it affects the decision making on task offloading when it is run Fig. 1. Computation tasks offloading model in IoT networks.
out. The major contribution of this paper is as follows:
1) We first consider resource allocation in IoT networks with
edge computing to design a task offloading scheme for picking from G possible values. We use a finite-state discrete
IoT devices. We formulate the weighted sum cost min- time Markov chain to model the channel gain state transition
imization problem with its objective function including over time epochs. Assuming each end device executes a lot of
the task execution latency and the power consumption of independent computation tasks, these tasks are in different sizes
both the edge device and the end device. and need to be processed with different CPU cycles. Then we
2) We solve this optimization problem with reinforcement denote the task queue at the end device as T = {T1 , ..., Tmax },
learning technique. And then we propose the near optimal where Tmax is the maximum number of tasks that can be stored
task offloading algorithm based on ϵ-greedy Q-learning. at the end device. The task arrival is assumed to be I = {0, 1},
3) Numerical results show that our proposed task offloading where I = 1 indicates there is one task generated with its task
algorithm achieves a better trade-off between the power size randomly picked from M = {m1 , ..., mM }, otherwise,
consumption and the task execution latency compared to there is no task arrived at current time epoch.
the other two baseline computing modes. From Fig. 1, computation tasks can be either executed at
the end device or offloaded to the gateway and executed at the
II. S YSTEM M ODEL edge server. In each IoT network, some end devices execute
As shown in Fig. 1, we consider an IoT network with many the computation tasks locally, while others offload their tasks
end devices (i.e., IoT devices) and a gateway (i.e., the edge to the gateway in the same time epoch. At the beginning of
device), where the gateway collects data from end devices in each time epoch k, each end device makes its own decisions on
its coverage area and processes them with its equipped edge computation task offloading O = {1}∪{0}∪{−1} and transmit
server. Each end device generates a variety of computation power level Pt = {P1k , ..., Pmax
k
} if the end device decides to
tasks continuously and has limited computation capacity and offload the computation tasks to the edge device. Note that
power, so offloading their tasks to the gateway may improve if the end device decides not to offload the computation task
the computation experience in terms of power consumption when Ok = 0, then the cost only contains the local computation
and task execution latency. We focus on a representative end power consumption and the local task execution latency, and
device making its own decisions on task offloading. We discrete the transmit power is defined as Ptk = 0 in this case. Ok = 1
the time horizon into epochs, with each epoch equalling to indicates that the end device decides to offload computation
duration η and indexed by an integer 0 < k ≤ K, K is the task to the gateway, with the transmit power Ptk ∈ Pt . In both
maximum number of time epochs in each time horizon. The end cases, the computation task is executed successfully, however, if
device operates over common license-free sub-gigahertz radio the computation task transmission suffers from outage between
frequency and the frequency bandwidth is denoted by Bw . We end device and the gateway, the computation task execution
denote end devices in the network as U = {u1 , ..., uU }. The fails and Ok = {−1}.
channel condition between the end device and the gateway is The task execution latency and power consumption are two
assumed to be time-varying. we assume the end device knows critical challenges in edge computing networks, both of them
some stochastic information about the channel condition in depend on the adopted task offloading scheme and transmit
time slot k, which is indicated by the channel gain states power allocation. In this paper, we consider them as the main
G = {g1k , ..., gG
k
} where the channel gain at each time epoch is cost of our considered IoT network. Therefore, we formulate

Authorized licensed use limited to: REVA UNIVERSITY. Downloaded on November 28,2024 at 06:09:28 UTC from IEEE Xplore. Restrictions apply.
an optimization problem to minimize the cost function, the III. P ROPOSED Q- LEARNING BASED R ESOURCE
weighted sum of the task execution latency and the power A LLOCATION FOR E DGE C OMPUTING
consumption. In this section, we formulate our task offloading problem to
A. Local Computing Mode minimize the weighted sum of power consumption and task
execution latency of both the end device and the edge device
We have Ok = 0 if the computation task is executed at local
by optimizing the task offloading decisions, the weight factor
end device. We assume the edge server allocates fixed and equal
and the transmit power of the end device. Since the formulated
CPU resource for each end device, and it is enough for the
problem is non-convex, the conventional algorithm is hard or
computation task to execute in each time frame. Considering
even impossible to solve it, we propose a near optimal task
that during any time epoch k, fd denotes the fixed CPU
offloading algorithm based on ϵ-greedy Q-learning.
frequency of any end device, which presents the number of
CPU cycles required for computing 1-bit of input data. The A. Task Offloading Problem Formulation
power consumption per CPU cycle is denoted by Pd . Then Computation tasks from the end device can be offloaded to
fd Pd indicates computing power consumption per bit at the end the gateway depending on the channel conditions, computation
device. The total power consumption of one computation task task queue and the remaining percentage of the end device’s
k
at the end device in any time epoch k, denoted by Pcd , is given CPU resource. We denote sk = (g k , T k , rdk , ) ∈ S = G × T ×
k k
by Pcd = fd Pd m . Moreover, let Dd denote the computation Rd as the network state of any end device in each time epoch k.
capacity of the end device, which is measured by the number By observing the network state sk at the beginning of each time
of CPU cycles per second. The remaining CPU resource of epoch k, the end device chooses an action ak = (Ok , Ptk ) ∈
the end device in each time epoch is denoted by the remaining A = O × Pt by following a stationary policy π. An agent, e.g.,
percentage of computation resource Rk = {rd1 , rd2 , ..., 1}. The each end device, decides whether to offload the computation
local computing latency Lkd is defined as Lkd = (fd mk )/Dd . task and chooses the transmit power level, and we define a
However, the power consumption and the task execution latency penalty function δ k as the cost when the task transmission fails.
are two contradictory challenges in the edge computing net- Therefore, the cost function is expressed as
work, we cannot reduce them simultaneously, so we are trying
to achieve a good trade-off between them. Then we define the C k = Cloc
k k
+ Cof k k k
f + δ = Pc + βL + δ
k
(4a)
cost function of the local computing mode as = k
Pcd + k
Pcs + Ptk + β(Lks + Lkt + Lkd ) k
+δ . (4b)
k k
Cloc = Pcd + βLkd , (1) In this paper, we propose to design an optimal task offloading
where β indicates the weight factor between power consump- scheme to minimize the long-term cost of the IoT network, that
tion and the task execution latency. is, both the immediate cost and the future cost are included. The
optimization problem is formulated as
B. Offloading Computing Mode
∑
K
We assume end devices adopt the time division multiple (P1) min Ck (5a)
access (TDMA) scheme to transmit their data to the gateway, β, Pt , O
k=1
that is, the interferences from other end devices are negligible s.t. C1 : 0 ≤ β ≤ 1; (5b)
when they are transmitting data over the same time epoch, k.
Let g k denote the channel gain from any end device to the C2 : 0 ≤ Pt ≤ Pmax ; k
(5c)
gateway, which is constant during the offloading time epoch. C3 : O = {0, 1, −1}.
k
(5d)
Ptk indicates the transmit power of the end device, then the
where C1 denotes the value range of weight factor β which
achievable transmission rate (bit/s) is denoted by
balances the power consumption and the task execution latency.
Ptk g k C2 is the transmit power of the end device when it decides
Rk = Bw log2 (1 + ), (2)
σ2 to offload the computation task to the gateway. C3 presents
where Bw and σ 2 indicate the bandwidth and the variance the task execution set. It is easily noticed that P1 is a mixed
of additive white Gaussian noise (AWGN), respectively. Then integer nonlinear programming (MINLP) problem as the integer
the power consumption of the end device caused by the data variable Ok , continuous variable Pt and the discrete variable
transmission is indicated as Ptk , and the transmission latency δ k need to be optimized. It is difficult or impossible to find the
is denoted by Lkt = T k /Rk . Similarly, let fs denote the optimal solution by using conventional optimization techniques.
computation frequency of the edge server, Ps denote the power The conventional algorithm has to decouple the optimization
consumption per CPU cycle at the edge server. Ds indicates the problem into many sub-optimization problems and solves them
computation capacity allocated to each end device. The compu- separately, which is inefficient and complicated, so we explore
tation power of the edge server is given by Pcs k
= fs Ps mk , and the reinforcement learning techniques to address this problem
the computation latency is calculated as Ls = (fs M k )/Ds .
k including multiple optimization variables.
Therefore, we can obtain the cost function of the offloading We consider optimizing the variables together in each time
computing mode, and it is presented as epoch, and denote the objective function as a negative reward
k k k k k
in (4). In addition, the state transition and cost are stochastic
Cof f = Pcs + Pt + β(Ls + Lt ). (3) and can be modelled as a Markov decision process, where

Authorized licensed use limited to: REVA UNIVERSITY. Downloaded on November 28,2024 at 06:09:28 UTC from IEEE Xplore. Restrictions apply.
the state transition probabilities and cost depend only on the where C k is the cost observed for the current state, α is the
environment and the obtained policy. The transition probability learning rate (0 < α ≤ 1). Q learning is an online action-
P = (sk+1 , C k |sk , ak ) is defined as the transition from state sk value function learning with an off-policy, in each time epoch,
to sk+1 with the cost C k when the action ak is taken according we calculate the Q-value in the next step with all the possible
to the policy. Therefore, the long-term expected cost is given actions that it can take, then choose the minimum Q-value and
by record the corresponding action.
∑K
Therefore, the computation task offloading optimization
V (s, π) = Eπ [ γ k C k ], (6) problem P1 is solved by using the Q-learning algorithm, and to
k=1
explore the unknown states instead of trusting the learn values
k
where s = (g , T k
γ ∈ [0, 1] is the discount factor and E
, rdk ), of Q(s, a) completely, the ϵ-greedy approach is used in the
indicates the statistical conditional expectation with transition Q-learning algorithm, where the agent picks a random action
probability P. with small probability ϵ, or with 1 − ϵ it chooses an action
that minimizes the Q(sk+1 , a) as shown in (9) in each time
B. Q-learning Approach epoch. Then a computation task offloading algorithm based on
Generally, the conventional solutions, like policy iteration ϵ-greedy Q-learning is proposed as shown in Algorithm 1.
and value iteration [17], can be used to solve the MDP
optimization problem with a known transition matrix. But it is Algorithm 1 Computation Task Offloading Algorithm based
hard for the agent to know the prior information of the transition on ϵ-greedy Q-Learning
matrix, which is determined by the environment. Therefore, Initialization
a model-free reinforcement learning approach is proposed to Initialize parameters: discount factor γ, learning rate α,
investigate this decision-making problem since the agent cannot exploration rate ϵ.
make predictions about what the next state and cost will be Initialize action-value function Q : S × A
before it takes each action. Initialize states: set g 1 randomly, set T 1 := T , rd1 := Rd .
In (P1 ), each end device is trying to design an optimal task Set k := 1,
offloading scheme according to some statistical information, Procedure
such as the possible channel conditions, the possible remaining 1: while k ≤ K and T > 0 and rd > 0 do
percentage of computation resource and the possible task queue, 2: g k is changed according to a random matrix.
observed from the environment. Particularly, we focus on 3: e ← random number from [0,1]
finding the optimal policy π ∗ that minimizes the cost V (s, π). 4: if e < ϵ then
For any given network state s, the optimal policy π ∗ can be 5: Choose action ak randomly.
obtained by 6: else
π ∗ = arg min V (s, π), ∀s ∈ S. (7) 7: Choose action ak according to arg min Q(sk , ak )
ak ∈A
π
8: end if
The computation task offloading optimization problem at 9: Set sk+1 = (Gk+1
g , T k+1 , rdk+1 ), where
each end device is a classic single-agent finite-horizon MDP T k+1
= T − a + Ik,
k k
with the discounted cost criterion. Then we adopt the clas- rd = rdk − ak (fd mk ).
k+1
sic model-free reinforcement learning approach, Q-learning 10: calculate the cost C k by (4)
algorithm, to explore the optimal task offloading policy by 11: update Q(sk , ak ) by (9)
minimizing the long-term expected accumulated discounted 12: Set k := k + 1
cost, C. We denote the Q-value, Q(s, a), as the expected 13: end while
accumulated discounted cost when taking an action ak ∈ A
following a policy π for a given state-action pair (s, a).
Thus, we define the action-value function Q(s, a) as C. Optimality and Approximation
Q(s, a) = Eπ [C k+1
+γQπ (s k+1
, a k+1 k k
)|s = s, a = a]. (8) The agent in the reinforcement learning algorithm aims to
solve sequential decision making problems by learning an
In our proposed algorithm, Q(s, a) indicates the value calcu- optimal policy. In practice, the requirement for Q-learning to
lated from cost function (4) for any given state s and action obtain the correct convergence is that all the state action pairs
a, it is stored in the Q-table which is built up to save all Q(s, a) continue to be updated. Moreover, if we explore the
the possible accumulative discounted cost. And the Q-value is policy infinitely, Q value Q(s, a) has been validated to converge
updated during the time epoch if the new Q-value is smaller with possibility 1 to Q∗ (s, a) , which is given by
than the current Q-value. The Q(s, a) is updated incrementally
based on the current cost function C k and the discounted Q- lim Pr (|Q∗ (s, a) − Q(s, a)n | ≥ ς) = 0, (10)
n→∞
value Q(sk+1 , a), ∀a ∈ A in the next time epoch.
This is achieved by the one-step Q-update equation where n is the index of the obtained sample, and Q∗ (s, a) is the
optimal Q value while Q(s, a)n is one of the obtained samples.
Q(sk , ak ) ← (1 − α) · Q(sk , ak ) + α(C k + γ · min Q(sk+1 , a)), Therefore, Q-learning can identify an optimal action selection
a
(9) policy based on infinite exploration time and a partly-random

Authorized licensed use limited to: REVA UNIVERSITY. Downloaded on November 28,2024 at 06:09:28 UTC from IEEE Xplore. Restrictions apply.
policy for a finite MDP model. In this paper, we approximate
the state and action space into finite states, and we use Monte- 0.24
Carlo simulation to explore the possible policy, so we can
obtain a near-optimal policy. 0.22

0.2

Average Cost, C ave

IV. N UMERICAL R ESULTS
In this section, the performance of our proposed computation 0.18
task offloading algorithm is verified by the simulations. The 0.16
task offloading algorithm is applied to each end device so that
they can make their own distributed decisions on how to process 0.14
the generated computation tasks, and this is achieved by using
0.12
the ϵ-greedy Q-learning algorithm. For comparison, we also
provide another two baselines: the local computing and the edge 0.1
computing mode, which are defined as follows,
0.08
1) Local computing: the computation task is locally execut-
ed at the end device in each time epoch whatever size of 0 2000 4000 6000 8000 10000
the computation task is generated. Iterations
2) Edge computing: all the computation tasks generated at
Fig. 2. Convergence performance of the proposed task offloading algorithm
the end device are offloaded to the edge device and are measured by average Cost Cave , the weight factor β = 0.5.
processed at the edge server.
3) Proposed scheme: when the task queue is not empty and
the end device has remaining computation resource, it 3
decides to execute the task locally or offload it to the Edge computing
Proposed scheme
gateway at the beginning of each time epoch to achieve
2.5 Local computing
the minimum cost, i.e., minimize the weighted sum of
Power consumption (W), P c

cost including power consumption and task execution

latency. 2

We carry out the simulations at any end device in an IoT

network. The simulations parameters are displayed in Table 1.5
I. β = 0.5 is the weight factor of the weighted sum cost,
and the time epochs in any time frame is set as K = 15. 1
Fig. 2 illustrates the convergence performance of the proposed
algorithm, where Y-axis presents the average cost, Cave , in each
0.5
time epoch. We observe that the average cost is converged to
a stable value with slight changes. Based on the convergence
of the proposed algorithm, the following numerical results are 0
analyzed. 1 3 5 7 9 11 13 15
Time epochs, k
Fig. 3 shows the comparison of power consumption among
edge computing, local computing and our proposed scheme Fig. 3. Performance comparison of cumulative power consumption Pc with
with an increasing number of the time epochs. We can observe the proposed scheme, edge computing and local computing versus the number
of time epochs k.
that the edge computing has the highest power consumption,
this is because the end device consumes transmit power to
offload the computation task to the gateway. But for the local
computing mode, it consumes the least power since all the information. The task is only offloaded when it is needed, such
power consumption only comes from the task execution. In our that it has less power consumption than the edge computing
proposed scheme, the sequential decisions on computation task mode, but has higher power consumption than the local com-
offloading are made by continuously observing environment puting mode due to the extra transmit power. Moreover, it is
noticed that the edge computing curve finishes earlier than the
other two curves. This is because the task execution is stopped
TABLE I and cannot be transferred to the local end device when the task
S IMULATION PARAMETERS
transmission fails.
γ, α, ϵ 0.5, 0.5, 0.1 Similarly, the performance of the task execution latency
Bw , σ 105 Hz, −174 + 10log10Bw among the three modes is shown in Fig. 4. Local computing
Channel gains G = (0.5, 1, 1.5) ∗ 10−5
Task queue T = (0, 1, ..., 9)
mode has the highest execution delay since the computation
Task size M = (10, 11, ..., 25)Kbits ability of the end device is much weaker than the gateway, and
fs = fd , fs = fd 500 cycles/bit, 10−8 W per CP U cycles the task size is too large to be processed efficiently with limited
Ds , Dd 4GHz, 500M Hz compute capability. Correspondingly, the computation task can

Authorized licensed use limited to: REVA UNIVERSITY. Downloaded on November 28,2024 at 06:09:28 UTC from IEEE Xplore. Restrictions apply.
two kinds of cost to meet different quality requirements of the
0.15
computation task.
V. C ONCLUSIONS
0.125
Task execution latency (s), L

In this paper, we investigate the design of a smart computa-

tion task offloading policy for the end device in the IoT net-
0.1 works by considering the statistics of environment information,
including the time-varying channel conditions, the dynamic
0.075 task queue and the remaining computation resource of the end
device. To solve the formulated task offloading problem, we
proposed a ϵ-greedy Q-learning based algorithm to minimize
0.05
the weighted sum cost of the power consumption and the task
execution latency. Our proposed task offloading algorithm has
0.025 Local computing
been validated by the numerical results, and the simulations
Proposed scheme
demonstrate that the proposed task offloading scheme achieved
Edge computing
0 a better trade-off performance between power consumption and
1 3 5 7 9 11 13 15 task execution latency compared to the other two baseline
Time epochs, k
computing modes.
Fig. 4. Performance comparison of cumulative task execution latency L with R EFERENCES
the proposed scheme, edge computing and local computing versus the number
of time epochs k. [1] A. Yousefpour, C. Fung, T. Nguyen, and etc., “All one needs to know
about fog computing and related edge computing paradigms: A complete
survey,” arXiv preprint arXiv:1808.05283, Sep. 2018.
[2] L. Bittencourt, R. Immich, R. Sakellariou, and etc., “The internet of
2.5 things, fog and cloud continuum: Integration and challenges,” Internet
=0.5 of Things, Elsevier, Sep. 2018.
=0.3 [3] C.-H. Hong and B. Varghese, “Resource management in fog/edge com-
puting: A survey,” arXiv preprint arXiv:1810.00305, Sep. 2018.
2 =0.8
[4] P. Mach and Z. Becvar, “Mobile edge computing: A survey on architecture
and computation offloading,” arXiv preprint arXiv:1702.05309, Mar.
Cumulative Cost, C

2017.
[5] Y. Mao, C. You, J. Zhang, K. Huang, and K. B. Letaief, “Mobile edge
1.5 computing: Survey and research outlook,” arXiv preprint arXiv, vol. 1701,
Jan. 2017.
[6] S. Bi and Y. J. Zhang, “Computation rate maximization for wireless
1 powered mobile-edge computing with binary computation offloading,”
IEEE Trans. Wireless Commun., vol. 17, no. 6, pp. 4177–4190, Apr. 2018.
[7] X. He, H. Xing, Y. Chen, and A. Nallanathan, “Energy-efficient mobile-
edge computation offloading for applications with shared data,” arXiv
0.5 preprint arXiv:1809.00966, Sep. 2018.
[8] J. Xu and S. Ren, “Online learning for offloading and autoscaling in
renewable-powered mobile edge computing,” in Global Telecom. Conf.
(GLOBECOM 2016), Washington, DC USA, Feb. 2016, pp. 1–6.
0 [9] X. Chen, H. Zhang, C. Wu, S. Mao, Y. Ji, and M. Bennis, “Performance
1 3 5 7 9 11 13 15 optimization in mobile-edge computing via deep reinforcement learning,”
Time epochs, k arXiv preprint arXiv:1804.00514, Mar. 2018.
[10] C. Zhang, Z. Liu, B. Gu, and etc., “A deep reinforcement learning based
Fig. 5. Cumulative weight sum of cost C versus different time epochs k with approach for cost-and energy-aware multi-flow mobile data offloading,”
different weight factors β. IEICE Trans. on Commun., pp. 1625–1634, Jan. 2018.
[11] G. Zhu, D. Liu, Y. Du, C. You, J. Zhang, and K. Huang, “Towards an
intelligent edge: Wireless communication meets machine learning,” arXiv
preprint arXiv:1809.00343, Sep. 2018.
be executed more quickly by offloading it to the gateway. Our [12] Y. Du and K. Huang, “Fast analog transmission for high-mobility wireless
proposed scheme achieves the neutral performance because it data acquisition in edge learning,” arXiv preprint arXiv:1807.11250, Jul.
2018.
makes decisions on the task execution based on the current [13] X. Wang, Y. Han, C. Wang, Q. Zhao, X. Chen, and M. Chen, “In-Edge
channel qualities and remaining computation resource of the AI: Intelligentizing mobile edge computing, caching and communication
end device. Specially, from Fig. 4, we notice that the local by federated learning,” arXiv preprint arXiv:1809.07857, Sep. 2018.
[14] H. Ye and G. Y. Li, “Deep reinforcement learning for resource allocation
computing curve is finished earlier than the other two curves, in V2V communications,” in IEEE Int. Conf. on Commun. (ICC 2018),
this is because the limited computation resource of the end Kansas City, MO, USA, May 2018, pp. 1–6.
device has run out before the end of the time frame. [15] H. Li, K. Ota, and M. Dong, “Learning iot in edge: deep learning for the
internet of things with edge computing,” IEEE Network, vol. 32, no. 1,
Fig. 5 illustrates the performance of the cumulative cost pp. 96–101, Jan. 2018.
with different weight factors β. It’s observed that the worst [16] Y. Mao, J. Zhang, and K. B. Letaief, “Joint task offloading scheduling and
case happens when β = 0.5 since the cumulative cost, C, is transmit power allocation for mobile-edge computing systems,” in IEEE
Wireless Commun. and Networking Conf. (WCNC 2017), San Francisco,
higher than others, which implies the task execution latency CA, Jan. 2017, pp. 1–6.
and the power consumption contribute different weights to the [17] M. L. Puterman, Markov decision processes: discrete stochastic dynamic
overall cost. It is important to make a trade-off between the programming. John Wiley & Sons, 2014.

Authorized licensed use limited to: REVA UNIVERSITY. Downloaded on November 28,2024 at 06:09:28 UTC from IEEE Xplore. Restrictions apply.

Splunk 8.1 Fundamentals Part 3
100% (4)
Splunk 8.1 Fundamentals Part 3
304 pages
Report of Student Management System
No ratings yet
Report of Student Management System
62 pages
MV12 Protection Relays Manual
100% (5)
MV12 Protection Relays Manual
25 pages
Book of Bass Scale - Ed Rogers
90% (10)
Book of Bass Scale - Ed Rogers
87 pages
Energy-Efficient Joint Task Offloading and Resource Allocation in OFDMA-Based Collaborative Edge Computing
No ratings yet
Energy-Efficient Joint Task Offloading and Resource Allocation in OFDMA-Based Collaborative Edge Computing
13 pages
Adaptive Resource Allocation in Future Wireless Networks With Blockchain and Mobile Edge Computing
No ratings yet
Adaptive Resource Allocation in Future Wireless Networks With Blockchain and Mobile Edge Computing
15 pages
FinalResearchPaperACN
No ratings yet
FinalResearchPaperACN
10 pages
Space/Aerial-Assisted Computing of Oading For IoT Applications: A Learning-Based Approach
No ratings yet
Space/Aerial-Assisted Computing of Oading For IoT Applications: A Learning-Based Approach
13 pages
An Online Learning Approach To Computation Offloading in Dynamic Fog Networks
No ratings yet
An Online Learning Approach To Computation Offloading in Dynamic Fog Networks
13 pages
Sensors 22 09546
No ratings yet
Sensors 22 09546
16 pages
Distributed Machine Learning For Multiuser Mobile Edge Computing Systems
No ratings yet
Distributed Machine Learning For Multiuser Mobile Edge Computing Systems
14 pages
Joint_Optimization_of_Data_Offloading_and_Resource_Allocation_With_Renewable_Energy_Aware_for_IoT_Devices_A_Deep_Reinforcement_Learning_Approach
No ratings yet
Joint_Optimization_of_Data_Offloading_and_Resource_Allocation_With_Renewable_Energy_Aware_for_IoT_Devices_A_Deep_Reinforcement_Learning_Approach
15 pages
Intelligent Task Processing Using Mobile Edge Computing: Processing Time Optimization
No ratings yet
Intelligent Task Processing Using Mobile Edge Computing: Processing Time Optimization
10 pages
A_Hybrid_DQN_and_Optimization_Approach_for_Strategy_and_Resource_Allocation_in_MEC_Networks
No ratings yet
A_Hybrid_DQN_and_Optimization_Approach_for_Strategy_and_Resource_Allocation_in_MEC_Networks
14 pages
s13677-025-00729-w
No ratings yet
s13677-025-00729-w
27 pages
2- Offloading Decision and Resource Allocation in Mobile Edge omputing for Cost and Latency Efficiencies in Real-Time IoT
No ratings yet
2- Offloading Decision and Resource Allocation in Mobile Edge omputing for Cost and Latency Efficiencies in Real-Time IoT
23 pages
GreenEdge - Joint Green Energy Scheduling and Dynamic Task Offloading in Multi - Tier Edge Computing Systems
No ratings yet
GreenEdge - Joint Green Energy Scheduling and Dynamic Task Offloading in Multi - Tier Edge Computing Systems
14 pages
Survey on computation offloading in UAV-Enabled mobile
No ratings yet
Survey on computation offloading in UAV-Enabled mobile
26 pages
Collaborative Edge Computing and Caching With Deep Reinforcement Learning Decision Agents
No ratings yet
Collaborative Edge Computing and Caching With Deep Reinforcement Learning Decision Agents
9 pages
Deep Reinforcement Learning Based Computation Offloading and Resource Allocation For MEC
No ratings yet
Deep Reinforcement Learning Based Computation Offloading and Resource Allocation For MEC
6 pages
Reference Paper - Page 87
No ratings yet
Reference Paper - Page 87
10 pages
5. A_Near-Optimal_Approach_for_Online_Task_Offloading_240906_224340
No ratings yet
5. A_Near-Optimal_Approach_for_Online_Task_Offloading_240906_224340
14 pages
1809.07857
No ratings yet
1809.07857
10 pages
Task Offloading Optimization Mechanism Based On Deep Neural Network in Edge-Cloud Environment
No ratings yet
Task Offloading Optimization Mechanism Based On Deep Neural Network in Edge-Cloud Environment
12 pages
Deep Reinforcement Learning For Online
No ratings yet
Deep Reinforcement Learning For Online
13 pages
Task Partitioning and Offloading in Iot Cloud-Edge Collaborative Computing Framework: A Survey
No ratings yet
Task Partitioning and Offloading in Iot Cloud-Edge Collaborative Computing Framework: A Survey
19 pages
Zhao 等 - 2022 - Multi-Agent Deep Reinforcement Learning for Task Offloading in UAV-Assisted Mobile Edge Computing
No ratings yet
Zhao 等 - 2022 - Multi-Agent Deep Reinforcement Learning for Task Offloading in UAV-Assisted Mobile Edge Computing
12 pages
Task Allocation Algorithm and Optimization Model On Edge Collaboration
No ratings yet
Task Allocation Algorithm and Optimization Model On Edge Collaboration
16 pages
A Novel Framework For Mobile-Edge Computing by Optimizing Task Offloading
No ratings yet
A Novel Framework For Mobile-Edge Computing by Optimizing Task Offloading
12 pages
1-s2.0-S0167739X23003862-main
No ratings yet
1-s2.0-S0167739X23003862-main
15 pages
Journal of Computer Networks and Communications - 2024 - Mashal - Multiobjective Offloading Optimization in Fog Computing
No ratings yet
Journal of Computer Networks and Communications - 2024 - Mashal - Multiobjective Offloading Optimization in Fog Computing
16 pages
s13677-023-00461-3
No ratings yet
s13677-023-00461-3
15 pages
saeik 2021
No ratings yet
saeik 2021
26 pages
Energy-Aware Task Offloading and Resource Allocation For Time-Sensitive Services in Mobile Edge Computing Systems
No ratings yet
Energy-Aware Task Offloading and Resource Allocation For Time-Sensitive Services in Mobile Edge Computing Systems
16 pages
1908 00080 PDF
No ratings yet
1908 00080 PDF
33 pages
1810.11199
No ratings yet
1810.11199
15 pages
Chapter 1 (1)
No ratings yet
Chapter 1 (1)
20 pages
Chapter 1+2 (1)
No ratings yet
Chapter 1+2 (1)
50 pages
1- A cooperative resource allocation model for IoT applications in mobile edge computing
No ratings yet
1- A cooperative resource allocation model for IoT applications in mobile edge computing
9 pages
Large_Language_Models_LLMs_Inference_Offloading_and_Resource_Allocation_in_Cloud-Edge_Computing_An_Active_Inference_Approach (2)
No ratings yet
Large_Language_Models_LLMs_Inference_Offloading_and_Resource_Allocation_in_Cloud-Edge_Computing_An_Active_Inference_Approach (2)
12 pages
Mid Sem BTP Report (6)
No ratings yet
Mid Sem BTP Report (6)
16 pages
Deep Reinforcement Learning MultiAgent System For Resource Allocation in Industrial Internet of ThingsSensors
No ratings yet
Deep Reinforcement Learning MultiAgent System For Resource Allocation in Industrial Internet of ThingsSensors
23 pages
Dynamical Resource Allocation in Edge For Trustable Internet-of-Things Systems: A Reinforcement Learning Method
No ratings yet
Dynamical Resource Allocation in Edge For Trustable Internet-of-Things Systems: A Reinforcement Learning Method
11 pages
Edge Computing For Internet of Everything
No ratings yet
Edge Computing For Internet of Everything
14 pages
Edge-AI IoT Request Service Provisioning in Federated Edge Computing Using Actor-Critic Reinforcement Learning
No ratings yet
Edge-AI IoT Request Service Provisioning in Federated Edge Computing Using Actor-Critic Reinforcement Learning
10 pages
1-s2.0-S0167739X18303996-main
No ratings yet
1-s2.0-S0167739X18303996-main
9 pages
1806.04589
No ratings yet
1806.04589
16 pages
Intelligent Edge: Leveraging Deep Imitation Learning For Mobile Edge Computation Offloading
No ratings yet
Intelligent Edge: Leveraging Deep Imitation Learning For Mobile Edge Computation Offloading
8 pages
Distributed Multi-Objective Dynamic Offloading Scheduling For Air-Ground Cooperative MEC
No ratings yet
Distributed Multi-Objective Dynamic Offloading Scheduling For Air-Ground Cooperative MEC
6 pages
Edge_Machine_Learning_for_AI-Enabled_IoT_Devices_A (1)
No ratings yet
Edge_Machine_Learning_for_AI-Enabled_IoT_Devices_A (1)
33 pages
Applsci 12 09124 v2
No ratings yet
Applsci 12 09124 v2
36 pages
TMC2020
No ratings yet
TMC2020
17 pages
paper_2
No ratings yet
paper_2
21 pages
sensors-24-04182
No ratings yet
sensors-24-04182
22 pages
126400V
No ratings yet
126400V
6 pages
Dynamic_Scheduling_for_Stochastic_Edge-Cloud_Computing_Environments_Using_A3C_Learning_and_Residual_Recurrent_Neural_Networks
No ratings yet
Dynamic_Scheduling_for_Stochastic_Edge-Cloud_Computing_Environments_Using_A3C_Learning_and_Residual_Recurrent_Neural_Networks
15 pages
A Multi-Model Edge Computing Offloading Framework For Deep Learning Application Based On Bayesian Optimization
No ratings yet
A Multi-Model Edge Computing Offloading Framework For Deep Learning Application Based On Bayesian Optimization
13 pages
Energy Criticality Avoidance-Based Delay Minimizat
No ratings yet
Energy Criticality Avoidance-Based Delay Minimizat
19 pages
HFEL_Joint_Edge_Association_and_Resource_Allocation_for_Cost-Efficient_Hierarchical_Federated_Edge_Learning
No ratings yet
HFEL_Joint_Edge_Association_and_Resource_Allocation_for_Cost-Efficient_Hierarchical_Federated_Edge_Learning
14 pages
A Machine Learning Approach For Task and Resource Allocation in Mobile Edge Computing Based Networks
No ratings yet
A Machine Learning Approach For Task and Resource Allocation in Mobile Edge Computing Based Networks
31 pages
2020-cs-433 (Paper Summary)
No ratings yet
2020-cs-433 (Paper Summary)
3 pages
heidarpoor
No ratings yet
heidarpoor
13 pages
Cloud vs Edge
From Everand
Cloud vs Edge
Isaac Berners-Lee
No ratings yet
Cloud Computing Made Simple: Navigating the Cloud: A Practical Guide to Cloud Computing
From Everand
Cloud Computing Made Simple: Navigating the Cloud: A Practical Guide to Cloud Computing
Poonam Devi
No ratings yet
AP-GSME-PL-CAL-004 Calculation For Cathodic Protection RA-IFR
No ratings yet
AP-GSME-PL-CAL-004 Calculation For Cathodic Protection RA-IFR
14 pages
VLSM Tutorial With Examples
0% (1)
VLSM Tutorial With Examples
7 pages
Gj 024203001
No ratings yet
Gj 024203001
11 pages
Turolla_Group_4_Gear_Pumps_L1022465_Web
No ratings yet
Turolla_Group_4_Gear_Pumps_L1022465_Web
28 pages
Chen Et Al 2012
No ratings yet
Chen Et Al 2012
7 pages
TIE32-Thermal Loads On Optical Glass
No ratings yet
TIE32-Thermal Loads On Optical Glass
6 pages
Eni Grease AC 2 PDF
No ratings yet
Eni Grease AC 2 PDF
1 page
Bearing Capacity calculation
No ratings yet
Bearing Capacity calculation
10 pages
Bryton Bogart: Objective
No ratings yet
Bryton Bogart: Objective
5 pages
Hierarchical Clustering: DSCI 5240 Data Mining and Machine Learning For Business
No ratings yet
Hierarchical Clustering: DSCI 5240 Data Mining and Machine Learning For Business
45 pages
Amazon CloudWatch
No ratings yet
Amazon CloudWatch
6 pages
University of California College of Engineering Department of Electrical Engineering and Computer Sciences
No ratings yet
University of California College of Engineering Department of Electrical Engineering and Computer Sciences
3 pages
Approaches and Methods in Computational Linguistics
No ratings yet
Approaches and Methods in Computational Linguistics
18 pages
CEMAT PresentationV71474389
0% (1)
CEMAT PresentationV71474389
112 pages
Revised DOEACC A Level SYLLABUS
No ratings yet
Revised DOEACC A Level SYLLABUS
180 pages
Prepared By: Sapan M. Raval (6 I.T.) Rishin S. Kumar (6 I.T.) C.U.Shah College of Engg. & Tech. Wadhwan City
No ratings yet
Prepared By: Sapan M. Raval (6 I.T.) Rishin S. Kumar (6 I.T.) C.U.Shah College of Engg. & Tech. Wadhwan City
31 pages
Physical Chemical Properties
No ratings yet
Physical Chemical Properties
7 pages
Documentum Content Transformation Services 16.7 Release Notes
No ratings yet
Documentum Content Transformation Services 16.7 Release Notes
22 pages
Conversion Table For Potato Specific Gravity
No ratings yet
Conversion Table For Potato Specific Gravity
7 pages
Differential Equations
No ratings yet
Differential Equations
28 pages
Comfort Indices
No ratings yet
Comfort Indices
11 pages
Vishwakarma Institute of Technology, Pune: Academic Time Table 2010-2011 First Term
No ratings yet
Vishwakarma Institute of Technology, Pune: Academic Time Table 2010-2011 First Term
2 pages
Spherical Washers: Extract
No ratings yet
Spherical Washers: Extract
1 page
Thermal Energy Notes
No ratings yet
Thermal Energy Notes
62 pages
Steady State Analysis of PMSG PDF
No ratings yet
Steady State Analysis of PMSG PDF
16 pages
Isomerism Level Wise Practice Sheet by Mr. Dhirendra Kumar For Class 11th Chemistry
No ratings yet
Isomerism Level Wise Practice Sheet by Mr. Dhirendra Kumar For Class 11th Chemistry
7 pages

Resource_Allocation_for_Edge_Computing_in_IoT_Networks_via_Reinforcement_Learning

Uploaded by

Resource_Allocation_for_Edge_Computing_in_IoT_Networks_via_Reinforcement_Learning

Uploaded by

Resource Allocation for Edge Computing in IoT

Networks via Reinforcement Learning

Average Cost, C ave

cost including power consumption and task execution

We carry out the simulations at any end device in an IoT

In this paper, we investigate the design of a smart computa-

You might also like