0% found this document useful (0 votes)

18 views17 pages

Online Optimal Service Caching For Multi Access Edge Computing - 2024 - Computer

Uploaded by

fredhkgo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views17 pages

Online Optimal Service Caching For Multi Access Edge Computing - 2024 - Computer

Uploaded by

fredhkgo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

Computer Networks 246 (2024) 110395

Contents lists available at ScienceDirect

Computer Networks
journal homepage: www.elsevier.com/locate/comnet

Online optimal service caching for multi-access edge computing: A

constrained Multi-Armed Bandit optimization approach
Weibo Chu a ,∗, Xiaoyan Zhang a , Xinming Jia a , John C.S. Lui b , Zhiyong Wang b
a
Northwestern Polytechnical University, Xi’an, China
b
The Chinese University of Hong Kong, Hong Kong

ARTICLE INFO ABSTRACT

Keywords: In order to fully exploit the power of Multi-Access Edge Computing, services need to be cached at the network
Multi-access edge computing edge in an adaptive and responsive way to accommodate the high system dynamics and uncertainty. In this
Service selection paper, we study the online service caching problem in MEC, with the goal to minimize users’ perceived latency
Service caching
while at the same time, ensure the rate of tasks processed by the edge server is no less than a preset threshold.
Constrained multi-armed bandit
We model the problem with a Constrained stochastic Multi-Armed Bandit formulation, and propose √ a simple
Online algorithm
yet effective online caching algorithm called Constrained Confidence Bound (CCB). CCB achieves 𝑂( 𝑇 ln 𝑇 )
bounds on both regret and violation of the constraint, and is able to achieve a good balance between them.
We further consider the scenario when there is cost (i.e., delay) due to service switches, and propose two
service switch-aware caching algorithms — Explore-First (EF) and Successive Elimination-based (SE) caching,
2 1
together with a novel sampling
√ scheme. We prove that EF achieves 𝑂(𝑇 3 (ln 𝑇 ) 3 ) bound on regret and violation,
whereas SE achieves 𝑂( 𝑇 ln 𝑇 ) and converges significantly faster. Lastly, we conduct extensive simulations
to evaluate our algorithms and results demonstrate their superior performance over baselines.

1. Introduction and can only accommodate a limited number of services. For example,
network operators usually implement cloudlet based mobile computing
It has been shown repeatedly that innovative network architectures using a computing server with small resources or a cluster with medium
and key technologies enable and empower crucial applications. It is resources. This raises the service placement problem as when and where
our consensus today that the reverse is also true as applications are to host the services at the edge nodes. Apparently, the performance of
shaping/changing the network architectures and technologies. Take MEC varies significantly depending on service placement.
for example emerging applications such as autonomous driving [1],
The service placement problem (SPP), sometimes also referred to
AR/VR [2], and networked gaming. Being resource-hungry and delay-
as service caching [6], has attracted a lot of research in the past
sensitive, these applications impose a stringent requirement on both
few years, and various algorithms [7–10], i.e., exact/approximate,
computing and networking capacity, but which cannot be met solely
by the existing cloud systems due to the long propagation delay static/dynamic, centralized/decentralized, have been proposed. Yet de-
and unstable network connections. In this context, a new network signing an optimal policy for service caching remains a challenge due to
computing paradigm called multi-access edge computing (MEC [3, the high heterogeneity and dynamics of both the system and workload.
4]), has been put forward. The key feature of MEC is that services The problem becomes even more challenging when we consider it in an
are hosted at various type of edge nodes endowed with comput- online setting where the caching decisions have to be made as system
ing/storage/communication capacities, so that low-latency access to operates, but without a priori knowledge of user-generated workload
services are possible. and network condition. In fact, these critical information such as task
With MEC, users can offload their tasks to the edge nodes (a.k.a offloading delays are stochastic and unobservable unless the services
MEC servers) for high energy-efficiency, fast responses and enhanced are cached. Moreover, from a practical point of view, it is expected that
security/privacy protection [5]. However, as compared with cloud in- the online caching algorithms can provide us provable performance
frastructure (e.g., data centers) which can virtually host all the services
guarantee.
with abundant resources, MEC servers are often resource-constrained

∗ Corresponding author.
E-mail addresses: [email protected] (W. Chu), [email protected] (X. Zhang), [email protected] (X. Jia), [email protected]
(J.C.S. Lui), [email protected] (Z. Wang).

https://doi.org/10.1016/j.comnet.2024.110395
Received 19 September 2023; Received in revised form 29 February 2024; Accepted 2 April 2024
Available online 16 April 2024
1389-1286/© 2024 Elsevier B.V. All rights reserved.
W. Chu et al. Computer Networks 246 (2024) 110395

In this paper, we study the online service caching problem for a

generic multi-access edge computing system, with the goal to minimize
users’ perceived latency. We adopt a Multi-Armed Bandits (MAB) [11,
12] optimization framework, which is a simple but very powerful tool
for online learning that allows a decision maker to estimate parame-
ters and perform optimization overtime under uncertainty. Our model
differs substantially from traditional MAB in that: (1) in addition to
cumulative reward (here, cumulative latency) that measured by regret,
we also consider a QoS constraint from MEC service provider that the
Fig. 1. A Multi-Access Edge Computing system.
rate of tasks processed by the edge server is no less than a preset
threshold. This formulation captures a key feature in many real-world
optimization tasks that network operators seek some other performance
guarantees (or constraints, measured by violation) in addition to the high capacity. Resource-hungry and delay-sensitive tasks are generated
optimal solution; (2) Besides bandit feedback, our model also involves from UEs and routed to the MEC server. If the required services are
other type of feedback, i.e., full feedback, complementary feedback, hosted, the corresponding tasks are performed locally and superior
and even state-dependent feedback. This significantly extends the exist- user experience can be achieved. Otherwise, they are directed to the
ing MAB framework, while at the same time, poses challenges to design cloud for remote execution at the cost of large delay. Throughout
efficient algorithms and characterize their performance. the paper, we assume that the cloud hosts all the services; however,
More specifically, in this paper we formulate the online service due to resource (CPU, memory, etc.) constraints, the MEC server can
caching problem as a Constrained Multi-Armed Bandits (CMAB) op- only accommodate a limited number of services simultaneously, which
timization problem. To balance between minimizing the aggregate raises the service caching problem, i.e., how to properly select a subset
delay and satisfying the QoS constraint, we first propose a simple of services from a large service pool and instantiate them at the MEC
yet effective algorithm, called Constrained Confidence Bound (CCB), server so as to optimize the system performance.
which is a probabilistic algorithm that achieves sub-linear bounds on Due to its combinatorial nature, service caching problem is gen-
both regret and violation with high probability, i.e., erally hard to tackle even when all system parameters are available,
√ at time 𝑇 with
i.e., the problem is often modeled as a 0–1 programming problem. It is
a probability 1 − 𝛿, we have 𝑅𝑒𝑔(𝑇 ) = 𝑂((𝐾 − 𝐿) 𝐾𝑇 ln 2𝐾𝑇 ) and
√ 𝛿 much more challenging when we consider it in an online setting where
2𝐾𝑇
𝑉 𝑖𝑜(𝑇 ) = 𝑂(𝐾 𝐾𝑇 ln 𝛿 ), where 𝐾 is the number of services in one has to make caching decisions as system operates, but without any
system and 𝐿 is the server capacity. Our simulation results suggest that priori knowledge of the stochastic user-generated tasks and network
CCB serves as a good candidate to online caching if our goal is more condition. Simply caching the most popular services does not always
about the long-term performance. guarantee low latency when, i.e., the computation workload of the tasks
We then consider the scenario when there is service switching cost, are high. Likewise, caching the services with the best network condition
i.e., delays for fetching a new service from cloud and then instantiate is also sub-optimal when the popularity of the service is low. In fact,
it at the MEC server. We show that with service switching cost, the system parameters such as MEC-offloading delay and service switching
performance of CCB degrades due to the inaccurate estimate of pa- cost are unobservable unless the service is selected/cached.
rameters and frequent service switches. To this end, we propose two In this paper, we study this online service caching problem and our
new service switch-aware online caching algorithms — Explore-First goal is to minimize users’ perceived latency. We seek efficient online
(EF) and Successive Elimination-based (SE) caching, both of which can caching algorithms which can optimally select a subset of services
effectively deal with service switches during the caching process. A while at the same time, which can run at the MEC server without
common ingredient of the two algorithms is a novel sampling scheme any collaboration with UEs or the cloud. This makes our mechanism
that we propose, which is designed to sample services uniformly at the easy to deploy, scalable and of low cost. To achieve this, we make the
same rate with small sampling cost, while at the same time, which can following assumptions: (1) tasks for the services arrive independently;
avoid too many service switches. We show that both algorithms are able (2) a task can only be offloaded to the MEC server if the required
to identify the optimal solution efficiently, with the difference that a service is present; otherwise, it will be directed to the cloud; and (3)
large number of samples are required by EF whereas the sampling cost the network delay between UEs and the MEC server is negligible as
can be significantly decreased by SE through gradually eliminating non- compared to the delay between the MEC server and the cloud, which
optimal services with high confidence during the learning process. This implies that the MEC-offloading delay for a task equals to the time it
allows SE to converge much faster and also more computationally cost- is performed at the MEC server, and the cloud computing delay is the
effective. Furthermore, we prove that both algorithms simultaneously time for the task being processed at cloud plus the time it takes for
achieve sub-linear bounds on regret and violation. transmitting data and computation results. It follows that both of the
The remainder of this paper is organized as follows. In Section 2 we two delays can be observed at the MEC server.
describe system model and the CMAB formulation for the online service Let  = {1, 2, … , 𝐾} be the set of services in system, and assume
caching problem. Section 3 presents CCB, its performance analysis and they are of equal sizes. The MEC server can accommodate up to 𝐿 < 𝐾
simulation results. In Section 4 we extend the model to problems with services. Time is divided into successive slots with equal length  =
service switching cost, and elaborate the EF and SE algorithm and their {1, 2, … , 𝑇 }. At each time slot 𝑡, we associate with each service 𝑖 ∈ 
performance evaluation. Section 5 gives related work. We discuss some the following three quantities: (1) 𝛽𝑖𝑡 , which is the arrival rate of tasks
future research directions and conclude the paper in Section 6. for service 𝑖 at time 𝑡; (2) 𝑚𝑡𝑖 , which is the aggregate latency of tasks
for service 𝑖 at time 𝑡 if they are computed at the MEC server, i.e., the
2. System model and problem formulation MEC-offloading delay; and (3) 𝑐𝑖𝑡 , which is the aggregate latency of tasks
for service 𝑖 at time 𝑡 if they are performed at cloud, i.e., the cloud
2.1. System model computing delay. Note that depending on the caching state of service
𝑖, at an arbitrary time slot 𝑡 we can only observe 𝑚𝑡𝑖 or 𝑐𝑖𝑡 , but not both of
We consider a Multi-Access Edge Computing system as shown in them. However, we can always observe 𝛽𝑖𝑡 since all tasks from UEs are
Fig. 1, which consists of multiple user equipments (UEs), an MEC routed to the MEC server for potential offloading. Moreover, without
server and the cloud. To make our model generic, this MEC server loss of generality, we assume 𝛽𝑖𝑡 ∈ [0, 1], 𝑚𝑡𝑖 ∈ [0, 1] and 𝑐𝑖𝑡 ∈ [0, 1].
can be micro data center, an edge cloud or a computing server with Table 1 gives main notations used in this paper.

2
W. Chu et al. Computer Networks 246 (2024) 110395

Table 1 MEC server is much smaller than task processing delay; (2) To imple-
Main notations.
ment multi-access edge computing, network infrastructure providers
Symbol Definition usually deploy MEC servers at locations such as a base station in a
 Set of services in system, || = 𝐾 cellular network, or a gateway in an enterprise local area network. In
𝐿 Server capacity
such a setup, it is common that we have identical delays between UEs
ℎ A threshold denoting the rate of tasks processed by the MEC
server and the MEC server. The overall effect is that this delay can be ignored
𝛿 Failure probability when we want to optimize users’ perceived latency, as doing this will
𝛽𝑖𝑡 Arrival rate of tasks for service 𝑖 at time 𝑡 not make too much differences.
𝑚𝑡𝑖 MEC offloading delay of tasks for service 𝑖 at time 𝑡
𝑐𝑖𝑡 Cloud computing delay of tasks for service 𝑖 at time 𝑡 2.2. Problem formulation
(𝛽𝑖 , 𝑚𝑖 , 𝑐𝑖 ) Expectation of (𝛽𝑖𝑡 , 𝑚𝑡𝑖 , 𝑐𝑖𝑡 )
(𝛽̄𝑖𝑡 , 𝑚̄ 𝑡𝑖 , 𝑐̄𝑖𝑡 ) Empirical means of ({𝛽𝑖𝑡 }, {𝑚𝑡𝑖 }, {𝑐𝑖𝑡 }) up to time 𝑡
(𝛽̌𝑖𝑡 , 𝑚̌ 𝑡𝑖 , 𝑐̌𝑖𝑡 ) Lower Confidence Bound for (𝛽𝑖 , 𝑚𝑖 , 𝑐𝑖 ) at time 𝑡 To address the online caching problem (1), we model it with a Con-
(𝛽̂𝑖𝑡 , 𝑚̂ 𝑡𝑖 , 𝑐̂𝑖𝑡 ) Upper Confidence Bound for (𝛽𝑖 , 𝑚𝑖 , 𝑐𝑖 ) at time 𝑡 strained Multi-Armed Bandit (CMAB) formulation. More specifically,
𝑚𝑡𝑖,11 MEC offloading delay of tasks for service 𝑖 of type 11 at each service is regarded as an arm and caching a service is equivalent to
time 𝑡 pulling an arm. The set of arms thus can be written as  = {1, 2, … , 𝐾}.
UCB𝑡 (𝑚𝑖,11 ) Upper Confidence Bound for 𝑚𝑖,11 at time 𝑡 For each arm 𝑖 ∈ , we associate it with three feedbacks: {𝑚𝑡𝑖 }𝑇𝑡=1 ,
LCB𝑡 (𝑚𝑖,11 ) Lower Confidence Bound for 𝑚𝑖,11 at time 𝑡
{𝑐𝑖𝑡 }𝑇𝑡=1 and {𝛽𝑖𝑡 }𝑇𝑡=1 . We assume that all these sequences are made of
𝐱∗ Optimal caching policy iid. random variables.
𝐱𝐭 Caching decision at time 𝑡
Let 𝑚𝑖 = E[𝑚𝑡𝑖 ], 𝑐𝑖 = E[𝑐𝑖𝑡 ] and 𝛽𝑖 = E[𝛽𝑖𝑡 ]. Also denote 𝐦 =
𝐱𝑡 (𝜋) Caching decision made by algorithm 𝜋 at time 𝑡
𝑡 (𝜋) Set of services selected by algorithm 𝜋 at time 𝑡 (𝑚1 , 𝑚2 , … , 𝑚𝐾 )T , 𝐜 = (𝑐1 , 𝑐2 , … , 𝑐𝐾 )T and 𝜷 = (𝛽1 , 𝛽2 , … , 𝛽𝐾 )T . Let  be
the set of all valid caching decisions, i.e.,  = {𝐱 ∈ {0, 1}𝐾 |𝟏T 𝐱 ≤ 𝐿}.
When all these information are available, the best (and static) caching
policy 𝐱∗ can be put as:
Let 𝐱𝐭 = (𝑥𝑡1 , 𝑥𝑡2 , … , 𝑥𝑡𝐾 )T be the caching decision at time slot 𝑡,
where 𝑥𝑡𝑖 = 1 if service 𝑖 is cached, and 𝑥𝑡𝑖 = 0 otherwise.1 Also denote 𝐱∗ = argmin: 𝐱T 𝐦 + (𝟏 − 𝐱)T 𝐜 (2)
𝑡 )T and 𝜷 𝑡 = (𝛽 𝑡 , 𝛽 𝑡 , … , 𝛽 𝑡 )T . 𝐱∈,𝜷 T 𝐱≥ℎ
𝐦𝐭 = (𝑚𝑡1 , 𝑚𝑡2 , … , 𝑚𝑡𝐾 )T , 𝐜𝐭 = (𝑐1𝑡 , 𝑐2𝑡 , … , 𝑐𝐾 1 2 𝐾
Here, in addition to optimize users’ perceived latency, we also enforce As we have mentioned above, the expectations and distributions of
another constraint to the caching policy that the aggregate rate of tasks the three feedbacks are unknown beforehand, and therefore we have to
processed by the MEC server is no less than a preset threshold ℎ > 0. learn and make decisions based on their estimates. At each time slot 𝑡,
Note that this constraint can be regarded as the QoS requirement from an algorithm 𝜋 makes caching decision 𝐱𝑡 (𝜋) ∈ , and selects services
the service providers. Indeed, one of the most significant benefits that 𝑡 (𝜋) ⊂ . It then observes {𝑚𝑡𝑖 , 𝛽𝑖𝑡 } for each 𝑖 ∈ 𝑡 (𝜋), and {𝑐𝑗𝑡 , 𝛽𝑗𝑡 } for
a service provider can expect from multi-access edge computing is that each 𝑗 ∈  ⧵ 𝑡 (𝜋). Our objective is to design an algorithm 𝜋 to decide
the vast majority of their tasks are processed at the network edge and the caching set 𝑡 (𝜋) for 𝑡 = 1, 2, … , 𝑇 such that it achieves the minimal
the workload on cloud thus can be dramatically decreased. This at the regret, i.e., the accumulated difference between the latency under 𝜋 and
same time alleviates network congestion for the infrastructure provider. that under the optimal policy 𝐱∗ , which is defined as:
Let 𝟏 be the one column vector, i.e. 𝟏 = (1, 1, … , 1)T . With the above
∑
𝑇 ∑ ∑
notations, we can formulate the online service caching problem as the 𝑅𝑒𝑔𝜋 (𝑇 ) = ( 𝑚𝑡𝑖 + 𝑐𝑗𝑡 ) − 𝑇 (𝐱∗ T 𝐦 + (𝟏 − 𝐱∗ )T 𝐜) (3)
following optimization problem: 𝑡=1 𝑖∈𝑡 (𝜋) 𝑗∈⧵𝑡 (𝜋)

∑
𝑇 It is worthy noting that an algorithm 𝜋 which achieves a low regret
T T
Min: 𝐱𝐭 𝐦𝐭 + (𝟏 − 𝐱𝐭 ) 𝐜𝐭 (1a) may violate the QoS constraint, especially when it has little information
{𝐱𝐭 } 𝑡=1 about services/arms. We define violation of the algorithm as the gap
s.t.: 𝟏T 𝐱𝐭 ≤ 𝐿, ∀𝑡 ≤ 𝑇 , (1b) between the accumulated rate of tasks processed by the edge server
𝑡T 𝐭 and the target rate, as follows:
𝜷 𝐱 ≥ ℎ, (1c)
𝑥𝑡𝑖 ∈ {0, 1}, ∀𝑖 ∈ , 𝑡 ≤ 𝑇 , (1d) ∑
𝑇 ∑
𝑉 𝑖𝑜𝜋 (𝑇 ) = [ℎ𝑇 − 𝛽𝑖𝑡 ]+ (4)
∑𝑇 T T 𝑡=1 𝑖∈𝑡 (𝜋)
where 𝑡=1 + 𝐱𝐭 𝐦𝐭 (𝟏 − 𝐱𝐭 ) 𝐜𝐭
is the aggregate users’ perceived delay
over the whole time horizon 𝑇 . Constraint (1b) is for the resource where [𝑥]+ = max{𝑥, 0}.
limitation at the MEC server, and (1c) is for the QoS requirement. Both the regret and violation are important performance indexes
Problem (1) is an integer linear programming (ILP) problem that can and both should be taken into account when we design the algorithm.
be well solved by existing algorithms. The key obstacle here is that in A small regret means the caching policy by the algorithm is close to the
an online setting, {𝐦𝐭 , 𝐜𝐭 , 𝜷 𝑡 } are not known in advance when it comes optimal one, and a small violation implies that the QoS requirement is
to time slot 𝑡, but rather they are revealed after the caching decision is well satisfied during the service caching process. In practice, an algo-
made. rithm with sub-linear bounds on both regret and violation is considered
acceptable and applicable.
Remark. The assumption that the network delay between UEs and the
MEC server can be ignored is for the following two considerations: 3. CCB and its performance evaluation
(1) MEC servers are generally deployed at the network edge in close
proximity to end-users, whereas cloud are located much further away. In this section, we present our Constrained Confidence Bound (CCB)
This implies that the delay between UEs and the MEC server is much algorithm and its performance under simulation. The idea behind CCB
smaller than that between the MEC server and cloud. Furthermore, it is straightforward: (1) at each time slot 𝑡 we can derive a valid caching
is typically also true that the transmission delay between UEs and the policy through solving the optimization problem (1); and (2) although
the system parameters are unknown at the time of decision, we can re-
place them by estimates. This should work well as long as the algorithm
1
Unless otherwise specified, all vectors defined in this paper are column can provide us with good estimates, i.e., Lower/Upper Confidence
vectors. Bound.

3
W. Chu et al. Computer Networks 246 (2024) 110395

3.1. Algorithm Proof. See Appendix A. □

Denote by 𝑡 = {𝜏 , 𝑚𝜏𝑖 , 𝑐𝑖𝜏 , 𝛽𝑖𝜏 ∶ 𝑖 ∈ 𝜏 , 1 ≤ 𝜏 ≤ 𝑇 } be the history Remark. CCB is based on Con-UCB [14], but with the following differ-
of caching decisions and the observed feedback up to time slot 𝑡. CCB ences: (1) Con-UCB is proposed to tackle the online decision problem
maintains the following empirical means for each arm 𝑖 ∈  at each where the objective function is a function of one parameter only (can
time 𝑡: be in the form of multi-level feedback), whereas CCB can be applied
∑ 𝜏
𝜏<𝑡,𝑖∈𝜏 𝑚𝑖 when there are two or more such parameters in the objective function;
𝑚̄ 𝑡𝑖 = 𝑡
(5) (2) In Con-UCB, all feedback are bandit feedback, i.e., observations
𝑁 +1
∑ 𝑖,M 𝜏 can only be made if the arm is selected. On the other hand, CCB
𝜏<𝑡,𝑖∉𝜏 𝑐𝑖
𝑐̄𝑖𝑡 = (6) allows both bandit feedback and full feedback, i.e., {𝛽𝑖𝑡 }. Moreover, CCB
𝑁𝑡 + 1 allows feedback that are complementary, i.e., one parameter that can
∑ 𝑖,C 𝜏
𝜏<𝑡,𝑖∈𝜏 𝛽𝑖 be observed if the arm is selected, and the other one be observed if
𝛽̄𝑖𝑡 = (7) the arm is not selected, but which cannot be simultaneously observed,
𝑡
𝑡 𝑡 are the number of times arm 𝑖 is selected and not i.e., {𝑚𝑡𝑖 } and {𝑐𝑖𝑡 }. These feedback need to be properly handled in the
where 𝑁𝑖,M and 𝑁𝑖,C
𝑡 + 𝑁 𝑡 + 1 = 𝑡. performance analysis. As a result, CCB is a generalization of Con-UCB.
selected before time 𝑡, respectively. Obviously, 𝑁𝑖,M
√ 𝑖,C
𝛾𝜇 𝛾
Let 𝑅(𝜇, 𝑛) = 𝑛
+ 𝑛
as in [13] and 𝛾 is a positive constant. Define 3.2. Simulation results
the following Lower Confidence Bound for 𝑚𝑖 and 𝑐𝑖 at each time 𝑡:
We use simulations to investigate the performance of CCB, with both
𝑚̌ 𝑡𝑖 = max{0, 𝑚̄ 𝑡𝑖 − 2𝑅(𝑚̄ 𝑡𝑖 , 𝑁𝑖,M
𝑡
+ 1)} (8)
synthetic workload and real dataset. For synthetic workload, we assume
the popularity of services follows a Zipf distribution with skewdness
𝑐̌𝑖𝑡 = max{0, 𝑐̄𝑖𝑡 − 2𝑅(𝑐̄𝑖𝑡 , 𝑁𝑖,C
𝑡
+ 1)} (9) parameter 𝑠 ∈ {0.6, 0.8, 1.0}. Task parameters are configured as follows:
and Upper Confidence Bound for 𝛽𝑖 : we divide task sizes into a set of intervals as [0.1 MB, 0.3 MB], [0.3 MB,
0.5 MB], [0.5 MB, 0.8 MB], [0.8 MB, 1 MB], [1 MB, 3 MB], [3 MB,
𝛽̂𝑖𝑡 = min{1, 𝛽̄𝑖𝑡 + 2𝑅(𝑐̄𝑖𝑡 , 𝑡)} (10) 5 MB], [5 MB, 8 MB], and [8 MB, 10 MB] [15,16]. Note that these
𝐭 intervals are not of the same lengths. The task size of each service falls
Denote by 𝐦 ̌𝐭 = (𝑚̌ 𝑡1 , 𝑚̌ 𝑡2 , … , 𝑚̌ 𝑡𝐾 ),
𝐜̌𝐭 = and 𝜷̂ =
(𝑐̌1𝑡 , 𝑐̌2𝑡 , … , 𝑐̌𝐾
𝑡 ),
̂𝑡 ̂𝑡 ̂𝑡 in one of these intervals that are randomly picked, and once fixed, the
(𝛽1 , 𝛽2 , … , 𝛽𝐾 ). As depicted in Alg. 1, the input of CCB includes the arm
size of a task for the service is uniformly chosen from that interval. The
set , the server capacity 𝐿, the QoS requirement ℎ, the time horizon
computing intensity of tasks (in CPU cycles per bit) are drawn randomly
𝑇 , and 𝛿 ∈ (0, 1) which is a failure probability. CCB starts with 𝛾 set to
72 ln 2𝐾𝑇 . It then solves problem (1) at each time 𝑡 with the system from [100, 200, 300, 400, 500] [17], which represents certain amount
𝛿 of task heterogeneity and skewed workload distribution.
𝐭
parameters (𝐦, 𝐜, 𝜷) being replaced by (𝐦 ̌ 𝐭 , 𝐜̌𝐭 , 𝜷̂ ) to get the selected Meanwhile, we assume tasks arrive independently and the aggre-
arms 𝑡 . After that, it updates the upper/low confidence bounds for gate request rate for services is 100 req/sec. The network bandwidth
each arm. The process repeats until time 𝑇 .
between cloud and the MEC server is 5 Mbps [18], and each service is
allocated to 5.6 GHz and 2.8 GHz CPU resources from cloud and the
Procedure 1 Constrained Confidence Bound algorithm for Caching MEC server, respectively. The time horizon is set as 𝑇 = 100 000, and
Services at the MEC server. the length of each slot is 100 s. Without otherwise specified, we set the
Input: , 𝐿, ℎ, 𝑇 , 𝛿; number of services in system as 𝐾 = 100 and that can be hosted at the
Output: Selected arms/services at each time slot; MEC server as 𝐿 = 10.
2𝐾𝑇 1
1: 𝛾 = 72 ln ̌ 1 = č 1 = 𝜷̂ = 𝟎, 𝑁 1 = 𝑁 1 = 0, ∀𝑖 ∈ .
,m 𝑖,M 𝑖,C
For performance evaluation, we adopt the following two algorithms
𝛿
2: for 𝑡 = 1, 2, … , 𝑇 do as baselines: (1) Random-Caching : this is the algorithm that randomly
3: Solve the following optimization problem: picks 𝐿 services to cache at the MEC server at each time slot; and (2)
Top-Rate-Caching : this is the algorithm that always cache the top 𝐿 most
̌ 𝐭 + (𝟏 − 𝐱)T č 𝐭
𝐱𝑡 = argmin: 𝐱T m (11) popular services, assuming that the knowledge of service popularity is
̂𝐭 T
𝐱∈,𝜷 𝐱≥ℎ given a priori.
Fig. 2 shows how each algorithm performs with different parameter
4: Select arms 𝑡 according to 𝐱𝑡 , and do the following updates for
settings in simulation. From these figures, we can see that: (1) as
each arm 𝑖 ∈ :
{ expected, in all cases Random-Caching performs worst as both the
𝑡 + 1,
𝑁𝑖,M ∀𝑖 ∈ 𝑡 regret and violation grow linearly in time; (2) the Top-Rate-Caching,
𝑡+1
𝑁𝑖,M = 𝑡 , . (12) which is able to satisfy the QoS constraint consistently, also does
𝑁𝑖,M ∀𝑖 ∈  ⧵ 𝑡
not provide a satisfactory delay performance for the linearly growing
{ regret (although it grows much more slowly than Random-Caching).
𝑡 + 1,
𝑁𝑖,C ∀𝑖 ∈  ⧵ 𝑡
𝑡+1
𝑁𝑖,C = . (13) This implies that in general, the top 𝐿 most popular services does
𝑡 ,
𝑁𝑖,M ∀𝑖 ∈ 𝑡 not coincide with the set of services that provides the most caching
gain. The reason is that workload distribution can be inconsistent with
𝑡+1
5: ̌ 𝑡+1 , č 𝑡+1 , 𝜷̂
Based on the received feedback, calculate (m ) service popularity distribution, as we configured in simulation; and
accordingly. (3) CCB gives the best performance among the three algorithms, in
that both the regret and violation grow sub-linearly as time elapses.
Moreover, it can be observed that CCB behaves exactly the same way as
The following theorem holds for CCB:
Random-Caching at the very beginning (i.e., 𝑡 < 20 000), for the fact that
during this period not enough samples are collected for services and as
Theorem 3.1. By running CCB, we have a probability at least 1 − 𝛿 such
a result, CCB is not able to differentiate them but have to randomly pick
that:
√ services. After that period, CCB gradually identifies/learns the optimal
2𝐾𝑇 services and the regret grows sub-linearly then.
𝑅𝑒𝑔(𝑇 ) = 𝑂((𝐾 − 𝐿) 𝐾𝑇 ln ),
𝛿 Fig. 3 shows the performance of each algorithm in trace-driven
√ simulation. We adopt a dataset from [19], which contains packet inter-
2𝐾𝑇
𝑉 𝑖𝑜(𝑇 ) = 𝑂(𝐾 𝐾𝑇 ln ). arrival times from 5 applications generated by 36 wireless devices. The
𝛿

4
W. Chu et al. Computer Networks 246 (2024) 110395

Fig. 2. Performance of CCB and the two baseline algorithms under different system settings.

wireless traces are used to generate workload, where each packet is the MEC server, but instead they have to be directed to the cloud
regarded as a request and each <device, application> as a service. for remote execution. The cost of service switches raises two new
In this way, we get workload for 94 services in total. Meanwhile, to problems for online caching algorithms, if they were designed without
have enough data for simulation, we stretch time axis in each trace properly considering it: (1) biased/inaccurate estimate of parameters,
by 1000 (so a millisecond becomes a second). Here, since the optimal in particular, the MEC-offloading delay; and (2) system performance
set of services is not known, we give the cumulative latency instead of degradation due to frequent service switches. Take CCB for example,
regret. Again, we can see that CCB outperforms the two baselines whose Fig. 4 shows its performance when there is no switching cost VS. there
latency and violation keep growing linearly all the time, whereas CCB is cost, where in the latter scenario we set the time to load a new service
slows down the growth, i.e., when 𝑡 > 40 000. as 20 s. It is evident that in call cases, the performance of CCB decreases
All in all, we believe that CCB serves as a good solution to online due to service switches, i.e., the regret grows faster and it takes more
caching problem if our goal is more about the long-term system perfor- time to converge. These results suggest that in practice, we need online
mance. In the next section, we will present an algorithm that converges caching algorithms that can properly handle the service switching cost.
much faster, which at the same time, grows much slower in both regret
and violation than CCB. 4.1. Problem formulation

With service switching cost, the MEC-offloading delay for each

4. Service switch-aware online caching
service depends on whether the service is cached or not in the previous
time slot, i.e., the caching state. Accordingly, at each time slot 𝑡 we can
In Section 3, we formulate the online caching problem and propose
divide services into 4 categories:
CCB, without taking into account service switching cost. According
to the current technology, services are often hosted by VMs or con- • type 00 — this is the set of services not cached at time 𝑡 and will
tainers, whose image (e.g., data, code) needs to be fetched and then also be absent at 𝑡 + 1;
loaded into system before the service is available. Note that during • type 01 — this is the set of services not cached at 𝑡 but will be
this period of time, tasks for the service will not be performed at cached at 𝑡 + 1;

5
W. Chu et al. Computer Networks 246 (2024) 110395

Fig. 3. Performance of CCB and the two baseline algorithms in trace-driven simulation: 𝐿 = 10, ℎ = 0.2, 𝛿 = 0.01.

Fig. 4. Performance reductions of CCB when there is service switching cost.

• type 10 — this is the set of services cached at 𝑡 but will be absent 4.2. Explore-first algorithm
at 𝑡 + 1;
• type 11 — this is the set of services cached at 𝑡 and will also be The first algorithm we propose to deal with service switches is
present at 𝑡 + 1. Explore-First (EF). To start with, let us re-examine the online caching
problem. We make the following observations: (1) Our goal is to
Note that among the four categories, service switching cost is in- minimize the user-perceived latency, that is, to cache the 𝐿 services
curred only for services of type 01, i.e., when a new service is loaded. with the largest delay savings (under the given constraint), i.e., the gap
Likewise, for each service 𝑖, let 𝑚𝑡𝑖,11 and 𝑚𝑡𝑖,01 denote the delays of between MEC-offloading delay and cloud computing delay; and (2) The
type 11 and 01, respectively, and denote by 𝑐𝑖𝑡 the delays of type 00 optimal policy is:
and 10 (both for cloud computing). The online caching problem then
becomes: 𝐱∗ = argmin: 𝐱T 𝐦𝟏𝟏 + (𝟏 − 𝐱)T 𝐜 (16)
𝐱∈,𝜷 T 𝐱≥ℎ
∑
𝑇 ∑
Min: 𝑥𝑡−1 𝑡 𝑡 𝑡−1 𝑡 𝑡 𝑡 𝑡
(14a) Note that both do not involve the switching cost. It follows that if
𝑖 𝑥𝑖 𝑚𝑖,11 + (1 − 𝑥𝑖 )𝑥𝑖 𝑚𝑖,01 + (1 − 𝑥𝑖 )𝑐𝑖
{𝐱𝐭 } 𝑡=1 𝑖∈ we can estimate 𝐦𝟏𝟏 , 𝐜 and 𝜷 accurately based on feedback, then we
s.t.: 𝟏T 𝐱𝐭 ≤ 𝐿, ∀𝑡 ≤ 𝑇 , (14b) can always find a good solution. Meanwhile, since service switching
T
cost is incurred only when we explore new services, and multiple
𝜷 𝑡 𝐱𝐭 ≥ ℎ, (14c) services needs to be selected at each time slot, we need an efficient
𝑥𝑡𝑖 ∈ {0, 1}, ∀𝑖 ∈ , 𝑡 ≤ 𝑇 , (14d) sampling scheme with the following properties: (1) low cost (≪ C𝐿 𝐾 );
(2) arms/services are sampled uniformly, so as to simplify the algorithm
The above problem is a 0–1 quadratic programming problem that is design and its performance analysis; and (3) the frequency of service
much more complicated than problem (1), in that the caching decision switches are well controlled so that we can avoid too much switching
at each time slot not only depends on the unknown delays but also cost.
depends on the current caching state. A simple and heuristic algorithm Our Explore-First algorithm is based on a novel sampling scheme
is to select services at each time slot 𝑡 through solving the following with the above properties. We use segment as the basic unit to sample
0–1 LP problem given the current system state 𝐱𝐭−𝟏 : a given set of services, where each segment contains multiple rounds,
and each round consists of two successive time slots. The structure of a
𝐱𝐭 =∶ (15a) segment is depicted in Fig. 6. We discriminate two scenarios according
∑
argmin: 𝑥𝑡−1 𝑡 𝑡
+ (1 − 𝑥𝑡−1 𝑡 𝑡
+ (1 − 𝑥𝑡𝑖 )𝑐̌𝑖𝑡 (15b) to whether the number of services to sample 𝑆 can be divided by 𝐿:
𝑖 𝑥𝑖 𝑚
̌ 𝑖,11 𝑖 )𝑥𝑖 𝑚
̌ 𝑖,01
T
𝐱𝐭 ∈,𝜷̂𝐭 𝐱𝐭 ≥ℎ 𝑖∈
• 𝑆 mod 𝐿 = 0. In this case, whenever a new round begins, we
where 𝑚̌ 𝑡𝑖,11 and 𝑚̌ 𝑡𝑖,01 are LCBs of 𝑚𝑖,11 and 𝑚𝑖,01 , respectively. This select 𝐿 new services (at the first time slot), and keep hosting
approach, although straightforward, is greedy in nature, and far from these services at the second time slot, as shown in Fig. 6(a). To
optimal as shown in Fig. 5, where we can see that the performance of sample all the services, each segment contains 𝑆∕𝐿 rounds, and
the algorithm is identical or even worse (see Fig. 5(b)) than CCB. 2𝑆∕𝐿 time slots in total.

6
W. Chu et al. Computer Networks 246 (2024) 110395

Fig. 5. Performance comparison between CCB and the greedy algorithm when there is service switching cost.

Fig. 6. A segment used to sample 𝑆 services.

• 𝑆 mod 𝐿 ≠ 0. Let 𝛼 be the smallest positive integer such that

𝑆𝛼 mod 𝐿 = 0. In this case, each segment contains 𝑆𝛼∕𝐿 rounds, 𝑛(𝑖) = 2𝑆𝛼∕𝐿 (19)
and we select at the 𝑗th round the following services: {[(𝑗 − 1)𝐿 +
where we set 𝛼 = 1 when 𝑆 mod 𝐿 = 0. 𝑛(𝑖, M) and 𝑛(𝑖, C) denote
1] mod 𝑆, [(𝑗 − 1)𝐿 + 2] mod 𝑆, … , 𝑗𝐿 mod 𝑆}, as depicted in
the number of times service 𝑖 is hosted at the MEC server (without
Fig. 6(b). Note that it is our policy that service switch occurs
switching cost) or it is played at cloud, in each segment, respectively,
whenever a new round begins, regardless of whether the service
and 𝑛(𝑖) denotes the number of times 𝑖 is requested.
to cache has been hosted in the previous round. The idea behind Explore-First (EF) is simple: we explore services
uniformly with the above sampling scheme, and pick an empirically
It can be easily verified that with this sampling scheme, each service
best arm set for exploitation, regardless of what has been observed
is sampled at the same rate, i.e., for both two scenarios we have: previously. More specifically, we divide the time horizon 𝑇 into two
𝑛(𝑖, M) = 𝛼 (17) phases, exploration phase and exploitation phase, as shown in Fig. 7(a).
The exploration phase consists of 𝑁 successive segments, where each
segment is used to sample the ground set of services . The exploita-
𝑆𝛼
𝑛(𝑖, C) = 2( − 𝛼) (18) tion phase follows which consists of remaining time slots that always
𝐿

7
W. Chu et al. Computer Networks 246 (2024) 110395

play the empirically best arm set, derived by solving the optimization computed at the MEC server (without switching cost) and the cloud,
problem 20. respectively.
Procedure 2 Explore-First algorithm for Caching Services at the MEC Let 𝑡 be the set of arms remaining active at time 𝑡. SE deactivates
server. arms according to the following rules:
Input: , 𝐿, ℎ, 𝑇 ; Rule 1: At the end of each segment (assuming at time 𝑡), identify
Output: Selected arms/services at each time slot; the set of arms  such that arm 𝑖 ∈  if we can find some other arm
1: Exploration phase: Sample the set of arms  with 𝑁 segments. 𝑗 ∈ 𝑡 with UCB𝑡 (𝑐𝑖 ) − LCB𝑡 (𝑚𝑖,11 ) < LCB𝑡 (𝑐𝑗 ) − UCB𝑡 (𝑚𝑗,11 );
2: Select the arm set (𝐱∗ ) by solving the following optimization Rule 2: Solve the following optimization problem for 𝑡 :
problem:
𝐱𝒕 = argmin: 𝐱𝐓 UCB𝒕 (𝒎𝟏𝟏 ) + (𝟏 − 𝐱)𝐓 UCB𝒕 (𝒄) (25)
𝐱∗ = argmin: 𝐱T m
̄ 𝟏𝟏 + (𝟏 − 𝐱)T c̄ (20) 𝐱∈ ,𝜷̄𝐓 𝐱≥𝒉
T
𝐱∈,𝜷̄ 𝐱≥ℎ Denote by (𝐱𝑡 ) the set of selected arms, obtain the set  = 𝑡 ⧵ (𝐱𝑡 );
3: Exploitation phase: Play (𝐱∗ ) in all remaining time slots. Rule 3: Deactivate from 𝑡 the arms in both  and .
Note that rule 1 is used to identify arms with small contribution,
As show in Alg. 2, here 𝑁 is a parameter chosen to minimize the
i.e., these arms are likely not the optimal ones. Rule 2 is used to
regret. It is a function of the time horizon 𝑇 , the number of arms 𝐾 and
identify arms that are not optimal with high confidence, where the
𝐿. In Appendix B, we will show how to properly set it.
QoS constraint has been properly taken into account. Therefore, the
We define the following regret for performance evaluation:
intersection of the two sets gives us arms that can be deactivated with
𝑅𝑒𝑔𝜋 (𝑇 ) =∶ high confidence. One can image that during the very first few segments
∑
𝑇 ∑ ∑ (21)  = ∅ and  is a random set, since not enough samples are collected
( 𝑚𝑖,11 + 𝑐𝑗 ) − 𝑇 (𝐱∗ T 𝐦, 𝟏𝟏 + (𝟏 − 𝐱∗ )T 𝐜) and the algorithm is not able to differentiate arms. As time elapses,
𝑡=1 𝑖∈𝑡 (𝜋) 𝑗∈⧵𝑡 (𝜋)  becomes larger and  becomes more accurate. Once an arm is in
Note that this definition is different from (3) as it uses the expected ∩, we are highly confident that it does not belong to the optimal set
latencies whereas realized latencies are adopted in (3). The definition and thus can be eliminated. Moreover, the active set becomes smaller
of violation remains unchanged as (4). as time elapses since more and more arms are deactivated, which
significantly decreases the sampling cost. See Alg. 3 for more details.
Theorem 4.1. By running Explore-First, we have the following bounds on
regret and violation: Procedure 3 Successive Elimination-based Algorithm for Caching
( 4 ) Services at the MEC server.
𝐾 13 23 1
𝑅𝑒𝑔(𝑇 ) = 𝑂 ( ) 𝑇 (ln 𝑇 ) 3 , Input: , 𝐿, ℎ, 𝑇 ;
𝐿
( 1 2 2 Output: Selected arms/services at each time slot;
1)
𝑉 𝑖𝑜(𝑇 ) = 𝑂 𝐾 3 𝐿 3 𝑇 3 (ln 𝑇 ) 3 . 1: 𝑡 = 1;  = . #  is the set of active arms;
2: for 𝑡 ≤ 𝑇 do
3: Sample  with a segment.
Proof. See Appendix B. □
4: if || > 𝐿 then
5:  = ∅; #  denotes set of arms to be potentially deactivated;
4.3. Successive elimination-based algorithm
6: for 𝑖 ∈ 𝑆 do
7: if ∃𝑗 ∈  such that UCB𝑡 (𝑐𝑖 ) − LCB𝑡 (𝑚𝑖,11 ) < LCB𝑡 (𝑐𝑗 ) −
Explore-First is able to identify the optimal set of services precisely
UCB𝑡 (𝑚𝑗,11 ) then
given sufficient samples, however, the performance in the exploration
8:  =  ∪ {𝑖};
phase may be poor, especially when most of the arms have a large gap
9: Solve the following optimization problem for , and denote
compared with the optimal one. Here we present another caching algo-
the selected arms as (𝐱𝑡 ):
rithm, called Successive Elimination-based (SE) caching, that can sig-
nificantly decrease the sampling cost while at the same time, improve 𝐱𝒕 = argmin: 𝐱𝐓 UCB𝒕 (𝒎𝟏𝟏 ) + (𝟏 − 𝐱)𝐓 UCB𝒕 (𝒄) (26)
the bound on both regret and violation. 𝐱∈ ,𝜷̄𝐓 𝐱≥𝒉
The main idea behind SE is as follows: (1) we divide time horizon 10: Deactivate arms in both  and  ⧵ (𝐱𝑡 ):
𝑇 into two phases: exploration/elimination phase and exploitation phase,
as shown in Fig. 7(b). The exploration/elimination phase consists of
successive segments, where each segment is used to sample a given set  =  ⧵ ( ∩ { ⧵ (𝐱𝑡 )}) (27)
of (active) arms; (2) In the end of each segment, one or more arms
may be deactivated according to the elimination rule; and (3) The
elimination phase completes when there is no more arms to be deleted, Theorem 4.2. Let 𝑡 be the time slot that the elimination phase completes,
and the exploitation phase follows which keeps playing the remaining by running SE we have the following bounds on regret and violation:
arms. √
More specifically, SE maintains the following quantities (LCB/UCB) 𝑅𝑒𝑔(𝑇 ) ≤ 𝑂( 𝐾𝑇 𝑡 ln 𝑇 ),
for each service 𝑖 at time 𝑡: √
√ 𝑉 𝑖𝑜(𝑇 ) ≤ 𝑂(𝑡 𝐾𝑇 ln 𝑇 ).
UCB𝑡 (𝑚𝑖,11 ) = 𝑚̄ 𝑖,11 + 2 ln 𝑇 ∕𝑛𝑡 (𝑖, M) (22)

√ Proof. See Appendix C. □

LCB𝑡 (𝑚𝑖,11 ) = 𝑚̄ 𝑖,11 − 2 ln 𝑇 ∕𝑛𝑡 (𝑖, M) (23)
Theorem 4.2 gives an instance-dependent upper bound on regret by
√ SE since 𝑡 is a function of parameter distributions, i.e., {𝑚𝑖,11 , 𝑐𝑖 , 𝛽𝑖 }.
LCB𝑡 (𝑐𝑖 ) = 𝑐̄𝑖 − 2 ln 𝑇 ∕𝑛𝑡 (𝑖, C) (24)
On the other hand, it is well known that UCB-like algorithms
√ achieve
where 𝑛𝑡 (𝑖, M) denotes the number of times service 𝑖 of type 11 is an instance-independent upper bound on the order of 𝑂( 𝐾𝑇 ln 𝑇 ) for
played, and 𝑛𝑡 (𝑖, C) denotes the number of times 𝑖 is absent before 𝑡. multi-armed bandit problems. We argue that there is no conflict in that
𝑚𝑖,11 and 𝑐𝑖 denote the aggregate latency when tasks of service 𝑖 are the two bounds are of the same order, as stated in the following lemma.

8
W. Chu et al. Computer Networks 246 (2024) 110395

Fig. 7. Two service switch-aware algorithms for online caching.

Lemma 4.1. SE achieves regret: the same empirically optimal set of services in the exploitation phase.
√ Obviously, this algorithm does not adapt its exploration scheduler to
𝑅𝑒𝑔(𝑇 ) ≤ 𝑂( 𝐾𝐿𝑇 ln 𝑇 ) the history of the observed rewards. Moreover, in order to have a good
performance, usually a large number of time slots is dedicated to the
Proof. See Appendix D. □ exploration phase, which incurs a high sampling cost. These together
leads to the fact that this algorithm is particularly useful when the time
4.4. Simulation results horizon 𝑇 is large and the system is stable. On the other hand, SE-
based caching works by successively eliminating non-optimal services
4.4.1. Performance over baselines during the sampling process, and therefore it satisfies the so called
Fig. 8 gives performance of EF, SE and other algorithms when there adaptive exploration and incurs much lower sampling cost, i.e., fewer
is switching cost. Again, we set the time to load a new service as time slots are needed in the exploration/elimination phase. In other
20 s. From the figure, we can see that in all cases: (1) both EF and words, this algorithm converges significantly faster and achieves much
SE can accurately identify the optimal set of services as their regret better regret bounds. Based on the above reasoning, we conclude that
and violation keep non-increasing after convergence; on the other hand, SE-based caching is particularly suitable when the underlying system
CCB exhibits a sub-linear growth in regret; (2) among the two proposed is non-stable such as a real edge computing system, where in that case
algorithms, SE converges much faster, i.e., less than 8000 time slots both the convergence speed/rate and accuracy are important, i.e., we
are needed for it to get stable whereas it takes more than 30 000 slots can divide time horizon 𝑇 into multiple phases, and restart the caching
for EF. Moreover, we find that SE are far more computationally cost- algorithm when a new phase starts, so as to quickly adapt to the
effective than CCB and EF as it requires less than 10 min to run each dynamics of the system.
simulation on our machine (2 × 2.2 GHz CPU, 8 GB Memory), whereas
it takes approximately 30 min for EF and even 2 h for CCB (note that 4.4.2. Performance comparison with SoA
CCB requires solving the optimization problem at each time slot). It is interesting to see the performance of our proposed schemes
Fig. 9 gives performance of the corresponding algorithms in trace- against existing online algorithms. To this end, we compare CCB
driven simulation. As expected, we observe that both SE and EF out- with the recently proposed potential-based algorithm [20], which is
perform the baselines and the conclusions are consistent. Moreover, a lightweight but very efficient algorithm for online content caching.
it is interesting to find that EF and SE have approximately the same Here, we model the edge-cloud system as a cache network with two
convergence time under the real workload, which suggests that in nodes, where the edge server is considered as a cache node with limited
practice one can adopt either of them for online service caching. storage capacity, and the cloud as a server that permanently holds
Furthermore, given the performance of CCB as shown in Fig. 4, all the content in system. Moreover, each service is regarded as a
we believe that SE and EF also outperforms when there is no service content, and each request for a service as a request for a content.
switching cost. We characterize each request for a service as a tuple (𝑖, 𝑡𝑚 𝑐
𝑖 , 𝑡𝑖 ), where
𝑚 𝑐
𝑖 is the service requested, 𝑡𝑖 and 𝑡𝑖 (both are stochastic) denotes the
Remark. Here we further compare and analyze the advantages and MEC-offloading delay and cloud-computing delay, respectively. The
application scenarios of the two algorithms. As we mentioned above, EF MEC server maintains a quantity 𝑄𝑖 called potential for each service
works by first sampling services uniformly at the same rate (with our 𝑖, together with the empirical means for MEC-offloading delay 𝑇𝑖𝑚 and
novel sampling scheme), and then makes caching decisions based on cloud-computing delay 𝑇𝑖𝑐 for each service 𝑖. Note that these quantities
estimates of the relevant parameters of services. It then keeps caching are initialized zero and updated whenever a new request arrives.

9
W. Chu et al. Computer Networks 246 (2024) 110395

Fig. 8. Performance of EF, SE and baseline algorithms when there is service switching cost.

Fig. 9. Performance of SE, EF and other algorithms in trace-driven simulation with switching cost: 𝐿 = 10, ℎ = 0.2, 𝛿 = 0.01.

10
W. Chu et al. Computer Networks 246 (2024) 110395

Fig. 10. Performance of CCB and the potential-based caching algorithm in simulation with no service switch cost: 𝐾 = 100, 𝐿 = 10, ℎ = 0.2, 𝛿 = 0.02.

At the very beginning, there is no services cached at the MEC server. Existing solutions can be categorized into centralized [24,25] and
Service caching is then performed according to the following rules: distributed [26–28], based on the control plane design. A centralized
Rule 1: if a request (𝑖, 𝑡𝑚 𝑐
𝑖 , 𝑡𝑖 ) arrives at the MEC server, 𝑄𝑖 is updated algorithm assumes that the global information such as application
as follows: demands and infrastructure resources are available, and computes a
globally optimal solution. For example, Hong et al. [25] propose that a
𝑄𝑖 = 𝑄𝑖 + 𝑇𝑖𝑐 − 𝑇𝑖𝑚 (28)
coordinator makes deployment decisions for IoT services over the fog
If 𝑖 is cached, then: infrastructure. The drawback of the centralized solution is that global
information is generally hard to collect and the computational cost may
𝑇𝑖𝑚 × 𝑁𝑖𝑚 + 𝑡𝑚
𝑖
𝑇𝑖𝑚 = , 𝑁𝑖𝑚 = 𝑁𝑖𝑚 + 1 (29) be excessively high. On the other hand, a distributed approach relies on
𝑁𝑖𝑚 + 1
the local computation of each node and their collaboration to address
else: the scalability and locality awareness issues. This approach is able to
𝑇𝑖𝑐 × 𝑁𝑖𝑐 + 𝑡𝑐𝑖 provide services that fit the local context, but generally speaking, it
𝑇𝑖𝑐 = , 𝑁𝑖𝑐 = 𝑁𝑖𝑐 + 1 (30) cannot guarantee global optimality of the solution.
𝑁𝑖𝑐 + 1
The service placement problem can be addressed in an offline [29,
where 𝑁𝑖𝑚 and 𝑁𝑖𝑐 denotes the number of times service 𝑖 is requested 30] and online fashion [31,32]. The offline approach requires that
when it is cached at the MEC server and when it is absent, respectively. all the information about the system and workload are given a priori
Rule 2: if a response to request (𝑖, 𝑡𝑚 𝑐
𝑖 , 𝑡𝑖 ) from the remote cloud before the placement decision is computed. That is, the placement
arrives at the MEC server and there is no room for hosting service 𝑖 decision is made at the complie time before deployment. Examples
if it is absent, then the MEC server calculates the caching probability include [29,30] that assume full knowledge of the Edge/Fog network.
𝑦𝑖 for 𝑖 based on 𝑄𝑖 ’s: On the other hand, recently proposed approaches [33–35] are mainly
𝑄𝑖 online that the placement decisions are made during the run-time of
𝑦𝑖 = ∑ (31)
𝑄𝑖 + 𝑗∈𝑐 𝑄𝑗 the systems. To provide satisfactory performance, the online algorithms
have to take into account the dynamic behaviors of the system. The
where 𝑐 is the set of services cached at the MEC server. If the decision
advantage of this approach is that it is more adaptive and responsive
is to cache 𝑖, then the service 𝑗 with the least potential, i.e., 𝑗 =
to changes. However, it remains a challenge as how to make the best
argmin 𝑄𝑖 , is evicted.
𝑖∈𝑐 use of the system resources.
It can be seen from the above two rules that the potential-based Based on whether the dynamicity of the system is handled or not,
caching algorithm aims at minimizing the aggregate latency for access- existing placement solutions can also be classified as static and dy-
ing the services in system. The following figures show its performance namic [36,37]. The static approach usually assumes that the Edge/Fog
and our proposed mechanism CCB, when there is no service switch cost infrastructure and application characteristics remain unchanged as time
and when there is cost. From Fig. 10 we can see that the potential-based elapses, which is not realistic. In fact, both the two aspects are highly
algorithm performs exceptionally well when there is no service switch time-evolving as new nodes can join and leave the system due to
cost, i.e., cumulative regret grows very slowly (although still linear) instability of the network, the resources available can change over
and the offloading rate constraint can always be satisfied. However, time based on real-life condition, and the workload varies when users’
as depicted in Fig. 11, its performance gets poor when there is switch interest changes. The dynamic approaches [38,39] employ reactive
cost, as both the regret and violation grow linearly as time elapses. strategies to deal with the dynamic nature of the infrastructure and
After a deep investigation, we find that this phenomenon is due to the application, in a way that new services may be deployed and exist-
frequent service switches incurred during the caching process by the ing services may be replaced/released whenever significant change is
potential-based algorithm. Based on this observation, we conclude here observed.
that any caching algorithm could perform poorly when there is service Alternatively, one can also characterize existing SPP solutions based
switch cost, if this cost is not taken into account when we design the on various aspects such as: (1) whether the mobility prediction is
algorithm. exploited or not for mobility and popularity caching [40,41], (2)
user-centric cooperative edge caching [42,43] or network-centric non-
5. Related work cooperative caching, (3) intelligent handover predictions for the edge
[44] and various other recent AI-based approaches like adopted rein-
The service placement problem (SPP) in Edge/Fog computing is forcement learning [45,46], (4) price congestion schemes for caching
essentially to find the available resources (nodes, links) in the network [47,48], and (5) DDPG for orchestration from an SDN perspective [49,
so as to optimize certain objectives (delay, energy consumption, etc.) 50] and so forth.
while at the same time satisfy application requirements, resource con- Obviously, our algorithms belong to the category of dynamic and
straints, locality constraints, etc. It has been a hot topic [21–23] in the online solutions. The work that most close to ours is [51], where the
past few years, and many approaches have emerged. authors address user-managed service placement problem, while in this

11
W. Chu et al. Computer Networks 246 (2024) 110395

Fig. 11. Performance of CCB and the potential-based caching algorithm in simulation with service switch cost: 𝐾 = 100, 𝐿 = 10, ℎ = 0.2, 𝛿 = 0.02.

work we address the problem from the network-side (that is, the net- i.e., for adequate computing power or load balancing. This raises
work operators make the caching decisions instead of users). Moreover, another question as how to determine the necessary number of VMs for
they adopt a contextual MAB framework with a Thompson-sampling each service, and then design efficient online learning algorithms for
scheme for online learning, whereas we employ a Constrained-MAB service caching. One possible solution is to extend the current model,
framework with a novel sampling scheme that we propose to efficiently i.e., by regarding each service with a particular number of instances
explore system dynamics. as an arm. The key challenges here are: (1) the set of arms/actions
would be huge; and (2) instead of selecting 𝐿 services each time, the
6. Conclusion and future work server capacity is now expressed as a new constraint, which further
complicates the online service caching problem.
In this paper, we study the online service caching problem for
a multi-access edge computing system, with the goal to minimize (4) Online service caching for MEC-based networks. There is a trend
users’ perceived latency. We formulate it as a Constrained Multi-Armed that multiple edge servers work collaboratively to form a shard resource
Bandits (CMAB) optimization problem, and propose three efficient pool [52–54], so as to provide reliable and elastic edge computing
algorithms — CCB, EF and SE. We show that CCB can well balance services. This, on one hand, provides us opportunity to leverage the
the objective and QoS constraint, whereas EF and SE can effectively power of the network for better exploration and exploitation. On the
learn the optimal solution when there is service switching cost. We other hand, it also raises significant challenges to design online learning
theoretically analyze their performance by giving the bound on regret algorithms for the network, since MEC servers may be heterogeneous
and violation, and conduct extensive simulations to validate their effi- in computing power, user bases, network condition, etc. Given that
cacy. Our experimental results show that these algorithms outperform there is a flurry studies on Multi-agent MAB (MA-MAB) [55–57], to the
baselines. best of our knowledge, it is still unclear as how to formulate and solve
There are several interesting issues for exploration. Below we give the problem that we concern here, i.e., Multi-agent MAB with multiple
some possible directions that we believe are important and worthy of constraints. We believe that the problem itself is interesting in the area
further investigation. of MAB and deserves further study.

(1) State-dependent CMAB formulation. We have shown that when

CRediT authorship contribution statement
there is service switching cost, the MEC-offloading delay depends on
whether the service is cached or not, i.e., the caching state. Now if we
treat each service of a particular type as an arm, then we get a state- Weibo Chu: Conceptualization, Methodology, Writing – original
dependent CMAB problem formulation, where the action available in draft, Writing – review & editing. Xiaoyan Zhang: Investigation, Soft-
the next time slot also depends on the current state of the system. It ware, Visualization. Xinming Jia: Formal analysis, Resources, Valida-
is still an open problem as how to design efficient algorithms for this tion. John C.S. Lui: Writing – review & editing, Funding acquisition,
type of CMAB problem, especially when the state space is huge. Methodology, Supervision. Zhiyong Wang: Data curation, Software,
Visualization.
(2) Heterogeneous services. We assume that the MEC server can host
𝐿 services at most. This means all the services are of equal sizes. Given
Declaration of competing interest
the limited resources of the MEC server, if services are heterogeneous in
storage or memory, then at any time slot 𝑡 the number of services hosted
by the MEC server can be different, depending on the arms selected. The authors declare that they have no known competing finan-
This is quite different from the problem we considered in this paper. cial interests or personal relationships that could have appeared to
One way to handle heterogeneous services is to model the problem influence the work reported in this paper.
with Combinatorial Bandits with Knapsack Constraints (CBwK), which
combines Combinatorial Bandits where a subset of arms needs to be Data availability
pulled at each round, and Bandits with Knapsack Constraints where
a super-arm needs to be pulled at each round but within a budget Data will be made available on request.
constraint. However, CBwK is not readily applicable since in addition to
the knapsack constraint, we also need to ensure that the rate of tasks Acknowledgments
processed by the MEC server is no less than a preset threshold. This
novel QoS constraint, however, poses significant challenges for both
This work was supported in part by the National Natural Science
the design and performance analysis of efficient algorithms.
Foundation of China (Grant No. 62172333) and the Natural Science
(3) Multiple instances for each service. We assume in this work Basic Research Plan in Shaanxi Province of China (Grant No. 2021JM-
that exactly one instance (e.g., VM) for each service can be hosted at 073). The work of John C.S. Lui was supported in part by the GRF
the MEC server, whereas in practice there can be multiple instances, 14215722 and RGC SRFS2122-4S02.

12
W. Chu et al. Computer Networks 246 (2024) 110395

Appendix A. Proof of Theorem 3.1 Proof of Lemma A.4. Denote by 𝑄𝑡𝑖 be the event such that |𝑚̄ 𝑡𝑖 − 𝑚𝑖 | >
2𝑅(𝑚̄ 𝑡𝑖 , 𝑁𝑖,M ̄ 𝑡 be its complement. Let 𝛾 = 72 ln 2𝐾𝑇 , obviously
𝑡 + 1), and 𝑄
𝑖 𝛿
We rely on the following lemmas to prove the theorem. 𝑟 ≥ 1.
From Lemma A.3 we have:
Lemma A.1 (Azuma-Hoeffding Inequality [58]). Suppose {𝑌𝑛 ∶ 𝑛 = 𝛿
𝑃 𝑟{𝑄𝑡𝑖 } <
0, 1, 2, 3, …} is a martingale and |𝑌𝑛 − 𝑌𝑛−1 | ≤ 𝑐𝑛 almost surely, then with 𝐾𝑇
2
− ∑𝑛𝑑 taking a union bound,
2 𝑐2
probability at least 1 − 2𝑒 𝑗=1 𝑗 , we have:
∑
𝑇 ∑
𝐾

|𝑌𝑛 − 𝑌0 | ≤ 𝑑. 𝑃 𝑟{∪𝑖,𝑡 𝑄𝑡𝑖 } ≤ 𝑃 𝑟{𝑄𝑡𝑖 } < 𝛿

𝑡=1 𝑖=1

Lemma A.2 ([59,60]). Consider 𝑛 i.i.d random variables 𝑍1 , 𝑍2 , … , 𝑍𝑛 in therefore,

[0, 1] with expectation 𝑧. Let 𝑢 denote their empirical average. Then for any 𝑃 𝑟{∩𝑖,𝑡 𝑄̄ 𝑡𝑖 } = 1 − 𝑃 𝑟{∪𝑖,𝑡 𝑄𝑡𝑖 } > 1 − 𝛿
1
𝛾 > 0, with probability at least 1 − 2𝑒− 72 𝛾 we have:
The above inequality states that for all 𝑖 and 𝑡, at probability at least
|𝑢 − 𝑧| ≤ 𝑅(𝑢, 𝑛) 1 − 𝛿 we have:
√
where 𝑅(𝑢, 𝑛) = 𝛾𝑢
+ 𝛾𝑛 . |𝑚̄ 𝑡𝑖 − 𝑚𝑖 | ≤ 2𝑅(𝑚̄ 𝑡𝑖 , 𝑁𝑖,M
𝑡
+ 1) (40)
𝑛

The following lemma is a corollary of Lemma A.2. It follows that at probability at least 1 − 𝛿,
𝑚𝑖 > 𝑚̄ 𝑡𝑖 − 2𝑅(𝑚̄ 𝑡𝑖 , 𝑁𝑖,M
𝑡
+ 1) = 𝑚̌ 𝑡𝑖
Lemma A.3 ([14]). Let the empirical means 𝑚̄ 𝑡𝑖 , 𝑐̄𝑖𝑡 and 𝛽̄𝑖𝑡 be defined
1 (36) and (37) can be proved in the same way.
as (5) (6) (7). Then for all 𝑖 and 𝑡, with probability at least 1 − 2𝑒− 72 𝛾
To prove (38), define two series of random variables:
we have:
∑ ∑ ∑
𝑡
|𝑚̄ 𝑡𝑖 − 𝑚𝑖 | ≤ 2𝑅(𝑚̄ 𝑡𝑖 , 𝑁𝑖,M
𝑡
+ 1) (32) 𝑍𝑡 = 𝑚𝑡𝑖 − 𝑚𝑖 , 𝑌𝑡 = 𝑍𝑙
𝑖∈𝑡 𝑖∈𝑡 𝑙=1

|𝑐̄𝑖𝑡 − 𝑐𝑖 | ≤ 2𝑅(𝑐̄𝑖𝑡 , 𝑁𝑖,C

𝑡
+ 1) (33) It can be seen that {𝑌𝑛 } is a sequence of independent variables, and
E[𝑍𝑡 |𝑡−1 ] = E[𝑍𝑡 ] = 0, where 𝑡−1 = (𝑌𝑡−1 , 𝑌𝑡−2 , … , 𝑌1 ) is the historical
information up to time 𝑡 − 1. {𝑌𝑛 } is a martingale since
|𝛽̄𝑖𝑡 − 𝛽𝑖 | ≤ 2𝑅(𝛽̄𝑖𝑡 , 𝑡) (34)
E[𝑌𝑡+1 |𝑌𝑡 , 𝑌𝑡−1 , … , 𝑌1 ]
where 𝛾 ≥ 1.
=E[𝑌𝑡 + 𝑍𝑡+1 |𝑌𝑡 , 𝑌𝑡−1 , … , 𝑌1 ]
Proof of Lemma A.3. For every 𝑖 and 𝑡, applying Lemma A.2, we have =E[𝑌𝑡 |𝑌𝑡 , 𝑌𝑡−1 , … , 𝑌1 ] + E[𝑍𝑡+1 |𝑌𝑡 , 𝑌𝑡−1 , … , 𝑌1 ]
1
with probability at least 1 − 2𝑒− 72 𝛾 : =𝑌𝑡 + E[𝑍𝑡+1 ] = 𝑌𝑡 + 0 = 𝑌𝑡
𝑡
𝑁𝑖,𝑀 +1 𝑡
𝑁𝑖,𝑀 +1 √
| 𝑚̄ 𝑡𝑖 − 𝑚𝑖 | ≤ 𝑅( 𝑚̄ 𝑡𝑖 , 𝑁𝑖,𝑀
𝑡
) Moreover, |𝑌𝑡 − 𝑌𝑡−1 | = |𝑍𝑡 | ≤ 𝐿. Let 𝑑 = 𝐿 2𝑇 ln 𝛿2 , according to
𝑡
𝑁𝑖,𝑀 𝑡
𝑁𝑖,𝑀 Lemma A.1, we known that with probability at least 1 − 𝛿:
𝑡
𝑁𝑖,𝑀 𝑡
𝑁𝑖,𝑀 +1 √
𝑚𝑖 ∑𝑇 ∑ ∑ 2
|𝑚̄ 𝑡𝑖 − 𝑚𝑖 + 𝑡
|≤ 𝑡
𝑅( 𝑡
𝑚̄ 𝑡𝑖 , 𝑁𝑖,𝑀
𝑡
) | 𝑚𝑡𝑖 − 𝑚𝑖 | ≤ 𝐿 2𝑇 ln (41)
𝑁𝑖,𝑀 +1 𝑁𝑖,𝑀 + 1 𝑁𝑖,𝑀 𝛿
𝑡=1 𝑖∈ 𝑡 𝑖∈ 𝑡
This implies that On the other hand, for all 𝑖 ∈ 𝑡 and 𝑡 ≤ 𝑇 , we get
√
√
𝑡
𝑁𝑖,𝑀 √ 𝛾(𝑁𝑖,𝑀
𝑡 + 1)𝑚̄ 𝑡𝑖 𝛾 𝑚 |𝑚̌ 𝑡𝑖 − 𝑚𝑖 | =|𝑚̌ 𝑡𝑖 − 𝑚̄ 𝑡𝑖 + 𝑚̄ 𝑡𝑖 − 𝑚𝑖 |
|𝑚̄ 𝑡𝑖 − 𝑚𝑖 | ≤ 𝑡 (√ 𝑡 + 𝑡 )+ 𝑡 𝑖
𝑁𝑖,𝑀 + 1 𝑡
𝑁𝑖,𝑀 × 𝑁𝑖,𝑀 𝑁𝑖,𝑀 𝑁𝑖,𝑀 + 1 ≤|𝑚̌ 𝑡𝑖 − 𝑚̄ 𝑡𝑖 | + |𝑚̄ 𝑡𝑖 − 𝑚𝑖 |
𝑚 ≤2𝑅(𝑚̄ 𝑡𝑖 , 𝑁𝑖,M
𝑡
+ 1) + |𝑚̄ 𝑡𝑖 − 𝑚𝑖 |
= 𝑅(𝑚̄ 𝑡𝑖 , 𝑁𝑖,𝑀
𝑡
+ 1) + 𝑡 𝑖
𝑁𝑖,𝑀 + 1
≤2𝑅(𝑚̄ 𝑡𝑖 , 𝑁𝑖,M
𝑡
+ 1) + 2𝑅(𝑚̄ 𝑡𝑖 , 𝑁𝑖,M
𝑡
+ 1)
≤ 2𝑅(𝑚̄ 𝑡𝑖 , 𝑁𝑖,𝑀
𝑡
+ 1)
=4𝑅(𝑚̄ 𝑡𝑖 , 𝑁𝑖,M
𝑡
+ 1)
The last inequality holds because 𝑚𝑖 ≤ 1 ≤ 𝛾.
where the third inequality is due to (40).
Similarly, we can prove |𝑐̄𝑖𝑡 − 𝑐𝑖 | ≤ 2𝑅(𝑐̄𝑖𝑡 , 𝑁𝑖,C
𝑡 + 1) and |𝛽̄𝑡 − 𝛽 | ≤
𝑖 𝑖 Let 𝜏(𝑖, 𝑛) be the time slot that arm 𝑖 is played for the 𝑛th time, we
2𝑅(𝛽̄𝑖𝑡 , 𝑡). □ have
∑
𝑇 ∑ ∑
𝑇 ∑
Lemma A.4. By running CCB with 𝛾 = 72 ln 2𝐾𝑇 𝛿
for 𝑇 time slots, with | (𝑚̌ 𝑡𝑖 − 𝑚𝑖 )| ≤ 4𝑅(𝑚̄ 𝑡𝑖 , 𝑁𝑖,M
𝑡
+ 1)
probability at least 1 − 𝛿 we have the following results hold simultaneously: 𝑡=1 𝑖∈𝑡 𝑡=1 𝑖∈𝑡
𝑁 𝑇 +1
𝑚𝑖 > 𝑚̌ 𝑡𝑖 , ∀𝑖 ∈ , ∀𝑡 ≤ 𝑇 (35) ∑
𝐾 ∑
𝑖,M
= 4𝑅(𝑚̄ 𝜏(𝑖,𝑛)
𝑖 , 𝑛)
𝑖=1 𝑛=1
𝑐𝑖 > 𝑐̌𝑖𝑡 , ∀𝑖 ∈ , ∀𝑡 ≤ 𝑇 (36) √
𝑁 𝑇 +1
∑
𝐾 ∑
𝑖,M
𝛾 𝑚̄ 𝜏(𝑖,𝑛)
𝑖 𝛾
= 4( + )
𝛽̂𝑖𝑡 > 𝛽𝑖 , ∀𝑖 ∈ , ∀𝑡 ≤ 𝑇 (37) 𝑖=1 𝑛=1
𝑛 𝑛

√ 𝑁 𝑇 +1 √
∑
𝑇 ∑ ∑ ∑
𝐾 ∑
𝑖,M
𝛾 𝛾
2𝐾𝑇 ≤ 4( + )
| ( (𝑚𝑡𝑖 − 𝑚̌ 𝑡𝑖 ) + (𝑐𝑖𝑡 − 𝑐̌𝑖𝑡 ))| = 𝑂((𝐾 − 𝐿) 𝐾𝑇 ln ) (38) 𝑛 𝑛
𝑡=1 𝑖∈𝑡 𝑖∉𝑡
𝛿 𝑖=1 𝑛=1
√ Note that when 𝑛 is large enough,
∑
𝑇 ∑ 2𝐾𝑇
| (𝛽̂𝑖𝑡 − 𝛽𝑖 )| = 𝑂(𝐾 𝐾𝑇 ln ) (39) 1 1 1
𝑡=1 𝑖∈𝑡
𝛿 1+ + + ⋯ + ⟶ ln 𝑛 + 𝐶
2 3 𝑛

13
W. Chu et al. Computer Networks 246 (2024) 110395

and for all 𝑛 ≥ 1 it can be proved that

√ ∑
𝑇 ∑ ∑
𝑇 ∑
1 1 1 +| (𝑐𝑖𝑡 − 𝑐𝑖 )| + | (𝑐̌𝑖𝑡 − 𝑐𝑖 )|
1+ √ + √ +⋯+ √ <2 𝑛
2 3 𝑛 𝑡=1 𝑖∉𝑡 𝑡=1 𝑖∉𝑡
√ √
Therefore, 2 2𝐾𝑇
≤𝐿 2𝑇 ln + 𝑂(𝐿 𝐾𝑇 ln )
𝛿 𝛿
√ √
𝑁 𝑇 +1 √
∑
𝑇 ∑ ∑
𝐾 ∑
𝑖,M
𝛾 𝛾 2
+ (𝐾 − 𝐿) 2𝑇 ln + 𝑂((𝐾 − 𝐿) 𝐾𝑇 ln
2𝐾𝑇
)
| (𝑚̌ 𝑡𝑖 − 𝑚𝑖 )| ≤ 4( + ) 𝛿 𝛿
𝑡=1 𝑖∈𝑡 𝑖=1 𝑛=1
𝑛 𝑛 √
2𝐾𝑇
𝐾 √
∑ =𝑂((𝐾 − 𝐿) 𝐾𝑇 ln )
𝑇 +1 𝑇 +1 𝛿
=𝑂( ( 𝛾𝑁𝑖,M + 𝛾 ln 𝑁𝑖,M ))
𝑖=1
the last equality holds since 𝐿 ≪ 𝐾 and 𝐿 < 𝐾 − 𝐿. (39) can be proved
𝐾 √ in the same way.
∑ ∑
𝐾
=𝑂( 𝑇 +1
𝛾𝑁𝑖,M + 𝑇 +1
𝛾 ln 𝑁𝑖,M ) Now we prove the theorem. Let 𝐱∗ be the optimal solution to
𝑖=1 𝑖=1 problem (1). From (39) we know that 𝐱∗ is also a feasible solution to
√
√ 𝐾 √ T
√∑ ∏
𝐾 problem 11, i.e., 𝜷̂𝐭 𝐱∗ ≥ 𝜷 T 𝐱∗ ≥ ℎ. Then for all 𝑡 ≤ 𝑇 we have:
=𝑂(√( 𝑇 +1 2
𝛾𝑁𝑖,M ) + 𝛾 ln 𝑇 +1
𝑁𝑖,M ) T T
𝑖=1 𝑖=1 ̌ 𝐭 + (𝟏 − 𝐱𝐭 ) 𝐜̌𝐭 ≤ 𝐱∗ T 𝐦
𝐱𝐭 𝐦 ̌ 𝐭 + (𝟏 − 𝐱∗ )T 𝐜̌𝐭
√ (45)
√𝐾 ≤ 𝐱∗ T 𝐦𝐭 + (𝟏 − 𝐱∗ )T 𝐜𝐭
√∑ ∑ 𝐾 ∏
𝐾
≤𝑂(√ 12
1
𝑇 +1 𝑇 +1 𝐾 𝐾
𝛾𝑁𝑖,M + 𝛾 ln[( 𝑁𝑖,M ) ] )
𝑖=1 𝑖=1 𝑖=1 Combining (38) (45), we get:
√
√𝐾 ∑
𝑇 ∑ ∑
√ √∑ ∏
𝐾
≤𝑂( 𝐾 √ 𝛾𝑁𝑖,M
1
𝑇 +1
+ 𝛾𝐾 ln[( 𝑇 +1 𝐾
𝑁𝑖,M ) ]) 𝑅𝑒𝑔(𝑇 ) =| ( 𝑚𝑡𝑖 + 𝑐𝑖𝑡 ) − 𝑇 (𝐱∗ T 𝐦𝐭 + (𝟏 − 𝐱∗ )T 𝐜𝐭 )|
𝑖=1 𝑖=1 𝑡=1 𝑖∈𝑡 𝑖∉𝑡
√ ∑𝐾
√𝐾 𝑇 ∑
∑ ∑ ∑ ∑
√ √∑ 𝑇 +1
𝑖=1 𝑁𝑖,M
≤𝑂( 𝐾 √ 𝑇 +1
𝛾𝑁𝑖,M + 𝛾𝐾 ln ) ≤| ( 𝑚𝑡𝑖 + 𝑐𝑖𝑡 − 𝑚̌ 𝑡𝑖 − 𝑐̌𝑖𝑡 )|
𝐾 𝑡=1 𝑖∈𝑡 𝑖∉𝑡 𝑖∈𝑡 𝑖∉𝑡
√
𝑖=1 √
√𝐾 2𝐾𝑇
√ √∑ =𝑂((𝐾 − 𝐿) 𝐾𝑇 ln )
=𝑂( 𝐾 √ 𝛾𝑁𝑖,M 𝑇 +1 𝐿𝑇 𝛿
+ 𝛾𝐾 ln )
𝑖=1
𝐾
√ On the other hand, since for all 𝑡, 𝐱𝐭 is a feasible solution to problem 11,
√ √𝐾 T
√ √∑ i.e., 𝜷̂𝐭 𝐱𝐭 ≥ ℎ, and with (39), we have:
≤𝐿 2𝑇 ln + 𝑂( 𝐾 √ 𝛾𝑁𝑖,M
2 𝑇 +1
+ 𝛾𝐾 ln 𝑇 )
𝛿 +
√
𝑖=1 ∑
𝑇 ∑
𝑉 𝑖𝑜(𝑇 ) =|ℎ𝑇 − 𝛽𝑖𝑡 |
2𝐾𝑇
=𝑂(𝐿 𝐾𝑇 ln ) 𝑡=1 𝑖∈𝑡
𝛿
∑
𝑇 ∑ ∑
𝑇 ∑
where the first inequality is due to Cauchy–Schwarz inequality, and the ≤| 𝛽̂𝑖𝑡 − 𝛽𝑖𝑡 |
third inequality is from inequality of arithmetic and geometric mean. 𝑡=1 𝑖∈𝑡 𝑡=1 𝑖∈𝑡
∑
The fourth equality holds since 𝐾 𝑇 +1 ∑
𝑇 ∑
𝑖=1 𝑁𝑖,M = 𝐿𝑇 .
=| (𝛽̂𝑖𝑡 − 𝛽𝑖𝑡 )|
Rewriting it in a compact way, we get 𝑡=1 𝑖∈𝑡
√ √
∑
𝑇 ∑
2𝐾𝑇 2𝐾𝑇
| (𝑚̌ 𝑡𝑖 − 𝑚𝑖 )| ≤ 𝑂(𝐿 𝐾𝑇 ln ) (42) =𝑂(𝐾 𝐾𝑇 ln )
𝛿 𝛿
𝑡=1 𝑖∈𝑡
This completes the proof. □
By the same reasoning, we can prove that with a probability 1 − 𝛿,
√ Appendix B. Proof of Theorem 4.1
∑
𝑇 ∑
2
| (𝑐𝑖𝑡 − 𝑐𝑖 )| ≤ (𝐾 − 𝐿) 2𝑇 ln (43)
𝑡=1 𝑖∉𝑡
𝛿
It suffices to consider the case when 𝐾 mod 𝐿 ≠ 0, since we can set
√
∑
𝑇 ∑
2𝐾𝑇
𝛼 = 1 when 𝐾 mod 𝐿 = 0.
| (𝑐̌𝑖𝑡 − 𝑐𝑖 )| ≤ 𝑂((𝐾 − 𝐿) 𝐾𝑇 ln ) (44) Let us focus on the time slot when the exploration phase completes,
𝑡=1 𝑖∉𝑡
𝛿
i.e., when 𝑡 = 2𝑁𝐾𝛼∕𝐿. From Hoeffding inequality (Appendix A
Based on (42) (43) (44), we have of [12]), we know that ∀𝑖,
2
∑
𝑇 ∑ ∑ 𝑃 𝑟{|𝑚̄ 𝑖,11 − 𝑚𝑖,11 | ≤ 𝑟𝑡 (𝑚𝑖,11 )} ≥ 1 − (46)
| ( (𝑚𝑡𝑖 − 𝑚̌ 𝑡𝑖 ) + (𝑐𝑖𝑡 − 𝑐̌𝑖𝑡 ))| 𝑇4
𝑡=1 𝑖∈𝑡 𝑖∉𝑡
2
∑
𝑇 ∑ ∑ ∑ ∑ 𝑃 𝑟{|𝑐̄𝑖 − 𝑐𝑖 | ≤ 𝑟𝑡 (𝑐𝑖 )} ≥ 1 − (47)
𝑇4
=| ( 𝑚𝑡𝑖 + 𝑐𝑖𝑡 − 𝑚̌ 𝑡𝑖 − 𝑐̌𝑖𝑡 )|
𝑡=1 𝑖∈𝑡 𝑖∉𝑡 𝑖∈𝑡 𝑖∉𝑡 2
𝑃 𝑟{|𝛽̄𝑖 − 𝛽𝑖 | ≤ 𝑟𝑡 (𝛽𝑖 )} ≥ 1 − (48)
∑
𝑇 ∑ ∑ ∑ ∑ 𝑇4
=| ( 𝑚𝑡𝑖 + 𝑐𝑖𝑡 + 𝑚𝑖 − 𝑚𝑖 √ √
2 ln 𝑇
𝑡=1 𝑖∈𝑡 𝑖∉𝑡 𝑖∈𝑡 𝑖∈𝑡 where 𝑟𝑡 (𝑚𝑖,11 ) = , 𝑟𝑡 (𝑐𝑖 ) = 2 ln 𝑇 ∕𝑁𝛼, and 𝑟𝑡 (𝛽𝑖 ) =
∑ ∑ ∑ ∑ √ 2𝑁𝛼(𝐾∕𝐿−1)
+ 𝑐𝑖 − 𝑐𝑖 − 𝑚̌ 𝑡𝑖 − 𝑐̌𝑖𝑡 )| 2𝐿 ln 𝑇
are the radius of confidence intervals.
𝑖∉𝑡 𝑖∉𝑡 𝑖∈𝑡 𝑖∉𝑡 2𝑁𝐾𝛼
Define the clean event to be the event that (46) (47) (48) hold for
∑
𝑇 ∑ ∑
𝑇 ∑
all arms simultaneously, and the ‘‘bad event ’’ to be its complement. To
≤| (𝑚𝑡𝑖 − 𝑚𝑖 )| + | (𝑚̌ 𝑡𝑖 − 𝑚𝑖 )|
𝑡=1 𝑖∈𝑡 𝑡=1 𝑖∈𝑡 analyze the regret and violation of the algorithm, it suffices to focus

14
W. Chu et al. Computer Networks 246 (2024) 110395

on the clean event, since the contribution of the bad event can be
neglected.2 2𝑁𝐾𝛼
Denote by 𝑡 be the arm set selected when the exploration phase ≤ 𝐿 + 2𝐿𝑇 𝑟𝑡 (𝛽𝑖 )
𝐿
√ (55)
completes (assuming at time 𝑡). If 𝑡 = ∗ , then the regret will no longer ln 𝑇
increase in the exploitation phase. On the other hand, if 𝑡 ≠ ∗ , then < 2𝑁𝐾𝛼 + 2𝐿𝑇
𝑁𝛼
we must have,
Substituting (54) into the above inequation, we get:
∑ ∑ ∑
∗ ∑
∗
( 1 2 2
𝑐̄𝑖 + 𝑚̄ 𝑖,11 < 𝑐̄𝑖 + 𝑚̄ 𝑖,11 (49) 1)
𝑉 𝑖𝑜(𝑇 ) = 𝑂 𝐾 3 𝐿 3 𝑇 3 (ln 𝑇 ) 3
𝑖∉𝑡 𝑖∈𝑡 𝑖∉ 𝑖∈

Since this is a clean event, we have: The proof completes.

∑ ∑
𝑐𝑖 − (𝐾 − 𝐿)𝑟𝑡 (𝑐𝑖 ) + 𝑚𝑖,11 − 𝐿𝑟𝑡 (𝑚𝑖,11 )
𝑖∉𝑡 𝑖∈𝑡 Appendix C. Proof of Theorem 4.2
∑ ∑ (50)
≤ 𝑐̄𝑖 + 𝑚̄ 𝑖,11
𝑖∉𝑡 𝑖∈𝑡
√ √
Let 𝑟𝑡 (𝑚𝑖,11 ) = 2 ln 𝑇 ∕𝑛𝑡 (𝑖, M), 𝑟𝑡 (𝑐𝑖 ) = 2 ln 𝑇 ∕𝑛𝑡 (𝑖, C). From Hoeffd-
and ing inequality, we know that (46) and (47) hold.
∑
∗ ∑
∗ Again, define the clean event to be the event that (46) (47) hold for
𝑐̄𝑖 + 𝑚̄ 𝑖,11 all arms simultaneously, and the ‘‘bad event ’’ to be its complement. To
𝑖∉ 𝑖∈
(51) prove the theorem, it suffices to focus on the clean event.
∑∗ ∑
∗
≤ 𝑐𝑖 + (𝐾 − 𝐿)𝑟𝑡 (𝑐𝑖 ) + 𝑚𝑖,11 + 𝐿𝑟𝑡 (𝑚𝑖,11 ) For each arm 𝑖 ∈ , define the gap as follows:
𝑖∉ 𝑖∈
𝛥(𝑖) ∶= (𝑐𝑖∗ − 𝑚𝑖∗ ,11 ) − (𝑐𝑖 − 𝑚𝑖,11 )
Combining (49)(50)(51), we get:
∑ ∑ ∑
∗ ∑
∗ where 𝑖∗ is the best arm, i.e., 𝑖∗ = argmax {𝑐𝑖 − 𝑚𝑖,11 }. Then we have,
𝑐̄𝑖 + 𝑚̄ 𝑖,11 − 𝑐̄𝑖 − 𝑚̄ 𝑖,11 𝑖∈
(52)
𝑖∉𝑡 𝑖∈𝑡 𝑖∉ 𝑖∈ ∑
𝑡 ∑
≤2(𝐾 − 𝐿)𝑟𝑡 (𝑐𝑖 ) + 2𝐿𝑟𝑡 (𝑚𝑖,11 ) 𝑅𝑒𝑔(𝑡) ≤ 𝑡𝐿(𝑐𝑖∗ − 𝑚𝑖∗ ,11 ) − (𝑐𝑖 − 𝑚𝑖,11 )
𝜏=1 𝑖∈𝑡
The above inequation states that each time slot in the exploitation
phase contributes at most 2(𝐾 − 𝐿)𝑟𝑡 (𝑐𝑖 ) + 2𝐿𝑟𝑡 (𝑚𝑖,11 ) to regret. Let 𝑅(𝑡) be the RHS of the above inequality, i.e., 𝑅(𝑡) = 𝑡𝐿(𝑐𝑖∗ −
∑ ∑
Next, we consider the regret generated in the exploration phase. 𝑚𝑖∗ ,11 ) − 𝑡𝜏=1 𝑖∈𝑡 (𝑐𝑖 − 𝑚𝑖,11 ), and 𝑡𝑗 be the time slot that arm 𝑗 is
Note that 𝑐𝑖𝑡 ∈ [0, 1], 𝑚𝑡𝑖,11 ∈ [0, 1], ∀𝑡, and there are 2𝑁𝐾𝛼∕𝐿 time slots eliminated. Fix arm 𝑗, for all 𝑡 ≤ 𝑡𝑗 , the confidence intervals of 𝑗 and 𝑖∗
within 𝑁 segments. It follows that the regret from exploration phase must overlap at time 𝑡, therefore,
can be bounded by 2𝑁𝐾 2 𝛼∕𝐿.
𝛥(𝑗) ≤2𝑟𝑡𝑗 (𝑚𝑗,11 ) + 2𝑟𝑡𝑗 (𝑚𝑖∗ ,11 ) + 2𝑟𝑡𝑗 (𝑐𝑗 ) + 2𝑟𝑡𝑗 (𝑐𝑖∗ )
Given that the regret to time slot 𝑇 is the sum of regret from the
two phases, we can bound it as follows: ≤4𝑟𝑡𝑗 (𝑚𝑗,11 ) + 4𝑟𝑡𝑗 (𝑚𝑖∗ ,11 ) (56)
( )
2𝑁𝐾𝛼 =8𝑟𝑡𝑗 (𝑚𝑗,11 )
𝑅𝑒𝑔(𝑇 ) ≤ 2(𝐾 − 𝐿)𝑟𝑡 (𝑐𝑖 ) + 2𝐿𝑟𝑡 (𝑚𝑖,11 ) (𝑇 − )
𝐿
2𝑁𝐾 𝛼 2 where the second inequality is for the fact that 𝑛𝑡𝑗 (𝑗, M) ≤ 𝑛𝑡𝑗 (𝑗, C), and
+ the last equality holds since 𝑛𝑡𝑗 (𝑗, M) = 𝑛𝑡𝑗 (𝑖∗ , M).
𝐿
( ) (53)
2𝑁𝐾 2 𝛼 Denote by 𝑅(𝑡, 𝑗) be the contribution of arm 𝑗 to regret at time slot
< 2(𝐾 − 𝐿)𝑟𝑡 (𝑐𝑖 ) + 2𝐿𝑟𝑡 (𝑚𝑖,11 ) 𝑇 +
𝐿 𝑡, we can bound it as
2𝑁𝐾 2 𝛼
< 2𝐾𝑇 𝑟𝑡 (𝑚𝑖,11 ) + 𝑅(𝑡, 𝑗) =𝑛𝑡 (𝑗, M) ⋅ 𝛥(𝑗)
𝐿 √
The last inequation holds since 𝑟𝑡 (𝑐𝑖 ) < 𝑟𝑡 (𝑚𝑖,11 ). Note that the two 2 ln 𝑇
≤8𝑛𝑡 (𝑗, M) ⋅
summands are respectively monotonically decreasing and increasing in 𝑛𝑡𝑗 (𝑗, M)
𝑁, to minimize the regret we can set it so that they are approximately √
equal. Therefore, by setting: =𝑂( ln 𝑇 ⋅ 𝑛𝑡𝑗 (𝑗, M))

2𝑁𝐾 2 𝛼 Therefore,
= 2𝐾𝑇 𝑟𝑡 (𝑚𝑖,11 )
𝐿 ∑ √ ∑√
𝑅(𝑡) = 𝑅(𝑡, 𝑗) ≤ 𝑂( ln 𝑇 ) 𝑛𝑡𝑗 (𝑗, M)
we have,
𝑗∈ 𝑗∈
2𝐿2 𝑇 2 ln 𝑇 13 √ ∑
𝑁 =( ) (54) Since 𝑓 (𝑥) = 𝑥 is a concave function and 𝑗∈ 𝑛𝑡𝑗 (𝑗, M) = 𝐿𝑡
, by
𝐾 2 𝛼3 2
Substituting it into (53), we get: Jensen’s Inequality we have,
( 4 ) √ √
𝐾 13 23 1 1 ∑√ 1 ∑ 𝐿𝑡
𝑅𝑒𝑔(𝑇 ) = 𝑂 ( ) 𝑇 (ln 𝑇 ) 3 . 𝑛𝑡𝑗 (𝑗, M) ≤ 𝑛𝑡𝑗 (𝑗, M) ≤
𝐿 𝐾 𝑗∈ 𝐾 𝑗∈ 2𝐾
Similarly, for violation we have: It follows that
∑
𝑇 ∑ √
√ 𝐿𝑡 √
𝑉 𝑖𝑜(𝑇 ) = [ℎ − 𝛽𝑖 ]+ 𝑅𝑒𝑔(𝑡) ≤ 𝑅(𝑡) ≤ 𝑂( ln 𝑇 )𝐾 = 𝑂( 𝐾𝐿𝑡 ln 𝑇 )
𝑡=1 𝑖∈𝑡 2𝐾
∑
𝑇 ∑ ∑ Next, we prove the bound for violation. Let 𝑖⋆ be the arm with the
≤ [ 𝛽𝑖 − 𝛽𝑖 ]+ largest arrival rate, i.e., 𝑖⋆ = argmax 𝛽𝑖 . Then we have,
𝑡=1 𝑖∈∗ 𝑖∈𝑡 𝑖∈

∑
𝑇 ∑
2
𝑉 𝑖𝑜(𝑇 ) ≤ (𝐿𝛽𝑖⋆ − 𝛽𝑖 )
The probability that bad event occurs is 𝑂(𝑇 −4 ). 𝜏=1 𝑖∈𝑡

15
W. Chu et al. Computer Networks 246 (2024) 110395

Denote by 𝑉 (𝑇 ) be the RHS of the above inequality, and 𝑡𝑗 be the References

time slot that arm 𝑗 is eliminated. Also denote by 𝑡1 be the time slot
[1] A.F. Florian Scheck, A. Freyberg, 5G: A Key Requirement for Autonomous
that the first segment ends. Define the gap as:
Driving—Really? https://www.kearney.com/communications-media-technology/
article/-/insights/5g-a-key-requirement-for-autonomous-driving-really-.
𝛿(𝑖) ∶= 𝛽𝑖⋆ − 𝛽𝑖 [2] Cloud AR/VR Whitepaper, https://www.gsma.com/futurenetworks/wiki/cloud-
ar-vr-whitepaper/.
Obviously, [3] Y. Mao, C. You, J. Zhang, K. Huang, K.B. Letaief, A survey on mobile edge
computing: The communication perspective, IEEE Commun. Surv. Tutor. 19 (4)
𝛿(𝑖) ≤ 2𝑟𝑡1 (𝑖) + 2𝑟𝑡1 (𝑖⋆ ) = 4𝑟𝑡1 (𝑖) (2017) 2322–2358.
[4] M. Patel, B. Naughton, C. Chan, N. Sprecher, S. Abeta, A. Neal, et al.,
On the other hand, let 𝑉 (𝑗, 𝑡𝑗 ) be the contribution of arm 𝑗 to Mobile-Edge Computing Introductory Technical White Paper, in: White Paper,
Mobile-Edge Computing (MEC) Industry Initiative, vol. 29, 2014, pp. 854–864.
violation at time 𝑡𝑗 , we have, [5] A. Alwarafy, K.A. Al-Thelaya, M. Abdallah, J. Schneider, M. Hamdi, A survey on
security and privacy issues in edge-computing-assisted internet of things, IEEE
𝑉 (𝑗, 𝑡𝑗 ) =𝑛𝑡𝑗 (𝑗, M) ⋅ 𝛿(𝑗) Internet Things J. 8 (6) (2020) 4004–4022.
√ [6] S. Zhong, S. Guo, H. Yu, Q. Wang, Cooperative service caching and computation
𝐿 ln 𝑇
≤4𝑛𝑡 (𝑗, M) ⋅ offloading in multi-access edge computing, Comput. Netw. 189 (2021) 107916.
𝑎𝐾 [7] J. Xu, L. Chen, P. Zhou, Joint service caching and task offloading for mobile
√
𝐿 ln 𝑇 edge computing in dense networks, in: IEEE INFOCOM 2018-IEEE Conference
=𝑂(𝑛𝑡 (𝑗, M) ⋅ ) on Computer Communications, IEEE, 2018, pp. 207–215.
𝑎𝐾 [8] L. Chen, C. Shen, P. Zhou, J. Xu, Collaborative service placement for edge
Therefore, computing in dense small cell networks, IEEE Trans. Mob. Comput. (2019).
√ [9] N. Yu, Q. Xie, Q. Wang, H. Du, H. Huang, X. Jia, Collaborative service placement
∑ ∑ 𝐿 ln 𝑇 for mobile edge computing applications, in: 2018 IEEE Global Communications
𝑉 (𝑇 ) = 𝑉 (𝑗, 𝑡𝑗 ) ≤ 𝑂( 𝑛𝑡 (𝑗, M) ⋅ ) Conference, GLOBECOM, IEEE, 2018, pp. 1–6.
𝑗∈ 𝑗∈
𝑎𝐾 [10] K. Poularakis, J. Llorca, A.M. Tulino, I. Taylor, L. Tassiulas, Service place-
√ ment and request routing in MEC networks with storage, computation, and
𝐿𝑡 𝐿 ln 𝑇
≤𝑂( ⋅ ) communication constraints, IEEE/ACM Trans. Netw. 28 (3) (2020) 1047–1060.
2 𝑎𝐾 [11] W. Chen, Y. Wang, Y. Yuan, Combinatorial multi-armed bandit: General frame-
√
work and applications, in: International Conference on Machine Learning, PMLR,
𝐿3 ln 𝑇
=𝑂(𝑡 ) 2013, pp. 151–159.
𝑎𝐾
√ [12] A. Slivkins, et al., Introduction to multi-armed bandits, Found. Trends Mach.
≤𝑂(𝑡 𝐾𝐿 ln 𝑇 ) Learn. 12 (1–2) (2019) 1–286.
[13] R. Kleinberg, A. Slivkins, E. Upfal, Multi-armed bandits in metric spaces, in:
where the last inequality is due to 𝐿 <= 𝐾. The proof completes. Proceedings of the Fortieth Annual ACM Symposium on Theory of Computing,
STOC ’08, 2008, pp. 681–690.
[14] K. Chen, K. Cai, L. Huang, J.C. Lui, Beyond the click-through rate: web link
selection with multi-level feedback, in: Proceedings of the 27th International
Appendix D. Proof of Lemma 4.1
Joint Conference on Artificial Intelligence, 2018, pp. 3308–3314.
[15] W. Chu, P. Yu, Z. Yu, J.C. Lui, Y. Lin, Online optimal service selection, resource
allocation and task offloading for multi-access edge computing: A utility-based
From (56) we known that 𝛥(𝑗) ≤ 8𝑟𝑡𝑗 (𝑚𝑗,11 ). Since for each 𝑗 ≠ 𝑖∗ , approach, IEEE Trans. Mob. Comput. (2022).
we have 𝑛𝑇 (𝑗, 𝑀) = 𝑛𝑡𝑗 (𝑗, 𝑀), it follows that: [16] L. Dong, W. He, H. Yao, Task offloading and resource allocation for tasks with
varied requirements in mobile edge computing networks, Electronics 12 (2)
√ (2023) 366.
ln 𝑇 [17] X. Chen, L. Jiao, W. Li, X. Fu, Efficient multi-user computation offloading for
𝛥(𝑗) ≤ 𝑂(𝑛𝑇 (𝑗, 𝑀)) = 𝑂( )
𝑛𝑇 (𝑗, 𝑀) mobile-edge cloud computing, IEEE/ACM Trans. Netw. 24 (5) (2015) 2795–2808.
[18] J. Huang, F. Qian, A. Gerber, Z.M. Mao, S. Sen, O. Spatscheck, A close
Therefore, examination of performance and power characteristics of 4G LTE networks,
in: Proceedings of the 10th International Conference on Mobile Systems,
ln 𝑇 Applications, and Services, 2012, pp. 225–238.
𝑛𝑇 (𝑗, 𝑀) ≤ 𝑂( )
𝛥(𝑗)2 [19] A.S. Uluagac, CRAWDAD Data Set Gatech/fingerprinting (v. 2014-06-09, 2014,
https://crawdad.org/gatech/fingerprinting/20140609/realtestbed/index.html.
and [20] W. Chu, Z. Yu, J.C. Lui, Y. Lin, Jointly optimizing throughput and content
delivery cost over lossy cache networks, IEEE Trans. Commun. 69 (6) (2021)
ln 𝑇
𝑅(𝑇 , 𝑗) = 𝛥(𝑗) ⋅ 𝑛𝑇 (𝑗, 𝑀) ≤ 𝑂( ) 3846–3863.
𝛥(𝑗) [21] F.A. Salaht, F. Desprez, A. Lebre, An overview of service placement problem in
fog and edge computing, ACM Comput. Surv. 53 (3) (2020) 1–35.
Let ∗ be the optimal set of arms to the online caching problem,
[22] A. Yousefpour, C. Fung, T. Nguyen, K. Kadiyala, F. Jalali, A. Niakanlahiji, J.
then we have: Kong, J.P. Jue, All one needs to know about fog computing and related edge
computing paradigms: A complete survey, J. Syst. Archit. 98 (2019) 289–330.
∑
∗ ∑∗
1
𝑅(𝑇 ) = 𝑅(𝑇 , 𝑗) ≤ 𝑂(ln 𝑇 ) ⋅ [23] C. Li, Y. Xue, J. Wang, W. Zhang, T. Li, Edge-oriented computing paradigms: A
𝑗∉ 𝑗∉
𝛥(𝑗) survey on architecture design and system management, ACM Comput. Surv. 51
(2) (2018) 1–34.
Now let us fix some 𝜖, then the regret consists of two parts: (1) [24] O. Ascigil, T.K. Phan, A.G. Tasiopoulos, V. Sourlas, I. Psaras, G. Pavlou, On
uncoordinated service placement in edge-clouds, in: 2017 IEEE International
each arm 𝑗 with 𝛥(𝑗) ≤ 𝜖 contributes at most 𝜖 per time slot, for a Conference on Cloud Computing Technology and Science, CloudCom, IEEE, 2017,
total of 𝐿𝑇 𝜖 by 𝐿 arms; (2) each arm 𝑗 with 𝛥(𝑗) > 𝜖 contributes pp. 41–48.
ln 𝑇
𝑅(𝑇 , 𝑗) ≤ 𝑂( 𝛥(𝑗) ) ≤ 𝑂( ln𝜖𝑇 ), under the clean event. Combining these [25] H.-J. Hong, P.-H. Tsai, A.-C. Cheng, M.Y.S. Uddin, N. Venkatasubramanian, C.-
H. Hsu, Supporting internet-of-things analytics in a fog computing platform,
two parts and under the clean event, we have:
in: 2017 IEEE International Conference on Cloud Computing Technology and
ln 𝑇 Science, CloudCom, IEEE, 2017, pp. 138–145.
𝑅(𝑇 ) ≤ 𝑂(𝐿𝑇 𝜖 + ) [26] Z. Ning, P. Dong, X. Wang, S. Wang, X. Hu, S. Guo, T. Qiu, B. Hu, R.Y.
𝜖
Kwok, Distributed and dynamic service placement in pervasive edge computing
Note that the above inequality holds for any 𝜖. To minimize the networks, IEEE Trans. Parallel Distrib. Syst. 32 (6) (2020) 1277–1292.
[27] Z. Xu, L. Zhou, S.C.-K. Chau, W. Liang, Q. Xia, P. Zhou, Collaborate or separate?
RHS, we can set 𝐿𝑇 𝜖 = ln𝜖𝑇 , and this leads us to the following bound:
Distributed service caching in mobile edge clouds, in: IEEE INFOCOM 2020-IEEE
√ Conference on Computer Communications, IEEE, 2020, pp. 2066–2075.
𝑅𝑒𝑔(𝑇 ) ≤ 𝑂( 𝐾𝐿𝑇 ln 𝑇 ) [28] P. Kayal, J. Liebeherr, Distributed service placement in fog computing: An itera-
tive combinatorial auction approach, in: 2019 IEEE 39th International Conference
The proof completes. on Distributed Computing Systems, ICDCS, IEEE, 2019, pp. 2145–2156.

16
W. Chu et al. Computer Networks 246 (2024) 110395

[29] T. Nishio, R. Shinkuma, T. Takahashi, N.B. Mandayam, Service-oriented het- [56] P. Landgren, V. Srivastava, N.E. Leonard, Distributed cooperative decision
erogeneous resource sharing for optimizing service latency in mobile cloud, in: making in multi-agent multi-armed bandits, Automatica 125 (2021) 109445.
Proceedings of the First International Workshop on Mobile Cloud Computing & [57] S. Hossain, E. Micha, N. Shah, Fair algorithms for multi-agent multi-armed
Networking, 2013, pp. 19–26. bandits, Adv. Neural Inf. Process. Syst. 34 (2021) 24005–24017.
[30] V.B.C. Souza, W. Ramírez, X. Masip-Bruin, E. Marín-Tordera, G. Ren, G. [58] K. Azuma, Weighted sums of certain dependent random variables, Tohoku Math.
Tashakor, Handling service allocation in combined fog-cloud scenarios, in: 2016 J. Sec. Ser. 19 (3) (1967) 357–367, http://dx.doi.org/10.2748/tmj/1178243286.
IEEE International Conference on Communications, ICC, IEEE, 2016, pp. 1–5. [59] R. Kleinberg, A. Slivkins, E. Upfal, Multi-armed bandits in metric spaces, in:
[31] J. Zhao, X. Sun, Q. Li, X. Ma, Edge caching and computation management for Proceedings of the Fortieth Annual ACM Symposium on Theory of Computing,
real-time internet of vehicles: an online and distributed approach, IEEE Trans. 2008, pp. 681–690.
Intell. Transp. Syst. 22 (4) (2020) 2183–2197. [60] A. Badanidiyuru, R. Kleinberg, A. Slivkins, Bandits with knapsacks, J. ACM 65
[32] S. Wang, R. Urgaonkar, T. He, K. Chan, M. Zafer, K.K. Leung, Dynamic service (3) (2018) 1–55.
placement for mobile micro-clouds with predicted future costs, IEEE Trans.
Parallel Distrib. Syst. 28 (4) (2016) 1002–1016. Weibo Chu received the B.S. degree in software engineering
[33] X. Li, X. Zhang, T. Huang, Asynchronous online service placement and task in 2005 and the Ph.D. degree in control science and engi-
offloading for mobile edge computing, in: 2021 18th Annual IEEE International neering in 2013, both from Xi’an Jiaotong University, Xi’an,
Conference on Sensing, Communication, and Networking, SECON, IEEE, 2021, China. He has participated in various research and develop-
pp. 1–9. ment projects on network testing, performance evaluation
[34] B. Gao, Z. Zhou, F. Liu, F. Xu, B. Li, An online framework for joint network and troubleshooting, and gained extensive experiences in
selection and service placement in mobile edge computing, IEEE Trans. Mob. the development of networked systems for research and
Comput. (2021). engineering purposes. From 2011–2012 he worked as a
[35] T. Liu, S. Ni, X. Li, Y. Zhu, L. Kong, Y. Yang, Deep reinforcement learning based visiting researcher at Microsoft Research Asia, Beijing. From
approach for online service placement and computation resource allocation in 2013 he was with the School of Computer Science and Tech-
edge computing, IEEE Trans. Mob. Comput. (2022). nology, Northwestern Polytechnical University. His research
[36] T. Ouyang, Z. Zhou, X. Chen, Follow me at the edge: Mobility-aware dynamic interests include internet measurement and modeling, traffic
service placement for mobile edge computing, IEEE J. Sel. Areas Commun. 36 analysis and performance evaluation.
(10) (2018) 2333–2345.
[37] Y. Zhang, L. Jiao, J. Yan, X. Lin, Dynamic service placement for virtual reality Xiaoyan Zhang received the B.E. degree in 2020 and M.S.
group gaming on mobile edge cloudlets, IEEE J. Sel. Areas Commun. 37 (8) degree in 2023, both in computer science from Northwest-
(2019) 1881–1897.
ern Polytechnical University, Xi’an, China. Her research
[38] Q. Zhang, Q. Zhu, M.F. Zhani, R. Boutaba, J.L. Hellerstein, Dynamic service interests include task scheduling and service management
placement in geographically distributed clouds, IEEE J. Sel. Areas Commun. 31
for edge/cloud computing systems.
(12) (2013) 762–772.
[39] L. Wang, L. Jiao, T. He, J. Li, H. Bal, Service placement for collaborative edge
applications, IEEE/ACM Trans. Netw. 29 (1) (2020) 34–47.
[40] P. Yang, N. Zhang, S. Zhang, L. Yu, J. Zhang, X. Shen, Content popularity
prediction towards location-aware mobile edge caching, IEEE Trans. Multimed.
21 (4) (2018) 915–929.
[41] X. Vasilakos, V.A. Siris, G.C. Polyzos, Addressing niche demand based on joint
Xinming Jia received the B.E. degree from Northwestern
mobility prediction and content popularity caching, Comput. Netw. 110 (2016)
Polytechnical University, Xi’an, China, in 2021. He is cur-
306–323.
rently working toward his M.S. degree at the School of
[42] S. Zhang, P. He, K. Suto, P. Yang, L. Zhao, X. Shen, Cooperative edge caching in
Computer Science and Technology, Northwestern Polytech-
user-centric clustered mobile networks, IEEE Trans. Mob. Comput. 17 (8) (2017)
nical University, Xi’an, China. His research interests include
1791–1805.
resource management and incentive mechanisms design for
[43] W. Han, A. Liu, V.K. Lau, Dual-mode user-centric open-loop cooperative caching
edge computing systems.
for backhaul-limited small-cell wireless networks, IEEE Trans. Wireless Commun.
18 (1) (2018) 532–545.
[44] N. Uniyal, A. Bravalheri, X. Vasilakos, R. Nejabati, D. Simeonidou, W. Feath-
erstone, S. Wu, D. Warren, Intelligent mobile handover prediction for zero
downtime edge application mobility, in: 2021 IEEE Global Communications
John C.S. Lui received the Ph.D. degree in computer
Conference, GLOBECOM, IEEE, 2021, pp. 1–6.
science from UCLA. He is currently a professor in the
[45] C. Zhong, M.C. Gursoy, S. Velipasalar, Deep reinforcement learning-based edge
Department of Computer Science and Engineering at The
caching in wireless networks, IEEE Trans. Cogn. Commun. Netw. 6 (1) (2020)
48–61. Chinese University of Hong Kong. His current research
interests include communication networks, network/system
[46] S. Chen, Z. Yao, X. Jiang, J. Yang, L. Hanzo, Multi-agent deep reinforce-
ment learning-based cooperative edge caching for ultra-dense next-generation security (e.g., cloud security, mobile security, etc.), network
networks, IEEE Trans. Commun. 69 (4) (2020) 2441–2456. economics, network sciences (e.g., online social networks,
[47] Z. Zheng, L. Song, Z. Han, G.Y. Li, H.V. Poor, A stackelberg game approach information spreading, etc.), cloud computing, large-scale
to proactive caching in large-scale mobile edge networks, IEEE Trans. Wireless distributed systems and performance evaluation theory. He
Commun. 17 (8) (2018) 5198–5211. serves in the editorial board of IEEE/ACM Transactions
[48] W. Huang, W. Chen, H.V. Poor, Request delay-based pricing for proactive on Networking, IEEE Transactions on Computers, IEEE
caching: A stackelberg game approach, IEEE Trans. Wireless Commun. 18 (6) Transactions on Parallel and Distributed Systems, Journal
(2019) 2903–2918. of Performance Evaluation and International Journal of
[49] G. Qiao, S. Leng, S. Maharjan, Y. Zhang, N. Ansari, Deep reinforcement learning Network Security. He was the chairman of the CSE Depart-
for cooperative content caching in vehicular edge computing and networks, IEEE ment from 2005 to 2011. He received various departmental
Internet Things J. 7 (1) (2019) 247–257. teaching awards and the CUHK Vice-Chancellor’s Exemplary
[50] Z. Wang, Y. Wei, F.R. Yu, Z. Han, Utility optimization for resource allocation in Teaching Award. He is also a corecipient of the IFIP WG 7.3
multi-access edge network slicing: a twin-actor deep deterministic policy gradient Performance 2005 and IEEE/IFIP NOMS 2006 Best Student
approach, IEEE Trans. Wireless Commun. 21 (8) (2022) 5842–5856. Paper Awards. He is an elected member of the IFIP WG 7.3,
[51] T. Ouyang, R. Li, X. Chen, Z. Zhou, X. Tang, Adaptive user-managed service fellow of the ACM, fellow of the IEEE, and croucher senior
placement for mobile edge computing: An online learning approach, in: IEEE research fellow.
INFOCOM 2019-IEEE Conference on Computer Communications, IEEE, 2019, pp.
Zhiyong Wang received his B.E. degree in 2021 from
1468–1476.
Huazhong University of Science and Technology, Wuhan,
[52] Open Edge Computing, http://openedgecomputing.org.
China. Since August 2021, he has pursued his Ph.D. degree
[53] Openfog, https://opcfoundation.org/markets-collaboration/openfog/.
in the Department of Computer Science & Engineering at
[54] V. Farhadi, F. Mehmeti, T. He, T.F. La Porta, H. Khamfroush, S. Wang, K.S.
The Chinese University of Hong Kong, Hong Kong. His re-
Chan, K. Poularakis, Service placement and request scheduling for data-intensive
search interests include bandits, reinforcement learning and
applications in edge clouds, IEEE/ACM Trans. Netw. 29 (2) (2021) 779–792.
their applications in computer networks and recommender
[55] D. Vial, S. Shakkottai, R. Srikant, Robust multi-agent multi-armed bandits,
systems.
in: Proceedings of the Twenty-Second International Symposium on Theory,
Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile
Computing, 2021, pp. 161–170.

Performance Task in STS
No ratings yet
Performance Task in STS
3 pages
TMC Jscra
No ratings yet
TMC Jscra
18 pages
MDPI FL Based Service Caching
No ratings yet
MDPI FL Based Service Caching
19 pages
History-Aware Online Cache Placement in Fog-Assisted IoT Systems An Integration of Learning and Control
No ratings yet
History-Aware Online Cache Placement in Fog-Assisted IoT Systems An Integration of Learning and Control
22 pages
Research Paper 1
No ratings yet
Research Paper 1
13 pages
Cluster Based Content Caching Driven by Popularity Prediction
No ratings yet
Cluster Based Content Caching Driven by Popularity Prediction
10 pages
TMC-Online Learning Aided Decentralized
No ratings yet
TMC-Online Learning Aided Decentralized
15 pages
Mathematical Modelling Approach in Web Cache Scheme: Hapiza@tmsk - Uitm.edu - My Norlaila@tmsk - Uitm.edu - My
No ratings yet
Mathematical Modelling Approach in Web Cache Scheme: Hapiza@tmsk - Uitm.edu - My Norlaila@tmsk - Uitm.edu - My
5 pages
S00607-025-01443-Wslnode Selection
No ratings yet
S00607-025-01443-Wslnode Selection
25 pages
Cross-Edge Orchestration of Serverless Functions With Probabilistic Caching
No ratings yet
Cross-Edge Orchestration of Serverless Functions With Probabilistic Caching
12 pages
Proactive Caching Strategy Based On Queueing Theory in F-RAN
No ratings yet
Proactive Caching Strategy Based On Queueing Theory in F-RAN
13 pages
Collaborative Discovery Management in Mobile Network
No ratings yet
Collaborative Discovery Management in Mobile Network
20 pages
ScienceDirect Citations 1723361568694
No ratings yet
ScienceDirect Citations 1723361568694
14 pages
Mapcaching: A Novel Mobility Aware Proactive Caching Over C-Ran
No ratings yet
Mapcaching: A Novel Mobility Aware Proactive Caching Over C-Ran
6 pages
Link 1
No ratings yet
Link 1
6 pages
User-Preference-Learning-Based Proactive Edge Caching For D2D-Assisted Wireless Networks
No ratings yet
User-Preference-Learning-Based Proactive Edge Caching For D2D-Assisted Wireless Networks
16 pages
Function Approximation Based Reinforcement Learning For Edge Caching in Massive MIMO Networks
No ratings yet
Function Approximation Based Reinforcement Learning For Edge Caching in Massive MIMO Networks
13 pages
Net 2023 109654
No ratings yet
Net 2023 109654
13 pages
Joint Multi-User Computation Offloading and Data Caching For Hybrid Mobile Cloud Edge Computing
No ratings yet
Joint Multi-User Computation Offloading and Data Caching For Hybrid Mobile Cloud Edge Computing
13 pages
Benefit-Based Data Caching in Ad Hoc Networks
No ratings yet
Benefit-Based Data Caching in Ad Hoc Networks
16 pages
Service Caching in Multi-Tier Fog Radio Access Networks-1
No ratings yet
Service Caching in Multi-Tier Fog Radio Access Networks-1
1 page
Deep Reinforcement Learning Mechanism For Deadline
No ratings yet
Deep Reinforcement Learning Mechanism For Deadline
21 pages
N7DM08
No ratings yet
N7DM08
14 pages
On Cooperative Caching in Wireless P2P Networks
No ratings yet
On Cooperative Caching in Wireless P2P Networks
9 pages
Theoretical Study of Cache Systems: Dmitry Dolgikh
No ratings yet
Theoretical Study of Cache Systems: Dmitry Dolgikh
17 pages
Joint Service Deployment and Task Offloading For Datacenters With Edge Heterogeneous Servers
No ratings yet
Joint Service Deployment and Task Offloading For Datacenters With Edge Heterogeneous Servers
15 pages
Wu 2021
No ratings yet
Wu 2021
15 pages
Offloading Decision For Mobile Multi-Access Edge Computing in A Multi-Tiered 6G Network
No ratings yet
Offloading Decision For Mobile Multi-Access Edge Computing in A Multi-Tiered 6G Network
14 pages
User Preference Learning-Based Proactive Edge Caching For D2D-Assisted Wireless Networks - 2023
No ratings yet
User Preference Learning-Based Proactive Edge Caching For D2D-Assisted Wireless Networks - 2023
16 pages
PSAC Proactive Sequence-Aware Content Caching Via Deep Learning at The Network Edge
No ratings yet
PSAC Proactive Sequence-Aware Content Caching Via Deep Learning at The Network Edge
10 pages
Cache-Aided NOMA Mobile Edge Computing: A Reinforcement Learning Approach
No ratings yet
Cache-Aided NOMA Mobile Edge Computing: A Reinforcement Learning Approach
30 pages
Cooperative Resource Allocation For NOMA-MEC Multi-Cell Network 2025
No ratings yet
Cooperative Resource Allocation For NOMA-MEC Multi-Cell Network 2025
16 pages
Data Replication and Scheduling in The Cloud With Optimization Assisted Work Flow Management
No ratings yet
Data Replication and Scheduling in The Cloud With Optimization Assisted Work Flow Management
23 pages
A Hybrid DQN and Optimization Approach For Strategy and Resource Allocation in MEC Networks
No ratings yet
A Hybrid DQN and Optimization Approach For Strategy and Resource Allocation in MEC Networks
14 pages
Consistent Hashing and Random Trees
No ratings yet
Consistent Hashing and Random Trees
10 pages
Modeling and Optimization of Latency in Erasure-Coded Storage Systems
No ratings yet
Modeling and Optimization of Latency in Erasure-Coded Storage Systems
141 pages
2021 - A Survey On Reinforcement Learning-Aided Caching in Heterogeneous Mobile Edge Networks
No ratings yet
2021 - A Survey On Reinforcement Learning-Aided Caching in Heterogeneous Mobile Edge Networks
33 pages
PTEA Algorithm For Wireless P2P Networks in The Presence of Cooperative Cache
No ratings yet
PTEA Algorithm For Wireless P2P Networks in The Presence of Cooperative Cache
3 pages
Security in Mobile Edge Caching With Reinforcement Learning
No ratings yet
Security in Mobile Edge Caching With Reinforcement Learning
7 pages
Comst 2019 2934489
No ratings yet
Comst 2019 2934489
2 pages
On Skipping Redundant Computation Via Smart Task Deployment For Faster Serverless
No ratings yet
On Skipping Redundant Computation Via Smart Task Deployment For Faster Serverless
6 pages
Algorithm Solved IEEE Projects 2012 2013 Java at Seabirdssolutions
No ratings yet
Algorithm Solved IEEE Projects 2012 2013 Java at Seabirdssolutions
111 pages
Replicate and Bundle (RNB) : Kalyani Khandezod, Nitin Raut, Abdulla Shaik
No ratings yet
Replicate and Bundle (RNB) : Kalyani Khandezod, Nitin Raut, Abdulla Shaik
6 pages
Load Balancing New
No ratings yet
Load Balancing New
12 pages
Coacs: A Cooperative and Adaptive Caching System For Manets
No ratings yet
Coacs: A Cooperative and Adaptive Caching System For Manets
17 pages
Effective Load Balancing in P2P Systems
No ratings yet
Effective Load Balancing in P2P Systems
8 pages
T-Caching: Enhancing Feasibility of In-Network Caching in ICN
No ratings yet
T-Caching: Enhancing Feasibility of In-Network Caching in ICN
13 pages
Optimal Service Pricing For A Cloud Cache
No ratings yet
Optimal Service Pricing For A Cloud Cache
6 pages
Optimal Service Pricing For A Cloud Cache
No ratings yet
Optimal Service Pricing For A Cloud Cache
13 pages
On Incentivizing Resource Allocation and Task Offloading Fo - 2024 - Computer Ne
No ratings yet
On Incentivizing Resource Allocation and Task Offloading Fo - 2024 - Computer Ne
16 pages
Energy Criticality Avoidance-Based Delay Minimizat
No ratings yet
Energy Criticality Avoidance-Based Delay Minimizat
19 pages
Jan 2024 Rambabu2
No ratings yet
Jan 2024 Rambabu2
24 pages
Cache-Enabled Multicast-Aided Traffic Offloading With Optimal Network-Wide Resource Consumption
No ratings yet
Cache-Enabled Multicast-Aided Traffic Offloading With Optimal Network-Wide Resource Consumption
12 pages
Muncc: Multi-Hop Neighborhood Collaborative Caching in Information Centric Networks
No ratings yet
Muncc: Multi-Hop Neighborhood Collaborative Caching in Information Centric Networks
9 pages
CSD Final Report
No ratings yet
CSD Final Report
8 pages
Distributed Caching Algorithms For Content Distribution Networks
No ratings yet
Distributed Caching Algorithms For Content Distribution Networks
22 pages
Cache
No ratings yet
Cache
14 pages
A Geographical-Aware State Deployment Service For Fog Computing
No ratings yet
A Geographical-Aware State Deployment Service For Fog Computing
17 pages
Optimal Transport-Based One-Shot Federated Learning For Artificial Intelligence of Things
No ratings yet
Optimal Transport-Based One-Shot Federated Learning For Artificial Intelligence of Things
15 pages
Tracking Influencers in Decaying Social Activity Streams With Theoretical Guarantees
No ratings yet
Tracking Influencers in Decaying Social Activity Streams With Theoretical Guarantees
16 pages
Eris An Online Auction For Scheduling Unbiased Distributed Learning Over Edge Networks
No ratings yet
Eris An Online Auction For Scheduling Unbiased Distributed Learning Over Edge Networks
14 pages
Aoe Video
No ratings yet
Aoe Video
33 pages
1 s2.0 S0306261924004148 Main
No ratings yet
1 s2.0 S0306261924004148 Main
20 pages
Etx 2 v6.7 - Datasheet
No ratings yet
Etx 2 v6.7 - Datasheet
13 pages
ADA Flanger Manual
No ratings yet
ADA Flanger Manual
11 pages
SOLAR POWER BANK Final Report Submission
100% (1)
SOLAR POWER BANK Final Report Submission
22 pages
Zombie
No ratings yet
Zombie
5 pages
Generating Evidence For Artificial Intelligence-Based Medical Devices
No ratings yet
Generating Evidence For Artificial Intelligence-Based Medical Devices
104 pages
Intro S4HANA Using Global Bike Exercises FI en v4.1
No ratings yet
Intro S4HANA Using Global Bike Exercises FI en v4.1
10 pages
National Cybersecurity Policy 2023 - 2028 Is Published - Carey Abogados
No ratings yet
National Cybersecurity Policy 2023 - 2028 Is Published - Carey Abogados
4 pages
Nweg5122 Poe
No ratings yet
Nweg5122 Poe
21 pages
Exam 4 Training Grile
No ratings yet
Exam 4 Training Grile
15 pages
2022 MDP APP and Budget Matrix F2F SARSARACAT ES
No ratings yet
2022 MDP APP and Budget Matrix F2F SARSARACAT ES
15 pages
Indonesia (Suite) Wiring Diagram
No ratings yet
Indonesia (Suite) Wiring Diagram
1 page
DP-200 Dump
No ratings yet
DP-200 Dump
164 pages
Fall 2023 - CS302P - 1
No ratings yet
Fall 2023 - CS302P - 1
2 pages
Corvis Prospekt 4 Seitig 0611
No ratings yet
Corvis Prospekt 4 Seitig 0611
4 pages
Application Guide For Master's Degree Courses Asia Bridge Program (ABP)
No ratings yet
Application Guide For Master's Degree Courses Asia Bridge Program (ABP)
34 pages
Colgate OpenCore ComputerVision
No ratings yet
Colgate OpenCore ComputerVision
8 pages
PL Systemair Feb 2024
No ratings yet
PL Systemair Feb 2024
133 pages
1 s2.0 S2772940024000171 Main1
No ratings yet
1 s2.0 S2772940024000171 Main1
10 pages
Microland Limited
No ratings yet
Microland Limited
3 pages
All-Electric Bus HVAC Solutions: Choose From A Range of Clean, Efficient Solutions
No ratings yet
All-Electric Bus HVAC Solutions: Choose From A Range of Clean, Efficient Solutions
4 pages
EGPCL-NPL-PEL-KEC-PPL-RPT-00007 Wall Thickness Calculation Report C01
No ratings yet
EGPCL-NPL-PEL-KEC-PPL-RPT-00007 Wall Thickness Calculation Report C01
13 pages
NIBDocument NIB16
No ratings yet
NIBDocument NIB16
92 pages
Cs403 Assignment Solution 1 Fall 2023
No ratings yet
Cs403 Assignment Solution 1 Fall 2023
7 pages
Conference
No ratings yet
Conference
3 pages
Digital Mp3 Player
No ratings yet
Digital Mp3 Player
3 pages
Embedded Systems Input and Output Optional
No ratings yet
Embedded Systems Input and Output Optional
4 pages
User Manual
No ratings yet
User Manual
128 pages
Syllabus Computer Class-3
No ratings yet
Syllabus Computer Class-3
9 pages

Online Optimal Service Caching For Multi Access Edge Computing - 2024 - Computer

Uploaded by

Online Optimal Service Caching For Multi Access Edge Computing - 2024 - Computer

Uploaded by

Computer Networks 246 (2024) 110395

Contents lists available at ScienceDirect

Online optimal service caching for multi-access edge computing: A

ARTICLE INFO ABSTRACT

In this paper, we study the online service caching problem for a

3.1. Algorithm Proof. See Appendix A. □

With service switching cost, the MEC-offloading delay for each

Fig. 4. Performance reductions of CCB when there is service switching cost.

Fig. 6. A segment used to sample 𝑆 services.

• 𝑆 mod 𝐿 ≠ 0. Let 𝛼 be the smallest positive integer such that

√ Proof. See Appendix C. □

Fig. 7. Two service switch-aware algorithms for online caching.

(1) State-dependent CMAB formulation. We have shown that when

|𝑌𝑛 − 𝑌0 | ≤ 𝑑. 𝑃 𝑟{∪𝑖,𝑡 𝑄𝑡𝑖 } ≤ 𝑃 𝑟{𝑄𝑡𝑖 } < 𝛿

Lemma A.2 ([59,60]). Consider 𝑛 i.i.d random variables 𝑍1 , 𝑍2 , … , 𝑍𝑛 in therefore,

|𝑐̄𝑖𝑡 − 𝑐𝑖 | ≤ 2𝑅(𝑐̄𝑖𝑡 , 𝑁𝑖,C

and for all 𝑛 ≥ 1 it can be proved that

Since this is a clean event, we have: The proof completes.

Denote by 𝑉 (𝑇 ) be the RHS of the above inequality, and 𝑡𝑗 be the References

You might also like