Towards Dynamic Request Updating With Elastic Scheduling For Multi-Tenant Cloud-Based Data Center Network
Towards Dynamic Request Updating With Elastic Scheduling For Multi-Tenant Cloud-Based Data Center Network
Authorized licensed use limited to: Sri Krishna College Of Engineering & Technology. Downloaded on May 21,2024 at 16:41:57 UTC from IEEE Xplore. Restrictions apply.
2224 IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, VOL. 11, NO. 2, MARCH/APRIL 2024
updating for those already in use. Here, we desire to find a r We conduct various evaluations with several state-of-the-
provisioning strategy for virtual clusters that achieving quick art algorithms under different topologies on the basic set-
response while also supporting maximum elasticity without ting that refers to the observations. The results are shown
resorting to reassignment. This problem is non-trivial due to the from different perspectives to provide conclusions.
following unique challenges: (i) It is nontrivial to implement fast The remainder of this paper is organized as follows. Section II
provisioning and offer a quick response to numerous customers surveys related works. Section III describes the model, problem
when several newly arrived virtual clusters arrive simultaneously formulation, and motivation. Section IV investigates the prob-
(# of n). (ii) The initial position is vital for the flexible growth lem by proposing an efficient framework. Section V presents the
of subsequent resources for virtual clusters. It is nontrivial evaluations. Section VI concludes the paper.
to improve their elasticity under the high dimensions caused
by the concurrent requests from enormous amount of virtual II. RELATED WORK
clusters and the huge volume of the cloud-based data center. (iii)
Additionally, due to fluctuations in demand, tenants who have Recently, the main research point of virtual cluster provision-
already provisioned resources in the data center may request ing in DCNs includes reliability, energy consumption, traffic
resource scaling during the operation. For instance, in Fig. 1, changing, and congestion control. However, with the explosion
tenant V1 initially requests 2 VMs, but when the demands varies of increasing types and scales of the requests by users, the
over the span of the operating period, the number scales to 4. It problem of elasticity in the data center has also been focused.
is challenging to deal with the dynamic scaling of requests that The solutions are mainly divided into three categories introduced
can realize adaptability. in the following.
In this paper, we introduce a novel dynamic updating frame-
work with elastic resource scheduling. Instead of using heuristic A. Elastic Scheduling by Extending Resources
algorithms or simple deep reinforcement learning as the existing With the explosion of increasing types and scales of requests
work does, we deal with the resource provisioning problem of by users, the problem of elasticity in the DCN has been focused.
multi-tenant and realize the dynamic adjustment for the elastic There have been a few recent work on elastic resource provision-
updating requests at the same time, which consists of two stages. ing by extending physical resources. Rui et al. [7] and Naskos
In the first stage, we try to find a feasible provisioning scheme et al. [9] showed a probability model and a cost-aware method
for the multi-tenant with a rapid response by relaxing partial to analyze the bottleneck in multi-layer cloud applications, and
constraints. In the second stage, we attempt to updating the they proposed a method to meet the elastic scaling of the data
virtual clusters of multi-tenant that have been provisioned by center. Farokhis et al. [10] designed a two-layer traffic-aware
using deep reinforcement learning, so as to improve the elastic- transmission algorithm, which can effectively solve the problem
ity. However, due to the large scale of the scenario, we need to of virtual machine placement and ensure the large-scale elastic
design an efficient solution that improving the training speed of scaling of potential user resources. Lin et al. [11] provided a
the neural network in high-dimensional space. Our contributions unified framework that integrates the representation of the logic
can be summarized as follows: graphs to maintain regular and reliable operation of data center
r We investigate the virtual clusters provisioning problem
networks and transmit data between servers. Fan et al. [12]
in multi-tenant cloud-based DCNs with hose model, and presented an adaptive path-finding algorithm for establishing
we propose to maximize the elasticity by considering the virtual links between any two nodes in the data center network.
limitation on computation and communication resources. Wang et al. [13] realized the elastic scaling of cloud-based
r We make a theoretical and experimental study of the com-
DCN by adjusting the size of CPU and memory. Chowdhury
monly used methods that are appropriate for provisioning et al. [14] analyzed the problem of elastic management based
of virtual clusters, and we analyze the insights that produce on virtual cluster service, separated resource allocation from
high complexity and slow convergence. service management, and provided the ability of elastic service
r We introduce a novel dynamic updating framework with
to adapt to dynamic workload changes. The above works realized
elastic scheduling that make it possible for multi-tenant the elastic resource provisioning by adding new instances (VMs,
cloud-based DCNs to provide scalable resources in two containers or application instance modules, etc.) or adjusting
stages. We construct a heuristic rapid provisioning scheme the size of the instance in itself during the runtime, which can
in the first stage to realize the real-time response to multi- solve the problem of the insufficient physical resources caused
tenant virtual clusters, and we prove the optimality under by the dynamic scaling. However, it is difficult to avoid the low
the single computation resource constraint. utilization and high cost caused by the uncertainty of requesting
r Based on that, we present an online dynamic updating
types and scales.
method based on deep reinforcement learning to enhance
the adaptability of virtual clusters that are running or scaled
during the second stage. In order to avoid the high dimen- B. Elastic Scheduling by Designing Heuristic Strategies
sions caused by the large scales of tenants and the DCNs, Quite a few works have been carried out on the elastic resource
we train a fully connected neural network by creating provisioning problem by designing heuristic strategies. Alfonso
a new feasible action set to realize the reduction and it et al. [15] proposed an open-source virtual cluster framework
approximates the policy based on a proposed aggressive based on the DCN, which analyzed the dynamic changes of
objective selection method to improve training speed. virtual clusters in the running process to minimize the cluster
Authorized licensed use limited to: Sri Krishna College Of Engineering & Technology. Downloaded on May 21,2024 at 16:41:57 UTC from IEEE Xplore. Restrictions apply.
LU et al.: TOWARDS DYNAMIC REQUEST UPDATING WITH ELASTIC SCHEDULING 2225
Authorized licensed use limited to: Sri Krishna College Of Engineering & Technology. Downloaded on May 21,2024 at 16:41:57 UTC from IEEE Xplore. Restrictions apply.
2226 IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, VOL. 11, NO. 2, MARCH/APRIL 2024
Authorized licensed use limited to: Sri Krishna College Of Engineering & Technology. Downloaded on May 21,2024 at 16:41:57 UTC from IEEE Xplore. Restrictions apply.
LU et al.: TOWARDS DYNAMIC REQUEST UPDATING WITH ELASTIC SCHEDULING 2227
shown in Fig. 2(a). We can see that the elasticity of the DCN B. Stage 1: Fast Initial Provisioning
is close to convergence when the number of iterations reaches
In this subsection, we introduce a multi-tenant fast initial
104 . In addition, the range of the elasticity fluctuates greatly
provisioning scheme (MFIP) for virtual clusters which is shown
in the iterations from 6 × 103 to 104 . We enlarge the value
in Algorithm 1. The insight of our scheme is to identify the
of elasticity within 300 iterations in Fig. 2(b), which has no
partition for virtual clusters based on the computing resource
tendency of convergence and fluctuates sharply. Thus, we can
of the DCN. The input in Algorithm 1 is the set of virtual
see that the convergence time will be extremely slow if the deep
clusters V, and the output is the fast provisioning scheme Xf .
reinforcement learning method is used to directly search for In lines 1 and 2, we first check the feasibility of virtual clusters
and learn the best solution of the virtual cluster provisioning by comparing |V| with ĉ. Here, we use ĉ to represent the
problem for multi-tenant.
total remaining resources where ĉ = m i=1 ĉi . If the remain-
ing computing resources can accommodate the virtual clusters
IV. DYNAMIC UPDATING FRAMEWORK WITH ELASTIC of V, where |V| ≤ ĉ, Algorithm 1 continues. Otherwise, the
SCHEDULING requests of set V will be rejected. In lines 3 to 5, we start to
In this section, we show the detail of our novel online dynamic calculate the estimated number of accommodation based on the
updating framework with elastic scheduling (DUES) which computing capacities of each server in G. Here, we introduce a
constructed by two stages to realize the rapid response and high new definition of the estimated divided factor.
elasticity. Definition 1 (Estimated divided factor): Let δi denote the
estimated divided factor of Ci and δi = ĉi / m i=1 ĉi , where ĉi
A. Overview denotes the rest available physical resources.
In line 3, we first initialize the group partition with estimated
The main idea of DUES is to realize real-time response to divided factors. Then we calculate the capacity of each group,
multi-tenant requests for provisioning and upgrading while max- which is the maximum amount of provisioning VMs on server
imizing the elasticity of the cloud-based DCN. The overview i, i.e, gi = δi · |V|. The value of gi is an integer that rounds
of DUES, which comprises of two stages, is shown in Fig. 3. down with gi = δi · |V| to avoid overflowing, which involves
In the first stage, we propose a heuristic scheme to realize the reducing to the nearest integer even if the fractional part is larger
fast provisioning for multi-tenant and analyze its optimality and than or equal to 0.5. We suppose that if an upward value or
complexity. We take the arriving requests of multi-tenant virtual rounding method is used to obtain the value of gi , it is possible
clusters as the input, and the output is the initial provisioning that the total available resources on the servers of the groups
scheme which also converts to the input of the second stage. will be higher than the total number of requests, resulting in
In the second stage, we propose a online dynamic updating an overflow error where the free position information exceeds
strategy based on deep reinforcement learning to improve the the total number of requests. Here, we highlight the potential
combinational elasticity of the cloud-based DCN. Since the main impact through using a straightforward example. We suppose
drawback of the simple deep reinforcement learning method is that there are only 3 available servers left in the data center,
that it will result in huge state and action spaces when applied and each has ĉ1 = 2, ĉ2 = 10, and ĉ3 = 8 remaining resources,
to the virtual cluster provisioning problem, we introduce a new respectively. We assume that the total amount of requests is
definition which is the feasible action set. Based on that, we train |V| = 15 altogether of virtual clusters at
a fully connected neural network to realize the reduction and it this time, we will
obtain g1 = 2, g2 = 8, and g3 = 6, where i=3 i=1 gi = 16 > |V|
approximates the policy based on a proposed aggressive objec- regardless of whether rounding up or normal rounding is used,
tive selection method to improve training speed. The detailed which results in overflow. In line 6, we update the number of
description are shown as follows.
Authorized licensed use limited to: Sri Krishna College Of Engineering & Technology. Downloaded on May 21,2024 at 16:41:57 UTC from IEEE Xplore. Restrictions apply.
2228 IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, VOL. 11, NO. 2, MARCH/APRIL 2024
VMs of multi-tenant virtual clusters |V| = |V| − m i=1 gi . In Algorithm 1: Multi-tenant Fast Initial Provisioning Scheme
lines 7 to 9, we resize some groups for the rest of the queries (MFIP).
in V. If |V| = 0, which means there are remaining VMs that
Input: Set of multi-tenant requests V;
cannot be covered, we choose the physical machine with the
Output: Fast initial provisioning scheme Xf ;
maximum available resources by arg maxCi ∈C ĉi and increase
1: if |V| < ĉ then
the estimated number of accommodations by gi = gi + 1 in
2: return False;
lines 8 and 9. Then, we update the total number of requests
3: for each server in G do
in line 10. In line 11, we update groups in set g with descending
4: Initialize group partition with the estimated divided
order based on their sizes, where g := descending(g). Then
factor δi ;
we start the provisioning process, which chooses request Vj
5: Calculate the capacity of each group gi ;
with maximum arg maxVj ∈V Nj and matches Vj into group
6: Update the virtual clusters |V| := |V| − m i=1 gi ;
gi . If the number of VMs |Vj | is over the estimated number
7: while |V| = 0 do
of accommodation in gi , we place a part of VMs according to
8: Choose the physical machine with arg maxCi ∈C ĉi ;
the size of gi . Then we update gi = 0 and remove gi from set
9: Update gi := gi + 1;
g. Otherwise, we update set g with gi = gi − |Vj |. After that,
10: Update |V| := |V| − 1;
we update g := descending(g) in line 19. Line 20 returns the
11: Update g := descending(g);
provisioning scheme Xf . The time complexity of Algorithm 1
12: for each group gi in G do
is O(m2 · |V|).
13: Choose request Vj with arg maxVj ∈V |Vj |;
Theorem 1: The total communication demand of virtual clus-
14: Matching Vj into group gi ;
ters V with MFIP is minimum in G under the single constraint
15: if |Vj | > gi then
ĉi ≤ ci .
16: |Vj | := |Vj | − gi ;
Proof: There are two steps in MFIP, which are initializing
17: Update gi := 0 and remove gi from set g;
the estimated groups of G and the identifying partition for
18: else
multiple virtual clusters. In the first step, the groups are estimated
19: Update set g with gi := gi − |Vj |;
by considering converting multiple virtual clusters to a single
20: Update g := descending(g);
one. The partition of each group is found by calculating the
21: return Initial Provisioning Xf ;
capacities based on the physical machines in G that is optimal,
which has been proved in [28]. Thus, we only need to prove
that the identifying partition for V obtained the minimum total
two cases (iii) (min{|Vi | − |gi |, |gi |} = |gi | under MFIP and
communication demand. We suppose that the set of estimated
min{|Vi | − |gj |, |gj |} = |gj |) and (iv) (min{|Vi | − |gi |, |gi |} =
groups g = {gi } has been updated with descending order. If the
|gi | under MFIP and min{|Vi | − |gj |, |gj |} = |Vi | − |gj |), the
demands of all virtual clusters in V with the same order are lower
proof process is the same as (a) and (b). In summary, the total
than groups in g, the total communication demand of virtual
communication demand of virtual clusters V with MFIP is
clusters V will be 0 which is minimum. If there is existing Vi ⊆
minimum under the single constraint ĉi ≤ ci .
V larger than the size of group gi , where |Vi | > |gi |, Vi ⊆ V
will be divided into several parts. We discuss the case as follows.
C. Stage 2: Online Dynamic Updating Strategy
We suppose that the fast provisioning Xf is (Vi → gi )|Vi |>|gi | ,
the total communication demand is min{|Vi | − |gi |, |gi |}. As- In this subsection, we propose an online dynamic updating
suming that the provisioning with minimum communication de- strategy based on deep reinforcement learning. The main idea
mand (MCD) of Vi is group gj , then we have (Vi → gj )|Vj |>|gj | . is to realize the dynamic updating requests from multi-tenants
The total communication demand will be min{|Vi | − |gj |, |gj |}. inside each equal time slot after dividing continuous time into
Here, we prove by contradiction which assumes that |gi | > |gj |, equal slices. This process intends to identify the bottleneck of
then we have four possible scenarios. (i) min{|Vi | − |gi |, |gi |} = the DCN for each time slot in the context of the fast initial
|Vi | − |gi | under MFIP and min{|Vi | − |gj |, |gj |} = |Vi | − |gj | provisioning Xf and choose VMs to readjust based on an
under MCD. Since we suppose MCD has the minimum com- aggressive upgrading strategy for the objective selection. Here,
munication demand, then we have that |Vi | − |gj | < |Vi | − |gi |, it is worth noting that when a request is produced at a certain
i.e. |gi | > |gj |, which contradicts with the assumption |gj | < time slot, it is necessary to verify beforehand whether the tenant
|gi |. (ii) min{|Vi | − |gi |, |gi |} = |Vi | − |gi | under MFIP and is already located in the data center. If the tenant is already in
min{|Vi | − |gj |, |gj |} = |gj | under MCD. Suppose the group the data center, it goes directly to the second stage of dynamic
that provisioning the Vi \ gj request is gk , where the rest capacity online updating; if not, it must go through the first stage of fast
|Vi | − |gi | < |ĝk | < |gi |. The total communication demand will initial provisioning. We define the bottleneck as follows.
be 2(|Vi | − |gi |). Since we suppose MCD has the minimum com- Definition 2 (bottleneck): The bottleneck b∗ is a vector rep-
munication demand where |gj | < |Vi | − |gi |, then we have that resenting the location of physical machine or link with the
|Vi | − |gj | > gi . Since |ĝk | < |gi |, we have |ĝk | < |gi | < |Vi | − minimum elasticity of G.
|gj |. The MCD will be at least 2|gj | + 2|gk |, however, |Vi | − The core component of the agent is to design a policy where
|gi | < |ĝk |, then we have 2|gj | + 2|gk | > 2|gj | + 2(|Vi | − |gi |), it provides the probability distribution over the action space a
which contradict with the assumption. For the remaining and the state space s.
Authorized licensed use limited to: Sri Krishna College Of Engineering & Technology. Downloaded on May 21,2024 at 16:41:57 UTC from IEEE Xplore. Restrictions apply.
LU et al.: TOWARDS DYNAMIC REQUEST UPDATING WITH ELASTIC SCHEDULING 2229
Algorithm 2: Reward Updating Mechanism. Algorithm 3: Feasible Actions Set Construction Function.
Input: Variable num; Input: Fast initial provisioning Xf ;
Output: Reward r(st , at ) and termination variable Γ; Output: Feasible actions set Â;
1: if E(st+1 ) ≤ E(st ) then 1: Calculate E under the fast initial provisioning Xf ;
2: num := num + 1; 2: Find the location of the bottleneck with minimum
3: if E(st+1 ) > E(st ) then elasticity;
4: r := (E(st+1 ) − Ē)/(Ē + ξ); 3: if E = Em Ci
then
5: else if E(st+1 ) = E(st ) then 4: Remove the bottleneck Ci from set  = {A/Ci };
6: r := E(st+1 ) − E(st ); 5: else
7: else 6: Remove the physical machines in set C(Lij ) under the
8: r := −1; bottleneck link Lij from set  = {A/C(Lij )};
9: if num ≥ Ψ then 7: return Feasible Actions Set Â;
10: Γ := ture;
11: else
12: Γ := f alse;
13: return Reward r(st , at ) and termination variable Γ; decreases after adjusting the VMs in one episode. The output of
this function is the reward after choosing action at and the value
of the termination variable Γ. In line 1, we first compare the value
1) Deep Reinforcement Learning Formulation: In order to of elasticity under the state st and st+1 . If E(st+1 ) ≤ E(st ), it
describe the environment of the DCN concisely and correctly means that the elasticity after choosing action at will decrease,
for the agent, the state space should include the knowledge of we will record it using num := num + 1. After that, we start to
the usage on the physical resources, the status of requests from define the reward updating mechanism under different cases. In
multiple tenants, and the information of the updating virtual lines 3 to 4, when there is an increment that E(st+1 ) > E(st ),
cluster. So the state is designed as follows. the reward will be defined as r := (E(st+1 ) − Ē)/(Ē + ξ),
Definition 3 (state): The state st is a vector consisting of where Ē is the baseline elasticity after the fast provisioning
st = [h1 , h2 , . . ., hm ]t , where hi denotes the provisioning list Xf . ξ is a factor that avoid the denominator obtaining zero,
on server ci at time slot t. The provisioning list hi records the where 0 < ξ ≤ 1. In lines 5 to 6, if the E(st+1 ) = E(st ), the
number of virtual clusters placed on this server, where hi = reward will be defined as r := 0. Otherwise, the reward will be
[v1 , v2 , vi , . . ., vk ]. defined as r := −1 under the E(st+1 ) > E(st ) case in lines
We consider realizing the dynamic updating by training the 7 to 8. Since the reward of one action cannot determine the
agent which needs to choose a destination physical machine for final result, the reward value r := −1 cannot represent that
each adjusting VM. The action at is designed as follows. the total provisioning order which is bad. However, if the bad
Definition 4 (action): The action space at = [C1 , C2 , . . . , cases continue to happen, which means that the agent always
Cm ]t is the updating action, where Ci = 0 or Ci = 1 means chooses the action with r := −1, this episode will be terminated
that the target location of adjustment is on Ci or not at time slot when num ≥ Ψ. Here, we use Ψ to denote a threshold that is
t. determined by the structure of the DCN, which is less than or
The objective of the agent is to find a provisioning scheme equal to the number of physical machines, i.e., Ψ ≤ |C|. Line
for multi-tenants that maximizes the elasticity of the DCN. For 13 returns the reward r(st , at ) and the termination variable Γ.
each episode, we decide the provisioning of VMs for the tenants 2) Construction of Feasible Actions Set: In order to reduce
by choosing an action. After that, the agent will get a reward the high dimensions caused by the large scales of tenants and
r(st , at ) at time slot t with state st after executing action at . the DCNs, we introduce a definition of the feasible action set.
In our problem, the value of this reward cannot determine the Definition 6 (Feasible Actions Set): Let  indicate the feasi-
final elasticity until all requests of tenants are provisioned. The ble action set of V, which only include the target VMs that have
reason is that the virtual cluster only communicate with VMs on positive effect on optimize the combinational elasticity of G.
their own, which means that although we make the adjustment Based on that, we further propose a method to construct a
decision for VMs one by one, the final elasticity is determined feasible action set Â. The main idea is to remove the invalid
until all virtual clusters finish provisioning. Here is the specific optional targets which exist under the bottleneck to improve the
definition. learning speed of the agent. The detail description is shown in
Definition 5 (Reward): The reward r is decided by the value Algorithm 3. In line 1, we first calculate the elasticity E under
of elasticity which defined in three cases, where the fast initial provisioning Xf , and we find the location of
⎧ bottleneck. If the bottleneck is located on the physical machine
⎪
⎨(E(st+1 ) − Ē)/(Ē + ξ) E(st+1 ) > E(st ) E = Em Ci
, where Em Ci Ci
is the elasticity of Ci , i.e., Em = 1 − cĉii .
r := E(st+1 ) − E(st ) E(st+1 ) = E(st ) (7) Then, we remove the optional target Ci from set A, where  =
⎪
⎩
−1 E(st+1 ) < E(st ) {A/Ci }. If the bottleneck is located on the physical link E =
ij ij ij l̂
The updating mechanism of the reward shown in Algorithm 2, ElL , where ElL is the elasticity of Lij , i.e., ElL = 1 − lij
ij
.
we initialize a variable num to record the times that the elasticity Then, we remove set CLij under the bottleneck link Lij from
Authorized licensed use limited to: Sri Krishna College Of Engineering & Technology. Downloaded on May 21,2024 at 16:41:57 UTC from IEEE Xplore. Restrictions apply.
2230 IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, VOL. 11, NO. 2, MARCH/APRIL 2024
Algorithm 4: Aggressive Objective Selection Algorithm. Algorithm 5: Dynamic Updating based on Deep Reinforce-
Input: State s under the fast initial provisioning Xf ; ment Learning (DU-DRL).
Output: VM that needs to be adjusted vk(h) ; Input: Set of updating requests V ;
1: Same as Algorithm 3 in lines 1 to 2; Output: Provisioning scheme X;
Ci
2: if E = Em then 1: Initialize D to N , Γ to f alse, Q with random weights θ,
3: Select VM vk with the maximum communication and Q̂ with weights θ− := θ;
demand on Ci to adjust; 2: for episode from 1 to κ do
4: else 3: Initialize sequence s based on Xf of Algorithm 1;
5: Choose the physical machine Cw in set C(Lij ) under 4: Preprocessed sequence φ1 = φ(sXf );
ij
bottleneck link ElL with minimum elasticity Em Cw
; 5: while Γ = f alse do
6: Select VM vk(h) with the maximum communication 6: Build feasible actions set  based on Algorithm 3;
demand on Cw to adjust; 7: With probability ε select a random action at ∈ Â;
7: return VM that needs to be adjusted vk(h) ; 8: Otherwise select at = argmaxa Q(φ(st ), a; θ);
9: Choose the adjusted objective based on Algorithm 4;
10: Execute action at in emulator and update r and Γ
based on Algorithm 2;
set  = {A/C(Lij )}. Here, we use C(Lij ) to denote the set
11: Set st+1 = st , at , xt+1 and preprocess
of physical machines under the physical link Lij . Line 7 returns
φt+1 = φ(St+1 ).
the feasible action set Â. 12: Store transition (φt , at , rt , φt+1 ) in D;
3) Aggressive Objective Selection: In a given episode, the 13: Sample random minibatch of transitions
agent choose an action from the set  that is detailed in (φj , aj , rj , φj+1 ) from D.
Algorithm 3. This action can only determine the destination 14: if episode terminates at step j + 1 then
that we can adjust the VMs, however, which VM is selected 15: Set yj = rj ;
to be adjusted cannot be determined by the action at . Here, 16: else
we design an aggressive adjusted objective selection algorithm 17: Set yj = rj + γmaxa Q̂(φj+1 , a ; θ− );
to recognize which VM is being adjusted based on its current 18: Perform a gradient descent step on
policy. Then the environment will return a reward rt to the agent (y − Q(φj , aj ; θ))2 with respect to the parameters
and transit to st+1 . In Algorithm 4, the input is the state s under θ.
the fast initial provisioning Xf and the output is the VM vk 19: Every C steps reset Q̂ = Q;
that is selected to be adjusted. In lines 1 to 2, we initialize 20: return Provisioning scheme X;
the elasticity E under the fast provisioning Xf and find the
bottleneck same as Algorithm 3. If the bottleneck is located
Ci
on Ci where E = Em , we select VM vk with the maximum
communication demand on Ci to adjust in line 3. Otherwise, if Q with random weight θ and the target action-value function Q̂
ij
the bottleneck is located on Lij where E = ElL , we choose the with weights θ− = θ. In lines 2 to 4, we start to train the agent by
physical machine Cw under bottleneck link Lij with minimum running a number of κ episodes with our environment. During
elasticity EmCw
in line 5. Based on that, we select VM vk with each episode, we initialize sequence S based on the fast initial
the maximum communication demand on Cw to adjust. Line 7 provisioning Xf and preprocess it with φ1 = φ(sXf ) in lines 3
returns the VM vk that needs to be adjusted. to 4. The training process starts from lines 5 to 20. The process of
the adjustment starts from choosing a physical machine from the
built feasible actions set  that is produced by Algorithm 3. In
D. Dynamic Updating Based on Deep Reinforcement
line 7, the agent selects a random action at ∈ Â with probability
Learning (DU-DRL) ε, otherwise, it will select at = argmaxa Q(φ(st ), a; θ) with the
The overview of the dynamic updating strategy based on deep maximum Q value in line 8. Since there is a queue of VMs from
reinforcement learning (DU-DRL) is shown in the right part of different virtual clusters provisioning on the chosen physical
Fig. 3. Algorithm 5 summarizes the specific steps. The main machine, the agent needs to choose the adjusted objective based
idea is to use a deep reinforcement learning agent to perform the on Algorithm 4 in line 9. At each time step, only one of the VMs
dynamic adjustment VM of virtual clusters to maximize the elas- in this queue is adjusted. Then the agent executes action at in
ticity of the DCN in each time slot. Before we conduct dynamic the emulator and updates r and Γ based on Algorithm 2. We
updates, we collect the status of multi-tenant virtual clusters V set st+1 = st , at , xt+1 , and preprocess φt+1 = φ(st+1 ), and
to construct the set of update requests V . We prioritize pro- we store the transition (φt , at , rt , φt+1 ) in the replay memory
cessing for tenants who requesting resource release V− . Then, D in line 12. After that, we sample a random minibatch of
we reinitialize the resources of cloud-based DCN and update transitions (φj , aj , rj , φj+1 ) from D. In lines 14 to 17, the agent
V = {V − V− } which serves as input of Algorithm 5. We first will calculate the reward after the termination of the episode.
initialize some preliminary parameters which include setting the The objective of our problem is to maximize the elasticity of the
replay memory D to capacity N and episode terminated variable DCN which is consistent with the cumulative reward received
Γ to f alse. Meanwhile, we initialize the action-value function by the agent. In line 18 to 19, the agent performs a gradient
Authorized licensed use limited to: Sri Krishna College Of Engineering & Technology. Downloaded on May 21,2024 at 16:41:57 UTC from IEEE Xplore. Restrictions apply.
LU et al.: TOWARDS DYNAMIC REQUEST UPDATING WITH ELASTIC SCHEDULING 2231
Authorized licensed use limited to: Sri Krishna College Of Engineering & Technology. Downloaded on May 21,2024 at 16:41:57 UTC from IEEE Xplore. Restrictions apply.
2232 IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, VOL. 11, NO. 2, MARCH/APRIL 2024
Since the physical resources are comparatively adequate when does not perform well in the first stage when k = 6, but as k
the scale of the data center is large (k = 9), different provisioning increases, the elasticity values show better results, and the trend
strategies will not have a significant impact on elasticity, and the in the second stage mirrors that of the first stage and relies on
advantage of MFIP on the elasticity is not particularly obvious the results. In addition, the DU-GP strategy could cause some
when the request size is modest (60, 70). However, the limited crucial resources (such as certain computing or communication
multi-dimensional resources (k = 6) might result in significant resources) to become scarce in the first phase. Such resource
differences in the elasticities across various solutions as the shortages could intensify in the second phase and lead to lower
number of user requests increases (60 to 90). elasticity values. Based on the dependence between the second
Based on the results of the first stage, we evaluate the elasticity stage and the first stage, we can clearly see that the performance
value of the second stage as follows. We deploy the algorithms of the second stage is closely related to the resource allocation
on tree topology with 6 to 9-layers under algorithms (DQN, strategy of the first stage. Efficient fast provisioning strategies
DU-EDP, DU-GP, DQN-S1, DUES) on each group of datasets for the multi-tenant will make great improvements to the final
and calculate the elasticities within 300 iterations. Among them, result, on the contrary, inefficient ones will bring bad effects.
DU-RP, DU-EDP, and DU-GP are based on the first-stage (iii). The algorithm DUES has the highest elasticity values
Random, EDP, and GP methods, respectively. We conducted among different topological configurations in the second stage.
experiments on the elasticity values of data centers with different Among the comparison algorithms, DU-EDP performs poorly
topologies (k = 6, k = 7, k = 8, and k = 9). The experiment when the topology is k = 6 and k = 7, but its performance
results are shown in Figs. 5 to 8, and we have the following improves when the topology is k = 8 and k = 9 as shown in
observations: (i). For the same group of users, the elasticity Figs. 7 and 8. This could be because the DU-EDP strategy can
values obtained by different algorithms are quite different. better balance resource allocation in larger topologies, thereby
Fig. 5(a) shows the elasticities under the five algorithms for avoiding resource bottlenecks that occur in smaller topologies.
the number of 60 tenants with the 6-layer DCN. The first three The algorithm DU-GP performs consistently poorly because
columns show the elasticities under the DQN, DU-EDP, and it prioritizes allocations based only on the elasticity values of
DU-GP, which are all negative values. It means that these three the computational resources in each decision process, without
algorithms do not give an appropriate solution for the virtual considering bottlenecks and tradeoffs. The performance of the
clusters of these 60 tenants. (ii). The choice of strategy used DU-RP algorithm varies significantly due to the randomness of
for the fast provisioning has an important influence on the final the algorithm, leading to inconsistent results. DU-DRL has an
elasticity value. As shown in Fig. 5(a), DU-EDP and DU-GP are efficient effect on improving the elasticity, especially when the
the strategies that adding EDP and GP fast provisioning based occupancy of physical resources is not close to saturation. Com-
on DQN. The final elasticity values under these two strategies paring sub-figures (a) and (d) of Fig. 5, the difference between
are lower than simple DQN. However, the elasticity under the the last two columns of 60 tenants is much higher than 90. The
DUES strategy is the highest. Therefore, we have that the choice average resource utilization of the DCNs with 60 and 90 tenants
of strategies using for fast provisioning is very important for is nearly 80% and 98% under the DUES. Then, the number
the elasticity of the DCNs. For example, the strategy DU-RP of tenants is large but not reaching saturation, and the impacts
Authorized licensed use limited to: Sri Krishna College Of Engineering & Technology. Downloaded on May 21,2024 at 16:41:57 UTC from IEEE Xplore. Restrictions apply.
LU et al.: TOWARDS DYNAMIC REQUEST UPDATING WITH ELASTIC SCHEDULING 2233
of algorithms on the elasticities are higher. (iv). The trends of 1) Convergence: We investigate the convergence for the
a different group of multiple tenants with the same physical groups of tenants (60, 70, 80, and 90) under physical topologies
topology are various. Compared with sub-figures (a) to (d) of with different layers (7-layer, 8-layer, and 9-layer). The results
Figs. 5 to 8, we can see that a larger number of tenants in one are shown in Figs. 9 to 11. The gray parts are the ranges of
group will lead to lower elasticity for the same physical topology. the collection results, and the bright lines with different colors
The reason is that more virtual clusters demand more physical are the mean values. Additionally, we have the following obser-
resources, which will lead to an increase in the combinational vations: (i). The increasing number of tenants has an influence
utilization of the clusters and thus lower the elasticity of the on the convergence. As shown in Fig. 9, the elasticity of the
data centers. Thus, a good provisioning scheme can support 7-layer data center is scaling with the increasing number of
more virtual clusters of tenants in the larger data center net- iterations. For each group of tenants, the elasticity begins to
work. In summary, compared with DQN, DU-EDP, DU-GP, and converge between 50 and 100 iterations and keeps at a high
DQN-S1, DUES has better performance in elasticity across the level of around 300 iterations. The growth rate is slow when the
virtual cluster provisioning in multi-tenant DCNs. The average number of users is small, i.e. sub-figures (a) and (b) of Fig. 9,
elasticity can improve 1.91 times compared with DQN-S1 under while it is relatively fast when the number of users is scaling.
the ranges of tenants [60, 90]. The reason is that the higher number of the total virtual clusters
provisioning in the data center greatly decreases the rest of the
available physical resources, which leads to a reduction of the
searching space. (ii). The elasticity fluctuates within a relatively
C. Experiment Results Under Different DCNs fixed range. There are many different placement results in the
Based on the experiment results of baseline algorithms under learning process of DUES, and the elasticities generated by these
the 6-layer DCN, we conduct the experiments of DUES to results will fluctuate among several relatively fixed values in the
compare the elasticities under different topologies. For each convergence process. As shown in the sub-figures (a), (b), and
group of users, we collect five groups of results under the same (c) of Fig. 9, the fluctuation of elasticities is within the range
settings. between 0.1 and 0.2. Compared with 8-layer and 9-layer, the
Authorized licensed use limited to: Sri Krishna College Of Engineering & Technology. Downloaded on May 21,2024 at 16:41:57 UTC from IEEE Xplore. Restrictions apply.
2234 IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, VOL. 11, NO. 2, MARCH/APRIL 2024
fluctuation of elasticities is within the range of 0.1 and 0.12, assess the average of the highest elasticities among the five
respectively. This range is correlated to the topology of the DCN groups of results with (λ = 0.5, ϕ = 0.5) which are shown in
and the provisioning deviation of a few individual VMs. Fig. 12. Additionally, we have the following observations: (i).
2) Elasticity: According to the convergence of the elastic- The final elasticities decrease with the increasing number of
ities obtained under the topologies with different layers, we tenants under the same t opology. As shown in sub-figure (a)
Authorized licensed use limited to: Sri Krishna College Of Engineering & Technology. Downloaded on May 21,2024 at 16:41:57 UTC from IEEE Xplore. Restrictions apply.
LU et al.: TOWARDS DYNAMIC REQUEST UPDATING WITH ELASTIC SCHEDULING 2235
TABLE IV the current number of tenants in the data center is 90, which
COEFFICIENT SETTING can be represented as (90, 1). The result is presented as fol-
lows. It is obvious that there are differences in the results of
elasticity at different coefficients, and the result is shown in
Fig. 14(a). Among them, the elasticity value is highest under
the coefficient (0.4, 0.6), which indicates that the expansion of
of Fig. 12, the increasing rates between the number of tenants this group of queries is focused on physical machines and the
and the elasticities are similar, which is not the case in the other subsequent scalability advantage is in communication resources.
three groups. Such as the 7-layer topology group, the gap of We then evaluate the scenario where only one virtual cluster has
the elasticity between 70 and 80 is not large, however, when a scaling requirement. Since only the internal resources within
the number of tenants increases to 90, the value of the elasticity one cluster have grown, the communication resources account
decreases sharply. Since the sizes of the requested virtual clusters for the main part where scaling in [45, 90] Gbps based on the
ranges in [10, 20], the total amount of resources requested of function f (Vk , Ci ). Here, we assume that the current cluster
the group with a lower number of tenants may be close to the resource growth requirement is 90, which can be represented as
higher one, which leads to a not large gap of these groups. (ii). (1, 90). The result is shown in Fig. 14(b). Among them, the elas-
With the scaling of the topologies of the DCNs, the impacts ticity value is the highest under the coefficient (0.3, 0.7), which
of algorithms on the elasticities are higher. However, when the indicates that the expansion of this group of queries is focused
resources are sufficient to a certain level, the effect of updating on physical links and the subsequent scalability advantage is in
strategy on elasticity may be small as shown in sub-figure (d) of computing resources. Based on the above result, we have that
9-layer topology. Based on the experiment results of benchmarks the values of coefficients can more accurately reflect the various
under the 6-layer DCN, the elasticity depends on the localities of requirements for different resource categories and optimize the
VMs requesting from different virtual clusters, which means a allocation by introducing coefficients to weigh the importance
provisioning scheme can support more virtual clusters in a larger of Em and El .
DCN. Compared with the same column on Y -axis with sub-
figures (a) to (d) of Fig. 12, the elasticities are increasing with
the scaling topologies of the DCNs, which means more available VI. CONCLUSION
resources can be supported by the providers. Additionally, we In this paper, we address the virtual cluster provisioning
estimate the descent rates of several DCNs and compare the problem in multi-tenant cloud data centers. We use elasticity
average elasticities; the results are shown in Fig. 13. We see that to measure the potential growth of multi-tenant in terms of
the fluctuations in elasticities decrease sharply when the scales of computing and communication resources. We aim to minimize
the topologies are 6 and 7 layers. However, the trend is flat when the elasticity by designing a two-stage framework DUES, which
the scales of the topologies increase to 8 and 9 layers. Thus, we consists of two stages. In the first stage, we first propose a fast
find that the fluctuation rates are inversely proportional to the initial provisioning MFIP scheme to realize the rapid response
scaling of the topologies. In summary, DUES in multi-tenant of multi-tenant, and we prove that MFIP is optimal under the
DCNs shows better performance in terms of elasticity at different single computation resource constrain. In the second stage, we
topologies. propose a dynamic updating strategy DU-DRL based on deep
3) Coefficient: In order to further analyze the impact of the reinforcement learning to further improve the elasticity of virtual
coefficient on the combinational elasticity, we experimented on clusters that are in use for scaling. Additionally, to avoid the
two distinct scenarios to confirm the efficacy of the introducing high dimensions caused by the large scales of tenants and the
coefficients, and we used λ and ϕ to represent coefficients DCN, we propose to train a fully connected neural network by
of physical machines and links, respectively. The setting of designing a new feasible action set to realize the reduction, and
coefficients is shown in Table IV. We first evaluate the scenario it approximates the policy based on the proposed aggressive
where only compute resources are increased and users in the objective selection method in DU-DRL. Finally, we conduct
data center request an increase of 1 virtual machine. Due to the extensive evaluations under various scenarios to demonstrate
scaling of virtual machines between different virtual clusters, that our scheme outperforms existing state-of-the-art methods
the communication between them is 0. Here, we assume that in terms of both elasticity and efficiency.
Authorized licensed use limited to: Sri Krishna College Of Engineering & Technology. Downloaded on May 21,2024 at 16:41:57 UTC from IEEE Xplore. Restrictions apply.
2236 IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, VOL. 11, NO. 2, MARCH/APRIL 2024
Authorized licensed use limited to: Sri Krishna College Of Engineering & Technology. Downloaded on May 21,2024 at 16:41:57 UTC from IEEE Xplore. Restrictions apply.
LU et al.: TOWARDS DYNAMIC REQUEST UPDATING WITH ELASTIC SCHEDULING 2237
Jiamei Shi received the B.Sc. degree in computer Jiayue Zhang received the Ph.D. degree in signal and
science and technology from Shanghai Maritime Uni- information processing from the Beijing University
versity, Shanghai, China. She is currently working of Posts and Telecommunications, Beijing, China, in
toward the M.Sc. degree on computer science with 2018. She is currently a Lecturer with the Faculty
Faculty of Information Technology, Beijing Univer- of Information Technology, Beijing University of
sity of Technology, Beijing, China. Her research in- Technology, Beijing. She was supported by the China
terests include cloud computing and edge computing. Scholarship Council as a Visiting Scholar supervised
by Prof. Jimmy Huang with the Department of Infor-
mation Technology, York University, Toronto, ON,
Canada from 2012 to 2014. She is a Member of China
Comupter Federation. Her research interests include
machine learning and data mining.
Authorized licensed use limited to: Sri Krishna College Of Engineering & Technology. Downloaded on May 21,2024 at 16:41:57 UTC from IEEE Xplore. Restrictions apply.