Week-8 Lecture Notes
Week-8 Lecture Notes
EL
PT
N
70 mins
Resource Allocation in Private & Public
EL
Edge-Cloud Systems
PT
N
Dr. Rajiv Misra, Professor
Dept. of Computer Science & Engineering
Indian Institute of Technology Patna
[email protected]
Collaborative Edge-Cloud Computing
• Edge-nodes usually do not exhibit enough storage and computing resources
when processing massive data. Therefore, collaborative edge-cloud (CEC)
computing provides users with computing services and requires the cloud and
edge nodes cooperate with each other.
• In CEC scenarios, the edge node with limited local resources can rent more
EL
resources from the cloud node and pay corresponding costs to meet users’
demands.
PT
• According to the nature of cloud service, cloud service can be divided into
private cloud and public cloud.
N
Collaborative Edge-Cloud Computing
• In the private cloud, when the users’ computing demands
randomly reach the edge node, the edge node needs to
decide how to reasonably allocate resources between the
cloud and edge nodes to satisfy users’ demands.
EL
• Each edge node exhibits its own VMs to process users’
demands. However, because the number of VMs requested
by the user may exceed the edge node’s capacity, the edge
node needs to rents VMs from private cloud node to scale
PT
up its capacity.
.
N
according to its physical computing cost, so the edge node
needs to dynamically allocate resources at each time slot
according to its resource allocation policy.
Collaborative Edge-Cloud Computing
• In the public cloud, service
providers offer various pricing
modes for cloud service, so the
edge node also needs to select
appropriate pricing mode of
EL
cloud service for collaborative
computing according to users’
demands.
PT
N
Collaborative Edge-Cloud Computing
• Cloud service providers such as Amazon, and Microsoft, provided three different pricing modes, each of which
exhibits a different cost structure:
EL
PT
On-Demand Instance: This pricing mode allows users to pay with the fixed time granularity set by the
platform, and the price is fixed in a long period of time.
N
Reserved Instance*: this pricing mode requires the user to submit the reservation time in advance and
pay the corresponding reservation fee to have the instance within the contract time. It is applicable to
users with a large number of computing demands, and the unit price is usually approximately 50% lower
than that of on-demand instances. The Reserved Instance has an option for customization with additional
cost associated with it, called the upfront cost.
Spot Instance: the instance of this mode is obtained by bidding, and its price changes in different time
slots, which is usually used for short-term computing demands.
Collaborative Edge-Cloud Computing
• Because the computing cost of edge nodes changes dynamically according to their
workload, if no strategic resource allocation exists when CEC computing provides
service, the computing resources in edge nodes with a relatively low computing
cost cannot be used reasonably, and its computing cost will increase.
• Therefore, reducing the long-term operation cost on the premise of meeting the
EL
dynamic computing demands of users is a key problem in the CEC computing.
• The goal is to achieve the lowest operation cost under different strength of
PT
demanding amount and computing time duration.
• In the following section, we shall formally define the resource allocation problem
N
in both private and public cloud scenarios.
Modeling Resource Allocation Problem in CEC
• Assume that in each time slot t, the user submits a demand for resources (VM). The user demand can be
represented as:
𝐷𝐷𝑡𝑡 = (𝑑𝑑𝑡𝑡 , 𝑙𝑙𝑡𝑡 )
• where (𝑑𝑑𝑡𝑡 ) represents the number of VMs requested and (𝑙𝑙𝑡𝑡 ) represents the duration of service request
EL
(timeslots).
• The total computing resources (VMs) owned by the edge node are represented by E.
PT
• As the resource is allocated to users, 𝒆𝒆𝒕𝒕 to represent the number of remaining VMs of
edge node in time slot t.
N
• The number of VMs provided by the edge node is expressed as 𝒅𝒅𝒆𝒆𝒕𝒕 .
• It should be noted that if the edge node has no available resources, it will hand over all
the arriving computing tasks to the cloud service for processing.
Modeling Resource Allocation Problem in CEC
When the resource allocation can be successfully performed on the edge node, each
demand processed by the edge node will generate an allocation record:
EL
where 𝒅𝒅𝒆𝒆𝒕𝒕 is the number of VMs provided by the edge node in this allocation request and 𝒍𝒍𝒕𝒕 is
the remaining computing time of this demand.
PT
When a new demand arrives and resource allocation is completed, an allocation record will
be generated and added to an allocation record list:
N
The record list H is maintained to keep track to the completion of users’ demands.
Modeling Resource Allocation Problem in CEC
After the users’ demand are fulfilled, the edge node needs to release the corresponding VMs
according to its record and delete the allocation record from the list H.
The number of VMs that completed the computing task and are waiting to be released at the
EL
end of time slot t is defined as η𝒕𝒕
PT
N
Then, the number of remaining VMs of the edge node at the time slot t + 1 is the following:
Cost of Collaborative Cloud-Side Computing in Private Cloud
Note that in order to quickly respond to users’ computing demands, even if no computing
demand is found, the machine still exhibits standby cost.
Therefore, the cost of edge nodes consists of standby cost and computing cost.
EL
• The standby cost of one VM in the edge node is 𝒑𝒑𝒆𝒆 .
• The computing cost of one VM in the edge node is 𝒑𝒑𝒇𝒇 .
PT
Therefore, the total cost of the edge node in the time slot t can be computed as:
N
total stand-by cost total computing cost
Cost of Collaborative Cloud-Side Computing in Public Cloud
In time slot t, the cost of collaborative cloud-edge in public cloud environment includes the
computing cost of cloud nodes and the cost of edge node, which is the following:
EL
PT
cost of on-demand cost of reserved Cost at the Edge Node
instance instance
N
of reserved instance
• The goal is to achieve the lowest operation cost under different strength of demanding
amount and computing time duration.
Resource Allocation Decision
Decision or Action (𝒂𝒂𝒕𝒕 ) denotes the ratio of VMs that will be provided by edge nodes and cloud nodes
EL
VMs provided VMs provided by
by Edge Node private cloud
PT
for Public Cloud Scenarios
EL
3 (20, 2)
where (𝑑𝑑𝑡𝑡 ) represents the number of VMs requested and (𝑙𝑙𝑡𝑡 ) represents the duration of service request.
Assume that time slot (1) is the starting slot such that no VMs have been allocated a priori.
PT
Assume that there are a total of 80 VMs present at the edge node.
Given Constants:
Constant
Stand-by cost of a VM at the edge node (𝑝𝑝𝑒𝑒 )
Computing cost of a VM at the edge node (𝑝𝑝𝑓𝑓 )
Computing cost of a private cloud (𝑝𝑝𝑐𝑐 )
N Value
0.03
0.20
3.00
Unit price of on-demand instance in public cloud (𝑝𝑝𝑜𝑜𝑜𝑜 ) 3.0
Unit price of reserved instance in public cloud (𝑝𝑝𝑟𝑟𝑟𝑟 ) 1.5
Customization price of reserved instance (𝑝𝑝𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢 ) 800
Unit price of spot instance in public cloud (𝑝𝑝𝑡𝑡 ) 1.0
Example
Resource allocation using private cloud:
Suppose that we have our own private cloud and a policy has been deployed to allocate VMs as per client demands
which outputs the following actions at each timeslot:
EL
Time-Slot (𝑡𝑡) Action (𝑥𝑥𝑡𝑡𝑘𝑘 )
1 0.4
2 0.7
PT
3 0.8
The action (𝑥𝑥𝑡𝑡𝑘𝑘 ∈ [0, 1]) represents the ratio of VMs allocated from the private cloud to
the total VMs requested by client at time slot t. The remaining VMs (1 − 𝑥𝑥𝑡𝑡𝑘𝑘 ) are
allocated from the edge node.
N 𝑝𝑝𝑝𝑝𝑝𝑝
[Q] Calculate the cost of collaborative cloud side computing (𝐶𝐶𝑡𝑡 ) in the given private
cloud setting at each of the three time slots. Also, find out the number of VMs that will be
available at the edge node at the beginning of fourth time slot.
Solution
Let (𝑒𝑒𝑡𝑡 ) represent the number of VMs available at the edge node at time slot (t).
Given, 𝑒𝑒1 = 𝐸𝐸 = 80
At time slot (t = 1):
Demand: 𝐷𝐷1 = (𝑑𝑑1 , 𝑙𝑙1 ) = (30, 2)
Action: 𝑥𝑥1𝑘𝑘 = 0.4
EL
No of VMs allocated from cloud: 𝑑𝑑1𝑐𝑐 = 𝑥𝑥1𝑘𝑘 ∗ 𝑑𝑑1 = 0.4 ∗ 30 = 12
No of VMs allocated from edge node: 𝑑𝑑1𝑒𝑒 = 𝑑𝑑1 − 𝑑𝑑1𝑐𝑐 = 30 − 12 = 18
No of VMs remaining at the edge node: 𝑒𝑒1 = 𝑒𝑒1 − 𝑑𝑑1𝑒𝑒 = 80 − 18 = 62
PT
Resources can be successfully allocated from edge node; hence, allocation record will be generated:
Allocation record: ℎ1 = 𝑑𝑑1𝑒𝑒 , 𝑙𝑙1 = (18, 2)
Allocation Record List 𝐻𝐻: < ℎ1 > ∶ < 18, 2 >
EL
No of VMs remaining at the edge node: 𝑒𝑒2 = 𝑒𝑒2 − 𝑑𝑑2𝑒𝑒 = 62 − 3 = 59
Resources can be successfully allocated from edge node; hence, allocation record will be generated:
PT
Allocation record: ℎ2 = 𝑑𝑑2𝑒𝑒 , 𝑙𝑙2 = (3, 1)
Allocation Record List 𝐻𝐻: < ℎ1 , ℎ2 > ∶ < 18, 1 , (3, 1) >
Cost at the edge node: 𝐶𝐶2𝑒𝑒 = 𝑒𝑒2 𝑝𝑝𝑒𝑒 + 𝐸𝐸 − 𝑒𝑒2 𝑝𝑝𝑓𝑓 = 59 ∗ 0.03 + (80 − 59) ∗ 0.2 = 1.77 + 4.2 = 5.97
N
= 𝑑𝑑2𝑐𝑐 𝑝𝑝𝑐𝑐 + 𝐶𝐶2𝑒𝑒 = 7 ∗ 3.0 + 5.97 = 𝟐𝟐𝟐𝟐. 𝟗𝟗𝟗𝟗
Updated Allocation Record List 𝐻𝐻: < ℎ1 , ℎ2 > ∶ < 18, 0 , (3, 0) >
Number of VMs waiting to be released: 𝑛𝑛2 = 18 + 3 = 21
Number of VMs available at next time slot: 𝑒𝑒3 = 𝑒𝑒2 + 𝑛𝑛2 = 59 + 21 = 𝟖𝟖𝟖𝟖
Solution
At time slot (t = 3):
Demand: 𝐷𝐷3 = (𝑑𝑑3 , 𝑙𝑙3 ) = (20, 2)
Action: 𝑥𝑥3𝑘𝑘 = 0.8
No of VMs allocated from cloud: 𝑑𝑑3𝑐𝑐 = 𝑥𝑥3𝑘𝑘 ∗ 𝑑𝑑3 = 0.8 ∗ 20 = 16
EL
No of VMs allocated from edge node: 𝑑𝑑3𝑒𝑒 = 𝑑𝑑3 − 𝑑𝑑3𝑐𝑐 = 20 − 16 = 4
No of VMs remaining at the edge node: 𝑒𝑒3 = 𝑒𝑒3 − 𝑑𝑑3𝑒𝑒 = 80 − 4 = 76
Resources can be successfully allocated from edge node; hence, allocation record will be generated:
PT
Allocation record: ℎ2 = 𝑑𝑑3𝑒𝑒 , 𝑙𝑙3 = (4, 2)
Allocation Record List 𝐻𝐻: < ℎ3 > ∶ < (4, 2) >
Cost at the edge node: 𝐶𝐶3𝑒𝑒 = 𝑒𝑒3 𝑝𝑝𝑒𝑒 + 𝐸𝐸 − 𝑒𝑒3 𝑝𝑝𝑓𝑓 = 76 ∗ 0.03 + (80 − 76) ∗ 0.2 = 2.28 + 0.8 = 3.08
N
= 𝑑𝑑3𝑐𝑐 𝑝𝑝𝑐𝑐 + 𝐶𝐶3𝑒𝑒 = 16 ∗ 3.0 + 3.08 = 𝟓𝟓𝟓𝟓. 𝟎𝟎𝟎𝟎
Updated Allocation Record List 𝐻𝐻: < ℎ3 > ∶ < (4, 1) >
Number of VMs waiting to be released: 𝑛𝑛3 = 0
Number of VMs available at next time slot: 𝑒𝑒4 = 𝑒𝑒3 + 𝑛𝑛3 = 76 + 0 = 𝟕𝟕𝟕𝟕
Example
Resource allocation using public cloud:
Assume that we have replaced the private cloud with a public cloud setting with a new policy that outputs the
following actions at each timeslot:
Time-Slot (𝑡𝑡) Action (𝑘𝑘𝑡𝑡 , 𝑥𝑥𝑡𝑡𝑘𝑘 )
EL
1 (1, 0.4)
2 (0, 0.7)
PT
3 (2, 0.8)
EL
Cost at the edge node (from previous example):
𝐶𝐶1𝑒𝑒 = 𝑒𝑒1 𝑝𝑝𝑒𝑒 + 𝐸𝐸 − 𝑒𝑒1 𝑝𝑝𝑓𝑓 = 5.46
𝑝𝑝𝑝𝑝𝑝𝑝
Cost at the public cloud: 𝐶𝐶1 = 𝑑𝑑1𝑐𝑐 𝑝𝑝𝑟𝑟𝑟𝑟 + 𝐶𝐶1𝑒𝑒 = 12 ∗ 1.5 + 5.46 = 23.46
PT
N consider cost of reserved instance only
since at timeslot (1) resources are allocated
from reserved instance only
Solution
At time slot (t = 2):
The cost calculation for edge note will remain the same (as calculated in case of private cloud)
Cost at the edge node (from previous example):
EL
𝐶𝐶2𝑒𝑒 = 𝑒𝑒2 𝑝𝑝𝑒𝑒 + 𝐸𝐸 − 𝑒𝑒2 𝑝𝑝𝑓𝑓 = 5.97
𝑝𝑝𝑝𝑝𝑝𝑝
Cost at the public cloud:𝐶𝐶2 = 𝑑𝑑2𝑐𝑐 𝑝𝑝𝑜𝑜𝑜𝑜 + 𝐶𝐶2𝑒𝑒 = 7 ∗ 3.0 + 5.97 = 26.97
PT
consider cost of on-demand instance only
EL
𝐶𝐶3𝑒𝑒 = 𝑒𝑒3 𝑝𝑝𝑒𝑒 + 𝐸𝐸 − 𝑒𝑒3 𝑝𝑝𝑓𝑓 = 3.08
𝑝𝑝𝑝𝑝𝑝𝑝
Cost at the public cloud:𝐶𝐶3 = 𝑑𝑑3𝑐𝑐 𝑝𝑝𝑡𝑡 + 𝐶𝐶3𝑒𝑒 = 16 ∗ 1.0 + 3.08 = 19.08
PT
consider cost of spot instance only
𝑇𝑇
𝑝𝑝𝑝𝑝𝑝𝑝|𝑝𝑝𝑝𝑝𝑝𝑝
𝐶𝐶 𝑝𝑝𝑝𝑝𝑝𝑝|𝑝𝑝𝑝𝑝𝑝𝑝 = � 𝐶𝐶𝑡𝑡
EL
𝑡𝑡=1
We shall model the resource allocation problem as a Markov Decision Problem (MDP) which converts the (cost)
PT
minimization problem into an equivalent (reward) maximization problem.
A Markov decision process (MDP) is a discrete-time stochastic control process. It provides a mathematical framework
N
for modeling a decision making process where the decisions are made at each discrete timestep. MDPs are useful for
studying optimization problems solved via dynamic programming (DP) and reinforcement learning (RL).
Markov Decision Process
An MDP is a 5-tuple 𝑺𝑺, 𝑨𝑨, 𝑷𝑷, 𝑹𝑹, 𝝆𝝆 where,
• 𝑺𝑺 represents the set of all states
• 𝑨𝑨 represents the set of actions
• 𝑷𝑷 ∶ 𝑆𝑆 × 𝐴𝐴 × 𝑆𝑆 → 𝒫𝒫(𝑆𝑆) represents the transition probability function with 𝒫𝒫(𝑆𝑆) being the probability
EL
distribution over (next) states. Hence, P(𝑠𝑠 ′ | 𝑠𝑠, 𝑎𝑎) is the probability of transitioning to state 𝒔𝒔′ from 𝒔𝒔
through action 𝒂𝒂
• 𝑹𝑹 ∶ 𝑆𝑆 × 𝐴𝐴 × 𝑆𝑆 → ℝ represents the reward function with 𝑟𝑟𝑡𝑡 = 𝑅𝑅(𝑠𝑠𝑡𝑡 , 𝑎𝑎𝑡𝑡 , 𝑠𝑠𝑡𝑡+1 ) being the reward
PT
obtained by moving from state 𝒔𝒔𝒕𝒕 to 𝒔𝒔𝒕𝒕+𝟏𝟏 via the action 𝒂𝒂𝒕𝒕
• 𝝆𝝆 represents the initial state distribution
• Additionally, � 𝑺𝑺 represents the set of final states
N
The name Markov Decision Process refers to the fact that the system obeys the
Markov property i.e., transitions only depend on the single most recent state and
action pair, and no prior history.
Reinforcement Learning
In Reinforcement Learning (RL), the Environment-Agent interaction is represented by a Markov Decision
Process where the agent can observe the environment (state 𝒔𝒔) and take decisions (actions 𝒂𝒂).
EL
Types of Policies
• Deterministic Policy 𝒂𝒂 = 𝝅𝝅 𝒔𝒔 ∶ 𝑆𝑆 → 𝐴𝐴
• Stochastic Policy 𝒂𝒂 ∼ 𝝅𝝅 𝒔𝒔 ∶ 𝑆𝑆 → 𝒫𝒫(𝐴𝐴)
PT
N
Reinforcement Learning
Decision Process (Trajectory)
At the beginning of the decision process 𝑡𝑡 = 0 , the environment assumes one of the states from the initial state
distribution i.e., 𝒔𝒔𝟏𝟏 ∼ 𝝆𝝆
EL
1. observe state 𝑠𝑠𝑡𝑡
2. decide action 𝑎𝑎𝑡𝑡 ∼ 𝝅𝝅(𝑠𝑠𝑡𝑡 )
3. transition to a new state (𝑠𝑠𝑡𝑡+1 ) with probability 𝑷𝑷(𝑠𝑠𝑡𝑡+1 𝑠𝑠𝑡𝑡 , 𝑎𝑎𝑡𝑡
4. receive a reward 𝑟𝑟𝑡𝑡 = 𝑹𝑹(𝑠𝑠𝑡𝑡 , 𝑎𝑎𝑡𝑡 , 𝑠𝑠𝑡𝑡+1 )
PT
5. terminate if final state reached 𝒔𝒔𝒕𝒕+𝟏𝟏 ∈ � 𝑺𝑺
𝓖𝓖 𝝉𝝉 = � 𝒓𝒓𝒕𝒕 𝜸𝜸𝒕𝒕−𝟏𝟏
𝑡𝑡=1
Reinforcement Learning
Probability of a Trajectory 𝓣𝓣 𝝉𝝉 𝝅𝝅
• is the probability that an agent using a policy (𝝅𝝅) encounters a trajectory 𝝉𝝉
where, 𝝉𝝉 = 𝒔𝒔𝟏𝟏 , 𝒂𝒂𝟏𝟏 , 𝒓𝒓𝟏𝟏 , 𝒔𝒔𝟐𝟐 , 𝒂𝒂𝟐𝟐 , 𝒓𝒓𝟐𝟐 , … 𝒔𝒔𝑻𝑻 , 𝒂𝒂𝑻𝑻 , 𝒓𝒓𝑻𝑻 , 𝒔𝒔𝑻𝑻+𝟏𝟏
EL
𝓣𝓣 𝝉𝝉 𝝅𝝅 = 𝝆𝝆(𝒔𝒔𝟏𝟏 ) × 𝝅𝝅 𝒂𝒂𝟏𝟏 𝒔𝒔𝟏𝟏 ) × 𝑷𝑷(𝒔𝒔𝟐𝟐 |𝒔𝒔𝟏𝟏 , 𝒂𝒂𝟏𝟏 ) × 𝝅𝝅 𝒂𝒂𝟐𝟐 𝒔𝒔𝟐𝟐 ) × 𝑷𝑷(𝒔𝒔𝟑𝟑 𝒔𝒔𝟐𝟐 , 𝒂𝒂𝟐𝟐 … × 𝝅𝝅 𝒂𝒂𝑻𝑻 𝒔𝒔𝑻𝑻 ) × 𝑷𝑷(𝒔𝒔𝑻𝑻+𝟏𝟏 |𝒔𝒔𝑻𝑻 , 𝒂𝒂𝑻𝑻 )
PT
𝑻𝑻
• Expected Return
N
𝑮𝑮 𝝅𝝅 = � 𝓣𝓣 𝝉𝝉 𝝅𝝅 𝓖𝓖 𝝉𝝉 = 𝔼𝔼𝝉𝝉∼𝝅𝝅 [𝓖𝓖 𝝉𝝉 ]
∀𝝉𝝉
• RL Objective: find optimal policy 𝝅𝝅∗ that maximizes the expected return
∗ 𝝅𝝅 = argmax 𝑮𝑮 𝝅𝝅
𝜋𝜋
Value Functions
State Value Function – Expected return starting at state 𝒔𝒔
EL
State Action Value Function – Expected return starting at state 𝒔𝒔 and taking action 𝒂𝒂
PT
Advantage Function – How better is to take action 𝒂𝒂 in state 𝒔𝒔 compared to other actions
EL
• Optimal Policy – A policy that will always select whichever action is the best i.e., maximizes expected
return starting at state 𝐬𝐬
PT
• The relation between Optimal Policy and Optimal State Action Value Function:
We can obtain optimal action 𝒂𝒂∗ using 𝑸𝑸∗ as follows,
EL
1
𝐶𝐶 𝐸𝐸 1,1
2
PT
Go to each predecessor node and update Q-Values using 𝑸𝑸∗ 𝒔𝒔, 𝒂𝒂 = 𝔼𝔼𝒔𝒔′ ∼𝑷𝑷 [𝒓𝒓 𝒔𝒔, 𝒂𝒂 + 𝛾𝛾 max 𝑸𝑸∗ 𝒔𝒔′, 𝒂𝒂′ ]
𝑎𝑎′
𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 = 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝐹𝐹 = 𝐷𝐷, 𝐸𝐸
𝑄𝑄 𝐷𝐷,∗ = 1 + max
′
𝑄𝑄 𝐸𝐸,∗ = 1 + max
𝑄𝑄 𝐹𝐹, 𝑎𝑎𝑎 = 1 + 0 = 1
𝑎𝑎
𝑄𝑄 𝐹𝐹, 𝑎𝑎𝑎 = 1 + 0 = 1
N
′ 𝑎𝑎
Q-Value Function Example
4
𝐵𝐵 𝐷𝐷
2
3 6
𝑨𝑨 𝑭𝑭
EL
1
𝐶𝐶 𝐸𝐸
2
PT
Go to each predecessor node and update Q-Values using 𝑸𝑸∗ 𝒔𝒔, 𝒂𝒂 = 𝔼𝔼𝒔𝒔′ ∼𝑷𝑷 [𝒓𝒓 𝒔𝒔, 𝒂𝒂 + 𝛾𝛾 max 𝑸𝑸∗ 𝒔𝒔′, 𝒂𝒂′ ]
𝑎𝑎′
𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 = 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝐷𝐷 , 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃(𝐸𝐸) = 𝐵𝐵, 𝐶𝐶
𝑨𝑨 𝑭𝑭
EL
1
𝐶𝐶 𝐸𝐸
PT
Go to each predecessor node and update Q-Values using 𝑸𝑸∗ 𝒔𝒔, 𝒂𝒂 = 𝔼𝔼𝒔𝒔′ ∼𝑷𝑷 [𝒓𝒓 𝒔𝒔, 𝒂𝒂 + 𝛾𝛾 max 𝑸𝑸∗ 𝒔𝒔′, 𝒂𝒂′ ]
𝑎𝑎′
𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 = 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃(𝐵𝐵), 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃(𝐶𝐶) = 𝐴𝐴
EL
1 8
𝐶𝐶 𝐸𝐸 1 𝐶𝐶 𝐸𝐸 1
2 3
Rewards Q-Values
PT
Optimal Policy: 𝒂𝒂∗ 𝒔𝒔 = argmax 𝑸𝑸∗ 𝒔𝒔, 𝒂𝒂
𝑎𝑎
Trajectory:
• 𝒂𝒂∗𝟐𝟐
N
• 𝒂𝒂∗𝟏𝟏 = argmax 𝑸𝑸∗ 𝑨𝑨, 𝒂𝒂 = argmax 𝟕𝟕, 𝟖𝟖 = 𝒂𝒂𝒂𝒂 Reward = 1, Next State: 𝑪𝑪
𝑎𝑎
= argmax 𝑸𝑸∗ 𝑪𝑪, 𝒂𝒂 = argmax 𝟕𝟕, 𝟑𝟑 = 𝒂𝒂𝒂𝒂 Reward = 6 Next State: 𝑫𝑫
𝑎𝑎
• 𝒂𝒂∗𝟑𝟑 = argmax 𝑸𝑸∗ 𝑫𝑫, 𝒂𝒂 = argmax 𝟏𝟏, 𝟏𝟏 = 𝒂𝒂𝒂𝒂𝒂𝒂 Reward = 1 Next State: 𝑭𝑭
𝑎𝑎
Return = 1 + 6 + 1 = 8
Q-Learning
Q-Learning Update Rule 𝑸𝑸 𝒔𝒔, 𝒂𝒂 ← 𝟏𝟏 − 𝜶𝜶 𝑸𝑸 𝒔𝒔, 𝒂𝒂 + 𝜶𝜶(𝒓𝒓 + 𝛾𝛾 max 𝑸𝑸∗ 𝒔𝒔𝒔, 𝒂𝒂′
𝑎𝑎′
Q-Learning Algorithm
_____________________________________________________________
EL
Require: Behavior Policy 𝝅𝝅𝒆𝒆 (usually random)
Learning Rate (0 ≤ 𝜶𝜶 ≤ 1)
Discount Factor (0 ≤ 𝜸𝜸 ≤ 1)
PT
Initialize: Q-Table (stores Q-Values for each state-action pair)
For N rounds:
Explore: Observer state 𝒔𝒔 and choose action 𝒂𝒂 using behavior policy 𝝅𝝅𝒆𝒆
N
Receive reward 𝒓𝒓 and transition to next state 𝒔𝒔𝒔
Update: 𝑸𝑸 𝒔𝒔, 𝒂𝒂 ← 𝟏𝟏 − 𝜶𝜶 𝑸𝑸 𝒔𝒔, 𝒂𝒂 + 𝜶𝜶(𝒓𝒓 + 𝛾𝛾 max 𝑸𝑸∗ 𝒔𝒔𝒔, 𝒂𝒂′
𝑎𝑎′
____________________________________________________________________________
EL
PT
N
Exploration v/s Exploitation in Q-Learning
Epsilon-Greedy Exploration:
The behavior policy chooses action based on exploration probability 𝝐𝝐 during the learning phase
The epsilon-greedy behaviour can be implemented as below:
EL
PT
N
MDP Formulation of Resource Allocation
Coming back to the resource allocation problem, for the sake of simplicity we shall assume the private cloud scenario
only. Furthermore, we shall discretize the actions to make it a finite set. The MDP can be defined as follows,
• States: 𝑺𝑺 = (𝒆𝒆, 𝒕𝒕, 𝒅𝒅, 𝒍𝒍) discrete set of 4-tuples containing 𝑇𝑇𝐷𝐷𝐷𝐷 𝐸𝐸 + 1 elements where,
𝑒𝑒 ∈ 0, 𝐸𝐸 represents number of available edge VMs
𝑡𝑡 ∈ [1, 𝑇𝑇] represents the current time slot
EL
𝑑𝑑 ∈ [1, 𝐷𝐷] represents number of VMs demanded by a user
𝑙𝑙 ∈ [1, 𝐿𝐿] represents the duration for which a user can occupy a demanded VM
• Actions: 𝑨𝑨 = [𝟎𝟎. 𝟎𝟎, 𝟎𝟎. 𝟏𝟏, 𝟎𝟎. 𝟐𝟐, … , 𝟎𝟎. 𝟗𝟗, 𝟏𝟏. 𝟎𝟎] a discrete set containing 11 actions, each representing ratio of VMs
PT
allocated from private cloud
𝒑𝒑𝒑𝒑𝒑𝒑
• Reward: 𝒓𝒓𝒕𝒕 = 𝑹𝑹 𝒔𝒔𝒕𝒕 , 𝒂𝒂𝒕𝒕 , 𝒔𝒔𝒕𝒕+𝟏𝟏 = −𝑪𝑪𝒕𝒕 reward is negative of cost
• Initial State: MDP start from the state 𝒔𝒔𝟏𝟏 = 𝑬𝑬, 𝟏𝟏, 𝒅𝒅𝟏𝟏 , 𝒍𝒍𝟏𝟏 where 𝒅𝒅𝟏𝟏 and 𝒍𝒍𝟏𝟏 are sampled uniformly from within the
range [1, 𝐷𝐷] and [1, 𝐿𝐿] respectively.
•
• N
Final State: the final state 𝒔𝒔𝑻𝑻 is reached after exactly 𝑇𝑇 timesteps
Transition Function: The stochasticity in our MDP comes from random user demand. We can assume the user to
demand uniformly from the set of 2-tuples
𝒟𝒟 = 𝒅𝒅, 𝒍𝒍 𝒅𝒅 ∈ 𝟏𝟏, 𝑫𝑫 , 𝒍𝒍 ∈ [𝟏𝟏, 𝑳𝑳]}.
• In this case the probability of a (next) user demand has a uniform distribution
over the set 𝒟𝒟
MDP Formulation of Resource Allocation
Example: Assuming that
• the number of VMs at the edge is 𝑬𝑬 = 19
• a user can demand maximum of 𝑫𝑫 = 4 VMs for a duration of at most 𝑳𝑳 = 3 time-slots
• allocation decision has to be made over 𝑻𝑻 = 15 timeslots
EL
We can have a total of 𝐒𝐒 = 𝑻𝑻𝑻𝑻𝑻𝑻 𝑬𝑬 + 𝟏𝟏 = 15*4*3*(19+1) = 3600 possible states of the MDP
The total number of possible user demands is 𝑫𝑫𝑫𝑫 = 4*3 = 12. Hence, there are 12 possible initial states one for each
initial user demand. The set of possible user demands 𝒟𝒟 is given by,
PT
𝒟𝒟 = 𝑫𝑫 × 𝑳𝑳 = (1,2,3,4) x (1,2,3) =
EL
PT
Implement policy using
N (argmax) Q-Values
Optimal Resource Allocation using Q-Learning
EL
PT
Trajectory
N
actions taken by agent
𝑝𝑝𝑝𝑝𝑝𝑝
Min cost at Private Cloud over T-steps 𝐶𝐶 𝑝𝑝𝑝𝑝𝑝𝑝 = ∑𝑇𝑇𝑡𝑡=1 𝐶𝐶𝑡𝑡
DDPG for Private Cloud Resource allocation
• The Deep Deterministic Policy Gradient (DDPG) algorithm is a
policy optimization method based on the classical Actor-Critic
style RL algorithm.
EL
• The Critic (Value Network) evaluates Actor’s performance
through a value function that guides Actor’s next action. thus
improving its convergence and performance.
PT
• DDPG deals with continuous action spaces. The exploration is
managed by adding a Gaussian noise (N) to actions
produced by the actor (policy) network.
N
DDPG for Private Cloud Resource allocation
EL
MSE Loss between
predicted Q value and target Q-value
PT
current reward
N
discount factor estimate of optimal future Q-value
DDPG for Private Cloud Resource allocation
For the Actor network, the loss function is the following
EL
PT
Q-Value given by Policy Gradient
critic network
N
τ is a constant
P-DQN for Public Cloud Resource allocation
• The basic idea of P-DQN is as follows:
EL
(type of public cloud instance – on-demand, reserved or
spot instance).
PT
parameterized Q values.
N
actor network) is fixed. It can be written as the following:
P-DQN for Public Cloud Resource allocation
EL
PT
N
P-DQN for Public Cloud Resource allocation
P-DQN Training Steps
The input of the algorithm contains information about the user requests demands Dt and the unit
cost of spot instance in public cloud in time slot t.
At the beginning of each iteration of the algorithm, the edge node first needs to obtain the state
EL
(s) of the collaborative cloud-edge environment and then pass the state as the input of the neural
network into the strategy network to obtain the parameter values of each discrete action.
After the edge node gets the action, it will select the appropriate public cloud instance type based
PT
on the discrete values in the action and determine the number of public cloud instances to be
used based on the parameter values.
Then, interaction with the environment occurs, to get the next state, reward, and termination flag.
N
Storing this round of experience to the experience replay pool, the DRL agent will sample from
the experience replay pool and calculate the gradient of the value network and the policy
network. Then, it will update the parameters of the corresponding networks.
After one round of iterative, to ensure the convergence of the resource allocation policy, the
training will be continued (repreated) to the maximum number of training rounds set.
Google Cluster Workload Traces Dataset (2019)
The Google Cluster Workload Traces Dataset contains the workloads running on eight
Google Borg compute clusters for the month of May 2019. The trace describes every job
submission, scheduling decision, and resource usage data for the jobs that ran in those
clusters.
It has enabled a wide range of research on advancing the state-of-the-art for cluster
EL
schedulers and cloud computing, and has been used to generate hundreds of analyses and
studies. The new trace allows researchers to explore these changes. The new dataset
includes additional data, including:
PT
• CPU usage information histograms for each 5 minute period, not just a point sample;
• information about alloc sets (shared resource reservations used by jobs); and
N
job-parent information for master/worker relationships such as MapReduce jobs.
Just like the last trace, these new ones focus on resource requests and usage, and contain
no information about end users, their data, or access patterns to storage systems and other
services.
The trace data is being made available via Google BigQuery so that sophisticated analyses
can be performed without requiring local resources. This site provides access instructions
and a detailed description of what the traces contain.
Results: Private Cloud
EL
PT
N
• EF (Edge First): gives priority to the edge node to process user’s requests (greedy algorithm)
• RANDOM: randomly selects the edge node or the private cloud service.
• Particle Swarm Optimization (PSO) : directly optimizes the resource allocation actions for each time
slot
Results: Public Cloud
EL
PT
N
E + O (Edge first + On demand): this algorithm gives priority to the edge node to process user’s requests.
When the edge node demonstrates insufficient capacity to provide services, only the on-demand instance is
used to process user’s requests.
E + R (Edge + Random): this algorithm gives priority to the edge node to process user requests. When the
capacity of the edge node is insufficient to provide computing services, the pricing mode of cloud service is
randomly selected for collaborative computing.
R + R (Random + Random): this algorithm randomly selects the pricing mode of cloud service and
randomly determines the quantities for allocation.
Conclusion
In this lecture, we discussed:
EL
• We used using Markov Decision Process along Parameterized DQN and Deep
Deterministic Policy gradient algorithms to solve the resource allocation decision making
process.
PT
• Discussed the experiments and results on Google Cluster Workload Traces Dataset.
N
EL
Thank You!
PT
N
EL
Thank You!
PT
N
References
1. Saeik, F., Avgeris, M., Spatharakis, D., Santi, N., Dechouniotis, D., Violos, J. & Papavassiliou, S. (2021). Task
offloading in Edge and Cloud Computing: A survey on mathematical, artificial intelligence and control theory
solutions. Computer Networks, 195, 108177.
2. Wang, B., Wang, C., Huang, W., Song, Y., & Qin, X. (2020). A survey and taxonomy on task offloading for edge-
cloud computing. IEEE Access, 8, 186080-186101.
3. S. Li, Y. Tao, X. Qin, L. Liu, Z. Zhang and P. Zhang, "Energy-Aware Mobile Edge Computation Offloading for IoT Over
Heterogeneous Networks," in IEEE Access, vol. 7, pp. 13092-13105, 2019, doi: 10.1109/ACCESS.2019.2893118.
EL
4. https://ieeexplore.ieee.org/document/9253665
Tang, M., & Wong, V. W. (2020). Deep reinforcement learning for task offloading in mobile edge computing
systems. IEEE Transactions on Mobile Computing, 21(6), 1985-1997.
5. https://www.mdpi.com/1999-5903/14/2/30
PT
Tu, Y.; Chen, H.; Yan, L.; Zhou, X. Task Offloading Based on LSTM Prediction and Deep Reinforcement Learning for
Efficient Edge Computing in IoT. Future Internet 2022, 14, 30. https://doi.org/10.3390/fi14020030
6. Luo, Q., Hu, S., Li, C., Li, G., & Shi, W. (2021). Resource scheduling in edge computing: A survey. IEEE
Communications Surveys & Tutorials, 23(4), 2131-2165.
7.
8. N
Yang S, Lee G, Huang L. Deep Learning-Based Dynamic Computation Task Offloading for Mobile Edge Computing
Networks. Sensors. 2022; 22(11):4088. https://doi.org/10.3390/s22114088
Fog and Edge Computing - Principles and Paradigms, Rajkumar Buyya and Satish Narayana Srirama, Wiley Series
On Parallel and Distributed Computing
EL
BELOW SLIDES WERE NOT USED
PT
N
N
PT
EL
N
PT
EL
Heading
EL
PT
N
Heading
EL
PT
N
Optimization convolutions in 1-D
• The convolution process is shown as follows,
EL
stride = 1
𝑑𝑑1 × 𝑤𝑤1
+ convolution operation
𝑑𝑑2 × 𝑤𝑤2
PT
𝑧𝑧1
N
Optimization convolutions in 1-D
• The convolution process is shown as follows,
𝑤𝑤1 𝑤𝑤2
EL
𝑑𝑑2 × 𝑤𝑤1
+
𝑑𝑑3 × 𝑤𝑤2
PT
𝑧𝑧1 𝑧𝑧2
N
Optimization convolutions in 1-D
• The convolution process is shown as follows,
𝑤𝑤1 𝑤𝑤2
EL
𝑑𝑑3 × 𝑤𝑤1
+
𝑑𝑑4 × 𝑤𝑤2
PT
𝑧𝑧1 𝑧𝑧2 𝑧𝑧3
N
Optimization convolutions in 1-D
• The convolution process is shown as follows,
𝑤𝑤1 𝑤𝑤2
EL
𝑑𝑑𝑘𝑘−1 × 𝑤𝑤1
+
PT
𝑑𝑑𝑘𝑘 × 𝑤𝑤2
𝑤𝑤1 𝑤𝑤2
EL
𝑑𝑑𝑘𝑘−1 × 𝑤𝑤1
+
PT
𝑑𝑑𝑘𝑘 × 𝑤𝑤2
ℎ1 ℎ2 ℎ3 . . . ℎ𝑘𝑘−1
Edge Computing Storage
• Provides temporary data storage
• Provides Caching in order to improve the performance of information or content
delivery, e.g., multimedia content caching at edge nodes to improve user QoE
• In Connected and Automated Vehicles (CAVs), the vehicles can utilize the Road-
EL
side Units (RSUs) to store and share information collected by the vehicles
continuously over time.
PT
N
Edge Computing Compute
• In general, Edge computing providers offer IaaS/PaaS based on either Virtual
machines (VMs) or containers engines (CEs), which provides platform for the
clients to deploy their software in a virtual environment hosted on edge nodes.
• On the other hand SaaS providers can offer two types of services - on-demand
EL
data processing (ODP) and context as a service (CaaS).
• Specifically, an ODP-based service has pre-installed methods that can process
PT
the data sent from the client in the request/response manner.
• Whereas, the CaaS-based service provides a customized data provision method
in which the edge nodes can collect and process the data to generate
N
meaningful information for their clients.
Edge Computing Acceleration
• Edge nodes can provides acceleration in two aspects - networking acceleration
and computing acceleration.
• Networking acceleration. Edge nodes can support network acceleration
mechanism based on network virtualization technology, which enables them to
EL
operate multiple routing tables in parallel and to realize a software-defined
network (SDN). This means that the clients of the edge nodes can configure
customized routing path for their applications in order to achieve optimal
PT
network transmission speed.
• Computing acceleration. Edge nodes equipped with Graphics Processing Units
N
(GPUs) or Field Programable Gate Array (FPGA) can provide additional speed up
in computing. FPGA units allow users to redeploy program codes on them in
order to improve or update the functions of the host devices. GPUs can allow for
extremely fast inferencing for Large ML models by speeding up complex
computations.
Edge Computing Networking
• Involves vertical and horizontal networking. Vertical networking interconnects
users/things and cloud with the IP networks; whereas, horizontal networking
can be heterogeneous in network signals and protocols, depending on the
supported hardware specification of the edge nodes.
EL
• Vertical networking: vertical network uses IP network-based standard protocols such
as the request/response-based TCP/UDP sockets, HTTP, Internet Engineering Task Force
(IETF) – Constraint Application Protocol (CoAP) or publish-subscribe-based Extensible
PT
Messaging and Presence Protocol (XMPP), OASIS – Advanced Message Queuing Protocol
(AMQP; ISO/IEC 19464), Message Queue Telemetry Transport (MQTT; ISO/IEC PRF
20922), etc.
• N
Edge nodes can also operate as the message broker of publish-subscribe-based protocol
that allows the IoT devices to publish data streams to the edge nodes and enable the
cloud backend to subscribe the data streams from the edge nodes.
Edge Computing Networking
• Horizontal networking: Based on various optimization requirements such as
energy efficiency or the network transmission efficiency, IoT systems are often
using heterogeneous cost-efficient networking approaches. In particular, smart
home, smart factories, and connected vehicles are commonly utilizing
Bluetooth, ZigBee (based on IEEE 802.15.4), and Z-Wave on the IoT devices and
EL
connecting them to an IP network gateway toward enabling the connectivity
between the devices and the backend cloud.
PT
• In general, the IP network gateway devices are the ideal entities to host edge
servers since they have the connectivity with the IoT devices in various signals.
For example, the cloud can request that an edge server hosted in a connected
N
car communicate with the roadside IoT equipment using ZigBee in order to
collect the environmental information needed for analyzing the real-time traffic
situation.
Edge Computing Control
• Edge nodes provide four control mechanisms - deployment, actuation,
mediation, and security.
• Deployment control allows users to perform customizable software program deployment
dynamically. Furthermore, clients can configure edge nodes to control which program the
EL
edge node should execute and when it should execute it. Edge providers can also provide
a complete edge network topology as a service that allows clients to move their program
from one edge node to another. Additionally, the clients may also control multiple edge
PT
nodes to achieve the optimal performance for their applications.
• Actuation control represents the mechanism supported by the hardware specification
and the connectivity between the edge nodes and the connected devices. Specifically,
N
instead of performing direct interaction between the cloud and the devices, the cloud
can delegate certain decisions to edge nodes to directly control the behavior of
controlable devices.
Edge Computing Control
• Mediation control corresponds to the capability of edge in terms of interacting with
external entities owned by different parties. In particular, the connected vehicles
supported by different service providers can communicate with one another, though they
may not have a common protocol initially. With the softwarization feature of edge node,
the vehicles can have on-demand software update toward enhancing their interoperability.
EL
• Security control is the basic requirement of edge nodes that allows clients to control the
authentication, authorization, identity, and protection of the virtualized runtime
environment operated on the edge nodes.
PT
N
Task Offloading Objectives
● Load Balance – In an edge or a cloud, there are multiple servers (or virtual servers) for processing
offloaded tasks. The imbalance of loads in these servers leads Inefficient resource utilization and
high response times for some of the offloaded tasks.
● Multi-Objective -
EL
(1) transforming multiple objectives to one by some approaches, e.g., weighted-adding, which must
decide the importance (weight) of each objective
PT
(2) selecting one (the most important) of these objectives as the optimization one and others as
constraints with lower/upper bounds
N
(3) designing solving method for providing various Pareto-optimal solutions of these objectives for
providers’ choices according to practical conditions.
Factors Affecting Task Offloading
● Data Variety: Large datasets may be more efficiently processed closer to the data source to
minimize bandwidth usage. Furthermore, sensitive data may be better processed locally to enhance
security and privacy.
● Task Variety: Tasks with high computational requirements (compute-intensive) may benefit from
EL
offloading to more powerful cloud servers. Also, tasks that can be easily parallelized may be
suitable for offloading to multiple edge devices or cloud servers.
PT
● Application Requirements: Applications with strict real-time requirements may prefer local
processing to minimize latency. Constraints such as desired quality of service (QoS), response time
and reliability, directly influences offloading decisions.
●
N
Device Mobility: In mobile edge computing scenarios, the mobility of edge devices can affect
offloading decisions as devices move within coverage units or in and out of the network coverage.
Resource Scheduling
The three-tier heterogeneous architecture for resource
scheduling in edge computing consists of three layers,
namely, the things layer (user layer), the edge layer,
and the cloud layer. The three-tier architecture is a
widely popular and accepted paradigm.
EL
PT
N
Week 8
EL
PT
N
70 mins
Network Virtualization
EL
Software Defined Network
PT
N
Dr. Rajiv Misra, Professor
Dept. of Computer Science & Engineering
Indian Institute of Technology Patna
[email protected]
Preface
Content of this Lecture:
• In this lecture, we will discuss the architecture of software defined
networking and its applications to networking in the cloud.
EL
• We will also discuss the network Virtualization in multi-tenant data
centers with case study of VL2 and NVP
PT
N
Need of SDN
The traditional networking problem that SDN (software defined
networking) is addressing as:
I) Complexity of existing networks
EL
• Networks are complex just like computer system, having system with
software.
• But worst than that it's a distributed system
PT
• Even more worse: No clear programming APIs, only “knobs and dials”
to control certain functions.
II) Network equipment traditionally is proprietary
N
• Integrated solutions (operating systems, software, configuration,
protocol implementation , hardware ) from major vendors
RESULT: - Hard and time intensive to innovate new kinds of networks
and new services or modify the traditional networks more
efficiently.
Traditional Network
Traditional Network
Configuration file
In traditional network, configuring a router, for example, for
BGP routing, in a configuration file. Configuration
file
Hundreds, thousands, tens of thousands network devices, router
switches and firewalls throughout the network that will get Device software
pushed out configuration to various devices on the network.
EL
Once it reaches a device then these configurations have to be Device software Device software
interpreted by router software and implementation of
PT
Device software
protocols that run distributed algorithms and arrive at routing
solutions that produce the ultimate behavior that's installed in the
Device software
network hardware.
So here the policy is embedded into the mechanism. It means
N
the network routing to achieve low latency or high utilization of
the network that control are baked into the distributive protocol
that are in standardized implementations.
Device software
Software Defined Network
• The traditional software and OSs are built app app
on layers and APIs. So it begin close to Network OS
Software
the hardware build a low level interface
abstractions
that gives direct access to what the
network switching hardware is doing.
Logically centralized
EL
• Then a logically centralized controller controller
which communicates with distributed
switches and other devices in the
network.
PT
• The goal of a logically centralized Data plane API
controller is to express our goal in one
location and keep the widely distributed
switching gear as simple as possible.
•
N
Put the intelligence in a logically
centralized location. On top of that, build
software abstractions that help us to
build different applications, a network
operating system if you want.
Key Ideas of SDN
Key ideas software-defined networking architecture:
Division of Policy and Mechanisms-
EL
• Low-level interface and programmatic interface for the data
plane
PT
• Logically centralized controller that allows us to build software
N
abstractions on top of it.
Example: NOX
NOX is a very early SDN controller. So to identify a
user's traffic, a particular user or computer to send
traffic through the network, and that traffic through
the network is going to be tagged with an identifier, a
VLAN and that identifies that user.
EL
So to instruct the network we match a specific set of
relevant packets and look at the location where this
traffic comes in from as well as the MAC address. And
PT
we're going to construct an action that should happen,
in this case, tagging, adding that VLAN tag to the
traffic.
N
And then we're going to install that action on the
specified set of packets in a particular switch. So
we're basically telling the switch, if you see this, do
that.
In addition, commonly SDN controllers have some
kind of topology discovery, the ability to control
traffic and monitor the behavior in the network.
Key Ideas of SDN
EL
• Centralized control
PT
• Higher level abstractions that makes easier control.
N
Evolution of SDN: Flexible Data Planes
• Evolution of SDN is driving towards making the network
flexible.
• Label switching or MPLS (1997) i.e. matching labels,
EL
executing actions based on those labels adding
flexibility:-
• Lay down any path that we want in the network for
PT
certain classes of traffic.
• Go beyond simple shortest path forwarding,
•
•
for traffic engineeringN
Good optimization of traffic flow to get high throughput
EL
PT
N
Evolution of SDN: Logically Centralized Control
• 4D architecture (2005)
• A clean slate 4D Approach to Network control and
management [Greensburg, Et. Al., CCR Oct 2005 ]
EL
• Logically centralized “decision plane” separated from
data planes
PT
• Ethane (2007) [Casado et al., SIGCOMM 2007]
• Centralized controller enforces enterprise network
Ethernet forwarding policy using existing hardware.
• N
OpenFlow (2008) [McKeown et al. 2008]
• Thin standardized interface to data planes.
• General purpose programmability at control.
SDN Opportunities
• Open data plane interface
• Hardware : with standardized API, easier for operators
to change hardware, and for vendors to enter market
• Software : can more directly access device behavior
EL
• Centralized controller:
• Direct programmatic control of network
Software abstraction of the controller
PT
•
• Solves distributed systems problem only once, then
just write algorithm.
apps N
• Libraries/languages to help programmers write net
EL
• Centralized controller:
• Direct programmatic control of network
Software abstraction of the controller
PT
•
• Solves distributed systems problem only once, then
just write algorithm.
apps N
• Libraries/languages to help programmers write net
EL
Distributed systems challenges still present
PT
• Network is fundamentally a distributed system
• Resilience of logically centralized controllers
N
• Imperfect knowledge of network state
• Consistency issues between controllers
Architectural Challenges of SDN
EL
• Devising the right control abstraction ?
PT
• Programming OpenFlow : far too low level
• But what are the right level abstractions to cover important use
cases ?
N
Multi-tenant Data Centers : The challenges
Cloud is shared among multiple parties and gives economy of
scale. To share the cloud among multiple tenants, there's bit more
work to do. So the key needs for building a multi-tenant Cloud
data centre are:
EL
(i) Agility
PT
(ii) Location independent addressing
(iii) Performance uniformity
(iv) Security
N
(v) Network semantics
(i) Agility
• Use any server for any service at any time:
• Better economy of scale through increased utilization: Pack, compute
as best we can for high utilization. If we ever have constraints then
EL
it's going to be a lot harder to make full use of resources.
• Improved reliability: If there is a planned outage or an unexpected
outage, move the services to keep running uninterrupted.
PT
• Service or tenant can means:
N
• A customer renting space in a public cloud
• Application or service in a private cloud as an internal customer
Traditional Datacenters
Routers
Aggregation
switches
EL
Top of
Racks
PT
switches
Racks
N
Lack of Agility in Traditional DCs
• Tenant in “silos”
– Means one rack
EL
or a part of the
cluster is devoted
to a particular
PT
service
• Poor Utilization
N • Inability to
expand
Lack of Agility in Traditional DCs
EL
• IP addresses locked
to topological
location!
PT
N
Key needs: Agility
• Agility
• Location independent addressing: Racks are generally
assigned different IP Subnets, because subnets are used as
topological locators so that we can route. To move some
EL
service over there, we're going to have to change its IP
address and it is hard to change the IP addresses of live
running services.
PT
• Tenant’s IP address can be taken anywhere: Tenant's IP
address to be taken anywhere, independent of the
location and the data center without notify tenants that it
N
has changed location. Large over subscription ratio i.e. 100
% or greater if there's a lot of communication between
both sides will be about a hundred times lower throughput
than communicating within the rack.
Key needs: Performance Uniformity
• Performance Uniformity
EL
see the same performance and
latency.
PT
• Smaller units of compute that
we've divided our services into,
and put them anywhere in the
N
data center, and may be on the
same physical machine.
Smaller units
of compute
Key needs: Security
• Security: Untrusting applications and
users sitting right next to each other and
can be inbound attacks. So to protect our
tenants in the data center from each
other in both the public data center as
EL
well as in the private cloud and you don't
want them to have to trust each other.
PT
• Micro-segmentation : separation of
different regions of a network.
• Much finer grained division and
control, of how data can flow.
•
N
Isolate or control just the data flow
between pairs of applications, or
tenants that should be actually
allowed.
Key needs: Network semantics
• Network semantics:
EL
• Not just Layer 3 routing services but also, Layer 2 services i.e.
PT
discovery services, multicast, broadcast etc. have to be supported.
N
Network Virtualization in
Multi-tenant Data Centers
Case Study: VL2
EL
PT
N
Network Virtualization Case Study: VL2
Key Needs:
(i) Agility
EL
(ii) Location independent addressing
PT
(iii) Performance uniformity
(iv) Security
N
(v) Network Semantics
Motivating Environmental Characteristics
Increasing internal traffic is a bottleneck
• Traffic volume between servers is 4 times
larger than the external traffic
Rapidly-changing traffic matrices (TMs)
EL
• i.e. Take traffic matrices in 100 second buckets
and classify them into 40 categories of similar
clusters of traffic matrices and see which of the
PT
clusters appear in the measurements
• So over time rapidly changing and no pattern to [Greenberg et al.]
what the particular traffic matrix is.
Design result: Nonblocking fabricN
• High throughput for any traffic matrices
that respects server NIC rates.
• The fabric joining together all the servers,
we don't want that to be a bottle neck.
Motivating Environmental Characteristics
Failure characteristics:
• Analyzed 300K alarm tickets, 36 million error events from the cluster
EL
• 0.4% of failures were resolved in over one day
• 0.3% of failures eliminated all redundancy in a device group (e.g. both
PT
uplinks)
•
N
Particular kind of non blocking topology
• “Scale out” instead of “scale up”
VL2 physical topology
EL
PT
Traditional
N VL2
An example Clos network between Aggregation and Intermediate switches provides a richly-connected
backbone well suited for VLB. .e network is built with two separate address families—topologically
significant Locator Addresses (LAs) and at Application Addresses (AAs).
Routing in VL2
Unpredictable traffic
• Means it is difficult to adapt. So this leads us to a design that is what's
called oblivious routing. It means that the path along which we send a
EL
particular flow does not depend on the current traffic matrix.
PT
Design result: “Valiant Load Balancing”
• For routing on hyper cubes take an arbitrary traffic matrix and make it
look like a completely uniform even traffic matrix.
N
• Take flows and spreading evenly over all the available paths. Spread
traffic as much as possible.
• Route traffic independent of current traffic matrix
Routing Implementation
Spreads arbitrary traffic
pattern so it’s uniform among
top layer switches which are
called intermediate switches.
Now to do that what VL2 does
EL
is it assigns those
intermediate switches and
any cast address. The same
PT
any cast address for all of the
switches.
So, then a top of rack switch
can send to a random one by
And if we are using ECMP we
will use a random one of
N
just using that single address.
EL
single anycast address of all of the
intermediates.
PT
• ECMP lets us select from one of those
paths then one will be picked from any
particular flow. We send it to that Similar effect
intermediate switch. Now that outer to ECMP to
N
anycast address is wrapping an inner
header that actually has the destination
address, in this design. So we'll forward
it from there onto the destination.
each rack
Smaller
forwarding
table at most
switches
Any service anywhere
Application or tenant of the data center is going to see what's
called application addresses.
These are location independent, Same address no matter
App/Tenant layer where the VM goes. And they're going to see the illusion of a
single big Layer
2 switch connecting all of the application's VMs.
EL
Maintains a directory server that maps the application level
addresses to their current locators.
PT
Indirection or VL2 has agents that run on the server that will query the
Virtualization directory server and find that AA to LA mapping. And then when
layer it sends a packet, it'll wrap the application address in the
outer locator header
Physical network
layer
N
Different set of IP addresses called locator addresses.
Tied to topology used to route
Layer 3 routing via OSPF
End-to-end Example
EL
PT
N
Did we achieve agility?
Location independent addressing
• AAs are location independent
EL
L2 network semantics
• Agent intercepts and handles L2 broadcast, multicast
PT
• Both of the above require “layer 2.5” shim agent running on host;
N
but, concept transfers to hypervisor-based virtual switch
Did we achieve agility?
Performance uniformity:
• Clos network is nonblocking (non-oversubscribed)
• Uniform capacity everywhere.
EL
• ECMP provides good (though not perfect) load balancing
• But performance isolation among tenants depends on TCP backing
off to the rate that the destination can receive.
PT
• Leaves open the possibility of fast load balancing
Security:
N
• Directory system can allow/deny connections by choosing whether
to resolve an AA to a LA
• But, segmentation not explicitly enforced at hosts
Where’s the SDN?
EL
• Control communication policy
PT
Hosts agents: dynamic “programming” of data path
N
Network Virtualization
Case Study: NVP
EL
PT
N
NVP Approach to Virtualization
EL
et al. in NSDI 2014.
PT
• And this comes out of a product developed by the Nicira startup that was
acquired by VMware.
N
Service: Arbitrary network topology
EL
• Network hypervisor
• Virtual network tenants want to build
PT
Network hypervisor
N
Physical Network: Any layer standard3 network
Virtual Network Service
• Modeled as a sequence of data path elements that represent those switches.
• And these data path elements, each one of them is a OpenFlow forwarding table.
• It means the table will match on certain signature of packet headers and take
certain resulting actions like dropping the packet, modifying certain fields in the
packet, or forwarding the packet on.
EL
• So the idea's that we can model the switching, and routing gear with the right
sequences of the right OpenFlow tables that we set up.
PT
Access
N
control Link
Table
Virtual Network Service
• There's a packet abstraction where virtual machines are able to inject traffic
into the virtual network.
• And there's a control abstraction where the tenant was able to define this
entire virtual network pipeline, that sequence of OpenFlow flow tables.
• That's the interface that the tenant is given, at least the lowest level interface,
EL
that the tenant is given to be able to program their virtual network.
PT
N
Virtual Network Service
EL
PT
N
Challenge: Performance
Large amount of state to compute
EL
• O(n2) tunnels for tenant with n VMs
• Solution 1: Automated incremental state computation with nlog
PT
declarative language
• Solution 2: Logical controller computes single set of universal flows
N
for a tenant, translated more locally by “physical controllers”
Challenge: Performance
Pipeline processing in virtual switch can be slow
• Solution: Send first packet of a flow through the full pipeline:
thereafter, put an exact-match packet entry in the kernel
EL
Tunneling interfaces with TCP Segmentation Offload (TSO)
PT
• NIC can’t see TCP outer header
N
Conclusion
• In this lecture, we have discussed the need of software defined
network, key ideas and challenges of software defined network.
EL
• We have also discussed the challenges in multi-tenant data centers
i.e. (i) Agility, (ii) Location independent addressing (iii)
PT
Performance uniformity, (iv) Security and (v) Network semantics.
PT
N