Differentiated Availability in Cloud Computing Slas
Differentiated Availability in Cloud Computing Slas
Differentiated Availability in Cloud Computing Slas
Astrid Undheim
Telenor ASA
Corporate Development
Trondheim, Norway
[email protected]
I. I NTRODUCTION
Cloud computing presents a new computing paradigm
that has attracted a lot of attention lately. It enables ondemand access to a shared pool of highly scalable computing
resources that can be rapidly provisioned and released [1].
This is achieved by offering computing resources and services from large data centers, where the physical resources
(servers, network, storage) are virtualized and offered as
services over a network.
A large part of cloud applications has so far been targeted to consumers with low willingness to pay, and low
expectations to the service QoS (dependability, performance
and security). Recently, more and more enterprises are also
investigating how to leverage on the cloud computing advantages such as the pay per use model and rapid elasticity.
However, major challenges have to be faced in order for
enterprises to trust cloud providers with their core business
applications. These challenges are mainly related to QoS, in
our view covering dependability, performance and security,
and a comprehensive Service Level Agreement (SLA) is
needed to cover all these aspects. This is in contrast to the
insufcient SLAs offered today.
In this paper, we focus on dependability, and more specifically the availability attribute. Availability is dened in
[2] as the readiness for correct service, which can be
VMware HA
Hot
Updated
VMware FT
Dedicated
Cold
Not updated
Dedicated
Shared
Remus
Shared
Cold
Shared
Cold
Dedicated
Dedicated
Hot
Not updated
Availability
PDU
PDU
Server
VM
VM
VM
VMM
VMM
HW
HW
L1
L2
Cloud Provider
Data
center 1
Cluster
L2
GW1
Internet
COL
PWR
Cluster
L2
GW2
Customer
Data center 2
Data center
Cluster
PDU 1
Power/
cooling
PDU 2
Data center
L1 - A
L2 - A
GW 1
W1
L1 - B
L2 - B
GW2
W2
Internet
UA
(1-c)
VI
Mngmt
Cloud
Mngmt
One
down
2
Both
OK
1
c
Both
down
3
c
(1-c)
(1-c)
Both
OK
1
Active
down
2
Both
down
4
Standby
down
3
c
A. Same Cluster
The easiest deployment is to place all replicas in the
same cluster. This means low network latency in upgrading
replicas etc., but it also means that power, management
software and network failures may lead to unavailability
of all replicas and thus the service. The resulting model is
shown in Figure 11.
Mngmt
Power
Server
Network
Network
Power
Cluster
B
(1-c)p
(1-c)
OK
VM
down
VM
down
pc
OK
(1-p)
Queue
c
DC A
Mngmt
Network
DC B
Parameter
Value
0.00722 hr 1
2.0 hr 1
60 hr 1
0.95
0.05 hr 1
0.99
6.0 hr 1
Source
[5]
[5]
[12]
[5]
Guessed
Guessed
Guessed
II
Value
0.9975
0.9992
0.999
0.97986
0.99966
0.989
0.999
0.99
Source
[15]
[17]
Guessed
[14]
[17]
[18]
[13]
[13]
III
Netw A
0.98901
0.98901
0.98901
0.98901
0.98901
0.98901
0.98901
0.98901
0.99
0.99
0.99
0.99
Tot A
0.98262
0.98262
0.97893
0.94336
0.98403
0.98403
0.98401
0.97230
0.98799
0.98799
0.98799
0.97697
0.989
0.988
Scenario I
Scenario II
0.987
Scenario III
Aservice
Name
Power
PDU
Management software
Switches
Router
Access Network
Core Network
User Access
Cloud A
0.99354
0.99342
0.98981
0.95385
0.994972
0.99497
0.99494
0.98311
0.997971
0.99797
0.99791
0.98683
VM Fault Tolerance
Updated Dedicated Hot
Updated Shared Hot
Shared Cold (HP)
Shared Cold (LP)
Updated Dedicated Hot
Updated Shared Hot
Shared Cold (HP)
Shared Cold (LP)
Updated Dedicated Hot
Updated Shared Hot
Shared Cold (HP)
Shared Cold (LP)
0.986
0.985
0.984
0.983
0.9990
0.9992
0.9994
0.9996
0.9998
Amngt
Scenario III
0.986
Aservice
0.985
0.984
0.983
0.982
0.9975
0.9980
0.9985
0.9990
0.9995
Apower
Finally, the availability for the cold standby with high and
low priority is plotted versus the preemption rate in Figure
15. The preemption rate is dependent on the load in the
system. With a preemption rate equal to zero, the high and
low priority techniques are equal, but for higher preemption
rates the high priority is superior. Hence, using different
priority levels and allowing for preemption will have a clear
differentiation effect when the load increases.
VII. C ONCLUSIONS AND F UTURE W ORK
SLAs have received a lot of attention in cloud computing,
and especially availability is covered by public cloud SLAs.
[5] D. S. Kim, F. Machida, and K. S. Trivedi, Availability Modeling and Analysis of a Virtualized System, in Proceedings
of the 15th IEEE Pacic Rim International Symposium on
Dependable Computing, Nov. 2009, pp. 365371.
1.00
Aserver
0.96
0.94
0.92
0.90
0.00
0.05
0.10
0.15
0.20
Figure 15: Availability for high and low priority cold standbys with increasing preemption rate
However, there are some important improvements to be
made. First, the SLAs must become more detailed with
respect to actual KPIs used to dene availability. Next, in
order to deploy also important enterprise services in clouds,
different levels of availability should be offered, depending
on the actual user requirements. Finally, the SLAs should
be available on demand, which also means that they should
be adjustable on demand.
This paper has proposed an overall availability model
for a cloud system, including the network. We have shown
how deploying replicas in different physical locations affect
the resulting availability, and also how different applications
need different fault tolerance schemes. These are two possible dimensions for differentiating cloud applications.
Future work include modeling more complex services,
e.g. a tiered web service. Also, the server models should
be made more detailed, taking into account characteristics
of the different failures and repairs. We have discussed the
need for well dened KPIs for availability, the next step
is to also include performance measures in the availability
models. Finally, the network availability strongly inuence
the total availability of a cloud service, and should optimally
be included in the cloud service SLA.
R EFERENCES
[15] W. P. Turner and J. Seader, Tier classications dene site infrastructure performance, The Uptime Institute White Paper,
2006.