0% found this document useful (0 votes)
9 views10 pages

Paper 4

This paper presents a novel approach to deadline-driven resource allocation in cloud systems using virtual machine technology, focusing on minimizing user payments while ensuring task completion within specified deadlines. It introduces an error-tolerant method to handle inaccuracies in workload predictions and validates the effectiveness of the proposed solution through experiments in a real VM-facilitated cluster environment. The results demonstrate that the algorithm can effectively manage resource allocation under varying competition levels, achieving a high percentage of tasks completed within their deadlines.

Uploaded by

Sheng Di
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views10 pages

Paper 4

This paper presents a novel approach to deadline-driven resource allocation in cloud systems using virtual machine technology, focusing on minimizing user payments while ensuring task completion within specified deadlines. It introduces an error-tolerant method to handle inaccuracies in workload predictions and validates the effectiveness of the proposed solution through experiments in a real VM-facilitated cluster environment. The results demonstrate that the algorithm can effectively manage resource allocation under varying competition levels, achieving a high percentage of tasks completed within their deadlines.

Uploaded by

Sheng Di
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 24, NO.

6, JUNE 2013 1097

Error-Tolerant Resource Allocation and


Payment Minimization for Cloud System
Sheng Di, Member, IEEE, and Cho-Li Wang, Member, IEEE

Abstract—With virtual machine (VM) technology being increasingly mature, compute resources in cloud systems can be partitioned in
fine granularity and allocated on demand. We make three contributions in this paper: 1) We formulate a deadline-driven resource
allocation problem based on the cloud environment facilitated with VM resource isolation technology, and also propose a novel solution
with polynomial time, which could minimize users’ payment in terms of their expected deadlines. 2) By analyzing the upper bound of
task execution length based on the possibly inaccurate workload prediction, we further propose an error-tolerant method to guarantee
task’s completion within its deadline. 3) We validate its effectiveness over a real VM-facilitated cluster environment under different
levels of competition. In our experiment, by tuning algorithmic input deadline based on our derived bound, task execution length can
always be limited within its deadline in the sufficient-supply situation; the mean execution length still keeps 70 percent as high as user-
specified deadline under the severe competition. Under the original-deadline-based solution, about 52.5 percent of tasks are
completed within 0.95-1.0 as high as their deadlines, which still conforms to the deadline-guaranteed requirement. Only 20 percent of
tasks violate deadlines, yet most (17.5 percent) are still finished within 1.05 times of deadlines.

Index Terms—VM multiplexing, resource allocation, convex optimization, prediction error tolerance, payment minimization

1 INTRODUCTION

C LOUD computing [1], [2] has emerged as a compelling


paradigm for the deployment of ease-of-use virtual
environment on the Internet. One typical feature of clouds
mutual performance interference among them. Whereas,
cloud systems usually do not provision physical hosts
directly to users, but leverage virtual resources isolated by
is its pool of easily accessible virtualized resources (such as VM technology [4], [5], [6]. Not only can such an elastic
hardware, platform or services) that can be dynamically resource usage way adapt to user’s specific demand, but it
reconfigured to adjust to a variable load (scale). All the can also maximize resource utilization in fine granularity
resources provisioned by cloud system are supposed to be and isolate the abnormal environments for safety purpose.
under a payment model [2], in order to avoid users’
Some successful platforms or cloud management tools
overdemand of their resources against their true needs.
leveraging VM resource isolation technology include
Each task’s workload is likely of multiple dimensions.
Amazon EC2 [7] and OpenNebula [8]. On the other hand,
First, the compute resources in need may be multiattribute
with fast development of scientific research, users may
(such as CPU, disk-reading speed, network bandwidth,
propose quite complicated demands. For example, users
etc.), resulting in multidimensional execution in nature.
may wish to minimize their payments when guaranteeing
Second, even though a task just depends on one resource
their service level such that their tasks can be finished
type like CPU, it may also be split to multiple sequential
before deadlines. Such a deadline-guaranteed resource
execution phases, each calling for a different computing
ability and various price on demand, also leading to a allocation with minimized payment is rarely studied in
potentially high-dimensional execution scenario. literatures. Moreover, inevitable errors in predicting task
The resource allocation in cloud computing is much workloads will definitely make the problem harder.
more complex than in other distributed systems like Grid Based on the elastic resource usage model, we aim to
computing platform. In a Grid system [3], it is improper to design a resource allocation algorithm with high prediction-
share the compute resources among the multiple applica- error tolerance ability, also minimizing users’ payments
tions simultaneously running atop it due to the inevitable subject to their expected deadlines.
Since the idle physical resources can be arbitrarily
partitioned and allocated to new tasks, the VM-based
. S. Di is with the MESCAL Group, INRIA, Room 213, Laboratoire LIG, divisible resource allocation could be very flexible. This
ENSIMAG antenne de Montbonnot ZIRST, 51, avenue Jean Kuntzmann,
38330 Monbonnot Saint Martin, France. E-mail: [email protected]. implies the feasibility of finding the optimal solution
. C.-L. Wang is with the Department of Computer Science, The University of through convex optimization strategies [9], unlike the
Hong Kong, Pokfulam Road, Hong Kong. E-mail: [email protected]. traditional Grid model [10] that relies on the indivisible
Manuscript received 29 Feb. 2012; revised 27 Aug. 2012; accepted 30 Sept. resources like the number of physical cores. However,
2012; published online 19 Oct. 2012.
Recommended for acceptance by V.B. Misic, R. Buyya, D. Milojicic, and we found that it is inviable to directly solve the necessary
Y. Cui. and sufficient condition to find the optimal solution, a.k.a.,
For information on obtaining reprints of this article, please send e-mail to: Karush-Kuhn-Tucker (KKT) conditions [9]. Our first con-
[email protected], and reference IEEECS Log Number
TPDSSI-2012-02-0179. tribution is devising a novel approach (with only Oðn  R2 Þ
Digital Object Identifier no. 10.1109/TPDS.2012.309. time complexity) to solve the problem, where R denotes the
1045-9219/13/$31.00 ß 2013 IEEE Published by the IEEE Computer Society
1098 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 24, NO. 6, JUNE 2013

number of execution dimensions and n is the system scale


(the number of compute nodes).
In literatures, traditional optimization problems are often
subject to the precise prediction of task’s characteristic (or
execution property), which is nontrivial to realize in
practice. Accordingly, as the state of the art, we further
analyze our algorithm’s optimality approximation ratio
given the possibly wrong predictions of tasks’ execution
properties. In particular, we will try to answer such a
question: when application’s characteristic is predicted with
certain levels of errors, will the application’s final execution
length (a.k.a., execution time) violate (or surpass) its
deadline? If yes, what is the ratio of the final execution
time to its deadline? These theoretical results will be
significantly valuable to the guarantee of user’s service Fig. 1. Resource allocation in cloud system.
level in practice. In fact, by setting a relatively stricter
deadline properly based on our derived approximation policy when the tasks are of the same priorities (like [12]).
ratio, each task can be guaranteed to be finished within its Each task’s execution may involve multidimensional
original deadline even though task properties cannot be resources, such as CPU and disk I/O. A data mining task,
predicted accurately. for example, usually needs to load a large set of data from
In addition to the above theoretical contribution, we disk before or in the middle of its computation. Eventually,
further confirm the effectiveness of our solutions by such a task may store its computation results onto the local
implementing a set of advanced web services that are disk or a public server through network. Fig. 1 illustrates
based on complex matrix-operations, over a real cluster the procedure in processing such a task (denoted ti Þ.
environment with 60 virtual machines. All the theoretical Suppose the task’s execution times cost on computation
conclusions are confirmed with our experiments. Specifi- and disk processing are predicted as 4 and 3 hours,
cally, in the situation with relatively sufficient resources, the respectively. Upon receiving the request, the scheduler
worst case tasks under the stricter deadline-based allocation checks the precollected availability states of all candidate
only take as about 0.75 times as their deadlines to complete, nodes, and estimates the minimal payment of running the
as compared to the 1.2 times of the deadlines under the task within its deadline on each of them (i.e., Step 1 in
original user-predefined deadline based allocation. We also the figure). The host (Node p3 shown in Fig. 1) that requires
observe that in the competitive environment, the latter the lowest payment will run the task via a customized VM
algorithm performs much more stable than the former instance with isolated resources (Step 2 in Fig. 1).
instead, which means that the latter tolerates the resource Specifically, the VM will be customized with such a CPU
competition better. We also confirm the effectiveness of our rate (e.g., 0.4 Gflops) and disk I/O rate (e.g., 0.3 Gbps) that
solution via the distribution of the number of tasks with the task can be finished within its deadline (Dðti Þ ¼ 1 hour
respect to execution times and user payments: in the in the example) and its user payment can also be minimized
competitive situation, majority of tasks can be guaranteed meanwhile. Finally (Step 3), its computation results (or
to be completed within deadlines. feedbacks) will be returned to users.
The rest of the paper is organized as follows: In Section 2, Suppose there are n compute nodes (denoted by pi ,
we formulate our problem based on the cloud scenario where 1  i  nÞ. Since all the resources are managed
which supports elastic divisible resource customization. In centrally, the availability state of each resource within any
Section 3, we first discuss the complexity of the modeled recent or later period can be predicted prior, for executing
problem in brief, and then formally describe a novel any given task with multiple execution dimensions. For any
algorithm, which can minimize user’s payment based on particular task with R execution dimensions, we use  to
task’s preset execution deadline. In Section 4, we intensively denote the whole set of dimensions and c ðpi Þ ¼ ðc1 ðpi Þ;
derive the lower bound and upper bound of execution time c2 ðpi Þ; . . . ; cR ðpi ÞÞT as node pi ’s capacity vector on these
for the situation with the possibly skewed predictions on dimensions (In the paper, we use bold type to indicate a
tasks’ properties as compared to the deadlines. We vector). In Fig. 1, for example, node p1 ’s physical capacity
rigorously implement our algorithm and analyze experi- vector is cðp1 Þ ¼ fCPU ¼ 2:4Gflops; disk IO ¼ 1Gbpsg.
mental results on a real-cluster setting in Section 5. We Any user’s task is denoted as ti , where 1  i  m, and m
discuss the related works in Section 6 and conclude with refers to the total number of submitted tasks. Each task has
future work in Section 7. a multidimensional workload vector, denoted by lðti Þ ¼
ðl1 ðti Þ; l2 ðti Þ; . . . ; lR ðti ÞÞT , which needs to be finished before
the task’s deadline. We denote the resource vector allocated
2 PROBLEM FORMULATION to ti as r ðti Þ ¼ ðr1 ðti Þ; r2 ðti Þ; . . . ; rR ðti ÞÞT , where rk ðti Þðk ¼
In cloud systems, the cloud proxy (a.k.a., server) continually 1; 2; . . . ; RÞ refers to the resource amount on kth execution
receives and responds to user requests (or tasks) with dimension isolated by hypervisor/virtual machine monitor
customized requirements (or virtual machines). All tasks (VMM) for the task’s execution. Node pi ’s availability
will be handled based on their priorities (like Google task vector (denoted a ðpiP ÞÞ along the multiple dimensions is
scheduler [11]) or in terms of First-Come-First-Serve (FCFS) calculated by cðpj Þ  ti running on pj r ðti Þ. For example, node
DI AND WANG: ERROR-TOLERANT RESOURCE ALLOCATION AND PAYMENT MINIMIZATION FOR CLOUD SYSTEM 1099

p1 in Fig. 1 is running two VMs that are allocated with half nodes ðps ; s ¼ 1; 2; . . . ; nÞ, how to select ps and split resources
of the total physical resources, so its availability vector such that ti ’s payment (i.e., (2)) is minimized, subject to
a ðp1 Þ ¼ fCPU ¼ 1:2 Gflops; disk IO ¼ 0:5 Gbpsg. If there are
no workloads being executed simultaneously for a parti- Min P ðrrðti ÞÞ
cular task, its total execution time will be the sum of the s:t:
individual processing times on different dimensions. If the
execution of the workloads overlap, however, the task’s T ðti Þ  Dðti Þ ð3Þ
completion time would be shorter. Accordingly, ti ’s final
execution time (denoted as TP ðti ÞÞ is definitely confined r ðti Þ  a ðps Þ: ð4Þ
within such a range ½maxðrlkk Þ; R lk
k¼1 rk . For simplicity, we
denote
P lk task ti ’s execution time as (1) (affine transformation
of R Þ, where  denotes a constant coefficient. Such a 3 OPTIMAL RESOURCE ALLOCATION
i¼1 rk
definition specifies a defacto broad set of applications each In this section, we will first analyze the problem mentioned
with multiple execution dimensions. The typical example is above, and then propose our optimal solution.
a single job with multiple sequentially interdependent tasks By combining (1) and (2), it is easy to verify that 8rk ;
@ 2 P ðrrðti ÞÞ P
or some program with distinct execution phases each ¼ R ð 2brk2lk þ 2lr3k R
@r2k i¼1 bi ri Þ > 0; thus, the target func-
relying on independent compute resources (where  ¼ 1Þ. k k
tion P ðrrðti ÞÞ is convex, which means that there must exist a
" # minimal extreme point.
X R
lk ðti Þ maxðrlkk Þ
T ðti Þ ¼  ; where  2 PR l ; 1 : ð1Þ Based on the convex optimization theory [9], the
r ðt Þ
k¼1 k i k¼1 r
k
Lagrangian function of the problem could be formulated
k
as (5), where  and 1 ; 2 ; . . . ; R are corresponding
For any cloud system, the resources provisioned are Lagrangian multipliers. Note that  is a constant defined
usually set with a price vector denoted as bðpi Þ ¼ ðb1 ðpi Þ; in (1) and r is the abbreviation of r ðti Þ as stated above
b2 ðpi Þ; . . . , bR ðpi ÞÞT along R dimensions. bk ðpi Þ ð1  k  RÞ ! ! !
denotes the per-time-unit price that the consumers need to 1 X R XR
lk X R
lk
pay for the consumption of the kth dimension on pi . Each F1 ðrrÞ ¼ bk rk  þ  D
R k¼1 k¼1 k
r r
k¼1 k
task ti is set with a deadline (denoted Dðti ÞÞ for its execution ð5Þ
and the payment is expected to be minimized under our XR
þ k ðrk  ak Þ:
algorithm.
k¼1
In our cloud model, any task will be executed on one or
more virtual machines with user-reserved resources and the Accordingly, we could get the Karush-Kuhn-Tucker
payment is calculated based on the customized resource conditions [9] (i.e., the necessary and sufficient condition
(a.k.a., pay-by-reserve policy). Adopting such a pricing of the optimization) as below:
policy is driven by three reasons. First, the efficiencies of 8
>
>   0; k  0; k ¼ 1; 2 . . . ; R
many applications usually rely on multiple resources but it >
>
> X R
li
is nontrivial to precisely evaluate the exact amount of their >
>  D
>
>
consumption separately on individual resources. Second, >
> r i
>
>
i¼1 !
quite a few users prefer to reserving resources for tolerating >
> XR
li
>
>
usage burst and guaranteeing their service levels. Lastly, the <  D ¼0
i¼1 i
r
alternative pricing policy, pay-as-you-consume, is rather P >
> rk  ak ðps Þ; k ¼ 1; 2; . . . ; R; s ¼ 1; 2; . . . ; n
>
simple because its payment P is always fixed ð¼  R >
>
k¼1 >
> k ðrk  ak ðps ÞÞ ¼ 0;! k ¼ 1; 2; . . . ; R; s ¼ 1; 2; . . . ; n !
ðbk ðps Þ  rk ðti Þ rlkkðtðtiiÞÞÞ ¼  R
k¼1 bk ðps Þ  lk ðti ÞÞ regardless of the
>
> X X
>
> @F1 1 R
lk R
li lk
resource allocation. >
> ¼ bi ri  2 þ bk  þ 2 þ k ¼ 0:
>
> @r R r r rk
Based on the pay-by-reserve policy, task ti ’s total >
> k i¼1 k i¼1 i
:
payment will be calculated via (2), where ps refers to ti ’s k ¼ 1; 2; . . . ; R
execution node. The mean price (i.e., R1 bðps ÞT  rðti ÞÞ will be ð6Þ
used as the pricing unit over time, for computing user’s
payment. Such a design can be consistent with our pay-by- In other words, as long as we can find such an
allocation case ðrr ¼ ðr1 ; r2 ; . . . ; rR ÞT Þ to satisfy the above
reserve model, and also prevent users from feeling too
conditions simultaneously, we can set it as the optimal
costly when their applications’ execution cannot overlap at
solution of the deadline-driven payment-minimized pro-
different dimensions
blem. However, it is nontrivial to do that, because the last
1 condition ð@F @rk ¼ 0Þ cannot be directly solved. Whereas, we
1

P ðrrðti ÞÞ ¼ bðps ÞT  r ðti Þ  T ðti Þ: ð2Þ exploit a novel algorithm with polynomial time complexity
R
ðn  R2 Þ to allocate resource, which can be proved to satisfy
In this paper, we might omit the notations ti and pi if the KKT condition listed above.
thus would not cause ambiguity. For instance, lk ðti Þ; Our algorithm is designed based on such a discovery: if
r ðti Þ; bk ðpi Þ; a ðpi Þ, and Dðti Þ may be substituted by we do not consider the limit of resource capacities (i.e.,
lk ; r; bk ; a , and D respectively, in the following text. condition (4)), the problem can be directly solved using
Our research could be briefly summarized as the follow- Lagrangian multiplier method. As follows, we will first
ing convex optimization format: for any task ti with its derive the optimal solution to the problem with unbounded
workload vector lðti Þ, given a set of candidate execution capacities (i.e., without the condition (4)) in Theorem 1. And
1100 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 24, NO. 6, JUNE 2013

then, we will describe our algorithm by recursively using


Theorem 1 to search the resource allocation case that
satisfies the whole KKT condition (6), in polynomial time.
Theorem 1. For a specific task ti , in order to minimize P ðrrðti ÞÞ
subject to the constraint (3), the optimal resource vector
r ðÞ ðti Þ is (7), where k ¼ 1; 2; . . . ; R. (Note that r ðÞ ðti Þ is not
subject to Inequality (4), unlike the notation r  ðti Þ that takes
into account this inequality.)
!sffiffiffiffiffiffiffi
ðÞ X R pffiffiffiffiffiffiffi
lk
lrk ðti Þ ¼ lj bj : ð7Þ
D j¼1 bk
Fig. 2. The function graph of a simple case.

Proof. As mentioned previously, the target function is and their number is infinitive, along the line fr1 ¼ r2 and
convex; thus, there must exist the minimal extreme point. P ðrrÞ ¼ 4g. This result is consistent with (12).
In order to simplify the target function (i.e., (2)), we fix Formula (7) presents the resource share vector r ðÞ
the task’s execution time to be T ð DÞ, which also gained by ti such that its payment and the resource
satisfies the problem’s conditions. Then, the target utilization can be both minimized within its execution
function could be converted to deadline (i.e., (3)). Considering the constraint (4), rðÞ is
right the optimal solution as long as rðÞ  a ðps Þ.
T XR
P ðrrÞ ¼  bk rk ; where T  D: ð8Þ However, if rðÞ does not fully satisfy the constraint (4)
R k¼1 ðÞ
(i.e., 9k: rk > ak ðps ÞÞ; r ðÞ should not be a feasible
solution. As one contribution, we propose an efficient
The corresponding Lagrangian function is shown
algorithm (Algorithm 1) to determine the optimal
below:
solution subject to the constraint (4) with the provable
!
T XR XR
lk time complexity Oðn  R2 Þ. u
t
F2 ðrrÞ ¼  bk rk þ   D : ð9Þ Definition 1. For any task ti , based on a subset ð Þ, CO-
R k¼1 r
k¼1 k
STEPð; CÞ is defined as the procedure of computing the
Based on the Lagrangian multiplier method, @F @rk ¼ 0
2
optimal solution of minimizing P ðrr ðti ÞÞ subject to
(where k ¼ 1; 2; . . . ; RÞ constructs a set of necessary the constraint (13) by using convex optimization (similar to
conditions for getting the optimal solution (i.e., (10) the proof of Theorem 1), where C denotes a deadline and
must hold, where  is a constant) r  ðti Þð¼ ðr1 ; r2 ; . . . ; rR ÞT Þ denotes the resource shares gained
by ti on the execution dimension set .
R=T ¼ bk r2k =lk : ð10Þ
X
R
li
According to (10), we can easily get (11), 8j; kð1 j 6¼   C: ð13Þ
k  RÞ i¼1
ri

r2k bk =lk ¼ r2j bj =lj : ð11Þ We devise Algorithm 1 for minimizing P ðrrðti ÞÞ subject to
the constraints (3) and (4), as shown below.
That is, (12) is the sufficient and necessary condition of
the optimal solution, s.t. a given deadline Algorithm 1. OPTIMAL ALLOCATION ALGORITHM
pffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffi Input: Dðti Þ; Output: execution node ps ; r  ðti Þ
r1 : r2 :    rR ¼ l1 =b1 : l2 =b2 :    : lR =bR : ð12Þ 1: for (each candidate node ps Þ do
In order to save the resource utilized by the current task 2:  ¼ ; C ¼ Dðti Þ; r ¼  (empty set);
as 3: repeat
PRmuch li
as possible, the optimal allocation should make
ðÞ
i¼1 ri equal to D. InPfact, for any resource allocation r ðti Þ 4: r ðti ; ps Þ ¼ CO-STEPð; CÞ; =* Compute optimal r
meeting (12) while R li
i¼1 ri < D, there must exist another
on  =
ðÞ
solution with lower resource allocation r ðti Þ0 (i.e., r 0 ðti Þ  5:  ¼ dk jdk 2  & rk ðti ; ps Þ > ak ðps Þ};
r ðti ÞÞ such that it also satisfies (12). Hence, P the task ti ’s = select elements violating constraint (4) =
R lk
optimal resource allocation should make k¼1 rk ¼ D, 6:  ¼ n; =  takes away  =
P
then, by combining this equation, we can calculate the 7: C ¼ C   dk 2 alkk ; = Update C  =
optimal resource vector to be allocated as (7). u
t 8: r ðti ; ps Þ ¼ r  ðti ; ps Þ [ frk ¼ ak ðps Þjdk 2 &ak ðps Þ
Remark. With unbounded resource availabilities, there will is dk ’s upper bound};
be no any constraint to the problem of minimizing the 9: until ð ¼ Þ;
ðÞ
target function P ðrrÞ. Based on the above analysis, there 10: r  ðti ; ps Þ ¼ r  ðti ; ps Þ [ r ðti ; ps Þ;
are infinite number of optimal stationary points, whose 11: end for
sufficient and necessary conditions are (12). For a vivid 12: Select the smallest P ðti Þ by traversing the candidate
illustration, we show the graph of a simple case in Fig. 2, solution set;
where b ¼ ð1; 1ÞT and l ¼ ð1; 1ÞT . From this figure, we 13: Output the selected node ps and resource allocation
can observe that there exist the minimal extreme points r  ðti ; ps Þ;
DI AND WANG: ERROR-TOLERANT RESOURCE ALLOCATION AND PAYMENT MINIMIZATION FOR CLOUD SYSTEM 1101

In this algorithm, line 4 executes CO-STEPð; CÞ in order previously selected h1 will together compose the solution
ðÞ
to find the optimal r  ðti ; ps Þ, under the assumption without satisfying the condition (6). If there are still h2 ð0 < h2 
ðÞ R  h1 Þ new resource shares violating rk  ak in this
constraint (4). If r  ðti ; ps Þ completely satisfies the constraint
ðÞ round, Algorithm 1 will continue the adjustment
(4) (i.e.,  ¼ Þ, then r  ðti ; ps Þ is the local optimal resource P until
allocation for ti to be run on ps ; otherwise, let the resource the Hth round such that either all the R  H i¼1 hi
shares ðrk ðti Þ, where k ¼ 1; 2; . . . ; RÞ that violate the con- remaining resource shares can satisfy rk  ak or there are
straint (4) equal to its upper bound (i.e., ak ðps ÞÞ and take the no remaining resource dimensions in . In the former
case, we can easily verify that all the R resource shares
corresponding execution dimensions (i.e., Þ away from ,
P satisfy the condition (6) simultaneously, composing an
then, C ¼ C   dk 2 alkk for the remaining dimensions. The
optimalPsolution; for the latter case, we could conclude
process will go on until the computed optimal resource
that  R li
i¼1 ai  D, then there does not exist a feasible
shares on the remaining dimensions satisfy the constraint resource allocation to run the task within the specified
(4). Since the time complexity of CO-STEPð; CÞ is OðjjÞ, deadline. In this situation, r  ¼ a ¼ ða1 ; a2 ; . . . ; aR ÞT will
the number of computation steps of line 2-10 in Algorithm 1 get the execution time closest to the deadline, and it will
PR1
in the worst case is i¼0 ðR  iÞ, thus the total time serve as the final solution. u
t
complexity of Algorithm 1 ¼ Oðn  R2 Þ.
Although Algorithm 1 is proved optimal for minimizing
Based on the Algorithm 1, it is obvious that the local
the payment cost within user-defined deadline for his/her
optimal resource allocation for ti to be executed on a
task, the deadline still may not be guaranteed due to two
specified node ps is the most crucial part. In fact, the final
factors, either bounded available resources or inaccurate
outputted resource allocation solution of the whole algo-
workload vector information about the task. We propose
rithm will be globally optimal around the whole system as the following lemma, which provides a necessary and
long as each local process on a specified node (line 2-10) can sufficient condition of guaranteeing the task’s deadline
be proved as optimal resource allocation. Consequently, we given accurate prediction and relatively sufficient re-
will intensively discuss the local divisible-resource alloca- sources. In next section, we will discuss how to guarantee
tion by specifying a particular execution node, in the task’s deadline when performing the Algorithm 1 with even
following text. inaccurate workload vector.
Theorem 2. Given a submitted task ti with its load vector l ðti Þ Lemma 1. Given a task ti ’s workload vector lðti Þ ¼
and a deadline Dðti Þ and a particular node ps with its resource ðl1 ; l2 ; . . . ; lR ÞT and its deadline Dðti Þ, and a candidate
price vector bðps Þ, then the output after running the line 2-10 execution node ps , then ti can be executed within Dðti Þ if
of Algorithm 1 (i.e., r  ðti ; ps Þ is optimal for minimizing ti ’s and only if (i.e., ,Þ Inequality (14) holds:
payment (i.e., P ðrrðti Þ)), subject to the constraints (3) and (4).
XR
lj ðti Þ

Main idea. We will prove that the r ðti ; ps Þ satisfies KKT  Dðti Þ: ð14Þ
a
j¼1 j s
ðp Þ
conditions (i.e., (6)).
Proof. At the beginning, the algorithm executes the CO- Proof. To prove ( : If Inequality (14) holds, it is obvious
ðÞ ðÞ
STEPð; Dðti ÞÞ and the output is denoted r  . Since r  is there must exist a P viable resource allocation
ðÞ lj ðti Þ
derived from Definition P li 1 and Theorem 1, r  must r ðti Þð a ðps ÞÞ, such that R j¼1 aj ðps Þ ¼ Dðti Þ. Hence, ti can
satisfy (12) and  R i¼1 ri ¼ D, then if we let k ¼ 0 for any be executed within Dðti Þ.
k, there must exist an assignment such that all the To prove ) : If ti can be executed within Dðti Þ, there
ðÞ
conditions in (6) hold except for rk  ak . Accordingly,
 ðÞ ðÞ ðÞ ðÞ PR ljexist a viable resource allocation r ðti Þ such that
must
r ¼ r  as long as rk  ak for all rk s in r . j¼1 rj  Dðti Þ and r ðti Þ  a ðps Þ. Assuming
P Inequality
ðÞ lj ðti Þ
If r cannot satisfy all the R inequalities ðrk  ak , (14) does not hold at the moment, i.e., R j¼1 aj ðps Þ > Dðti Þ,
where k ¼ 1; 2; . . . ; RÞ, we need to further adjust the then, we could derive
ðÞ
solution r  to find the one completely satisfying the
condition (6). In Algorithm 1, at this moment, all the rk s
ðÞ XR
lj ðri Þ XR
lj ðti Þ
ðÞ < : ð15Þ
such that rk > ak will be selected and set to ak . Without r ðp Þ j¼1 aj ðps Þ
j¼1 j s
loss of generality, assuming there are h1 such resource
shares and they are denoted as r1 ; r2 ; . . . ; rh1 . Obviously, Accordingly, we can derive that there must exist a
each selected rk must satisfy k  ðrk  ak Þ ¼ 0 because dimension, for example, dk such that rk ðti Þ  aðps Þ,
rk ¼ ak . On the other hand, Algorithm 1 will continue to which contradicts to the previous assumption that rðti Þ
execute CO-STEPð; P CÞ on the rest R  h1 dimensions, is a viable solution ðrrðti Þ  aðps ÞÞ. u
t
where C ¼ Dðti Þ   hk¼1 1 lk
rk . Likewise, all the R  h1 new
resource shares (each denoted by rk ; k ¼ h1 +1, . . . ; RÞ
4 OPTIMALITY ANALYSIS WITH INACCURATE
must also satisfy
INFORMATION
sffiffiffiffiffiffiffiffiffiffi sffiffiffiffiffiffiffiffiffiffi sffiffiffiffiffi
lh1 þ1 lh1 þ2 lR In this section, we focus on such a question: what is the final
rh1 þ1 : rh1 þ2 :    : rR ¼ : :  : ; upper bound of task execution length as compared to its
bh1 þ1 bh1 þ2 bR
P li predefined deadline D, when running it using the resource
and R i¼1 ri ¼ D, thus if each of them meets the condition vector allocated under Algorithm 1 with inaccurately
rk  ak , the R  h1 new resource shares and the predicted workload information?
1102 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 24, NO. 6, JUNE 2013

ðÞ
4.1 Problem Description . r E ðti Þ ¼ r E ðti Þ.
ðÞ
Although Algorithm 1’s output is proved optimal, such a . r E ðti Þ 6¼ r E ðti Þ.
result relies on a strong condition, i.e., accurate task’s The first situation indicates that in terms of the skewed
workload vector. That is, each user needs to precisely estimation of workload ratios, all the resource shares
predict the execution property (i.e., workload ratio) for his/ calculated by the initial CO-STEP in Algorithm 1 are always
her task, before constructing the resource allocation with no greater than the corresponding capacities. That is, it is
minimized payment for its execution under a user-specified equal to the situation with the assumption that Inequality
deadline. In some cases, the execution property could be (18) holds:
easily estimated accurately. For instance, we can decide the
ðÞ
workload ratio between the data to be read/written from/ rE ðti Þ  aðps Þ: ð18Þ
to disk and those to be downloaded/uploaded via network
In contrast, the second one means that the initial CO-STEP
by comparing their data sizes. In many other cases,
cannot fulfill the above condition, and the optimal alloca-
however, the execution property cannot be accurately
tion cannot be found unless a few more adjustment steps
estimated, such as computation-intensive applications
(line 5-8 of Algorithm 1).
whose execution times highly depends on the CPU cycles
As follows, we will first derive task ti ’s execution time
to consume.
upper bound for the first category (i.e., Theorem 3), and
Definition 2. Suppose a task ti ’s real workload vector is lðti Þ, then discuss the upper bound (i.e., Theorem 4) for the more
while its workload vector used by our algorithm is l 0 ðti Þ subject generic case including the second category.
to the Inequality (16), where  and  are the lower bound and
Theorem 3. Given a submitted task ti with a predefined deadline
upper bound for the estimation ratio specified by user based on
Dðti Þ, a candidate execution node ps with unbounded resource
experiences or particular workload prediction methods such as
capacity and a resource price vector (denoted bðps ÞÞ, and a
[13], [14], [15]:
skewed workload vector l0 ðti Þ subject to Inequality (16), then
l0k ðti Þ the bound of execution time must satisfy Inequality (19), under
  ; k ¼ 1; 2; . . . R: ð16Þ ðÞ
lk ðti Þ the resource allocation r E :
1 ðÞ 1
To illustrate the above definition, an example is given.  Dðti Þ  TE ðti Þ   Dðti Þ: ð19Þ
 
Assuming the task ti ’s real workload ratios range in [0.125,
1], and the workload vector l 0 ðti Þ used by Algorithm 1 will Proof.
be set based on the task’s historical execution records.
Suppose each element l0k ðti Þðk ¼ 1; 2; . . . RÞ will be set to 0.25 ðÞ
X
R
lk X
R
lk
if the corresponding true workload fluctuates in [0.125, 0.5] TE ¼  ðÞ
¼ P R p ffiffiffiffiffiffiffi qffiffiffil0
k¼1 rEk k¼1 ð  l0i bi Þ bkk
and set to 0.75 if the true workload ranges within (0.5, 1]. D i¼1
Then, we could get Inequality (17) below, where  ¼ 0:125 ! pffiffiffiffiffi PR pffiffiffiffiffiffiffiffi
0:25 ¼ XR
0:5
0:5 and  ¼ 0:25 ¼ 2: D lk bk D 1 lk b k D
¼ PR pffiffiffiffiffiffi
0
ffi p ffiffiffi
0
ffi  p ffiffiffi
ffi  p ffiffiffi
ffi  Pk¼1
R pffiffiffiffiffiffiffi
¼ :
i¼1 li bi k¼1 l k
  i¼1 li bi

l0k ðti Þ t
u
0:5   2; k ¼ 1; 2; . . . R: ð17Þ
lk ðti Þ
Using the inaccurate prediction l0 ðti Þ to perform the The key of the above proof is based on the Inequality
Algorithm 1, it is obvious that ti ’s real execution time may (16). Similarly, According to Inequality (16), we can also
surpass the expected execution deadline Dðti Þ. Hence, one ðÞ
derive TE  D .
question is what the worst performance will get when Accordingly, Inequality (19) holds. It is easy to see that
using l 0 ðti Þ instead of lðti Þ, compared to the expected Inequality (19)’s bound is tight. Considering such a case:
deadline Dðti Þ. ðÞ
8k; l0k ðti Þ ¼ lk ðti Þ, then TE will be equal to D

4.2 Deadline Extension Ratio (DER) with Skewed Theorem 4. Given a submitted task ti with a predefined deadline
Estimation of Execution Property Dðti Þ, a candidate execution node ps with a limited available
   resource vector ðaðps ÞÞ and price vector b ðps Þ, and a skewed
PR lk we denote r E ð¼ ðrE1 ; rE2 ; . . . ;
For simplicity of description,
 T 
rER Þ Þ and TE ð¼  k¼1 r Þ as the output of Algorithm 1 workload vector l 0 ðti Þ subject to Inequality (16), if Inequality
Ek
with the skewed workload prediction and the correspond- (14) holds, then under the resource allocation r E , the bound of
ing execution time, respectively ðE here implies “Estima- execution time must conform to
tion with error”). Similarly,
P we denote rI ð¼ ðrI1 ; rI2 ; . . . ; 1 1
rIR ÞT Þ and TI ð¼ D ¼  R lk
k¼1 rIk Þ as the output with real  Dðti Þ  TE ðti Þ   Dðti Þ: ð20Þ
 
workload vector and the corresponding execution time,
respectively ðI here indicates “Ideal case”). Hence, our Proof. Without loss of generality, we denote  to be the set
T
objective is to determine the upper bound of TE , a.k.a., of resource dimensions accumulated by Line 5 of
I
deadline extension ratio. Algorithm 1, and the corresponding dimensions’ indexes
We partition the situation that Algorithm 1 would face to are 1, 2,. . . ; jj. That is, r1 ¼ a1 ; r2 ¼ a2 ; . . . ; rjj ¼ ajj ,
ðÞ
two categories, where r E refers to the optimal resource while rjjþ1 < ajjþ1 ; . . . ; rR < aR . Hence, we can get the
allocation with the constraint (4) (unlike the notation r E Þ: following equation:
DI AND WANG: ERROR-TOLERANT RESOURCE ALLOCATION AND PAYMENT MINIMIZATION FOR CLOUD SYSTEM 1103
0 1
Xj j XR
li li TABLE 1
TE ¼ @ þ A: ð21Þ
i¼1 i
a r
i¼jjþ1 Ei
Workload of Typical Matrix Operations (Seconds/Core)

We could further prove the Inequality (20) as follows:


jj
X X
R
li lk
TE ¼  þ 
i¼1
ai r
k¼jjþ1 Ek
qffiffiffi
jj
X li X
R lk bl0k
k
¼ þ PR pffiffiffiffiffiffi

i¼1
ai k¼jjþ1

Pjj l0i i¼jjþ1 l0i bi operation is called by some user task through a web service
D i¼1 ai API and each task is executed in a VM container. Our
Pjj l0 qffiffiffi
jj ðD   i¼1 aii Þ  lk bl0k algorithm is evaluated on such a real cluster environment.
X li X R
¼ þ PR pffiffiffiffiffiffi

k There are 10 physical nodes in the cluster, each owning 2
ai k¼jjþ1 0b
i¼1 i¼jjþ1 li i quad-core Xeon CPU E5540 (i.e., eight processors per node)
qffiffiffiffiffi and 16 G of memory. There are 60 VM-images (centos 5.2)
! bk
jj
X li
j j 0
X l X
R lk lk
kept by Network File System (NFS), so 60 VM-instances will
i
 þ D PR pffiffiffiffiffiffiffiffiffiffi be created at the bootstrap before our experiment. XEN 3.1
i¼1
ai i¼1
ai k¼jjþ1 i¼jjþ1 li bi
[17] serves as the hypervisor/VMM on each node and
jj jj
! dynamically allocates various CPU speeds (or capabilities)
X li 1 X l0
i to the VM-instances at runtime using credit scheduler.
¼ þ D
i¼1
ai  a
i¼1 i Users can submit their computation request by editing
jj jj their mathematical formulas. In our experiment, we make use
D X li  X li D
 þ  ¼ : of ParallelColt [18] to perform math computations, each
 a  i¼1 ai
i¼1 i
 consisting of a partially ordered set of operations. ParallelColt
t
u [18] is such a library that can effectively calculate complex
matrix-operations like matrix matrix multiply, in parallel via
The key of the above proof is based on the Inequality multiple threads. Here is an example computation request,
(16). Similarly, According to Inequality (16), we can also which is submitted as Solve (ðAm n  An m Þk ; Bm m Þ. Such a
derive TE  D . Hence, Inequality (20) holds. computation task can be split into three steps (or subtasks) of
When  is empty, the lower bound and upper bound of different matrix operations: 1) matrix multiplication: Cm m ¼
Inequality (20) can be reached as the upper bound and Am n  An m ; 2) matrix-power: Dm m ¼ Cm k
m ; 3) Least
lower bound of Inequality (16) are met, respectively. squares solution of D  X ¼ B based on QR-Decomposition:
Remark. Let us review the Theorem 4 and discuss its SolveðDm m ; Bm m Þ. In our benchmark, we simulate a large
significance. Inequality (20) implies that task ti ’s execu- number of user requests, each of which is composed of 3-15
tion time based on the optimal resource allocation of subtasks. Each subtask is constructed of three typical matrix
Algorithm 1 under inaccurate workload ratios has an operations (i.e., matrix multiply, matrix power, and QR-
upper bound, which is only determined by the lower matrix solving(least square)) with various parameters as-
bound of the inaccurate ratio . In principle, by leveraging signed. That is, each request contains many subtasks that are
this theoretical result, we can always provide the strict randomly selected from the above three types. We evaluate
guarantee for user-preset deadline even with the wrong our algorithm under different competitive situation with
prediction of task’s property, as long as there are relatively different number (1-40) of tasks submitted simultaneously;
sufficient resources. In fact, what we need to do is just thus, there are 40 cases for each experiment which has
setting a stricter deadline D0 according to (22) and 820 submitted tasks in total as observed.
preforming the Algorithm 1 based on D0 instead of D. In our system, each matrix-operation’s workload is
Then, the user task’s deadline will be strictly limited estimated based on the historical tracing records. The
under its expected value D even though the workload workload prediction formula is shown in (23), where j
ratio information is inaccurate (s.t. Inequality (16)). On the denotes the number of processors and T ðopi ; jÞ indicates the
other hand, existing workload prediction work can be execution time of running the matrix operation (denoted
used to support how to determine the value of . For opi Þ on j cores:
example, the polynomial regression method [16] can
bound the prediction error in 10 percent (i.e.,  ¼ 0:1Þ: 1X 8
li ¼ ðj  T ðopi ; jÞÞ: ð23Þ
D0 ¼   D: ð22Þ 8 j¼1

u
t Each user request (denoted as task ti Þ is assigned with a
deadline, which is a random value in [18  T1 ðti Þ; T1 ðti Þ],
where T1 ðti Þ means the estimated execution time when
5 PERFORMANCE EVALUATION running the task ti on a particular core. Based on our
5.1 Experimental Setting experiment, the three matrix operations on one core will
We implement a web service-based prototype that can cost from 1 second to 1,206 seconds as shown in Table 1,
compute a set of combined matrix operations. Each matrix which implies a quite heterogenous nature. In Table 1,
1104 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 24, NO. 6, JUNE 2013

Fig. 3. Workload prediction. (a) Lower bound. (b) Upper bound. Fig. 4. Deadline extension ratio. (a) D0 ¼ D. (b) D0 ¼   D.

M; N, and P refers to the matrix scale in the matrix-matrix- physical machines in our experiment but much more than 10
multiply and QR-Decomposition Solving, and m indicates tasks can be processed with guaranteed deadlines, which
the value of exponent in the matrix-power computation. indicates a remarkably high level on service consolidation.
Users’ prices of running the three individual matrix- This also implies a great potential in improving resource
operations are set to 1, 2, and 3, respectively. utilization by taking advantage of VM-multiplexing feature.
Fig. 5 presents the distribution of the deadline extension
5.2 Experimental Results ratio, in a competitive situation where there are 40 tasks
We first present the prediction effect over the historical submitted. We observe that the stricter deadline-based
records of the three matrix operations (as shown in Fig. 3), in algorithm can more effectively limit the majority tasks’
that the approximation ratio of our optimal algorithm is execution times to about 0.7 times as high as the user-
based on the inaccuracy of the workload predicted, accord- specified deadlines (i.e., the original ones), but it may suffer
ing to the analysis in Section 4. From this figure, we can from higher DER at the worst case. In comparison, the
clearly observe that the prediction method we used can make majority of tasks (about 52.5 percent) under the original-
sure that the lower bound of the workload predicted (i.e., ’s deadline-based algorithm are completed within 0.95- 1.0
value will be set close to 0.7, where  is defined in Definition 2 times of their deadlines, which still conforms to the deadline-
and used in Theorem 3 and 4) is always lower than the real guaranteed requirement; there are about 20 percent of tasks
that would violate deadlines, most of which (17.5 percent)
workload that is calculated after its execution.
are still finished within 1.05 times of deadlines.
We evaluate our designed algorithm with and without the
Finally, we evaluate the fairness of task processing in the
prediction-error-tolerant support. That is, the system will
two cases, confirming the stability. Based on Jain’s work
test the Algorithm 1 with the tuned stricter deadline ðD0 Þ or
[19], fairness index (higher value means higher fairness) is
the original one ðDÞ. We use Deadline Extension Ratio (defined
defined as (24) whose value ranges in [0, 1], where xi refers
as the ratio of task’s final execution time to its deadline) to
to the DER of task ti :
evaluate the statistical task execution lengths compared to
Pn
their expected deadlines. We run 40 separate cases each with ð x i Þ2
different number (1-40) of tasks, and show the lowest/ xÞ ¼ Pi¼1n 2 :
F ðx ð24Þ
n i¼1
xi
average/highest level of DER for each case.
We first show the experimental result by using the We present the experimental results about the fairness
original deadline D (i.e., D0 ¼ DÞ in the algorithm. From index of the DER in Fig. 6. As observed, the fairness index is
Fig. 4a, we see that the tasks’ execution times cannot be always kept over 0.99 for both cases under the relatively
always guaranteed to be executed within their deadlines in uncompetitive situation (e.g., m  30Þ, and still kept about
the worst case, no matter how many tasks (1-40) are 0.95 in the case with higher competition (i.e., when m > 30Þ.
submitted. Specifically, even though the system availability Recall that there are only 10 physical machines used for
is relatively high (e.g., there are only several tasks sub- resource provisioning in our experiment, which implies our
mitted), the average value of deadline extension ratio is solution’s allocation effect is confirmed to be quite stable to
nearly to 1 and its highest value in the worst case is up to 1.2. any task’s execution under such a dense server consolida-
This is mainly due to the inaccurate workload prediction tion. In addition, the main reason for the degradation of the
with about 30 percent margin of errors as shown in Fig. 3. In fairness of DER in the competitive situation is that the tasks
comparison, Fig. 4b shows the deadline extension ratio when
the deadline D0 is set to a stricter deadline ð  DÞ. When the
number (denoted by mÞ of tasks submitted scales up to 30, all
tasks’ execution times can be kept nearly to about only 0.7
times as high as their preset deadlines ðDÞ at the worst
situation (i.e., the highest level shown in the figure). With
further increasing number of the submitted tasks, tasks’
execution times cannot be always guaranteed because of the
limited resource capacities (or the higher level of competi-
tions on resources), but the mean level is still kept
remarkably lower than 1, which means that most of the
tasks can still meet the QoS (i.e., large majority can be
finished before deadlines). Note that there are only 10 Fig. 5. Distribution of DER (the number of tasks).
DI AND WANG: ERROR-TOLERANT RESOURCE ALLOCATION AND PAYMENT MINIMIZATION FOR CLOUD SYSTEM 1105

SLA-based resource allocation method, which is compa-


tible to the heterogeneity of infrastructure and adaptable to
the dynamic change of customer requests. It can maximize
the profit of SaaS providers by minimizing the humber of
SLA violations and the cost by reusing VMs. Chaisiri et al.
[30] also aim to minimize the provisioning cost incurred to
users by taking into account stochastic programming,
robust optimization, and sample-average approximation
together. Mao et al. [31], [32] present a cloud autoscaling
mechanism to automatically scale computing instances
Fig. 6. Fairness index of DER.
based on workload information and performance desire,
also aiming to guarantee task’s deadline with less pay-
with higher priorities or the ones arriving earlier would be
ment. In comparison, our approach can be fundamentally
treated with higher service level in our experiment, which
proved optimal via the convex optimization theory, which
would definitely impact other lower priority tasks’ execu-
we believe is a huge step forward especially from the
tion in the short-supply situation. In fact, guaranteeing the
perspective of theoretical analysis.
higher priority tasks’s QoS by sacrificing lower priority
Most of the existing theoretical research on cloud
tasks’ benefit may also be considered a fairer treatment in
computing [33], [34], [35] mainly focused on the relatively
many scenarios. Hence, for different applications, we can
ideal scenarios by assuming tasks’ workloads can be
easily maximize the fairness level among all tasks by
accurately predicted, simplifying the resource allocation
assigning the adaptive values for , which will be further
problem. For example, Weinman [33] analyzed the penalty
studied in our future work.
functions working in the workload aggregation and relative
statistical effects, given a set of fixed task workloads to be
6 RELATED WORK used, while Petrucci et al. [34] proposed an optimization
VM-based model to minimize the power and management
Traditional job scheduling [20] is often formulated as a kind
cost by assuming that application’s consumption can be
of combinatorial optimization problem (or queue-based
predicted precisely by monitoring system. Unlike these
multiprocessor scheduling problem [21], [22], [12]), due to
works, we theoretically analyze the upper bound of task’s
the nonguaranteed performance isolation for multiple tasks
execution time compared to its deadline and that of user-
running on the same machines. That is, most of the existing
specified payment to the precise-prediction-based result. By
deadline-driven task scheduling solutions (from single
taking advantage of the derived bounds and approximation
cluster environment confined in LAN [23], [24] to the Grid
ratio, we can more effectively guarantee user tasks’ QoS in
computing environment suitable for WAN [25], [26]) are
terms of their demands. To the best of our knowledge, this
also strictly subject to the queuing model under which a
is the first attempt to study how to minimize the payment
single machine’s multiple resources cannot be further split
cost in the cloud system, which can also tolerate the
to smaller fractions at will. This will eventually cause the
prediction errors of tasks’ properties.
raw-grained resource allocation, relatively low resource
utilization and suboptimal task execution efficiency.
With the VM resource isolation technology being mature 7 CONCLUSION AND FUTURE WORK
recently, it is viable to design more efficient resource
In this paper, we propose a novel resource allocation
allocation due to the fledged performance isolation among
algorithm for cloud system that supports VM-multiplexing
VMs running on the same machines. Meng et al. [27]
proposed a VM multiplexing-based resource allocation technology, aiming to minimize user’s payment on his/her
approach, which can successfully analyze the compatibility task and also endeavor to guarantee its execution deadline
of any two different VMs (each with an application running meanwhile. We can prove that the output of our algorithm
atop it) on the same physical machines, and reschedule the is optimal based on the KKT condition, which means any
combination of the VMs to improve the overall perfor- other solutions would definitely cause larger payment cost.
mance. However, it cannot guarantee high compatibility In addition, we analyze the approximation ratio for the
among more than two VMs on the same machine. Q-Clouds expanded execution time generated by our algorithm to the
[28] is another well-known system which can realize high user-expected deadline, under the possibly inaccurate task
consolidation of multiple VM-hosted applications, focusing property prediction. When the resources provisioned are
on how to prevent inevitable performance interference relatively sufficient, we can guarantee task’s execution time
among VMs from degrading user’s QoS or enhancing always within its deadline even under the wrong predic-
corresponding users’ payment unexpectedly. tion about task’s workload characteristic. In the future, we
Compared to the above existing works about VM- plan to integrate our algorithms with stricter/original
multiplexing resource allocation, our work aims to not only deadlines into some excellent management tools like
confine tasks’ execution to be within their deadlines, but OpenNebula, for maximizing the system-wide perfor-
also minimize the payments for their users. This work will mance. Some queuing policies like earliest deadline first
definitely benefit and motivate many cloud users or service (EDF) will be studied to further reduce user payment
providers, who wish to minimize the infrastructure cost especially in the short supply situation. More complex
with the guaranteed QoS, actually already endeavored by scheduling constraints like the compatibility and security
many researchers. Wu et al. [29], for example, proposed a issue will also be taken into account.
1106 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 24, NO. 6, JUNE 2013

ACKNOWLEDGMENTS [22] O. Sinnen, Task Scheduling for Parallel Systems. Wiley-Interscience,


May 2007.
This research was supported by a Hong Kong RGC grant [23] K. Ramamritham, J.A. Stankovic, and W. Zhao, “Distributed
Scheduling of Tasks with Deadlines and Resource Requirements,”
HKU 7179/09E and a Hong Kong UGC Special Equipment IEEE Trans. Computers, vol. 38, no. 8, pp. 1110-1123, Aug. 1989.
Grant (SEG HKU09). [24] M.C. McElvany and P.D. Stotts, “Guaranteed Task Deadlines for
Fault-Tolerant Workloads with Conditional Branches,” Real-Time
Systems, vol. 3, no. 3, pp. 275-305, 1991.
[25] L. Zhao, Y. Ren, and K. Sakurai, “A Resource Minimizing
REFERENCES Scheduling Algorithm with Ensuring the Deadline and Reliability
[1] M. Armbrust, A. Fox, R. Griffith, A.D. Joseph, R.H. Katz, A. in Heterogeneous Systems,” Proc. 25th IEEE Int’l Conf. Advanced
Konwinski, G. Lee, D.A. Patterson, A. Rabkin, I. Stoica, and M. Information Networking and Applications (AINA ’11), pp. 275-282,
Zaharia, “Above the Clouds: A Berkeley View of Cloud Comput- 2011.
ing,” Technical Report UCB/EECS-2009-28, EECS Dept., Univ. [26] W. Chen, A. Fekete, and Y.C. Lee, “Exploiting Deadline Flexibility
California, Berkeley, Feb. 2009. in Grid Workflow Rescheduling,” Proc. 11th IEEE/ACM Int’l Conf.
[2] L.M. Vaquero, L. Rodero-Merino, J. Caceres, and M. Lindner, “A Grid Computing (Grid ’10), pp. 105-112, 2010.
Break in the Clouds: Towards a Cloud Definition,” SIGCOMM [27] X. Meng, C. Isci, J. Kephart, L. Zhang, E. Bouillet, and D.
Computer Comm. Rev., vol. 39, no. 1, pp. 50-55, 2009. Pendarakis, “Efficient Resource Provisioning in Compute Clouds
[3] I. Foster and C. Kesselman, The Grid 2: Blueprint for a New via VM Multiplexing,” Proc. Seventh Int’l Conf. Autonomic Comput-
Computing Infrastructure. Morgan Kaufmann, Nov. 2003. ing (ICAC ’10), pp. 11-20, 2010.
[4] J.E. Smith and R. Nair, Virtual Machines: Versatile Platforms For [28] R. Nathuji, A. Kansal, and A. Ghaffarkhah, “Q-Clouds : Managing
Systems and Processes. Morgan Kaufmann, 2005. Performance Interference Effects for Qos-Aware Clouds,” Proc.
[5] D. Gupta, L. Cherkasova, R. Gardner, and A. Vahdat, “Enforcing European Conf. Computer Systems (EuroSys ’10), pp. 237-250, 2010.
Performance Isolation across Virtual Machines in Xen,” Proc. [29] L. Wu, S.K. Garg, and R. Buyya, “SLA-Based Resource Allocation
ACM/IFIP/USENIX Int’l Conf. Middleware (Middleware ’06), pp. 342- for Software as a Service Provider (SAAS) in Cloud Computing
362, 2006. Environments,” Proc. 11th IEEE/ACM Int’l Symp. Cluster, Cloud and
[6] J.N. Matthews, W. Hu, M. Hapuarachchi, T. Deshane, D. Dimatos, Grid Computing (CCGRID ’11), pp. 195-204, 2011.
G. Hamilton, M. McCabe, and J. Owens, “Quantifying the [30] S. Chaisiri, R. Kaewpuang, B.-S. Lee, and D. Niyato, “Cost
Performance Isolation Properties of Virtualization Systems,” Proc. Minimization for Provisioning Virtual Servers in Amazon Elastic
Workshop Experimental Computer Science (ExpCS ’07), 2007. Compute Cloud,” Proc. 19th Ann. IEEE/ACM Int’l Symp. Modeling,
[7] Amazon Elastic Compute Cloud, http://aws.amazon.com/ec2/, Analysis and Simulation of Computer and Telecomm. Systems
2012. (MASCOTS ’11), pp. 85-95, 2011.
[8] D. Milojicic, I.M. Llorente, and R.S. Montero, “Opennebula: A [31] M. Mao, J. Li, and M. Humphrey, “Cloud Auto-Scaling with
Cloud Management Tool,” IEEE Internet Computing, vol. 15, no. 2, Deadline and Budget Constraints,” Proc. 11th IEEE/ACM Int’l
pp. 11-14, Mar./Apr. 2011. Conf. Grid Computing (Grid ’10), pp. 41-48, 2010.
[9] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge [32] M. Mao and M. Humphrey, “Auto-Scaling to Minimize Cost and
Univ. Press, 2009. Meet Application Deadlines in Cloud Workflows,” Proc. Int’l Conf.
[10] E. Imamagic, B. Radic, and D. Dobrenic, “An Approach to Grid High Performance Computing, Networking, Storage & Analysis (SC
Scheduling by Using Condor-G Matchmaking Mechanism,” Proc. ’11), pp. 49:1-49:12, 2011.
28th Int’l Conf. Information Technology Interfaces, pp. 625-632, 2006. [33] J. Weinman, “Smooth Operator: The Value of Demand Aggrega-
[11] B. Sharma, V. Chudnovsky, J.L. Hellerstein, R. Rifaat, and C.R. tion,” http://joeweinman.com/Resources/Joe_Weinman_
Das, “Modeling and Synthesizing Task Placement Constraints in Smooth_Operator _Demand_Aggregation.pdf, 2011.
Google Compute Clusters,” Proc. Second ACM Symp. Cloud [34] V. Petrucci, O. Loques, and D. Mossé, “A Dynamic Optimization
Computing (SOCC ’11), pp. 3:1-3:14, 2011. Model for Power and Performance Management of Virtualized
[12] H. Khazaei, J.V. Misic, and V.B. Misic, “Modelling of Cloud Clusters,” Proc. First Int’l Conf. Energy-Efficient Computing and
Computing Centers Using m/g/m Queues,” Proc. Int’l Conf. Networking (e-Energy ’10), pp. 225-233, 2010.
Distributed Computing Systems Workshops (ICDCS), pp. 87-92, 2011. [35] F. Chang, J. Ren, and R. Viswanathan, “Optimal Resource
[13] Y. Wu, K. Hwang, Y. Yuan, and W. Zheng, “Adaptive Workload Allocation in Clouds,” Proc. IEEE Int’l Conf. Cloud Computing,
Prediction of Grid Performance in Confidence Windows,” IEEE pp. 418-425, 2010.
Trans. Parallel and Distributed Systems, vol. 21, no. 7, pp. 925-938,
July 2010. Sheng Di received the MPhil degree from
[14] S. Di, D. Kondo, and W. Cirne, “Characterization and Comparison Huazhong University of Science and Technology
of Cloud versus Grid Workloads,” Proc. 14th Int’l Conf. Cluster in 2007 and the PhD degree from The University
Computing, pp. 230-238, 2012. of Hong Kong in 2011. He is currently a
[15] Q. Zhang, J.L. Hellerstein, and R. Boutaba, “Characterizing Task postdoctoral researcher at INRIA. His research
Usage Shapes in Google’s Compute Clusters,” Proc. Large-Scale interest involves optimization of distributed
Distributed Systems and Middleware Workshop (LADIS ’11), 2011. resource allocation especially in P2P systems
[16] L. Huang, J. Jia, B. Yu, B.G. Chun, P. Maniatis, and M. Naik, and large-scale cloud computing platforms. His
“Predicting Execution Time of Computer Programs Using Sparse background is mainly on the fundamental
Polynomial Regression,” Proc. 24th Conf. Neural Information theoretical analysis and practical system imple-
Processing Systems (NIPS ’10), pp. 1-9, 2010. mentation. He is a member of the IEEE.
[17] P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R.
Neugebauer, I. Pratt, and A. Warfield, “Xen and the Art of Cho-Li Wang received the PhD degree from the
Virtualization,” Proc. 19th ACM Symp. Operating Systems Principles University of Southern California in 1995. His
(SOSP ’03), pp. 164-177, 2003. research interests include multicore computing,
[18] P. Wendykier and J.G. Nagy, “Parallel Colt: A High-Performance software systems for Cluster and Grid comput-
Java Library for Scientific Computing and Image Processing,” ing, and virtualization techniques for cloud
ACM Trans. Math. Software, vol. 37, pp. 31:1-31:22, Sept. 2010. computing. He serves on the editorial boards
[19] R.K. Jain, The Art of Computer Systems Performance Analysis: of several international journals, including IEEE
Techniques for Experimental Design, Measurement, Simulation and Transactions on Computers(2006-2010), Jour-
Modelling. John Wiley & Sons, Apr. 1991. nal of Information Science and Engineering, and
[20] C. Jiang, C. Wang, X. Liu, and Y. Zhao, “A Survey of Job Multiagent and Grid Systems. He is the regional
Scheduling in Grids,” Proc. Joint Ninth Asia-Pacific Web and Eighth coordinator (Hong Kong) of IEEE Technical Committee on Scalable
Int’l Conf. Web-Age Information Management Conf. Advances in Data Computing (TCSC). He is a member of the IEEE.
and Web Management (APWeb/WAIM ’07), pp. 419-427, 2007.
[21] P. Crescenzi and V. Kann, A Compendium of NP Optimization
Problems, ftp://ftp.nada.kth.se/Theory/Viggo-Kann/
compendium.pdf, 2012.

You might also like