0% found this document useful (0 votes)
19 views13 pages

KSW SVR

This document proposes a multi-step load forecasting method called KSwSVR to efficiently provision cloud computing resources. KSwSVR integrates an improved support vector regression algorithm and Kalman smoother. It was tested on real resource trace data and showed lower prediction errors than other methods. Based on KSwSVR predictions, a resource provisioning strategy was developed that reduced resource consumption while meeting service level agreements. An experiment found it could save 17.2-48.1% of CPU capacities under different service level agreements.

Uploaded by

Saksham Alok
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views13 pages

KSW SVR

This document proposes a multi-step load forecasting method called KSwSVR to efficiently provision cloud computing resources. KSwSVR integrates an improved support vector regression algorithm and Kalman smoother. It was tested on real resource trace data and showed lower prediction errors than other methods. Based on KSwSVR predictions, a resource provisioning strategy was developed that reduced resource consumption while meeting service level agreements. An experiment found it could save 17.2-48.1% of CPU capacities under different service level agreements.

Uploaded by

Saksham Alok
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Hindawi Publishing Corporation

e Scientific World Journal


Volume 2014, Article ID 321231, 12 pages
http://dx.doi.org/10.1155/2014/321231

Research Article
Efficient Resources Provisioning Based on
Load Forecasting in Cloud

Rongdong Hu,1 Jingfei Jiang,1 Guangming Liu,1,2 and Lixin Wang1


1
School of Computer, National University of Defense Technology, Changsha 410073, China
2
National Supercomputer Center, Tianjin 300457, China

Correspondence should be addressed to Rongdong Hu; [email protected]

Received 4 November 2013; Accepted 19 December 2013; Published 20 February 2014

Academic Editors: J. Comellas, J.-X. Du, and S.-S. Liaw

Copyright © 2014 Rongdong Hu et al. This is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Cloud providers should ensure QoS while maximizing resources utilization. One optimal strategy is to timely allocate resources in
a fine-grained mode according to application’s actual resources demand. The necessary precondition of this strategy is obtaining
future load information in advance. We propose a multi-step-ahead load forecasting method, KSwSVR, based on statistical learning
theory which is suitable for the complex and dynamic characteristics of the cloud computing environment. It integrates an improved
support vector regression algorithm and Kalman smoother. Public trace data taken from multitypes of resources were used to
verify its prediction accuracy, stability, and adaptability, comparing with AR, BPNN, and standard SVR. Subsequently, based on the
predicted results, a simple and efficient strategy is proposed for resource provisioning. CPU allocation experiment indicated it can
effectively reduce resources consumption while meeting service level agreements requirements.

1. Introduction capacity, leading to extra expenses on overprovisioning and


thus extra total cost of acquisition [1]. Another problem is
Cloud computing offers near-infinite amount of resources the narrow dynamic power range of servers. Even completely
capacity (e.g., CPU, memory, Network I/O, disk, etc.) at a idle servers still consume about 70% of their peak power [2].
competitive rate and allows customers to obtain resources Therefore, keeping servers underutilized is highly inefficient
on-demand with pay-as-you-go pricing model. Instead of from the energy consumption perspective.
incurring high upfront costs in purchasing Information Tech- Many techniques can improve energy efficiency, such
nology (IT) infrastructure and dealing with the maintenance as improvement of applications’ algorithms, energy efficient
and upgrades of both software and hardware, organizations hardware, Dynamic Voltage and Frequency Scaling (DVFS),
can outsource their computational needs to the cloud. The terminal servers, and thin clients. Cloud computing mainly
proliferation of cloud computing has resulted in the estab- leverages the capabilities of the virtualization technology to
lishment of large-scale data centers containing thousands address the energy inefficiency problem [3]. The virtualiza-
of computing nodes and consuming enormous amounts of tion technology allows cloud providers to create multiple
electrical energy. virtual machine (VM) instances on a single physical server,
According to previous studies in the past decade, the rea- thus improving the utilization of resources and increasing the
son for this extremely high energy consumption is not just the return on investment. The reduction in energy consumption
quantity of computing resources and the power inefficiency of can be achieved by switching idle nodes to low-power modes
hardware but rather the inefficient usage of these resources. (i.e., sleep or hibernation), thus eliminating the idle power
The data collected from more than 5000 production servers consumption. Moreover, by using live migration [4], the VMs
over a 6-month period have shown that although servers can be dynamically consolidated to the minimal number of
usually are not idle, the utilization rarely approaches 100%. physical nodes according to their current resources require-
Most of the time, servers operate at 10%∼50% of their full ments.
2 The Scientific World Journal

However, virtualization also creates a new problem. One that applications use the resources allocated to them as fully
essential requirement of a cloud computing environment as possible.
is providing reliable QoS defined in terms of service level
agreements (SLA). Modern applications often experience 2.1.3. QoS. Quality of service refer to a certain level of
highly variable workloads causing dynamic resources usage performance and availability of a service. It also covers other
patterns. The consolidation of VMs can lead to performance aspects which are outside the scope of this paper, such as
degradation when an application encounters an increasing security and reliability.
demand resulting in an unexpected rise of resources usage.
This may lead to SLA violation—increasing response times,
2.1.4. SLA. Service level agreements, a series of goals
time outs, or failures. Overprovisioning may help to ensure
obtained through negotiating between service providers and
SLA, but it leads to inefficiency when the load decreases. The
customers. Its purpose is to achieve and maintain a specific
optimal strategy is to timely adjust resources provisioning
QoS. Typical parameters of cloud computing SLA include
according to the actual demands of the application. The pre-
CPU and memory capacity, resource expansion speed and
condition of this approach is to find out the future workload.
permissions, resource availability and reliability, application
The focus of this work is on improving the efficiency
response time, communication delay, and security and pri-
of resources provisioning by forecasting the load of various
vacy. It also defines the penalties that should be imposed
resources in cloud. We propose a multi-step-ahead load
when someone violates the relevant terms.
prediction method, KSwSVR, mainly based on statistical
learning technology, support vector regression (SVR), which
is suitable for the complex and dynamic characteristics of the 2.1.5. VM Size. The quantity of each resource of virtual
cloud computing environment. To the best of our knowledge, machine, such as CPU, memory, storage, bandwidth. It is usu-
it is the first time that SVR is used for load forecasting in ally a multidimensional variable, for example, “2core∗3GHz
cloud. KSwSVR integrates our improved SVR algorithm and CPU, 2 G memory, 30 G disk, and 10 M bandwidth”.
Kalman smoothing technology. Experiments with public
trace data have shown that, in comparison with AutoRegres- 2.2. Load in Cloud. With the proliferation of private and pub-
sive (AR), Back-Propagation Neural Network (BPNN), and lic cloud data centers, it is quite common today to lease virtual
standard SVR, KSwSVR always has the minimum prediction machines to host applications instead of physical machines.
error. Furthermore, KSwSVR is very stable; that is, its pre- Cloud users typically pay for a statically configured VM size,
diction error increases quite slowly when the predicted steps irrespective of the actual resources consumption of the appli-
increase. We also verified the broad adaptability of KSwSVR cation (e.g., Amazon EC2). This charging mode is obviously
with real trace data of various resources related to network, unreasonable especially for applications with variable load. It
CPU, memory, and storage systems. Based on the predicted is usually difficult for cloud users to figure out which size of
results, a simple and efficient strategy is proposed for resource VM is suitable for their applications as their loads are rarely
provisioning, considering the variations of prediction error constant. They certainly do not like to pay for the resources
and SLA levels. Finally, the usefulness of this method is they hold but not use when load is light. Furthermore, they
demonstrated in a CPU allocation experiment. With the have to face the risk of performance degradation when the
assistance of KSwSVR, dynamic provisioning strategy can load is heavy. In addition, cloud providers, such as Amazon
save 17.20%∼48.12% CPU capacities under different SLA EC2, provide resources on a VM basis. VMs are added,
levels, comparing with static provisioning. released, or migrated according to variation of load. Each
The rest of this paper is organized as follows. Section 2 process involves significant overhead but does not bring any
describes the background and motivation. Section 3 discusses actual benefit. It would be highly desirable for cloud providers
the detailed design and implementation of the proposed to provide dynamically finer-grained online scalable services
approach and Section 4 presents the experimental evaluation. that allocate resources according to application’s demand that
Section 5 examines the related work and Section 6 makes the could encourage the customer to pay a higher price for better
conclusion. service compared to paying a flat fee based on the size of
their VMs. Moreover, cloud providers can have the flexibility
to dynamically optimize the resources allocation, improve
2. Background and Motivation resources utilization, and achieve maximum benefit.
Timely and dynamic fine-grained online scalability will
2.1. Definitions greatly increase the pressure on management system to
2.1.1. Load. The object processed by the entity. For different rapidly detect and resolve SLA violation problems. Typically,
entities, load refers to different objects, such as user requests problem detection is done by specifying threshold tests.
of web server, computing tasks of CPU, read/write requests of Examples include “Do ping response times exceed 0.5s?” and
storage system, and I/O requests of external device. “Are HTTP operations greater than 12 per second on server?”
Unfortunately, once the detection occurs, there is often little
time to take corrective actions. In addition, if the load changes
2.1.2. Utilization/Usage. The amount of system resources dramatically, there will be frequent SLA violations. It is
used by applications, usually expressed as the ratio of the used desirable that the resources can be acquired earlier than the
part to the total resource. Efficient utilization/usage means time when the load actually increases. We need a predictive
The Scientific World Journal 3

2.3. Support Vector Machine. Modern cloud computing dat-


acenters is comprised of heterogeneous and distributed
components, making them difficult to manage piecewise, let
Tweets per minute

alone as a whole. Furthermore, the scale, complexity, and


growth rate of these systems render any heuristic and rule-
based system management approaches insufficient. It is also
infeasible to forecast load by modeling the behaviors of the
various applications and their relationships to each other. In
response to these challenges, statistics-based techniques for
building gray or black box models of applications load can
better guide resources provisioning decision in cloud. Our
study treats the load forecasting as a time series prediction
problem and makes use of statistical learning method—
11:00 am

11:20 am

11:40 am

12:00 pm

12:20 pm

12:40 pm

1:00 pm
support vector machine (SVM).
SVM was developed by Vapnik and used for many
machine learning tasks such as pattern recognition, object
Today classification, regression analysis, and time series prediction
Last week
[7]. It is based on the structural risk minimization (SRM)
Figure 1: Load of Twitter on Obama’s inauguration day. principle which tries to control the model complexity as well
as the upper bound of generalization risk. The principle is
based on the fact that the generalization error is bounded by
the sum of the empirical error and a confidence interval term
that depends on the Vapnik-Chervonenkis (VC) dimension.
solution instead of a reactive strategy. This outcome would be
On the contrary, traditional regression techniques, including
possible only if the future load can be predicted. According to
traditional artificial neural networks (ANN) [8], are based on
the predictive value, we can prepare for retrieving upcoming
empirical risk minimization (ERM) principle, which tries to
idle resources, providing them to other users or converting
minimize the training error only. Furthermore, the learning
them to energy saving mode in advance or we can add
process of ANN is quite complex and inefficient for modeling,
resources for the upcoming peak load in advance to ensure
and the choices of model structures and parameters are
a stable QoS.
lack of rigorous theoretical guidance. So, it may suffer from
However, load forecasting is difficult in cloud computing overfitting or underfitting with ill-chosen parameters. In
environment for the following reasons. First, most modern contrast, SVM has strict theory and mathematical foundation
applications have fluctuant loads which lead to complex that do not have the problem of local optimization and
behaviors in resources usages as their intensity and com- dimensional disaster. It can achieve higher generalization
position change over time. For example, Figure 1 depicts a performance especially for small samples set. It has a limited
real-world scenario wherein Twitter experienced dramatic number of parameters to choose for modeling, and there exist
load fluctuations on Obama’s inauguration day [5]. Such a fast and memory-efficient algorithms.
load is very typical in modern commercial websites, and SVR is the methodology by which a function is estimated
load forecasting for such application is not easy. Second, using observed data which in turn train the SVM. Its goal is to
for security and privacy, cloud service providers are usually construct a hyperplane that lays close to as many of the data
forbidden to access the internal details of the application. points as possible. This goal is achieved by arriving at the most
So, cloud management system cannot take advantage of flat function which ensures that the error does not exceed a
the application’s internal characteristics (e.g., a loop code threshold 𝜀. Flatness is defined in terms of minimum norm
indicates that resources usage will exhibit periodic similarity) whereas the error threshold is introduced as a constraint.
to forecast load. For example, Niehorster et al. used sensors to Slack variables are introduced to deal with the situations
read the behavior of application [6]. It is infeasible in most where the above definition followed in the strict sense leads
cases. Third, unlike traditional computing environment, in to an infeasible solution.
cloud, the external environment which the applications have When performing time series prediction by SVM, each
to face is dynamic. Interference among applications hosted on input vector 𝑥𝑖 is defined as a finite set of consecutive
the same physical machine leads to complex resources usage measurements of the series. The output vector 𝑦𝑖 contains
behaviors as they compete for various types of resources the 𝑥(𝑛+1) observation, where 𝑛 determines the amount of
which are hard to strictly partition. For instance, in the history data. Each combination (𝑥𝑖 , 𝑦𝑖 ) constitutes a training
exclusive nonvirtualized environment, an application with point. There are 𝑁 such training points used for fitting the
constant workload should have relatively stable resources SVR. SVM is a linear learning machine. The linear function
demand. But in cloud where cohosted applications compete is formulated in the high dimensional feature space, with the
for the shared last level cache or disk I/O bandwidth, the usage form
of resources that can be strictly partitioned and allocated (e.g.,
CPU or memory) will likely fluctuate. 𝑓 (𝑥) = 𝑤𝜙 (𝑥) + 𝑏, (1)
4 The Scientific World Journal

above and below 𝜀, leads (2) to the following constrained


w
function:
𝑛
1
𝑅reg = ‖𝑤‖2 + 𝐶∑ (𝜁𝑖 + 𝜁𝑖∗ )
𝜙
minimize
2 𝑖=1

subject to 𝑓 (𝑥𝑖 ) − 𝑤𝜙 (𝑥𝑖 ) − 𝑏𝑖 ≤ 𝜀 + 𝜁𝑖 (4)


𝑤𝜙 (𝑥𝑖 ) + 𝑏𝑖 − 𝑓 (𝑥𝑖 ) ≤ 𝜀 + 𝜁𝑖∗ ,
Figure 2: Mapping data from input space to feature space.
𝜁𝑖(∗) ≥ 0.

From the implementation point of view, training SVM


where 𝑥 is nonlinearly mapped from the “input” space to is equivalent to solving a linearly constrained quadratic pro-
a higher dimension “feature” space via mapping function gramming (QP) problem with the number of variables equal
𝜙; see Figure 2. To simplify the mapping, kernel function to the number of training data points. The sequential minimal
𝐾(𝑥𝑖 , 𝑥𝑗 ) = ⟨𝜙(𝑥𝑖 ), 𝜙(𝑥𝑗 )⟩ is used. The most widely used optimization (SMO) algorithm [9] is very effective in training
kernel functions are SVMs for solving the regression estimation problem.

linear: 𝐾(𝑥𝑖 , 𝑥𝑗 ) = 𝑥𝑖𝑇 𝑥𝑗


3. Approach
𝑑
polynomial: 𝐾(𝑥𝑖 , 𝑥𝑗 ) = (𝛾𝑥𝑖𝑇𝑥𝑗 + 𝑟) , 𝛾 ≥ 0 The main body of our multi-step-ahead load forecasting
method is based on our improved SVR. Rather than giving the
radial basis function (RBF): 𝐾(𝑥𝑖 , 𝑥𝑗 ) = same consideration to the training data within a sliding win-
exp(−𝛾‖𝑥𝑖 − 𝑥𝑗 ‖2 ), 𝛾 ≥ 0 dow in standard SVR, our multi-step-ahead load forecasting
strategy gives more weight to more “important” data.
sigmoid: 𝐾(𝑥𝑖 , 𝑥𝑗 ) = tanh(𝛾𝑥𝑖𝑇 𝑥𝑗 + 𝑟). In order to enhance prediction accuracy, the Kalman
Smoother is used for data preprocessing. We argue that
We choose the RBF kernel as it is easier to compute and Kalman smoother is suitable for the cloud application’s load
has fewer parameters to adjust. estimation because it was originally developed to estimate
The goal is to find “optimal” weights 𝑤 and threshold 𝑏 time-varying states in dynamic systems. This approach essen-
as well as to define the criteria for finding an “optimal” set tially uses a filtering technique to eliminate the noise of
of weights. First is the “flatness” of the weights, which can resources usage signal coming from error of measurement
be measured by the Euclidean norm (i.e., minimize‖𝑤‖2 ). technique while still discovering its real main fluctuations.
Second is the error 𝑅emp generated by the estimation process We give this method a name, KSwSVR.
of the value, also known as the empirical risk that is to be
minimized. The overall goal is to minimize the regularized 3.1. Kalman Smoother (KS). The Kalman filter [10] has been
risk 𝑅reg (using the 𝜀-insensitive loss function) as widely used in the area of autonomous or assisted navigation.
One of the main advantages of the filter is that it can estimate
1 hidden parameters indirectly from measured data and can
minimize 𝑅reg = ‖𝑤‖2 + 𝑅emp integrate data from as many measurements as are available,
2
(2) in an approximately optimal way. The Kalman filter estimates
𝑛
1 the state 𝑥 of a discrete-time controlled process which is
= ‖𝑤‖2 + 𝐶∑𝐿𝜀 (𝑦𝑖 , 𝑓 (𝑥𝑖 )) , governed by the linear stochastic difference equation:
2 𝑖=1

𝑥𝑘 = 𝐴𝑥𝑘−1 + 𝐵𝑢𝑘−1 + 𝑤𝑘−1 . (5)


where
With a measurement 𝑧 that is
󵄨 󵄨 󵄨󵄨 󵄨
𝜀 {󵄨󵄨󵄨𝑦𝑖 − 𝑓 (𝑥𝑖 )󵄨󵄨󵄨 − 𝜀 󵄨󵄨𝑦𝑖 − 𝑓 (𝑥𝑖 )󵄨󵄨󵄨 ≥ 𝜀, 𝑧𝑘 = 𝐻𝑥𝑘 + V𝑘 , (6)
𝐿 (𝑦𝑖 , 𝑓 (𝑥𝑖 )) = {
0 otherwise,
{ where 𝐴 is a transform matrix from time step 𝑘 − 1 to
(3) 𝑘. 𝑢𝑘−1 represents a known vector. 𝐵 is a control matrix.
𝐻 is a matrix that presents the relation of 𝑧𝑘 and 𝑥𝑘 . The
𝑅emp in (2) is measured by the 𝜀-insensitive loss function random variables 𝑤𝑘−1 ∼ 𝑁(0, 𝑄𝑘−1 ) and V𝑘 ∼ 𝑁(0, 𝑅𝑘 )
𝐿𝜀 . 𝐶 is the regularization constant determining the tradeoff represent the process and measurement noise, respectively.
between the empirical risk and regularized risk. It should be They are assumed to be white and independent of each other.
noted that both 𝜀 and 𝐶 are user defined constants and are The Kalman filter estimates a process by using a form of
typically computed empirically. Introduction of the positive feedback control: the filter estimates the process state and
slack variables, 𝜁 and 𝜁∗ , which, respectively, denote the errors then obtains feedback in the form of (noisy) measurements.
The Scientific World Journal 5

As such, the equations for the Kalman filter fall into two and directly influenced the design of processor caches, disk
groups: time update (7) (predictor) and measurement update controller caches, storage hierarchies, network interfaces,
equations (8) (corrector). 𝐾 is known as Kalman Gain. The database systems, graphics display systems, human-computer
time update projects the current state estimate ahead of time. interfaces, individual application programs, search engines,
The measurement update adjusts the projected estimate by an web browsers, edge caches for web-based environments, and
actual measurement at that time computer forensics. Therefore, we have good reason to believe
that cloud load would follow the law as well. Hence we
Time update 𝑥̂𝑘− = 𝐴𝑥̂𝑘−1 + 𝐵𝑢𝑘−1 , take the same principle that gives more weight to the more
(7) important recent historical data of the load. The newer the
𝑃𝑘− = 𝐴𝑃𝑘−1 𝐴𝑇 + 𝑄, data are, the more important they are.
−1 Another factor that influences the importance of data is
Measurement update 𝐾𝑘 = 𝑃𝑘− 𝐻𝑇 (𝐻𝑃𝑘− 𝐻𝑇 + 𝑅) , their “credibility.” In our multi-step-ahead load forecasting,
there are two types of data—measured data and predicted
𝑥̂𝑘 = 𝑥̂𝑘− + 𝐾𝑘 (𝑧𝑘 − 𝐻𝑥̂𝑘− ) , data. Measured data refer to the true historical resources
usage information collected by the system monitor. Many
𝑃𝑘 = (𝐼 − 𝐾𝑘 𝐻) 𝑃𝑘− .
popular monitor tools, such as the top for Linux and the
(8) xentop for Xen [3], are available for obtaining system infor-
mation (e.g., usage of CPU, memory, network, and block
The Kalman filter only considers 𝑃(𝑥𝑡 | 𝑦0:𝑡 ) as the filtered
device on host or VM). Monitor periodically collects system
estimate of 𝑥𝑡 only takes into account the “past” information
information for decision making. As mentioned before,
relative to 𝑥𝑡 . By incorporating the “future” observations
cloud system is usually soft real time. Therefore, there is
relative to 𝑥𝑡 , we can obtain a more refined state estimate.
generally not a severe time constraint on monitoring period.
That is why we choose the Kalman smoother as our noise
Furthermore, too small monitoring granularity will bring
reduction method. The Kalman smoother, which can be
more decision making costs and is not conducive to the
calculated from the Kalman filter results by recursions,
improvement of system resource utilization. For example,
estimates 𝑃(𝑥𝑡 | 𝑦0:𝑇 , 𝑡 < 𝑇), taking into account both past
once every 5 seconds is enough. Measured data are believed
and future information. It is also computationally attractive,
to have a high credibility (more important). Predicted data,
due to its recursive computation, since the production of the
the result of load forecasting algorithm, have lower cred-
next estimate only requires the updated measurements and
ibility (less important) because any prediction algorithm
the previous estimations.
has prediction error. A multi-step-ahead prediction can be
In our scenario of cloud application load, there is no
achieved by running one-step-ahead prediction iteratively.
control input, so 𝑢 = 0. The noisy measurement is of the
Time series data related to once m-step-ahead prediction are
state directly, so 𝐻 = 1. We assume the state does not change
{. . . , 𝑥𝑡−2 , 𝑥𝑡−1 , 𝑥̂𝑡 , 𝑥̂𝑡+1 , 𝑥̂𝑡+2 , . . . , 𝑥̂𝑡+𝑚−1 }. The prediction of
from step to step, so 𝐴 = 1. Given the existence of relatively
𝑥̂𝑡+𝑚−1 is based on the series {. . . , 𝑥𝑡−2 , 𝑥𝑡−1 , 𝑥̂𝑡 , 𝑥̂𝑡+1 , 𝑥̂𝑡+2 , . . . ,
accurate measurement tools, we set 𝑄 = 0.1 and 𝑅 = 1.
𝑥̂𝑡+𝑚−2 }, where only {. . . , 𝑥𝑡−2 , 𝑥𝑡−1 } are the measured data and
Therefore, we use the (5) and (6) as follows:
{𝑥̂𝑡 , 𝑥̂𝑡+1 , 𝑥̂𝑡+2 , . . . , 𝑥̂𝑡+𝑚−2 } are predicted data. This process is
𝑥𝑘 = 𝑥𝑘−1 + 𝑤𝑘−1 , based on a significant hypothesis that predicted data are
assumed to be measured data when performance next step
𝑧𝑘 = 𝑥𝑘 + V𝑘 , (9) prediction. However, due to prediction error and dynamic
feature of cloud, this hypothesis cannot be satisfied all the
𝑤 ∼ 𝑁 (0, 0.1) , V ∼ 𝑁 (0, 1) . time when multi-step-ahead prediction carried out. Partic-
ularly, we need to address the accumulation of prediction
3.2. SVR with Weighted Training Data (wSVR). Treating errors. Every one-step-ahead prediction may cause an error.
all data of a time series equally is clearly unreasonable. Therefore, using the former prediction results as the input
We should take advantage of the data according to their data for next prediction will cause accumulation of errors.
“importance”—usefulness of data for improving prediction Figure 3 depicts the relationship between prediction mean
accuracy. absolute error (MAE) and predicted steps, where the load
In the time series of nonstationary system, the depen- series we used is collected from a real world I/O trace of an
dency between input variables and output variables gradually online transaction processing (OLTP) applications [12] and
changes over time. Specifically, recent past data could provide predictor is AR(16) [13]. It indicates that with the increase of
more important information than the distant past data. predicted steps, the MAE is increasing drastically; that is, the
This conclusion is also true in this paper’s cloud scenario. predicted data’s credibility is decreasing.
One of the most powerful arguments is locality of reference, To sum up the above arguments, in multi-step-ahead
also known as principle of locality [11]. As one of the load forecasting of cloud, the importance of input data series
cornerstones of computer science, locality of reference was gradually increases and then decreases. The inflection point
born from efforts to make virtual memory systems work is between last measured data and first predicted data.
well, which is a phenomenon of the same value or related From (2), it can be observed that the performance of SVR
storage locations being frequently accessed. But today, this is sensitive to the regularization constant 𝐶. A small value for
principal has found application well beyond virtual memory 𝐶 will underfit the training data because the weight placed
6 The Scientific World Journal

1.2 Violation Application


SLA
Mean absolute error

estimating performance
1
0.8 s

0.6 ̂t xtalloc
Predictor x Cloud
xt , s, di )
f(̂
0.4 (KSwSVR) system
0.2
di
0
5 10 15 20 25 xiuse
Error
Predicted steps computing
AR(16)

Figure 3: Relationship between MAE and predicted steps. Figure 4: Decision-making process of resource provisioning.

resources directly according to forecasting results cannot


on the training data is too small thus resulting in large value
guarantee a stable QoS.
of prediction error. On the contrary, when 𝐶 is too large,
Therefore, we propose a simple resource allocation mech-
SVR will overfit the training data, leading to deterioration of
anism based on load forecasting for two considerations.
generalization performance.
First, load spike, which will lead to underprovisioning, is
By using a fixed value of 𝐶 in the regularized risk function,
difficult to predict for any prediction algorithm especially
standard SVR assigns equal weights to all the 𝜀-insensitive
in dynamic cloud. Second, in cloud computing model, the
errors between the measured and predicted data, treating
user’s requirements of QoS also can change at any time. We
two types of data equally. For illustration, the empirical risk
need a flexible mechanism to deal with different SLA levels in
function in standard SVR is expressed by
this multitenant environment. The actual allocation value is
𝑛 computed as
𝑅emp std = 𝐶∑ (𝜁𝑖 + 𝜁𝑖∗ ) . (10)
𝑖=1 1 𝑡−𝑘
𝑥𝑡alloc = 𝑠𝑥̂𝑡 + ∑ 𝑑,
𝑘 𝑖=𝑡−1 𝑖 (12)
As discussed above, this is unreasonable. Thus, it is
beneficial to place different weight on the 𝜀-insensitive 𝑑𝑖 = max {0, 𝑥𝑖use − 𝑥𝑖alloc } ,
errors according to the importance of training data. So, we
add a weight coefficient 𝑤𝑖 to the regularization constant, 𝑥𝑡use , 𝑥̂𝑡 , and 𝑥𝑡alloc separately represent real resources usage,
translating the empirical risk function to predicted value, and actual resources allocation value at time
𝑛 𝑡. With 𝑑𝑖 , we use the information of underprovisioning in the
𝑅emp 𝑤 = 𝐶∑𝑤𝑖 (𝜁𝑖 + 𝜁𝑖∗ ) , last 𝑘 periods, while ignoring overprovisioning. This allows
𝑖=1 system to quickly respond to load spike. 𝑠 is an incremental
(11) coefficient which is highly correlated with QoS of cloud. Its
𝑓 (𝑖) 𝑖 < 𝑡, value depends on the gap between the actual application
𝑤𝑖 = { measured
𝑓predicted (𝑖) 𝑖 ≥ 𝑡, performance and SLA. The greater the gap, the bigger the
𝑠. Bigger 𝑠 means allocating more resources and fewer SLA
where 𝑓measured (𝑖) is monotonically increasing function for violations. It is a proactive (KSwSVR) and QoS-driven (𝑠)
measured data, while 𝑓predicted (𝑖) is monotonically decreasing decision making process with a feedback (𝑑𝑖 ); shown in as
function for predicted data. 𝑓∗ (𝑖) may be a linear function, Figure 4. Obviously, it is also applicable to other predictors.
an exponential function, or others meeting the monotonicity
requirement. Obviously, the choice of 𝑓∗ (𝑖) will affect the 4. Evaluation
subsequent prediction accuracy. The optimal solution is
taking into account characteristics of the data. This is another In this section, the performance of KSwSVR will be evaluated
point worth studying. by using various types of real-world trace data and comparing
with other typical load forecasting technology. We prefer
3.3. Resources Provisioning Based on KSwSVR. Directly tak- using public trace data rather than historical data generated
ing the predicted value as the final resources provisioning by ourselves for the purpose of giving comparable and
value may lead to unacceptable SLA violation since any reproducible results.
prediction algorithm has an error range. Furthermore, as we
can see in the latter experiment, even the same algorithm 4.1. Prediction Algorithm. In order to highlight the prediction
will have different prediction errors as the prediction object performance of KSwSVR, two widely used prediction meth-
changes. That is, for different system resources (CPU, mem- ods are chosen for comparison. They are typical represen-
ory, or I/O) or the same resource at different time, allocating tatives of linear prediction algorithm and machine learning
The Scientific World Journal 7

phase, the network is fed with input vectors, and random


xt−p weights are assigned to the synapses. After presentation of
each input vector, the network generates a predicted output
h1 𝑥̂𝑡 . The generated output is then compared with the actual
xt−p+1 output 𝑥𝑡 . The difference between the two is known as the
.. error term.
. ̂t
x
.. The BPNN algorithm is the most popular and the oldest
. supervised learning feedforward neural network algorithm
hk proposed by Rumelhart and Mcclelland [16]. The BPNN
xt−1 learns by calculating the errors of the output layer to find the
errors in the hidden layers. The algorithm is highly suitable
for the solution to problems in which no relationship is found
Input layer Hidden layer Output layer
between the output and inputs. Due to its flexibility and
Figure 5: Standard multilayer feedforward neural network. learning capabilities, BPNN has been successfully used in
wide range of applications. Therefore, we chose it as one of
the comparison object and empirically configure it with six
technology—AR and BPNN. Meanwhile, standard SVR is input neurons and one hidden layer with ten hidden neurons,
also our comparison object. as considering both prediction overhead and accuracy.

4.1.3. SVR and KSwSVR. Parameter configuration of stan-


4.1.1. Linear Prediction—AR. Dinda and O’Hallaron [13]
dard SVR and KSwSVR in this work is as follows.
studied different linear load forecasting models including AR,
SVM-type: 𝜀-regression
moving average, autoregressive moving average, autoregres-
SVM-kernel: radial basis function (RBF)
sive integrated moving average, and autoregressive fraction-
Cost (C): 1, penalty parameter of the error term
ally integrated moving average models. Their conclusion is
Gamma: 0.0625, parameter of the RBF
that the simple AR model is the best model and is appropriate
Epsilon (𝜀): 0.1, 𝜀-insensitive loss function
and sufficient to be used for load prediction.
𝑓∗ (𝑖): linear function, for KSwSVR in (11).
AR is a basic linear time series prediction algorithm in
It is worth emphasizing that we do not have tuned param-
which the current value can be represented by the sum of a
eters of BPNN, SVR, and KSwSVR for specific predictions
linear combination of several previous values and an error
object, but set the same parameters for all trace data according
𝜀. The general expression of AR(𝑝) model can be denoted as
to domain knowledge. Therefore, the experimental result of
(13), where {𝑥1, 𝑥2 , . . . , 𝑥𝑡 } is the time series, 𝑝 is the order of
this paper is representative.
AR model, 𝜑 = (𝜑1 , 𝜑2 , . . . , 𝜑𝑝 ) denotes the coefficients of AR
model.
Consider 4.2. Experiment Setup. The implementation of KSwSVR is
based on libsvm [17].
𝑝 In order to highlight the adaptability of KSwSVR, we
𝑥𝑡 = ∑𝜑𝑖 𝑥𝑡−𝑖 + 𝜀𝑡 . (13) have collected public trace data of various type resources,
𝑖=1 involving the network, CPU, memory, and storage systems
We adopt Dinda’s recommendation that AR(16) is the best in (see Figure 6).
consideration of both overhead and prediction accuracy. (1) Gcpu/Gmem. 7 hours of CPU and memory usage data
in Google cluster (TraceVersion1) [18]. For confidentiality
4.1.2. Machine Learning—BPNN. As SVR, ANN [8] is also a reasons, the consumption of CPU and memory is obscured
typical machine learning strategy in the category of regres- using a linear transformation before release. We randomly
sion computation. ANN is a powerful tool for self-learning, selected a long duration job with jobID 1485896354.
and it can generalize the characteristics of load by proper (2) CScpu. CPU load trace of a big memory compute
training. It is inherently a distributed architecture with high server in the CMCL (Computers, Media, and Communica-
robustness and has been used in resources state prediction tion Laboratory) at Carnegie Mellon University [19]. The data
in the past. It is indicated by Eswaradass et al. [14] that the is the number of processes that are running or are ready to
ANN prediction outperforms the Network Weather Service run, which is the length of the ready queue maintained by the
methods [15]. scheduler.
The structure of a standard multilayer feedforward neural (3) OLTPio. Storage system data request rate derived from
network is in Figure 5. It consists of an input layer with input I/O trace of an OLTP applications running at large financial
neurons [𝑥𝑡−𝑝 , 𝑥𝑡−𝑝+1 , . . . , 𝑥𝑡−1 ], a hidden layer with hidden institutions [12].
neurons [ℎ1 , ℎ2 , . . . , ℎ𝑘 ], and an output layer with one output
neuron 𝑥̂𝑡 . Every node in a layer is connected to every other (4) SEio. Storage system data request rate derived from
node in the neighboring layer. These connections are known I/O trace of a search engine [20].
as synapses. Each synapse is associated with a weight which (5) WC98. Client request rate observed in World Cup 98
is to be determined during training. During the training web servers, from 1998-06-22:00.00 to 1998-06-22:23.59 [21].
8 The Scientific World Journal

30 4.5 2
Gcpu Gmem CScpu

25 4 1

20 3.5 0
0 20 40 60 0 20 40 60 0 5000 10000

×106 ×106
10 2000
4 OLTPio SEio WC98

5 1000
2

0 0 0
0 5000 10000 0 5000 10000 0 2 4 6 8 ×104
Time

Figure 6: Trace data used in this work.

First, we have evaluated the forecast performance of Table 1: Improvement of prediction error.
KSwSVR. The first 2000 points of each trace is used as our
Trace Data Gcpu Gmem CScpu OLTPio SEio WC98
experiment data and translated to mean value of five inter-
vals, except Google cluster data for its limited amount. As MAE Improvement 12.9% 17.9% 22.0% 28.1% 20.8% 22.5%
previously mentioned, our prediction work is for timely and
dynamic fine-grained online scalability of cloud, so that the
overhead of each prediction would be as small as possible. AR, as a typical representative of the linear prediction
In order to ensure forecasting speed, six training data were technology that cannot be well adapted to the nonlinear
used each time in BPNN, standard SVR, and KSwSVR in our regression problem, has a relatively high prediction error.
experiment. Furthermore, when data have high volatility (e.g., CScpu and
Then, by simulating dynamic CPU allocation, we have OLTPio), the error cumulative effect of AR is quite obvious.
shown the high efficiency of KSwSVR in resources provision- This makes it unsuitable for multi-step-ahead long-term load
ing. forecasting in cloud.
As an excellent machine learning technology, the neural
network should have a better nonlinear regression perfor-
4.3. Experimental Results mance. However, as mentioned before, timely and dynamic
4.3.1. Prediction Accuracy Evaluation. Before the training fine-grained online scalability of cloud needs a fast and
process begins, data normalization is performed by linear efficient load forecasting algorithm. For this reason, the
transformation as training data set for a predicted data should not be too large.
So, suffering from this restriction, the performance of BPNN
𝑥 − 𝑥min
𝑥𝑖𝑛 = 𝑖 , (14) deteriorates significantly, and its prediction accuracy is even
𝑥max − 𝑥min lower than that of AR. Its error cumulative effect is also large.
In contrast, the theory foundation of SVM determines its
where 𝑥𝑖𝑛 and 𝑥𝑖 represent the normalized and original
excellent performance in the face of small training sample set.
measured data, respectively, and 𝑥min and 𝑥max represent the
For all six traces, SVR and KSwSVR have always maintained
minimum and maximum value among the original measured
a relatively high prediction accuracy and stability and their
data, respectively.
error cumulative effect is quite slight.
The evaluation of prediction performance is based on the
We further compared KSwSVR and standard SVR with
mean absolute error (MAE) which is a widely used error
more predicted steps; see Figure 8. Benefiting from Kalman
metric for evaluating results of time-series forecasting, as
smoother and weight technology, KSwSVR outperforms
shown in (15).
standard SVR all the time, except two points of Gcpu for
1 𝑛 󵄨󵄨 󵄨 its small samples set. Furthermore, the prediction error of
MAE = ∑ 󵄨𝑥̂ − 𝑥𝑖 󵄨󵄨󵄨 , (15)
𝑛 𝑖=1 󵄨 𝑖
KSwSVR increases quite more slowly as predicted steps
increases. That means its prediction accuracy is very stable.
where 𝑥̂𝑖 and 𝑥𝑖 represent the predicted and original data, In contrast, the performance of standard SVR fluctuates with
respectively, and 𝑛 is the number of predicted data points. predicted steps. Specifically, improved MAE of KSwSVR is in
The detailed experimental results are shown in Figure 7. Table 1.
Because the total number of samples is limited, performance
trend feature of Google’s two traces fluctuate slightly. But 4.3.2. Computational Costs. As mentioned earlier, SVM is
we can still reach this conclusion: KSwSVR has the best a relatively new machine learning method that optimizes
prediction accuracy, followed successively by standard SVR model on training data. Because dot products in feature space
and AR and BPNN. can be represented by kernel functions, the transformation
The Scientific World Journal 9

Gcpu MAE Gmem MAE CScpu MAE


0.4 0.4 0.2
0.3 0.3 0.15
0.2 0.2 0.1
0.1 0.1 0.05
0 0 0
1 2 3 4 5 1 2 3 4 5 1 2 3 4 5

(a) (b) (c)


0.2
OLTPio MAE SEio MAE WC98 MAE
0.4
0.15
0.1
0.3
0.1
0.2
0.05
0.05 0.1
0 0 0
1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
Predicted steps

(d) (e) (f)

AR SVR BPNN KSwSVR

Figure 7: MAE comparison.

0.11
Gcpu MAE 0.19 Gmem MAE CScpu MAE
0.17
0.17
0.09
0.15
0.15

0.13 0.13 0.07


0 5 10 15 20 0 5 10 15 20 0 5 10 15 20
(a) (b) (c)
0.1 0.18 0.08
OLTPio MAE SEio MAE WC98 MAE

0.08 0.16 0.065

0.06 0.14 0.05


0 5 10 15 20 0 5 10 15 20 0 5 10 15 20
Predicted steps

(d) (e) (f)


SVR KSwSVR

Figure 8: KSwSVR versus standard SVR.

from input space to feature space is implicit. Training SVM is Figure 6, we have compared the temporal costs of AR,
converted to solving a linearly constrained QP problem. That BPNN, standard SVR, and KSwSVR on the same computing
greatly reduces the computational complexity. platform. The choices of experimental parameters are the
Moreover, the use of SMO to solve the SVM QP problem same as before, except that 10,000 data points of each trace
can further accelerate the training speed. SMO avoids the are used as test data.
numerical QP optimization steps and requires no extra Experimental results, in Table 2, show that the two algo-
matrix storage at all [9]. rithms based on SVR have obvious efficiency advantages. The
Based on the above theories, we believe that the KSwSVR computational cost of KSwSVR increases slightly compared
based on SVM should be efficient. With four traces in with that of standard SVR. However, as analyzed above,
10 The Scientific World Journal

Table 2: Comparison of total computational costs (𝑠). Table 3: Comparison of total CPU consumption.

Trace Data AR BPNN SVR KSwSVR Incremental coefficient (𝑠) 3.0 3.8 5.6
CScpu 29.1467 409.5801 0.4287 0.6549 SLA violation rate 5% 3% 1%
OLTPio 29.0108 410.7172 0.4480 0.7548 (Static-dynamic)/static 17.20% 29.81% 48.12%
SEio 29.4803 411.9297 0.4308 0.6627
WC98 29.3089 409.8534 0.4415 0.7111

less than static method. For example, when the SLA violation
1.2 rate is 5%, dynamic CPU allocation strategy based on the
1 KSwSVR (blue solid line) can reduce the CPU consumption
0.8 by 17.20%, compared with the static CPU allocation strategy
CPU

0.6 (blue dashed line). Detailed data is shown in Table 3. With


0.4 the decrease of SLA violation rate, the benefit produced by
0.2 KSwSVR becomes more significant. Specifically, the total
0 allocated CPU amount even reduces by nearly half when SLA
100 200 300 400 500 600 700 800 900 1000
violation rate is 1%.
Time
By cooperating with other technologies such as virtualiza-
R AD 3% tion and DVFS, our method can effectively improve resources
P AS 3% utilization and energy saving.
AD 5% AD 1%
AS 5% AS 1%
5. Related Work
Figure 9: CPU allocation of dynamic/static strategy. 𝑅—real
usage; 𝑃—predicted value; AD/AS—actual dynamic/static alloca- We classify the related work into two categories: (1) forecast-
tion value; 𝑥%: SLA violation rate.
ing technology in grid, cloud, and virtualization environment
and (2) SVM/SVR related to our work. Other forecasting
technology in computer science is not considered here.
the introduction of the Kalman smoother significantly The Network Weather Service (NWS) [15] is the most
improves the forecasting performance. It is worthwhile. For famous system designed to provide dynamic resources per-
any trace data, the total computational cost of KSwSVR is formance forecasts and has been mainly deployed as a grid
only about 0.7 seconds, which accounts for approximately middleware service. The predictive methods currently used
2% of the cost of AR (about 29 s) or 0.17% of the cost of include running average, sliding window average, last mea-
BPNN (about 410 s). The algorithm complexity of BPNN surement, adaptive window average, median filter, adaptive
determines its high computational cost. As a linear time window median, 𝛼-trimmed mean, stochastic gradient, and
series prediction algorithm, AR should have low temporal AR. Xu et al. [22] use fuzzy logic to model and predict the
cost. However, in order to achieve acceptable prediction load of virtualized web applications. The VCONF project has
accuracy, AR requires more training samples each time that studied using reinforcement learning combined with ANN to
increases the computational cost. Since all data are processed automatically tune the CPU and memory configurations of
as time series, the temporal costs of different types of traces a VM in order to achieve good performance for its hosted
are almost equal for the same prediction algorithm. That application [23]. This solution is specifically targeted for only
is, the computational cost of load forecasting algorithm is the CPU resources. Roy et al. [24] used a second order
independent of the type of load. Thus, we can conclude that autoregressive moving average method filter for workload
the KSwSVR has a relatively lower computational cost and forecasting in cloud. Jiang et al. [25] proposed a self-adaptive
is very suitable for online load forecasting. It can effectively prediction solution to enable the instant cloud resources
support the timely and dynamic fine-grained scalability for provisioning. They employ a set of different prediction
real-time applications. techniques as the base predictors, including moving average,
AR, ANN, SVM, and gene expression programming. For the
4.3.3. Dynamic CPU Allocation Based on KSwSVR. We dem- sake of well handling of the prediction task, a prediction
onstrated the advantage of accurate load forecasting for ensemble method was proposed to combine the power of
resources provisioning with a real CPU load trace, CScpu individual prediction techniques. One of the characteristics
[19]. To be clearer, we only use its first half in this experiment. of this method is the large amount of calculation. It is not
Static CPU allocation with fixed allocation value was com- suitable for fine-grained online forecasting. In [26], PRESS
pared with our dynamic CPU allocation based on KSwSVR. first employs signal processing techniques (Fast Fourier
We assume that SLA only requires no CPU underprovi- Transform) to identify repeating patterns called signatures
sioning. Any SLA violation rate can be achieved by adjusting that are used for its predictions. If no signature is discovered,
incremental coefficient 𝑠 in (12). The detailed experimental PRESS employed a statistical state-driven approach (discrete-
results are shown in Figure 9. Under the same SLA violation time Markov chain) to capture short-term patterns of load.
rate, the total CPU consumption of dynamic strategy is always Experimental results show that its accuracy is similar to
The Scientific World Journal 11

AR. Based on similar characteristics of web traffic, Caron 6. Conclusion and Future Work
et al. [27] proposed a pattern matching algorithm to identify
closest resembling patterns similar to the present resources In order to achieve efficient resources provisioning in cloud,
utilization pattern in a set of past usage traces of the cloud we propose a multi-step-ahead load prediction method,
client. The resulting closest patterns were then interpolated KSwSVR, based on statistical learning technology which is
by using a weighted interpolation to forecast approximate suitable for the complex and dynamic characteristics of the
future values that were going to follow the present pattern. cloud computing environment. It integrates an improved
However, their approach has two problems. Firstly, searching support vector regression algorithm and Kalman smoothing
for similar patterns each time over the entire set of historical technology and does not require access to the internal details
data is inefficient. And secondly, it may lead to overspe- of application. The improved SVR gives more weight to more
cialization, thus turning out to be ineffective. Islam et al. “important” data than standard SVR, using the historical
[28] explored Error Correction Neural Network (ECNN) information more reasonably. Kalman Smoother is employed
and linear regression to make prediction on future resources to eliminate the noise of resources usage data coming from
requirement in the cloud. The authors also mentioned that measurement error. Public trace data of various resources
it is also possible to accommodate other learning methods were used to verify the excellent prediction accuracy, sta-
(e.g., SVM) for prediction of future resources usage. Bey et al. bility, and adaptability. In comparison with AR, BPNN, and
[29] presented a modeling approach to estimating the future standard SVR, KSwSVR always has the minimum prediction
value of CPU load. This modeling prediction approach uses error facing every type of resources and different predicted
the combination of Adaptive Network-based Fuzzy Inference steps. This statistical learning-based approach is not designed
Systems (ANFIS) and the clustering process applied on the for a specific forecast object, so we believe it will exhibit
CPU load time series. Liang et al. [30] presented a long- outstanding performance when faced with various subjects
term prediction model applying Fourier transform to exploit (job, application, VM, host, and cloud system) and resources
the periods of the CPU waves and using tendency-based (computing, storage, and network). Subsequently, based on
methods to predict the variation. Wu et al. [31] adopted the predicted results, a simple and efficient strategy is pro-
an Adaptive Hybrid Model (AHModel) for long-term load posed for resource provisioning, considering the variations of
prediction in computational grid. It is an improvement of prediction error and SLA levels. CPU allocation experiment
their previous work HModel [32]. Both are based on AR has shown that accurate load forecasting could significantly
and confidence interval estimations. However, the prediction reduce resources consumption while ensuring QoS. It is
range of AHModel is limited to 50-step-ahead. Moreover, beneficial to improve resources utilization and energy saving.
AHModel cannot predict the load variation very well, espe- We plan to integrate this method into an automated cloud
cially the variation around the peak points. resources management system in future work.
SVM is widely used in financial data prediction, general
business applications, environmental parameter estimation, Conflict of Interests
electric utility load forecasting, machine reliability forecast-
ing, control system, and signal processing [33]. As a relatively The authors declare that there is no conflict of interests
new forecasting technology, its application related to our regarding the publication of this paper.
work is comparatively rare at present. Prem and Raghavan
[34] directly used SVM to forecast resources derived from
NWS. SVR was used to build models predicting response Acknowledgments
time given a specified load for individual workloads cohosted
This work is supported by the National Natural Science
on shared storage system in [35]. SVR has also been used
Foundation of China (NSFC) under Grant no. 61303070.
to model power consumption as a function of hardware and
An earlier version of this paper was presented at the 10th
software performance counters in [36]. Kundu et al. [37]
International Conference on Services Computing, June 27-
proposed to use SVR and ANN to model the relationship
July 2, 2013, CA, USA (IEEE SCC2013).
between the resources provisioning to a virtualized applica-
tion and its performance. Such a model can then be used to
predict the resources need of an application to meet its perfor- References
mance target. But, their work was based on one assumption
that application has static workloads and its behavior is stable. [1] L. A. Barroso and U. Hölzle, “The case for energy-proportional
Obviously this assumption will severely restrict the feasibility computing,” Computer, vol. 40, no. 12, pp. 33–37, 2007.
of their method in dynamic cloud environment. Moreover, [2] X. Fan, W. Weber, and L. Barroso, “Power provisioning for a
their offline performance modeling cannot timely respond warehousesized computer,” ACM SIGARCH Computer Architec-
to changes of environment and load. Niehorster et al. [6] ture News, vol. 35, no. 2, pp. 13–23, 2007.
used SVM to enable the SaaS (software as a service) provider [3] P. Barham, B. Dragovic, K. Fraser et al., “Xen and the art of
to predict the resources requirements of an application. virtualization,” in Proceedings of the 9th ACM Symposium on
Different from our method, theirs is coarse-grained and Operating Systems Principles, vol. 37 of ACM SIGOPS Operating
offline. In addition, feature extraction of their work needs Systems Review, pp. 164–177, 2003.
to know the application’s internal detail, but this is generally [4] C. Clark, K. Fraser, S. Hand et al., “Live migration of virtual
prohibited in cloud. machines,” in Proceedings of the 2nd Conference on Symposium
12 The Scientific World Journal

on Networked Systems Design & Implementation, vol. 2, pp. 273– [25] Y. Jiang, C.-S. Perng, T. Li, and R. Chang, “ASAP: a self-
286, USENIX Association, 2005. adaptive prediction system for instant cloud resource demand
[5] “Inauguration Day on Twitter,” 2013, https://blog.twitter.com/ provisioning,” in Proceedings of the 11th IEEE International
2009/inauguration-day-twitter. Conference on Data Mining (ICDM ’11), pp. 1104–1109, IEEE,
[6] O. Niehorster, A. Krieger, J. Simon, and A. Brinkmann, “Auto- 2011.
nomic resource management with support vector machines,” [26] Z. Gong, X. Gu, and J. Wilkes, “Press: predictive elastic resource
in Proceedings of the 12th IEEE/ACM International Conference scaling for cloud systems,” in Proceedings of the International
on Grid Computing (GRID ’11), pp. 157–164, IEEE Computer Conference on Network and Service Management (CNSM ’10),
Society, 2011. pp. 9–16, IEEE, 2010.
[7] V. N. Vapnik, “An overview of statistical learning theory,” IEEE [27] E. Caron, F. Desprez, and A. Muresan, “Forecasting for grid
Transactions on Neural Networks, vol. 10, no. 5, pp. 988–999, and cloud computing on-demand resources based on pattern
1999. matching,” in Proceedings of the IEEE Second International Con-
[8] S. Theodoridis and K. Koutroumbas, Pattern Recognition, Aca- ference on Cloud Computing Technology and Science (CloudCom
demic Press, San Diego, Calif, USA, 4th edition, 2008. ’10), pp. 456–463, 2010.
[9] J. Platt, “Sequential minimal optimization: a fast algorithm [28] S. Islam, J. Keung, K. Lee, and A. Liu, “Empirical prediction
for training support vector machines,” Microsoft Research models for adaptive resource provisioning in the cloud,” Future
Technical Report MSR-TR-98-14, 1998. Generation Computer Systems, vol. 28, no. 1, pp. 155–162, 2012.
[10] R. E. Kalman, “A new approach to linear filtering and prediction [29] K. B. Bey, F. Benhammadi, A. Mokhtari, and Z. Guessoum,
problems,” Journal of Basic Engineering, vol. 82, no. 1, pp. 35–45, “CPU load prediction model for distributed computing,” in
1960. Proceedings of the 8th International Symposium on Parallel and
[11] P. J. Denning, “The locality principle,” Communications of the Distributed Computing (ISPDC ’09), pp. 39–45, IEEE, 2009.
ACM, vol. 48, no. 7, pp. 19–24, 2005. [30] J. Liang, J. Cao, J. Wang, and Y. Xu, “Long-term CPU load pre-
[12] “OLTP applications,” 2013, http://skuld.cs.umass.edu/traces/ diction,” in Proceedings of the 9th IEEE International Conference
storage/Financial1.spc.bz2. on Dependable, Autonomic and Secure Computing (DASC ’11),
[13] P. Dinda and D. O’Hallaron, “Host load prediction using linear pp. 23–26, IEEE, 2011.
models,” Cluster Computing, vol. 3, no. 4, pp. 265–280, 2000. [31] Y. Wu, K. Hwang, Y. Yuan, and W. Zheng, “Adaptive workload
[14] A. Eswaradass, X.-H. Sun, and M. Wu, “A neural network based prediction of grid performance in confidence windows,” IEEE
predictive mechanism for available bandwidth,” in Proceedings Transactions on Parallel and Distributed Systems, vol. 21, no. 7,
of the 19th IEEE International Parallel and Distributed Processing pp. 925–938, 2010.
Symposium (IPDPS ’05), p. 33a, IEEE, 2005. [32] Y. Wu, Y. Yuan, G. Yang, and W. Zheng, “Load prediction using
[15] M. Swany and R. Wolski, “Multivariate resource performance hybrid model for computational grid,” in Proceedings of the 8th
forecasting in the network weather service,” in Proceedings of IEEE/ACM International Conference on Grid Computing (GRID
the ACM/IEEE Conference on Supercomputing (Supercomputing ’07), pp. 235–242, IEEE, 2007.
’02), pp. 1–10, IEEE Computer Society Press, 2002. [33] N. Sapankevych and R. Sankar, “Time series prediction using
[16] D. Rumelhart and J. Mcclelland, Parallel Distributed Processing: support vector machines: a survey,” IEEE Computational Intel-
Explorations in the Microstructure of Cognition. Volume 1. ligence Magazine, vol. 4, no. 2, pp. 24–38, 2009.
Foundations, 1986. [34] H. Prem and N. R. S. Raghavan, “A support vector machine
[17] C.-C. Chang and C.-J. Lin, “LIBSVM: a library for support based approach for forecasting of network weather services,”
vector machines,” ACM Transactions on Intelligent Systems and Journal of Grid Computing, vol. 4, no. 1, pp. 89–114, 2006.
Technology, vol. 2, no. 3, article 27, 2011. [35] S. Uttamchandani, L. Yin, G. Alvarez, J. Palmer, and G. Agha,
[18] “The google cluster trace data,” 2013, http://code.google.com/ “Chameleon: a self-evolving, fully-adaptive resource arbitrator
p/googleclusterdata/. for storage systems,” in Proceedings of the USENIX Annual
[19] “CPU load trace from CMCL,” 2013, http://people.cs.uchicago Technical Conference, pp. 75–88, 2005.
.edu/lyang/Load/abyss10000.dat. [36] J. Mccullough, Y. Agarwal, J. Chandrashekar, S. Kuppuswamy,
[20] “Search engine,” 2013, http://skuld.cs.umass.edu/traces/storage/ A. Snoeren, and R. Gupta, “Evaluating the effectiveness of
WebSearch2.spc.bz2. model-based power characterization,” in Proceedings of the
USENIX Annual Technical Conference, 2011.
[21] “World Cup 98,” 2013, http://ita.ee.lbl.gov/html/contrib/World-
Cup.html. [37] S. Kundu, R. Rangaswami, A. Gulati, M. Zhao, and K. Dutta,
“Modeling virtualized applications using machine learning
[22] J. Xu, M. Zhao, J. Fortes, R. Carpenter, and M. Yousif, “Auto-
techniques,” in Proceedings of the 8th ACM SIGPLAN/SIGOPS
nomic resource management in virtualized data centers using
Conference on Virtual Execution Environment (VEE ’12), ACM
fuzzy logic-based approaches,” Cluster Computing, vol. 11, no. 3,
SIGPLAN Notices, pp. 3–14, 2012.
pp. 213–227, 2008.
[23] J. Rao, X. Bu, C.-Z. Xu, L. Wang, and G. Yin, “VCONF: a
reinforcement learning approach to virtual machines auto-
configuration,” in Proceedings of the 6th International Con-
ference on Autonomic Computing (ICAC ’09), pp. 137–146,
Association for Computing Machinery, 2009.
[24] N. Roy, A. Dubey, and A. Gokhale, “Efficient autoscaling in
the cloud using predictive models for workload forecasting,”
in Proceedings of the IEEE International Conference on Cloud
Computing (CLOUD ’11), pp. 500–507, IEEE, 2011.
Advances in Journal of
Industrial Engineering
Multimedia
Applied
Computational
Intelligence and Soft
Computing
The Scientific International Journal of
Distributed
Hindawi Publishing Corporation
World Journal
Hindawi Publishing Corporation
Sensor Networks
Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014

Advances in

Fuzzy
Systems
Modelling &
Simulation
in Engineering
Hindawi Publishing Corporation
Hindawi Publishing Corporation Volume 2014 http://www.hindawi.com Volume 2014
http://www.hindawi.com

Submit your manuscripts at


Journal of
http://www.hindawi.com
Computer Networks
and Communications Advances in
Artificial
Intelligence
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014 Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014

International Journal of Advances in


Biomedical Imaging Artificial
Neural Systems

International Journal of
Advances in Computer Games Advances in
Computer Engineering Technology Software Engineering
Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014

International Journal of
Reconfigurable
Computing

Advances in Computational Journal of


Journal of Human-Computer Intelligence and Electrical and Computer
Robotics
Hindawi Publishing Corporation
Interaction
Hindawi Publishing Corporation
Neuroscience
Hindawi Publishing Corporation Hindawi Publishing Corporation
Engineering
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014

You might also like