Analysis, Modeling and Simulation of Workload
Analysis, Modeling and Simulation of Workload
2, APRIL-JUNE 2014
AbstractUnderstanding the characteristics and patterns of workloads within a Cloud computing environment is critical in order to
improve resource management and operational conditions while Quality of Service (QoS) guarantees are maintained. Simulation
models based on realistic parameters are also urgently needed for investigating the impact of these workload characteristics on new
system designs and operation policies. Unfortunately there is a lack of analyses to support the development of workload models that
capture the inherent diversity of users and tasks, largely due to the limited availability of Cloud tracelogs as well as the complexity in
analyzing such systems. In this paper we present a comprehensive analysis of the workload characteristics derived from a production
Cloud data center that features over 900 users submitting approximately 25 million tasks over a time period of a month. Our analysis
focuses on exposing and quantifying the diversity of behavioral patterns for users and tasks, as well as identifying model parameters
and their values for the simulation of the workload created by such components. Our derived model is implemented by extending the
capabilities of the CloudSim framework and is further validated through empirical comparison and statistical hypothesis tests. We
illustrate several examples of this works practical applicability in the domain of resource management and energy-efficiency.
Index TermsCloud computing, workload characterization, cloud computing simulation, workload modeling
1 INTRODUCTION
sufficient to characterize the workload diversity of Cloud the model based on the validation results. Section 8 dis-
environments. In addition, there have been a number of cusses practical applications of the results obtained within
approaches that analyze the diversity of workload by clas- this paper. Sections 9 and 10 discuss the conclusions and
sifying tasks according to critical characteristics [7], [8], further research directions of this work, respectively.
[9]. However, none of these provide a comprehensive
study of the diversity of users and tasks, or provide a 2 BACKGROUND
model containing sufficient details about the model
parameters obtained from the analyses in order to be of 2.1 Diversity Patterns in Cloud
practical use to researchers. According to the NIST [11], the Cloud computing model has
The objective of this paper is to present an in-depth the following five essential characteristics: on-demand self-
empirical analysis of workload and its diversity in a large- service, resource pooling, broad network access, rapid elas-
scale production Cloud computing data center. Addition- ticity and measured service. These characteristics create
ally, this work aims to provide a validated simulation highly dynamic environments where customers from differ-
model that includes parameters of tasks and users to be ent contexts co-exist submitting workloads with diverse
made available for other researchers to use. The analysis is resource requirements at anytime. Workloads by themselves
conducted using the data from the second version of the have properties or attributes that describe their behavior.
Google Cloud tracelog [3], [10], which contains over 25 These attributes are normally expressed by the type and
million tasks, submitted by 930 users over the observational amount of resources consumed and other attributes that
period of a month. There are three core contributions within could dictate where a specific workload can or cannot be
this work: executed. For example, security requirements, geographical
location, or specific hardware constraints such as processor
An in-depth statistical analysis of the characteristics architecture, number of cores or Ethernet speed among
of workload diversity within a large-scale produc- others described in [13]. As discussed in [14], as more and
tion Cloud. The analysis was performed over the more customers adopt Cloud platforms to fulfill their IT
entire tracelog time span as well as a number of requirements, Cloud providers need to be prepared to man-
observational periods to investigate patterns of age highly heterogeneous workloads that are served on the
diversity for both users and tasks within the system. top of shared infrastructure. Workloads can be broadly clas-
An extensive analysis of distribution parameters sified according to the fundamental resources that they con-
derived from the workload analysis that can be sume in terms of CPU, memory and storage-bound
applied to simulation tools by other researchers. workloads [15]. Moreover, depending on the interaction
A comprehensive validation of the simulation model with the end-users, they can also be classified as latency-
based on empirical and statistical methods. A signifi- sensitive and batch workloads [16]. Common examples of
cant contribution of the simulation model provided workloads running in multi-tenant Cloud data centers
is that it does not just replay the data within the according to [17] include Business Intelligence, scientific
tracelog. Instead, it creates patterns that randomly high-performance computing, gaming and simulation.
fluctuate based on realistic parameters. This is
important in order to emulate dynamic environ- 2.2 Importance of Workload Models in Cloud
ments and to avoid just statically reproducing the Models abstract reality to aid researchers and providers in
behavior from a specific period of time. understanding system environments in order to develop or
enhance such systems. Workload models enable a way to
A secondary contribution of this paper is presenting actually study Cloud environments and the effect of work-
practical applications of the model obtained to identify load variability on the performance and productivity of the
sources of inefficiencies and enhance resource-management overall system. Specifically, they support researchers and
and energy usage in virtualized Cloud environments. providers in further understanding the actual status and
This paper applies the methodology of analysis intro- conditions of the Cloud system and identify Key Perfor-
duced in our previous approach [9], but is substantially mance Indicators (KPI) necessary to improve operational
different in a number of ways. First, this paper focuses parameters. Such models can be used in a number of
specifically on a substantial analysis of Cloud diversity for research domains including resource optimization, security,
tasks and users. Additionally, we analyze the entire trace- dependability and energy-efficiency. In order to produce
log time span and three additional observational periods, realistic models, it is critical to derive their components and
instead of just two dayswhich limited the original parameters from real-world production tracelogs. This
approachs applicability, as it could potentially omit cru- leads to capturing the intrinsic diversity and dynamism of
cial behavior within the overall Cloud environment. Fur- all co-existing components within the system as well as their
thermore, extensive analysis and parameter details are interactions. Moreover, realistic workload models enable
provided for user and task distributions. the simulation of Cloud environments whilst being able to
The remainder of this paper is organized as follows: control selected variables to study emergent system-wide
Section 2 presents the background; Section 3 discusses behavior, as well as support the estimation of accurate fore-
related work; Section 4 details the methodology used. Sec- casting under dynamic system conditions to improve QoS
tion 5 presents the cluster and distribution analysis of task offered to users. This supports the enhancement of Cloud
and user diversity. Section 6 presents the validation of the Management Systems (CMSs) as it allows providers to
model simulation. Section 7 describes the improvements to experiment with hypothetical scenarios and assess their
210 IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. 2, NO. 2, APRIL-JUNE 2014
decisions as a result of changes within the Cloud environ- generated by the Hadoop framework. The main objective of
ment (i.e., Capacity planning for increased system size, this work is to group jobs with similar characteristics using
alteration of the workload scheduling algorithm, perfor- clustering to analyze the resulting centroids. This work only
mance tradeoffs, and service pricing models). focuses on the usage of the storage system, neglecting other
critical resources such as CPU and Memory.
Our previous work [9] provides an approach for charac-
3 RELATED WORK terizing Cloud workload based on user and task patterns
The analysis of workload patterns for Cloud computing using the second version of the Google tracelog; it presents
environments has been addressed previously [5], [6], [7], coarse-grain statistical properties of the tracelog, and classi-
[8], [9], [18], [19], [20], [21], [22]. In this section, the most rel- fies tasks and users using statistical mechanisms to select
evant approaches are described; their limitations and gaps the number of clusters. A concise analysis of the clusters is
are also discussed. performed as well as best fit distributions for each. Finally,
Wang et al. [22] present an approach to characterize the the derived analysis parameters are simulated and com-
workloads of Cloud computing Hadoop ecosystems, based pared against the empirical data for validation. This work
on an analysis of the first version of the Google tracelog [2]. has a number of limitations; the analysis performed is con-
The main objective of this work is to obtain coarse-grain sta- fined to only two days as opposed to the entire tracelog
tistical data about jobs and tasks to classify them by dura- time span, resulting in the potential omission of crucial sys-
tion. This characteristic limits the works application to the tem environment behavior. Also, the cluster analysis and
study of timing problems, and makes it unsuitable to ana- intra-cluster analysis do not contain sufficient detail to
lyze other Cloud computing issues related to resource usage quantify the diversity of workload, instead presenting high-
patterns. Additionally, the analysis focuses on tasks and level observations. Furthermore, there is insufficient detail
ignores the relationship with the users, a crucial component about the parameter distributions used; more detail is nec-
in Cloud workload as discussed previously. essary in order for other researchers to simulate the work-
Zhang et al. [5] present a study to evaluate whether the load obtained. Finally, the validation of the simulated
mean values for task waiting time, CPU, Memory, and disk model against that of the empirical data is based only on a
consumption are suitable to accurately represent the perfor- visual match of the patterns from one single execution, and
mance characteristics of real traces. The data used in their does not consider more rigorous statistical techniques.
study is not publicly available and consists of the historical From the analysis of the related work it is clear that there
traces of six Google compute clusters spanning five days of are few available production tracelogs to analyze workload
operation. The evaluation conducted suggests that mean patterns in Cloud environments. Previous analyses present
values of runtime task resource consumption is a promising gaps that need to be addressed in order to achieve more
way to describe overall task resource usage. However, it realistic workload patterns. It is imperative to analyze large
does not describe how the boundaries for task classification data samples as performed by [5], [6], [9]. Small operational
were made and how members behave. time frames as those used in [7], [8], [22] could lead to unre-
Mishra et al. [7] describe an approach to develop Cloud alistic models. Second, analyses need to explore more than
computing workload classifications based on task resource coarse-grain statistics and cluster centroids. To capture the
consumption patterns. The analyzed data consist of records patterns of clustered individuals it is also necessary to con-
from five Google clusters over four days. The proposed duct analysis of the parameters and study the trends of each
approach identifies workload characteristics, constructs the cluster characteristic. Although previously approaches offer
task classification, identifies the qualitative boundaries of some insights about workload characteristics, they do not
each cluster and then reduces the number of clusters by provide a structured model which can be used for conduct-
merging adjacent clusters. This approach is useful to create ing simulations. Finally, the workload is always driven by
the classification of tasks, but does not perform an analysis the users, therefore realistic workload models must include
of the characteristics of the formed clusters in order to user behavioral patterns linked to tasks. The approaches
derive a detailed workload model. Finally, it is entirely previously described completely focus on tasks, neglecting
focused on task modeling, neglecting user patterns. the impact of user behavior on the overall environment
Kuvulya et al. [6] present a statistical analysis of MapRe- workload. A summary of the main characteristics of the
duce traces. The analysis is based on ten months of MapRe- related work is presented in Table 1.
duce logs from the M45 supercomputing cluster [4]. Here,
the authors present a set of coarse-grain statistical character-
istics of the data related to resource utilization, job patterns, 4 METHODOLOGY
and source of failures. This work provides a detailed The methodology, analysis and subsequent simulation
description of the distributions followed by the job comple- within this paper was applied to the second version of
tion times, but only provides very general information the Google Cloud tracelog [3], [10] which contains over
about the resource consumption and user behavioral pat- 12,000 servers, 25 million tasks and 930 users over the
terns. Similar to [22], this characteristic limits the proposed period of a month. The tracelog includes detailed data such
approach mainly to the study of timing problems. as submission patterns, resource requests of users and
Aggarwal et al. [8] describe an approach to characterize resource consumption of tasks within the system.
Hadoop jobs. The analysis is performed on a data set span- The methodology is divided into two distinct steps: The
ning 24 hours from one of Yahoo!s production clusters first is defining the model that will be used for simulating the
comprising of 11,686 jobs. This data set features metrics Cloud workload from the derived data set analysis. As stated
MORENO ET AL.: ANALYSIS, MODELING AND SIMULATION OF WORKLOAD PATTERNS IN A LARGE-SCALE UTILITY CLOUD 211
TABLE 1
Overview of Related Studies
previously, users are responsible for driving the volume and Eti ti P ti j P uj : (6)
behavior of tasks in terms of requested resources and the vol-
ume of task submission. Therefore, three important charac-
The second step of the methodology is to cluster tasks and
teristics that define this behavior within the tracelog are
users composed by the parameters defined for analyzing
referred to as parameters that are fundamental to describe the
and creating realistic workload models derived from empiri-
user behavior: the submission rate a, and requested amount
cal data. k-means clustering is a popular data-clustering
of CPU b and Memory f. The submission rate is the quotient
algorithm to divide n observations into k clusters, in which
of dividing the number of submissions by the tracelog time
analyzed data sets are partitioned in relation to the selected
span and is presented as task submissions per hour.
parameters and grouped around cluster centroids [23].
Requested CPU and memory are represented as normalized
One critical factor in such an algorithm is determining
resources requested by users taken directly from the task
the optimal number of clusters. For the analysis, we use
events log within the tracelog.
the statistical method proposed by Pham, et al. [24]. This
Tasks are defined by the type and amount of work
method, shown in Equations (7) and (8), allows us to
dictated by users, resulting in different execution length
select the number of clusters based on quantitative met-
and resource utilization patterns. Consequently, essential
rics avoiding qualitative techniques that introduce sub-
parameters that describe tasks are the length x and the
jectivity. This clustering method considers the degree of
average resource utilization for CPU g and Memory p.
variability among all the elements within the derived
While the length is defined as the total amount of work to
clusters in relation to the number of analyzed parame-
be computed, the average resource utilization is the mean
ters. A number of clusters k is suggested when this vari-
of all the consumption measurements recorded in the trace-
ability represented by f(k) is lower than or equal to 0.85
log for each task.
according to the observations presented by the authors.
The Cloud workload can be defined as a set of users
SK is the sum of cluster distortions, Nd is the number of
with profiles U submitting tasks classified in profiles T ,
parameters within the population and ak is the weight
where each user profile ui is defined by the probability
factor based on the previous set of clusters.
functions of a; b and f, and each task profile ti byx; g and
We run the k-means clustering algorithm for k ranging
p determined from the tracelog analysis. The expectation
from 1 to 10. For each value of k we calculate f(k) using
Eui of a user profile is given by its probability P(ui ),
Equations (7) and (8). Based on the results we were able to
and the expectation Eti of a task profile is given by its
formally determine the number of clusters for U and T
probability P ti conditioned to the probability of P uj .
(Equations (1) and (2)) respectively.
The model components and their relationship are formal-
ized in Equations (1) to (6). 8
> 1 If k I
>
>
>
<
U fu1 ; u2 ; u3 ; . . . ; ui g (1) Sk
fk If Sk1 6 0; 8k > I (7)
> a
> k Sk1
T ft1 ; t2 ; t3 ; . . . ; ti g (2) >
>
:
1 If Sk1 0; 8k > I
Fig. 1. Clusterization for users (a) entire month (b) entire month (omitting outliers) (c) Day 2, (d) Day 18, and (e) Day 26.
TABLE 2
Statistical Properties of User Clusters for Entire System
Fig. 2. Clusterization for tasks (a) entire month, (b) Day 2, (c) Day 18, and (d) Day 26.
rate exhibits highly variant behavior across all user clusters, highly heterogeneous across all clusters and observational
with an average Cv of 1.97. U2 is the only user cluster whose periods with an average Cv of 2.36, indicating high varia-
Cv submission rate is less than 1, which is most likely due to tion between values. This is due to the same reasons as
the cluster population size of 3. for the variability that exists for user submission rates;
There are three reasons for the above observations. First, task length is a parameter that is outside the boundaries
as reported in previous works [9] the Cloud data center of the system environment and is entirely dependent on
environment is naturally heterogeneous in workload due to the demands of the user (i.e., Users will execute tasks of
user behavior. Second, requested resources by users are different execution length to meet their QoS demands).
possibly a reflection of the application and system domain CPU and memory are less variable due to application
boundaries. For example, applications deployed or invoked domain constraints imposed by the system environment,
within the Cloud environment have pre-defined resource reflected by an average Cv value of 0.93 and 0.83 for CPU
requests to meet the demands of user QoS. Third, the sub- and memory utilization respectively.
mission rate is outside the boundaries of the system and is These results highlight two important findings. First,
entirely driven by users; Such behavior is reflective of the when quantifying the diversity of the Cloud environment, it
definition of Cloud computing, which provides the illusion appears that parameters that are outside the boundaries of
of infinite resource to users [25], allowing them to submit as the system environment introduce the highest level of het-
many tasks as required without conscious thought about erogeneity. This is demonstrated by the parameters user sub-
system limitations. mission rate and task execution length exhibiting highly
Figs. 2a, 2b, 2c, and 2d presents the k-clusters for tasks variant behavior in comparison to CPU and memory
across all observational periods, and demonstrates that it requests and utilization for users and tasks, respectively.
was possible to define three clusters for all observational Second, the diversity of workload imposed by these two
periods where f(k) < 0.85. It is observable that the cluster parameters introduces potential challenges to workload pre-
shapes are visually similar across all observational periods, diction; for this case, where the parameters are highly vari-
with cluster 3 (T3) containing the lowest values for CPU, able and dynamic, the expiration time of historical data
memory and length while T2 exhibits more variant behav- seems to be considerably shorter. Therefore, there exists the
ior. Moreover, T2 composes less than 2 percent of the total need for adaptive and evolving mechanisms that allow
task population and T3 contains over 70 percent of the task providers to obtain more accurate predictions.
population across all time periods as shown in Table 3. In
addition, we observe that the proportions of tasks within 5.2 Distribution Analysis
the clusters stay relatively constant. In comparison to the This section studies the data distributions for each cluster
heterogeneity of user clusters, task patterns appear to be parameter for tasks and users. Figs. 3 and 4 present the
more uniform across different observational periods.
Table 4 presents the statistical properties of the task TABLE 3
parameters length, CPU and Memory utilization for all Proportion of Task Clusters Population Percent
clusters across the four observational periods. It is possi-
ble to make a more balanced comparison of task clusters
over different time periods in contrast to user clusters
due to the observed stability. Similar to the characteristic
of user submission rate, we observe that task length is
214 IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. 2, NO. 2, APRIL-JUNE 2014
Fig. 3. CDF of user cluster U1 (a) CPU requested, (b) memory requested, and (c) submission rate.
MORENO ET AL.: ANALYSIS, MODELING AND SIMULATION OF WORKLOAD PATTERNS IN A LARGE-SCALE UTILITY CLOUD 215
Fig. 4. CDF of task cluster T1 (a) CPU, (b) memory, and (c) submission rate.
set of operating conditions, we implemented the task and parameters derived for the entire month analysis as
user model parameters described previously as an exten- described in Tables 5 and 6. The profiles of the simulated
sion to the CloudSim framework [26], [27], [28], [29]. servers are outlined from the tracelog as presented in
CloudSim is a Java based framework that enables the sim- Table 8 where the values of CPU and memory are nor-
ulation of complete Cloud Computing environments [27]. malized. The normalization is a scaling relative to the
It provides abstraction of all the elements within the largest capacity of the resource on any server in the trace
Cloud computing model and the interaction among them. which is 1.0.
However, as with any other simulation software, the qual-
ity and accuracy of the results entirely depends on how
6.3 Simulation Validation
accurately the introduced parameters reflect the analyzed
system in reality. The following subsections describe the Model validation is defined as the substantiation that a
implemented workload generator and the conducted sim- computerized model with its domain of applicability possesses
ulation validation. a satisfactory range of accuracy consistent with the intended
application of the model [3]. In the case of the historical
data of trace-driven models where the analyst does not
6.1 Workload and Environment Generator
have access to the real system or to a different dataset
The workload and environment generator is composed of
sample from the same system, a common validation
six modules: The Profile Manager, Data center Generator,
technique consists of using a portion of the available
Customer Generator, Task Generator and Environment
data to construct the model and the remaining data to
Coordinator. The user and task profiles describe respec-
determine whether the model behaves as the real system
tively the user and task types identified during the cluster-
does. This is typically addressed by sampling the ana-
ing process and encapsulate the outlined behavioral
lyzed tracelog where both the input and the actual sys-
patterns derived during the cluster and distribution analy-
tem response must be collected from the same period of
sis. The server profiles describe the capacities and character-
time [31]. According to Sargent [30], there are two basic
istics of the data center hosts according to the data within
approaches in comparing the simulation model to the
the tracelog. These characteristics as well as the proportion
behavior of the real system. The first consists of using
of servers from each type are listed in Table 8.
graphs to empirically evaluate the outputs and the sec-
The profiles manager loads each element description
ond involves the application of statistical hypothesis
making them available to the generators. The User Gener-
tests to make an objective decision.
ator creates the CloudSim user instances and connects
To validate our model simulation we use both techni-
them with a specific profile determined by their associ-
ques; the proportions of categorical data such as task,
ated probabilities as described in Equation (5). The Task
user and server types as well as tasks priorities are con-
Generator creates the CloudSim task instances and con-
trasted empirically by plotting comparative charts and
nects them with a specific task profile determined by the
evaluating the absolute error between the average out-
conditional probability in Equation (6). Each one of the
put from the simulations and the data in the real sys-
user and task characteristics defined such as submission
tem. Additionally we analyze the variability of results
rate, length and resource consumption described in the
and their corresponding confidence interval (CI). On the
model are obtained by sampling the inverse CDFs of the
other hand, continuous data such as the user and task
distributions in Equations (3) and (4). Finally, the Envi-
resource request and consumption patterns are com-
ronment Coordinator controls the interactions between
pared statistically using the Wilcox Mann-Whitney test
the three generators and the CloudSim framework that
executes the simulation with the created instances. TABLE 5
Probability of 0 for Task Resource Utilization
6.2 Simulation Configuration
We have executed a model simulation of a data center
composed of 12,000 servers with 160 customers submit-
ting tasks during 24 hours a total of five iterations. The
user and task profiles are configured using the statistical
216 IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. 2, NO. 2, APRIL-JUNE 2014
TABLE 6
Best Fit Distribution Parameters of User and Task Clusters for Entire System
(WMW) [32], [33]. WMW is one of the most powerful 6.4 Validation Results
non-parametric tests for comparing two populations. The results from our simulation experiments demonstrate
According to Mauger [34], it is based on the test of the the accuracy of the derived model to represent the opera-
null hypothesis that the distributions of two populations, tional characteristics of the workload within the Cloud com-
although unspecified, are equal, against the alternative puting data center for the analyzed scenario. Fig. 5
hypothesis that the distributions have the same shape but are illustrates the proportion of components (users, tasks, task
shifted, so the outcomes of one population tends to be larger priorities and servers) created during the simulations which
than the other. It is commonly applied instead of the are contrasted against the observations from the real sys-
two-sample t-test when the analyzed data does not fol- tem. Comparing the average simulation outputs with the
low a normal distribution as is the case of the outlined real values, it is possible to observe that simulated propor-
user and tasks patterns. Additionally, in order to verify tions of fundamental elements consistently match the pro-
the consistency of the WMW test, we have applied the portions of the elements in the actual system. From the
Fishers Method [35]; a meta-analysis technique to com- detailed results presented in Table 9, it can be observed that
bine p-values from different and independent tests while the proportions of tasks do not significantly fluctuate,
which have the same null hypothesis. The objective is to the proportions of users and servers across different simula-
verify whether the rejections are statistically significant tion executions present a higher variability. This is mainly
given the variances reported, or are consistent with the produced by a very small population of specific clusters.
results of the other simulations. For example cluster U2 represents only 0.70 percent of
TABLE 7
Best Fit Distribution Comparison for Task Clusters
MORENO ET AL.: ANALYSIS, MODELING AND SIMULATION OF WORKLOAD PATTERNS IN A LARGE-SCALE UTILITY CLOUD 217
TABLE 8 TABLE 9
Server Characteristics of Tracelog Simulation Results for Proportions of Cloud Data
Center Components
Fig. 5. Comparison of proportions of real and simulated data for (a) users, (b) tasks, (c) task priority, and (d) servers.
218 IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. 2, NO. 2, APRIL-JUNE 2014
Fig. 6. CDF of user patterns between real and simulated data for U1 (a) requested CPU, (b) requested memory, and (c) submission rate.
Fig. 7. CDF of task patterns between real and simulated data for T3 (a) CPU utilization, (b) memory utilization, and (c) length.
TABLE 10
Wilcox Mann-Whitney and Fishers P-Value Test for User Clusters
TABLE 11
Wilcox Mann-Whitney and Fishers P-Value Test for Task Clusters
is closer to that observed in the real system. Conversely, Essentially, the data is ranked and presented in a histo-
differences in CPU utilization for T2 and T3 increase the gram, which is split based on the lowest points of the dif-
error in execution time for these two clusters. ferent valleys created by the multimodal distribution. To
identify the peaks and valleys of a given multimodal data
set, we smooth the histogram by applying the LOWESS
7 IMPROVEMENT OF CPU CONSUMPTION [36] (Locally-Weighted Scatterplot Smoother) technique
PATTERNS using the Minitab statistical package [37]. Then, the
Inaccurate CPU utilization patterns for T2 and T3 are derived sub-regions are fitted to new parametrical
result of multimodal data distributions. This makes fit- distributions following the same process described in
ting such data sets with a single theoretical distribution Section 5.2. Consequently, the CPU utilization patterns of
unsuitable and creates significant gaps between the simu- the affected clusters comprise a combination of different
lated and real data as observed in Fig. 7b. To improve the distributions which are sampled by the model simulator
accuracy of our model, we applied multi-peak histogram based on the proportional size of the derived sub-regions.
analysis for region splitting [38] and fitted the derived The distribution parameters and sizes of the obtained
dataset sub-regions to new parametrical distributions. sub-regions are presented in Table 12.
MORENO ET AL.: ANALYSIS, MODELING AND SIMULATION OF WORKLOAD PATTERNS IN A LARGE-SCALE UTILITY CLOUD 219
Fig. 8. CDF of task patterns between real and simulated data of task execution time (seconds) for (a) T1 (s), (b) T2, and (c) T3.
The results of this process are illustrated in Fig. 9 for resources and energy-efficiency. The core idea is to co-
where it can be observed that the split distributions allocate different types of workloads based on the level of
improve the fitting between the simulated and real data- interference that they create, to reduce resultant overhead
sets. The p-values of the WMW test for both clusters are and thus improve the energy-efficiency of the data center.
sufficiently statistically strong to support the equality of By considering the resource consumption patterns of each
patterns. This reduces the error for execution time from task type we estimate the level of interference and energy-
8.07 to 0.42 percent and from 5.91 to 0.13 percent for T2 efficiency decrement when they are co-located in a physi-
and T3, respectively. cal server. We classify incoming tasks based on their
resource usage patterns, pre-select the hosting servers
8 APPLICATION OF WORK based on resources constraints, and make the final alloca-
tion decision based on the current servers performance
The workload model presented in this paper enables
interference level [40]. In both cases the proposed work-
researchers to simulate request and consumption pat-
load model and the parameters derived from the pre-
terns considering parameters and patterns statistically
sented analysis are used to emulate the user and tasks
close to those observed from a production environment.
patterns required by the energy-aware algorithms. The
This is critical in order to improve resources utilization,
model integrates the relationship between user demand
reduce energy waste and in general terms support the
and the actual resource usageessential in both scenarios
design of accurate forecast mechanisms under dynamic
where the aim is to achieve a balance between resource
conditions to improve the QoS offered to customers. Spe-
request and utilization in order to reduce resource waste.
cifically, we use the proposed model to support the
Another important benefit of our approach is that as val-
design and evaluation of two energy-aware mechanisms
ues of customer and task parameters are represented as pro-
for Cloud computing environments.
portions of resources requested or consumed, they are
The first is a resource overallocation mechanism that
agnostic of underlying hardware characteristics. Therefore,
considers customers resource request patterns and the
the proposed model can be used to evaluate the perfor-
actual resource utilization imposed by their submitted
mance of different data center configurations under the
tasks. Taking into account these parameters from the pro-
same workload.
posed model it is possible to estimate the resource overes-
Furthermore, the comprehensive analysis at cluster and
timation patterns. The main idea is to exploit the resource
intra-cluster level, the workload model that integrates user
overestimation patterns of each user type in order to
and tasks patterns, and the applicability of the model
smartly overallocate resources to the physical servers.
independently of the hardware characteristics represent
This reduces the waste produced by frequent overestima-
unique advances in comparison with the related work pre-
tions and increases data center availability. Consequently,
viously discussed in Section 3. Additionally, the proposed
it creates the opportunity to host additional Virtual
model supports the assessment of resource management
Machines in the same computing infrastructure, improv-
mechanisms such as those recently presented in [41], [42]
ing its energy-efficiency [39].
and [43] with parameters from a large-scale production
The second mechanism considers the relationship
Cloud environment.
between Virtual Machine interference due to competition
TABLE 12
Sub-Regions Distribution Fitting to Improve CPU Utilization for T2 and T3
220 IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. 2, NO. 2, APRIL-JUNE 2014
10 FUTURE WORK
Future research directions includes extending the model to
include tasks constraints based on server characteristics;
this will allows us to analyze the impact of hardware hetero-
Fig. 9. CPU utilization pattern improvement for (a) T2 and (b) T3.
geneity on workload behavior. Other extensions include
analyzing the workload from the jobs perspective specifi-
9 CONCLUSIONS cally modeling the behavior and relationship of users and
This paper presents an analysis that quantifies the diver- submitted jobs, accurately emulating and analyzing work-
sity of Cloud workloads and derives a workload model load energy consumption and reliability enabling further
from a large-scale production Cloud data center. The research into energy-efficiency, resource optimization and
presented analysis and model captures the characteristics failure-analysis in the Cloud environment. Finally, it is
and behavioral patterns of user and task variability important to enable a collaboration link with the CloudSim
across the entire system as well as different observa- group in order to integrate the proposed workload genera-
tional periods. The derived model is implemented using tor as an add-in of the current framework implementation
the CloudSim framework and extensively validated allowing it to be made publicly available.
through empirical comparison and statistical tests.
From the observations presented within this work and ACKNOWLEDGMENTS
the results obtained from the simulations, a number of The work was supported by CONACyT (No. 213247),
conclusions can be made. These are as follows: the National Basic Research Program of China (973)
(No. 2011CB302602), and the UK EPSRC WRG platform
Workload in Cloud data centers is driven not only by tasks
project (No. EP/F057644/1).
characteristics but also by user behavioral patterns.
Related approaches on workload analysis are
focused on parameters such as the duration and the REFERENCES
resources consumed by tasks. However, as observed [1] R. Buyya, R. Ranjan, and R. N. Calheiros, InterCloud: Utility-ori-
from the presented analysis, in some scenarios spe- ented federation of cloud computing environments for scaling of
application services, Proc. 10th Int. Conf. Algorithms Archit. Parallel
cific types of users impose a strong influence on the Process., 2010, pp. 1331.
overall Cloud workload. Therefore, comprehensive [2] Google. Google Cluster Data V1 (2010). [Online] Available: http://
workload models must consider both tasks and users code.google.com/p/googleclusterdata/wiki/TraceVersion1
[3] Google. Google Cluster Data V2 (2011). [Online] Available: http://
in order to reflect realistic conditions. code.google.com/p/googleclusterdata/wiki/ClusterData2011_1
User patterns tend to be significantly more diverse than [4] Yahoo. Yahoo! M45 Supercomputing Project. (2007). [Online].
task patterns across different observational periods. Available: http://research.yahoo.com/node/1884
Depending on the type of service offered, providers [5] Q. Zhang, J. Hellerstein, and R. Boutaba, Characterizing task
usage shapes in Google compute clusters, in Proc. 5th Int. Work-
can control the type of tasks and the environment in shop Large Scale Distrib. Syst. Middleware, 2011, pp. 28.
which they are running (i.e., SaaS and PaaS). This can [6] S. Kavulya, J. Tan, R. Gandhi, and P. Narasimhan, An analysis of
create more stable tasks patterns over the time. On traces from a production MapReduce cluster, in Proc. IEEE/ACT
Int. Conf. Cluster, Cloud Grid Comput., 2010, pp. 94103.
the other hand, user patterns tend to change accord- [7] A. K. Mishra, J. Hellerstein, W. Cirne, and C. R. Das, Towards
ing to needs derived from their own business objec- characterizing cloud backend workloads: Insights from Google
tives which are completely out of the boundaries of compute clusters, ACM SIGMETRICS Perform. Eval. Rev., vol. 37,
Cloud providers. This creates new challenges on pp. 3441, 2010.
[8] S. Aggarwal, S. Phadke, and M. Bhandarkar, Characterization of
workload prediction mechanisms that need to evolve Hadoop jobs using unsupervised learning, in Proc. 2nd Int. Conf.
and adapt according to such dynamic characteristics. Cloud Comput. Technol. Sci., 2010, pp. 748753.
MORENO ET AL.: ANALYSIS, MODELING AND SIMULATION OF WORKLOAD PATTERNS IN A LARGE-SCALE UTILITY CLOUD 221
[9] I. Solis Moreno, P. Garraghan, P. Townend, and J. Xu, An [33] A. Gold, Understanding the Mann-Whitney Test, J. Property Tax
approach for characterizing workloads in Google cloud to derive Assessment Admin., vol. 4, pp. 5557, 2007.
realistic resource utilization models, in Proc. IEEE Int. Symp. Serv. [34] D. T. Mauger and G. L. Kauffman Jr, 82 - statistical analysis
Oriented Syst. Eng., 2013, pp. 4960. specific statistical tests: Indications for use, Surgical Research
[10] C. Reiss, J. Wilkes, and J. Hellerstein, Google Cluster-Usage W. S. Wiley and W. W. Douglas, eds., San Diego, CA, USA, Aca-
Traces: Format Schema, Google Inc., Mountain View, CA, demic, 2001, pp. 12011215.
USA, White Paper, 2011. [35] D. A. S. Fraser, A. K. M. Saleh, and K. Ji, Combining p-values: A
[11] P. Mell and T. Grance, The NIST definition of cloud computing, definitive process, J. Statist. Res., vol. 44, pp. 1529, 2010.
NIST Spec. Publication, vol. 800, p. 145, 2011. [36] D. Borcard and P. Legendre, Exploratory data analysis,in
[12] M. A. El-Refaey and M. A. Rizkaa, Virtual systems workload char- Numerical Ecology, New York, NY, USA, Springer, pp. 930, 2011.
acterization: An overview, in Proc. IEEE Int. Workshops Enabling [37] Minitab, Version: Release 16 (2010). MINITAB statistical software
Technol. Infrastructures Collaborative Enterprises, 2009, pp. 7277. [Online]. Available: http://www.minitab.com.
[13] B. Sharma, V. Chudnovsky, J. Hellerstein, R. Rifaat, and C. R. Das, [38] S. Pal and P. Bhattacharyya, Multipeak histogram analysis in
Modeling and synthesizing task placement constraints in Google region splitting: A regularization problem, in Proc. IEEE Comput.
compute clusters, in Proc. ACM Symp. Cloud Comput., 2011, pp. 114. Digit. Tech., 1991, vol. 138, pp. 285288.
[14] J. Zhan, L. Wang, W. Shi, S. Gong, and X. Zang, PhoenixCloud: [39] I. Solis Moreno and J. Xu, Neural network-based overallocation
Provisioning resources for heterogeneous workloads in cloud for improved energy-efficiency in real-time cloud environments,
computing, arXiv preprint arXiv:1006, vol. 1401, 2010. in Proc. IEEE Int. Symp. Object/Compon./Serv.-Oriented Real-Time
[15] V. Vasudevan, D. Andersen, M. Kaminsky, L. Tan, J. Franklin, and Distrib. Comput., 2012, pp. 119126.
I. Moraru, Energy-efficient cluster computing with FAWN: [40] I. Solis Moreno, R. Yang, J. Xu, and T. Wo, Improved energy-effi-
Workloads and implications, in Proc. Int. Conf. Energy-Efficient ciency in cloud datacenters with interference-aware virtual
Comput. Netw., 2010, pp. 195204. machine placement, in Proc. IEEE Int. Symp. Auton. Decentralized
[16] T. N. B. Doung, X. Li, R. S. M. Goh, X. Tang, and W. Cai, QoS- Syst., 2013, pp. 18.
aware revenue-cost optimization for latency-sensitive services in [41] X. Lu, H. Wang, J. Wang, J. Xu, and D. Li, Internet-based virtual
IaaS clouds, in Proc. IEEE/ACM Int. Symp. Distrib. Simul. Real computing environment: Beyond the data center as a computer,
Time Appl., 2012, pp. 1118. Future Generation Comput. Syst., vol. 29, pp. 309322, 2013.
[17] IBM, Get more out of cloud with a structured workload analysis, [42] M. Kesavan, I. Ahmad, O. Krieger, R. Soundararajan, A.
White Paper IAW03006-USEN-00, 2011. Gavrilovska and K. Schwan, Practical compute capacity
[18] A. Bahga and V. K. Madisetti, Synthetic workload generation for management for virtualized datacenters, IEEE Trans. Cloud
cloud computing applications, J. Softw. Eng. Appl., vol. 4, Comput., vol. 1, no. 1, pp. 88100, Jan.-Jun. 2013.
pp. 396410, 2011. [43] J. Doyle, R. Shorten, and D. OMahony, Stratus: Load balancing
[19] A. Beitch, B. Liu, T. Yung, R. Griffith, A. Fox, and D. A. Patterson, the cloud for carbon emissions control, IEEE Trans. Cloud Com-
Rain: A workload generation toolkit for cloud computing put., vol. 1, no. 1, pp. 116128, Jan.-Jun. 2013.
applications, Elect. Eng. Comput. Sci. Univ. California, Berkeley,
CA, USA, White Paper UCB/EECS-2010-14, 2010. Ismael Solis Moreno received the PhD degree
[20] Y. Chen, A. S. Ganapathi, R. Griffith, and R. H. Katz, Analysis from the University of Leeds, and the MSc
and lessons from a publicly available Google cluster trace, USA, degree from the CENIDET, Mexico. He has
EECS Dept., Univ. California, Berkeley, CA, UCB/EECS-2010-95., worked as a researcher for the Mexican Electrical
Jun. 2010. Research Institute. His current work on energy-
[21] J. W. Smith and I. Sommerville, Workload classification & soft- efficient Cloud computing is funded by the CON-
ware energy measurement for efficient scheduling on private cloud ACyT. He has received best paper awards at
platforms, presented at the ACM SOCC, Cascais, Portugal, 2011. IEEE SOSE-2013 and IEEE ISADS-2013.
[22] G. Wang, A. R. Butt, H. Monti, and K. Gupta, Towards synthesiz-
ing realistic workload traces for studying the Hadoop ecosystem,
in Proc. IEEE Int. Symp. Modeling, Anal. Simul. Comput. Telecom-
mun. Syst., 2011, pp. 400408. Peter Garraghan received the BSc degree from
[23] R. Xu and D. Wunsch, Survey of clustering algorithms, IEEE Staffordshire University, United Kingdom, and is
Trans. Neural Netw., vol. 16, pp. 645678, 2005. currently working toward the PhD degree in the
[24] D. T. Pham, S. S. Dimov, and C. D. Nguyen, Selection of K in Distributed Systems and Service Group at the
K-means clustering, Proc. Inst. Mech. Eng., Part C: J. Mech. Eng. University of Leeds. He has worked as an IT spe-
Sci., vol. 219, pp. 103119, 2005. cialist at HP, Germany. His current research on
[25] M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. H. Katz, Cloud computing and energy-aware dependabil-
A. Konwinski, G. Lee, D. A. Patterson, A. Rabkin, I. Stoica, and M. ity is funded by the UK EPSRC WRG platform
Zaharia, Above the Clouds: A Berkeley view of cloud project. He has received an award for best con-
computing, Univ. California, Berkeley, CA, USA, Tech. Rep. ference paper at the IEEE SOSE-2013.
UCB/EECS-2009-28, Feb. 2009.
[26] R. Buyya, R. Ranjan, and R. N. Calheiros, Modeling and simula- Paul Townend is a research fellow in the School
tion of scalable cloud computing environments and the CloudSim of Computing, University of Leeds. He has been
toolkit: Challenges and opportunities, in Proc. Intl Conf. High Per- a lead researcher on major projects dealing with
form. Comput. Simul., 2009, pp. 111. HPC, decision support, large-scale simulations,
[27] R. N. Calheiros, R. Ranjan, A. Beloglazov, C. A. F. De Rose, and R. Cloud computing, and dependable and secure
Buyya, CloudSim: A toolkit for modeling and simulation of cloud systems. He has extensive experience in collabo-
computing environments and evaluation of resource provisioning rating with academia, local government, and
algorithms, Softw. Practice Experience, vol. 41, pp. 2350, 2010. industry, and has authored and coauthored more
[28] S. K. Garg and R. Buyya, NetworkCloudSim: Modelling parallel than 40 international publications.
applications in cloud simulations, in Proc. IEEE Intl. Conf. Utility
Cloud Comput., 2011, pp. 105113.
[29] B. Wickremasinghe, R. N. Calheiros, and R. Buyya, CloudAnalyst: Jie Xu is a chair of computing and head of the I-
A CloudSim-based visual Modeller for analysing cloud computing CSS at the University of Leeds. He is the director
environments and applications, in Proc. IEEE Intl. Conf. Adv. Inf. of the UK EPSRC WRG e-Science Centre. He is
Netw. Appl., 2010, pp. 446452. also a guest professor of Beihang University,
[30] R. G. Sargent, Verification and validation of simulation models, China. He has published more than 300 aca-
in Proc.Conf, Winter Simul., 2010, pp. 166183. demic papers in areas related to dependable dis-
[31] O. Balci and R. G. Sargent, Some examples of simulation model tributed systems and has industrial experience in
validation using hypothesis testing, Proc. Conf. Winter Simul., designing and implementing large-scale net-
vol. 2, pp. 621629, 1982. worked computer systems. He has led or coled
[32] D. Brown and P. Rothery, Models in biology: Mathematics, statis- many research projects to the value of more than
tics and computing, Proc. 14th Conf. Winter Simul, 1993. $30M. He is a member of the IEEE.