0% found this document useful (0 votes)
89 views34 pages

AutoScaling (ResearchPPT)

The researchers propose a novel cloud resource auto-scaling scheme that automatically predicts web request volumes and determines the optimal number of virtual machines (VMs) needed to process requests while balancing cost and latency. They implemented the scheme on Amazon Web Services and evaluated it using three real-world web log datasets. Results showed the scheme achieved accurate request prediction and optimal resource allocation with low cost-latency tradeoffs and service level agreement violations.

Uploaded by

Naveen Jaiswal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
89 views34 pages

AutoScaling (ResearchPPT)

The researchers propose a novel cloud resource auto-scaling scheme that automatically predicts web request volumes and determines the optimal number of virtual machines (VMs) needed to process requests while balancing cost and latency. They implemented the scheme on Amazon Web Services and evaluated it using three real-world web log datasets. Results showed the scheme achieved accurate request prediction and optimal resource allocation with low cost-latency tradeoffs and service level agreement violations.

Uploaded by

Naveen Jaiswal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 34

Optimal Cloud Resource AutoScaling for Web Applications

J I N G J I A N G , J I E LU , G U A N G Q U A N Z H A N G , G U O D O N G LO N G
2 0 1 3 1 3 T H I E E E / A C M I N T E R N AT I O N A L S Y M P O S I U M O N
C LU S T E R , C LO U D , A N D G R I D C O M P U T I N G

Presented By Nayan,Naveen,Digvijay

INTRODUCTION
Web

application providers scales Virtual Resources up or


down to achieve Cost-effective outcomes.

Ability to access computing resource elastically (by scaling


up or down) according to dynamic resource demands
Elasticity and Cost effectiveness are two of the key
features that ensure cloud computing will appeal to more
customers

What Research Paper


Does
They propose a novel cloud resource auto-scaling scheme
at the virtual machine (VM) level for web application
providers.
The scheme automatically predicts the number of web
requests and discovers an optimal cloud resource demand
with cost-latency trade-off.
The scheme makes a resource scaling decision that is Up
or Down or NOP (no operation) in each time-unit reallocation.

How did they test


They have implemented the scheme on the Amazon cloud
platform and evaluated it using three real-world web log
datasets.
The experiment results show that the scheme achieves
resource auto-scaling with low prediction errors, as
well as optimal resource allocation with scalar costlatency trade-off and low SLA violations.

Research Paper goals


Predict resources are demanded in each time-unit of re-allocation
Adaptively adjust the resource cap based on the predicted
resource demands
Design optimization algorithms to make a trade-off decision
between cost and latency, while meeting the cost constraints and
SLAs on latency metrics.

Approaches used for Auto-Scaling


before this Research Paper
Reactive mechanism
Elasticity rules or threshold-based rules
Rule-based auto-scaling mechanisms
Explicit users intervention required
Takes several minutes to instantiate a new VM
Predictive-based method
Analyzing the history data of resource usage
Uses histogram techniques to predict workload

System Modeling
Web application providers can specify their budgetary
constraints and SLA in respect of latency for their
applications.
In practical applications, an unpredictable burst of
number of requests will happen.
Proposed scheme:
This scheme monitors the waiting queue of requests to be
processed in real-time. Once the length of the queue is
bigger than a threshold, the scheme could dynamically
append VMs to process the exceeding number of requests.

Steps followed in
Scheme
To collect request records as the history data.
To analyze the history data hourly and predict the
number of requests for the next time-unit.
To discover the optimal number of VMs by utilizing the
Optimization Model.
To scale the resource(VMs) up or down or NOP from a
public cloud platform

Prediction Model Definition


They model the resource demand based on the predicted web request
distribution at a VM-level
To predict the number of web requests
X(t);t T where T is an index of the time fragment, and X(t) is the random
variable

The current and past observed values :- (X(t k), X(t k + 1), ...X(t 1),
X(t)),
Future value :- X(t + p)
where k is the length of the history data used for prediction and p is the
predictive time fragment

Prediction Model- Key


feature identification
Considering most online web requests have a seasonal
or periodical behavior to some extent, we design a
novel Linear Regression approach for prediction by
using an auto-correlation function to identify the key
features. Ex: Mail server
Linear model to relate no of request in continuous time
frames:

Modelling the Relationship


between Cost and Latency
Cost (C) prediction depends on the number (M) of VMs
changing, e.g C=f(m)
Latency (L) consists of execution time (Ts) and waiting
time for executing (Tq)
The arrivals of requests to be processed on VMs obey
a Poisson distribution with rate
Executed requests on VMs are also considered as a
Poisson distribution with rate .

Markov chain

Due to the allocation of multiple servers (or VMs) in the scheme


The arrivals are Poisson with rate = E(X(t))
= 1/E(Ts)
i no of request

Request queue size and


Latency
Queue Size:
By steady state probability and no of requests arrived queue
length can be calculated

Expected waiting time


Lq = Tq, where Tq = E(tq) is the expected length of the waiting
time in queue tq.
So Tq= Lq /
Latency:
L(, , m) = Tq + Ts

Optimization Model
Cost, latency tradeoff
If virtual machines are reduced then cost reduced but waiting
time increases
If VM increased then cost increase

To solve this problem, cost-latency trade-off optimization


objective function
where [0, 1] reflects the importance ratio of cost and latency.

Optimization Model
Cont..
Due to the different scale of the number of VMs and latency, we can
normalize the latency by equation
G = L/T
where T is the latency threshold, which is defined in SLAs.

To normalize the number of VMs, we consider the equation

Now trade off function becomes:

Algorithm for computing


VMs

Experiment
Real-world datasets:
1. AOL 1 search log dataset
2. Sogou 2 search log dataset
3. A real-world dataset collected by the UTS (The
University of Technology, Sydney) library
Time-unit of re-allocation: on hourly basis.

Experiment Steps
1. Investigate how the seasonal characters
affect the selection of features for
prediction modeling
2. Evaluate the prediction model through three datasets
3. Visualize the performance of the prediction model
4. Evaluate the allocation performance for the given
number of requests
5. Compare our scheme with other approaches

Features Selection
Evaluation
Investigate how the seasonal characters
affect the selection of features for
prediction modeling
Number of requests in each hour as a vector <v1, ..., vi,
...v60>,
(vi - volume within one minute)
Apply the Symmetrizing Kullback-Leibler (KL)
divergence to measure the difference between two
distribution probabilities.

Cont..

Conclusion:
If SKL divergence varies from 0 to 1.
All the SKL divergences in Table I are small, which
demonstrates that the three datasets have highly seasonal
characters and the number of requests can be predicted by
using the history data.

Cont..
select the key features for the linear regression model

Experiment Steps
1. Investigate how the seasonal characters affect the
selection of features for prediction modeling

2. Evaluate the prediction model through


three datasets
3. Visualize the performance of the prediction model
4. Evaluate the allocation performance for the given
number of requests
5. Compare our scheme with other approaches

Evaluation Methods
Used 10-fold cross validation as the evaluation method
Root Mean Squared Error (RMSE)
Relative Squared Error (RSE)
Mean Absolute Error (MAE)
Relative Absolute Error (RAE)
coefficient of determination (R2)

Experiment Steps
1. Investigate how the seasonal characters affect the
selection of features for prediction modeling
2. Evaluate the prediction model through three datasets

3. Visualize the performance of the


prediction model
4. Evaluate the allocation performance for the given
number of requests
5. Compare our scheme with other approaches

Prediction Model
Evaluation
U = (1 + padding) prediction
To select a good padding value, we measure the
relationship between the padding value and the
confidence interval

Conclusion:
our scheme achieves good prediction on both number of
requests and resource demands, and that the padding
value can be dynamically adjusted well in each time
interval

Cont..

Experiment Steps
1. Investigate how the seasonal characters affect the
selection of features for prediction modeling
2. Evaluate the prediction model through three datasets
3. Visualize the performance of the prediction model

4. Evaluate the allocation performance for


the given number of requests
5. Compare our scheme with other approaches

Allocation Evaluation

Cont..

Experiment Steps
1. Investigate how the seasonal characters affect the
selection of features for prediction modeling
2. Evaluate the prediction model through three datasets
3. Visualize the performance of the prediction model
4. Evaluate the allocation performance for the given
number of requests

5. Compare our scheme with other


approaches

Performance Evaluation
for a Web Application

CONCLUSION & FUTURE


WORKS
Auto-scaling scheme via cost-latency tradeoff
Predicted optimal numbers of VMs via machine
learning and time series data
More general queueing models
Other types of VMs (e.g. Multi-tenant shared)

Problem identified
Proposed algorithm adds virtual machines only if SLA
doesnt violate. But in case of sudden traffic burst there
may be need of additional resources beyond the limit
given in SLA. So we need a better trade off between SLA
violation and poor latency in burst traffic.

Thank You

You might also like