0% found this document useful (0 votes)

58 views11 pages

Predicting The End-to-End Tail Latency of Containerized Microservices in The Cloud

This document summarizes a research paper that proposes a modeling approach to predict the end-to-end tail latency of requests flowing through microservices deployed in containers in the cloud. The modeling approach uses machine learning on multi-layer data including metrics from containers, VMs and hardware to account for cloud performance variability and interference. The approach is evaluated using an open-source e-commerce application on a cloud testbed and is shown to achieve high prediction accuracy.

Uploaded by

gabriel.matos1201

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

58 views11 pages

Predicting The End-to-End Tail Latency of Containerized Microservices in The Cloud

Uploaded by

gabriel.matos1201

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

2019 IEEE International Conference on Cloud Engineering (IC2E)

Predicting the End-to-End Tail Latency of Containerized Microservices in the Cloud

Joy Rahman Palden Lama

Dept. of Computer Science Dept. of Computer Science
University of Texas at San Antonio University of Texas at San Antonio
San Antonio, Texas-78249 San Antonio, Texas-78249
Email: [email protected] Email: [email protected]

Abstract—Large-scale web services are increasingly adopting flowing through the microservice architecture, which could
cloud-native principles of application design to better utilize result in poor user experiences and loss of revenue [32, 46].
the advantages of cloud computing. This involves building Containerized microservices deployed in a public cloud
an application using many loosely coupled service-specific
components (microservices) that communicate via lightweight are scaled automatically based on user-specified static
APIs, and utilizing containerization technologies to deploy, thresholds for per-microservice resource utilization [1, 2, 6].
update, and scale these microservices quickly and indepen- However, this places a significant burden on application
dently. However, managing the end-to-end tail latency of owners who are concerned about the end-to-end tail la-
requests flowing through the microservices is challenging in tency (e.g 95th percentile latency) [28]. Setting appropriate
the absence of accurate performance models that can capture
the complex interplay of microservice workflows with cloud- resource utilization thresholds on various microservices to
induced performance variability and inter-service performance meet the end-to-end tail latency in such complex distributed
dependencies. In this paper, we present performance char- system is difficult and error-prone in the absence of accurate
acterization and modeling of containerized microservices in performance models.
the cloud. Our modeling approach aims at enabling cloud There are many challenges in modeling the end-to-end
platforms to combine resource usage metrics collected from
multiple layers of the cloud environment, and apply machine tail latency of containerized microservices. First, a mi-
learning techniques to predict the end-to-end tail latency of croservice architecture is characterized by complex request
microservice workflows. We implemented and evaluated our execution paths spanning many microservices forming a
modeling approach on NSF Cloud’s Chameleon testbed using directed acyclic graph (DAG) with complex interactions
KVM for virtualization, Docker Engine for containerization across the service topology [28, 29, 39]. Second, the tail
and Kubernetes for container orchestration. Experimental re-
sults with an open-source microservices benchmark, Sock Shop, latency is highly sensitive to any variance in the system
show that our modeling approach achieves high prediction which could be related to application, OS or hardware [32].
accuracy even in the presence of multi-tenant performance Third, in a cloud environment where microservices run as
interference. containers hosted on a cluster of virtual machines (VMs),
Keywords-microservices; containers; cloud computing; per- application performance can degrade often in unpredictable
formance modeling; ways [18, 21, 24, 44].
Traditionally, analytical models based on queuing theory
I. I NTRODUCTION have been widely applied for performance prediction and
resource provisioning of monolithic (3-tier) applications [40,
Large-scale web services (e.g Netflix, Microsoft Bing, 41]. However, such techniques can become intractable when
Uber, Spotify etc.) are increasingly adopting cloud-native dealing with the scale and complexity of microservice ar-
principles and design patterns such as microservices and chitecture, and the presence of cloud-induced performance
containers to better utilize the advantages of the cloud variability. Furthermore, analytical modeling is a white-box
computing delivery model, which includes greater agility in approach that often requires intrusive instrumentation of ap-
software deployment, automated scalability, and portability plication code for workload profiling and expert knowledge
across cloud environments [24, 30]. In a micro-services about the application structure and data flow between various
architecture, an application is built using a combination of components [25]. Such approach can be impractical from
loosely coupled and service-specific software containers that a cloud provider’s perspective since customer applications
communicate using APIs, instead of using a single, tightly appear with limited visibility to the cloud providers.
coupled monolith of code. This development methodol- There are black-box modeling approaches that relate
ogy combined with recent advancements in containerization observable resource usage metrics [36, 42] or resource
technologies makes an application easier to enhance, main- allocation metrics [43] with the performance of monolithic
tain, and scale. However, it is challenging to manage the end- applications hosted in virtualized computing environments.
to-end tail latency (e.g 95th percentile latency) of requests More recent studies [19, 26] focused on runtime trace anal-

978-1-7281-0218-4/19/$31.00 ©2019 IEEE 200

DOI 10.1109/IC2E.2019.00034
Authorized licensed use limited to: UNIVERSIDADE FEDERAL DE MINAS GERAIS. Downloaded on November 29,2023 at 00:08:21 UTC from IEEE Xplore. Restrictions apply.
ysis tools and simulation based approaches to analyze the
performance of microservice-based applications. However,
none of these works study the impact of cloud induced per-
formance interference on microservice-based applications,
and the resulting inaccuracies in performance modeling. In
this paper, we observe that the end-to-end tail latency of mi-
croservice workflows are highly sensitivity to performance
interference in the cloud. Furthermore, we show that the
tail latency of microservice workflows can be accurately
(a) Monolith. (b) Microservices.
predicted even in the presence of performance interference,
with the help of machine learning and multi-layer data Figure 1: Monolithic vs microservice architecture.
collected from the cloud environment.
In particular, we make the following contributions.
proach. Section VII discusses resource scaling optimization
1. We quantify the impact of resource utilization and per- based on the proposed models. Section VIII concludes the
formance interference experienced by various microser- paper.
vices on the end-to-end tail latency of various request
workflows in a web application. Since CPU is a major II. BACKGROUND ON M ICROSERVICE A RCHITECTURE
bottleneck for most web applications, we use CPU uti- Microservice architecture aims to overcome various lim-
lization as a resource metric in this paper, and focus on itations of traditional monolithic architecture for software
the performance interference caused by the contention in development [10, 22]. Figure 1 illustrates the difference
shared processor resources such as LLC (last level cache) between multi-tier monolithic architecture and microservice
and memory bandwidth. However, our approach can be architecture in the context of an e-commerce application
easily extended to include other resource metrics. that takes orders from customers, verifies product catalogue,
2. We propose a modeling approach that combines multi- processes payment and ships orders. In monolithic archi-
layer data including container-level, VM level and a tecture, the web application is divided into technology-
hardware performance counter based metric, CPI (clock specific tiers such as a frontend web tier for serving web
cycles per instruction), to accurately predict end-to-end contents, an application tier composed of numerous tightly
tail latency in the presence of performance interference coupled components for implementing the entire business
in the cloud. logic, and a shared database tier for data persistence. A
3. We apply several machine learning based modeling tech- monolithic application is often simple to design. However,
niques, and compare their accuracy in predicting the end- in order to update one component, the entire application
to-end performance for containerized microservices. has to be redeployed. Furthermore, each component within
4. We demonstrate the feasibility of utilizing the proposed a tier cannot be scaled independently based on its resource
performance models in making efficient resource scaling requirements. On the other hand, microservice architecture
decisions. For this purpose, we formulate resource scaling splits the application into many smaller self-contained com-
of microservices as a constrained nonlinear optimization ponents, called microservices, that serve specific business
problem, and solve it to calculate appropriate resource functions and communicate with each other via lightweight
utilization thresholds on various microservices, so that language-agnostic APIs. Each microservice has its own code
they can be scaled efficiently to meet a performance SLO and database without any shared component with other
(service level objective) target. services. This facilitates flexibility in application deployment
5. We implement and evaluate the proposed techniques and enhanced scalability since each component of an appli-
using a representative microservices benchmark, Sock cation can be updated and scaled independently. In essence,
Shop [14], using the NSF Chameleon cloud [3] microservice architecture is a variant of the Service-Oriented
testbed. The Sock Shop benchmark is containerized with Architecture (SOA) that emphasizes fine-grained services
Docker [35] and deployed in a cluster of VMs managed and lightweightness.
by Kubernetes [8] an open-source container orchestration
engine. III. R ELATED W ORK
The rest of this paper is organized as follows. Section II Performance modeling and dynamic resource provisioning
provides the background on microservice archiecture. Re- of Internet applications has been an important research topic
lated work are discussed in Section III. Section IV describes for many years [31, 36, 37, 40, 41, 43, 45]. There are
the testbed setup and benchmarks used. Section V presents traditional analytical modeling approaches based on queue-
the performance characterization of containerized microser- ing theory [40, 41], and hybrid approaches that combine
vices. Section VI provides the performance modeling ap- queueing theory with machine learning techniques [38, 45].

201

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DE MINAS GERAIS. Downloaded on November 29,2023 at 00:08:21 UTC from IEEE Xplore. Restrictions apply.
Urgaonkar et al. [41] designed a dynamic server provision-
ing technique on multi-tier server clusters. The technique
decomposes the per-tier average delay targets to be certain
percentages of the end-to-end delay constraint. Singh et
al. [38] applied k-means clustering algorithm and a G/G/1
queuing model to predict the server capacity for a given
workload mix. Although these approaches were effective
for multi-tier monolithic applications, they can become in-
tractable when dealing with complex microservice architec- Figure 2: Workflow DAGs.
ture in a cloud environment. The complexity introduced by
having many moving parts with complex interactions and the
presence of cloud-induced performance variability [21, 44] IV. P LATFORM
pose significant challenges in modeling the system behavior, A. Experimental Testbed
identifying critical resource bottlenecks and managing them
We set up a cloud prototype testbed, which closely resem-
effectively.
bles real-world cloud platforms such as Google Kubernetes
Blackbox modeling techniques have been widely adopted
Engine [6] and Amazon Elastic Container Services [2]. Our
in cluster resource allocation and management [31, 36,
testbed consists of a physical layer of bare metal servers, a
42, 43]. Nguyen et al.[36] applied online profiling and
VM layer built on top of the physical layer and a container
polynomial curve fitting to provide a black-box performance
layer built on top of VM layer.
model of the applications SLO violation rate for a given re-
Physical Servers. We used four bare metal servers leased
source pressure. Wajahat et al. [42] presented an application-
on NSF Chameleon Cloud[3] testbed. Each server was
agnostic, neural network based auto-scaler for minimizing
equipped with dual socket Intel Xeon E5-2670 v3 Haswell
SLA violations of diverse applications. Wang et al. [43]
processors (each with 12 cores @ 2.3GHz) and 128 GiB of
applied fuzzy model predictive control and Lama et al. [31]
RAM. Each server was connected to a Dell switch at 10Gbp,
proposed self-adaptive neural fuzzy control techniques for
with 40Gbps of bandwidth to the core network from each
dynamic resource management of monolithic cloud applica-
switch.
tions. However, these studies do not address the modeling
VMs. We setup 16 VMs on top of the bare metal servers
inaccuracies caused by the performance interference in the
by using KVM for server virtualization. Each VM was
cloud, and the complexity introduced by microservice archi-
configured with four vCPUs, 8GB Ram and 30GB disk
tecture.
space.
A few studies have focused on managing the end-to-end
Containers. We setup a 16 VM Kubernetes cluster for
performance objectives of large-scale web services and ana-
container orchestration and management. Docker (version
lyzing their complex performance behavior [27, 28, 39]. Guo
18.03.1-ce) was used as the container run time engine on
et al. [27] highlighted how the complex interactions between
each VM. Kubernetes pod networking was set up using
various components of large-scale web services not only
the Calico CNI (Container Network Interface) network plu-
lead to sharp degradation in performance, but also trigger
gin [11]. We use the term pod and container interchangeably
cascading behaviors that result in wide-spread application
in this paper, since we use a one-container-per-Pod model,
outages. Jalaparti et al. [28] presented Kwiken, a framework
which is the most common Kubernetes use case.
that decomposes the problem of minimizing latency over
a general processing DAG in a large web service into a B. Workloads
manageable optimization over individual stages. Suresh et
For performance characterization, we used Sock
al. [28] presented Wisp, a resource management frame-
Shop [14], an open-source microservices benchmark that
work that applies a combination of techniques, including
is particularly tailored for container platforms. Sock Shop
estimating local workload models based on measurements
emulates an e-commerce website as shown in Figure 1 with
of immediate neighborhoods, distributed rate control and
the specific aim of aiding the demonstration and testing
metadata propagation to achieve end-to-end throughput and
of existing microservice and cloud-native technologies.
latency objectives in Service-Oriented architectures. These
A recent study suggests that Sock shop closely reflects
approaches are complimentary to our work as they focus on
how typical microservices applications are currently being
solutions that need to be adopted at the application layer in
developed and delivered into production, as reported
the context of cloud computing stack, and requires expert
by practitioners and industry experts [17]. We used the
knowledge about the application. On the other hand, our
Locust tool [9] to generate user traffic for the Sock
performance modeling approach does not require intrusive
Shop benchmark. The workload traffic is composed of a
instrumentation of application code for profiling or expert
number of concurrent clients that generate HTTP-based
knowledge about the data flow between various components.
REST API calls to Sock Shop. To create a controlled

202

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DE MINAS GERAIS. Downloaded on November 29,2023 at 00:08:21 UTC from IEEE Xplore. Restrictions apply.
RUGHUVBZRUIORZ RUGHUVBZRUIORZ RUGHUVBZRUIORZ
WKSHUFHQWLOHODWHQF\ PV

WKSHUFHQWLOHODWHQF\ PV

WKSHUFHQWLOHODWHQF\ PV
FDUWBZRUIORZ FDUWBZRUIORZ FDUWBZRUIORZ

&38XWLOL]DWLRQ &38XWLOL]DWLRQ &38XWLOL]DWLRQ

(a) CPU utilization of orders microservice. (b) CPU utilization of cart microservice. (c) CPU utilization of frontend microservice.

Figure 3: Impact of CPU utilization on the tail latency of various workﬂows.

WKSHUFHQWLOHRUGHUVZRUNIORZODWHQF\ PV WKSHUFHQWLOHRUGHUVZRUNIORZODWHQF\ PV WKSHUFHQWLOHRUGHUVZRUNIORZODWHQF\ PV

@ @ @
@ @ @
&38XWLOL]DWLRQ

&38XWLOL]DWLRQ

&38XWLOL]DWLRQ
@ @ @

FDUW IURQWHQG RUGHU XVHU FDUW IURQWHQG RUGHU XVHU FDUW IURQWHQG RUGHU XVHU

(a) without interference. (b) with interference on cart. (c) with interference on frontend.

Figure 4: Parallel coordinates plot showing the impact of performance interference on the multivariate relationship between
CPU utilization and end-to-end tail latency of orders workﬂow.

interference workload for our experiments, we used the APIs invoked at each encountered microservice, the supplied
STREAM Memory Bandwidth benchmark[33]. STREAM is arguments, the content of caches, as well as the use of
a synthetic benchmark program geared towards measuring load balancing along the service graph [39]. We used a
memory bandwidth (in MB/s) corresponding to computation visualization and monitoring tool, weavescope [16], to map
rate for simple vector kernels. We run the benchmark inside the DAG structure of orders and cart workflows as shown
a docker container and deploy it as a batch job in in Figure 2.
Kubernetes.
A. End-to-end Tail Latency
V. P ERFORMANCE C HARACTERIZATION
First, we analyze the impact of CPU utilization of in-
One of the challenges that complicate performance char- dividual microservices on the end-to-end tail latency of
acterization of a microservice architecture is that request two different workflows viz. orders and cart in the Sock
execution workflows can form directed acyclic graph (DAG) Shop benchmark. For this purpose, we run experiments
structures spanning across many microservices. As a re- with various workload intensities by varying the number
sult, the end-to-end latency of a workflow is impacted by of concurrent clients in the workload generator from 5 to
the performance behavior of multiple microservices in a 50, while setting the total number of generated requests to
complex way. We use the term workflow to represent an be 50000. We also vary the number of pods allocated to
application-specific group of requests that are associated cart, orders and frontend microservices to include various
with a particular API endpoint, which is usually in the combination of scaling configurations. The CPU utilization
form of an HTTP URI. For instance, in case of the Sock of a particular microservice is measured as the average CPU
Shop benchmark shown in Figure 1, the HTTP URIs for utilization of all the pods allocated to that microservice.
workflows involved with processing orders are [ base url: As shown in Figures 3 (a), (b) and (c) the end-to-end tail
/ GET / Orders] and [ base url: / POST / Orders]. The latency of various workflows have a non-linear relationship
exact structure of the DAG for request workflows is often with the CPU utilization of individual microservices. We
unknown, since it depends on multiple factors such as the observe that the 95th percentile latency of the two workflows

203

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DE MINAS GERAIS. Downloaded on November 29,2023 at 00:08:21 UTC from IEEE Xplore. Restrictions apply.
WKSHUFHQWLOHFDUWZRUNIORZODWHQF\ PV WKSHUFHQWLOHFDUWZRUNIORZODWHQF\ PV WKSHUFHQWLOHFDUWZRUNIORZODWHQF\ PV

@ @ @
@ @ @
&38XWLOL]DWLRQ

&38XWLOL]DWLRQ

&38XWLOL]DWLRQ
@ @ @
@

FDUW IURQWHQG RUGHU XVHU FDUW IURQWHQG RUGHU XVHU FDUW IURQWHQG RUGHU XVHU

(a) without interference. (b) with interference on cart. (c) with interference on frontend.

Figure 5: Parallel coordinates plot showing the impact of performance interference on multivariate relationship between CPU
utilization and end-to-end tail latency of cart workﬂow.

As shown in Figures 4 (a), (b) and (c), the end-to-end

WKSHUFHQWLOHODWHQF\ PV

RUGHUVZRUNIORZ FDUWZRUNIORZ

tail latency of the orders workflow is influenced by the
CPU utilization of multiple microservices. However, their
multivariate relationship changes significantly depending on
the performance interference experienced by various mi-

croservices. For example, in the case of no interference,
the 95th percentile latency of orders workflow is greater
UW

UIH G

G
H

than 300 ms when the CPU utilization measured at cart,

HQ
QF

QF
FD

FD
LQ QWH

UH
Q

frontend, orders and user microservices are 67%, 110%,

IUR

UIH
WUI

WUI
Q

WH
LQ

LQ
R

55% and 41% respectively. However, similar tail latency

WUI

WUI
R

R
LQ

LQ
Z

of orders workﬂow was observed at much lower CPU

Figure 6: Impact of performance interference on the end-to- utilization values when one of the microservices experienced
end tail latency of various workflows. performance interference. Similar results were obtained for
the cart workflow as shown in Figures 5 (a), (b) and (c). This
implies that the CPU utilization of microservices measured
increase significantly even at low CPU utilization values at the pod level are insufficient in accurately predicting the
of the orders and cart microservices. On the other hand, end-to-end tail latency of various workflows.
only high CPU utilization values (>70%) of the frontend Figure 6 shows the distribution of the 95th percentile
microservice has significant impact on the 95th percentile latency of various workflows under three different scenarios,
latency. For example, the tail latency of the orders workflow i.e with interference on cart, interference on frontend and
reaches 200 ms at 49%, 57% and 106% CPU utilizations of without interference. The variation in the latency observed
the orders, cart and frontend microservices respectively. within each case is mainly due to the varying workload
intensities in these experiments. On average the performance
B. Impact of Performance Interference degradation observed by orders and cart workflows due
Next, we analyze the impact of performance interference to interference on cart microservice are 22% and 79%
in a cloud environment on the multivariate relationship respectively. On the other hand, the average performance
between CPU utilization of various microservices and the degradation of the two workflows due to interference on
end-to-end tail latency of particular request workflows. For frontend microservice are 6% and 18% respectively. These
the sake of clarity, we present our analysis using top four mi- results demonstrate the complex interplay between perfor-
croservices from the Sock Shop benchmark ranked accord- mance interference, inter-service performance dependency
ing to their CPU utilization values. To induce performance and the end-to-end tail latency of various workflows.
interference, we colocate pods running the memory-intensive
VI. P ERFORMANCE M ODELING WITH M ACHINE
STREAM [33] benchmark on the VMs that host the pods
L EARNING
running cart and frontend microservices respectively. The
intensity of interference is fixed by running four pods for In this section, we present our approach to address the
each interfering workload. The workload intensities and challenges of predicting the end-to-end tail latency of com-
the scaling configurations for orders, cart and frontend plex workflows in a microservice architecture in the face
microservices are varied similar to the previous experiment. of diverse performance interference patterns. Our approach

204

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DE MINAS GERAIS. Downloaded on November 29,2023 at 00:08:21 UTC from IEEE Xplore. Restrictions apply.
combines the resource usage metrics at the container/pod
level with VM level resource usage and hardware perfor-

0HDQ$EVROXWH3HUFHQWDJH(UURU

mance counter values to construct machine learning (ML) 3RGB&38
based performance models for individual workﬂows. Our 3RGB&3890B&3,
3RGB&3890B&38
modeling approach does not rely on any expert application

knowledge. Hence, it can be easily extended to ﬁt the need

of diverse applications.

A. Data Collection
/5 695 '7 5) 11
0/PRGHOV
In this paper, we use CPU utilization as a resource metric
for the microservices since CPU is a major resource bottle- (a) Mean absolute percentage error.
neck in most web applications. We use docker stats [4] to
3RGB&38
measure pod level CPU utilization. To capture the impact of 3RGB&3890B&3,

performance interference due to the contention of processor 3RGB&3890B&38

56FRUH
resources, such as the last level cache (LLC) and memory

bandwidth, we utilize the CPU utilization and CPI metric

associated with the VMs that host the various microservices
as pods. We use the virt top [15] tool to measure VM level

CPU utilization. CPI is measured on a per cgroup basis by /5 695 '7 5) 11
0/PRGHOV
using the perf event [23] tool and each cgroup is mapped to a
VM. For data collection, we conduct extensive experiments (b) R2 Score.
on our cloud prototype testbed by varying the number of
Figure 7: Prediction accuracy of various ML models for
concurrent clients, and the performance interference levels
orders.
experienced by different microservices in the Sock Shop
benchmark. We also vary the number of pods allocated to

0HDQ$EVROXWH3HUFHQWDJH(UURU
the microservices. For each experiment, we measure the
end-to-end tail latency of various workﬂows as reported by 3RGB&38
3RGB&3890B&3,
the Locust [9] tool. The collected data is used to train our 3RGB&3890B&38
machine learning based performance models.

B. Machine Learning Models

We build performance models for predicting the end-to-

/5 695 '7 5) 11
end tail latency of each microservice workﬂow by applying 0/PRGHOV
various machine learning (ML) techniques including Linear
Regression (LR), Support Vector Regression (SVR), Deci- (a) Mean absolute percentage error.
sion Tree (DT), Random Forrest (RF) and a deep Neural
3RGB&38
Network (NN) based regression (more speciﬁcally a multi- 3RGB&3890B&3,
3RGB&3890B&38
layer perceptron with multiple hidden layers). The ML
56FRUH

models are built and trained by using scikit-learn [12], a

machine learning library in Python.
Feature Selection. The input features of our ML mod-

els include the number of concurrent clients, pod-level

/5 695 '7 5) 11
resource metrics and VM-level resource metrics. The pod- 0/PRGHOV
level metrics include the average CPU utilization of load-
balanced pods for each microservice. The VM-level metrics (b) R2 Score.
include the CPU utilization or the CPI of VMs that host the Figure 8: Prediction accuracy of various ML models for cart.
pods. To reduce our feature space and avoid potential over-
fitting issues, we apply a popular feature selection technique selected across randomizations. The features selected for the
called stability selection [34]. In particular, we use scikit- orders workflow are the number of concurrent clients, pod-
learn [12] library’s randomized lasso technique, which works level CPU utilization of the microservices including front-
by subsampling the training data and computing a Lasso end, orders, users, shipping, payment, cart, users-db, orders-
estimate where the penalty of a random subset of coefficients db, cart-db, and the CPU utilization or CPI of the VMs
has been scaled. By performing this operation several times, that host these microservices. Similarly, the features selected
the method assigns high scores to features that are repeatedly for the cart workflow are the number of concurrent clients,

205

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DE MINAS GERAIS. Downloaded on November 29,2023 at 00:08:21 UTC from IEEE Xplore. Restrictions apply.

3UHGLFWHGWDLOODWHQF\ PV

0HDVXUHGWDLOODWHQF\ PV 0HDVXUHGWDLOODWHQF\ PV 0HDVXUHGWDLOODWHQF\ PV 0HDVXUHGWDLOODWHQF\ PV

(a) Linear regression with Pod CPU.(b) Linear regression with Pod CPU (c) Neural network with Pod CPU. (d) Neural network with Pod CPU
and VM CPI. and VM CPI.

Figure 9: Cross-validated predictions of tail latency in orders workﬂow.

Table I: Optimal number of neurons in the three hidden predictions approximate the real data points. An R2 of 1
layers of NN models for orders and cart workflow. indicates that the regression predictions perfectly fit the data.
Workflow Figures 7 (a) and (b) show that, compared to the
orders cart
Input Feature Pod CPU based modeling approach, Pod CPU+VM CPU
Pod CPU (6,3,5) (8,5,6)
Pod CPU+VM CPU (4,6,3) (3,6,8)
and Pod CPU+VM CPI approaches achieve significant im-
Pod CPU+VM CPI (9,6,4) (5,7,5) provement in the prediction accuracy of each ML model
for the orders workflow. This is because VM-level CPU
utilization can capture inter-pod CPU contention within a
the pod-level CPU utilization of the microservices including VM. Furthermore, VM-level CPI metric can capture the
front-end, orders, cart, cart-db, and the CPU utilization or contention of shared processor resources between multiple
CPI of the VMs that host these microservices. pods within a VM as well as across VMs. Such inter-
Hyper-parameters. The hyper-parameters of each model VM resource contention may arise when the concerned
is set to the default values provided by scikit-learn. We VMs are colocated in the same physical machine. The
observe that the prediction accuracy of the deep NN model improvement in the prediction accuracy in terms of MAPE
is highly sensitive to the number of hidden layers and the due to Pod CPU+VM CPU and Pod CPU+VM CPI ap-
size (number of neurons) in each hidden layer. Hence, we proaches are up to 36% and 38% respectively. The largest
tuned these parameters through an exhaustive search for improvement is observed in case of the NN model. We also
various combinations of input feature space and the targeted observe that the NN model outperforms all other models in
workflow for the prediction of end-to-end tail latency. The prediction accuracy since the Neural Network is a universal
optimal number of hidden layers for our NN model is three, function approximator. On the other hand, the LR model
and the optimal number of neurons in these three hidden shows the worst prediction accuracy. This is because a linear
layers is summarized in Table I. regression model can not capture the non-linearity of tail
latency. Overall, we observed similar results in the latency
C. Prediction Accuracy prediction of cart workflow as shown in Figure 8.
In this section, we evaluate the prediction accuracy of Figure 9 plots the cross-validated predictions vs. the
various ML models (LR, SVR, DT, RF, NN) and three measured values of end-to-end tail latency of the orders
modeling approaches. First, the Pod CPU approach includes workflow in order to graphically illustrate the different R2
pod-level CPU utilization metrics in the input feature space. values for the LR and NN models. Theoretically, if a model
Second, the Pod CPU+VM CPU approach includes both could explain 100% of the variance in the observed data, the
pod-level and VM-level CPU utilization metrics. Third, the predicted values would always equal the measured values
Pod CPU+VM CPI approach includes pod-level CPU uti- and, therefore, all the data points would fall on the fitted
lization and VM-level CPI metrics in the input feature space. regression line. The more variance that is accounted for
The models are evaluated with 10-fold cross validation on by the regression model the closer the data points will
the collected dataset. As a result, 90% of data is used for fall to the fitted regression line. The proportion of variance
training, 10% of data is used for testing in each of the accounted for by the LR model with Pod CPU , LR model
10 iterations of cross-validation. We utilize commonly used with Pod CPU+VM CPI, NN model with Pod CPU and
metrics such as the mean absolute percentage error (MAPE) NN model with Pod CPU+VM CPI approaches are 42%,
of determination, R2 . MAPE is calculated
n y−ŷ
and the coefficient 66%, 71% and 89% respectively.
1
as n i=1 y where y and ŷ are the measured and VII. O PTIMIZATION FOR R ESOURCE S CALING
predicted values of the end-to-end tail latency respectively. Although existing cloud platforms [1, 2, 5, 6] provide
R2 is a statistical measure of how well the regression mechanisms for auto-scaling microservices, they expect ap-

206

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DE MINAS GERAIS. Downloaded on November 29,2023 at 00:08:21 UTC from IEEE Xplore. Restrictions apply.
Table II: Notation used in Resource Scaling Optimization We formulate the optimization problem as follows:
Problem
max xi (2)
Symbol Description i∈Sj
Sj Set of microservices relevant to workflow j
SLOjtarget Tail latency target of workflow j s.t. rj (x) ≤ SLOjtarget (3)
xi Average pod-level CPU utilization in microservice i
x A vector of average pod-level CPU utilizations of various
x = (xi )i∈Sj (4)
microservices relevant to the target workflow
rj (x) Predicted tail latency of workflow j where, the symbol notations are described in Table II. The
objective function in Equation 2 aims to maximize the pod-
level resource usage i.e the sum of average CPU utilization
in the set of microservices that are relevant to the target
workflow. The relevance of a microservice to a workflow
plication owners to specify thresholds for various microser- can be determined either by analyzing the workflow DAG,
vice load metrics to enable auto-scaling features. For exam- or through machine learning based feature selection as
ple, the auto-scaling feature [7] in Kubernetes determines described in Section VI-B. Consider that rj (x) is the tail
the allocation of containers/pods to a microservice by using latency predicted by machine learning model for workflow
the formula:
j. The inequality constraint in Equation 3 ensures that
the SLO target of workflow j will not be violated. The
optimization problem is nonlinear since the workflow tail
currentM etricV alue latency rj (x) included in the constraint Equation 3 has a
desiredReplicas = currentReplicas∗
desiredM etricV alue nonlinear relationship with the average CPU utilization of
(1)
various microservices.
In the formulation of the optimization problem,
If the desiredMetricValue (threshold) is specified as an application-layer metrics (e.g number of concurrent clients),
average CPU utilization of 50% for a particular microser- VM-level CPU utilization and CPI metrics are not included
vice, and the current average CPU utilization is 100%, then as variables, although the tail latency prediction rj (x) de-
the number of pods allocated to that microservice will be pends on these metrics as well. Instead, the values of these
doubled. Furthermore, any scaling is performed only if the metrics are fixed according to their observed values at the
ratio of currentMetricValue and desiredMetricValue drops time of solving the optimization problem, and are treated
below 0.9 or increases above 1.1 (10% tolerance by default). as constants for that instance of optimization. As a result,
It is challenging and burdensome for application owners the solutions to the optimization problem will only include
to determine the resource utilization thresholds for various pod-level CPU utilization values, which can be directly used
microservices in order to meet the application’s end-to-end as thresholds for making resource scaling decisions. This
performance target. Setting inappropriate thresholds may allows the resource scaling mechanism to be practical and
lead to overprovisioning or underprovisioning of resources. simple to implement.
We propose that cloud platforms should automatically deter- B) Solution. We apply a non-linear optimization tech-
mine these thresholds based on user-provided performance nique, trust-region interior point method [13, 20], to solve
SLO targets. For this purpose, we study the feasibility of this problem. This optimization technique provides two main
utilizing the proposed performance models in making effi- benefits. First, it is efficient for large scale problems. Second,
cient resource scaling decisions by formulating a constrained the gradient of the constraint function which is required for
nonlinear optimization problem. optimization, can be approximated through finite difference
methods in this optimization technique [13]. This property
A) Problem Formulation. Consider that the performance
is desirable since the machine learning models for workflow
SLO target in terms of the end-to-end tail latency for a work-
tail latency are blackbox functions, whose gradient can not
flow is specified. For a given workload condition, we aim
be directly calculated.
to find the highest resource utilization values of the relevant
microservices, at which the given SLO targets will not be C) Feasibility Study. As a case study, we apply the
violated. These optimal utilization values can be calculated optimization technique to calculate the desired CPU utiliza-
periodically and set as the thresholds (desiredMetricValue) tion (thresholds) for various relevant microservices, when a
for making resource scaling decisions. These thresholds will workload of 30 concurrent clients is applied to the SockShop
help in determining which microservices should be scaled, benchmark, and a performance SLO target of 240 ms is spec-
and how many pods should be allocated to each microservice ified for the 95th percentile latency of orders workflow. For
based on Equation 1. This approach aims to avoid resource this optimization, we utilize our Neural Network model for
overprovisioning while providing performance guarantee to orders workflow with pod-level CPU utilization, VM-level
the given workflow. CPI metrics and the number of concurrent clients as the input

207

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DE MINAS GERAIS. Downloaded on November 29,2023 at 00:08:21 UTC from IEEE Xplore. Restrictions apply.
envision that our performance modeling and resource scaling
GHVLUHG

&38XWLOL]DWLRQ
PHDVXUHG
optimization approach can enable cloud platforms to au-

tomatically scale microservice-based applications based on
user-provided performance SLO targets. This will remove

the burden of determining resource utilization thresholds
for numerous microservices from the cloud users, which

W
UW

VK G
QJ

E
is prevalent in existing cloud platforms. In future, we will

HQ
HQ

BG
FD

XV
SL

\P
QW

HU
LS
RU

IUR

XV
SD
PLFURVHUYLFH
extend our work to include diverse microservice-based ap-
plications with different resource bottlenecks. We will also
(a) Current vs desired average CPU utilization evaluate the effectiveness of the proposed resource scaling
of various microservices. Here, one pod is
allocated to each microservice. system in the face of dynamic workloads.

WKSHUFHQWLOHODWHQF\ PV

6/2WDUJHW
PHDVXUHGODWHQF\ ACKNOWLEDGMENT

Results presented in this paper were obtained using the
Chameleon testbed supported by the National Science Foun-

dation. The research is partially supported by NSF CREST

Grant HRD-1736209. We thank the anonymous reviewers
for their many suggestions for improving this paper. In

FRQILJXUDWLRQ FDUWRUGHUVIURQWHQG particular we thank our shepherd, Prof. Maarten van Steen.
(b) Tail latency of orders workflow for various
resource scaling configurations. The configu-
R EFERENCES
ration suggested by the optimization of CPU
utilization thresholds is (1,1,2) i.e one pod for
[1] Amazon elastic container service. https://aws.amazon.
cart, one pod for orders and two pods for fron- com/ecs/.
tend. All other microservices are provisioned [2] Amazon elastic container service for kubernetes. https:
with one pod.
//aws.amazon.com/eks/.
Figure 10: Optimization of CPU utilization thresholds for [3] Chameleon: A configurable experimental environ-
efficient resource scaling with a workload of 30 concurrent ment for large-scale cloud research. https://www.
clients, and SLO target 240 ms for 95th percentile latency chameleoncloud.org.
of orders workflow. [4] Docker stats. https://docs.docker.com/engine/reference/
commandline/stats/.
[5] Google app engine flexible environment. https://cloud.
features. Figure 10 (a) compares the current (measured) CPU google.com/appengine/docs/flexible/.
utilization of the microservices relevant to orders workflow [6] Google Kubernetes engine. https://cloud.google.com/
and their desired CPU utilization values, when only one pod kubernetes-engine/.
is allocated to each microservice. Based on Equation 1, the [7] Kubernetes horizontal autoscaling. https://kubernetes.
optimal resource scaling option is to allocate an additional io/docs/tasks/run-application/horizontal-pod-autoscale/
pod to the frontend microservice. As shown in Figure 10 (b), #algorithm-details.
we validate the optimality of this resource scaling option by [8] Kubernetes: Production-grade container orchestration.
comparing the tail latency of orders workflow for various https://kubernetes.io/.
possible resource scaling configurations. We observe that the [9] Locust: An open source load testing tool. https://locust.
resource scaling configuration suggested by our optimization io.
technique is able to meet the performance SLO target while [10] Microservices: an application revolution powered
allocating minimum number of pods in total. by the cloud. https://azure.microsoft.com/en-us/blog/
microservices-an-application-revolution-powered-by-the-cloud/.
VIII. C ONCLUSIONS A ND F UTURE W ORK [11] Project calico. https://www.projectcalico.org/.
We present the performance characterization and model- [12] Scikit-learn: Machine learning in python. http://
ing of containerized microservices in the cloud. Our mod- scikit-learn.org/stable/.
eling approach utilizes machine learning and multi-layer [13] Scipy optimization library. https://docs.scipy.org/doc/
data collected from the cloud environment to predict the scipy/reference/generated/scipy.optimize.minimize.
end-to-end tail latency of microservice workflows even in html.
the presence cloud induced performance interference. We [14] Sockshop microservice demo application. https://
also demonstrate the feasibility of utilizing the proposed microservices-demo.github.io.
models in making efficient resource scaling decisions. We [15] virt-top. https://linux.die.net/man/1/virt-top.

208

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DE MINAS GERAIS. Downloaded on November 29,2023 at 00:08:21 UTC from IEEE Xplore. Restrictions apply.
[16] Weave scope. https://www.weave.works/docs/scope/ resource provisioning for multi-service web applica-
latest/introducing/. tions. In Proceedings of the 19th ACM International
[17] C. M. Aderaldo, N. C. Mendona, C. Pahl, and Conference on World wide web (WWW), 2010.
P. Jamshidi. Benchmark requirements for microser- [30] G. Kakivaya, L. Xun, R. Hasha, S. B. Ahsan,
vices architecture research. In IEEE/ACM 1st Inter- T. Pfleiger, R. Sinha, A. Gupta, M. Tarta, M. Fussell,
national Workshop on Establishing the Community- V. Modi, M. Mohsin, R. Kong, A. Ahuja, O. Platon,
Wide Infrastructure for Architecture-Based Software A. Wun, M. Snider, C. Daniel, D. Mastrian, Y. Li,
Engineering (ECASE), 2017. A. Rao, V. Kidambi, R. Wang, A. Ram, S. Shiv-
[18] A. Balalaie, A. Heydarnoori, and P. Jamshidi. Mi- aprakash, R. Nair, A. Warwick, B. S. Narasimman,
croservices architecture enables devops: Migration to a M. Lin, J. Chen, A. B. Mhatre, P. Subbarayalu,
cloud-native architecture. IEEE Software, 33(3), 2016. M. Coskun, and I. Gupta. Service fabric: A distributed
[19] S. Barakat. Monitoring and analysis of microservices platform for building microservices in the cloud. In
performance. Journal of Computer Science and Control Proceedings of the Thirteenth EuroSys Conference,
Systems, 10:19–22, 05 2017. 2018.
[20] R. H. Byrd, M. E. Hribar, and J. Nocedal. An interior [31] P. Lama and X. Zhou. Autonomic provisioning with
point algorithm for large-scale nonlinear programming. self-adaptive neural fuzzy control for percentile-based
SIAM J. on Optimization, 9(4):877–900, Apr. 1999. delay guarantee. ACM Transactions on Autonomous
[21] X. Chen, L. Rupprecht, R. Osman, P. Pietzuch, F. Fran- and Adaptive Systems, 31 pages, under 2nd reviewing
ciosi, and W. Knottenbelt. Cloudscope: Diagnosing after revision, 2011.
and managing performance interference in multi-tenant [32] J. Li, N. K. Sharma, D. R. K. Ports, and S. D. Gribble.
clouds. In 2015 IEEE 23rd International Symposium Tales of the tail: Hardware, os, and application-level
on Modeling, Analysis, and Simulation of Computer sources of tail latency. In Proceedings of the ACM
and Telecommunication Systems (MASCOTS), 2015. Symposium on Cloud Computing (SoCC), 2014.
[22] N. Dragoni, S. Giallorenzo, A. L. Lafuente, M. Maz- [33] J. D. McCalpin. Memory bandwidth and machine
zara, F. Montesi, R. Mustafin, and L. Safina. Mi- balance in current high performance computers. IEEE
croservices: yesterday, today, and tomorrow. In Present computer society technical committee on computer
and Ulterior Software Engineering, pages 195–216. architecture (TCCA) newsletter, 2(19–25), 1995.
Springer, 2017. [34] N. Meinshausen and P. Bhlmann. Stability selection.
[23] S. Eranian. perfmon2: the hardware-based perfor- Journal of the Royal Statistical Society: Series B (Sta-
mance monitoring interface for linux. http://perfmon2. tistical Methodology), 72(4):417 – 473, 8 2010.
sourceforge.net/. [35] D. Merkel. Docker: lightweight linux containers for
[24] M. Fazio, A. Celesti, R. Ranjan, C. Liu, L. Chen, and consistent development and deployment. Linux Jour-
M. Villari. Open issues in scheduling microservices in nal, 2014(239):2, 2014.
the cloud. IEEE Cloud Computing, 3(5):81–88, 2016. [36] H. Nguyen, Z. Shen, X. Gu, S. Subbiah, and
[25] I. Giannakopoulos, D. Tsoumakos, and N. Koziris. J. Wilkes. AGILE: Elastic distributed resource scaling
Towards an adaptive, fully automated performance for infrastructure-as-a-service. In Proceedings of the
modeling methodology for cloud applications. In 10th International Conference on Autonomic Comput-
IEEE International Conference on Cloud Engineering ing (ICAC), 2013.
(IC2E), 2018. [37] J. Rao and C.-Z. Xu. Online capacity identification
[26] M. Gribaudo, M. Iacono, and D. Manini. Performance of multi-tier Websites using hardware performance
evaluation of massively distributed microservices based counters. IEEE Trans. on Parallel and Distributed
applications. In European Council for Modelling and Systems, 2009.
Simulation (ECMS), 2017. [38] R. Singh, U. Sharma, E. Cecchet, and P. Shenoy.
[27] Z. Guo, S. McDirmid, M. Yang, L. Zhuang, P. Zhang, Autonomic mix-aware provisioning for non-stationary
Y. Luo, T. Bergan, M. Musuvathi, Z. Zhang, and data center workloads. In Proc. IEEE Int’l Conf. on
L. Zhou. Failure recovery: When the cure is worse than Autonomic Computing (ICAC), pages 21–30, 2010.
the disease. In Presented as part of the 14th Workshop [39] L. Suresh, P. Bodik, I. Menache, M. Canini, and
on Hot Topics in Operating Systems, Santa Ana Pueblo, F. Ciucu. Distributed resource management across
NM, 2013. USENIX. process boundaries. In Proceedings of the 2017 Sym-
[28] V. Jalaparti, P. Bodik, S. Kandula, I. Menache, M. Ry- posium on Cloud Computing-SoCC’17. ACM Press,
balkin, and C. Yan. Speeding up distributed request- 2017.
response workflows. In Proceedings of the ACM [40] B. Urgaonkar, G. Pacifici, P. Shenoy, M. Spreitzer,
SIGCOMM 2013 Conference on SIGCOMM, 2013. and A. Tantawi. An analytical model for multi-tier
[29] D. Jiang, G. Pierre, and C.-H. Chi. Autonomous internet services and its applications. In Proceedings

209

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DE MINAS GERAIS. Downloaded on November 29,2023 at 00:08:21 UTC from IEEE Xplore. Restrictions apply.
of the ACM SIGMETRICS International Conference
on Measurement and Modeling of Computer Systems,
2005.
[41] B. Urgaonkar, P. Shenoy, A. Chandra, P. Goyal, and
T. Wood. Agile dynamic provisioning of multi-tier
internet applications. ACM Trans. Auton. Adapt. Syst.,
3(1), Mar. 2008.
[42] M. Wajahat, A. Gandhi, A. Karve, and A. Kochut.
Using machine learning for black-box autoscaling.
In 2016 Seventh International Green and Sustainable
Computing Conference (IGSC), 2016.
[43] L. Wang, J. Xu, H. A. Duran-Limon, and M. Zhao.
Qos-driven cloud resource management through fuzzy
model predictive control. In IEEE International Con-
ference on Autonomic Computing (ICAC), 2015.
[44] Y. Xu, Z. Musgrave, B. Noble, and M. Bailey. Bobtail:
Avoiding long tails in the cloud. In Presented as part
of the 10th USENIX Symposium on Networked Systems
Design and Implementation (NSDI 13), 2013.
[45] Q. Zhang, L. Cherkasova, and E. Smirni. A regression-
based analytic model for dynamic resource provision-
ing of multi-tier Internet applications. In Proc. IEEE
Int’l Conference on Autonomic Computing (ICAC),
2007.
[46] Y. Zhang, D. Meisner, J. Mars, and L. Tang. Treadmill:
Attributing the source of tail latency through precise
load testing and statistical inference. In ACM/IEEE
43rd Annual International Symposium on Computer
Architecture (ISCA), 2016.

210

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DE MINAS GERAIS. Downloaded on November 29,2023 at 00:08:21 UTC from IEEE Xplore. Restrictions apply.

The Spring Cloud Handbook: Practical Solutions for Cloud-Native Architecture
From Everand
The Spring Cloud Handbook: Practical Solutions for Cloud-Native Architecture
Robert Johnson
No ratings yet
Hands-On Assignment-Ways To Organize Text-Student Guide
No ratings yet
Hands-On Assignment-Ways To Organize Text-Student Guide
3 pages
Splunk Lab - Scheduling Reports & Alerts
No ratings yet
Splunk Lab - Scheduling Reports & Alerts
8 pages
Finite, Non-Finite 1
100% (1)
Finite, Non-Finite 1
3 pages
Mastering Cloud Computing With Best Practices
From Everand
Mastering Cloud Computing With Best Practices
Manish Soni
No ratings yet
Cloud Computing: Harnessing the Power of the Digital Skies: The IT Collection
From Everand
Cloud Computing: Harnessing the Power of the Digital Skies: The IT Collection
Christopher Ford
No ratings yet
Cloud Computing A Beginners Guide to Expertise
From Everand
Cloud Computing A Beginners Guide to Expertise
Manish Soni
No ratings yet
Cloud Computing
From Everand
Cloud Computing
Dr. Nirvikar Katiyar
No ratings yet
Cloud Computing Made Simple: Navigating the Cloud: A Practical Guide to Cloud Computing
From Everand
Cloud Computing Made Simple: Navigating the Cloud: A Practical Guide to Cloud Computing
Poonam Devi
No ratings yet
Orchestrating Virtual Machines with Ignite and Kubernetes: The Complete Guide for Developers and Engineers
From Everand
Orchestrating Virtual Machines with Ignite and Kubernetes: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Cloud Computing Essentials: A Practical Guide with Examples
From Everand
Cloud Computing Essentials: A Practical Guide with Examples
William E. Clark
No ratings yet
Study Guide for the Cisco 300-440 ENCC Designing and Implementing Cloud Connectivity Exam.
From Everand
Study Guide for the Cisco 300-440 ENCC Designing and Implementing Cloud Connectivity Exam.
Anand Vemula
No ratings yet
Submariner Multi-Cluster Connectivity in Kubernetes: The Complete Guide for Developers and Engineers
From Everand
Submariner Multi-Cluster Connectivity in Kubernetes: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
WireMock Cloud API Mocking Guide: The Complete Guide for Developers and Engineers
From Everand
WireMock Cloud API Mocking Guide: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Microservices Engineering Essentials: Definitive Reference for Developers and Engineers
From Everand
Microservices Engineering Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Implementing Linkerd Service Mesh
From Everand
Implementing Linkerd Service Mesh
Kimiko Lee
No ratings yet
Consul Architecture and Practical Deployment: Definitive Reference for Developers and Engineers
From Everand
Consul Architecture and Practical Deployment: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Kuma Service Mesh in Depth: Definitive Reference for Developers and Engineers
From Everand
Kuma Service Mesh in Depth: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Ambassador for Cloud Native Ingress Solutions: Definitive Reference for Developers and Engineers
From Everand
Ambassador for Cloud Native Ingress Solutions: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Robust Resource Scaling of Containerized Microservices With Probabilistic Machine Learning
No ratings yet
Robust Resource Scaling of Containerized Microservices With Probabilistic Machine Learning
10 pages
Service Discovery Across Kubernetes Clusters with Submariner Lighthouse: The Complete Guide for Developers and Engineers
From Everand
Service Discovery Across Kubernetes Clusters with Submariner Lighthouse: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Firecracker MicroVMs Engineering: The Complete Guide for Developers and Engineers
From Everand
Firecracker MicroVMs Engineering: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
The Ultimate Guide to Unlocking the Full Potential of Cloud Services: Tips, Recommendations, and Strategies for Success
From Everand
The Ultimate Guide to Unlocking the Full Potential of Cloud Services: Tips, Recommendations, and Strategies for Success
Rick Spair
No ratings yet
Network Coding and Signcryption for Cloud Data Integrity
From Everand
Network Coding and Signcryption for Cloud Data Integrity
Noah Joan
No ratings yet
Chaos Mesh for Resilient Kubernetes Deployments: The Complete Guide for Developers and Engineers
From Everand
Chaos Mesh for Resilient Kubernetes Deployments: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Red Hat AMQ Streams for Cloud-Native Messaging: The Complete Guide for Developers and Engineers
From Everand
Red Hat AMQ Streams for Cloud-Native Messaging: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Cloud Brokering
From Everand
Cloud Brokering
Felipe Díaz-Sánchez
No ratings yet
Cloud Computing For Noobs
From Everand
Cloud Computing For Noobs
Silas Meadowlark
No ratings yet
Optimizing Cloud Data Centers Through Machine Learning
No ratings yet
Optimizing Cloud Data Centers Through Machine Learning
3 pages
Real-Time Phoenix: Building Scalable Elixir Applications with Live Updates and WebSocket Streams
From Everand
Real-Time Phoenix: Building Scalable Elixir Applications with Live Updates and WebSocket Streams
Sam Stevenson
No ratings yet
TypeScript in Microservices Architecture: Effective Patterns and Techniques
From Everand
TypeScript in Microservices Architecture: Effective Patterns and Techniques
Baldurs L.
No ratings yet
Knative in Cloud-Native Infrastructure: Definitive Reference for Developers and Engineers
From Everand
Knative in Cloud-Native Infrastructure: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Moleculer for Scalable Microservices: Definitive Reference for Developers and Engineers
From Everand
Moleculer for Scalable Microservices: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Mainframe Meets Modernization: Mastering Hybrid Cloud Design: Mainframes
From Everand
Mainframe Meets Modernization: Mastering Hybrid Cloud Design: Mainframes
Ricardo Nuqui
No ratings yet
Sidecar Architecture and Patterns in Cloud-Native Systems: Definitive Reference for Developers and Engineers
From Everand
Sidecar Architecture and Patterns in Cloud-Native Systems: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Evolving Legacy Systems: Transitioning to Microservices and Cloud-Native Architectures
From Everand
Evolving Legacy Systems: Transitioning to Microservices and Cloud-Native Architectures
Adam Jones
No ratings yet
Moleculer Microservices for Node.js: The Complete Guide for Developers and Engineers
From Everand
Moleculer Microservices for Node.js: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Architecting Distributed Applications with Macrometa: The Complete Guide for Developers and Engineers
From Everand
Architecting Distributed Applications with Macrometa: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Virtual Kubelet in Practice: The Complete Guide for Developers and Engineers
From Everand
Virtual Kubelet in Practice: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
A Concise Guide to Microservices for Executive (Now for DevOps too!)
From Everand
A Concise Guide to Microservices for Executive (Now for DevOps too!)
alasdair gilchrist
1/5 (1)
Dancing on a Cloud: A Framework for Increasing Business Agility
From Everand
Dancing on a Cloud: A Framework for Increasing Business Agility
David Sterling
No ratings yet
Cocoa Development Essentials: Definitive Reference for Developers and Engineers
From Everand
Cocoa Development Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
CCSP - Certified Cloud Security Professional Exam Success
From Everand
CCSP - Certified Cloud Security Professional Exam Success
SUJAN
No ratings yet
WebSocket Protocol and Application Design: Definitive Reference for Developers and Engineers
From Everand
WebSocket Protocol and Application Design: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Deploying and Managing Applications with DigitalOcean: Definitive Reference for Developers and Engineers
From Everand
Deploying and Managing Applications with DigitalOcean: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Shedding Light on Cloud Computing
From Everand
Shedding Light on Cloud Computing
Gregor Petri
5/5 (1)
AI-Driven Web Apps: Practical Machine Learning for Software Developers
From Everand
AI-Driven Web Apps: Practical Machine Learning for Software Developers
Sivaramarajalu Ramadurai Venkataraajalu
No ratings yet
Containerization Technology Essentials: Definitive Reference for Developers and Engineers
From Everand
Containerization Technology Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Designing Infrastructure Abstractions with Crossplane: The Complete Guide for Developers and Engineers
From Everand
Designing Infrastructure Abstractions with Crossplane: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Drone CI/CD for Cloud-Native Workflows: The Complete Guide for Developers and Engineers
From Everand
Drone CI/CD for Cloud-Native Workflows: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Engineering Anthos Solutions: Definitive Reference for Developers and Engineers
From Everand
Engineering Anthos Solutions: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
AWS SQS in Practice: Definitive Reference for Developers and Engineers
From Everand
AWS SQS in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
An Introduction to SDN Intent Based Networking
From Everand
An Introduction to SDN Intent Based Networking
alasdair gilchrist
5/5 (1)
Building Container Solutions with Fargate: Definitive Reference for Developers and Engineers
From Everand
Building Container Solutions with Fargate: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Seldon Core Triton Integration for Scalable Model Serving: The Complete Guide for Developers and Engineers
From Everand
Seldon Core Triton Integration for Scalable Model Serving: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
BACnet Engineering and Protocol Design: Definitive Reference for Developers and Engineers
From Everand
BACnet Engineering and Protocol Design: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Cortex for Scalable Multi-Tenant Metrics: The Complete Guide for Developers and Engineers
From Everand
Cortex for Scalable Multi-Tenant Metrics: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Crossplane for Modern Cloud Infrastructure: Definitive Reference for Developers and Engineers
From Everand
Crossplane for Modern Cloud Infrastructure: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Seneca Microservices Development: Definitive Reference for Developers and Engineers
From Everand
Seneca Microservices Development: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Efficient Web App Deployment with Passenger: Definitive Reference for Developers and Engineers
From Everand
Efficient Web App Deployment with Passenger: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Zuul API Gateway in Modern Microservices: Definitive Reference for Developers and Engineers
From Everand
Zuul API Gateway in Modern Microservices: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
CockroachDB Serverless Essentials: The Complete Guide for Developers and Engineers
From Everand
CockroachDB Serverless Essentials: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Efficiency Analysis of Provisioning Microservices
No ratings yet
Efficiency Analysis of Provisioning Microservices
8 pages
Survey On Machine Learning Based Scheduling in Cloud Computing
No ratings yet
Survey On Machine Learning Based Scheduling in Cloud Computing
5 pages
A Comparison of Reinforcement Learning Techniques For Fuzzy Cloud Auto-Scaling
No ratings yet
A Comparison of Reinforcement Learning Techniques For Fuzzy Cloud Auto-Scaling
10 pages
Network Slice Selection Assignment and Routing Within 5G Networks
No ratings yet
Network Slice Selection Assignment and Routing Within 5G Networks
7 pages
A Container Scheduling Strategy Based On Machine Learning in Microservice Architecture
No ratings yet
A Container Scheduling Strategy Based On Machine Learning in Microservice Architecture
7 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
41 pages
A Machine Learning Framework For Resource Allocation Assisted by Cloud Computing
No ratings yet
A Machine Learning Framework For Resource Allocation Assisted by Cloud Computing
8 pages
The Myth of The Eternal Return
No ratings yet
The Myth of The Eternal Return
4 pages
Be Meek
No ratings yet
Be Meek
19 pages
Jvs Buxar Poltaknik
No ratings yet
Jvs Buxar Poltaknik
1 page
Soran Bushi: Section 1 - Musical Analysis - Traditional Japanese Song
No ratings yet
Soran Bushi: Section 1 - Musical Analysis - Traditional Japanese Song
15 pages
Salinan Terjemahan BIG4-Song
No ratings yet
Salinan Terjemahan BIG4-Song
2 pages
Assignment 06
No ratings yet
Assignment 06
4 pages
Digital Detox Camp Student
No ratings yet
Digital Detox Camp Student
3 pages
Worksheet Adv.
No ratings yet
Worksheet Adv.
3 pages
SoMachine Basic - Operating Guide
No ratings yet
SoMachine Basic - Operating Guide
206 pages
Unit 1 Notes - DW
No ratings yet
Unit 1 Notes - DW
25 pages
How To Install
No ratings yet
How To Install
2 pages
RTU500 Series Remote Terminal Unit: Protocol Description Bidirectional Communication Interface With IEC60870-5-104
No ratings yet
RTU500 Series Remote Terminal Unit: Protocol Description Bidirectional Communication Interface With IEC60870-5-104
68 pages
Windows Returns Wrong IP Address As Source IP
No ratings yet
Windows Returns Wrong IP Address As Source IP
3 pages
Soal Ulum Bahasa Inggris Kelas Xi
No ratings yet
Soal Ulum Bahasa Inggris Kelas Xi
4 pages
2015 Teaching English To Gifted Children
No ratings yet
2015 Teaching English To Gifted Children
15 pages
Class XII Mathematics Support Material 2024 2025
No ratings yet
Class XII Mathematics Support Material 2024 2025
397 pages
Test Appointment Details # 9731109214617221: Score Recipients
No ratings yet
Test Appointment Details # 9731109214617221: Score Recipients
1 page
c00-Catia-V5 Introduction To CATIA
No ratings yet
c00-Catia-V5 Introduction To CATIA
31 pages
7th Grade Standards
No ratings yet
7th Grade Standards
4 pages
Action Replay Codes
No ratings yet
Action Replay Codes
4 pages
3.4 Positive and Negative Effects of Religion
No ratings yet
3.4 Positive and Negative Effects of Religion
5 pages
DBMS Lab1
No ratings yet
DBMS Lab1
72 pages
STMS
No ratings yet
STMS
9 pages
2nd Quarter ACtivity & Lesson
No ratings yet
2nd Quarter ACtivity & Lesson
4 pages
Class-4 Computer TERM - 1
No ratings yet
Class-4 Computer TERM - 1
7 pages
Als426 Language and Linguistics - Japan - Taro Nagazumi
No ratings yet
Als426 Language and Linguistics - Japan - Taro Nagazumi
19 pages
SSH Guide
No ratings yet
SSH Guide
6 pages

Predicting The End-to-End Tail Latency of Containerized Microservices in The Cloud

Uploaded by

Predicting The End-to-End Tail Latency of Containerized Microservices in The Cloud

Uploaded by

2019 IEEE International Conference on Cloud Engineering (IC2E)

Predicting the End-to-End Tail Latency of Containerized Microservices in the Cloud

Joy Rahman Palden Lama

978-1-7281-0218-4/19/$31.00 ©2019 IEEE 200

  

  

Figure 3: Impact of CPU utilization on the tail latency of various workﬂows.

WKSHUFHQWLOHRUGHUVZRUNIORZODWHQF\ PV WKSHUFHQWLOHRUGHUVZRUNIORZODWHQF\ PV WKSHUFHQWLOHRUGHUVZRUNIORZODWHQF\ PV

As shown in Figures 4 (a), (b) and (c), the end-to-end

than 300 ms when the CPU utilization measured at cart,

frontend, orders and user microservices are 67%, 110%,

55% and 41% respectively. However, similar tail latency

of orders workﬂow was observed at much lower CPU

knowledge. Hence, it can be easily extended to ﬁt the need 

bandwidth, we utilize the CPU utilization and CPI metric 

B. Machine Learning Models 

We build performance models for predicting the end-to- 

els include the number of concurrent clients, pod-level 

   

   

   

Figure 9: Cross-validated predictions of tail latency in orders workﬂow.

You might also like

WKSHUFHQWLOHRUGHUVZRUNIORZODWHQF\ PV WKSHUFHQWLOHRUGHUVZRUNIORZODWHQF\ PV WKSHUFHQWLOHRUGHUVZRUNIORZODWHQF\ PV

knowledge. Hence, it can be easily extended to ﬁt the need

bandwidth, we utilize the CPU utilization and CPI metric

B. Machine Learning Models

We build performance models for predicting the end-to-

els include the number of concurrent clients, pod-level