Prebaking Functions To Warm The Serverless Cold

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/347439620

Prebaking Functions to Warm the Serverless Cold Start

Conference Paper · December 2020


DOI: 10.1145/3423211.3425682

CITATIONS READS
27 1,326

3 authors, including:

Daniel Fireman Thiago Emmanuel Pereira da Cunha Silva


Instituto Federal de Alagoas (IFAL) Universidade Federal de Campina Grande (UFCG)
11 PUBLICATIONS   106 CITATIONS    19 PUBLICATIONS   59 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Cloudlab-BR View project

All content following this page was uploaded by Thiago Emmanuel Pereira da Cunha Silva on 17 December 2020.

The user has requested enhancement of the downloaded file.


Prebaking Functions to Warm the Serverless Cold
Start
Paulo Silva Daniel Fireman Thiago Emmanuel Pereira
Federal University of Campina Federal Institute of Alagoas Federal University of Campina
Grande Brazil Grande
Brazil [email protected] Brazil
[email protected] [email protected]

Abstract higher: the speed-up increases from 127.45% to 403.96%, for a


Function-as-service (FaaS) platforms promise a simpler pro- small, synthetic function; and for a bigger, synthetic function,
gramming model for cloud computing, in which the devel- this ratio increases from 121.07% to 1932.49%.
opers concentrate on writing its applications. In contrast, CCS Concepts: • Computer systems organization → Cloud
platform providers take care of resource management and ad- computing; • General and reference → Performance;
ministration. As FaaS users are billed based on the execution Experimentation.
of the functions, platform providers have a natural incentive
not to keep idle resources running at the platform’s expense. Keywords: cloud, performance evaluation, faas, serverless
However, this strategy may lead to the cold start issue, in
ACM Reference Format:
which the execution of a function is delayed because there Paulo Silva, Daniel Fireman, and Thiago Emmanuel Pereira. 2020.
is no ready resource to host the execution. Cold starts can Prebaking Functions to Warm the Serverless Cold Start. In 21st
take hundreds of milliseconds to seconds and have been a International Middleware Conference (Middleware ’20), December
prohibitive and painful disadvantage for some applications. 7–11, 2020, Delft, Netherlands. ACM, New York, NY, USA, 13 pages.
This work describes and evaluates a technique to start func- https://doi.org/10.1145/3423211.3425682
tions, which restores snapshots from previously executed
function processes. We developed a prototype of this tech- 1 Introduction
nique based on the CRIU process checkpoint/restore Linux
Serverless computing is a new cloud computing service offer-
tool. We evaluate this prototype by running experiments that
ing pay-per-use and a low entry barrier to get applications
compare its start-up time against the standard Unix process
running in the datacenter. In the Function-as-a-Service (FaaS)
creation/start-up procedure. We analyze the following three
incarnation of the serverless model, the critical abstraction is
functions: i) a "do-nothing" function, ii) an Image Resizer
the function. The user applications are composed of function
function, and iii) a function that renders Markdown files. The
units, usually written in a high-level managed language such
results attained indicate that the technique can improve the
as Go, Java, Python, or JavaScript. The FaaS platform is re-
start-up time of function replicas by 40% (in the worst case of
sponsible for monitoring events, e.g., HTTP requests, which
a "do-nothing" function) and up to 71% for the Image Resizer
trigger a function call. To be able to invoke a function, the
one. Further analysis indicates that the runtime initialization
platform must allocate resources (VMs or containers) and
is a key factor, and we confirmed it by performing a sensi-
start the function’s runtime environment. Furthermore, as
tivity analysis based on synthetically generated functions
the load imposed on the functions may vary along the time,
of different code sizes. These experiments demonstrate that
the platform is also responsible for autoscaling the resources
it is critical to decide when to create a snapshot of a func-
committed to run the functions. The billing considers only the
tion. When one creates the snapshots of warm functions,
time and resources used during the execution of the function.
the speed-up achieved by the prebaking technique is even
The serverless bold promise is attractive because it liber-
ates the user from the mundane tasks of the "old" IaaS and
Permission to make digital or hard copies of all or part of this work for PaaS era, including resource provisioning and administration,
personal or classroom use is granted without fee provided that copies are not
made or distributed for profit or commercial advantage and that copies bear to focus on writing only the application’s business logic. As
this notice and the full citation on the first page. Copyrights for components an indication of popularity, the serverless model, launched by
of this work owned by others than ACM must be honored. Abstracting with Amazon as the Amazon Lambda, has already been adopted by
credit is permitted. To copy otherwise, or republish, to post on servers or to major cloud providers, e.g., IBM Cloud Functions [5], Google
redistribute to lists, requires prior specific permission and/or a fee. Request Cloud Functions, Microsoft Azure Function [27]. In addition
permissions from [email protected].
Middleware ’20, December 7–11, 2020, Delft, Netherlands
to the public service offerings, there are several open-source
© 2020 Association for Computing Machinery. options available, e.g., OpenFaas, Kubeless, and Fission [17].
ACM ISBN 978-1-4503-8153-6/20/12. . . $15.00 Despite its recent popularity, the serverless vision needs
https://doi.org/10.1145/3423211.3425682 to overcome some challenges. One of these challenges is

1
Middleware ’20, December 7–11, 2020, Delft, Netherlands Paulo Silva, Daniel Fireman, and Thiago Emmanuel Pereira

related to the long and unpredictable delays observed when The results attained in these experiments indicate that
the platform needs to start new function replicas. The so- the runtime initialization (which in Java includes lazy code
called cold-start happens not only when a new version of compilation) is key to the cold start delay. To understand
the function runs for the first time but also whenever the the relationship between the code compilation and the cold
FaaS platform policy decides to scale the function up to start delay, we extend our experiments to consider syntheti-
address a demand growth [9]. This start-up delay includes: cally generated functions, which vary in the code size. These
i) the time spent provisioning and starting the resources results helped us to discover that it is critical to decide at
to run the functions (typically, VMs or containers); ii) the which point of the function execution lifetime, the snapshot
initialisation of the function runtime environment (e.g, the should be generated. The speed-ups reported earlier were
JVM, the Chrome V8 engine or the Python interpreter); and achieved using checkpoints, generated just after the function
iii) the execution of application-specific bootstrap, which was ready to process requests. The results are even more im-
includes loading and compiling libraries. proved when one creates the checkpoints after the functions
Evaluating and improving the efficiency of serverless in- have received at least one request, which forces the Java
frastructure is an active area of research. In particular, re- runtime to compile and optimize the code. In this case, the
garding resource provisioning and its impact on cold-starts. speed-ups can be even higher: the ratio between the start-
A common approach is to avoid delays by being conserva- up time using the standard function creation mechanism
tive when provisioning functions [14]. On the one hand, by and the prebaking technique is increased from 127.45% to
maintaining an idle pool of functions instances, the platform 403.96% for a small, synthetic function; for a bigger, synthetic
addresses surges in demand with no performance penalty. function, this ratio increases from 121.07% to 1932.49%.
On the other hand, as the platform provider does not charge In summary, this paper has the following contributions:
for idle function instances, this strategy increases the plat- • Dives into the function start-up to investigate causes
form’s operational cost. Other approaches to the cold-start of delay, including within the runtime environment;
performance problem include the usage of specialized sand- • Proposes the usage of checkpoint/restore in the con-
boxing mechanisms like unikernels [2, 8], lightweight con- text of FaaS, the so-called prebaking technique;
tainers [19, 25] and microVMs [1]. • Creates a prototype of the prebaking technique using
In this work, we focus on reducing the impact of the run- CRIU, a checkpoint mechanism available for the Linux
time environment setup and loading. The prebaking tech- kernel;
nique replaces the standard fork-exec procedure by a mech- • Evaluates of the prebaking prototype, comparing with
anism that restores snapshots of previously created functions state-of-the-practice.
processes. To evaluate the prebaking technique, we devel-
oped a prototype using the CRIU checkpoint/restore tool1 The remaining of the paper is the following. In Section 2,
available for the Linux Kernel and analyzed this prototype we overview the design of FaaS platforms and its relation to
experimentally. the cold-start issue. In Section 3, we describe the design of
Our experiments evaluate the start-up delays of real and our cloning technique as well as the prototype implemen-
synthetic functions. We compared the cold-start when using tation using the CRIU tool. In Section 4, we describe our
the prebaking mechanism and the usual option based on experimental design and the results attained in the evalua-
creating new processes each time a function is started. The tion of our technique. This evaluation compares the proposed
evaluated functions include: i) a NOOP function that does technique against the state-of-the-practice of function start-
nothing, ii) an Image Resizer and, iii) a render of Markdown up. In Section 5, we show the feasibility of our technique by
files. reporting how we manage to integrate our technique with
The results indicate that the checkpoint/restore technique an existing serverless platform. In Section 6, we overview
is effective: the least improvement case, using the prebak- the literature on function-as-a-service performance improve-
ing technique decreases the start-up delay by 40% for the ments, in particular, related to the function start-up problem.
elementary function that does nothing other than returning Finally, in Section 7, we discuss the results we obtained and
an ack for the request. We also observe that the speed-up possible limitations and improvements.
achieved by the prototype increases as a function grows
more complex and is based on the increased amount of code 2 Background
(e.g., number of loaded classes). For a function that renders In this Section, we overview the typical design of FaaS plat-
a file in the markdown format to HTML, the start-up time forms. In addition to providing the background to this re-
is reduced from 100𝑚𝑠 to 53𝑚𝑠. For the Image Resizer, the search, this overview highlights the issue of cold-start of
improvement is even more significant: the start-up delay is such platforms.
decreased from 310𝑚𝑠 to 87𝑚𝑠, i.e., a speed-up of 71%. As introduced earlier, the FaaS model has two key aspects:
1) payment is based only on time and resources used dur-
1 https://criu.org/ ing the execution of functions; and, 2) users are liberated

2
Prebaking Functions to Warm the Serverless Cold Start Middleware ’20, December 7–11, 2020, Delft, Netherlands

from operation and management of computing resources. the execution of a function (in this case, not available). As a
Although all the major FaaS providers share above basic prin- result, the Function Deployer steps in to have a new Func-
ciples, the design of their platforms is very diverse; maybe a tion Replica provisioned. To this end, it gathers the desired
sign of a still-nascent field. Nevertheless, there are already function configuration from the Function Registry. With
some common, emerging, architectural patterns, as identi- the necessary information, the Resource Manager is com-
fied by the SPEC Research Group on Cloud [7]. We based manded to deploy the Function Replica on the computing
our background overview on the SPEC-RG reference archi- nodes. Once the Function Deployer is informed that a new
tecture. function replica was created, the Function Router resumes
The SPEC-RG reference architecture is organized in three the triggering of the original event that will lead to the exe-
layers: the Resource Orchestration, the Function Management, cution of the function.
and Workflow Management layers. We focus the remaining The cold-start delay of the above execution scenario has
of this discussion on the first two layers since they are more two components: 1) the delay to provision the execution
related to the cold-start issue considered in this work. environment (VMs and containers) for the new function
The Resource Orchestration layer is responsible for the replica; and 2) the delay to start-up the function application.
management of computing resources, e.g. containers and As containerization or virtualization techniques are op-
VMs, employed to support the executions of functions. In its timized to decrease start-up time [16, 19, 23], applications
turn, the Function Management layer, built upon the previous start-up time will become a more evident problem. Our ex-
one, is responsible for deploying the function replicas, in periments showed that Java applications typically takes more
addition to executing the functions and to autoscaling the than 100 ms to start, and depending on the application ini-
function replicas, when needed. tialization requirements, this time can reach 300 ms.
Figure 1 illustrates the interaction between some of the
components of the reference architecture to handle the exe-
cution of a function when there is no function replica already 3 Prebaking
deployed — the cold-start case. The Figure shows, for the In this Section, we describe the design and a prototype im-
Resource Orchestration layer, the Resource Manager compo- plementation of our prebaking technique, which has the
nent. While for the Function Management layer, it shows primary goal of reducing serverless cold-starts. Furthermore,
the Function Router, Function Deployer, Function Registry, and the technique aims to: i) be easy to integrate with existing
Function Replica components. serverless platforms, ii) not harm the function performance
The Function Router dispatches new requests or events after the start-up, and iii) not increase the costs of operating
to the correct function replicas (or, queue the requests and the serverless platform.
events while the replicas are still not available to process In the following, we detail the design of the prebaking
them). The Function Registry is a repository for the metadata technique (Section 3.1). Then, we describe a prototype imple-
and binaries of the functions available in the platform. The mentation of this design using the CRIU checkpoint/restore
Function Builder transforms the function representations, tool (Section 3.2).
kept by the Function Registry into a deployable form (this
process might include compiling, handling dependencies,
and other building activities). The Function Deployer drives 3.1 Design
the actual deploy mechanisms, implemented by the Resource The prebaking technique reduces function start-up time by
Orchestration layer, to deploy new function replicas into restoring snapshots of previously started functions runtimes.
computing resources. The Function Deployer component is As shown in Figure 1, before a function is ready to serve
responsible for deciding how many function replicas should requests, it executes a complex (and, in many cases, slow)
be deployed and which kind of resources should be used to series of steps. These steps include: creating a new process
deploy the functions. The Resource Manager, based on the to host the runtime, bootstrapping the runtime (e.g., the
information gathered by agents running of the infrastruc- initialization of its data structures and auxiliary services),
ture node, ensures that the state of the computing cluster is and loading the function code. We assume that it is faster
always in the desired states. to restore a function snapshot than to re-execute all those
The function building process starts when the Function start-up steps.
Builder component receives the function source code and This sort of checkpoint/restart method is widely used in
transforms it into a deployable artifact. After the building high-performance computing (HPC) to tolerate faults in long-
phase, the deployable function is stored into the Function running applications; when a failure happens, the application
Registry (from where later it can be downloaded and used could be resumed, from periodically generated snapshots,
to create a Function Replica [7]). instead of restarting from scratch. However useful, check-
The execution flow shown in Figure 1 starts when the point/restart may impact application performance. On the
Function Router receives a new event that would trigger one hand, the more frequent the snapshots are generated,

3
Middleware ’20, December 7–11, 2020, Delft, Netherlands Paulo Silva, Daniel Fireman, and Thiago Emmanuel Pereira

Figure 1. Function execution sequence when a function replica is not available. Adapted from [7]

the faster it is to recover the state just before the failure (be- Another possible optimization process is to fine-tune the
cause less computation must be re-executed). On the other moment of the runtime start-up to snapshot the function.
hand, since the snapshot generation competes for comput- Choosing the best time requires in-depth knowledge about
ing resources, frequent snapshots could slow the application the runtime, such as when the start-up procedure reaches
down. In the best case, when there is no fault, the snapshot the right balance between progress and the amount of state
generation is pure overhead. generated. The rationale behind such optimization is that
Differently from the HPC case, the prebaking technique the larger the snapshot, the longer it takes to be restored.
creates function snapshots only when the user deploys a new Despite the potential benefits, harnessing fine-grained
function version. From a typical serverless platform archi- knowledge about the runtime could compromise our aim to
tecture point of view, its more appropriate for the Function be easy to integrate with existing serverless platforms, since
Builder to trigger the function snapshot since this component any modification on the runtime (not unusual) should be
is responsible for transforming the function into deployable integrated back. It would also make it hard to instrument
artifacts. After building the function based on the prebak- the runtime and function codes to support the generation of
ing technique, the function building process can remain the the snapshots.
same as explained in Section 2. This has the additional advan-
tage of not delaying the function execution, since function 3.2 Prototype implementation
building executes before the function is available to be called.
The first thing needed to perform the checkpoint is to read
The platform would restore the snapshot whenever a new
the memory and state of the target process. The most straight-
function instance is created. The same snapshot can be used
forward mechanism to perform this is to modify the program
to restore different Function Replicas because all of them
to perform its checkpoint. This solution is inadequate to our
have the same state at the beginning of the execution. More
scenario. As we described in the beginning of this Section,
importantly, the prebaking technique allows the creation of
we aim to be easy to integrate with existing serverless plat-
snapshots at any point of the function setup. This character-
forms and the solution would demand modifications to the
istic opens a room for optimizing the process of snapshot
code of any function submitted to the platform.
generation to minimize the restart delay. For example, a
A fully-transparent checkpoint that is, without acknowl-
runtime-agnostic option is to generate the snapshots after
edgment of the target program is doable at the kernel level.
the end of the start-up procedure. This alternative eases
At this level, the checkpoint procedure would be able to
the snapshot generation. Instead of instrumenting runtime-
access the address space of any process. Unfortunately, how-
specific code, it is only necessary to wait for the completeness
ever doable, the kernel-based solution never achieved main-
of the process creation to generate the snapshot.
stream adoption. Another option is to use solutions that relax
transparency to stay at the userland, which is the case of

4
Prebaking Functions to Warm the Serverless Cold Start Middleware ’20, December 7–11, 2020, Delft, Netherlands

libckpt [22]. That is still not ideal because it requires the – i.e., prebaking versus the usual start method, based on
recompilation of the target application to include the code fork-exec system calls (henceforth, the Vanilla method). The
that performs the checkpoint. start-up of a function involves procedures executed both by
More recently, CRIU achieved a fully-transparent check- the operating system (e.g. clone, fork and exec system calls)
point at the user level. Instead of modifying the target appli- as well as procedures executed at the user level, including the
cations at the compiling time, CRIU injects the checkpoint bootstrap of the runtime and the application initialization.
procedure into the application code while the applicaiton Since the duration of the user level procedures are affected
is running. After injection, the checkpoint code runs in the by the characteristics of the functions, we evaluated three
same address space of the target process and thus can read different functions: NOOP, Markdown Render and Image
its internal state to perform the checkpoint. Once the check- Resizer2 .
point is finished it removes itself from the code, and the Both functions were written in Java and used an HTTP
unaware target application resumes its execution. server to handle the requests, as usually employed in com-
First, CRIU needs to freeze all the target process’s threads, mercial FaaS providers, such as AWS Lambda, Google Cloud
so that its state does not change while generating the check- Function, Azure Functions, and IBM OpenWhisk [13, 17].
point dump. After stopping all the threads, CRIU needs to dis- The NOOP function is very straightforward. It does noth-
cover what should be checkpointed for each of these threads. ing and returns success to every incoming request. The
For example, it reads the /proc/$pid/pagemap file to find function business logic neither has extra dependencies nor
the mapped memory areas. Afterward, CRIU injects the pro- adds extra processing/memory overhead. On the other hand,
cedure (parasite code) responsible for performing the actual the Markdown Render converts a markdown to an HTML
dump into the target process address space using the ptrace page. We embed a markdown3 inside the body of each in-
system call. When the parasite code starts running, it com- coming request, and receive the HTML page as response.
municates with the CRIU process to know what to dump, In its turn, the Image Resizer is more complex [2, 19].
reads the content from the process address space, and sends On start-up, it loads a 1MB, 3440x1440 pixels image4 , and
it through a pipe to the CRIU process. Finally, CRIU uses for each incoming request the function scales it down to 10%
the ptrace system call to remove the parasite code and to of its original size. The Image Resizer function depends on
detach from the target process, which resumes its execution. three image processing packages, all from the Java Software
The restore process is more straightforward than the dump Development Kit[21].
one. During the restoration, the CRIU tool process trans- As we are focusing on the function start-up, the experi-
mutes itself into the checkpointed process. The first action is ments were composed only by the load generator and the
to read the dump files and restore the process’s state. Then, it function runtime (i.e., JVM). That means we deliberately
recreates all namespaces and opened files. Finally, the check- excluded some typical components of FaaS platforms, such
pointed memory is remapped. as container orchestrators [17]. Without loss of generality,
CRIU is able to run both the checkpoint and the restore focusing on the runtime simplified the experimental setup
mechanisms unprivileged. This is possible due the recently and removed sources of experimental noise.
added CAP_CHECKPOINT_RESTORE capability [11]. This capa- The most common concurrent model in public clouds is
bility relax some permissions to execute procedures such that each function replica handles one request at a time. If a
as selecting a specific pid when cloning a new process and replica is busy and a new request arrives, the platform starts
acessing the memory mapped files. another replica to do the job. On the other hand, if a replica
is inactive for a certain period, the platform garbage collects
4 Evaluation the function replica to save resources [27]. So, to mimic this
behavior, the load generator starts the function replica and
In this Section, we evaluate the performance of using the
holds the first request until the replica becomes ready. After
prebaking technique, described previously, in comparison
that, the load is sent sequentially and at a constant rate.
with the usual start-up procedure. We focused our evaluation All the experiments discussed in Sections 4.2 and 4.3 were
on two questions: i) Can the prebaking technique improve the performed in a quad-core Intel(R) Core(TM) i5-3470S 2.90GHz
start-up time of a serverless function (Section 4.2)? ii) Does the VM, with 8GB RAM running Ubuntu 16.04 with Linux kernel
novel start-up procedure lead to any penalty on the function 4.15.0-45-generic-x86_64. We used the Java Oracle 1.8.0_201
performance after start-up (Section 4.3)?
runtime. Each experiment treatment was repeated 200 times.
The load generator and the function runtime was restarted
4.1 Methodology before a run.
To answer the questions described above, we conducted a 22
factorial experiment. Our experiments measured the runtime 2 https://github.com/paulofelipefeitosa/serverless-handlers
start-up and the function response time of 200 subsequent 3 https://github.com/PrincetonUniversity/openpiton

invocations. As factors, we assessed two start-up methods 4 https://i.imgur.com/BhlDUOR.jpg

5
Middleware ’20, December 7–11, 2020, Delft, Netherlands Paulo Silva, Daniel Fireman, and Thiago Emmanuel Pereira

it is the bottom line in terms of start-up time improvement.


The experimental results for Markdown Render and Image
Resizer confirmed our belief: for more complex functions,
using the prebaking technique led to higher startup speed-up.
The Markdown function achieved an improvement of 47%
while the Image Resizer reached 71% of speed-up.
Figure 2. Experiment components and communication flow.
4.2.1 Start-up Components. Once we gained confidence
in the overall strategy of the prebaking technique, we started
4.2 Function start-up Time another set of experiments aiming to better understanding
This Section describes the set of experiments conducted to what happens when the function is started. To do so, we di-
answer the following research question: can the prebaking vided the function start-up into four components (or phases):
technique improve the start-up time of a FaaS function? i) execution of the clone system call (CLONE), ii) execution
Figure 3 compares 200 observations of NOOP, Markdown of the exec system call (EXEC), iii) the period between the
Render and Image Resizer functions start-up time (in mil- end of the exec call and the start of the main() procedure
liseconds). The error bars represent the median interval for (runtime start-up - RTS) and iv) from the end of the RTS
95% of statistical confidence calculated using bootstrap [6]. phase to when the function is ready to serve the first request
As neither the confidence intervals of the NOOP, Markdown (application initialization - APPINIT).
nor the Image Resizer function intersect, we have a visual To measure the duration of the first two phases, we instru-
hint that the prebaking technique improves start-up perfor- mented the execution of the application using the bpftrace
mance, and thus decrease the function latency variability. tool5 . We introduced logs using the standard system call
probes available in bpftrace (in which one can probe both
the entry and exit points for any system call). To measure the
last two phases, we added logs before the runtime starts exe-
cuting the first line of code, and right before the HTTP server
handles the first request. We repeated each experiment 200
times.
Figure 4 depicts a detailed view of the start-up time of the
functions. For both applications and runtime initialization
techniques, it presents the duration of each component as
a part of the overall start-up time. For instance, Figure 4
allows us to spot that, regardless of the case, clone and exec
system calls contribute to a tiny fraction of the start-up time.

Figure 3. Comparison of serverless instance initialization


techniques using the NOOP, Markdown Render and Image
Resizer functions. Each point represents the start-up time
of one experiment repetition. The error bars represent the
median interval for 95% of statistical confidence calculated
using bootstrap [6].

As some samples failed the Shapiro-Wilk normality test [24],


we used the non-parametric Wilcoxon-Mann-Whitney Test
to check if both medians are equal and calculate the interval
confidence of the median distance. The tests confirm, with
95% of statistical confidence, that in both cases the medians
are not equal. Thus, the prebaking technique leads to shorter Figure 4. Application start-up components duration stacked
start-up times. as part of overall start-up time. CLONE and EXEC phases
For the NOOP function, the median difference was [40.35, contributes with a tiny fraction of the overall start-up time.
42.29] milliseconds, which represents an improvement of
40%. As the NOOP function is very elementary, we believe 5 https://github.com/iovisor/bpftrace

6
Prebaking Functions to Warm the Serverless Cold Start Middleware ’20, December 7–11, 2020, Delft, Netherlands

When using the Vanilla technique, the RTS and APPINIT


phases dominate the function start-up time. In the Vanilla
executions, the RTS phase is responsible for adding ≈ 70 ms
into the overall application start-up, there is no statistical
difference between the RTS phase values for all evaluated
functions. On the other hand, there is a significant difference
between the duration of the APPINIT phase when compar-
ing the NOOP or Markdown results with the Image Resizer.
This difference becomes visible because the Image Resizer
function needs to read a 1MB file, this translates to perform
more I/O operations.
One noticeable difference of using the prebaking tech- Figure 5. Impact of the function size on the function start-up
nique is that it brings the RTS down to 0ms, regardless of the time. The error bars represent the interval for 95% statistical
function. Therefore, the overall start-up time is almost totally confidence.
dictated by the APPINIT phase. That was expected, consider-
ing the core of the technique is to bake a ready-to-use image Figure 5 presents the impact of the function size in start-up
of the function runtime and load it when needed. However, time. The error bars represent the interval for 95% statistical
it is important to notice that their absolute values differ, i.e., confidence. Those results led us to the idea that the prebak-
on average, the Image Resizer function spent ≈ 42% more ing technique could be used to eliminate runtime-specific
time than the NOOP, and it spent ≈ 34% more time than overheads. Instead of baking a freshly started JVM, one way
the Markdown Render. We also expected this increase, since to eliminate the overhead caused by the code compilation
the size of Image Resizer snapshot (99.2MB) is larger than is to warm the JVM before baking. The warmup procedure
the NOOP (13MB) and Markdown (14MB) snapshots, thus consisted of sending one request to the serverless function,
requiring more read operations to recover the process state. which triggers the code compilation.
Even though I/O operations impact the APPINIT phase We evaluated the new prebaking mechanism (Prebaking-
of both mechanisms, the impact is different. The APPINIT Warmup) and the previous one (Prebaking-NOWarmup) in
duration of the Image Resizer function is ≈ 7.18 times more comparison with the vanilla mechanism (without the check-
significant than the NOOP when using the vanilla start-up point/restore acceleration). In this full-factorial experiment
mechanism. However, that same ratio decreases to ≈ 1.43 design, we also experimented with the function size factor:
when using the prebaking technique. small, medium and big. Our evaluation measured the start-up
time and repeated each treatment 200 times.
4.2.2 Choosing The (Pre)Baking Ingredients. The last
section taught us three valuable lessons:
• The RTS and APPINIT are the phases with the most
significant impact on start-up time;
• The prebaking technique brings the RTS down to 0ms;
• Regardless of the mechanism, the APPINIT phase con-
tributions to the function start-up time are impacted
by runtime-specific characteristics, for example, the
lazily JVM loads and compiles the function code.
To better understand the runtime-specific impact in the
function start-up time, we created a synthetic function which
loads a predefined number of classes when invoked. Using Figure 6. The startup time improvement of both prebak-
the vanilla start-up mechanism, we ran an experiment mea- ing techniques in comparison with the vanilla technique.
suring the start-up time of three function sizes: i) small - The Prebaking-Warmup improvement shows the impact of
374 classes (≈ 2.8MB), ii) medium - 574 classes (≈ 9.2MB) warming the functions before generating the snapshots.
and iii) big - 1574 classes (≈ 41MB). The loaded classes have
different sizes, and that is the reason for the growth in the Figure 6 presents the ratio between the start-up time using
number of classes does not match the size linearly. Each the vanilla in comparison with both versions of the prebaking
treatment was repeated 200 times. We evaluated using this technique. First, our results show that warming up a Java
range of function sizes to assess how sensitive is the start-up function before baking makes the start-up time even better
mechanisms to the growth of functions, rather than trying (403.96% versus 127.45%, for small functions). Second, the
to forecast typical function sizes. performance gain grows as the function size grows (1932.49%

7
Middleware ’20, December 7–11, 2020, Delft, Netherlands Paulo Silva, Daniel Fireman, and Thiago Emmanuel Pereira

versus 121.07%, for more realistic functions). This growth functions. As an example, we list some operations which are
is because snapshot loading is less impacted by function essential to the integration with the prebaking technique:
size than the vanilla source-code loading and compilation. 1. new: creates a new function project by copying a lan-
Table 1 help us analyzing this impact by showing the start-up guage template from the Templates Repository. After
time intervals for the three start-up techniques and function it, the developer can edit the project to implement the
sizes. As we could see, the start-up time growth from small function’s business logic;
to big functions was ≈ 30ms for prebaking with warmup (i.e., 2. build: transforms the function source code into a de-
PB-Warmup) and ≈ 1168ms for prebaking without warmup ployable artifact which is a Docker container image;
(i.e., PB-NOWarmup). 3. push: stores the function deployable artifacts into the
Function Registry which is a Container Image Reposi-
4.3 Service Time Overhead tory;
In Section 4.2, we analyzed the start-up time and showed 4. deploy: deploys the function into an OpenFaaS Gate-
that the prebaking technique led to significant performance way enabling its usage through the platform.
improvements. In the state of the practice, those improve- Every request that comes through the platform hits the
ments imply in more latency predictability, as the perceived Gateway API, which is the OpenFaaS platform entry point.
latency of the first request suffers less from the cold start. In It provides APIs to deploy, invoke, scale, gather information,
this Section, we present the evaluation of the service time, and metrics about the instances of the function. Furthermore,
aiming to understand how FaaS functions behave after being the platform auto-scaling functionality is shared between
restored. In other words, we assess if the start-up procedure the Gateway API and the Prometheus tool, which continu-
leads to any performance penalty. ously monitors metrics and fires alerts. All alerts fired by
Figure 7 presents the empirical cumulative distribution Prometheus are processed by Gateway API, which decides
function (ECDF) of the service time for 200 requests applied when to scale down/up the number of active function repli-
to NOOP, Markdown Render and Image Resizer functions cas.
after being initialized by the prebaking and vanilla technique. Instead of directly executing operations, such as incre-
Both ECDFs pretty much coincide, thus a good indication menting the number of replicas of a particular function, the
that the prebaking technique does not lead to any perfor- API Gateway delegates it to the FaaS-Provider. This indirec-
mance penalty after the functions are restored. tion abstract details about different container orchestration
mechanisms and tools. Currently, the FaaS-Provider has im-
5 Integration plementations for Kubernetes and DockerSwarm integration.
To access the feasibility of the prebaking technique, we in- Finally, the function Watchdog is the component respon-
tegrate it with the open-source OpenFaaS platform6 . In our sible for managing and monitoring the function replica life-
integration scenario, we used Kubernetes as the Resource cycle. Furthermore, it is a communication interface between
Management layer. OpenFaaS is one of the most popular the platform API and the replica process.
open-source serverless platforms. It provides excellent docu-
mentation about the platform architecture, making it easier 5.2 Prebaking OpenFaaS Functions
to understand how we could integrate our technique. In the As shown in figure 9, OpenFaaS introduced the concept of
next sections, we overview the OpenFaaS design and explain templates. A template hides setup complexity from users that
how its integration with the prebaking technique. have everyday use cases. There are templates for languages
The prebaking technique was designed to be easily inte- like Go, Python, Java, PHP, and C#.
grated with the existing serverless platforms. Such a premise To spin off a prebaked function, we need to create a tem-
does not tied-up our technique to a specific serverless plat- plate that adds all CRIU dependencies and executes CRIU
form, neither to a process isolation technology. commands. As CRIU uses different commands to start pro-
cesses in different runtimes, we created a new CRIU-version
5.1 OpenFaaS template for each language that we wanted to support7 . With
OpenFaaS is a container-based serverless platform. It means the prebaking template, the developers can create function
that there is a container for every function, and the container projects that adopts the prebaking technique by using the
should encapsulate all dependencies (e.g., source-code and new operation from FaaS-CLI.
runtime). That said, Figure 8 presents an overview of the Prebaking templates work differently from usual Open-
OpenFaas architecture and how its components communi- FaaS templates. On the build phase, when transforming
cate. the source-code and dependencies into a deployable func-
As shown in Figure 8, users operate OpenFaaS through tion, Prebaking templates start the function runtime and run
the Faas-CLI, which defines an API for the operations with an optional post-processing script (e.g., warm-up requests),
6 https://www.openfaas.com/ 7 https://github.com/paulofelipefeitosa/templates

8
Prebaking Functions to Warm the Serverless Cold Start Middleware ’20, December 7–11, 2020, Delft, Netherlands

Table 1. start-up time intervals (in milliseconds) for functions with small, medium and big code bases. Intervals were calculated
to provide 95% of statistical confidence.

Vanilla PB-NOWarmup PB-Warmup


Small (219.25;220.32) (172.12;172.80) (54.06;54.75)
Medium (455.45;456.64) (360.51;361.24) (63.46;63.99)
Big (1619.91;1622.08) (1339.90;1340.98) (83.62;84.35)

Figure 7. Empirical cumulative distribution function (ECDF) of the service time for 200 requests applied to NOOP, Markdown
Render and Image Resizer functions after being initialized by the Prebaking and Vanilla technique.

and checkpoint the function process into the container im- Finally, as mentioned before, the restore operation is priv-
age. And, after creating the Docker container image, the ileged. The docker run command already supports this func-
OpenFaaS platform can finish the function push and deploy tionality by starting the container using the –privileged
processes as usual. With that, when it is time to start the option. As Kubernetes already support this behavior, we only
function replica, the container executes the CRIU command needed to introduce it in the FaaS-Provider implementation.
to restore the dump previously saved inside the container
image. 6 Related Work
Since usual docker build does not allow the execution of
Long start-up time remains an open problem in serverless
privileged operations, it was necessary to install the Docker
computing. Many efforts have been proposed to decrease
Buildx CLI plugin8 to allow docker command to perform
start-up time, including snapshot-based solutions. For exam-
such operations.
ple, SEUSS harnesses unikernels to executes snapshots of
After creating a new function using the CRIU templates
serverless applications [4]. SEUSS improves startup times by
and installing the Docker Buildx, the developers can deploy
snapshotting the unikernel state at different moments of the
function instances using the same commands provided by
function life-cycle and cloning the unikernel snapshot when
FaaS-CLI.
there is no Function Replica available. SEUSS runs on a ded-
icated OS focused on delivering fast snapshot restore. The
decision of adopting an ad-hoc kernel instead of using an
8 https://docs.docker.com/buildx/working-with-buildx/ out-of-box Linux kernel comes with a price: it leads to more

9
Middleware ’20, December 7–11, 2020, Delft, Netherlands Paulo Silva, Daniel Fireman, and Thiago Emmanuel Pereira

Figure 8. OpenFaaS main components. Functions artifacts are store in a Container Image Repository, and later downloaded
and used to create new Function Replicas.

complex integrations with existing serverless platforms. In [10] propose to clone and restore the JVM internal structures
particular, when the platform is deployed on premise, by avoiding several start-up steps to reduce Java applications
small organizations, the cost of operation could be prohibi- start-ups. Oh and Moon [20] propose to decrease Web ap-
tive. Furthermore, it is unclear how SEUSS performs when plications start-up time by snapshotting Javascript objects
dealing with more complex functions as the work describes and restoring them when the application is loaded. We chose
only the evaluation of the NOOP function. the process cloning technique because of its generality. As
SOCK proposes to cache application loaders with pre- the clone operation is applied at the process level, it does
imported packages and clone them to avoid runtime start- not need to know runtime or application internal structures.
up[18, 19]. However, SOCK service implementation is language- However, the knowledge of runtime internals could be used
specific, and it does not deal with other application aspects to make the start-up even better.
that influence the start-up time, for instance, I/O heavy
initialization. Boucher et al. [3] proposes the adoption of
language-based isolation instead of process-based. The au- 7 Conclusion
thors implemented a multi-tenant worker process in Rust, Serverless platform users face a well-known problem of high
which directly executes functions by dynamic loading the response times when the request handling need wait for the
function code and running it as a thread. Even though they platform to scale-up. This problem is commonly known as
achieved a start-up time in the order of microseconds, the "cold start", which has as significant contributors the plat-
solution requires users to write functions in Rust, which form orchestration overhead and virtualized environment
brings challenges in terms of usage and code transpilation. (container or VM) start-up [14, 16]. However, corroborating
Our work provides a language-aware solution that can be with previous studies [14, 15, 19], our findings revealed that
plugged by the cloud provider and massively improve the function runtime start-up also plays a major role in cold
start-up time of applications without changing the user code starts. Our experiment results show that JVM start-up times
or imposing new user requirements. range from 310 to 1600 ms, depending on the code size.
The clone and restore approach has already been used We focused on decreasing the function process start-up
in other contexts to decrease cloud applications’ start-up time using a cloning technique based on Checkpoint/Restore
time. Kukreti and Mueller [12] propose the process cloning In Userspace (CRIU). The proposed solution persists the pro-
technique to avoid speculative tasks recompute work already cess state of a ready-to-serve serverless instance to recover
done by the original task improving the probability of a spec- this state in a cloned process later, when the platform needs
ulative task catching up the straggler task. Kawachiya et al. to scale up. Our results show that using process cloning to

10
Prebaking Functions to Warm the Serverless Cold Start Middleware ’20, December 7–11, 2020, Delft, Netherlands

Figure 9. Diagram of the Prebaked Functions deployment and execution flow in the OpenFaaS platform. On the build phase,
CRIU triggers the process checkpoint and stores the Function Snapshot data inside the Function Container Image. Whenever
the FaaS-Provider launches a new Function Replica, CRIU restores the snapshot.

start serverless functions removes the overhead of the JVM to evaluate the checkpoint/restore as a service including
start-up, leading to a gain of 40% for a NOOP function, and aspects such as the performance to deal with even bigger
47% to 71% for more representative functions. The proposed function code sizes and concurrent snapshots. Finally, we
solution also allows for platform maintainers to interact plan to adopt the recently released version of the CRIU tool,
with the process before persisting its state. We used this that does not require the execution of previleged operations
functionality to warm a Java function up before persisting and to experiment with in-memory optimization on CRIU
it (i.e., prebaking), and our experiments show that it leads to speed-up snapshot restore [26].
to improvements ranging from 127.45% to 1932.49%. That
means the prebaking technique, not only removed the JVM
start-up overhead, but also effectively removed the overhead Acknowledgments
caused by loading and compiling the code (JIT). Finally, we We would like to thank all anonymous reviewers from this
showed that these gains are proportional to the code size of 2020 Middleware edition and our shepherd Lucy Cherkasova
the function. for their guidance and precious feedback. This work was
As future work, we plan to extend our evaluation to other supported by CAPES– Brazilian Federal Agency for Support
runtimes environments such as Node.JS and Python, all sup- and Evaluation of Graduate Education. Furthermore, it has
ported by the leading public FaaS platforms. As different been funded by grant #2015/24461-2, São Paulo Research
runtimes implement distinct start-up procedures, the poten- Foundation (FAPESP), and by the project ATMOSPHERE
tial improvements remain unknown. Also, we plan to evolve (atmosphere-eubrazil.eu), by the Brazilian Ministry of Sci-
the integration of the Prebaking technique into other FaaS ence, Technology and Innovation (Project 51119 - MCTI /
platforms, aiming to assess how ease it is to integrate the RNP 4th Coordinated Call) and by the European Commis-
technique with different designs. In addition to that, we plan sion under the Cooperation Programme, Horizon 2020 grant
agreement no 777154.

11
Middleware ’20, December 7–11, 2020, Delft, Netherlands Paulo Silva, Daniel Fireman, and Thiago Emmanuel Pereira

References git/commit/?id=74858abbb1032222f922487fd1a24513bbed80f9
[1] Alexandru Agache, Marc Brooker, Alexandra Iordache, Anthony [12] Sarthak Kukreti and Frank Mueller. 2018. CloneHadoop: Process
Liguori, Rolf Neugebauer, Phil Piwonka, and Diana-Maria Popa. 2020. Cloning to Reduce Hadoop’s Long Tail. In 5th IEEE/ACM International
Firecracker: Lightweight Virtualization for Serverless Applications. In Conference on Big Data Computing Applications and Technologies, BD-
17th USENIX Symposium on Networked Systems Design and Implemen- CAT 2018, Zurich, Switzerland, December 17-20, 2018. IEEE Computer
tation, NSDI 2020, Santa Clara, CA, USA, February 25-27, 2020, Ranjita Society, 11–20. https://doi.org/10.1109/BDCAT.2018.00011
Bhagwan and George Porter (Eds.). USENIX Association, 419–434. [13] Hyungro Lee, Kumar Satyam, and Geoffrey C. Fox. 2018. Evalu-
https://www.usenix.org/conference/nsdi20/presentation/agache ation of Production Serverless Computing Environments. In 11th
[2] Istemi Ekin Akkus, Ruichuan Chen, Ivica Rimac, Manuel Stein, Klaus IEEE International Conference on Cloud Computing, CLOUD 2018, San
Satzke, Andre Beck, Paarijaat Aditya, and Volker Hilt. 2018. SAND: Francisco, CA, USA, July 2-7, 2018. IEEE Computer Society, 442–450.
Towards High-Performance Serverless Computing. In 2018 USENIX https://doi.org/10.1109/CLOUD.2018.00062
Annual Technical Conference, USENIX ATC 2018, Boston, MA, USA, July [14] Ping-Min Lin and Alex Glikson. 2019. Mitigating Cold Starts in Server-
11-13, 2018, Haryadi S. Gunawi and Benjamin Reed (Eds.). USENIX less Platforms: A Pool-Based Approach. CoRR abs/1903.12221 (2019).
Association, 923–935. https://www.usenix.org/conference/atc18/ arXiv:1903.12221 http://arxiv.org/abs/1903.12221
presentation/akkus [15] Johannes Manner, Martin EndreB, Tobias Heckel, and Guido Wirtz.
[3] Sol Boucher, Anuj Kalia, David G. Andersen, and Michael Kaminsky. 2018. Cold Start Influencing Factors in Function as a Service. In 2018
2018. Putting the "Micro" Back in Microservice. In 2018 USENIX Annual IEEE/ACM International Conference on Utility and Cloud Computing
Technical Conference, USENIX ATC 2018, Boston, MA, USA, July 11-13, Companion, UCC Companion 2018, Zurich, Switzerland, December 17-
2018, Haryadi S. Gunawi and Benjamin Reed (Eds.). USENIX Associa- 20, 2018, Alan Sill and Josef Spillner (Eds.). IEEE, 181–188. https:
tion, 645–650. https://www.usenix.org/conference/atc18/presentation/ //doi.org/10.1109/UCC-Companion.2018.00054
boucher [16] Anup Mohan, Harshad Sane, Kshitij Doshi, Saikrishna Edupuganti,
[4] James Cadden, Thomas Unger, Yara Awad, Han Dong, Orran Krieger, Naren Nayak, and Vadim Sukhomlinov. 2019. Agile Cold Starts for
and Jonathan Appavoo. 2020. SEUSS: skip redundant paths to make Scalable Serverless. In 11th USENIX Workshop on Hot Topics in Cloud
serverless fast. In EuroSys ’20: Fifteenth EuroSys Conference 2020, Her- Computing, HotCloud 2019, Renton, WA, USA, July 8, 2019, Christina
aklion, Greece, April 27-30, 2020, Angelos Bilas, Kostas Magoutis, Evan- Delimitrou and Dan R. K. Ports (Eds.). USENIX Association. https:
gelos P. Markatos, Dejan Kostic, and Margo I. Seltzer (Eds.). ACM, //www.usenix.org/conference/hotcloud19/presentation/mohan
32:1–32:15. https://doi.org/10.1145/3342195.3392698 [17] Sunil Kumar Mohanty, Gopika Premsankar, and Mario Di Francesco.
[5] Serjik G. Dikaleh, Eric Charpentier, John Liu, Neil DeLima, and Vince 2018. An Evaluation of Open Source Serverless Computing Frame-
Yuen. 2018. Build a cognitive serverless slack app with IBM cloud works. In 2018 IEEE International Conference on Cloud Computing
functions & IBM Watson API. In Proceedings of the 28th Annual In- Technology and Science, CloudCom 2018, Nicosia, Cyprus, December
ternational Conference on Computer Science and Software Engineering, 10-13, 2018. IEEE Computer Society, 115–120. https://doi.org/10.1109/
CASCON 2018, Markham, Ontario, Canada, October 29-31, 2018. 354– CloudCom2018.2018.00033
355. https://dl.acm.org/citation.cfm?id=3291336 [18] Edward Oakes, Leon Yang, Kevin Houck, Tyler Harter, Andrea C.
[6] Bradley Efron and Robert J. Tibshirani. 1993. An Introduction to the Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2017. Pipsqueak: Lean
Bootstrap. Number 57 in Monographs on Statistics and Applied Proba- Lambdas with Large Libraries. In 37th IEEE International Conference
bility. Chapman & Hall/CRC, Boca Raton, Florida, USA. on Distributed Computing Systems Workshops, ICDCS Workshops 2017,
[7] Erwin Van Eyk, Alexandru Iosup, Johannes Grohmann, Simon Eis- Atlanta, GA, USA, June 5-8, 2017, Aibek Musaev, João Eduardo Ferreira,
mann, André Bauer, Laurens Versluis, Lucian Toader, Norbert Schmitt, and Teruo Higashino (Eds.). IEEE Computer Society, 395–400. https:
Nikolas Herbst, and Cristina L. Abad. 2019. The SPEC-RG Ref- //doi.org/10.1109/ICDCSW.2017.32
erence Architecture for FaaS: From Microservices and Containers [19] Edward Oakes, Leon Yang, Dennis Zhou, Kevin Houck, Tyler Har-
to Serverless Platforms. IEEE Internet Comput. 23, 6 (2019), 7–18. ter, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2018.
https://doi.org/10.1109/MIC.2019.2952061 SOCK: Rapid Task Provisioning with Serverless-Optimized Contain-
[8] Henrique Fingler, Amogh Akshintala, and Christopher J. Rossbach. ers. In 2018 USENIX Annual Technical Conference, USENIX ATC 2018,
2019. USETL: Unikernels for Serverless Extract Transform and Load Boston, MA, USA, July 11-13, 2018, Haryadi S. Gunawi and Benjamin
Why should you settle for less?. In Proceedings of the 10th ACM SIGOPS Reed (Eds.). USENIX Association, 57–70. https://www.usenix.org/
Asia-Pacific Workshop on Systems, APSys 2019, Hangzhou, China, Aug- conference/atc18/presentation/oakes
sut 19-20, 2019. ACM, 23–30. https://doi.org/10.1145/3343737.3343750 [20] JinSeok Oh and Soo-Mook Moon. 2015. Snapshot-based loading-time
[9] Eric Jonas, Johann Schleier-Smith, Vikram Sreekanti, Chia-che Tsai, acceleration for web applications. In Proceedings of the 13th Annual
Anurag Khandelwal, Qifan Pu, Vaishaal Shankar, Joao Carreira, Karl IEEE/ACM International Symposium on Code Generation and Opti-
Krauth, Neeraja Jayant Yadwadkar, Joseph E. Gonzalez, Raluca Ada mization, CGO 2015, San Francisco, CA, USA, February 07 - 11, 2015,
Popa, Ion Stoica, and David A. Patterson. 2019. Cloud Program- Kunle Olukotun, Aaron Smith, Robert Hundt, and Jason Mars (Eds.).
ming Simplified: A Berkeley View on Serverless Computing. CoRR IEEE Computer Society, 179–189. https://doi.org/10.1109/CGO.2015.
abs/1902.03383 (2019). arXiv:1902.03383 http://arxiv.org/abs/1902. 7054198
03383 [21] Oracle. 2019. Overview (Java Platform SE 8). Retrieved August 6, 2019
[10] Kiyokuni Kawachiya, Kazunori Ogata, Daniel Silva, Tamiya Onodera, from https://docs.oracle.com/javase/8/docs/api/
Hideaki Komatsu, and Toshio Nakatani. 2007. Cloneable JVM: a new [22] James S. Plank, Micah Beck, Gerry Kingsley, and Kai Li.
approach to start isolated java applications faster. In Proceedings of 1995. Libckpt: Transparent Checkpointing under UNIX. In
the 3rd International Conference on Virtual Execution Environments, USENIX 1995 Technical Conference on UNIX and Advanced
VEE 2007, San Diego, California, USA, June 13-15, 2007, Chandra Krintz, Computing Systems, New Orleans, Louisiana, USA, January
Steven Hand, and David Tarditi (Eds.). ACM, 1–11. https://doi.org/10. 16-20, 1995, Conference Proceedings. USENIX Association, 213–
1145/1254810.1254812 224. https://www.usenix.org/conference/usenix-1995-technical-
[11] Linux Kernel. 2020. Linux merged patch for unprivi- conference/libckpt-transparent-checkpointing-under-unix
leged checkpoint/restore. Retrieved August 31, 2020 from [23] Amazon Web Services. 2018. Firecracker. Retrieved May 15, 2019 from
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux. https://firecracker-microvm.github.io

12
Prebaking Functions to Warm the Serverless Cold Start Middleware ’20, December 7–11, 2020, Delft, Netherlands

[24] S. S. SHAPIRO and M. B. WILK. 1965. An analysis of variance test for In Proceedings of the International Symposium on Memory Systems,
normality (complete samples). Biometrika 52, 3-4 (dec 1965), 591–611. MEMSYS 2019, Washington, DC, USA, September 30 - October 03, 2019.
https://doi.org/10.1093/biomet/52.3-4.591 ACM, 53–65. https://doi.org/10.1145/3357526.3357542
[25] Jörg Thalheim, Pramod Bhatotia, Pedro Fonseca, and Baris Kasikci. [27] Liang Wang, Mengyuan Li, Yinqian Zhang, Thomas Ristenpart, and
2018. Cntr: Lightweight OS Containers. In 2018 USENIX Annual Michael M. Swift. 2018. Peeking Behind the Curtains of Server-
Technical Conference, USENIX ATC 2018, Boston, MA, USA, July 11-13, less Platforms. In 2018 USENIX Annual Technical Conference, USENIX
2018, Haryadi S. Gunawi and Benjamin Reed (Eds.). USENIX Associa- ATC 2018, Boston, MA, USA, July 11-13, 2018, Haryadi S. Gunawi
tion, 199–212. https://www.usenix.org/conference/atc18/presentation/ and Benjamin Reed (Eds.). USENIX Association, 133–146. https:
thalheim //www.usenix.org/conference/atc18/presentation/wang-liang
[26] Ranjan Sarpangala Venkatesh, Till Smejkal, Dejan S. Milojicic, and
Ada Gavrilovska. 2019. Fast in-memory CRIU for docker containers.

13

View publication stats

You might also like