Vaadin 14 Scalability Report - December 2019
Vaadin 14 Scalability Report - December 2019
REPORT
December 2019
Introduction
The following scalability assessment was done for Bakery App Starter
Flow application. For that purpose we set up a production-like
scalability testing environment with Amazon EC2 instances. A local
machine was used to execute Gatling tests against the environment.
The purpose of this document is to show how well Bakery Flow
scales up and to describe how the testing was done. We also provide
some recommendations for optimizing the application and setup.
First, we will go through the user journeys, then the test setup. Next,
basic profiling will be described to show possible configuration
problems and the most prominent scalability issues. After that, we
will discuss about the memory usage and general CPU usage of
Bakery Flow with our user journey. Finally, at the end of the
document, we will make a summary, some remarks, and a short
guide on how to continue testing by yourself in the future.
User journeys
It is easy to identify at least two separate user journeys: Barista and
Admin. It is expected that the Barista user journey is at least 10 times
more common than the Admin user journey. Therefore, we will do
the scalability test only for the Barista journey.
Barista
● Log in (view Storefront)
● Click New Order
● Fill in customer name and phone number
● Add an item to the order (first from the combobox)
● Add another item into order (second from the combobox)
● Save order
● Go back to storefront
Admin
● Log in (view Storefront)
● Navigate to Dashboard
● Navigate to Users
● Navigate to Products
● Click Strawberry Bun
● Increase the price by $0.01
● Click Save.
Test setup
The application server and database were run on Amazon EC2 (EU
Frankfurt) m5.2xlarge (8 virtual CPU cores and 32GB RAM) cluster
consisting of two identical nodes. On the first node, we used
embedded Spring boot Apache Tomcat/8.5.15 server with Hikari
connection pool. Tomcat server’s maximum thread count was
reduced from the default 200 to 100 to reduce thread switching
overhead. A good guideline for setting the thread count is the
formula below:
connections = ((core_count * 2) + 1)
(https://github.com/brettwooldridge/HikariCP/wiki/About-Pool-Sizing)
The Postgres 9.6 database server was run on the second node of the
cluster. Hikari connection pool’s size was increased to 17 connections.
The postgresql.conf file is shown in the appendix. Both the Tomcat
server and the Postgres server were setup with help of Docker. Used
Docker files are attached to the appendix.
Gatling was run on a separate 6 core (12 thread) i7 machine. This
machine was located in Turku, Finland. Our previous tests indicated
that we were able to run Gatling with at least 10,000 concurrent
users without significant CPU or network bandwidth limitations. To
Profiling
The profiling was done in the local development environment using
Gatling, local Postgres database, and the JProfiler tool. The profiling
was done with 1,000 concurrent users. Tomcat’s maxThread
parameter was set to 10, and max heap size was 7 GB. The purpose of
profiling before actual scalability tests is to verify and possibly fix (at
least the low-hanging-fruit type issues) the application and the
environment such that it would be scalable as possible.
The picture below shows basic telemetric of the test when it was run
for a couple of minutes with constant 1,000 users. Maybe the most
interesting observations are as follows:
● GC collect activity is modest since the application doesn’t use
all of it’s 7 GB heap limit.
● The CPU load is not very high.
● Maybe most importantly, the time threads spent on the net I/O
(light blue color) is very high when compared to the time spent
on normal work (green color). Also there are a lot of idle threads
(yellow color).
The view of JPA hotspots (below) reveals that over 70% of database
time is spent fetching previous orders from the database. These
orders are shown in Grid of Storefront view.
This indicates clearly that the biggest part of the time (~40 %) is
spent waiting (I/O) for the Storefront view’s database query to
complete. After that, the next most significant hotspot is actual CPU
This will reduce the size of the query to half of it since the default
page size is 50.
To verify the effect of these changes we profiled the application
again with JProfiler. There is first the same basic telemetric of the
test when it was run for a couple of minutes with constant 1,000
users. The proportion of I/O wait time has significantly reduced from
~40% to <10%. The next figure shows the CPU hotspots after
optimizations. We can observe that CPU spent to DB operations is
less than one fifth of the original value.
We made these modifications and built a new version. The
following tests were run on this optimized version of the
application.
m5.large 2 8 800
m5.xlarge 4 16 1600
m5.2xlarge 8 32 3200
m5.4xlarge 16 64 6400
In addition to the CPU usage we monitored the network usage of
Tomcat (blue line: Tomcat network out, orange line: Tomcat in) and
In the figure below there are the response time percentiles over the
duration of the tests for 500, 1,500 and 2,500 virtual users. With all
these tested amounts, the response times of the requests were low
(<500ms) for most of requests (90%). A very small percentage of
requests (~1%) resulted in increased response times (~700ms) when
having 1,500 users. With 2,500 users the percentage of slower
requests (>700ms) was slightly bigger, around 5% of the total
amount of the requests.
Summary
Having a cluster of two m5.2xlarge nodes for your Bakery App Starter
Flow application, you should be able to serve constantly about 2,500
concurrent users and still survive without major problems if the
concurrent user count occasionally jumps to 3,000 users. Since our
optimizations reduced the CPU load of the DB server, you could use
a lower end server for that, e.g. m5.large (2 CPU cores and 8 GB RAM).
It should be even possible to use the same server used for Bakery for
DB too.
To translate this to hosting costs, let's assume we would host a
business system with similar resource use to Bakery for 10,000 users.
Let’s further assume that the globally distributed workforce uses the
system so that there are 1,000 - 2,500 concurrent users depending on
the time of the day. We are taking into account the optimization
presented and using a single m5.2xlarge AWS reserved instance
(standard 1-year term) for hosting the application and the DB costs
$0.2/user/year.
Some additional improvements might be possible to achieve using
optimised garbage collection, native application server libraries and
Appendix
postgresql.conf generated with https://pgtune.leopard.in.ua/
max_connections = 200
shared_buffers = 7680MB
effective_cache_size = 23040MB
maintenance_work_mem = 1920MB
checkpoint_completion_target = 0.7
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1.1
effective_io_concurrency = 200
work_mem = 10485kB
min_wal_size = 1GB
max_wal_size = 2GB
max_worker_processes = 16
max_parallel_workers_per_gather = 4
Dockerfiles
Tomcat server:
FROM anapsix/alpine-java:8_server-jre
ADD bakery-app.war .
ARG dbip
flow_bakery
ARG dbname=
flow_bakery_user
ARG dbuser=
flow_bakery_user_pw
ARG dbpw=
$dbip
ENV dbip=
$dbname
ENV dbname=
Postgres server:
FROM library/postgres:9.6.9
Gatling commands
Gatling test were run with Gatling Maven plugin under the project
with the following parameters, where the IP address (marked as
xx.xxx.xxx.xx) was the application server’s IP.
500 users:
mvn -Pscalability gatling:test -Dgatling.sessionCount=500 -Dgatling.sessionStartInterval=140
-Dgatling.sessionRepeats=4 -Dgatling.baseUrl=http://xx.xxx.xxx.xx:8080
1,500 users:
mvn -Pscalability gatling:test -Dgatling.sessionCount=1500 -Dgatling.sessionStartInterval=140
-Dgatling.sessionRepeats=4 -Dgatling.baseUrl=http://xx.xxx.xxx.xx:8080