SPS Project Analysis Report 2022

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Software Performance and Scalability

Sello Stefano, 864851 - Volpe Davide, 862989


Abstract
The purpose of this project consists of analysing and benchmarking a self-made web application,
which provides to the end users a simple interface to write and compile c++ programs. The
main topics of such analysis will be the behaviour of the system under high traffic conditions
and the performance variations encountered when allowing higher or lower number of concurrent
compilations.

Contents
1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1 Application purpose and behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Application technology stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Server configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 System components identification . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.1 Test design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.2 Test results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.3 Service demand and optimal number of customers . . . . . . . . . . . . . . . . . 6
4 Java Modelling Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4.1 Throughput . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4.2 Utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4.3 Response Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
5 Future improvements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
5.1 Augment µb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
5.2 Reduce Vb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1
1 Overview 2

1 Overview
This first section presents the application main intent along with the technology stack used in
the development, the benchmarking process and the underlying infrastructure where the service
has been deployed.

1.1 Application purpose and behaviour


The web application is aimed to provide a simple interface capable of capturing some user-defined
c++ code, sending it to a backend service and compiling it through a secondary service, returning
the compilation outcome to the end user. The flow is the following:
1. The user visits the web application through a browser.
2. The user provides some c++ code in the main text area field; additionally he can issue
custom flags for the compilation by using the secondary input field. By clicking on the
compiile button,a compilation request is sent to the backend service.

3. The backend service creates a file containing the user-provided code in a shared folder
located in the server; it then sends a request to the compiler service.
4. The compiler service checks if the request coming from the backend service can be executed:
if the number of concurrent compilation jobs has reached a predefined value, the incoming
request waits until one of the running compilation ends.

5. When the compilation request has been processed, the compilation outcome is sent back
to the backend service.
6. The backend service returns the response to the frontend interface: if the compilation was
successfull, a success message is displayed, outherwise the standard error generated by the
compiler is displayed as well.

Fig. 1: The web interface of the application.


1 Overview 3

1.2 Application technology stack


The adopted technology stack consists in the following technologies:

• As frontend framework we used Vue.js, a widly spread javascript framework.


• As backend framework we used node.js; moreover the code was written in Typescript in
order to enhance the development flow thanks to the static type definitions.
• As compiler we used GCC version 8.3.0;

• As infrastructure technology, we used Docker and docker compose, to develop and deploy
the services with ease, and to ensure the portablility of the software in the deployment
platform.
From a performance point of view, node.js is based on javascript (that is single-threaded),
but, despite this, implements a mechanism called event loop that allows the framework to ex-
ecute asynchronously many concurrent operations. Thus, during the development, we took care
not to execute heavy synchronous blocking operations.
Docker is also a technology that will eventually impact on performances: the execution of an ap-
plication inside a docker container uses a layer of abstraction between the host operating system
and the container, which can lead to a really small (but still present) performance overhead.

1.3 Server configuration


The web services are hosted under a E2 Google Cloud Computing Engine virtual machine of
type micro. The server responds only to the HTTP protocol and exposes the port 80 to reach the
GUI and the port 8080 to reach the backend service. The compiler service doesn’t expose any
port to the outworld, since it can only be reached by the backend service. The resources made
available by the E2 server to such services are:
• CPU Intel Xeon E5 v4 (Broadwell E5), x86 64, 2.2GHz/2.8GHz/3.7GHz, 1 Core
• MEMORY QUEMU DIMM RAM 500MiB

• VOLUME QEMU HARDDISK 20GiB, 5400rpm


Note that all these resources are shared across the frontend http server and the two node.js
services.
2 System components identification 4

2 System components identification

Fig. 2: A graphical representation of the system components and their interactions

The system we are going to study is composed by 3 stations: the Thinking Room, the Web Server
and the Compiler. More in depth we can model the three components as follows.

• The Thinking Room is an infinite server station representing the time a user spends
inserting the input into the text field, before submitting the compilation request. For
semplicity and study purpose we suppose its residence time is distributed as an exponential
random variable with a mean value of 5 seconds.

• The Web Server is responsible for handling the compilation requests: it receives the code
to be compiled within the request and creates a job for the compiler station. Since the
service is runned by the node.js engine, which is single-threaded, we suppose that the
station serves requests with a FIFO policy. Moreover, we assume that its service time is
distributed as an exponential random variable.

• The Compiler is the station handling the compilation of the submitted code. It consists
in a node.js service that implements a queue with an infinite population size and a service
room of size N . The compilation is executed by the host system through a call to GCC. The
same assumptions applied for the web server are applied here too, so this station handles
the requests in a FIFO order and it’s service time is exponentially distributed.

There are some observations to do on this infrastructure model, before digging into the analysis:
1. When the compilation process of a job ends, the outcome is sent to the Web Server compo-
nent, that will eventually forward it to the user’s web interface. For the sake of clarity and
readability, we obmitted this detail in the description of the system, making the compiler
to return its outcome directly to the user (in this case such transition in represented with
the arrow exiting from the Compiler and directed to the Thinking Room).
2. We actually implemented a fourth component: a web server with the only purpose of
serving the static page that provides the input field and the javascript function to make
the end user able to make a compilation request. From a performances evaluation point
of view this component is not interesting at all because the requests made to such service
don’t involve any component’s computation. Furthermore, the assets provided by such
service can be easily cached by the user’s browser or by a DNS cache. For these reasons
the evaluation of the performances of such service are not considered of interest for this
paper’s purposes.
3 Analysis 5

3 Analysis
3.1 Test design
The test that we designed and developed is in a closed-loop form, for the tsung tool. With
this test we are able to measure the service time of our components, in order to analytically
compute the optimal N , i.e. the maximum number of concurrent compilations allowed. Our test
consists in a single customer re-iterating a compilation request for a certain number of times,
with different task weights. Since there was the need to simulate different compilation loads, the
following c++ code has been used.

template < int i > class A


{
A<i-1> x;
A<i-2> y;
};

template <> class A<0>


{ char a; };

template <> class A<1>


{ char a; };

int main(int argc, char **argv) {


int N = 30;
A<N> b;
return 0;
}

This piece of code is really useful for our purpose since the compilation time increases exponen-
tially as the variable N increases. Using the same code, we can obtain very different compilation
loads, from very low (values between 1 and 10) to very high (values between 30 and 35). The
session, then, has been designed as shown in the following configuration (specific for tsung).

<for from="1" to="150" var="counter">


<transaction name="think">
<thinktime random='true' value='5'/>
</transaction>

<transaction name="req">
<!-- This random value changing at every request will replace N in the
c++ code granting different random compilation loads for each request -->
<setdynvars sourcetype="random_number" start="1" end="35">
<var name="heavyness"></var>
</setdynvars>
<request subst="true">
<http
url='http://&ip;:8080/compile'
version='1.1'
contents='{
&quot;code&quot;:&quot;<!-- c++ code -->&quot;,
&quot;cflags&quot;:&quot;&quot;}'
content_type='application/json'
3 Analysis 6

method='POST'>
</http>
</request>
</transaction>
</for>

A single customer entering the system will then execute 150 iterations of such flow, with a
dynamic value for the variable N.

3.2 Test results


With the previously described closed loop test, the following tables describe the initial parameters
used to run the test and the final measures obtained by it.

Parameters
Duration (min) 15
Iterations x user 150
Thinking Time (ms) 5000
Number of users 1

Tab. 1: Parameters used to run the tsung test on the S.U.T.

Measures
Mean Server Service Time (ms) 2,825503356
Mean Compiler Service Time (ms) 157,4765101
Mean System Time (ms) 280

Tab. 2: Measures obtained from the tsung test on the S.U.T.

In order to measure the Compiler service time and the Web Server service time alone, we intro-
duced a logging system able to measure both the service times of each request and write them
down into a log file, eventually used to retrieve the data needed.

3.3 Service demand and optimal number of customers


For the computation of the service demand we firstly need the relative visit ratio of each station,
with respect to the Thinking Room (which is our reference station). In our case it is straight-
forward to notice that there is always 1 visit to each station of the system for each visit to the
reference station. Since the service demand is equal to Vi ∗ µ−1i (where Vi is the relative visit
ratio of component i and µ−1
i is the service rate of the component i), the service demand of
the stations in our system is always equal to their service time.

To get the optimal number of customers, we firstly need to know which of the system’s stations is
the bottleneck. By definition, the Bottleneck of a queuing system is the station with the highest
service demand. As we can see from table 2, the station with the highest service demand (and
service time) is the Compiler. Asserted that, the optimal number of customers can be obtained
by the following formula, where D̄ is the total mean service demand, D̄b is the bottleneck service
demand and Z̄ is the thinking room resident time.

D̄ + Z̄ (2.825503356 + 157.4765101) + 5000


Nopt = ⋍ ⋍ 32.76870951
D̄b 157.4765101
The optimal number of customers for our system is approximately 32.
4 Java Modelling Tools 7

4 Java Modelling Tools


The Java Modelling Tools suite has been used to model the S.U.T. and to predict the capacity
of our system. In particular we used the MVA solver configured as expressed by the images below.

(a) JMT/MVA step 1. (b) JMT/MVA step 2.

(c) JMT/MVA step 3. (d) JMT/MVA step 4.

Fig. 3: JMT/MVA solver configuration steps

After configuring the model appropriately with the parameters showed in table 1, a simulation
was made taking the thinking room as the reference station and considering an upper bound of
50 users. The results are expressed by the following outcomes.
4 Java Modelling Tools 8

4.1 Throughput
Figure 4 shows the trend of the system throughput with respect to the growing number of users.
As expected, such simulated trend defaces around the quote of 32 users. After 35 users, it is
clear that the throughput has taken a descending direction. The maximum throughput we can
obtain is given by the following formula.

N 1
X ≤ min( , )
D + Z Db
−1
From a certain N on, the upper bound will be then Db , where Db is the service demand of
the bottleneck. In our case such value is 0.1575−1 ≃ 6.349, that is our theoretical upper bound
for the throughtput. This is however a theoretical trend: in a real-life application it is unlikely
to see the throughout remaining constant with a linearly growing number of customers. It will
more likely start to drop near the upper bound, due to a congestion collapse condition.

Fig. 4: Graphical representation of the throughput related to the number of customers

4.2 Utilization
Figure 5 shows the thrend of the utilization of the two stations with respect to a growing number
of customer. As expected, the utilization of the Web Server remains very low, whereas the
utilization of the Compiler grows quickly until reaching its maximum around 45-50 simultaneous
users. This is an expected behaviour, since we know that the Compiler is the bottleneck of our
system and that it has a service time way larger then the Web Server component. If we take
into accoutn the bottleneck law and we consider the upper bound for the throughput obtained
in the previous paragraph as the throughtput of the system, we can observe that the computed
utilization for the bottleneck station is really close to 1.

ρb = X × Db ≃ 6.349 × 0.1575 ≃ 1
4 Java Modelling Tools 9

Fig. 5: A graphich representation of the utilization related to the number of customers

4.3 Response Time

Fig. 6: Graphical representation of the system response time related to the number of customers
4 Java Modelling Tools 10

Figure 6 represents the trend of the system response time related to a growing number of cus-
tomers. It is clear that near the optimal number of customers found previously the trend change
from a sub-linear relation to a linear relation. In particular for limN →∞ , it takes an asymptotic
direction where the asymptote can be described by the following equation:

limN →∞ R(N ) = N × Db − Z
5 Future improvements 11

5 Future improvements
As seen in the previous sections, our bottleneck station is the compiler station, which holds the
highest service demand. In order to improve our performances we should find a way to improve
the optimal number of customers and, consequently, reduce the bottleneck’s service demand.
Since the latter measure is expressed by D̄b = Vb µ1b , in order to reduce such index we could
both:
1. augment the value of µb , which means to improve the bottleneck station’s service time
2. reduce Vb as multiplicative factor, which means to add more compiler instances to the
existing one

Analysing more in depth the two options, we can obtain the following observations.

5.1 Augment µb
We know that the optimal number of users is given, in our specific case, by the following formula:

D̄ws + D̄cmp + Z̄ µ−1 −1


ws + µcmp + Z̄
Nopt = = −1 = µcmp (E[sws ] + Z̄) + 1
D̄cmp µcmp

So we can observe that there is a linear correlation between Nopt and µcmp : as we increase µcmp ,
also Nopt increases by the same factor. Be aware, however, that this relation holds as long as
the compiler station remains the bottleneck, so a limit for this equation to hold is µcmp < µws .

5.2 Reduce Vb
1
Back to the service demand formula, we know that D̄cmp = Vcmp µcmp , but in our case there
is only one compiler service and all the jobs that arrive at the web server will pass exactly
once through our compiler, so Vcmp = Vws . If the system would be modified in a way that it
admits m identical compilers, each of which with a relative visit ratio equal to 1/m and with a
service rate of µcmp , we can rewrite the equation to obtain the optimal number of customers in
the following way:
m
1
P
D̄ws + m D̄cmp + Z̄
k=1
Nopt = 1 = m(µcmp (E[sws ] + Z̄) + 1)
m D̄cmp

So we can conclude that there is a direct proportionality between the optimal number of cus-
tomers and the number of compiler instances that we include in our system. However such
relation holds only if the bottleneck is one of such instances, so an upper bound for our m should
be m < µµcmp
ws
.

In conclusion both methods are equally valid, the choise only depends on the availability of more
computational power for the single compiler service or the possibility to add many compiler
instances.

You might also like