1 - Performance Modelling Introduction
1 - Performance Modelling Introduction
Performance modeling
Prof. Danilo Ardagna
Performance
Storage
Processing
Dependability
MTTDL
Cloud Storage
Virtualization
• Computer performance:
• The total effectiveness of a computer system, including
throughput, individual response time and availability
• Can be characterized by the amount of useful work
accomplished by a computer system or computer network
compared to the time and resources used
Examples of observed performance indices
Measurement-based
Techniques
Model-based
What is a model
Existing
system
Parametrize the Model
Quality
Model
System Design
Performance indices
Quality Evaluation techniques
Direct measurement
Measurement-based
we need to have the Benchmarking
full implementation
Prototype
Techniques
Model-based
Simulation
we can work even
during the design phase
Hybrid techniques
Measurement techniques
• Direct measurement
§ system behavior is measured using its real workload using
appropriate instrumentation and techniques
Direct measurement
Measurement-based
Benchmarking
Prototype
Techniques
Model-based
Simulation
Hybrid techniques
Model solution techniques
N=X*R
Utilization Law:
k is the index of the resource in our system.
Uk=X*Dk
Sample traces
State space and sample paths
Existing
system
Parametrize the Model
Quality
Model
System Design
Performance indices
Reference Models of this course: Queueing Networks
Queueing theory is the theory behind what happens when you have a lot of jobs,
scarce resources, and so long queue and delays.
• The service facility has one or more servers which can perform
the service required by customers
arrivals departures
to queue from queue
⌫
? - ? -
* H
Y
H
HH
Service center
bu↵er server
Service center
• Arrival
• Service
• Queue
• Population
Arrival of customers
Arrivals represent jobs entering the system: they specify how fast,
how often and which types of jobs does the station service.
example:CPU
vedere appunti, pagina 0
• Arrival
• Service
• Queue
• Population
Service
The service part represents the time a job spends being serviced.
Service time distribution
Possible situations:
• a single server: the service facility only has the capability to
serve one customer at a time; waiting customers will stay in the
buffer until chosen for service; how the next customer is chosen
will depend on the service discipline
…
c
Service center
• Arrival
• Service
• Queue
• Population
Queue
• If the buffer has finite capacity there are two alternatives for
when the buffer becomes full:
§ either, the fact that the facility is full is passed back to the
arrival process and arrivals are suspended until the facility
has spare capacity, i.e. a customer leaves;stop receiving
§ or, arrivals continue and arriving customers are lost (turned
loose new arrivals
away) until the facility has spare capacity again
When the (one of the) job(s) currently in service leaves the system,
one of the job in the queue can enter the now free service center
• Arrival
• Service
• Queue
• Population
Population
• When this is not the case we divide the population into classes
whose members all exhibit the same behaviour
• The buffer currently has 13 places, including the place occupied by the
packet being transmitted and packets that arrive when the buffer is full
are lost
Making exponential assumptions about the arrival rate and the service rate
we would model the gateway as:
s/n
…
s
s/n
Queueing Networks
Single Queues Networks of Queues Expressiveness
SYSTEM
SUBSYSTEM
SUBSYSTEM
SYSTEM
SUBSYSTEM
SUBSYSTEM
Network of queues
service centre
arrivals departures
to queue from queue departures
arrivals
⌫
to system from system
?
HH
j
H ?
- ?(1 p) ⇡ -
6 p
HH
Y
⌫
H routing probability
* H
Y
H
HH
server bu↵er
• Arrival
• Service
• Queue
• Population
• Routing
Routing
.....
' •
./
'
..
'
./
•
r\
l)
, " ,
c
Round robin In the round robin policy, the destination chosen by
the job rotates among all the possible exits.
§ the destination chosen by the job rotates among all the possible exits
Queuing Networks - part 1 POLITECNICO 01 M
1 4 7
2 5 •••
3 6
•
o1n ueue
Join the shortest queue
In the Join the Shortest Queue (JSQ) policy jobs can
§ jobs can query the queue length of the possible destinations, and chose
jobs can query the queue length of the possible
to move to the one with the smallest number of jobs waiting to be served
destinations, and chose to move to the one with the
smallest number of jobs waiting to be served.
NA = 3
r r
\..
Na = 6
• .. •
Nc = 4
r r
Provide a QN model of the system and evaluate the overall throughput considering that
the network delay is negligible with respect to the other devices and two different cases:
1) The only thing we know is that each server should be visited by the application
2) In the second case we know that the application after visiting the web server requires
some operations at the application server and then can go back to the web server and
leave the system or can require service at the DBMS and then go back to the application
server
Open Networks
Source Sink
Source
Requests Completed
Arrivals Requestes
Application DBMS Server
Web Server Server
Sink
62
Scenario 1: Tandem networks
Local Clients
Local Network
A client server system, with a finite number of customers, which is architected with three
tiers: the first one includes one web server, the second tier includes one application server
and the third one includes a database server.
Provide a QN model of the system and evaluate the system throughput considering that
Network delay is negligible with respect to the other devices. Model the two different
cases previously described.
Closed Networks (first model)
Local Clients
Local Clients
Disk1
a) b)
Web server
RAM
Cache1 Disk1
c)
CPU
Cache2 Disk2
Web server
POLITECNICO DI MILANO
Level of Detail
Disk1
a) b)
Web server
RAM
Cache1 Disk1
c)
POLITECNICO DI MILANO
Expressiveness and Extended Queueing Networks
Join
2
...
Waiting Buffer
Fork Join
2
...
Waiting Buffer
(a) An open network with a fork-join service center (b) A finite cap
Expressiveness and Extended Queueing Networks
• Defections from the queue: Routers often set a maximum limit on the
time that a packet or request is able to stay in a queue. Thus, packets
may be dropped from the queue on the assumption that they may have
already been retransmitted by higher protocol layers