337 Simulation and Modelling
337 Simulation and Modelling
Simulation Modeling and Analysis A.M. Law and W.D. Kelton McGraw Hill, 2000 Probabilistic Modelling I. Mitrani Cambridge University Press, 1998
A Compositional Approach to Performance Modelling (rst three chapters) J. Hillston Cambridge University Press, 1996. On-line at: http://www.doc.ic.ac.uk/ jb/teaching/336/1994hillston-thesis.ps.gz Probability and Statistics with Reliability, Queuing and Computer Science Applications K. Trivedi Wiley, 2001
Course Overview
This course is about using models to understand performance aspects of real-world systems Models have many uses, typically:
To understand the behaviour of an existing system (why does my network performance die when more than 10 people are at work?)
To predict the eect of changes or upgrades to the system (will spending 100,000 on a new switch cure the problem?) To study new or imaginary systems (lets bin the Ethernet and design our own scalable custom routing network!)
There are many application areas, e.g. Computer architecture Networks/distributed systems Software performance engineering Telecommunications Manufacturing Healthcare Transport Finance Environmental science/engineering Here, well focus on the important common principles, rather than become entrenched in specic application areas
The Focus...
Were going to focus on discrete event systems, with continuous time and discrete states Our models will essentially be stochastic transition systems where delays are associated with each state Dierent high-level formalisms can be used to dene these models well study some popular approaches These models can variously be solved using Simulation Numerical algorithms Analytically Broadly, these are in increasing order of eciency and decreasing order of generality
7
This course...
Part I: Models 1. Modelling overview (ajf ) Examples and motivation Performance measures and their relationships 2. Timed transition systems (ajf ) Markov Processes (MPs) Generalised Semi-Markov Processes (GSMPs)
3. Solution methods (ajf ) Numerical solution of MPs Analytical solution of MPs Simulation of MPs and GSMPs Part II: High-level formalisms, tools and techniques 1. Discrete-event Simulation (ajf ) Libraries for discrete-event simulation (Java) Random number generation & distribution sampling Output analysis Applications
9
2. Stochastic Petri Nets (jb) Notation Generation of underlying MP Applications 3. Stochastic Process Algebras (jb) Introduction to PEPA Generation of underlying MP Applications 3. Markovian Queueing Networks (jb)
10
Modelling Overview
Example 1: A Simple TP server
A TP system accepts and processes a stream of transactions, mediated through a (large) buer:
...
The arrival rate is 15tps The mean service time per transaction is 60ms The current mean response time is around 350ms Q: What happens to the mean response time if the arrival rate increases by 10%?
...
Transactions arrive randomly at some specied rate The TP server is capable of servicing transactions at a given service rate Q: If both the arrival rate and service rate are doubled, what happens to the mean response time?
11
12
2
Arrivals
Disk 1
3
BROKEN
Disk2
. . .
On average, each submitted job makes 121 visits to the CPU, has 70 disk 1 accesses and 50 disk 2 accesses Around 3.8 customers pass through the terminal each second It takes about 30 minutes on average to get through security Q: How long would it take on average if all 5 scanners were working? The mean service times are 5ms for the CPU, 30ms for disk 1 and 27ms for disk 2 Q: How would the system response time change if we replace the CPU with one twice as fast?
14
13
There are three job types, numbered 0, 1 and 2, which occur with probability 0.3, 0.5 and 0.2 respectively Jobs arrive at the rate of 4/hr and visit the stations in the order: 0 1 2 2, 0, 1, 4 3, 0, 2 1, 4, 0, 3, 2
Queue
The mean time for each job type at each station in visit order is
3 4
0 1 2
0.50, 0.60, 0.85, 0.50 1.10, 0.80, 0.75 1.10, 0.25, 0.70, 0.90, 1.00
Q: If you were to invest in one extra machine, where would you put it?
15 16
There are many techniques one can use to model such systems Simple models involve nothing more than back-of-the-envelope calculations The most complex model is a full-blown simulation A tiny taster...
Two Simple Experiments with Random Numbers Example 1: The Birthday Bet
You meet n people at a party youve never met before You bet one of them that at least two of the n people have the same birthday (e.g. 5th March) (Assume birthdays are uniformly distributed and ignore leap years) Question: How big should n be for your winning chances to be > 50%? Alternatively, what is the probability, pn say, that two or more people in a random group of n people have the same birthday?
17
18
Under the assumptions of the model, there is an analytical solution: n1 365 i pn = 1 365 i=1 Amazingly, p22 = 0.476 and p23 = 0.507, so the break-even point is 23 If we couldnt do the math, we could instead simulate the scenario by generating m sets of n random birthdays (random integers between 0 and 364) and counting the number of sets c that contain a duplicate If the result of each of the m experiments is 0 or 1, then each outcome is an observation or estimate of pn , albeit a very bad one! Of course, a better estimate is c/m
19
public class birthday { public static void main( String args[] ) { int n = Integer.parseInt( args[0] ) ; int k = Integer.parseInt( args[1] ) ; int b = 0, c = 0 ; for ( int i = 0 ; i < m ; i++ ) { int tot[] = new int[365] ; for ( int j = 0 ; j < n && tot[b] < 2 ; j++ ) { b = (int)( Math.random() * 365 ) ; tot[b]++ ; if ( tot[b] == 2 ) c++ ; } } System.out.println( (float)c / m ) ; } }
20
Here are the averages of 5 runs of the program for dierent values of n and m m 5 100 101 102 103 104 10
5
Lets home in on m = 22, 23 and 24 and look at the 95% condence intervals (see later) for the mean using m = 105 m 22 23 24 Mean 0.476 0.508 0.538 95% Condence Interval (0.475,0.478) (0.506,0.510) (0.537,0.539)
n 10 0.000 0.080 0.100 0.117 0.117 0.117 15 0.200 0.300 0.236 0.250 0.255 0.254 20 0.200 0.460 0.482 0.402 0.414 0.412 25 0.400 0.520 0.580 0.567 0.565 0.569 30 0.200 0.840 0.660 0.716 0.710 0.707 0.000 0.000 0.028 0.024 0.028 0.027
21
22
(170,110)
23
24
In general, this cannot be solved analytically we must resort to simulation The island will be modelled as a polygon We will simulate a number, m, of missions and will estimate the average number of hits per mission The impact point of each bomb is simulated by sampling the two Normal distributions (details later) The hit/miss status reduces to a point in polygon test...
public class Bombers { static boolean isOnTarget( double x, double y ) { double[] xs = {-280,10,180,170,40,-180,-140} ; double[] ys = {-80,-140,-60,110,200,100,-30} ; int ncrosses = 0, n = xs.length - 1 ; for ( int i = 0 ; i < xs.length ; i++ ) { if ( ( ( ys[i] <= y && y < ys[n] ) || ( ys[i] > y && y >= ys[n] ) ) && x < xs[i] + ( xs[n] - xs[i] ) * ( y - ys[i] ) / ( ys[n] - ys[i] ) ) ncrosses++ ; n = i ; } return ( ncrosses % 2 == 1 ) ; }
25
26
public static void main( String args[] ) { int m = Integer.parseInt( args[0] ) ; int hits = 0 ; for ( int i = 0 ; i < m ; i++ ) { for ( int j = 0 ; j < 10 ; j++ ) { double x = Normal.normal( 0, 180 ) ; double y = Normal.normal( 0, 100 ) ; if ( isOnTarget( x, y ) ) hits++ ; } } System.out.println( (float)hits / m ) ; } }
Here are the results of 5 runs of the program for dierent numbers of simulated missions (m): m 101 10 10 10
2
103
4 5
27
28
This is what happens if the standard deviations are reduced by 30% (to 126m and 70m respectively), for 105 missions: m 105 Run 1 7.679 Run 2 7.675 Run 3 7.672 Run 4 7.676 Run 5 7.675
By averaging the results for m = 105 an estimate of the improvement in hit rate is (approximately) 44%
29
30
Fundamental Laws
Consider the following:
Arrivals System Completions
Lets assume we observe the system for time T and let The number of arrivals be A The number of completions be C From these we can dene a number of useful performance measures and establish relationships between them
The Utilisation is U = B/T The average Service Demand of each job (or average service time) is S = B/C Note: The net service rate is = 1/S = C/B Now some Fundamental Laws (or Operational Laws)...
31
32
Littles Law
Suppose we plot (A C ) over the observation time (T ):
AC
U = S
Sometimes we work with service rates rather than times, in which case we have U = / Importantly, note that U < 1 so we require = 1/S > Q: What if = ?
Total area= I
4 3 2 1 0
0 t1
t2
t3 t4 t5
t6
t7 t8
t9 t10 t11
time
33
34
The average number of jobs in the system is N = I/T The average time each job spends in the system (the average waiting time) is W = I/C Since I/T = C/T I/C we have:
Littles Law can be applied to any system (and subsystem) in equilibrium, e.g.
Clients . . . Subservers
N = W
Note: Assumes (A C ) is zero at either end of the observation period, or that is much larger than the residual populations at either end
CPU
35
36
W = N/ Z
37
k = Vk
where k is the throughput of resource k
38
E1 t1
E2 t2
E3 t3
E4 t4
E5 t5
etc.
These are called discrete-event systems Note that the state doesnt change between events (transitions)
39 40
We need to know the distribution of the time the system spends in a given state
State s
In order to understand the key issues, consider a very simple example: a single server with a FIFO queue; arrival rate and service rate :
Population 1 n Population 2 n 1
... ...
Infinite capacity
Infinite capacity
The state holding times might be based on measurement, or assumed to have a known mathematical distribution t
Measured/assumed distribution
The simplest way to represent the state of the system is to identify a state with each queue population (0, 1, 2..., N ):
arrival arrival
...
arrival
arrival
1 ...
completion completion
N1
completion completion
t
Density function
Transitions are triggered by events, here job arrival and job completion
41 42
With no time delays, this is just a Labelled Transition System or LTS, cf FSP. Its not useful for performance analysis! To add state holding times in a completely general way we associate clocks with events
0
Ca completion Ca Cc arrival (set C a , C c ) arrival arrival (set C a ) (set C a ) ... 1 N1 Ca Ca Cc Cc arrival
0
Ca
arrival (set C a , C c )
arrival
N
Cc completion (set C a , C c )
...
completion completion (set C c ) completion (set C c )
N
Cc completion (set C a , C c )
Clocks run down (to 0) at a state-dependent rate The event whose clock reaches 0 rst in any state triggers the next transition A clock setting function sets the clocks for new events that are scheduled during the transition The clocks of old events, i.e. that are active in both the old and new states, continue to run down Events that were active in the old state, but not the new, are cancelled
44
...
completion (set C c ) completion (set C c )
Events that are active in a given state compete to trigger the next transition, via expiry of their clock Events may schedule other events by setting their clocks during the transition
43
p(s ; s, e) the probability that the next state is s when event e occurs in state s F (x; s , e , s, e) the distribution function used to set the clock for new event e when event e triggers a transition from s to s r (s, e) the rate at which the clock for event e E runs down in state s S0 , F0 the distribution functions used to assign the initial state and to set the clocks for all active events in that state.
45
000000 111111 111111 000000 0000000000000 1111111111111 000000 111111 000000 111111 000000000000 111111111111 0000000000000 1111111111111 000000 111111 000 111 000 111 000000 111111 000000000000 111111111111 000 111 000 111 0000000000000 1111111111111 000000 000 111 000 111111 111 000000 111111 000000000000 111111111111 000 111 000 111111 111 0000000000000 1111111111111 000000 000 111 000 111 000000 111111 000000000000 111111111111 000 111 000 111 0000000000000 1111111111111 000000 111111 000 111 000 111 000000 111111 000000000000 111111111111 000 111 000 111 0000000000000 1111111111111 000 111 000 111 000000000000 111111111111 000 111 000 111 0000000000000 1111111111111 000 111 000 111 000000000000 111111111111 000 111 000 111 0000000000000 1111111111111 000 111 000 111 000000000000 111111111111 000 111 000 111 0000000000000 1111111111111 000 111 000 111 000000000000 111111111111 000 111 000 111 0000000000000 1111111111111 000 111 000 111 000000000000 111111111111 000 111 000 111 0000000000000 1111111111111 000 111 000 111 000 111 000 111 000000000000 111111111111 000 111 000 111 0000000000000 1111111111111 000 111 000 111 000000000000 111111111111 0000000000000 1111111111111 000 111 000 111
E(s) Old events O(s; s, e ) = E(s) (E(s){e})
E(s)
Note: This only allows one event to trigger a transition GSMPs in general allows multiple events to occur simultaneously Back to the example. We have... E = {a, c} where a=arrival, c=completion S = {0, 1, 2..., N }
E (0) = {a}, E (0 < a < N ) = {a, c}, E (N ) = {c} p(s + 1; s, a) = 1, p(s 1; s, c) = 1; 0 elsewhere F (x; s , a, s, e) = Fa (x)
F (x; s , c, s, e) = Fc (x) S0 (0) = 1, S0 (s = 0) = 0, F0 (x; a, 0) = Fa (x) For all s S , r (s, e) = 1 for all e in E (s)
47
48
The state at time t denes a continuous-time process, X (t) say; formally, this is the GSMP Note that the clocks in state s S determine how long the process remains in state s, prior to the next transition, e.g.
4 3 2 1 0
1
A sample path of a GSMP is a particular (random) traversal of the transition system To start it o, we use S0 to select an initial state s S ; for each event e E (s) use F0 (x; s, e) to initialise Ce A sample path is then generated by repeating the following steps: 1. Find tnext = min{Ce /r (s, e) | e E (s)} and let e be the winning event 3. For each o O(s ; s, e), reduce Cn by tnext r (s, o)
2. Determine the next state s using p(.; s, e) 4. For each n N (s ; s, e), set Cn using F (x; s , n, s, e) 5. Set s = s
Time
2
t X(t) = 3
49
50
Example: The innite capacity single-server queue in Java (i.e. as above but with N = ) Assume the inter-arrival times are uniform-random between 0 and 1 (i.e. U (0, 1)) and that service times are xed at 0.25...
class ssqClocks2 { public static void main( String args[] ) { int s, sPrime ; double now = 0.0 ; // Keeps track of time double stopTime ; double arrClock = 0.0 ; double compClock = 0.0 ; double stopTime = Double.parseDouble( args[0] ) ;
s = 0 ; arrClock = Math.random() ; while ( now < stopTime ) { if ( s == 0 ) { // E(0) = {arrival} now += arrClock ; // Note: all rates are 1 // p(1;0,arrival) = 1 sPrime = 1 ; // O(1;0,arrival) = {} // N(1;0,arrival) = {arrival,completion} arrClock = Math.random() ; compClock = 0.25 ;
51
52
} else { if ( arrClock < compClock ) { // Arrival event now += arrClock ; // p(s+1;s,arrival) = 1 sPrime = s + 1 ; // O(s;s,arrival) = {completion} compClock -= arrClock ; // N(s+1;s,arrival) = {arrival}, s>0 arrClock = Math.random() ;
} else { // Completion event now += compClock ; // p(s-1;s,completion) = 1 sPrime = s - 1 ; // O(s;s,completion) = {arrival}, s>0 arrClock -= compClock ; // N(s-1;s,completion) = {completion}, // = {}, if ( sPrime > 0 ) { compClock = 0.25 ; ; } } } s = sPrime ; }}}
s>0 s=0
53
54
The above code is an example of a discrete event simulation With completely general clock setting functions, simulation is the only way to analyse a GSMP The code above doesnt accumulate any measures, but we can easily keep track of, e.g. The time the system spends in each state The area under some function of the state (see the example under Littles Law earlier) The average time between leaving one set of states and entering another and so on From these, useful performance measures can be computed
Another example: The Machine repair Problem There are N machines in a factory that are subject to periodic failure Machines that fail are serviced in fail-time order by a repairer The individual time-to-breakage (B ) and repair times (R) are each assumed to be independent and identically distributed This can be modelled as a tandem queueing system:
Total population N
...
...
Working machines
Broken machines
55
56
Because the population (N ) is xed the state can be described uniquely by the population of one queue (broken machines, say) We need only two events: break and repair, say Let Fb (x) = P (B x) be the distribution function for the time to breakage for a given machine, and Fr (x) = P (R x) be that for the repair time A GSMP for this system is as follows...
S E
p(s 1; s, repair) = 1,
E (N ) = {repair}
E (0) = {break}
= {break, repair}
= {0, 1, ..., N }
p(s + 1; s, break) = 1,
0<sN
p(s ; s, e) = 0 for all others , s, e F (x; s , repair, s, e) = Fr (x) = P (R x) r (s, e) = 1, F (x; s , break, s, e) = Fb (x) = P (B x) 0 s N, e E (s)
0s<N
Q: What are the new events, N (s ; s, e), in each state s when event e triggers a transition from state s? Exercise: sketch the Java code for generating sample paths from the GSMP in this case
Distribution Sampling
Sample path generation (discrete event simulation) depends on the ability to sample distributions, e.g. of Inter-arrival times Service/repair times Message lengths Batch size etc. E.g. in computing a sample path of a GSMP we need to sample such distributions in order to set the clocks Where possible, we sample from a mathematical distribution, but if this is not possible we must use empirical data
59
60
62
Javas Math.random() uses an MCG with a 48-bit seed Modern generators combine several MCGs or generalise the MCG approach e.g. the Mersenne Twister (period 219937 1)!
64
Well assume the ability to generate U (0, 1) samples by any standard method (well use Javas Math.random()) and take it from there Recall that a continuous random variable, X say, can be characterised by its density and/or distribution functions:
x
1. The (Inverse) Transform Method Suppose X is a continuous r.v. with cdf F (x) = P (X x) and that we are trying to sample X A curious but useful result is that if we dene R = F (X ) then R U (0, 1); OR... given R U (0, 1) sample X from F 1 (R):
F(x) 1 r
F (x) =
f (y )dy = P (X x)
Algorithm: Sample U (0, 1) giving some value 0 R 1, then compute F 1 (R) Of course, this only works if F (x) has an inverse! Example: U(a,b) If X U (a, b) then F (x) = xa , axb ba
Example: exp() If X exp() then F (x) = 1 expx , x 0 Setting R = F (X ) and inverting, we get: R 1R = expX = 1 expX
Setting R = F (x) and inverting F gives X = F 1 (R) = R(b a) + a This conrms what we (should!) already know: if R U (0, 1), then (R(b a) + a) U (a, b)
67
loge (1 R) = X loge (1 R) = X So, if R U (0, 1), then loge (1 R)/ U (0, 1) (this is also the case for loge (R)/ since (1 R) U (0, 1))
68
2. The Rejection Method If F (x) cannot be inverted we can sometimes work with the density function f (x) The rejection method is a Monte Carlo algorithm (i.e. based on random sampling): we enclose f (x) in a box and then throw darts(!)
f(x)
1. Find an a, b and m that bounds all (or most of) f (x) 2. Pick two random samples X U (a, b) and Y U (0, m) 3. If the point (X, Y ) lies beneath the curve for f (x) accept X ; otherwise reject X and repeat from step 2
Intuitively, the method works because the smaller f (X ) is the less likely you are to accept X More rigorously, we need to show that the density function for those values of X that we accept is precisely f (x), i.e. we need to show that: P (x < X x + dx | Y < f (X )) = f (x) dx
m Y1
Y2 a X1 (Accept) X2 (Reject) b x
69
70
f(x)
The eciency depends on the number of rejections R before accepting a value of X The probability of accepting X in any one experiment, p say, is simply the ratio of the area of the box to that under the curve, i.e. 1 p= m(b a)
b
x
X
x + x
Since each experiment is independent, R is geometrically distributed: P (R = r ) = p(1 p)r If f has very long tails, p will be small, although it is possible to divide the x axis into regions and treat each separately
Using the formula for conditional probability, this becomes: P (x < X x + dx, Y < f (X ))/P (Y < f (X )) dx/(b a) f (x)/m = 1/(m(b a)) = f (x) dx
71
72
3. The Convolution Method A number of distributions can be expressed in terms of the (possibly weighted) sum of two or more random variables from other distributions (The distribution of the sum is the convolution of the distributions of the individual r.v.s) Example: Erlang(k, ) A r.v. X Erlang(k, ) is dened as the sum of k r.v.s each with distribution exp(k) We can think of X being the time taken to pass through a chain of k servers, each with an exp(k) service time distribution:
k 1 k 2 k k
Notice that
E [X ] =
1 1 1 1 + + ... = k k k
We can generate Erlang(k, ) samples using the sampler for the exponential distribution: if Xi exp(k) then
k
X=
i=1
Xi Erlang(k, )
If Ri U (0, 1) then Xi is sampled using log Ri /(k) We can save the expensive log calculations in the summation by turning the sum into a product:
k
X =
i=1
log Ri k
1 log Ri k i=1
73
74
Here, we have: x 0 1 2 3 4 p(x) 0.15 0.24 0.22 0.15 0.09 F(x) 0.15 0.39 0.61 0.81 1.00
1 0.81 0.66
To generate a sample, we drive the table backwards We simply pick a value on the y-axis (a U (0, 1) sample) and then map this back to an interval on the x-axis Q This involves a linear-time lookup. Can we do better...?!
0.39
0.15 0 0 1 2 3 4 5 x
75
To estimate equilibrium behaviour we must ignore the post-initialisation (warm-up) transient as it may introduce bias We must start any measurement (or reset ongoing measurement) at the end of the transient period Note: this simply means that the state probabilities are approximately those at equilibrium the state on an individual sample path at that time may assume any value In general, we can only work out the approximate length of the warm-up transient by doing pilot runs, e.g. Plot running mean or moving average over one run Plot measures having discarded p% of the initial observations Plot means/distributions of nth observation of many runs
77
78
Condence Intervals
Suppose we have n independent, identically-distributed (iid) observations Xi , 1 i n each with mean and variance 2 Wed like to compute a point estimate for based on the Xi thats easy: n 1 Xi X= n i=1 Wed also like a measure of condence in this estimate We seek two numbers, a, b say, for which we can state: P (a b ) = p for some probability p, e.g. 0.9, 0.95... This is called a condence interval and is usually symmetric about X , i.e. (a, b) = X h where h is the half-width
79
The Central Limit Theorem tells us that the sample mean of the Xi approaches the N (, 2 /n) distribution as n Normalising, X N (, 2 /n) X N (0, 1) as n 2 /n
Because we know the cdf of N (0, 1) we can easily locate two points z and z such that P ( z < for any given z is messy to compute, so particular values have been tabulated (see handouts) X z) = 1 2 /n
80
Rearranging...
Q What does suciently large mean? A It depends on the distribution of the Xi However, key fact: it can be shown that if the Xi , 1 i n, are themselves normally distributed then X S 2 /n
/ 2
/ 2
has the Students t distribution with parameter n 1 (degrees of freedom) Provided the Xi are normally distributed and independent, an exact 100(1 )% condence interval for the mean () is X tn1,1/2 S n n 2, tn1,1/2 from tables
This is ne provided n is suciently large We can replace by S (sample standard deviation), again provided n is suciently large
81
82
Example: Suppose the observations 1.20, 1.50, 1.68, 1.89, 0.95, 1.49, 1.58, 1.55, 0.50, 1.09 are samples from a normal distribution with unknown mean . Construct a 90% condence interval for The sample mean is X= 1 10
10
From tables of the Students t distribution , t9,0.95 = 1.83 so the 90% condence interval is X 1.83 0.41 = 1.34 0.24 10
Xi = 1.34
i=1
In the context of a simulation, how do we ensure in general that our observations are a. independent and b. approximately normally distributed? If the observations come from independent runs of the simulation, then they will be independent If the observations are themselves internally-computed population means then they will be approximately normally distributed (Central Limit Theorem) problem solved! But what if theyre not?!
(Xi X )2 = 0.17
83
84
Batched Means
A popular general approach seeks to achieve approximate normality and independence simultaneously Idea: Run the model once and divide the post-warm-up period into n batches (typically 10 n 30) where batch i computes the average of m estimates Yi,1 , ..., Yi,m Compute Xi =
1j m
Independence
Independent replications guarantees independent observations at the expense of a warm-up transient at the start of each run The batched means method involves a single run and a single warm-up transient, but the observations might now be dependent (why?) If the observations are dependent then the formula for the sample variance S 2 will be wrong because the covariances are ignored Consequence: The condence interval will be too narrow
Yi,j , i = 1, ..., n
Provided m is reasonably large, the Xi will (hopefully) be approximately normal and independent
Batch observations Transient 0
...
X1
X2
X3
Xn
Reset measures
End Simulation
85
86
Recall also, V AR(X + Y ) = E [((X + Y ) (X + Y ))2 ] = E [((X X ) + (Y Y ))2 ] + 2 E [((X X )(Y Y ))]
COV (X, Y ) = = E ([X X ][Y Y ]) = E (XY ) X Y Note that covariance may be positive (positive correlation), negative (negative correlation) or zero (uncorrelated, which implies the r.v.s are independent) Note that COV (X, X ) = E ([X X ]2 ) = V AR(X )
= E [(X X )2 ] + E [(Y Y )2 ]
2 2 + 2 COV (X, Y ) + Y = X
87
88
To obtain our condence interval we assumed that V AR(X ) = 2 /n, but if the Xi , 1 i n are dependent then V AR(X ) = V AR( = = = 1 n
n
The bottom line: Covariances are typically positive so if we ignore them altogether S 2 /n becomes an under-estimate of V AR(X ) and the condence intervals will be too narrow
Xi )
i=1 n
1 V AR( Xi ) n2 i=1 1 E ([( Xi ) n]2 ) n2 i=1 1 E ([ (Xi )]2 ) n2 i=1 n n1 2 1 Cov(Xi , Xj ) + 2 2 n n i=1 j =i+1
89 90 n n
...
...
...
Parameterisation 1: Simulation time spent in a small number of states Parameterisation 2: Simulation time spent covering a large number of states
The accuracy (consistency) of a measure derived directly, or indirectly, from the (equilibrium) state probabilities is determined by the average number of times the simulation visits each state
For the same simulator execution time we would expect narrower condence intervals from the model on the left than the one on the right
91
92
Markov Processes
Note that the future history of a GSMP is determined entirely by the current state However, note that the current state includes clock settings that are history dependent, e.g.:
e1 (set C k ) s Ck e3 (set C k ) e2 (set C k )
1. Each transition into s initialises the clocks of the new events in e E (s ) using the same distribution F (x; e ), i.e. independently of how the state was entered 2. All clocks are set according to an exponential dsitribution, i.e. F (x; e ) = 1 erx for some parameter r that depends only on e
The exponential distribution is memoryless, in that the future is independent of the past, i.e. if X exp(r ) P (t < X t + s | X > t) = P (X s) s, t 0
With the above assumptions the GSMP now assumes a very much simpler form: 1. All occurrences of event e are associated with the same distribution, hence the same distribution parameter 2. There is no need to remember the time spent in any previous state(s), e.g. throught the use of clocks (memorylessness) 3. In state s we simply race the distribution samplers associated with e E (s) and choose the event with the smallest time 4. Transitions need only be labelled with the associated (rate) parameter, e.g. for our queue:
0 1 ... ... c1 c
This much simpler process is called a Markov Process Markov Processes are easily analysed by discrete-event simulation (Exercise: sketch the algorithm) Crucially, however, they can be analysed Analytically, in some special cases provided the process is irreducible: every state can be reached from any other
95
96
At equilibrium, the probability uxes (average number of transitions per unit time) into, and out of, each state must be in balance, i.e. for each state s S
ps
s S,s =s
qs,s =
ps qs ,s
s S,s =s
These (linear) ux equations are called the global balance equations and are subject to the normalising equation ps = 1
s S
97
98
Example:
s s qs,s s qs,s s qs,s qs,s
The ux out is ps (qs,s + qs,s ); the ux in is ps qs ,s + ps qs ,s Thus, at equilibrium: ps (qs,s + qs,s ) = ps qs ,s + ps qs ,s We can use standard linear solvers to nd ps for any state s, given the transition rates qs,s , s, s S
99
The problem is usually dened in matrix form, viz. the solution to p Q = 0 where p = (ps1 , ps2 , ..., psN ), S = {s1 , s2 , ..., sN } is the state space, and Q is the Generator Matrix q1 qs1 ,s2 qs1 ,s3 ... qs1 ,sN q2 qs2 ,s3 ... qs2 ,sN qs2 ,s1 Q= . . . qsN ,s1 qsN ,s2 qsN ,s3 ... qN The diagonal terms qi , si S encode the net output rate from state i in the balance equations: qi = qsi ,sj
sj S,j =i
100
For example, consider p applied to the rst column: ps1 q1 + ps2 qs2 ,s1 + ... + psN qsN ,s1 = 0 which is the balance equation for state s1 As an example, consider our single-server queue (the M/M/1/c queue), state space {0, 1, ..., c} 0 0 ... 0 ( + ) 0 ... 0 0 ( + ) ... 0 Q = . . . ... Recall: is the service rate; 1 is the mean service time, S say
101
Note that Q is singular and there are innitely many solutions to pQ = 0 as written In order to solve for the equilibrium distribution p it is necessary encode the normalising condition: sS ps = 1 Simple: pick a column, j , set all its elements to 1, and likewise the j th element of the 0 vector, e.g. in our example we might solve pQ = 1c , or 0 0 ... 1 ( + ) 0 ... 1 = (0, 0, ..., 1) 0 ( + ) ... 1 p . . . ... 1
102
=1
Notice, importantly, that the sum only converges if < 1 i.e. < ; hence the case = has no solution (the system is unstable) Putting this together, we obtain: pn = (1 ) n , n0
where =
103
104
Mean queue length, N Recall the denition of the mean queue length:
N=
n=0
npn
Server utilisation, U The server is idle when n = 0; busy otherwise U = 1 Pr{ The server is idle } = 1 p0 = 1 (1 ) = S = =
105
Hence: N =
n=0
npn = (1 ) d d
nn
n=0
= (1 )
n
n=0
106
Mean Response ( Waiting) Time, W Recall: the mean response time is the mean time that a job spends in the system (queueing time + service time) The easiest way to nd W is to use Littles Law N = W = W Thus W = = = N/ (1 ) 1
Mean Queueing Time The time spent waiting to go into service (WQ say) is easily computed once we know W : WQ = W 1 =
Note: If a customer is removed from the queue before it goes into service then Littles Law applied to WQ delivers the mean number of customers waiting to be served: NQ = WQ = 2 1
107
108
Some Examples
2 2 2 3 5 8 4 5 10 4 10 12 2 2 2 3 5 8 U 1/2 2/5 1/5 3/4 1/2 2/3 N 1 2/3 1/4 3 1 2 W 1/2 1/3 1/8 1 1/5 1/4
Population 2 n 1
...
Infinite capacity
Infinite capacity
Note: the arrival rate (throughput) is the same for both queues Lets proceed as before - the state this time is the population of both queue 1 (n1 say) and queue 2 (n2 say), i.e. a pair (n1 , n2 ) The queue length vector random variable is (N1 , N2 ) and well write pn1 ,n2 = P (N1 = n1 , N2 = n2 ) at equilibrium
109
110
Because the state is a pair (of queue populations) the state transition diagram is now two dimensional:
2 1 2 1,0 0,0 1 2 0,1 2 1 2 2
Now a small problem: no closed contour gives us an immediately solvable balance equation, e.g. the dotted box: p 0 ,0 = 2 p 0 ,1 or the solid box: ( + 1 )p 2 , 0 = p 1 , 0 + 2 p 2 , 1 We can, however, come up with four equations which cover all states and transitions. For n1 , n2 > 0: p 0 ,0 ( + 1 ) pn1 ,0 ( + 2 ) p0,n2 ( + 1 + 2 ) pn1 ,n2 = 2 p 0 ,1 = pn1 1,0 + 2 pn1 ,1 = 1 p1,n2 1 + 2 p0,n2 +1 = pn1 1,n2 + 1 pn1 +1,n2 1 + 2 pn1 ,n2 +1
2,0
1,1
0,2
111
112
So it is a solution, and its unique! We nd p0,0 again using the fact the probabilities sum to 1: pn1 ,n2 =
n1 , n2 n1 , n2 0 n1 , n2 n1 , n2 0
1 n2 n 1 2 p0,0 = 1
by p0,0 ( + 1 )
1 by n 1 p 0 ,0 1 0 n 1 2
p 0 ,0
1 = 2 0 1 2 p0,0 =
Hence
1 n2 n 1 2
p 0 ,0 =
1 1 0 n 2 1
=
n1 =0 n2 =0
p 0 ,0 +
1 1 2 n 1 2
+ 1
= / 1 + 2 2 = 1 +
p 0 ,0
1
1 n 1
1
2 n 2
=
n1 =0
n2 =0
= (1 1 )(1 2 )
113
114
In particular, the M/M/1 queue results apply to each node in isolation, hence, for example: n1 , n2 0 The utilisations are U1 = 1 and U2 = 2 The mean population of queue 2 is 2 1 2 The mean time spent in queue 1 is 1 2
So, the joint probability (of being in state (n1 , n2 ) is the product of the marginals (the probability of the rst queue being in state n1 and similarly the second) This is an example of the Product Form result which applies to an arbitrary queueing network, provided the initial assumptions hold An extraordinary consequence of this is that each queue now operates as if it were an M/M/1 queue with an independent Poisson arrival stream
The probability that queue 2 contains exactly 4 customers is p(N2 = 4) = (1 2 )4 2 and so on... Q: What about arbitrary networks? More later...
116
115
117
In a GSMP the current state is dened by the value(s) of program variable(s), e.g. class instance variables The set of states is implicitly dened in terms of these state variables New events in state s , N (s ; s, e), are created by explicitly scheduling them during the transition from s to s Event scheduling involves sampling the appropriate clock setting function (distribution sampling) Cancelled events in state s , K (s ; s, e) say, are removed by descheduling them during the transition from s to s Probabilistic choice of next state is achieved by sampling a U (0, 1), i.e. via a RNG
119
120
now() delivers the current virtual time (double) schedule( e ) adds event e to an event set (aka calendar queue, event diary, future event list...) deschedule( e ) removes event e from the event set simulate() invokes events in time order and advances time
Events are class instances, with associated code (public void invoke(){...}) for eecting a state transition and scheduling/descheduling events All events have an invocation time a superclass instance variable The event code models a transition in some underlying GSMP
121
122
One arrival schedules the next. In state 1, we set the clock for the completion event, as in the GSMP...
class Arrival extends Event { public Arrival( double t ) { super( t ) ; } public void invoke() { schedule( new Arrival( now() + Math.random() ) ) ; n++ ; if ( n == 1 ) schedule( new Completion( now() + 0.25 ) ) ; } }
The code for the completion event mirrors the equivalent GSMP... class Completion extends Event { public Completion( double t ) { super( t ) ; } public void invoke() { n-- ; if ( n > 0 ) schedule( new Completion( now() + 0.25 ) ) ; } }
123
124
The initial state is 0 (w.p. 1 in the GSMP); the only active event in state 0 is Arrival... public SSQSim() { schedule( new Arrival( now() + Math.random() ) ) ; execute() ; } } public class ssq1Ev { public static void main( String args[] ) { new SSQSim() ; } }
2
Arrivals
Disk 1
3
Disk2
125
126
Each submitted job makes 121 visits to the CPU, has 70 page faults serviced by disk 1 and 50 by disk 2 on average
The state of the underlying GSMP is the population of each of the three queues ((c, m, n) say) Here well use explicit Queues; each entry is a job, represented by its arrival time Events:
The mean service times are 0.005s for the CPU, 0.03s for disk 1 and 0.027s for disk 2 The current arrival rate of jobs = 0.1/s Note: the CPU mean service time is the mean time a job spends at the CPU (before leaving or page-faulting) Q: How does the system response time (W ) vary as the load increases? For illustration purposes, well assume all time delays are exponentially distributed (this is easily changed)
Arrival new job arrival StopJob exit CPU Resume( n, ... ) exit disk n
We need four distribution samplers for setting clocks: inter-arrival time, cpu time slice and two disk (page fault) service times Well measure response time W using a CustomerMeasure
127
128
import tools.* ; class CPUSim extends Sim { Queue cpuQ = new Queue(), diskQ[] = { new Queue(), new Queue() } ; Exp iaTime, sTime[] = { new Exp( 1000.0/30 ), new Exp( 1000.0/27 ) }, xTime = new Exp( 1000.0/5 ) ; CustomerMeasure respTime = new CustomerMeasure() ; int completions = 0 ; int stopCount = 100000000 ;
The arrival event (Arrive) generates the next arrival, places the new job in the job queue (cpuQ) and starts the CPU if it was idle (state (0, m, n) in the GSMP) Event StopJob may exit the job or pass it to one of the diskstransition from (c, m, n) to either (c 1, m, n) w.p. 1/121, (c 1, m + 1, n) w.p. 70/121, or (c, m, n + 1) w.p. 50/121 In states (c, 0, n), (c, m, 0), c, m, n 0 we need to schedule a Resume event at the corresponding disk On exit from a server the next job in the job queue is started, if there is one Well build in an EndOfWarmUp event to allow us to delete transient measures
129
130
class Arrival extends Event { public Arrival( double t ) { super( t ) ; } public void invoke() { schedule( new Arrival( now() + iaTime.next() ) ) ; cpuQ.enqueue( new Double( now() ) ) ; if ( cpuQ.queueLength() == 1 ) schedule( new StopJob( now() + xTime.next() ) ) ; } }
class StopJob extends Event { public StopJob( double t ) { super( t ) ; } public void invoke() { Double t = (Double) cpuQ.dequeue() ; if ( Math.random() <= 1.0/121 ) { completions++ ; respTime.add( now() - t.doubleValue() ) ; } // Continued over...
131
132
// StopJob continued... else { int i = (int) ( Math.random() + 50.0/120 ) ; diskQ[i].enqueue( t ) ; if ( diskQ[i].queueLength() == 1 ) schedule( new Resume( i, now() + sTime[i].next() ) ) ; } if ( cpuQ.queueLength() > 0 ) schedule( new StopJob( now() + xTime.next() ) ) ; } }
class Resume extends Event { int d ; public Resume( int disk, double t ) { super( t ) ; d = disk ; } public void invoke() { Double t = (Double) diskQ[d].dequeue() ; cpuQ.enqueue( t ) ; if ( cpuQ.queueLength() == 1 ) schedule( new StopJob( now() + xTime.next() ) ) ; if ( diskQ[d].queueLength() > 0 ) schedule( new Resume( d, now() + sTime[d].next() ) ) ; } }
133
134
class EndOfWarmUp extends Event { public EndOfWarmUp( double t ) { super( t ) ; } public void invoke() { respTime.resetMeasures() ; completions = 0 ; stopCount = 5000 ; // Whatever...! } } }
public boolean stop() { return completions == 3000 ; } public CPUSim( double lambda ) { iaTime = new Exp( lambda ) ; schedule( new Arrival( now() + iaTime.next() ) ) ; schedule( new EndOfWarmUp( now() + 15000 ) ) ; // Whatever! simulate() ; System.out.println( respTime.mean() ) ; }
135
136
Some Results Here are three example time series plots for the mean waiting time (moving average) when = 0.15:
16 14
CPU model - disk 1 mean queue length for =0.4 (three trial runs)
20
15
12
CPU model - disk 1 mean queue length for =0.15 (three trial runs)
10
10
Time
As increases the system takes longer to reach equilibrium (why?) An an example, wed expect measurements taken between 15000 and 30000 time units, say, to have wider condence intervals for = 0.4 than for = 0.15. Lets see...
138
Time
It looks as if the system is approaching equilibrium after around 15000 time units
137
Point Estimate 4.839 5.267 6.198 6.679 8.662 11.726 16.295 30.765
90% Condence Interval (10 independent replications) (4.776, 4.902) (5.044, 5.490) (5.790, 6.607) (6.158, 7.199) (7.784, 9.541) (10.767, 12.685) (12.297, 20.293) (18.621, 42.909)
Q: Whats going on? Q: Where did the exact results come from?
139