שפות סימולציה- הרצאה 12 - Output Data Analysis I
שפות סימולציה- הרצאה 12 - Output Data Analysis I
שפות סימולציה- הרצאה 12 - Output Data Analysis I
o
o o
= + < <
1 ) (
2 / 1 , 1 2 / 1 , 1
n
t X
n
t X P
n n
t
n-1,1-o/2
- 1-o/2 Quantile of t distribution
Simulation 25
T-Distribution
Similar to the standard normal in that it is
unimodal, bell-shaped and symmetric.
The tail on the distribution are thicker than the
standard normal
The distribution is indexed by degrees of
freedom (df).
Simulation 26
T-Distribution
The degrees of freedom measure the amount of
information available in the data set that can be used
for estimating the population variance (df=n-1).
Area under the curve still equals 1.
Probabilities for the t-distribution with infinite df
equals those of the standard normal.
Simulation 27
Confidence interval for mean with unknown
variance
95 . 0 ) (
, 1 , 1
= + < <
n
t X
n
t X P
n n
o
o
o o
t
n-1,o
- a quantile of T-distribution with n-1
degrees of freedom (use tables or library
functions
Compare: 95% quantile of standard normal
distribution is 1.96, T-distribution 2.23 (n-1=10)
Uncertainty make the confidence interval wider
For n > 30 the difference between standard
normal and t-distributions is negligible
Simulation 28
X random vaiable with arbitrary distribution. Poplulation
quantile of order p is
Confidence Interval for Quantile
X
1
, ...,X
n
n samples from X
X (1) X (2) X (n) - order statistic, i.e. the set of values of
{X
1
, ...,X
n
} sorted in increasing order
Then confidence interval at level 100o % for q
p
is
[X (j),X (k) ]
where j and k satisfy B
n,p
(k 1) - B
n,p
(j 1) = o, B
n,p
is the CDF
of the binomial distribution.
p q F q X P
p p
= = s ) ( ) (
Simulation 29
Confidence Interval for Quantile: Proof
The true (unknown) quantile q
p
satisfies
P(X
i
< q
p
) = p.
Let Z
i
= 1 if X
i
< q
p
, 0 otherwise and N = E
all i
Z
i
,
i.e. N is the number of times that X
i
is below q
p
We have the event equalities
{ X
(j)
q
p
} = { N j }
{ X
(k)
q
p
} = { N k-1 }
Thus
Now Z
i
are iid Bernoulli(p) random variables thus N
is Binomial(n, p).
q
p
Simulation 30
Reducing Confidence Interval Width
The width of confidence interval can be reduced
by
increasing the number of observations n,
decreasing the value of S(n).
The reduction obtained by decreasing S(n) to half
of its value is the same as the one obtained by
producing four times as much as observations.
Hence, variance reduction techniques are very
important.
Simulation 31
Outline
Motivation and Terminology
Recall: sampling distribution and mean
Confidence intervals for mean and quintiles
Estimators and Correlation
Terminating simulations
Non-terminating Simulations:Steady state
estimation
Transient Removal
Stopping Rules
Independent Replications, Batch means and
Regeneration methods
Simulation 32
Estimators and Autocorrelation
Classic estimators require IID observations
However, many output processes are
autocorrelated and non-stationary
In presence of correlation, variance estimator is
biased
Positive correlation variance is underestimated
E.g. delay times are usually positively correlated
(well see an example later)
Simulation 33
Waiting time distribution in a queuing system:
Problem
Supermarket check-out queue: Revisit
Simple-minded approach: let D
i
indicate the
delay in queue of customer i
Let us try to estimate the average
customer waiting time distribution
Is this correct? No!
Simulation 34
Waiting time distribution in a queuing system:
Problem
Remember:
Waiting time of customer 1 is always zero
Waiting time of customer 2 depends on its
arrival time and customer 1s departure time
Hence, D
i
s are not identically distributed,
they are not independent either!
Using them to compute the distribution is like
comparing apples and oranges
Simulation 35
Correlation in M/M/1 system
This table shows the
time in system for an
M/M/1 system ; =
0.5
Note that if the
system is relatively
empty then both job A
and job A+1 are likely
to have small times in
system
If the system is
relatively full, then
job A and job A+1 are
likely to have large
times in system
M/M/1 Example
lambda: 0.5
mu: 1
Interarrival Arrives Service Service Service Time in
Job Time At Start Time End System
1 0.71 0.71 0.71 0.64 1.35 0.64
2 0.01 0.72 1.35 0.20 1.55 0.83
3 1.12 1.84 1.84 0.70 2.54 0.70
4 0.71 2.55 2.55 0.11 2.66 0.11
5 2.46 5.02 5.02 0.30 5.32 0.30
6 0.26 5.28 5.32 1.19 6.51 1.23
7 0.59 5.87 6.51 0.63 7.14 1.27
8 1.73 7.60 7.60 0.72 8.31 0.72
9 0.28 7.88 8.31 0.58 8.89 1.02
10 0.31 8.19 8.89 1.59 10.48 2.30
11 0.06 8.25 10.48 0.75 11.24 2.99
12 0.41 8.65 11.24 0.13 11.37 2.72
13 0.15 8.81 11.37 0.07 11.45 2.64
14 0.49 9.30 11.45 1.15 12.60 3.30
15 0.41 9.71 12.60 1.25 13.85 4.14
Simulation 36
Correlation in M/M/1 system
Measure the correlation between
successive observations. (i.e t1 and
t2 form a pair of observations, t2 and
t3 form another pair and so on
Notice how very positively
correlated the successive
observations are. (Remember that a
correlation of 1 would be perfect
correlation
M/M/1 Example
0.64 0.83
0.83 0.70
0.70 0.11
0.11 0.30
0.30 1.23
1.23 1.27
1.27 0.72
0.72 1.02
1.02 2.30
2.30 2.99
2.99 2.72
2.72 2.64
2.64 3.30
3.30 4.14
Cov: 1.122054
Correl: 0.893807
Simulation 37
Correlation in M/M/1 system
The observations are not IID. As a result,
measuring the mean time in system or the variance
of time in system and developing a confidence
interval for the mean is useless.
Solution:
Assume that our process is covariance-stationary (i.e.
the covariance between samples doesnt change over
time)
Observe that if we look not at successive observations,
but rather observations that are separated by some
amount (called a lag), we find that the covariance is
reduced.
Recall: Output Analysis and Variance
Estimation
Simulation output analysis
Point estimator and confidence interval
Variance estimation confidence interval
Independent and identically distributed (IID)
Suppose X
1
,X
m
are iid
38
Stochastic Stationary Process
Stationary time series with positive
autocorrelation
Stationary time series with negative
autocorrelation
Nonstationary time series with an upward
trend
The stochastic process X is stationary for t
1
,,t
k
, t T, if
1 1
" " ( ,..., ) ( ,..., )
d
k k
d
t t t t t t
where denotes equality in distribution X X X X
+ +
= =
39
Stochastic Stationary Process (2)
The expected value of the variance estimator is:
If Xi are independent, then is an unbiased estimator of
If the autocorrelation is positive, then is biased low as an estimator of
If the autocorrelation is negative, then is biased high as an estimator of
2
2
( )
( )
( )
( )
n
n
n
n
S X
E Var X when negatively correlated
n
S X
E Var X when positively correlated
n
(
>
(
(
<
(
( )
n
Var X
( )
n
Var X
( )
n
Var X
2
( )
n
S X
n
2
( )
n
S X
n
2
( )
n
S X
n
40
Simulation 41
Correlation in M/M/1 system :Lags
Lag of 1
(t
1
,t
2
), (t
2
,t
3
),
M/M/1 Example
0.64 0.83
0.83 0.70
0.70 0.11
0.11 0.30
0.30 1.23
1.23 1.27
1.27 0.72
0.72 1.02
1.02 2.30
2.30 2.99
2.99 2.72
2.72 2.64
2.64 3.30
3.30 4.14
Cov: 1.122054
Correl: 0.893807
Lag of 2
(t
1
,t
3
), (t
2
,t
4
),
M/M/1 Example
0.64 0.70
0.83 0.11
0.70 0.30
0.11 1.23
0.30 1.27
1.23 0.72
1.27 1.02
0.72 2.30
1.02 2.99
2.30 2.72
2.99 2.64
Cov: 0.424541
Correl: 0.532773
Lag of 3
(t
1
,t
4
), (t
2
,t
5
),
M/M/1 Example
0.64 0.11
0.83 0.30
0.70 1.23
0.11 1.27
0.30 0.72
1.23 1.02
1.27 2.30
0.72 2.99
1.02 2.72
Cov: 0.114085
Correl: 0.320257
Simulation 42
Correlation and Lag
Correlation and Lag
0
0.2
0.4
0.6
0.8
1
1.2
0 1 2 3 4 5 6 7 8 9 10
Lag
C
o
r
r
e
l
a
t
i
o
n
TI = .9
TI = .5
Observations become less correlated as lag
increases.
The busier the system is (i.e. the closer the traffic
intensity is to 1), the greater the correlation
Simulation 43
Possible Solution- Multiple runs
Try to obtain estimates for the distributions of
each D
i
However, one simulation run only gives a single
sample for each D
i
Not enough to estimate a distribution from it
Solution: Use multiple runs, made independently
from each other
Let D
j,i
indicate waiting time of customer i in run j
Each run uses the same initial conditions and
parameters, yet different seeds for random number
generation
Simulation 44
Independence across runs
For a fixed i, D
j,i
are independent and identically
distributed random variables
Samples for different runs for waiting times can be used
to estimate D
i
= D
j,i
(for all j)
This property is called independence across runs
Let us talk about general rvs Y
i
and Y
j,i
, respectively
For m different observations per run, n runs:
How to determine m, the number of observations
per run?
1,1 1, 1,
2,1 2, 2,
,1 ,
... ...
... ...
... ... ... ...
... ... ...
i n
i n
R R n
Y Y Y
Y Y Y
Y Y
Simulation 45
Outline
Motivation and Terminology
Recall: sampling distribution and mean
Confidence intervals for mean and quintiles
Estimators and Correlation
Terminating simulations
Non-terminating Simulations:Steady state
estimation
Transient Removal
Stopping Rules
Independent Replications, Batch means and
Regeneration methods
Simulation 46
Two main types of simulation
Terminating (finite horizon) simulation:
Specific end of the simulation, a terminating event
Non-terminating simulation:
No natural event to specify the length of a run
Simulation could run forever terminating depends on
accuracy requirements
Simulation 47
Example of Terminating Systems
Terminating time known in advance
Bank one day operation simulation
Opens 8.30 am, closes 4.30 pm
Initial conditions: 0 customers, 8 out of 11 tellers working
Simulation time 480 minutes
Terminating time not known in advance
Communication system with A, B, C, D components
Simulation stops when system fails {A fails or D fails,
or (B and C fails)}
One possible performance measure: the mean time to
system failure: E(T
E
)
A
B
C
D
Simulation 48
Output Analysis for Terminating Simulations
Simulation runs over a time interval [0, T
E
]
Output observations: {Y
1
.Y
2
,Y
n
}
We want to estimate
Repeat simulation R times, with random initial conditions
independent streams from run to run
Y
ri
= the i-th observation within replication r
Y
ri
, Y
rj
may be correlated but Y
si
, Y
ri
are independent for all i and s
The sample mean can be defined as:
R r Y
n
r
n
i
ri
r
r
,..., 1 ,
1
1
= =
|
.
|
\
|
=
=
n
i
i
Y
n
E
1
1
=
R
r
r
R R
1
2
2
1
1
o
( ) ( ) o o
o o 1 , 2 / 1 , 2 /
+ s s
R R
t t