0% found this document useful (0 votes)
8 views75 pages

Simulation Theory Review

Uploaded by

tuongphan063
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views75 pages

Simulation Theory Review

Uploaded by

tuongphan063
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 75

SIMULATION THEORY

REVIEW FOR FINAL EXAM


By Phat Nguyen
Identify distributions!
Triangular

Uniform
Selecting the Family of Distribution

• A family of distributions is selected based on:


– Shape of the histogram
– The context of the input variable

Reference:
● Understanding and choosing the right probability distribution
https://onlinelibrary.wiley.com/doi/pdf/10.1002/9781119197096.app03
● Jerry Bank’s chapter 5: Statistical models in simulation
● Jerry Bank's chapter 9: page 346 3
Poisson

Exponential
Normal Exponential
Guideline of Probability Distributions
● Binomial: # of successes in n trials
● Negative binomial: # of trials for r successes
● Geometry: #of trials for 1st success Discret
● Poisson: # of independent events that occur in a fixed amount of
time or space

● Normal: distribution of a process that is the sum of a number of


component processes. Eg. assembly
● Lognormal: distribution of a process that can be thought of as
the product of (meaning to multiply together) a number of Continu
component processes. Eg. ROI with compounded interest
● Exponential: time between independent events, or a process
time that is memoryless
7
Guideline of Probability Distributions
● Gamma: model nonnegative random variables. The gamma can be
shifted away from 0 by adding a constant.
● Beta: model bounded (fixed upper and lower limits) random
variables, ranging [0,1] (unit interval).The beta can be shifted away
from 0 by adding a constant
Continuo
● Erlang: sum of several exponentially distributed processes;
● Weibull: time to failure for components

● Discrete or continuous uniform: models complete uncertainty


● Triangular (continuous): a process for which only the minimum,
most likely, and maximum values are known
● Empirical: resamples from the actual data collected

8
Common random variables
● Queuing system: Interarrival time, Service time
→ constant, Normal (>0), Exponential, Gamma, Weibull
Common random variables

● Inventory & SC:


- Number of demand: Geometric, Poisson, Negative
binomial
- Lead time: Gamma
- Time between demand

● Reliability & maintainability:


- Time to failure: Weibull, Exponential, Normal

<Jerry's chapter 5>


Practical Selection of Distribution Family
Probability Distribution

Histogram Histogram
uniform, single hump number of values groupings
(multimodals), data points different
from main set (outliers)
Theoretical Empirical (large sample size)
mathematical formulation divide data into groupings,
calculate proportion, interpolate
ARENA: TRIA( ), NORM( ).. ARENA: CONT( ), DISC( )
Bounded/ unbounded? discrete/continuous?
eg.TRIA is prefered to NORM eg. assigning entity type
discrete/continuous?

ease of parameter
manipulation?
eg.EXPO is prefered to WEIB

11
4 Steps of Input modelling

1. Data Collection
2. Identifying Distribution (Family)
3. Parameter Estimation
4. Goodness-of-Fit Tests
Parameter Estimation (signature)
• To identify a specific instance of the distribution family
• Location parameters — they shift the density function
• Shape parameters — they change the shape of the
density function
• Scale parameters – change scale

Normal, location & scale parameters


Parameter Estimation
• To identify a specific instance of the distribution family
• Location parameters — they shift the density function
• Shape parameters — they change the shape of the
density function
• Scale parameters – change scale

Weibull, shape parameter Weibull, scale parameter

Normal, location & scale parameters


Parameter
Estimation
Parameter Estimation

• A parameter is an unknown constant, but an estimator is a


statistic.
• If observations in a sample of size n are X1, X2, …, Xn
(discrete or continuous), the sample mean and variance are:

16
Example of Raw data of Component life (days)

If data come from Exponential


(continuous data)
• If the data are discrete and have been grouped in a
frequency distribution:

The histogram +
context → X follows
Poisson distribution

Why sample mean is


Number of Arrivals in a 5- not equal to sample
Minute Period variance?
4 Steps of Input modelling

1. Data Collection
2. Identifying Distribution (Family)
3. Parameter Estimation
4. Goodness-of-Fit Tests
Goodness-of-fit Test

• Graphical approach:
– Q-Q plot: graphs the quantiles of the fitted distribution
vs. the sample quantiles.
– P-P plot: graphs the fitted CDF vs. the empirical CDF

• Statistical Test:
– Kolmogorov-Smirnov test
– Chi-square test

20
Quantile-Quantile Plot

A Median is a
_______% quantile

● Q-Q plot: The plot of yj versus F-1( (j-0.5)/n) 21


Quantile-Quantile Plot

● The plot of yj versus F-1( (j-0.5)/n) is

– Approximately a straight line if F is a member of an


appropriate family of distributions
– The line has slope 1 if F is a member of an appropriate
family of distributions with appropriate parameter values
● Q-Q plot can also be used to check homogeneity
– Check whether a single distribution can represent both
sample sets
– Plotting the order values of the two data samples against

each other 22
Exercise

Test Exponential, Exponential


Uniform

Uniform
Test uniform distribution using Q-Q plot

Comment?
Quantile-
Quantile
Plot
Quantile-Quantile Plot

randomness tends to obscure things,


26
especially with small samples
Statistical Test
We want to test the null hypothesis :
H0: The random variable, X, conforms to the distributional
assumption with the parameter(s) given by the
estimate(s).
H1: The random variable X does not conform.

𝛼= Type I error = Pr(reject H0/ H0is true)


𝛃=Type II error = Pr(accept H0/ H0is false)
Power = 1-𝛃 = Pr(reject H0/ H0is false)
p value: smallest value of type I error that leads to rejection
of H0
Chi-square Test (for both discrete +
continuous data; large data)
Chi-square Test
• Test statistics :

which approximately follows the chi-square distribution


with k-s-1 degrees of freedom, where s = number of
parameters of the hypothesized distribution estimated by
the sample statistics.
- One should use Ei ≥ 5

• Reject Ho if 𝛘02 > 𝛘2k-s-1,𝛂


29
Chi-square Test

– Recommended number of class intervals (k):

– Caution: Different grouping of data (i.e., k) can affect


the hypothesis testing result.
– if using equal probabilities, then pi = 1/k

recommend: Ei =npi ≥5 or k ≤ n/5


31
Chi-square Test (for discrete case)
• Vehicle Arrival Example :
H0: the random variable is Poisson distributed.
H1: the random variable is not Poisson distributed.
Chi-square Test
• Vehicle Arrival Example :
H0: the random variable is Poisson distributed.
H1: the random variable is not Poisson distributed.

Combined because
of min Ei
Need to combine Ei when Ei < 5 =>
đồng thời, combine Oi tương ứng

– Degree of freedom is k-s-1 = 7-1-1 = 5, hence, the


hypothesis is rejected at the 0.05 level of significance.
33
Chi-square Test (for continuous case)
● Component Life Example
H0: the random variable is Exponential distributed.
H1: the random variable is not Exponential distributed.

50 data points → number of intervals ?


Ei =npi ≥5 or k ≤ n/5
5 < k < 10; chọn k=8
=npi

Dof = k-s-1 = 8-1-1 = 6


Kolmogorov-Smirnov

Sn(x)
Kolmogorov-Smirnov (only for continuous data;
small data set)

• Test statistics:
D = max| F(x) - Sn(x)|

• Reject Ho if D > dn,𝛂

• A more powerful test, particularly useful when:


– Sample sizes are small,
– No parameters have been estimated from the data.
– discrete distribution
• When parameter estimates have been made:
– Critical values in Table A.8 are biased, too large.
– More conservative, i.e., smaller Type I error than specified.
38
Example: Test uniform distribution unif(0,1) of these 5 numbers
0.44, 0.81, 0.14, 0.05, 0.93.
R(i) = F(x) (thế x vô công thức của distribution)
Arrange R(i) from
Step 1: R(i) 0.05 0.14 0.44 0.81 0.93 smallest to largest

i/N 0.20 0.40 0.60 0.80 1.00

i/N – R(i) 0.15 0.26 0.16 - 0.07 D+ = max {i/N – R(i)}


Step 2:
R(i) – (i-1)/N 0.05 - 0.04 0.21 0.13
D- = max {R(i) - (i-1)/N}

Step 3: D = max(D+, D-) = 0.26


Step 4: For α = 0.05,
Dα = 0.565 > D

Hence, H0 is not rejected.

39
Verification and validation

Book: Jerry Bank’s chapter 10


Kelton’s chapter 4

Dr. P.H.Tram, May 2020


Validating I-O Transformation
[Cal. & Val.]
Bank Example

Hypothesis Testing

• Compare the average delay from the model Y2 with the


actual delay Z2 (continued):
– Null hypothesis testing: evaluate whether the simulation and the real
system are the same (w.r.t. output measures):

• If H0 is not rejected, then, there is no reason to


consider the model invalid
• If H0 is rejected, the current version of the model is
rejected, and the modeler needs to improve the model 42
Validating I-O Transformation
[Cal. & Val.]
Bank Example
Hypothesis Testing

– Conduct the t test:


• Chose level of significance (α = 0.05) and sample size (n = 6),

• Compute the same mean and sample standard deviation over


the n replications:

• Compute test statistics:

• Hence, reject H0. Conclude that the model is inadequate.


• Check: the assumptions justifying a t test, that the observations
43
(Y2i) are normally and independently distributed.
Validating I-O Transformation
[Cal. & Val.]
Type I and II Error

• Type I error (α):


– Error of rejecting a valid model.
– Controlled by specifying a small level of significance α.

• Type II error (β):


– Error of accepting a model as valid when it is invalid.
– Controlled by specifying critical difference and find the n.

• For a fixed sample size n, increasing α will decrease β.

45
Validating I-O Transformation
[Cal. & Val.]
Type II Error

❖ For validation, the power of the test is:


▪ Probability[ detecting an invalid model ] = 1 – β (we want to
maximize)
● β = P(Type II error) = P(failing to reject H0|H1 is true)
▪ Consider failure to reject H0 as a strong conclusion, the modeler
would want β to be small.
▪ Value of β depends on:
• Sample size, n
Tử số: critical
• The true difference between E(Y) and μ (ℇ): difference =>
không tính; we
assume, đề cho
sẵn

❖ In general, the best approach to control β error is:


▪ Specify the critical difference ℇ, calculate δ .
▪ Choose a sample size, n, by making use of the operating
characteristics curve (OC curve). 46
Eg.
ℇ=E(Y2)-μ=1
s=1.66
-->δ=0.6
-->β=?
Validating I-O Transformation
[Cal. & Val.]
Confidence Interval Testing

• Confidence interval testing: evaluate whether the


simulation and the real system are close enough.
• If Y is the simulation output, and μ = E(Y), the confidence
interval (C.I.) for μ is:

HW
- Is a small width of C.I. prefered to a large width of C.I.?
- What is the relationship between the width of C.I and the sample size n?

49
Validating I-O Transformation
[Cal. & Val.]
Accept, Reject, or More Replication?

(a) C.I. does not contain μ0:


•If the best-case error is > ε,
model needs to be refined.
•If the worst-case error is ≤ ε,
accept the model.
•If best-case error is ≤ ε, but the
worst-case is > ε , additional
replications are necessary.

(b) C.I. contains μ0:


•If either the best-case or worst-
case error is > ε, additional
replications are necessary.
•If the worst-case error is ≤ ε,
accept the model.
Xđ worst, best dựa vào distance
của u0 với LB, UB
50
Validating I-O Transformation
[Cal. & Val.]
Confidence Interval Testing

• Bank example: μ0 = 4.3, and “close enough” is ε = 1


minute of expected customer delay.
– A 95% confidence interval, based on the 6 replications is
[1.65, 3.37]

– “Real” value falls outside the confidence interval, the best case |
3.37 – 4.3| = 0.93 < 1, but the worst case |1.65 – 4.3| = 2.65 > 1,
→ conclusion: additional replications are needed to reach a
decision

51
I-O Validation: Using Historical Input Data
[Cal. & Val.]
Pair T test

• Table 10.6: Comparison of System and Model Output Measures for

Identical Historical Inputs

pair-T test
52
I-O Validation: Using Historical Input Data
[Cal. & Val.]
The Candy Factory

• Table 10.7: Validation of the Candy-Factory Model (Continued)

53
Estimation of absolute
performance (1 system)
Book: Jerry Bank’s chapter 11
Kelton’s chapter 7

Dr. P.H.Tram, May 2022


• “Terminating” and “Steady state” simulation
• “Tally” and “Time persistent” Statistics
• “Within” and “across replication”
• Confidence Interval (C.I) and C.I. with specified precision
• “Warm-up” time
“Terminating” and
“Steady state”
simulation

# Simulation length

ARENA v16
“Terminating” or “Steady state” simulation ?
o Bank opens from 8:30 to 4:30 (terminating)
Performance: one day operation, flow of money, ATM
Simulation length:
o Manufacturing process runs continuously from Monday mornings until Saturday mornings (terminating)
Performance: one shift, 13 shifts
Simulation length:
o Large web based order-processing company runs continuously 24h per day (steady state)

o Hospital emergency room

• What is the nature of the simulation system? 🡪 Initial condition


(terminating or nonterminating?)
Stopping time
• What is the objective of the simulation study? (Simulation length)
(outputs? Short run or long run behavior?)
“Terminating” or “Steady state” simulation

• Terminating (transient) simulation:


- Simulation that runs for some duration of time TE
- Must define Initial condition at time 0, stopping time TE , stopping event E

• Steady state simulation:


- Run continuously, or at least over a very long period of time
- To study steady state/long run properties that are not influenced by the initial
conditions
- stopping time TE is determined not by the nature of the problem, but by the
simulation analyst
• “Terminating” and “Steady state” simulation
• “Tally” and “Time persistent” Statistics
• “Within” and “across replication”
• Confidence Interval (C.I) and C.I. with specified precision
• “Warm-up” time
ARENA V16

Types of
performance
measures and their
estimation

Example:

WIP
Throughput
Time in system
Cycle time
Takt time
Types of
performance
measures and their
estimation

Example:

WIP
Throughput
Time in system
Cycle time
Takt time
Types of performance measures and their
estimation
• Discrete-time statistic (Tally) => thay đổi theo từng entity (lấy average như thông
thường)
Estimation of a performance parameter θ (ordinary mean)
Simulation output data are of the form Y1 Y2Y3.. Yn
E.g.: time in system, waiting time

• Continuous-time statistic (Time persistent) => thay đổi theo thời gian (lấy average
theo thời gian)
Estimation of a performance parameter φ (time weighted mean)
Simulation output data are of the form Y(t), 0<t<TE
E.g.: WIP, throughput, number in queue
Absolute measures of performance and their
estimation
• Point estimation
(Discrete/Tally) (Continuous/Time persistent)
Types of performance measures and their
estimation

HW
Avg

CI =

(--> there is (1-α)% chance that E[y] is within these two limits.)

● “Half Width”: half of the range between upper limit and lower limit is used
as a measure of accuracy of the measurement.
• “Terminating” and “Steady state” simulation
• “Tally” and “Time persistent” Statistics
• “Within” and “across replication”
• Confidence Interval (C.I) and C.I. with specified precision
• “Warm-up” time
Output analysis for terminating simulation
With across-replication data, we use the same equations for both tally
and time-persistent statistics
• “Terminating” and “Steady state” simulation
• “Tally” and “Time persistent” Statistics
• “Within” and “across replication”
• Confidence Interval (C.I) and C.I. with specified precision
• “Warm-up” time
Dạng bài tính number of replications

→ →
Estimation of Relative
Performance (>= 2 systems)
Book: - Jerry Bank’s Chapter 12
- Kelton’s Chapter 6,12

Dr. P.H.Tram, June 2022


Comparison of 2 system designs
● Independence sampling
- Different and independent random number streams are used to
simulate the 2 systems

● Common Random Numbers (CRN) or Correlated sampling


- To generate the same variates for the same purpose in run A & B
- For each replication, the same random numbers are used to
simulate both systems.
- On different replication, independent RN are used

Did we actually use the same RN in the previous example?


● Focus on Correlated sampling

2 tests to compare: Hypothesis test, or Confidence interval test


Hypothesis testing (Pair T test)
Confidence interval test

Nếu CI chứa 0 => not significantly different


Nếu CI không chứa 0 => significantly different
• Nếu CI dương => Y1 is significantly larger than Y2 (assume lấy Y1 – Y2)
• Nếu CI âm => Y1 is significantly smaller than Y2 (assume lấy Y1 – Y2)
Dạng bài tính number of replications

You might also like