Sms Mod1 5@azdocuments - in
Sms Mod1 5@azdocuments - in
1. Simulation:
Simulation is the imitation of the operation of a real world process or system over time.
Simulation models help us to study the behavior of system as it evolves
models keeps the set of assumption concerning the operation of the system
Assumptions are expressed in terms of mathematical, logical and symbolic relationship
between the entities or object of interest of the system.
Simulation modeling can be used both as an analysis tools to predict the performance of the
new system and also predict the effect of changes to existing system.
simulation can be done by hand or computer its keeps the history of system
Simulation produce the set of data is used to estimate the measures of performance of
system.
1.3Advantages of Simulation
1. New policies, operating procedures, decision rules, information flow, etc can be
explored without disrupting the ongoing operations of the real system.
2. New hardware designs, physical layouts, transportation systems can be tested
without committing resources for their acquisition.
3. Hypotheses about how or why certain phenomena occur can be tested for feasibility.
4. Time can be compressed or expanded allowing for a speedup or slowdown of the
phenomena under investigation.
5. Insight can be obtained about the interaction of variables.
6. Insight can be obtained about the importance of variables to the performance of the system.
7. Bottleneck analysis can be performed indication where work-in process, information materials
and so on are being excessively delayed.
8. A simulation study can help in understanding how the system operates rather than how
individuals think the system operates.
9. “what-if” questions can be answered. Useful in the design of new systems.
1.4Disadvantages of simulation
1. Model building requires special training. It is an art that is learned over time and through
experience.
2. If two models are constructed by two competent individuals, they may have similarities,
but it is highly unlikely that they will be the same.
3. Simulation results may be difficult to interpret. Since most simulation outputs are
essentially random variables (they are usually based on random inputs), it may be hard to
determine whether an observation is a result of system interrelationships or randomness.
4. Simulation modeling and analysis can be time consuming and expensive. Skimping on
resources for modeling and analysis may result in a simulation model or analysis that is not
sufficient for the task.
5. Simulation is used in some cases when an analytical solution is possible, or even preferable.
This might be particularly true in the simulation of some waiting lines where closed-form
queueing models are available.
1.5Applications of Simulation
Manufacturing application
Semiconductor manufacturing
construction engineering
military application
Business process simulation
Human system
1. Manufacturing Applications
Analysis of electronics assembly operations
Design and evaluation of a selective assembly station for high-precision scroll compressor
shells
Comparison of dispatching rules for semiconductor manufacturing using large-facility
models
Evaluation of cluster tool throughput for thin-film head production
Determining optimal lot size for a semiconductor back-end factory
Optimization of cycle time and utilization in semiconductor test manufacturing
Analysis of storage and retrieval strategies in a warehouse
Investigation of dynamics in a service-oriented supply chain
Model for an Army chemical munitions disposal facility
2. Semiconductor Manufacturing
Comparison of dispatching rules using large-facility models
The corrupting influence of variability
A new lot-release rule for wafer fabs
Assessment of potential gains in productivity due to proactive retile management
Comparison of a 200-mm and 300-mm X-ray lithography cell
Capacity planning with time constraints between operations
300-mm logistic system risk reduction
3. Construction Engineering
Construction of a dam embankment
Trenchless renewal of underground urban infrastructures
Activity scheduling in a dynamic, multi project setting
Investigation of the structural steel erection process
Special-purpose template for utility tunnel construction
4. Military Application
Modeling leadership effects and recruit type in an Army recruiting station
Design and test of an intelligent controller for autonomous underwater vehicles
Modeling military requirements for non war fighting operations
Using adaptive agent in U.S Air Force pilot retention
5. Logistics, Transportation, and Distribution Applications
Evaluating the potential benefits of a rail-traffic planning algorithm
Evaluating strategies to improve railroad performance
Parametric modeling in rail-capacity planning
Analysis of passenger flows in an airport terminal
Proactive flight-schedule evaluation
Logistics issues in autonomous food production systems for extended-duration space
exploration
Sizing industrial rail-car fleets
Product distribution in the newspaper industry
Design of a toll plaza
Choosing between rental-car locations
Quick-response replenishment
6. Business Process Simulation
Impact of connection bank redesign on airport gate assignment
Product development program planning
Reconciliation of business and systems modeling
Personnel forecasting and strategic workforce planning
7. Human Systems and Healthcare
Modeling human performance in complex systems
Studying the human element in air traffic control
Modeling front office and patient care in ambulatory health care practices
Evaluating hospital operations b/n the emergency department and a medical
Estimating maximum capacity in an emergency room and reducing length of stay in that
room.
1.6 Systems and System Environment
System:
System is defined as a group of object that are joined together in some regular interaction or
interdependence toward the accomplishment of same.
System environment:
A system is often affected by changes occurring outside the system,Such changes are said to
occure in the system environment.
Continuous system:
Is one in which the state variable change continuous over
time.
head of water behind a dam, during and for some time
after a rain storm water flow into the lake behind the dam.
1.9 Model of a system
A model is defined as a representation of a system for the purpose of studying the system.
It is necessary to consider only those aspects of the system that affect the problem under
investigation.
These aspects are represented in a model, and by definition it is a simplification of the system.
Types of Models:
Mathematical or physical model
Static and dynamic model
deterministic and stochastic model
discrete and continuous model
7.Continuous system:
Is one in which the state variable change continuous over time.
head of water behind a dam, during and for some time after a rain storm water
flow into the lake behind the dam.
1. Problem formulation:
Every study should begin with a statement of the problem.
If the statement is provided by the policy makers or those that have the problem, The
analyst must ensure that the problem being described is clearly understood
If the problem statement is being developed by the analyst, it is important that the policy
makers understand and agree with the formulation.
3. Model Conceptualization:
The construction of a model of a system is probably as much art as science.
The art of modeling is enhanced by ability to have following:
To abstract the essential features of a problem.
To select and modify basic assumptions that characterizes the system.
To enrich and elaborate the model until a useful approximation results.
4. Data Collection:
There is a constant interplay between the construction of the model and the
collection of the needed input data.
As complexity of the model changes the required data elements may also
change.
Since data collection takes such a large portion of the total time required to
perform a simulation it is necessary to begin it as early as possible.
5. Model Translation:
Since most real world system result in model that require a great deal of information
storage and computation, the model must be entered into a computer recognizable format.
we use term program even though it is possible to accomplish the desired result in many
instances with little or no actual coding.
6.Varified:
It pertains to the computer program and checking the performance.
If the input parameters and logical structure and correctly represented, verification is
completed.
7.Validated:
validation is the determination that a model is an accurate representation of the real
system.
Is usually achieved through the calibration of the model an iterative process of comparing
the model to actual system behavior and using the discrepancy between the two and the
insights gained to improve the model.
This process is repeated until model accuracy is judges acceptable.
8.Experimental Design:
The alternatives that are to be simulated must be determined. For each system design,
decisions need to be made concerning
10.More runs:
Based on the analysis of runs that have been completed. The analyst determines if
additional runs are needed and what design those additional experiments should follow.
11.Documentation and reporting:
Two types of documentation. Program documentation and Process documentation
Program documentation: Can be used again by the same or different
analysts to understand how the program operates
Process documentation: This enable to review the final formulation and
alternatives, results of the experiments and the recommended solution to the
problem. The final report provides a vehicle of certification.
12.Implementation:
Success depends on the previous steps. If the model user has been thoroughly involved and
understands the nature of the model and its outputs, likelihood of a vigorous implementation is
enhanced.
1.12 Simulation of queuing systems
A Queuing system is described by its calling population, the nature of its arrivals, the service
mechanism, the system capacity, and queuing discipline.
Simulation is often used in the analysis of queuing models. In a simple typical queuing model,
shown in
In the single-channel queue, the calling population is infinite; that is, if a unit leaves the
calling population and joins the waiting line or enters service, there is no change in the
arrival rate of other units that may need service.
Arrivals for service occur one at a time in a random fashion; once they join the waiting
line, they are eventually served.
The system capacity has no limit, meaning that any number of units can wait in line. Finally,
units are served in the order of their arrival (often called FIFO: first in, first out) by a single
server or channel.
Arrivals and services are defined by the distributions of the time between arrivals and the
distribution of service times, respectively.
The state of the system: the number of units in the system and the status of the server, busy or
idle.
An event: a set of circumstances that cause an instantaneous change in the state of the system.
In a single-channel queueing system there are only two possible events that can affect
the state of the system.
The simulation clock is used to track simulated time.
The arrival event occurs when a unit enters the system. The flow diagram for the arrival event
is shown in
The unit may find the server either idle or busy; therefore, either the unit begins service
immediately, or it enters the queue for the server. The unit follows the course of action shown
in fig 2.4.
If the server is busy, the unit enters the queue. If the server is idle and the queue is empty,
the unit begins service. It is not possible for the server to be idle and the queue to be
nonempty.
After the completion of a service the service may become idle or remain busy with the
next unit. The relationship of these two outcomes to the status of the queue is shown in fig
2.5. If the queue is not empty, another unit will enter the server and it will be busy
Problems:
Standard Formulas:
1.Average waiting time(i.e customer wait)=total time customer wait in queue / Total number of
customer
3.Probability of idle server(idle time of server)=total idle time of server / total run time of
simulation
6.Average waiting time those who wait in queue=total time customer wait in queue/total
number of customer who wait
7.Average time customer spend In the system=Total time customer spend in system/total
number of customer
General Principles
1. Discrete-event simulation
• The basic building blocks of all discrete-event simulation models: entities and attributes, activities and
events.
• A system is modeled in terms of
o Its state at each point in time
o The entities that pass through the system and the entities that represent system resources
o The activities and events that cause system state to change.
• Discrete-event models are appropriate for those systems for which changes in system state occur
only at discrete points in time.
• This chapter deals exclusively with dynamic, stochastic systems (i.e., involving time and containing
random elements) which change in a discrete manner.
1. System: A collection of entities (e.g., people and machines) that together over time to accomplish one
or more goals.
2. Model: An abstract representation of a system, usually containing structural, logical, or mathematical
relationships which describe a system in terms of state, entities and their attributes, sets, processes,
events, activities, and delays.
3. System state: A collection of variables that contain all the information necessary to describe the system
at any time.
4. Entity: Any object or component in the system which requires explicit representation in the model (e.g., a
server, a customer, a machine).
5. Attributes: The properties of a given entity (e.g., the priority of a v customer, the routing of a job
through a job shop).
6. List: A collection of (permanently or temporarily) associated entities ordered in some logical fashion
(such as all customers currently in a waiting line, ordered by first come, first served, or by priority).
7. Event: An instantaneous occurrence that changes the state of a system as an arrival of a new customer).
8. Event notice: A record of an event to occur at the current or some future time, along with any
associated data necessary to execute the event; at a minimum, the record includes the event type and the
event time.
9. Event list: A list of event notices for future events, ordered by time of occurrence; also known as
the future event list (FEL).
10. Activity: A duration of time of specified length (e.g., a service time or arrival time), which is
known when it begins (although it may be defined in terms of a statistical distribution).
11. Delay: A duration of time of unspecified indefinite length, which is not known until it ends (e.g., a
customer's delay in a last-in, first-out waiting line which, when it begins, depends on future arrivals).
12. Clock: A variable representing simulated time.
The system snapshot at time t=0 and t=t1 (VIP VTU question)
ClK System State Future Event List
Event-scheduling/time-advance algorithm
Step 1. Remove the event notice for the imminent event
(event 3, time t\) from FEL
Step 2. Advance CLOCK to imminent event time
(i.e., advance CLOCK from r to t1).
Step 3. Execute imminent event: update system state, change entity attributes, and set membership as needed.
Step 4. Generate future events (if necessary) and place their event notices on PEL ranked by event time.
(Example: Event 4 to occur at time t*, where t2 < t* < t3.)
Step 5. Update cumulative statistics and counters.
The system consists of those customers in the waiting line plus the one (if any) checking out. The model has
the following components:
System state (LQ (t), LS (t)), where LQ (t) is the number of customers in the waiting line, and LS (t) is
the number being served (0 or 1) at time t.
Entities: The server and customers are not explicitly modeled, except in terms of the state variables above.
Events
Arrival(A)
Departure(D)
Stopping event (E), scheduled to occur at time 60.
Event notices
(A, t). Representing an arrival event to occur at future time t
Flow Chart for execution of arrival and departure event using time advance /Event scheduling
algorithm (vtu Question)
Question Bank
Ƥ -> P as T -> ∞
Ls = λ / μ 0<= Ls <= C
The long run average server utilization is defined by
The utilization P can be interpreted as the proportion of time an arbitrary server is busy in
the long run
4.4 STEADY-STATE BEHAVIOUR OF INFINITE-
POPULATION MARKOVIAN MODLES
For the infinite population models, the arrivals are assumed to follow a poisson process
with rate λ arrivals per time unit
The interarrival times are assumed to be exponentially distributed with mean 1/λ
Service times may be exponentially distributed(M) or arbitrary(G)
The queue discipline will be FIFO because of the exponential distributed assumptions on
the arrival process, these model are called “MARKOVIAN MODEL”.
The steady-state parameter L, the time average number of customers in the s/m can be
computed as
∞
𝐿 = ∑ 𝑛𝑃𝑛
𝑛=0
Where Pn are the steady state probability of finding n customers in the s/m
Other steady state parameters can be computed readily from little equation to whole
system & to queue alone
w = L/λ
wQ = w – (1/μ)
LQ = λwQ
Where λ is the arrival rate & μ is the service rate per server
When the calling population is small, the presence of one or more customers in
the system have a strong effect on the distribution of future arrivals and the use of
an infinite population model can be misleading.
Consider a finite calling population model with k customers. The time between
the end of one service visit and the next call for service for each member of the
population is assumed to be exponentially distributed with mean 1/ λ time units.
Service times are also exponentially distributed, with mean 1/ µ time units. There
are c parallel servers and system capacity is so that all arrivals remain for service.
Such a system is shown in figure.
The effective arrival rate λe has several valid interpretations:
Λe = long-run effective arrival rate of customers to queue
= long-run effective arrival rate of customers entering service
= long-run rate at which customers exit from service
= long-run rate at which customers enter the calling population
=long-run rate at which customers exit from the calling population.
4.6 NETWORKS OF QUEUE
Many systems are naturally modeled as networks of single queues in which
customer departing from one queue may be routed to another
The following results assume a stable system with infinite calling population and
no limit on system capacity.
1) Provided that no customers are created or destroyed in the queue,then the
departure rate out of a queue is the same as the arrival rate into the queue over the
long run.
2) If customers arrive to queue i at rate λi and a fraction 0≤pij≤ 1 of them are routed
to queue j upon departure, then the arrival rate from queue i to queue j is λipij is
over long run
3) The overall arrival rate into queue j,λi is the sum of the arrival rate from all
source.If customers arrive from outside the network at rate ai then
4) If queue j has ci<∞ parallel servers, each working at rate µ ,then the long run
utilization of each server is
5) If, for each queue j ,arrivals from outside the network form a poisson process
with rate a and if there are ci identical services delivering exponentially
distributed service times with mean 1/µ then in steady state queue j behaves like
a M|M|C; queue with arrival rate
UNIT 5:Random number generation And Variation Generation
5.1 Properties of Random Numbers A sequence of random numbers, R1, R2... must have two
important statistical properties, uniformity and independence. Each random number Ri, is an independent
sample drawn from a continuous uniform distribution between zero and 1.
That is, the pdf is given by
Pseudo means false, so false random numbers are being generated. The goal of any generation scheme, is
to produce a sequence of numbers between zero and 1 which simulates, or initiates, the ideal properties of
uniform distribution and independence as closely as possible. When generating pseudo-random numbers,
certain problems or errors can occur. These errors, or departures from ideal randomness, are all related to
the properties stated previously. Some examples include the following
3) The mean of the generated numbers may be too high or too low.
c) Several numbers above the mean followed by several numbers below the mean.
Usually, random numbers are generated by a digital computer as part of the simulation. Numerous
methods can be used to generate the values. In selecting among these methods, or routines, there are a
number of important considerations.
1. The routine should be fast. The total cost can be managed by selecting a computationally efficient
method of random-number generation.
2. The routine should be portable to different computers, and ideally to different programming languages
.This is desirable so that the simulation program produces the same results wherever it is executed.
3. The routine should have a sufficiently long cycle. The cycle length, or period, represents the length of
the random-number sequence before previous numbers begin to repeat themselves in an earlier order.
Thus, if 10,000 events are to be generated, the period should be many times that long.
A special case cycling is degenerating. A routine degenerates when the same random numbers appear
repeatedly. Such an occurrence is certainly unacceptable. This can happen rapidly with some methods.
4. The random numbers should be replicable. Given the starting point (or conditions), it should be
possible to generate the same set of random numbers, completely independent of the system that is being
simulated. This is helpful for debugging purpose and is a means of facilitating comparisons between
systems.
5. Most important, and as indicated previously, the generated random numbers should closely
approximate the ideal statistical properties of uniformity and independences
It widely used technique, initially proposed by Lehmer [1951], produces a sequence of integers, X1,
X2,... between zero and m — 1 according to the following recursive relationship:
Xi+1 = (aXi + c) mod m, i = 0, 1, 2.... (7.1)
The initial value X0 is called the seed, a is called the constant multiplier, c is the increment, and m is the
modulus.
If c ≠ 0 in Equation (7.1), the form is called the mixed congruential method. When c = 0, the form is
known as the multiplicative congruential method.
The selection of the values for a, c, m and X0 drastically affects the statistical properties and the cycle
length. An example will illustrate how this technique operates.
EXAMPLE 1 Use the linear congruential method to generate a sequence of random numbers with X0 =
27, a= 17, c = 43, and m = 100.
Here, the integer values generated will all be between zero and 99 because of the value of the modulus.
These random integers should appear to be uniformly distributed the integers zero to 99.
Random numbers between zero and 1 can be generated by
Ri =Xi/m, i= 1,2,…… (7.2)
Basic Relationship:
Most natural choice for m is one that equals to the capacity of a computer word. m = 2b (binary
machine), where b is the number of bits in the computer word.
m = 10d (decimal machine), where d is the number of digits in the computer word.
EXAMPLE 1: Let m = 102 = 100, a = 19, c = 0, and X0 = 63, and generate a sequence c random
integers using Equation
When m is a power of 10, say m = 10b, the modulo operation is accomplished by saving the b rightmost
(decimal) digits.
As computing power has increased, the complexity of the systems that we are able to simulate has also
increased. One fruitful approach is to combine two or more multiplicative congruential generators in such a way
that the combined generator has good statistical properties and a longer period. The following result from
L'Ecuyer [1988] suggests how this can be done: If Wi,1, Wi,2 ,... , Wi,k are any independent, discrete-valued random
variables (not necessarily identically distributed), but one of them, say Wi,1, is uniformly distributed on the integers
0 to mi — 2, then
is uniformly distributed on the integers 0 to mi — 2. To see how this result can be used to form combined
generators, let Xi,1, Xi,2,..., X i,k be the i th output from k different multiplicative congruential generators, where the
j th generator has prime modulus mj, and the multiplier aj is chosen so that the period is mj — 1. Then the j'th
generator is producing integers Xi,j that are approximately uniformly distributed on 1 to mj - 1, and Wi,j = X i,j — 1 is
approximately uniformly distributed on 0 to mj - 2. L'Ecuyer [1988] therefore suggests combined generators of the
form
1. Frequency test. Uses the Kolmogorov-Smirnov or the chi-square test to compare the distribution
of the set of numbers generated to a uniform distribution.
2. Autocorrelation test. Tests the correlation between numbers and compares the sample
correlation to the expected correlation of zero.
5.4.1 Frequency Tests
A basic test that should always be performed to validate a new generator is the test of
uniformity. Two different methods of testing are available.
1. Kolmogorov-Smirnov(KS test) and
2. Chi-square test.
• Both of these tests measure the degree of agreement between the distribution of a sample of
generated random numbers and the theoretical uniform distribution.
• Both tests are on the null hypothesis of no significant difference between the sample distribution
and the theoretical distribution.
1. The Kolmogorov-Smirnov test. This test compares the continuous cdf, F(X), of the uniform
distribution to the empirical cdf, SN(x), of the sample of N observations. By definition,
F(x) = x, 0 ≤ x ≤ 1
If the sample from the random-number generator is R1 R2, ,..., RN, then the empirical cdf, SN(x), is
defined by
The Kolmogorov-Smirnov test is based on the largest absolute deviation between F(x) and SN(X) over the
range of the random variable. That is. it is based on the statistic D = max |F(x) -SN(x)| For testing
against a uniform cdf, the test procedure follows these steps:
Step 1: Rank the data from smallest to largest. Let R (i) denote the i th smallest observation, so that
Step 2: Compute
Step 3: Compute D = max (D+, D-).
Step 4: Determine the critical value, Dα, from Table A.8 for the specified significance level α and the
given sample size N.
Step 5:
We conclude that no difference has been detected between the true distribution of {R1, R2,... RN} and the
uniform distribution.
EXAMPLE 6: Suppose that the five numbers 0.44, 0.81, 0.14, 0.05, 0.93 were generated, and it is
desired to perform a test for uniformity using the Kolmogorov-Smirnov test with a level of significance α
of 0.05.
Step 1: Rank the data from smallest to largest. 0.05, 0.14, 0.44, 0.81, 0.93
Step 4: Determine the critical value, Dα, from Table A.8 for the specified significance level α and the
given sample size N. Here α=0.05, N=5 then value of Dα = 0.565
Step 5: Since the computed value, 0.26 is less than the tabulated critical value, 0.565,
the hypothesis of no difference between the distribution of the generated numbers and the uniform
distribution is not rejected.
N – No. of observation
n – No. of classes
Note: sampling distribution of approximately the chi square has n-1 degrees of freedom
Example 7: Use the chi-square test with α = 0.05 to test whether the data shown below are uniformly
distributed. The test uses n = 10 intervals of equal length, namely [0, 0.1), [0.1, 0.2)... [0.9, 1.0).
(REFER TABLE A.6)
5.4.2 Tests for Auto-correlation
The tests for auto-correlation are concerned with the dependence between numbers in a sequence. The list
of the 30 numbers appears to have the effect that every 5th number has a very large value. If this is a
regular pattern, we can't really say the sequence is random.
The test computes the auto-correlation between every m numbers (m is also known as the lag) starting
ρ
with the ith number. Thus the autocorrelation im between the following numbers would be of interest.
EXAMPLE : Test whether the 3rd, 8th, 13th, and so on, numbers in the sequence at the beginning of this
section are auto correlated. (Use a = 0.05.) Here, i = 3 (beginning with the third number), m = 5 (every
five numbers), N = 30 (30 numbers in the sequence), and M = 4 (largest integer such that 3 + (M +1)5 <
30).
Solution:
2.Random Variate Generation TECHNIQUES:
• INVERSE TRANSFORMATION TECHNIQUE
• ACCEPTANCE-REJECTION TECHNIQUE
All these techniques assume that a source of uniform (0,1) random numbers is available R1,R2….. where
each R1 has probability density function and cumulative distribution function.
Note: The random variable may be either discrete or continuous.
2.1 Inverse Transform Technique The inverse transform technique can be used to sample
from exponential, the uniform, the Weibull and the triangle distributions.
2.1.1 Exponential Distribution The exponential distribution, has probability density function (pdf)
given by
E(Xi)= 1/λ
And so 1/λ is mean inter arrival time. The goal here is to develop a procedure for generating values X1, X2,
X3 . . . which have an exponential distribution.
The inverse transform technique can be utilized, at least in principle, for any distribution. But it is most
useful when the cdf. F(x), is of such simple form that its inverse, F-1, can be easily computed.
Step 1: Compute the cdf of the desired random variable X. For the exponential distribution, the cdf is
Step 2: Set F(X) = R on the range of X. For the exponential distribution, it becomes
Since X is a random variable (with the exponential distribution in this case), so 1-e-λx is also a random
variable, here called R. As will be shown later, R has a uniform distribution over the interval (0,1).,
Step 3: Solve the equation F(X) = R for X in terms of R. For the exponential distribution, the solution
proceeds as follows:
Equation (5.1) is called a random-variate generator for the exponential distribution. In general, Equation
(5.1) is written as X=F-1(R). Generating a sequence of values is accomplished through steps 4.
Step 4: Generate (as needed) uniform random numbers R1, R2, R3,... and compute the desired random
variates by
Xi = F-1 (Ri)
so that Xi = -1/λ ln ( 1 – Ri) …( 5.2 ) for i = 1,2,3,.... One simplification that is usually employed in
Equation (5.2) is to replace 1 – Ri by Ri to yield Xi = -1/λ ln Ri …( 5.3 ) which is justified since both Ri
and 1- Ri are uniformly distributed on (0,1).
Solution:
F(x) = 0, x < a
( x – a ) / ( b –a ), a ≤ x ≤ b
1, x > b
where α>0 and β>0 are the scale and shape of parameters.
Steps for Weibull distribution are as follows:
Useful particularly when inverse cdf does not exist in closed form
Illustration: To generate random variants, X ~ U(1/4, 1)
Procedures:
2.1.1 Poisson Distribution A Poisson random variable, N, with mean a > 0 has pmf
N can be interpreted as number of arrivals from a Poisson arrival process during one unit of time
• Then time between the arrivals in the process are exponentially distributed with rate α.
• Thus there is a relationship between the (discrete) Poisson distribution and the (continuous)
exponential distribution, namely
The procedure for generating a Poisson random variate, N, is given by the following steps:
Example: Generate three Poisson variants with mean a =0.2 for the given Random number
0.4357,0.4146,0.8353,0.9952,0.8004
Solution:
Step 1.Set n = 0, P = 1.
Step 3. Since P = 0.4357 < e-b = 0.8187, accept N = 0. Repeat Above procedure
Gamma distribution:
• Collect data from the real system of interest. This often requires a substantial time and
resource commitment. Unfortunately, in some situations it is not possible to collect data
• Identify a probability distribution to represent the input process. When data are
available, this step typically begins by developing a frequency distribution, or histogram,
of the data.
• Choose parameters that determine a specific instance of the distribution family.
When data are available, these parameters may be estimated from the data.
• Evaluate the chosen distribution and the associated parameters for good-of- fit.
Goodness-of-fit may be evaluated informally via graphical methods, or formally via
statistical tests. The chisquare and the Kolmo-gorov-Smirnov tests are standard
goodness-of-fit tests. If not satisfied that the chosen distribution is a good approximation
of the data, then the analyst returns to the second step, chooses a different family of
distributions, and repeats the procedure. If several iterations of this procedure fail to yield
a fit between an assumed distributional form and the collected data
• Data collection is one of the biggest tasks in solving real problem. It is one of the most
important and difficult problems in simulation. And even if when data are available, they
have rarely been recorded in a form that is directly useful for simulation input modeling.
1
The following suggestions may enhance and facilitate data collection, although they are not
all – inclusive.
1. A useful expenditure of time is in planning. This could begin by a practice or
pre observing session. Try to collect data while pre-observing.
2. Try to analyze the data as they are being collected. Determine if any data being
collected are useless to the simulation. There is no need to collect superfluous
data.
3. Try to combine homogeneous data sets. Check data for homogeneity in
successive time periods and during the same time period on successive days.
4. Be aware of the possibility of data censoring, in which a quantity of interest is
not observed in its entirety. This problem most often occurs when the analyst is
interested in the time required to complete some process (for example, produce
a part, treat a patient, or have a component fail), but the process begins prior to,
or finishes after the completion of, the observation period.
5. To determine whether there is a relationship between two variables, build a
scatter diagram.
6. Consider the possibility that a sequence of observations which appear to be
independent may possess autocorrelation. Autocorrelation may exist in
successive time periods or for successive customers.
7. Keep in mind the difference between input data and output or performance
data, and be sure to collect input data. Input data typically represent the
uncertain quantities that are largely beyond the control of the system and will
not be altered by changes made to improve the system.
• In this section we discuss methods for selecting families of input distributions when data
are available.
6.2.1 Histogram
• If the intervals are too wide, the histogram will be coarse, or blocky, and its shape and
other details will not show well. If the intervals are too narrow, the histogram will be
ragged and will not smooth the data.
• The histogram for continuous data corresponds to the probability density function of a
theoretical distribution.
Example 6.2 : The number of vehicles arriving at the northwest corner of an intersection in a 5
min period between 7 A.M. and 7:05 A.M. was monitored for five workdays over a 20-week
period. Table shows the resulting data. The first entry in the table indicates that there were 12:5
min periods during which zero vehicles arrived, 10 periods during which one vehicles arrived,
and so on,
3
Fig 6.2 Histogram of number of arrivals per period.
• Additionally, the shapes of these distributions were displayed. The purpose of preparing
histogram is to infer a known pdf or pmf. A family of distributions is selected on the
basis of what might arise in the context being investigated along with the shape of the
histogram.
• Thus, if interarrival-time data have been collected, and the histogram has a shape similar
to the pdf in Figure 5.9.the assumption of an exponential distribution would be warranted.
• Similarly, if measurements of weights of pallets of freight are being made, and the
histogram appears symmetric about the mean with a shape like that shown in Fig 5.12,
the assumption of a normal distribution would be warranted.
• The exponential, normal, and Poisson distributions are frequently encountered and are
not difficult to analyze from a computational standpoint. Although more difficult to
analyze, the gamma and Weibull distributions provide array of shapes, and should not be
overlooked when modeling an underlying probabilistic process. Perhaps an exponential
4
distribution was assumed, but it was found not to fit the data. The next step would be to
examine where the lack of fit occurred.
• If the lack of fit was in one of the tails of the distribution, perhaps a gamma or Weibull
distribution would more adequately fit the data.
• Literally hundreds of probability distributions have been created, many with some
specific physical process in mind. One aid to selecting distributions is to use the physical
basis of the distributions as a guide. Here are some examples:
• Further, our perception of the fit depends on widths of the histogram intervals. But even
if the intervals are well chosen, grouping of data into cells makes it difficult to compare a
histogram to a continues probability density function
• If X is a random variable with cdf F, then the q-quintile of X is that y such that F(y) =
P(X < y) = q, for 0 < q < 1. When F has an invererse, we write y = F-1(q).
• Now let {Xi, i = 1, 2,...,n} be a sample of data from X. Order the observations from
the smallest to the largest, and denote these as {yj, j =1,2 ,,,n}, where y1 < y2 < ….. <
yn- Let j denote the ranking or order number. Therefore, j = 1 for the smallest and j = n
for the largest. The q-q plot is based on the fact that y1 is an estimate of the (j — 1/2)/n
quantile of X other words,
• Now suppose that we have chosen a distribution with cdf F as a possible representation of
the distribution of X. If F is a member of an appropriate family of distributions, then a
5
6.3 Parameter Estimation
• After a family of distributions has been selected, the next step is to estimate the
parameters of the distribution. Estimators for many useful distributions are described in
this section. In addition, many software packages—some of them integrated into
simulation languages—are now available to compute these estimates.
• In a number of instances the sample mean, or the sample mean and sample variance, are
used to estimate of the parameters of hypothesized distribution;
• If the observations in a sample of size n are X1, X2,..., Xn, the sample mean ( X) is
defined by
If the data are discrete and grouped in frequency distribution, Equation (9.1) and (.2) can
be modified to provide for much greater computational efficiency, The sample mean can be
computed by
6
And the sample variance by
where k is the number of distinct values of X and fj is the observed frequency of the value Xj, of
X.
• Numerical estimates of the distribution parameters are needed to reduce the family of
distributions to a specific distribution and to test the resulting hypothesis.
• These estimators are the maximum-likelihood estimators based on the raw data. (If the
data are in class intervals, these estimators must be modified.)
• The triangular distribution is usually employed when no data are available, with the
parameters obtained from educated guesses for the minimum, most likely, and maximum
possible value's; the uniform distribution may also be used in this way if only minimum
and maximum values are available.
7
6.4 Goodness-of-Fit Tests
• These two tests are applied in this section to hypotheses about distributional forms of
input data. Goodness-of-fit tests provide help full guidance for evaluating the suitability
of a potential input model.
• However, since there is no single correct distribution in a real application, you should not
be a slave to the verdict of such tests.
• It is especially important to understand the effect of sample size. If very little data are
available, then a goodness-of-fit test is unlikely to reject any candidate distribution; but if
a lot of data are available, then a goodness-of-fit test will likely reject all candidate
distribution.
• One procedure for testing the hypothesis that a random sample of size n of the random
variable X follows a specific distributional form is the chi-square goodness-offit test.
• This test formalizes the intuitive idea of comparing the histogram of the data to the shape
of the candidate density or mass function, The test is valid for large sample sizes, for both
discrete and continuous distribution assumptions, When parameters are estimated by
maximum likelihood.
• where 0, is the observed frequency in the ith class interval and Ei, is the expected
frequency in that class interval. The expected frequency for each class interval is computed
as Ei=npi, where pf is the theoretical, hypothesized probability associated with the ith class
interval.
• It can be shown thatX02 approximately follows the chi-square distribution with k-s-1
degrees of freedom, where s represents the number of parameters of the hypothesized
distribution estimated by sample statistics. The hypotheses are :
8
H0: the random variable, X, conforms to the distributional assumption with the
parameter(s) given by the parameter estimate(s)
• If the distribution being tested is discrete, each value of the random variable should be a
class interval, unless it is necessary to combine adjacent class intervals to meet the
minimum expected cell-frequency requirement. For the discrete case, if combining
adjacent cells is not required,
• If the distribution being tested is continuous, the class intervals are given by [ai-1,ai),
, where ai-1 and ai, are the endpoints of the ith class interval. For the continuous case
with assumed pdf f(x), or assumed cdf F(x), pi, can be computed By
Ei = n p i 5
9
6.4.3 Kolmogorov - Smirnov Goodness-of-Fit Test
• The chi-square goodness-of-fit test can accommodate the estimation of parameters from
the data with a resultant decrease in the degrees of freedom (one for J each parameter
estimated). The chi-square test requires that the data be placed in class intervals, and in
the case of continues distributional assumption, this grouping is arbitrary.
• Also, the distribution of the chi-square test statistic is known only approximately, and the
power of the test is sometimes rather low. As a result of these considerations, goodness-
of-fit tests, other than the chi-square, are desired.
• The Kolmogorov-Smirnov test is particularly useful when sample sizes are small and
when no parameters have been estimated from the data.
• The data were collected over the interval 0 to T = 100 min. It can be shown that if the
underlying distribution of interarrival times { T1, T2, … } is exponential, the arrival
times are uniformly distributed on the interval (0,T).
10
• The arrival times T1, T1+T2, T1+T2+T3,…..,T1+…..+T50 are obtained by
adding interarrival times.
• On a (0,1) interval, the points will be [T1/T, (T1+T2)/T,…..,(T1+….+T50)/T].
Engineering data : Often a product or process has performance ratings pro vided by the
manufacturer.
Expert option : Talk to people who are experienced with the procesws or similar
processes. Often they can provide optimistic, pessimistic and most likely
times.
Physical or conventional limitations : Most real processes have physical limit on
performance. Because of company policies, there may be upper limits on
how long a process may take. Do not ignore obvious limits or bound: that
narrow the range of the input process.
The nature of the process It can be used to justify a particular choice even when no data
are available.
6.6 Multivariate and Time-Series Input Models
The random variables presented were considered to be independent of any other variables
within the context of the problem. However, variables may be related, and if the variables
appear in a simulation model as inputs, the relationship should be determined and taken into
consideration.
11
6.7 Time series input model:
If X1,X2..Xn is a sequence of identically distributed,but dependent and convarianc stationary
random variables,then there are a number of times series model that can be used to represent the
process. The two models that have the characteristics that the autocorrelatrion take the form.
for h=1,2,..n that the log-h autocorrelation decreases geometrically as the lag increases.
AR(1) Model:
consider the time series model
for t=2,3,..n where ε2, ε3 are the independent and identically distributed with men 0 and variance
σ2 ε and -1< ϕ<1. If the initial value x1 is chosen appropriately,then x1,x2..are all normal
distributed with mean u and variance
EAR(1) Model:
Consider the time series model
for t=2,3,..n where ε2, ε3 are the independent and identically distributed with mean and 0<
ϕ<1. If the initial value x1 is chosen appropriately, then x1,x2.. are all exponentially distributed
with mean and variance
12
13
OUTPUT ANALYSIS FOR A SINGLE MODEL
Estimate system performance via simulation
• If q is the system performance, the precision of the estimator can be measured by:
1. The standard error of .
2. The width of a confidence interval (CI) for q.
• Purpose of statistical analysis:
1. To estimate the standard error or CI .
2. To figure out the number of observations required to achieve desired error/CI.
• Potential issues to overcome:
1. Autocorrelation, e.g. inventory cost for subsequent weeks lack statistical
independence.
2. Initial conditions, e.g. inventory on hand and # of backorders at time 0 would
most likely influence the performance of week 1.
• Model output consist of one or more random variables (r. v.) because the model is an
input-output transformation and the input variables are r.v.‘s.
• Point estimation for discrete time data[Y1, Y2, …, Yn] is defined by.
The point estimator:
ˆ 1 n
= Yi
n i=1
• Where ˆ is a sample mean based on sample of size n The pointer estimator ˆ is said to be
unbiased for if its expected value is , that is if: Is biased
E(ˆ) =
• Point estimation for continuous-time
data. The point estimator:
ˆ 1 TE
= T 0 Y (t)dt
E
◼ An unbiased or low-bias estimator is desired.
• Usually, system performance measures can be put into the common framework of q or f:
the proportion of days on which sales are lost through an out-of-stock situation, let:
• Find such that 100p% of the histogram is to the left of (smaller than) .
Let Yi be the average cycle time for parts produced on the ith replication of the
simulation (its mathematical expectation is q).
Average cycle time will vary from day to day, but over the long-run the average
of the averages will be close to q.
S 2 = 1 (Y − Y ) 2
R
Sample variance across R replications:
i
. ..
R −1 i=1
7.3.3 Confidence-Interval Estimation
◼ Confidence Interval (CI):
A measure of error.
A measure of risk.
A good guess for the average cycle time on a particular day is our estimator but it
is unlikely to be exactly right.
PI is designed to be wide enough to contain the actual average cycle time on any
particular day with high probability.
Normal-theory prediction interval:
1
Y.. t / 2,R−1S 1+
R
One of the most important and difficult tasks facing a model developer is the
Verification and validation of the simulation model.
It is the job of the model developer to work closely with the end users
Throughout the period (development and validation to reduce this skepticism
And to increase the credibility.
1: To produce a model that represents true system behavior closely enough for the
model to be used as a substitute for the actual system for the purpose of experimenting
with system.
2: To increase an acceptable, level the credibility of the model ,so that the model will be
used by managers and other decision makers. |
2: Validation is concerned with building the right model. It is utilized to determine that
a model is an accurate representation of the real system. It is usually achieved
through the calibration of the model
Many common-sense suggestions can be given for use in the verification process:-
Have the computerized representation checked by someone other than its developer.
Make a flow diagram which includes each logically possible action a system can take when
an event occurs, and follow the model logic for each a for each action for each event type.
Closely examine the model output for reasonableness under a variety of settings of Input
parameters.
Have the computerized representation print the input parameters at the end of the
Simulation to be sure that these parameter values have not been changed inadvertently.
Make the computerized representation of self-documenting as possible.
If the computerized representation is animated, verify that what is seen in the
animation imitates the actual system.
The interactive run controller (IRC) or debugger is an essential component of Successful
simulation model building. Even the best of simulation analysts makes mistakes or commits
logical errors when building a model.
The IRC assists in finding and correcting those errors in the follow ways:
(a) The simulation can be monitored as it progresses.
(b) Attention can be focused on a particular line of logic or multiple lines of logic
that constitute a procedure or a particular entity.
(c) Values of selected model components can be observed. When the simulation has
paused, the current value or status of variables, attributes, queues, resources,
counters, etc., can be observed
(d) The simulation can be temporarily suspended, or paused, not only to view
information but also to reassign values or redirect entities.
1. Inter arrival times of customers during several 2-hour periods of peak loading
("rush-hour" traffic)
2. Inter arrival times during a slack period
3. Service times for commercial accounts
4. Service times for personal accounts
In this phase of validation process the model is viewed as input –output transformation.
That is, the model accepts the values of input parameters and transforms these inputs into
output measure of performance. It is this correspondence that is being validated.
Instead of validating the model input-output transformation by predicting the future ,the
modeler may use past historical data which has been served for validation purposes that
1: Minor changes of single numerical parameters such as speed of the machine, arrival
rate of the customer etc.
2: Minor changes of the form of a statistical distribution such as distribution of service
time or a time to failure of a machine.
3: Major changes in the logical structure of a subsystem such as change in queue
discipline for waiting-line model, or a change in the scheduling rule for a job shop
model.
4: Major changes involving a different design for the new system such as computerized
inventory control system replacing a non computerized system .
If the change to the computerized representation of the system is minor such as in items one
or two these change can be carefully verified and output from new model can be accepted
with considerable confidence.
When using artificially generated data as input data the modeler expects the model produce
event patterns that are compatible with, but not identical to, the event patterns that
occurred in the real system during the period of data collection.
Thus, in the bank model, artificial input data {X\n, X2n, n = 1,2, , .} for inter arrival and service
Optimization via simulation to refer to the problem of maximizing or minimizing the expected
performance of a discrete event, stochastic system that is represented by a computer
simulation model.
Optimization usually deals with problems with certainty, but in stochastic discrete-event
simulation the result of any simulation run is a random variable
let x1,x2,..xm be the m controllable design variable and Y(x1,x2,..xm)be the observed
simulation output performance on one run:
To optimize Y(x1,x2,..xm) with respect to x1,x2,..xm is to maximize or minimize the
mathematical expectation of performance. E[Y(x1,x2,..xm)]
1.Initialization Bias.
2.Error Estimation
3.Replication mathods.
4.Sample size.
5.Batch means.