Simulation Notes
Simulation Notes
Introduction To Simulation
A Simulation is the imitation of the operation of a real world process or a facility. The
facility or the process of interest is usually called a system.
In order to study the system we make a set of assumptions about it. These assumptions
constitute a model. Assumptions are expressed in mathematical or logical relationship.
A system is often affected by changes occurring outside the system. Such changes are
said to occur in the system environment. In modeling systems, it is necessary to decide
on the boundary between the system and its environment. This decision may depend on
the purpose of the study.
Component of a System
endogenous is used to describe activities and events occurring within a system and the
term endogenous is used to describe activities and events occurring within a system and
the term exogenous is used to describe activities and events in the environment that
affect the system. In the bank study, the arrival of a customer is an
endogenous event, and the completion of service of a customer is an endogenous event.
A continuous system is one in which the state variables(s) change continuously over
time.
Model of a System
Types of Models
Problem formulation: Every study should begin with a statement of the problem. If the
statement is provided by the policy makers or those that have the problem, the analyst
must ensure that the problem being described is clearly understood.
Model conceptualization
The construction of a model of a system is probably as much art as science. The art of
modeling is enhanced by an ability to abstract the essential features of a problem. To
select and modify basic assumptions that characterize the system, and then to enrich and
elaborate the model until a useful approximation results, the model complexity need not
exceed that required to accomplish the purposes for which the model is intended.
Violation of this principle will only add to model-building expenses.
It is advisable to involve the model user in model conceptualization. This will both
enhance the quality of the resulting model and increase the confidence of the model user
in the application of the model.
Data collection
There is a constant interplay between the construction of the model and the collection of
the needed input data. As the complexity of the model changes, the required data
elements may also change.
Model translation
Since most real-world systems result in models that require a great deal of information
storage and computation, the model must e entered into a computer-recognizable format.
We use the term “program”, even though it is possible to accomplish the desired result in
many instance with little or no actual coding. The modeler must decide whether to
program the model in a simulation language or to use special-purpose simulation
software.
4
Verifications
Verification pertains to the computer program prepared for the simulation model. Is the
computer program performing properly? If the input parameters and logical structure of
the model are correctly represented in the computer, verification has been completed.
Validation
Experimental design
The alternatives that are to be simulated must be determined. Often, the decision
concerning which alternatives to simulate may be function of runs that have been
completed and analyzed.
Production runs and their subsequent analysis, are used to estimate measures of
performance for the system designs that are being simulated.
There are two types of documentation: program and progress, Program documentation is
necessary for numerous reasons. If the program is going to be used again by the same
or different analysts, it may be necessary to understand how the program operates.
Implementation
The success of the implementation phase depends on how well the previous eleven
steps have been performed. It is also contingent upon how thoroughly the analyst has
involved the ultimate model user during the entire simulation process.
Advantages are
2. New hardware designs, physical layouts, transportation systems, and so on, can
be tested without committing resources for their acquisition.
3. Hypotheses about how or why certain phenomena occur can be tested for
feasibility.
8. A
Disadvantages are
1. Model building requires special training. It is an art that is learned over time and
through experience. Furthermore, if two models are constructed by two
competent individuals, they may have similarities, but it is highly unlikely that they
will be the same.
2. Simulation results may be difficult to interpret. Since most simulation outputs are
essentially random variables, it may be hard to determine whether an
observation is a result of system interrelationships or randomness.
3. By changing simulation input and observing the resulting outputs, we may know
which variables are important.
6. Simulation models designed for training allow learning without the cost and
efforts of on the job training.
8. Many modern systems can be so complex that simulation may be the only way to
study them.
7. If there is no time or skilled personnel are not available to verify and validate the
model.
Graded Questions
1. What is system modeling? Give an example and explain the different types of
models. [N-04]
3. Define simulation. What are the various steps in simulation study? [M-05]
4. Define model. What are the different types of models? Give example for each.
[M-05]
8
2
Simulation Examples
Graded Questions
1. A small grocery store has only one checkout counter. Customers arrive at this
checkout counter at random from 1 to 8 minutes apart. Each possible value of
interarrival time has the same probability of occurrence. The service time distribution
is as follows:
Perform the simulation for 10 customers. Obtain average waiting time in the queue
and average idle time of the server.
Random numbers (start at North-West corner and proceed along the row)
93 14 72 10 21
81 87 90 38 10
29 17 11 68 99
51 40 30 52 71
3. A plant has a large number of similar machines. The machine breakdowns or failures
are random and independent.
9
The shift in-charge of the plant collected the data about the various
machines breakdown times and the repair time required on hourly basis, and
the record for the past 100 observations as shown below was:
For each hour that one machine is down due to being or waiting to be repaired, the
plant loses Rs. 70 by way of lost production. A repairman is paid at Rs. 20 per hour.
a. Simulate this maintenance system for 15 breakdowns.
b. Obtain the total maintenance cost.
Using MonteCarlo simulation technique, determine the average profit from the said
investment on the basis of 20 trials.
10
5. A company manufactures 200 motorcycles per day. Depending upon the availability
of raw materials and other conditions, the daily production has been varying from 196
motorcycles to 204 motorcycies, whose probability distribution is as given below:
Production/day: 196 197 198 199 200 201 202 203 204
Probability: 0.05 0.09 0.12 0.14 0.20 0.15 0.11 0.08 0.06
that can accommodate only 200 motorcycles. Using the following random
numbers:
6. A confectioner sells confectionary items. Past data of demand per week (in hundred
kilograms) with frequency is given below:
7.
Demand/week: 0 5 10 15 20 25
Frequency: 2 11 8 21 5 3
Using the following sequence of random numbers, generate the demand for the next
10 weeks. Also find the average demand per week:
35,52,90,13,23,73,34,57,35,83,94,56,67,66,60
8. Consider the following continuously operating job shop. Interarrival times of jobs are
distributed as follows:
Time between Probability
Arrivals (Hours)
0 .23
1 .37
2 .28
3 .12
Processing times for jobs are normally distributed with mean 50 minutes and
standard deviation 8 minutes. Construct a simulation table, and perform a simulation
for 10 new customers. Assume that when the simulation beings there is one job
11
being processed (scheduled to be completed in 25 minutes) and there is one job with
a 50-minute processing time in the queue.
(a) What was the average time in the queue for the 10 new jobs?
(b) What was the average processing time of the 10 new jobs?
(c) What was the maximum time in the system for the 10 new jobs?
Daily 0 1 2 3 4
Demand
Probability 0.18 0.39 0.29 0.09 0.05
follows:
Lead Time 0 1 2 3 4 5
(Days)
Probability 0.135 0.223 0.288 0.213 0.118 0.023
Value of C Probability
10 .10
20 .25
30 .50
40 .15
D = (A – 25B) / (2C)
11. Lead-time for a stock item is normally distributed with a mean of 7 days and standard
deviation of 2 days. Daily demand is distributed as follows:
Daily 0 1 2 3 4
Demand
Probability 0.367 0.368 0.184 0.062 0.019
Determine the lead-time demand for 20 order cycles. (Round off lead time to the
closest integer during the simulation, and if a negative value results, give it a lead
time of zero.)
12. A bank has one drive-in teller and room for one additional customer to wait.
Customers arriving when the queue is full, park and go inside the bank to transact
business. The time-between-arrivals and services-time distributions are given below.
Simulate the operation of the drive-in-teller for 10 new customers. The first of the 10 new
customers arrives at a time determined at random. Start the simulation with one customer
being served, leaving at time 3, and one in the queue. How many customers went into
the bank to transact business?
Table Random Normal Numbers
Simulation Examples
Chap – 2
A queuing system is described by its calling population, the nature of the arrivals, the
service mechanism, the system capacity & the queuing discipline.
In the single channel queue, the calling population is infinite ie if a unit leaves the
calling population & joins the waiting lines or entire service, there is no change in the
arrival rate of other units that may need service.
In the single queue, the system works an FIFO by a single server or channel.
Arrivals & services are defined by the distribution by the distribution of service
times, resp. for any simple single or multi channel queue the overall effective motival sale
must be lets than the total service rate or the waiting line wall grow without bound. When
queues grow without bound they are termed explosive or unstable.
1. The average waiting time for a customer 2.8 minutes. This is determined in the
following manner.
2. The probability that a customer has to wait in the queue is 0.65. This is
determined in the following manner:
Number of customers who wait
Probability (wait) =
Total number of customers
13
= = 0.65
20
15
3. The fraction of idle time of the server is 0.21. This is determined in the following
manner:
Probability of idle total idle tune if server (minutes)
Server =
Total run time of simulation (minutes)
18
= 86 = 0.21
The probability of the server being busy is the complement of 0.21, or 0.79.
This result can be compared with the expected service time by finding the mean of the
service time by finding the mean of the service-time distribution using the equation
E (S) = s p (s)
S=0
Applying the expected-value equation to the distribution in Table 2.7 gives an expected
service time of:
= 1(0.10) + 2 (0.20) + 3 (0.30) + 4 (0.25) + 5 (0.10) + 6 (0.05)
= 3.2 minutes
the expected service time is slightly lower that the average service time in the simulation.
The longer the simulation, the closer the average will be to E (S).
5. The average time between arrivals is 4.3 minutes. This is determined in the
following manner:
Sum of all times
Average time between = between arrivals (minutes)
Arrivals (minutes)
Number of arrivals – 1
82
= = 4.3 minutes
19
One is subtracted from the denominator because the first arrival is assumed to occur at
time 0. This result can be compared to the expected time between arrivals by finding the
16
mean of the discrete uniform distribution whose endpoints are a = 1 and b = 8. The mean
is given by
a+b 1+8
E (A) = = 4.5 minutes
2 2
The expected time between arrivals is slightly higher than the average. However, as the
simulation becomes longer, the average value of the time between arrivals will approach
the theoretical mean, E (A).
6. The average waiting time of those who wait is 4.3 minutes. This is determined in
the following manner.
Average waiting time of total time customers wait in queue (minutes)
=
those who wait (minutes) total number of customers who wait
56
= = 4.3 minutes
13
7. The average time a customer spends in the system is 6.2 minutes. This can be
determined in two ways. First, the computation can be achieved by the following
relationship:
Total time customers spend in the
Average time customer system (minutes)
Spends in the system =
(minutes) total number of customers
124
= = 6.2 minutes
20
The second way of computing this same result is to realize that the following relationship
must hold:
Average time average time average time
Customers spends customer spends customer spends
In the system = waiting in the + in service
(minutes) queue (minutes) (minutes)
Notice that in the second cycle, the amount in inventory drops below zero,
indicating a shortage. In Figure 2.7, these units are backordered; when the order arrives,
the demand for the backordered items is satisfied first. To avoid shortages, a buffer, or
safety, stock would need to be carried.
18
Carrying stock in inventory has an associated cost attributed to the interest paid
on the funds borrowed to buy the items (this also could be considered as the loss from
not having the funds available for other investment purposes). Other costs can be placed
in the carrying or holding cost column: renting of storage space, hiring guards, and so on.
An alternative to carrying high inventory is to make more frequent reviews, and
consequently, more frequent purchases or replenishments. This has an associated cost:
the ordering cost. Also, there is a cost in being short. Customers may get angry, with a
subsequent loss of good will. Larger inventories decrease the possibilities of shortages.
These costs must be traded off in order to minimize the total cost of an inventory system.
The total cost (or total profit) of an inventory system is the measure of
performance. This can be affected by the policy alternatives. For example, in Figure 2.7,
the decision maker can control the maximum inventory level, M, and the length of the
cycle, N. What effect does changing N have on the various costs?
In an (M, N) inventory system, the events that may ovvur are: the demand for
items in the inventory, the review of the inventory position, and the receipt of an order at
the end of each review period. When the lead-time is zero, as in Figure 2.7, the last two
events occur simultaneously.
In the following example for deciding how many newspapers to buy, only a single
time period of specified length is relevant and only a single procurement is made.
Inventory remaining at the end of the single time period is sold for scrap or discarded. A
wide variety of real-world problems are of this form, including the stocking of spare parts,
perishable items, style goods, and special seasonal items [Hadley and Whitin, 1963].
SUMMARY:
This chapter introduced simulation concepts via eg in order to illustrate general areas of
application.
19
3
General Principles
System A collection of entities (e.g., people and machines) that interact together
over time to accomplish one or more goals.
System state A collection of variables that contain all the information necessary to
describe the system at any time.
Entity Any object or component in the system, which requires explicit representation
in the model (e.g., a server, a customer, a machine).
Attributes The properties of a given entity (e.g., the priority of a waiting customer,
the routing of a job through a job shop).
Event notice A record of an event to occur at the current or some future time, along
with any associated data necessary to execute the event; at a minimum, the record
includes the event type and the event time.
20
Event list A list of event notices for future events, ordered by time of occurrence;
also known as the future even list (FEL).
Delay A duration of time of unspecified indefinite length, which is not known until it
ends (e.g., a customer’s delay in a last-in, first-out waiting line which, when it begins,
depends on future arrivals).
Different simulation packages use different terminology for the same or similar
concepts. For the same or similar concepts. For example, lists are sometimes called sets,
queues, or chains. Sets or lists are used to hold entities as well as event notices. The
entities on a list are always ordered by some rule, such as first-in, first-out or last-in, first-
out, or ranked by some entity attribute, such as priority or due date. The future event list
is always ranked by the event time recorded in the event notice. Section 3.2 discusses a
number of methods for handling lists and introduces some of the methodologies for
efficient processing of ordered sets or lists.
2. Statistical - for example, as a random draw from among 2,5,7 with equal
probabilities;
beginning, then an event notice is created that specifies the type of event (an end of
inspection event) and the event time (100 + 5 = 105 minutes).
The mechanism for advancing simulation time and guaranteeing that all events occur in
correct chronological order is based on the future event list (FEL). This list contains all
event notices for events that have been scheduled to occur at a future time. Scheduling a
future event means that at the instant an activity begins, its duration is computed or
drawn as a sample from a statistical distribution and the end-activity event, together with
its event time, is placed on the future event list. In the real world, most future events are
not scheduled but merely happen-such as random breakdowns or random arrivals. In the
model, such random events are represented by the end of some activity, which in turn is
represented by a statistical distribution.
At any given time t, the FEL contains all previously scheduled future events and
their associated event times (called t1, t2, … in Figure 3.1). The FEL is ordered by event
22
time, meaning that the events are arranged chronologically; that is, the event times
satisfy
Time t is the value of CLOCK, the current value of simulated time. The event
associated with time t1 is called the imminent event; that is, it is the next event that will
occur. After the system snapshot at simulation time CLOCK = t has been updated, the
CLOCK is advanced to simulation time CLOCK = t1, and the imminent event notice is
removed from the FEL and the event executed. Execution of the imminent event means
that a new system snapshot for time t1 is created based on the old snapshot at time t and
the nature of the imminent event. At time t1, new future events may or may not be
generated, but if any are, they are scheduled by creating event notices and putting them
in their proper position on the FEL. After the new system snapshot for time has been
updated, the clock is advanced to the time of the new imminent event and that event is
executed. This process repeats until the simulation is
The length and contents of the FEL are constantly changing as the simulation
progresses, and thus its efficient management in a computerized simulation will have a
major impact on the efficiency of the computer program representing the model. The
management of a list is called list processing. The major event, addition of a new event to
the list, and occasionally removal of some event (called cancellation of an event). As the
imminent event is usually at the top of the list, its removal is as efficient as possible.
Addition of a new event (and cancellation an old event) requires a search of the list. The
efficiency of this search depends on the logical organization of the list and on how the
search is conducted. In addition to the FEL, all the sets in a model are maintained in
some logical order, and the operations of addition and removal of entities from the set
also require efficient list-processing techniques. A brief introduction to list processing in
simulation is given in Section 3.2.
We can the removal and addition of events from the FEL is illustrated in Figure
3.2. Event 3 with event time t1 represents, say, a service-completion event at server 3.
Since it is the imminent event at time t, it is removed from the FEL in step 1 (Figure 3.2)
of the event-scheduling/time-advance algorithm. When event 4 (say, an arrival event)
with event time t* is generated at step 4, one possible way to determine its correct
position on the FEL is to conduct a top-down search:
If t* < t2 place event 4 at the top of the FEL.
If t2 < t* < t3, place event 4 second on the list.
If t3 < t* < t4, place event 4 third on the list.
:
:
If tn < t*, place event 4 last on the list.
(In figure 3.2, it was assumed that t* was between t2 and t3.) Another way is to conduct a
bottom-up search. The least efficient way to maintain the FEL is to leave it as an
23
unordered list (additions placed arbitrarily at the top or bottom), which would require at
step 1 of Figure 3.2 a complete search of the list for the imminent event before each
clock advance. (The imminent event is the event on the FEL with the lowest event time.)
The system snapshot at time 0 is defined by the initial conditions and the
generation of the so-called exogenous events. The specified initial conditions define the
system state at time 0. For example, in Figure 3.2, if t = 0, then the state (5,1,6) might
represent the initial number of customers at three different points in the system. An
exogenous event is a happening “outside the system” which impinges on the system. An
important example is an arrival to a queuing system. At time 0, the first arrival event is
generated and is scheduled on the FEL (meaning its event notice is placed on the FEL).
The interarrival time is an example of an activity. When the clock eventually is advanced
to the time of this first arrival, a second arrival event is generated. First, an interarrival
time is generated, a*; it is added to the current time, CLOCK = t; the resulting (future)
event time, t + a* = t*, is used to position the new arrival event
Notice on the FEL. This method of generating an external arrival stream, called
bootstrapping, provides one example of how future events are generated in step 4 of the
event-scheduling/time-advance algorithm. Bootstrapping is illustrated in Figure 3.3. The
first three interarrival times generated are 3.7, 0.4, and 3.3 time units. The end of an
interarrival interval is an example of a primary event.
A second example of how future events are generated (step 4 of Figure 3.2) is
provided by a service-completion event in a queueing simulation. When one customer
completes service, at current time CLOCK = t, if the next customer is present, then a new
service time, s*, will be generated for the next customer. The next service-completion
event will be scheduled to occur at future time t* = t + s* by placing onto the FEL a new
24
event notice of type service completion with event time t*. In addition, a service-
completion event will be generated and scheduled at the time of an arrival event,
provided that, upon arrival, there is at least one idle server in the server group. A service
time is an example of an activity. Beginning service is a conditional event, because its
occurrence is triggered only on the condition that customer is present and a server is
free. Service completion is an example of a primary event. Note that a conditional event,
such as beginning service, is triggered by a primary event occurring and certain
conditions prevailing in the system. Only primary events appear on the FEL.
Every simulation must have a stopping event, here called E, which defines how
long the simulation will run. There are generally two ways to stop a simulation:
1. At time 0, schedule a stop simulation event at a specified future time TE. Thus,
before simulating, it is known that the simulation will run over the time interval [0,
TE]. Example: Simulate a job shop for TE = 40 hours.
In case 2, TE is not known ahead of time. Indeed, it may be one of the statistics of
primary interest to be produced by the simulation.
state. This world view will be illustrated by the manual simulations of Section 3.1.3
and the C++ simulation in Chapter 4.
When using a package that supports the process-interaction approach, a
simulation analyst thinks in terms of processes. The analyst defines the simulation
model in terms of entities or objects and their life cycle as they flow through the
system, demanding resources and queueing to wait for resources. More precisely, a
process is the life cycle of one entity. This life cycle consists of various events and
activities. Some activities may require the use of one or more resources whose
capacities are limited. These and other constraints cause processes to interact, the
simplest example being an entity forced to wait in a queue (on a list) because the
resource it needs is busy with another entity. The process-interaction approach is
popular because of its intitive appeal, and because the simulation packages that
implement it allow an analyst to describe the process flow in terms of high-level block
or network constructs, while the interaction among processes is handled
automatically.
In more precise terms, a process is a time-sequenced list of events, activities,
and delays, including demands for resources, that define the life cycle of one entity
as it moves through a system.
of the built-in but hidden rules of operation. Schriber and Brunner [1998] provide
understanding in this area.
Both the event-scheduling and the process-interaction approaches use a variable
time advance; that is, when all events and system state changes have occurred at
one instant of simulated time, the simulation clock is advanced to the time of the next
imminent event on the FEL. The activity-scanning approach, in contrast, uses a fixed
time increment and a rule-based approach to decide whether any activities can begin
at each point in simulated time.
B activities Activities bound to occur; all primary events and unconditional activities.
C activities Activities or events that are conditional upon certain conditions being
true.
The B-type activities and events can be scheduled ahead of time, just as in the
event-scheduling approach. This allows variable time advance. The FEL contains only B-
type events. Scanning to check if any C-type activities can begin or C-type events occur
happens only at the end of each time advance, after all B-type events have completed. In
summary, with the three-phase approach, the simulation proceeds with repeated
execution of the three phases until it is completed:
Phase A Remove the imminent event from the FEL and advance the clock to its
event time. Remove any other events from the FEL that have the same event time.
Phase B Execute all B-type events that were removed from the FEL. (This may free
a number of resources or otherwise change system state.)
27
Phase C Scan the conditions that trigger each C-type activity and activate any whose
conditions are met. Rescan until no additional C-type activities can begin or events
occur.
The three-phase approach improves the execution efficiency of the activity
scanning method. In addition, proponents claim that the activity-scanning and three-
phase approaches are particularly good at handling complex resource problems, in which
various combinations of resources are needed to accomplish different tasks. These
approaches guarantee that resources being freed at a given simulated time will be freed
before any available resources are reallocated to new tasks.
List processing
The purpose of this discussion of list processing is not to prepare the reader to
implement lists and their processing in a general-purpose language such as FORTRAN,
C, or C++ but rather to increase the reader’s understanding of lists and the underlying
concepts and operations.
Since lists are ranked, they have a top or head (the first item on the list); some
way to traverse the list (to find the second, third, etc, items on the list); and a bottom or
28
tail (the last item on the list). A head pointer is a variable that points to or indicates the
record at the top of the list. Some implementations of lists may also have a tail pointer
that points to the bottom item on the list.
For purposes of discussion, an entity along with its attributes or an event notice
will be referred to as a record. An entity identifier and its attributes are fields in the entity
record; the event type, event time, and any other event-related data are fields in the
event-notice record. Each record on a list will also have a field that holds a “next pointer”
that points to the next record on the list, providing a way to traverse the list. Some lists
may also require a “previous pointer” to allow traversing the list from bottom to top.
For either type of list, the main activities in list processing are adding a record to
a list and removing a record from a list. More specifically, the main operations on a list
are:
While the first and third operations, removing or adding a record to the top or
bottom of the list, can be carried out in minimal time by adjusting two record pointers and
the head or tail pointer, the other two operations require at least a partial search through
the list. Making these two operations efficient is the goal of list- processing techniques.
The array method of list storage is typical of FORTRAN but may be used in other
procedural languages. As most versions of FORTRAN do not have actual record-type
data structures, a record may be implemented as a row in a two-dimensional array or as
a number of parallel arrays. For convenience, we use the notation R(i) to refer to the i th
record in the array, however it may be stored in the language being used. Most modern
simulation packages do not use arrays for list storage but rather use dynamically
allocated records—that is, records that are created upon first being needed and
destroyed when they are no longer needed.
Arrays are advantageous in that any specified record, say the i th, can be
retrieved quickly without searching, merely by referencing R(i). Arrays are disadvantaged
when items are added to the middle of a list or the list must be rearranged. In addition,
arrays typically have a fixed size, determined at compile time or upon initial allocation
when a program first begins to execute. In simulation, the maximum number of record for
any list may be difficult or impossible to determine ahead of time, while the current
numberin a list may vary widely over the course of the simulation run. Worse yet, most
simulations repuire more than one list, and if kept in separate arrays, each would have to
be dimensioned to the largest the list would ever be, potentially using excessive amounts
of computer memory.
In this text, we are not concerned with the details of allocating and freeing
computer memory, and we will assume that the necessary operations occur as needed.
With dynamic allocation, a record is referenced by a pointer instead of an array index.
When a record is allocated in C or C++, the allocation routine returns a pointer to the
allocated record, which must be stored in a ariable or a field of another record for later
30
In our example, we will use a notation for records identical to that in the previous
section (3.2.2):
but we will not reference them by the array notation R(i) as before, because it would be
misleading. If for some reason we wanted the third item on the list, we would have to
traverse the list, counting items until we reached the third record. Unlike arrays, there is
no way to retrieve directly the ith record in a linked list, as the actual records may be
stored at any arbitrary location in computer memory and are not stored contiguously as
are arrays.
Summary
This chapter introduced the major concepts & building blocks in simulations the
most important being entities & attributes, events & activities.
Exercises
1) Prepare a banking system model with the future event list if the 4 customers & 2
solvers and model the clock for 2 days
2) Prepare 1. System state 2. System entities & their attributes 3. Sets & the entities
that may be put into the sets. 4. Events & activities for any 5 systems (either single
server or double server).
3) Prepare a table in the until the clock reaches time 15, _________the international &
service times given below in the order shown. The stopping event will be at time 30.
International times 1 5 6 3 8
Service time 5 3 4 1 5.
31
4
Simulation Software
In this chapter we first discuss the history of simulation software – a history that
is just reaching middle age. We base this history on our collective experience, articles
written by Richard Nance, and panel discussions at the annual Winter Simulation
Conference.
Next, we discuss features and attributes of simulation software. If you were about
to purchase simulation software, what would concern you? Would it be the cost, the ease
of learning, the ease of use, or would it be the power to model the kinds of system with
which you are concerned? Or would it be the animation capabilities? We then discuss
other issues and concerns related to the selection of simulation software.
Software that is used to develop simulation models can be divided into three
categories. First, there are the general – purpose programming languages, such as
FORTRAN, C, and C++. Second, there are simulation programming languages,
TM
examples being GPSS/H , and SIMAN V. Third, there are the simulation environments.
This category includes many products that are distinguished one way or another (by, for
example, cost, application area, or type of animation) but have common characteristics
such as a graphical user interface and an environment that supports all (or most) aspects
of a simulation study. Many simulation environments contain a simulation programming
language, while some take a graphical approach similar to process flow diagramming.
There are many features that are relevant when selecting simulation software [Banks,
1996]. Some of these features are shown, along with a brief description, in Tables 4.1 to
4.5. We offer the following advice when evaluating and selecting simulation software:
1. Do not focus on a single issue such as ease of use. Consider the accuracy and
level of detail obtainable, ease of learning, vendor support, and applicability to
your problems.
5. Beware of “checklists” with “yes” and “no” as the entries. For example, many
packages claim to have a conveyor entity. However, implementations have
considerable variation and level f fidelity. Implementation and capability are what
is important. As a second example, most packages offer a runtime license, but
these vary considerably in price and features.
6. Simulation users ask if the simulation model can link to and use code or routines
written in external languages such as C, C++, or FORTRAN. This is a good
feature, especially when the external routines already exist ad are suitable for the
purpose at hand. However, the more important question is whether the simulation
package and language are sufficiently powerful to avoid having to write logic in
any external language.
The system, a grocery checkout counter, is modeled as a single – server queue. The
simulation will run until 1000 customers have been served. In addition, assume that the
inter arrival times of customers are exponentially distributed with a mean of 4.5 minutes,
and that the service times are (approximately) normally distributed with a mean of 3.2
minutes and a standard deviation of 0.6 minute. (The approximation is that service times
are always positive). When the cashier is busy, a queue forms with no customers turned
away. This example was manually simulated in Examples 3.3 and 3.4 using the event
scheduling point of view. The model contains two events, the arrival and departure
events. Figures 3.5 and 3.6 provide the event logic.
The following three sections illustrate the simulation of this single – server queue in C ++,
GPSS/H, and CSIM. Although this example is much simpler than models that arise in the
study of complex systems, its simulation contains the essential components of all
discrete-event simulations.
4.4 Simulation in C ++
The simulation begins by setting the simulation Clock to zero, initializing cumulative
statistics to zero, generating any initial events (there will always be at least one) and
placing them on the Future Event List, and defining the system state at time 0. The
simulation program then cycles, repeatedly passing the current least-time event to the
appropriate event subroutines until the simulation is over. At each step, after finding the
imminent event but before calling the event subroutine, the simulation Clock is advanced
to the time of the imminent event. (Recall that during the simulated time between the
occurrence of two successive events, the systems state and entity attributes do not
change in value. Indeed, this is the definition of discrete – event simulation: The system
state changes only when an event occurs.) Next, the appropriate event subroutine is
called to execute the imminent event, update cumulative statistics, and generate future
events (to be placed on the Future Event List). Executing the imminent event means that
the system state, entity attributes, and set membership are changed to reflect the fact
that the event has occurred. Notice that all actions in an event subroutine takes places at
one instant of simulated time. The value of the variable Clock does not change in an
event routine. If the simulation is not over, control passes again to the time-advance
subroutine, then to the appropriate event subroutine, and so on. When the simulation is
over, control passes to the report generator, which computes the desired summary
statistics from the collected cumulative statistics and prints a report.
The grocery checkout counter, defined in detail in Example 4.1, is now simulated using C
++. A version of this example was simulated manually in Examples 3.3 and 3.4, where
the systems state, entities and attributes, sets, events, activities, and delays were
analyzed and defined.
Class Event represents an event. It stores a code for the event type (arrival or departure),
and the event time – stamp. It has associated methods (functions) for creating an event
and accessing its data. It also has associated special functions, called operators, which
provide semantic meaning to relational operators < and = = between events. These are
used by the C ++ standard library queue implementations. The subroutines for this model
and the flow of control are shown in Figure 4.2, which is an adaptation of Figure 4.1
35
be used to model any situation where transactions (entities, customers, units of traffic)
are flowing through a system (e. g., a network of queues, with the queues preceding
scarce resources). The block diagram is converted to block statements, control
statements are added, and the result is a GPSS model.
The first version of GPSS was released by IBM about 1961. Since is was the first process
– interaction simulation language, and due to its popularity, it has been implemented
anew and improved by many parties since 1961, with GPSS/H being the most widely
used version in use today. Example 4.3 is based on GPSS/H.
Figure 4.10 exhibits the block diagram and figure 4.11 the GPSS program for the grocery
store checkout –counter model described in Example 4.2. Note that the program (Figure
4.11) is a translation of the block diagram together with additional definition and control
statements.
In figure 4.10, the GENERATE block represents the arrival event, with the interarrival
times specified by RVEXPO (1, &IAT). RVEXPO stands for “random variable,
exponentially distributed,” the 1 indicates the random-number stream to use, and &IAT
indicates that the mean time for the exponential distribution comes from a so-called
ampervariable &IAT. Ampervariable names begin with the “&” character; Wolverine
added ampervariables to GPSS because the original IBM implementation had limited
support for ordinary global variables, with no user freedom for naming them. (In the
discussion that follows all nonreserved words are shown in italics.)
The next block is a QUEUE with a queue named SYSTIME. It should be noted that the
QUEUE block is not needed for queues or waiting lines to form in GPSS. The true
36
purpose of the QUEUE block is to work in conjunction with the DEPART block to collect
data on queues or any other subsystem.
Figure 4.10. GPSS block diagram for single – server queue simulation.
37
In Example 4.3, we want to measure the system response time – that is, the time a
transaction spends in the system. By placing a QUEUE block at the point where
transactions enter the system, and the counter part of the QUEUE block, the DEPART
block, at the point where the transactions complete their processing, the response times
will be automatically collected. The purpose of the DEPART block is to signal the end of
data collection for an individual
SIMULATE
Define Ampervariables
INTEGER &LIMIT
REAL &IAT, &MEAN,&STDEV, &COUNT
LET &IAT = 4.5
LET &MEAN = 3.2
LET &STDEV = .6
LET &LIMIT = 1000
TER TERMINATE 1
38
END
transaction. The QUEUE and DEPART block combination is not necessary for queues to
be modeled, but rather is used for statistical data collection.
The next QUEUE block (with name Line) begins data collection for the waiting –
line before the cashier. The customers may or may not have to wait for the cashier. Upon
arrival to an idle checkout counter or after advancing to the head of the waiting line, a
customer captures the cashier as represented by the SEIZE block with the resource
named CHECKOUT. Once the transaction representing a customer captures the cashier
represented by the resource CHECKOUT, the data collection for the waiting line statistics
ends, as represented by the DEPART block for the queue named LINE. The transaction’s
service time at the cashier is represented by an ADVANCE block. RVNORM indicates
“random variable, normally distributed.” Again, random – number stream 1 is being used,
the mean time for the normal distribution is given by ampervariable & MEAN, and its
standard deviation is given by ampervariable & STDEV. Next,/ the customer gives up the
use of the facility CHECKOUT with a RELEASE block. The end of the data collection for
response times is indicated by the DEPART block for the queue SYSTIME.
Next, there is a TEST block that checks to see if the time in the system, M1, is
greater than or equal to 4 minutes. (Note that M1 is a reserved word in GPSS/H; it
automatically tracks transaction total time in system.) In GPSS/H, the maxim is “if true,
pass through.” Thus, if the customer has been in the system 4 minutes or longer, the next
BLET block (for block LET) adds one to he counter & COUNT. If not true, the escape
route is to the block labeled TER. That label appears before the TERMINATE block,
whose purpose is the removal of the transaction from the system. The TERMINATE block
39
has a value “1” indicating that one more transaction is added toward the limiting value, or
“transactions to go.”
The control statements in this example are all of those lines in Figure 4.11 that
precede or follow the block section. (There are eleven blocks in the model from the
GENERATE block to the TERMINATE block). The control statements that begin with an
asterisk are comments, some of which are used for spacing purposes. The control
statement SIMULATE tells GPSS/H to conduct a simulation; if omitted, GPSS/H compiles
the model and checks for errors only. The ampervariables are defined as integer or real
by control statements INTEGER and REAL. It seems that the ampervariable & COUNT
should be defined as an integer; however, it will be divided later by the number of
customers to obtain a proportion. If it were integer, the result of an integer divided by an
integer would be truncation, and that is not desired in this case. The four assignment
statements (LET) provide data for the simulation. These four values could have been
placed directly in the program; however, the preferred practice is to place them in
ampervariables at the top of the program, so that changes can be made more easily, or
the model modified to read them from a data file.
To insure that the model data is correct, and for the purpose of managing
different scenarios simulated. It is good practice to echo the input data. This is
accomplished with a PUTPIC (for “put picture”) control statement. The five lines following
PUTPIC provide formatting information, with the asterisks being markers (called picture
formatting) in which the values of the four ampervariables replace the asterisks when
PUTPIC is executed. Thus, “**.**” indicates a value that may have two digits following the
decimal point.
The contents of the custom output file OUT are shown in Figure 4.12. The
standard GPSS/H output file is displayed in Figure 4.13. Although much of the same data
shown inj the file OUT can be found in the standard GPSS/H output, the custom file is
more compact and uses the language of the problem rather than GPSS jargon. There are
many other reasons that customized output files are useful. For example, if 50
replications of the model are to be made and the lowest, highest, and average value of a
response are desired, this can be accomplished using control statements with the results
in a very compact form, rather than extracting the desired values from 50 standard output
reports.
CSIM is a system for using the C or C ++ language for modeling, along with a
rich library of predefined objects to support process-oriented simulation modeling. It is an
exepensive commercial product sold and maintained by Mesquite. Software [Shwetman,
1996]; it has a wide user base in industry and academia, principally to model computer
and communication systems. CSIM is fast, owing to careful implementation and being a
compiled language.
* quired customers, it calls the wait function associated with CSIM event done. This
causes it to suspend until done is set by the last customer process of interest. The call to
done. Set ( ) in turn releases sim to call Report Generation to give its report. Our CSIM
implementation of Report Generation can be essentially that illustrated in Figure 4.7, with
the exception of calling CSIM provided facility routine util to get server utilization, and
resp to get the average response time :
CSIM bridges the gap between models developed in pure C ++ and models
developed in languages specifically designed for simulation. It provides the flexibility
offered by a general programming language, with essential support for simulation.
41
All the packages described here take the process – interaction world view. A few
also allow event scheduling and mixed discrete – continuous models. For animation,
some emphasize scale drawings in 2 – D or 3-D, while other emphasize iconic – type
animations based on schematic drawings or process flow diagrams. A few offer both
scale-drawing and schematic – type animations. Almost all offer dynamic business
graphing in the form of timelines, bar charts, and pie charts.
4.7.1 Arena
The Arena Standard Edition is designed for more detailed models of discrete and
continuous systems. First released in 1993, Arena employs an object based design for
entirely graphical model development. Simulation models are built using graphical objects
called modules to define system logic and physical components such as machines,
operators, and clerks. Modules are represented by icons plus associated data entered in
a dialog window. These icons are connected to represent entity flow. Modules are
organized into collections called templates. The Arena template is the core collection of
modules providing general – purpose features for modeling all types of applications. In
42
addition, to standard features, such as resources, queues, process logic, and system
data, the Arena template includes modules focused on specific aspects of manufacturing
and material handling systems. Arena SE can also be used to model combined discrete/
continuous systems, such as pharmaceutical and chemical production, through its built-in
continuous modeling capabilities.
The Arena Professional Edition enhances Arena SE with the capability to craft
custom simulation objects that mirror components of the real system, including
terminology, process logic, data, performance metrics, and animation. The Arena family
also includes products designed specifically to model call centers and high-speed
production lines, namely Arena Call Center and Arena Packaging.
At the heart of Arena is the SIMAN simulation language. For animating simulation
models, Arena’s core modeling constructs are accompanied by standard graphics for
showing queues, resource status, and entity flow. Arena’s 2-D animations are created
using Arena’s built-in drawing tools and by incorporating clipart, AutoCAD, Visio, and
other graphics.
Arena’s Input Analyzer automates the process of selecting the proper distribution
and its parameters for representing existing data, such as process and interarrival times.
The Output Analyzer and Process Analyzer (discussed in Section 4.8.2) automate
comparison of different design alternatives.
The Auto Mod Product Suite is offered by Auto Simulations [Rohrer, 1999]. It
includes the Auto Mod simulation package, AutoStat for experimentation and analysis,
and AutoView for making AVI movies of the built-in-3D animation. The main focus of the
AutoMod simulation product is manufacturing and material handling systems.
Auto Mod has built – in templates for most common material handling systems,
including vehicle systems, conveyors, automated storage and retrieval systems, bridge
cranes, power and free conveyors, and kinematics for robotics. With its Tanks and Pipes
module, it also supports continuous modeling of fluid and bulk material flow.
The Path Mover vehicle system can be used to model lift trucks, humans walking
or pushing carts, automated guided vehicles, trucks and cars. All the movement
templates are based on a 3-D scale drawing (drawn or imported from CAD as 2-D or 3-
D). All the components of a template are highly paramterized. For example, the conveyor
template contains conveyor sections, stations for load induction or removal, motors, and
photo-eyes. Sections are defined by length, width, speed, acceleration and type
(accumulating or nonaccumulating), plus other specialized parameters. Photo-eyes have
blocked and cleared timeouts that facilitate modeling of detailed conveyor control logic.
43
In addition to the material handling templates, Auto Mod contains a full simulation
programming language. Its 3-D animation can be viewed from any angle or perspective in
real time. The user can freely zoom, pan, or rotate the 3-D world.
An Auto Mod model consists of one or more systems. A system can be either a
process system, in which flow and control logic are defined, or a movement system
based on one of the material handling templates. Each model must contain one process
system and may contain any number of movement systems. Processes can contain
complex logic to control the flow of either manufacturing materials or control messages,
to contend for resources, or to wait for user – specified times. Loads can move between
processes with or without using movement systems.
In the Auto Mod world view, loads (products, parts, etc.) move from process to
process and compete for resources (equipment, operators, vehicles and queues). The
load is the active entity, executive action statements in each process. To move between
processes, loads may use a conveyor or vehicle in a movement system.
Auto Stat, described in Section 4.8.2, works with Auto Mod models to provide a complete
environment for the user to define scenarios, conduct experimentation, and perform
analyses. It offers optimization based on an evolutionary strategies algorithm.
4.7.3 Deneb/QUEST
QUEST models are based on 3-D CAD geometry. QUEST combines a simulation
environment with graphical user interface and material flow modules for labor, conveyors,
automated guided vehicles, kinematics, power and free conveyors, and automated
storage and retrieval systems.
For Deneb/IGRIP users, existing robot work cell models and robot or other
complex machinery cycle times can be directly imported to Deneb/QUEST. Similarly,
model of human operators in a workcell developed in Deneb\ERGO Sim can be
incorporate into a QUEST process model both visually and numerically.
4.7.4 Extend
Extend offered by Imagin That, Inc. [Krahl, 1999]. Extend combines a block
diagram approach to model building along with an authoring environment for creating
new blocks or complete vertical market simulators. It is process oriented but it also
capable of continuous and combined modeling. It provides iconic animation of the block
diagram plus an interface to Proof Animation [Henriksen, 1999] from Wolverine Software
for 2-D scale – drawing animation.
Each block has an icon and encapsulates code, parameters, user interface, and
online help. Extend comes with a large set of elemental blocks; in addition, libraries of
blocks for general and specific application areas, such as manufacturing and BPR
(business process reengineering), are available. Third-party developers have created
Extend libraries for vertical market applications, including supply chain analysis chemical
processing, pulp and paper processing, and radio-frequency analysis.
End users can build models by placing and connecting blocks and filling in the
parameters on the dialog window associated with a block. Collections of blocks can be
grouped into a hierarchical block representing a submodel such as a particular machine,
a workstation, or a subassembly line. In this way, a higher-level block diagram can be
displayed to the end user, hiding many of the details. Parameters from the submodel
network of blocks that an end user desires to control can be grouped and displayed at the
level of the hierarchical block. Using this capability, a firm could build a set of customized
blocks representing various components of their manufacturing facility.
For creating new blocks, Extend comes with a compiled C-like simulation
programming language called Modl. It contains simulation support, as well as support for
custom user interfaces and message communication. Extend has an open architecture,
so that in most cases the source code for blocks is available to advanced model builders.
45
The statistics library of blocks is used to collect and analyze output data, such as
computing confidence intervals.
Micro Saint is offered by Micro Analysis and Design, Inc. [Bloechle and
Laughery, 1999]. With Micro Saint, a model builder develops a model by creating a flow-
chat diagram to describe a network of tasks. The animation is iconic and is based on the
flow-chart representation. Through menus and parametrized icons, the network can have
branching logic, sorted queues, and conditional task execution. Micro Saint is used in
manufacturing, health care, retail, and military applications. It supports ergonomics and
human performance modeling. It works with Opt Quest to provide optimization capability.
Pro Model has manufacturing – oriented modeling elements and rule – based
decision logic. Some systems can be modeled by selecting from Pro Model’s set of highly
parameterized modeling elements. In addition, its simulation programming language
provides for modeling special situations not covered by the built- in choices.
Pro Model’s runtime interface allows a user to define multiple scenarios for
experimentation. Sim Runner (discussed in Section 4.8.2.) adds the capability to perform
an optimization; it is based on an evolutionary strategy algorithm, a variant of the genetic
algorithm approach.
4.7.7 Taylor ED
Taylor ED is based on the concept of an atom. Atoms are Taylor ED’s smart
objects and model-building resources. Model builders can access standard libraries of
atoms as well as create new atoms using Taylor ED’s Atom Editor. Atoms are used to
represent the products or entities flowing through the system as well as the resources
acting on these entities. In fact, everything in Taylor ED is an atom, whether it is a
resource, a product, a model (or submodel), a table, a report, a graph, a record, or a
library.
To build a model, a user first creates a model layout by selecting atoms from the
atom library and placing them on the screen. Next the user connects the atoms and
defines the routing of products or entities (also atoms) flowing through the model. Next
the use assigns logic, including routing logic, to each atom by editing its parameter fields.
An atom is an object with four dimensions (x, y, z, and time). Each atom can
have a location, speed, and rotation and dynamic behavior over time. Atoms can inherit
their behavior from other atoms and contain other atoms; atoms can be created and
destroyed. Atoms can be viewed simultaneously in 2-D and 3-D animation.
One of the hallmark characteristics of the atom is its reusability. For example, a
workcell can be created by using the Atom Editor or by combining several existing atoms.
The workcell can be saved as an atom, included in a standard atom library, then reused
in a new model or shared with other Taylor ED modelers.
Taylor ED includes both an experiment module and the Opt Quest Optimizer
(discussed in Section 4.8.2).
4.7.8 WITNESS
WITNESS is offered by The Lanner Group [Mehta and Rawles, 1999]. WITNESS
is strongly machine oriented and contains many elements for discrete part manufacturing.
It also contains elements for continuous processing such as the flow of fluids through
processors, tanks and pipes.
It offers a 2-D layout animation plus a process flow view. The 2-D graphics offers
a walk-through capability controlled by a mouse or a camera attached to an entity.
47
Virtually all simulation packages offer various degrees of support for statistical
analysis of simulation outputs. In recent years, many packages have added optimization
as one of the analysis tools. To support analysis, most packages provide scenario
definition, run management capability, as well as data export to spreadsheet and other
external applications.
Optimization is used to find a “near – optimal” solution. The user must define an
objective or fitness function, usually a cost or costlike function that incorporates the trade
– off between additional throughput and additional resources. Until recently, the methods
available for optimizing a system had difficulty coping with the random and nonlinear
nature of most simulation outputs. Advances in the field of metaheuristics have offered
new approaches to simulation optimization, based on artificial intelligence, neural
networks, genetic algorithms, evolutionary strategies, tabu search, and scatter search.
This section briefly discusses the Arena Output and Process Analyzers AutoStat
for AutoMod, Sim Runner for ProModel, and OptQuest, which is used in a number of
simulation products.
Arena comes with the Output Analyzer and Process Analyzer. In addition, Arena
uses Opt Quest for optimization.
and responses. Responses can be added after runs have been completed. It will rank
scenarios by any response and provide summaries and statistical measures of the
responses. A user can view 2-D and 3-D charts of response values across either
replications or scenarios.
AutoStat
AutoStat is the run manager and statistical analysis product in the AutoMod
product family [Rohrer, 1999]. AutoStat provides a number of analyses, including warm
up determination for steady-state analyses, absolute and comparison confidence
intervals, design of experiments, sensitivity analysis, and optimization using an
evolutionary strategy. The evolutionary strategies algorithm used by AutoStat is well
suited to finding a near-optimal solution without getting trapped at a local optimum.
With AutoStat, an end user can define any number of scenarios by defining
factors and their range of values. Factors include single parameters such as resource
capacity or vehicle speed, single cells in a data file, and complete data files. By allowing
a data file to be a factor, a user can experiment with, for example, alternate production
schedules, customer orders for different days, different labor schedules, or any other
numerical inputs typically specified in a data file. Any standard or custom output can be
designated as a response. For each defined response, AutoStat computes descriptive
statistics (average, standard deviation, minimum and maximum) and confidence
intervals. New responses can be defined after runs are made, because AutoStat archives
and compresses the standard and custom outputs from all runs. Various charts and plots
are available to provide graphical comparisons.
Auto Stat also works with two other products from Auto Simulations: the AutoMod
Simulator, a spreadsheet – based job-shop simulator, and AutoSched AP, a rule – based
simulation package for finite – capacity scheduling in the semi conductor industry.
Opt Quest
49
Opt Quest was developed by Fred Glover, University of Colorado and co-founder
of Optimization Technologies, Inc. [Glover et al., 1999]. It is available for Arena, CSIM,
Micro Saint, QUEST, and Taylor ED.
SimRunner
Sim Runner works with ProModel [Prive and Harrell, 1999]. Its optimization is
based on evolutionary strategy, a variant of the genetic algorithm approach. A user
specifies an objective function (a response to minimize or maximize) and the input factors
to vary. In its Stage One Optimization Analysis, Sim Runner performs a factorial design of
experiments to determine which input factors have a significant effect on the objective
function. Those factors that have no significant effect can be eliminated from the search
for an optimum. The Stage Two Simulation Optimization uses the evolutionary strategy
algorithm to conduct a multivariable optimization search.
you are looking at a virtual machine on your computer monitor, it looks very much like
that machine. If the machine is updated, it is only necessary to update the simulation
model. You don’t have to spend $1,000,000 for a modern centerless grinder, you just
simulate it in the computer to study its operation. Of course, it you want to make
bearings, you will need to buy the grinder.
Several simulation software firms have worked with EAI to accept SDX. This data
exchange format could emerge as a standard. Rather than redraw the conveyor system
with all of its parameters, it is just imported through SDX.
SDX is not the only data exchange format vying for attention. He Worldwide Web
Consortinum (W3C) is developing XML for extensible markup language. XML is a set of
rules, guidelines, or conventions for designing text formats for data such that the files are
easy to generate and read, they are unambiguous, and they are extensible. It aids in
putting information on the Web and in the retrieval of that information.
XML has a large and growing community. A project coordinated through the
Manufacturing Engineering Laboratory at the U. S. National Institute of Standards and
Technology is to “explore the use of XML as a standard for structured document
interchange on the Web, for exchanging complex data objects between tasks in a
distributed workflow application.”
51
So, groups are developing XML standards for all types of documentation.
Eventually, some group will want to develop XML standards for representation of
manufacturing and material handling systems.
The Internet can deliver modeling capability to the user. The interface can be at
the client (local computer), with the models run on a server. However, this server is not in
a local area network (LAN). It is at the software vendor’s or some other host’s site. The
server can always have the latest version of the model. The interface can be in Java,
which is independent of the computer and has applets for displaying results. Currently,
the speed of Java is not sufficient to serve as the simulation engine.
Another way that we can use the Internet is to have models run on many
computers. Thus, if the model is extremely large or if many replications are needed, the
computing power can be greatly extended via the Internet by distributing the
computational needs.
In the past, models were always constructed with a limited purpose in mind and
to answer specific questions. That would be called the old paradigm. But, it takes time to
build models in response to questions.
The new paradigm is to have models built at different levels of detail, but
without knowing all of the questions to be asked. This will speed up the
response time, since the model already exists. Under the new paradigm, the
models become an asset. The firm with more prebuilt models will be able to
answer questions faster – but may have more out-of-date models. Indeed,
model maintenance becomes a big issue.
Consider component libraries stored in a neutral file format. This requires that a
format can be designed and accepted by numerous software vendors, which is in itself a
challenge. System providers could be contributors of modules. For example, a robot
manufacturer could contribute a module for a specific type of robot. The result of having
module libraries is more rapid development of models. The modules could be prepared
so that all that they require is parameterization.
The technology has matured to the point where it can now be tried in other
domains. This technology is collectively called High Level Architecture (HLA). HLA has
been designed so that the minimum amount of information is exchanged between the
various simulations sufficient to effect clock synchronization and the exchange of data
objects.
4.9.8 Optimization
The main problem with optimization is that of computer time. It takes a lot of it
when running an optimization on a complex model. This is related to the number of
variables, the types of variables (discrete or continuous), the number of constraints on
the solution space, the type of constraints (simple bound, linear, or nonlinear), and the
type of objective function (linear, quadratic, or nonlinear). Assume that a simulation takes
10 minutes per replication and that 5 replications are conducted at each combination of
factor values. Further, assume that the optimization algorithm calls for 1000 scenarios
(not unusual). To obtain a near-optimal solution would take 50,000 minutes. That is
approximately one month of elapsed time. Decision makes usually do not have the luxury
of this amount of time.
REFERECNES
Banks, J., J. S. CARSON, AND J. N. SY [1995], Getting Started with GPSS/H, 2d ed.,
Wolverine Software Corporation, Annandale, VA.
BANKS, J. [1996], “Interpreting Software Checklists,” OR/MS Today, June.
BANKS, J. [1998], “The Future of Simulation Software: A Panel Discussion,” Proceedings
of the 1998 Winter Simulation Conference, D. J. Medeiros, E. Watson, J. Carson,
S. Manivannan, eds., Washington, DC, Dec. 13-16.
BANKS, J. [2000], “The Future of Simulation,” Proceedings of the European Simulation
Multi-Conference, Ghent, Belgium, May 23-26, to appear.
54
Proportion of p.d.f.
Proportion of cdf :
If x is discrete
= x.f(x)dx if x is continuous
V(X) = E[X-E(X)]2
= E (X2) – (E(X))2
56
Mode: In the desecrate case, the mode is the value of the r. v. that
occurs most frequently. In the continuous case, the mode is the value
at which the pdf is maximized
Discrete Distribution
2. Binomial distribution :
3. Geometric Distribution :
= E(x) = = V(x).
Graded Questions
1. An industrial chemical that will retard the spread of fire in paint has been developed.
The local sales representative has determined, from past experience, that 48 % of
the sales calls will result in an order.
a) What is the probability that the first order will come on the fourth
sales call of the day ?
c) If four sales calls are made before lunch, what is the probability that
one or less results in an order ?
2. The Hawks are currently winning 0.55 of their games. There are 5
games in the next two weeks. What is the probability that they will win
more games than they lose ?
4. Arrivals at a bank teller’s cage are Poisson distributed at the rate of 1.2
per minute.
(a) What is the probability that this satellite is still “alive” after 5 years ?
(b) What is the probability that this satellite dies between 3 and 6 years
from the time it is placed in orbit ?
P (X 3) = 0.9 P (X 4)
11. The daily use of water, in thousands of liters, at the hardscrabble Tool
and Die works follows a gamma distribution with a shape parameter of
2 and a scale parameter of ¼ . what is the probability that the demand
exceeds 4000 liters on any given day?
12. When Admiral Byrd went to the north Pole, he wore battery-powered
thermal wear. The batteries failed instantaneously rather than
gradually. The batteries had a life that was exponentially distributed
with a mean of 12 days. tHe trip took 30 days, Admiral Byrd packed
three batteries, what is the probability that three batteries would be a
sufficient number to keep the Admiral warm?
13. Find the probability that 6<X<8 for each, of the distribution :
Normal (10,4)
Triangular (4,10,16)
Uniform (4,16)
14. Professor Dipsy Doodle gives six problems on each exam. Each
problem requries an average of 30 minutes grading time for the entire
class of 15 students. The grading time for each problem is
exponentially distributed and the problems are independent of each
year.
(a) What is the probability that the Professor will finish the grading in 2
½ hours or less?
1.5 Three shafts are made and assembled in a linkage. The length of
each shaft, in centimeters, is distributed as follows:
Shaft 2: N(40,0.05)
60
(b) What is the probability that the linkage will be longer than
150.2 centimeters?
(c) The tolerance limit for the assembly are (149.83, 150.21).
What proportion of assemblies are within the tolerance
limits? [Hint : If (X1) are n independent normal Random
variables, and if Xi has mean and variance o2I, then the
sum
Y =X1 +X2+…+Xn
i and variance i2 ]
n n
Is normal with mean i 1 i 1
B = ¼ , and a = ½ years,
(a) Find the fraction of batteries that are expected to fail prior to 1.5
years.
(b) What fraction of batteries are expected to last longer than the
mean life?
(c) What fraction of batteries are expected to fail between 1.5 and
2.5 years?
16. Explain in detail the 3-step approach of Naylor and finger in the
validation process.
AREA UNDER STANDARD NORMAL CURVE :
Z 0.0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
61
0.1 0.040 0.0438 0.0478 0.0517 0.0557 0.0596 0.0638 0.0675 0.0714 0.7530
0.2 0.0783 0.0832 0.0871 0.0910 0.0948 0.9870 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.1 0.3643 0.3685 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4686 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.475 0.4756 0.4761 0.4757
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981
2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986
3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990
STATISTICAL MODELS IN SIMULATION
62
In real world phenomena these are few setuations where the actions of
the entities with in the system under study can be completely predicted in
advance.
The no. P(xi), I=1,2… must satisfy the following two conditions.
1) P (Xi) 0 I
p( x ) 1
2) i
i 1
f ( x)dx ------------------------------------------(1)
b
P (a X b) =
a
63
The function f (x) is called the probability density function (pdf) of the
random variable X. the pdf satisfies the following conditions.
(b) Rx f(x) dx = 1
P (x = Xo) = Oxo
f ( x)dx =0
x0
x0
.: f (x) dx = O
f(x)
x=a x=b x
the shaded area represents the probability that X lies in the interval [a, b]
64
Of X is discrete then
F (x) = p (xi) ------------------------------------------------ (3)
xi x
Of X is continuous then
f (t )dt
x
F (x) = f(x) dt ---------------------------------(4)
ii) it F (x) = 1
iii) it f(x) = o
----(6)
65
E(X) = XiP (xi) is X is discrete & E (X) = xf(x) dx
if X is continuous ------
------(7)
E (Xn) =
i
Xn P (Xi) if X is discrete ----------------------(8)
& E (Xn) = X
n
f(x) dx if X is Continues ------------(9)
V(X) = E [X –E(x) 2]
(5) The Mode :- In the discrete case, the mode is the value of the random
variable that occurs most frequently, in the continuous case, the mode is
the value of which the pdf is marimized. The mode may not be unique. Of
the modal value occurs at two values of the random variable, the
distribution is said to be bimodal.
DISCRETE DISTRIBUTIONS
Discrete random variables are used to describe random phenomena in
which only integer values can occur.
There fore, probability can be defined as:- P (x1, x2 … xn) = P1 (x1). P2 (x2)…
Pn (xn) & Pj (Xj) = P (Xj) = { P,
Xj = 1, j = 1,2 …n
{ 1-P = q, Xj = 0 , j = 1,2…n
{ O, othervise )
----------------(11)
for one trial, given (eq n 11) distribution is called the bernoulli distributions.
: Means & variance of Xj are defined as –
E(Xj) = 0.q + 1.p = p
& V(Xj) = [ (O2.q) + (12.P) ] – P2
= P (1-P)
, otherwise -------------------------------------(12)
Where q = 1-p
X = X1 + X2+….+ Xn
2
The event { X=x } occurs when these are x-1 failure followed by a success.
Probability of failure = q = 1-p
Probability of success = p
P(FFF…..FS) = q
x-1
p
Mean = E(X) = 1/p ---------------------------------------------------------------(16)
2
& Variance = V(x) = q/p --------------------------------------------------------(17)
e i
F ( x)
x
------------------------------(19)
i o i!
Which is propertional to the length of the interval for all X1 & X2 satisfying a
< X1 <X2 < b. :. Mean & variance can be defined as .
ab
E(X) = --------------------------------------------------------(22)
2
( a b) 2
V(X) = ----------------------------------------------------(23)
12
Thus the mean and standard deviation are equal. Cdf can be expressed
as –
{ t dt 1 e x , x
0
A random variable X is gamma distributed with parameters & if its pdf is given by
( ) 1 e ,
f(x) = X>O
( )
72
{ O, otherwise ------------------------(30)
The parameter is called the shape parameter and is called the scale
parameter.
F(x) = 1 ( t ) 1 e t dt , x>0
( )
{ 0, X 0 -------(33)
variables, each with parameter BQ, then X has a gamma distribution with parameters B &
X = X1 + X2+ …. + X --------------------------------------------------(34)
73
g ( Xj ) ( )e x
xj
,
{ 0, otherwise
f ( x) ( ) 1 e x , x > 0
( )
{ 0, otherwise
The mean & variance of gamma distribution can be used to define the mean & variance
if = k, an integer
The expected values of the exponentially distributed Xj are each given by 1/k.
74
Thus,
E( X ) ......
1 1 1 1
K K K
If the random variables Xj are independent the variance of their sum is the sum of the
variance or
V (X ) ......
1 1 1 1
( K ) 2
( K ) 2
( K ) 2
K 2
k 1 e0 kx ( kx )i
F ( x) 1 , x>0 -------------------------------------(35)
i 0 i!
75
6
Queuing Models
Characteristics of Queuing systems :
The key elements of queuing systems are the customes and servers. The
term customer can refer to anything that arrives at a facility and requires
service (e.g. people, machines, patienes, e-mail etc.) the term “server”
might refer to any resource which provides the requested service. (e.g.
receptionist, mechanics, medical personnel, CPU in a computer etc.)
1. The calling population :-
who finds the system full does not enter but returns immediately to the
calling population e.g. waiting room of a clinic.
Some systems may be considered as having unlimited capacity.
E.g. Ticket counters at a railway station.
The most important model for random arrivals is the Poisson arrival
process. For a Poisson arrival rate is customer per unit time. If An
represent interarrival time between customer n-1 and n, then for a Poisson
arrival process, An is exponentially distributed with mean 1/ time units.
The service times of successive arrivals are denoted S1, S2, S3,….
They may be constant or of random duration. If service times are random,
then { S1, S2,…} is usually a sequence of independent and identically
distributed random variables. The exponential. Erlang, Weibull etc. are
77
Queuing Notation :-
N : system capacity
Ek : arbitary or general
GI : general independent.
Let
Pn : Steady-state probability of having n customers in system.
Pn(t): Probability of n customers in system at time t.
L : Arrival rate
Le : Effective arrival rate
U : Service rate of one server
P : Server utilization
An : Interarrival time between customers n-1 and n
Sn : Service time of the nth arriving customer.
Wn : Total time spent in system by the nth arriving customer.
Wnq : Total time spent in the waiting line by customer n
L (t) : The number of customers in system at time t.
Lq(t) : The number of customers in queue at time t.
L : long-run time-average number of custmers in system.
Lq : Long-run time-average number of customers in queue.
78
(II) The term system usually refers to the waiting line plus the service
mechanism; whereas the term ‘queue’ refers to the waiting line
alone.
Ti T
General,
i 0
The time-weighted-average number in the system is defined by
Ti
1
L i ………………………… (1)
^
i.Ti
T i0 i0 T
TO
The expression in equations (1) and (2) are always equal for any queuing system,
regardless of the number of servers, the queue discipline, or any other special
circumstances. This average is also called time-integrated average.
79
T O
T
L L(t )dt LasT ……………….. (3)
^ 1
Where Lq is the observed time-average number of customers waiting in line from time 0
to T and Lq is the long-run time-average number of customers waiting in line.
W ……………………………………….(6)
N
wq
^
1
i
N i 1
Let Wiq denote the total time customer I spends waiting in queue. Let Wnq
be the observed average time spend in the queue(called delay) and Wq
be the long-run average delay per customer. Then
W ……………………………… (7)
N
wq
^
1 q
i
N i 1
Example :
Graded Questions
1. A two-runway (one runway for landing, one for taking off) airport is being designed for
propeller-driven aircraft. The time to land an airplane is known to be exponentially
Distributed with a mean of 1 ½ minutes. If airplane arrivals are
assumed to occur at random.
What arrival rate can be tolerated if the average wait in the sky is not to
exceed 3 minutes?
2. The port of Trop can service only one ship at a time. However, there is
mooring space for three more ships. Trop is a favorite port of call, but if
81
10. A small copy shop has a self-service copier. Currently there is room for
only 4 people to line up for the machine (including the person using the
machine); when there are more than 4 people, then the additional
people must line up outside the shop as much as possible. For that
reason they are thinking about a second self-service copier. Self-
service customers have been observed to arrive at a rate of 24 per
hour, and they use the machine 2 minutes, on average. Assess the
impact of adding another copier. Carefully state any assumption or
approximations you make.
11. Customers arrive at a one man barber shop according to a Poisson process with a
mean interarrival time of 20min. Customers spend an average of 15 min in the
barber’s chair,
83
(a) what is the expected number of customers in the barber shop? In the
queue?
(b) What is the probability that a customers in the barber shop? In the
queue?
(c) How much can a customer expect to spend in the barber shop?
(d) Management will put another chair and hire another barber when a
customer’s average waiting time in the shop exceeds 1.25 h. how
much must the average rate of arrivals increase to warrant a second
barber?
(f) Obtain the probability that waiting time in system is greater than 30
minutes
(g) Obtain the probability that there are more than three customers in the
system.
13. If the mean arrival rate is 24 per hour, find from the customer’s point of
view of the time spent in the system, whether 3 channels in parallel
with a mean service rate of 10 per hour is better or worse than a single
channel with mean service rate of 30 per hour.
Ans :- [single
channel is
better.]
84
7
Random Number Generation
E ( R)
1 2
xdx x
1
O
2 0
1
=
2
and the variance is given by,
V ( R ) x 2 dx E ( R )
1
2
1
3 1 2
x
3 0 2
1 1 1
3 4 12
85
Note :
1. If the interval (0,1) is divided into n classes, or subinterval of equal length, the
expected number of observations in each interval is N/n, where N is the total no. of
observation.
3. The mean of the generated numbers may be too high or too low,
4. The variance of the generated nos. may be too high or too low,
(c) Several numbers above the mean followed by several nos. below
the mean.
(a) What is the probability that a subscriber will have to wait for his long
distance call during the park hours of the day?
(b) If the subscribers will wait and are serviced in turn, what is the
expected waiting time?
Ans :- [ 0.48; 3.2
min]
(a) What is the probability that a customer can get directly into the
barber’s chair upon arrival?
(b) What is the expected number of customers waiting for a hair cut?
(c) How much time can a customer expect to spend in the barber
shop?
87
16. A Barber shop has 2 barbers and 3 chairs for waiting customers.
Assume that customers arrive in Poisson fashion at a rate of 5 per
hour and that each barber services customers. According to
exponential distribution with mean of 15 min. further, if a customer
arrives and there are no empty chairs in the shop, he will leave. Find
the steady-state probabilities. What is the probability that the shop Is
empty? What is the expected number of customers in the shop?
17. Two attendants manage the tool crib while mechanics, assumed to be
from an infinite calling population, arrive for service. Assume poisson
arrivals at rate of 2 meehanics per minute and exponentially distributed
service times with mean 40 seconds.
(a) Obtain the number of mechanics in the system and waiting time in
the queue.
18. What are the characteristics of queueing system and how would you
determine the costs in queueing problems.
19. Name and explain some of the useful statistical models for queuing
systems
20. What are the long run measures of performance of queuing systems?
Explain briefly.
3. The routine should have a sufficiently long cycle. The cycle length or
period, represents the length of the random-number sequence before
previous numbers begin to repeat themselves in an earlier order. Thus,
if 10,000 events are to be generated, the period should be many times
that long.
A special case of cycling is degenerating. A routine degenerates when
the same random numbers appear repeatedly. Such an occurrence is
certainly unacceptable. This can happen rapidly with some methods.
4. The random numbers should be replicable. Given the starting point (or
conditions), it should be possible to generate the same set of random
numbers, completely independence of the system that is being
simulated. This is helpful for debugging purposes and is a mean of
facilitating comparisons between systems. For the same reasons, it
should be possible to easily specify different starting points, widely
separated, within the sequence.
R1 , i 0,1,2,.....
Xi
m
To avoid cycling (i.e. recurrence of the same sequence of generated numbers) the
generator should have the largest possible period. This can be achieved by proper
choice of a, c, m, and Xo.
k
Let X 1 1 X i , j mod m1 1
j 1
j 1
90
And R 1 , X1 0
X1
m1
m1 1
, X1 0
m1
( m1 1)( m 2 1)......( m k 1)
P=
2 k 1
4. Test for Random Numbers
This test compares the continuous cdf, F(x), qf the uniform distribution
to the campirical cdf, SN(x) of the sample of N observations. We have,
F(x) = x, 0 x 1
Let R1, R2 …., RN be the sample from random number generator.
Then SN (x) is defined by
number of R1 , R2 ,...., R N S x X
SN (x) =
N
91
Let D = max F ( x ) S N ( x)
Test procedure :
Step 1: Rank the data from the smallest to largest. Let R(i) denotes the
ith smallest observation.
Let R(i) R(2) …R(N)
Step 2 : Compute
i 1
D- = max R(1)
N
oi ei 2
chi X
n
2 2
i 1 ei
92
N
for the uniform distribution ei =
n
d.f. = n-1
2N 1
Then, = E® =
3
16 N 29
= V(R) =
90
R
The test statistic is Z = ~ N(0,1)
Reject Ho if Z < -Za/2 or Z > Za/2 where is the level of significance.
A ‘+’ sign will be used to denote an observation above mean, and a ‘_’
sign will denote an observation below the mean.
Let, n1 = no. of observation above the mean.
n2 = no. of observation below the mean.
R = no. of runs.
Then , = E(R) = ;
2n1 n 2 1
N = n1+n2
N 2
2n1 n2 (2n1 n 2 N )
2 V ( R)
N 2 ( N 1)
R
Let Z =
Degrees of 105.50%
Freedom D D D
(N) 0.10 0.05 0.01
1 0.95 0.975 0.995
2 0.776 0.842 0.929
3 0.642 0.708 0.828
4 0.564 0.624 0.733
5 0.51 0.565 0.669
6 0.47 0.521 0.618
7 0.438 0.486 0.577
8 0.411 0.457 0.543
9 0.388 0.432 0.514
10 0.368 0.41 0.49
11 0.352 0.391 0.468
94
X2
95
Graded Questions
1. Use the chi-square test with =0.05, to test whether the data shown below are
uniformly.
Distributed.
0.34 0.90 0.25 0.89 0.87 0.44 0.12 0.21 0.46 0.67
0.83 0.76 0.79 0.64 0.70 0.81 0.94 0.74 0.22 0.74
0.96 0.99 0.77 0.67 0.56 0.41 0.52 0.73 0.99 0.02
0.47 0.30 0.17 0.82 0.56 0.05 0.45 0.31 0.78 0.05
0.79 0.71 0.23 0.19 0.82 0.93 0.65 0.37 0.39 0.42
2 Use the linear congruential method to generate a sequence of three two-digit random
integers. Let Xo = 27, a = 8, c = 47 and m = 100.
5. The sequence of numbers 0.54, 0.73, 0.98, 0.11 and 0.68 has been generated. Use
the kolmogorov-
Smirnov test with = 0.05 to determine if the hypothesis that the numbers are
uniformly distributed
on the interval [0, 1] can be rejected.
8. Consider the first 30 two digit values in Example 1. Test whether the
2nd, 9th, 16th, … numbers in the sequence are autocoerrelated, where
=0.05.
15. Consider the data in example 1. Can the hypothesis that the numbers
are. Independent be rejected no the basis of the length of the runs up
and down when = 0.05.
16. State the various test for random numbers and explain briefly any two
of them.
17. Mention the properties of random numbers, and give the methods of
generating pseudo random numbers.
Summary :-
In the chapter the desoubtion of the generation of random nos. are given and the
generated nos. are uniform & independent.
99
8
Random Variate Generation
Step 4 : Generate uniform random numbers R1, R2, … and compute the
desired random variates by
Xi = F-1 (Ri)
Step 3 : 1 – e -x = R.
e-x = 1-R
x=
1
loge (1-R)
Step 4 : Generate uniform random numbers R1, R2, … and compute the
desired random variates by Xi = F –1 (Ri)
1
Here, Xi = loge (1-Ri), i = 1,2,3,….
f(x) = 1/b-a, a x b
= 0, otherwise
0, x<a
F (x) = x – a/ b – a , a x b
= 1, x>b
1 ( x / a )
f (x) = x e , x0
= 0, otherwise
10. Data have been collected on service times at a drive-in-bank window at the Shady
Lane National Bark. This data are summarized into intervals as follows:
Interval Frequency
(Seconds)
15-30 10
30-45 20
45-60 25
60-90 35
90-120 30
120-180 20
180-300 10
Set up a table for generating service times by the table-look up method and generate
five values of service time using four-digit random numbers.
12. The weekly demand, X for a slow moving item has been found to be
well approximated by a geometric distribution on the range { 0,1,2,…}
102
13. Suppose that the demand has been found to have a Poisson
distribution with mean 0.5 items per week. Generate 3 values of X,
Demand per week, using random numbers from table. Use four digit
random numbers.
14. Lead times have been found to be exponentially distributed with mean
3.7 days. Generate five random lead times from this distribution.
16. The beta (4,3) distribution has pdf (X) = 60x3 (1-x)2, 0 x 1
= 0, otherwise
Summary:-
The basic principles of random variable generation using the inverse
transform technique, the convolution method & acceptance rejection
techniques have been introduced.
105
9
Input Modeling
Almost all real-world system contains one or more sources of randomness. However,
determining appropriate distributions for input data is a major task from the standpoint of
time and resource requirements. Faulty models of the inputs will lead to outputs whose
interpretation may give rise to misleading conclusions.
1. Collect data from real system of interest. This requires substantial time and
resources. In some situations it may not be possible to collect data. When data
are not available, expert opinion and knowledge of the process must be used to
make educated guesses.
2. Identify a probability distribution to represent the input process. Here, a
frequency distribution or histogram is developed. Based on the frequency
distribution and structural knowledge of the process, a family of distribution is
chosen.
3. Choose parameters that determine a specific instance of the distribution family.
When data are available, these parameters may be estimated from the data.
4. Evaluate the chosen distribution and the associated parameters for goodness-of-
fit. It the chosen distribution is not a good approximation of the data, then the
analyst chosen a different family of distribution, and repeats the procedure. If
several iterations of this procedure fail to yield a fit between an assumed
distributional form and the collected data the empirical form of the distribution
may be used.
Data Collection:
106
Data collection is one of the biggest tasks in solying a real problem. It is one of the most
important and difficult problems in simulation. Even when data are available, they may
not be recorded in a form that is directly useful for simulation input modeling. Even if the
model structure is valid, if the input data are inaccurately collected, inappropriately
analysed or not representative of the environment, the simulation output data will be
misleading when used for policy making.
There are several methods for selecting families of input distributions when data are
available. The specific distributions within a family are specified by estimating its
parameters.
(1) Histograms:
1. Divide the range of the data into intervals (intervals are usually of equal width;
however, unequal widths may be used if the heights of the frequencies are adjusted)
2. Label the horizontal axis to conform to the intervals selected.
3. Determine the frequency of occurrences within each interval.
4. Label the vertical axis so that the total occurrences can be plotted for each interval.
5. Plot the frequencies on the vertical axis.
The number of class intervals depends on the number of observations and the
amount of scatter or dispersion in the data. Choosing the number of class intervals
approximately equal to the square root of the sample size often works well in practice. If
the intervals are too wide, the histogram will be coarse, or blocky, and its shape and
other details will not show well. If the intervals are too narrow, the histogram will be
ragged and will not smooth the data.
The histogram for continuous data corresponds to the probability density function of a
theoretical distribution, if continuous, a line through the center point of each class interval
frequency should result in a shape like that of a pdf.
A family of distribution is selected on the basis of what arise in the context being
investigated along with the shape of the histogram.
Binomial: Models the number of successes in n trials, when the trials are independent
with common success probability, p.
Poisson: Models the number of independent events that occur in a fixed amount of time
or space.
Normal: Models the distribution of a process that can be thought of as the sum of a
number of component processes.
Exponential: Models the time between independent events, or a process time which is
memoryless.
Gamma: An extremely flexible distribution used model nonnegative random variables.
108
Graded Questions
2. The highway between Atlanta, Georgia and Athens, Georgia, has a high
incidence of accidents along its 100 km. Public safety officers say that the
occurrence of accident along the highway is randomly (uniformly) distributed, but
the news media says otherwise. [N-04] The Georgia department of Public Safety
published records for the month of September.
0 35
1 40
2 13
3 6
4 4
5 1
6 1
(a) Apply the chi-square test to these data to test the hypothesis that the underlying
distributions are Poisson. Use a level of significance of a = 0.05.
(b) Apply the chi-square test of these data to test the hypothesis that the distribution is
Poisson with mean 1.0. Again let a = 0.05.
4. The time required for 50 different employees to conpute and record the number of
hours worked during the week was measured with the following results in minutes.
1 1.88 26 0.04
2 0.54 27 1.49
3 1.90 28 0.66
4 0.15 29 2.03
5 0.02 30 1.00
6 2.81 31 0.39
7 1.50 32 0.34
8 0.53 33 0.01
9 2.62 34 0.10
10 2.67 35 1.10
11 3.53 36 0.24
12 0.53 37 0.26
13 1.80 38 0.45
14 0.79 39 0.17
110
15 0.21 40 4.29
16 0.80 41 0.80
17 0.26 42 5.50
18 0.63 43 4.91
19 0.36 44 0.35
20 2.03 45 0.36
21 1.42 46 0.90
22 1.28 47 1.03
23 0.82 48 1.73
24 2.16 49 0.38
25 0.05 50 0.48
Use the chi-square test to test the hypothesis that these service times are exponentially
distributed. Let the number of class intervals be k = 6. Use a level of significance of a =
0.05.
5. 200 electric High bulbs were tested and the average lifetime of the bulbs was found to
be 25 hrs. Using the summary given below, test the hypothesis that the lifetime is
exponentially.
7. The following data were available for the past 10 years on demand and lead-time.
[N-04]
Lead time: 6.5 4.3 6.9 6.9 6.9 6.9 5.8 7.3 4.5 6.3
Demand: 103 83 116 97 112 104 106 109 92 96
8. Discuss the steps involved in the development of a model of input data. [M-05]
Parameter Estimation: -
Xi
n
By X i 1
n
2
& the sample variance S is defined by
X nX
n
2 2
i
S2 i 1
n 1
And if the data are direct & grouped in a frequency distribution then sample mean can be
computed by
fiXj
k
X
j 1
n
& sample variance by
fiXj n X
k
2
S2
j 1
n 1
Where k is the no. of distinct values of X &
fi is the observed frequency of the value Xj of X?
Lognormal , 2 x
S2
Weibull , 0 X S
Taking 0
For the goodness of fit tests, kolmogorov-& mirnov test and the chi-square test
were introduced these two tests are applied in this section to hypotheses about
distributional forms of input data.
One procedure for testing the hypotheses that a random sample of size n of the random
variable X follows a specific distributional form is the chi square goodness of fit test.
The test is valid for large sample sizes for both discrete & continuous distributional
assumptions.
(Oi Ei ) 2
X0
K
2
i 1 Ei
Oi is observed frequency in the ith class Ei is the expected frequency in the class interval.
Ei = npi
Pi is the theoretical hypothesized probability associated with the ith class interval.
The chi square goodness of fit test cans a common date the estimation of parameters
from the data with resulted decreases in the degree of random.
The k-s test is particularly useful when sample sizes are small and when no parameters
have been estimated from the data.
K-s test is also applicable to the explain distribution.
The covariance and correlation are measures of the linear dependence between X1 &
( X 1 ) ( x2 ) + E
X2.
1 2
Where E is a random variable with mean O that is independent of X2.
( X 1 ) ( x2 )
.
1 2
[( X 1 )( x2 )]
The covariance between X1 & X2 is defined to be
C or ( X 1 , X 2 ) = E
1 2
=E ( X 1 X 2 ) 1 2
The covariance can take any value between. X and x. The correlation standardizes the
covariance to bet between -1 & 1
P corr ( X 1 X 2 )
cov( X 1 X 2 )
1 2
Summary –
Input data collection and analysis require major time and resources commitments in a
discrete event simulation project.
In this chapter, four steps in the development of models of input data is discussed
10
Verification and Validation of Simulation Model
One the most difficult problems facing a simulation analyst is that of trying to determine
whether a simulation model is an accurate representation of the actual being studied, i.e.
whether the model is valid.
The first step in model building consists of observing the real system and the interactions
among its various components and collecting data. Observation alone rarely gives
sufficient understanding of system behavior. Persons familiar with the system should be
interviewed to get information about the system. Model building is a continuous process.
The second step in model building is the construction of a conceptual model. This
includes assumptions on the components and the structure of the system and the values
of input parameters.
The third step is the translation of the operational model into a computer recognizable
form that is a computerized model.
The model builder will repeat these three steps many times, till an acceptable model is
achieved.
The purpose of model verification is to assure that the conceptual model is reflected
accurately in the computerized representation.
115
3. Closely examine the model output for reasonableness under a variety of setting
of the input parameters. Computer model should print a variety of output
statistics along with input parameters.
Validation is the overall process of comparing the model and its behavior to the real
system and its behavior. Calibration is the iterative process of comparing the model to the
real system. Making adjustment or changes to the model, comparing the revised model to
reality and so on.
The comparison of the model to reality is carried out by subjective and objective tests. A
subjective test involves talking to people, who are knowledgeable about the system,
making models and forming the judgment. Objective tests involve one or more statistical
tests to compare some model output with the assumptions in the model.
Naylor and Finger formulated a three step approach to the validation process.
Face Validity
116
The first goal of the simulations modeler is to construct a model that appears reasonable
on its face to model users are others who are knowledgeable about the real system being
simulated. The potential users of a model should be involved in model construction from
its conceptualization to its implementation to ensure that a high degree of realism in built
into the model through reasonable assumptions regarding system structure, and reliable
data. Potential users and knowledgeable persons can also evaluate model output for
reasonableness and can aid in identifying model deficiencies. Thus the users can be
involved in the calibration process as the model is iteratively improved, based on the
insights gained from the initial model deficiencies. Another advantage of user
involvement is the increase in the model’s perceived validity, or credibility, without which
a manager would not be willing to trust simulation results as a basis for decision making.
Sensitivity analysis can also be used to check a model’s face validity. The model user is
asked if the model behaves in the expected way when one or more input variables are
changed.
Model assumptions fall into two general classes. Structural assumptions and data
assumptions. Structural assumptions involve questions of how the system operates and
usually involve simplifications and obstructions of reality.
Data assumptions should be based on the collection of reliable data and correct
statistical analysis of the data.
When combining two or more data sets collected at different times, data reliability can be
further enhanced by objective statistical tests for homogeneity of data.
Additional tests may be required to test for correlation in the data. As soon as the analyst
is assured of dealing with a random sample (i.e. correction is not present), the statistical
analysis can begin.
The procedures for analyzing input data from a random sample were discussed in detail
in chapter 9. Whether by hand, or using computer software for the purpose, the analysis
consists of three steps.
3. Validating the assumed statistical model by a goodness-f-fit test, such as the chi-
square or Kolmogorov-Smirnov test, and by graphical methods.
The use of goodness-of-fit tests is an important part of the validation of the model
assumptions.
117
The ultimate, test of a model, and in fact the only objective test of the model as a whole,
is its ability to predict the future behavior of the real system when the model input data
match the real inputs and when a policy implemented in the model is implemented at
some point in the system. Furthermore, if the level of some input variables (e.g. the
arrival rate of customers to a service facility) were to increase of decrease, the model
should accurately) predict what would happen in the real system under similar
circumstances. In other words, the structure of the model should be accurate enough for
the model to make good predictions, not just for one input data set, but for the range of
input data sets which are of interest.
An alternative to generating input data is to use the actual historical record, {An, Sin, n=1,
2….} to drive the simulation model and then to compare model output to system data.
When using this technique the modeler hopes that the simulation will duplicate as closely
as possible the important events that occurred in the real system.
To conduct a validation test using historical input data, it is important that all the input
data (An, Sn,) and all the system response data, such as average delay (Z2), be
collected during the same time period. Otherwise, the comparison of model responses to
system responses, such as the comparison of average delay in the model (Y2) to that in
the system (Z2), could be misleading. The responses (Y2 and Z2) depend on the inputs
(An and Sn) as well as on the structure of the system, or model. Implementation of this
technique could be difficult for a large system because of the need for simultaneous data
collection of all input variables and those response variables of primary interest. In some
118
systems, electronic counters and devices are used to ease the data-collection task by
automatically recording certain types of data.
Develop and conduct a statistical test to determine if model output is consistent with
system behavior. User a level of significance of a=0.05.
2. System data for the job of exercise 1 revealed that the average time spent by a
job in the shop was approximately 4 working days. The model made the following
predictions on seven independent replications, for average time spent in the
shop.
Is model output consistent with system behavior? Conduct a statistical test using a level
of significance a=0.01. If it is important to detect a difference of 0.5 day, what sample size
is needed?
3. For the job of exercise 1, four sets of input data were collected over four different
10 day periods, together with the average number of jobs in the shop (Zi) for
each period. The input data were used to drive the simulation model for four runs
119
of 10 days each and model predictions of average number of jobs in the shop
(Yi) were collected, with these results:
I 1 2 3 4
Zi 21.7 19.2 22.8 19.4
Yi 24.6 21.1 19.7 24.9
Conduct a statistical test to check the consistency of system output and model output.
Use a level of significance of a = 0.05.
4. Explain in detail the 3-step approach of Naylor and finger in the validation
process.
0 l
120
Summary –
122
Validation of simulation models is great importance. Decisions are made on the basis of
simulations results thus the accuracy of these results should be subject to question and
investigation.
123
11
Output Analysis for a Single Model
The specification of initial condition may influence the output data. That is the initial
conditions may influence the values Y1, Y2……..Yn.
For the purpose of statistical analysis, the effect of the initial conditions is that the output
observations may not be identically distributed, and the initial observations may not be
representative of steady state behavior of the system. An incorrect choice of initial
condition may lead to an improper estimate of the steady performance of the simulations
model.
A terminating simulation is one that runs for some duration of time. Te, where E is
specified event or set or events which stops the simulations. Such a simulations system
“Opens” at time zero under specified initial conditions and “Closes” at the stopping time
Te, since the initial conditions for a terminating simulation generally affect the desired
measures of performance. These conditions should be representative of those for the
actual system, e.g. simulations of a Bank from 9.00 a.m. (time 0) to 3.00 p.m. (time
Te=360 minutes)
A non terminating simulations is one for which there is no natural event E to specify the
length of a run. E.g. continuous production systems, telephone systems, internet, OPD of
a Hospital, designing a new system.
A Simulation of non terminating system starts at simulation time zero under initial
conditions defined by the analyst and runs for some analyst-specified period of time Te.
The analyst wants to study stead-state or long-run properties of the system-that is,
properties which are not influenced by the initial conditions of the model at time zero. A
steady state simulation is a simulation whose objective is to study long-run, or steady-
state, behavior or a non terminating system.
Consider one run of a simulation model over a period of time. [è, T]. Since some of the
model inputs are random variables, the model output variables are also random
variables. This indicates stochastic (Probabilistic) nature of output variables.
Consider the estimation of a performance parameter è (or è), of a simulated system. Let
Y1, Y2……..Yn be the output data which is called discrete-time data. Suppose discrete-
time data are used to estimate è. Let {Y(t), è < Te} be the output data which is called
continuous-time data continuous-time are used to estimate è. e.g. Yi may be total time
spent in system by customer i. Y (t) may be the number of customers in the system at
time t.
Point Estimation
Yi
1 n
The point estimator of O based on Y1, Y2……..Yn is defined by
n m
125
The point estimator __ is said to be unbiased for __ if E(_) = __. E(_) - __ is called the
bias in __.
TE 0
T
i 1
Y (t )dt
Interval estimation
Let be a point estimator based on Y1, Y2, …..Yn. Let ó2( ) = v( ). Let ó2 ( ) be an
estimator of ó ( ).
2
If ó ( ) is approximately unbiased, then the statistic t
2
is approximately t-
( )
distributed with degrees of freedom f(say). An approximate 100(1- )% confidence
interval for is given by t a 2 , r. ( ), t a 2 , r is such that P[t t a 2 , r] 2 .
The Simulation is repeated R times, each run using a different random-number stream
and independently chosen initial conditions.
Y
r
n1
, r 1,2,....R.
1
The sample mean for replication r is given by ri
nr m
The R sample means 1 ,....... R are statistically independent and identically distributed
The overall point estimate, is given by,
r
1 R
r
R r 1
r )2
2 R
( r )
1
(
R ( R 1) r 1
and
r
1
r
TE 0
r
1 R
r
R r 1
2 R
( r )
1
( )2
R ( R 1) r 1
and r
Graded Questions
2. Discuss the output analysis for terminating simulations and confidence interval
estimation for a fixed number of replications.
3. Discuss output analysis for steady state simulations.
1 n
Yi
n n i 1
con(Y , Y )
2
( ) var( )
n n
1
i j
n2 l 1 j 1
2
where con(Yi , Y j ) var(Yi )
Yr , (n, d ) Y
n
1
nd
rj
j d 1
n ,d E[Yr (n, d )]
Yr ( n , d )
1 R
Yr , ( n, d )
R r 1
Sample variance-
1 R
S 2
(Yr Y .....) 2
R 1 r 1
128
( Y 2 r RY 2 ....)
R
1
R 1 r 1
s.e.(Y ..)
S
R
Yj
1 jm
Y
i ( j 1) m 1 i d
m
Sample mean
Y KY
K
2 2
(Y j Y )
2 2 j
K
S 1 j 1
K K j 1 K 1 K ( K 1)
R quantile estimates Q1….QR are independent & identically distributed. Their average
i
1 R
R i 1
Confidence Interval-
t 2, R 1
S
R
When S2 is the usual sample variance of 1 ........ R
Summary:
129
This chapter emphasized the idea that a stochastic disccrete event simulation is a
statistical experiment.
The main point is hat simulation o/p data contain some amount of random variability &
without some assessment of its magnitude; the pf estimate cannot be used with any
degree of reliability.
12
Comparison and Evaluation of Alternative System Design
When comparing two systems, the simulation analyst must decide on a run length T (i) E
for each model (i=1, 2) and a number of replication RI to be made of each model. From
replication r of system I, the simulation analyst obtains an estimate Yn of the mean
130
performance measure, èi. Assuming that the estimators Yri are (at least approximately)
unbiased, it follows that
Where Y1 is the sample mean performance measure for system I over all replications.
Y
R1
Yi1
1
ri
Ri r 1
And v is the degrees of freedom associated with the variance estimator t /2, v is the 100
(1-/2) percentage point of a t-distribution with v degrees of freedom, and S.E. (.)
represents the standard error of the specified point estimator.
If the confidence interval for 1-2 is totally to the left (right) of zero then we may
conclude that
i < 2 (1>2).
If the confidence interval for 1- contains zero there is no statistical evidence that one
system design is better than the other.
Independent sampling means that different and independent random-number streams will
be used to simulate the two systems. This implies that all the observations of simulated
system 1, namely {Yri, r=1, R1}, are statistically independent of all the observations of
simulated system 2, namely {Yr2, r=1…... R2}. The independence of the replications, the
variance of the sample means Yi is given by
var(Yri ) i
2
var(Yi ) i=1,2
Ri Ri
12 22
R1 R2
In some cases it is reasonable to assume that the two variances are equal (but unknown
in value) that is 12 = 22
In a steady state simulation, the variance 12 decreases as the run length TE(1)
increases; therefore, it may be possible to adjust the two run length TE(1) and TE( 2 ) , to
achieve at least approximate equality of 12 and 22 ..
1 R1 2
Yri Ri Yi 2
1 R1
Si
2
(Y Y )
Ri 1 r 1 Ri 1 r 1
ri i
( R1 1) S12 ( R2 1) S 22
Sp which has v R1 R2 2 degrees of freedom.
2
R1 R2 2
(Yi Y2 ) S p
1 1
s.e.
R1 R2
Correlated sampling means that, for each replication, the same random numbers are
used to simulate both systems. Therefore, R1 and R2 must be equal, say R1 R2 R .
Thus for each replication r, the two estimates Yr1 and Yr2 are no longer independent but
rather are correlated.
132
Dr
1 R1
Let, Dr Yr1 Yr 2 and D
R r 1
1 2
R
2
1 R1
( Dr D ) =
R 1 r 1
2 2
SD D R D
R 1 r 1
r
= s.e. (Yi Y2 )
SD
Let s.e. (D )
R
Suppose that a simulation analyst desires to compare K alternative system designs. The
comparison will be made on the basis of some specified performance measure, Oi of
system i for i = 1, 2, K. Many different statistical procedures have been developed which
can be used to analyze simulation data and draw statistically sound inferences
concerning the parameters Oil. These procedures can be classified as being fixed-
sample size procedures, or sequential sampling (or multistage) procedures. In the first
type, a predetermined sample size is used to draw inferences via hypothesis tests or
confidence intervals.
A sequential sampling scheme is one in which more and more data are collected until an
estimator with a prespecified precision is achieved, or until one of several alternative
hypothesis is selected, with the probability of correct selection being larger than a
prespecified value. A two stage 9or multistage) procedure is one in which an initial
sample is used to estimate HW many additional observations are needed to draw
conclusions with a specified precision.
Suppose that a total C confidence intervals are computed, and that the ith interval has
confidence coefficient 1- a. Let Si be the statement that the ith confidence interval
contains the parameter (or difference of two parameters) being estimated. This statement
may be true or false for a given set of data, but the procedure leading to the interval is
designed so that statement Si will be true with probability 1- ai. When it is desired to
make statements about several true simultaneously. The Bonferroni inequality states that
133
1 j 1 E
C
P (all statements Si are true, I=1, ……, C)
j 1
E j
C
Where is called the overall error probability.
j 1
P (one or more of the confidence intervals does not contain the parameter being
estimated) < E .
Metamodeling
Summary:-
13
135
Examples.
a) Physical Layout
b) Labor
c) Equipment
d) Maintenance
e) Workers
f) Product
g) Production control
h) Storage
i) Packing & Shipping
Models of material handling system may have to contain some of the following types of
subsystems.
j) Conveyers
ii) Transporters
An abstract of some of the papers from the 1994 winter simulation conference process
during will provide some insight into the types of problems that can be addresses by
simulation.
Summary:
This chapter introduced some of the ideas & concepts most relevant to manufacturing &
material handling simulation.
The advantages of trace driven simulations with respect to some of the inputs &
the need in some models for accurate modeling of material handling equipment & the
control system.
Case Studies:
Do the case studies of the simulation of manufacturing & material handling systems.
University Exam. Papers – November 2004
3. (a) What is system modeling? Give an example and explain the different types of
models.
4. (a) State the various test for random numbers and explain briefly any two of
them.
(b) What are the characteristics of queuing system and how would you determine
the costs in queuing problems.
3. (a) Name and explain some of the useful statistical models for queuing systems.
(b) The highway between Atlanta, Georgia and Athens, Georgia, has a high incidence of
accident along its 100 km. Public safety officers say that the occurrence of accident along
the highway is randomly (uniformly) distributed, but the news media says otherwise.
The Georgia department of Public Safety published records for the month of September.
These records indicated the point at which 30 accidents occurred, as follows:
Use the Kolmogorov – Smirnov test to determine whether the distribution is uniformly
distributed given D0.25, 30=0.24
(b) The following data were available for the past 10 years on demand and lead time.
Lead time: 6.5 4.3 6.9 6.9 6.9 6.9 5.8 7.3 4.5 6.3
Demand: 103 83 116 97 112 104 106 109 92 96
5. (a) Explain in detail the 3-step approach of Naylor and finger in the validation process.
138
(b) Mention some important points which you would consider in selecting simulation
software.
Write a program for single server queue using any one of the simulation language you
know.
(b) Discuss the output analysis for terminating simulations and confidence interval
estimation for a fixed number of replications.
7. What are the objectives of simulation in a manufacturing system? Give the block
diagram and explain the sequence of operation in a manufacturing system. Suggest a
suitable simulation language for the same.