All Simulation Lectures
All Simulation Lectures
The outcomes of the simulations will be reliable and ‘good’ only if the original model is
an accurate representation of the actual system. “Your results will be as good as your
data”. The word ‘data’ here includes the model and the associated parameters.
DISTRIBUTION FUNCTIONS (Revision)
The distribution function [or cumulative distribution function (cdf)] of X is the function defined by
Most of the information about a random experiment described by the r.v. X is determined by the
behavior of FX(x)
Properties of FX(x):
The function px(x) is called the probability mass function (pmf) of the discrete r.v. X.
1
CONTINUOUS RANDOM VARIABLES AND PROBABILITY DENSITY FUNCTIONS:
Let X be a r.v. with cdf FX(x). If FX(x) is continuous and. also has a derivative dFx(x)/dx which exists
everywhere except at possibly a finite number of points and is piecewise continuous, then X is called a
continuous random variable. Alternatively, X is a continuous r.v. only if its range contains an interval
(either finite or infinite) of real numbers. Thus, if X is a. continuous r.v., then
Properties of fx(x) :
Statistics:
Mean:
The mean (or expected value) of a r.v. X, denoted by x or E(X), is defined by
2
An important quantity is the coefficient of variation of the positive random variable X defined as
The coefficient of variation is a (dimensionless) measure of the variability of the random variable X.
Moment:
The nth moment of a r.v. X is defined by:
Bernoulli distribution:
A r.v. X is called a Bernoulli r.v. with parameter p if its pmf is given by:
where 0<= p <= 1. By, the cdf FX(x) of the Bernoulli r.v. X is given by:
Poisson distribution
In many practical situations we are interested in measuring how many times a certain event occurs in a
specific time interval or in a specific length or area. For instance:
1 the number of phone calls received at an exchange or call centre in an hour;
2 the number of customers arriving at a toll booth per day;
3 the number of defects on a length of cable;
4 the number of cars passing using a stretch of road during a day.
The Poisson distribution plays a key role in modelling such problems.
3
The Poisson distribution is a discrete probability distribution for the counts of events that occur
randomly in a given interval of time (or space).
If we let X = The number of events in a given interval t. Then, if the mean number of events per unit
time is
The probability of observing k events in a given interval, t, is given by
( t ) k
P ( X k ) e t
k!
The corresponding cdf is:
( t ) i
k
FX ( k ) e t
i!
0
The average = E(X) = t
The Variance = t
Uniform Distribution:
A r.v. X is called a uniform r.v. over (a, b) if its pdf is given by
Statistics are:
4
Negative Exponential Distribution:
A r.v. X is called an exponential r.v. with parameter >0 if its pdf is given by:
The most interesting property of the exponential distribution is its "memoryless" property. By this we
mean that if the lifetime of an item is exponentially distributed, then an item which has been in use for
some hours is as good as a new item with regard to the amount of time remaining until the item fails.
The exponential distribution is the only distribution which possesses this property.
5
The cdf is:
Also:
The normal r.v. is probably the most important type of continuous r.v. It has played a significant role in
the study of random phenomena in nature. Many naturally occurring random phenomena are
approximately normal.
Example: Now suppose we know that in hospital A births occur randomly at an average rate of 2.3
births per hour and in hospital B births occur randomly at an average rate of 3.1 births per hour. What
is the probability that we observe 7 births in total from the two hospitals in half an hour period?
Erlang distribution:
A random variable X has an Erlang-k (k = 1; 2; : : :) distribution with mean k/ if X is the sum of k
independent random variables X1; : : : ;Xk having a common exponential distribution with mean 1= .
The common notation is Ek( ) or briefly Ek. The density of an Ek() distribution is given by:
6
7
Handling Distributions
Statistical distributions: Statistical distributions or probability distributions describe the outcomes of
varying a random variable, and the probability of occurrence of those outcomes. When the random
variable takes only discrete values, the corresponding probability distributions are called discrete
probability distributions. Examples of this kind are the binomial distribution, Poisson distribution,
and hypergeometric distribution. On the other hand, when the random variable takes continuous
values, the corresponding probability distributions are called continuous probability distributions.
Examples of this kind are normal, exponential, and gamma distributions.
Random sampling In statistics, a finite subset of individuals from a population is called a sample. In
random sampling, the samples are drawn at random from the population, which implies that each
unit of population has an equal chance of being included in the sample.
Random number generator (RNG) A random number generator is a computational or physical
device designed to generate a sequence of numbers that appear to be independent draws from a
population, and that also pass a series of statistical tests. They are also called Pseudo‐random
number generators, since the random numbers generated through this method are not actual, but
simulated. In this article, we will consider RNG's which generate random numbers between 0 and 1,
also called uniform RNG's.
The following steps are typically performed for the Monte Carlo simulation of a physical process.
Static Model Generation Every Monte Carlo simulation starts off with developing a deterministic
model which closely resembles the real scenario. In this deterministic model, we use the most likely
value (or the base case) of the input parameters. We apply mathematical relationships which use the
values of the input variables, and transform them into the desired output.
Input Distribution Identification When we are satisfied with the deterministic model, we add the
risk components to the model. As mentioned before, since the risks originate from the stochastic
nature of the input variables, we try to identify the underlying distributions, if any, which govern the
input variables. This step needs historical data for the input variables. There are standard statistical
procedures to identify input distributions.
Random Variable Generation After we have identified the underlying distributions for the input
variables, we generate a set of random numbers (also called random variates or random samples)
from these distributions. One set of random numbers, consisting of one value for each of the input
variables, will be used in the deterministic model, to provide one set of output values. We then
repeat this process by generating more sets of random numbers, one for each input distribution, and
collect different sets of possible output values. This part is the core of Monte Carlo simulation..
Analysis and Decision Making After we have collected a sample of output values in from the
simulation, we perform statistical analysis on those values. This step provides us with statistical
confidence for the decisions which we might make after running the simulation
We have so far encountered a number of stochastic processes. Examples are the Arrival process in a
queuing network, the time taken to service a job in a server, the failure process of components in a
system etc. You can list as many as you wish. In fact many physical processes do contain some
degree of randomness and the study of random processes is of paramount importance to any
professional. You, as engineers will encounter randomness in your professional life on daily basis.
Power systems engineer: the load actual load in given city is a random variable. You may be able to
describe the load by some distribution function
Telecoms engineer: the utilization of your network is a random variable, etc.
In the context of queuing networks, we have so far assumed a nice Poisson distribution both for the
arrival and for the service processes. I remind you that this distribution implies two important
properties: the process is Memory‐less, the probability of an event during a small time interval dt is
given by dt where is the average event rate. The distribution of time between Poisson events is
given by:
f (t ) e t
We know the distribution function is given by:
1
f (t )
b a for b≤t≤a and zero otherwise.
The variable t is equally likely to occur anywhere within this interval. We can generate a value for t
satisfying above distribution using the following simple algorithm:
Generate p = Random ( )
Compute t a p (b a )
Note, the function Random ( ) is intended to generate a real number in the range 0..1. Most
software tool provide a function to generate such random numbers (in the range 0..1), see Excel
RAND(). Some tools do generate a random integer in the rand 0.. MAX_INT. This could then be used
to generate the number in the range 0..1 (i.e. divide the generated number by MAX_INT).
If the algorithm above is repeated a large number of times, you can show that resulting t's will
correspond exactly to a uniform distribution.
Starting at any time, we know that the probability of an event to occur after time t is given by:
p e t
We know that p ≤1. If we generate a probability p, then we can compute the variable t as
1
t ln( p )
The algorithm is:
Generate p = Random ( )
1
t ln( p )
Compute
x 2
1
f ( x) e 2 2
2
The cumulative probability is
x
p f u du
To generate a random number from this Normal distribution we can do the following:
Generate p = Random ( )
x
p f u du
Solve for x in:
This could be solved numerically or using the Normal distribution tables. For those of you who
consider themselves to be ‘Lazy', you can copy the Normal distribution tables into your computer
code and use the search and/or interpolation techniques to find x.
This algorithm, for the Normal distribution, is not quite efficient and you can use it if and when you
don't care about the time spent solving for x. Also, remember that in many uses you will need to
repeat the process so many times and the algorithm will sooner or later become unbearably
inefficient.
Hit‐Miss Algorithm:
This is a very powerful algorithm that could be used to generate random numbers from a
distribution where the we know the function is a bit complicated and does not render itself to an
analytical solution. The Normal distribution above is a typical example.
Let us state the following:
The random variable x is limited to the range x1 < x < x2
f f ( x) f
The distribution function min max where f
min and fmax are the minimum and
maximum values that could be taken by the function respectively.
For the Normal distribution above, f(x) becomes negligibly small for values of x 4 and
x 4 (please convince yourself by consulting the widely available Normal distribution tables).
1
f max
Again, for the Normal distribution, f min 0 and 2
The algorithm proceeds as follows:
Generate x0 in the range (x1,x2):
x0 x1 x 2 x1 .Random()
y 0 f ( x0 )
Compute y0 as:
If (y1 < y0 ) then
Hit: Accept x0 as a number from the given distribution & exit.
Else
Miss: repeat above procedure.
End.
Using the Normal distribution, this algorithm returns a value in 2‐4 iterations.
Monte‐Carlo Simulations:
This technique tries to model and study the behaviour of systems by including the randomness
of the processes involved in the system. It is one of the most powerful techniques used in the
simulation and modeling of stochastic systems. Here, we do not make any assumptions about
the models, instead we describe the behaviour as complete as we can and follow the items
contributing to the process.
Monte Carlo simulation is a type of simulation that relies on repeated random sampling and
statistical analysis to compute the results. This method of simulation is very closely related to
random experiments, experiments for which the specific result is not known in advance. In this
context, Monte Carlo simulation can be considered as a methodical way of doing so‐called what‐
if analysis. We will emphasis this view throughout this tutorial, as this is one of the easiest ways
to grasp the basics of Monte Carlo simulation.
We use mathematical models in natural sciences, social sciences, and engineering disciplines to
describe the interactions in a system using mathematical expressions. These models typically
depend on a number of input parameters, which when processed through the mathematical
formulas in the model, results in one or more outputs. A schematic diagram of the process is
shown below
.
This model fits quite nicely in discrete or semi‐discrete systems where all or some of the
variables are stochastic.
The input parameters for the models depend on various external factors. Because of these
factors, realistic models are subject to risk from the systematic variation of the input
parameters. A deterministic model, which does not consider these variations, is often termed as
a base case, since the values of these input parameters are their most likely values. An effective
model should take into consideration the risks associated with various input parameters. In
most circumstances, experimenters develop several versions of a model, which can include the
base case, the best possible scenario, and the worst possible scenario for the values of the input
variables.
In Monte Carlo simulation, we identify a statistical distribution which we can use as the source
for each of the input parameters. Then, we draw random samples from each distribution, which
then represent the values of the input variables. For each set of input parameters, we get a set
of output parameters. The value of each output parameter is one particular outcome scenario in
the simulation run. We collect such output values from a number of simulation runs. Finally, we
perform statistical analysis on the values of the output parameters, to make decisions about the
course of action (whatever it may be). We can use the sampling statistics of the output
parameters to characterize the output variation.
Arrival process, time between events has the distribution fa(t).
Service process, service time has the a distribution fs(t).
TimeNow = 0
QSize = 0
ServerBusy = False
Read SimulationTime
NextArrivalTime = Random_fa ()
NextDepartureTime = SimulationTime+1
/* Loop until simulation time is exhausted */
While (TimeNow < SimulationTime) do
TimeNow = TimeNow + dt /*Advance the time */
/* Is it time for a new arrival? */
If (TimeNow ≥ NextArrivalTime)
/* Push in the Q and generate a time for next arrival */
QSize = QSize+1
NextArrivalTime = TimeNow + Random_fa ()
EndIf
/* Is it time for job completion? */
If (ServerBusy == True) AND (NextDepartureTime ≥ TimeNow) then
/* Remove from the Q and release the server */
QSize = QSize ‐ 1
ServerBusy = False
Endif
/* Check whether the server is idle and there is a job to be served */
If (ServerBusy == False) AND (QSize > 0) then
/* Server is now busy. Select a time for job completion */
ServerBusy = True
NextDepartureTime = TimeNow + Random_fs ()
Endif
CollectQStatistics
Endwhile
ComputeDelaysAndStats
p
A C
p2
p3 D
Suppose you want to simulate the above system. This will involve the processing patterns at
each centre together with the arrival and departure processes. Let us look into the departure
process at A. Leaving A, the job could be placed at B, C or D with probabilities p1, p2 & p3
respectively. To decide which centre may be chosen we generate a random number (x) between
0..1, then use the following simple algorithm
If (0 <= x <= p1) then place job in B
If (p1 < x <= p1+p2) then place job in C
If ( (p1+p2) <x<= 1) then place job in D
Note that p1+p2+p3 = 1.
Above models could be extended to simulate all the dynamics of such systems in order to
predict the performance when unusual patterns are employed. The technique could also be
deployed in order to understand the dynamics of systems where the user can vary the
parameters in an effort to get the simulation results to closely match the known behaviour of
the system.
Monte Carlo techniques are extensively used in a wide range of engineering applications. In
electronics, these techniques are used to study and predict the characteristics of semic
conductor devices. These methods trace a large number of particles (electrons and holes) for a
long time and use averaging methods to compute performance measures for the devices.
Note that the results you would obtain using any simulation technique, Monte‐Carlo included,
will be as good as the model you started with. In other words, bad models will produce bad
results.
Distribution of charges in a transistor
The figure above shows a typical High Electron Mobility Transistor. The use of the Monte Carlo
technique in the study of such systems could be summarized as with the following algorithm:
Define the structure
Generate initial particles (typically 1 million particles)
Repeat:
Calculate the resulting Distribution of Charges,
Calculate the Distribution of Potential,
Calculate the Electric fields
Move the particles
Generate random scattering events (collisions). ‐‐ MONTE CARLO
Execute the collisions (energy loss and direction change)
Until the change in voltages becomes small
In the above we observe the following: the change in the distribution of the charges changes the
voltage distribution. The change in voltages will also affect the charges. Also, the voltages
determine the electric fields which drive the motion of the particles. The scattering/collision
events are a random process which is normally dependent on the velocity of the charges.
Simulation of Manufacturing Processes
Diffusion:
This process is used in semiconductor manufacturing. It is used to introduce atoms into the
semiconducting material to change its characteristics. The material to be introduced is heated in
gaseous form. The gas will then diffuse into the semiconductor.
In the diffusion process, the dopant atoms are introduced from the gas phase by using doped
sources. The doping concentration decreases monotonically from the surface, and the in‐depth
distribution of the dopant is determined mainly by the temperature and diffusion time.
Diffusion and ion implantation (see below) complement each other. For instance, diffusion is
used to form a deep junction, such as an n‐tub in a CMOS device, while ion implantation is
utilized to form a shallow junction, like a source / drain junction of a MOSFET.
Boron (B) is the most common p‐type impurity in silicon, whereas arsenic (As) and phosphorus
(P) are used extensively as n‐type dopants. These three elements are highly soluble in silicon
with solubilities exceeding 5 x 1020 atoms / cm3 in the diffusion temperature range (between
800oC and 1200oC). These dopants can be introduced via several means, including solid sources
(BN for B, As2O3 for As, and P2O5 for P), liquid sources (BBr3, AsCl3, and POCl3), and gaseous
sources (Diborane: B2H6, Arsine: AsH3, and Phosphine: PH3). Usually, the gaseous source is
transported to the semiconductor surface by an inert gas (e.g. N2) and is then reduced at the
surface.
The Monte‐Carlo simulation uses the laws of diffusion and allows the particles to move into the
material and imposes the various laws of scattering and diffusion. The diffusion process changes
the concentration and the concentration changes the rate of diffusion. The Monte‐Carlo
simulations are allowed to continue until the diffusion process stops or the rates are driven to
acceptable low values.
Ion Implantation:
Ion implantation is a low‐temperature technique for the introduction of impurities (dopants)
into semiconductors and offers more flexibility than diffusion. For instance, in MOS transistors,
ion implantation can be used to accurately adjust the threshold voltage.
In ion implantation, dopant atoms are volatilized, ionized, accelerated, and directed at a target
that is typically a silicon substrate. The atoms enter the crystal lattice, collide with the host
atoms, lose energy, and finally come to rest at some depth within the solid. The average
penetration depth is determined by the dopant, substrate materials, and acceleration energy.
Ion implantation energies range from several hundred to several million electron volts, resulting
in ion distributions with average depths from < 10 nm to 10m. Doses range from 1011
atoms/cm2 for threshold adjustment to 1018 atoms/cm2 for buried dielectric formation.
As each implanted ion impinges onto the target, it undergoes a series of collisions with the host
atoms until it finally stops at some depth, as depicted below. The initial ion energy is typically
several tens of keV and is much higher than lattice binding energies. The ion scattering process
is dominated by the collisions between pairs of nuclei and the collisions with the electrons in the
target.
Defect Formation
Crystal Growth
Finite State Machine
Note: Sample code is given in this note to give you a feel of how the simulations could be
implemented using software languages:
A finite‐state machine (FSM) or simply a state machine is a mathematical abstraction sometimes
used to design digital logic or computer programs. It is a behaviour model composed of a finite
number of states, transitions between those states, and actions, similar to a flow graph in which one
can inspect the way logic runs when certain conditions are met. It has finite internal memory, an
input feature that reads symbols in a sequence, one at a time without going backward; and an
output feature, which may be in the form of a user interface, once the model is implemented. The
operation of an FSM begins from one of the states (called a start state), goes through transitions
depending on input to different states and can end in any of those available, however only a certain
set of states mark a successful flow of operation (called accept states).
In many programs, there is a need to deal with entities whose handling requires a variety of distinct
behaviours. A straight‐forward approach to algorithm design deals with the different kinds of
entities using case statements or extended if‐then‐else statements. However, if the number of types
of the entities is large then this approach lead to large amounts of code, often with poor run‐time
characteristics. A table driven approach uses tables to classify the different kinds of entities, aiming
at a reduction of the number of cases requiring distinct code.
For example, an assembler needs to translate a variety of machine instructions into machine code.
These instructions often have varying numbers of operands and varying operands types. Translating
an instruction involves combining information based on the instruction mnemonic and the operands
into a binary coded instruction. For fixed‐length instruction coding, the binary code instruction is
constructed by starting with a base code for the mnemonic and adding operand data using bitwise
or operations.
One obvious use of a table in algorithm design is the for retrieving the base code for an instruction.
However, this use does not involve any classification of instructions. With careful classification,
tables can often be put to more profitable uses.
Classification
The heart of table‐driven design is a classification of the kinds of entities that the software needs to
handle. For example, in some simple assembly languages, there are only a few types of instructions
when classified according to the number and types of their operands.
If the programming language supports enumerated types then an enumerated type can be defined
with a value for each type of instruction. Then a table of instruction data can be constructed that
contains operand type information in addition to the base code for each instruction. The operand
type can be used in a case or switch statement to control handling of operands.
For languages, such as C, that allow functions to be treated like other kinds of data, functions for
handling operands can be stored in the instruction table. Each instruction has the particular function
that is needed for dealing with its operands stored in its instruction data entry.
Often, there are different levels of granularity possible in the classification involved in a table‐driven
design. For example, in an assembler the machine instructions may be assigned a single classification
based on all of the operands (course granularity), or individual operands can be classified (finer
granularity).
Any finite‐state machine can be shown as a graph with a finite set of nodes. The nodes correspond
to the states. There is no other memory implied other than the state shown. The start state is
designated with an arrow directed into the corresponding node, but otherwise unconnected.
An unconnected in‐going arc indicates that the node is the start state
The arcs are labelled σ/δ as shown below, where σ is the input symbol and δ is the output symbol or
action to be taken. The state transition is designated by virtue of the arrow going from one node to
another.
Transition from q1 to q2, based on input σ, giving output (action) δ
A Finite State Machine models behaviour by breaking it down into a small (finite) number of
different states.
The machine remains in each state for a relatively long period of time.
State transitions are triggered by a particular series of events (inputs).
Transitions between states are instantaneous.
The machine can generate output when a transition occurs. (For the nerds, this form of Finite
State Machine is called a "Mealy" machine)
Example:
Consider a "Missile Attack" game.
The user controls an interceptor missile (sprite) which is fired to shoot down incoming warheads
(sprites).
The missile sits on the launch pad until the user clicks the mouse to set a target position.
The missile flies to the target position and explodes (possibly destroying an incoming warhead).
If the warhead is destroyed then the score is increased by 10 points.
If any warhead reaches the ground then it explodes and the game finishes.
We can model the missile sprite lifetime using this FSM.
States are updated every 1/60th of a second in Update ().
Missile Lifetime FSM
The FSM for an incoming warhead
Combining FSMs
We have three Finite State Machines which we need to combine into one working application.
Each FSM requires its own separate state variable.
Each FSM is updated within the Update () method every 1/60th second.
The updates to missile and warhead states only take place when the game State is
GAME_PLAYING.
The warhead FSM can trigger the game FSM to move to GAME_OVER when the game has
finished.
Uses of the FSM:
Design of Logic gates
Design of Software module
Understand the behaviour of complex RT system
Design of Communication protocol
Manage the interactions between entities of RT system
FSMs are powerful because they are a very dense representation. A lot of logic can be represented
by a relatively small diagram. They are powerful because they follow very simple rules, and are easy
to verify. And they are powerful because they can be used to generate code.
We know that the behaviour of a system could be represented at the highest level by a set of finite
states. The dynamics of the system means that the system will be transitioning between states in
response to events, either external or internal events.
Before the system completes the transition to the final states it normally performs some action, and
ends up in the final state. An example of this sort of behaviour is:
Initial state: lift at 5th floor, Door is open
Event: You press the call button at the Ground floor
Action: Door closes, lift descends to Ground floor, Door opens
Final State: Lift at Ground floor with Door open
EXAMPLE:
Consider a subway turnstile. This simple device is governed by an equally simple FSM. Figure 1
shows part of that FSM. The round rectangles are states. The turnstile has only two states. It can be
locked, or it can be unlocked. When the turnstile is locked, a person can drop a coin into its slot. This
will cause the turnstile to change to the Unlocked state. This is shown in the diagram by the arrow
leading from the Locked state to the Unlocked state. This arrow is called a transition, because it
describes how the FSM.
Abnormal events:
Implementation:
enum State {Locked, Unlocked};
enum Event {Pass, Coin};
void Unlock();
void Lock();
void Thankyou();
void Alarm();
void Transition(Event e)
{
switch(CurrentState)
{
case Locked:
switch(e)
{
case Coin:
Unlock();
CurrentState = Unlocked;
break;
case Pass:
Alarm();
break;
}
break;
case Unlocked:
switch(e)
{
case Coin:
Thankyou();
break;
case Pass:
CurrentState = Locked;
Lock();
break;
}
break;
}
}
For the full implementation see code provided with this lecture.
The simulations of DEDs depends on the determination of the time taken to complete given tasks and the
times of the occurrence of events.
The simulation includes the selection of times of completion or times of events using the distribution of
time to complete tasks or the time between successive events. This as been described before.
Bound Task:
The bound tasks are tasks that could be started if another event has completed. This does not depend on
a condition. Example of such bound process is the arrival process. If a customer has arrived, then we can
start to expect the arrival of the next customer.
The service task is not a bound task, because the service can start only IF there is a customer o be served.
See conditional tasks
ABC Algorithm:
StartABC
Initialise Tasks and events {Generate initial events and start tasks}
Repeat
Advance Time to time of first event in List
Execute the Event
Execute Bound Events (That must be executed at this moment
Execute Conditional Events (that may be executed as a result of the current state
Until Simulation Time expires
Calculate performance parameters
EndABC
Mathematical Models
A mathematical model is a description of a system using mathematical
concepts and language. The process of developing a mathematical model is
termed mathematical modelling. Mathematical models are used in the natural
sciences (such as physics, biology, earth science, meteorology) and engineering
disciplines (such as computer science, artificial intelligence), as well as in the
social sciences (such as economics, psychology, sociology, political science).
Physicists, engineers, statisticians, operations research analysts, and
economists use mathematical models most extensively. A model may help to
explain a system and to study the effects of different components, and to
make predictions about behaviour.
Mathematical models can take many forms, including but not limited to
dynamical systems, statistical models, differential equations, or game theoretic
models. These and other types of models can overlap, with a given model
involving a variety of abstract structures. In general, mathematical models may
include logical models. In many cases, the quality of a scientific field depends
on how well the mathematical models developed on the theoretical side agree
with results of repeatable experiments. Lack of agreement between theoretical
mathematical models and experimental measurements often leads to
important advances as better theories are developed.
Mathematical models are usually composed of relationships and variables.
Relationships can be described by operators, such as algebraic operators,
functions, differential operators, etc. Variables are abstractions of system
parameters of interest that can be quantified. Several classification criteria can
be used for mathematical models according to their structure:
Linear vs. nonlinear: If all the operators in a mathematical model exhibit
linearity, the resulting mathematical model is defined as linear. A model is
considered to be nonlinear otherwise. The definition of linearity and
nonlinearity is dependent on context, and linear models may have nonlinear
expressions in them. For example, in a statistical linear model, it is assumed
that a relationship is linear in the parameters, but it may be nonlinear in the
predictor variables. Similarly, a differential equation is said to be linear if it can
be written with linear differential operators, but it can still have nonlinear
expressions in it. In a mathematical programming model, if the objective
functions and constraints are represented entirely by linear equations, then
the model is regarded as a linear model. If one or more of the objective
functions or constraints are represented with a nonlinear equation, then the
model is known as a nonlinear model.
Nonlinearity, even in fairly simple systems, is often associated with
phenomena such as chaos and irreversibility. Although there are exceptions,
nonlinear systems and models tend to be more difficult to study than linear
ones. A common approach to nonlinear problems is linearization, but this can
be problematic if one is trying to study aspects such as irreversibility, which are
strongly tied to nonlinearity.
Static vs. dynamic: A dynamic model accounts for time‐dependent changes in
the state of the system, while a static (or steady‐state) model calculates the
system in equilibrium, and thus is time‐invariant. Dynamic models typically are
represented by differential equations.
Explicit vs. implicit: If all of the input parameters of the overall model are
known, and the output parameters can be calculated by a finite series of
computations (known as linear programming, not to be confused with linearity
as described above), the model is said to be explicit. But sometimes it is the
output parameters which are known, and the corresponding inputs must be
solved for by an iterative procedure, such as Newton's method (if the model is
linear) or Broyden's method (if non‐linear). For example, a jet engine's physical
properties such as turbine and nozzle throat areas can be explicitly calculated
given a design thermodynamic cycle (air and fuel flow rates, pressures, and
temperatures) at a specific flight condition and power setting, but the engine's
operating cycles at other flight conditions and power settings cannot be
explicitly calculated from the constant physical properties.
Discrete vs. continuous: A discrete model treats objects as discrete, such as
the particles in a molecular model or the states in a statistical model; while a
continuous model represents the objects in a continuous manner, such as the
velocity field of fluid in pipe flows, temperatures and stresses in a solid, and
electric field that applies continuously over the entire model due to a point
charge.
Deterministic vs. probabilistic (stochastic): A deterministic model is one in
which every set of variable states is uniquely determined by parameters in the
model and by sets of previous states of these variables; therefore, a
deterministic model always performs the same way for a given set of initial
conditions. Conversely, in a stochastic model—usually called a "statistical
model"—randomness is present, and variable states are not described by
unique values, but rather by probability distributions.
Deductive, inductive, or floating: A deductive model is a logical structure based
on a theory. An inductive model arises from empirical findings and
generalization from them. The floating model rests on neither theory nor
observation, but is merely the invocation of expected structure. Application of
mathematics in social sciences outside of economics has been criticized for
unfounded models.
Analytical Models:
Are mathematical models that are obtained from using first principles to
obtain the closed form relationship between the inputs and the output
variables.
Empirical Models:
Empirical models are those that are based entirely on data. The important
distinction between empirical models and examples from the previous section
is that the empirical models are not derived from assumptions concerning the
relationship between variables and they are not based on physical principles.
The first step in deriving an empirical models is to get the scatterplot of the
data. If the
Data does not seem to be linear, try to plot one or both variables as logarithms
so that you can check if an exponential or power models are good fits. The idea
is to get a graph that looks reasonably linear and then to get a linear model.
It is possible to build a mathematical model solely out of the abstract concepts.
However, if the models are to be made to confront reality it is through the
data that the confrontation happens.
By data we mean measurements or observations collected in the real world.
Interaction between data and models occurs in a couple of ways:
1. Data are needed to suggest a right model. The models called empirical are
based entirely on data.
2. Data are needed to estimate the values of the parameters appearing in a
model. This is sometimes called calibrating a model.
3. Data are needed to test a model.
It happens very often that the data given at the beginning is not sufficient for
making a good model. In these cases further data collection is needed.
Considering the following questions might be useful:
‐ What is the relevant data? Exactly what kind of data is needed?
‐ How can the relevant data be obtained?
‐ In what form do you need the data?
Once the data is collected, you need to decide on the techniques you want to
use in order to find an appropriate model. There are two major groups of
techniques based on two different ideas
1. Interpolation – finding a function that contains all the data points.
2. Model fitting – finding a function that is as close as possible to containing all
the data points. Such function is also called a regression curve.
Sometimes you would need to combine these methods since the interpolation
curve might be too complex and the best fit model might not be sufficiently
accurate.
Model Fitting. Modelling using Regressions.
Most of the technology used (e.g. Excel, graphing calculators, Matlab) can be
used to find regression curves and a variable monitoring the validity of the
model, the coefficient of determination usually denoted by R2. This coefficient
takes values in interval [0,1] and indicates how close the data points are to be
exactly on the regression curve. If R2 is close to 1, the model is reliable. If R2 is
close to 0, other model should be considered.
Activity Cycle Diagrams
Activity Cycle Diagram (ACD), aka., flow diagram, or activity diagram describes
the target world in activity cycles of resources and entities, which are either in
a passive or active state. This modeling paradigm was firstly introduced by K.D.
Tocher, 1960. At that time, it was called as flow diagram, but later, researchers
refer it as to activity cycle diagram.
The ACD is composed of activity cycles of resources and entities. Each activity
cycle describes the active and passive states of a resource or entity in the
system. Usually, active state of a resource or entity is represented by a
rectangle, called as “activity”, a passive state of a resource or entity is
represented by a circle, called as “queue” as shown in Figure 1. The arc is used
to connect the activity and queue.
Figure 1. Basic Graphical Notations of the ACD
The activity represents the interaction between an entity and resource(s),
which usually takes a time delay to finish it. The token is used to represent the
state of the queue and activity. All activity cycles are closed on itself (Crookes
1986).
2. Activity Cycle Diagram (ACD)
An ACD model for a single machine system with a setup operator is shown in
Figure 2. This model consists of four activity cycles: three for resources of
“generator”, “machine” and “operator” and one for an entity of “jobs”. A job is
generated at the interval of ta time unit by the generator and stored in a
queue “B” waiting its processing on a machine. A ready‐to‐process machine
serves a job for tp time unit if a queue “B” has at least one job and it holds for
a moment until the operator is available. The operator sets up the machine for
ts time unit as soon as it is available. Other resources also perform one or
more different activities in any sequence or are idle. Here, all activity cycles are
closed.
Figure 2. An Example of the ACD: Single Machine System with a Setup Operator
3. Activity Transition Table (ATT)
The three‐phase rule is also proposed by Tocher (Tocher 1963) to handle the
flow of time in the discrete event simulation:
Phase A: Advance the clock to the time of the next (bound‐to‐occur) event.
Phase B: Terminate any activity bound to end at this time.
Phase C: Initiate any activity whose condition now permits.
The ACD represents the state flow of an entity or resource in a system, while
the three‐phase rule is based on the event that denotes the change in the state
of the model. In phase B, the activities bound to occur at a time are terminated
with the release of resources and entities (into output queues), which is called
bound or bound‐to‐occur (BTO) event. In phase C, the conditional events,
which satisfies the beginning condition of the availability of entities and
resources, are initiated by acquiring of them (Crookes 1986).
The three‐phase rule has the atomistic structure of advancing time and
executing BTO and conditional events. In the simulation execution, the BTO
event is handled by the event routine and the conditional event is executed by
the activity routine.
The activity routine firstly checks the at‐begin condition of an activity, whether
all input queues of that activity has at least one token or not. If it is true, the
at‐begin action is fulfilled, which takes one token out of each input queue.
Then it schedules a BTO event to occur in a time delay or time duration. The
event routine executes the at‐end action, which adds one token to each output
queue.
The phase C of the three‐phase rule has an inefficiency of scanning all activity
in the ACD model, even though the BTO event has an effect only on the
succeeding activities.
The activity transition table (ATT) as a model specification for the simulation
execution of the ACD models is a set of activity transitions. Each activity
transition has at‐begin condition, at‐begin action, BTO event with the time
delay, at‐end condition, at‐end action and influenced activities.
There are many known modeling paradigms to describe the dynamics of
system in process‐oriented, event‐based and activity‐based view points.
Among those modeling paradigms, the activity‐based modeling is a natural way
to represent the activity paradigm of discrete event simulation and our
knowledge about a system. In activity‐based modeling, the dynamics of system
is represented as an ACD (activity cycle diagram) which is a network model of
the logical and temporal relationships among the activities. An ACD is easily
implemented with the activity scanning method of simulation execution.
Developing the activity cycle diagram
The methodology for constructing an activity cycle diagram consists of the
following five steps:
Specify the model domain.
List all entities and their key attributes
For each entity define its individual closed cycle of activity ‐ queue ‐ activity
Merge the individual activity cycles
Verify the logic of the diagram and amend as necessary.
Data Modelling
Types of Databases and Database Applications
• Traditional Applications:
Multimedia Databases
Data Warehouses
Data Model
A model is an abstraction process that hides superfluous details. Data modelling is used for
representing entities of interest and their relationship in the database.
Data model and different types of Data Model Data model is a collection of concepts that can
be used to describe the structure of a database which provides the necessary means to achieve
the abstraction. The structure of a database means that holds the data.
data types
relationships
constraints
3. Relational or Representational
1. High Level-conceptual data model: User level data model is the high level or conceptual
model. This provides concepts that are close to the way that many users perceive data.
2.Low level-Physical data model : provides concepts that describe the details of how data is
stored in the computer model. Low level data model is only for Computer specialists not for
end-user.
3. Representation data model: It is between High level & Low level data model which
provides concepts that may be understood by end-user but that are not too far removed from
the way data is organized by within the computer.
1. Relational Model
The Relational Model uses a collection of tables both data and the relationship among those
data. Each table have multiple columns and each column has a unique name.
Customer -Table.
Account -Table
Customer Preethi and Rocky share the same account number A-111
Advantages
1. The main advantage of this model is its ability to represent data in a simplified format.
2. The process of manipulating record is simplified with the use of certain key attributes used
to retrieve data.
2. Network Model
The data in the network model are represented by collection of records and relationships
among data are represented by links, which can be viewed as pointers.
Advantages:
3. Hierarchical Model
A hierarchical data model is a data model which the data is organized into a tree like
structure. The structure allows repeating information using parent/child relationships: each
parent can have many children but each child only has one parent. All attributes of a specific
record are listed under an entity type.
Advantages:
1. The representation of records is done using an ordered tree, which is natural method of
implementation of one-to-many relationships.
2. Proper ordering of the tree results in easier and faster retrieval of records.
3. Allows the use of virtual records. This result in a stable database especially when
modification of the data base is made.
• One set comprises models of persistent O-O Programming Languages such as C++
• Additionally, systems like O2, ORION (at MCC - then ITASCA), IRIS (at H.P.- used in
Open OODB).
5 Object-Relational Models
• Universal Server.
• Relational systems incorporate concepts from object databases leading to object relational.
• Exemplified in the latest versions of Oracle-10i,DB2, and SQL Server and other DBMSs.
• it maps well to the relational model. The constructs used in the ER model can easily be
transformed into relational tables.
• it is simple and easy to understand with a minimum of training. Therefore, the model can be
used by the database designer to communicate the design to the end user.
• In addition, the model can be used as a design plan by the database developer to implement
a data model in specific database management software.
Entities
Entities are the principal data object about which information is to be collected. Entities are
usually recognizable concepts, either concrete or abstract, such as person, places, things, or
events which have relevance to the database. Some specific examples of entities are
EMPLOYEES, PROJECTS, INVOICES. An entity is analogous to a table in the relational
model.
Entities are classified as independent or dependent (in some methodologies, the terms used
are strong and weak, respectively). An independent entity is one that does not rely on another
for identification. A dependent entity is one that relies on another for identification.
Relationships
While ER model lists and defines the constructs required to build a data model, there is no
standard process for doing so. Some methodologies, such as IDEFIX, specify a bottom-up
development process were the model is built in stages. Typically, the entities and
relationships are modelled first, followed by key attributes, and then the model is finished by
adding non-key attributes. Other experts argue that in practice, using a phased approach is
impractical because it requires too many meetings with the end-users.
In practice, model building is not a strict linear process. As noted above, the requirements
analysis and the draft of the initial ER diagram often occur simultaneously. Refining and
validating the diagram may uncover problems or missing information which require more
information gathering and analysis