General Queuing Theory Notes
General Queuing Theory Notes
INTRODUCTION
Queuing theory deals with problems, which involve queuing (or waiting). Queuing is an
every-day experience. Examples of queues include people waiting for service at the banks or
supermarkets; people waiting for a train or a bus, etc. Queues are formed because resources are
limited. For instance, if the rate of demand for a service is higher than the rate of providing that
service, as experienced in banking halls at certain times of the day, queues are likely to be formed.
have queues because its avoidance may be very expensive. For example, to avoid queues at bus
terminals, an obvious solution is to buy more buses. This would require more funds. Similarly,
queues can be avoided in banking halls by employing more cashiers, but that again will in turn
increase cost of providing service. Whereas the customer is at an advantage when queues are
eliminated, it is not always possible for the service provider to achieve this. Hence, it is worthwhile
to aim for a balance between service to customers (short queues) and economic considerations
A queuing system is made up of two major parts: entities (e.g. customers) in the queue and the
entities receiving service at the service points. Figure 1 illustrates a typical queuing system with a
single-server.
XXXXXXX Service
Queue Exit
Many queuing systems have more than one server. This type of queue is referred to as
multiple-server queuing system. In a multiple-server queuing system, customers may access more
than one server. Multiple-server queuing system may be further classified into two types according
to how service is provided. In the first type, servers provide service to customers in tandem (service
When service is in tandem, customers must go through several servers before they can
complete service. An example is a registration process of students for a new academic year at the
University of Cape Coast. The student must first of all submit his/her receipt of payment of fees
and a unique identification number is issued to the student. Secondly, the student must enter the
name and this unique identification number into a record book for registered students. The student
then proceeds to the ITC centre (third phase) for the computerized registration where a registration
form is issued to the student. The fourth process includes submitting the form to one’s department
for approval of courses and lastly, the form is submitted to the Dean of Faculty. Figure 2 illustrates
In the case of service in parallel, customers go through only one of the servers to complete
service. Each server offers the same service and customers need not go through all the service
points. An example is what is experienced at the banking hall. Customers may go to any of the
tellers to draw money or deposit money. Figure 3 illustrates a service in parallel with three servers.
SERVER
1
SERVER
XXXXXXX
2 Exit
XXXXXXX
Queue
SERVER
3
QUEUE DISCIPLINE
The manner in which customers join queues and access service is known as queue
discipline. There are four queue disciplines. The first and most common queue discipline is
referred to as First-In-First-Out (FIFO). Here customers are served in the order of their arrivals. It
Out (LIFO). As the name implies, a customer who enters service last turns out to be the person
discipline is one where service is in random order (SIRO). Thus, the order in which customers are
served has no effect on the order in which they arrived. Customers are chosen at random from the
waiting list. The last, but not the least, is the PRIORITY queuing discipline and this discipline
classifies each arrival into one of several categories above and each category is then given a priority
level. Subsequently, the customers now enter service on a FCFS basis. For example, customers
attended to at emergency rooms or hospital OPD. The PRIORITY discipline is an example of a
Furthermore, customers that arrive to join queues may decide not to join the queue if it is
too long. This phenomenon is called balking. Other customers renege; that is customers leave the
queue if they have waited too long for service. Whilst other customers switch between queues
(jockey) if they think they will get served faster by so doing. Queues can be of finite capacity or
of infinite capacity.
According to Beasley (2007), changing the queue discipline (the rule by which we select
the next customer to be served) can often reduce congestion. Often the queue discipline that
"choose the customer with the lowest service time" results in the smallest value for the time (on
average) a customer spends queuing. Note here that, integral to queuing situations is the idea of
uncertainty in, for example, interarrival times and service times. This means that probability and
In terms of the analysis of queuing situations, certain factors measure system performance.
These factors might include the time it takes customers to wait in the queue before they are served
and the length of time they have to wait before the service is completed. The expected utilisation
of the server and the expected time period during which it will be fully occupied (remember servers
control/improve long queues. Some of these alternatives investigated in practice include investing
efforts in reducing the service time, adequate servers employed, priorities for certain types of
Queue Notations
Queuing systems are described using standard notations. Kendell (1951) devised the
1/2/3/4/5/6
The first and second characteristics specify the nature of the arrival and service processes
M Interarrival and service times are independent, identically distributed (iid) random variables
The third characteristic is the number of parallel servers. The fourth characteristic describes
the queue discipline. For example the First In First Out discipline (FIFO). Also, the fifth
characteristic specifies the maximum number of customers in the system. This includes customers
who are waiting and customers who are in service. The last characteristic gives the size of the
population from which customers are drawn. Unless the number of potential customers is of the
same order of magnitude as the number of servers, the population size is considered to be infinite.
An illustration of this notation is the M/M/S/G/∞/∞ system. In this notation, the first M
indicates that interarrival is exponentially distributed; the second M indicates that service time is
also exponentially distributed; S indicates that the system has S servers; G, the queue discipline;
and the two ∞ signs represent the maximum number of customers allowed in the queuing system
and the maximum number of customers in total, respectively. Thus, M/M/8/FCFS/10/∞, for
example, represents a system with exponential arrival rate and service rate with 8 servers under a
The first queuing theory problem was considered by Erlang in 1908 (Beasley, 2007). He
looked at how large a telephone exchange needed to be in order to keep to a reasonable value the
number of telephone calls not connected because the exchange was busy (lost calls). Within ten
There are various processes that add-up to any queuing system’s functionality. In the next two
sections, we take a look at distributions that govern the arrival and service processes (mechanisms).
ARRIVAL PROCESS
Arrival process shows how customers arrive for example, singly or in groups (batch or bulk
arrivals). It also depicts how the arrivals are distributed in time. An example is the probability
distribution of time between successive arrivals. This is known as the interarrival time distribution.
Since the interarrival times normally follow an exponential distribution, the number of
random. In a Poisson stream, successive customers arrive after intervals which are independently
model of many real life queuing systems and is described by a single parameter - the average
arrival rate. Other important arrival processes are scheduled arrivals; batch arrivals; and time
dependent arrival rates (i.e. the arrival rate varies according to the time of day).
This research proceeds to establish the fact that the arrival process depends on the no-memory
property of the exponential distribution. Further, the relationship between the Poisson and
exponential distributions would also be established. This enhances the understanding of the steady-
state probabilities of the queuing system. The following notation would be helpful in the
- The mean (or average) number of arrivals per time period, i.e. the mean arrival rate
- The mean (or average) number of customers served per time period, i.e. the mean service
rate
P - Traffic intensity
j
- Steady state probability that j customers are in system
Exploring any queuing system, it is assumed that at most one arrival can occur at a given
instance of time. To illustrate this, t i is defined to be the time at which the i th customer arrives.
continuous random variables described by the random variable A. Thus, T2 has no effect on the
usually a good approximation of reality. Further, the distribution of arrivals is independent of time
or day or the day of the week. This assumption of stationary interarrival times is unrealistic (rush
hours) but we may often approximate reality by breaking the time of day into segments. Say,
morning, midday and afternoon rush hours. Thus, each segment’s interarrival time may be
stationary.
Let density function of A be a(t ). It can be shown that for any small t , P(t A t t )
c
P( A c) a(t )dt (1)
0
and
P( A c) a(t )dt (2)
c
Note that the lower limit of the summation in equation (1) is zero because a negative interarrival
1
Defining to be the mean or average interarrival time and assuming time is measured in
hours, then the average interarrival time will have units of hours per arrival. Now, we may compute
1
from a(t ) by using the equation
1
ta(t )dt E ( A) (3)
0
Now the question of how to choose A to reflect reality and still be computationally tractable
arises. The common choice of A is the exponential distribution, (Winston 1994). This distribution
Freund (1992) discussed that, a random variable T has an exponential distribution (with
parameter ), and is referred to as an exponential random variable, if and only if its probability
density is given by
e t t0
a(t )
0
elsewhere
It can be shown that a(t ) decreases very rapidly for small values of t. This indicates that very long
Using equation (3) and integration by parts, the average or mean interarrival time E (A) is
given by
E ( A) te t dt
0
1
1
var( A)
2
The reason the exponential distribution is often used to model interarrival times is
If A has an exponential distribution then for all nonnegative values of t and h (Winston,
1994),
P ( A t h / A t ) P ( A h)
P( A h) e t dt
h
P ( A h ) e h
P( A B)
Using the relation P A / B , it follows that
P( B)
P ( A t h) ( A t )
P( A t h / A t )
P( A t )
Thus,
P ( A t h)
P( A t h A t )
P( A t )
e (t h )
and
P( A t ) e t
Hence
e (t h )
P ( A t h / A t ) t
e
e h
P ( A h)
The above equation implies that the probability that there will be no arrivals during the
next h hours does not depend on the value of t, and for all values of t, this probability equals
P( A h) e h .
For example,
P( A 9 / A 5) P( A 7 / A 3)
P( A 6 / A 2)
P( A / A 0)
P( A 4)
e 4
To know the probability distribution of the time it takes for the next arrival, then it does
not matter how long it has been since the last arrival. This means that to predict future arrival
patterns there is no need to keep track of how long it has been since the last arrival. These
The Poisson distribution is a discrete distribution that measures the number of occurrences
over some interval of time or space. It describes, for example, the number of customers who might
arrive during some given period. The exponential distribution on the other hand, is a continuous
distribution. It measures the passage of time between those occurrences. While the Poisson
distribution describes arrival rates (of people, trucks, telephone cells, etc.) within some time
period, the exponential distribution estimates the lapse of time between those arrivals. If the
number of occurrences is Poisson distributed, the lapse of time between occurrences will be
Interarrival times are exponential with parameter if and only if the number of arrivals to
occur in an interval of length t follows a Poisson distribution with parameter t . This implies that,
the number of arrivals ( ) for exponential interarrival times is equivalent to the number of arrivals
t for a Poisson distribution. A discrete random variable N t has a random distribution with
e t (t ) n
P ( N t n) ( n 0, 1, 2, )
n!
E ( N t ) var( N t )
t
Hence an average of t arrivals occur during a time interval of length t, so may be thought of
as the average number of arrivals per unit time, or the arrival rate.
Firstly, arrivals defined on non-overlapping time intervals are independent (for example, the
number of arrivals occurring between times 1 and 0 does not give us any information about the
number of arrivals occurring times 30 and 50). Secondly, for small t (and any value of t), the
probability of one arrival occurring between times t and t t is t o(t ) , where o(t )
o(t )
lim 0
t 0 t
Also, the probability of no arrival during the interval between t and t t is 1 t o(t ) , and
the probability of more than one arrival occurring between t and t t is o(t ) . If the above
assumptions hold, then N t follows a Poisson distribution with parameter t , and interarrival
a(t ) e t .
In essence, if the arrival rate is stationary, if bulk arrivals cannot occur, and if past arrivals
do not affect future arrivals, then interarrival times will follow an exponential distribution with
parameter . The number of arrivals in any interval of length t has a Poisson distribution with
parameter t. In many applications, the assumption of exponential interarrival times turns out to
SERVICE PROCESS
Service process is a description of the resources needed for service to begin, how long the
service will take (the service time distribution), and the number of servers available. It also
describes whether the servers are in series (each server has a separate queue) or in parallel (one
queue for all servers) and whether pre-emption is allowed (a server can stop processing a customer
to deal with another "emergency" customer). The assumptions involved in service mechanisms are
that the service times for customers are independent and do not depend upon the arrival process.
Another common assumption about service times is that they are exponentially distributed.
In modelling the service time, it is assumed that the service times of different customers
are independent random variables and that each customer’s service time is governed by a random
1
variable S having density function s (t ) . Let us define to be the mean service time with units
1
ts(t )dt E ( s)
0
where is the service rate. For example, 5 means that if customers were always present, the
server could serve an average of 5 customers per hour, and the average service time of each
1
customer would be hour. Also, note that if service time follows an exponential density function
5
1
s(t ) e t , then a customer’s mean service time is . Suppose we have a three-server system
in which customer’s service time is governed by an exponential distribution and all servers are
busy and a customer is waiting (customer 4). One of the customers already in service would
complete service and the customer 4 will enter service. By the no memory property, customer 4
service time has the same distribution as the remaining two customers. Thus, the three customers
Most queuing systems with exponential interarrival and service times are modelled as
birth-death processes. This greatly eases the derivation of the steady-state probabilities and for that
matter the average quantities. The study continues with explaining the birth-death process and its
application regarding general queuing systems (more especially the single-server exponential
queuing system).
Birth-Death Process
A birth-death process is a continuous-time stochastic process for which the system’s state
at any time is a nonnegative integer (Winston, 1994). We would subsequently use the important
idea of the birth-death process to model a queuing system. We define the number of people in any
For t = 0, the state of the system equals to the number of people initially present in the
system. Pij (t ) = the probability that j people will be present in the queuing system at time t given
Pij (t ) will, for a large t, approach a limit j ,which is independent of the initial state i. So,
as
n , Pij (n) j .
Hence,
lim P (t )
t
ij j
For the queuing system that will be discussed, j may be thought of as the probability that
at an instant in the distant future, j customers will be present. Alternatively, j may be thought of
(for time in the distant future) as the fraction of the time that j customers are present. The behaviour
of Pij (t ) (values of Pij (t ) for small t depend critically on i, the number of customers initially
present in the system) before the steady state is reached is called the transient behaviour of the
queuing system. For all but the simplest queuing systems, analysis of the system’s transient
behaviour is extremely difficult. For this reason, when we analyse the behaviour of a queuing
system, it is assumed that a steady state has been reached (Winston, 1994). This allows us to work
with the j ' s instead of the Pij (t ) ' s so as to easily determine the steady-state probabilities (if
they exist).
If a birth-death process is in state j at time t , then the motion of the process is governed
LAW 1
With the probability j t O(t ), a birth occurs between time t and time t t . A birth
increases the system state by 1, to j 1 . The variable j is called the birth rate in state j. In most
LAW 2
With probability j t O(t ), a death occurs between time t and time t t a death
decreases the system state by 1 to j 1 . The variable j is the death rate in state j. A death is a
LAW 3
Births and deaths are independent of each other. Laws 1-3 can be used to show that the
probability that more than one event (birth or death) occurs between t and t t is equal to o(t )
. Note that any birth-death process is completely specified by knowledge of the birth rates j and
the death rates j . Since a negative state cannot occur, any birth-death process must have o 0
o(t )
lim 0
t 0 t
The relationship between the birth-death process and the exponential distribution is
DEATH PROCESS
Consider for instance an M/M/1/FCFS/∞/∞ queuing system in which interarrival times are
exponentially distributed with parameter and service times are also exponentially distributed
with parameter . If the state (number of people present at time t) at time t is j, then the no memory
property of the exponential distribution implies that the probability of a birth occurring during
[t , t t ] is
P(t A t t ) P(0 A t )
e t dt
t
e t |
0
t
1 e
The above probability does not depend on how long the system has been in state j. This implies an
t o(t )
j t o(t )
From the above, we may conclude that the birth rate in state j, j , is simply the arrival rate .
Similarly, to determine the death rate at time t, note that if the state is zero at time t, then
the state at time t is j 1 , then we know (since there is only one server) that exactly one customer
will be in service. The no-memory property of the exponential distribution then implies that the
P(t S t t ) P(0 S t )
t
e t dt 1 e tt
0
(t ) o(t )
Thus for j > 1, we may conclude that the service rate in state j, j , is simply the arrival rate .
In summary, if we assume that service completion and arrivals occur independently, then
an M/M/1/FCFS/∞/∞ queuing system is a birth death process. The birth death process is only
PROCESS
The steady state probability j is the probability that in the long run there will be exactly
j customers in the system. We now show how the j ' s may be determined for an arbitrary birth–
death process. The key is to relate for a small change t , Pij (t t ) to Pij (t ) using the analogy
of Markov chains in obtaining the flow balance equations, or the conservation of flow equations
for the birth-death process (Winston 1994). It follows that for equilibrium, the expected number
of departures from state j per unit time is equal to the expected number of entrances into state j per
Exptd no. of depts from state j Exptd no. of entrs from state j
(4)
Unit time Unit time
Assuming the system has settled down into the steady state, the system spends a fraction
j of its time in state j. For j 1 , we can only leave state j by going to state j 1 or j 1 , so
for j 1 , we obtain
𝜇1 𝜋1 = 𝜋0 𝜆0 (8)
Equations (7) and (8) are called the flow balance equations, or the conservation of flow equations,
Note that equation (7) expresses the fact that in the steady state, the rate at which transitions
occurs into any state i must equal the rate at which transitions occur out of state i. If equation (6)
did not hold for all states, the probability would “pile up” at some state, and a steady state would
not exist. Therefore, the flow balance equations for a birth-death process are stated below
j0 0 0 1 1
j 1 (1 1 ) 1 0 0 2 2
j h ( j j ) j j 1 j 1 j 1 j 1
The solution of the Birth-Death Flow Balance equations from the j 0 of equation (9) above is
0 0
1
1
Now we use the 𝑗 = 3 equation of (9) to solve for 𝜋3 in terms of 𝜋0 and so on.
Defining
0 1 j 1
Cj
1 2 j
𝜋𝑗 = 𝜋0 𝐶𝑗 (10)
Since at any given time, we must be in some state, the steady-state probabilities (𝜋𝑗′ 𝑠) must sum
to 1:
∑ 𝜋𝑗 = 1
𝑗=0
𝜋0 (1 + ∑ 𝐶𝑗 ) = 1
𝑗=1
Since 𝐶0 = 1
If C
j 1
j is finite, then we can use the above equation to solve for 0 . Where
1
0
1 C j
j 1
1
C
j 0
j
If C
j 1
j is infinite then, no steady-state probability exists. Hence Equation (10) can be used to
determine 𝜋1 , 𝜋2 , 𝜋3 , … . The most common reason for a steady-state failing to exist is that the
arrival rate is at least as large as the maximum rate at which customers can be served.
We now have a basis to explore exponential models (specifically the single server
exponential model) in the next chapter based on the aforementioned derivations using the birth-