Queueing Theory
Queueing Theory
Queueing Theory
R E V: M A Y 1, 2 0 0 3
V.G. N ARAYAN AN
Queueing Theory
Waiting lines, or queues, are a universal—and universally reviled—phenomenon: we have all
experienced the frustration of waiting at banks, in grocery stores, in doctor’s offices, or on customer
service lines. Queueing theory is the mathematical study of how these lines will behave, and how the
costs of waiting (which include reduced customer satisfaction) need to be balanced against the costs
of increasing the capacity to serve customers more quickly. In this note, we will focus on techniques
for analyzing the behavior of waiting lines.
Queueing (often spelled as “queuing” in the United States) theory can be traced back to the work
of A.K. Erlang, a Danish mathematician, who studied demand for telephone lines and the resulting
service delays. It has since been extended and applied in many diverse contexts: in computer
networks, highway planning, and airport runway management, to name but few examples. All
queues share a few basic components. First are arrivals, that is, the type of item being served, such as
customers, equipment, work-in-process inventory, or airplanes. In addition, the manner in which
arrivals occur, which is generally modeled as a random process, must also be specified. That is,
rather than assuming that exactly 100 planes will always arrive per hour at an airport, we assume the
rate at which aircraft arrive follows some sort of probability distribution, with some arrival rates
being more likely than others. In the delays at Logan case, we have assumed arrivals follow a
Poisson distribution, in line with other Logan delay models1. In fact, it is precisely the uncertainty in
arrival rates that compels delay times and costs to behave in the non-linear fashion you should see in
your Excel exercise2.
Next, the number and nature of servers must be specified. Examples of servers are tellers at banks,
traffic lights at intersections, or runways at air terminals. Again, the rate at which services are
rendered must be specified, though we often assume it will be a random process. Finally, the form of
queue discipline must be specified. Queue discipline is the decision rule (like first-in-first-out or last-
in-first-out) that determines which customers will be serviced when. In an airport context, we would
most likely encounter a first-in-first-out queueing discipline.
The basic structure of the waiting lines process must also be specified. Waiting line models vary
based on whether multiple servers (or channels) are handling customers and whether customers are
serviced in one or more stages (or phases). Your airplane exercise involves the use of two models: a
1 <http://web.mit.edu/aeroastro/www/labs/AATT/reviews/delays.html..
2 If you would like to find out more about this phenomenon, please see R. Banker, S. Datar, and S. Kekre, “Relevant costs,
congestion and stochasticity in production environments,” Journal of Accounting & Economics, July 1988.
_
Professor V.G. Narayanan and Doctoral Student George Batta prepared this note as the basis for class discussion.
Copyright © 2001 President and Fellows of Harvard College. To order copies or request permission to reproduce materials, call 1-800-545-7685,
write Harvard Business School Publishing, Boston, MA 02163, or go to http://www.hbsp.harvard.edu. No part of this publication may be
reproduced, stored in a retrieval system, used in a spreadsheet, or transmitted in any form or by any means—electronic, mechanical,
photocopying, recording, or otherwise—without the permission of Harvard Business School.
102-023 Queueing Theory
single-channel, single-phase model when there is only one runway in operation, and a multiple-
channel, single-phase model for when two or three runways in operation (see Exhibit 1). However,
this is merely for simplicity and convenience’s sake, since the airport case itself is more complex than
either of these models suggest; queues may form as planes lie in holding patterns while waiting for
runway slots, as planes wait for gate slots to clear, as planes wait for refueling stations to be
readied, etc.
We also need to make assumptions about the rate at which arrivals are serviced. We often model
service rates as also following a Poisson distribution, with the caveat that this assumption is not valid
nearly as often as the Poisson assumption for arrival rates. Note also that your expected service rate
must always exceed your expected arrival rate, since, otherwise, your queue would necessarily grow
to infinite length (guess why this is so). A typical relationship between average arrival rates, average
service rates, and expected queue length is depicted in Exhibit 3.
Further Reference
A more detailed discussion of queueing behavior and managing customer satisfaction with
queues can be found in “Notes on the Management of Queues,” HBS No. 680-053 (Boston: Harvard
Business School Publishing, 1979, Rev. 1995).
2
Queueing Theory 102-023
= average expected service rate Wq = average expected waiting time in queue
s = number of servers
Another statistic that managers of queueing systems also like to examine is = /s, which tells
managers the average capacity utilization of a system. Using the multiple-channel, single-phase
queueing system equations in Exhibit 4, we can derive as well as our other outputs. If we assume
the following about arrival rates, service rates, and number of tellers, our queueing system outputs
will calculate as shown:
(average (average Lq
number of number of (average
customers customers number of
entering bank each teller s (system Wq people
per hour) services per (number of capacity (wait time, waiting
hour) tellers) utilization) in minutes) in line)
This might be considered a typical lunchtime crush for a bank, where full service capacity is
almost reached and tellers can barely keep up with the number of customers entering the bank. As a
bank manager, imposing a waiting time greater than 15 or even 10 minutes might be unacceptable,
and we might want to understand how waiting times will behave if we manipulated one of our three
input variables. For example, say we hired another teller, bringing the total number of tellers to five.
In this case, our output variables will change as shown:
3
102-23 Queueing Theory
(average (average
number of number of Lq (average
customers customers number of
entering bank each teller s (system Wq people
per hour) services per (number of capacity (wait time, in waiting in
hour) tellers) utilization) minutes) line)
The effect is dramatic; by decreasing capacity utilization to 75%, we have cut waiting times to a
bare minimum. Managers might also attempt to manipulate the arrival rate, perhaps by installing
more ATM machines around the city or by allowing more transactions to be placed online or over the
phone. Say by doing so, we allow the arrival rate to drop to 43 customers per hour. Even a small
decrease like this will have a dramatic effect, though not of the same magnitude as increasing our
number of servers:
(average (average
number of number of Lq (average
customers customers number of
entering bank each teller s (server Wq people
per hour) services per (number of capacity (wait time, in waiting in
hour) tellers) utilization) minutes) line)
Increasing the productivity of tellers, through training or automation, may also help decrease
queue lengths and waiting times. However, as with any other policy, managers must weigh the
benefits of enhanced customer satisfaction against the costs of extra labor or extra automation.
4
Queueing Theory 102-023
5
102-23 Queueing Theory
0
Queue Length
6
Queueing Theory 102-023
Exhibit 4 Queueing Formulae
s = Number of Servers