Decision-Theoretic Group - Elevator Scheduling
Decision-Theoretic Group - Elevator Scheduling
Decision-Theoretic Group - Elevator Scheduling
http://www.merl.com
Decision-Theoretic Group
Elevator Scheduling
Daniel Nikovski
Matthew Brand
Abstract
We present an efficient algorithm for exact calculation and minimization of expected
waiting times of all passengers using a bank of elevators. The dynamics of the system
are represented by a discrete-state Markov chain embedded in the continuous phase-
space diagram of a moving elevator car. The chain is evaluated efficiently using dy-
namic programming to compute measures of future system performance such as ex-
pected waiting time, properly averaged over all possible future scenarios. An elevator
group scheduler based on this method significantly outperforms a conventional algo-
rithm based on minimization of proxy criteria such as the time needed for all cars to
complete their assigned deliveries. For a wide variety of buildings, ranging from 8 to
30 floors, and with 2 to 8 shafts, our algorithm reduces waiting times up to 70% in
heavy traffic, and exhibits an average waiting-time speed-up of 20% in a test set of
20,000 building types and traffic patterns. While the algorithm has greater computa-
tional costs than most conventional algorithms, it is linear in the size of the building and
number of shafts, and quadratic in the number of passengers, and is completely within
the computational capabilities of currently existing elevator bank control systems.
This paper has been presented at the 13th International Conference on Automated Planning and Scheduling
ICAPS’03, June 9-13, 2003, Trento, Italy.
This work may not be copied or reproduced in whole or in part for any commercial purpose. Permission to copy in
whole or in part without payment of fee is granted for nonprofit educational and research purposes provided that all such
whole or partial copies include the following: a notice that such copying is by permission of Mitsubishi Electric Research
Laboratories, Inc.; an acknowledgment of the authors and individual contributions to the work; and all applicable portions
of the copyright notice. Copying, reproduction, or republishing for any other purpose shall require a license with payment
of fee to Mitsubishi Electric Research Laboratories, Inc. All rights reserved.
Copyright
c Mitsubishi Electric Research Laboratories, Inc., 2003
201 Broadway, Cambridge, Massachusetts 02139
Submitted June 2003.
Decision-Theoretic Group Elevator Scheduling
Abstract (AWT) of all passengers in the system, i.e., the time period
from the moment a passenger arrives until the moment this
We present an efficient algorithm for exact calculation and
passenger boards some car, averaged over many arrivals. Al-
minimization of expected waiting times of all passengers us-
ing a bank of elevators. The dynamics of the system are repre- ternative criteria are sometimes used as well, such as the
sented by a discrete-state Markov chain embedded in the con- average system time, defined as the time until a passenger
tinuous phase-space diagram of a moving elevator car. The arrives at the desired floor, or the average squared waiting
chain is evaluated efficiently using dynamic programming to time, which expresses a preference for a small variance in
compute measures of future system performance such as ex- wait times.
pected waiting time, properly averaged over all possible fu- Minimizing any of these criteria is an extremely compli-
ture scenarios. An elevator group scheduler based on this cated problem, for at least three reasons. First, the state
method significantly outperforms a conventional algorithm
based on minimization of proxy criteria such as the time
space of the system is huge, since it is indexed by the po-
needed for all cars to complete their assigned deliveries. For sition, direction, and velocity of all cars, the number of pas-
a wide variety of buildings, ranging from 8 to 30 floors, and sengers inside each car, and the number waiting at each floor
with 2 to 8 shafts, our algorithm reduces waiting times up to board a car. A truly optimal scheduler would have to
to 70% in heavy traffic, and exhibits an average waiting-time consider all this information when deciding how to serve
speed-up of 20% in a test set of 20,000 building types and a newly arrived passenger. Second, the dynamics of the
traffic patterns. While the algorithm has greater computa- system are accompanied by a large amount of uncertainty.
tional costs than most conventional algorithms, it is linear in While the motion of a car is completely determined by its
the size of the building and number of shafts, and quadratic current schedule, this schedule changes constantly, because
in the number of passengers, and is completely within the it depends on the future arrival of passengers, which is a
computational capabilities of currently existing elevator bank
control systems.
stochastic process. Passenger arrival events contain three
types of uncertainty: the time of arrival, the floor of arrival,
Keywords: decision-theoretic planning and scheduling, and the passenger’s final destination. Third, if the scheduler
applications of planning and scheduling, group elevator is allowed to revoke previous assignments and continuously
scheduling, dynamic programming, Markov chains reassign calls to cars, an exponential number of calls-to-cars
assignments have to be considered in a short time. (This
Introduction mode of operation, typical of elevator systems in western
countries, is known as a reassignment policy; the converse
Group elevator scheduling is a hard problem that has been mode, when the scheduler never reconsiders an assignment
researched extensively due to its high practical significance after it has been made, is known as an immediate policy, and
(Bao et al. 1994; Koehler & Ottiger 2002). The problem is is typical in Japan.)
simply stated: New passengers arrive at a bank of elevators
The insurmountable combinatorial complexity and
at random times and floors, making hall calls to signal for
stochastic nature of the problem have led practitioners in
rides up or down. A ride destination is unknown until the
the field of elevator scheduling to consider alternative, more
passenger enters the car and makes a car call to request a
tractable optimization criteria that are hoped to correlate
stop. The scheduler must assign a car to serve each hall call
well with the AWT of passengers. The following section
in a way that optimizes overall system performance. The
discusses several practical approaches used in commercial
execution of the schedule is performed by alternating the
systems, as well as several academic studies in which
direction of movement of each car and servicing all hall calls
idealizations of the problem have led to important insights,
assigned to it in its current direction of motion.
albeit with limited practical applicability. Our approach is
The usual performance criterion to be optimized when
rooted in the academic work, but the result is very practical:
scheduling passenger pick-ups is the average waiting time
A linear-time algorithm that directly optimizes the AWT.
Copyright
c 2003, American Association for Artificial Intelli- Subsequent sections introduce the assumptions and general
gence (www.aaai.org). All rights reserved. operation of our algorithm, and describe how the general
procedure, which has exponential complexity, is reduced to More rigorously motivated methods use approximations
an efficient algorithm by embedding a discrete-state Markov of the desired performance criterion that are computable in
chain in the continuous phase space of an elevator car, and reasonable time. Kone Corporation employs a method called
evaluating the AWT represented by the chain by means of the Enhanced Spacing Principle, which computes an approx-
dynamic programming. Experimental results on a detailed imation of AWT based on estimating the probable number of
commercial-grade simulator are presented as well. stops and most likely reversal floor of a car servicing its cur-
rent commitments (Siikonen 1997). The method proposed
Supervisory Group Elevator Scheduling by Cho, Gagov, and Kwon uses the same idea, and extends
Group elevator control is a specific planning and scheduling the method to handle arbitrary probability distributions over
problem characterized by a very large state space, significant destination floors (Cho, Gagov, & Kwon 1999). The correct
uncertainty, and numerous resource constraints such as finite computation of the probable number of stops and most likely
car capacities, previous car calls, etc. As a result, most of its car reversal floors is essential to these methods, and usu-
early proposed solutions have not been based on either clas- ally several well-known statistical formulae are employed
sical AI planning or decision-theoretic planning, but rather (Barney & dos Santos 1985). These formulae, however, are
on ad hoc approximations and heuristics. only applicable under the very restrictive assumptions that
The oldest elevator schedulers used the principle of col- an elevator car would reach its contract (maximal) speed
lective control (Strakosch 1998), according to which cars al- within a distance no longer than half the space between two
ways stop at the nearest call in their running direction. This neighboring floors, i.e., it would be able to accelerate fully
strategy is far from optimal and usually results in bunch- and stop completely during a trip between two consecutive
ing – the phenomenon where several cars arrive at the same floors. This assumption is grossly violated for modern el-
floor at about the same time, with all cars but one wasting evators: even a typical elevator with contract speed of 180
time. Hikihara and Ueshima (Hikihara & Ueshima 1997) m/min needs at least three floors to accelerate fully and stop
analyzed the jamming effect occurring in down-peak traffic completely, and the world’s fastest elevators, installed by
and concluded that it was due to emergent synchronization Mitsubishi Electric in the Yokohama Landmark Tower, have
between multiple cars. Another approach is zoning, or sec- contract speeds of 750 m/min and require 39 floors in order
toring, where the building is divided into zones, and each to reach maximal speed and then come back to rest (Tana-
car is assigned a single zone. While this approach avoids hashi & Araki 1994).
bunching, it is also suboptimal when many passenger ar- The problem of group elevator scheduling has also been
rivals occur in the same zone (Barney & dos Santos 1985; approached from the point of view of classical planning,
Strakosch 1998). expressing the problem domain using the PDDL formal-
Otis Elevator Company uses an optimization criterion ism (Koehler & Schuster 2000; Koehler 2001). Several ob-
consisting of a weighted sum of bonuses and penalties, stacles to this approach have been identified, such as the
called Relative System Response (RSR), computed for each lack of support for metric/resource constraints, no consid-
car in turn (Bittar 1982). This criterion is largely heuristic eration for cost functions during planning, and lack of op-
and its relation to actual AWT is not clear. Otis also uses timization capabilities. To these, we will also add the in-
another criterion called Remaining Response Time (RRT), ability of classical planners to reason and plan under un-
defined as the time necessary for a car to reach the floor of certainty — in fact, the proposed PDDL-based solution is
the new hall-call, given its current commitments for loading only applicable to elevator banks with full destination con-
and unloading passengers already assigned to it (Powell & trol, i.e., all ride destinations are registered in advance on
Williams 1992). This criterion is in fact well correlated with a destination panel (Koehler & Ottiger 2002). However,
the AWT of the passengers who have signaled the current hall decision-theoretic planning, which is an extension of classi-
call, but misses completely the effect a potential assignment cal planning to stochastic and partially-observable domains,
would have on hall calls previously assigned to that car. Fur- in conjunction with queueing theory models, seems like a
thermore, RRT includes only the time necessary for a car to very suitable formalism for this problem (Boutilier, Dean, &
pick up passengers assigned to it, but ignores the time re- Hanks 1999).
quired for these passengers to get off, since it is not known Special traffic patterns, such as down-peak and up-peak
whether they would disembark before or after the new hall traffic, can be handled very efficiently by special-purpose
call is serviced. algorithms based on queueing theory. A provably optimal
Another group of scheduling methods uses fuzzy-logic solution has been obtained for the case of pure up-peak traf-
rules which are supposed to prescribe the correct assignment fic, when all passengers arrive in the lobby at a fixed rate and
of a car to a new hall call in a small number of prototypical no other departure floors are allowed (Pepyne & Cassandras
situations (Ujihara & Tsuji 1988; Ujihara & Amano 1994). 1997). In order to make the problem tractable, however, the
The ability of fuzzy inference to generalize over similar sit- service time of elevators has been assumed to come from a
uations is used to ensure coverage of the whole state space fixed exponential distribution. This assumption, along with
of the system. The rules are either elicited from experts, the requirement for pure up-peak traffic, severely limits the
or induced from a training set. It is a matter of speculation usefulness of the algorithm in practical schedulers.
whether the rules are correct even in the prototypical situa- For the case of down-peak traffic, similarly efficient algo-
tions, and whether the fuzzy inference mechanism general- rithms are Finite Intervisit Minimization (FIM) and Empty
izes those rules correctly to novel situations. the System Algorithm (ESA) (Bao et al. 1994). While they
have been demonstrated to outperform simpler algorithms we begin with the strong assumption for simplicity of expo-
by a margin of 34%, FIM and ESA are only applicable to sition. Under this assumption, the most informed decisions
down-peak traffic, because they assume that the destination are made when the waiting times are estimated all the way
of all passengers is the lobby. As soon as there is uncer- out to the horizon where all known passengers have been
tainty in passengers’ destinations, the assumptions of these delivered to their destinations. Hence this is an empty-the-
algorithms are violated and they cannot be expected to per- system algorithm, but unlike the original ESA algorithm, our
form well. approach accommodates all traffic patterns and uncertainty
Nevertheless, the ESA algorithm contains a very important about the state of the system. In short, we retain the ESA
idea: Instead of minimizing AWT of all known and future strategy but introduce new machinery for inference.
arrivals, it limits the optimization only to the residual wait- We will initially describe the algorithm under the assump-
ing time (RWT) of the passengers currently in the system. tion that the destination floors of passengers are equally
This includes all passengers currently in elevator cars and all likely, and later on explain how non-uniform destination
passengers who have signaled their presence by making hall probabilities can be handled (at a significant computational
calls for elevators. The RWT of a single passenger is defined expense). We also assume that the full state of the system
as the time between the current moment and the moment is known to the scheduler — most importantly, we assume
this passenger is picked up by a car (Bao et al. 1994). Min- that the number of people standing on each floor is known.
imizing the RWT of known passengers instead of the AWT While such information cannot be obtained only by inspect-
of all known and future passengers is equivalent to the as- ing the number of hall buttons pressed, approximations of
sumption that the current decision (assignment of a car to various quality exist and will be discussed below. A sched-
the current hall call) would not influence the waiting times of uler operating under this assumption is known as an omni-
future passengers. While this assumption is clearly not true scient scheduler.
and should lead to suboptimal policies, its consequences can Another key assumption concerns the way passengers as-
be expected to be relatively benign, since it can be expected signed to a car are being served. In general, if n passengers
that the stochasticity of the arrival process would eliminate are assigned to a car, there are n! possible orders for them to
the influence of the current decision in the long run. be picked up. If all orders are allowed and will be considered
In computational terms, this assumption eliminates two of by the scheduler, the corresponding planning problem has
the three sources of uncertainty identified above: the exact been shown to be NP-hard even for a single car (Seckinger
arrival times of passengers and the exact floor of their arrival. & Koehler 1999). However, there exists a simple order of
This is due to the fact that the scheduler needs only consider serving passengers assigned to a car that also conforms well
those passengers who have already arrived but have not been to passengers’ expectations and is rarely suboptimal: Keep
served yet—their exact arrival times and floors are known. moving in the current direction until all passengers who have
The only uncertainty that remains is that of their destination requested rides in this direction are picked up and delivered;
floors. In this paper we show that one can compute exactly after that, move to the first hall call in the opposite direction,
the expectation of the RWT of all known passengers with and repeat the same procedure for all opposite hall calls. Our
respect to an arbitrary probability distribution of their desti- planner assumes that all assignments would be served in this
nation floors. manner.
Before continuing, we will note the existence of algo- The discussion in this section assumes an immediate as-
rithms that also consider the consequences of the current signment policy, as is customary in Japan, i.e., new assign-
decision on future arrivals. Crites and Barto demonstrated ments are appended to the current schedule and previous as-
an asynchronous algorithm for stochastic optimal control signments are never changed. Using the algorithm for a re-
which uses neural networks and Q-learning (Crites & Barto assignment policy would simply involve employing it as a
1996). Although their algorithm performed slightly better subroutine on a set of proposed assignments, generated ei-
than FIM and ESA for one specific down-peak scenario (by ther exhaustively in a combinatorial manner, or after pruning
2.65%), it took 60, 000 hours of simulated elevator opera- the set of candidate assignments by means of heuristics or a
tion to converge, which is not practical for real elevator sys- systematic branch-and-bound algorithm, similarly to other
tems. While trainable algorithms seem very promising for scheduling algorithms (Bao et al. 1994). The last assump-
this problem area, the issue of correct generalization over tion we are making is that each car has infinite capacity —
the enormous state space and infinite horizon is very forbid- while not realistic, this assumption simplifies significantly
ding. the algorithm, and we will discuss possible ways to relax it.
Optimization Criterion
Dynamic Programming for Exact
Whenever a new hall call is generated at a particular floor
Computation of Expected Average Waiting in a particular direction, the algorithm minimizes the total
Times residual waiting time of all currently waiting passengers, in-
cluding the new arrival. All such passengers except the new
Initial Assumptions
one have already been assigned to a car; under the imme-
Our key assumption, motivated above, is that future arrivals diate assignment policy, their assignments will never be re-
need not be factored into decisions about current passengers. considered. If the elevator group has a total of Nc cars, let
There are in fact several ways in which this can be relaxed; Wi− , i ∈ [1, Nc ] denote the expected waiting time of all
passengers currently assigned to car i, excluding the newly possible futures of the system (disregarding future passen-
arrived passenger(s) signaling the current hall call, and sim- gers), and computing a weighted sum of the waiting times
ilarly, let Wi+ , i ∈ [1, Nc ] denote the expected waiting time over all paths from the root to the leaves. If Np passengers
of all passengers currently assigned to car i, including the are assigned to a car in a building of Nf floors, each of the
newly arrived passenger(s). We can then compute the ex- passengers has O(Nf ) possible destinations, and the total
pected waiting time Wi associated with assigning the new N
complexity of such an implementation would be O(Nf p )—
call to car i as prohibitively high.
Nc
X Estimation of Expected RWT in Linear Time
Wi = Wi+ + Wj− , i ∈ [1, Nc ].
j=1
by Means of Dynamic Programming
i6=j
It is possible, however, to reduce the complexity of compu-
The car c chosen by the scheduler for assignment is tation to O(Nf Np ) by casting the problem into a dynamic
the one, which minimizes the total expected RWT: c = programming framework. We will call the corresponding
arg mini Wi . Note that since the number of waiting pas- algorithm ESA - DP (ESA by Dynamic Programming).
sengers is constant at the time of a particular decision step, Dynamic programming is commonly employed in
such an assignment would also minimize the average ex- stochastic scheduling algorithms where cost estimates on
pected RWT of current passengers, which is computed as segments of a system’s path can be reused in multiple paths
the total RWT of all passengers divided by their number. (Bertsekas 2000). In order to solve a problem this way,
. PNc
If W − = i=1 Wi− , the waiting times for each possible one must typically discretize the state and identify branch
assignment can be expressed as Wi = Wi+ − Wi− + W − . points where system paths converge and then diverge again,
Since W − is the same for each i, the assignment which min- so that the costs on a segment between two such points can
imizes ∆Wi = Wi+ − Wi− is also the one which minimizes be computed only once, and then reused for the computation
Wi . As a result, the optimal assignment can be found by of costs along all paths which include this segment.
computing Wi+ and Wi− for each car, and choosing the car
for which their respective difference is minimal. Trajectory Structure of an Elevator Car
Computing Wi+ and Wi− for a particular car i is essen- Such branching points can readily be identified on the phase-
tially the same problem. For Wi− we compute the expected space diagram of an elevator car shown in Fig. 1. Like any
RWT given the state of the system and all currently scheduled moving mechanical system, a car traveling in an elevator
elevator-to-passenger assignments. For W + we temporarily shaft has a phase-space diagram which describes the pos-
add the new passenger to elevator i’s itinerary and recom- sible coordinates (x, ẋ) for the car’s position along the shaft
pute the expected RWT. x and its velocity ẋ. When the car is moving under constant
acceleration without friction, its trajectory consists of seg-
Effect of Uncertainty in Passengers’ Destination ments which are parts of parabolae, and more complicated
By definition, W is the expected total RWT for all passen- equations of motion result in slightly different shapes of the
gers currently assigned to be served by a car, subject to the traversed trajectories. However, even when the equations of
constraints imposed by the car’s current position, direction, motion of a car are nonlinear (e.g. include gear backlash)
and velocity, and the currently mandated stops at requested and/or include position derivatives higher than acceleration
destination floors. The expectation of the waiting time is (e.g. jerk with a specified magnitude and duration), the mo-
taken with respect to the uncertainty in the destinations of tion of the car is very predictable and can be realized only
passengers who are yet to be picked up by the car. Since on a small number of trajectories. Accordingly, these trajec-
only their requested direction of travel is known, their des- tories branch only on a small number of points, denoted by
tination can be any of the remaining floors in that direction. circles in Fig. 1, which always correspond to the last possi-
This is in contrast to the original ESA algorithm (Bao et al. ble location at which a car should start decelerating if it is to
1994) which only works for pure down-peak traffic in which stop at a particular floor in its direction of motion. A particu-
all passengers are delivered to the lobby. In classical ESA, lar path of a car during its round-trip always consists of a fi-
the exact path traveled by the car is known, and the total nite number of such segments, whose endpoints are branch-
waiting time of all passengers can be found by summing up ing or resting points. Consequently, if the waiting time on
the car travel times between floors where passengers are to each such segment can be computed, it can be reused for the
be picked up, weighted by the number of passengers still computation along many paths that include that segment.
waiting. Reusing the costs on all individual segments can be
A straightforward extension of the ESA approach to the achieved by embedding a discrete Markov chain into the
case when the destination of passengers is not known can original system of elevator movement, which in itself oper-
be implemented by considering all possible destinations of ates in continuous time and space. A Markov chain consists
each passenger (respectively, all possible paths the car can formally of a finite number of states Si , i ∈ [1, Ns ], an im-
take), computing the total waiting time along each path, and mediate cost Cij of the transition between each pair of states
weighting these times by the probability of the respective Si and Sj , a matrix Pij of the probabilities of transition be-
path. This is equivalent to generating a tree containing all tween states Si and Sj , and a distribution π(Si ) which spec-
. of passengers assigned to a car, traveling in either direction.
x
(This maximum number is reached, for example, when all
passengers intend to get off the car at the last floor in the
current direction of motion.)
The variables f and v, however, are essentially continu-
0 x ous, and in order to make the problem tractable, they have
[floors] to be discretized. An inspection of the phase-space diagram
suggests a straightforward discretization scheme for the ve-
Figure 1: A schematic illustration of the phase space of a locity v — it can be seen that while accelerating from a par-
single car moving upwards in a shaft of a building with eight ticular floor, the car reaches branching points along its tra-
floors, not all of which have equal height. All branching jectory only at a small number of velocities (four in Fig. 1,
points are denoted by circles. including the quiescent state, when the velocity is zero).
The reason for this is the limit on the maximum speed of
any real elevator car. Depending on the inter-floor distance,
ifies the probability that the system would start in state Si maximum speed, and acceleration of the motors, this num-
(Bertsekas 2000). ber of distinct velocities at branching points can be lower
In order for the chain to be Markovian, it should obey (for longer inter-floor distances, lower maximum speed, and
the Markov property: The probability Pij of transitioning greater acceleration), or higher (for shorter inter-floor dis-
to state Sj should depend only on the starting state Si , and tances, higher maximum speed, and lower acceleration). For
not on the trajectory of the system before it entered Si . If a particular building and the elevator bank installed in it, this
we define the states of the system to correspond only to the number is fixed and can be found easily, so henceforth we as-
branching points in the phase-space diagram, the resulting sume it is known and will denote it by Nv . Hence, the vari-
chain would not be Markovian, because the probability of able v would only take Nv discrete values, ranging from 0
each branch depends on how many people are currently in- (rest) to Nv − 1 (maximum speed). Note that the same value
side the car, and that number depends on how many of all of v can correspond to different physical velocities, depend-
waiting people have already been transported to their desti- ing on which floor the car stopped at last. Another interpre-
nations in previous stops of the car. tation of this variable is the number of branching points a car
Consequently, the number of people in the car must be in- has encountered since its last stop.
cluded in the state of the Markov chain as well. However, There are several ways to discretize the floor variable
the state needs only encode the number of currently waiting f , the obvious one being to round the physical location of
people who will board the car after the moment of assign- the car to the nearest floor. While such a discretization
ment decision. This number does not include people who are is possible, the resulting value for the floor is not conve-
already in the car at that time and have signaled their desti- niently related to the particular branching point represented
nations by pressing car buttons. These “in-car” passengers by the Markov chain. A much more convenient discretiza-
influence the motion of the car too, by imposing constraints tion scheme is to choose for a value of f the floor at which
on its motion in the form of obligatory car stops, but these the car will stop if it starts decelerating at that branching
constraints are deterministic and have no impact on transi- point. The advantage of such a discretization scheme be-
tion probabilities. These probabilities depend only on the comes apparent, if we organize the states of the Markov
uncertainty in the destinations of the passengers who are yet chain in a regular structure, commonly called trellis in dy-
to board the car1 . namic programming algorithms.
Accordingly, a state Si of the Markov chain is described
by the four-tuple (f, d, v, n), where f is the floor of the car, Structure and Parameters of the Embedded
d is its current direction, v is its current velocity, and n is Markov Chain
the number of newly boarded passengers, precisely, wait- Fig. 2 shows a dynamic programming trellis for one partic-
ing passengers who enter the car in the course of evaluating ular Markov chain which corresponds to the situation when
the Markov chain. The variables d and n are discrete, and a car is moving down and is about to reach the branching
have predefined ranges: d can take only two values, “up” point at which it will stop at floor 13, if it decelerates. It
and “down”, while n ranges from 0 to the maximum number has already been scheduled to pick up a passenger at floor
7, and the scheduler is considering whether this car should
1
Technically, the disembarkation times of in-car passengers also pick up a new hall call down, originating at floor 11.
have some impact on the RWT of waiting in-hall passengers. Un- The embedded Markov chain has 84 states which are placed
certainty over how many in-car passengers will disembark at each in a trellis matrix of 7 rows and 12 columns. States in a
of the (known) requested stops could be marginalized out via dy- row represent branching points that share the property that
namic programming with an expanded state descriptor. However,
the car will stop at the same floor, if it starts decelerating im-
this represents a respectable increase in run-time for a negligible
gain in accuracy; disembarkation times are very small relative to mediately. Note that this applies to branching points reached
other time costs. We approximate the exact cost by assessing an when the car is moving in a particular direction — when it is
N/n-second disembarkation cost for N in-car passengers and n moving in the opposite direction, the branching points gen-
requested stops. The disembarkation times of to-be-boarded pas- erally have different positions in the phase space diagram.
sengers are modeled exactly. The corresponding row of the trellis is labelled with the floor
at which the car can stop, as well as the direction of the The last remaining components of the embedded Markov
movement of the car when it reaches the branching points. chain are the transition probabilities Pij of transitioning be-
Since there is a separate row for each floor and direction, the tween each pair of states Si and Sj . A large number of these
trellis can have at most 2Nf rows. transitions are deterministic and are always taken with prob-
ability one. Such are the transitions resulting from existing
car and hall calls. For example, the initial trajectory of the
car from floor 13 to floor 11 in Fig. 2 is deterministic — the
empty car accelerates until it reaches the branching point for
stopping at floor 11, where it stops to pick up the first hall-
call passenger waiting there. After that, the car accelerates
again until it reaches the branching floor for stopping at floor
10, from which it can take many different paths, depending
on the unknown destination of that passenger.
At the branching point of floor 10, the passenger might be
getting off at one of the next 10 floors, and hence the prob-
ability that this would be exactly floor 10 is 0.1. With prob-
ability 0.9, the passenger would not get off at floor 10, and
the car will continue accelerating until the branching point
Figure 2: Simplified trellis structure for the embedded for floor 9, with one passenger still on board, as reflected in
Markov chain of a single descending car. Rows signify the diagram of the Markov chain.
floors; columns signify the number of recently boarded pas- In the general case, when the car has k floors to go with n
sengers; column groups signify elevator speeds. The empty passengers on board, and we assume that a passenger would
descending car is about to reach the branching point for pos- get off at any of the k floors with equal probability (1/k),
sible a stop at floor 13. It has been assigned hall calls at we can find the probability that x people would want to get
floors 7 and 11, each of which may increase the passenger off at the next floor by using the formula for the binomial
load by one. probability function:
The states in each row of the trellis are organized into n! (k − 1)n−x
P r(x, n, k) = (1)
Nv groups (4 in figure 2), corresponding to the Nv possi- (n − x)!x! kn
ble velocity values at branching points (ordered so that the Therefore n − x people would remain on board the car
leftmost column correspond to zero velocity, and the right- with probability P r(x, n, k). A similar treatment will give
most column correspond to the maximum velocity of the P r(x, n, k) when the destination probabilities are nonuni-
car). Within a group, the states correspond to the number form but independent of where each passenger gets on; we
of people who are currently in the car and who were waiting show below how to exploit a matrix of destination prob-
in the halls at the beginning of the trellis (ranging from 0 to abilities that are conditioned on the floor of arrival. The
2 in figure 2). If the maximum number of people that can be number of remaining people n − x specifies which state
in the car at the same time is H (2 in figure 2), the width of within a group the Markov chain would enter with probabil-
the trellis is Nv (H + 1) states. This organization of states ity P r(x, n, k), but we still have to find which group (veloc-
constitutes the trellis of the dynamic programming problem. ity setting) this state would be in. This velocity setting can
It can be seen that not all of the states in the trellis can be be determined by inspecting the existing car and hall calls,
visited by the car, because its motion is constrained by the as well as the number x of people getting off. If x > 0 or
current hall and car calls. there is a mandatory stop at the next floor due to a car or hall
If we assume that the floor-value component f of the four- call, the velocity v at the next state would be zero; only when
tuple used to describe a branching point is that of the floor x = 0 (nobody gets off the car at the next floor) and there
where the car will stop, if it starts decelerating at this branch- are no car or hall calls for this floor, the car would accelerate
ing point, the first row of the trellis always contains the first (or maintain maximum speed, if it has already reached it).
branching point which the car will reach. Similarly, under The initial and terminal states of the Markov chain are al-
this convention, the last row of the trellis always corresponds ways unique, and can easily be found by locating the first
to the floor where the last passenger along the round-trip of branching point the car would enter along its current trajec-
the car will be picked up. This arrangement of rows conve- tory, and the floor where the last waiting passenger would
niently spans the horizon which the dynamic programming be picked up, respectively. The distance between the rows
algorithm has to consider, because the last moment which of the trellis containing the initial and terminal states effec-
has to be considered is always the moment the last waiting tively spans the planning horizon of the dynamic program-
passenger is picked up—after that, the residual waiting time ming algorithm.
of passengers assigned to the current car becomes zero. Once the initial state of the Markov chain has been found,
The total cost Cij incurred on a segment, measured as the whole chain can be built by propagating the set of states
the waiting time of passengers who have not been picked up which can be visited by the car from the initial state. The
yet, can be expressed simply as the product of the number of chosen arrangement of the states into a dynamic program-
these passengers and the duration of the segment. ming trellis provides a convenient order for doing this. By
inspecting the order of transitions in Fig. 2, it can be seen Per−trial improvement in response speed
900
that if a transition is between different rows, the starting state
is always in a row above the successor state, and if a transi- Conventional better
800
tion is within the same row, the starting state is always to the
%
right of the successor state. Consequently, the propagation 10
700 %
of states proceeds row-wise from the upper-right corner of 20
the trellis until the lower-left corner is reached. Each state 3 0 %
might have multiple successor states, depending on the num- 600