Decision-Theoretic Group - Elevator Scheduling

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

MITSUBISHI ELECTRIC RESEARCH LABORATORIES

http://www.merl.com

Decision-Theoretic Group
Elevator Scheduling

Daniel Nikovski
Matthew Brand

TR2003-61 June 2003

Abstract
We present an efficient algorithm for exact calculation and minimization of expected
waiting times of all passengers using a bank of elevators. The dynamics of the system
are represented by a discrete-state Markov chain embedded in the continuous phase-
space diagram of a moving elevator car. The chain is evaluated efficiently using dy-
namic programming to compute measures of future system performance such as ex-
pected waiting time, properly averaged over all possible future scenarios. An elevator
group scheduler based on this method significantly outperforms a conventional algo-
rithm based on minimization of proxy criteria such as the time needed for all cars to
complete their assigned deliveries. For a wide variety of buildings, ranging from 8 to
30 floors, and with 2 to 8 shafts, our algorithm reduces waiting times up to 70% in
heavy traffic, and exhibits an average waiting-time speed-up of 20% in a test set of
20,000 building types and traffic patterns. While the algorithm has greater computa-
tional costs than most conventional algorithms, it is linear in the size of the building and
number of shafts, and quadratic in the number of passengers, and is completely within
the computational capabilities of currently existing elevator bank control systems.
This paper has been presented at the 13th International Conference on Automated Planning and Scheduling
ICAPS’03, June 9-13, 2003, Trento, Italy.

This work may not be copied or reproduced in whole or in part for any commercial purpose. Permission to copy in
whole or in part without payment of fee is granted for nonprofit educational and research purposes provided that all such
whole or partial copies include the following: a notice that such copying is by permission of Mitsubishi Electric Research
Laboratories, Inc.; an acknowledgment of the authors and individual contributions to the work; and all applicable portions
of the copyright notice. Copying, reproduction, or republishing for any other purpose shall require a license with payment
of fee to Mitsubishi Electric Research Laboratories, Inc. All rights reserved.

Copyright
c Mitsubishi Electric Research Laboratories, Inc., 2003
201 Broadway, Cambridge, Massachusetts 02139
Submitted June 2003.
Decision-Theoretic Group Elevator Scheduling

Daniel Nikovski and Matthew Brand


Mitsubishi Electric Research Laboratories
201 Broadway, Cambridge, MA 02139, USA
[email protected], [email protected]

Abstract (AWT) of all passengers in the system, i.e., the time period
from the moment a passenger arrives until the moment this
We present an efficient algorithm for exact calculation and
passenger boards some car, averaged over many arrivals. Al-
minimization of expected waiting times of all passengers us-
ing a bank of elevators. The dynamics of the system are repre- ternative criteria are sometimes used as well, such as the
sented by a discrete-state Markov chain embedded in the con- average system time, defined as the time until a passenger
tinuous phase-space diagram of a moving elevator car. The arrives at the desired floor, or the average squared waiting
chain is evaluated efficiently using dynamic programming to time, which expresses a preference for a small variance in
compute measures of future system performance such as ex- wait times.
pected waiting time, properly averaged over all possible fu- Minimizing any of these criteria is an extremely compli-
ture scenarios. An elevator group scheduler based on this cated problem, for at least three reasons. First, the state
method significantly outperforms a conventional algorithm
based on minimization of proxy criteria such as the time
space of the system is huge, since it is indexed by the po-
needed for all cars to complete their assigned deliveries. For sition, direction, and velocity of all cars, the number of pas-
a wide variety of buildings, ranging from 8 to 30 floors, and sengers inside each car, and the number waiting at each floor
with 2 to 8 shafts, our algorithm reduces waiting times up to board a car. A truly optimal scheduler would have to
to 70% in heavy traffic, and exhibits an average waiting-time consider all this information when deciding how to serve
speed-up of 20% in a test set of 20,000 building types and a newly arrived passenger. Second, the dynamics of the
traffic patterns. While the algorithm has greater computa- system are accompanied by a large amount of uncertainty.
tional costs than most conventional algorithms, it is linear in While the motion of a car is completely determined by its
the size of the building and number of shafts, and quadratic current schedule, this schedule changes constantly, because
in the number of passengers, and is completely within the it depends on the future arrival of passengers, which is a
computational capabilities of currently existing elevator bank
control systems.
stochastic process. Passenger arrival events contain three
types of uncertainty: the time of arrival, the floor of arrival,
Keywords: decision-theoretic planning and scheduling, and the passenger’s final destination. Third, if the scheduler
applications of planning and scheduling, group elevator is allowed to revoke previous assignments and continuously
scheduling, dynamic programming, Markov chains reassign calls to cars, an exponential number of calls-to-cars
assignments have to be considered in a short time. (This
Introduction mode of operation, typical of elevator systems in western
countries, is known as a reassignment policy; the converse
Group elevator scheduling is a hard problem that has been mode, when the scheduler never reconsiders an assignment
researched extensively due to its high practical significance after it has been made, is known as an immediate policy, and
(Bao et al. 1994; Koehler & Ottiger 2002). The problem is is typical in Japan.)
simply stated: New passengers arrive at a bank of elevators
The insurmountable combinatorial complexity and
at random times and floors, making hall calls to signal for
stochastic nature of the problem have led practitioners in
rides up or down. A ride destination is unknown until the
the field of elevator scheduling to consider alternative, more
passenger enters the car and makes a car call to request a
tractable optimization criteria that are hoped to correlate
stop. The scheduler must assign a car to serve each hall call
well with the AWT of passengers. The following section
in a way that optimizes overall system performance. The
discusses several practical approaches used in commercial
execution of the schedule is performed by alternating the
systems, as well as several academic studies in which
direction of movement of each car and servicing all hall calls
idealizations of the problem have led to important insights,
assigned to it in its current direction of motion.
albeit with limited practical applicability. Our approach is
The usual performance criterion to be optimized when
rooted in the academic work, but the result is very practical:
scheduling passenger pick-ups is the average waiting time
A linear-time algorithm that directly optimizes the AWT.
Copyright
c 2003, American Association for Artificial Intelli- Subsequent sections introduce the assumptions and general
gence (www.aaai.org). All rights reserved. operation of our algorithm, and describe how the general
procedure, which has exponential complexity, is reduced to More rigorously motivated methods use approximations
an efficient algorithm by embedding a discrete-state Markov of the desired performance criterion that are computable in
chain in the continuous phase space of an elevator car, and reasonable time. Kone Corporation employs a method called
evaluating the AWT represented by the chain by means of the Enhanced Spacing Principle, which computes an approx-
dynamic programming. Experimental results on a detailed imation of AWT based on estimating the probable number of
commercial-grade simulator are presented as well. stops and most likely reversal floor of a car servicing its cur-
rent commitments (Siikonen 1997). The method proposed
Supervisory Group Elevator Scheduling by Cho, Gagov, and Kwon uses the same idea, and extends
Group elevator control is a specific planning and scheduling the method to handle arbitrary probability distributions over
problem characterized by a very large state space, significant destination floors (Cho, Gagov, & Kwon 1999). The correct
uncertainty, and numerous resource constraints such as finite computation of the probable number of stops and most likely
car capacities, previous car calls, etc. As a result, most of its car reversal floors is essential to these methods, and usu-
early proposed solutions have not been based on either clas- ally several well-known statistical formulae are employed
sical AI planning or decision-theoretic planning, but rather (Barney & dos Santos 1985). These formulae, however, are
on ad hoc approximations and heuristics. only applicable under the very restrictive assumptions that
The oldest elevator schedulers used the principle of col- an elevator car would reach its contract (maximal) speed
lective control (Strakosch 1998), according to which cars al- within a distance no longer than half the space between two
ways stop at the nearest call in their running direction. This neighboring floors, i.e., it would be able to accelerate fully
strategy is far from optimal and usually results in bunch- and stop completely during a trip between two consecutive
ing – the phenomenon where several cars arrive at the same floors. This assumption is grossly violated for modern el-
floor at about the same time, with all cars but one wasting evators: even a typical elevator with contract speed of 180
time. Hikihara and Ueshima (Hikihara & Ueshima 1997) m/min needs at least three floors to accelerate fully and stop
analyzed the jamming effect occurring in down-peak traffic completely, and the world’s fastest elevators, installed by
and concluded that it was due to emergent synchronization Mitsubishi Electric in the Yokohama Landmark Tower, have
between multiple cars. Another approach is zoning, or sec- contract speeds of 750 m/min and require 39 floors in order
toring, where the building is divided into zones, and each to reach maximal speed and then come back to rest (Tana-
car is assigned a single zone. While this approach avoids hashi & Araki 1994).
bunching, it is also suboptimal when many passenger ar- The problem of group elevator scheduling has also been
rivals occur in the same zone (Barney & dos Santos 1985; approached from the point of view of classical planning,
Strakosch 1998). expressing the problem domain using the PDDL formal-
Otis Elevator Company uses an optimization criterion ism (Koehler & Schuster 2000; Koehler 2001). Several ob-
consisting of a weighted sum of bonuses and penalties, stacles to this approach have been identified, such as the
called Relative System Response (RSR), computed for each lack of support for metric/resource constraints, no consid-
car in turn (Bittar 1982). This criterion is largely heuristic eration for cost functions during planning, and lack of op-
and its relation to actual AWT is not clear. Otis also uses timization capabilities. To these, we will also add the in-
another criterion called Remaining Response Time (RRT), ability of classical planners to reason and plan under un-
defined as the time necessary for a car to reach the floor of certainty — in fact, the proposed PDDL-based solution is
the new hall-call, given its current commitments for loading only applicable to elevator banks with full destination con-
and unloading passengers already assigned to it (Powell & trol, i.e., all ride destinations are registered in advance on
Williams 1992). This criterion is in fact well correlated with a destination panel (Koehler & Ottiger 2002). However,
the AWT of the passengers who have signaled the current hall decision-theoretic planning, which is an extension of classi-
call, but misses completely the effect a potential assignment cal planning to stochastic and partially-observable domains,
would have on hall calls previously assigned to that car. Fur- in conjunction with queueing theory models, seems like a
thermore, RRT includes only the time necessary for a car to very suitable formalism for this problem (Boutilier, Dean, &
pick up passengers assigned to it, but ignores the time re- Hanks 1999).
quired for these passengers to get off, since it is not known Special traffic patterns, such as down-peak and up-peak
whether they would disembark before or after the new hall traffic, can be handled very efficiently by special-purpose
call is serviced. algorithms based on queueing theory. A provably optimal
Another group of scheduling methods uses fuzzy-logic solution has been obtained for the case of pure up-peak traf-
rules which are supposed to prescribe the correct assignment fic, when all passengers arrive in the lobby at a fixed rate and
of a car to a new hall call in a small number of prototypical no other departure floors are allowed (Pepyne & Cassandras
situations (Ujihara & Tsuji 1988; Ujihara & Amano 1994). 1997). In order to make the problem tractable, however, the
The ability of fuzzy inference to generalize over similar sit- service time of elevators has been assumed to come from a
uations is used to ensure coverage of the whole state space fixed exponential distribution. This assumption, along with
of the system. The rules are either elicited from experts, the requirement for pure up-peak traffic, severely limits the
or induced from a training set. It is a matter of speculation usefulness of the algorithm in practical schedulers.
whether the rules are correct even in the prototypical situa- For the case of down-peak traffic, similarly efficient algo-
tions, and whether the fuzzy inference mechanism general- rithms are Finite Intervisit Minimization (FIM) and Empty
izes those rules correctly to novel situations. the System Algorithm (ESA) (Bao et al. 1994). While they
have been demonstrated to outperform simpler algorithms we begin with the strong assumption for simplicity of expo-
by a margin of 34%, FIM and ESA are only applicable to sition. Under this assumption, the most informed decisions
down-peak traffic, because they assume that the destination are made when the waiting times are estimated all the way
of all passengers is the lobby. As soon as there is uncer- out to the horizon where all known passengers have been
tainty in passengers’ destinations, the assumptions of these delivered to their destinations. Hence this is an empty-the-
algorithms are violated and they cannot be expected to per- system algorithm, but unlike the original ESA algorithm, our
form well. approach accommodates all traffic patterns and uncertainty
Nevertheless, the ESA algorithm contains a very important about the state of the system. In short, we retain the ESA
idea: Instead of minimizing AWT of all known and future strategy but introduce new machinery for inference.
arrivals, it limits the optimization only to the residual wait- We will initially describe the algorithm under the assump-
ing time (RWT) of the passengers currently in the system. tion that the destination floors of passengers are equally
This includes all passengers currently in elevator cars and all likely, and later on explain how non-uniform destination
passengers who have signaled their presence by making hall probabilities can be handled (at a significant computational
calls for elevators. The RWT of a single passenger is defined expense). We also assume that the full state of the system
as the time between the current moment and the moment is known to the scheduler — most importantly, we assume
this passenger is picked up by a car (Bao et al. 1994). Min- that the number of people standing on each floor is known.
imizing the RWT of known passengers instead of the AWT While such information cannot be obtained only by inspect-
of all known and future passengers is equivalent to the as- ing the number of hall buttons pressed, approximations of
sumption that the current decision (assignment of a car to various quality exist and will be discussed below. A sched-
the current hall call) would not influence the waiting times of uler operating under this assumption is known as an omni-
future passengers. While this assumption is clearly not true scient scheduler.
and should lead to suboptimal policies, its consequences can Another key assumption concerns the way passengers as-
be expected to be relatively benign, since it can be expected signed to a car are being served. In general, if n passengers
that the stochasticity of the arrival process would eliminate are assigned to a car, there are n! possible orders for them to
the influence of the current decision in the long run. be picked up. If all orders are allowed and will be considered
In computational terms, this assumption eliminates two of by the scheduler, the corresponding planning problem has
the three sources of uncertainty identified above: the exact been shown to be NP-hard even for a single car (Seckinger
arrival times of passengers and the exact floor of their arrival. & Koehler 1999). However, there exists a simple order of
This is due to the fact that the scheduler needs only consider serving passengers assigned to a car that also conforms well
those passengers who have already arrived but have not been to passengers’ expectations and is rarely suboptimal: Keep
served yet—their exact arrival times and floors are known. moving in the current direction until all passengers who have
The only uncertainty that remains is that of their destination requested rides in this direction are picked up and delivered;
floors. In this paper we show that one can compute exactly after that, move to the first hall call in the opposite direction,
the expectation of the RWT of all known passengers with and repeat the same procedure for all opposite hall calls. Our
respect to an arbitrary probability distribution of their desti- planner assumes that all assignments would be served in this
nation floors. manner.
Before continuing, we will note the existence of algo- The discussion in this section assumes an immediate as-
rithms that also consider the consequences of the current signment policy, as is customary in Japan, i.e., new assign-
decision on future arrivals. Crites and Barto demonstrated ments are appended to the current schedule and previous as-
an asynchronous algorithm for stochastic optimal control signments are never changed. Using the algorithm for a re-
which uses neural networks and Q-learning (Crites & Barto assignment policy would simply involve employing it as a
1996). Although their algorithm performed slightly better subroutine on a set of proposed assignments, generated ei-
than FIM and ESA for one specific down-peak scenario (by ther exhaustively in a combinatorial manner, or after pruning
2.65%), it took 60, 000 hours of simulated elevator opera- the set of candidate assignments by means of heuristics or a
tion to converge, which is not practical for real elevator sys- systematic branch-and-bound algorithm, similarly to other
tems. While trainable algorithms seem very promising for scheduling algorithms (Bao et al. 1994). The last assump-
this problem area, the issue of correct generalization over tion we are making is that each car has infinite capacity —
the enormous state space and infinite horizon is very forbid- while not realistic, this assumption simplifies significantly
ding. the algorithm, and we will discuss possible ways to relax it.

Optimization Criterion
Dynamic Programming for Exact
Whenever a new hall call is generated at a particular floor
Computation of Expected Average Waiting in a particular direction, the algorithm minimizes the total
Times residual waiting time of all currently waiting passengers, in-
cluding the new arrival. All such passengers except the new
Initial Assumptions
one have already been assigned to a car; under the imme-
Our key assumption, motivated above, is that future arrivals diate assignment policy, their assignments will never be re-
need not be factored into decisions about current passengers. considered. If the elevator group has a total of Nc cars, let
There are in fact several ways in which this can be relaxed; Wi− , i ∈ [1, Nc ] denote the expected waiting time of all
passengers currently assigned to car i, excluding the newly possible futures of the system (disregarding future passen-
arrived passenger(s) signaling the current hall call, and sim- gers), and computing a weighted sum of the waiting times
ilarly, let Wi+ , i ∈ [1, Nc ] denote the expected waiting time over all paths from the root to the leaves. If Np passengers
of all passengers currently assigned to car i, including the are assigned to a car in a building of Nf floors, each of the
newly arrived passenger(s). We can then compute the ex- passengers has O(Nf ) possible destinations, and the total
pected waiting time Wi associated with assigning the new N
complexity of such an implementation would be O(Nf p )—
call to car i as prohibitively high.

Nc
X Estimation of Expected RWT in Linear Time
Wi = Wi+ + Wj− , i ∈ [1, Nc ].
j=1
by Means of Dynamic Programming
i6=j
It is possible, however, to reduce the complexity of compu-
The car c chosen by the scheduler for assignment is tation to O(Nf Np ) by casting the problem into a dynamic
the one, which minimizes the total expected RWT: c = programming framework. We will call the corresponding
arg mini Wi . Note that since the number of waiting pas- algorithm ESA - DP (ESA by Dynamic Programming).
sengers is constant at the time of a particular decision step, Dynamic programming is commonly employed in
such an assignment would also minimize the average ex- stochastic scheduling algorithms where cost estimates on
pected RWT of current passengers, which is computed as segments of a system’s path can be reused in multiple paths
the total RWT of all passengers divided by their number. (Bertsekas 2000). In order to solve a problem this way,
. PNc
If W − = i=1 Wi− , the waiting times for each possible one must typically discretize the state and identify branch
assignment can be expressed as Wi = Wi+ − Wi− + W − . points where system paths converge and then diverge again,
Since W − is the same for each i, the assignment which min- so that the costs on a segment between two such points can
imizes ∆Wi = Wi+ − Wi− is also the one which minimizes be computed only once, and then reused for the computation
Wi . As a result, the optimal assignment can be found by of costs along all paths which include this segment.
computing Wi+ and Wi− for each car, and choosing the car
for which their respective difference is minimal. Trajectory Structure of an Elevator Car
Computing Wi+ and Wi− for a particular car i is essen- Such branching points can readily be identified on the phase-
tially the same problem. For Wi− we compute the expected space diagram of an elevator car shown in Fig. 1. Like any
RWT given the state of the system and all currently scheduled moving mechanical system, a car traveling in an elevator
elevator-to-passenger assignments. For W + we temporarily shaft has a phase-space diagram which describes the pos-
add the new passenger to elevator i’s itinerary and recom- sible coordinates (x, ẋ) for the car’s position along the shaft
pute the expected RWT. x and its velocity ẋ. When the car is moving under constant
acceleration without friction, its trajectory consists of seg-
Effect of Uncertainty in Passengers’ Destination ments which are parts of parabolae, and more complicated
By definition, W is the expected total RWT for all passen- equations of motion result in slightly different shapes of the
gers currently assigned to be served by a car, subject to the traversed trajectories. However, even when the equations of
constraints imposed by the car’s current position, direction, motion of a car are nonlinear (e.g. include gear backlash)
and velocity, and the currently mandated stops at requested and/or include position derivatives higher than acceleration
destination floors. The expectation of the waiting time is (e.g. jerk with a specified magnitude and duration), the mo-
taken with respect to the uncertainty in the destinations of tion of the car is very predictable and can be realized only
passengers who are yet to be picked up by the car. Since on a small number of trajectories. Accordingly, these trajec-
only their requested direction of travel is known, their des- tories branch only on a small number of points, denoted by
tination can be any of the remaining floors in that direction. circles in Fig. 1, which always correspond to the last possi-
This is in contrast to the original ESA algorithm (Bao et al. ble location at which a car should start decelerating if it is to
1994) which only works for pure down-peak traffic in which stop at a particular floor in its direction of motion. A particu-
all passengers are delivered to the lobby. In classical ESA, lar path of a car during its round-trip always consists of a fi-
the exact path traveled by the car is known, and the total nite number of such segments, whose endpoints are branch-
waiting time of all passengers can be found by summing up ing or resting points. Consequently, if the waiting time on
the car travel times between floors where passengers are to each such segment can be computed, it can be reused for the
be picked up, weighted by the number of passengers still computation along many paths that include that segment.
waiting. Reusing the costs on all individual segments can be
A straightforward extension of the ESA approach to the achieved by embedding a discrete Markov chain into the
case when the destination of passengers is not known can original system of elevator movement, which in itself oper-
be implemented by considering all possible destinations of ates in continuous time and space. A Markov chain consists
each passenger (respectively, all possible paths the car can formally of a finite number of states Si , i ∈ [1, Ns ], an im-
take), computing the total waiting time along each path, and mediate cost Cij of the transition between each pair of states
weighting these times by the probability of the respective Si and Sj , a matrix Pij of the probabilities of transition be-
path. This is equivalent to generating a tree containing all tween states Si and Sj , and a distribution π(Si ) which spec-
. of passengers assigned to a car, traveling in either direction.
x
(This maximum number is reached, for example, when all
passengers intend to get off the car at the last floor in the
current direction of motion.)
The variables f and v, however, are essentially continu-
0 x ous, and in order to make the problem tractable, they have
[floors] to be discretized. An inspection of the phase-space diagram
suggests a straightforward discretization scheme for the ve-
Figure 1: A schematic illustration of the phase space of a locity v — it can be seen that while accelerating from a par-
single car moving upwards in a shaft of a building with eight ticular floor, the car reaches branching points along its tra-
floors, not all of which have equal height. All branching jectory only at a small number of velocities (four in Fig. 1,
points are denoted by circles. including the quiescent state, when the velocity is zero).
The reason for this is the limit on the maximum speed of
any real elevator car. Depending on the inter-floor distance,
ifies the probability that the system would start in state Si maximum speed, and acceleration of the motors, this num-
(Bertsekas 2000). ber of distinct velocities at branching points can be lower
In order for the chain to be Markovian, it should obey (for longer inter-floor distances, lower maximum speed, and
the Markov property: The probability Pij of transitioning greater acceleration), or higher (for shorter inter-floor dis-
to state Sj should depend only on the starting state Si , and tances, higher maximum speed, and lower acceleration). For
not on the trajectory of the system before it entered Si . If a particular building and the elevator bank installed in it, this
we define the states of the system to correspond only to the number is fixed and can be found easily, so henceforth we as-
branching points in the phase-space diagram, the resulting sume it is known and will denote it by Nv . Hence, the vari-
chain would not be Markovian, because the probability of able v would only take Nv discrete values, ranging from 0
each branch depends on how many people are currently in- (rest) to Nv − 1 (maximum speed). Note that the same value
side the car, and that number depends on how many of all of v can correspond to different physical velocities, depend-
waiting people have already been transported to their desti- ing on which floor the car stopped at last. Another interpre-
nations in previous stops of the car. tation of this variable is the number of branching points a car
Consequently, the number of people in the car must be in- has encountered since its last stop.
cluded in the state of the Markov chain as well. However, There are several ways to discretize the floor variable
the state needs only encode the number of currently waiting f , the obvious one being to round the physical location of
people who will board the car after the moment of assign- the car to the nearest floor. While such a discretization
ment decision. This number does not include people who are is possible, the resulting value for the floor is not conve-
already in the car at that time and have signaled their desti- niently related to the particular branching point represented
nations by pressing car buttons. These “in-car” passengers by the Markov chain. A much more convenient discretiza-
influence the motion of the car too, by imposing constraints tion scheme is to choose for a value of f the floor at which
on its motion in the form of obligatory car stops, but these the car will stop if it starts decelerating at that branching
constraints are deterministic and have no impact on transi- point. The advantage of such a discretization scheme be-
tion probabilities. These probabilities depend only on the comes apparent, if we organize the states of the Markov
uncertainty in the destinations of the passengers who are yet chain in a regular structure, commonly called trellis in dy-
to board the car1 . namic programming algorithms.
Accordingly, a state Si of the Markov chain is described
by the four-tuple (f, d, v, n), where f is the floor of the car, Structure and Parameters of the Embedded
d is its current direction, v is its current velocity, and n is Markov Chain
the number of newly boarded passengers, precisely, wait- Fig. 2 shows a dynamic programming trellis for one partic-
ing passengers who enter the car in the course of evaluating ular Markov chain which corresponds to the situation when
the Markov chain. The variables d and n are discrete, and a car is moving down and is about to reach the branching
have predefined ranges: d can take only two values, “up” point at which it will stop at floor 13, if it decelerates. It
and “down”, while n ranges from 0 to the maximum number has already been scheduled to pick up a passenger at floor
7, and the scheduler is considering whether this car should
1
Technically, the disembarkation times of in-car passengers also pick up a new hall call down, originating at floor 11.
have some impact on the RWT of waiting in-hall passengers. Un- The embedded Markov chain has 84 states which are placed
certainty over how many in-car passengers will disembark at each in a trellis matrix of 7 rows and 12 columns. States in a
of the (known) requested stops could be marginalized out via dy- row represent branching points that share the property that
namic programming with an expanded state descriptor. However,
the car will stop at the same floor, if it starts decelerating im-
this represents a respectable increase in run-time for a negligible
gain in accuracy; disembarkation times are very small relative to mediately. Note that this applies to branching points reached
other time costs. We approximate the exact cost by assessing an when the car is moving in a particular direction — when it is
N/n-second disembarkation cost for N in-car passengers and n moving in the opposite direction, the branching points gen-
requested stops. The disembarkation times of to-be-boarded pas- erally have different positions in the phase space diagram.
sengers are modeled exactly. The corresponding row of the trellis is labelled with the floor
at which the car can stop, as well as the direction of the The last remaining components of the embedded Markov
movement of the car when it reaches the branching points. chain are the transition probabilities Pij of transitioning be-
Since there is a separate row for each floor and direction, the tween each pair of states Si and Sj . A large number of these
trellis can have at most 2Nf rows. transitions are deterministic and are always taken with prob-
ability one. Such are the transitions resulting from existing
car and hall calls. For example, the initial trajectory of the
car from floor 13 to floor 11 in Fig. 2 is deterministic — the

empty car accelerates until it reaches the branching point for
  stopping at floor 11, where it stops to pick up the first hall-


  
call passenger waiting there. After that, the car accelerates
 
 again until it reaches the branching floor for stopping at floor
 
 

 10, from which it can take many different paths, depending
     on the unknown destination of that passenger.

     
   At the branching point of floor 10, the passenger might be

         
     
 
getting off at one of the next 10 floors, and hence the prob-
 
 
   ability that this would be exactly floor 10 is 0.1. With prob-
ability 0.9, the passenger would not get off at floor 10, and
the car will continue accelerating until the branching point
Figure 2: Simplified trellis structure for the embedded for floor 9, with one passenger still on board, as reflected in
Markov chain of a single descending car. Rows signify the diagram of the Markov chain.
floors; columns signify the number of recently boarded pas- In the general case, when the car has k floors to go with n
sengers; column groups signify elevator speeds. The empty passengers on board, and we assume that a passenger would
descending car is about to reach the branching point for pos- get off at any of the k floors with equal probability (1/k),
sible a stop at floor 13. It has been assigned hall calls at we can find the probability that x people would want to get
floors 7 and 11, each of which may increase the passenger off at the next floor by using the formula for the binomial
load by one. probability function:

The states in each row of the trellis are organized into n! (k − 1)n−x
P r(x, n, k) = (1)
Nv groups (4 in figure 2), corresponding to the Nv possi- (n − x)!x! kn
ble velocity values at branching points (ordered so that the Therefore n − x people would remain on board the car
leftmost column correspond to zero velocity, and the right- with probability P r(x, n, k). A similar treatment will give
most column correspond to the maximum velocity of the P r(x, n, k) when the destination probabilities are nonuni-
car). Within a group, the states correspond to the number form but independent of where each passenger gets on; we
of people who are currently in the car and who were waiting show below how to exploit a matrix of destination prob-
in the halls at the beginning of the trellis (ranging from 0 to abilities that are conditioned on the floor of arrival. The
2 in figure 2). If the maximum number of people that can be number of remaining people n − x specifies which state
in the car at the same time is H (2 in figure 2), the width of within a group the Markov chain would enter with probabil-
the trellis is Nv (H + 1) states. This organization of states ity P r(x, n, k), but we still have to find which group (veloc-
constitutes the trellis of the dynamic programming problem. ity setting) this state would be in. This velocity setting can
It can be seen that not all of the states in the trellis can be be determined by inspecting the existing car and hall calls,
visited by the car, because its motion is constrained by the as well as the number x of people getting off. If x > 0 or
current hall and car calls. there is a mandatory stop at the next floor due to a car or hall
If we assume that the floor-value component f of the four- call, the velocity v at the next state would be zero; only when
tuple used to describe a branching point is that of the floor x = 0 (nobody gets off the car at the next floor) and there
where the car will stop, if it starts decelerating at this branch- are no car or hall calls for this floor, the car would accelerate
ing point, the first row of the trellis always contains the first (or maintain maximum speed, if it has already reached it).
branching point which the car will reach. Similarly, under The initial and terminal states of the Markov chain are al-
this convention, the last row of the trellis always corresponds ways unique, and can easily be found by locating the first
to the floor where the last passenger along the round-trip of branching point the car would enter along its current trajec-
the car will be picked up. This arrangement of rows conve- tory, and the floor where the last waiting passenger would
niently spans the horizon which the dynamic programming be picked up, respectively. The distance between the rows
algorithm has to consider, because the last moment which of the trellis containing the initial and terminal states effec-
has to be considered is always the moment the last waiting tively spans the planning horizon of the dynamic program-
passenger is picked up—after that, the residual waiting time ming algorithm.
of passengers assigned to the current car becomes zero. Once the initial state of the Markov chain has been found,
The total cost Cij incurred on a segment, measured as the whole chain can be built by propagating the set of states
the waiting time of passengers who have not been picked up which can be visited by the car from the initial state. The
yet, can be expressed simply as the product of the number of chosen arrangement of the states into a dynamic program-
these passengers and the duration of the segment. ming trellis provides a convenient order for doing this. By
inspecting the order of transitions in Fig. 2, it can be seen Per−trial improvement in response speed
900
that if a transition is between different rows, the starting state
is always in a row above the successor state, and if a transi- Conventional better
800
tion is within the same row, the starting state is always to the
%
right of the successor state. Consequently, the propagation 10
700 %
of states proceeds row-wise from the upper-right corner of 20
the trellis until the lower-left corner is reached. Each state 3 0 %
might have multiple successor states, depending on the num- 600

average waiting time, ESA−DP


%
ber of people in the car who might want to get off, and the 50
probability of each of these transitions is determined by the 500
%
75
binomial formula above.
%
It should be noted that trellis building and evaluation time 400 100

can be significantly reduced by skipping highly improba- %


150
ble events. A simple way to do this is to simply neglect to 300 %
200
add a transition for any disembarkation event whose prob-
ability (equation 1) lies below some threshold, for exam- 200 300%
ple, all passengers disembarking at the same mid-building
floor. However, as the number of passengers in the system ESA−DP better
100
increases, nearly all disembarkation events have probability
values close to zero. It is important to evaluate a sample of
0
such cases representing the majority of the probability mass 0 100 200 300 400 500 600 700 800 900
in equation 1. This is done very efficiently by adding tran- average waiting time, Conventional

sitions for events in order of descending probability, stop-


ping when some large fraction of the probability mass is ac- Figure 3: Waiting times of the ESA - DP scheduler plotted
counted for, say 99%. against waiting times of the conventional scheduler in iden-
tical scenarios, in seconds. Each dot represents an hour sim-
Evaluation of Travel Time ulation in a different building type and arrival rate. Dots
Once the trellis is built, it is used to evaluate the expected below the diagonal represent cases when ESA-DP achieves
travel time of the car, starting from its current state. Opposite lower waiting time than the conventional scheduler, and vice
to the procedure for building the trellis, the algorithm for versa for dots above the diagonal.
evaluating it starts from the bottom row of the trellis moving
up, and processes the states in each row from left to right.
In essence, the algorithm iteratively computes the cost- calls, but ignoring stops where those new passengers would
to-go (expected remaining waiting time) of each state in the be dropped off. We have been advised by industry experts
trellis that can be visited by the car, by means of a Bell- that the benchmark scheduler method is considered to be
man backup on all successor states, using the probabilities highly competitive with the state of the art in currently de-
and costs for transitioning to these states. After all costs-to- ployed schedulers. The algorithm was tested on various
go are computed, the overall expected residual waiting time buildings with height ranging from 8 to 30 floors, served
from the moment the car enters the initial state is just the by between 2 and 8 elevator shafts, whose cars were moving
cost-to-go of that state. at a speed of 180 m/min. Each floor in these buildings was
This algorithm computes the expected remaining waiting 4m tall, except for the lobby which was 5m tall.
time of a car’s passengers only from the moment the car The experiments explored the case of mixed up-peak and
reaches the first state of the trellis. In general, however, down-peak traffic which are some of the scenarios that pose
when a hall call occurs, the car would be somewhere be- extraordinary demands to the performance of the system. In
tween two branching points, In order to find the total ex- addition, the up-peak case is accompanied by significant un-
pected residual waiting time of a car’s passengers from the certainty in passenger destinations. Most (80%) of the traf-
moment a hall call occurs, the result returned by the algo- fic originated at the lobby and was directed approximately
rithm has to be increased by the time it would take for the evenly to the upper floors, while the remaining 20% of the
car to reach the first branching point, multiplied by the total traffic was between floors other than the lobby. For the case
number of passengers currently assigned to that car, in both of down-peak traffic, most of the traffic (80%) originated at
directions. the upper floors and directed to the lobby, 10% was directed
from the upper floors, and the remaining 10% percent was
Experimental Verification of the Algorithm inter-floor traffic not involving the lobby.
The algorithm was benchmarked against a conventional The performance of the two algorithms was tested under
method for supervisory group control which minimizes the arrival rates ranging from 100 arrivals per hour up to the
round-trip time of the car along a single path, taking into point where the elevator group was overwhelmed by the ar-
consideration only stops due to existing car calls and stops rival stream and average waiting time exceeded two min-
where the car would pick up new passengers due to hall utes. Such a point is reached at different rates for different
buildings and number of shafts in the elevator group. The One may also estimate the expected number of arrivals at
scatter-plot in Fig. 3 plots the average waiting times of ESA - a floor since the hall button on that floor was first pressed.
DP against those of the conventional scheduler in 20,000 If the time elapsed since then is ∆t, and the times between
hour-long trials in a detailed simulator. Each point repre- arrivals at this floor are i.i.d. exponentially distributed ran-
sents the average wait time over 1 hour, with both sched- dom variables with arrival rate λ, then the total number
ulers fed identical arrival patterns. In general, the time sav- of new arrivals comes from a Poisson distribution whose
ings resulting from the application of the ESA - DP algorithm, mean is λ · ∆t. Hence, the expected number of passen-
expressed as a percentage of the time of the conventional al- gers waiting at this floor is λ · ∆t + 1. Such estimates
gorithm, also grow larger with increasing the arrival rate. At have been widely used by supervisory control algorithms,
the rate, at which the elevator bank can be considered to be with minimal decrease in performance (Bao et al. 1994;
overwhelmed by the passenger flow, the savings are in the Crites & Barto 1996). In order to apply this statistical esti-
range of 30%-40%. This indicates that ESA - DP has a higher mation method, however, the arrival rates at each floor must
throughput and capacity, only saturating at much higher traf- be known. These might come either from online statistical
fic rates. estimates of the latest arrivals, or from known traffic profiles
The reduction in average waiting time and consequent im- accumulated off-line from past data (Siikonen 1997).
provement of system capacity can be quite dramatic, espe-
cially in tall buildings with relatively few shafts, where the Non-uniform Destination Probabilities
system is heavily strained. The advantages of the ESA - DP al- Conditioned on Floor of Arrival
gorithm fade away in buildings with (unrealistically) many The assumption of a uniform probability distribution on pas-
shafts, partly because the system is never strained, and partly senger destinations will certainly be violated for most real
because of the conventional scheduler’s zoning strategy. It buildings which usually have different number of occupants
should be noted that ESA - DP strongly dominates the conven- on each floor, so the traffic flow from the lobby to differ-
tional scheduler at all traffic rates in mixed and down-peak ent floors cannot be assumed to be uniform. Furthermore,
traffic. Successive papers will document elaborations of the traffic between floors other than the lobby would usually be
ESA - DP approach that make it strongly dominant at all ar- non-uniform as well, for example in the case when a single
rival rates in up-peak traffic too. company is occupying adjacent floors in a building and there
is a lot of traffic between these floors, but little or no traffic
Extensions of the Algorithm to and from other, unrelated floors.
As noted above, handling non-uniform destination proba-
Several of the underlying assumptions of the algorithm can
bilities which do not depend on the floor of arrival is straight-
be relaxed, which might result in better performance and
forward and does not change the complexity of the algorithm
easier installation on existing elevator banks. These are:
— the only change necessary is in the probability used in
completely observable system state, uniform probability dis-
the binomial formula. However, when destination probabili-
tribution on passenger destinations, infinite capacity of the
ties are conditioned on the floor of arrival, the computational
cars, and absence of future arrivals. Solutions for relaxing
complexity changes drastically, due to the necessity to “re-
the first three assumptions are described below.
member” explicitly the state of each individual passenger (in
or out of the car).
Partially-Observable System State The favorable computational complexity of the original
The ESA - DP algorithm assumes that the state of the system algorithm could be achieved because the scheduler could ig-
is completely observable, including the number of car and nore the distinctions among the passengers within a car, and
hall calls, and the number of people waiting per hall call. each state could be characterized simply by the number of
The exact number of hall and car calls is known because people within the car. If the largest number of passengers
they are always registered by passengers by pressing car and that could be in the car at the same time was H, a total of
hall buttons, but the exact number of people per hall call is H + 1 states were sufficient to encode all distinctions neces-
not readily available. sary to quantify the future trajectory of the car, i.e. to make
There are two possibilities for dealing with this problem the chain fully Markovian. This was possible because indi-
— one of them relies on technical devices, and the other one vidual passengers exited the car with the same probability,
relies on statistical estimation techniques. The simplest pos- and the binomial formula could be used to aggregate multi-
sibility is to require each passengers to press individually a ple exits.
button in the desired direction, even when the button has al- Conversely, the binomial formula cannot be used to com-
ready been pressed by a previous passenger. This would pro- pute transition probabilities when the respective probabil-
vide an accurate count of the number of passengers per hall ities of exiting the car vary among different passengers,
call, but would most likely be abused by impatient passen- because the disembarkation events for individual car pas-
gers. The exact number of people waiting on a given floor sengers are not i.i.d. variables. One straightforward ex-
can also be determined from sensing, e.g., a computer vision tension of the algorithm is to maintain individual Boolean
system observing the space in front of the elevator bank and state variables for each passenger, where each variable des-
counting the number of faces in the image. Such a solution is ignates whether the passenger is inside the car or not. Let
clearly within the current state of the art in computer vision the overall state of the passengers inside the car be U =
.
(Kim & Moon 2001). [u1 , u2 , ..., uH ], where each variable (fluent) ui = 1 iff pas-
.
senger 1 ≤ i ≤ H is in the car, and ui = 0 otherwise. Fur- passengers getting off at a particular floor is not completely
thermore, let pij be the probability that passenger i would certain, when the number of such passengers exceeds the
get off at floor j, given that he/she is still in the car, and let number of car buttons pressed. Even though the distribu-
.
qij = 1 − pij . Then, the transition between state U and tion on people getting off is completely defined, computing
one of its successor states U 0 = [u01 , u02 , ..., u0H ], u0i ≤ ui , it is not trivial, because that distribution must also obey the
QH u −u0 1−u +u0 constraints arising from existing car calls.
1 ≤ i ≤ H, can be expressed as i=1 (piji i )(qij i i ).
While this extension is quite straightforward, both its The propagation of the second component (the one en-
computational time and storage space complexities are ex- coding the possibility that between 0 and H − m passengers
ponential in the number of passengers H. This is a fre- would be left off as a result of an overflow) can be imple-
quent issue in many decision-theoretic planning problems, mented easily if one additional assumption is made: that
and many approaches for complexity reduction have been once a passenger has been left off, he or she will remain
tried (Boutilier, Dearden, & Goldszmidt 1995). Significant unserved until the end of the round-trip. Under this assump-
computational leverage can be achieved by exploiting struc- tion, the number of left-off passengers cannot decrease, and
ture in the stochastic properties of the problem domain, and would either remain the same or increase as the state is prop-
one primary manifestation of such structure is conditional agated in time, depending on whether new overflows occur.
independence between state fluents.
Such conditional independence is present in the elevator Conclusions
scheduling problem as well. The behavior of each individual This paper details an efficient scheduling algorithm based on
passenger is independent of that of other passengers — deci- dynamic programming for exact estimation and minimiza-
sions to get off or stay in the car are not influenced by other tion of the expected waiting times of all known passengers
passengers. It can be expected that this structure can be ex- in a group elevator system. Empirical comparison with a
ploited in the future to reduce the exponential complexity of state-of-the-art scheduler in a very detailed discrete-event
the extended algorithm. elevator bank simulator demonstrated that for a wide vari-
ety of buildings, ranging from 8 to 30 floors, and with 2 to
Finite Car Capacity 8 shafts, ESA - DP reduces waiting times by 30%-40% under
When the number of waiting passengers assigned to a sin- very heavy traffic, and rarely under-performs the benchmark
gle car exceeds its physical limit, whether all of them can scheduler in light traffic.
be transported within a single round trip of the car depends The base ESA - DP algorithm was developed under the as-
on their destinations. The basic version of the ESA - DP al- sumptions of no future passenger arrivals, unlimited car ca-
gorithm assumes that the car has infinite capacity, and all pacity, full state information, and a known marginal distribu-
passengers would be transported in a single round trip — tion over destination floors; even though the simulator does
while this assumption simplifies computation, it would not not respect most of these assumptions, the algorithm per-
correspond to reality in many cases2 . Another way of han- forms very well.
dling the physical limits of the car is to divide the H + 1 Most of these assumptions can be relaxed. We provided
states within a single-velocity-group of the trellis into two complete solutions for dealing with partial state information
subgroups: one subgroup of m + 1 states, where m is the and handling non-uniform probability distributions on des-
physical limit of the car, and another subgroup of H − m + 1 tination floors. The algorithm was also extended to deal
states which correspond to the possible number of people with non-uniform probability distributions on destination
left off as a result of a car overflow. The state propagation al- floors conditioned on the arrival floor, although at a much
gorithm becomes more complicated, because now two com- higher computational cost. We also outlined how the algo-
ponents have to be propagated: one is the number of people rithm could be extended to consider possible future passen-
inside the car, and the other one is the number of people left ger overflows due to finite car capacity. This may not be
off. necessary because the base algorithm shows its greatest per-
A major difference from the basic case is that when com- formance advantage at the heaviest traffic rates—precisely
puting the first component (number of people in the car), the where we should expect to see it most punished for neglect-
number of people from existing car calls can no longer be ing overflows.
ignored, since whether the car would pick up new hall-call The complexity of the base algorithm is linear3 in both the
passengers at a particular floor (and exactly how many) de- number of cars and the number of existing hall calls, which
pends also on whether existing car-call passengers would get allows it to be implemented on micro-controllers currently
off at that floor to make room for new ones. The fact that all employed in existing elevators.
stops due to existing car calls are known in advance simpli- It is our hope that this paper will open the door to DP solu-
fies the computation, but still the exact number of car-call tions to many other online scheduling problems. Forthcom-
ing papers will describe extensions to ESA - DP that consider
2
We are forced to make this unrealistic assumption in the cur- future arrivals, with significant further gains in performance.
rent implementation because the elevator simulator automatically
3
assigns to a car any new arrivals at floors where the car is already Run-time complexity is also linear in the total number of pas-
scheduled to stop, regardless of whether or not car has room to sengers, but in some cases the number of transitions between two
accommodate these passengers. The scheduler is not called and slices of the trellis can be quadratic in the number of people in the
therefore does not get a chance to re-balance the load. car.
Acknowledgments G.; and Eiter, T., eds., Lecture Notes in Computer Science,
We would like to thank Koichi Sasakawa and Masafumi volume 2174. Berlin: Springer Verlag. 459–462.
Iwata from Sentan Soken, the Advanced Technology R&D Pepyne, D. L., and Cassandras, C. G. 1997. Optimal
Center of Mitsubishi Electric, for providing us with the de- dispatching control for elevator systems during uppeak
tailed elevator simulator used in the experiments reported in traffic. IEEE transactions on control systems technology
this paper. 5(6):629–643.
Powell, B. A., and Williams, J. N. 1992. Elevator dis-
References patching based on remaining response time. US Patent.
Bao, G.; Cassandras, C. G.; Djaferis, T. E.; Gandhi, #5,146,053.
A. D.; and Looze, D. P. 1994. Elevator dispatchers for Seckinger, B., and Koehler, J. 1999. Online-Synthese von
down-peak traffic. Technical report, University of Mas- Aufzugssteuerungen als Plannungsproblem. In 13. Work-
sachusetts, Department of Electrical and Computer Engi- shop Planen und Konfigurieren, Interner Bericht des Insti-
neering, Amherst, Massachusetts. tuts für Informatik der Universität Würzburg, 127–134.
Barney, G., and dos Santos, S. 1985. Elevator Traffic Anal- Siikonen, M.-L. 1997. Elevator group control with artificial
ysis, Design and Control. England: IEE, Peter Peregrinus intelligence. Technical Report A67, Helsinki University of
Ltd. Technology, Systems Analysis Laboratory, Helsinki, Fin-
Bertsekas, D. P. 2000. Dynamic Programming and Opti- land.
mal Control. Belmont, Massachusetts: Athena Scientific. Strakosch, G. R. 1998. Vertical transportation: elevators
Volumes 1 and 2. and escalators. New York, NY: John Wiley & Sons, Inc.
Bittar, J. 1982. Relative system response elevator call as- Tanahashi, T., and Araki, H. 1994. Drive-control equip-
signments. US Patent. #4,363,381. ment for 750 m/min elevators. Mitsubishi Electric Advance
Boutilier, C.; Dean, T.; and Hanks, S. 1999. Decision the- 67:8–9.
oretic planning: Structural assumptions and computational Ujihara, H., and Amano, M. 1994. The latest elevator
leverage. Journal of Artificial Intelligence Research 11:1– group-control system. Mitsubishi Electric Advance 67:10–
94. 12.
Boutilier, C.; Dearden, R.; and Goldszmidt, M. 1995. Ujihara, H., and Tsuji, S. 1988. The revolutionary AI-
Exploiting structure in policy construction. In Extending 2000 elevator group-control system and the new intelligent
Theories of Action: Formal Theory & Practical Applica- option series. Mitsubishi Electric Advance 45:5–8.
tions: Papers from the 1995 AAAI Spring Symposium, 33–
38. AAAI Press, Menlo Park, California.
Cho, Y.; Gagov, Z.; and Kwon, W. 1999. Elevator group
control with accurate estimation of hall call waiting times.
In Proceedings of the 1999 International Conference on
Robotics and Automation, 447–452. Detroit, MI: IEEE.
Crites, R. H., and Barto, A. G. 1996. Improving eleva-
tor performance using reinforcement learning. In Touret-
zky, D. S.; Mozer, M. C.; and Hasselmo, M. E., eds.,
Advances in Neural Information Processing Systems, vol-
ume 8, 1017–1023. The MIT Press.
Hikihara, T., and Ueshima, S. 1997. Emergent syn-
chronization in multi-elevator system and dispatching con-
trol. IEICE Transactions on Fundamentals of Electronics,
Communications and Computer Sciences E80-A(9):1548–
1553.
Kim, J.-H., and Moon, B.-R. 2001. Adaptive elevator
group control with cameras. IEEE Transactions on indus-
trial electronics 48(2):377–382.
Koehler, J., and Ottiger, D. 2002. An AI-based approach to
destination control in elevators. AI Magazine 23(3):59–79.
Koehler, J., and Schuster, K. 2000. Elevator control as
a planning problem. In Chien, S.; Kambhampati, S.; and
Knoblock, C. A., eds., Proceedings of the Fifth Interna-
tional Conference on AI Planning and Scheduling (AIPS),
1332–1339.
Koehler, J. 2001. From theory to practice: AI planning for
high performance elevator control. In Baader, F.; Brewka,

You might also like