OPERATIONS RESEARCH
informs
Vol. 56, No. 4, JulyAugust 2008, pp. 865880
issn 0030-364X eissn 1526-5463 08 5604 0865
doi 10.1287/opre.1080.0550
2008 INFORMS
Simulation-Based Optimization of Virtual Nesting
Controls for Network Revenue Management
Garrett van Ryzin
Graduate School of Business, Columbia University, New York, New York 10027,
[email protected]
Gustavo Vulcano
Stern School of Business, New York University, New York, New York 10012,
[email protected]
Virtual nesting is a popular capacity control strategy in network revenue management. In virtual nesting, products (itineraryfare-class combinations) are mapped (indexed) into a relatively small number of virtual classes on each resource (ight
leg) of the network. Nested protection levels are then used to control the availability of these virtual classes; specically, a
product request is accepted if and only if its corresponding virtual class is available on each resource required. Bertsimas
and de Boer proposed an innovative simulation-based optimization method for computing protection levels in a virtual
nesting control scheme [Bertsimas, D., S. de Boer. 2005. Simulation-based booking-limits for airline revenue management.
Oper. Res. 53 90106]. In contrast to traditional heuristic methods, this simulation approach captures the true network
revenues generated by virtual nesting controls. However, because it is based on a discrete model of capacity and demand,
the method has both computational and theoretical limitations. In particular, it uses rst-difference estimates, which are
computationally complex to calculate exactly. These gradient estimates are then used in a steepest-ascent-type algorithm,
which, for discrete problems, has no guarantee of convergence.
In this paper, we analyze a continuous model of the problem that retains most of the desirable features of the Bertsimasde Boer method, yet avoids many of its pitfalls. Because our model is continuous, we are able to compute gradients
exactly using a simple and efcient recursion. Indeed, our gradient estimates are often an order of magnitude faster to
compute than rst-difference estimates, which is an important practical feature given that simulation-based optimization
is computationally intensive. In addition, because our model results in a smooth optimization problem, we are able to
prove that stochastic gradient methods are at least locally convergent. On several test problems using realistic networks, the
method is fast and produces signicant performance improvements relative to the protection levels produced by heuristic
virtual nesting schemes. These results suggest it has good practical potential.
Subject classications: stochastic algorithm; stochastic gradients; revenue management; capacity control.
Area of review: Manufacturing, Service, and Supply Chain Operations.
History: Received January 2003; revisions received September 2004, September 2005, February 2006, May 2006;
accepted September 2006.
Introduction
ple, network problems arise in managing the capacities of
a set of ights in a hub-and-spoke airline network with
connecting and local trafc, or in managing hotel capacity on consecutive days based on guests lengths of stay.
The dependence among the resources in these cases is
created by customer demand; customers may require several resources simultaneously (e.g., two connecting ights,
or a number of consecutive days at a hotel) to satisfy
their needs. Thus, accepting demand will reduce capacity
on multiple resources simultaneously. To make networkoptimal revenue decisions in such cases requires a networklevel approach to capacity control; that is, optimization
methods that explicitly model network effects and control
mechanisms that determine product availability based on
total network revenue contribution. (See Talluri and van
Ryzin 2004 for an in-depth discussion of network RM
methods.)
Optimally rationing the amount of capacity sold to various demand classes is a central problem in airline revenue
management (RM). The so-called single-resource problem
involves rationing capacity on a single resource (single ght
leg or single day at a hotel property). This problem has
received signicant attention in the academic literature in
the past and a variety of heuristic and exact methods have
been proposed for single-resource problems; see, for example, Littlewood (1972), Belobaba (1987, 1989), Brumelle
and McGill (1993), and Lee and Hersh (1993). The book
by Talluri and van Ryzin (2004) provides a detailed discussion of single-resource methods.
More recently, RM practitioners and researchers have
turned their attention to network, or origin-destination (O-D),
RM. In network RM, the objective is to jointly manage
the capacities of a network of related resources. For exam865
866
van Ryzin and Vulcano: Simulation-Based Optimization of Virtual Nesting Controls for Network Revenue Management
Although network RM creates both methodological and
implementation challenges, the potential revenue benets
are signicant. Indeed, simulation studies of airline huband-spoke networks by various researchers have demonstrated signicant revenue gains from using network
methods over single-resource methods; see, e.g., Belobaba
and Lee (2000), Belobaba (2001), and Williamson (1992).
Several major airlines are now using network methods and
most vendors now offer some form of network RM system.
Yet, exact network optimization is, practically speaking,
impossible due to the large dimensionality of the resulting
dynamic programming models. Instead, various approximate methods have been developed. Some approximations
are based on deterministic mathematical programming
models. Among these are the minimum-cost network-ow
formulations by Glover et al. (1982) and Dror et al. (1988).
Williamson (1992) proposed and investigated mathematical programming methods based on linear and nonlinear programming approximations. Curry (1990) describes
a combined mathematical programming and nested allocation formulation of the network problem, in which fare
classes within the same O-D are nested, while capacity is
partitioned among O-Ds.
Other approximations are based on decomposing the network problem into a collection of single-resource problems. One of the oldest decomposition approximations is
the so-called virtual nesting method, developed initially
at American Airlines (Smith and Penn 1988, Smith et al.
1992). In virtual nesting, products (itinerary-fare-class combinations) are clustered according to some criteria (usually
an estimate of the net network revenue benet of selling
the product) to form a small number of virtual classes
on each resource of the network. This mapping of products
to virtual classes is called indexing. Demand and revenue
statistics from these aggregated virtual classes are then used
as inputs to a single-resource model, which provides nested
protection levels (or booking limits) for the virtual classes
on each resource. A product is then open for sale if and
only if capacity is available in its corresponding virtual
class on each of the resources it requires. In this way, the
network problem is decomposed into a collection of singleresource problems. Yet, by clustering itineraries into a relatively small number of classes based on their estimated
network revenue values, virtual nesting is able to approximate the network effects of selling a product. Moreover,
the approach preserves the nested protection level structure of traditional single-resource controlsan important
advantage in practice because often the reservation system
is designed only to implement resource-level, class-based
booking controls. For these reasons, virtual nesting methods have proved to be quite popular in practice. (Again,
see Talluri and van Ryzin 2004 for a complete discussion
of virtual nesting methods.)
Recently, an interesting variation on virtual nesting was
proposed by Bertsimas and de Boer (2005). Their idea is
to view the mapping of products to virtual classes and the
Operations Research 56(4), pp. 865880, 2008 INFORMS
collection of nested protection levels as dening a class
of control strategies, parameterized by the set of protection levels on the network. Starting with a given initial set
of protection levels, they then use simulation-based optimization and approximate dynamic programming methods
to optimize these protection levels. The key idea is that by
using simulation, it is possible to provide a very accurate
estimate of the true network effects of changing protection
levels. This is a signicant advantage relative to traditional
virtual nesting methods, which are based on single-resource
models that, at best, only heuristically approximate network
effects.
However, to perform this simulation-based optimization,
Bertsimas and de Boer (2005) use a discrete-capacity,
discrete-demand model of the network problem. They divide
the booking horizon into time periods, and when the booking process enters a new time period, the protection levels
are reoptimized. This is accomplished by using simulationbased optimization to determine the revenue generated during the current time period and by using an approximation
of the value function to estimate the revenue-to-go from the
next period onward. While running the simulation, if a protection level becomes binding, it is perturbed and the resulting revenue is estimated by spawning a new data structure
to keep track of the two sample paths (perturbed and unperturbed). This is repeated for every binding protection level
encountered.
Although this discrete model is realistic, it leads to a
difcult optimization problem in several respects. For one,
rst-difference estimates can be computationally inefcient. Indeed, as mentioned, to compute all rst-differences
requires generating a separate sample path for each binding protection level encountered. This can quickly become
too computationally intensive to be practical for large networks. (A detailed complexity analysis is provided in 2.4.)
This is one reason Bertsimas and de Boer (2005) break
the booking horizon into periods and use an approximation of the revenue-to-go rather than directly simulating
the complete sample path. In fairness, their revenue-to-go
approximation also has the desirable feature of reoptimizing the protection level parameters in each period, so it
further serves as a value function estimate in the spirit of
approximate dynamic programming. We do not consider
this sort of value function approximation here; rather, our
focus is on the simulation-based optimization portion of
their approach. However, it is certainly possible to combine
our more efcient simulation method with Bertsimas and
de Boers value function approximation approach; the two
ideas are quite independent.
A second difculty is that Bertsimas and de Boer (2005)
use their rst-difference gradient estimates in a stochastic
steepest ascent algorithm, together with rounding at each
iteration. The drawback here is that ascent methods together
with rounding at each step are not guaranteed to converge
for discrete problems. Even algorithms based on nite rstdifferences do not necessarily converge, either locally or
van Ryzin and Vulcano: Simulation-Based Optimization of Virtual Nesting Controls for Network Revenue Management
Operations Research 56(4), pp. 865880, 2008 INFORMS
globally. (See Ermoliev 1988.) In short, the method lacks
good convergence properties.
Despite these computational and theoretical limitations,
Bertsimas and de Boer (2005) report promising numerical
results. Moreover, the overall idea of their approach is a
clever and practical one.
Motivated by the potential of this approach, we propose
and analyze a variation of it that overcomes the shortcomings identied above. The key difference is that we
use a continuous capacity and demand (uid) model of
the problem that allows for partial acceptances. This uidmodel approach is similar in spirit to that in Mahajan
and van Ryzin (2001) for a stochastic inventory problem.
Although less realistic in terms of the ne-grained details
of the simulation, the advantage of the model is that it leads
to a much easier continuous optimization problem. Indeed,
we show how to compute the entire sample path gradient
for the network revenue function exactly via a simple recursion that is essentially no more complex than simulating
a single sample path. Numerical examples show that our
gradient estimates can often be computed an order of magnitude faster than rst-difference estimates. The recursion
is also quite easy to implement. The pseudocode is provided in online Appendix A4. An electronic companion to
this paper is available as part of the online version that can
be found at http://or.journal.informs.org/.
We then embed these sample path gradients in a stochastic approximation algorithm as in Bertsimas and de Boers
(2005) method, but without rounding at each iteration.
Because our revenue function is sufciently smooth, we
are able to prove that it is locally convergent. (As a nal
heuristic step, one can always round the nal parameter
values as needed to generate integer-valued protection levels.) Our numerical experiments show signicant revenue
increases with respect to the starting protection levels using
our method. Moreover, it is quite fast even on problems of
realistic size.
We note that both speed and performance are important
factors in RM practice. Indeed, simulation-based optimization is notoriously computationally intensive, and in large
airline implementations of network RM, one must often
solve many instances of large network problems in a small
processing time window. For example, a major network
carrier can easily have networks consisting of thousands of
ight legs and hundreds of thousands of products, and they
may need to optimize one such network for each day going
out 300 days into the future. All this optimization is typically done in an overnight batch process along with data
extraction, forecasting, reservation system updating, and
other computationally intensive tasks. The total available
time window for optimization may be only on the order
of a few hours, corresponding to a few minutes per network in most cases. Indeed, one reason simple deterministic
models remain so popular in practice is precisely because
they are extremely fast to solve. Therefore, if simulationbased methods are to become viable alternatives in practice,
computational efciency is essential.
867
The remainder of this paper is organized as follows. In 1,
we introduce the discrete model and its continuous approximation. Section 2 describes the sample path view of the
network demand. In 3, we present the way we improve
an initial set of protection levels through a gradient-based
method. Section 4 shows some numerical results, and we
present our conclusions in 5.
Notation
We begin by introducing some notational conventions. For
a vector x, xj denotes its jth component, and xT is the vector transpose. For a number a, we denote a+ = maxa 0.
Let denote the set 1 n. We use I for the indicator function, a.s. means almost surely, c.d.f. is short for
cumulative distribution function, and w.p.1 is short for with
probability 1.
1. Model Formulation
The network has m resources which can be used to provide n products (e.g., in an airline network each resource
could correspond to a single-leg ight, and a product is
dened by an itinerary and price-class combination). Dene
the incidence matrix A = aij
0 1mn . We let aij = 1
if resource i is used by product j, and aij = 0 otherwise.
Thus, the jth column of A, Aj , is the incidence vector for
product j; and the ith row of A, Ai , is the incidence vector for resource i. We use the notation i Aj to indicate
that resource i is used by product j; and j Ai to mean
that product j uses resource i. The state of the network is
described by a vector xT = x1 xm of resource capacities. If one unit of product j is sold, the state of the network changes to x Aj . To simplify the analysis, we will
ignore cancellations and no-shows. The revenue obtained
from accepting a request for one unit of product j is rj .
A request for product j is mapped to virtual class ci j
on each resource i used by product j as given by a xed
indexing scheme, which we assume is given. A variety of
methods can be used to perform this indexing (we discuss
a standard one in online Appendix A3) and our algorithm
works with any of them. Although some indexing schemes
will no doubt yield better performance than others, the
indexing scheme per se is not the focus of our analysis;
rather, it is assumed to be an input to our model.
We assume that there are c + 1 virtual classes on each
resource, and that virtual class 1 is the highest in the nesting order, followed by virtual class 2, etc. Let yic denote
the protection level for virtual classes c and higher on
resource i. Under a virtual nesting control, requests mapped
to virtual class c + 1 are accepted if and only if the remaining capacity on each leg i required by the booking exceeds
yic (the protection level for higher virtual classes). In other
words, class c + 1 requests only have access to the capacity
in excess of yic on leg i.
868
van Ryzin and Vulcano: Simulation-Based Optimization of Virtual Nesting Controls for Network Revenue Management
Operations Research 56(4), pp. 865880, 2008 INFORMS
Let y = y11 y1c ym1 ymc denote the vector
of all mc protection levels. The protection levels are nested,
so we require
0 yi1 yi2 yic xi 1
i = 1 m
(1)
where xi 1 is the initial capacity of resource i. Without
loss of generality, we assume that xi 1 > 0 i, else we
can simply eliminate leg i and all products j Ai from
the model formulation. Let be the set of all y satisfying
these constraints. We assume dummy protection levels yi0
when needed: yi0 = 0 i, representing the fact that there is
no protection level for the highest virtual class on each leg.
We assume that the protection levels y remain constant
while processing the requests. This corresponds to what
is called theft nesting (de Boer 2003). Another alternative, called standard nesting, is to update protection levels
after requests are accepted. Specically, in standard nesting, if a request for one seat is accepted for a virtual
class c on a given leg, the protection levels for classes c
and lower (i.e., classes c c + 1 c)
are reduced by one.
The rational for this adjustment is heuristic, but the idea is
roughly to compensate for lower expected demand-to-come
for class c after we observe an arrival of a class c request.
There is effectively no difference between these two versions of nesting when low-fare demand arrives before highfare demand. However, if the order of arrivals is mixed (as
is possible in virtual nesting schemes), theft nesting can
potentially overprotect capacity for high-fare demand,
although the relative performance of the two methods can
depend on the frequency of reoptimization of the protection levels. While both nesting methods are found in airline
industry practice, we restrict our analysis here to theft nesting, although our approach can be modied to work with
standard nesting.
We take a sample path view of the demand and sales
process. Let T denote the number of customers on a sample path, verifying T = 1 for some nite constant .
The demand is characterized as a sequence of customer
requests = t t = 1 T . The index t runs forward
in time (i.e., t = 1 represents the rst customer, t = 2 the
second customer, and t = T the last one). Each element t
in the sequence is a pair t = jt qt , where jt is a realization of a random variable on the support representing
the product type requested by customer t, and qt is a real
denoting
ization of a random variable with support 0 Q
,
the amount requested. While T , jt , and qt are all functions
of , we omit this dependence explicitly to prevent the
notation from becoming too cumbersome.
To ensure the expected revenue function is smooth, in
our theoretical analysis we will require that the quantities qt
have a continuous differentiable c.d.f. Fq , although it is
clearly more realistic to view qt as discrete (e.g., the number of seats). In our computations, we tested this more
realistic discrete-demand case.
Note that this demand model is quite general. It allows
arbitrary order of arrivals among the virtual classes, arbitrary correlation and coefcients of variation between successive demands, etc. This level of generality is one of the
main advantages of simulation-based methods; essentially,
any procedure for generating sequences of demand of this
form is allowed.
Our key assumption is that capacity and demand are
continuous quantities and that requests can be partially
accepted. Specically, a request t for an amount qt = q
of product jt = j is processed as follows: By letting
xt denote the vector of remaining capacities at time t,
the amount of request accepted uj xt y q is given by
the minimum available capacity among all the resources
required by j, or q if there is at least q units of capacity
on all these resources. Formally,
uj xt y q = minq xi t yi ci j1 + i Aj
(2)
Essentially, we are considering demands as uids, which
we can partially accept if the available capacity is positive
but less than the quantity q requested.
For the same sample path , dene Rt xt y to be
the revenue-to-go over periods t t + 1 T starting with
a vector xt of remaining capacities and protection levels y. We then have the following set of recursive forward
equations for determining the revenues:
Rt xt y = rjt ujt xt y qt + Rt+1 xt + 1 y
xt + 1 = xt Ajt ujt xt y qt
for
t = 1 T ,
with
boundary
conditions
RT +1 x y = 0 for all x y . The total sample path
revenue is given by
Ry = R1 x1 y
(3)
Our objective is to maximize the expected revenue over
the set of feasible protection levels:
max gy
y
where
(4)
gy = ERy
(Here, and in what follows, expectation is taken with
respect to the random sequence of demand.)
2. Sample Path Analysis
We begin by analyzing the sample path revenue
Rt xt y as a function of the protection levels y and
the remaining capacity xt.
2.1. Smoothing the Revenue Function
Note that by allowing partial acceptance of requests, the
function uj xt y q dened by (2) is continuous and
van Ryzin and Vulcano: Simulation-Based Optimization of Virtual Nesting Controls for Network Revenue Management
869
Operations Research 56(4), pp. 865880, 2008 INFORMS
piecewise linear in y. Moreover, one can easily verify that
yi ci j1 = xi t q and yi ci j1 = xi t are points of nondifferentiability, which makes Ry a continuous but
nonsmooth function of y. Indeed, we cannot even guarantee
that Rt xt y is differentiable with respect to y w.p.1,
because the event yi ci j1 = xi t can occur with some
positive probability (e.g., with positive probability, we can
get a sequence of high-quantity requests such that the value
uj xt y q = 0 in (2) for a sequence of consecutive ts
is determined by the fact that yi ci j1 = xi t). This fact
violates the well-known sufcient conditions for the differentiability of gy, and in particular for interchanging
differentiation and expectation. (See Glasserman 1994 for
a good reference on this topic.)
To overcome these technical difculties, we redene the
sample path as = t t = 1 T , where now t =
jt qt !t , and where !t i is a random variable uniformly
distributed on 0 "
for some small ".1 Then, we consider
the following variation of the problem:
Rt xt y = rjt ujt xt !t y qt
+ Rt+1 xt + 1 y
(5)
xt + 1 = xt !t Ajt ujt xt !t y qt
(6)
The idea here is to smooth the acceptance function by randomly perturbing the remaining capacity. Indeed, with this
new perturbation, following the argument in the previous
paragraph, the event yi ci j1 = xi t!t i occurs with probability zero. Thus, the control uj dened in (2) becomes
differentiable w.p.1. Using the composition dened by (5),
it is not hard to see that the revenue function becomes differentiable w.p.1. as well.
We are now able to calculate the gradient of the revenue
function with respect to y. From (3), we have
$
R xt y
$xi t t
$
= rjt
Rt+1 xt + 1 y
kAj $xk t + 1
t
$
u xt !t y qt
$xi t jt
$
R xt + 1 y i t
$xi t + 1 t+1
(8)
with boundary conditions
$
R xT + 1 y = 0
$yic T +1
i c
$
R xT + 1 y = 0
$xi T + 1 T +1
i
The online appendix provides a detailed derivation of these
derivatives of the revenue function.
Note that the general form of the two derivatives is very
similar. The term in parentheses is simply the marginal revenue for accepting one extra unit of product j minus the
marginal displacement cost over the legs used by product jin other words, product js displacement adjusted
revenue value. This quantity is multiplied by the derivatives of the acceptance function ($/$yic uj x y q or
$/$xi uj x y q) to give the marginal value in the current
period. Adding this to the marginal revenue-to-go gives the
total derivative.
2.2. Gradients of uj
We next determine the gradients of the acceptance function uj x y q. To ensure that partial derivatives are well
dened on the boundary of the feasible set (1), we redene (2) as follows:
#y Ry = #y R1 x y
$
$
=
R1 x y
R1 x y
$y11
$ymc
Analogously, the gradient with respect to x is
$
$
#x R1 x y =
R x y
R x y
$x1 1
$xm 1
Using the chain rule, we then obtain the set of backward
equations for the derivatives with respect to yic :
$
R xt y
$yic t
$
= rjt
Rt+1 xt + 1 y
kAj $xk t + 1
t
$
$
ujt xt !t y qt +
R xt + 1 y
$yic
$yic t+1
i c t
A similar set of backward equations for the derivatives with
respect to xi is given by
(7)
uj x y q = minq xi yic + i Aj c < ci j
(9)
From (9), one can determine for all i and c the following
derivative:
1 if simultaneously
(i) i Aj
(ii) xi yic < xk ykc
k Aj k = i c < ck j
$
u xyq =
(10)
$yic j
(iii) 0 < xi yic < q
(iv) c < ci j
if (ii) or (iii) holds at equality
0
otherwise
van Ryzin and Vulcano: Simulation-Based Optimization of Virtual Nesting Controls for Network Revenue Management
870
Operations Research 56(4), pp. 865880, 2008 INFORMS
Figure 1.
Stochastic gradient calculation of the
acceptance function with respect to nested
protection levels: The zero gradient case.
uj = q
x1
x3
y1, c1( j)1
y3, c3( j) 1
x2
y2, c2( j) 1
In words, the quantity of demand accepted from a request
for product j in state x is reduced (one-for-one) by a slight
increase in the protection level yic if and only if all of the
following hold: (i) resource i is used by j, (ii) the capacity available on resource i is a binding constraint, (iii) the
amount accepted is positive but constrained by the protection level associated to product j over resource i, and
(iv) class c is higher in the nesting order than the virtual
class of product j over resource i. If at least one equality holds for (ii) or (iii), then the derivative does not exist.
However, from (7), one can see that the probability of this
event occurring is zero due to the random perturbation of
capacity, and the continuity of demand quantity qt . In all
other cases, a small change in yic does not affect the amount
of j we accept.
These conditions are further illustrated in Figures 1 and 2,
which show a request at time t for q units of product j
that uses resources 1, 2, and 3. The height of the bars represent the capacity remaining at time t, and the quantities yi ci j1 represent the protection levels for product j
on each resource i. The unshaded areas therefore represent
the capacity available for product j on each of the three
resources. Note in Figure 1 that there is sufcient capacity
available on all three resources to fully satisfy the request
for product j, so uj = q. Thus, a small perturbation in
yi ci j1 will not affect the quantity of product j accepted in
this time period, and therefore $/$yi ci j1 uj x y q = 0
Figure 2.
Stochastic gradient calculation of the
acceptance function with respect to nested
protection levels: The nonzero gradient case.
uj = x3 y3, c3( j ) 1
q
x1
y1, c1( j)1
x3
x2
y2, c2( j) 1
y3, c3( j) 1
for all i. However, in Figure 2, the requested quantity
exceeds the available capacity on resource 3 and the
request can only be partially lled because of the protection level constraint on resource 3, so uj = x3 y3 c3 j1 .
In this case, a small increase in y3 c3 j1 will reduce the
amount of product j that we accept in this period, so
$/$y3 c3 j1 uj x y q = 1. An increase in any of the
other protection levels on resources 1 and 2, however, will
not affect the quantity accepted because these protection
levels are not binding. This example illustrates the conditions leading to (10).
A similar reasoning provides the derivatives with respect
to xi :
1 if simultaneously
i i Aj
ii xic yic < xk ykc
k Aj k = ic < ck j
$
for some c < ci j (11)
u xyq =
$xi j
iii 0 < xi yic < q
for some c < ci j
if (ii) or (iii) holds at equality
0 otherwise
In words, the quantity of demand accepted from a request
for product j in state x is decreased (one-for-one) by a
slight decrease in the capacities xi if and only if all of
the following hold: (i) resource i is used by j, (ii) the
capacity available on resource i is a binding constraint, and
(iii) there is some positive capacity available on resource i,
but constrained by the protection level associated to product j over resource i (or by one from a higher virtual
class). Again, the derivative does not exist if (ii) or (iii)
holds with equality, but as before, because of the perturbation of capacity and the continuity of the distribution of q,
this event occurs with probability zero. In all other cases,
a small change in xi does not affect the amount of j we
accept.
2.3. Example of Revenue Gradient
The following example illustrates the mechanics of the
gradient calculation: Consider a single-leg problem, with
three products, three virtual classes (one per product), and
revenues r = 25 19 10. Suppose that the initial capacity is x1 = 8, and that protection levels are y = 2 4.
Assume for simplicity that all requests are for a single seat,
and suppose that the sequence of requests for products is
3 3 3 3 2 1, e.g., corresponding to a sample path
= 3 1 001 3 1 0006 3 1 0003
3 1 0001 2 1 003 1 1 0007
van Ryzin and Vulcano: Simulation-Based Optimization of Virtual Nesting Controls for Network Revenue Management
871
Operations Research 56(4), pp. 865880, 2008 INFORMS
In this case, the fourth request for product 3 meets a binding constraint (e.g., after the realization of the random
variable !4 = 0001, x4 !4 = 498) because four seats
are reserved for classes 2 and 1. Then, when marginally
increasing y2 (or decreasing x), we will be rejecting a
marginal unit of product 3, decreasing the revenue by 10.
The gradients here are
#y R1 x1 y = 0 10
and
$
R x1 y = 10
$x1 1
Now, take the same initial situation, and suppose instead
that the sequence of product requests is 3 3 3 3 2 2 2
1 1 1. Here, when processing the second request for product 2, according to Equation (6), the remaining capacity
will be a value close to but less than three, say x6 !6 =
2993. In other words, the second request for product 2
meets a binding constraint and is not fully accepted because
there are two seats reserved for class 1. The second seat
for class 1 is not fully accepted either because there is no
more space available at this point. The gradients here are
then
#y R1 x1y = 69 and
$
R x1y = 10
$x1 1
In words, by incrementing y2 , we will be able to accept an
additional increment of product 2 at the expense of reducing
a marginal unit of product 3. So, $/$y2 R1 x1 y =
19 10. Similarly, by incrementing y1 , we will accept an
additional increment of product 1, but lose an increment of
product 2: $/$y1 R1 x1 y = 25 19. Regarding the
sensitivity with respect to x1, when xing y and decreasing the number of seats available, we reduce the amount of
product 3 accepted.
2.4. Complexity Analysis
The procedure for calculating a sample-based gradient is
described in detail in online Appendix A4. It basically consists of two passes: In the forward pass, we simulate the
acceptance decisions and keep track of the state (available
capacity) of the network, xt, observed by each arriving
customer t. The backward pass then rebuilds the capacity
seen by each customer, identies all the binding protection levels, and calculates the gradient using (7) and (8).
The overall complexity of this routine is determined by the
backward pass.
Let K denote the total number of binding protection
levels encountered on the sample path, and recall that T
denotes the total number of customers. Then, the computation of the sample path gradient takes OK + T time.2 The
nite-difference method of Bertsimas and de Boer (2005),
in contrast, takes OK T for the same sample because
an efcient implementation of their algorithm consists of
two forward passes: In the rst pass, we identify all the
binding protection levels (again, there are K). In the second pass, for each binding protection level, we increase the
protection level by one and keep track of the acceptance
decisions and revenues from that request onward under the
increased protection level. This requires tracking another
OT booking requests and simulating the acceptance decisions under the new protection level. This results in the
complexity of OK T cited above. For realistic size networks under high load factors, one can expect large values of K, and hence the complexity difference can become
quite signicant.
The above complexity is based on worst-case analysis. To give a sense of the real-world difference in speed
between these two gradient estimates, we performed a
series of numerical tests on two sample networks. Table 1
reports the CPU time taken by both the rst-difference and
our continuous gradient estimate over the same 100 sample
paths for the two different networks.3 As suggested by the
theoretical complexity analysis, the difference in computation time becomes more signicant when the networks are
congested (and hence, more binding protection levels are
met along the sample path).4
Note that the running time difference between the methods approaches an order of magnitude in the larger network
case. But even this network is relatively modest in size
compared to large commercial airline networks. Indeed, it
would not be unreasonable to expect differences of two
orders of magnitude or more in very large airline networks.
Such differences in speed can well mean the difference
between solving and not solving a network in a practical
amount of time.
2.5. Properties of the Revenue Function
We next look at several important theoretical properties of
the revenue function.
2.5.1. Quasiconcavity. The important result here is a
negative one: namely, we show that, in general, the sample
path revenue function is not quasiconcave in the protection
Table 1.
Comparison of CPU times between the
stochastic gradient and nite-difference
methods over 100 sample paths.
Demand
factor
Average
sample
size
Stochastic
gradient
CPU
seconds
Finitedifference
CPU
seconds
Magnitude
of time
improvement
Network
m
8
8
8
8
80
80
80
80
125
111
100
091
1030
1030
1030
1 030
0657
0653
0641
0625
2827
2372
1803
1531
43
36
28
24
62
62
62
62
62
1844
1844
1844
1844
1844
128
110
096
086
077
7696
7696
7696
7696
7696
4870
4826
4796
4897
4921
49411
39783
35563
32321
29423
101
82
74
66
60
van Ryzin and Vulcano: Simulation-Based Optimization of Virtual Nesting Controls for Network Revenue Management
872
Operations Research 56(4), pp. 865880, 2008 INFORMS
Table 2.
Sample path for the quasiconcavity counterexample.
y = 3 5
Customer t
1
2
3
4
5
6
7
8
9
10
11
Total revenue
z = 1 7
Product jt
Decision
xt + 1
Decision
xt + 1
Decision
xt + 1
2
3
3
2
2
2
2
2
1
1
1
Accepted
Accepted
Accepted
Accepted
Accepted
Rejected
Rejected
Rejected
Accepted
Accepted
Accepted
7
6
5
4
3
3
3
3
2
1
0
Accepted
Rejected
Rejected
Accepted
Accepted
Accepted
Accepted
Accepted
Accepted
Accepted
Rejected
7
7
7
6
5
4
3
2
1
0
0
Accepted
Accepted
Rejected
Accepted
Accepted
Accepted
Accepted
Rejected
Accepted
Accepted
Rejected
7
6
6
5
4
3
2
2
1
0
0
63
62
levels vector. The signicance of this result is that without quasiconcavity, we cannot preclude the possibility that
there maybe local optima in the expected revenue of the
continuous problem (4). While this is not entirely surprising, this property is worth verifying formally.
Theorem 1. There exist sample paths on which the sample path revenue function Ry for the continuous problem is not quasiconcave.
Proof. We exhibit a sample path in Table 2, on which the
revenue function is not quasiconcave.5 There is a single leg
(m = 1), and three products (virtual classes) with revenues
r = 10 7 6. The capacity is set at x = 8, and 11 customers arrive, with requests for just one unit each. There
are two sets of protection levels given by y = 3 5, i.e., the
seller reserves three seats for the highest class and ve for
classes 1 and 2; and z = 1 7. Their convex combination
given by +y + 1 +z at + = 05 is the set of protection levels (2 6). Columns labeled xt + 1 represent the
remaining capacity for the next customer t + 1. Given the
integrality of protection levels and the unit requests, each
customer is fully accepted or rejected.
Table 2 shows that the revenue from protection levels y
is 3 10 + 3 7 + 2 6 = 63 (i.e., the seller has accepted
three products 1, three products 2, and two products 3).
From protection levels z, revenue is 2 10 + 6 7 = 62,
and from the average protection levels, revenue is 2 10 +
5 7 + 1 6 = 61. Hence, quasiconcavity is not veried;
that is, this sample path has strictly fewer revenue using a
convex combination of two sets of protection levels, in this
case (2 6), than with both y = 3 5 and z = 1 7.
2.5.2. Continuity and Differentiability. We will use
the following observations in the sequel (see online
Appendix A2). For any t = 1 T , there exist constants
ky and kx such that
$
$y Rt xt y ky
ic
05y + 05z
(12)
61
and
$
$x t Rt xt y kx
k
(13)
The next lemma records a direct consequence of (12):
The revenue function is Lipschitz continuous in the protection levels.
Lemma 1. Let y and z be two feasible sets of protection
levels. Then, there exists a constant KR , independent of x
and , such that
Rt xt z Rt xt y KR z y
for all t = 1 T for the continuous, a.s. differentiable
model (5)(6).
The next lemma justies the interchange of the expectation and differentiation operators on a sample path :
Lemma 2. For the randomly perturbed model (5)(6),
the gradient #y ERy
exists for all y , and
#y ERy
= E#y Ry
.
Proof. By denition of partial derivative,
Ry + h eic Ry
$
=
Ry
lim
h0
h
$yic
where eic denotes the vector in mc that has a one in
component i, c, and zero otherwise. Note that the numerator Ry + heic Ry ky h, and hence for
all sequences hk such that hk 0, the random variable Rkic y Ry + hk eic Ry /hk veries
Rkic y ky for all k and , and hence it is uniformly
bounded. Now, applying the bounded convergence theorem
(e.g., see Billingsley 1995, Theorem 5.4), we get for all
sequences hk 0,
$
Ry
lim ERkic y
= E
hk 0
$yic
The left-hand side is in fact
$
ERy
$yic
and the statement of the lemma follows.
The following continuity result is critical for solving (4).
van Ryzin and Vulcano: Simulation-Based Optimization of Virtual Nesting Controls for Network Revenue Management
Operations Research 56(4), pp. 865880, 2008 INFORMS
Theorem 2. The objective function gy is continuously
differentiable on .
Proof. From Lemma 2, #gy = #y ERy
=
E#y Ry
, where #y Ry is dened by the (perturbed) recursion (7). As discussed in 2.1, for any given
y , this recursion is differentiable at y w.p.1, and
moreover, by the same argument, the gradient is also
continuous at y w.p.1. That is, for any sequence of vectors
yk tending to y,
lim #y Ryk = #y Ry w.p.1
yk
y
But from (12), the derivatives with respect to y are uniformly bounded by ky . Hence, by the bounded convergence
theorem and Lemma 2, we have that
lim #y gyk = lim E#y Ryk
yk
y
= E#y Ry
= #y gy
Hence, #gy is continuous at y for all y .
2.5.3. Bounded Variance. The following result shows
that the variance of the stochastic partial derivative of
the revenue function is bounded. This result is important
for proving the local convergence of stochastic gradient
methods.
Lemma 3. There exists a nite constant C such that
Var$/$yic Ry C.
Proof. Applying (12),
Var
2
$
$
ky2
Ry E
Ry
$yic
$yic
So, it is enough to take C = ky .
a steepest ascent direction. A step is taken in this direction and the procedure is repeated. The stochastic approximation method originated in Robbins and Monro (1951).
Kushner and Clark (1978), Benveniste et al. (1990), and the
recent book by Kushner and Yin (2003) contain expositions
of its theory.
Some applications in the OR/OM eld are the queueing papers by LEcuyer and Glynn (1994) and LEcuyer
et al. (1994). Mahajan and van Ryzin (2001) solve by an
SA method an inventory problem where customers substitute among product variants within a retail assortment when
inventory is depleted. In the RM arena, Karaesmen and
van Ryzin (2004) apply these techniques to an overbooking
problem with substitutable inventory classes.
A sketch of the stochastic gradient method is as follows:
For a random function F x = E f x
, the method
attempts to maximize F x on a constraint set X through
the iterations
xk+1 = 2X xk + 3k s k
yk
y
3. Stochastic Gradient Algorithm
We next look at using the sample path gradients in a
simulation-based optimization algorithm. There are two
main approaches for doing this: sample average approximation methods (SAA) and stochastic approximation
methods (SA).
SAA methods generate a xed set of sample paths and
then solve a deterministic problem on this xed sample using traditional optimization techniques. The theory
and application of SAA are discussed in Plambeck et al.
(1996), Kleywegt and Shapiro (2001), and Shapiro (2000).
In a nonsmooth framework such as our original formulation in 1, bundle methods like the ones discussed in
Lemarchal (1989, 6) are appropriate.
In contrast, SA methods generate a single sample path
per iteration, and then a gradient from this sample is used as
873
x0 X k = 0 1
where 2X is a projection operator on the set X, 3k is
a step length, and s k is an estimate of the gradient of
F xk .
Based on preliminary tests using both the SAA and the
SA methods, we settled on using SA. In our experience,
naive versions of the SAA were less efcient than SA
because the cost of each gradient calculation was quite high
(e.g., SAA required computing gradients for, say, 1,000
sample paths for each iteration versus one in the case of
SA). Thus, despite the fact that SA required more iterations in total, it was faster overall. However, this is not
an entirely fair comparison because we did not try SAA
with more advanced bundle methods, which would likely
improve the efciency of the approach signicantly. In general, more sophisticated algorithms applied to SAA would
be worth investigating but are somewhat beyond the scope
of this paper. Our aim, instead, was simply to implement
a straightforward version of the algorithm to get a rst-cut
sense of its performance. In addition, the SA algorithm is
provably convergent in our case (as we show below) and
thus it has some theoretical value as well.
3.1. Algorithm and Implementation Choices
To maximize gy = ERy
over the convex compact set dened by constraints (1), we require an initial
feasible point y 0 , and a sequence of step sizes 3k
satisfying
3k = + and
3k > 0 lim 3k = 0
k
k=1
(14)
3k
2 < +
k=1
In particular, we have chosen a step size 3k = a/k for
some constant a > 0.
For simulated demand streams 1 N , our implementation of the stochastic gradient method proceeds as
follows.
van Ryzin and Vulcano: Simulation-Based Optimization of Virtual Nesting Controls for Network Revenue Management
874
Operations Research 56(4), pp. 865880, 2008 INFORMS
Stochastic Gradient Algorithm
Step 1. Compute an initial feasible set of protection
levels y 0 .
Step 2. For k = 1 to N do:
(a) Calculate the sample path gradient over demand
stream k : #y Ry k1 k .
(b) Set new step size 3k = a/k.
(c) Update the protection levels for the next iteration,
using the equation
y k = 2 y k1 + 3k #y Ry k1 k
where 2 is the orthogonal projection into the feasible
set .
Step 3. Return y N . Stop.
Some comments about the implementation of this algorithm are in order. We ran a xed number, N , of iterations to
improve an initial feasible set of protection levels obtained
in Step 1 (typically, N was on the order of thousands). In
practice, various stopping criterion can be employed to terminate the algorithm, although one weakness of stochastic
gradient methods is that they lack good stopping criteria
(Shapiro 2000).
The step size chosen in Step 2(b) is a popular choice.
Alternative step-size rules for more general stochastic
quasigradient methods can be found in Pug (1988).
The projection in Step 2(c) is of the form
y = 2 z y = arg min y z
y
For each resource i, this projection is given by a quadratic
program with linear constraints, and can be solved efciently using standard methods like a modication of the
simplex (see Wolfe 1959) or barrier-type algorithms (see
Bertsekas 1999, Chapter 4). Note that this projection typically involves a small number of variables and a small
number of constraints.6
We nally note that in a commercial implementation, the
simulations could be run on parallel processors, with each
CPU generating its own sequence of demand and calculating the resulting sample path gradient. In this sense, the
algorithm is highly parallelizable.
3.2. Convergence
Observe that in view of Theorem 1, our algorithm is
unlikely to be globally convergent. However, it has at least
robust local convergence properties. Recall that the gradient
#y Ry k1 k is a noisy representation of the gradient of
gy k1 . Let the noise (error) in the gradient at iteration k
be the vector 6 k #y Ry k1 k #gy k1 . From
Lemma 2, we have that E6 k y 0 y k1
= 0.
Let the cumulative step sizes be dened as sk =
k1 i
i=1 3 , and dene a function ms such that ms =
maxk sk s for s 0, and ms = 0 otherwise. Note
that from the choice of 3k in (14), ms < . Consider
the following conditions:
Assumption A1. 3k is a sequence of positive real num
k
= +.
bers such that limk 3k = 0,
k=1 3
Assumption A2. Let the constraint set for the problem be
dened by = y 7j y 0 j = 1 s. The set is
closed and bounded. The 7j j = 1 s, are continuously differentiable. At each y that is on the boundary
of , the gradients of the active constraints are linearly
independent.
Assumption A3. limk supklmsk +s li=k 3i 6 i
> " = 0 for each " > 0 and s > 0.
Assumption A4. g is a continuously differentiable realvalued function.
Assumption A5. 3k E6 k 2
0 as k .
Theorem 3. Let be the set of Kuhn-Tucker points for
the continuous problem (5)(6). Then, the stochastic gradient algorithm described above veries Assumptions A1A5.
Moreover, if is a connected set, the sequence of points
y k in probability as k .
Proof. We discuss here that Assumptions A1A5 hold for
our algorithm.
A1 is satised by our choice of the step sizes 3k
in (14).
A2 is satised by our constraint set (1) and our
assumption that xi 1 > 0 i.
For A3, note that for a xed k, the sequence
l
i=k 3i 6 i l k is a martingale. Because the Euclidean
norm function is convex, we have that li=k 3i 6 i
l k is a submartingale. Therefore,
l
i i
3 6 >"
sup
klmsk +s i=k
=
E
sup
klmsk +s
msk +s
i=k
msk +s
l
3i 6 i
i=k
3i 6 i
2
> "2
"2
3i
2 E6 i
2
"2
i 2
3
E6 i
2
i=k
"2
=
i=k
where the rst inequality holds from Kolmogorovs inequality for submartingales (e.g., see Ross 1996, Theorem
6.4.4), and the next equality holds because the noise terms
are uncorrelated. Because of our choice of step sizes in (14)
and bounded variance of the noise term (from Lemma 3),
the last upper bound goes to zero as k goes to innity.
van Ryzin and Vulcano: Simulation-Based Optimization of Virtual Nesting Controls for Network Revenue Management
Operations Research 56(4), pp. 865880, 2008 INFORMS
A4 holds by Theorem 2.
A5 holds by the choice of 3k and Lemma 3.
The convergence result follows from Theorem 6.3.1 in
Kushner and Clark (1978).
A weaker convergence result holds when is not connected. To do that, we rst dene an interpolation for the
sequence y k . We dene the continuous function ys by
k
if s = sk
y
ys = sk+1 s y k + s sk y k+1 if s s s
k k+1
3k
3k
Observe that ys is just a linear interpolation of the
values y k as a function of the cumulative step sizes sk . Let
N" denote the "-neighborhood of the set . Kushner
and Clark (1978, Theorem 6.3.1) show that under A1A5,
if is not connected, then for each > 0 and " > 0, there
exists a nite constant s0 such that s > s0 implies
s
1
lim
Iysk + v mc N" dv
k
2s s
Roughly, this result says that the the average amount of
time the iterates y k lie more than " away from a point
in (averaging over a sufciently large but nite interval)
becomes arbitrarily small as k increases. It is basically a
convergence in probability of a moving average of y k
rather than a convergence of y k itself.
Summarizing, the algorithm we proposed satises the
conditions for the convergence to a Kuhn-Tucker point.
When is connected, we have convergence in probability.
Even if is not connected, we still have a guarantee of
convergence of the average of iterates to a point arbitrarily
close to . We emphasize, though, that all these are only
local convergence guarantees.
4. Numerical Experiments
In this section, we describe the results of some numerical
tests of our algorithm. We rst describe the implementation
and experimental design details of our tests and then look
at the results for three example networks.
4.1. Algorithm Implementation Details
For the virtual nesting scheme, we used a standard version of displacement adjusted virtual nesting (DAVN) as
proposed in Williamson (1992). The method is based on
solving a deterministic linear program based on the mean
demands. The dual variables from this linear program are
used to both compute displacement adjusted revenues and
cluster products into virtual classes. The resulting singleleg models for each leg are solved using the EMSRb (expected marginal seat revenue) heuristic of Belobaba (1992). Requests are processed through theft nesting.
Our particular implementation of this method is described
in detail in online Appendix A3. This overall scheme is
875
commonly used in practice and therefore serves as a good
reference point for testing our algorithm.
As is common in practice, we also reoptimized the
DAVN control periodically during the booking process.
Specically, we split the booking horizon into three periods
and reoptimized the protection levels at the start of each
period. We kept the original displacement adjusted revenues
and mapping to virtual classes and only update the inputs to
the single-leg problems (i.e., the current remaining capacity
and demand-to-come statistics).
As for the gradient estimate, while we perturbed the
remaining capacity in our original problem by a random
noise term to ensure a.s. differentiability, our computations
are mainly performed without introducing this perturbation.
We indeed tested the perturbed version of the algorithm.7
However, note that the perturbation could be arbitrarily
smalleventually smaller than the tolerance of any computer. In this case, one can avoid explicitly simulating this
perturbation by noticing the following: If we focus on the
a.s. differentiable control uj x ! y q in (5)(6), and
consider its rate of change with respect to both yic and xi
when " 0, it corresponds to $ + /$yic uj x y q and
$ /$xi uj x y q (i.e., the right partial derivative with
respect to yic and the left partial derivative with respect
to xi , respectively, because a negative perturbation in the
remaining capacity is equivalent to a marginal increase in
the protection levels and a marginal decrease in the available capacity). Observe that these directional partial derivatives always exist because the revenue function is piecewise
linear. Moreover, they are dened by (10) and (11), albeit
now with the upper limits described in conditions (ii) and
(iii) replaced by nonstrict inequalities.
Given these observations, when " is arbitrarily small, we
can simply use the approximations
$
R xt y
$yic t
+
$
$
rjt
Rt+1 xt +1y
ujt xtyqt
$x
$y
k
ic
kAj
t
$
+
R xt + 1 y i c t
$yic t+1
and
$
R xt y
$xi t
$
$
rjt
Rt+1 xt + 1 y
ujt xt y qt
$x
$x
k
i
kAj
t
$
+
R xt + 1 y i t
$xi t+1
This nonperturbed version produced exactly the same
results as the perturbed version when applied over several
examples when we set " = 1010 ; they showed negligibly different results when we set " = 105 . However, in
terms of memory requirements and computational time, the
van Ryzin and Vulcano: Simulation-Based Optimization of Virtual Nesting Controls for Network Revenue Management
876
Operations Research 56(4), pp. 865880, 2008 INFORMS
nonperturbed version is clearly more efcient.8 Because the
nonperturbed version requires a simpler data structure, is
computationally more efcient, and performed equally well
in our numerical tests, this is the version that would likely
be implemented in actual applications. Hence, we concentrated out tests on this version of the algorithm.
We have also tried computing the gradient based on a
batch of sample paths instead of based on a single one, and
then taking the average of them. The purpose is smoothing estimates as in the stochastic quasigradient method
of Ermoliev (1988). Given a pool of simulated demand
instances, our experiments suggest that it is more convenient to run a large number of iterations computing the gradient based on a single sample path, than to get an accurate
estimate at each iteration through averaging the gradient
computed over multiple sample paths, and running fewer
iterations.
In terms of the computing platform, we implemented
the stochastic gradient algorithm in C++ on a Pentium IV
Workstation (CPU of 2.00 Ghz and RAM of 512 Mb) under
Windows 2000.9
4.2. Examples
Example 1. We start with a very small example to illustrate how the stochastic gradient algorithm evolves from an
initial solution. Here, we have a single leg with capacity
x = 150 and three different products (virtual classes). The
revenues are r = 200 160 100. Demands are assumed
normal, with means : = 40 65 70, standard deviations
;j = :j , and distributions truncated between zero and
2:j . From these parameters, we calculate discrete distributions (a probability mass function is calculated from the
normal c.d.f.). Customer arrival order is from low- to highfare classes.
The same set of 200 sample paths is applied to two different initial sets of protection levels, and the evolution of
the successive protection levels visited by the algorithm is
shown in Figure 3. We chose a step length 3k = 01/k.
The rst set of initial protection levels was y 0 =
30 80. The output of the algorithm before rounding was y N = 3211 10342. The sample average
Figure 3.
Evolution of the protection levels visited by
the stochastic algorithm in Example 1, when
applying it over the same 200 sample paths.
revenue increased from 21,139 at y 0 to 22,209 at
Roundy N = 32 103, corresponding to an expected
revenue gap of 5.06% with a 95% condence interval of
188% 1200%.
The second initial set was provided by EMSR-b: y 0 =
35 103. The output of the algorithm before rounding
was y N = 3591 10594. The sample average revenue
increased from 22,236 at y 0 to 22,276 at Roundy N =
36 106, corresponding to an expected revenue gap of
0.18% with a 95% condence interval of (2.08%, 2.44%).
Figure 3 illustrates the behavior of the stochastic algorithm from these two starting points. The pattern of
progress observed here is typical of the algorithm behavior
with decreasing jump sizes and oscillating behavior. (See
Gaivoronski 2005 for other numerical examples exhibiting
similar convergence patterns for stochastic quasigradient
methods.) Despite the small number of instances generated, note that in both cases, the algorithm brought the initial protection levels close to the near-optimal ones given
by EMSR-b (widely used in the airline practice), yet still
showing some improvement over them.
However, one must be careful not to interpret Figure 3
as representing convergence to a (locally optimal) solution
per se. First, the decreasing step sizes mean that the algorithms progress naturally stalls after a large number of
iterations simply because the step sizes become very small.
Second, we terminated the algorithm after a xed number
of iterations. At best, therefore, one can claim the algorithm moves the protection levels to an improved set of
values. Convergence to optimal values would likely take
many more iterations. Still, as a practical matter, achieving
an improvement in the protection levels is the main aim of
the algorithm and by this measure it appears to do well.
Example 2. This example is based on the ve-airport
network shown in Figure 4 and data from Williamson
(1992). There are 10 round-trip itineraries. Each itinerary
is segmented into four different classes: Y, M, B, and Q
(see Table 3). Each product is a combination of one-way
itinerary and class (e.g., product 1 is the one-way itinerary
ATLBOS for class Y, which has revenue r1 = 310 and mean
demand :1 = 12; product 2 corresponds to ATLBOS, class
M, with revenue r2 = 290 and mean demand :2 = 9; product 5 is BOSATL, class Y, with associated r5 = 310 and
Figure 4.
114
110
Five-airport network for Example 2.
y2
106
102
BOS
98
94
90
86
LAX
Prot. levels visited starting from (30, 80)
Prot. levels visited starting from (35,103)
82
78
29
30
31
32
33
34
35
36
y1
37
38
39
40
41
42
43
ATL
SAV
7
8
MIA
van Ryzin and Vulcano: Simulation-Based Optimization of Virtual Nesting Controls for Network Revenue Management
877
Operations Research 56(4), pp. 865880, 2008 INFORMS
Table 3.
Description of the network in Figure 4.
Revenue
O-D market
ATLBOS/BOSATL
ATLLAX/LAXATL
ATLMIA/MIAATL
ATLSAV/SAVATL
BOSLAX/LAXBOS
BOSMIA/MIABOS
BOSSAV/SAVBOS
LAXMIA/MIALAX
LAXSAV/SAVLAX
MIASAV/SAVMIA
Mean demand
310
455
280
159
575
403
319
477
502
226
290
391
209
140
380
314
250
239
450
168
95
142
94
64
159
124
109
139
154
84
69
122
59
49
139
89
69
119
134
59
12
8
20
25
9
11
5
17
5
13
9
4
9
7
7
5
7
11
7
4
11
11
7
5
11
14
11
7
7
7
17
27
14
13
24
20
27
14
32
25
:5 = 12, etc). This gives eight legs in the network and 80
products.
We assume that demand arrives from lower to higher
in revenue order: First, requests for products j =
4 8 12 80 arrive (class Q); followed by requests for
products j = 3 7 11 79 (class B); then by products 2 6 10 78 (class M), and nally by products
1 5 9 77 (class Y).
Demand for product j was assumed to be normally distributed with mean EDj
= :j (provided in Table 3, as
explained above) and standard deviation ;j = :j . This
distribution was then truncated between zero and 2:j . From
the resulting continuous c.d.f., a probability mass function
was then calculated. That is, demand was treated as a discrete random variable such that for a given integer value a,
the probability that demand was equal to a was taken
a+1/2
as a1/2 f x dx, where f x is the continuous density
function. We assumed that each customer requests exactly
one seat (qt = 1 for all t). The maximum number of protection levels per leg was set at c = 9 (i.e., there are 10
virtual classes).
For the stochastic algorithm, we used a step size of 3k =
01/k. Kleywegt and Shapiro (2001) point out that this sort
of method is very sensitive to the choice of step sizes:
Small step sizes result in very slow progress toward the
optimum, while large step sizes make the iterations zigzag.
We experimented with 3k = a/k, with a = 001, 0.05, 0.1,
0.5, and 1. The best results were obtained with a = 01.
DAVN was used to dene the indexing and nd the initial
set of protection levels as described in online Appendix A3.
When running the stochastic gradient method, we generated
5,000 instances (sample paths) taking as a starting point the
Table 4.
Capacity
per leg
160
180
200
220
original outcome of DAVN.10 Each sample path consisted
of 1,030 requests on average. As mentioned, we then split
the booking horizon evenly into three periods and reoptimized the protection levels for both methods at the start of
each period.
Two measures of congestion were computed for reference: the demand factor is a measure of the level of demand
relative to capacity. It is computed as the average of the leg
demand factors, where
n
j=1 EDj
Ii Aj
demand factor for leg i =
(15)
xi
Demand factors close to or above one indicates scarcity of
resources and the need of some capacity control to optimize
revenues. The second congestion measure is the load factor, an indicator of capacity utilization. It is also computed
as the average of the leg load factors:
load factor for leg i
average number of seats sold on leg i
=
xi
Note that the load factor depends on the capacity control policy, while the demand factor is independent of the
capacity control policy. We varied the demand factors and
load factors by scaling the capacity on each leg of the network, leaving demand unchanged.
Table 4 shows average revenues and average load factors
for different capacity levels. We also list the 95% condence intervals for the revenue gaps. It took approximately
35 seconds to compute the new set of protection levels
starting from the original output of DAVN. These new protection levels produced signicant increases in revenues
the expected gaps are between 1.74% and 5.15%and also
slightly higher load factors, which suggests that not only
were more requests accepted but also the mix of accepted
demand improved.
To illustrate the changes in protection levels, Tables 5
and 6 show the protection levels before and after execution
of the SA algorithm, computed at the start of the rst booking period for the case xi = 180 for all legs i. For this example, there were at most eight virtual classes per leg. It is
interesting to observe the structure of the protection levels
produced by the stochastic gradient algorithm. Some levels are increased, while others are decreased. Indeed, some
virtual classes are in fact collapsed (e.g., virtual classes 4
Revenues for different capacity levels for Example 2.
Demand
factor
125
111
100
091
(16)
Perf. for reopt. DAVN
Performance for SA
Revenue
Load
Revenue
Load
E[% Gap]
% Gap (95% CI)
CPU sec.
157640
165270
174707
183668
088
089
090
088
165758
173050
181424
186865
091
090
091
089
515
471
384
174
(3.09, 7.21)
(2.67, 6.74)
(2.03, 5.65)
(0.20, 3.68)
3286
3672
3684
3492
van Ryzin and Vulcano: Simulation-Based Optimization of Virtual Nesting Controls for Network Revenue Management
878
Table 5.
Operations Research 56(4), pp. 865880, 2008 INFORMS
Initial protection levels y 0 for Example 2,
when xi = 180, for the rst period in the
booking horizon.
Protection levels yic
Resource
1
2
3
4
5
6
7
8
6
6
7
7
3
3
25
25
34
34
18
18
11
11
52
52
42
42
30
30
18
18
62
62
49
49
59
59
38
38
79
79
61
61
73
73
63
63
98
98
75
75
103
103
81
81
107
107
88
88
96
96
141
141
and 5 are merged on leg 2 because y23 = y24 ; virtual classes
3 and 4 are merged on legs 7 and 8 because y72 = y73 and
y82 = y83 ). Virtual class 1 disappeared from the rst four
legs, meaning that there is no specic capacity reserved
for it.
Example 3. This example is a real-world data set based
on a subnetwork from a major U.S. commercial airline.
The network has 1,844 products (origin-destination-fareclasses) and 62 legs. Each stream of demand consists of
the order of 7,700 bookings. Again, we assumed that each
customer requests a single seat (qt = 1 for all t). Demand
for each product was assumed to be a truncated normal random variable, or a truncated Poisson random variable when
the mean was less than ve. The mean demand per product
varied from less than one to 86. We allowed a maximum
of 15 virtual classes per leg, and computed the indexing
and initial protection levels using DAVN as described in
online Appendix A3. We again split the horizon into three
equal-sized periods and reoptimized the protection levels at
the start of each period. A thousand streams of demand are
generated as input to our SA algorithm. We used the step
size 3k = 01/k, as in Example 2.
Table 7 shows average revenues, load factors, and revenue gaps for different demand factors. The revenue gains
Table 6.
Improved protection levels y N for Example 2,
when xi = 180, for the rst period in the
booking horizon.
Protection levels yic
Resource
1
2
3
4
5
6
7
8
0
0
0
0
3
3
25
21
34
34
0
3
9
11
42
39
41
41
9
6
10
13
42
39
42
41
47
45
10
13
79
77
45
44
71
73
59
61
98
98
75
74
105
104
81
81
107
107
81
81
87
90
165
159
are less dramatic than in Example 2 but are still signicant,
ranging from 0.56%1.32%.
The stochastic gradient algorithm computes the rst set
of protection levels in around one minute, which is a
remarkable time given the size of the problem. (The time
for computing the two reoptimized sets of protection levels is even shorter because they are based on the residual
demand-to-come, which results in smaller sample paths.)
Overall, the example suggests that the approach has good
practical potential.
5. Conclusions
In this paper, we have proposed a model and algorithm that
improves on the simulation-based approach developed by
Bertsimas and de Boer (2005). The method has the advantage of being simple to implement, is computationally efcient, and provably convergent.
We believe the model and solution approach are appealing for several reasons. First, the method improves protection levels based on an accurate estimate of true network
effects, without relying on approximations or decompositions. Second, because it is simulation based, the demand
processes can be completely general. For example, it could
include correlations among itineraries or correlations over
time. Third, it is easy to implement, requiring a recursive iteration that is not much more complex than simulating the acceptance decisions themselves. Fourth, due to its
continuous nature, our model overcomes the computational
and theoretical difculties of working with nite-difference
estimates and a discrete optimization problem as in the
work of Bertsimas and de Boer (2005). Finally, it shows
promising results in our computational tests and seems reasonably efcient even for some relatively large, real-world
networks.
As for future research, variations of the basic computational method would be worth investigating. Ours is
a straightforward implementation of a stochastic gradient method, and as such primarily serves to illustrate the
value of gradient estimators and the potential performance
improvements they yield. However, it may be desirable to
develop a sample average approximation version of the
algorithm that exploits more advanced methods in nonsmooth optimization (e.g., bundle methods). Variance reduction methods would be worth exploring as well.
Also, it would be worthwhile to try to apply a similar approach to bid-price controls. The overall approach
can also be applied to cases where customers exhibit more
complex behavior. For example, where each customer has
an ordered preference for products to buy and if their rst
choice is not available, they buy the second one if it is
available, and so on. This allows modeling of path preference and buy-up and buy-down behavior which is not taken
into account by traditional methods (DAVN, EMSR-b, bid
prices, etc). This extension is the topic of a recent paper
(van Ryzin and Vulcano 2008).
van Ryzin and Vulcano: Simulation-Based Optimization of Virtual Nesting Controls for Network Revenue Management
879
Operations Research 56(4), pp. 865880, 2008 INFORMS
Table 7.
Demand
factor
1.28
1.10
0.96
0.86
0.77
Revenues for different demand factors for Example 3.
Perf. for reopt. DAVN
Performance for reoptimized SA
Revenue
Load
Revenue
Load
E[% Gap]
% Gap (95% CI)
CPU sec.
2318515
2496684
2648337
2781869
2906919
088
086
083
081
078
2331538
2525456
2683401
2816914
2928350
092
089
085
083
079
056
065
132
126
073
041 154
046 184
086 179
092 160
036 111
6880
6053
6351
5709
5520
6. Electronic Companion
An electronic companion to this paper is available as part
of the online version that can be found at http://or.journal.
informs.org/.
Acknowledgments
The authors thank Sanne de Boer, Paul Glasserman, and
Anton Kleywegt for helpful discussions on earlier drafts
of this work. They also thank Vladimir Norkin for comments on an earlier version of this paper. Andy Boyd of
PROS Revenue Management and Shankar Sivaramakrishnan provided some of the data sets used in the numerical
experiments. Finally, the authors thank the Deming Center at the Columbia Business School for having partially
supported this research.
Endnotes
1. Again, while !t is a function of , we do not explicitly
express this dependence to simplify the notation.
2. This complexity relies on the reasonable assumption that
there is a small constant upper bound for the number of
resources used by each product. For example, in the airline
case we could reasonably take a value of four, which corresponds to having itineraries with at most three stopovers.
Then, when a request hits a protection level, the number of
components to update in the gradient vector is also small
(e.g., at most 4c).
3. We implemented both the stochastic gradient and nitedifference algorithms in Microsoft Visual C++ 6.0 on a
Pentium IV Workstation (CPU of 2.00 Ghz and RAM of
512 Mb) under Windows 2000.
4. A proxy for measuring the network congestion is the
demand factor. Its formal denition and further description
of the network examples are provided in 4, Example 2
(for the rst four cases in Table 1) and Example 3 (for the
last ve cases).
5. For simplicity, we do not perturb capacity in this sample
path example; moreover, because these perturbations can
be taken arbitrarily small, including them does not change
the example.
6. Gu (2002) reported that for small models, variations of
both primal and dual simplex for quadratic programming
outperforms the barrier-type algorithms in terms of computational time.
7. In the perturbed version, we have also simulated continuous quantities for the different requests, assuming that
they were Unif(" + q, q + ") for a discrete quantity q in
the sample path .
8. Regarding the memory requirements of the perturbed
algorithm, one has to keep track of all the perturbations for
every request t in the sample path. As for computational
time, in Example 2, it took roughly 35 seconds to solve for
the nonperturbed version (see Table 4) and 47 seconds for
the perturbed version.
9. We have worked with Microsoft Visual C++ 6.0, building a Win32 console application. We have linked our code
with a LINDO application programming interface (Lindo
Systems, Inc.) to make the projection in Step 2(d). This
routine uses a barrier algorithm to solve the quadratic program.
10. Both the implementation of DAVN and the sample
path simulation were computed ofine with MATLAB from
MathWorks Inc.
References
Belobaba, P. 1987. Air travel demand and airline seat inventory management. Ph.D thesis, Flight Transportation Laboratory, Massachusetts
Institute of Technology, Cambridge, MA.
Belobaba, P. 1989. Application of a probabilistic decision model to airline
seat inventory control. Oper. Res. 37 183197.
Belobaba, P. 1992. Optimal vs. heuristic methods for nested seat allocation. Proc. AGIFORS Reservations and Yield Management Study
Group, Brussels, Belgium, 2853.
Belobaba, P. 2001. Revenue and competitive impact of O-D control: Summary of PODS results. First Annual INFORMS Revenue Management Section Meeting. Columbia University, New York.
Belobaba P., S. Lee. 2000. PODS update: Large network O-D control results. AGIFORS Revenue Management Study Group Meeting,
New York.
Benveniste A., M. Mtivier, P. Priouret. 1990. Adaptive Algorithms and
Stochastic Approximations. Springer-Verlag, Berlin.
Bertsekas, D. 1999. Nonlinear Programming, 2nd ed. Athena Scientic,
Nashua, NH.
Bertsimas, D., S. de Boer. 2005. Simulation-based booking-limits for airline revenue management. Oper. Res. 53 90106.
Billingsley, P. 1995. Probability and Measure. John Wiley & Sons,
New York.
Brumelle, S., J. McGill. 1993. Airline seat allocation with multiple nested
fare classes. Oper. Res. 41 127137.
Curry, R. 1990. Optimal airline seat allocation with fare classes nested by
origins and destinations. Transportation Sci. 24 193204.
de Boer, S. 2003. Personal communication.
Dror, M., P. Trudeau, S. Ladany. 1988. Network models for seat allocation
on ights. Transportation Res. 22B 239250.
880
van Ryzin and Vulcano: Simulation-Based Optimization of Virtual Nesting Controls for Network Revenue Management
Ermoliev, Y. 1988. Stochastic quasigradient for methods. Y. Ermoliev,
R. J.-B. Wets, eds. Numerical Techniques Stochastic Optimization,
Chapter 6. Springer-Verlag, Berlin, 143185.
Gaivoronski, A. 2005. SQG: Software for solving stochastic programming problems with stochastic quasigradient methods. S. Wallace,
W. Ziemba, eds. Applications of Stochastic Programming. MPS/SIAM
Series in Optimization, SIAM, Philadelphia, 3760.
Glasserman, P. 1994. Perturbation analysis of production networks. D. D.
Yao, ed. Stochastic Modeling and Analysis of Manufacturing Systems,
Chapter 6. Springer, New York, 233280.
Glover, F., R. Glover, J. Lorenzo, C. McMillan. 1982. The passenger mix
problem in the scheduled airlines. Interfaces 12 7279.
Gu, Z. 2002. Quadratic programming and mixed integer quadratic programming: Algorithms and implementation. Presented at Second
Columbia Optimization Day. Columbia University, New York.
Karaesmen, I., G. J. van Ryzin. 2004. Overbooking with substitutable
inventory classes. Oper. Res. 52 83104.
Kleywegt, A. J., A. Shapiro. 2001. Stochastic optimization. G. Salvendy,
ed. The Handbook of Industrial Engineering, 3rd ed. John Wiley &
Sons, New York, 26252650.
Kushner, H. J., D. S. Clark. 1978. Stochastic Approximation Methods for
Constrained and Unconstrained Systems. Springer-Verlag, Berlin.
Kushner, H. J., G. Yin. 2003. Stochastic Approximation and Recursive
Algorithms and Applications. Springer-Verlag, New York.
LEcuyer, P., P. Glynn. 1994. Stochastic optimization by simulation: Convergence proofs for the GI/G/1 queue in steady state. Management
Sci. 40 15621578.
LEcuyer, P., N. Giroux, P. Glynn. 1994. Stochastic optimization by simulation: Numerical experiments with the M/M/1 queue in steady state.
Management Sci. 40 12451261.
Lee, T., M. Hersh. 1993. A model for dynamic airline seat inventory
control with multiple seat bookings. Transportation Sci. 27 252265.
Operations Research 56(4), pp. 865880, 2008 INFORMS
Lemarchal, C. 1989. Nondifferentiable optimization. G. Nemhauser, A.
Rinnooy Kan, M. Todd, eds. Handbooks in OR & MS, Vol. 1, Chapter VII. Elsevier Science Publishers, 529572.
Littlewood, K. 1972. Forecasting and control of passengers bookings. 12th
AGIFORS Sympos. Proc., Nathanya, Israel, 95128.
Mahajan, S., G. van Ryzin. 2001. Stocking retail assortments under
dynamic consumer substitution. Oper. Res. 49 334351.
Pug, G. 1988. Step size rules, stopping times and their implementation in stochastic quasigradient methods. Y. Ermoliev, R. J.-B. Wets,
eds. Numerical Techniques for Stochastic Optimization, Chapter 17.
Springer-Verlag, Berlin, 353372.
Plambeck, E. L., B. Fu, S. M. Robinson, R. Suri. 1996. Sample-path
optimization of convex stochastic performance functions. Math. Programming 75 137176.
Robbins, H., S. Monro. 1951. On a stochastic approximation method. Ann.
Math. Statist. 22 400407.
Ross, S. M. 1996. Stochastic Processes, 2nd ed. John Wiley & Sons,
New York.
Shapiro, A. 2000. Stochastic programming by Monte Carlo simulation
methods. Stochastic Programming E-Prints Series, article 20002003.
Smith, B., C. Penn. 1988. Analysis of alternative origin-destination control strategies. AGIFORS Sympos. Proc., Vol. 28. New Seabury, MA,
123144.
Smith, B., J. Leimkuhler, R. Darrow. 1992. Yield management at
American Airlines. Interfaces 22 831.
Talluri, K., G. J. van Ryzin. 2004. The Theory and Practice of Revenue
Management. Kluwer Academic Publishers, New York.
van Ryzin, G., G. Vulcano. 2008. Computing virtual nesting controls for
network revenue management under customer choice behavior. Manufacturing Service Oper. Management 10(3) 448467.
Williamson, E. 1992. Airline network seat inventory control: Methodologies and revenue impacts. Ph.D. thesis, Flight Transportation Laboratory, Massachusetts Institute of Technology, Cambridge, MA.
Wolfe, P. 1959. The simplex method for quadratic programming.
Econometrics 37 382398.