0% found this document useful (0 votes)
19 views6 pages

Lecture 7 Large-Scale

Uploaded by

guangning.liu96
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views6 pages

Lecture 7 Large-Scale

Uploaded by

guangning.liu96
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

BDC6111: Introduction to Optimization Fall 2018, NUS

Lecture 7: Large Scale Optimization


Lecturer: Zhenyu Hu

7.1 Revised Simplex Method and Delayed Column Generation

Consider the standard form problem

min c0 x
s.t. Ax = b,
x ≥ 0.

In the full tableau implementation, in each iteration we update the following table which requires us to

−c0B B−1 b c0 − c0B B−1 A


B−1 b B−1 A

compute the reduced cost c̄j = cj − c0B B−1 Aj and the search direction B−1 Aj for all nonbasic variables xj .
However, in some large scale optimization problems, we have a huge number of decision variables, i.e, n is
very large, and accessing every column in the matrix A in each iteration can be time consuming. One can
overcome this difficulty if the following two steps can be achieved:

• One can efficiently solve


min c̄j
j=1,...,n

without computing every c̄j ;


• One can perform simplex iteration without accessing to each column Ai in each iteration.

The second step is the key idea behind the revised simplex method, whose typical iteration is summarized
below.
An iteration of the revised simplex method

1. In each iteration, we start with the basic columns AB(1) , ...AB(m) , the associated BFS x and B−1 .
2. Compute the reduced costs c̄j = cj − c0B B−1 Aj sequentially. If one encounters c̄j < 0 for some j for
the first time, then stop and return the index j. If all reduced costs are all nonnegative, the current
basic feasible solution is optimal, and the algorithm terminates.
3. For the returned nonbasic index j, compute u = B−1 Aj . If u ≤ 0, then the optimal cost is −∞ and
the algorithm terminates; else let
xB(i)
θ∗ = min .
{i|ui >0} ui
x
and l be the index such that θ∗ = B(l)
ul . Form a new basis by replacing AB(l) with Aj and compute
the new BFS y via yj = θ∗ , and yB(i) = xB(i) − θ∗ ui for i 6= l.

7-1
Lecture 7: Large Scale Optimization 7-2

4. Finally, we update B−1 to B̄−1 by performing row operations to make


 −1
u → B̄−1 el .
  
B

Note that the steps 3 and 4 are essentially the same as solving
m
X
min cB(i) xB(i) + cj xj
i=1
s.t. BxB + Aj xj = b,
xB ≥ 0, xj ≥ 0.
In some delayed column generation methods, instead of keeping just the basic columns and throwing away
the exit column in each iteration, one may keep some of the columns {Ai |i ∈ I} with I ⊆ {1, ..., n} and solve
the following smaller problems (without explicitly going through the simplex iteration as in steps 3 and 4)
m
X X
min cB(i) xB(i) + cj xj + ci x i
i=1 i∈I
X
s.t. BxB + Aj xj + Ai xi = b,
i∈I
xB ≥ 0, xi , xj ≥ 0.

In the revised simplex method, once a column Aj with negative reduced cost is found, the rest of the nonbasic
columns will not be accessed when performing steps 3 and 4. However, in step 2, in the worst case, one still
needs to generate every column Aj —the generation of the columns is not delayed.
We demonstrate using the example below that when the problem has certain special structure, minj c̄j can
be computed without accessing to every column Aj .

Example 7.1 (Cutting Stock Problem) Consider a paper company that has a supply of large rolls of
paper of width W , which is assumed to be a positive integer. There are demands for bi rolls of paper with
width wi , where wi ≤ W for i = 1, ..., m. A large roll can be sliced in a certain pattern to obtain smaller
rolls. Let ai be the number of rolls of width wi to be produced from a single large roll. A feasible pattern
(a1 , ..., am ) then must satisfy
Xm
ai wi ≤ W.
i=1
If there are in total n feasible patterns, we then collect all feasible patterns in a matrix A of dimension m×n.
For instance, when W = 7, w1 = 2, w2 = 4, the following matrix summarizes all feasible patterns:
 
0 0 1 1 2 3
A= ,
0 1 0 1 0 0
with the column Aj corresponding to a pattern j.
Let xj be the number of large rolls cut according to pattern j. The company seeks to minimize the number
of large rolls used while satisfying customer demand:
n
X
min xj
j=1
Xn
s.t. aij xj = bi , i = 1, ..., m
j=1

xj ≥ 0, j = 1, ..., n.
Lecture 7: Large Scale Optimization 7-3

Each xj should be an integer, but we consider its linear relaxation here.


Even if m is small, n can be huge and even writing down all the columns of A can be difficult. Instead, we
can use revised simplex method and start with the basis I, i.e., the pattern where only one roll of width wj is
produced from a large roll is always feasible (although may not be economical) for any j = 1, ..., m.
0
At each iteration, given a basis B, we can compute the dual basic solution p0 = cB B−1 . Instead of computing
c̄j = cj − p0 Aj = 1 − p0 Aj for every j = 1, ..., n. We seek to solve minj c̄j or equivalently maxj p0 Aj . By
definition of Aj , j = 1, ..., n, this is equivalent as
m
X
max pi ai
ai ,i=1,...,m
i=1
Xn
s.t. wi ai ≤ W,
i=1
ai ≥ 0, integer, i = 1, ..., m.
The above problem is called the knapsack problem—although known as an NP-hard problem, can still be
efficiently solved when m is small.
Upon solving the knapsack problem:

• If the knapsack problem returns an optimal value less than or equal to one, we then know c̄j ≥ mini c̄i ≥
0. Hence, the current basis B is optimal.
• If the knapsack problem returns a value greater than one with optimal solution (a∗1 , ..., a∗m ), then we
have identified a nonbasic column A0j = (a∗1 , ..., a∗m ), that enters the basis.

7.2 Delayed Constraint Generation

Consider the dual of the standard form problem


max p0 b
s.t. p0 Ai ≤ ci , i = 1, ..., n.
When A has large number of columns, i.e., n is large, the number of constraints in the above dual problem
is large. Like delayed column generation, we can consider a subset I ⊆ {1, ..., n} of the constraints, and form
the relaxed dual problem
max p0 b
s.t. p0 Ai ≤ ci , i ∈ I.
Let p∗ be the optimal basic feasible solution to the relaxed dual problem.

• If p∗ satisfies all the constraints p0 Ai ≤ ci , i = 1, ..., n. Then, p∗ must also be optimal to the original
dual problem, and the algorithm terminates.
• If p∗ violates constraint i for some i 6∈ I, then we add i into I.

The step of checking feasibility is the same as checking the nonnegativity of the reduced cost in the delayed
column generation method, and we need an efficient method for identifying a violated constraint. Usually,
this is achieved by finding an efficient way for solving
min ci − (p∗ )0 Ai .
i=1,...,n
Lecture 7: Large Scale Optimization 7-4

Solving the above problem without going through every term ci − (p∗ )0 Ai is possible when the problem has
certain special structure, which we demonstrate next.

7.3 Stochastic Programming and Benders Decomposition

Let (Ω, F, P) be a probability space. Consider a decision maker who acts in two consecutive stages with
some random information being revealed in the second stage. In the first stage, the decision maker needs to
choose a vector x that satisfies the constraints
Ax = b,
x ≥ 0.

The decision x generates an immediate cost c0 x.


In the second stage, some random variables B(ω), d(ω) are revealed, where ω denotes a particular scenario
(or sample) from the sample space Ω. Given a particular scenario ω and the first stage decision x, the
decision maker needs to choose another vector y(ω) that satisfies the constraints

B(ω)x + Dy(ω) = d(ω),


y(ω) ≥ 0.

The decision y(ω) generates a second stage cost f 0 y(ω). Let z(x, ω) be the minimum second stage cost given
a scenario ω and first stage decision x. It follows that

z(x, ω) = min f 0 y(ω)


y(ω)

s.t. B(ω)x + Dy(ω) = d(ω), (7.1)


y(ω) ≥ 0.

Now, the optimization problem in the first stage can be written as

min c0 x + EP [z(x, ω)]


x
s.t. Ax = b, (7.2)
x ≥ 0.

While EP [z(x, ω)] is in general a nonlinear function of x, the above problem can nevertheless be formulated
as an LP when Ω consists of finite samples, say, ω1 , ..., ωK . Let αi be the probability of scenario ωi . The
above problem is then equivalent to
K
X
min c0 x + αi f 0 yi
x,yi ,i=1,...,K
i=1
s.t. Ax = b, (7.3)
Bi x + Dyi = di , i = 1, ..., K
x, y1 , ..., yK ≥ 0.

Example 7.2 (Joint Inventory and Transportation Problem) Suppose a retailer manages the inven-
tory at n warehouses, which are used to satisfy random demands at m locations. In the first stage, the retailer
needs to decide x ∈ Rn with xi being the inventory placed at warehouse i for i = 1, ..., n. The procurement
cost at warehouse i is ci so that the total procurement cost generated in the first stage is c0 x.
Lecture 7: Large Scale Optimization 7-5

In the second stage, the demand d(ω) at m locations is realized with dj (ω) being the demand at location j
in scenario ω. Given the inventory level x and the demand realization d(ω), the retailer needs to decide
yij (ω), the amount of inventory transported from warehouse i to satisfy demand at location j. The unit
transportation cost from i to j is tij and the unit revenue for satisfying demand at location j is rj . The
second stage problem is then
n X
X m
z(x, ω) = min (tij − rj )yij (ω)
y(ω)
i=1 j=1
m
X
s.t. yij (ω) ≤ xi ,
j=1
Xn
yij (ω) ≤ dj (ω),
i=1
y(ω) ≥ 0.
The first stage problem is simply
min c0 x + EP [z(x, ω)]
x
s.t. x ≥ 0.

When K is large and y(ω), d(ω) has dimension m, t respectively, the formulation in (7.3) is an LP with
O(mK) decision variables and O(tK) equality constraints, and can be computationally demanding to solve1 .
Observe that given a fixed x, the problems of finding y(ω) are all decoupled and we can solve K much
smaller LPs, i.e., (7.1), with m decision variables and t equality constraints. The difficulty lies in the fact
that finding x is coupled with finding y(ω), ω ∈ Ω. The idea behind Benders decomposition is to decouple
the two tasks.
In the following we assume that (7.1) is feasible and has finite optimal value for any ω ∈ Ω. The dual of
(7.1) is
z(x, ω) = max p0 (ω)(d(ω) − B(ω)x)
p(ω)
(7.4)
s.t. p0 (ω)D ≤ f 0 .
Let pi , i = 1, ..., I be the extreme points of {p|p0 D ≤ f 0 }. By our assumption on (7.1), problem (7.4) also
has finite optimal value and we must have
z(x, ω) = max (pi )0 (d(ω) − B(ω)x),
i=1,...,I

which is equivalent to
z(x, ω) = min z(ω)
z(ω)

s.t. (pi )0 (d(ω) − B(ω)x) ≤ z(ω), i = 1, ..., I.


We can then reformulate (7.3) as
X
min c0 x + αi z(ω)
x,z(ω)
ω∈Ω
s.t. Ax = b, (7.5)
i 0
(p ) (d(ω) − B(ω)x) ≤ z(ω), i = 1, ..., I, ω ∈ Ω,
x ≥ 0.
1 Keep in mind that computing the inverse of an m × m dimension matrix B or solving a linear system Bx = b takes O(m3 ).
Lecture 7: Large Scale Optimization 7-6

We call formulation (7.5) the master problem, which only has O(K) decision variables (as opposed to O(mK)
in (7.3)). But (7.5) has an extremely large number of inequality constraints—O(IK). We can overcome this
via delayed constraint generation.
We start with (7.3) that involves only a subset of inequality constraints. Suppose the resulting optimal
solution to this relaxed master problem is x∗ and z∗ = (z1∗ , ..., zK

). We then need to check the feasibility
of (x∗ , z∗ ) with respect to the rest of constraints in (7.3). The key idea here is to solve some auxiliary
subproblems instead of checking the constraints (pi )0 (d(ω) − B(ω)x∗ ) ≤ z ∗ (ω) one by one. In particular,
for each ω ∈ Ω, we solve

min f 0 y(ω)
y(ω)

s.t. Dy(ω) = d(ω) − B(ω)x∗ ,


y(ω) ≥ 0.

From solving above problem, we can obtain the optimal dual BFS: pi(ω) for every ω ∈ Ω.

• If (pi(ω) )0 (d(ω) − B(ω)x∗ ) ≤ z ∗ (ω) for every ω ∈ Ω, then by optimality of pi(ω) ,

(pi )0 (d(ω) − B(ω)x∗ ) ≤ z ∗ (ω)

for all i = 1, ..., I. As a result, (x∗ , z∗ ) is feasible to (7.5) and hence optimal.
• If (pi(ω̄) )0 (d(ω̄) − B(ω̄)x∗ ) > z ∗ (ω̄) for some ω̄ ∈ Ω, then we have identified a violating constraint:

(pi(ω̄) )0 (d(ω̄) − B(ω̄)x) ≤ z(ω̄),

which is added to the relaxed master problem.

References
[BT97] D. Bertsimas and J.N. Tsitsiklis, Introduction to Linear Optimization, Springer, 1997.

You might also like