Colgen
Colgen
by
Soumitra Pal
Roll No: 05305015
Prof. A. G. Ranade
Computer Science and Engineering
IIT-Bombay
a
Department of Computer Science and Engineering
Indian Institute of Technology, Bombay
Mumbai
Acknowledgments
I would like to thank my guide, Prof. A. G. Ranade, for his consistent motivation and
directions throughout this work.
Soumitra Pal
MTech-I
CSE, IIT Bombay
Abstract
1 Introduction 3
1.1 Cutting stock problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Kantorovich model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Gilmore-Gomory model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Column generation approach . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4 Branch-and-price algorithms 21
4.1 Generic Branch-and-price algorithm . . . . . . . . . . . . . . . . . . . . . . 21
4.2 Branch and price algorithm for the flow model . . . . . . . . . . . . . . . . 22
4.2.1 Column generation subproblem . . . . . . . . . . . . . . . . . . . . 22
4.2.2 Branch-and-bound . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5 Conclusions 26
2
Chapter 1
Introduction
The successful solution of large scale mixed integer programming (MIP) problems requires
solving linear programming (LP) relaxations that have huge number of variables. One
of several reasons for considering LP with huge number of variables is that compact
formulation of a MIP may have a weak LP relaxation. Frequently the relaxation can be
tighten by using huge number of variables.
In the pricing step of the revised simplex algorithm, if the optimal solution is not already
reached, the algorithm figures out the non-basic variable that should replace one of the
basic variables to improve the objective function in the next iteration of the algorithm.
This explicit search of variable, may become intractable for problems with huge number
of variables (or columns). Gilmore and Gomory [6], [7] proposed an ingenious way of
solving this problem. The algorithm starts with a few columns and generates new column
as and when required by solving a subproblem or oracle.
Gilmore and Gomory applied this column generation technique to solve large scale cut-
ting stock problem. Since then several researchers have applied the column generation
technique to many real life applications.
We introduce the column generation technique using an example of the cutting stock
problem. In section 1.1 we describe the problem. We gradually build the context for
the column generation technique. In section 1.2 we show a simple model for solving the
problem. Linear programming solution of the model does not give proper answer and
some time fails. We give a better model in section 1.3. We discuss different issues in
solving the model and how column generation technique can address them in section 1.4
3
out to be an optimization problem, or more specifically, an integer linear programming
problem.
Example An instance of the problem. The raw rolls all have width of W = 10 inches.
There are orders for m = 4 different finals of widths as follows:
subject to
K
X
xik ≥ bi , i = 1..m (1.2)
k=1
m
X
wi xik ≤ W yk , k = 1..K (1.3)
i=1
yk = 0 or 1, k = 1..K (1.4)
This formulation was introduced by Kantorovich. For solving the given example, we set
K = 9 + 79 + 90 + 27 = 215 assuming at most one final is cut from one raw. We can set
4
value of K even greater than 215. In that case the all the extra variables yk become 0 in
the optimal solution.
Let us now think about the solution of the above integer program. One of the most obvious
way is to relax the integrality constraints on the variables xik and yk . In that case, the
problem becomes a linear program which can be solved using standard algorithms such
as the simplex algorithm.
For the given example, solution to the LP relaxation is 120.5 which is significantly less that
the integer solution 157. In other examples having more finals, the gap increases. Hence
just rounding the values of the variables in the optimum solution of the LP relaxation
will be far from the integral solution. Also, corresponding to the optimum LP solution,
the value 120.5 is assigned to one of the yk s. The rest are all assigned to zero. If we add
the constraint
yk ≤ 1, k = 1..K (1.6)
the value 120.5 gets distributed as 120 of the yk s having value 1 and one more yk having
0.5. Rest become all 0. This is definitely not the solution we want.
subject to
X
aij xj ≥ bi , i = 1, 2, · · · m (1.8)
jJ
5
1 10 ≥ 9
2 10 ≥ 6
3 10 ≥ 6+3
4 10 ≥ 5
5 10 ≥ 5+5
6 10 ≥ 5+3
7 10 ≥ 3
8 10 ≥ 3+3
9 10 ≥ 3+3+3
Similar to the Kantorovich formulation, this can also be solved using the LP relaxation.
However, values of xj s (we say x∗j ) may not be all integer in the optimal solution of the
LP relaxation. In that case, we can round each x∗j down to the nearest integer and obtain
a solution close to actual integer solution. The residual demands of finals which are not
met due to rounding down can be found by brute-force.
For the given example, optimal LP solution z = 156.7 corresponds to
x∗1 = 27, x∗3 = 90, x∗5 = 39.5
Rounding each x∗j down to nearest integer gives value of z = 157 which fortunately
matches with the optimal integer solution.
• Problems commonly encountered in the paper industry may involve huge number of
variables. For example, if the raw rolls are 200 in. wide and if the finals are ordered
in 40 different lengths ranging from 20 in. to 80 in., then the number of different
patterns may easily exceed 10 or even 100 million. In that case, the solution may
not be tractable.
• Passing from an optimal fractional-valued solution to an optimal integer-valued so-
lution is not easy. Rounding the fractional values down and satisfying the residual
demands, as we mentioned in the example, may not yield the optimal working plan.
If the finals are ordered in small enough quantities, then the patterns used in the
optimal integer valued solution may be quite different from those used originally in
the optimal fractional valued solutions.
An ingenious way of getting around the first difficulty was suggested by P. C. Gilmore and
R. E. Gomory. The trick is to work with only a few patterns at a time and to generate
new patterns only when they are really needed. This technique is called delayed column
generation. We discuss this in details in the chapter 2.
No efficient way of handling the second difficulty is known. However there have been
instances of tacking this difficulty by using a method commonly known as branch-and-
price. Here the trick is to combine the delayed column generation with the standard
6
branch-and-bound algorithm for solving integer programs. Delayed column generation is
applied to each node of the branch and bound tree to solve the LP subproblem. This is
discussed in details in the chapter 4.
By now, the reader should have an idea about the column generation technique to solve
linear programs, particularly, huge integer programming problems. The rest of this report
is organized as follows.
In chapter 2, we provide the mathematical foundation of column generation. We elaborate
the rationale of column generation by providing an algebraic interpretation of the pricing
step in simplex algorithm.
In the previous sections, we have mentioned about two formulations of the cutting stock
problem: Kantorovich formulation which is straight forward but having poor LP re-
laxation value and Gilmore-Gomory formulation which is trickier but gives better LP
relaxation. It is amenable to column generation.
In chapter 3, we present some concepts from the Dantzig-Wolfe decomposition, and in-
dicate how Gilmore-Gomory model can be obtained applying a Dantzig -Wolfe decom-
position to Kantorovich model. We also argue why the second model gives better LP
relaxation bounds. We also provide flow based formulation of the problem.
In chapter 4, we present the outline of the generic branch-and-price algorithm. We also
discuss an exact solution of the flow-based model using an implementation of the branch-
and-price algorithm.
We conclude the report in chapter 5.
7
Chapter 2
In this chapter, we touch upon the mathematical theory required to understand the basics
of column generation. We provide details up-to the level that is required to understand
the column generation technique.
We start in section 2.1 with the algebraic interpretation of the simplex algorithm, par-
ticularly the step in which a non-basic variable replaces a basic variable to improve the
cost. We show how this leads to process of delayed column generation in section 2.2.
such that
n
Ax = b, x ∈ R+ . (2.2)
Its dual is
such that
8
uA ≥ c, u ∈ Rm . (2.4)
We suppose that rank(A) = m ≤ n, so that all the redundant equations have been removed
from the LP.
For solving the LP, in each iteration of the simplex method we look for a non-basic variable
to price out and enter the basis. Column generation plays a key role in this pricing step.
We explain the algebra of pricing step below.
Let A = (a1 , a2 , · · · , an ) where aj is the jth column of A. Since rank(A) = m, there
exists an m × m nonsingular submatrix AB = (aB1 , aB2 , · · · , aBm ). Let J = {1, 2, · · · , n}
and B = {B1 , B2 , · · · , Bm } and let N = J \ B. Now permute the columns of A so that
A = (AB , AN ). We can write Ax = b as AB xB + AN xN = b, where x = (xB , xN ). Then a
solution to Ax = b is given by xB = A−1 B b and xN = 0.
xB
= b, or xB = A−1
Ax = AB A N B b. (2.5)
0
The question is where to go next, after leaving this corner. The decision is made easy
by elimination on A, which reduces the square part B to the identity matrix. In matrix
notation this multiplies Ax = b by A−1
B
xB
A−1 = A−1
I B AN 0 B b. (2.7)
Now suppose, the non-basic (zero) components of x is increased to some non-zero value.
0 0 0
Say x is changed to x = [xB xN ]. To maintain equality 2.7,
x0B
−1
= A−1
I AB AN 0 B b, (2.8)
xN
or
0 0
xB + A−1 −1
B AN xN = AB b, (2.9)
or
0 0
xB = A−1 −1
B b − AB AN x N (2.10)
0 0 0 0 0
cx = cB xB + cN xN = cB (A−1 −1
B b − A B AN x N ) + c N x N (2.11)
9
0 0
cx = cB A−1 −1
B b + (cN − cB AB AN )xN (2.12)
0
The reduction in cost cx − cx is
0 0
cB A−1 −1 −1 −1
B b + (cN − cB AB AN )xN − cB AB b = (cN − cB AB AN )xN (2.13)
We can see that as xN increases, the cost function goes up or down depending on the sign
of the vector c̄ in parentheses.
c̄ = cN − cB A−1
B AN (2.14)
This vector contains the reduced costs. If c̄ ≥ 0 then the current corner is optimal. The
product c̄xN in equation 2.13 cannot be negative since x ≥ 0, so the best decision is
to keep xN = 0 and stop. On the other hand, suppose a component is negative. Then
the cost is reduced if the corresponding non-basic variable enters into the basis. The
simplex method chooses one entering variable - generally the one with the most negative
component of c̄.
10
column represents a cutting pattern, which must satisfy the knapsack constraint - sum of
widths of finals in a particular cutting pattern must not exceed the width of the raw rolls.
0
Thus, in practice, one works with a reasonably small subset J ⊂ J of columns, with a
restricted master problem (RMP). Assuming that we have a feasible solution, let λ̄ and ū
be primal and dual optimal solutions of the RMP, respectively. When columns aj , j ∈ J,
are given as elements of a set A and the cost coefficient cj can be computed from aj ,
cj = c(a), then the subproblem or oracle
m
X
max ūi ai (2.19)
i=1
m
X
such that ai wi ≤ W (2.20)
i=1
ai ≥ 0 (2.21)
This way of starting with a basic set of columns & generating more columns as and when
necessary is known as Delayed Column Generation or simply Column Generation.
With the usual precautions against cycling of simplex method, column generation is finite
and exact. In addition, it is possible to utilize the knowledge of intermediate solution
quality during the process. Let z̄ denote the optimal objective function valuePto the
RMP. Note by duality we have z̄ = ūb. Interestingly, when an upper bound κ ≥ j∈J λj
holds for an optimal solution of the master problem, we establish not only an upper bound
on z ∗ in each iteration, but also a lower bound. We can not reduce z̄ by more than κ
times the smallest reduced cost c̄∗ , hence
z̄ + c̄∗ κ ≤ z ∗ ≤ z̄ (2.22)
In the optimum solution of the MP, c̄∗ = 0 for the basic variables and the bounds close.
When the objective already is a sum of variables that is c = 1, we use z ∗ instead of κ
and obtain the improved lower bound z̄/(1 − c̄∗ ) ≤ z ∗ . For c ≥ 0, Farley [5] proposes a
more general lower bound at the expense of a slightly increased computation effort. Let
0
j ∈ arg min j∈J {cj /ūaj |uaj > 0}. Then
11
z̄ + cj 0 /ūaj ≤ z ∗ ≤ z̄ (2.23)
This knowledge about the running bound helps in deciding whether to continue generating
columns at a particular node in the branch-and-bound tree. We discuss branch-and-bound
with column generation in chapter 4.
12
Chapter 3
subject to
K
X
xik ≥ bi , i = 1..m (3.2)
k=1
m
X
wi xik ≤ W yk , k = 1..K (3.3)
i=1
yk = 0 or 1, k = 1..K (3.4)
13
xik integer and ≥ 0, i = 1..m, k = 1..K (3.5)
min cx (3.6)
subject to Ax = b (3.7)
x∈X (3.8)
where the constraint set given by equation 3.8 have nice structure.
The LP relaxation of this model that results from dropping the integrality constraints on
the variable x can be very weak. A stronger model can be obtained by restricting the set
of points from set X that are considered in the LP relaxation of the reformulated model.
According to Minkowski’s theorem, any point of an non-empty polyhedron X can be
expressed as a convex combination of the extreme points of X, denotes as x1 , x2 , · · · ,
14
x|P | , plus a non-negative linear combination of the extreme rays of X, denoted as r1 , r2 ,
· · · , r|R| :
X X X
n
X = {x ∈ R+ :x= λp x p + µr r r , λp = 1, λp ≥ 0, ∀p ∈ P, µr ≥ 0, ∀r ∈ R}
p∈P r∈R p∈P
(3.10)
where P = xp is the set of extreme points of X and R = rr is the set of extreme rays of
X.
Substituting the value of x in the original model, we obtain the following model
X X
min c( λp x p + µr r r ) (3.11)
p∈P r∈R
X X
subject to A( λp x p + µr r r ) = b (3.12)
p∈P r∈R
X
λp = 1 (3.13)
p∈P
λp ≥ 0, ∀p ∈ P (3.14)
µr ≥ 0, ∀r ∈ R (3.15)
X X
subject to (Axp )λp + (Arr )µr = b (3.17)
p∈P r∈R
X
λp = 1, λp ≥ 0, ∀p ∈ P, µr ≥ 0, ∀r ∈ R (3.18)
p∈P
15
reformulated model have the same LP bounds. However, when the subproblem does not
have integrality property the relaxation of the reformulated model provides better LP
bound.
This happens in the cutting stock problem as we will see in the next section.
K X
X
min λpk (3.19)
k=1 p∈P
K X
X
subject to apik λpk ≥ bi , ∀i (3.20)
k=1 p∈P
X
λpk ≤ 1, k = 1, 2, · · · , K (3.21)
p∈P
X
apik λpk ≥ 0, and integer , ∀i, k = 1, 2, · · · , K (3.22)
p∈P
λpk ≥ 0, ∀p ∈ P, k = 1, 2, · · · , K (3.23)
16
It can be seen that the LP relaxation of this model has only those feasible solutions that are
non-negative linear combination of the integer solutions to the knapsack problem. Non-
negative linear combinations of the fractional extreme points of the knapsack polytope,
which are feasible in the linear relaxation of the Kantorovich model are eliminated.
Due to this fact, the reformulated model gives better bound of the LP relaxation.
The subproblem decomposes into K subproblems, which are knapsack problems. When
all the rolls have the same width, all the subproblems are identical, and the reformulated
model is equivalent to the Gilmore-Gomory model. If the K rolls used are grouped
according to the cutting pattern that are cut to, the model takes of the form of Gilmore-
Gomory model
X
min xj (3.24)
j∈J
X
subject to aij xj ≥ bi , ∀i (3.25)
j∈J
xj ≥ 0, ∀J ∈ J. (3.26)
where aij represents the number of rolls of width wj obtained in cutting pattern j. xj is
the decision variable denoting the number of rolls to be cut according to pattern j.
It has been found that in real life cutting stock problems the following conjecture to be
true.
Proposition 3.3.1 (Conjecture) The gap between the optimal value of the cutting stock
problem in the form of Gilmore-Gomory model and its LP relaxation is less than 2.
17
Given bins of integer capacity W and a set of different item sizes w1 , w2 , · · · , wn the
problem of determining a valid solution to a single bin can be modeled as the problem of
finding a path in the acyclic directed graph with W + 1 vertices.
Consider a graph G = (V, A) with V = {0, 1, · · · , W } and A = {(i, j) : 0 ≤ i < j ≤ W
and j − i = Wd for every d ≤ m}, meaning that there exists a directed arc between two
vertices if there is an item of the corresponding size. The number of variables is O(mW ).
Consider additional arcs between (k, k + 1), k = 0, 1, · · · , W − 1 corresponding to unoc-
cupied portion of the bin. These arcs are known as loss arcs.
There is a packing in a single bin iff there is a path between the vertices 0 and W . Then
length of arcs that constitute the path define the item sizes to be packed.
Figure 3.1 shows the graph associated to an instance with bin capacity W = 5 and items
of size 2 and 3. In the same figure a path is also shown. The path corresponds to 1 items
of size 2 and 1 item of size 3.
If a solution to a single bin corresponds to the flow of one unit between vertices 0 and
W , a path carrying a larger flow will correspond to using the same packing solution in
multiple bins. This leads us to the formulation of the problem.
The problem is formulated as the problem of determining the minimum flow between
vertices 0 and W with additional constraints enforcing that the sum of the flows in the
arcs of each order must be greater than or equal to the number of items of a given size.
Consider decision variables xij associated with the arcs defined above, which correspond
to the number of items of size j − i placed in any bin at a distance of i unit s from the
beginning of the bin. The variable z can be seen as the total flow from vertex 0 to W .
The model is as follows :
Minimize
z (3.27)
18
subject to
X X −z, if j = 0,
xij − xjk = 0, if j = 1, 2, · · · W − 1, (3.28)
z, if j = W ;
(i,j)∈A (j,k)∈A
X
xk(k+wd ) ≥ bd , d = 1, 2, · · · , m, (3.29)
(k,k+wd )∈A
Proposition 3.4.1 The bounds of the linear programming relaxations of the flow model
and Gilmore & Gomory model are equal.
Using the variables presented thus far, there are many alternative solutions with exactly
the same items in each bin. These alternative solutions, in other words, symmetry of the
solution space cause the undesirable effect of same column getting generated repeatedly in
the column generation process. Both the symmetry and size of the model can be reduced
by selecting only a subset of arcs from A using the following criteria.
The arcs are specified such that the corresponding items are ordered in decreasing values
of width.
Criterion 1 An arc of size we , designated by xk,kwe , can only have its tail at a node k
that is the head of another arc of size wd , xk−wd ,k , for wd ≥ we , or, else, from node 0, i.e.,
the left border of the bin.
In particular, if a bin has any loss, it will appear last in the bin. A bin can never start
with a loss. Formally this can be stated as follows.
Criterion 2 All the loss arcs are xk,k+1 can be set to zero for k < wm .
In a bin, the number of consecutive arcs corresponding to a single item size must be less
than or equal to the number of items of that size. Therefore, using criterion 1,
Criterion 3 Given any node k that is the head of another arc of size wd (wd > we ) or k =
0, the only valid arcs for size we are those that start at nodes k +swe , s = 0, 1, 2, · · · , be −1
and k + swe ≤, where be is the demand of items of size we .
Figure 3.2 shows the graph associated to an instance with bin capacity W = 5 and items
of size 2 and 3. This is reduced as per the criteria 1 & 2.
Even after applying reduction using the above criteria, the model still can have symmetry.
In instances with small average number of items per bin, which happens to be uncommon,
if the criteria are applied there may have some symmetry. However, the symmetry is low
and its undesirable effects are not so harmful.
19
Figure 3.2: Reduced flow graph
20
Chapter 4
Branch-and-price algorithms
Even with the stronger LP relaxations, optimal solution to the LP relaxation may not give
the optimal solution to the original problem. Among the standard approaches of solving
integer programs using LP relaxations, the one we are interested uses column generation
and branch-and-bound. Various notions have been coined for the synthesis of column
generation and branch-and-bound such as branch-and-price Barnhart et al. [1] and IP
column generation [8]. Besides the decomposition principles, essentially it all boils down
to branch-and-bound.
In section 4.1 we outline the branch and price algorithm and the difficulties associated
with it. In section 4.2 we discuss the implementation of this algorithm using the flow
based formulation [3].
2. Solve the RMP using column generation. This may take multiple iterations before
no more better column can be generated. The columns are generated by solving a
subproblem.
It may seem that branch-and-price involves nothing more than combining well known
ideas for solving linear programs by column generation with branch-and-bound. However,
21
there are fundamental difficulties in applying column generation techniques for linear
programming in IP solution methods. These include :
• Solving these LPs and the subproblems to optimality may not be efficient, in which
case different rules will apply for managing the branch-and-price tree.
Barnhart et al. [1] describes the different techniques of handling the above two issues.
In the next section we see an implementation of the generic algorithm that gives exact
solution to the bin-packing problem.
X X −1, if j = 0,
subject to xij − xjk = 0, if j = 1, 2, · · · , W − 1, (4.2)
1, if j = W ;
(i,j)∈ALP (j,k)∈ALP
22
The arcs that correspond to the variables that are already in the restricted master problem
are also considered in the subproblem. There is an attractive path if the optimum is
strictly less than 0. Otherwise, the solution to the restricted master problem is optimal.
4.2.2 Branch-and-bound
The procedure used here is oriented to a problem where the gaps are almost always strictly
smaller than one and can be reduced to zero by introducing the following valid inequality.
P Pm
(k,k+1)∈ALP xk,k+1 ≥ Lmin where the minimum loss Lmin is given by dzLP eW − d=1 wd bd .
• the solution is integer, with the value equal to zLB , which means that is optimal;
• the solution is fractional, with a value equal to zLB , making it necessary to introduce
new branching constraints;
• the solution has a value that is strictly greater than zLB . The column generation
procedure is called, trying to reach a solution with value zLB leading either to case
1 or 2. If this is not possible, the node is fathomed.
23
The number of decision problems to be solved is finite, as there is an absolute performance
ratio for the bin packing problem. Also the set of constraints that can be imposed in each
decision problem is finite, which guarantees the finiteness of the entire solution process.
Figure 4.1 shows a flowchart for the column generation/branch-and-bound procedure.
24
Figure 4.1: Flow chart for the algorithm
25
Chapter 5
Conclusions
Column generation is a success story in solving large scale integer programming. The
advantage of keeping only few columns at any time, makes many real life integer opti-
mization problems thought of being solved. When used with extensive formulation having
stronger linear programming relaxation, the method gives quicker result. Application spe-
cific heuristics applied at each node of branch-and-bound tree also improves the efficiency
of the procedure.
26
References
[2] V. Chvátal. Linear Programming, chapter 13. W. H. Freeman and Company, New
York, 1983.
[4] J. M. Valério de Carvalho. Lp models for bin-packing and cutting stock problems.
European Journal of Operations Research, 141(2):253–273, 2002.
[8] F. Vanderbeck and L. A. Wolsey. An exact algorithm for ip column generation. Op-
erations Research Letters, 19:151–159, 1996.
27