Lecture 3
Lecture 3
Combinatorial Optimization
Lecture 3: Geometric Aspects of Linear
Programming and an Introduction to the Simplex
Algorithm
Notes taken by Adam Tenenbaum
January 31, 2005
1 Review
In our previous class, we took some important steps toward defining an efficient algorithm for solving linear programs (LP).
Candidate solutions arise when the columns are linearly independent and
there is a nonnegative combination of the columns that equals .
An easy corollary is that if there is a solution (
solution which is a BFS.
Lecture Notes for a course given by Avner Magen, Dept. of Computer Sciecne, University of Toronto.
That is, it is possible to pick an objective function that makes the BFS the
unique optimal solution.
so that
!
for
and
In other words, if
is a convex combination of the
s, then it can be represented by a
convex combination of only #
s.
!
Proof. To prove this theorem, we write finding the scalars as a linear program:
$%
() $*% +
()
..
'&&&
.
,
by inserting a row:
We then include the constraint .
$*
(- $*
$*
(*
*
()
) - % +
%
%
)
/&&&
..
.
,
&&&
&&& 3254
A BFS for the
must exist since ; as such, there is a solution 0 1
system
with at most 6 non-zero coefficients. We take the positive components of that
BFS to be the
s.
7 8
7 8
(1)
x3
1
1
x1
1
x2
The set of feasible solutions to Equation (1) is represented by the shaded triangle
shown in Figure 2. We can also consider reducing the dimension of the problem by
removing the slack variable 8 :
+ 7
+ 7
(2)
We get (Figure 3) a somewhat similar triangle that is presented in two (rather than
three) dimensions.
x2
1
x1
!
is the set
and ! 6 .
!
and ! 6 . Similarly, a halfspace is the set
!
and
is the hyperplane
Observation 2.4. If we take a LP in standard form, then our set of feasible solutions
is
Since halfspaces are convex, the intersection of halfspaces is convex. Conversely, all
convex spaces can be represented as an intersection of halfspaces (but not necessarily
a finite intersection).
Definition 2.5. A polyhedron is the finite intersection of halfspaces. Furthermore, a
polyhedron which is bounded (contained in a big enough cube) is a polytope.
Since the set of feasible solutions is always a finite intersection of halfspaces, it always
forms polyhedra.
( is not disjoint
Definition 2.6.
is a supporting hyperplane for if
from ) and or ( is contained in either the upper or lower
halfspace).
with
is called a face.
Definition 2.8. The dimension of a set is the dimension of the smallest affine space
containing it.
For example, the feasible sets in Figures 2 and 3 are of dimension 2.
(c)
(b)
(a)
(d)
. is a vertex of iff
is an extreme
point.
is a BFS. Equivalently,
Before we begin the proof, we note that since vertices are equivalent to BFSs and
extreme point, it is enough to look at vertices when looking for the optimum solution.
Proof. Here, we prove that a BFS is a vertex. Recall that if is a BFS, then there exists
such that
for
. We define the hyperplane as
!
!
where
!
Since, for
and
, !
, it follows that
. Furthermore,
, hence . Therefore, is a
since for any
,
, we get
supporting hyperplane for and its intersection with (the point ) is a vertex.
Note that we have previously shown the equivalence between BFSs and extreme points;
the proof of equivalence between vertices and extreme points will complete the proof
of the lemma and is left as an exercise.
7
7
where
, are the corresponding basic indices, and , are two distinct indices in
.
Assume, without loss of generality, that all entries in
are non-negative since we
can multiply a row in and the corresponding by without changing the solution
space.
We consider a different problem
of solving
We can
then use the simplex algorithm to solve this problem; notice that in the new
system, the phase is easy: simply take
as the BFS. If we can solve it
with .
, then we have found a solution to
and thus a BFS from which
we can begin our original problem. Note that we do not have any difficulty finding a
BFS from which to start the subproblem, as
is a BFS for the subproblem
with as the basic variables (corresponding to the linearly independent columns in the
appended identity matrix).
More precisely, at the end of this start phase, the simplex algorithm will give us
an optimal solution
to the linear program. We will then have one of three
possible cases:
. If this is the case, there are no BFSs for the original program; in other
words, the original LP
is infeasible.
2.
and
is of size . When this is the case,
is a feasible
solution to the original LP and is a basis of linearly independent columns.
Thus,
is a BFS corresponding to .
3.
ofandlinearly
+ . In this case, we can find additional columns in to get
independent columns. This is achieved by checking if column
is linearly
independent
of all columns for each ; if it is, is
1.
"!
! 0
0 2 &&&
7
as the set of basic variables of our starting BFS, where 0 2 is the th index in the basis.
2
0 7 &&&
We know that for the BFS
4 ,
(3)
Furthermore, each column of A which is not in the basis can be written as a linear
combination of the columns which are in the basis:
Here, we define as some entries in a matrix which are used to form linear combinations of the basis columns.
A geometrical viewpoint of adding column to the basis is that the basis increases
from to components. The previous basis (which belonged to a BFS) corresponded to a vertex; linear combinations of the previous basis with a scaled version of
the new column
(where
) correspond to moving along an edge. The
upper bound of
will be explained shortly. We can write the scaled column as
2
we still have a feasible solution (i.e. all
for not in are still 0).
If, further, there exists an for which
, then we reach a solution
with at most nonzero coordinates which we will see is a new BFS, with in the new
basis.
Figure 5 illustrates the operation of moving from one BFS *$ to another.
Here, the
( -*
%
)
move is made from to the new vertex along the direction
.
7
8
Example 3.3 (Three examples, and ). Here, we consider three examples
of finding
, the maximum
for which a scaled version of a column can be subtracted.
Example A
new vertex
a vector in direction d
x
7 ; subtracting a scalar multiple of the row
Here,
Example B
with
any greater would
Here,
. No move can
be made since any positive
would cause the first element
to be negative. This is the case whenever there is a zero entry in
and a positive
corresponding entry in . A BFS with less than nonzero elements such as athe one
shown here is called a degenerated basis.
Example C
Here,
We formally define
as:
The minimization is only performed over positions
where
since we have
seen that they are the ones that can cause problems. The three examples shown above
are typical of the three possible solutions to
:
A:
B:
C:
. This occurs
when there are no issues with nonnegativity all
as follows,
.
!
!
!
!
2
2
2
0 for , then is a new BFS with as
and we define 0
and 0
the new basis.
this is true since both
Proof. We start our proof by showing that is feasible;
!
and
are non-negative.
Next
we
show
that
is
still
a
basis.
!
In order to prove that is still a basis, we will show columns 7 &&&
!
are linear independent. Let be the column entering the basis. Then
as a linear combination of the columns which are in the original basis
can be written
:
. So
we
can
the
following equality:
get
.
0 &&& &&& 2 0 &&& 2
So
!
is still a basis.
7 8
8
7 8
10
for all
.
Our first goal will be to transform this LP from standard form to canonical form.
, and with equality constraints. This set of
Note that this is a problem with
equality constraints can be represented in matrix form as
. Also note that in this
; that is, the
particular problem, the matrix is in the form
matrix
at the right of is the identity matrix. If one is faced with a problem in which this is
not the case, it is possible to transform the
matrices in to the identity matrix
through elementary row operations.
By noting that for the variables corresponding to the identity part of ,
and
, these
where is the 0 2 entry in the matrix , and also noting that
equalities can be written as inequalities in ,
:
. Thus,
satisfy the inequalities
The remaining variables ,
we have expressed the problem in canonical form:
7 8
8
7
8
7
7
8
8
Note that the problem in canonical form consists of inequalities, which we label .
We also note that we have reduced the domain of the problem from
to
by
removing the slack variables. From this form, it is much easier to view the problem
geometrically.
11
!
!
8
. This correNow, consider starting from the BFS
with basis
8
sponds to the point 0
2 in . As we have seen in class, moving from one BFS
to another is accomplished by relaxing one of the equalities 7
to an inequal7
7
ity. Here, we consider removing the equality
on . Then, we will add a scaled
version of the variable 7 , namely
7 . Geometrically, this action corresponds to moving along the line towards the point 0 2 . This occurs since we have removed the
equality 7 , and the intersection of the hyperplanes corresponding to the remaining
2 and 0 2 . As we
inequalities
is the line connecting the two points 0
have seen in our description of the simplex algorithm, we will continue moving along
this line until one of the inequalities reaches an equality. In this case, this will occur at the point 0 2 where 8
. At that point, the intersection of hyperplanes
corresponding to equalities 8
forms the vertex 0 2 which is a BFS.
12
We also noted that with degenerate BFSs, we must be careful not to get trapped
in a loop. If we only consider the slope of the edge (i.e. its gradient) connecting
degenerate vertices, we may think that moving will improve the objective function. However, since the two points are separated by distance zero, clearly no
improvement can be made.
Finally, we used our geometrical interpretation to reassert that local optima are
global optima. This holds since the polytope is convex.
References
13
x3
A2
(0,0,3)
(1,0,3)
A6
(0,1,3)
(2,0,2)
A1
A4
A7
A5
(2,0,0)
(0,0,0)
x1
(2,2,0)
(0,2,0)
A3
x2
Figure 6: Polytope corresponding to the example LP
14