0% found this document useful (0 votes)
4 views328 pages

Luc 2016

The document is an introduction to Multiobjective Linear Programming by Dinh The Luc, aimed at providing foundational knowledge in the field. It covers linear optimization problems, Pareto optimality, duality, and numerical algorithms, making it suitable for undergraduates and first-year graduate students. The book emphasizes a straightforward approach to complex concepts, ensuring accessibility for readers with minimal mathematical background.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views328 pages

Luc 2016

The document is an introduction to Multiobjective Linear Programming by Dinh The Luc, aimed at providing foundational knowledge in the field. It covers linear optimization problems, Pareto optimality, duality, and numerical algorithms, making it suitable for undergraduates and first-year graduate students. The book emphasizes a straightforward approach to complex concepts, ensuring accessibility for readers with minimal mathematical background.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 328

Dinh The Luc

Multiobjective
Linear
Programming
An Introduction
Multiobjective Linear Programming
Dinh The Luc

Multiobjective Linear
Programming
An Introduction

123
Dinh The Luc
Avignon University
Avignon
France

ISBN 978-3-319-21090-2 ISBN 978-3-319-21091-9 (eBook)


DOI 10.1007/978-3-319-21091-9

Library of Congress Control Number: 2015943841

Springer Cham Heidelberg New York Dordrecht London


© Springer International Publishing Switzerland 2016
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made.

Printed on acid-free paper

Springer International Publishing AG Switzerland is part of Springer Science+Business Media


(www.springer.com)
To Dieu Huyen,
Liuli and The Duc
Preface

Multiobjective optimization problems arise in decision-making processes in many


areas of human activity including economics, engineering, transportation, water
resources, and the social sciences. Although most real-life problems involve non-
linear objective functions and constraints, solution methods are principally
straightforward in problems with a linear structure. Apart from Zeleny’s classic
1974 work entitled “Linear Multiobjective Programming” and Steuer’s 1986 book
“Multiple Criteria Optimization: Theory, Computation and Application,” nearly all
textbooks and monographs on multiobjective optimization are devoted to non-
convex problems in a general setting, sometimes with set-valued data, which are not
always accessible to practitioners. The main purpose of this book is to introduce
readers to the field of multiobjective optimization using problems with fairly simple
structures, namely those in which the objective and constraint functions are linear.
By working with linear problems, readers will easily come to grasp the fundamental
concepts of vector problems, recognize parallelisms in more complicated problems
with scalar linear programming, analyze difficulties related to multi-dimensionality
in the outcome space, and develop effective methods for treating multiobjective
problems.
Because of the introductory nature of the book, we have sought to present the
material in as elementary a fashion as possible, so as to require only a minimum of
mathematical background knowledge. The first part of the book consists of two
chapters providing the necessary concepts and results on convex polyhedral sets
and linear programming to prepare readers for the new area of optimization with
several objective functions. The second part of the book begins with an examination
of the concept of Pareto optimality, distinguishing it from the classical concept of
optimality used in traditional optimization. Two of the most interesting topics in
this part of the book involve duality and stability in multiple objective linear
programming, both of which are discussed in detail. The third part of the book is
devoted to numerical algorithms for solving multiple objective linear programs.
This includes the well-known multiple objective simplex method, the outcome
space method, and a recent method using normal cone directions.

vii
viii Preface

Although some new research results are incorporated into the book, it is well
suited for use in the first part of a course on multiobjective optimization for
undergraduates or first-year graduate students in applied mathematics, engineering,
computer science, operations research, and economics. Neither integer problems
nor fuzzy linear problems are addressed. Further, applications to other domains are
not tackled, though students will certainly have no real difficulty in studying them,
once the basic results of this book assimilated.
During the preparation of this manuscript I have benefited from the assistance of
many people. I am grateful to my Post-Ph.D. and Ph.D. students Anulekha Dhara,
Truong Thi Thanh Phuong, Tran Ngoc Thang, and Moslem Zamani for their careful
reading of the manuscript. I would also like to thank Moslem Zamani for the
illustrative figures he made for this book. I want to take this opportunity to give
special thanks to Juan-Enrique Martinez-Legaz (Autonomous University of
Barcelona), Boris Mordukhovich (Wayne State University), Nguyen Thi Bach Kim
(Hanoi Polytechnical University), Panos Pardalos (University of Florida), Michel
Thera (University of Limoges), Majid Soleimani-Damaneh (University of Tehran),
Ralph E. Steuer (University of Georgia), Michel Volle (University of Avignon), and
Mohammad Yaghoobi (University of Kerman) for their valued support in this
endeavor.

Avignon Dinh The Luc


December 2014
Contents

1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Part I Background

2 Convex Polyhedra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1 The Space Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 System of Linear Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 Convex Polyhedra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.4 Basis and Vertices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3 Linear Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.1 Optimal Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.2 Dual Problems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.3 The Simplex Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

Part II Theory

4 Pareto Optimality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.1 Pareto Maximal Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.2 Multiobjective Linear Problems . . . . . . . . . . . . . . . . . . . . . . . . 102
4.3 Scalarization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

5 Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
5.1 Dual Sets and Dual Problems . . . . . . . . . . . . . . . . . . . . . . . . . 119
5.2 Ideal Dual Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.3 Strong Dual Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
5.4 Weak Dual Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
5.5 Lagrangian Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

ix
x Contents

5.6 Parametric Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167


5.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

6 Sensitivity and Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183


6.1 Parametric Convex Polyhedra . . . . . . . . . . . . . . . . . . . . . . . . . 183
6.2 Sensitivity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
6.3 Error Bounds and Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
6.4 Post-optimal Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
6.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234

Part III Methods

7 Multiobjective Simplex Method. . . . . . . . . . . . . . . . . . . . . . . . . . . 241


7.1 Description of the Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
7.2 The Multiobjective Simplex Tableau . . . . . . . . . . . . . . . . . . . . 247
7.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258

8 Normal Cone Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261


8.1 Normal Index Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
8.2 Positive Index Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
8.3 The Normal Cone Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
8.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284

9 Outcome Space Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289


9.1 Analysis of the Efficient Set in the Outcome Space . . . . . . . . . . 289
9.2 Free Disposal Hull . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
9.3 Outer Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
9.4 The Outcome Space Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 299
9.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305

Bibliographical Notes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309

References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
Notations

N Natural numbers
R Real numbers
Rn Euclidean n-dimensional space
LðRn ; Rm Þ Space of m  n matrices
Bn Closed unit ball in Rn
Sn Unit sphere in Rn
Bmn Closed unit ball in LðRn ; Rm Þ
e Vector of ones
ei i-th coordinate unit vector
Δ Standard simplex
k xk Euclidean norm
k x k1 Max-norm
hx; yi Canonical scalar product
5 Less than or equal to
 Less than but not equal to
\ Strictly less than
affðAÞ Affine hull

clðAÞ, A Closure
intðAÞ Interior
riðAÞ Relative interior
coðAÞ Convex hull
coðAÞ Closed convex hull
coneðAÞ Conic hull
posðAÞ Positive hull
MaxðAÞ Set of maximal elements
WMaxðAÞ Set of weakly maximal elements
MinðAÞ Set of minimal elements
WMinðAÞ Set of weakly minimal elements
S(MOLP) Efficient solution set
WS(MOLP) Weakly efficient solution set

xi
xii Notations

supðAÞ Supremum
IðxÞ Active index set at x
A? Orthogonal
A Negative polar cone
A1 Recession/asymptotic cone
NA ðxÞ Normal cone
dðx; CÞ Distance function
hðA; BÞ Hausdorff distance
grðGÞ Graph
suppðxÞ Support
Chapter 1
Introduction

Mathematical optimization studies the problem of finding the best element from a set
of feasible alternatives with regard to a criterion or objective function. It is written
in the form

optimize f (x)
subject to x ∈ X,

where X is a nonempty set, called a feasible set or a set of feasible alternatives, and
f is a real function on X , called a criterion or objective function. Here “optimize”
stands for either “minimize” or “maximize” which amounts to finding x̄ ∈ X such
that either f (x̄)  f (x) for all x ∈ X , or f (x̄)  f (x) for all x ∈ X .
This model offers a general framework for studying a variety of real-world and
theoretical problems in the sciences and human activities. However, in many practical
situations, we tend to encounter problems that involve not just one criterion, but a
number of criteria, which are often in conflict with each other. It then becomes
impossible to model such problems in the above-mentioned optimization framework.
Here are some instances of such situations.
Automotive design The objective of automotive design is to determine the technical
parameters of a vehicle to minimize (1) production costs, (2) fuel consumption, and
(3) emissions, while maximizing (4) performance and (5) crash safety. These criteria
are not always compatible; for instance a high-performance engine often involves
very high production costs, which means that no design can optimally fulfill all
criteria.
House purchase Buying property is one of life’s weightiest decisions and often
requires the help of real estate agencies. An agency suggests a number of houses
or apartments which roughly meet the potential buyer’s budget and requirements. In
order to make a decision, the buyer assesses the available offers on the basis of his
or her criteria. The final choice should satisfy the following: minimal cost, minimal
maintenance charges, maximal quality and comfort, best environment etc. It is quite

© Springer International Publishing Switzerland 2016 1


D.T. Luc, Multiobjective Linear Programming,
DOI 10.1007/978-3-319-21091-9_1
2 1 Introduction

natural that the higher the quality of the house, the more expensive it is; as such, it
is impossible to make the best choice without compromising.
Distributing electrical power In a system of thermal generators the chief problem
concerns allocating the output of each generator in the system. The aim is not only
to satisfy the demand for electricity, but also to fulfill two main criteria: minimizing
the costs of power generation and minimizing emissions. Since the costs and the
emissions are measured in different units, we cannot combine the two criteria into one.
Queen Dido’s city Queen Dido’s famous problem consists of finding a territory
bounded by a line which has the maximum area for a given perimeter. According to
elementary calculus, the solution is known to be a circle. However, as it is incon-
ceivable to have a city touching the sea without a seashore, Queen Dido set another
objective, namely for her territory to have as large a seashore as possible. As a result,
a semicircle partly satisfies her two objectives, but fails to maximize either aspect.
As we have seen, even in the simplest situations described above there can be
no alternative found that simultaneously satisfies all criteria, which means that the
known concepts of optimization do not apply and there is a real need to develop new
notions of optimality for problems involving multiple objective functions. Such a
concept was introduced by Pareto (1848–1923), an Italian economist who explained
the Pareto optimum as follows: “The optimum allocation of the resources of a society
is not attained so long as it is possible to make at least one individual better off in his
own estimation while keeping others as well off as before in their own estimation.”
Prior to Pareto, the Irish economist Edgeworth (1845–1926) had defined an optimum
for the multiutility problem of two consumers P and Q as “a point (x, y) such that in
whatever direction we take an infinitely small step, P and Q do not increase together
but that, while one increases, the other decreases.” According to the definition put
forward by Pareto, among the feasible alternatives, those that can simultaneously be
improved with respect to all criteria cannot be optimal. And an alternative is optimal
if any alternative better than it with respect to a certain criterion is worse with respect
to some other criterion, that is, if a tradeoff takes place when trying to find a better
alternative. From the mathematical point of view, if one defines a domination order
in the set of feasible alternatives by a set of criteria—an alternative a dominates
an alternative b if the value of every criterion function at a is bigger than that at
b—then an alternative is optimal in the Pareto sense if it is dominated by no other
alternatives. In other words, an alternative is optimal if it is maximal with respect
to the above order. This explains the mathematical origin of the theory of multiple
objective optimization, which stems from the theory of ordered spaces developed by
Cantor (1845–1918) and Hausdorff (1868–1942).
A typical example of ordered spaces, frequently encountered in practice, is the
finite dimensional Euclidean space Rn with n ≥ 2, in which two vectors a and b
are comparable, let’s say a is bigger than or equal to b if all coordinates of a are
bigger than or equal to the corresponding coordinates of b. A multiple objective
optimization problem is then written as
1 Introduction 3

Maximize F(x) := ( f 1 (x), . . . , f k (x))


subject to x ∈ X,

where f 1 , . . . , f k are real objective functions on X and “Maximize” signifies finding


an element x̄ ∈ X such that no value F(x), x ∈ X is bigger than the value F(x̄). It
is essential to note that the solution x̄ is not worse than any other solution, but in no
ways it is the best one, that is, the value F(x̄) cannot be bigger than or equal to all
values F(x), x ∈ X in general. A direct consequence of this observation is the fact
that the set of “optimal values” is not a singleton, which forces practitioners to find
a number of “optimal solutions” before making a final decision. Therefore, solving
a multiple objective optimization problem is commonly understood as finding the
entire set of “optimal solutions” or “optimal values”, or at least a representative
portion of them. Indeed, this is the point that makes multiple objective optimization
a challenging and fascinating field of theoretical research and application.
Part I
Background
Chapter 2
Convex Polyhedra

We begin the chapter by introducing basic concepts of convex sets and linear func-
tions in a Euclidean space. We review some of fundamental facts about convex
polyhedral sets determined by systems of linear equations and inequalities, includ-
ing Farkas’ theorem of the alternative which is considered a keystone of the theory
of mathematical programming.

2.1 The Space Rn

Throughout this book, Rn denotes the n-dimensional Euclidean space of real column
n-vectors. The norm of a vector x with components x1 , · · · , xn is given by
 n 1/2

x = (xi ) 2
.
i=1

The inner product of two vectors x and y in Rn is expressed as


n
x, y = xi yi .
i=1

The closed unit ball, the open unit ball and the unit sphere of Rn are respectively
defined by
 
Bn := x ∈ Rn : x  1 ,
 
int(Bn ) := x ∈ Rn : x < 1 ,
 
Sn := x ∈ Rn : x = 1 .

© Springer International Publishing Switzerland 2016 7


D.T. Luc, Multiobjective Linear Programming,
DOI 10.1007/978-3-319-21091-9_2
8 2 Convex Polyhedra

Given a nonempty set Q ⊆ Rn , we denote the closure of Q by cl(Q) and its interior
by int(Q). The conic hull, the positive hull and the affine hull of Q are respectively
given by
 
cone(Q) := ta : a ∈ Q, t ∈ R, t  0 ,
 k 

pos(Q) := ti a i : a i ∈ Q, ti ∈ R, ti  0, i = 1, · · · , k with k ∈ N ,
i=1
 

k 
k
aff(Q) := ti a : a ∈ Q, ti ∈ R, i = 1, · · · , k and
i i
ti = 1 with k ∈ N ,
i=1 i=1

where N denotes the set of natural numbers (Figs. 2.1, 2.2 and 2.3).

Fig. 2.1 Conic hull (with


Q = Q1 ∪ Q2) cone(Q)

Q1

cone(Q)
Q2

Fig. 2.2 Positive hull (with


Q = Q1 ∪ Q2)
pos(Q)
Q1

Q2
2.1 The Space Rn 9

Fig. 2.3 Affine hull (with


Q = Q1 ∪ Q2)
af f (Q)
Q1

Q2

Among the sets described above cone(Q) and pos(Q) are cones, that is, they are
invariant under multiplication by positive numbers; pos(Q) is also invariant under
addition of its elements; and aff(Q) is an affine subspace of Rn . For two vectors x
and y of Rn , inequalities x > y and x  y mean respectively xi > yi and xi  yi
for all i = 1, · · · , n. When x  y and x = y, we write x ≥ y. So a vector x is
positive, that is x  0, if its components are non-negative; and it is strictly positive
if its components are all strictly positive. The set of all positive vectors of Rn is the
positive orthant Rn+ . Sometimes row vectors are also considered. They are transposes
of column vectors. Operations on row vectors are performed in the same manner as
on column vectors. Thus, for two row n-vectors c and d, their inner product is
expressed by

n
c, d = c T , d T  = ci di ,
i=1

where the upper index T denotes the transpose. On the other hand, if c is a row vector
and x is a column vector, then the product cx is understood as a matrix product which
is equal to the inner product c T , x.

Convex sets
We call a subset Q of Rn convex if the segment joining any two points of Q lies entirely
in Q, which means that for every x, y ∈ Q and for every real number λ ∈ [0, 1], one
has λx + (1 − λ)y ∈ Q (Figs. 2.4, 2.5). It follows directly from the definition that the
intersection of convex sets, the Cartesian product of convex sets, the image and inverse
image of a convex set under a linear transformation, the interior and the closure of a
convex set are convex. In particular, the sum Q 1 + Q 2 := {x + y : x ∈ Q 1 , y ∈ Q 2 }
of two convex sets Q 1 and Q 2 is convex; the conic hull of a convex set is convex.
The positive hull and the affine hull of any set are convex.
The convex hull of Q, denoted co(Q) (Fig. 2.6), consists of all convex combina-
tions of elements of Q, that is,
10 2 Convex Polyhedra

Fig. 2.4 Convex set


y

Fig. 2.5 Nonconvex set

x
y

 

k 
k
co(Q) := λi x : x ∈ Q, λi  0, i = 1, · · · , k and
i i
λi = 1 with k ∈ N .
i=1 i=1

It is the intersection of all convex sets containing Q. The closure of the convex hull
of Q will be denoted by co(Q), which is exactly the intersection of all closed convex
sets containing Q. The positive hull of a set is the conic hull of its convex hull. A
k
convex combination i=1 λi x i is strict if all coefficients λi are strictly positive.
Given a nonempty convex subset Q of Rn , the relative interior of Q, denoted
ri(Q), is its interior relative to its affine hull, that is,
 
ri(Q) := x ∈ Q : (x + εBn ) ∩ aff(Q) ⊆ Q for some ε > 0 .

Equivalently, a point x in Q is a relative interior point if and only if for any point y in
Q there is a positive number δ such that the segment joining the points x − δ(x − y)
and x + δ(x − y) entirely lies in Q. As a consequence, any strict convex combination
of a finite collection {x 1 , · · · , x k } belongs to the relative interior of its convex hull
(see also Lemma 6.4.8). It is important to note also that every nonempty convex set
in Rn has a nonempty relative interior. Moreover, if two convex sets Q 1 and Q 2 have
at least one relative interior point in common, then ri(Q 1 ∩ Q 2 ) = ri(Q 1 ) ∩ ri(Q 2 ).

Fig. 2.6 Convex hull of Q


co(Q)
Q
2.1 The Space Rn 11

Fig. 2.7 The standard


simplex in R3

Example 2.1.1 (Standard simplex) Let ei be the ith coordinate unit vector of Rn ,
that is its components are all zero except for the ith component equal to one. Let Δ
denote the convex hull of e1 , · · · , en . Then a vector x with components x1 , · · · , xn is
n
an element of Δ if and only if xi  0, i = 1, · · · , n and i=1 xi = 1. This set has no
interior point. However, its relative interior consists of x with xi > 0, i = 1, · · · , n
n
and i=1 xi = 1. The set Δ is called the standard simplex of Rn (Fig. 2.7).

Caratheodory’s theorem
It turns out that the convex hull of a set Q in the space Rn can be obtained by convex
combinations of at most n + 1 elements of Q. First we see this for positive hull.

Theorem 2.1.2 Let {a 1 , · · · , a k } be a collection of vectors in Rn . Then for every


nonzero vector x from the positive hull pos{a 1 , · · · , a k } there exists an index set
I ⊆ {1, · · · , k} such that
(i) the vectors a i , i ∈ I are linearly independent;
(ii) x belongs to the positive hull pos{a i , i ∈ I }.

Proof Since the collection {a 1 , · · · , a k } is finite, we may choose an index set I of


minimum cardinality such that x ∈ pos{a i , i ∈ I }. It is evident that there are strictly
positive numbers ti , i ∈ I such that x = i∈I ti a i . We prove that (i) holds for this
I . Indeed, if not, one can find an index j ∈ I and real numbers si such that

aj − si a i = 0.
i∈I \{ j}

Set ti
ε = min t j and − : i ∈ I with si < 0
si

and express
12 2 Convex Polyhedra
 
x= ti a i − ε a j − si a i
i∈I i∈I \{ j}

= (t j − ε)a + j
(ti + εsi )a i .
i∈I \{ j}

It is clear that in the latter sum those coefficients corresponding to the indices that
realize the minimum in the definition of ε are equal to zero. By this, x lies in the
positive hull of less than |I | vectors of the collection. This contradiction completes
the proof. 

A collection of vectors {a 1 , · · · , a k } in Rn is said to be affinely independent if


the dimension of the subspace aff{a 1 , · · · , a k } is equal to k − 1. By convention a set
consisting of a solitary vector is affinely independent. The next result is a version of
Caratheodory’s theorem and well-known in convex analysis.

Corollary 2.1.3 Let {a 1 , · · · , a k } be a collection of vectors in Rn . Then for every


x ∈ co{a 1 , · · · , a k } there exists an index set I ⊆ {1, · · · , k} such that
(i) the vectors a i , i ∈ I are affinely independent
(ii) x belongs to the convex hull of a i , i ∈ I.

Proof We consider the collection of vectors v i = (a i , 1), i = 1, · · · , k in the space


Rn × R. It is easy to verify that x belongs to the convex hull co{a 1 , · · · , a k } if
and only if the vector (x, 1) belongs to the positive hull pos{v 1 , · · · , v k }. Applying
Theorem 2.1.2 to the latter positive hull we deduce the existence of an index set
I ⊆ {1, · · · , k} such that the vector (x, 1) belongs to the positive hull pos{v i , i ∈ I }
and the collection {v i , i ∈ I } is linearly independent. Then x belongs to the convex
hull co{a i , i ∈ I } and the collection {a i , i ∈ I } is affinely independent. 

Linear operators and matrices


A mapping φ : Rn → Rk is called a linear operator between Rn and Rk if
(i) φ(x + y) = φ(x) + φ(y),
(ii) φ(t x) = tφ(x)
for every x, y ∈ Rn and t ∈ R. The kernel and the image of φ are the sets
 
Kerφ = x ∈ Rn : φ(x) = 0 ,
 
Imφ = y ∈ Rk : y = φ(x) for some x ∈ Rn .

These sets are linear subspaces of Rn and Rk respectively.


We denote the k × n-matrix whose columns are c1 , · · · , cn by C, where ci is the
vector image of the ith coordinate unit vector ei by φ. Then for every vector x of Rn
one has
φ(x) = C x.
2.1 The Space Rn 13

The mapping x → C x is clearly a linear operator from Rn to Rk . This explains


why one can identify a linear operator with a matrix. The space of k × n matrices
is denoted by L(Rn , Rk ). The transpose of a matrix C is denoted by C T . The norm
and the inner product in the space of matrices are given by
  1/2
C = |ci j |2 ,
i=1,··· ,n j=1,··· ,n
 
C, B = ci j bi j .
i=1,··· ,n j=1,··· ,n

The norm C is called also the Frobenius norm.


The inner product C, B is nothing but the trace of the matrix C B T . Sometimes
the space L(Rn , Rk ) is identified with the n × k-dimensional Euclidean space Rn×k .

Linear functionals
A particular case of linear operators is when the value space is one-dimensional. This
is the space of linear functionals on Rn and often identified with the space Rn itself.
Thus, each linear functional φ is given by a vector dφ by the formula

φ(x) = dφ , x.

When dφ = 0, the kernel of φ is called a hyperplane; the vector dφ is a normal vector


to this hyperplane. Geometrically, dφ is orthogonal to the hyperplane Kerφ. The sets
 
x ∈ Rn : dφ , x  0 ,
 
x ∈ Rn : dφ , x  0

are closed halfspaces and the sets


 
x ∈ Rn : dφ , x > 0 ,
 
x ∈ Rn : dφ , x < 0

are open halfspaces bounded by the hyperplane Kerφ. Given a real number α and a
nonzero vector d of Rn , one also understands a hyperplane of type
 
H (d, α) = x ∈ Rn : d, x = α .

The sets
 
H+ (d, α) = x ∈ Rn : d, x  α ,
 
H− (d, α) = x ∈ Rn : d, x  α

are positive and negative halfspaces and the sets


14 2 Convex Polyhedra
   
int H+ (d, α) = x ∈ Rn : d, x > α ,
   
int H− (d, α) = x ∈ Rn : d, x < α

are positive and negative open halfspaces.


Theorem 2.1.4 Let Q be a nonempty convex set in Rn and let d, . be a positive
functional on Q, that is d, x  0 for every x ∈ Q. If d, x = 0 for some relative
interior point x of Q, then d, . is zero on Q.

Proof Let y be any point in Q. Since x is a relative interior point, there exists a
positive number δ such that x + t (y − x) ∈ Q for |t|  δ. Applying d, . to this
point we obtain
d, x + t (y − x) = td, y  0

for all t ∈ [−δ, δ]. This implies that d, y = 0 as requested. 

2.2 System of Linear Inequalities

We shall mainly deal with two kinds of systems of linear equations and inequalities.
The first system consists of k inequalities

a i , x  bi , i = 1, · · · , k, (2.1)

where a 1 , · · · , a k are n-dimensional column vectors and b1 , · · · , bk are real num-


bers; and the second system consists of k equations which involves positive vectors
only

a i , x = bi , i = 1, · · · , k (2.2)
x  0.

Denoting by A the k × n-matrix whose rows are the transposes of a 1 , · · · , a k and


by b the column k-vector of components b1 , · · · , bk , we can write the systems (2.1)
and (2.2) in matrix form
Ax  b (2.3)

and

Ax = b (2.4)
x  0.

Notice that any system of linear equations and inequalities can be converted to the
two matrix forms described above. To this end it suffices to perform three operations:
2.2 System of Linear Inequalities 15

(a) Express each variable xi as difference of two non-negative variables xi = xi+ −


xi− where

xi+ = max{xi ; 0},


xi− = max{−xi ; 0}.

(b) Introduce a non-negative slack variable yi in order to obtain equivalence between


inequality a i , x  bi and equality a i , x + yi = bi . Similarly, with a non-
negative surplus variable z i one may express inequality a i , x  bi as equality
a i , x − z i = bi .
(c) Express equality a i , x = bi by two inequalities a i , x  bi and a i , x  bi .

Example 2.2.1 Consider the following system

x1 + 2x2 = 1,
−x1 − x2  0.

It is written in form (2.3) as


⎛ ⎞ ⎛ ⎞
1 2   1
⎝ −1 −2 ⎠ x1  ⎝ −1 ⎠
x2
1 1 0

and in form (2.4) with a surplus variable y as


   
1 −1 2 −2 0  + − + − T 1
x1 , x1 , x2 , x2 , y = ,
−1 1 −1 1 −1 0
(x1+ , x1− , x2+ , x2− , y)T  0.

Redundant equation
Given a system (2.4) we say it is redundant if at least one of the equations (called
redundant equation) can be expressed as a linear combination of the others. In other
words, it is redundant if there is a nonzero k-dimensional vector λ such that

A T λ = 0,
b, λ = 0.

Moreover, redundant equations can be dropped from the system without changing
its solution set. Similarly, an inequation of (2.1) is called redundant if its removal
from the system does not change the solution set.
16 2 Convex Polyhedra

Proposition 2.2.2 Assume that k  n and that the system (2.4) is consistent. Then
it is not redundant if and only if the matrix A has full rank.
Proof If one of equations, say a 1 , x = b1 , is redundant, then a 1 is a linear combina-
tion of a 2 , · · · , a k . Hence the rank of A is not maximal, it is less than k. Conversely,
when the rank of A is maximal (equal to k), no row of A is a linear combination of
the others. Hence no equation of the system can be expressed as a linear combination
of the others. 

Farkas’ theorem
One of the theorems of the alternative that are pillars of the theory of linear and
nonlinear programming is Farkas’ theorem or Farkas’ lemma. There are a variety of
ways to prove it, the one we present here is elementary.
Theorem 2.2.3 (Farkas’ theorem) Exactly one of the following systems has a
solution:
(i) Ax = b and x  0;
(ii) A T y  0 and b, y < 0.
Proof If the first system has a solution x, then for every y with A T y  0 one has

b, y = Ax, y = x, A T y  0,

which shows that the second system has no solution.


Now suppose the first system has no solution. Then either the system

Ax = b

has no solution, or it does have a solution, but every solution of it is not positive. In the
first case, choose m linearly independent columns of A, say a1 , · · · , am , where m is
the rank of A. Then the vectors a1 , · · · , am , b are linearly independent too (because
b does not lie in the space spanned by a1 , · · · , am ). Consequently, the system

ai , y = 0, i = 1, · · · , m,
b, y = −1

admits a solution. This implies that the system (ii) has solutions too. It remains to
prove the solvability of (ii) when Ax = b has solutions and they are all non-positive.
We do it by induction on the dimension of x. Assume n = 1. If the system ai1 x1 =
bi , i = 1, · · · , k has a negative solution x1 , then y = −(b1 , · · · , bk )T is a solution
of (ii) because A T y = −(a11 2 + · · · + a 2 )x > 0 and b, y = −(b2 + · · · + b2 ) < 0.
k1 1 1 k
Now assume n > 1 and that the result is true for the case of dimension n −1. Given an
n-vector x, denote by x the (n − 1)-vector consisting of the first (n − 1) components
of x. Let Ā be the matrix composed of the first (n − 1) columns of A. It is clear that
the system
2.2 System of Linear Inequalities 17

Āx = b and x  0

has no solution. By induction there is some y such that

T
A y  0,
b, y < 0.

If an , y  0, we are done. If an , y < 0, define new vectors

âi = ai , yan − an , yai , i = 1, · · · , n − 1,


b̂ = b, yan − an , yb

and consider a new system

â1 ξ1 + · · · + ân−1 ξn−1 = b̂. (2.5)

We claim that this system of k equations has no positive solution. Indeed, if not, say
ξ1 , · · · , ξn−1 were non-negative solutions, then the vector x with

xi = ξi , i = 1, · · · , n − 1,
1
xn = − a1 ξ1 + · · · + an−1 ξn−1 , y − b, y
an , y

T
should be a positive solution of (i) because −b, y > 0 and Aξ, y = ξ, A y  0
for ξ = (ξ1 , · · · , ξn−1 )T  0, implying xn  0. Applying the induction hypothesis
to (2.5) we deduce the existence of a k-vector ŷ with

âi , ŷ  0, i = 1, · · · , n − 1,
b̂, ŷ < 0.

Then the vector y = an , ŷy − an , y ŷ satisfies the system (ii). The proof is
complete. 

A number of consequences can be derived from Farkas’ theorem which are useful
in the study of linear systems and linear programming problems.

Corollary 2.2.4 Exactly one of the following systems has a solution:


(i) Ax = 0, c, x = 1 and x  0;
(ii) A T y  c.
18 2 Convex Polyhedra

Proof If (ii) has a solution y, then for a positive vector x with Ax = 0 one has

0 = y, Ax = A T y, x  c, x.

So (i) is not solvable. Conversely, if (i) has no solution, then applying Farkas’ theorem
to the inconsistent system
   
A 0
x= and x  0
cT 1

yields the existence of a vector y and of a real number t such that


     
y 0 y
AT c  0 and , < 0.
t 1 t

Hence t < 0 and −y/t is a solution of (ii). 

Corollary 2.2.5 Exactly one of the following systems has a solution:


(i) Ax ≥ 0 and x  0;
(ii) A T y  0 and y > 0.

Proof By introducing a surplus variable z ∈ Rk we convert (i) to an equivalent


system

Ax − I z = 0,
 
x
 0,
z
  
x
c, = 1,
z

where c is an (n + k)-vector whose n first components are all zero and the remaining
components are one. According to Corollary 2.2.4 it has no solution if and only if
the following system has a solution
 
AT
y  c.
−I

It is clear that the latter system is equivalent to (ii). 

The next corollary is known as Motzkin’s theorem of the alternative.


Corollary 2.2.6 (Motzkin’s theorem) Let A and B be two matrices having the same
number of columns. Exactly one of the following systems has a solution:
(i) Ax > 0 and Bx  0;
(ii) A T y + B T z = 0, y ≥ 0 and z  0.
2.2 System of Linear Inequalities 19

Proof The system (ii) is evidently equivalent to the following one

    
AT B T y 0
=
eT 0 z 1
 
y
 0.
z

By Farkas’ theorem it is compatible (has a solution) if and only if the following


system is incompatible:
    
Ae x 0

B0 t 0
   
x 0
, < 0.
t 1

The latter system is evidently equivalent to the system of (i). 

Some classical theorems of alternatives are immediate from Corollary 2.2.6.


• Gordan’s theorem (B is the zero matrix):
Exactly one of the following systems has a solution
(1) Ax > 0;
(2) A T y = 0 and y ≥ 0.
• Ville’s theorem (B is the identity matrix):
Exactly one of the following systems has a solution
(3) Ax > 0 and x  0;
(4) A T y  0 and y ≥ 0.
 
B
• Stiemke’s theorem (A is the identity matrix and B is replaced by ):
−B
Exactly one of the following systems has a solution
(5) Bx = 0 and x > 0;
(6) B T y ≥ 0.

2.3 Convex Polyhedra

A set that can be expressed as the intersection of a finite number of closed half-spaces
is called a convex polyhedron. A convex bounded polyhedron is called a polytope.
According to the definition of closed half-spaces, a convex polyhedron is the solution
set to a finite system of inequalities
20 2 Convex Polyhedra

a i , x  bi , i = 1, · · · , k (2.6)

where a 1 , · · · , a k are n-dimensional column vectors and b1 , · · · , bk are real num-


bers. When bi = 0, i = 1, · · · , k, the solution set to (2.6) is a cone and called a
convex polyhedral cone. We assume throughout this section that the system is not
redundant and solvable.

Supporting hyperplanes and faces


Let P be a convex polyhedron and let

H = {x ∈ Rn : v, x = α}

be a hyperplane with v nonzero. We say H is a supporting hyperplane of P at a point


x ∈ P if the intersection of H with P contains x and P is contained in one of the
closed half-spaces bounded by H (Fig. 2.8). In this case, the nonempty set H ∩ P is
called a face of P. Thus, a nonempty subset F of P is a face if there is a nonzero
vector v ∈ Rn such that

v, y  v, x for all x ∈ F, y ∈ P.

When a face is zero-dimensional, it is called a vertex. A nonempty polyhedron may


have no vertex. By convention P is a face of itself; other faces are called proper
faces. One-dimensional faces are called edges. Two vertices are said to be adjacent
if they are end-points of an edge.

Example 2.3.1 Consider a system of three inequalities in R2 :

x1 + x2  1 (2.7)
−x1 − x2  0 (2.8)
−x1  0. (2.9)

Fig. 2.8 Supporting


hyperplane H

x
2.3 Convex Polyhedra 21

The polyhedron defined by (2.7) and (2.8) has no vertex. It has two one-dimensional
faces determined respectively by x1 + x2 = 1 and x1 + x2 = 0, and one two-
dimensional face, the polyhedron itself. The polyhedron defined by (2.7)–(2.9) has
two vertices (zero-dimensional faces) determined respectively by
 
x1 = 0 x1 = 0
and ,
x2 = 0 x2 = 1

three one-dimensional faces given by



⎨ x1 + x2  1  
x1 + x2 = 1 −x1 − x2 = 0
−x1 − x2  0 , and ,
⎩ −x1  0 −x1  0
x1 =0

and one two-dimensional face, the polyhedron itself.

Proposition 2.3.2 Let P be a convex polyhedron. The following properties hold.


(i) The intersection of any two faces is a face if it is nonempty.
(ii) Two different faces have no relative interior point in common.

Proof We prove (i) first. Assume F1 and F2 are two faces with nonempty intersection.
If they coincide, there is nothing to prove. If not, let H1 and H2 be two supporting
hyperplanes that generate these faces, say
 
H1 = x ∈ Rn : v 1 , x = α1 ,
 
H2 = x ∈ Rn : v 2 , x = α2 .

Since these hyperplanes contain the intersection of distinct faces F1 and F2 , the
vector v = v 1 + v 2 is not zero. Consider the hyperplane
 
H = x ∈ Rn : v, x = α1 + α2 .

It is a supporting hyperplane of P because it evidently contains the intersection of


the faces F1 and F2 , and for every point x in P, one has

v, x = v 1 , x + v 2 , x  α1 + α2 . (2.10)

It remains to show that the intersection of H and P coincides with the intersection
F1 ∩ F2 . The inclusion
F1 ∩ F2 ⊆ H ∩ P

being clear, we show the converse. Let x be in H ∩ P. Then (2.10) becomes equality
for this x. But v 1 , x  α1 and v 2 , x  α2 , so that equality of (2.10) is possible
only when the two latter inequalities are equalities. This proves that x belongs to
both F1 and F2 .
22 2 Convex Polyhedra

For the second assertion notice that if F1 and F2 have a relative interior point in
common, then in view of Theorem 2.1.4, the functional v 1 , . is constant on F2 . It
follows that F2 ⊆ H1 ∩ P ⊆ F1 . Similarly, one has F1 ⊆ F2 , and hence equality
holds. 

Let x be a solution of the system (2.6). Define the active index set at x to be the
set
I (x) = i ∈ {1, · · · , k} : a i , x = bi .

The remaining indices are called inactive indices.

Theorem 2.3.3 Assume that P is a convex polyhedron given by (2.6). A nonempty


proper convex subset F of P is a face if and only if there is a nonempty maximal
index set I ⊆ {1, · · · , k} such that F is the solution set to the system

a i , x = bi , i ∈ I (2.11)
a , x  b j , j ∈ {1, · · · , k}\I,
j
(2.12)

in which case the dimension of F is equal to n−rank{a i : i ∈ I }.

Proof Denote the solution set to the system (2.11, 2.12) by F that we suppose
nonempty. To prove that it is a face, we set
 
v= a i and α = bi .
i∈I i∈I

Notice that v is nonzero because F is not empty and the system (2.6) is not redundant.
It is clear that the negative half-space H− (v, α) contains P. Moreover, if x is a
solution to the system, then, of course, x belongs to P and to H at the same time,
which implies F ⊆ H ∩ P. Conversely, any point x of the latter intersection satisfies

a i , x  bi , i = 1, · · · , k,
 
a i , x = bi .
i∈I i∈I

The latter equality is possible only when those inequalities with indices from I are
equalities. In other words, x belongs to F .
Now, let F be a proper face of P. Pick a relative interior point x of F and consider
the system (2.11, 2.12) with I = I (x) the active index set of x. Being a proper face of
P, F has no interior point, and so the set I is nonempty. As before, F is the solution
set to that system. By the first part, it is a face. We wish to show that it coincides with
F. For this, in view of Proposition 2.3.2 it suffices to show that x is also a relative
interior point of F . Let x be another point in F . We have to prove that there is a
positive number δ such that the segment [x, x + δ(x − x)] lies in F . Indeed, note
2.3 Convex Polyhedra 23

that for indices j outside the set I , inequalities a j , x  b j are strict. Therefore,
there is δ > 0 such that
a j , x + δa j , x − x  b j

for all j ∈
/ I . Moreover, being a linear combination of x and x, the endpoint x +
δ(x − x) satisfies the equalities (2.11) too. Consequently, this point belongs to F ,
and hence so does the whole segment. Since F and F are two faces with a relative
interior point in common, they must be the same. 
In general, for a given face F of P, there may exist several index sets I for which
F is the solution set to the system (2.11, 2.12). We shall, however, understand that
no inequality can be equality without changing the solution set when saying that the
system (2.11, 2.12) determines the face F. So, if two inequalities combined yields
equality, their indices will be counted in I .
Corollary 2.3.4 If an m-dimensional convex polyhedron has a vertex, then it has
faces of any dimension less than m.
Proof The corollary is evident for a zero-dimensional polyhedron. Suppose P is a
polyhedron of dimension m > 0. By Theorem 2.3.3 without loss of generality we
may assume that P is given by the system (2.11, 2.12) with |I | = n − m and that
the family {a i , i ∈ I } is linearly independent. Since P has a vertex, there is some
i 0 ∈ {1, · · · , k}\I such that the vectors a i , i ∈ I ∪ {i 0 } are linearly independent.
Then the system

a i , x = bi , i ∈ I ∪ {i 0 },
a j , x  b j , j ∈ {1, · · · , k}\(I ∪ {i 0 })

generates an (m − 1)-dimensional face of P. Notice that this system has a solution


because P is generated by the non-redundant system (2.11, 2.12). Continuing the
above process we are able to construct a face of any dimension less than m. 
Corollary 2.3.5 Let F be a face of the polyhedron P determined by the system (2.11,
2.12). Then for every x ∈ F one has

I (x) ⊇ I.

Equality holds if and only if x is a relative interior point of F.


Proof The inclusion I ⊆ I (x) is evident because x ∈ F. For the second part, we
first assume I (x) = I , that is

a i , x = bi , i ∈ I,
a j , x < b j , j ∈ {1, · · · , k}\I.

It is clear that if y ∈ aff(F), then a i , y = bi , i ∈ I, and if y ∈ x + εBk with


ε > 0 sufficiently small, then a j , y < b j , j ∈ {1, · · · , k}\I. We deduce that
24 2 Convex Polyhedra

aff(F) ∩ (x + εBk ) ⊆ F, which shows that x is a relative interior point of F.


Conversely, let x be a relative interior point of F. Using the argument in the proof
of Theorem 2.3.3 we know that F is also a solution set to the system

a i , y = bi , i ∈ I (x),
a j , y  b j , j ∈ {1, · · · , k}\I (x).

Since the system (2.11, 2.12) determines F, we have I (x) ⊆ I , and hence equality
follows. 

Corollary 2.3.6 Let F be a face of the polyhedron P determined by the system (2.11,
2.12). Then a point v ∈ F is a vertex of F if and only if it is a vertex of P.

Proof It is clear that every vertex of P is a vertex of F if it belongs to F. To prove


the converse, let us deduce a system of inequalities from (2.11, 2.12) by expressing
equalities a i , x = bi as two inequalities a i , x  bi and −a i , x  −bi . If v is
a vertex of F, then the active constraints at v consists of the vectors a i , −a i , i ∈ I
and some a j , j ∈ J ⊆ {1, · · · , k}\I , so that the rank of the family {a i , −a i , a j :
i ∈ I, j ∈ J } is equal to n. It follows that the family {a i , a j : i ∈ I, j ∈ J } has rank
equal to n too. In view of Theorem 2.3.3 the point v is a vertex of P. 

Given a face F of a polyhedron, according to the preceding corollary the active


index set I (x) is constant for every relative interior point x of F. Therefore, we call
it active index set of F and denote it by I F .
A collection of subsets of a polyhedron is said to be a partition of it if the elements
of the collection are disjoint and their union contains the entire polyhedron.
Corollary 2.3.7 The collection of all relative interiors of faces of a polyhedron forms
a partition of the polyhedron.

Proof It is clear from Proposition 2.3.2(ii) that relative interiors of different faces
are disjoint. Moreover, given a point x in P, consider the active index set I (x). If
it is empty, then the point belongs to the interior of P and we are done. If it is not
empty, by Corollary 2.3.5, the face determined by system (2.11, 2.12) with I = I (x)
contains x in its relative interior. The proof is complete. 

We now deduce a first result on representation of elements of polyhedra by ver-


tices.

Corollary 2.3.8 A convex polytope is the convex hull of its vertices.

Proof The corollary is evident when the dimension of a polytope is less or equal
to one in which case it is a point or a segment with two end-points. We make the
induction hypothesis that the corollary is true when a polytope has dimension less
than m with 1 < m < n and prove it for the case when P is a polytope determined by
the system (2.6) and has dimension equal to m. Since P is a convex set, the convex
hull of its vertices is included in P itself. Conversely, let x be a point of P. In view of
2.3 Convex Polyhedra 25

Corollary 2.3.7 it is a relative interior point of some face F of P. If F is a proper face


of P, its dimension is less than m, and so we are done. It remains to treat the case
where x is a relative interior point of P. Pick any point y = x in P and consider the
line passing through x and y. Since P is bounded, the intersection of this line with P
is a segment, say with end-points c and d. Let Fc and Fd be faces of P that contain
c and d in their relative interiors. As c and d are not relative interior points of P, the
faces Fc and Fd are proper faces of P, and hence they have dimension strictly less
than m. By induction c and d belong to the convex hulls of the vertices of Fc and Fd
respectively. By Corollary 2.3.6 they belong to the convex hull of the vertices of P,
and hence so does x because x belongs to the convex hull of c and d. 

A similar result is true for polyhedral cones. It explains why one-dimensional


faces of a polyhedral cone are called extreme rays.

Corollary 2.3.9 A nontrivial polyhedral cone with vertex is the convex hull of its
one-dimensional faces.

Proof By definition a polyhedral cone P is defined by a homogeneous system

a i , x  0, i = 1, · · · , k. (2.13)

Choose any nonzero point y in P and consider the hyperplane H given by

a 1 + · · · + a k , x − y = 0. (2.14)

We claim that the vector a 1 + · · · + a k is nonzero. Indeed, if not, the inequalities


(2.13) would become equalities for all x ∈ P, and P would be either a trivial cone, or
a cone without vertex. Moreover, P ∩ H is a bounded polyhedron, because otherwise
one should find a nonzero vector u satisfying a i , u = 0, i = 1, · · · , k and P could
not have vertices. In view of Corollary 2.3.8, P ∩ H is the convex hull of its vertices.
To complete the proof it remains to show that a vertex v of P ∩ H is the intersection of
a one-dimensional face of P with H . Indeed, the polytope P ∩ H being determined
by the system (2.13) and (2.14), there is a set J ⊂ {1, · · · , k} with |J | = n − 1 such
that the vectors a j , j ∈ J and a 1 + · · · + a k are linearly independent and v is given
by system

a j , x = 0, j ∈ J (2.15)
a + · · · + a k , x = a 1 + · · · + a k , y,
1

a i , x  0, i ∈ {1, · · · , k}\J. (2.16)

It is clear that (2.15) and (2.16) determine a one-dimensional face of P whose inter-
section with H is v. 
26 2 Convex Polyhedra

Separation of convex polyhedra


Given two convex polyhedra P and Q in Rn , we say that a nonzero vector v separates
them if
v, x  v, y for all vectors x ∈ P, y ∈ Q

and strict inequality is true for some of them (Fig. 2.9). The following result can be
considered as a version of Farkas’ theorem or Gordan’s theorem.

Theorem 2.3.10 If P and Q are convex polyhedra without relative interior points
in common, then there is a nonzero vector separating them.

Proof We provide a proof for the case where both P and Q have interior points only.
Without loss of generality we may assume that P is determined by the system (2.6)
and Q is determined by the system

d j , x  c j , j = 1, · · · , m.

Thus, the following system:

a i , x < bi , i = 1, · · · , k
d j , x < c j , j = 1, · · · , m

has no solution because the first k inequalities determine the interior of P and the last
m inequalities determine the interior of Q. This system is equivalent to the following
one: ⎛ ⎞ ⎛ ⎞
−A b   0
⎝ −D c ⎠ x > ⎝ 0 ⎠ ,
t
01 0

Fig. 2.9 Separation


H

P
Q
2.3 Convex Polyhedra 27

where A is the k × n-matrix whose rows are transposes of a 1 , · · · , a k , D is the


m × n-matrix whose rows are transposes of d 1 , · · · , d k , b is the k-vector with
the components b1 , · · · , bk and c is the m-vector with the components c1 , · · · , cm .
According to Gordan’s theorem, there exist positive vectors λ ∈ Rk and μ ∈ Rm and
a real number s  0, not all zero, such that

A T λ + D T μ = 0,
b, λ + c, μ + s = 0.

It follows from the latter equality that (λ, μ) is nonzero. We may assume without
loss of generality that λ = 0. We claim that A T λ = 0. Indeed, if not, choose x an
interior point of P and y an interior point of Q. Then D T μ = 0 and hence

b, λ > Ax, λ = x, A T λ = 0

and
c, μ  Dy, μ = y, D T μ = 0,

which is in contradiction with the aforesaid equality. Defining v to be the nonzero


vector −A T λ, we deduce for every x ∈ P and y ∈ Q that

v, x = −A T λ, x = λ, −Ax  λ, −b  μ, c  μ, Dy = v, y.

Of course inequality is strict when x and y are interior points. By this v separates P
and Q as requested. 

Asymptotic cones
Given a nonempty convex and closed subset C of Rn , we say that a vector v is an
asymptotic or a recession direction of C if

x + t x ∈ C for all x ∈ C, t  0.

The set of all asymptotic directions of C is denoted by C∞ (Fig. 2.10). It is a convex


cone. It can be seen that a closed convex set is bounded if and only if its asymptotic
cone is trivial. The set C∞ ∩ (−C∞ ) is a linear subspace and called the lineality
space of C.
An equivalent definition of asymptotic directions is given next.

Theorem 2.3.11 A vector v is an asymptotic direction of a convex and closed set C


if and only if there exist a sequence of elements x s ∈ C and a sequence of positive
numbers ts converging to zero such that v = lims→∞ ts x s .

Proof If v ∈ C∞ and x ∈ C, then x s = x + sv ∈ C for all s ∈ N\{0}. Setting


ts = 1/s we obtain v = lims→∞ ts x s with lims→∞ ts = 0. Conversely, assume that
28 2 Convex Polyhedra

C∞

Fig. 2.10 Asymptotic cone

v = lims→∞ ts x s for x s ∈ C and ts > 0 converging to zero as s tends to ∞. Let


x ∈ C and t > 0 be given. Then tts converges to zero as s → ∞ and 0  tts  1
for s sufficiently large. Hence,

x + tv = lim (x + tts x s )
s→∞

= lim (1 − tts )x + tts x s + tts x


s→∞

= lim (1 − tts )x + tts x s .


s→∞

The set C being closed and convex, the points under the latter limit belong to the set
C, and therefore their limit x + tv belongs to C too. Since x and t > 0 were chosen
arbitrarily we conclude that v ∈ C∞ . 

Below is a formula to compute the asymptotic cone of a polyhedron.


Theorem 2.3.12 The asymptotic cone of the polyhedron P determined by the system
(2.6) is the solution set to system

a i , v  0, i = 1, · · · , k. (2.17)

Proof Let v be an asymptotic direction of P. Then for every positive number t one
has
a i , x + tv  bi , i = 1, · · · , k,

where x is any point in P. By dividing both sides of the above inequalities by t > 0
and letting this t tend to ∞ we derive (2.17). For the converse, if v is a solution of
(2.17), then for every point x in P one has

a i , x + tv = a i , x + ta i , v  bi , i = 1, · · · , k
2.3 Convex Polyhedra 29

for all t  0. Thus, the points x + tv with t  0, belong to P and v is an asymptotic


direction. 

Example 2.3.13 Consider a (nonempty) polyhedron in R3 defined by the system:

−x1 − x2 − x3  −1,
x3  1,
x1 , x2 , x3  0.

The asymptotic cone is given by the system

−x1 − x2 − x3  0,
x3  0,
x1 , x2 , x3  0,

in which the first inequality is redundant, and hence it is simply given by x1  0,


x2  0 and x3 = 0.

Using asymptotic directions we are also able to tell whether a convex polyhedron
has a vertex or not. A cone is called pointed if it contains no straight line. When a
cone C is not pointed, it contains a nontrivial linear subspace C ∩ (−C), called also
the lineality space of C.

Corollary 2.3.14 A convex polyhedron has vertices if and only if its asymptotic cone
is pointed. Consequently, if a convex polyhedron has a vertex, then so does any of its
faces.

Proof It is easy to see that when a polyhedron has a vertex, it contains no straight
line, and hence its asymptotic cone is pointed. We prove the converse by induction
on the dimension of the polyhedron. The case where a polyhedron is of dimension
less or equal to one is evident because a polyhedron with a pointed asymptotic cone
is either a point or a segment or a ray, hence it has a vertex. Assume the induction
hypothesis that the conclusion is true for all polyhedra of dimension less than m with
1 < m < n. Let P be m-dimensional with a pointed asymptotic cone. If P has no
proper face, then the inequalities (2.6) are strict, which implies that P is closed and
open at the same time. This is possible only when P coincides with the space Rn
which contradicts the hypothesis that P∞ is pointed. Now, let F be a proper face of
P. Its asymptotic cone, being a subset of the asymptotic cone of P is pointed too.
By induction, it has a vertex, which in view of Corollary 2.3.6 is also a vertex of P.
To prove the second part of the corollary it suffices to notice that if a face of P
has no vertex, by the first part of the corollary, its asymptotic cone contains a straight
line, hence so does the set P itself. 

A second representation result for elements of a convex polyhedron is now for-


mulated in a more general situation.
30 2 Convex Polyhedra

Corollary 2.3.15 A convex polyhedron with vertex is the convex hull of its vertices
and extreme directions.

Proof We conduct the proof by induction on the dimension of the polyhedron. The
corollary is evident when a polyhedron is zero or one-dimensional. We assume that
it is true for all convex polyhedra of dimension less than m with 1 < m < n and
prove it for an m-dimensional polyhedron P determined by system

a i , x = bi , i ∈ I
a j , x  b j , j ∈ {1, · · · , k}\I,

in which |I | = n − m and the vectors a i , i ∈ I are linearly independent. Let y be


an arbitrary element of P. If it belongs to a proper face of P, then by induction we
can express it as a convex combination of vertices and extreme directions. If it is a
relative interior point of P, then setting a = a 1 + · · · + a k that is nonzero because
the asymptotic cone of P is pointed, P itself having a vertex, and considering the
intersection of P with the hyperplane H determined by equality a, x = a, y we
obtain a bounded polyhedron P ∩ H . By Corollary 2.3.8, the point y belongs to the
convex hull of vertices of P ∩ H . The vertices of P ∩ H belong to proper faces of P,
by induction, they also belong to the convex hull of vertices and extreme directions
of P, hence so does y. 

When a polyhedron has a non-pointed asymptotic cone, it has no vertex. However,


it is possible to express it as a sum of its asymptotic cone and a bounded polyhedron
as well.

Corollary 2.3.16 Every convex polyhedron is the sum of a bounded polyhedron and
its asymptotic cone.

Proof Denote the lineality space of the asymptotic cone of a polyhedron P by M. If


M is trivial, we are done in view of Corollary 2.3.15 (the convex hull of all vertices
of P serves as a bounded polyhedron). If M is not trivial, we decompose the space
Rn into the direct sum of M and its orthogonal M ⊥ . Denote by P⊥ the projection of
P on M ⊥ . Then P = P⊥ + M. Indeed, let x be an element in P and let x = x 1 + x 2
with x 1 ∈ M and x 2 ∈ M ⊥ . Then x 2 ∈ P⊥ and one has x ∈ P⊥ + M. Conversely,
let x 1 ∈ M and x 2 ∈ P⊥ . By definition there is some y ∈ P, say y = y 1 + y 2 with
y 1 ∈ M and y 2 ∈ M ⊥ such that y 2 = x 2 . Since M is a part of the asymptotic cone
of P, one deduces that

x = x 1 + x 2 = y 1 + (x 1 − y 1 ) + y 2 = y + (x 1 − y 1 ) ∈ y + M ⊆ P,

showing that x belongs to P. Further, we claim that the asymptotic cone of P⊥


is pointed. In fact, if not, say it contains a straight line d. Then the convexity of
P implies that the space M + d belongs to the asymptotic cone of P. This is a
contradiction because d lies in M ⊥ and M is already the biggest linear subspace
contained in P. Let Q denote the convex hull of the set of all vertices of P⊥ which
2.3 Convex Polyhedra 31

is nonempty by Corollary 2.3.14. It follows from Corollary 2.3.18 below that the
asymptotic cone of P is the sum of the asymptotic cone of P⊥ and M. We deduce
P = Q + (P⊥ )∞ + M = Q + P∞ as requested. 

The following calculus rule for asymptotic directions under linear transformations
is useful.

Corollary 2.3.17 Let P be the polyhedron determined by the system (2.6) and let L
be a linear operator from Rn to Rm . Then

L(P∞ ) = [L(P)]∞ .

Proof The inclusion L(P∞ ) ⊆ [L(P)]∞ is true for any closed convex set. Indeed,
if u is an asymptotic direction of P, then for every x in P and for every positive
number t one has x + tu ∈ P. Consequently, L(x) + t L(u) belongs to L(P) for all
t  0. This means that L(u) is an asymptotic direction of L(P). For the converse
inclusion, let v be a nonzero asymptotic direction of L(P). By definition, for a fixed
x of P, vectors L(x)+tv belong to L(P) for any t  0. Thus, there are x 1 , x 2 , · · · in
ν
P such that L(x ν ) = L(x) + νv, or equivalently v = L( x ν−x ) for all ν = 1, 2, · · ·
ν
Without loss of generality we may assume that the vectors x ν−x converge to some
nonzero vector u as ν tends to ∞. Then

xν x
a i , u = lim a i ,  − a i ,   0
ν→∞ ν ν
for all i = 1, · · · , k. In view of Theorem 2.3.12 the vector u is an asymptotic direction
of P and v = L(u) ∈ L(P∞ ) as requested. 

Corollary 2.3.18 Let P, P1 and P2 be polyhedra in Rn with P ⊆ P1 . Then

(P)∞ ⊆ (P1 )∞
(P1 × P2 )∞ = (P1 )∞ × (P2 )∞
(P1 + P2 )∞ = (P1 )∞ + (P2 )∞ .

Proof The first two expressions are direct from the definition of asymptotic direc-
tions. For the third expression consider the linear transformation L from Rn × Rn to
Rn defined by L(x, y) = x + y, and apply Corollary 2.3.17 and the second expression
to conclude. 

Polar cones
Given a cone C in Rn , the (negative) polar cone of C (Fig. 2.11) is the set

C ◦ := v ∈ Rn : v, x  0 for all x ∈ C .


32 2 Convex Polyhedra

The polar cone of C ◦ is called the bipolar cone of C. Here is a formula to compute
the polar cone of a polyhedral cone.
Theorem 2.3.19 The polar cone of the polyhedral cone determined by the system

a i , x  0, i = 1, · · · , k,

is the positive hull of the vectors a 1 , · · · , a k .


Proof It is clear that any positive combination of vectors a 1 , · · · , a k belongs to the
polar cone of the polyhedral cone. Let v be a nonzero vector in the polar cone. Then
the following system has no solution

a i , x  0, i = 1, · · · , k,
v, x > 0.

According to Farkas’ theorem the system

y1 a 1 + · · · + yk a k = v,
y1 , · · · , yk  0

has a solution, which completes the proof. 


Example 2.3.20 Let C be a polyhedral cone in R3 defined by the system:

x1 − x2  0,
x3 = 0.

By expressing the latter equality as two inequalities x3  0 and −x3  0,


we deduce that the polar cone of C is the positive hull of the three vectors
(1, −1, 0)T , (0, 0, −1)T and (0, 0, 1)T . In other words, the polar cone C ◦ consists
of vectors (t, −t, s)T with t ∈ R+ and s ∈ R.
Corollary 2.3.21 Let C1 and C2 be polyhedral cones in Rn . Then the following
calculus rules hold
Fig. 2.11 Polar cone

C◦
2.3 Convex Polyhedra 33

(C1 + C2 )◦ = C1◦ ∩ C2◦


(C1 ∩ C2 )◦ = C1◦ + C2◦ .

Proof Let v ∈ (C1 + C2 )◦ . We have

v, x + y  0 for all x ∈ C1 , y ∈ C2 .

By setting y = 0 in this inequality we deduce v ∈ C1◦ . Similarly, by setting x = 0


we obtain v ∈ C2◦ , and hence v ∈ C1◦ ∩ C2◦ . Conversely, if v belongs to both C1◦
and C2◦ , then v, · is negative on C1 and C2 . Consequently, it is negative on the sum
C1 + C2 by linearity, which shows that v ∈ (C1 + C2 )◦ .
For the second equality we observe that the inclusion C1◦ + C2◦ ⊆ (C1 ∩ C2 )◦
follows from the definition. To prove the opposite inclusion we assume that C1 is
determined by the system described in Theorem 2.3.19 with i = 1, · · · , k1 and C2 is
determined by that system with i = k1 + 1, · · · , k1 + k2 . Then the polyhedral
cone C1 ∩ C2 is determined by that system with i = 1, · · · , k1 + k2 . In view
of Theorem 2.3.19, the polar cone of C1 ∩ C2 is the positive hull of the vectors
a 1 , · · · , a k1 +k2 , which is evidently the sum of the positive hulls pos{a 1 , · · · , a k1 }
and pos{a k1 +1 , · · · , a k1 +k2 }, that is the sum of the polar cones C1◦ and C2◦ . 

Corollary 2.3.22 The bipolar cone of a polyhedral cone C coincides with the cone
C itself.

Proof According to Theorem 2.3.19 a vector v belongs to the bipolar cone C ◦◦ if


and only if
  k 
v, λi a i  0 for all λi  0, i = 1, · · · , k.
i=1

The latter system is equivalent to

a i , v  0, i = 1, · · · , k,

which is exactly the system determining the cone C. 

Corollary 2.3.23 A vector v belongs to the polar cone of the asymptotic cone of a
convex polyhedron if and only if the linear functional v, . attains its maximum on
the polyhedron.

Proof It suffices to consider the case where v is nonzero. Assume v belongs to the
polar cone of the asymptotic cone P∞ . In virtue of Theorems 2.3.12 and 2.3.19, it is
a positive combination of the vectors a 1 , · · · , a k . Then the linear functional v, . is
majorized by the same combination of real numbers b1 , · · · , bk on P. Let α be its
supremum on P. Our aim is to show that this value is realizable, or equivalently, the
system
34 2 Convex Polyhedra

a i , x  bi , i = 1, · · · , k
v, x  α

is solvable. Suppose to the contrary that the system has no solution. In view of
Corollary 2.2.4, there are a positive vector y and a real number t  0 such that

tv = A T y,
tα = b, y + 1.

We claim that t is strictly positive. Indeed, if t = 0, then A T y = 0 and b, y = −1


and for a vector x in P we would deduce

0 = A T y, x = y, Ax  y, b = −1,

a contradiction. We obtain expressions for v and α as follows


1 T 1 
v= A y and α = b, y + 1 .
t t

Let {x r }r 1 be a maximizing sequence of the functional v, . on P, which means


limr →∞ v, x r  = α. Then, for every r one has

1 T
v, x r  = A y, x r 
t
1
 y, b
t
1
 α− ,
t
which is a contradiction when r is sufficiently large.
For the converse part, let x be a point in P where the functional v, . achieves its
maximum. Then
v, x − x  0 for all x ∈ P.

In particular,
v, u  0 for all u ∈ P∞ ,

and hence v belongs to the polar cone of P∞ . 

Normal cones
Given a convex polyhedron P determined by the system (2.6) and a point x in P, we
say that a vector v is a normal vector to P at x if

v, y − x  0 for all y ∈ P.


2.3 Convex Polyhedra 35

The set of all normal vectors to P at x forms a convex cone called the normal cone to
P at x and denoted N P (x) (Fig. 2.12). When x is an interior point of P, the normal
cone at that point is zero. When x is a boundary point, the normal cone is computed
by the next result.
Theorem 2.3.24 The normal cone to the polyhedron P at a boundary point x of P
is the positive hull of the vectors a i with i being active indices at the point x.
Proof Let x be a boundary point in P. Then the active index set I (x) is nonempty.
Let v be an element of the positive hull of the vectors a i , i ∈ I (x), say

v= λi a i with λi  0, i ∈ I (x).
i∈I (x)

Then for every point x in P and every active index i ∈ I (x), one has

a i , x − x = a i , x − bi  0,

which yields 
v, x − x = λi a i , x − x  0.
i∈I (x)

Hence v is normal to P at x. For the converse, assume that v is a nonzero vector


satisfying
v, x − x  0 for all x ∈ P. (2.18)

We wish to establish that v is a normal vector at 0 to the polyhedron, denoted Q, that


is determined by the system

a i , y  0, i ∈ I (x).

This will certainly complete the proof because the normal cone to that polyhedron
is exactly its polar cone, the formula of which was already given in Theorem 2.3.19.
Observe that normality condition (2.18) can be written as

v, y  0 for all y ∈ cone(P − x).

Therefore, v will be a normal vector to Q at zero if Q coincides with cone(P − x).


Indeed, let y be a vector of cone(P − x), say y = t (x − x) for some x in P and some
positive number t. Then

a i , y = ta i , x − x  0,

which yields y ∈ Q. Thus, cone(P − x) is a subset of Q. For the reverse inclusion


we notice that inequalities with inactive indices are strict at x. Therefore, given a
vector y in Q, one can find a small positive number t such that
36 2 Convex Polyhedra

Fig. 2.12 Normal cone

P
y
x

NP (y) + y
NP (x) + x

a j , x + ta j , y  b j

for all j inactive. Of course, when i is active, it is true that

a i , x + t y = a i , x + ta i , y  bi .

Hence, x + t y belongs to P, or equivalently y belongs to cone(P − x). This achieves


the proof. 

Example 2.3.25 Consider the polyhedron in R3 defined by the system:

x1 + x2 + x3  1,
−2x1 − 3x2  −1,
x1 , x2 , x3  0.

This is a convex polytope with six vertices


⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 0 0
v1 = ⎝ 0 ⎠ , v2 = ⎝ 1 ⎠ , v3 = ⎝ 1/3 ⎠ ,
0 0 0
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1/2 1/2 0
v4 = ⎝ 0 ⎠ , v5 = ⎝ 0 ⎠ , v6 = ⎝ 1/3 ⎠
0 1/2 2/3

and five two-dimensional faces

co{v1 , v2 , v5 , v6 }, co{v1 , v2 , v3 , v4 }, co{v1 , v4 , v5 },


co{v3 , v4 , v5 , v6 }, co{v2 , v3 , v6 }.
2.3 Convex Polyhedra 37

At the vertex v1 there are three active constraints:

x1 + x2 + x3 = 1,
x2 = 0,
x3 = 0,

and two non-active constraints

−2x1 − 3x2  −1,


−x1  0.

Hence the normal cone at the vertex v1 is the positive hull of the vectors u 1 =
(1, 1, 1)T , u 2 = (0, −1, 0)T and u 3 = (0, 0, −1)T . Notice that u 1 generates
the normal cone at the point (1/3, 1/3, 1/3)T on the two-dimensional face F1 =
co{v1 , v2 , v6 , v5 }, u 2 generates the normal cone at the point (2/3, 0, 1/4)T on the
two-dimensional face F2 = co{v1 , v4 , v5 }, and the positive hull of u 1 and u 2 is the
normal cone at the point (3/4, 0, 1/4)T on the one-dimensional face [v1 , v5 ] that is
the intersection of the two-dimensional faces F1 and F2 .

As a direct consequence of Theorem 2.3.24, we observe that the normal cone is


the same at any relative interior point of a face. We refer to this cone as the normal
cone to a face. In view of Corollary 2.3.7 we obtain a collection of all normal cones
of faces, whose union is called the normal cone of P and denoted by N P . Thus, if
F := {F1 , · · · , Fq } is the collection of all faces of P, then

q
NP = N (Fi ).
i=1

It is to point out a distinction between this cone and the cone N (P), the normal cone
to P when P is considered as a face of itself. We shall see now that the collection N
of all normal cones N (Fi ), i = 1, · · · , q, is a nice dual object of the collection F.

Theorem 2.3.26 Assume that P is a convex polyhedron given by the system

a i , x  bi , i = 1, · · · , k.

Then the following assertions hold.


(i) The normal cone of P is composed of all normal cones to P at its points, that is

NP = N P (x)
x∈P

and coincides with the polar cone of the asymptotic cone of P. In particular,
it is a polyhedral cone, and it is the whole space if and only if P is a polytope
(bounded polyhedron).
38 2 Convex Polyhedra

(ii) In the collection F, if Fi is a face of F j , then N (F j ) is a face of N (Fi ). Moreover,


if i = j, then the normal cones N (Fi ) and N (F j ) have no relative interior point
in common.
(iii) In the collection N , if N is a face of N (Fi ), then there is a face F containing
the face Fi such that N = N (F ).

Proof For the first property it is evident that N P is contained in the union of the right
hand side. Let x ∈ P. There exists an index i ∈ {1, · · · , k} such that x ∈ ri(Fi ).
Then N P (x) = N (Fi ) and equality of (i) is satisfied. To prove that N P coincides
with (P∞ )◦ , let v be a vector of the normal cone N (Fi ) for some i. Choose a relative
interior point x0 of the face Fi . Then, by definition,

v, x − x0   0 for all x ∈ P.

By Corollary 2.3.23 the vector v belongs to the polar cone of the cone P∞ . Conversely,
let v be in (P∞ )◦ . In view of the same corollary, the linear functional v, . attains
its maximum on P at some point x, which means that

v, x − x  0 for all x ∈ P.

By definition, v is a normal vector to P at x.


For (ii), assume that Fi is a face of F j with i = j, which implies that the active
index set I Fi of Fi contains the active index set I F j of F j . Let x j be a relative interior
point of F j . Then one has

N (F j ) = N P (x j ) ⊂ N (Fi ).

Suppose that N (F j ) is not a face of N (Fi ). There exists a face

N0 = pos{a  :  ∈ I0 } ⊆ N (Fi )

for some I0 ⊆ I Fi , which contains N (F j ) as a proper subset and such that its relative
interior meets N (F j ) at some point, say v0 . Let F0 be the solution set to the system

a  , x = b ,  ∈ I0 ,
a  , x  b ,  ∈ {1, · · · , p}\I0 .

We see that I F j ⊆ I0 ⊆ I Fi , hence Fi ⊆ F0 ⊆ F j . In particular F0 = ∅, hence it is


a face of P. Let x0 be a relative interior point of F0 . We claim that

v, x j − x0  = 0 for all v ∈ N0 .

Indeed, consider the linear functional v → v, x j − x0  on N0 . On the one hand,


v, x j − x0   0 for all v ∈ N0 because x0 ∈ ri(F0 ). On the other hand, for
v0 ∈ ri(N0 ) ∩ N (F j ) above, one has v0 , x0 − x j   0, hence v0 , x j − x0  = 0.
2.3 Convex Polyhedra 39

Consequently, v, x j − x0  = 0 on N0 . Using this fact we derive for every v ∈ N0


that
v, x − x j  = v, x − x0  + v, x0 − x j   0,

for all x ∈ M, which implies v ∈ N (F j ) and arrive at the contradiction N (F j ) = N0 .


To prove the second part of assertion (ii), suppose to the contrary that the normal
cones N (Fi ) and N (F j ) have a relative interior point v in common. Then for each
x ∈ Fi and y ∈ F j one has
v, x − y = 0.

Since u, y − x  0 for all u ∈ N (Fi ) and v is a relative interior point of N (Fi ),
one deduces
u, x − y = 0 for all u ∈ N (Fi ).

Consequently, for u ∈ N (Fi ) it is true that

u, z − y = u, z − x + u, x − y  0 for all z ∈ P,

which shows u ∈ N (F j ). In other words N (Fi ) ⊆ N (F j ). The same argument with


i and j interchanging the roles, leads to equality N (Fi ) = N (F j ). In view of the
first part we arrive at the contradiction Fi = F j .
We proceed to (iii). Let N be a face of N (Fi ) for some i : 1  i  k. The case
N = N (Fi ) being trivial, we may assume N = N (Fi ). Let I ⊆ I Fi be a subset of
indices such that

N = cone{a  :  ∈ I } ⊆ N (Fi ) = cone{a  :  ∈ I Fi } .

Let F be the solution set to the system

a  , x = b ,  ∈ I,
a  , x  b ,  ∈ {1, · · · , p}\I .

Since I ⊆ I Fi , we have Fi ⊆ F. In particular F = ∅ and F is a face of P. Now we


show that N (F) = N and Fi is a proper face of F. Indeed, as N is a proper face of
N (Fi ), I is a proper subset of I Fi and there is a nonzero vector u ∈ R n such that

a  , u = 0 for  ∈ I,
a  , u < 0 for  ∈ I Fi \I .

Take x ∈ ri(Fi ) and consider the point x + tu with t > 0. One obtains

a  , x + tu = b ,  ∈ I,

a  , x + tu = a  , x + ta  , u < b ,  ∈ I Fi \I .


40 2 Convex Polyhedra

Moreover, since a  , x < b for  ∈ {1, · · · , p}\I Fi , when t is sufficiently small,


one also has
a  , x + tu < b ,  ∈ {1, · · · p}\I Fi .

Consequently,
N (F) = N P (x + tu) = pos{a  :  ∈ I } = N

when t is sufficiently small. It is evident that F = Fi . The proof is complete. 

Example 2.3.27 Consider the polyhedron P in R2 defined by the system:

−x1 − x2  −1,
−x1 + x2  1,
−x2  0.

It has two vertices F1 and F2 determined respectively by


⎧ ⎧
⎨ −x1 −x2 = −1 ⎨ −x1 −x2 = −1
−x1 +x2  1 and −x1 +x2 = 1
⎩ ⎩
−x2 = 0 −x2  0

three one-dimensional faces F3 , F4 and F5 determined respectively by


⎧ ⎧ ⎧
⎨ −x1 −x2  −1 ⎨ −x1 −x2 = −1 ⎨ −x1 −x2  −1
−x1 +x2  1 , −x1 +x2  1 and −x1 +x2 = 1
⎩ ⎩ ⎩
−x2 = 0 −x2  0 −x2  0

and P itself is the unique two-dimensional face. Denote by v1 = (−1, −1)T , v2 =


(−1, 1)T and v3 = (0, −1)T . Then the normal cones of the faces F1 ,· · · , F5 are
respectively the positive hulls of the families {v1 , v3 }, {v1 , v2 }, {v3 }, {v1 } and {v2 }.
The normal cone of P is zero. Moreover, the union N P of these normal cones is the
positive hull of the vectors v2 and v3 . It is the polar cone of the asymptotic cone of
P which is defined by the system

−x1 − x2  0,
−x1 + x2  0,
−x2  0,

in which the first inequality is redundant and hence it is reduced to x1  x2  0.

Next we prove that the normal cone of a face is obtained from the normal cones
of its vertices.
2.3 Convex Polyhedra 41

Corollary 2.3.28 Assume that a face F of P is the convex hull of its vertices
v 1 , · · · , v q . Then
q
N (F) = N P (v i ).
i=1

q
Proof The inclusion N (F) ⊆ i=1 N P (v i ) is clear from (ii) of Theorem 2.3.26.
We prove the converse inclusion. Let u be a nonzero vector of the intersection
q
i=1 N P (v ). Let x be a relative interior point of F. Then x is a convex combi-
i

nation of the vertices v 1 , · · · , v q :


q
x= λi v i
i=1

with λi  0, i = 1, · · · , q and λ1 + · · · + λq = 1. We have then

u, x − v i   0 for all x ∈ P, i = 1, · · · , q.

This implies


q 
q
u, x − x = u, λi x − λi v i 
i=1 i=1

q
= λi u, x − v i   0
i=1

By this, u is a normal vector to P at x, and u ∈ N (F). 

Combining this corollary with Corollary 2.3.8 we conclude that the normal cone
of a bounded face is the intersection of the normal cones of all proper faces of that
bounded face. This is not true for unbounded faces, for instance when a face has no
proper face.

2.4 Basis and Vertices

In this section we consider a polyhedron P given by the system

Ax = b (2.19)
x  0.
42 2 Convex Polyhedra

We assume throughout that the matrix A has n columns denoted a1 , · · · , an and


k rows that are transposes of a 1 , · · · , a k and linearly independent, and that the
components b1 , · · · , bk of the vector b are non-negative numbers. A point x in P is
said to be an extreme point of P if it cannot be expressed as a convex combination
x = ta + (1 − t)a for some 0 < t < 1 and a, a ∈ P with a = a . It can be seen that
extreme points correspond to vertices we have defined in the previous section. Certain
results we have obtained for polyhedra given in a general form (by inequalities) will
be recaptured here, but our emphasis will be laid on computing issues which are
much simplified under equality form (2.19).
A k × k-submatrix B composed of columns of A is said to be a basis if it is
invertible.
Let B be a basis. By using a permutation one may assume that B is composed
of the first k columns of A, and the remaining columns form a k × (n − k)-
submatrix N , called a non-basic part of A. Let x be a vector with components x B
and x N , where x B is a k-dimensional vector and x N is an (n − k)-dimensional vector
satisfying

Bx B = b,
x N = 0.

If x B is a positive vector, then x is a solution to (2.19) and called a feasible basic


solution (associated with the basis B). If in addition x B has no zero component, it is
called non-degenerate; otherwise it is degenerate.

Example 2.4.1 Consider the polyhedron in R3 defined by the system:

x1 + x2 + x3 = 1,
3x1 + 2x2 = 1,
x1 , x2 , x3  0.

The vectors a1 = (1, 1, 1)T and a2 = (3, 2, 0)T are linear independent. There are
three bases      
11 11 11
B1 = , B2 = and B3 = .
32 30 20

The basic solutions corresponding to B1 , B2 and B3 are respectively (−1, 2, 0)T ,


(1/3, 0, 2/3)T and (0, 1/2, 1/2)T . The first solution is unfeasible, while the two last
ones are feasible and non-degenerate.

Given a vector x ∈ Rn , its support, denoted supp(x), consists of indices i for which
the component xi is nonzero. The support of a nonzero vector is always nonempty.

Theorem 2.4.2 A vector x is a vertex of the polyhedron P if and only if it is a feasible


basic solution of the system (2.19).
2.4 Basis and Vertices 43

Proof Let x be a feasible basic solution. Assume that it is a convex combination of


two solutions y and z of the system (2.19), say x = t y + (1 − t)z with t ∈ (0, 1).
Then for any nonbasic index j, the component x j is zero, so that t y j + (1 − t)z j = 0.
Remembering that y and z are positive vectors, we derive y j = z j = 0. Moreover,
the basic components of solutions to (2.19) satisfy equation

Bx B = b

with B nonsingular. Therefore, they are unique, that is x B = y B = z B . Consequently,


the three solutions x, y and z are the same.
Conversely, let x be an extreme point of the polyhedron. Our aim is to show that
the columns a i , i ∈ supp(x) are linearly independent. It is then easy to find a basis
B such that x is the basic solution associated with that basis. To this end, we prove
first that supp(x) is minimal by inclusion among solutions of the system (2.19). In
fact, if not, one can find another solution, say y, with minimal support such that
supp(y) is a proper subset of supp(x). Choose an index j from the support of y such
that xj xi
= min{ : i ∈ supp(y)}.
yj yi

Let t > 0 be that quotient. Then

A(x − t y) = (1 − t)b and x − t y  0.

If t  1, then by setting z = x − y we can express

1 2 1 4
x= (y + z) + (y + z),
2 3 2 3
a convex combination of two distinct solutions of (2.19), which is a contradiction. If
t < 1, then take
1
z= (x − t y).
1−t

We see that z is a solution to (2.19) and different from x because its support
is strictly contained in the support of x. It is also different from y because the
component y j is not zero while the component z j is zero. We derive from the de-
finition of z that x is a strict convex combination of y and z, which is again a
contradiction.
Now we prove that the columns a i , i ∈ supp(x) are linearly independent. Sup-
pose the contrary: there is a vector y different from x (if not take 2y instead)
with
Ay = 0 and supp(y) ⊆ supp(x).
44 2 Convex Polyhedra

By setting 
− min{ xyii : i ∈ supp(y)} if y  0
t= xi
min{− yi : i ∈ supp(y), yi < 0} else

we obtain that z = x + t y is a solution to (2.19) whose support is strictly contained


in the support of x and arrive at a contradiction with the minimality of the support of
x. It remains to complete the vectors a i , i ∈ supp(x) to a basis to see that x is indeed
a basic solution. 
Corollary 2.4.3 Thenumber
 of vertices of the polyhedron P does not exceed the
n
binomial coefficient .
k
Proof This follows from 
Theorem
 2.4.2 and the fact that the number of bases of
n
the matrix A is at most . Notice that not every basic solution has positive
k
components. 
We deduce again Corollary 2.3.8 about the description of polytopes in terms of
extreme points (vertices), but this time for a polytope determined by the system
(2.19).
Corollary 2.4.4 If P is a polytope, then any point in it can be expressed as a convex
combination of vertices.
Proof Let x be any solution of (2.19). If the support of x is minimal, then in view of
Theorem 2.4.2 that point is a vertex. If not, then there is a solution y 1 different from
x, with minimal support and supp(y 1 ) ⊂ supp(x). Set
xj
t1 = min : j ∈ supp(y 1 ) .
y 1j

This number is positive and strictly smaller than one, because otherwise the nonzero
vector x − y 1 should be an asymptotic direction of the polyhedron and P should be
unbounded. Consider the vector
1
z1 = (x − t1 y 1 ).
1 − t1

It is clear that this vector is a solution to (2.19) and its support is strictly smaller than
the support of x. If the support of z 1 is minimal, then z 1 is a vertex and we obtain a
convex combination
x = t1 y 1 + (1 − t1 )z 1 ,

in which y 1 and z 1 are vertices. If not, we continue the process to find a vertex
y 2 whose support is strictly contained in the support of z 1 and so on. In view of
Corollary 2.4.3 after a finite number of steps one finds vertices y 1 , · · · , y p such that
x is a convex combination of them. 
2.4 Basis and Vertices 45

Extreme rays
Extreme direction of a convex polyhedron P in Rn can be defined to be a direction that
cannot be expressed as a strictly positive combination of two linearly independent
asymptotic vectors of P. As the case when a polyhedron is given by a system of
linear inequalities (Corollary 2.3.15), we shall see that a polyhedron determined by
(2.19) is completely determined by its vertices and extreme directions.

Theorem 2.4.5 Assume that the convex polyhedron P is given by the system (2.19).
Then
(i) A nonzero vector v is an asymptotic direction of P if and only if it is a solution
to the associated homogenous system

Ax = 0,
x  0.

(ii) A nonzero vector v is an extreme asymptotic direction of P if and only if it is a


positive multiple of a vertex of the polyhedron determined by the system

Ay = 0 (2.20)
y1 + · · · + yn = 1,
y  0.

Consequently P∞ consists of all positive combinations of the vertices of this


latter polyhedron.

Proof The first assertion is proven as in Theorem 2.3.12. For the second assertion,
let v be a nonzero extreme direction. Then Av = 0 by (i) and t := v1 + · · · + vn > 0.
The vector v/t is in the polyhedron of (ii), denoted Q. Since each point of that
polyhedron is an asymptotic direction of P, if v/t were a convex combination of
two distinct points y 1 and y 2 in Q, then v would be a convex combination of two
linearly independent asymptotic directions t y 1 and t y 2 of P, which is a contradiction.
Conversely, let v be a vertex of Q. It is clear that v is nonzero. If v = t x + (1 − t)y
for some nonzero asymptotic directions x and y of P and some t ∈ (0, 1), then with

t n 
n
i=1 x i
t = n n =t xi ,
t i=1 x i + (1 − t) i=1 yi i=1
1
x = n x,
i=1 x i
1
y = n y,
i=1 yi
46 2 Convex Polyhedra

we express v as a convex combination t x + (1 − t )y of two points of Q. Note that


t > 0. By hypothesis, x = y which means that x and y are linearly dependent.
The proof is complete. 

Corollary 2.4.6 A nonzero vector is an extreme asymptotic direction of P if and


only if it is a basic feasible solution of the system (2.20). Consequently, the num-
ber
 of extreme
 asymptotic directions of P does not exceed the binomial coefficient
n
.
k+1

Proof This is obtained from Theorems 2.4.2 and 2.4.5. 

Example 2.4.7 Consider the polyhedron in R3 defined by the system:

x1 − x2 = 1
x1 , x2 , x3  0.

The asymptotic cone of this polyhedron is the solution set to the system

x1 − x2 = 0
x1 , x2 , x3  0.

Any vector (t, t, s)T with t  0 and s  0 is an asymptotic direction. To obtain


extreme asymptotic directions we solve the system

y1 − y2 =0
y1 + y2 + y3 = 1
y1 , y2 , y3  0.

There are three bases corresponding to basic variables {y1 , y2 }, {y1 , y3 } and {y2 , y3 }:
     
1 −1 10 −1 0
B1 = , B2 = and B3 = .
1 1 11 11

The basic solution y = (1/2, 1/2, 0)T is associated with B1 and the basic solution
y = (0, 0, 1)T is associated with B2 and B3 . Both of them are feasible, and hence
they are extreme asymptotic directions.

In the following we describe a practical way to compute extreme rays of the


polyhedron P.

Corollary 2.4.8 Assume that B is a basis of the matrix A and as is a non-basic


column of A such that the system

By = −as
2.4 Basis and Vertices 47

has a positive solution y  0. Then the vector x whose basic components are equal
to y, the sth component is equal to 1 and the other non-basic components are all
zero, is an extreme ray of the polyhedron P.

Proof It is easy to check that the submatrix corresponding to the variables of y and
the variable ys is a feasible basis of the system (2.20). It remains to apply Corollary
2.4.6 to conclude. 

In Example 2.4.7 we have A = (1, −1, 0). For the basis B = (1) corresponding
to the basic variable x1 and the second non-basic column, the system By = −as
takes the form y = 1 and has a positive solution y = 1. In view of Corollary 2.4.8
the vector (1, 1, 0)T is an extreme asymptotic direction. Note that using the same
basis B and the non-basic column a3 = (0) we obtain the system y = 0 which has
a positive (null) solution. Hence the vector (0, 0, 1)T is also an extreme asymptotic
direction.

Representation of Elements of a Polyhedron


A finitely generated convex set is defined to be a set which is the convex hull of a
finite set of points and directions, that is, each element of it is the sum of a convex
combination of a finite set of points and a positive combination of a finite set of
directions. The next theorem states that convex polyhedra are finitely generated,
which is Corollary 2.3.15 for a polyhedron determined by the system (2.19).

Theorem 2.4.9 Every point of a convex polyhedron given by the system (2.19) can
be expressed as a convex combination of its vertices, possibly added to a positive
combination of the extreme asymptotic directions.

Proof Let x be any point in P. If its support is minimal, then, according to the proof
of Theorem 2.4.2 that point is a vertex. If not, there is a vertex v 1 whose support is
minimal and strictly contained in the support of x. Set
xj
t = min : j ∈ supp(v 1 )
v 1j

and consider the vector x − tv 1 . If t  1, then the vector z = x − v 1 is an asymptotic


direction of the polyhedron and then x is the sum of the vertex v 1 and an asymptotic
direction. The direction z, in its turn, is expressed as a convex combination of extreme
asymptotic directions. So the corollary follows. If t < 1, the technique of proof of
Theorem 2.4.2 can be applied. Expressly, setting z = (x − tv 1 )/(1 − t) we deduce
that z  0 and
1 t
Az = b− b = b.
1−t 1−t

Moreover, the support of z is a proper subset of the support of x because the compo-
nents j of z with j realizing the value of t = x j /v 1j are zero. Then x = tv 1 + (1 − t)z
with strict inclusion supp(z) ⊂ supp(x). Continuing this process we arrive at finding
48 2 Convex Polyhedra

a finite number of vertices v 1 , · · · , v p and an asymptotic direction z such that x is


the sum of a convex combination of v 1 , · · · , v p and z. Then expressing z as a convex
combination of asymptotic extreme directions we obtain the conclusion. 

In view of Corollaries 2.4.3 and 2.4.6 the numbers of vertices and extreme asymp-
totic directions of a polyhedron P are finite. Denote them respectively by v 1 , · · · , v p
and z 1 , · · · , z q . Then each element x of P is expressed as


p 
q
x= λi v +
i
μjz j
i=1 j=1

with

p
λi = 1, λi  0, i = 1, · · · , p and μ j  0, j = 1, · · · , q.
i=1

Notice that the above representation is not unique, that is, an element x of P can
be written as several combinations of v i , i = 1, · · · , p and z j , j = 1, · · · , q with
different coefficients λi and μ j . An easy example can be observed for the center x
of the square with vertices
       
0 1 0 1
v =
1
,v =
2
,v =
3
and v =
4
.
0 0 1 1

It is clear that x can be seen as the middle point of v 1 and v 4 , and as the middle point
of v 2 and v 3 too.
Another point that should be made clear is the fact that the results of this section are
related to polyhedra given by the system (2.19) and they might be false under systems
of different type. For instance, in view of Theorem 2.4.9 a polyhedron determined by
(2.19) has at least a vertex. This is no longer true if a polyhedron is given by another
system. Take a hyperplane determined by equation d, x = 0 for some nonzero
vector d ∈ R2 . It is a polyhedron without vertices. An equivalent system is given in
form of (2.19) as follows

d, x +  − d, x −  = 0,
x + , x −  0.

The latter system generates a polyhedron in R4 that does have vertices. However, a
vertex (x + , x − )T of this polyhedron gives an element x = x + − x − of the former
polyhedron, but not a vertex of it.
Chapter 3
Linear Programming

A linear mathematical programming problem is a problem of finding a maximum or


minimum of a linear functional over a convex polyhedron. The functional to optimize
is called an objective or cost function, and the linear equalities and linear inequalities
that define the polyhedron are called constraints.

3.1 Optimal Solutions

We consider the following linear programming problem, denoted (LP):

maximize c, x
subject to Ax = b (3.1)
x  0, (3.2)

where c is an n-vector, A is an m × n-matrix and b is an m-vector. Under these


constraints we say (LP) is given in standard form. It is given in canonical form when
the constraints (3.1) and (3.2) are substituted by inequalities Ax  b. As we have
already discussed in Sect. 2.2, linear equalities can be converted to linear inequalities
and vice versa, any linear programming problem may be set in form as (LP) above.
We denote the feasible set of the problem (LP) by X , that is, X is the solution set to
the system (3.1–3.2). A feasible solution x ∈ X is optimal if c, x  c, x for all
x ∈ X . The linear function x → c, x is called the cost function of the problem. A
fundamental theorem of linear programming is given next.

Theorem 3.1.1 Assume that X is nonempty. Then the four conditions below are
equivalent.
(i) (LP) admits an optimal solution.
(ii) (LP) admits an optimal vertex solution.

© Springer International Publishing Switzerland 2016 49


D.T. Luc, Multiobjective Linear Programming,
DOI 10.1007/978-3-319-21091-9_3
50 3 Linear Programming

(iii) The cost function is non-positive on every asymptotic direction of X .


(iv) The cost function is bounded above on X .

Proof The scheme of our proof is as follows: (i)⇒ (iv) ⇒ (iii) ⇒ (ii) ⇒ (i). The
first and the last implications are immediate. We proceed to the second implication.
Let α be an upper bound of the cost function on X and let u be a nonzero asymptotic
direction of X if it exists. Pick any point x in X which is nonempty by hypothesis.
Then for every positive number t, the point x + tu belongs to X . Hence

c, x + tu = c, x + tc, u  α.

This inequality being true for all t positive, we must have c, u  0.
To establish the third implication let {v 1 , · · · , v p } be the collection of all vertices
and let {u 1 , · · · , u q } be the collection of all extreme rays of the polyhedron X . The
collection of extreme rays may be empty. Choose a vertex v i0 such that

c, v i0  = max{c, v 1 , · · · , c, v p }.

Let x be any point


 p in X . In view of Theorem 2.4.9, there are non-negative numbers
ti and s j with i=1 ti = 1 such that


p 
q
x= ti v +
i
sju j.
i=1 j=1

We deduce


p 
q
c, x = ti c, v i  + s j c, u j 
i=1 j=1

 c, v ,
i0

which shows that the vertex v i0 is an optimal solution. The proof is complete. 

Existence of optimal solutions is always guaranteed when the feasible solution


set is bounded as the next corollary shows.

Corollary 3.1.2 If the problem (LP) has a bounded feasible set, then it has optimal
solutions.

Proof When the set X is bounded, it has no nonzero asymptotic direction. Hence
condition (iii) of the previous theorem is fulfilled and the problem (LP) has optimal
solutions. 

The result below expresses a necessary and sufficient condition for optimal solu-
tions in terms of normal directions.
3.1 Optimal Solutions 51

Theorem 3.1.3 Assume that X is nonempty. Then the following statements are equiv-
alent.
(i) x is an optimal solution of (LP).
(ii) The vector c belongs to the normal cone to the set X at x.
(iii) The whole face of X which contains x as a relative interior point is an optimal
solution face.
Consequently, if (LP) has an optimal solution, then the optimal solution set is a face
of the feasible polyhedron.
Proof The implication (iii) ⇒ (i) is evident, so we have to show the implications (i)
⇒ (ii) and (ii) ⇒ (iii). For the first implication we observe that if x is an optimal
solution, then

c, x − x  0 for all x ∈ X.

By definition, c is a normal vector to X at x which yields (ii). Now, assume (ii) and
let x be any point in the face that has x as a relative interior point. There is a positive
number δ such that the points x + δ(x − x) and x − δ(x − x) belong to X . We have
then

c, x + δ(x − x) − x  0,
c, x − δ(x − x) − x  0.

This produces

c, x − x  0,
c, −(x − x)  0.

Consequently, c, x = c, x, which together with the normality of c at x shows
that x is an optimal solution too.
For the second part of the theorem, set α = c, x where x is an optimal solution
of (LP). Then the intersection of the hyperplane
 
H = x ∈ Rn : c, x = α

with the feasible set X is a face of X and contains all optimal solutions of the
problem. 
Given a feasible basis B, we call it an optimal basis if the associated basic solution
is an optimal solution of (LP). We shall decompose the cost vector c into the basic
component vector c B and non-basic component vector c N . The vector

c N = c N − (B −1 N )T c B

is called the reduced cost vector.


52 3 Linear Programming

Theorem 3.1.4 Let B be a feasible basis and x the feasible basic solution associated
with B. The following statements hold.
(i) If the reduced cost vector c N is negative, then B is optimal.
(ii) When B is non-degenerate, it is optimal if and only if the reduced cost vector
c N is negative.

Proof Up to a suitable permutation we may assume that the matrix A is decomposed


by (B N ), the basic index set is {1, · · · , m} and the non-basic index set is {m +
1, · · · , n}. To prove (i) let x be any feasible solution of the problem. Since x is a
solution to the system Ax = b, the basic part x B of x corresponding to the basic
columns of B is expressed by its non-basic components via

x B = B −1 b − B −1 N x N = x B − B −1 N x N . (3.3)

The cost function at x is then given by

c, x = c B , x B  + c N , x N 
= c B , B −1 b − B −1 N x N  + c N , x N 
= c B , x B  + c N , x N 
= c, x + c N , x N .

Since x N is positive and by hypothesis c N is negative, we deduce

c, x  c, x.

As x was an arbitrary feasible solution of the problem, we deduce that x is an optimal


solution and B is an optimal basis.
For (ii) we need to prove the “only if” part. Suppose to the contrary that the
reduced cost vector is not negative, that is, c j > 0 for some non-basic index j. Our
aim is to find a new feasible solution x̂ with

c, x < c, x̂ (3.4)

which yields a contradiction. We look for a solution x̂ in special form:


 
x̂ B
x̂ = with x̂ N = x N + te j = te j
x̂ N

where e j is the non-basic part of the jth coordinate unit vector in Rn and t is a
positive number to be chosen such that x̂ be feasible. Since x̂ N is positive, in view
of (3.3) the feasibility of x̂ means that

x̂ B = x B − t B −1 N e j  0.
3.1 Optimal Solutions 53

The basis B being non-degenerate, the vector x B is strictly positive, hence x̂ B is


positive whenever t > 0 is sufficiently small. We fix such a value of t and calculate
the cost function at that point by using (3.3):

c, x̂ = c B , x̂ B  + c N , x̂ N 
= c B , x B  − tc B , B −1 N e j  + tc N , ej
= c B , x B  + tc j
> c B , x B ,

which contradicts the optimality of x. 

We note that degeneracy is caused by redundancy of equality constraints that


define a vertex under consideration. When some of the equality constraints are redun-
dant, that vertex is the solution of at least two different sets of equality constraints
and may be associated with several bases. When a feasible basis B is degenerate,
three things may happen. First, a strictly positive solution t to determine a new basic
feasible solution in the proof of the preceding theorem does not necessarily exist. In
such a situation one must look for another basis that defines the same feasible solution
as B does and search for t with this new basis. Second, even if a new feasible solution
can be found, it is not necessary that the value of the objective function increases
when moving to the new solution. In principle, when a feasible vertex is not optimal,
there always exists a basis associated with it, which allows to find a new feasible
solution where the value of the objective function is strictly bigger than its value at
the current vertex. Third, the current vertex may be optimal even if the reduced cost
vector is not negative. In other words, the second conclusion of Theorem 3.1.4 is not
true if the basis under consideration is degenerate.

Example 3.1.5 Consider the following linear programming problem

maximize x1 − 3x2
subject to −x1 + x2 + x3 =0
x1 − 2x2 + x4 = 0
x1 , x2 , x3 , x4  0.

A tangible basis corresponding to basic variables x3 and x4 is given by


 
10
B3,4 = .
01

Its associated solution is the vector x = (0, 0, 0, 0)T , which is feasible and degen-
erate. The reduced cost vector at this basis is given by
     T    
1 10 −1 1 0 1
cN = − = .
−3 01 1 −2 0 −3
54 3 Linear Programming

It has a strictly positive component. However the basic solution x obtained above is
optimal. Indeed, let us examine another basis with which the solution x is associated,
namely the basis corresponding to the basic variables x1 and x3 :
 
−1 1
B1,3 = .
10

Of course, like the preceding one, this basis is degenerate. Its reduced cost vector is
computed by
     T    
−3 01 10 1 −1
cN = − = .
0 11 −2 1 0 0

It is a negative vector. In view of Theorem 3.1.4 (i), the solution x is optimal.

Example 3.1.6 Consider the following linear programming problem

maximize x1 + x2 + x3
subject to x1 + x2 +x4 =8
− x2 + x3 + x5 = 0
x1 , · · · , x5  0.

A visible basis corresponding to basic variables x4 and x5 is given by


 
10
B4,5 = .
01

Its associated solution is the vector x = (0, 0, 0, 8, 0)T , which is feasible and degen-
erate. The reduced cost vector is given by
⎛ ⎞ ⎛ ⎞
1    T   1
⎝ ⎠ 10 1 10 0
cN = 1 − = ⎝1⎠.
01 0 −1 1 0
1 1

At this stage Theorem 3.1.4 is not applicable as the basic solution is degenerate. Let
us try to find a new feasible solution with a bigger cost value. To this end, we observe
that the reduced cost vector has three strictly positive components, and so any of
j
the coordinate unit vectors e N , j = 1, 2, 3 is a suitable choice to determine a new
feasible solution x̂ as described in the proof of Theorem 3.1.4. We set for instance
3.1 Optimal Solutions 55
⎛ ⎞
1
x̂ N = t ⎝ 0 ⎠
0
⎛ ⎞
     1  
8 10 1 10 ⎝ ⎠ 8−t
x̂ B = −t 0 = .
0 01 0 −1 1 0
0

Since x̂ is positive, the largest t for which x̂ N and x̂ B are positive is the value t = 8. We
obtain then a new feasible solution x̂ = (8, 0, 0, 0, 0)T . This solution is degenerate.
A basis associated to it is B1,5 which is identical with B4,5 . The reduced cost vector
at x̂ using the basis B1,5 is given by
−1
c N = c N − (B1,5 N )T c B
⎛ ⎞
1    T  
⎝ ⎠ 10 101 1
= 1 −
01 −1 1 0 0
0
⎛ ⎞
0
= ⎝ 1⎠.
−1

As before, the solution x̂ being degenerate, Theorem 3.1.4 is not applicable. We try
again to find a better feasible solution. As the second component of the reduced cost
vector is strictly positive, we choose a solution y by help of the vector e3N = (0, 1, 0)T
(basic components being y1 and y5 ):
⎛ ⎞
0
yN = t ⎝ 1 ⎠
0
⎛ ⎞
     0  
8 10 101 ⎝ ⎠ 8
yB = −t 1 = .
0 01 −1 1 0 −t
0

The feasibility of y requires y N  0 and y B  0, which enforce t = 0. Thus,


with the basis B1,5 we move nowhere and remain at the same solution x̂. One notes
nevertheless two more bases associated with the solution x̂. They are given below
   
1 1 10
B1,2 = and B1,3 = ,
0 −1 01

which correspond to the pairs of basic variables x1 , x2 and x1 , x3 respectively. The


reduced cost vectors at x̂ related to these bases are then
56 3 Linear Programming
⎛ ⎞ ⎛ ⎞
1 1
ĉ2,4,5 = ⎝ −1 ⎠ and ĉ3,4,5 = ⎝ −1 ⎠ .
−1 0

Both of these reduced cost vectors suggest to find a new feasible solution z that may
increase the cost value, with help of the coordinate vector e1N = (1, 0, 0)T . Thus, by
picking for instance the basis B1,3 we obtain a system that determines z as follows
⎛ ⎞
1
zN = t ⎝ 0 ⎠
0
⎛ ⎞
     1  
8 10 110 ⎝ ⎠ 8−t
zB = −t 0 = .
0 01 −1 0 1 t
0

The biggest t that makes z feasible is the value t = 8. The new feasible solution is
then z = (0, 8, 8, 0, 0)T . Its associated basis is
 
10
B2,3 = .
−1 1

It is feasible and non-degenerate. A direct calculation gives the negative reduced cost
vector ĉ N = (−1, −2, −1)T . By Theorem 3.1.4 the basis B2,3 is optimal.

The following corollary is useful in computing optimal solutions.

Corollary 3.1.7 Let B be a feasible non-degenerate basis and let x be the associated
basic feasible solution. If there is a non-basic variable xs for which the sth component
cs is strictly positive, then
(i) either the variable xs can take any value bigger than x s without getting out of
the feasible region X , in which case the optimal value of (LP) is unbounded;
(ii) or another feasible basis B can be obtained to which the associated feasible
basic solution x̂ satisfies

c, x̂ > c, x.

Proof Under the hypothesis of the corollary the basic solution x is not optimal by
the preceding theorem. Our aim is to find another feasible solution that produces a
bigger value of the objective function. To this purpose choose x̂ with

x̂ N = tes
x̂ B = b − ta s
3.1 Optimal Solutions 57

where b denotes the vector B −1 b which is the same as x B and a s = B −1 N es . If the


vector a s is negative, then x̂ B is positive and hence x̂ is feasible for every positive
value of t. Moreover,

c, x̂ = c B , x B  + tcs

which diverges to ∞ as t tends to ∞. The optimal value of (LP) is then unbounded.


If the vector a s is not negative, say a is > 0 for some index i, then x̂ cannot be
feasible (positive) when t is large. A necessary and sufficient condition for that vector
to be feasible is that t be smaller than the value
b 
i
tˆ := min : i ∈ {1, · · · , m}, a is > 0 .
a is

Let r be a basic index for which the above minimum is reached. Then the solution
 
b − tˆa s
x̂ =
tˆes

is feasible. The r th component x r > 0 becomes x̂r = 0, while the sth component
x s = 0 becomes x̂s > 0. Denote by B̂ the matrix deduced from B by using the
column as instead of ar , so that they differ from each other by one column only. We
show that this new matrix is a basis. Remember that as = Ba s which means that
as is a vector in the column space of the matrix B. The coefficient corresponding
to the vector ar in that linear combination is a r s > 0. Consequently, the vector ar
can be expressed as a linear combination of the remaining column vectors of B and
the vector as . Since the columns of B are linearly independent, we deduce the same
property for the columns of the matrix B̂. Thus B̂ is a basis. It is clear that x̂ is the
basic feasible solution associated with this basis. Moreover, the value of the objective
function at this solution is given by c, x̂ = c, x + tˆcs that is strictly bigger than
the value c, x. The proof is complete. 

Example 3.1.8 Consider the following linear programming problem

maximize x1 − x2 + x3
subject to x1 + x2 + x3 = 1
3x1 + 2x2 =1
x1 , x2 , x3  0.

There are three bases


     
11 11 11
B1,2 = , B1,3 = and B2,3 = .
32 30 20
58 3 Linear Programming

The first basis is not feasible because the associated solution (−1, 2, 0)T is not feasi-
ble, having one negative component. The second and the third bases are feasible and
non-degenerate. Their associated basic solutions are respectively u = (1/3, 0, 2/3)T
and v = (0, 1/2, 1/2)T . For the basis B2,3 , the non-basic variable is x1 , the basic
component cost vector c2,3 = (−1, 1)T and the non-basic part of the constraint
matrix is N1 = (1, 3)T . By definition, the reduced cost one-dimensional vector is
computed by
  
0 1/2 1
c1 = 1 − (−1, 1) = 3.
1 −1/2 3

In view of Theorem 3.1.4 the vertex v is not optimal. Let us follow the method of
Corollary 3.1.7 to get a better solution. Remembering that x1 is the non-basic variable
with the corresponding reduced cost c1 = 3 > 0 we compute b and a 1 by
    
−1 0 1/2 1 1/2
b = (B3 ) b= =
1 −1/2 1 1/2

and
    
−1 0 1/2 1 3/2
a 1 = (B3 ) a1 = = .
1 −1/2 3 −1/2

The positive number tˆ in the proof of Corollary 3.1.7 expressing the length to move
from v to a new vertex is tˆ = b1 /a 11 = (1/2)/(3/2) = 1/3. Hence the new feasible
solution x̂ is given by
⎛ ⎞
  1/3
tˆe1
x̂ = = ⎝ 0⎠.
b − tˆa 1
2/3

This is exactly the feasible basic solution u. For this solution the non-basic variable
is x2 and the corresponding reduced cost c2 is given by
  
0 1/3 1
c2 = −1 − (1, 1) = −2 < 0.
1 −1/3 2

By Theorem 3.1.4 the solution u is optimal and the optimal value of the problem is
equal to c, v = 1.
3.2 Dual Problems 59

3.2 Dual Problems

Associated with the linear problem (LP) we define a new linear problem, denoted
(LD) and called the dual of (LP). We display both (LP) and (LD) below

maximize c, x minimize b, y


subject to Ax = b subject to A T y  c.
x  0,
In this dual formulation the problem (LP) is called the primal problem. Using the
method of converting linear inequalities to linear equalities, one may obtain from
the pair (LP) and (LD) above the dual of a linear problem given in canonical form.
In fact, suppose we are given the problem

maximize c, x
subject to Ax  b.

It is equivalent to the following problem


⎛ ⎞
x+
maximize (c, −c, 0) ⎝ x − ⎠
z
⎛ +⎞
x
subject to (A, −A, I ) ⎝ x − ⎠ = b
z
x + , x − , z  0,

in which I is the identity m × m matrix and z is an m-vector variable. It follows from


the scheme (LP)-(LD) that the dual of the latter problem is given by

minimize b, y
⎛ ⎞ ⎛ ⎞
AT cT
subject to ⎝ −A T ⎠ y  ⎝ −c T ⎠ ,
I 0

which is equivalent to

minimize b, y
subject to A T y = c
y  0.

On the other hand, by putting the minimization of the function b, y as minus of the
maximization of the function −b, y and applying the primal-dual scheme above
60 3 Linear Programming

we come to conclusion that the dual of (LD) is exactly (LP). In other words the dual
of the dual is the primal. In this sense the pair (LP) and (LD) is known as a symmetric
form of duality. This symmetry can also be seen when we write down a primal and
dual pair in a general mixed form in which both equality and inequality constraints
as well as positive, negative variables and free (unrestricted) variables are present:
3 3
i=1 c , x  i=1 b , y 
maximize i i minimize i i

subject to A1 x  b1 subject to y1  0
A2 x  b2 y2  0
A3 x = b3 y 3 free
x1  0 A1T y  c1
x2  0 A2T y  c2
x 3 free A3T y = c3
in which the variables x and y are decomposed into three parts: positive part, negative
part and unrestricted part, and the dimensions of the vectors ci , bi and the matrices
Ai , i = 1, 2, 3 are in concordance.
⎛ 1 ⎞ The matrices A1 , A2 and A3 are composed
A
of the columns of the matrix ⎝ A2 ⎠ corresponding to the variables x 1 , x 2 and x 3
A3
respectively. The aforementioned scheme of duality is clearly seen in the following
example.

Example 3.2.1 Consider the following problem of three variables:

maximize c1 x1 + c2 x2 + c3 x3
subject to a11 x1 + a12 x2 + a13 x3  b1
a21 x1 + a22 x2 + a23 x3  b2
a31 x1 + a32 x2 + a33 x3 = b3
x1  0
x2  0
x3 free.

By using the duality scheme we obtain the dual problem

minimize b1 y1 + b2 y2 + b3 y3
subject to y1  0
y2  0
y3 free
a11 y1 + a21 y2 + a31 y3  c1
a12 y1 + a22 y2 + a32 y3  c2
a13 y1 + a23 y2 + a33 Y3 = c3 .
3.2 Dual Problems 61

Lagrangian functions
There are several ways to obtain the dual problem (LD) from the primal problem.
Here is a method by Lagrangian functions. Let us define a function of two variables x
and y in the product space Rn × Rm with values in the extended real line R ∪ {±∞}:

c, x + y, b − Ax if x 0
L(x, y) =
−∞ else.

The inner product y, b − Ax can be interpreted as a measure of violation of the
equality constraint Ax = b, which is added to the objective function as a penalty.
The function L(x, y) is called the Lagrangian function of (LP). We see now that
both problems (LP) and (LD) are obtained from this function.

Proposition 3.2.2 For every fixed vector x in Rn and every fixed vector y in Rm one
has

c, x if Ax = b, x  0
inf L(x, y ) =
y ∈Rm −∞ else

b, y if AT y  c
sup L(x , y) =
x ∈Rn +∞ else.

Proof To compute the first formula of the proposition we distinguish three possible
cases of x: (a) x  0; (b) x  0 with Ax = b, and (c) x  0 and Ax = b. In the
case (a), the function L(x, y ) takes the value −∞ for any y . In the case (b) we put
yt = t (Ax − b) for t > 0. Then

lim L(x, yt ) = lim c, x − t Ax − b 2


= −∞.
t→∞ t→∞

In the last case, it is obvious that L(x, y ) = c, x for every y . By this the first
formula is proven.
As to the second formula we have

sup L(x , y) = sup L(x , y)


x ∈Rn x ∈Rn+
 
= sup c, x  + y, b − Ax 
x ∈Rn+
 
= sup c − A T y, x  + b, y .
x ∈Rn+

Given y in Rm , if c− A T y  0, then the maximum on the right hand side is attained at


x = 0 and equal to b, y. If c − A T y  0, there is a strictly positive component, say
62 3 Linear Programming

ci − (A T x)i > 0. Then by setting xt = tei for t > 0, where ei is the ith coordinate
unit vector of Rn we obtain

c − A T y, xt  + b, y = t (ci − (A T y)i ) + b, y,

which tends to ∞ as t tends to ∞. This completes the proof. 

As a consequence of Proposition 3.2.2 the primal problem is written in form


 
maximize inf y∈Rm L(x, y)
subject to x ∈ Rn

and the dual problem is written in form


 
minimize supx∈Rn L(x, y)
subject to y ∈ Rm .

The utility of dual problems will be clear in the sequel.

Duality relations
Intimate and mutual ties between the primal and dual problems stimulate our insight
of existence and sensibility of optimal solutions as well as solving methods of a
linear problem and economic interpretation of the models where that came from.
The theorem below describes a complete relationship between (LP) and (LD), their
values and variables.

Theorem 3.2.3 For the couple of the primal and dual problems (LP) and (LD) the
following statements hold.
(i) (Weak duality) For each feasible solution x of (LP) and each feasible solution
y of (LD),
c, x  b, y.

In particular, if inequality becomes equality, then x and y are optimal solu-


tions. Moreover, two feasible solutions x and y are optimal if and only if the
complementary slackness holds:

A T y − c, x = 0.

(ii) (Strong duality) If the primal and the dual problems have feasible solutions,
then they have optimal solutions and the two optimal values are equal.
(iii) If either problem has unbounded optimal value, then the other problem has no
feasible solution.
3.2 Dual Problems 63

Proof For the weak duality relation we have b = Ax and A T y  c. As x is positive,


we deduce

b, y = Ax, y = x, A T y  c, x.

Assume now equality holds for some feasible solutions x 0 and y 0 of (LP) and (LD).
Then for every feasible solutions x and y the weak duality yields respectively

c, x 0  = b, y 0   c, x,


b, y 0  = c, x 0   b, y,

which prove that x 0 and y 0 are optimal.


Furthermore, if the complementary slackness holds for feasible solutions x and
y, then the weak duality relation becomes equality, and hence they are optimal.
Conversely, let x and y be primal and dual optimal solutions. We wish to establish
equality in the weak duality relation. Suppose to the contrary that for these optimal
solutions, the weak duality relation is strict, which means that the system

AT y  c
b, y  c, x

is not solvable. In view of Corollary 2.2.4 there exist a positive vector x and a real
number t  0 such that
 
x
(A − b) =0
t
   
c x
, = 1.
−c, x t

If t = 0, then Ax = 0 and c, x  = 1. This means that the objective function is


strictly positive on an asymptotic direction of X and shows that the problem has
unbounded optimal value. Thus, t is strictly positive and the vector 1t x is a feasible
solution at which the value of the objective function is

1 1
c, x  = c, x + > c, x.
t t
This is a contradiction and hence the two optimal values are equal.
We proceed to (ii). Assume that (LP) and (LD) have feasible solutions. By the
weak duality the cost function c, . of (LP) is bounded above on the feasible set. In
view of Theorem 3.1.1, (LP) has an optimal solution. The same argument shows that
the dual (LD) possesses an optimal solution too. Let x be an optimal solution of (LP)
64 3 Linear Programming

and y an optimal solution of (LD). By the first part, they satisfy the complementary
slackness condition. Taking into account the fact that Ax = b we deduce

c, x = A T y, x = Ax, y = b, y,

and so the two optimal values are equal.


The last statement is an immediate consequence of the weak duality relation
because if the primal has a feasible solution x, then the dual is bounded below
by c, y. Likewise, if the dual has a feasible solution, then the primal is bounded
above. 
So far the preceding theorem describes almost all possible situations of the primal
and dual pair, it remains to notice the last one when both of them are infeasible. Here
is an example of such a situation.
Example 3.2.4 Consider the problem

maximize x1 + x2
subject to x1 − x2  1
−x1 + x2  −2
x1 , x2  0,

which is infeasible. Its dual takes the form

minimize y1 − 2y2
subject to y1 − y2  1
−y1 + y2  1
y1 , y2  0.

It is evident that the dual problem has no feasible solution.


The next corollary shows how to obtain an optimal dual vertex from an optimal
primal basis.
Corollary 3.2.5 If x is an optimal basic solution of (LP) corresponding to a basis
B and the reduced cost vector c N is negative, then the vector y = (B −1 )T c B is an
optimal basic solution of (LD).
Proof We show that y = (B −1 )T c B is feasible. In fact,
 
−1 T −1 T cB
A y = A (B
T T
) c B = (B N ) (BT
) cB = .
(B −1 N )T c B
 
xB
By hypothesis x = and the vector (B −1 N )T c B − c N (the opposite of the
0
reduced cost vector) is positive. Hence A T y  c and y is feasible. To prove that y is
optimal, we calculate
3.2 Dual Problems 65

A T y, x = c B , x B  + (B −1 N )T c B , x N  = c B , x B  = c, x.

By Theorem 3.2.3, y is optimal. 

The hypothesis of the preceding corollary is satisfied if the basis is optimal and
non-degenerate. When it is degenerate, the reduced cost vector is not necessarily
negative, and so there is no guarantee that the vector y defined by the formula of the
corollary is dual feasible.

Example 3.2.6 Consider the problem

maximize x1 + x2
subject to x1 + x2 + x3 =1
x1 + 2x2 + x4 = 2
x1 , x2 , x3 , x4  0.

The dual is written as follows

minimize y1 + 2y2
subject to y1 + y2  1
y1 + 2y2  1
y1 , y2  0.

11
We choose the basis B2,3 = that corresponds to the basic variables x2 and
20
x3 . The associated basic solution x = (0, 1, 0, 0)T is degenerate. The reduced cost
vector at this basis is
 T
−1
c N = c N − B2,3 N cB
     T  
1 0 1/2 10 1
= −
0 1 −1/2 11 0
 
1/2
= .
−1/2

This vector is not negative, and the dual vector

 T  T    
−1 0 1/2 1 0
y= B2,3 cB = =
1 −1/2 0 1/2

is not feasible for the dual problem.  


10
On the other hand, if we choose the basis B2,4 = that corresponds to the
21
basic variables x2 and x4 , then its associated basic solution is exactly the same as
66 3 Linear Programming

that of the basis B2,3 , that is x = (0, 1, 0, 0)T . The reduced cost vector at this basis
is different, however, and given below
 T
−1
c N = c N − B2,4 N cB
     T  
1 10 11 1
= −
0 −2 1 10 0
 
0
= .
−1

This time, the reduced cost vector is negative. The dual vector defined by the formula
of the corollary
 T  
−1 1
y = B2,4 cB =
0

is feasible for the dual problem.

Sensitivity
The duality result of Corollary 3.2.5 yields a nice estimate for the change of the
optimal value when the constraint vector b undergoes small perturbations.
 
xB
Corollary 3.2.7 Assume that x = is an optimal non-degenerate basic solu-
0
tion of (LP) and B is the corresponding basis. Then for a small perturbation b + Δb
of the vector b, the vector
 
x B + B −1 Δb
x̂ =
0

is an optimal basic solution of the perturbed problem (LPΔb ):

maximize c, x
subject to Ax = b + Δb
x 0

and the change in the optimal value is given by

c, x̂ − c, x = (c B )T B −1 Δb.

In particular, as a function of b, the optimal value of (LP) is differentiable at b and


its derivative is the optimal dual solution y = (B −1 )T c B .
3.2 Dual Problems 67

Proof Let Δb be a small increment of b. Consider a perturbed problem (LPΔb )


described in the corollary. We show that x̂ is feasible. Indeed,
 
x B + B −1 Δb
A x̂ = (B N )
0
= Bx B + Δb
= b + Δb.

Since x B is strictly positive, when Δb is sufficiently small, the vector x B + B −1 Δb


is positive. Hence x̂ is feasible. We deduce also that B remains a feasible basis of
the perturbed problem. According to Corollary 3.2.5 the vector y = (B −1 )T c B is an
optimal solution of the dual problem of (LP). It is also a feasible solution of the dual
of the perturbed problem as they share the same feasible set. Moreover,

b + Δb, y = b + Δb, (B −1 )T c B 
= B −1 b + B −1 Δb, c B 
= x̂ B , c B  = c, x̂.

By duality, x̂ is an optimal solution of the perturbed problem and y is an optimal


solution of its dual. We deduce also the change of the optimal value:

c, x̂ − c, x = c B , x̂ B  − c B , x B 


= b + Δb, y − b, y
= Δb, (B −1 )T c B 
= (c B )T B −1 Δb,

which yields the requested formula. The last assertion is immediate from the formula
of the change of the optimal value. 

3.3 The Simplex Method

The simplex method is aimed at solving the linear problem (LP). Its strategy is to
start with a feasible vertex and search an adjacent vertex that increases the value of
the objective function until either a ray on which the objective function is unbounded
is identified or an optimal vertex is found.
68 3 Linear Programming

Description of the method


Let us assume for a moment that we have a feasible basis B0 in our disposition. Here
is the algorithm.
Step 1: Compute the associated feasible vertex x 0 whose components are x B0 =
(B0 )−1 b and x N0 = 0. Iteration k = 0.
Step 2. Set k := k + 1. Let Bk be the current feasible basis and its associated basic
vertex x k with two components x Bk and x Nk . Compute

b = Bk−1 b
c N = c N − [Bk−1 N ]T c B .

Step 3. If c N  0, then stop. The current vertex x k is optimal.


Otherwise go to the next step.
Step 4. Let s be an index for which cs > 0. Pick the column as of the matrix A and
compute

a s = Bk−1 as .

If this vector is negative, then stop. The problem is unbounded.


Otherwise find an index  such that

b b 
i
x̂s := = min : a is > 0 .
a s a is

Step 5. Form a new feasible basis Bk+1 from Bk by deleting the column a and
entering the column as instead. The new associated vertex x k+1 is obtained from x k
by setting the variable xs = x̂s > 0 and the variable x = 0.
−1
Step 6. Compute the inverse matrix Bk+1 of the new basis Bk+1 and return to Step 2.
The element a s obtained above from the matrix A is called a pivot, the column
a s = (a 1s , · · · , a ms )T and the row a  = (a 1 , · · · , a n ) are called the pivotal
column and the pivotal row of the algorithm.

Theorem 3.3.1 If all feasible bases of the matrix A are non-degenerate, then the
simplex algorithm terminates at a finite number of iterations.

Proof Note that the number of the vertices of the polyhedron X is finite, say equal
to p. Moreover, in the simplex method, the objective function increases its value each
time when passes from a vertex to a vertex. Thus, after a finite number of iterations
(at most p), one obtains a vertex which is either optimal, or the objective function
increases along a ray starting from it. 
3.3 The Simplex Method 69

Finding a feasible basis


We are considering the constraints of the problem (LP)

Ax = b and x  0.

By multiplying both side of an equality by (−1) one may assume that the right hand
side vector b has non-negative components only. Moreover, as we saw in the first
chapter when the constraints have a solution, one may remove some of them so that
the constraints remain without redundant equations and producing the same solution
set. From now on, we suppose the two conditions on the constraints: (1) the vector
b is positive, and (2) no equality is redundant.
Since the choice of a feasible basis for starting the simplex algorithm is not evident,
one introduces a vector of artificial variables y = (y1 , · · · , ym )T and consider the
linear problem

minimize y1 + · · · + ym (3.5)
subject to Ax + y = b and x  0, y  0.

Proposition 3.3.2 The problem (LP) has a feasible solution if and only if the problem
(3.5) has a minimum value equal to zero with y = 0.

Proof Assume that x is a feasible solution of (LP). Then (x, y) with y = 0 is a


feasible solution of (3.5), and the minimum value is zero. Conversely, if the optimal
value of (3.5) is zero, then at an optimal solution (x, y) one has y = 0, which implies
that x is a feasible solution of (LP). 

Notice that the artificial problem (3.5) has optimal solutions. This is because
the feasible set is nonempty, it contains, for instance, the solution with x = 0 and
y = b which is a basic feasible solution associated with the feasible basis B = I
the identity matrix, and the objective function is bounded from below. Suppose that
an optimal vertex (x, 0) is found for (3.5) in which no yi is basic variable. Then the
corresponding columns of the matrix A are linear independent and form a feasible
basis of (LP). If one of yi is a basic variable, say, for the sake of simple writing,
y1 is the unique basic variable together with m − 1 basic variables x1 , · · · , xm−1 ,
then by applying the simplex method one may arrive at the point that either the basic
variable y1 is replaced by a new basic variable x j with m  j  n, or it is impossible
to substitute y1 by those x j . The latter case can happen only when the coefficients
y1 j , j = 2m, · · · , m +n of the matrix B −1 N are all zero. This shows that the rank of
the matrix A is m − 1, and so the constraints of (LP) are redundant, a contradiction.
70 3 Linear Programming

The product form of the inverse


By looking at the basic matrices Bk and Bk+1 we notice that they differ from each
other only in one column, that is they are adjacent. This enables us to compute the
inverse of Bk+1 from the inverse of Bk . In fact, denote by D the elementary m × m-
matrix, called the matrix for change of basis, which is the identity matrix except for
the th column equal to the vector
 T
a 1s 1 a ms
− ,··· , ,··· ,− .
a s a s a s

Namely,
⎡ ⎤
1 · · · −a 1s /a s · · · 0
⎢ ⎥
⎢ ⎥
D=⎢
⎢0 ··· 1/a s · · · 0⎥⎥
⎣ ⎦
0 · · · −a ms /a s · · · 1

Proposition 3.3.3 With the matrix D above, one has


−1
Bk+1 = D Bk−1 .

In particular, if the first basis is the identity matrix, then

Bk−1 = Dk · · · D1

where Di are change matrices.

Proof Let β1 , · · · , βm be the columns of the matrix Bk . Then the columns of Bk+1
are the same, with its th column substituted by the column as of the matrix A. By
multiplying Bk+1 by D we obtain a matrix whose columns are exactly as Bk except
for the th one given by
1
(Bk (−a s ) + as ) + β .
a s

By definition, a s = Bk−1 as , we deduce that Bk (−a s ) + as = 0 and the th column


of the product Bk+1 D is equal to β . Consequently, Bk+1 D = Bk and the requested
formula follows. 

It would be noticed that each elementary matrix D is uniquely determined by


the number  and its th column. Thus, it is sufficient to store m + 1 numbers
(, −a 1s /a s , · · · , 1/a s , · · · , −a ms /a s ) in order to fully restitute D.
3.3 The Simplex Method 71

The simplex tableau


In order to solve the problem (LP):

maximize c, x
subject to Ax = b and x  0

we assume that b is a positive vector and the matrix A is written in the form (B N )
where B is a feasible basis. To simplify the writing, the cost vector c is set in row
form. The simplex tableau is of the form, denoted T ,

c T = (c TB c TN ) 0

A = (B N) b

By pre-multiplying the tableau T by the extended inverse of B,

1 −c TB B −1

0 B −1

we obtain the tableau T ∗ as follows

0 c TN = c TN − c TB B −1 N −c TB B −1 b

I N = B −1 N B −1 b

The tableau T ∗ contains all information necessary for the simplex algorithm.
 
xB
• The associated basic solution is found in the right bottom box: x = with
xN
x B = B −1 b and x N = 0.
• The value of the objective function at this basic solution is equal to c, x =
(c B )T B −1 b, the opposite of the value given in the upper right corner.
• The reduced cost c N is given in the upper middle box. If all components of this
vector are negative, then the current basic vertex is optimal.
• If some of the components of the reduced cost vector are positive, choose an index,
say s, with cs largest. The variable xs will enter the basis. A variable x with the
index  satisfying
72 3 Linear Programming
 
b bi
= min : a is > 0
a s a is

will leave the basis.


The simplex tableau of the next iteration is obtained from T ∗ by pre-multiplying
it by the matrix

1 0 · · · −cs /a s · · · 0
0 1 · · · −a 1s /a s · · · . 0
S= ···
: : 1/a s :
···
0 0 · · · −a ms /a s · · · 1

We find again the matrix of change D in the right low box. The pivot of the simplex
tableau is the element a s .
Note that after identifying the pivot a s , the simplex tableau of the next iteration
is obtained from the current one by multiplying the pivot row by the inverse of the
pivot and by adding the pivot row multiplied by a number to other rows so that the
pivot column becomes a vector whose th component is equal to one and the other
components are zero. This is exactly what the pre-multiplication of T ∗ by S does.
Example 3.3.4 We consider the problem

maximize x1 + 2x2
subject to −3x1 + 2x2  2
−x1 + 2x2  4
x1 + x2  5
x1 , x2  0.

It is equivalent to the following problem in standard form

maximize x1 + 2x2
subject to −3x1 + 2x2 + x3 = 2
−x1 + 2x2 + x4 = 4
x1 + x2 + x5 = 5
x1 , · · · , x5  0.

The initial simplex tableau is given as

1 2 0 0 0 0
−3 2 1 0 0 2
−1 2 0 1 0 4
1 1 0 0 1 5
3.3 The Simplex Method 73

Choose an evident basis B = I the identity matrix corresponding to the basic vari-
ables x3 , x4 and x5 . Since the basic part of the cost vector c B is null, the reduced cost
vector of this basis is
 
1
c N = c N − [B −1 N ]T c B = c N = .
2

In view of Theorem 3.1.4 this basis is not optimal. To move to a better solution we
make a change of basis by introducing the non-basic variable x2 corresponding to
the biggest reduced cost c2 = 2 into the basic variables. We have
⎛ ⎞
2
a 2 = B −1 a2 = a2 = ⎝ 2 ⎠
1
   
bi 2 4 5
tˆ = min : a i2 > 0 = min , , =1
a i2 2 2 1

We see that tˆ is reached at i = 1 which corresponds to the basic variable x3 . Thus,


x3 leaves the basis and x2 enters it. The pivot is the element a 12 = 2 with the pivotal
row  = 1 and the pivotal column s = 2. The matrix of change D and S are given by

⎛ ⎞ 1 −1 0 0
1/2 0 0
D1 = ⎝ −2/2 1 0 ⎠ and S1 = 0 1/2 0 0
−1/2 0 1 0 −1 1 0
0 −1/2 0 1

The new tableau T2 is obtained by the product S1 T1 and is displayed below

4 0 −1 0 0 −2
−3/2 1 1/2 0 0 1
2 0 −1 1 0 2
5/2 0 −1/2 0 1 4

2 , x 4 and
The current basic variables are x  x5 . We read from the simplex tableau that
4
the reduced cost vector c N = and by Theorem 3.1.4 the current basis is
−1
not optimal. The unique non-basic variable with the positive reduced cost is x1 (the
reduced cost c1 = 4). We have
74 3 Linear Programming
⎛ ⎞
−3/2
a 1 = B −1 a1 = ⎝ 2⎠
5/2
   
bi 2 4
tˆ = min : a i1 > 0 = min , = 1.
a i1 2 5/2

The value tˆ is reached at i = 2 which corresponds to the basic variable x4 . Thus, x4


leaves the basis and x1 enters it. The pivot is the element a 21 = 2 with the pivotal
row  = 2 and the pivotal column s = 1. The matrix S is given by

1 0 −4/2 0
S2 = 0 1 3/4 0
0 0 1/2 0
0 0 −5/4 1

The new tableau T3 is obtained by the product S2 T2 and is displayed below

0 0 1 −2 0 −6
0 1 −1/4 3/4 0 5/2
1 0 −1/2 1/2 0 1
0 0 3/4 −5/4 1 3/2

1
The current basic variables are x1 , x2 and x5 . The reduced cost vector is c N =
−2
and again by Theorem 3.1.4 the current basis is not optimal. The unique non-basic
variable with the positive reduced cost is x3 (the reduced cost c3 = 1). We have
⎛ ⎞
−1/4
a 3 = B −1 a3 = ⎝ −1/2 ⎠
3/4
 
bi 3/2
tˆ = min : a i3 > 0 = = 2.
a i3 3/4

The value tˆ is reached at i = 3 which corresponds to the basic variable x5 . Thus, x5


leaves the basis and x3 enters it. The pivot is the element a 33 = 3/4 with the pivotal
row:  = 3, and the pivotal column: s = 3. The matrix S is given by

1 0 0 −4/3
S3 = 0 1 0 1/3
0 0 1 2/3
0 0 0 4/3
3.3 The Simplex Method 75

The new tableau T4 is obtained by the product S3 T3 and is displayed below

0 0 0 −1/3 −4/3 −8
0 1 0 1/3 1/3 3
1 0 0 −1/3 2/3 2
0 0 1 −5/3 4/3 2
 
−1/3
The current basic variables are x1 , x2 and x3 . The reduced cost vector c N =
−4/3
is negative, hence the current basis is optimal. We obtain immediately x1 = 2, x2 = 3
and x3 = 2. The optimal value is 8 (the opposite of the number at the upper right
corner of the tableau). At this iteration the algorithm terminates.

The Two-phase method


When a feasible basis for starting the simplex algorithm is not apparent, the auxiliary
problem (3.5) with artificial variables represents Phase I of the simplex method. Once
a basic feasible solution is found from Phase I, one applies Phase II to find an optimal
solution. In Phase II the artificial variables and its objective function play no role and
so they are omitted.

Example 3.3.5 We consider the problem

maximize x1 + x2 + x3
subject to 2x1 + x2 + 2x3 = 4
3x1 + 3x2 + x3 = 3
x1 , x2 , x3  0.

Since a feasible basic solution is not evident, we introduce two artificial variables x4
and x5 . Phase I of the simplex algorithm consists of solving the problem

maximize − x4 − x5
subject to 2x1 + x2 + 2x3 + x4 = 4
3x1 + 3x2 + x3 + x5 = 3
x1 , · · · , x5  0.

The initial simplex tableau in Phase I is given as

0 0 0 −1 −1 0
2 1 2 1 0 4
3 3 1 0 1 3
76 3 Linear Programming

Choose the evident basis B = I the identity matrix corresponding to the basic
variables x4 and x5 . Since the non-basic component of the cost vector c is null, the
reduced cost vector of this basis is
⎛ ⎞
5
−1
c N = −[B N ] c B =
T ⎝ 4 ⎠.
3

In view of Theorem 3.1.4 this basis is not optimal. Since B is the identity matrix, the
new tableau differs from the former one only by the first row equal to (5, 4, 3, 0, 0, 7)
and is displayed below

5 4 3 0 0 7
2 1 2 1 0 4
3 3 1 0 1 3

In fact the new tableau is obtained from the initial one by updating the first row so
that components corresponding to the basic variables are zero. The first component
of this row has the biggest positive value equal to 5, we choose the pivot a 21 = 3
with the pivotal row  = 2 and pivotal column s = 1 and obtain the next tableau

0 −1 4/3 0 −5/3 2
0 −1 4/3 1 −2/3 2
1 1 1/3 0 1/3 1

It is clear that a suitable pivot is a 13 = 4/3 with pivot row  = 1 and pivot column
s = 3. The next tableau is given as

0 0 0 −1 −1 0
0 −3/4 1 3/4 −1/2 3/2
1 5/4 0 −1/4 1/2 1/2

Phase I of the algorithm terminated, we obtain a basic feasible solution x1 = 1/2


and x3 =  3/2. Now we proceed to Phase II by starting from the feasible basis
22
B = corresponding to the basic variables x1 and x3 . The initial tableau of
31
this phase is given below

1 1 1 0
0 −3/4 1 3/2
1 5/4 0 1/2
3.3 The Simplex Method 77

By subtracting the sum of the second and the third rows from the first row to make
zero all components of the first row which correspond to the basic variables, we
obtain a new tableau

0 1/2 0 −2
0 −3/4 1 3/2
1 5/4 0 1/2

It is clear that this basic feasible solution is not optimal. The pivot element is a 22 =
5/4. Making this column equal to (0, 0, 1)T by row operations we deduce the next
tableau

− 2/5 0 0 −11/5
3/5 0 1 9/5
4/5 1 0 2/5

At this iteration the reduced cost vector is negative, and so the algorithm terminates.
The solution x1 = 0, x2 = 2/5 and x3 = 9/5 is optimal.

Degeneracy
As we experienced in Example 3.1.6 when a basis is degenerate, there is a possi-
bility that a new basis is also degenerate and the value of the objective function
does not change and that even if a basis changes, the associated solution remains
unchanged. A sequence of such degenerate feasible solutions may make no increase
in the value of the objective function and causes cycling. The simplex algorithm
then never terminates. To avoid cycling, there are some techniques to modify the
algorithm. We cite below two that are frequently used. The first one is Bland’s rule to
tighten pivot choice. It consists of selecting the pivotal columns as with the smallest
subscript among those with strictly positive entries and selecting the pivotal row
corresponding to the basic variable with lowest subscript among rows i having the
same minimum ratio bi /a is equal to x̂s . With this rule the simplex method cannot
cycle and hence is finite.
The second technique consists of perturbing the constraints, replacing Ax = b by
Ax = b+ A(ε, ε2 , · · · , εn )T with a small ε > 0. For a certain range of ε, a degenerate
basis B will be non-degenerate for the perturbed system and leads to a new feasible
basic solution. This way the simplex method avoids cycling and terminates in a finite
number of iterations too.
It is worthwhile noticing that rigorous techniques to overcome cycling require
additional operations at each iteration and become costly when solving very large
78 3 Linear Programming

sized problems. On the other hand experience indicates that cycling in the simplex
algorithm is very rare. Therefore, most of commercial codes apply the method without
paying any attention to it.

The primal-dual method


The primal-dual method is based on duality relations between the primal and the
dual problems. Namely, if a feasible solution y of the dual problem is known, and
if we succeed in finding a feasible solution x of the primal problem so that the
complementary slackness condition A T y − c, x = 0 holds, then x is an optimal
solution of (LP) and y is an optimal solution of (LD). In order to find a feasible
solution of (LP) that satisfies the complementary slackness we evoke the auxiliary
problem (3.5) with an additional constraint xi = 0 for all i outside of the active index
set I (y).
Thus, given a feasible solution of (LD) we consider the restricted problem, denoted
(P y )
  
x
maximize d,
y
subject to Ax + y = b
x  0, y  0
xi = 0, i ∈
/ I (y),

where d = (0, · · · , 0, −1, · · · , −1)T , and its dual, denoted (D y )

minimize b, z
subject to a i , z  0, i ∈ I (y)
z  (−1, ..., −1)T .

Here is a relationship between solutions of the primal and dual problems and their
restricted problems.

Theorem 3.3.6 Let (x 0 , y 0 ) be an optimal solution of the restricted problem asso-


ciated with a feasible solution y of the dual problem (LD) and B a non-degenerate
basis associated with this optimal solution. The following statements hold.
(i) If y 0 = 0, then x 0 is an optimal solution of (LP) and y is an optimal solution
of (LD).
(ii) If y 0 = 0 and if the feasible solution z 0 = (B −1 )T d B of the dual problem
(D y ) satisfies
a i , z 0   0 for all i ∈
/ I (y), (3.6)

then the primal problem (LP) is infeasible.


3.3 The Simplex Method 79

(iii) If y 0 = 0 and (3.6) does not hold, then the vector ŷ = y + t z 0 with
 
ci − a i , y
t = min : a i , z 0  < 0
a i , z 0 

is a feasible solution of (LD) satisfying b, ŷ < b, y.

Proof For the first statement we observe that x 0 is a feasible solution of (LP) under
y 0 = 0. Moreover, for active indices i ∈ I (y) we have

a i , y − ci = 0

while for inactive indices j ∈


/ I (y) the components x 0j are zero. Consequently,


n
 i 
A T y − c, x 0  = a , y − ci xi0 = 0.
i=1

By Theorem 3.2.3, x 0 is an optimal solution of (LP) and y is an optimal solution of


(LD).
For (ii) we set

yt = y + t z 0 for t  0.

We make two observations. First, for every positive number t, the vector yt is a
feasible solution of (LD). In fact, by hypothesis, A T z 0  0, and hence

A T yt = A T y + t A T z 0  A T y  c.

Second, the optimal values of the primal restricted problem (P y ) and its dual (D y )
being equal, we deduce from Theorem 3.1.4 and Corollary 3.2.5 that

b, z 0  = −y10 − · · · − ym0 < 0.

It follows that

lim b, yt  = −∞.


t→∞

Thus, the dual problem (LD) is unbounded below. In view of Theorem 3.2.3 the
primal problem (LP) is infeasible.
To prove (iii), we find conditions on t such that yt is a feasible solution of (LD).
For those indices i with a i , z 0   0, the ith constraints are evidently satisfied when
t is positive because

a i , yt  = a i , y + ta i , z 0   a i , y  ci .
80 3 Linear Programming

For those i with a i , z 0  < 0, the ith constraints are evidently satisfied if

ta i , z 0   ci − a i , y.

It follows readily that the value t given in (iii) is the biggest that makes ŷ feasible
for the dual problem (LD). Finally, we have

b, ŷ = b, y + tb, z 0  < b, y

because t is strictly positive and b, z 0  is strictly negative. 


We shall proceed now to describe the primal-dual algorithm, assuming a feasible
solution y of the dual problem (LD) at our disposition.
Step 1: Solve the restricted problem (P y ) associated with y.  
x0
If the optimal value is zero, then stop. The vector x 0 , where with y 0 = 0 is
y0
an optimal solution of (P ȳ ), is optimal for (LP) and y is optimal for (LD).
Otherwise, go the the next step.
Step 2. Compute z 0 = (B −1 )T d B where  B is an optimal non-degenerate basis asso-
x0
ciated with the optimal solution with y 0 = 0 and d B is the basic component
y0
of the vector d determining the objective function of the restricted problem.
If a i , z 0   0 for all i ∈
/ I (y), then stop. The primal problem (LP) is infeasible.
Otherwise, go to the next step.
Step 3. Compute
 
ci − a i , y
t = min : a , z  < 0
i 0
a i , z 0 

and return to Step 1, replacing y by y + t z 0 .


A few useful comments on the implementation of the algorithm are in order. First,
to solve the restricted primal problem an evident basis corresponding to the basic
variables y1 , · · · , ym can be started with. Remember that b  0 is assumed.
Second, if an index k realizes the minimum in the formula of t in Step 3, then it is
also active index at the new dual feasible solution y + t z 0 because

ck − a k , y k 0
a k , y + t z 0  = a k , y + a , z  = ck .
a k , z 0 
 
x0
Moreover, if a component x 0j
of the optimal solution obtained in Step 1 is
y0
strictly positive, then the complementary slackness a j , z 0  = 0 implies

a j , y + t z 0  = a j , y = c j .
3.3 The Simplex Method 81

Hence j is an active index too.


Third, it follows from the preceding comment that when an index i is not active at
the new feasible solution y + t z 0 , that is a i , y + t z 0  > ci, then the corresponding
x0
ith component of x 0 is zero. By this, the optimal solution is feasible for the
y0
new restricted problem (P y+t z 0 ).
Finally, we may claim that under non-degeneracy the algorithm terminates after a
finite number of iterations. Indeed, at each iteration we arrive at either the infeasibility
of (LP), or a new feasible basic solution of the dual problem (LD) whose value strictly
decreases. Since the number of bases of the dual problem is finite, the algorithm is
finite too.

Example 3.3.7 We consider the problem

maximize −x1 − x2 − x3
subject to x1 + 2x2 + x3 = 2
2x1 + 3x2 + x3 = 3
x1 , x2 , x3  0,

and its dual problem

minimize 2y1 + 3y2


subject to y1 + 2y2  −1
2y1 + 3y2  −1
y1 + y2  −1.
 
1
An evident feasible solution of the dual problem is y = whose active index
0
set I (y) is empty. The restricted primal problem associated with this solution is the
following

maximize − z1 − z2
subject to x1 + 2x2 + x3 + z 1 =2
2x1 + 3x2 + x3 + z2 =3
x1 = x2 = x3 =0
z1, z2  0.

Since the latter problem has only one feasible solution, its optimal value is strictly
negative. The solution x 0 = (0, 0, 0)T is not feasible for (LP). The optimal
 basis
10
corresponding to the basic variables z 1 and z 2 is the identity matrix B = and
01
the basic component of the objective vector is d B = (−1, −1) . By Corollary 3.2.5,
T

the vector u 0 = (B −1 )T d B = d B is a feasible solution of the dual problem of the


82 3 Linear Programming

restricted primal problem. We compute the value of t given by the formula in the
Step 3 of the primal-dual algorithm, which is exactly the maximal t such that
⎛ ⎞ ⎛ ⎞
1 2     −1
⎝2 3⎠ 1 −1
+t  ⎝ −1 ⎠ ,
0 −1
11 −1

which yields t = 3/5. The new feasible


 solution of the dual problem (LD) for the
2/5
next iteration is y + tu =
0 . Its active index set is I (y + tu 0 ) = {2}. The
−3/5
restricted primal problem associated with it is now

maximize − z1 − z2
subject to x1 + 2x2 + x3 + z 1 =2
2x1 + 3x2 + x3 + z2 = 3
x1 = x3 = 0
x2 , z 1 , z 2  0.

Solving this problem by the simplex method we have the initial tableau without the
variables x1 and x3 which are zero

0 −1 −1 0
2 1 0 2
3 0 1 3

Using the basis corresponding to the basic variables z 1 and z 2 the tableau becomes

5 0 0 5
2 1 0 2
3 0 1 3

The pivot a 21 = 3 leads to the final tableau

0 0 −5/3 0
0 1 −2/3 0
1 0 1/3 1

which indicates that the solution x1 = 0, x2 = 1, x3 = 0, z 1 = 0, z 2 = 0 is


optimal with the optimal value equal to zero. According to the algorithm the solution
x1 = 0, x2 = 1, x2 = 0 is feasible for (LP), and hence it is an optimal solution of it.
The dual solution y1 = 2/5, y2 = −3/5 is an optimal solution of the dual problem
(LD).
Part II
Theory
Chapter 4
Pareto Optimality

In a multi-dimensional Euclidean space there are several ways to classify elements


of a given set of vectors. The componentwise order relation introduced in the very
beginning of the second chapter seems to be the most appropriate for this classifica-
tion purpose and leads to the concept of Pareto optimality or efficiency, a cornerstone
of multiobjective optimization that we are going to study in the present chapter.

4.1 Pareto Maximal Points

In the space Rk with k > 1 the componentwise order x  y signifies that each
component of x is bigger than or equal to the corresponding component of y. Equiv-
alently, x  y if and only if the difference vector x − y has non-negative components
only. This order is not complete in the sense that not every couple of vectors is com-
parable, and hence the usual notion of maximum or minimum does not apply. We
recall also that x > y means that all components of the vector x − y are strictly
positive, and x ≥ y signifies x  y and x = y. The following definition lays the
basis for our study of multiobjective optimization problems.

Definition 4.1.1 Let Q be a nonempty set in Rk . A point y ∈ Q is said to be a


(Pareto) maximal point of the set Q if there is no point y  ∈ Q such that y   y and
y  = y. And it is said to be a (Pareto) weakly maximal point if there is no y  ∈ Q
such that y  > y.

The sets of maximal points and weakly maximal points of Q are respectively denoted
Max(Q) and WMax(Q) (Figs. 4.1 and 4.2). They are traditionally called the efficient
and weakly efficient sets or the non-dominated and weakly non-dominated sets of Q.
The set of minimal points Min(Q) and weakly minimal points WMin(Q) are defined
in a similar manner. When no confusion likely occurs between maximal and minimal
elements, the set Min(Q) and WMin(Q) are called the efficient and weakly efficient
sets of Q too. The terminology of efficiency is advantageous in certain circumstances

© Springer International Publishing Switzerland 2016 85


D.T. Luc, Multiobjective Linear Programming,
DOI 10.1007/978-3-319-21091-9_4
86 4 Pareto Optimality

Fig. 4.1 Max and Min

Q M ax(Q)

Min(Q)

Fig. 4.2 WMax and WMin

W M ax(Q)

W M in(Q)

in which we deal simultaneously with maximal points of a set as introduced above


and maximal elements of a family of subsets which are defined to be maximal with
respect to inclusion. Thus, given a convex polyhedron, a face of it is efficient if it
consists of maximal points only. When we refer to a maximal efficient face, it is
understood that that face is efficient and maximal by inclusion which means that no
efficient face of the polyhedron contains it as a proper subset. In some situations one
is interested in an ideal maximal point (called also a utopia point), which is defined
to be a point y ∈ Q that satisfies

y  y  for all y  ∈ Q.

Such a point is generally unattainable, and if it exists it is unique and denoted by


IMax(Q) (Fig. 4.3).
Geometrically, a point y of Q is an efficient (maximal) point if the intersection
of the set Q with the positive orthant shifted at y consists of y only, that is,

Q ∩ (y + Rk+ ) = {y}

and it is weakly maximal if the intersection of Q with the interior of the positive
orthant shifted at y is empty, that is,
4.1 Pareto Maximal Points 87

Fig. 4.3 IMax


IMAX (Q)

Q ∩ (y + int(Rk+ )) = ∅.

Of course, maximal points are weakly maximal, and the converse is not true in
general. Here are some examples in R2 .

0 1 0
Example 4.1.2 Let Q be the triangle of vertices a =
,b = and c =
0 0 1
in R . Then Max(Q) = WMax(Q) = [b, c], Min(Q) = {a} and WMin(Q) =
2

[a, b] ∪ [a, c].

Example 4.1.3 Let Q be the polytope in the space R3 , determined by two


inequalities

y2 + y3  0
y3  0.

Then Max(Q) = WMax(Q) = ∅, Min(Q) = ∅ and WMin(Q) = Q \ int(Q).

Existence of pareto maximal points


As we have already seen in Example 4.1.3, a polyhedron may have no weakly max-
imal points. This happens when some components of elements of the set are un-
bounded above. Positive functionals provide an easy test for such situations.
Theorem 4.1.4 Let Q be a nonempty set and let λ be a nonzero vector in Rk . Assume
that y ∈ Q is a maximizer of the functional λ, . on Q. Then
88 4 Pareto Optimality

(i) y is a weakly maximal point of Q if λ is a positive vector;


(ii) y is a maximal point of Q if either λ is a strictly positive vector, or λ is a positive
vector and y is the unique maximizer.
In particular, if Q is a nonempty compact set, then it has a maximal point.
Proof Assume λ is a nonzero positive vector. If y were not weakly maximal, then
there would exist another vector y  in Q such that the vector y  − y is strictly positive.
This would yield λ, y  > λ, y , a contradiction.
Now, if λ is strictly positive, then for any y   y and y  = y, one has λ, y  >
λ, y as well. Hence y is a Pareto maximal point of Q.
When λ is positive (not necessarily strictly positive) and not zero, the above
inequality is not strict. Actually, we have equality because y is a maximizer. But, in
that case y  is also a maximizer of the functional λ, . on Q, which contradicts the
hypothesis.
When Q is compact, any strictly positive vector λ produces a maximizer on Q,
hence a Pareto maximal point too. 
Maximizers of the functional λ, . with λ positive, but not strictly positive, may
produce no maximal points as seen in the following example.
Example 4.1.5 Consider the set Q in R3 consisting of the vectors x = (x1 , x2 , x3 )T
with x3  0. Choose λ = (0, 0, 1)T . Then every element x of Q with x3 = 0
is a maximizer of the functional λ, . on Q, hence it is weakly maximal, but not
maximal, for the set Q has no maximal element.
Given a reference point a in the space, the set of all elements of a set Q that are
bigger than the point a forms a dominant subset, called a section of Q at a. The
lemma below shows that maximal elements of a section are also maximal elements
of the given set.
Lemma 4.1.6 Let Q be a nonempty set in Rk . Then for every point a in Rk one has

Max Q ∩ (a + Rk+ ) ⊆ Max(Q)


WMax Q ∩ (a + Rk+ ) ⊆ WMax(Q).

Proof Let y be a Pareto maximal point of the section Q ∩ (a + Rk+ ). If y were not
maximal, then one would find some y  in Q such that y   y and y  = y. It would
follow that y  belongs to the section Q ∩ (a + Rk+ ) and yield a contradiction. The
second inclusion is proven by the same argument. 
For convex polyhedra existence of maximal points is characterized by position of
asymptotic directions with respect to the positive orthant of the space.
Theorem 4.1.7 Let Q be a convex polyhedron in Rk . The following assertions hold.
(i) Q has maximal points if and only if

Q ∞ ∩ Rk+ = {0}.
4.1 Pareto Maximal Points 89

(ii) Q has weakly maximal points if and only if

Q ∞ ∩ int(Rk+ ) = ∅.

In particular, every polytope has a maximal vertex.

Proof Let y be a maximal point of Q and let v be any nonzero asymptotic direction
of Q. Since y + v belongs to Q and Q ∩ (y + Rk+ ) = {y}, we deduce that v does not
belong to Rk+ . Conversely, assume Q has no nonzero asymptotic direction. Then for
a fixed vector y in Q the section Q ∩ (y + Rk+ ) is bounded; otherwise any nonzero
asymptotic direction of that closed convex intersection, which exists due to Corollary
2.3.16, should be a positive asymptotic vector of Q. In view of Theorem 4.1.4 the
compact section Q ∩ (y + Rk+ ) possesses a maximal point, hence, in view of Lemma
4.1.6, so does Q.
For the second assertion, the same argument as above shows that when Q has
a weakly maximal point, no asymptotic direction of it is strictly positive. For the
converse part, by the hypothesis we know that Q ∞ and Rk+ are two convex polyhedra
without relative interior points in common. Hence, in view of Theorem 2.3.10 there
is a nonzero vector λ ∈ Rk separating them, that is

λ, v  λ, d for all v ∈ Q ∞ and d ∈ Rk+ .

In particular, for v = 0 and for d being usual coordinate unit vectors, we deduce
from the above relation that λ is positive. Moreover, the linear function λ, . is then
non-positive on every asymptotic direction of Q. We apply Theorem 3.1.1 to obtain
a maximum of λ, . on Q. In view of Theorem 4.1.4 that maximum is a weakly
maximal point of Q.
Finally, if Q is a polytope, then its asymptotic cone is trivial. Hence, by the first
assertion, it has maximal points. To prove that it has a maximal vertex, choose any
strictly positive vector λ ∈ Rk and consider the linear problem of maximizing λ, .
over Q. In view of Theorem 3.1.3 the optimal solution set contains a vertex, which,
by Theorem 4.1.4, is also a maximal vertex of Q. 

In Example 4.1.5 a positive functional λ, . was given on a polyhedron having


no maximizer that is maximal. This, however, is impossible when the polyhedron
has maximal elements.

Corollary 4.1.8 Assume that Q is a convex polyhedron and λ is a nonzero positive


vector in Rk . If Q has a maximal point and the linear functional λ, . has maximizers
on Q, then among its maximizers there is a maximal point of Q.

Proof Let us denote by Q 0 the nonempty intersection of Q with the hyperplane


{y ∈ Rk : λ, y = d} where d is the maximum of λ, . on Q. It is a convex
polyhedron. Since Q has maximal elements, in view of Theorem 4.1.7 one has
Q ∞ ∩ Rk+ = {0}, which implies that (Q 0 )∞ ∩ Rk+ = {0} too. By the same theorem,
Q 0 has a maximal element, say y0 . We show that this y0 is also a maximal element
90 4 Pareto Optimality

of Q. Indeed, if not, one could find some y ∈ Q such that y  y0 and y = y0 . Since
λ is positive, we deduce that λ, y  λ, y0 = d. Moreover, as y does not belong
to Q 0 , this inequality must be strict which is a contradiction. 
We say a set Q in the space Rk has the domination property if its elements are
dominated by maximal elements, that is, for every y ∈ Q there is some maximal
element a of Q such that a  y. The weak domination property refers to domination
by weakly maximal elements.
Corollary 4.1.9 A convex polyhedron has the domination property (respectively
weak domination property) if and only if it has maximal elements (respectively weakly
maximal elements).
Proof The “only if” part is clear. Assume a convex polyhedron Q has maximal
elements. In view of Theorem 4.1.7, the asymptotic cone of Q has no nonzero vector
in common with the positive orthant Rk+ . Hence so does the section of Q at a given
point a ∈ Q. Again by Theorem 4.1.7 that section has maximal points that dominate
a and by Lemma 4.1.6 they are maximal points of Q. Hence Q has the domination
property. The weak domination property is proven by the same argument. 
We learned in Sect. 2.3 how to compute the normal cone at a given point of a
polyhedron. It turns out that by looking at the normal directions it is possible to say
whether a given point is maximal or not.
Theorem 4.1.10 Let Q be a convex polyhedron in Rk . The following assertions
hold.
(i) y ∈ Q is a maximal point if and only if the normal cone N Q (y) to Q at y contains
a strictly positive vector.
(ii) y ∈ Q is a weakly maximal point if and only if the normal cone N Q (y) to Q at
y contains a nonzero positive vector.
Proof Let y be a point in Q. If the normal cone to Q at y contains a strictly positive
vector, say λ, then by the definition of normal vectors, the functional λ, . attains
its maximum on Q at y. In view of Theorem 4.1.4, y is a maximal point of Q. The
proof of the “only if” part of (i) is based on Farkas’ theorem. We assume that y is
a maximal point of Q and suppose to the contrary that the normal cone to Q at that
point has no vector in common with the interior of the positive orthant Rk+ . We may
assume that Q is given by a system of inequalities

a i , z  bi , i = 1, · · · , m. (4.1)

The active index set at y is denoted I (y). By Theorem 2.3.24, the normal cone to
Q at y is the positive hull of the vectors a i , i ∈ I (y). Its empty intersection with
int(Rk+ ) means that the following system has no solution

A I (y) λ  e
λ  0,
4.1 Pareto Maximal Points 91

where A I (y) denotes the matrix whose columns are a i , i ∈ I (y) and e is the vector
whose components are all equal to one. By introducing artificial variables z, the
above system is equivalent to the system

λ
A I (y) (−I ) =e
z
λ0
z  0.

Apply Farkas’ theorem (Theorem 2.2.3) to obtain a nonzero positive vector v such
that
a i , v  0 for all i ∈ I (y).

The inequalities (4.1) corresponding to the inactive indices at y being strict, we may
find a strictly positive number t such that

a i , y + tv  bi for all i = 1, · · · , m.

In other words, the point y + tv belongs to Q. Moreover, y + tv  y and y + tv = y


which contradicts the hypothesis. This proves (i).
As to the second assertion, the “if" part is clear, again, Theorem 4.1.4 is in use. For
the converse part, we proceed the same way as in (i). The fact that the intersection
of N Q (y) with the positive orthant Rk+ consists of the zero vector only, means that
the system

A I (y) λ  0
λ0

has no nonzero solution. Applying Corollary 2.2.5 we deduce the existence of a


strictly positive vector v such that

a i , v  0 for all i ∈ I (y).

Then, as before, the vector y + tv with t > 0 sufficiently small, belongs to Q and
y + tv > y, which is a contradiction. 

Example 4.1.11 Consider a convex polyhedron Q in R3 determined by the system

1 1 1 1
0 1 1 x1 1
1 0 1 x2  1 .
0 0 −1 x3 0
0 0 1 1
92 4 Pareto Optimality

We analyze the point y = (1/3, 1/3, 1/3)T ∈ Q. Its active index set is I (y) = {1}.
By Theorem 2.3.24 the normal cone to Q at that point is generated by the vector
(1, 1, 1)T . According to Theorem 4.1.10 the point y is a maximal point of Q. Now,
take another point of Q, say z = (−1, 0, 1)T . Its active index set consists of two
indices 2 and 5. The normal cone to Q at z is generated by two directions (0, 1, 1)T
and (0, 0, 1)T . It is clear that this normal cone contains no strictly positive vector,
hence the point z is not a maximal point of Q because z T ≤ (0, 0, 1)T . It is a weakly
maximal point, however, because normal directions at z are positive. Finally, we
choose a point w = (0, 0, 0)T in Q. Its active index set is I (w) = {4}. The normal
cone to Q at w is the cone generated by the direction (0, 0, −1)T . This cone contains
no positive vector, hence the point w is not weakly maximal. This can also be seen
from the fact that w is strictly dominated by y.

Scalarizing vectors
In remaining of this section we shall use the terminology of efficient points instead
of (Pareto) maximal points in order to avoid possible confusion with the concept of
maximal element of a family of sets by inclusion. Given a family {Ai : i ∈ I } of
sets, we say that Ai0 is maximal (respectively minimal) if there is no element Ai
of the family such that Ai = Ai0 and Ai0 ⊂ Ai (respectively Ai0 ⊃ Ai ). Another
formulation of Theorem 4.1.10 is viewed by maximizing linear functionals on the
set Q.
Corollary 4.1.12 Let Q be a convex polyhedron in Rk . Then the following statements
hold.
(i) y ∈ Q is an efficient point if and only if there is a strictly positive vector λ ∈ Rk
such that y maximizes the functional λ, . on Q.
(ii) y ∈ Q is a weakly efficient point if and only if there is a nonzero positive vector
λ ∈ Rk such that y maximizes the functional λ, . on Q.

Proof This is immediate from the definition of normal cones and from Theorem
4.1.10. 

The vector λ mentioned in this corollary is called a scalarizing vector (or weakly
scalarizing vector in (ii)) of the set Q. We remark that not every strictly positive
vector is a scalarizing vector of Q like not every strictly positive functional attains
its maximum on Q. Moreover, an efficient point of Q may maximize a number of
scalarizing vectors that are linearly independent, and vice versa, a scalarizing vector
may determine several maximizers on Q. For a given polyhedron Q that has efficient
elements, the question of how to choose a vector λ so that the functional associated
with it furnishes a maximizer is not evident. Analytical choice of positive directions
such as the one discussed in Example 4.1.11 is conceivable and will be given in details
later. Random generating methods or uniform divisions of the standard simplex do
not work in many instances. In fact, look at a simple problem of finding efficient
points of the convex polyhedral set given by the inequality
4.1 Pareto Maximal Points 93

x1 + 2x2  1

in the two-dimensional space R2 . Except for one direction, every positive vector λ
leads to a linear problem of maximizing λ, x over that polyhedron with unbounded
objective. Hence, using positive vectors λi of a uniform partition

i 1 p−i 0
λi = +
p 0 p 1

of the simplex [(1, 0)T , (0, 1)T ] of the space R2 for whatever the positive integer p
be, will never generate efficient points of the set.

Any nonzero positive vector of the space Rk is a positive multiple of a vector from
the standard simplex Δ. This combined with Corollary 4.1.12 yields the following
equalities

Max(Q) = argmax Q λ, .
λ∈riΔ

WMax(Q) = argmax Q λ, .
λ∈Δ

where argmax Q λ, . is the set of all maximizers of the functional λ, . on Q. Given


a point y ∈ Q denote

Δ y = λ ∈ Δ : y ∈ argmax Q λ, .
ΔQ = Δy .
y∈Q

The set Δ Q is called the weakly scalarizing set of Q and Δ y is the weakly scalarizing
set of Q at y (Fig. 4.4). By Corollary 4.1.12 the set Δ y is nonempty if and only if
the point y is a weakly efficient element of Q. Hence when Q has weakly efficient
points, the set Δ Q can be expressed as

ΔQ = Δy , (4.2)
y∈WMax(Q)

in which every set Δ y is nonempty. By definition a vector λ ∈ Δ belongs to Δ y if


and only if
λ, y  − y  0 for all y  ∈ Q.

The latter inequality signifies that λ is a normal vector to Q at y, and so (4.2) becomes

ΔQ = N Q (y) ∩ Δ.
y∈WMax(Q)
94 4 Pareto Optimality

Fig. 4.4 Scalarizing set at y

Let = {F1 , · · · , Fq } be the collection of all faces of Q and let N (Fi ) be the
normal cone to Fi , which, by definition, is the normal cone to Q at a relative interior
point of Fi . Since each element of Q is a relative interior point of some face, the
decomposition (4.2) produces the following decomposition of Δ Q :

ΔQ = Δi , (4.3)
i∈I

where Δi = N (Fi ) ∩ Δ and I is the set of those indices i from {1, · · · , q} such that
the faces Fi are weakly efficient. We note that when a face is not weakly efficient, the
normal cone to it does not meet the simplex Δ. Remember that a face of Q is weakly
efficient if all elements of it are weakly efficient elements of Q, or equivalently if a
relative interior point of it is a weakly efficient element. A face that is not weakly
efficient may contain weakly efficient elements on its proper faces.
We say a face of Q is a maximal weakly efficient face if it is weakly efficient and
no weakly efficient face of Q contains it as a proper subset. It is clear that when
a convex polyhedron has weakly efficient elements, it does have maximal weakly
efficient faces. Below we present some properties of the decompositions (4.2) and
(4.3) of the weakly scalarizing set.

Lemma 4.1.13 If P and Q are convex polyhedra with P ∩ Q = ∅, then there are
faces P  ⊆ P and Q  ⊆ Q such that P ∩ Q = P  ∩ Q  and ri(P  ) ∩ ri(Q  ) = ∅.
Moreover, if the interior of Q is nonempty and contains some elements of P, then
ri(P) ∩ int(Q) = ∅ and ri(P ∩ Q) = ri(P) ∩ int(Q).

Proof Let x be a relative interior point of the intersection P ∩ Q. Let P  ⊆ P


and Q  ⊆ Q be faces that contain x in their relative interiors. These faces meet
the requirements of the lemma. Indeed, it suffices to show that every point y from
P ∩ Q belongs to P  ∩ Q  . Since x is a relative interior point of P ∩ Q, the segment
[x − ε(x − y), x + ε(x − y)] belongs to that intersection when ε > 0 is sufficiently
4.1 Pareto Maximal Points 95

small. Moreover, as P  is a face, this segment must lie in P  which implies that y
lies in P  . The same argument shows that y lies in Q  , proving the first part of the
lemma.
For the second part it suffices to observe that P is the closure of its relative interior.
Hence it has relative interior points inside the interior of Q. The last equality of the
conclusion is then immediate. 

Theorem 4.1.14 The weakly scalarizing set Δ Q is a polytope. Moreover, if Δ Q is


nonempty, the elements of the decomposition (4.2) and (4.3) are polytopes and satisfy
the following conditions:
(i) If Δ y = Δz for some weakly efficient elements y and z, then there is i ∈ I such
that y, z ∈ Fi and Δ y = Δz = Δi .
(ii) If Fi is a maximal weakly efficient face of Q, then Δi is a minimal element of
the decomposition (4.3). Conversely, if the polytope Δi is minimal among the
polytopes of the decomposition (4.3), then there is a maximal weakly efficient
face F j such that Δ j = Δi .
(iii) For all i, j ∈ I with i = j, one has either Δi = Δ j or ri(Δi ) ∩ ri(Δ j ) = ∅.
(iv) Let Fi and F j be two weakly efficient adjacent vertices (zero-dimensional faces)
of Q. Then the edge joining them is weakly efficient if and only if Δi ∩ Δ j = ∅.

Proof Since Δ y is empty when y is not a weakly efficient point of Q, we may express
Δ Q as
ΔQ = Δy = (N Q (y) ∩ Δ) = N Q ∩ Δ
y∈Q y∈Q

which proves that Δ Q is a bounded polyhedron because the normal cone N Q is a


polyhedral cone. Likewise, the sets Δ y = N Q (y) ∩ Δ and Δi = N (Fi ) ∩ Δ are
convex polytopes.
To establish (i) we apply Lemma 4.1.13 to the intersections Δ y = N Q (y) ∩ Δ
and Δz = N Q (z) ∩ Δ. There exist faces N ⊆ N Q (y), M ⊆ N Q (z) and Δ y , Δz ⊆ Δ
such that

N Q (y) ∩ Δ = N ∩ Δ y , ri(N ) ∩ ri(Δ y ) = ∅


N Q (z) ∩ Δ = M ∩ Δz , ri(M) ∩ ri(Δz ) = ∅.

Choose any vector ξ from the relative interior of Δ y . Then it is also a relative interior
vector of the faces N , M, Δ y and Δz . This implies that N = M and Δ y = Δz . Using
Theorem 2.3.26 we find a face Fi of Q such that N (Fi ) = N . Then Fi contains y
and z and satisfies

Δi = N (Fi ) ∩ Δ = N ∩ Δ = Δ y = Δz .

For (ii) assume Fi is a maximal weakly efficient face. Assume that Δ j is a subset
of Δi for some j ∈ I . We choose any vector ξ from Δ j and consider the face F 
96 4 Pareto Optimality

consisting of all maximizers of ξ, . on Q. Then F  is a weakly efficient face and


contains F j and Fi . As Fi is maximal, we must have F  = Fi . Thus, F j ⊆ Fi and

Δi = N (Fi ) ∩ Δ ⊆ N (F j ) ∩ Δ = Δ j .

Conversely, let Δi be a minimal element among the polytopes Δ j , j ∈ I . If Fi is


maximal weakly efficient face, we are done. If it is not, we find a maximal weakly
efficient face F j containing Fi . Then Δ j = N (F j ) ∩ Δ ⊆ N (Fi ) ∩ Δ = Δi and
Δ j = Δi by hypothesis.
We proceed to (iii). Assume that the relative interior of Δi and the relative interior
of Δ j have a vector ξ in common. In view of Lemma 4.1.13 one can find four faces:
N i of N (Fi ), N j of N (F j ), Δi and Δ j of Δ such that

N (Fi ) ∩ Δ = N i ∩ Δi , ri(N i ) ∩ ri(Δi ) = ∅


N (F j ) ∩ Δ = N j ∩ Δ j , ri(N j ) ∩ ri(Δ j ) = ∅.

According to Theorem 2.3.26 there are faces F and Fm of Q which respectively


contain Fi and F j with N (F ) = N i and N (Fm ) = N j . Then ξ is a relative interior
vector of the faces N (F ), N (Fm ), Δi and Δ j . We deduce that the face Δi coincides
with Δ j , and F coincides with Fm . Consequently, Δi = Δ j .
To prove the last property we assume Fi and F j are adjacent vertices (zero-
dimensional faces) of Q. Let a one-dimensional face Fl be the edge joining them.
According to Corollary 2.3.28 we have N (Fl ) = N (Fi ) ∩ N (F j ). Then Δl =
Δi ∩ Δ j which shows that Fl is weakly efficient if and only if the latter intersection
is nonempty. 

Note that two different faces of Q may have the same weakly scalarizing set.
For instance the singleton {(0, 0, 1)T } is the weakly scalarizing set for all weakly
efficient faces of the polyhedron R2+ × {0} in R3 .
In order to treat efficient elements of Q we need to work with the relative interior
of Δ. Corresponding notations will be set as follows

ΔrQ = Δ Q ∩ ri(Δ)
Δry = Δ y ∩ ri(Δ)
Δri = Δi ∩ ri(Δ).

The set ΔrQ is called the scalarizing set of Q. It is clear that y ∈ Q is efficient if and
only if Δry is nonempty, and it is weakly efficient, but not efficient if and only if Δ y
lies on the border of Δ. The decompositions of the weakly scalarizing set induce the
following decompositions of the scalarizing set

ΔrQ = Δry (4.4)


y∈Max(Q)
4.1 Pareto Maximal Points 97

and

ΔrQ = Δri (4.5)


i∈I0

where I0 consists of those indices i from {1, · · · , q} for which Fi are efficient.

Theorem 4.1.15 Assume that the scalarizing set ΔrQ is nonempty. Then

Δ Q = cl(ΔrQ ).

Moreover the decompositions (4.4) and (4.5) of ΔrQ satisfy the following properties:
(i) If Δry = Δrz for some efficient elements y and z, then there is s ∈ I0 such that
y, z ∈ ri(Fs ) and Δry = Δrz = Δrs .
(ii) For i ∈ I0 the face Fi is a maximal efficient face if and only if Δri is a minimal
element of the decomposition (4.5).
(iii) For all i, j ∈ I0 with i = j, one has ri(Δri ) ∩ ri(Δrj ) = ∅.
(iv) Let Fi and F j be two efficient adjacent vertices (zero-dimensional efficient
faces) of Q. Then the edge joining them is efficient if and only if Δri ∩ Δrj = ∅.

Proof Since the set ΔrQ is nonempty, the set Δ Q does not lie on the border of Δ.
Being a closed convex set, Δ Q is the closure of its relative interior. Hence the relative
interior of Δ Q and the relative interior of Δ have at least one point in common and
we deduce

Δ Q = Δ Q ∩ Δ = cl ri(Δ Q ∩ Δ)
= cl riΔ Q ∩ riΔ ⊆ cl Δ Q ∩ riΔ
⊆ cl ΔrQ .

The converse inclusion being evident, we obtain equality Δ Q = cl(ΔrQ ).


To prove (i) we apply the second part of Lemma 4.1.13 to have

ri cone(Δry ) = ri N Q (y) ∩ Rk+ = ri N Q (y) ∩ int(Rk+ )


ri cone(Δrz ) = ri N Q (z) ∩ Rk+ = ri N Q (z) ∩ int(Rk+ ).

If y and z were relative interior points of two different faces, in view of Theorem
2.3.26 we would have ri[N Q (y)] ∩ ri[N Q (z)] = ∅ that contradicts the hypothesis.
Hence they are relative interior points of the same face, say Fs . By definition N (Fs ) =
N Q (y) and we deduce Δrs = Δry .
For (ii) assume Fi is a maximal efficient face. If for some j ∈ I0 one has Δrj ⊆ Δri ,
then by Lemma 4.1.13 there is some strictly positive vector that lies in the relative
interior of the normal cone N (F j ) and in the normal cone N (Fi ). We deduce that
either Fi = F j , or Fi is a proper face of F j . The last case is impossible because F j is
also an efficient face and Fi is maximal. Conversely, if Fi is not maximal, then there
98 4 Pareto Optimality

is a face F j that is efficient and contains Fi as a proper face. We have Δ j ⊆ Δi .


This inclusion is strict because the relative interiors of N (Fi ) and N (F j ) do not meet
each other. Thus, Δi is not minimal.
We proceed to (iii). If ri(Δri ) ∩ ri(Δrj ) = ∅, in view of Theorem 4.1.14 one has
Δi = Δ j , and hence Δri = Δrj . By (i), there is some face that contains relative interior
points of Fi and F j in its relative interior. This implies Fi = F j a contradiction.
For the last property we know that the normal cone to the edge joining the vertices
Fi and F j satisfies N ([Fi , F j ]) = N (Fi )∩N (F j ). Hence the edge [Fi , F j ] is efficient
if and only if the normal cone to it meets the set ri(Δ), or equivalently Δri ∩ Δrj is
nonempty. 

A practical way to compute the weakly scalarizing set is to solve a system of


linear equalities when the polyhedron Q is given by a system of linear inequalities.

Corollary 4.1.16 Assume the polyhedron Q in Rk is determined by the system

a i , y  bi , i = 1, · · · , m.

Then for every y ∈ Q, the set Δ y consists of all solutions z to the following system

z1 + · · · + zk = 1
αi a i = z
i∈I (y)
z i  0, i = 1, · · · , k, αi  0, i ∈ I (y).

In particular the weakly scalarizing set Δ Q is the solution set to the above system
with I = {1, · · · , m}.

Proof According to Theorem 2.3.24 the normal cone to Q at y is the positive hull
of the vectors a i , i ∈ I (y). Hence the set Δ y is the intersection of the positive
hull of these vectors and the simplex Δ, which is exactly the solution set to the
system described in the corollary. For the second part of the corollary it suffices to
observe that the normal cone of Q is the polar cone of the asymptotic cone of Q
(Theorem 2.3.26) which, in view of Theorem 2.3.19, is the positive hull of the vectors
a i , i = 1, · · · , m. 

Example 4.1.17 Consider the polyhedron defined by

y1 − y2 − y3  1
2y1 + y3  0.
4.1 Pareto Maximal Points 99

By Corollary 4.1.16 the weakly scalarizing set Δ Q is the solution set to the system

z1 + z2 + z3 = 1
α1 + 2α2 = z 1
−α1 = z 2
−α1 + α2 = z 3
z 1 , z 2 , z 3 , α1 , α2  0.

This produces a unique solution z with z 1 = 2/3, z 2 = 0 and z 3 = 1/3. Then Δ Q


consists of this solution only. The scalarizing set ΔrQ is empty, which shows that Q
has no efficient point. Its weakly efficient set is determined by the problem

maximize 2
3 y1 + 13 y3
subject to y ∈ Q.

It follows from the second inequality determining Q that the maximum value of the
objective function is zero and attained on the face given by the equations 2y1 + y3 = 0
and y1 − y2 − y3  1.

In the next example we show how to compute the scalarizing set when the poly-
hedron is given by a system of equalities (see also Exercise 4.4.13 at the end of this
chapter).

Example 4.1.18 Let Q be a polyhedron in R3 determined by the sytem

y1 + y2 + y3 = 1
y1 − y2 =0
y3  0.

We consider the solution y = (1/2, 1/2, 0)T and want to compute the scalarizing set
at this solution if it exists. As the proof of Theorem 4.1.14 indicates, a vector λ ∈ Δ
in R3 is a weakly scalarizing vector of Q at y if and only if it is normal to Q at that
point. Since the last component of y is zero, a vector λ is normal to Q at y if and
only if there are real numbers α, β and a positive number γ such that

1 1 0
λ=α 1 +β −1 −γ 0 .
1 0 1

We deduce λ ∈ Δ y if and only if

α+β  0
α−β  0
100 4 Pareto Optimality

α−γ  0
(α + β) + (α − β) + (α − γ) = 1

and hence Δ y consists of vectors λ whose components satisfy 0  λ3  1/3,


λ1 + λ2 = 1 − λ3 and λ1 , λ2  0.
To obtain the scalarizing vectors, it suffices to choose λ as above with an additional
requirement that λi > 0 for i = 1, 2, 3.

Structure of the set of efficient points


Given a convex polyhedron Q in the space Rk , the set of its efficient elements is not
simple. For instance, it is generally not convex, and an edge of it is not necessarily
efficient even if its two extreme end-points are efficient vertices. Despite of this, a
number of nice properties of this set can be scrutinized.

Corollary 4.1.19 Let Q be a convex polyhedron in Rk . The following statements


hold.
(i) If a relative interior point of a face of Q is efficient or weakly efficient, then so
is every point of that face.
(ii) If Q has vertices, it has an efficient vertex (respectively a weakly efficient vertex)
provided that it has efficient (respectively weakly efficient) elements.

Proof Since the normal cone to Q at every point of a face contains the normal cone
at a relative interior point, the first statement follows directly from Theorem 4.1.10.
For the second statement let y be an efficient point of the polyhedron Q. By
Theorem 4.1.10 one can find a strictly positive vector λ such that y is a maximizer
of the linear functional λ, . on Q. The face which contains y in its relative interior
maximizes the above functional. According to Corollary 2.3.14 there is a vertex of
Q inside that face and in view of Theorem 4.1.10 this vertex is an efficient point of
Q. The case of weakly efficient points is proven by the same argument. 

A subset P of Rk is called arcwise connected if for any pair of points y and z in


P, there are a finite number of points y 0 , · · · , y  in P such that y 0 = y, y  = z and
the segments [y i , y i+1 ], i = 0, · · · ,  − 1 lie all in P.

Theorem 4.1.20 The sets of all efficient points and weakly efficient points of a convex
polyhedron consist of faces of the polyhedron and are closed and arcwise connected.

Proof By analogy, it suffices to prove the theorem for the efficient set. According to
Corollary 4.1.19, if a point ȳ in Q is efficient, then the whole face containing y in
its relative interior is a face of efficient elements. Hence, Max(Q) consists of faces
of Q if it is nonempty. Moreover, as faces are closed, their union is a closed set.
Now we prove the connectedness of this set by assuming that Q has efficient
elements. Let y and z be any pair of efficient points of Q. We may assume without
loss of generality that y is a relative interior point of a face Q y and z is a relative
interior point of a face Q z . Consider the decomposition (4.5) of the scalarizing set
4.1 Pareto Maximal Points 101

ΔrQ . For a face F of Q, the scalarizing set N (F) ∩ ri(Δ) is denoted by Δri(F) . Let λ y
be a relative interior point of the set Δri(Q y ) and λz a relative interior point of Δri(Q z ) .
Then the segment joining λ y and λz lies in ΔrQ . The decomposition of the latter set
induces a decomposition of the segment [λ y , λz ] by [λi , λi+1 ], i = 0, · · · ,  − 1
where λ0 = λ y , λ = λz . Let Q 1 , · · · , Q  be faces of Q such that

[λ j , λ j+1 ] ⊆ Δri(Q j+1 ) j = 0, · · · ,  − 1.

For every j, we choose a relative interior point y j of the face Q j . Then λ j belongs
to the normal cones to Q at y j and y j+1 . Consequently, the points y j and y j+1 lie
in the face argmax Q λ j , . and so does the segment joining them. As λ j ∈ ΔrQ , by
Theorem 4.1.10 the segment [y j , y j+1 ] consists of efficient points of Q. Moreover,
as the vector λ0 belongs to the normal cones to Q at y and at y1 , we conclude that
the segment [y, y 1 ] is composed of efficient points of Q. Similarly we have that
[y −1 , z] lies in the set Max(Q). Thus, the union [y, y 1 ] ∪ [y 1 , y 2 ] ∪ · · · [y −1 , z]
forms a path of efficient elements joining y and z. This completes the proof. 

We know that every efficient point of a convex polyhedron is contained in a


maximal efficient face. Hence the set of efficient points is the union of maximal
efficient faces. Dimension of a maximal efficient face may vary from zero to k − 1.

Corollary 4.1.21 Let Q be a convex polyhedron in Rk . The following statements


hold.
(i) Q has a zero-dimensional maximal efficient face if and only if its efficient set is
a singleton.
(ii) Every (k − 1)-dimensional efficient face of Q, if any exists, is maximal. In
particular in the two dimensional space R2 every efficient edge of Q is maximal
if the efficient set of Q consists of more than two elements.
(iii) An efficient face F of Q is maximal if and only if the restriction of the decom-
position of ΔrQ on Δ F consists of one element only.

Proof The first statement follows from the arcwise connectedness of the efficient
set of Q. In Rk a proper face of Q is of dimension at most k − 1. Moreover, a k-
dimensional polyhedron cannot be efficient, for its interior points are not maximal.
Hence, if the dimension of an efficient face is equal to k − 1, it is maximal.
The last statement follows immediately from Theorem 4.1.15. 

Example 4.1.22 Let Q be a polyhedron in R3 defined by the system

x1 + x3  1
x2 + x3  1
x1 , x2 , x3  0.
102 4 Pareto Optimality

Since Q is bounded, it is evident that the weakly scalarizing set Δ Q is the whole
standard simplex Δ and the scalarizing set is the relative interior of Δ. Denote

1 0 0 1/2 0
q1 = 0 , q2 = 1 , q3 = 0 , q4 = 0 , q5 = 1/2 .
0 0 1 1/2 1/2

Applying Corollary 4.1.16 we obtain the following decomposition of ΔrQ :


(i) ri[q 4 , q 5 ] is the scalarizing set of the face determined by the equalities x1 +x3 =
1 and x2 + x3 = 1;
(ii) ri(co([q 3 , q 4 , q 5 ])) is the scalarizing set of the face determined by the equali-
ties x1 + x3 = 1, x2 + x3 = 1 and x1 = x2 = 0;
(iii) ri(co([q 1 , q 2 , q 4 , q 5 ])) is the scalarizing set of the face determined by the equal-
ities x1 + x3 = 1, x2 + x3 = 1 and x3 = 0.
In view of Corollary 4.1.21 the one dimensional face (edge) determined by x1 + x3 =
1 and x2 + x3 = 1 is a maximal efficient face.

4.2 Multiobjective Linear Problems

The central multiobjective linear programming problem which we propose to study


throughout is denoted (MOLP) and written in the form :

Maximize C x
subject to x ∈ X,

where X is a nonempty convex polyhedron in Rn and C is a real k × n-matrix. This


problem means finding a Pareto efficient (Pareto maximal) solution x̄ ∈ X such that
C x̄ ∈ Max(C(X )). In other words, a feasible solution x̄ solves (MOLP) if there is
no feasible solution x ∈ X such that

C x̄  C x and C x̄ = C x.

The efficient solution set of (MOLP) is denoted S(MOLP). When x is an efficient


solution, the vector C x is called an efficient or maximal value of the problem. In
a similar manner one defines the set of weakly efficient solutions WS(MOLP) to be
the set of all feasible solutions whose image by C belong to the weakly efficient
set WMax(C(X )). It is clear that an efficient solution is a weakly efficient solution,
but not vice versa as we have already discussed in the preceding section. When the
feasible set X is given by the system

Ax = b
x  0,
4.2 Multiobjective Linear Problems 103

where A is a real m × n-matrix and b is a real m-vector, we say that (MOLP) is given
in standard form, and it is given in canonical form if X is determined by the system

Ax  b.

The matrix C is also considered as a linear operator from Rn to Rk , and so its kernel
consists of vectors x with C x = 0.

Theorem 4.2.1 Assume that the problem (MOLP) has feasible solutions. Then the
following assertions hold.
(i) (MOLP) admits efficient solutions if and only if

C(X ∞ ) ∩ Rk+ = {0}.

(ii) (MOLP) admits weakly efficient solutions if and only if

C(X ∞ ) ∩ int(Rk+ ) = ∅.

In particular, if all asymptotic rays of X belong to the kernel of C, then (MOLP) has
an efficient solution.

Proof By definition, (MOLP) has an efficient solution if and only if the set C(X )
has an efficient point, which, in virtue of Theorem 4.1.7, is equivalent with

[C(X )]∞ ∩ Rk+ = {0}.

Now the first assertion is deduced from this equivalence and from the fact that the
asymptotic cone of C(X ) coincides with the cone C(X ∞ ) (Corollary 2.3.17). The
second assertion is proven by a similar argument. 

Example 4.2.2 Assume that the feasible set X of the problem (MOLP) is given by
the system
x1 + x2 − x3 = 5
x1 − x2 = 4
x1 , x2 , x3  0.

It is nonempty and parametrically presented as

t +4
1
X= t :t  .
2
2t − 1
104 4 Pareto Optimality

Its asymptotic cone is given by

t
X∞ = t :t 0 .
2t

Consider an objective function C with values in R2 given by the matrix

1 01
C= .
−2 −4 0

Then the image of X ∞ under C is the set

3t
C(X ∞ ) = :t 0 ,
−6t

that has only the zero vector in common with the positive orthant. In view of Theorem
4.2.1 the problem has maximal solutions.
Now we choose another objective function C  given by

−1 1 0
C = .
001

Then the image of X ∞ under C  is the set

0
C  (X ∞ ) = :t 0 ,
2t

that has no common point with the interior of the positive orthant. Hence the problem
admits weakly efficient solutions. It has no efficient solution because the intersection
of C  (X ∞ ) with the positive orthant does contain positive vectors.
Definition 4.2.3 The objective function of the problem (MOLP) is said to be
bounded (respectively weakly bounded) from above if there is no vector v ∈ X ∞
such that
Cv ≥ 0 (respectively Cv > 0).

We shall simply say that (MOLP) is bounded if its objective function is bounded
from above. Of course, a bounded problem is weakly bounded and not every weakly
bounded problem is bounded. A sufficient condition for a problem to be bounded is
given by the inequality
C x  a for every x ∈ X,

where a is some vector from Rk . This condition is also necessary when k = 1, but
not so when k > 1.
4.2 Multiobjective Linear Problems 105

Example 4.2.4 Consider the bi-objective problem

x1
−3 1 1
Maximize x2
010
x3
x1
1 −1 0 0
subject to x2 =
0 01 1
x3
x1 , x2 , x3  0.

The feasible set and its asymptotic cone are given respectively by

t
X= t ∈ R3 : t  0
1

and
t
X∞ = t ∈ R3 : t  0
0

Then for every asymptotic direction v = (t, t, 0)T ∈ X ∞ one has

−2t
Cv =  0.
t

By definition the objective function is bounded. Nevertheless the value set of the
problem consists of vectors

−2t + 1
C(X ) = :t 0
t

for which no vector a ∈ R2 satisfies C x  a for all x ∈ X .

Corollary 4.2.5 The problem (MOLP) has efficient solutions (respectively weakly
efficient solutions) if and only if its objective function is bounded (respectively weakly
bounded).

Proof This is immediate from Theorem 4.2.1. 

The following theorem provides a criterion for efficiency in terms of normal


directions.
106 4 Pareto Optimality

Theorem 4.2.6 Let x̄ be a feasible solution of (MOLP). Then


(i) x̄ is an efficient solution if and only if the normal cone N X (x̄) to X at x̄ contains
some vector C T λ with λ a strictly positive vector of Rk ;
(ii) x̄ is a weakly efficient point if and only if the normal cone N X (x̄) to X at x̄
contains some vector C T λ with λ a nonzero positive vector of Rk .

Proof If the vector C T λ with λ strictly positive, is normal to X at x̄, then

C T λ, x − x̄ ≤ 0 for every x ∈ X

which means that

λ, C x ≤ λ, C x̄ for every x ∈ X.

By Theorem 4.1.4 the vector C x̄ is an efficient point of the set C(X ). By definition,
x̄ is an efficient solution of (MOLP).
Conversely, if C x̄ is an efficient point of C(X ), then by Theorem 4.1.10, the
normal cone to C(X ) at C x̄ contains a strictly positive vector, denoted by λ. We
deduce that
C T λ, x − x̄ ≤ 0 for all x ∈ X.

This shows that the vector C T λ is normal to X at x̄. The second assertion is proven
similarly. 

Example 4.2.7 We reconsider the bi-objective problem given in Example 4.2.2

x1
1 01
Maximize x2
−2 −4 0
x3
x1 + x2 − x3 = 5
subject to x1 − x2 = 4
x1 , x2 , x3  0.

Choose a feasible solution x = (9/2, 1/2, 0)T corresponding to t = 1/2. The


normal cone to the feasible set at x is the positive hull of the hyperplane of basis
{(1, 1, −1)T , (1, −1, 0)T } (the row vectors of the constraint matrix) and the vector
(0, 0, −1)T (the constraint x3  0 is active at this point). In other words, this normal
cone is the half-space determined by the inequality

x1 + x2 + 2x3  0. (4.6)

The image of the positive orthant of the value space R2 under C T is the positive hull
of the vectors
4.2 Multiobjective Linear Problems 107

1 −2 1 1 −2 −2
1 0
v1 = 0 −4 = 0 and v2 = 0 −4 = −4 .
0 1
1 0 1 1 0 0

Using inequality (4.6) we deduce that the vector v2 lies in the interior of the normal
cone to the feasible set at x. Hence that normal cone does contain a vector C T λ with
some strictly positive vector λ. By Theorem 4.2.6 the solution x is efficient. It is
routine to check that the solution x is a vertex of the feasible set.
If we pick another feasible solution, say z = (5, 1, 1)T , then the normal cone to
the feasible set at z is the hyperplane determine by equation

x1 + x2 + 2x3 = 0.

Direct calculation shows that the vectors v1 and v2 lie in different sides of the normal
cone at z. Hence there does exist a strictly positive vector λ in R2 such that C T λ is
contained in that cone. Consequently, the solution z is efficient too.

4.3 Scalarization

We associate with a nonzero k-vector λ a scalar linear problem, denoted (LPλ )

maximize λ, C x
subject to x ∈ X.

This problem is referred to as a scalarized problem of (MOLP) and λ is called a


scalarizing vector. Now we shall see how useful scalarized problems are in solving
multiobjective problems.

Theorem 4.3.1 The following statements hold.


(i) A feasible solution x̄ of (MOLP) is efficient if and only if there is a strictly positive
k-vector λ such that x̄ is an optimal solution of the scalarized problem (LPλ ).
(ii) A feasible solution x̄ of (MOLP) is weakly efficient if and only if there is a nonzero
positive k-vector λ such that x̄ is an optimal solution of the scalarized problem
(LPλ ).

Proof If x̄ is an efficient solution of (MOLP), then, in view of Theorem 4.2.6, there


is a strictly positive vector λ such that C T λ is a normal vector to X at x̄. This implies
that x̄ maximizes the linear functional λ, C(.) on X , that is, x̄ is an optimal solution
of (LPλ ).
Conversely, if x̄ solves the problem (LPλ ) with λ strictly positive, then C T λ is a
normal cone to X at x̄. Again, in view of Theorem 4.2.6, the point x̄ is an efficient
solution of (MOLP). The proof of the second statement follows the same line. 
108 4 Pareto Optimality

We notice that Theorem 4.3.1 remains valid if the scalarizing vector λ is taken
from the standard simplex, that is λ1 + · · · + λk = 1. Then another formulation of
the theorem is given by equalities

S(MOLP) = S(LPλ ) (4.7)


λ∈riΔ

WS(MOLP) = S(LPλ ) (4.8)


λ∈Δ

where S(LPλ ) denotes the optimal solution set of (LPλ ). It was already mentioned
that a weakly efficient solution is not necessarily an efficient solution. Consequently
a positive, but not strictly positive vector λ may produce weakly efficient solutions
which are not efficient. Here is an exception.

Corollary 4.3.2 Assume for a positive vector λ, the set consisting of the values C x
with x being optimal solution of (LPλ ) is a singleton, in particular when (LPλ ) has
a unique solution. Then every optimal solution of (LPλ ) is an efficient solution of
(MOLP).

Proof Let x be an optimal solution of (LPλ ) and let y be a feasible solution of


(MOLP) such that C y  C x. Since λ is positive, one has

λ, C x  λ, C y .

Actually we have equality because x solves (LPλ ). Hence y solves (LPλ ) too. By
hypothesis C x = C y which shows that x is an efficient solution of (MOLP). 

Equalities (4.7) and (4.8) show that efficient and weakly efficient solutions of
(MOLP) can be generated by solving a family of scalar problems. It turns out that
a finite number of such problems are sufficient to generate the whole efficient and
weakly efficient solution sets of (MOLP).

Corollary 4.3.3 There exists a finite number of strictly positive vectors (respectively
positive vectors) λi , i = 1, · · · , p such that
p
S(MOLP) = S(LPλi )
i=1
p
(respectively WS(MOLP) = S(LPλi ))
i=1

Proof It follows from Theorem 3.1.3 that if an efficient solution is a relative interior
of a face of the feasible polyhedron and an optimal solution of (LPλ ) for some
strictly positive vector λ, then the whole face is optimal for (LPλ ). Since the number
of faces is finite, a finite number of such vectors λ is sufficient to generate all efficient
4.3 Scalarization 109

solutions of (MOLP). The case of weakly efficient solutions is treated in the same
way. 

Corollary 4.3.4 Assume that (MOLP) has an efficient solution and (LPλ ), where
λ is a nonzero positive vector, has an optimal solution. Then there is an efficient
solution of (MOLP) among the optimal solutions of (LPλ ).

Proof Apply Theorem 4.3.1 and the method of Corollary 4.1.8. 

Corollary 4.3.5 Assume that the scalarized problems

maximize ci , x
subject to x ∈ X

where ci , i = 1, · · · , k are the columns of the matrix C T , are solvable. Then (MOLP)
has an efficient solution.

Proof The linear problems mentioned in the corollary correspond to the scalarized
problems (LPλ ) with λ = (0, · · · , 1, · · · , 0)T where the one is on the ith place,
i = 1, · · · , k. These problems provide weakly efficient solutions of (MOLP). The
linear problem whose objective is the sum c1 , x + · · · + ck , x is solvable too. It is
the scalarized problem with λ = (1, · · · , 1)T , and hence by Theorem 4.3.1, (MOLP)
has efficient solutions. 

Decomposition of the scalarizing set


Given a feasible solution x of (MOLP) we denote the set of all vectors λ ∈ Δ such
that x solves (LPλ ) by (x), and the union of all these (x) over x ∈ X by (X ).
We denote also

r (x) = (x) ∩ ri(Δ)


r (X ) = (X ) ∩ ri(Δ).

The sets r (X ) and (X ) are respectively called the scalarizing and weakly scalar-
izing sets of (MOLP). The decomposition results for efficient elements (Theorems
4.1.14 and 4.1.15) are easily adapted to decompose the weakly scalarizing and scalar-
izing sets of the problem (MOLP). We deduce a useful corollary below for computing
purposes.

Corollary 4.3.6 The following assertions hold for (MOLP).


(i) A feasible solution x ∈ X is efficient (respectively weakly efficient ) if and only
if r (x) (respectively (x)) is nonempty.
(ii) If X has vertices, then the set r (X ) (respectively (X )) is the union of the
sets r (x i ) ( respectively (x i )) where x i runs over the set of all efficient
(respectively weakly efficient ) vertices of (MOLP).
110 4 Pareto Optimality

(iii) If X is given by system

a i , x  bi , i = 1, · · · , m

and x is a feasible solution, then the set (x) consists of all solutions λ to the
following system

λ1 + · · · + λ k = 1
αi a i = λ1 c1 + · · · + λk ck
i∈I (x)
λi  0, i = 1, · · · , k, αi  0, i ∈ I (x).

In particular the weakly scalarizing set (X ) is the solution set to the above
system with I = {1, · · · , m}.
Proof The first assertion is clear from Theorem 4.3.1. For the second assertion we
observe that when X has vertices, every face of X has vertices too (Corollary 2.3.6).
Hence the normal cone of X is the union of the normal cones to X at its vertices.
Moreover, by writing the objective function λ, C(.) of (LPλ ) in the form C T λ, . ,
we deduce that
(x) = {λ ∈ Rk : C T λ ∈ N X (x) ∩ C T (Δ)}. (4.9)

Consequently,

(X ) = (x)
x∈X

= λ : C T λ ∈ N X (x) ∩ C T (Δ), x ∈ X

= λ : C T λ ∈ N X (x) ∩ C T (Δ), x is a vertex of X

= (x) : x is a weakly efficient vertex of X .

The proof for efficient solutions is similar. The last assertion is derived from (4.9)
and Corollary 4.1.16. 
Example 4.3.7 We reconsider the bi-objective problem

1 1 x1
Maximize
2 −1 x2
x1 + x2  1
subject to
3x1 + 2x2  2.

We wish to find the weakly scalarizing set of this problem. According to the preceding
corollary, it consists of positive vectors λ from the standard simplex of R2 , solutions
to the following system:
4.3 Scalarization 111

λ1 + λ2 = 1
1 3 1 2
α1 + α2 = λ1 + λ2
1 2 1 −1
α1 , α2  0, λ1 , λ2  0.

t
Solving this system we obtain λ = with 7/8  t  1. For t = 1, the
1−t
scalarized problem associated with λ is of the form

maximize x1 + x2
x1 + x2  1
subject to
3x1 + 2x2  2.

It can be seen that x solves this problem if and only if x1 + x2 = 1 and x1  0. These
solutions form the set of weakly efficient solutions of the multiobjective problem.
For t = 7/8, the scalarized problem associated with λ = (7/8, 1/8)T is of the
form
9 3
maximize x1 + x2
8 4
x1 + x2  1
subject to
3x1 + 2x2  2.

Its optimal solutions are given by 3x1 + 2x2 = 2 and x1  0. Since λ is strictly
positive, these solutions are efficient solutions of the multiobjective problem. If we
choose λ = (1/2, 1/2)T outside of the scalarizing set, then the associated scalarized
problem has no optimal solution.

Structure of the efficient solution set


We knew in Chap. 3 that the optimal solution set of a scalar linear problem is a
face of the feasible set. This property, unfortunately, is no longer true when the
problem is multiobjective. However, a few interesting properties of the efficient set
of a polyhedron we established in the first section are still valid for the efficient
solution set and exposed in the next theorem.

Theorem 4.3.8 The efficient solutions of the problem (MOLP) have the following
properties.
(i) If a relative interior point of a face of X is an efficient or weakly efficient solution,
then so is every point of that face.
(ii) If X has a vertex and (MOLP) has an efficient (weakly efficient) solution, then
it has an efficient (weakly efficient) vertex solution.
(iii) The efficient and weakly efficient solution sets of (MOLP) consist of faces of the
feasible polyhedron and are closed and arcwise connected.
112 4 Pareto Optimality

Proof Since the normal cone to X at every point of a face contains the normal cone
at a relative interior point, the first property follows directly from Theorem 4.2.6.
Further, under the hypothesis of (ii) there is a strictly positive vector λ ∈ Rk
such that the scalarized problem (LPλ ) is solvable. The argument in proving (ii) of
Corollary 4.1.19 is applicable to obtain an optimal vertex of (LPλ ) which is also an
efficient vertex solution of (MOLP).
The proof of the last property is much similar to the one of Theorem 4.1.15.
We first notice that in view of (i) the efficient and weakly efficient solution sets are
composed of faces of the feasible set X , and as the number of faces of X is finite, they
are closed. We now prove the arcwise connectedness of the weakly efficient solution
set, the argument going through for efficient solutions too. Let x and y be two weakly
efficient solutions, relative interior points of efficient faces X x and X y of X . Let λx
and λ y be relative interior vectors of the weakly scalarizing sets (X x ) and (X y ).
The decomposition of the weakly scalarizing set (X ) induces a decomposition of
the segment joining λx and λ y by

[λx , λ y ] = [λ1 , λ2 ] ∪ [λ2 , λ3 ] ∪ · · · ∪ [λ−1 , λ ]

with λ1 = λx , λ = λ y and [λi , λi+1 ] ⊆ (X i ) for some face X i of X , i = 1, ..., −


1. Since λi belongs simultaneously to (X i ) and (X i+1 ), there is some common
point x i ∈ X i ∩ X i+1 , i = 1, ..., −1. It is clear that [x, x1 ]∪[x1 , x2 ]∪· · ·∪[x−1 , y]
is an arcwise path joining x and y and each member segment [xi , xi+1 ] is efficient
because being in the face X i , i = 1, ...,  with x = y. 

4.4 Exercises

4.4.1 Find maximal elements of the sets determined by the following systems

2x + y  15 x + 4y  12
(a) x + 3y  20 (b) −2x + y  0
x, y  0. x, y  0.

x + 2y  20 x + 2y + 3z  70
7x + z  6 x+ y + z  50
(c) (d)
3y + 4z  30 − y + z  0
x, y, z  0. x, y, z  0.
4.4 Exercises 113

4.4.2 Find maximal and weakly maximal elements of the following sets
x1
Q1 = x2 ∈ R3 : x1  0, x2  0, x3  0, x22 + x32  1
x3
1
Q 2 =co(A, B) with A = 0 ∈ R3 : 0  s  1
s
0
and B = x2 ∈ R3 : x2  0, x3  0, x22 + x32  1 .
x3

4.4.3 We say a real function g on Rk is increasing if x, y ∈ Rk and x ≥ y imply


g(x) > g(y), and it is weakly increasing if x > y implies g(x) > g(y). Prove that g
is increasing (respectively weakly increasing) if and only if for every nonempty subset
Q of Rk , every maximizer of g on Q is an efficient (respectively weakly maximal)
element of Q.

4.4.4 Let Q be a closed set in Rk . Prove the following statements.


(i) The set WMax(Q) is closed.
(ii) The set Max(Q) is closed provided that k = 2 and Q − R2+ is convex.
(iii) Max(−Q) = − Min(Q) and Max(αQ) = α Max(Q) for every α > 0.

4.4.5 Let P and Q be two convex polyhedra in Rk .


(i) Prove that Max(P + Q) ⊆ Max(P)+ Max(Q).
(ii) Find conditions under which equality holds in (i).

4.4.6 Prove that the set of maximal elements of a convex polytope is included in the
convex hull of the maximal vertices. Is the converse true?

4.4.7 An element x of a set P in Rk is said to be dominated if there is some x  ∈ P


such that x  ≥ x. Prove that the set of dominated elements of a convex polyhedral
set is convex and if a face contains a dominated element, its relative interior points
are dominated too.

4.4.8 A diet problem. A multiobjective version of the diet problem in hospital


consists of finding a combination of foods for a patient to minimize simultaneously
the cost of the menu and the number of calories under certain nutritional requirements
prescribed by a treating physician. Assume a menu is composed of three main types of
foods: meat with potatoes, fish with rice and vegetables. The nutrition facts, calories
in foods and price per servings are given below

Fats Carbohydrates Vitamin Calories Prices/serving


Meat + potatoes 0.2 0.2 0.06 400 1.5
Fish + rice 0.1 0.2 0.08 300 1.5
Vegetables 0 0.05 0.8 50 0.8
114 4 Pareto Optimality

Using three variables: x= number of servings of meat, y= number of servings of fish


and z= number of servings of vegetables, formulate a bi-objective linear problem
whose objective functions are the cost and the number of calories of the menu while
maintaining the physician’s prescription of at least one unit and at most one and half
unit for each nutritional substance. Discuss the menus that minimize the cost and the
number of calories separately.

4.4.9 An investment problem. An investor disposes a budget of 20,000 USD and


wishes to invest into three product projects with amounts x, y and z respectively. The
total profit is given by

P(x, y, z) = 20x + 10y + 100z

and the total sale is given by

S(x, y, z) = 10x + 2y + z.

Find x, y and z to maximize the total profit and total sale.

4.4.10 Bilevel linear programming problem. A typical bilevel programming prob-


lem consists of two problems: the upper level problem of the form

maximize c, x + d, y
subject to A1 x  b1
x 0

and the lower level problem for which y is an optimal solution:

maximize p, z
subject to A2 x + A3 z  b2
z  0.

Here c, p, d, b1 and b2 are vectors of dimension n 1 , n 2 , n 2 , m 1 and m 2 respectively;


A1 , A2 and A3 are matrices of dimension m 1 × n 1 , m 2 × n 1 and m 2 × n 2 corre-
spondingly.
Consider the following multiobjective problem

x
Maximize − e, x
p, y
subject to A1 x  b1
A2 x + A3 y  b2
x  0, y  0
4.4 Exercises 115

where e is the vector whose components are all equal to one. Prove that (x, y) is an
efficient solution of this latter problem if and only if it is a feasible solution of the
upper level problem described above.

4.4.11 Apply Theorem 4.1.15 to find a decomposition of the scalarizing set for the
polyhedron defined by the system

2x1 + x2 + 2x3  5
x1 + 2x2 + 2x3  5
x1 , x2 , x3  0.

4.4.12 Find the weakly scalarizing set of a polyhedron in Rk determined by the


system

a i , y = bi , i = 1, · · · , m
y  0,

and apply it to find the weakly scalarizing set of a multiobjective problem given in
standard form.

4.4.13 Scalarizing set at a vertex solution. Consider the problem (MOLP) in stan-
dard form
Maximize C x
subject to Ax = b
x  0,

where C is a k × n-matrix, A is an m × n-matrix and b is an m-vector. Assume


x is a feasible solution associated with a non-degenerate basis B. The non-basic
part of A is denoted N , the basic and non-basic parts of C are denoted C B and C N
respectively. Prove the following statements.
(a) A vector λ belongs to (x) if and only if it belongs to Δ and solves the following
system
[C NT − (B −1 N )T C BT ]λ  0.

(b) If the vector on the left hand side of the system in (a) is strictly negative for some
λ, then x is a unique solution of the scalarized problem

maximize λ, C x
subject to Ax = b
x  0.

In particular, if in addition λ is positive, then x is an efficient solution of (MOLP).


116 4 Pareto Optimality

4.4.14 Pascoletti-Serafini’s method. Let λ ∈ Rk be a strictly positive vector and


C x  0 for every feasible solution x ∈ X of (MOLP). Show that if (x, α) is an
optimal solution of the problem

maximize α
subject to c , x  α, j = 1, ..., k
j

x ∈ X,

then x is a weakly efficient solution of (MOLP).

4.4.15 Weighted constraint method. Prove that a feasible solution x ∈ X is a


weakly efficient solution of (MOLP) if and only if there is some strictly positive
vector λ ∈ Rk such that x solves

maximize λ c  , x
subject to λ j c j , x  λ c , x , j = 1, · · · , k, j = 
x∈X

for  = 1, · · · , k.

4.4.16 Constraint method. Choose  ∈ {1, · · · , k}, L j ∈ R, j = 1, · · · , k, j = ,


and solve the scalar problem (P ):

maximize c , x
subject to c j , x ≥ L j , j = 1, · · · , k, j = 
Ax = b, x  0.

Note that if L j are big, then (P ) may have no feasible solution. A constraint c j , x ≥
L j is called binding if equality c j , x = L j is satisfied at every optimal solution of
(P ). Prove that
(a) every optimal solution of (P ) is a weakly efficient solution of (MOLP);
(b) if an optimal solution of (P ) is unique or all constraints of (P ) are binding,
then it is an efficient solution of (MOLP);
(c) a feasible solution x 0 of (MOLP) is efficient if and only if it is optimal for all
(P ),  = 1, ..., k and

L  = ( c1 , x 0 , · · · , c−1 , x 0 , c+1 , x 0 , · · · , ck , x 0 ).

4.4.17 Let d be a k-vector such that C x  d for some feasible solution x of (MOLP).
Consider the problem (P)
4.4 Exercises 117

maximize e, y
subject to Cx = d + y
Ax = b, x  0, y  0,

where e is the vector of ones in Rk . Show that


(a) a feasible solution x 0 of (MOLP) is efficient if and only if the optimal value of
(P) with d = C x 0 is equal to zero;
(b) (MOLP) has efficient solutions if and only if the optimal value of (P) is finite.

4.4.18 Let x̄ be a feasible solution of the problem

Maximize Cx
subject to Ax  b.

Show that the following statements are equivalent.


(i) x̄ is a weak Pareto maximal solution.
(ii) The system
Ax  b
Cx > Cx

is inconsistent.
(iii) For every t > 0, the system

Ax  b − Ax
C x  te

is inconsistent, where e is the vector of ones.


(iv) For every t > 0, the system

C T λ − AT μ = 0
A x̄ − b, μ + t e, λ = 1
λ, μ  0

is consistent.

4.4.19 Consider the multiobjective problem described in the preceding exercise.


Assume that the cone pos{c1 , · · · , ck } contains the origin in its relative interior.
Prove that if the interior of the feasible set is nonempty, then every feasible solution
of (MOLP) is an efficient solution.

4.4.20 Let X denote the feasible set of the problem (MOLP) given in Exercise 4.4.18.
Consider the following function

h(x) = max

min λ, C x  − C x .
x ∈X λ∈Δ
118 4 Pareto Optimality

Prove that x is a weakly maximal solution of (MOLP) if and only if h(x) = 0.

4.4.21 Geoffrion’s proper efficient solutions. Let X be a nonempty set in Rn and let
f be a vector-valued function from Rn to Rk . Consider the following multiobjective
problem

Maximize f (x)
subject to x ∈ X.

A feasible solution x of this problem is said to be a proper efficient solution if there


exists a constant α > 0 such that for every i ∈ {1, · · · , k} and x ∈ X satisfying
f i (x) > f i (x) there exists some j ∈ {1, · · · , k} for which f j (x) < f j (x) and

f i (x) − f i (x)
 α.
f j (x) − f j (x)

(i) Justify that every proper efficient solution is efficient. Give an example of efficient
solutions that are not proper.
(ii) Prove that when f is linear and X is a polyhedral set, every efficient solution is
proper.

4.4.22 Maximality with respect to a convex cone. Let C be a convex cone in Rk


with C ∩ (−C) = {0} (one says C is pointed). For y, z ∈ Rk define y C z by
y − z ∈ C. A point z of a set A is called C-maximal if there is no y ∈ A such that
y C z and y = z. Prove the following properties:
(i) A point z ∈ A is C-maximal if and only if (A − a) ∩ Rk = {0};
(ii) If Rk+ ⊆ C, then every C-maximal point is Pareto maximal, and if Rk+ ⊇ C,
then every Pareto maximal point is C-maximal;
(iii) If A is a polyhedral set, then there is a polyhedral cone C satisfying Rk+ ⊆
int(C) ∪ {0} such that a point of A is C-maximal if and only if it is Pareto
maximal. Find such a cone for the sets in Exercise 4.4.1 (a) and (b).

4.4.23 Lexicographical order. The lexicographical order lex in Rk is defined as


y lex z for y, z ∈ Rk if and only if either y = z or there is some j ∈ {1, · · · , k}
such that yi = z i for i < j and y j > z j . A point z of a nonempty set A in Rk is
called lex-maximal if there is no y ∈ A such that y lex z and y = z.
(i) Show that the lexicographical order is total in the sense that for every y, z ∈ Rk
one has either y lex z or z lex y.
(ii) Find a convex cone C such that y lex z if and only if y − z ∈ C.
(iii) Prove that every lex-maximal element of a set is Pareto maximal.
Do the same for the colexicographical order: y colex z if and only if either y = z
or there is some j ∈ {1, · · · , k} such that yi = z i for i > j and y j > z j .

You might also like