0% found this document useful (0 votes)

8 views37 pages

1. From linear to conic programming

Uploaded by

scribd-ml

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views37 pages

1. From linear to conic programming

Uploaded by

scribd-ml

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 37

C.O.R.E.

Summer School
on
Modern Convex Optimization
August 26-30, 2002

FIVE LECTURES
ON
MODERN CONVEX OPTIMIZATION

Arkadi Nemirovski
[email protected],
http://iew3.technion.ac.il/Home/Users/Nemirovski.phtml
Faculty of Industrial Engineering and Management
and Minerva Optimization Center
Technion – Israel Institute of Technology
Technion City, Haifa 32000, Israel
2

Preface

Mathematical Programming deals with optimization programs of the form

minimize f (x)
subject to
(P)
gi (x) ≤ 0, i = 1, ..., m,
[x ⊂ Rn ]
and includes the following general areas:
1. Modelling: methodologies for posing various applied problems as optimization programs;
2. Optimization Theory, focusing on existence, uniqueness and on characterization of optimal
solutions to optimization programs;
3. Optimization Methods: development and analysis of computational algorithms for various
classes of optimization programs;
4. Implementation, testing and application of modelling methodologies and computational
algorithms.
Essentially, Mathematical Programming was born in 1948, when George Dantzig has invented
Linear Programming – the class of optimization programs (P) with linear objective f (·) and
constraints gi (·). This breakthrough discovery included
• the methodological idea that a natural desire of a human being to look for the best possible
decisions can be posed in the form of an optimization program (P) and thus subject to
mathematical and computational treatment;
• the theory of LP programs, primarily the LP duality (this is in part due to the great
mathematician John von Neumann);
• the first computational method for LP – the Simplex method, which over the years turned
out to be an extremely powerful computational tool.
As it often happens with first-rate discoveries (and to some extent is characteristic for such
discoveries), today the above ideas and constructions look quite traditional and simple. Well,
the same is with the wheel.
In 50 plus years since its birth, Mathematical Programming was rapidly progressing along
all outlined avenues, “in width” as well as “in depth”. I have no intention (and time) to trace
the history of the subject decade by decade; instead, let me outline the major achievements in
Optimization during the last 20 years or so, those which, I believe, allow to speak about modern
optimization as opposed to the “classical” one as it existed circa 1980. The reader should be
aware that the summary to follow is highly subjective and reflects the personal attitudes prefer-
ences of the author. Thus, in my opinion the major achievements in Mathematical Programming
during last 15-20 years can be outlined as follows:
♠ Realizing what are the generic optimization programs one can solve well (“efficiently solv-
able” programs) and when such a possibility is, mildly speaking, problematic (“computationally
intractable” programs). At this point, I do not intend to explain what does it mean exactly
that “a generic optimization program is efficiently solvable”; we will arrive at this issue further
in the course. However, I intend to answer the question (right now, not well posed!) “what are
generic optimization programs we can solve well”:
3

(!) As far as numerical processing of programs (P) is concerned, there exists a

“solvable case” – the one of convex optimization programs, where the objective f
and the constraints gi are convex functions.
Under minimal additional “computability assumptions” (which are satisfied in basi-
cally all applications), a convex optimization program is “computationally tractable”
– the computational effort required to solve the problem to a given accuracy “grows
moderately” with the dimensions of the problem and the required number of accuracy
digits.
In contrast to this, a general-type non-convex problems are too difficult for numerical
solution – the computational effort required to solve such a problem by the best
known so far numerical methods grows prohibitively fast with the dimensions of
the problem and the number of accuracy digits, and there are serious theoretical
reasons to guess that this is the intrinsic feature of non-convex problems rather than
a drawback of the existing optimization techniques.
Just to give an example, consider a pair of optimization problems. The first is
n
minimize − i=1
xi
subject to
(A)
x2i − xi = 0, i = 1, ..., n;
xi xj = 0 ∀(i, j) ∈ Γ,

Γ being a given set of pairs (i, j) of indices i, j. This is a fundamental combinatorial problem of computing the
stability number of a graph; the corresponding “covering story” is as follows:
Assume that we are given n letters which can be sent through a telecommunication channel, say,
n = 256 usual bytes. When passing trough the channel, an input letter can be corrupted by errors;
as a result, two distinct input letters can produce the same output and thus not necessarily can be
distinguished at the receiving end. Let Γ be the set of “dangerous pairs of letters” – pairs (i, j) of
distinct letters i, j which can be converted by the channel into the same output. If we are interested
in error-free transmission, we should restrict the set S of letters we actually use to be independent
– such that no pair (i, j) with i, j ∈ S belongs to Γ. And in order to utilize best of all the capacity
of the channel, we are interested to use a maximal – with maximum possible number of letters –
independent sub-alphabet. It turns out that the minus optimal value in (A) is exactly the cardinality
of such a maximal independent sub-alphabet.

Our second problem is

k m
minimize −2 i=1 j=1
cij xij + x00
subject to  m 
x1 b x
j=1 pj 1j
 .. 
 
λmin  .
m · · ·  ≥ 0, (B)
 
m m xk b x
j=1 pj kj

j=1
bpj x1j ··· b x
j=1 pj kj
x00
p = 1, ..., N,
k
i=1 i
x = 1,

where λmin (A) denotes the minimum eigenvalue of a symmetric matrix A. This problem is responsible for the
design of a truss (a mechanical construction comprised of linked with each other thin elastic bars, like an electric
mast, a bridge or the Eiffel Tower) capable to withstand best of all to k given loads.
When looking at the analytical forms of (A) and (B), it seems that the first problem is easier than the second:
the constraints in (A) are simple explicit quadratic equations, while the constraints in (B) involve much more
complicated functions of the design variables – the eigenvalues of certain matrices depending on the design vector.
The truth, however, is that the first problem is, in a sense, “as difficult as an optimization problem can be”, and
4

the worst-case computational effort to solve this problem within absolute inaccuracy 0.5 by all known optimization
methods is about 2n operations; for n = 256 (just 256 design variables corresponding to the “alphabet of bytes”),
the quantity 2n ≈ 1077 , for all practical purposes, is the same as +∞. In contrast to this, the second problem is
quite “computationally tractable”. E.g., for k = 6 (6 loads of interest) and m = 100 (100 degrees of freedom of
the construction) the problem has about 600 variables (twice the one of the “byte” version of (A)); however, it
can be reliably solved within 6 accuracy digits in a couple of minutes. The dramatic difference in computational
effort required to solve (A) and (B) finally comes from the fact that (A) is a non-convex optimization problem,
while (B) is convex.
Note that realizing what is easy and what is difficult in Optimization is, aside of theoretical
importance, extremely important methodologically. Indeed, mathematical models of real world
situations in any case are incomplete and therefore are flexible to some extent. When you know in
advance what you can process efficiently, you perhaps can use this flexibility to build a tractable
(in our context – a convex) model. The “traditional” Optimization did not pay much attention
to complexity and focused on easy-to-analyze purely asymptotical “rate of convergence” results.
From this viewpoint, the most desirable property of f and gi is smoothness (plus, perhaps,
certain “nondegeneracy” at the optimal solution), and not their convexity; choosing between
the above problems (A) and (B), a “traditional” optimizer would, perhaps, prefer the first of
them. I suspect that a non-negligible part of “applied failures” of Mathematical Programming
came from the traditional (I would say, heavily misleading) “order of preferences” in model-
building. Surprisingly, some advanced users (primarily in Control) have realized the crucial
role of convexity much earlier than some members of the Optimization community. Here is a
real story. About 7 years ago, we were working on certain Convex Optimization method, and
I sent an e-mail to people maintaining CUTE (a benchmark of test problems for constrained
continuous optimization) requesting for the list of convex programs from their collection. The
answer was: “We do not care which of our problems are convex, and this be a lesson for those
developing Convex Optimization techniques.” In their opinion, I am stupid; in my opinion, they
are obsolete. Who is right, this I do not know...
♠ Discovery of interior-point polynomial time methods for “well-structured” generic convex
programs and throughout investigation of these programs.
By itself, the “efficient solvability” of generic convex programs is a theoretical rather than
a practical phenomenon. Indeed, assume that all we know about (P) is that the program is
convex, its objective is called f , the constraints are called gj and that we can compute f and gi ,
along with their derivatives, at any given point at the cost of M arithmetic operations. In this
case the computational effort for finding an -solution turns out to be at least O(1)nM ln( 1 ).
Note that this is a lower complexity bound, and the best known so far upper bound is much
worse: O(1)n(n3 + M ) ln( 1 ). Although the bounds grow “moderately” – polynomially – with
the design dimension n of the program and the required number ln( 1 ) of accuracy digits, from
the practical viewpoint the upper bound becomes prohibitively large already for n like 1000.
This is in striking contrast with Linear Programming, where one can solve routinely problems
with tens and hundreds of thousands of variables and constraints. The reasons for this huge
difference come from the fact that
When solving an LP program, our a priory knowledge is far beyond the fact that the
objective is called f , the constraints are called gi , that they are convex and we can
compute their values at derivatives at any given point. In LP, we know in advance
what is the analytical structure of f and gi , and we heavily exploit this knowledge
when processing the problem. In fact, all successful LP methods never never compute
the values and the derivatives of f and gi – they do something completely different.
5

One of the most important recent developments in Optimization is realizing the simple fact
that a jump from linear f and gi ’s to “completely structureless” convex f and gi ’s is too long: in-
between these two extremes, there are many interesting and important generic convex programs.
These “in-between” programs, although non-linear, still possess nice analytical structure, and
one can use this structure to develop dedicated optimization methods, the methods which turn
out to be incomparably more efficient than those exploiting solely the convexity of the program.
The aforementioned “dedicated methods” are Interior Point polynomial time algorithms,
and the most important “well-structured” generic convex optimization programs are those of
Linear, Conic Quadratic and Semidefinite Programming; the last two entities merely did not
exist as established research subjects just 15 years ago. In my opinion, the discovery of Inte-
rior Point methods and of non-linear “well-structured” generic convex programs, along with the
subsequent progress in these novel research areas, is one of the most impressive achievements in
Mathematical Programming. It is my pleasure to add that one of the key roles in these break-
through developments, and definitely the key role as far as nonlinear programs are concerned,
was and is played by Professor Yuri Nesterov from CORE.
♠ I have outlined the most revolutionary, in my appreciation, changes in the theoretical core
of Mathematical Programming in the last 15-20 years. During this period, we have witnessed
perhaps less dramatic, but still quite important progress in the methodological and application-
related areas as well. The major novelty here is certain shift from the traditional for Operations
Research applications in Industrial Engineering (production planning, etc.) to applications in
“genuine” Engineering. I believe it is completely fair to say that the theory and methods
of Convex Optimization, especially those of Semidefinite Programming, have become a kind
of new paradigm in Control and are becoming more and more frequently used in Mechanical
Engineering, Design of Structures, Medical Imaging, etc.

The aim of the course is to outline some of the novel research areas which have arisen in
Optimization during the past decade or so. I intend to focus solely on Convex Programming,
speciﬁcally, on

• Conic Programming, with emphasis on the most important particular cases – those of
Linear, Conic Quadratic and Semideﬁnite Programming (LP, CQP and SDP, respectively).
Here the focus will be on

– basic Duality Theory for conic programs;

– investigation of “expressive abilities” of CQP and SDP;
– overview of the theory of Interior Point polynomial time methods for LP, CQP and
SDP.

• “Eﬃcient (polynomial time) solvability” of generic convex programs.

• “Low cost” optimization methods for extremely large-scale optimization programs.

Acknowledgements. The ﬁrst four lectures of the ﬁve comprising the course are based upon
the recent book
Ben-Tal, A., Nemirovski, A., Lectures on Modern Convex Optimization: Analysis, Algo-
rithms, Engineering Applications, MPS-SIAM Series on Optimization, SIAM, Philadelphia,
2001.
6

I am greatly indebted to my colleagues, primarily to Yuri Nesterov, Aharon Ben-Tal, Stephen

Boyd, Claude Lemarechal and Kees Roos, who over the years have inﬂuenced signiﬁcantly my
understanding of our subject as expressed in this course. Needless to say, I am the only person
responsible for the drawbacks in what follows.

Arkadi Nemirovski,
Haifa, Israel, May 2002.
Contents

1 From Linear to Conic Programming 9

1.1 Linear programming: basic notions . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2 Duality in linear programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2.1 Certiﬁcates for solvability and insolvability . . . . . . . . . . . . . . . . . 10
1.2.2 Dual to an LP program: the origin . . . . . . . . . . . . . . . . . . . . . . 14
1.2.3 The LP Duality Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.3 From Linear to Conic Programming . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.4 Orderings of Rm and convex cones . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.5 “Conic programming” – what is it? . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.6 Conic Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.6.1 Geometry of the primal and the dual problems . . . . . . . . . . . . . . . 25
1.7 The Conic Duality Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.7.1 Is something wrong with conic duality? . . . . . . . . . . . . . . . . . . . 31
1.7.2 Consequences of the Conic Duality Theorem . . . . . . . . . . . . . . . . 32

2 Conic Quadratic Programming 39

2.1 Conic Quadratic problems: preliminaries . . . . . . . . . . . . . . . . . . . . . . . 39
2.2 Examples of conic quadratic problems . . . . . . . . . . . . . . . . . . . . . . . . 41
2.2.1 Contact problems with static friction [10] . . . . . . . . . . . . . . . . . . 41
2.3 What can be expressed via conic quadratic constraints? . . . . . . . . . . . . . . 43
2.3.1 More examples of CQ-representable functions/sets . . . . . . . . . . . . . 58
2.4 More applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
2.4.1 Robust Linear Programming . . . . . . . . . . . . . . . . . . . . . . . . . 62

3 Semidefinite Programming 77
3.1 Semidefinite cone and Semidefinite programs . . . . . . . . . . . . . . . . . . . . 77
3.1.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.2 What can be expressed via LMI’s? . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.3 Applications of Semidefinite Programming in Engineering . . . . . . . . . . . . . 95
3.3.1 Dynamic Stability in Mechanics . . . . . . . . . . . . . . . . . . . . . . . . 96
3.3.2 Design of chips and Boyd’s time constant . . . . . . . . . . . . . . . . . . 98
3.3.3 Lyapunov stability analysis/synthesis . . . . . . . . . . . . . . . . . . . . 100
3.4 Semidefinite relaxations of intractable problems . . . . . . . . . . . . . . . . . . . 108
3.4.1 Semidefinite relaxations of combinatorial problems . . . . . . . . . . . . . 108
3.4.2 Matrix Cube Theorem and interval stability analysis/synthesis . . . . . . 121
3.4.3 Robust Quadratic Programming . . . . . . . . . . . . . . . . . . . . . . . 128
3.5 Appendix: S-Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

7
8 CONTENTS

4 Polynomial Time Interior Point algorithms for LP, CQP and SDP 137
4.1 Complexity of Convex Programming . . . . . . . . . . . . . . . . . . . . . . . . . 137
4.1.1 Combinatorial Complexity Theory . . . . . . . . . . . . . . . . . . . . . . 137
4.1.2 Complexity in Continuous Optimization . . . . . . . . . . . . . . . . . . . 140
4.1.3 Diﬃcult continuous optimization problems . . . . . . . . . . . . . . . . . 144
4.2 Interior Point Polynomial Time Methods for LP, CQP and SDP . . . . . . . . . . 145
4.2.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
4.2.2 Interior Point methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
4.2.3 But... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
4.3 Interior point methods for LP, CQP, and SDP: building blocks . . . . . . . . . . 151
4.3.1 Canonical cones and canonical barriers . . . . . . . . . . . . . . . . . . . . 151
4.3.2 Elementary properties of canonical barriers . . . . . . . . . . . . . . . . . 153
4.4 Primal-dual pair of problems and primal-dual central path . . . . . . . . . . . . . 155
4.4.1 The problem(s) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
4.4.2 The central path(s) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
4.5 Tracing the central path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
4.5.1 The path-following scheme . . . . . . . . . . . . . . . . . . . . . . . . . . 162
4.5.2 Speed of path-tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
4.5.3 The primal and the dual path-following methods . . . . . . . . . . . . . . 165
4.5.4 The SDP case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
4.6 Complexity bounds for LP, CQP, SDP . . . . . . . . . . . . . . . . . . . . . . . . 181
4.6.1 Complexity of LP b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
4.6.2 Complexity of CQP b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
4.6.3 Complexity of SDP b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
4.7 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

5 Simple methods for extremely large-scale problems 187

5.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
5.2 Information-based complexity of Convex Programming . . . . . . . . . . . . . . . 189
5.3 The Bundle-Mirror scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
5.4 Implementation issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
5.5 Illustration: PET Image Reconstruction problem . . . . . . . . . . . . . . . . . . 204
5.6 Appendix: strong convexity of ω(·) for standard setups . . . . . . . . . . . . . . . 211
Lecture 1

From Linear to Conic Programming

1.1 Linear programming: basic notions

A Linear Programming (LP) program is an optimization program of the form

min cT x Ax ≥ b , (LP)

where
• x ∈ Rn is the design vector
• c ∈ Rn is a given vector of coeﬃcients of the objective function cT x
• A is a given m × n constraint matrix, and b ∈ Rm is a given right hand side of the
constraints.
(LP) is called
– feasible, if its feasible set
F = {x | Ax − b ≥ 0}
is nonempty; a point x ∈ F is called a feasible solution to (LP);
– bounded below, if it is either infeasible, or its objective cT x is bounded below on F.
For a feasible bounded below problem (LP), the quantity
c∗ ≡ inf cT x
x:Ax−b≥0

is called the optimal value of the problem. For an infeasible problem, we set c∗ = +∞,
while for feasible unbounded below problem we set c∗ = −∞.
(LP) is called solvable, if it is feasible, bounded below and the optimal value is attained, i.e.,
there exists x ∈ F with cT x = c∗ . An x of this type is called an optimal solution to (LP).
A priori it is unclear whether a feasible and bounded below LP program is solvable: why should
the inﬁmum be achieved? It turns out, however, that a feasible and bounded below program
(LP) always is solvable. This nice fact (we shall establish it later) is speciﬁc for LP. Indeed, a
very simple nonlinear optimization program

1
min x≥1
x
is feasible and bounded below, but it is not solvable.

9
10 LECTURE 1. FROM LINEAR TO CONIC PROGRAMMING

1.2 Duality in linear programming

The most important and interesting feature of linear programming as a mathematical entity
(i.e., aside of computations and applications) is the wonderful LP duality theory we are about
to consider. We motivate this topic by ﬁrst addressing the following question:
Given an LP program

∗ T
c = min c x Ax − b ≥ 0 , (LP)
x

how to find a systematic way to bound from below its optimal value c∗ ?
Why this is an important question, and how the answer helps to deal with LP, this will be seen
in the sequel. For the time being, let us just believe that the question is worthy of the effort.
A trivial answer to the posed question is: solve (LP) and look what is the optimal value.
There is, however, a smarter and a much more instructive way to answer our question. Just to
get an idea of this way, let us look at the following example:
 

 x1 + 2x2 + ... + 2001x2001 + 2002x2002 − 1 ≥ 0,  

min x1 + x2 + ... + x2002 2002x1 + 2001x2 + ... + 2x2001 + x2002 − 100 ≥ 0, .

 
..... ... ... 
101
We claim that the optimal value in the problem is ≥ 2003 . How could one certify this bound?
This is immediate: add the first two constraints to get the inequality

2003(x1 + x2 + ... + x1998 + x2002 ) − 101 ≥ 0,

and divide the resulting inequality by 2003. LP duality is nothing but a straightforward gener-
alization of this simple trick.

1.2.1 Certiﬁcates for solvability and insolvability

Consider a (ﬁnite) system of scalar inequalities with n unknowns. To be as general as possible,
we do not assume for the time being the inequalities to be linear, and we allow for both non-
strict and strict inequalities in the system, as well as for equalities. Since an equality can be
represented by a pair of non-strict inequalities, our system can always be written as

fi (x) Ωi 0, i = 1, ..., m, (S)

where every Ωi is either the relation ” > ” or the relation ” ≥ ”.

The basic question about (S) is
(?) Whether (S) has a solution or not.
Knowing how to answer the question (?), we are able to answer many other questions. E.g., to
verify whether a given real a is a lower bound on the optimal value c∗ of (LP) is the same as to
verify whether the system
−cT x + a > 0
Ax − b ≥ 0
has no solutions.
The general question above is too diﬃcult, and it makes sense to pass from it to a seemingly
simpler one:
1.2. DUALITY IN LINEAR PROGRAMMING 11

(??) How to certify that (S) has, or does not have, a solution.

Imagine that you are very smart and know the correct answer to (?); how could you convince
somebody that your answer is correct? What could be an “evident for everybody” certificate of
the validity of your answer?
If your claim is that (S) is solvable, a certificate could be just to point out a solution x∗ to
(S). Given this certificate, one can substitute x∗ into the system and check whether x∗ indeed
is a solution.
Assume now that your claim is that (S) has no solutions. What could be a “simple certificate”
of this claim? How one could certify a negative statement? This is a highly nontrivial problem
not just for mathematics; for example, in criminal law: how should someone accused in a murder
prove his innocence? The “real life” answer to the question “how to certify a negative statement”
is discouraging: such a statement normally cannot be certified (this is where the rule “a person
is presumed innocent until proven guilty” comes from). In mathematics, however, the situation
is different: in some cases there exist “simple certificates” of negative statements. E.g., in order
to certify that (S) has no solutions, it suffices to demonstrate that a consequence of (S) is a
contradictory inequality such as
−1 ≥ 0.
For example, assume that λi , i = 1, ..., m, are nonnegative weights. Combining inequalities from
(S) with these weights, we come to the inequality
m

λi fi (x) Ω 0 (Cons(λ))
i=1

where Ω is either ” > ” (this is the case when the weight of at least one strict inequality from
(S) is positive), or ” ≥ ” (otherwise). Since the resulting inequality, due to its origin, is a
consequence of the system (S), i.e., it is satisﬁed by every solution to S), it follows that if
(Cons(λ)) has no solutions at all, we can be sure that (S) has no solution. Whenever this is the
case, we may treat the corresponding vector λ as a “simple certiﬁcate” of the fact that (S) is
infeasible.
Let us look what does the outlined approach mean when (S) is comprised of linear inequal-
ities:
T ”>”
(S) : {ai x Ωi bi , i = 1, ..., m} Ωi =
”≥”
Here the “combined inequality” is linear as well:
m
m

(Cons(λ)) : ( λai )T x Ω λbi
i=1 i=1

(Ω is ” > ” whenever λi > 0 for at least one i with Ωi = ” > ”, and Ω is ” ≥ ” otherwise). Now,
when can a linear inequality
dT x Ω e
be contradictory? Of course, it can happen only when d = 0. Whether in this case the inequality
is contradictory, it depends on what is the relation Ω: if Ω = ” > ”, then the inequality is
contradictory if and only if e ≥ 0, and if Ω = ” ≥ ”, it is contradictory if and only if e > 0. We
have established the following simple result:
12 LECTURE 1. FROM LINEAR TO CONIC PROGRAMMING

Proposition 1.2.1 Consider a system of linear inequalities

aTi x > bi , i = 1, ..., ms ,
(S) :
aTi x ≥ bi , i = ms + 1, ..., m.

with n-dimensional vector of unknowns x. Let us associate with (S) two systems of linear
inequalities and equations with m-dimensional vector of unknowns λ:


 (a) λ ≥ 0;

 m




 (b) λi ai = 0;
 i=1
TI : m


 (cI ) λi bi ≥ 0;

 i=1

 m
s


 (dI ) λi > 0.
i=1


 (a) λ ≥ 0;

 m


(b) λi ai = 0;
TII : i=1

 m



 (cII ) λi bi > 0.
i=1

Assume that at least one of the systems TI , TII is solvable. Then the system (S) is infeasible.

Proposition 1.2.1 says that in some cases it is easy to certify infeasibility of a linear system of
inequalities: a “simple certificate” is a solution to another system of linear inequalities. Note,
however, that the existence of a certificate of this latter type is to the moment only a sufficient,
but not a necessary, condition for the infeasibility of (S). A fundamental result in the theory of
linear inequalities is that the sufficient condition in question is in fact also necessary:

Theorem 1.2.1 [General Theorem on Alternative] In the notation from Proposition 1.2.1, sys-
tem (S) has no solutions if and only if either TI , or TII , or both these systems, are solvable.
There are numerous proofs of the Theorem on Alternative; in my taste, the most instructive one is to
reduce the Theorem to its particular case – the Homogeneous Farkas Lemma:
[Homogeneous Farkas Lemma] A homogeneous nonstrict linear inequality

aT x ≤ 0

is a consequence of a system of homogeneous nonstrict linear inequalities

aTi x ≤ 0, i = 1, ..., m

if and only if it can be obtained from the system by taking weighted sum with nonnegative
weights:
(a) aTi x ≤ 0, i = 1, ..., m ⇒ aT x ≤ 0,
(1.2.1)
(b) ∃λi ≥ 0 : a = λi ai .
i

The reduction of TA to HFL is easy. As about the HFL, there are, essentially, two ways to prove the
statement:
• The “quick and dirty” one based on separation arguments, which is as follows:
1.2. DUALITY IN LINEAR PROGRAMMING 13

1. First, we demonstrate that if A is a nonempty closed convex set in Rn and a is a point from
Rn \A, then a can be strongly separated from A by a linear form: there exists x ∈ Rn such
that
xT a < inf xT b. (1.2.2)
b∈A

To this end, it suﬃces to verify that

√
(a) In A, there exists a point closest to a w.r.t. the standard Euclidean norm b2 = bT b,
i.e., that the optimization program
min a − b2
b∈A

has a solution b∗ ;
(b) Setting x = b∗ − a, one ensures (1.2.2).
Both (a) and (b) are immediate.
2. Second, we demonstrate that the set
m

A = {b : ∃λ ≥ 0 : b = λi ai }
i=1

– the cone spanned by the vectors a1 , ..., am – is convex (which is immediate) and closed (the
proof of this crucial fact also is not difficult).
3. Combining the above facts, we immediately see that
— either a ∈ A, i.e., (1.2.1.b) holds,

— or there exists x such that xT a < inf xT λi ai .
λ≥0 i
The latter inf is finite if and only if xT ai ≥ 0 for all i, and in this case the inf is 0, so that
the “or” statement says exactly that there exists x with aTi x ≥ 0, aT x < 0, or, which is the
same, that (1.2.1.a) does not hold.
Thus, among the statements (1.2.1.a) and the negation of (1.2.1.b) at least one (and, as it
is immediately seen, at most one as well) always is valid, which is exactly the equivalence
(1.2.1).
• “Advanced” proofs based purely on Linear Algebra facts. The advantage of these purely Linear
Algebra proofs is that they, in contrast to the outlined separation-based proof, do not use the
completeness of Rn as a metric space and thus work when we pass from systems with real coefficients
and unknowns to systems with rational (or algebraic) coefficients. As a result, an advanced proof
allows to establish the Theorem on Alternative for the case when the coefficients and unknowns in
(S), TI , TII are restricted to belong to a given “real field” (e.g., are rational).
We formulate here explicitly two very useful principles following from the Theorem on Al-
ternative:

A. A system of linear inequalities

aTi x Ωi bi , i = 1, ..., m

has no solutions if and only if one can combine the inequalities of the system in
a linear fashion (i.e., multiplying the inequalities by nonnegative weights, adding
the results and passing, if necessary, from an inequality aT x > b to the inequality
aT x ≥ b) to get a contradictory inequality, namely, either the inequality 0T x ≥ 1, or
the inequality 0T x > 0.
B. A linear inequality
aT0 x Ω0 b0
14 LECTURE 1. FROM LINEAR TO CONIC PROGRAMMING

is a consequence of a solvable system of linear inequalities

aTi x Ωi bi , i = 1, ..., m
if and only if it can be obtained by combining, in a linear fashion, the inequalities of
the system and the trivial inequality 0 > −1.
It should be stressed that the above principles are highly nontrivial and very deep. Consider,
e.g., the following system of 4 linear inequalities with two variables u, v:
−1 ≤ u ≤ 1
−1 ≤ v ≤ 1.
From these inequalities it follows that
u2 + v 2 ≤ 2, (!)
which in turn implies, by the Cauchy inequality, the linear inequality u + v ≤ 2:
√
u + v = 1 × u + 1 × v ≤ 12 + 12 u2 + v 2 ≤ ( 2)2 = 2. (!!)
The concluding inequality is linear and is a consequence of the original system, but in the
demonstration of this fact both steps (!) and (!!) are “highly nonlinear”. It is absolutely
unclear a priori why the same consequence can, as it is stated by Principle A, be derived
from the system in a linear manner as well [of course it can – it suﬃces just to add two
inequalities u ≤ 1 and v ≤ 1].
Note that the Theorem on Alternative and its corollaries A and B heavily exploit the fact
that we are speaking about linear inequalities. E.g., consider the following 2 quadratic and
2 linear inequalities with two variables:
(a) u2 ≥ 1;
(b) v2 ≥ 1;
(c) u ≥ 0;
(d) v ≥ 0;
along with the quadratic inequality
(e) uv ≥ 1.

The inequality (e) is clearly a consequence of (a) – (d). However, if we extend the system of
inequalities (a) – (b) by all “trivial” (i.e., identically true) linear and quadratic inequalities
with 2 variables, like 0 > −1, u2 + v 2 ≥ 0, u2 + 2uv + v 2 ≥ 0, u2 − uv + v 2 ≥ 0, etc.,
and ask whether (e) can be derived in a linear fashion from the inequalities of the extended
system, the answer will be negative. Thus, Principle A fails to be true already for quadratic
inequalities (which is a great sorrow – otherwise there were no diﬃcult problems at all!)

We are about to use the Theorem on Alternative to obtain the basic results of the LP duality
theory.

1.2.2 Dual to an LP program: the origin

As already mentioned, the motivation for constructing the problem dual to an LP program
   
aT1
   
  aT2  
c∗ = min cT x Ax − b ≥ 0 A =   ∈ Rm×n  (LP)
x   ...  
aTm
1.2. DUALITY IN LINEAR PROGRAMMING 15

is the desire to generate, in a systematic way, lower bounds on the optimal value c∗ of (LP).
An evident way to bound from below a given function f (x) in the domain given by system of
inequalities
gi (x) ≥ bi , i = 1, ..., m, (1.2.3)
is oﬀered by what is called the Lagrange duality and is as follows:
Lagrange Duality:
• Let us look at all inequalities which can be obtained from (1.2.3) by linear aggre-
gation, i.e., at the inequalities of the form

yi gi (x) ≥ yi bi (1.2.4)
i i

with the “aggregation weights” yi ≥ 0. Note that the inequality (1.2.4), due to its
origin, is valid on the entire set X of solutions of (1.2.3).
• Depending on the choice of aggregation weights, it may happen that the left hand
side in (1.2.4) is ≤ f (x) for all x ∈ Rn . Whenever it is the case, the right hand side

yi bi of (1.2.4) is a lower bound on f in X.
i

Indeed, on X the quantity yi bi is a lower bound on yi gi (x), and for y in question
i
the latter function of x is everywhere ≤ f (x).

It follows that
• The optimal value in the problem
 
 y ≥ 0, (a) 
max yi bi : yi gi (x) ≤ f (x) ∀x ∈ Rn (b) (1.2.5)
y  
i i

is a lower bound on the values of f on the set of solutions to the system (1.2.3).
Let us look what happens with the Lagrange duality when f and gi are homogeneous linear
functions: f = cT x, gi (x) = aTi x. In this case, the requirement (1.2.5.b) merely says that

c = yi ai (or, which is the same, AT y = c due to the origin of A). Thus, problem (1.2.5)
i
becomes the Linear Programming problem

max bT y : AT y = c, y ≥ 0 , (LP∗ )
y

which is nothing but the LP dual of (LP).

By the construction of the dual problem,
[Weak Duality] The optimal value in (LP∗ ) is less than or equal to the optimal value
in (LP).
In fact, the “less than or equal to” in the latter statement is “equal”, provided that the optimal
value c∗ in (LP) is a number (i.e., (LP) is feasible and below bounded). To see that this indeed
is the case, note that a real a is a lower bound on c∗ if and only if cT x ≥ a whenever Ax ≥ b,
or, which is the same, if and only if the system of linear inequalities

(Sa ) : −cT x > −a, Ax ≥ b

16 LECTURE 1. FROM LINEAR TO CONIC PROGRAMMING

has no solution. We know by the Theorem on Alternative that the latter fact means that some
other system of linear equalities (more exactly, at least one of a certain pair of systems) does
have a solution. More precisely,
(*) (Sa ) has no solutions if and only if at least one of the following two systems with
m + 1 unknowns:


 (a) λ = (λ0 , λ1 , ..., λm ) ≥ 0;

 m


 −λ0 c +
 (b) λi ai = 0;
TI : i=1
m




 (cI ) −λ0 a + λi bi ≥ 0;

 i=1

(dI ) λ0 > 0,
or 

 (a) λ = (λ0 , λ1 , ..., λm ) ≥ 0;

 m


(b) −λ0 c − λi ai = 0;
TII : i=1

 m



 (cII ) −λ0 a − λi bi > 0
i=1
– has a solution.
Now assume that (LP) is feasible. We claim that under this assumption (Sa ) has no solutions
if and only if TI has a solution.
The implication ”TI has a solution ⇒ (Sa ) has no solution” is readily given by the above
remarks. To verify the inverse implication, assume that (Sa ) has no solutions and the system
Ax ≤ b has a solution, and let us prove that then TI has a solution. If TI has no solution, then
by (*) TII has a solution and, moreover, λ0 = 0 for (every) solution to TII (since a solution
to the latter system with λ0 > 0 solves TI as well). But the fact that TII has a solution λ
with λ0 = 0 is independent of the values of a and c; if this fact would take place, it would
mean, by the same Theorem on Alternative, that, e.g., the following instance of (Sa ):

0T x ≥ −1, Ax ≥ b

has no solutions. The latter means that the system Ax ≥ b has no solutions – a contradiction
with the assumption that (LP) is feasible.

Now, if TI has a solution, this system has a solution with λ0 = 1 as well (to see this, pass from
a solution λ to the one λ/λ0 ; this construction is well-deﬁned, since λ0 > 0 for every solution
to TI ). Now, an (m + 1)-dimensional vector λ = (1, y) is a solution to TI if and only if the
m-dimensional vector y solves the system of linear inequalities and equations
y ≥ 0;
m

AT y ≡ yi ai = c; (D)
i=1
bT y ≥ a
Summarizing our observations, we come to the following result.
Proposition 1.2.2 Assume that system (D) associated with the LP program (LP) has a solution
(y, a). Then a is a lower bound on the optimal value in (LP). Vice versa, if (LP) is feasible and
a is a lower bound on the optimal value of (LP), then a can be extended by a properly chosen
m-dimensional vector y to a solution to (D).
1.2. DUALITY IN LINEAR PROGRAMMING 17

We see that the entity responsible for lower bounds on the optimal value of (LP) is the system
(D): every solution to the latter system induces a bound of this type, and in the case when
(LP) is feasible, all lower bounds can be obtained from solutions to (D). Now note that if
(y, a) is a solution to (D), then the pair (y, bT y) also is a solution to the same system, and the
lower bound bT y on c∗ is not worse than the lower bound a. Thus, as far as lower bounds on
c∗ are concerned, we lose nothing by restricting ourselves to the solutions (y, a) of (D) with
a = bTy; the best lower bound
∗
on c given by (D) is therefore the optimal value of the problem

maxy bT y AT y = c, y ≥ 0 , which is nothing but the dual to (LP) problem (LP∗ ). Note that
(LP∗ ) is also a Linear Programming program.
All we know about the dual problem to the moment is the following:

Proposition 1.2.3 Whenever y is a feasible solution to (LP∗ ), the corresponding value of the
dual objective bT y is a lower bound on the optimal value c∗ in (LP). If (LP) is feasible, then for
every a ≤ c∗ there exists a feasible solution y of (LP∗ ) with bT y ≥ a.

1.2.3 The LP Duality Theorem

Proposition 1.2.3 is in fact equivalent to the following

Theorem 1.2.2 [Duality Theorem in Linear Programming] Consider a linear programming

program
T

min c x Ax ≥ b (LP)
x

along with its dual

T
T
max b y A y = c, y ≥ 0 (LP∗ )
y

Then
1) The duality is symmetric: the problem dual to dual is equivalent to the primal;
2) The value of the dual objective at every dual feasible solution is ≤ the value of the primal
objective at every primal feasible solution
3) The following 5 properties are equivalent to each other:

(i) The primal is feasible and bounded below.

(ii) The dual is feasible and bounded above.
(iii) The primal is solvable.
(iv) The dual is solvable.
(v) Both primal and dual are feasible.

Whenever (i) ≡ (ii) ≡ (iii) ≡ (iv) ≡ (v) is the case, the optimal values of the primal and the dual
problems are equal to each other.

Proof. 1) is quite straightforward: writing the dual problem (LP∗ ) in our standard form, we
get    
 

 Im 0 


T  
T 
min −b y  A  y − −c ≥ 0 , 
y  

−AT c
18 LECTURE 1. FROM LINEAR TO CONIC PROGRAMMING

where Im is the m-dimensional unit matrix. Applying the duality transformation to the latter
problem, we come to the problem
 

 ξ ≥ 0 


 

η ≥ 0
max 0 ξ + c η + (−c) ζ
T T T
,

ξ,η,ζ  ζ ≥ 0 


 

ξ − Aη + Aζ = −b

which is clearly equivalent to (LP) (set x = η − ζ).

2) is readily given by Proposition 1.2.3.
3):

(i)⇒(iv): If the primal is feasible and bounded below, its optimal value c∗ (which
of course is a lower bound on itself) can, by Proposition 1.2.3, be (non-strictly)
majorized by a quantity bT y ∗ , where y ∗ is a feasible solution to (LP∗ ). In the
situation in question, of course, bT y ∗ = c∗ (by already proved item 2)); on the other
hand, in view of the same Proposition 1.2.3, the optimal value in the dual is ≤ c∗ . We
conclude that the optimal value in the dual is attained and is equal to the optimal
value in the primal.
(iv)⇒(ii): evident;
(ii)⇒(iii): This implication, in view of the primal-dual symmetry, follows from the
implication (i)⇒(iv).
(iii)⇒(i): evident.
We have seen that (i)≡(ii)≡(iii)≡(iv) and that the ﬁrst (and consequently each) of
these 4 equivalent properties implies that the optimal value in the primal problem
is equal to the optimal value in the dual one. All which remains is to prove the
equivalence between (i)–(iv), on one hand, and (v), on the other hand. This is
immediate: (i)–(iv), of course, imply (v); vice versa, in the case of (v) the primal is
not only feasible, but also bounded below (this is an immediate consequence of the
feasibility of the dual problem, see 2)), and (i) follows.

An immediate corollary of the LP Duality Theorem is the following necessary and suﬃcient
optimality condition in LP:

Theorem 1.2.3 [Necessary and suﬃcient optimality conditions in linear programming] Con-
sider an LP program (LP) along with its dual (LP∗ ). A pair (x, y) of primal and dual feasible
solutions is comprised of optimal solutions to the respective problems if and only if

yi [Ax − b]i = 0, i = 1, ..., m, [complementary slackness]

likewise as if and only if

cT x − bT y = 0 [zero duality gap]

Indeed, the “zero duality gap” optimality condition is an immediate consequence of the fact
that the value of primal objective at every primal feasible solution is ≥ the value of the
dual objective at every dual feasible solution, while the optimal values in the primal and the
dual are equal to each other, see Theorem 1.2.2. The equivalence between the “zero duality
gap” and the “complementary slackness” optimality conditions is given by the following
1.3. FROM LINEAR TO CONIC PROGRAMMING 19

computation: whenever x is primal feasible and y is dual feasible, the products yi [Ax − b]i ,
i = 1, ..., m, are nonnegative, while the sum of these products is precisely the duality gap:

y T [Ax − b] = (AT y)T x − bT y = cT x − bT y.

Thus, the duality gap can vanish at a primal-dual feasible pair (x, y) if and only if all products
yi [Ax − b]i for this pair are zeros.

1.3 From Linear to Conic Programming

Linear Programming models cover numerous applications. Whenever applicable, LP allows to
obtain useful quantitative and qualitative information on the problem at hand. The specific
analytic structure of LP programs gives rise to a number of general results (e.g., those of the LP
Duality Theory) which provide us in many cases with valuable insight and understanding. At
the same time, this analytic structure underlies some specific computational techniques for LP;
these techniques, which by now are perfectly well developed, allow to solve routinely quite large
(tens/hundreds of thousands of variables and constraints) LP programs. Nevertheless, there
are situations in reality which cannot be covered by LP models. To handle these “essentially
nonlinear” cases, one needs to extend the basic theoretical results and computational techniques
known for LP beyond the bounds of Linear Programming.
For the time being, the widest class of optimization problems to which the basic results of
LP were extended, is the class of convex optimization programs. There are several equivalent
ways to define a general convex optimization problem; the one we are about to use is not the
traditional one, but it is well suited to encompass the range of applications we intend to cover
in our course.
When passing from a generic LP problem

min c x Ax ≥ b
T
[A : m × n] (LP)
x

to its nonlinear extensions, we should expect to encounter some nonlinear components in the
problem. The traditional way here is to say: “Well, in (LP) there are a linear objective function
f (x) = cT x and inequality constraints fi (x) ≥ bi with linear functions fi (x) = aTi x, i = 1, ..., m.
Let us allow some/all of these functions f, f1 , ..., fm to be nonlinear.” In contrast to this tra-
ditional way, we intend to keep the objective and the constraints linear, but introduce “nonlin-
earity” in the inequality sign ≥.

1.4 Orderings of Rm and convex cones

The constraint inequality Ax ≥ b in (LP) is an inequality between vectors; as such, it requires a
deﬁnition, and the deﬁnition is well-known: given two vectors a, b ∈ Rm , we write a ≥ b, if the
coordinates of a majorate the corresponding coordinates of b:

a ≥ b ⇔ {ai ≥ bi , i = 1, ..., m}. (” ≥ ”)

In the latter relation, we again meet with the inequality sign ≥, but now it stands for the
“arithmetic ≥” – a well-known relation between real numbers. The above “coordinate-wise”
partial ordering of vectors in Rm satisﬁes a number of basic properties of the standard ordering
of reals; namely, for all vectors a, b, c, d, ... ∈ Rm one has
20 LECTURE 1. FROM LINEAR TO CONIC PROGRAMMING

1. Reﬂexivity: a ≥ a;

2. Anti-symmetry: if both a ≥ b and b ≥ a, then a = b;

3. Transitivity: if both a ≥ b and b ≥ c, then a ≥ c;

4. Compatibility with linear operations:

(a) Homogeneity: if a ≥ b and λ is a nonnegative real, then λa ≥ λb

(”One can multiply both sides of an inequality by a nonnegative real”)
(b) Additivity: if both a ≥ b and c ≥ d, then a + c ≥ b + d
(”One can add two inequalities of the same sign”).

It turns out that

• A signiﬁcant part of the nice features of LP programs comes from the fact that the vector
inequality ≥ in the constraint of (LP) satisﬁes the properties 1. – 4.;

• The standard inequality ” ≥ ” is neither the only possible, nor the only interesting way to
deﬁne the notion of a vector inequality ﬁtting the axioms 1. – 4.

As a result,

A generic optimization problem which looks exactly the same as (LP), up to the
fact that the inequality ≥ in (LP) is now replaced with and ordering which diﬀers
from the component-wise one, inherits a signiﬁcant part of the properties of LP
problems. Specifying properly the ordering of vectors, one can obtain from (LP)
generic optimization problems covering many important applications which cannot
be treated by the standard LP.

To the moment what is said is just a declaration. Let us look how this declaration comes to
life.
We start with clarifying the “geometry” of a “vector inequality” satisfying the axioms 1. –
4. Thus, we consider vectors from a ﬁnite-dimensional Euclidean space E with an inner product
·, · and assume that E is equipped with a partial ordering, let it be denoted by !: in other
words, we say what are the pairs of vectors a, b from E linked by the inequality a ! b. We call
the ordering “good”, if it obeys the axioms 1. – 4., and are interested to understand what are
these good orderings.
Our ﬁrst observation is:

A. A good inequality ! is completely identiﬁed by the set K of !-nonnegative vectors:

K = {a ∈ E | a ! 0}.

Namely,
a ! b ⇔ a − b ! 0 [⇔ a − b ∈ K].

Indeed, let a ! b. By 1. we have −b ! −b, and by 4.(b) we may add the latter
inequality to the former one to get a − b ! 0. Vice versa, if a − b ! 0, then, adding
to this inequality the one b ! b, we get a ! b.
1.4. ORDERINGS OF RM AND CONVEX CONES 21

The set K in Observation A cannot be arbitrary. It is easy to verify that it must be a pointed
convex cone, i.e., it must satisfy the following conditions:
1. K is nonempty and closed under addition:
a, a ∈ K ⇒ a + a ∈ K;

2. K is a conic set:
a ∈ K, λ ≥ 0 ⇒ λa ∈ K.
3. K is pointed:
a ∈ K and − a ∈ K ⇒ a = 0.
Geometrically: K does not contain straight lines passing through the origin.
Thus, every nonempty pointed convex cone K in E induces a partial ordering on E which
satisﬁes the axioms 1. – 4. We denote this ordering by ≥K :
a ≥K b ⇔ a − b ≥K 0 ⇔ a − b ∈ K.
What is the cone responsible for the standard coordinate-wise ordering ≥ on E = Rm we have
started with? The answer is clear: this is the cone comprised of vectors with nonnegative entries
– the nonnegative orthant
Rm T m
+ = {x = (x1 , ..., xm ) ∈ R : xi ≥ 0, i = 1, ..., m}.

(Thus, in order to express the fact that a vector a is greater than or equal to, in the component-
wise sense, to a vector b, we were supposed to write a ≥Rm +
b. However, we are not going to be
that formal and shall use the standard shorthand notation a ≥ b.)
The nonnegative orthant Rm + is not just a pointed convex cone; it possesses two useful
additional properties:
I. The cone is closed: if a sequence of vectors ai from the cone has a limit, the latter also
belongs to the cone.
II. The cone possesses a nonempty interior: there exists a vector such that a ball of positive
radius centered at the vector is contained in the cone.
These additional properties are very important. For example, I is responsible for the possi-
bility to pass to the term-wise limit in an inequality:
ai ≥ bi ∀i, ai → a, bi → b as i → ∞ ⇒ a ≥ b.
It makes sense to restrict ourselves with good partial orderings coming from cones K sharing
the properties I, II. Thus,
From now on, speaking about good partial orderings ≥K , we always assume that the
underlying set K is a pointed and closed convex cone with a nonempty interior.
Note that the closedness of K makes it possible to pass to limits in ≥K -inequalities:
ai ≥K bi , ai → a, bi → b as i → ∞ ⇒ a ≥K b.
The nonemptiness of the interior of K allows to deﬁne, along with the “non-strict” inequality
a ≥K b, also the strict inequality according to the rule
a >K b ⇔ a − b ∈ int K,
where int K is the interior of the cone K. E.g., the strict coordinate-wise inequality a >Rm
+
b
(shorthand: a > b) simply says that the coordinates of a are strictly greater, in the usual
arithmetic sense, than the corresponding coordinates of b.
22 LECTURE 1. FROM LINEAR TO CONIC PROGRAMMING

Examples. The partial orderings we are especially interested in are given by the following
cones:
• The nonnegative orthant Rm n
+ in R ;

• The Lorentz (or the second-order, or the less scientiﬁc name the ice-cream) cone
 
 m−1

Lm = x = (x1 , ..., xm−1 , xm )T ∈ Rm : xm ≥ ! x2i
 
i=1

• The positive semideﬁnite cone Sm + . This cone “lives” in the space E = S

m of m × m

symmetric matrices (equipped with the Frobenius inner product A, B = Tr(AB) =

Aij Bij ) and consists of all m × m matrices A which are positive semideﬁnite, i.e.,
i,j

A = AT ; xT Ax ≥ 0 ∀x ∈ Rm .

1.5 “Conic programming” – what is it?

Let K be a cone in E (convex, pointed, closed and with a nonempty interior). Given an
objective c ∈ Rn , a linear mapping x #→ Ax : Rn → E and a right hand side b ∈ E, consider the
optimization problem

min cT x Ax ≥K b (CP).
x

We shall refer to (CP) as to a conic problem associated with the cone K. Note that the only
diﬀerence between this program and an LP problem is that the latter deals with the particular
choice E = Rm , K = Rm + . With the formulation (CP), we get a possibility to cover a much
wider spectrum of applications which cannot be captured by LP; we shall look at numerous
examples in the sequel.

1.6 Conic Duality

Aside of algorithmic issues, the most important theoretical result in Linear Programming is the
LP Duality Theorem; can this theorem be extended to conic problems? What is the extension?
The source of the LP Duality Theorem was the desire to get in a systematic way a lower
bound on the optimal value c∗ in an LP program

∗ T
c = min c x Ax ≥ b . (LP)
x

The bound was obtained by looking at the inequalities of the type

λ, Ax ≡ λT Ax ≥ λT b (Cons(λ))

with weight vectors λ ≥ 0. By its origin, an inequality of this type is a consequence of the system
of constraints Ax ≥ b of (LP), i.e., it is satisﬁed at every solution to the system. Consequently,
whenever we are lucky to get, as the left hand side of (Cons(λ)), the expression cT x, i.e.,
whenever a nonnegative weight vector λ satisﬁes the relation

AT λ = c,
1.6. CONIC DUALITY 23

the inequality (Cons(λ)) yields a lower bound bT λ on the optimal value in (LP). And the dual
problem
max bT λ | λ ≥ 0, AT λ = c
was nothing but the problem of ﬁnding the best lower bound one can get in this fashion.
The same scheme can be used to develop the dual to a conic problem

min cT x | Ax ≥K b , K ⊂ E. (CP)

Here the only step which needs clariﬁcation is the following one:

(?) What are the “admissible” weight vectors λ, i.e., the vectors such that the scalar
inequality
λ, Ax ≥ λ, b
is a consequence of the vector inequality Ax ≥K b?

In the particular case of coordinate-wise partial ordering, i.e., in the case of E = Rm , K = Rm

+,
the admissible vectors were those with nonnegative coordinates. These vectors, however, not
necessarily are admissible for an ordering ≥K when K is diﬀerent from the nonnegative orthant:

Example 1.6.1 Consider the ordering ≥L3 on E = R3 given by the 3-dimensional ice-cream
cone:    
a1 0 "
 a2  ≥L3  0  ⇔ a3 ≥ a2 + a2 .
1 2
a3 0
The inequality    
−1 0
 −1  ≥L3  0 
2 0
 
1
is valid; however, aggregating this inequality with the aid of a positive weight vector λ =  1 ,
0.1
we get the false inequality
−1.8 ≥ 0.
Thus, not every nonnegative weight vector is admissible for the partial ordering ≥L3 .

To answer the question (?) is the same as to say what are the weight vectors λ such that

∀a ≥K 0 : λ, a ≥ 0. (1.6.1)

Whenever λ possesses the property (1.6.1), the scalar inequality

λ, a ≥ λ, b

is a consequence of the vector inequality a ≥K b:

a ≥K b
⇔ a − b ≥K 0 [additivity of ≥K ]
⇒ λ, a − b ≥ 0 [by (1.6.1)]
⇔ λ, a ≥ λT b.
24 LECTURE 1. FROM LINEAR TO CONIC PROGRAMMING

Vice versa, if λ is an admissible weight vector for the partial ordering ≥K :

∀(a, b : a ≥K b) : λ, a ≥ λ, b

then, of course, λ satisﬁes (1.6.1).

Thus the weight vectors λ which are admissible for a partial ordering ≥K are exactly the
vectors satisfying (1.6.1), or, which is the same, the vectors from the set

K∗ = {λ ∈ E : λ, a ≥ 0 ∀a ∈ K}.

The set K∗ is comprised of vectors whose inner products with all vectors from K are nonnegative.
K∗ is called the cone dual to K. The name is legitimate due to the following fact:

Theorem 1.6.1 [Properties of the dual cone] Let E be a ﬁnite-dimensional Euclidean space
with inner product ·, · and let K ⊂ E be a nonempty set. Then
(i) The set
K∗ = {λ ∈ Em : λ, a ≥ 0 ∀a ∈ K}
is a closed convex cone.
(ii) If int K $= ∅, then K∗ is pointed.
(iii) If K is a closed convex pointed cone, then int K∗ $= ∅.
(iv) If K is a closed convex cone, then so is K∗ , and the cone dual to K∗ is K itself:

(K∗ )∗ = K.

An immediate corollary of the Theorem is as follows:

Corollary 1.6.1 A set K ⊂ E is a closed convex pointed cone with a nonempty interior if and
only if the set K∗ is so.

From the dual cone to the problem dual to (CP). Now we are ready to derive the dual
problem of a conic problem (CP). As in the case of Linear Programming, we start with the
observation that whenever x is a feasible solution to (CP) and λ is an admissible weight vector,
i.e., λ ∈ K∗ , then x satisﬁes the scalar inequality

(A∗ λ)T x ≡ λ, Ax ≥ λ, b 1)

– this observation is an immediate consequence of the deﬁnition of K∗ . It follows that whenever

λ∗ is an admissible weight vector satisfying the relation

A∗ λ = c,

one has
cT x = (A∗ λ)T x = λ, Ax ≥ b, λ
1)
For a linear operator x → Ax : Rn → E, A∗ is the conjugate operator given by the identity

y, Ax = xT Ay ∀(y ∈ E, x ∈ Rn ).

When representing the operators by their matrices in orthogonal bases in the argument and the range spaces,
the matrix representing the conjugate operator is exactly the transpose of the matrix representing the operator
itself.
1.6. CONIC DUALITY 25

for all x feasible for (CP), so that the quantity b, λ is a lower bound on the optimal value of
(CP). The best bound one can get in this fashion is the optimal value in the problem

max {b, λ | A∗ λ = c, λ ≥K∗ 0} (D)

and this program is called the program dual to (CP).

So far, what we know about the duality we have just introduced is the following
Proposition 1.6.1 [Weak Duality Theorem] The optimal value of (D) is a lower bound on the
optimal value of (CP).

1.6.1 Geometry of the primal and the dual problems

The structure of problem (D) looks quite different from the one of (CP). However, a more
careful analysis demonstrates that the difference in structures comes just from how we represent
the data: geometrically, the problems are completely similar. Indeed, in (D) we are asked to
maximize a linear objective b, λ over the intersection of an affine plane L∗ = {λ | A∗ λ = c}
with the cone K∗ . And what about (CP)? Let us pass in this problem from the “true design
variables” x to their images y = Ax−b ∈ E. When x runs through Rn , y runs through the affine
plane L = {y = Ax − b | x ∈ Rn }; x ∈ Rn is feasible for (CP) if and only if the corresponding
y = Ax − b belongs to the cone K. Thus, in (CP) we also deal with the intersection of an affine
plane, namely, L, and a cone, namely, K. Now assume that our objective cT x can be expressed
in terms of y = Ax − b:
cT x = d, Ax − b + const.
This assumption is clearly equivalent to the inclusion

c ∈ ImA∗ . (1.6.2)

Indeed, in the latter case we have c = A∗ d for some d, whence

cT x = A∗ d, x = d, Ax = d, Ax − b + d, b ∀x. (1.6.3)

In the case of (1.6.2) the primal problem (CP) can be posed equivalently as the following problem:

min {d, y | y ∈ L, y ≥K 0} ,
y

where L = ImA − b and d is (any) vector satisfying the relation A∗ d = c. Thus,

In the case of (1.6.2) the primal problem, geometrically, is the problem of minimizing
a linear form over the intersection of the affine plane L with the cone K, and the
dual problem, similarly, is to maximize another linear form over the intersection of
the affine plane L∗ with the dual cone K∗ .
Now, what happens if the condition (1.6.2) is not satisfied? The answer is very simple: in this
case (CP) makes no sense – it is either unbounded below, or infeasible.
Indeed, assume that (1.6.2) is not satisfied. Then, by Linear Algebra, the vector c is not
orthogonal to the null space of A, so that there exists e such that Ae = 0 and cT e > 0. Now
let x be a feasible solution of (CP); note that all points x − µe, µ ≥ 0, are feasible, and
cT (x − µe) → ∞ as µ → ∞. Thus, when (1.6.2) is not satisfied, problem (CP), whenever
feasible, is unbounded below.
26 LECTURE 1. FROM LINEAR TO CONIC PROGRAMMING

From the above observation we see that if (1.6.2) is not satisﬁed, then we may reject (CP) from
the very beginning. Thus, from now on we assume that (1.6.2) is satisﬁed. In fact in what
follows we make a bit stronger assumption:

A. The mapping A is of full column rank, i.e., it has trivial null space.
Assuming that the mapping x #→ Ax has the trivial null space (“we have eliminated
from the very beginning the redundant degrees of freedom – those not aﬀecting the
value of Ax”), the equation
A∗ d = q
is solvable for every right hand side vector q.

In view of A, problem (CP) can be reformulated as a problem (P) of minimizing a linear objective
d, y over the intersection of an affine plane L and a cone K. Conversely, a problem (P) of this
latter type can be posed in the form of (CP) – to this end it suffices to represent the plane L as
the image of an affine mapping x #→ Ax − b (i.e., to parameterize somehow the feasible plane)
and to “translate” the objective d, y to the space of x-variables – to set c = A∗ d, which yields

y = Ax − b ⇒ d, y = cT x + const.

Thus, when dealing with a conic problem, we may pass from its “analytic form” (CP) to the
“geometric form” (P) and vice versa.
What are the relations between the “geometric data” of the primal and the dual problems?
We already know that the cone K∗ associated with the dual problem is dual of the cone K
associated with the primal one. What about the feasible planes L and L∗ ? The answer is
simple: they are orthogonal to each other! More exactly, the aﬃne plane L is the translation,
by vector −b, of the linear subspace

L = ImA ≡ {y = Ax | x ∈ Rn }.

And L∗ is the translation, by any solution λ0 of the system A∗ λ = c, e.g., by the solution d to
the system, of the linear subspace

L∗ = Null(A∗ ) ≡ {λ | A∗ λ = 0}.

A well-known fact of Linear Algebra is that the linear subspaces L and L∗ are orthogonal
complements of each other:

L = {y | y, λ = 0 ∀λ ∈ L∗ }; L∗ = {λ | y, λ = 0 ∀y ∈ L}.

Thus, we come to a nice geometrical conclusion:

A conic problem2) (CP) is the problem

min {d, y | y ∈ L − b, y ≥K 0} (P)

of minimizing a linear objective d, y over the intersection of a cone K with an aﬃne
plane L = L − b given as a translation, by vector −b, of a linear subspace L.
2)
recall that we have restricted ourselves to the problems satisfying the assumption A
1.6. CONIC DUALITY 27

The dual problem is the problem

max b, λ | λ ∈ L⊥ + d, λ ≥K∗ 0 . (D)
λ

of maximizing the linear objective b, λ over the intersection of the dual cone K∗
with an aﬃne plane L∗ = L⊥ + d given as a translation, by the vector d, of the
orthogonal complement L⊥ of L.

What we get is an extremely transparent geometric description of the primal-dual pair of conic
problems (P), (D). Note that the duality is completely symmetric: the problem dual to (D) is
(P)! Indeed, we know from Theorem 1.6.1 that (K∗ )∗ = K, and of course (L⊥ )⊥ = L. Switch
from maximization to minimization corresponds to the fact that the “shifting vector” in (P) is
(−b), while the “shifting vector” in (D) is d. The geometry of the primal-dual pair (P), (D) is
illustrated on the below picture:
b
L*

K* L

Figure 1.1. Primal-dual pair of conic problems

[bold: primal (vertical segment) and dual (horizontal ray) feasible sets]

Finally, note that in the case when (CP) is an LP program (i.e., in the case when K is the
nonnegative orthant), the “conic dual” problem (D) is exactly the usual LP dual; this fact
immediately follows from the observation that the cone dual to Rm m
+ is R+ itself.
We have explored the geometry of a primal-dual pair of conic problems: the “geometric
data” of such a pair are given by a pair of dual to each other cones K, K∗ in E and a pair of
affine planes L = L − b, L∗ = L⊥ + d, where L is a linear subspace in E and L⊥ is its orthogonal
complement. The first problem from the pair – let it be called (P) – is to minimize b, y over
y ∈ K ∩ L, and the second (D) is to maximize d, λ over λ ∈ K∗ ∩ L∗ . Note that the “geometric
data” (K, K∗ , L, L∗ ) of the pair do not specify completely the problems of the pair: given L, L∗ ,
we can uniquely define L, but not the shift vectors (−b) and d: b is known up to shift by a
vector from L, and d is known up to shift by a vector from L⊥ . However, this non-uniqueness
is of absolutely no importance: replacing a chosen vector d ∈ L∗ by another vector d ∈ L∗ , we
pass from (P) to a new problem (P ) which is completely equivalent to (P): indeed, both (P)
and (P ) have the same feasible set, and on the (common) feasible plane L of the problems their
28 LECTURE 1. FROM LINEAR TO CONIC PROGRAMMING

objectives d, y and d , y diﬀer from each other by a constant:

y ∈ L = L − b, d − d ∈ L⊥ ⇒ d − d , y + b = 0 ⇒ d − d , y = −(d − d ), b ∀y ∈ L.

Similarly, shifting b along L, we do modify the objective in (D), but in a trivial way – on the
feasible plane L∗ of the problem the new objective diﬀers from the old one by a constant.

1.7 The Conic Duality Theorem

The Weak Duality (Proposition 1.6.1) we have established so far for conic problems is much
weaker than the Linear Programming Duality Theorem. Is it possible to get results similar to
those of the LP Duality Theorem in the general conic case as well? The answer is affirmative,
provided that the primal problem (CP) is strictly feasible, i.e., that there exists x such that
Ax − b >K 0, or, geometrically, L ∩ int K $= ∅.
The advantage of the geometrical definition of strict feasibility is that it is independent of
the particular way in which the feasible plane is defined; hence, with this definition it is clear
what does it mean that the dual problem (D) is strictly feasible.
Our main result is the following

Theorem 1.7.1 [Conic Duality Theorem] Consider a conic problem

c∗ = min cT x | Ax ≥K b (CP)
x

along with its conic dual

b∗ = max {b, λ | A∗ λ = c, λ ≥K∗ 0} . (D)

1) The duality is symmetric: the dual problem is conic, and the problem dual to dual is the
primal.
2) The value of the dual objective at every dual feasible solution λ is ≤ the value of the primal
objective at every primal feasible solution x, so that the duality gap

cT x − b, λ

is nonnegative at every “primal-dual feasible pair” (x, λ).

3.a) If the primal (CP) is bounded below and strictly feasible (i.e. Ax >K b for some x), then
the dual (D) is solvable and the optimal values in the problems are equal to each other: c∗ = b∗ .
3.b) If the dual (D) is bounded above and strictly feasible (i.e., exists λ >K∗ 0 such that
A λ = c), then the primal (CP) is solvable and c∗ = b∗ .
∗

4) Assume that at least one of the problems (CP), (D) is bounded and strictly feasible. Then
a primal-dual feasible pair (x, λ) is a pair of optimal solutions to the respective problems
4.a) if and only if
b, λ = cT x [zero duality gap]

and
4.b) if and only if
λ, Ax − b = 0 [complementary slackness]
1.7. THE CONIC DUALITY THEOREM 29

Proof. 1): The result was already obtained when discussing the geometry of the primal and
the dual problems.
2): This is the Weak Duality Theorem.
3): Assume that (CP) is strictly feasible and bounded below, and let c∗ be the optimal value
of the problem. We should prove that the dual is solvable with the same optimal value. Since
we already know that the optimal value of the dual is ≤ c∗ (see 2)), all we need is to point out
a dual feasible solution λ∗ with bT λ∗ ≥ c∗ .
Consider the convex set
M = {y = Ax − b | x ∈ Rn , cT x ≤ c∗ }.
Let us start with the case of c $= 0. We claim that in this case
(i) The set M is nonempty;
(ii) the plane M does not intersect the interior K of the cone K: M ∩ int K = ∅.
(i) is evident (why?). To verify (ii), assume, on contrary, that there exists a point x̄, cT x̄ ≤ c∗ ,
such that ȳ ≡ Ax̄ − b >K 0. Then, of course, Ax − b >K 0 for all x close enough to x̄, i.e., all
points x in a small enough neighbourhood of x̄ are also feasible for (CP). Since c $= 0, there are
points x in this neighbourhood with cT x < cT x̄ ≤ c∗ , which is impossible, since c∗ is the optimal
value of (CP).
Now let us make use of the following basic fact:
Theorem 1.7.2 [Separation Theorem for Convex Sets] Let S, T be nonempty non-
intersecting convex subsets of a ﬁnite-dimensional Euclidean space E with inner prod-
uct ·, · . Then S and T can be separated by a linear functional: there exists a nonzero
vector λ ∈ E such that
supλ, u ≤ inf λ, u .
u∈S u∈T

Applying the Separation Theorem to S = M and T = K, we conclude that there exists λ ∈ E

such that
sup λ, y ≤ inf λ, y . (1.7.1)
y∈M y∈int K

From the inequality it follows that the linear form λ, y of y is bounded below on K = int K.
Since this interior is a conic set:
y ∈ K, µ > 0 ⇒ µy ∈ K
(why?), this boundedness implies that λ, y ≥ 0 for all y ∈ K. Consequently, λ, y ≥ 0 for all
y from the closure of K, i.e., for all y ∈ K. We conclude that λ ≥K∗ 0, so that the inf in (1.7.1)
is nonnegative. On the other hand, the inﬁmum of a linear form over a conic set clearly cannot
be positive; we conclude that the inf in (1.7.1) is 0, so that the inequality reads
sup λ, u ≤ 0.
u∈M

Recalling the deﬁnition of M , we get

[A∗ λ]T x ≤ λ, b (1.7.2)
for all x from the half-space cT x ≤ c∗ . But the linear form [A∗ λ]T x can be bounded above on
the half-space if and only if the vector A∗ λ is proportional, with a nonnegative coeﬃcient, to
the vector c:
A∗ λ = µc
30 LECTURE 1. FROM LINEAR TO CONIC PROGRAMMING

for some µ ≥ 0. We claim that µ > 0. Indeed, assuming µ = 0, we get A∗ λ = 0, whence λ, b ≥ 0
in view of (1.7.2). It is time now to recall that (CP) is strictly feasible, i.e., Ax̄ − b >K 0 for
some x̄. Since λ ≥K∗ 0 and λ $= 0, the product λ, Ax̄ − b should be strictly positive (why?),
while in fact we know that the product is −λ, b ≤ 0 (since A∗ λ = 0 and, as we have seen,
λ, b ≥ 0).
Thus, µ > 0. Setting λ∗ = µ−1 λ, we get
λ∗ ≥K∗ 0 [since λ ≥K∗ 0 and µ > 0]
A∗ λ ∗ =c [since A∗ λ = µc] .
cT x ≤ λ∗ , b ∀x : cT x ≤ c∗ [see (1.7.2)]
We see that λ∗ is feasible for (D), the value of the dual objective at λ∗ being at least c∗ , as
required.
It remains to consider the case c = 0. Here, of course, c∗ = 0, and the existence of a dual
feasible solution with the value of the objective ≥ c∗ = 0 is evident: the required solution is
λ = 0. 3.a) is proved.
3.b): the result follows from 3.a) in view of the primal-dual symmetry.
4): Let x be primal feasible, and λ be dual feasible. Then
cT x − b, λ = (A∗ λ)T x − b, λ = Ax − b, λ .
We get a useful identity as follows:
(!) For every primal-dual feasible pair (x, λ) the duality gap cT x − b, λ is equal to
the inner product of the primal slack vector y = Ax − b and the dual vector λ.
Note that (!) in fact does not require “full” primal-dual feasibility: x may be ar-
bitrary (i.e., y should belong to the primal feasible plane ImA − b), and λ should
belong to the dual feasible plane A∗ λ = c, but y and λ not necessary should belong
to the respective cones.
In view of (!) the complementary slackness holds if and only if the duality gap is zero; thus, all
we need is to prove 4.a).
The “primal residual” cT x − c∗ and the “dual residual” b∗ − b, λ are nonnegative, provided
that x is primal feasible, and λ is dual feasible. It follows that the duality gap
cT x − b, λ = [cT x − c∗ ] + [b∗ − b, λ ] + [c∗ − b∗ ]
is nonnegative (recall that c∗ ≥ b∗ by 2)), and it is zero if and only if c∗ = b∗ and both primal
and dual residuals are zero (i.e., x is primal optimal, and λ is dual optimal). All these arguments
hold without any assumptions of strict feasibility. We see that the condition “the duality gap
at a primal-dual feasible pair is zero” is always sufficient for primal-dual optimality of the pair;
and if c∗ = b∗ , this sufficient condition is also necessary. Since in the case of 4) we indeed have
c∗ = b∗ (this is stated by 3)), 4.a) follows.
A useful consequence of the Conic Duality Theorem is the following
Corollary 1.7.1 Assume that both (CP) and (D) are strictly feasible. Then both problems are
solvable, the optimal values are equal to each other, and each one of the conditions 4.a), 4.b) is
necessary and sufficient for optimality of a primal-dual feasible pair.

Indeed, by the Weak Duality Theorem, if one of the problems is feasible, the other is bounded,
and it remains to use the items 3) and 4) of the Conic Duality Theorem.
1.7. THE CONIC DUALITY THEOREM 31

1.7.1 Is something wrong with conic duality?

The statement of the Conic Duality Theorem is weaker than that of the LP Duality theorem:
in the LP case, feasibility (even non-strict) and boundedness of either primal, or dual problem
implies solvability of both the primal and the dual and equality between their optimal values.
In the general conic case something “nontrivial” is stated only in the case of strict feasibility
(and boundedness) of one of the problems. It can be demonstrated by examples that this
phenomenon reﬂects the nature of things, and is not due to our ability to analyze it. The case
of non-polyhedral cone K is truly more complicated than the one of the nonnegative orthant K;
as a result, a “word-by-word” extension of the LP Duality Theorem to the conic case is false.
Example 1.7.1 Consider the following conic problem with 2 variables x = (x1 , x2 )T and the
3-dimensional ice-cream cone K:
   

 x1 − x2 

 
min x1 | Ax − b ≡  1  ≥L3 0 .

 

x1 + x2

Recalling the deﬁnition of L3 , we can write the problem equivalently as

"
min x1 | (x1 − x2 )2 + 1 ≤ x1 + x2 ,

i.e., as the problem

min {x1 | 4x1 x2 ≥ 1, x1 + x2 > 0} .
Geometrically the problem is to minimize x1 over the intersection of the 3D ice-cream cone with
a 2D plane; the inverse image of this intersection in the “design plane” of variables x1 , x2 is part
of the 2D nonnegative orthant bounded by the hyperbola x1 x2 ≥ 1/4. The problem is clearly
strictly feasible (a strictly feasible solution is, e.g., x = (1, 1)T ) and bounded below, with the
optimal value 0. This optimal value, however, is not achieved – the problem is unsolvable!

Example 1.7.2 Consider the following conic problem with two variables x = (x1 , x2 )T and the
3-dimensional ice-cream cone K:
   

 x1 

 
min x2 | Ax − b =  x2  ≥L3 0 .

 

x1

The problem is equivalent to the problem

"
x2 | x21 + x22 ≤ x1 ,

i.e., to the problem

min {x2 | x2 = 0, x1 ≥ 0} .
The problem is clearly solvable, and its optimal set is the ray {x1 ≥ 0, x2 = 0}.
Now let us build the conic dual to our (solvable!) primal. It is immediately seen that the
cone dual to an ice-cream cone is this ice-cream cone itself. Thus, the dual problem is
# $ # $ %
λ1 + λ3 0
max 0 | = , λ ≥L3 0 .
λ λ2 1
32 LECTURE 1. FROM LINEAR TO CONIC PROGRAMMING

In spite of the fact that primal is solvable, the dual"is infeasible: indeed, assuming that λ is dual
feasible, we have λ ≥L3 0, which means that λ3 ≥ λ21 + λ22 ; since also λ1 + λ3 = 0, we come to
λ2 = 0, which contradicts the equality λ2 = 1.

We see that the weakness of the Conic Duality Theorem as compared to the LP Duality one
reﬂects pathologies which indeed may happen in the general conic case.

1.7.2 Consequences of the Conic Duality Theorem

Sufficient condition for infeasibility. Recall that a necessary and sufficient condition for
infeasibility of a (finite) system of scalar linear inequalities (i.e., for a vector inequality with
respect to the partial ordering ≥) is the possibility to combine these inequalities in a linear
fashion in such a way that the resulting scalar linear inequality is contradictory. In the case of
cone-generated vector inequalities a slightly weaker result can be obtained:
Proposition 1.7.1 Consider a linear vector inequality

Ax − b ≥K 0. (I)

(i) If there exists λ satisfying

λ ≥K∗ 0, A∗ λ = 0, λ, b > 0, (II)

then (I) has no solutions.

(ii) If (II) has no solutions, then (I) is “almost solvable” – for every positive there exists b
such that b − b2 < and the perturbed system

Ax − b ≥K 0

is solvable.
Moreover,
(iii) (II) is solvable if and only if (I) is not “almost solvable”.
Note the diﬀerence between the simple case when ≥K is the usual partial ordering ≥ and the
general case. In the former, one can replace in (ii) “nearly solvable” by “solvable”; however, in
the general conic case “almost” is unavoidable.
Example 1.7.3 Let system (I) be given by
 
x+1
 
Ax − b ≡  x√− 1  ≥L3 0.
2x

Recalling the deﬁnition of the ice-cream cone L3 , we can write the inequality equivalently as
√ "
2x ≥ (x + 1)2 + (x − 1)2 ≡ 2x2 + 2, (i)

which of course is unsolvable. The corresponding system (II) is

" & '
λ3 ≥ λ21 + λ22 ⇔ λ ≥L3∗ 0
√ & '
λ1 + λ2 + 2λ3 = 0 ⇔ AT λ = 0 (ii)
& '
λ2 − λ 1 > 0 ⇔ bT λ > 0
1.7. THE CONIC DUALITY THEOREM 33

From the second of these relations, λ3 = − √12 (λ1 + λ2 ), so that from the ﬁrst inequality we get
0 ≤ (λ1 − λ2 )2 , whence λ1 = λ2 . But then the third inequality in (ii) is impossible! We see that
here both (i) and (ii) have no solutions.
The geometry of the example is as follows. (i) asks to ﬁnd a point in the intersection of
the 3D ice-cream cone and a line. This line is an asymptote of the cone (it belongs to a 2D
plane which crosses the cone in such way that the boundary of the cross-section is a branch of
a hyperbola, and the line is one of two asymptotes of the hyperbola). Although the intersection
is empty ((i) is unsolvable), small shifts of the line make the intersection nonempty (i.e., (i) is
unsolvable and “almost solvable” at the same time). And it turns out that one cannot certify
the fact that (i) itself is unsolvable by providing a solution to (ii).

Proof of the Proposition. (i) is evident (why?).

Let us prove (ii). To this end it suﬃces to verify that if (I) is not “almost solvable”, then (II) is
solvable. Let us ﬁx a vector σ >K 0 and look at the conic problem

min {t | Ax + tσ − b ≥K 0} (CP)
x,t

in variables (x, t). Clearly, the problem is strictly feasible (why?). Now, if (I) is not almost solvable, then,
first, the matrix of the problem [A; σ] satisfies the full column rank condition A (otherwise the image of
the mapping (x, t) #→ Ax + tσ − b would coincide with the image of the mapping x #→ Ax − b, which is
not he case – the first of these images does intersect K, while the second does not). Second, the optimal
value in (CP) is strictly positive (otherwise the problem would admit feasible solutions with t close to 0,
and this would mean that (I) is almost solvable). From the Conic Duality Theorem it follows that the
dual problem of (CP)
max {b, λ | A∗ λ = 0, σ, λ = 1, λ ≥K∗ 0}
λ

has a feasible solution with positive b, λ , i.e., (II) is solvable.

It remains to prove (iii). Assume ﬁrst that (I) is not almost solvable; then (II) must be solvable by
(ii). Vice versa, assume that (II) is solvable, and let λ be a solution to (II). Then λ solves also all systems
of the type (II) associated with small enough perturbations of b instead of b itself; by (i), it implies that
all inequalities obtained from (I) by small enough perturbation of b are unsolvable.

When is a scalar linear inequality a consequence of a given linear vector inequality?

The question we are interested in is as follows: given a linear vector inequality

Ax ≥K b (V)

and a scalar inequality

cT x ≥ d (S)

we want to check whether (S) is a consequence of (V). If K is the nonnegative orthant, the
answer is be given by the Inhomogeneous Farkas Lemma:

Inequality (S) is a consequence of a feasible system of linear inequalities Ax ≥ b if

and only if (S) can be obtained from (V) and the trivial inequality 1 ≥ 0 in a linear
fashion (by taking weighted sum with nonnegative weights).

In the general conic case we can get a slightly weaker result:

34 LECTURE 1. FROM LINEAR TO CONIC PROGRAMMING

Proposition 1.7.2 (i) If (S) can be obtained from (V) and from the trivial inequality 1 ≥ 0 by
admissible aggregation, i.e., there exist weight vector λ ≥K∗ 0 such that

A∗ λ = c, λ, b ≥ d,

then (S) is a consequence of (V).

(ii) If (S) is a consequence of a strictly feasible linear vector inequality (V), then (S) can be
obtained from (V) by an admissible aggregation.

The diﬀerence between the case of the partial ordering ≥ and a general partial ordering ≥K is
in the word “strictly” in (ii).
Proof of the proposition. (i) is evident (why?). To prove (ii), assume that (V) is strictly feasible and
(S) is a consequence of (V) and consider the conic problem
( )
x Ax − b
min t | Ā − b̄ ≡ ≥ 0 ,
x,t t d − cT x + t K̄

K̄ = {(x, t) | x ∈ K, t ≥ 0}

The problem is clearly strictly feasible (choose x to be a strictly feasible solution to (V) and then choose
t to be large enough). The fact that (S) is a consequence of (V) says exactly that the optimal value in
the problem is nonnegative. By the Conic Duality Theorem, the dual problem
( )
∗ λ
max b, λ − dµ | A λ − c = 0, µ = 1, ≥K̄∗ 0
λ,µ µ

has a feasible solution with the value of the objective ≥ 0. Since, as it is easily seen, K̄∗ = {(λ, µ) | λ ∈
K∗ , µ ≥ 0}, the indicated solution satisﬁes the requirements

λ ≥K∗ 0, A∗ λ = c, b, λ ≥ d,

i.e., (S) can be obtained from (V) by an admissible aggregation.

“Robust solvability status”. Examples 1.7.2 – 1.7.3 make it clear that in the general conic
case we may meet “pathologies” which do not occur in LP. E.g., a feasible and bounded problem
may be unsolvable, the dual to a solvable conic problem may be infeasible, etc. Where the
pathologies come from? Looking at our “pathological examples”, we arrive at the following
guess: the source of the pathologies is that in these examples, the “solvability status” of the
primal problem is non-robust – it can be changed by small perturbations of the data. This issue
of robustness is very important in modelling, and it deserves a careful investigation.

Data of a conic problem. When asked “What are the data of an LP program min{cT x |
Ax − b ≥ 0}”, everybody will give the same answer: “the objective c, the constraint matrix A
and the right hand side vector b”. Similarly, for a conic problem

min cT x | Ax − b ≥K 0 , (CP)

its data, by deﬁnition, is the triple (c, A, b), while the sizes of the problem – the dimension n
of x and the dimension m of K, same as the underlying cone K itself, are considered as the
structure of (CP).
1.7. THE CONIC DUALITY THEOREM 35

Robustness. A question of primary importance is whether the properties of the program (CP)
(feasibility, solvability, etc.) are stable with respect to perturbations of the data. The reasons
which make this question important are as follows:

• In actual applications, especially those arising in Engineering, the data are normally inex-
act: their true values, even when they “exist in the nature”, are not known exactly when
the problem is processed. Consequently, the results of the processing say something deﬁ-
nite about the “true” problem only if these results are robust with respect to small data
perturbations i.e., the properties of (CP) we have discovered are shared not only by the
particular (“nominal”) problem we were processing, but also by all problems with nearby
data.

• Even when the exact data are available, we should take into account that processing them
computationally we unavoidably add “noise” like rounding errors (you simply cannot load
something like 1/7 to the standard computer). As a result, a real-life computational routine
can recognize only those properties of the input problem which are stable with respect to
small perturbations of the data.

Due to the above reasons, we should study not only whether a given problem (CP) is feasi-
ble/bounded/solvable, etc., but also whether these properties are robust – remain unchanged
under small data perturbations. As it turns out, the Conic Duality Theorem allows to recognize
“robust feasibility/boundedness/solvability...”.
Let us start with introducing the relevant concepts. We say that (CP) is

• robust feasible, if all “suﬃciently close” problems (i.e., those of the same structure
(n, m, K) and with data close enough to those of (CP)) are feasible;

• robust infeasible, if all suﬃciently close problems are infeasible;

• robust bounded below, if all suﬃciently close problems are bounded below (i.e., their
objectives are bounded below on their feasible sets);

• robust unbounded, if all suﬃciently close problems are not bounded;

• robust solvable, if all suﬃciently close problems are solvable.

Note that a problem which is not robust feasible, not necessarily is robust infeasible, since among
close problems there may be both feasible and infeasible (look at Example 1.7.2 – slightly shifting
and rotating the plane Im A − b, we may get whatever we want – a feasible bounded problem,
a feasible unbounded problem, an infeasible problem...). This is why we need two kinds of
deﬁnitions: one of “robust presence of a property” and one more of “robust absence of the same
property”.
Now let us look what are necessary and suﬃcient conditions for the most important robust
forms of the “solvability status”.

Proposition 1.7.3 [Robust feasibility] (CP) is robust feasible if and only if it is strictly feasible,
in which case the dual problem (D) is robust bounded above.

Proof. The statement is nearly tautological. Let us ﬁx δ >K 0. If (CP) is robust feasible, then for small
enough t > 0 the perturbed problem min{cT x | Ax − b − tδ ≥K 0} should be feasible; a feasible solution
to the perturbed problem clearly is a strictly feasible solution to (CP). The inverse implication is evident
36 LECTURE 1. FROM LINEAR TO CONIC PROGRAMMING

(a strictly feasible solution to (CP) remains feasible for all problems with close enough data). It remains
to note that if all problems suﬃciently close to (CP) are feasible, then their duals, by the Weak Duality
Theorem, are bounded above, so that (D) is robust above bounded.

Proposition 1.7.4 [Robust infeasibility] (CP) is robust infeasible if and only if the system

b, λ = 1, A∗ λ = 0, λ ≥K∗ 0

is robust feasible, or, which is the same (by Proposition 1.7.3), if and only if the system

b, λ = 1, A∗ λ = 0, λ >K∗ 0 (1.7.3)

has a solution.
Proof. First assume that (1.7.3) is solvable, and let us prove that all problems suﬃciently close to (CP)
are infeasible. Let us ﬁx a solution λ̄ to (1.7.3). Since A is of full column rank, simple Linear Algebra
says that the systems [A ]∗ λ = 0 are solvable for all matrices A from a small enough neighbourhood U
of A; moreover, the corresponding solution λ(A ) can be chosen to satisfy λ(A) = λ̄ and to be continuous
in A ∈ U . Since λ(A ) is continuous and λ(A) >K∗ 0, we have λ(A ) is >K∗ 0 in a neighbourhood of A;
shrinking U appropriately, we may assume that λ(A ) >K∗ 0 for all A ∈ U . Now, bT λ̄ = 1; by continuity
reasons, there exists a neighbourhood V of b and a neighbourhood U of A such that b ∈ V and all
A ∈ U one has b , λ(A ) > 0.
Thus, we have seen that there exist a neighbourhood U of A and a neighbourhood V of b, along with
a function λ(A ), A ∈ U , such that

b λ(A ) > 0, [A ]∗ λ(A ) = 0, λ(A ) ≥K∗ 0

for all b ∈ V and A ∈ U . By Proposition 1.7.1.(i) it means that all the problems
* +
min [c ]T x | A x − b ≥K 0

with b ∈ V and A ∈ U are infeasible, so that (CP) is robust infeasible.

Now let us assume that (CP) is robust infeasible, and let us prove that then (1.7.3) is solvable. Indeed,
by the deﬁnition of robust infeasibility, there exist neighbourhoods U of A and V of b such that all vector
inequalities
A x − b ≥K 0
with A ∈ U and b ∈ V are unsolvable. It follows that whenever A ∈ U and b ∈ V , the vector inequality

A x − b ≥K 0

is not almost solvable (see Proposition 1.7.1). We conclude from Proposition 1.7.1.(ii) that for every
A ∈ U and b ∈ V there exists λ = λ(A , b ) such that

b , λ(A , b ) > 0, [A ]∗ λ(A , b ) = 0, λ(A , b ) ≥K∗ 0.

Now let us choose λ0 >K∗ 0. For all small enough positive we have A = A + b[A∗ λ0 ]T ∈ U . Let us
choose an with the latter property to be so small that b, λ0 > −1 and set A = A , b = b. According
to the previous observation, there exists λ = λ(A , b) such that

b, λ > 0, [A ]∗ λ ≡ A∗ [λ + b, λ λ0 ] = 0, λ ≥K∗ 0.

Setting λ̄ = λ + b, λ λ0 , we get λ̄ >K∗ 0 (since λ ≥K∗ 0, λ0 >K∗ 0 and b, λ > 0), while A∗ λ̄ = 0 and
b, λ̄ = b, λ (1 + b, λ0 ) > 0. Multiplying λ̄ by appropriate positive factor, we get a solution to (1.7.3).

Now we are able to formulate our main result on “robust solvability”.

1.7. THE CONIC DUALITY THEOREM 37

Proposition 1.7.5 For a conic problem (CP) the following conditions are equivalent to each
other
(i) (CP) is robust feasible and robust bounded (below);
(ii) (CP) is robust solvable;
(iii) (D) is robust solvable;
(iv) (D) is robust feasible and robust bounded (above);
(v) Both (CP) and (D) are strictly feasible.
In particular, under every one of these equivalent assumptions, both (CP) and (D) are solv-
able with equal optimal values.
Proof. (i) ⇒ (v): If (CP) is robust feasible, it also is strictly feasible (Proposition 1.7.3). If, in addition,
(CP) is robust bounded below, then (D) is robust solvable (by the Conic Duality Theorem); in particular,
(D) is robust feasible and therefore strictly feasible (again Proposition 1.7.3).
(v) ⇒ (ii): The implication is given by the Conic Duality Theorem.
(ii) ⇒ (i): trivial.
We have proved that (i)≡(ii)≡(v). Due to the primal-dual symmetry, we also have proved that
(iii)≡(iv)≡(v).

An Introduction To Continuous Optimization
100% (4)
An Introduction To Continuous Optimization
400 pages
Optimization Methods (Lecture India)
50% (2)
Optimization Methods (Lecture India)
267 pages
Math 273a: Optimization: Instructor: Wotao Yin Department of Mathematics, UCLA Fall 2015
No ratings yet
Math 273a: Optimization: Instructor: Wotao Yin Department of Mathematics, UCLA Fall 2015
17 pages
Lect ModConvOpt
No ratings yet
Lect ModConvOpt
590 pages
Ben-Tal A., Nemirovski A. - Lectures on modern convex optimization-Technion, Georgia Institute of Technology (2023)
No ratings yet
Ben-Tal A., Nemirovski A. - Lectures on modern convex optimization-Technion, Georgia Institute of Technology (2023)
732 pages
Lectures on Modern Convex Optimization 2020-2023: Analysis, Algorithms, Engineering Applications Aharon Ben-Tal - Quickly download the ebook to start your content journey
100% (1)
Lectures on Modern Convex Optimization 2020-2023: Analysis, Algorithms, Engineering Applications Aharon Ben-Tal - Quickly download the ebook to start your content journey
77 pages
Lectures on Modern Convex Optimization 2020-2023: Analysis, Algorithms, Engineering Applications Aharon Ben-Tal download
100% (1)
Lectures on Modern Convex Optimization 2020-2023: Analysis, Algorithms, Engineering Applications Aharon Ben-Tal download
75 pages
109649444
No ratings yet
109649444
84 pages
01 Intro Notes Cvxopt f22
No ratings yet
01 Intro Notes Cvxopt f22
25 pages
0898714915
No ratings yet
0898714915
505 pages
FALLSEM2023-24 EEE1020 ETH VL2023240103124 2023-07-27 Reference-Material-I
No ratings yet
FALLSEM2023-24 EEE1020 ETH VL2023240103124 2023-07-27 Reference-Material-I
16 pages
Classification of Optimization methods
No ratings yet
Classification of Optimization methods
68 pages
Cours D'optimisation
No ratings yet
Cours D'optimisation
159 pages
UNIT_I_Introduction of Optimization Notes
No ratings yet
UNIT_I_Introduction of Optimization Notes
10 pages
What Is Optimization?
No ratings yet
What Is Optimization?
15 pages
2024 03 07 Lecture Notes
No ratings yet
2024 03 07 Lecture Notes
43 pages
Convex Optimization - Introduction (S.l. Dr. Ing. Carmen Voicu)
No ratings yet
Convex Optimization - Introduction (S.l. Dr. Ing. Carmen Voicu)
32 pages
Wisdom of Crowds Intro
No ratings yet
Wisdom of Crowds Intro
53 pages
CO Lecture Notes
No ratings yet
CO Lecture Notes
84 pages
(Ebook) Linear programming with MATLAB by Michael C. Ferris, Olvi L. Mangasarian, Stephen J. Wright ISBN 9780898716436, 0898716438download
100% (3)
(Ebook) Linear programming with MATLAB by Michael C. Ferris, Olvi L. Mangasarian, Stephen J. Wright ISBN 9780898716436, 0898716438download
59 pages
I. Introduction To Convex Optimization
No ratings yet
I. Introduction To Convex Optimization
12 pages
Solving Optimization Problems Using The Matlab Opt
No ratings yet
Solving Optimization Problems Using The Matlab Opt
50 pages
PhuongPhapTinh (1)
No ratings yet
PhuongPhapTinh (1)
49 pages
1 PDF
No ratings yet
1 PDF
10 pages
Optimality Conditions in Convex Optimization A Finite dimensional View 1st Edition Joydeep Dutta instant download
100% (2)
Optimality Conditions in Convex Optimization A Finite dimensional View 1st Edition Joydeep Dutta instant download
79 pages
Optimality Conditions in Convex Optimization A Finite dimensional View 1st Edition Joydeep Dutta - Read the ebook now or download it for a full experience
No ratings yet
Optimality Conditions in Convex Optimization A Finite dimensional View 1st Edition Joydeep Dutta - Read the ebook now or download it for a full experience
84 pages
Problem Set 6
No ratings yet
Problem Set 6
5 pages
(Ebook) Linear programming with MATLAB by Michael C. Ferris, Olvi L. Mangasarian, Stephen J. Wright ISBN 9780898716436, 0898716438 All Chapters Instant Download
100% (3)
(Ebook) Linear programming with MATLAB by Michael C. Ferris, Olvi L. Mangasarian, Stephen J. Wright ISBN 9780898716436, 0898716438 All Chapters Instant Download
81 pages
Optimization1
No ratings yet
Optimization1
32 pages
CH1
No ratings yet
CH1
33 pages
School of Computer Science and Applied Mathematics
No ratings yet
School of Computer Science and Applied Mathematics
12 pages
Convexity II: Optimization Basics: Ryan Tibshirani Convex Optimization 10-725
No ratings yet
Convexity II: Optimization Basics: Ryan Tibshirani Convex Optimization 10-725
28 pages
Math 404 - W01 - Intro
No ratings yet
Math 404 - W01 - Intro
28 pages
Mathematical Programming: Is Called The Feasible Set and
No ratings yet
Mathematical Programming: Is Called The Feasible Set and
12 pages
(Final) Extra Exercise
No ratings yet
(Final) Extra Exercise
10 pages
Na
No ratings yet
Na
67 pages
IP Lecture Notes
No ratings yet
IP Lecture Notes
95 pages
0912 ch49
No ratings yet
0912 ch49
24 pages
Optimización Lineal
No ratings yet
Optimización Lineal
304 pages
CHE 536 Engineering Optimization: Course Policies and Outline
No ratings yet
CHE 536 Engineering Optimization: Course Policies and Outline
33 pages
1.1 Mathematical Optimization
No ratings yet
1.1 Mathematical Optimization
8 pages
A First Course in Linear Optimi - Wei Zhi
No ratings yet
A First Course in Linear Optimi - Wei Zhi
306 pages
Lecture 2-History
No ratings yet
Lecture 2-History
14 pages
johnson1984-Papadimitriou and Steiglitz
No ratings yet
johnson1984-Papadimitriou and Steiglitz
4 pages
LP Book
No ratings yet
LP Book
161 pages
Non-Serial Dynamic Programming. Academic Press, 1972. Another Reference Used in
No ratings yet
Non-Serial Dynamic Programming. Academic Press, 1972. Another Reference Used in
37 pages
Optimization Methods
No ratings yet
Optimization Methods
13 pages
Instant Download Optimality Conditions in Convex Optimization A Finite dimensional View 1st Edition Joydeep Dutta PDF All Chapters
100% (3)
Instant Download Optimality Conditions in Convex Optimization A Finite dimensional View 1st Edition Joydeep Dutta PDF All Chapters
81 pages
Mathematical Theory of Optimization
No ratings yet
Mathematical Theory of Optimization
276 pages
Operations Research Notes Main
No ratings yet
Operations Research Notes Main
123 pages
Optimality Conditions in Convex Optimization A Finite dimensional View 1st Edition Joydeep Dutta download
No ratings yet
Optimality Conditions in Convex Optimization A Finite dimensional View 1st Edition Joydeep Dutta download
83 pages
Numopt 0
No ratings yet
Numopt 0
163 pages
Integer Optimization-Stephan Weltge
No ratings yet
Integer Optimization-Stephan Weltge
88 pages
Ad 0744313
No ratings yet
Ad 0744313
146 pages
Optimization in Industrial Engineering Sqp-methods
No ratings yet
Optimization in Industrial Engineering Sqp-methods
30 pages
Matinf 2360 Part 3
No ratings yet
Matinf 2360 Part 3
106 pages
Optimization in Function Spaces
From Everand
Optimization in Function Spaces
Amol Sasane
No ratings yet
Limits and Continuity (Calculus) Engineering Entrance Exams Question Bank
From Everand
Limits and Continuity (Calculus) Engineering Entrance Exams Question Bank
Mohmmad Khaja Shareef
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Mathematical Optimization: Fundamentals and Applications
From Everand
Mathematical Optimization: Fundamentals and Applications
Fouad Sabry
No ratings yet
Ge4 Linear Programming
No ratings yet
Ge4 Linear Programming
5 pages
MCQ Module 1 RGPV Mathematics III
No ratings yet
MCQ Module 1 RGPV Mathematics III
7 pages
Review of Computing With Quantum Cats - From Colossus To Qubit
No ratings yet
Review of Computing With Quantum Cats - From Colossus To Qubit
5 pages
Trace Tables: 1. Note: A String Is An Array of Characters. For Example, in This Method The Element
No ratings yet
Trace Tables: 1. Note: A String Is An Array of Characters. For Example, in This Method The Element
5 pages
Part A: Texas A&M University MEEN 683 Multidisciplinary System Design Optimization (MSADO) Spring 2021 Assignment 2
No ratings yet
Part A: Texas A&M University MEEN 683 Multidisciplinary System Design Optimization (MSADO) Spring 2021 Assignment 2
5 pages
Introductory Topics in Graph Theory: First Draft
No ratings yet
Introductory Topics in Graph Theory: First Draft
29 pages
On The Analysis and Application of LDPC Codes: Olgica Milenkovic University of Colorado, Boulder
No ratings yet
On The Analysis and Application of LDPC Codes: Olgica Milenkovic University of Colorado, Boulder
51 pages
Rippling TPS Prep (Coding)
No ratings yet
Rippling TPS Prep (Coding)
3 pages
Digital Filter Structures and Quantization Effects: 6.1: Direct-Form Network Structures
No ratings yet
Digital Filter Structures and Quantization Effects: 6.1: Direct-Form Network Structures
35 pages
Master Machine Learning Algorithms Discover How They Work And Implement Them From Scratch 11 Jason Brownlee instant download
100% (1)
Master Machine Learning Algorithms Discover How They Work And Implement Them From Scratch 11 Jason Brownlee instant download
56 pages
Artificial Intelligent: Supervised Learning and Unsupervised Learning
No ratings yet
Artificial Intelligent: Supervised Learning and Unsupervised Learning
17 pages
Java Quick Reference Guide
No ratings yet
Java Quick Reference Guide
2 pages
Deep Learning Techniques Notes
No ratings yet
Deep Learning Techniques Notes
42 pages
Assignment 3 - Solutions PDF
No ratings yet
Assignment 3 - Solutions PDF
13 pages
Traveling Salesman Problem: A Case Study: Dr. Leena Jain Mr. Amit Bhanot
No ratings yet
Traveling Salesman Problem: A Case Study: Dr. Leena Jain Mr. Amit Bhanot
3 pages
IN2010 Structured Notes
No ratings yet
IN2010 Structured Notes
24 pages
INFO1905 Cheat Sheet Final
No ratings yet
INFO1905 Cheat Sheet Final
2 pages
ai graph concepts
No ratings yet
ai graph concepts
12 pages
Lec01 introductionToToC
No ratings yet
Lec01 introductionToToC
34 pages
Python Lists
No ratings yet
Python Lists
4 pages
Study On Centrality Measures in Social Networks: A Survey: Kousik Das Sovan Samanta Madhumangal Pal
No ratings yet
Study On Centrality Measures in Social Networks: A Survey: Kousik Das Sovan Samanta Madhumangal Pal
11 pages
DHSCH 1
No ratings yet
DHSCH 1
31 pages
Ant Colony 1
No ratings yet
Ant Colony 1
12 pages
Propositional Logic and Resolution: Artificial Intelligence
No ratings yet
Propositional Logic and Resolution: Artificial Intelligence
40 pages
4343 2024s Mid Sample 1 PDF
No ratings yet
4343 2024s Mid Sample 1 PDF
1 page
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
11 pages
Stack and Queue: COMP171 Fall 2005
No ratings yet
Stack and Queue: COMP171 Fall 2005
44 pages
AI Papers
No ratings yet
AI Papers
9 pages
Evaluation of Output of An LTI System Using Convolution: Avishekh Shrestha
No ratings yet
Evaluation of Output of An LTI System Using Convolution: Avishekh Shrestha
2 pages
Pigeon Hole
No ratings yet
Pigeon Hole
5 pages

1. From linear to conic programming

Uploaded by

1. From linear to conic programming

Uploaded by

C.O.R.E.

Mathematical Programming deals with optimization programs of the form

(!) As far as numerical processing of programs (P) is concerned, there exists a

Our second problem is

– basic Duality Theory for conic programs;

• “Eﬃcient (polynomial time) solvability” of generic convex programs.

• “Low cost” optimization methods for extremely large-scale optimization programs.

I am greatly indebted to my colleagues, primarily to Yuri Nesterov, Aharon Ben-Tal, Stephen

1 From Linear to Conic Programming 9

2 Conic Quadratic Programming 39

5 Simple methods for extremely large-scale problems 187

From Linear to Conic Programming

1.1 Linear programming: basic notions

1.2 Duality in linear programming

2003(x1 + x2 + ... + x1998 + x2002 ) − 101 ≥ 0,

1.2.1 Certiﬁcates for solvability and insolvability

fi (x) Ωi 0, i = 1, ..., m, (S)

where every Ωi is either the relation ” > ” or the relation ” ≥ ”.

Proposition 1.2.1 Consider a system of linear inequalities

is a consequence of a system of homogeneous nonstrict linear inequalities

To this end, it suﬃces to verify that

A. A system of linear inequalities

is a consequence of a solvable system of linear inequalities

1.2.2 Dual to an LP program: the origin

which is nothing but the LP dual of (LP).

(Sa ) : −cT x > −a, Ax ≥ b

1.2.3 The LP Duality Theorem

Theorem 1.2.2 [Duality Theorem in Linear Programming] Consider a linear programming

along with its dual   

(i) The primal is feasible and bounded below.

which is clearly equivalent to (LP) (set x = η − ζ).

yi [Ax − b]i = 0, i = 1, ..., m, [complementary slackness]

likewise as if and only if

y T [Ax − b] = (AT y)T x − bT y = cT x − bT y.

1.3 From Linear to Conic Programming

1.4 Orderings of Rm and convex cones

a ≥ b ⇔ {ai ≥ bi , i = 1, ..., m}. (” ≥ ”)

2. Anti-symmetry: if both a ≥ b and b ≥ a, then a = b;

3. Transitivity: if both a ≥ b and b ≥ c, then a ≥ c;

4. Compatibility with linear operations:

(a) Homogeneity: if a ≥ b and λ is a nonnegative real, then λa ≥ λb

It turns out that

A. A good inequality ! is completely identiﬁed by the set K of !-nonnegative vectors:

• The positive semideﬁnite cone Sm + . This cone “lives” in the space E = S

1.5 “Conic programming” – what is it?

1.6 Conic Duality

The bound was obtained by looking at the inequalities of the type

In the particular case of coordinate-wise partial ordering, i.e., in the case of E = Rm , K = Rm

Whenever λ possesses the property (1.6.1), the scalar inequality

is a consequence of the vector inequality a ≥K b:

Vice versa, if λ is an admissible weight vector for the partial ordering ≥K :

∀(a, b : a ≥K b) : λ, a ≥ λ, b

then, of course, λ satisﬁes (1.6.1).

An immediate corollary of the Theorem is as follows:

(A∗ λ)T x ≡ λ, Ax ≥ λ, b 1)

– this observation is an immediate consequence of the deﬁnition of K∗ . It follows that whenever

max {b, λ | A∗ λ = c, λ ≥K∗ 0} (D)

and this program is called the program dual to (CP).

1.6.1 Geometry of the primal and the dual problems

Indeed, in the latter case we have c = A∗ d for some d, whence

cT x = A∗ d, x = d, Ax = d, Ax − b + d, b ∀x. (1.6.3)

where L = ImA − b and d is (any) vector satisfying the relation A∗ d = c. Thus,

L = {y | y, λ = 0 ∀λ ∈ L∗ }; L∗ = {λ | y, λ = 0 ∀y ∈ L}.

Thus, we come to a nice geometrical conclusion:

A conic problem2) (CP) is the problem

min {d, y | y ∈ L − b, y ≥K 0} (P)

The dual problem is the problem

Figure 1.1. Primal-dual pair of conic problems

objectives d, y and d , y diﬀer from each other by a constant:

1.7 The Conic Duality Theorem

Theorem 1.7.1 [Conic Duality Theorem] Consider a conic problem

along with its conic dual

b∗ = max {b, λ | A∗ λ = c, λ ≥K∗ 0} . (D)

is nonnegative at every “primal-dual feasible pair” (x, λ).

Applying the Separation Theorem to S = M and T = K, we conclude that there exists λ ∈ E

Recalling the deﬁnition of M , we get

along with its dual

∀(a, b : a ≥K b) : λ, a ≥ λ, b

(A∗ λ)T x ≡ λ, Ax ≥ λ, b 1)

max {b, λ | A∗ λ = c, λ ≥K∗ 0} (D)

cT x = A∗ d, x = d, Ax = d, Ax − b + d, b ∀x. (1.6.3)

L = {y | y, λ = 0 ∀λ ∈ L∗ }; L∗ = {λ | y, λ = 0 ∀y ∈ L}.

min {d, y | y ∈ L − b, y ≥K 0} (P)

objectives d, y and d , y diﬀer from each other by a constant:

b∗ = max {b, λ | A∗ λ = c, λ ≥K∗ 0} . (D)

λ ≥K∗ 0, A∗ λ = 0, λ, b > 0, (II)

has a feasible solution with positive b, λ , i.e., (II) is solvable.

b, λ = 1, A∗ λ = 0, λ >K∗ 0 (1.7.3)

b λ(A ) > 0, [A ]∗ λ(A ) = 0, λ(A ) ≥K∗ 0

b , λ(A , b ) > 0, [A ]∗ λ(A , b ) = 0, λ(A , b ) ≥K∗ 0.

b, λ > 0, [A ]∗ λ ≡ A∗ [λ + b, λ λ0 ] = 0, λ ≥K∗ 0.