0% found this document useful (0 votes)
7 views5 pages

Lecture 8

Uploaded by

mralreda99
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views5 pages

Lecture 8

Uploaded by

mralreda99
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

TMA947 / MMG621 – Nonlinear optimization Lecture 8

TMA947 / MMG621 — Nonlinear optimization

Lecture 8 — Lagrangian duality


Emil Gustavsson, Zuzana Nedělková, Axel Ringh

August 14, 2024

We will consider one the most classical versions of a general method for optimization, namely
relaxation/duality based methods. The basic premise is to take a "difficult" optimization problem,
and replace it with something simpler. The "simpler" here will be what we call the Lagrangian
relaxation. We are now working with the problem

f ∗ = inf f (x)
(1)
subject to x ∈ S

We define a relaxation of the problem above to be a problem of the form

fR∗ = inf fR (x),


(2)
subject to x ∈ SR

where the function fR (x) ≤ f (x) for all x ∈ S, and where S ⊆ SR . That is, we have replaced
the feasible set with a larger one, and the objective with something smaller. The following basic
result can be stated.
Theorem 1 (The relaxation theorem). a) fR∗ ≤ f ∗ .
b) If the relaxed problem (2) is infeasible, then so is (1).
c) If the relaxed problem (2) has an optimal solution x∗R for which it holds that x∗R ∈ S and fR (x∗R ) =
f (x∗R ), then x∗R is an optimal solution to (1) as well.

Proof. See Theorem 6.1.

1 Lagrangian relaxation

Now we consider a problem of the form

f ∗ = inf f (x) (3a)


subject to x ∈ X, (3b)
gi (x) ≤ 0, i = 1, . . . , m, (3c)

1
TMA947 / MMG621 – Nonlinear optimization Lecture 8

where f and gi are some given functions1 , and X ⊆ Rn is some set. The basic idea of Lagrangian
relaxation is to replace constraints with a price in the objective function for their violation. That is,
for any µ ∈ Rm we define the Lagrangian relaxation of (the constraints (3c) of) the problem (3) as
the problem

m
X
q(µ) = inf f (x) + µi gi (x), (4a)
i=1
subject to x ∈ X. (4b)

Whenever µi ≥ 0 for i = 1, . . . , m, the above is a relaxation of (3). The objective


P function above is
called the Lagrange function of (3), and is denoted as L(x, µ) := f (x) + i µi gi (x).

Now we have defined a family of relaxations of (3), parametrized by the vector µ. We immediately
have the following, very important, result.
Theorem 2 (Weak duality). For any µ ≥ 0m , and any x feasible in (3) we have

q(µ) ≤ f (x). (5)

Proof. This is just a rephrasing of the statement that the Lagrangian relaxation is a relaxation.

The reason why this result is so important is that it allows to get lower bounds on f ∗ of (3).

Example 1. Consider the problem to

f ∗ = min x2 ,
subject to x ≥ 1.

Relax the constraint x ≥ 1 to get a Lagrangian dual function (note the rewriting of x ≥ 1 as 1 − x ≤ 0)

 µ 2 µ2
q(µ) = min x2 + µ(1 − x) = min x − − + µ.
2 4
For each fixed µ ≥ 0, the above is an unconstrained minimization problem of a convex function of x, so we
can actually compute

µ2
q(µ) = µ − .
4
Evaluating at, say, µ = 0, we get q(µ) = 0, and we can conclude that the optimal value f ∗ must satisfy
f ∗ ≥ 0. If we instead evaluate at µ = 1, we would be able to conclude f ∗ ≥ q(1) = 3/4.
1 note that we do not say anything about smoothness!

2
TMA947 / MMG621 – Nonlinear optimization Lecture 8

Having the weak duality theorem we can try to find the best lower bound of f ∗ , which motivates
the following definition.
Definition 1 (The Lagrangian dual problem). The Lagrangian dual problem to (3) (with respect to the
relaxation of (3c)) is the problem

q ∗ = sup q(µ), (6a)


subject to µ ≥ 0m . (6b)

In other words, the Lagrangian dual problem is the problem of defining as tight a relaxation as
possible. An immediate consequence of the weak duality theorem is: for any pair of primal/dual
problems, we have q ∗ ≤ f ∗ .

A note on terminology: from now we will refer to (6) as the dual problem, and to (3) as the primal
problem. So, for example, the phrase "x∗ is primally optimal" means that x∗ is optimal in (3).
Example 2. Consider again the problem from the previous example. The dual problem is to

µ2
q ∗ = sup µ − ,
µ≥0 4
and one can easily verify that the maximum is attained at µ = 2, q ∗ = q(2) = 1, which can also be noted
to be the optimal value f ∗ = q ∗ = 1.

Note that in the above example we have f ∗ = q ∗ . If this holds, we say the pair of primal and dual
problems has no duality gap. In general, we define the duality gap to be the difference f ∗ − q ∗ . We
also define
Definition 2. We call µ∗ a Lagrange multiplier vector if

f ∗ = inf L(x, µ∗ ). (7)


x∈X

Note that we do not have a Lagrange multiplier vector unless f ∗ = q ∗ . Also note the conflict of
terminology, Lagrange multiplier is also used to talk about the vector µ appearing in the Fritz-
John and KKT conditions.

1.1 The dual problem

All we have done so far is to take an optimization problem (the primal) and replaced it with
another problem (the dual). This only makes sense if the dual problem is in some sense "easier"
than the primal.
Theorem 3. The dual function q(µ) is concave, and its effective domain Dq = {µ | q(µ) > −∞} is
convex.

3
TMA947 / MMG621 – Nonlinear optimization Lecture 8

Proof. See Theorem 6.4.

The above theorem tells us that the dual problem is a maximization of a concave function over a
convex set! In other words, the dual problem is always a convex problem!

1.2 Global optimality conditions

If f ∗ = q ∗ it turns out that we can actually use the Lagrangian relaxation to get a sufficient condi-
tion for optimality.
Theorem 4. Consider the primal/dual pair of vectors (x∗ , µ∗ ). Then x∗ is optimal and µ∗ is a Lagrange
multiplier vector if and only if

x∗ ∈ arg minL(x, µ∗ ), (8a)


x∈X
µ∗ ≥ 0 , m
(8b)

x ∈ X, gi (x∗ ) ≤ 0, i = 1, . . . , m, (8c)
µ∗i gi (x∗ ) = 0, i = 1, . . . , m. (8d)

Proof. See Theorem 6.8.

Remark: The conditions are, in order, often called Lagrangian optimality, dual feasibility, primal
feasibility and complementary slackness. Note also the similarity to the KKT conditions; the only
difference is that we have minimization of the Lagrangian instead of stationarity. However, this
makes a very important difference: the KKT conditions are necessary optimality conditions, while
the global optimality conditions are necessary and sufficient.
Example 3. Consider
1
f ∗ = min x21 + x22 ,
2
subject to − x21 − x22 + 1 ≤ 0.

The feasible region is all points outside of the open unit disc, and hence the problem is not convex. Relaxing
the constraint we get the Lagrangian
1
L(x, µ) = x21 + x22 + µ(−x21 − x22 + 1) = (1 − µ)x21 + ( 12 − µ)x22 + µ
2
and hence the dual function
(
−∞ if µ > 12 ,
q(µ) = inf 2 (1 − µ)x21 + ( 21 − µ)x22 +µ=
x∈R µ if µ ≤ 12 .

The solution to the dual problem is thus µ∗ = 12 and q ∗ = 12 . It can now be verified that this µ∗ together
with x∗ = [0, 1]T fulfills the global optimality conditions (8), and hence is an optimal solution (in fact,
globally optimal; why?). Moreover, it can also be verified that this means that x∗ is a KKT point.

4
TMA947 / MMG621 – Nonlinear optimization Lecture 8

Nevertheless, it can also be verified that x̄ = [1, 0]T is a KKT point (with µ̄ = 1). But L(x, 1) = − 12 x22 + 21 ,
and hence arg minx∈R2 L(x, 1) = ∅ since L(x, 1) is not bounded from below as x2 → ∞ (see the dual
function above). So while this point is a KKT point, it does not fulfill the global optimality conditions (8).

We can in fact formulate the conditions in (8) in a more compact way, as what is called saddle-
point optimality conditions, meaning that the pair (x∗ , µ∗ ) simultaneously maximizes L over µ
and minimizes L over x.
Theorem 5. x∗ is primally optimal and µ∗ is a Lagrange multiplier if and only if x∗ ∈ X, µ∗ ≥ 0m and

L(x∗ , µ) ≤ L(x∗ , µ∗ ) ≤ L(x, µ∗ ), (x, µ) ∈ X × Rm


+. (9)

Proof. See Theorem 6.9.

1.3 Strong Lagrangian Duality

A natural question to ask is under what conditions one can guarantee that q ∗ = f ∗ , i.e., that the
primal and dual optimal values coincide. It turns out that we need to require convexity and some
regularity, which will here be a variant of the Slater CQ. That is, we now require that X is a convex
set, gi are convex for i = 1 . . . , m and there is some point x ∈ X such that gi (x) < 0, i = 1, . . . , m.
Theorem 6. Assume that the problem (3) satisfies the Slater CQ, and that f ∗ ≥ −∞. Then strong
Lagrangian duality holds, and there exists at least one Lagrange multiplier vector.

Proof. See Theorem 6.10.

If we assume that f and gi are also C 1 , X = Rm , and the problem (3) satisfies Slaters CQ and has
some optimal solution x∗ . The above theorem then gives a Lagrange multiplier vector µ∗ and
the pair (x∗ , µ∗ ) is a pair of a primally optimal solution and a Lagrange multiplier vector, so it
satisfy the system (8). Since the Lagrange function L(x, µ) is convex in x for any µ, the condition
that x∗ ∈ arg minx∈X L(x, µ∗ ) can be replaced byPthe neccessary and sufficient condition that
m
∇x L(x∗ , µ∗ ) = 0. But ∇x L(x∗ , µ∗ ) = ∇f (x∗ ) + i=1 µ∗i ∇gi (x∗ ). Thus in this case the global
optimality conditions reduce to the Karush-Kuhn-Tucker conditions!

You might also like