0% found this document useful (0 votes)

39 views25 pages

Nlpsol 3

This summary provides the key information from the document in 3 sentences: The document discusses necessary and sufficient conditions for optimality in optimization problems with equality and inequality constraints. It shows that if a point satisfies the necessary conditions for a local minimum, then the Jacobian matrix must be invertible. It also proves that under certain conditions like linear constraints or convex objective and constraint functions, the cone of feasible variations is equal to the cone of feasible directions at a point.

Uploaded by

Afshin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views25 pages

Nlpsol 3

Uploaded by

Afshin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Section 3.

Solutions Chapter 3

SECTION 3.2

3.2.6 www

Assume that the matrix

∇2xx L(x∗ , λ∗ ) ∇h(x∗ )
J=
∇h(x∗ ) 0
is invertible, but the suﬃciency conditions do not hold for x∗ and λ∗ . Since x∗ and λ∗ satisfy
the ﬁrst and the second order necessary conditions of Prop. 3.2.1, this implies that there is a
vector ȳ = 0 such that ∇h(x∗ ) ȳ = 0 and ȳ ∇2xx L(x∗ , λ∗ )ȳ = 0. Hence, ȳ minimizes the quadratic
function y ∇2xx L(x∗ , λ∗ )y over all y with ∇h(x∗ ) y = 0. Thus ∇2xx L(x∗ , λ∗ )ȳ = 0, and we have

∇2xx L(x∗ , λ∗ ) ∇h(x∗ ) ȳ
= 0,
∇h(x∗ ) 0 0

which contradict the invertibility of J.

For the reverse assertion, assume that x∗ and λ∗ satisfy the second order suﬃciency condi-
tions of Prop. 3.2.1. Let ȳ ∈ n and z̄ ∈ m be vectors such that

ȳ
J = 0.
z̄

Consequently
∇2xx L(x∗ , λ∗ )ȳ + ∇h(x∗ )z̄ = 0, (1)

∇h(x∗ ) ȳ = 0. (2)

Pre-multiplying Eq. (1) by ȳ and using Eq. (2), we obtain

ȳ∇2xx L(x∗ , λ∗ )ȳ = 0.

In view of Eq. (2), it follows that ȳ = 0, for otherwise the second order suﬃciency condition
would be violated. Then Eq. (1) yields ∇h(x∗ )z̄ = 0. Since x∗ is a regular point, we must have
z̄ = 0. Hence, J is invertible.

1
Section 3.3

3.2.7 www

We have

∇2 p(u) = −∇λ(u).

To calculate ∇λ(u), we diﬀerentiate the relation

∇f x(u) + ∇h x(u) λ(u) = 0.

We have

∇x(u)∇2xx L x(u), λ(u) + ∇λ(u)∇h x(u) = 0.

We also have ∇x(u)∇h x(u) = I, from which we obtain for all c ∈

c∇x(u)∇h x(u) ∇h x(u) = c∇h x(u) .

By adding the last two equations, we see that

∇x(u) ∇2xx L x(u), λ(u) + c∇h x(u) ∇h x(u) + ∇λ(u) − cI h x(u) = 0.

From this, we obtain, for every c for which the inverse below exists,

−1
∇x(u) + ∇λ(u) − cI h x(u) ∇2xx L x(u), λ(u) + c∇h x(u) ∇h x(u) = 0.

Multiplying with ∇h x(u) and using the equations ∇x(u)∇h x(u) = I and ∇2 p(u) = −∇λ(u),
we see that

−1
−1
∇2 p(u) = ∇h x(u) ∇2xx L x(u), λ(u) + c∇h x(u) ∇h x(u) ∇h x(u) − cI.

SECTION 3.3

2
Section 3.3

3.3.5 www

(a) Let d ∈ F (x∗ ) be arbitrary. Then there exists a sequence {dk } ⊆ F (x∗ ) such that dk → d.
For each dk , we have
f (x∗ + αdk ) − f (x∗ )
∇f (x∗ ) dk = lim .
α→0 α
f (x∗ +αdk )−f (x∗ )
Since x∗ is a constrained local minimum, we have α ≥ 0 for all suﬃciently small α
(for which x∗ + αdk is feasible), and thus ∇f (x∗ ) dk ≥ 0. Hence

∇f (x∗ ) d = lim ∇f (x∗ ) dk ≥ 0

k→∞

as desired.

(b) If x∗ is a constrained local minimum, we have from part (a)

∇f (x∗ ) d ≥ 0 ∀ d with ∇gj (x∗ ) d ≤ 0, ∀ j ∈ A(x∗ ).

According to Farkas’ lemma, this is true if and only if there exists µ∗ such that

−∇f (x∗ ) = µ∗j ∇gj (x∗ ), µ∗j ≥ 0.
j∈A(x∗ )

Setting µ∗j = 0 for j ∈ A(x∗ ), we have the desired result.

(c) We want to show that F (x∗ ) = V (x∗ ), where V (x∗ ) is the cone of ﬁrst order feasible variations
given by

V (x∗ ) = d | ∇gj (x∗ ) d ≤ 0, ∀ j ∈ A(x∗ ) .

First, let’s show that under any of the conditions (1)–(4), we have F (x∗ ) ⊆ V (x∗ ). By
Mean Value Theorem, for each j ∈ A(x∗ ) and for any d ∈ F (x∗ ) there is some ∈ [0, 1] such that

gj (x∗ + αd) = gj (x∗ ) + α∇gj (x∗ + αd) d.

Because gj (x∗ + αd) ≤ 0 for all α ∈ [0, ᾱ] and gj (x∗ ) = 0 for all j ∈ A(x∗ ), we obtain for all
j ∈ A(x∗ )
lim ∇gj (x∗ + αd) d ≤ 0,
α→0

which by continuity of each ∇gj implies that

∇gj (x∗ ) d ≤ 0, ∀ j ∈ A(x∗ ),

so that d ∈ V (x∗ ). Therefore F (x∗ ) ⊆ V (x∗ ) and F (x∗ ) ⊆ V (x∗ ) [because V (x∗ ) is closed].

3
Section 3.3

Now we need to show that V (x∗ ) ⊆ F (x∗ ) for each of the parts (1) through (4).

(1) Let gj (x) = bj x + cj for all j, where bj are vectors and cj are scalars. Let d ∈ V (x∗ ). We
have
gj (x∗ + αd) = bj (x∗ + αd) + cj = gj (x∗ ) + αbj d.

If j ∈ A(x∗ ), then by the deﬁnition of V (x∗ ) we have bj d = ∇gj (x∗ ) d ≤ 0, so that gj (x∗ + αd) ≤
gj (x∗ ) = 0 for all α > 0. If j ∈ A(x∗ ) and bj d ≤ 0, then gj (x∗ + αd) ≤ gj (x∗ ) < 0 for any α > 0
[because this constraint is not tight at x∗ ]. If j ∈ A(x∗ ) and bj d > 0, then gj (x∗ + αd) ≤ 0 for all
α ≤ ᾱj , where ᾱj = −gj (x∗ )/(aj d) [here we use gj (x∗ ) < 0]. Therefore we have gj (x∗ + αd) ≤ 0
for all j and all α ≤ ᾱ, where

ᾱ = min ᾱj | j ∈ A(x∗ ), bj d > 0 .

Thus d ∈ F (x∗ ) and consequently V (x∗ ) ⊆ F (x∗ ) [since V (x∗ ) is closed].

(2) Let d ∈ V (x∗ ) and let d¯ be such that

∇gj (x∗ ) d¯ < 0, ∀ j ∈ A(x∗ ).

Deﬁne dγ = γ d¯ + (1 − γ)d. By using the Mean Value Theorem, for each j there is some ∈ [0, 1]
such that

gj (x∗ + αdγ ) = gj (x∗ ) + α∇gj (x∗ + αdγ ) dγ

= gj (x∗ ) + αγ∇gj (x∗ + αdγ ) d¯ + α(1 − γ)∇gj (x∗ + αdγ ) d.

Let γ be ﬁxed. If j ∈ A(x∗ ), then by using the fact gj (x∗ ) < 0 it can be seen that for all
suﬃciently small α we have

gj (x∗ + αdγ ) ≤ 0, ∀ j ∈ A(x∗ ).

If j ∈ A(x∗ ), then by continuity of ∇gj we have for all suﬃciently small α

∇gj (x∗ + αdγ ) d¯ ≤ 0.

This combined with the fact d ∈ V (x∗ ) implies that for all suﬃciently small α

gj (x∗ + αdγ ) ≤ 0, ∀ j ∈ A(x∗ ).

Therefore, for a ﬁxed γ, there exists a suﬃciently small ᾱ such that gj (x∗ + αdγ ) ≤ 0 for all j
and α ∈ (0, ᾱ]. Thus dγ ∈ F (x∗ ) for all γ and

lim dγ = d ∈ F (x∗ ).
γ→0

4
Section 3.3

(3) Since gj is convex, we have for every j ∈ A(x∗ )

gj (x∗ ) + ∇gj (x∗ ) (x − x∗ ) ≤ gj (x) < 0.

By deﬁning d = x − x∗ and by using gj (x∗ ) = 0 for all j ∈ A(x∗ ), from the preceding relation we
obtain
∇gj (x∗ )d < 0, ∀ j ∈ A(x∗ ),

and the result follows from part (2).

(4) Let B be a matrix with rows consisting of ∇gj (x∗ ) for j ∈ A(x∗ ). Since these gradients are
linearly independent, B has full row rank, so that the square matrix BB is invertible and the
matrix Br = B (BB )−1 is well-deﬁned. Let
⎛ ⎞
−1
⎜ . ⎟
d = Br ⎝ .. ⎠ .
−1

Multiplying both sides of this equation with B, we obtain

⎛ ⎞
−1
⎜ . ⎟
Bd = ⎝ .. ⎠ ,
−1

which is equivalent to
∇gj (x∗ ) d = −1, ∀ j ∈ A(x∗ ).

The result now follows from part (2).

(d) For this problem we can easily see that the point x∗ = (0, 0) is a constrained local minimum.
We have
0 0
∇g1 (0, 0) = and ∇g2 (0, 0) = .
1 −1
Note that both constraints are active at x∗ = (0, 0), i.e., A(x∗ ) = {1, 2}. Evidently g1 and g2 are
not linear, so the condition (c1) does not hold. Furthermore, there is no vector d = (d1 , d2 ) such
that
∇g1 (0, 0) d = d2 < 0 and ∇g2 (0, 0) d = −d2 < 0.

Hence, the condition (c2) is violated. If the condition (c3) holds, then as seen in proof of
part (c3) the condition (c2) also holds, which is a contradiction. Therefore, at x∗ = (0, 0) the
condition (c3) does not hold. The vectors ∇g1 (0, 0) and ∇g2 (0, 0) are linearly dependent since
∇g1 (0, 0) = −∇g2 (0, 0), so the condition (c4) is also violated.

5
Section 3.3

Let scalars µ0 ≥ 0, µ1 ≥ 0, and µ2 ≥ 0 be such that

µ0 ∇f (x∗ ) + µ1 ∇g1 (x∗ ) + µ2 ∇g2 (x∗ ) = 0,

or equivalently
µ0 0 0 0
+ + = .
µ0 µ1 −µ2 0
It follows that µ0 = 0, i.e., there is no Lagrange multiplier.

(e) Note that x | h(x) = 0 = x | ||h(x)||2 ≤ 0 , so that x∗ is also a local minimum for
the modiﬁed problem. The modiﬁed problem has a single constraint g1 (x) = ||h(x)||2 , which
is active at x∗ . Since g1 is not linear, the condition (c1) does not hold. Because ∇g1 (x∗ ) =
2∇h(x∗ )h(x∗ ) = 0, the conditions (c2) and (c4) are violated at x∗ . If g1 is convex and the
condition (c3) holds, then as seen in the proof of (c3), the condition (c2) also holds, which is a
contradiction. Hence, at x∗ each of the conditions (1)–(4) of part (c) is violated. From

µ∗0 ∇f (x∗ ) + µ∗1 ∇g1 (x∗ ) = 0

and ∇g1 (x∗ ) = 0, it follows that µ∗0 ∇f (x∗ ) = 0, and since ∇f (x∗ ) = 0, we must have µ∗0 = 0,
i.e., there is no Lagrange multiplier.

3.3.6 www

Assume that there exist x ∈ n and µ ∈ m such that conditions (i) and (ii) hold, i.e.,

ai x < 0, ∀ i = 1, . . . , m, (1)

m
µi ai = 0, µ = 0, µ ≥ 0, (2)
i=1

where ai are row vectors of the matrix A. Without loss of generality, we may assume that µ1 > 0.
By pre-multiplying Eq. (1) with µi ≥ 0 and summing the obtained inequalities over i, we have

m
µi ai x ≤ µ1 a1 x < 0.
i=1

On other hand, from Eq. (2) we obtain

m
µi ai x = 0,
i=1

which is a contradiction. Hence, conditions (i) and (ii) cannot hold simultaneously.

6
Section 3.3

The proof will be complete if we can show that conditions (i) and (ii) cannot fail to hold
simultaneously. Indeed, if condition (i) fails to hold, the minimax problem

minimize max{a1 x, . . . , am x}

subject to x ∈ n
m
has 0 as its solution. Hence by Prop. 3.3.10, there exists a µ ≥ 0 with i=1 µi = 1 such that
m
i=1 µi ai = 0, or A µ = 0. Thus condition (ii) holds, and it follows that the conditions (i) and

(ii) cannot fail to hold simultaneously.

3.3.7 www

Assume, to obtain a contradiction, that the conclusion does not hold, so that there is a sequence
{xk } such that xk → x∗ , and for all k, xk = x∗ , h(xk ) = 0, and f (xk ) < f (x∗ ) + (1/k)||xk − x∗ ||2 .
Let us write xk = x∗ + δ k y k , where

xk − x∗
δ k = xk − x∗ , yk = .
xk − x∗

The sequence {y k } is bounded and lies on the surface of the unit sphere, so it must have a
subsequence converging to some y with y = 1. Without loss of generality, we assume that the
whole sequence {y k } converges to y.

By taking the limit as δ k → 0 in the relations

1 k f (x∗ + δ k y k ) − f (x∗ ) o(δ k )

||x − x∗ || > = ∇f (x∗ ) y k + ,
k δk δk

hi (xk ) − hi (x∗ ) hi (x∗ + δ k y k ) − hi (x∗ ) o(δ k )

0= k
= k
= ∇hi (x∗ ) y k + k ,
δ δ δ
gj (xk ) − gj (x∗ ) gj (x∗ + δ k y k ) − gj (x∗ ) o(δ k )
0≥ k
= k
= ∇gj (x∗ ) y k + k ,
δ δ δ
we see that

∇f (x∗ ) y ≤ 0, ∇h(x∗ ) y = 0, i = 1, . . . , m, ∇gj (x∗ ) y ≤ 0, ∀ j ∈ A(x∗ ).

Let us now show that

∇gj (x∗ ) y = 0, ∀ j ∈ A+ (x∗ ), (1)

where
A+ (x∗ ) = {j | µ∗j > 0},

7
Section 3.3

so that we can conclude based on the hypothesis that

y ∇2xx L(x∗ , λ∗ )y > 0. (2)

Indeed, we have ∇x L(x∗ , λ∗ , µ∗ ) = 0 or equivalently

m
∇f (x∗ ) + λ∗i ∇hi (x∗ ) + µ∗j ∇gj (x∗ ) = 0.
i=1 j∈A+ (x∗ )

By taking inner product of this relation with y and by using the equation ∇hi (x∗ ) y = 0, we
obtain

∇f (x∗ ) y + µ∗j ∇gj (x∗ ) y = 0.
j∈A+ (x∗ )

Since all the terms in the above equation have been shown to be nonpositive, they must all be
equal to 0, showing that Eq. (1) holds.

We will now show that y ∇2xx L(x∗ , λ∗ )y ≤ 0, thus coming to a contradiction [cf. Eq. (2)].
Since xk = x∗ + δ k y k , by the mean value theorem [Prop. A.23(b) in Appendix A], we have
1 k (δ k )2 k 2 ˜k k
||x − x∗ ||2 > f (xk ) − f (x∗ ) = δ k ∇f (x∗ ) y k + y ∇ f (ξ )y , (3)
k 2
(δ k )2 k 2 ¯k k
0 = hi (xk ) − hi (x∗ ) = δ k ∇hi (x∗ ) y k + y ∇ hi (ξi )y , i = 1, . . . , m, (4)
2
(δ k )2 k 2 ˆk k
0 ≥ gj (xk ) − gj (x∗ ) = δ k ∇gj (x∗ ) y k + y ∇ gj (ξj )y , j ∈ A(x∗ ), (5)
2
k
where all the vectors ξ˜k , ξ i , and ξˆjk lie on the line segment joining x∗ and xk . Multiplying Eqs.
(4) and (5) by λ∗i and µ∗j , respectively, adding them and adding Eq. (3) to them, we obtain
⎛ ⎞
1 k m
||x − x∗ ||2 > δ k ⎝∇f (x∗ ) + λ∗i ∇hi (x∗ ) + µ∗j ∇gj (x∗ )⎠ y k
k i=1 ∗
j∈A(x )
⎛ ⎞
k 2
(δ ) k ⎝ 2 ˜k m
µ∗j ∇2 gj (ξˆjk )⎠ y k .
k
+ y ∇ f (ξ ) + λ∗i ∇2 hi (ξ i ) +
2 i=1 ∗ j∈A(x )
m
Since δ k = ||xk − x∗ || and ∇f (x∗ ) + i=1 λ∗i ∇hi (x∗ ) + j∈A(x∗ ) µ∗j ∇gj (x∗ ) = 0, we obtain
⎛ ⎞
2 m
> y k ⎝∇2 f (ξ˜k ) + µ∗j ∇2 gj (ξˆjk )⎠ y k .
k
λ∗i ∇2 hi (ξ i ) +
k i=1 ∗ j∈A(x )

By taking the limit as k → ∞,

⎛ ⎞

m
0 ≥ y ⎝∇2 f (x∗ ) + λ∗i ∇2 hi (x∗ ) + µ∗j ∇2 gj (x∗ )⎠ y,
i=1 j∈A(x∗ )

thus arriving at the desired contradiction.

8
Section 3.3

3.3.10 www

(a) Consider a problem where there are two identical equality constraints [h1 (x) = h2 (x) for all
x], and assume that x∗ is a local minimum such that ∇h1 (x∗ ) = 0. Then, ∇f (x∗ )+λ∇h1 (x∗ ) = 0
for some λ. Take a scalar γ > 0 such that λ + γ > 0 and let λ∗1 = λ + γ and λ∗2 = −γ. Then we
have
∇f (x∗ ) + λ∗1 ∇h1 (x∗ ) + λ∗2 ∇h2 (x∗ ) = 0,

but since λ∗1 and λ∗2 have diﬀerent signs, there is no x such that simultaneously we have λ∗1 h1 (x) >
0 and λ∗2 h2 (x) > 0. Thus λ∗1 and λ∗2 violate the last Fritz John condition. As an alternative
example, consider the following inequality constrained problem

minimize x1 + x2

subject to g1 (x1 , x2 ) = (x1 )2 − x2 ≤ 0, g2 (x1 , x2 ) = −(x1 )2 + x2 ≤ 0.

Then x∗ = (0, 0) is a local minimum with A(x∗ ) = {1, 2}, and µ∗0 = µ∗1 = µ∗2 = 1 satisfy
Karush-Kun-Tucker conditions, namely

∇f (0, 0) + ∇g1 (0, 0) + ∇g2 (0, 0) = 0.

However, there is no point (x1 , x2 ) such that g1 (x1 , x2 ) > 0 and g2 (x1 , x2 ) > 0, i.e., the Fritz
John condition (iv) does not hold.

(b) For simplicity, assume that all the constraints are inequalities (equality constraints can be
handled by conversion to two inequalities). If ∇f (x∗ ) = 0, we can take µj = 0 for all j, and we
are done. Assume that ∇f (x∗ ) = 0 and consider the index subsets J ⊂ A(x∗ ) such that ∇f (x∗ )
is a positive combination of the gradients ∇gj (x∗ ), j ∈ J, and among all such subsets, let J ∗
have a minimal number of elements. Without loss of generality, let J ∗ = {1, . . . , s}, so we have

∇f (x∗ ) + µ1 ∇g1 (x∗ ) + · · · + µs ∇gs (x∗ ) = 0,

where µj > 0 for j = 1, . . . , s.

We claim that ∇g1 (x∗ ), . . . , ∇gs (x∗ ) are linearly independent. Indeed, if this were not so,
we would have for some α1 , . . . , αs , not all zero,

α1 ∇g1 (x∗ ) + · · · + αs ∇gs (x∗ ) = 0

so that
∇f (x∗ ) + (µ1 + γα1 )∇g1 (x∗ ) + · · · + (µs + γαs )∇gs (x∗ ) = 0,

9
Section 3.3

for all scalars γ. Thus, we can ﬁnd γ such that µj + γαj ≥ 0 for all j and µj + γαj = 0 for at least
one index j ∈ {1, . . . , r}. This contradicts the hypothesis that the index set J ∗ has a minimal
number of elements.

Thus ∇g1 (x∗ ), . . . , ∇gs (x∗ ) are linearly independent, so that we can ﬁnd a vector h such
that
∇g1 (x∗ ) h = · · · = ∇gs (x∗ ) h = 1.

Consider vectors of the form

x = x∗ + γh,

where γ is a positive scalar. By Taylor’s theorem, for suﬃciently small γ, we have gj (x∗ +γh) > 0
and hence also µj gj (x∗ + γh) > 0 for all j = 1, . . . , s. Thus, the scalars µj , j = 1, . . . , s, together
with µj = 0 for j = s + 1, . . . , r, satisfy all the Fritz John conditions with µ0 = 1.

3.3.11 www

From the given conditions, it follows that

µ∗j ∇gj (x∗ ) = 0, (1)
j∈A(x∗ )

where µ∗1 , . . . , µ∗r are Lagrange multipliers satisfying the Fritz John conditions. Since the functions
gj (x) are convex over n , for any j ∈ A(x∗ ) and any feasible vector x we have

0 ≥ gj (x) − gj (x∗ ) ≥ ∇gj (x∗ ) (x − x∗ ).

Therefore
µ∗j gj (x) ≥ µ∗j gj (x∗ ) + ∇gj (x∗ ) (x − x∗ )

= µ∗j ∇gj (x∗ ) (x − x∗ ), ∀ j ∈ A(x∗ ).

This and Eq. (1) imply

µ∗j gj (x) ≥ 0, for all feasible x.
j∈A(x∗ )

On the other hand, for all feasible x we have j∈A(x∗ ) µ∗j gj (x) ≤ 0. Therefore

µ∗j gj (x) = µ∗j gj (x) = 0
j∈A(x∗ ), µ∗
j
>0 j∈A(x∗ )

for all feasible x. This is possible only if gj (x) = 0 for all feasible x and j ∈ A(x∗ ) with µ∗j > 0.
Since not all µ∗j are equal to zero, there is at least one index j with µ∗j > 0.

10
Section 3.3

3.3.12 www

It is straightforward that the given condition is implied by the condition (iv) of Prop. 3.3.5.

To show the reverse, we replace each equality constraint hi (x) = 0 with the two constraints
hi (x) ≤ 0 and −hi (x) ≤ 0, and we apply the version of the Fritz John conditions given in the
−
i and λi be the multipliers corresponding to the constraints hi (x) ≤ 0 and
exercise. Let λ+
−hi (x) ≤ 0, respectively. Thus in any neighborhood N of x∗ there is a vector x such that

hi (x) > 0, for all i with λ+

i > 0, (1)

−hi (x) > 0, for all i with λ−

i > 0, (2)

gj (x) > 0, for all j with µ∗j > 0.

−
Evidently µ∗j gj (x) > 0 for all j with µ∗j > 0. Since λ∗i = λ+ ∗
i − λi , if λi = 0 then either
− ∗ − ∗
λ+ +
i > λi = 0 (corresponds to λi > 0) or λi > λi = 0 (corresponds to λi < 0). In either case,

from Eqs. (1) and (2) we have that

λ∗i hi (x) > 0, for all i with λ∗i = 0.

Hence the Fritz John condition (iv), as given in Prop. 3.3.5, holds.

3.3.13 www

First, let us point out some important properties of a convex function that will be used in the
proof.

Convexity of f over n implies that f is continuous over n and the set ∂f (x) of subgra-
dients of f at x is nonempty for all x ∈ n (see Prop. B.24 of Appendix B).

If f is convex over n , while G is continuously diﬀerentiable over n , then if a point y ∗ is

an unconstrained local minimum of f (x) + G(x), we have if 0 ∈ ∂f (y ∗ ) + ∇G(y ∗ ) (see Prop. B.24
of Appendix B).

(a) Let x∗ be a local minimum of f and S = {x | ||x − x∗ || ≤ }, where > 0 is such that
f (x) ≥ f (x∗ ) for all feasible x with x ∈ S. As in the proof of Prop. 3.1.1 (Sec. 3.1.1), for each
k ≥ 1 we consider the penalized problem

k k +
m r
1
minimize F k (x) = f (x) + (hi (x))2 + (gj (x))2 + ||x − x∗ ||2
2 i=1 2 j=1 2
subject to x ∈ S.

11
Section 3.3

Similar to Sec. 3.1.1, we conclude that the solution xk for the above problem exists and (using
the continuity of f , hi , gj+ ) that xk → x∗ as k → ∞. Therefore, there is an index k such that xk
is an interior point of S for all k ≥ k. For such k, we have 0 ∈ ∂F k (xk ), or equivalently

m
r
sk + ξik ∇hi (xk ) + ζjk ∇gj (xk ) + (xk − x∗ ) = 0,
i=1 j=1

for some sk ∈ ∂f (xk ) and ξik = khi (xk ), ζjk = kgj+ (xk ).

Following the lines of the proof of Prop. 3.3.5, we obtain

m
r
1 k
µk0 sk + λki ∇hi (xk ) + µkj ∇gj (xk ) + (x − x∗ ) = 0,
i=1 j=1
δk

for all k ≥ k, where

1 ξik ζjk
µk0 = , λki = , i = 1, . . . , m, µkj = , j = 1, . . . , r,
δk δk δk

and

m r
δ = 1 +
k (ξik )2 + (ζjk )2 .
i=1 j=1

Since xk → x∗ with sk ∈ ∂f (xk ) for all k, from Prop. B.24 and the boundedness of the se-
quence {µk0 , λk1 , . . . , λkm , µk1 , . . . , µkr } we see that there are a vector s∗ ∈ ∂f (x∗ ) and a limit point
(µ∗0 , λ∗1 , . . . , λ∗m , µ∗1 , . . . , µ∗r ) such that

m
r
µ∗0 s∗ + λ∗i ∇hi (x∗ ) + µ∗j ∇gj (x∗ ) = 0, (1)
i=1 j=1

If µ∗ = 0, then the vector

m
r
− λ∗i ∇hi (x∗ ) − µ∗j ∇gj (x∗ )
i=1 j=1

is equal to zero. Otherwise, we can set µ∗0 = 1 in (1), which shows that the above vector is a
subgradient of f at x∗ . Thus, condition (i) of the exercise is satisﬁed. The rest of the proof is
the same as that of Prop. 3.3.5.

(b) The proof is similar to the one of Prop. 3.3.7.

(b) Assume that ∇hi (x∗ ) are linearly independent, and that there is a vector d such that

∇hi (x∗ ) d = 0, ∀ i = 1, . . . , m, ∇gj (x∗ ) d < 0, ∀ j ∈ A(x∗ ).

If µ∗0 = 0 in (1), then using the same argument as in proof of Prop. 3.3.8 we arrive at contradiction.
Under the Slater condition, the proof that µ∗0 = 0 is the same as in Prop. 3.3.9.

12
Section 3.3

3.3.14 www

The problem can be formulated as follows

minimize r2

subject to ||x − yj ||2 ≤ r2 , j = 1, . . . , p, x ∈ n ,

which is equivalent to the unconstrained minimax problem

minimize max {||x − y1 ||2 , . . . , ||x − yp ||2 }

subject to x ∈ n .

According to Prop. 3.3.10, the Lagrange multiplier conditions are

p
(i) 2 j=1 µ∗j (x∗ − yj ) = 0.
p
(ii) µ∗ ≥ 0, j=1 µ∗j = 1.

(iii) For all j = 1, . . . , p, if µ∗j > 0, then

||x∗ − yj ||2 = max {||x − y1 ||2 , . . . , ||x − yp ||2 } ,

where x∗ is optimal solution for the minimax problem and µ∗ is the corresponding Lagrange
multiplier.

Note that the cost function is continuous and coercive, so that the optimal solution always
exists. Furthermore, the cost function is convex and the given conditions are also suﬃcient for
optimality. By combining (i) and (ii) we have

p
p
x∗ = µ∗j yj , µ∗j = 1, µ∗j > 0, ∀ j,
j=1 j=1

i.e., x∗ is a convex combination of the given points y1 , . . . , yp . For p = 3, when y1 , y2 , y3 do not

lie on the same line, we have the following geometric solution:

(1) All constraints are active, so x∗ is at equal distance from all three points. Then x∗ is the
center of the circle circumscribed around the triangle of the three points. In this case x∗ must lie
within the triangle and is a positive combination of the yj , the coeﬃcients being the multipliers.
This corresponds to the case when the triangle is not obtuse.

(2) Only two of the constraints are active, in which case x∗ lies on the line connecting the
two points. This occurs when the triangle formed by the given points is obtuse. Then x∗ is the
midpoint of the longest side of the triangle. If yj is not the end point of the longest side, then
µj = 0. The other two Lagrange multipliers are both positive.

13
Section 3.3

Now consider the degenerate case when the three points lie on the same line. We can assume
that y3 lies between y1 and y2 . Then the optimal point x∗ is the midpoint of the segment joining
y1 and y2 . The Lagrange multipliers µ∗1 and µ∗2 are positive, while µ∗3 = 0.

3.3.15 www

(a) Let {y k } be a sequence of points in T (x) for some x ∈ X. Assume that y k → y as k → ∞.

The deﬁnition of the tangent cone implies that for every y k there is a sequence {xki } ⊆ X \ {x}
such that
xki − x yk
xki → x, and → k as i → ∞.
||xi − x||
k ||y ||
For k = 1, 2, . . ., choose an index ik such that ik > ik−1 > . . . > i1 and

xk − x k
1 i y 1
||xkik − x|| < k and kk − < .
2 ||xi − x|| ||y k || 2k
k
k
xi −x
Evidently {xik } ⊆ X \ {x}, and xik → x as k → ∞. Also, we have that
k k
||xk −x|| −
k yk
||y k ||
→0
ik
as k → ∞. This together with the fact that y k → y, and

xk − x y k yk
ik xik − x yk y
k − ≤ k − k + − ,
||xi − x|| ||y|| ||xi − x|| ||y || ||y k || ||y||
k k

implies
xk − x y
ik
lim − = 0,
k→∞ ||xk
ik − x|| ||y||
which by the deﬁnition of T (x) means that y ∈ T (x). Thus, T (x) is closed.

(b) Let F (x) and F (x) denote, respectively, the set of feasible directions at x and its closure.
First, we will prove that F (x) ⊆ T (x) holds, regardless of whether X is convex. Let d ∈ F (x).
Then there is an α > 0 such that x+αd ∈ X for all α ∈ [0, α]. Choose any sequence {αk } ⊆ (0, α]
xk −x
with αk → 0 as k → ∞. Deﬁne xk = x + αk d. Evidently xk ∈ X \ {x}, and ||xk −x||
= d
||d||

converges to d
||d|| . Hence d ∈ T (x). It follows that F (x) ⊆ T (x), and since T (x) is closed, we
have F (x) ⊆ T (x).

Next, we prove that T (x) ⊆ F (x). Let y ∈ T (x) and {xk } ⊆ X \ {x} be such that
xk − x y
= + ξk ,
||x − x||
k ||y||
where ξ k → 0 as k → ∞. Since X is a convex set, the direction xk − x is feasible at x for all k.
xk −x
Therefore, the direction dk = ||xk −x||
· ||y|| = y + ξ k ||y|| is feasible at x for all k, i.e., {dk } ⊆ F (x).
Since
lim dk = lim (y + ξ k ||y||) = y,
k→∞ k→∞

we have y ∈ F (x). Consequently T (x) ⊆ F (x). This completes the proof.

14
Section 3.3

3.3.16 www

Let x be any vector in X. We will show that T (x) = V (x). We have, in general T (x) ⊂ V (x)
(see e.g., the proof of Prop. 3.3.17), so we focus on showing that V (x) ⊂ T (x). Let y ∈ V (x), so
that we have
∇gj (x) y ≤ 0, ∀ j ∈ A(x).

Let αk be a positive sequence with αk → 0, and let

xk = x + αk y.

For all j ∈ A(x) we have gj (x) = 0, and using the concavity of gj , we obtain

gj (xk ) ≤ gj (x) + αk ∇gj (x) y ≤ 0.

It follows that for k suﬃciently large, xk is feasible. Since

xk − x y
xk → x, = ,
x −x
k y
it follows that y ∈ T (x), so that V (x) ⊂ T (x).

3.3.17 www

Let y be a vector such that ∇gj (x∗ ) y < 0 for all j ∈ A(x∗ ). By continuity of ∇gj (x) (as a
function of x and j), there exist a neighborhood N of x∗ and a neighborhood A of A(x∗ ) (relative
to J) such that
∇gj (x) y < 0, ∀ x ∈ N, ∀ j ∈ A. (1)

Furthermore, the neighborhood N can be chosen so that

gj (x) < 0, ∀ x ∈ N, ∀ j ∈ J \ A. (2)

Since N is open and x∗ ∈ N , we can ﬁnd a scalar α > 0 so that x∗ + αy ∈ N whenever 0 ≤ α ≤ a.

For any α with 0 < α ≤ a and j ∈ A, by the mean value theorem and feasibility of x∗ , we have

gj (x∗ + αy) = gj (x∗ ) + α∇gj (x∗ + θαy) y ≤ α∇gj (x∗ + θαy) y, (3)

for some θ ∈ (0, 1). Since x∗ + θαy ∈ N and j ∈ A, from Eqs. (1) and (3) we obtain

gj (x∗ + αy) < 0, ∀ j ∈ A, ∀ α ∈ (0, 1].

For any α with 0 < α ≤ a the point x∗ + αy belongs to N , which together with Eq. (2) implies

gj (x∗ + αy) < 0, ∀ j ∈ J \ A, ∀ α ∈ (0, 1].

The last two inequalities show that y is a feasible direction of X at x∗ . In the solution to part (b)
of Exercise 3.3.15, it is shown that the set of feasible directions at x∗ is a subset of the tangent
cone at x∗ , regardless of the structure of the set X.

15
Section 3.3

3.3.18 www

Assume that we have shown the validity of the Mangasarian-Fromovitz constraint qualiﬁcation
for the problem without equality constraints, i.e., for a local minimum x∗ , there exist Lagrange
multipliers under the condition that there is a vector d such that

∇gj (x∗ ) d < 0, ∀ j ∈ A(x∗ ). (1)

Now, consider the problem with equality and inequality constraints. Assume that there is
a vector d such that
∇hi (x∗ ) d = 0, ∀ i = 1, . . . , m,
(2)
∇gj (x∗ ) d < 0, ∀ j ∈ A(x∗ ).
Since the vectors ∇h1 (x∗ ), . . . , ∇hm (x∗ ) are linearly independent, by reordering the coordinates
of x if necessary, we can partition the vector x as x = (xB , xR ) such that the submatrix ∇B h(x∗ )
(the gradient matrix of h with respect to xB ) is invertible. The equation

h(xB , xR ) = 0

has the solution (x∗B , x∗R ), and the implicit function theorem (Prop. A.25 of Appendix A) can be
used to express xB in terms of xR via a unique continuously diﬀerentiable function φ : S → m
deﬁned over a sphere S centered at x∗R . In particular, we have x∗B = φ(x∗R ), h (φ(xR ), xR ) = 0
for all xR ∈ S, and

−1
∇φ(xR ) = −∇R h (φ(xR ), xR ) (∇B h (φ(xR ), xR )) , ∀ xR ∈ S, (3)

where ∇R h is the gradient matrix of h with respect to xR . Observe that x∗R is a local minimum
of the problem
min F (xR )
(4)
subject to Gj (xR ) ≤ 0, j = 1, . . . , r,
where F (xR ) = f (φ(xR ), xR ), Gj (xR ) = gj (φ(xR ), xR ). Note that this problem has no equality
constraints. From (2) we have

∇h(x∗ ) d = ∇B h(x∗ ) dB + ∇R h(x∗ ) dR = 0,

and
∇gj (x∗ ) d = ∇B gj (x∗ ) dB + ∇R gj (x∗ ) dR < 0, (5)

for all j ∈ A(x∗ ). Since ∇B h(x∗ ) is invertible, from the ﬁrst relation above we obtain

−1
dB = − ∇B h (φ(x∗R ), x∗R ) ∇R h (φ(x∗R ), x∗R ) dR ,

16
Section 3.3

which in view of Eq. (3) is equivalent to

dB = ∇φ(x∗R ) dR .

Substituting this in Eq. (5), we obtain

∇B gj (φ(x∗R ), x∗R ) ∇φ(x∗R ) dR + ∇R gj (φ(x∗R ), x∗R ) dR < 0,

which is equivalent to
∇Gj (x∗R ) d < 0, ∀ j ∈ A(x∗ ).

This means that the Mangasarian-Fromovitz constraint qualiﬁcation is satisﬁed for problem (4),
so there are Lagrange multipliers µ∗1 , . . . , µ∗r such that

r
0 = ∇F (x∗R ) + µ∗j ∇Gj (x∗R ) = ∇φ(x∗R )∇B f (x∗ ) + ∇R f (x∗ )
j=1

r
+ µ∗j (∇φ(x∗R )∇B gj (x∗ ) + ∇R gj (x∗ ))
j=1
⎛ ⎞ (6)

r
= ∇φ(x∗R ) ⎝∇B f (x∗ ) + µ∗j ∇B gj (x∗ )⎠ + ∇R f (x∗ )
j=1

r
+ µ∗j ∇R gj (x∗ ).
j=1

Deﬁne
B = ∇B h (φ(x∗R ), x∗R ) , R = ∇R h (φ(x∗R ), x∗R )

and ⎛ ⎞

r
λ∗ = −B −1 ⎝∇B f (x∗ ) + µ∗j ∇B gj (x∗ )⎠ .
j=1

−1
Then from Eq. (3) we see that ∇φ(x∗R ) = −R B , which combined with Eq. (6) implies

r
∇R f (x ) + R λ +
∗ ∗ µ∗j ∇R gj (x∗ ) = 0.
j=1

The deﬁnition of λ∗ implies

r
∇B f (x∗ ) + B λ∗ + µ∗j ∇B gj (x∗ ) = 0.
j=1

Since ∇h(x∗ ) = (B , R ), the last two equalities are equivalent to

r
∇f (x∗ ) + ∇h(x∗ ) λ∗ + µ∗j ∇gj (x∗ ) = 0,
j=1

which shows that the Lagrange multipliers exist.

The proof of the existence of the Lagrange multipliers under the Slater constraint qualifica-
tion is straightforward from the preceding analysis by noting that the vector d = x − x∗ satisfies
the Mangasarian-Fromovitz constraint qualification.

17
Section 3.3

3.3.19 www

For simplicity we assume that there are no equality constraints; the subsequent proof can be
easily extended to the case whether there are some inequality constraints. To show that the
Mangasarian-Fromovitz constraint qualiﬁcation implies boundedness of the set of Lagrange mul-
tipliers, follow the given hint.

Conversely, if the set of Lagrange multipliers is bounded, there cannot exist a µ = 0 with

µ ≥ 0 and j∈A(x∗ ) µj ∇gj (x∗ ) = 0, since adding γµ, for any γ > 0, to a Lagrange multiplier
gives another Lagrange multiplier. Hence by the theorem of the alternative of Exercise 3.3.6,
there must exist a d such that ∇gj (x∗ ) d < 0 for all j ∈ A(x∗ ).

3.3.20 www

We have
0
∇h1 (x) = ,
1
⎧ 3
⎪
⎪ 4x sin 1
− x2 cos 1
⎪
⎪
1 x1 1 x1
0,
if x1 =
⎨ −1
∇h2 (x) =
⎪
⎪ 0
⎪
⎪ if x1 = 0,
⎩
−1
and it can be seen that ∇h1 and ∇h2 are everywhere continuous. Thus, for λ1 = 1, λ2 = 1, we
have
λ1 ∇h1 (0) + λ2 ∇h2 (0) = 0.

On the other hand, it can be seen that arbitrarily closely to x∗ = (0, 0), there exists an x such
that h1 (x) > 0 and h2 (x) > 0. Thus x∗ is not quasinormal, although it is seen (most easily, by a
graphical argument) that x∗ is quasiregular.

3.3.21 www

(a) Without loss of generality, we assume that there are no equality constraints and that all
inequality constraints are active at x∗ . Based on the deﬁnition of quasinormality, it is easy
to verify that x∗ is a quasinormal vector of X if it is a quasinormal vector of X. Conversely,
suppose that x∗ is a quasinormal vector of X, but not a quasinormal vector of X. Then there
exist Lagrange multipliers µ1 , . . . , µr that satisfy the Fritz John conditions with µ0 = 0 and
µj > 0 for some j ∈ J (for otherwise, x∗ would not be a quasinormal vector of X). From the

18
Section 3.3

deﬁnition of the set J it follows that there is a vector y ∈ V (x∗ ) such that ∇gj (x∗ ) y < 0. By
multiplying the relation

r
µj ∇gj (x∗ ) = 0
j=1

with y, we obtain

r
0= µj ∇gj (x∗ ) y ≤ µj ∇gj (x∗ ) y < 0,
j=1

which is a contradiction. Hence, x∗ is a quasinormal vector in X.

(b) Clearly, if x∗ is a quasiregular vector of X, then it is a quasiregular vector of X. To prove the

converse, we follow the given hint. Assume that x∗ is a quasiregular vector of X. Then evidently
Ṽ (x∗ ) ⊂ V (x∗ ) = T (x∗ ), where V (x∗ ) and T (x∗ ) denote, respectively, the cone of ﬁrst order
feasible variations and the tangent cone of X at x∗ . To complete the proof, we need to show that
Ṽ (x∗ ) ⊂ T (x∗ ). Let y ∈ Ṽ (x∗ ) \ {0} be arbitrary. Since y ∈ T (x∗ ), there is a sequence {xk } ⊂ X
such that xk = x∗ for all k and

xk − x∗ y
xk → x∗ , → .
||xk − x∗ || ||y||

From the ﬁrst order Taylor’s expansion we have

gj (xk ) − gj (x∗ ) ∇gj (x∗ ) (xk − x∗ ) ∇gj (x∗ ) y

lim = lim =
k→∞ ||x − x ||
k ∗ k→∞ ||x − x ||
k ∗ ||y||

for all j. This implies gj (xk ) < 0 for all j ∈ J and all suﬃciently large k. Therefore xk ∈ X for all
k suﬃciently large, and consequently y is in the tangent cone of X at x∗ . Hence Ṽ (x∗ ) ⊂ T (x∗ ),
which is equivalent to quasiregularity of x∗ with respect to the set X.

3.3.22 www

Without loss of generality, we can assume that there are no equality constraints (every equality
constraint hi (x) = 0 can be replaced by two inequalities hi (x∗ ) ≤ 0 and −hi (x∗ ) ≤ 0 with hi (x)
and −hi (x) being linear, and therefore concave). Since x∗ is a local minimum, there exist a scalar
µ0 and Lagrange multipliers λ1 , . . . , λm , µ1 , . . . , µr satisfying the Fritz John conditions. Assume
that µ0 = 0. Then

r
µj ∇gj (x∗ ) = µj ∇gj (x∗ ) = 0. (1)
j=1 j∈A(x∗ )

19
Section 3.4

Multiplying this equation by d, we obtain

µj ∇gj (x∗ ) d = 0. (2)
j∈A(x∗ )

If µj0 > 0 for some j0 ∈ A(x∗ ) \ J, then

µj ∇gj (x∗ ) d ≤ µj0 ∇gj0 (x∗ ) d < 0,
j∈A(x∗ )

which is a contradiction to Eq. (2). Therefore for all j0 ∈ A(x∗ ) \ J we must have µj = 0. Then
from Eq. (1) we have

µj ∇gj (x∗ ) = 0. (3)
j∈J

Now we use the same line of argument as in the proof of Prop. 3.3.6 in order to arrive at a
contradiction. In particular, since gj is concave for every j ∈ J, we have

gj (x) ≤ gj (x∗ ) + ∇gj (x∗ ) (x − x∗ ), ∀ j ∈ J.

By multiplying this inequality with µj and adding over j ∈ J, we obtain

⎛ ⎞

µj gj (x) ≤ µj gj (x∗ ) + ⎝ µj ∇gj (x∗ )⎠ (x − x∗ ) = 0, (4)
j∈J j∈J j∈J

where the last equality follows from Eq. (3) and the fact that µj gj (x∗ ) = 0 for all j [by the Fritz
John condition (iv)]. On the other hand, we know that there is some j ∈ J for which µj > 0 and

an x satisfying gj (x) > 0 for all j with µj > 0. For this x, we have j∈J µj gj (x) > 0, which
contradicts Eq. (4). Thus, we can take µ0 = 1 so that x∗ satisﬁes the necessary conditions of
Prop. 3.3.7.

SECTION 3.4

3.4.3 www

Let’s ﬁrst consider

(P ) min c x ⇐⇒ max b µ. (D)
A x≥b Aµ=c,µ≥0

20
Section 3.4

The dual problem to (P ) is

⎧ ⎫
⎨n
m
m ⎬
max q(µ) = max infn cj − µi aij xj + µi bi .
µ≥0 µ≥0 x∈ ⎩ ⎭
j=1 i=1 i=1

m
If cj − i=1 µi aij = 0, then q(µ) = −∞. Thus the dual problem is

m
max µi bi
i=1

m
µi aij = cj , j = 1, . . . , n
i=1

µ ≥ 0.

To ﬁnd the dual of (D), note that (D) is equivalent to

min −b µ,
Aµ=c,µ≥0

and so the dual problem is

max p(x) = max inf {(Ax − b) µ − c x}.

x∈n x µ≥0

If ai x − bi < 0 for any i, then p(x) = −∞. Thus the dual of (D) is

max −c x or min c x

subject to A x ≥ b.

The Lagrangian optimality condition for (P ) is

m
m
x∗ = arg min c− µ∗i ai x+ µ∗i bi ,
x
i=1 i=1

from which we determine the complementary slackness conditions for (P ):

Aµ = c.

The Lagrangian optimality condition for (D) is

µ∗ = arg min{(Ax∗ − b) µ − c x∗ },

µ≥0

from which we determine the complementary slackness conditions for (D):

Ax∗ − b ≥ 0,

21
Section 3.4

(Ax∗ − b)i µ∗i = 0, ∀ i.

Next, consider
(P ) min c x ⇐⇒ max b µ. (D)
A x≥b,x≥0 Aµ≤c,µ≥0

The dual problem to (P ) is

⎧ ⎫
⎨n
m
m ⎬
max q(µ) = max inf cj − µi aij xj + µi bi .
µ≥0 µ≥0 x≥0 ⎩ ⎭
j=1 i=1 i=1
m
If cj − i=1 µi aij < 0, then q(µ) = −∞. Thus the dual problem is
m
max µi bi
i=1

m
µi aij ≤ cj , j = 1, . . . , n
i=1
µ ≥ 0.

To ﬁnd the dual of (D), note that (D) is equivalent to

min −b µ,
Aµ≤c,µ≥0

and so the dual problem is

max p(x) = max inf {(Ax − b) µ − c x}.

x≥0 x≥0 µ≥0

If ai x − bi < 0 for any i, then p(x) = −∞. Thus the dual of (D) is

max −c x or min c x

subject to A x ≥ b, x ≥ 0

The Lagrangian optimality condition for (P ) is

m
m
x∗ = arg min c− ∗
µi ai x + ∗
µi bi ,
x≥0
i=1 i=1
from which we determine the complementary slackness conditions for (P ):

m
cj − µ∗i aij x∗j = 0, x∗j ≥ 0, ∀ j = 1, . . . , n,
i=1

m
c− µ∗i ai ≥ 0, ∀ i.
i=1
The Lagrangian optimality condition for (D) is

µ∗ = arg min{(Ax∗ − b) µ − c x∗ },

µ≥0

from which we determine the complementary slackness conditions for (D):

Ax∗ − b ≥ 0,

(Ax∗ − b)i µ∗i = 0, ∀ i.

22
Section 3.4

3.4.4 www

m
(a) Let λj be a Lagrange multiplier associated with the constraint i=1 xij = βj , and let νi be
n
a Lagrange multiplier associated with the constraint j=1 xij = αi . Deﬁne

X = {x | xij ≥ 0, ∀ i, j}.

The Lagrangian function is

⎛ ⎞

m
n
n
m
L(x, ν, λ) = aij xij + νi ⎝αi − xij ⎠ + λj βj − xij
i,j i=1 j=1 j=1 i=1

m
n
= (aij − νi − λj )xij + νi αi + λj βj .
i,j i=1 j=1

The dual function is

m n
i=1 νi αi + j=1 λj βj if aij − νi − λj ≥ 0 for all i, j,
q(ν, λ) = inf L(x, ν, λ) =
x∈X
−∞ otherwise.

An alternative dual function is obtained by assigning a Lagrange multiplier λj to each

m
constraint i=1 xij = βj , and lumping the remaining inequality constraints within the abstract
set constraint. Thus,

n
X = {x | xij = αi , xij ≥ 0, ∀ i, j}.
j=1

The Lagrangian function is

n
m
L(x, λ) = aij xij + λj βj − xij
i,j j=1 i=1
⎛ ⎞

m
n
n
= ⎝ (aij − λj )xij ⎠ + λj βj .
i=1 j=1 j=1

Then the dual function is

q(λ) = inf L(x, λ)
x∈X
⎛ ⎞

n
m
n
= λj βj + inf ⎝ (aij − λj )xij ⎠
x∈X
j=1 i=1 j=1

n
m
= λj βj + inf (aij − λj )αi ,
1≤j≤n
j=1 i=1

and the dual problem is

maximize q(λ)

subject to λ ∈ n .

23
Section 3.4

(b) & (c) The Lagrange multiplier λj can be interpreted as the price pj . So if the transportation
problem has an optimal solution x∗ , then its dual also has an optimal solution, say p∗ , and

q(p∗ ) = aij x∗ij ,
i,j

i.e.,

n
m
p∗j βj + min (aij − p∗j )αi = aij x∗ij . (1)
1≤j≤n
j=1 i=1 i,j

Since x∗ is primal feasible, we have

n
n
m
p∗j βj = p∗j x∗ij ,
j=1 j=1 i=1

and by combining this with Eq. (1), we obtain

m
min {aij − p∗j }αi = (aij − p∗j )x∗ij . (2)
1≤j≤n
i=1 i,j

n
By the feasibility of x∗ , we have j=1 x∗ij = αi for all i, and from Eq. (2) it follows that

aij − p∗j − min {aij − p∗j } x∗ij = 0.
1≤j≤n
i,j

Since all the terms in the summation above are nonnegative, we must have

aij − p∗j − min {aij − p∗j } x∗ij = 0, ∀ i, j.
1≤j≤n

Therefore if x∗ij > 0, then

aij − p∗j = min {aik − p∗k },
1≤k≤n

which can be equivalently expressed as

p∗j − aij = max {p∗k − aik }.

1≤k≤n

Since p∗ is arbitrary, this property holds for every dual optimal solution p∗ .

3.4.5 (Duality and Zero Sum Games) www

Consider the linear program

min ζ,
n ζe≥A x
x =1, xi ≥0
i=1 i

24
Section 3.4

whose optimal value is equal to minx∈X maxz∈Z x Az. Introduce dual variables z ∈ m and
n
ξ ∈ , corresponding to the constraints A x − ζe ≤ 0 and i=1 xi = 1, respectively. The dual
function is

n
q(z, ξ) = inf ζ+ z (A x − ζe) + ξ 1− xi
xi ≥0, i=1,...,n
i=1
⎧ ⎛ ⎞ ⎫
⎨ m ⎬
= inf ζ ⎝1 − zj ⎠ + x (Az − ξe) + ξ
xi ≥0, i=1,...,n ⎩ ⎭
j=1
m
ξ if j=1 zj = 1, ξe − Az ≤ 0,
=
−∞ otherwise.
Thus the dual problem, which is to maximize q(z, ξ) subject to z ≥ 0 and ξ ∈ , is equivalent to
the linear program
max ξ,
ξe≤Az, z∈Z

whose optimal value is equal to maxz∈Z minx∈X x Az.

Evans PDE Solution Chapter 3 Nonlinear First-Order PDE
No ratings yet
Evans PDE Solution Chapter 3 Nonlinear First-Order PDE
6 pages
Partial Differential Equations L.C Evans
No ratings yet
Partial Differential Equations L.C Evans
6 pages
Evans PDE Solution Chapter 6 Second-Order Elliptic Equations
50% (2)
Evans PDE Solution Chapter 6 Second-Order Elliptic Equations
9 pages
Nonlinear Programming 3rd Edition Theoretical Solutions Manual
No ratings yet
Nonlinear Programming 3rd Edition Theoretical Solutions Manual
12 pages
Vector Calculus Cheat Sheet
100% (1)
Vector Calculus Cheat Sheet
1 page
Ch1 - MCR3U - Review-1
No ratings yet
Ch1 - MCR3U - Review-1
13 pages
Module 4 Mathematics in The Modern World
No ratings yet
Module 4 Mathematics in The Modern World
6 pages
Nonlinear Programming 3rd Edition Theoretical Solutions Manual
No ratings yet
Nonlinear Programming 3rd Edition Theoretical Solutions Manual
26 pages
Nonlinear Programming 3rd Edition Theoretical Solutions Manual
No ratings yet
Nonlinear Programming 3rd Edition Theoretical Solutions Manual
13 pages
Nlpsol 2
No ratings yet
Nlpsol 2
11 pages
6252_slides13
No ratings yet
6252_slides13
8 pages
Lecture 3
No ratings yet
Lecture 3
5 pages
Nlpsol 6
No ratings yet
Nlpsol 6
20 pages
Nlpsol 4
No ratings yet
Nlpsol 4
7 pages
15.093 Optimization Methods
No ratings yet
15.093 Optimization Methods
12 pages
EE364a Homework 3 Solutions: 0 N 0 1 N N 1 1 N N 0 0
No ratings yet
EE364a Homework 3 Solutions: 0 N 0 1 N N 1 1 N N 0 0
19 pages
Nonlinear Programming 3rd Edition Theoretical Solutions Manual
No ratings yet
Nonlinear Programming 3rd Edition Theoretical Solutions Manual
27 pages
1 Convex Analysis: 1.1 Motivations: Convex Optimization Problems
No ratings yet
1 Convex Analysis: 1.1 Motivations: Convex Optimization Problems
24 pages
Preview of "Optimization - PDF"
No ratings yet
Preview of "Optimization - PDF"
20 pages
Lec6 Constr Opt
No ratings yet
Lec6 Constr Opt
30 pages
The Method of Lagrange Multipliers
No ratings yet
The Method of Lagrange Multipliers
5 pages
The Method of Lagrange Multipliers
No ratings yet
The Method of Lagrange Multipliers
5 pages
Assignment 2
No ratings yet
Assignment 2
24 pages
6252_slides11
No ratings yet
6252_slides11
8 pages
Lagrange Mult
No ratings yet
Lagrange Mult
5 pages
Week02 Convex Optimization
No ratings yet
Week02 Convex Optimization
48 pages
Nlpsol 5
No ratings yet
Nlpsol 5
37 pages
14.451 Notes: 1 Mathematical Preliminaries
No ratings yet
14.451 Notes: 1 Mathematical Preliminaries
5 pages
Numerical Optimization: 1 The Use of Optimality Conditions
No ratings yet
Numerical Optimization: 1 The Use of Optimality Conditions
6 pages
Mathematics For Economics (ECON 104)
No ratings yet
Mathematics For Economics (ECON 104)
51 pages
Lagrange Multipliers
No ratings yet
Lagrange Multipliers
35 pages
BV Solutions To Evolution Problems With Time-Dependent Domains
No ratings yet
BV Solutions To Evolution Problems With Time-Dependent Domains
16 pages
Ineq Lagrange PDF
100% (1)
Ineq Lagrange PDF
7 pages
Solutions To Selected Exercises in Chapter Two
No ratings yet
Solutions To Selected Exercises in Chapter Two
31 pages
Review Question 2
No ratings yet
Review Question 2
4 pages
Unconstrained Minimization in R: Newton Methods
No ratings yet
Unconstrained Minimization in R: Newton Methods
5 pages
Review Question 3
No ratings yet
Review Question 3
4 pages
Mathematics For Economics (ECON 104)
No ratings yet
Mathematics For Economics (ECON 104)
46 pages
Lect5 Removed
No ratings yet
Lect5 Removed
35 pages
Optimality Conditions
No ratings yet
Optimality Conditions
10 pages
Slater 2013
No ratings yet
Slater 2013
14 pages
Lecture Notes 8
No ratings yet
Lecture Notes 8
38 pages
Review Questions 4
No ratings yet
Review Questions 4
4 pages
FA Solutions Sheet 06
No ratings yet
FA Solutions Sheet 06
6 pages
34 Math 22B Notes 2012 Intermediate Calculus
No ratings yet
34 Math 22B Notes 2012 Intermediate Calculus
4 pages
Vdoc - Pub Linear-Analysis
No ratings yet
Vdoc - Pub Linear-Analysis
67 pages
1 s2.0 S0022247X17309629 Main
No ratings yet
1 s2.0 S0022247X17309629 Main
14 pages
Wavelets and Subband Coding Solutions Manual
100% (8)
Wavelets and Subband Coding Solutions Manual
90 pages
L Spaces: 3.1 Banach and Hilbert Spaces
No ratings yet
L Spaces: 3.1 Banach and Hilbert Spaces
10 pages
On Marginal Allocation KTH
No ratings yet
On Marginal Allocation KTH
7 pages
The Method of Lagrange Multipliers
No ratings yet
The Method of Lagrange Multipliers
4 pages
Notes2 PDF
No ratings yet
Notes2 PDF
4 pages
Opt7 20
No ratings yet
Opt7 20
8 pages
Introduction To Gamma Convergence
No ratings yet
Introduction To Gamma Convergence
15 pages
Gadhi - 2004 Sufficient Optimality Condition For Vector Opt Prob Under D.C. Data
No ratings yet
Gadhi - 2004 Sufficient Optimality Condition For Vector Opt Prob Under D.C. Data
12 pages
HW 2 Sol
No ratings yet
HW 2 Sol
9 pages
Coercive Ness
No ratings yet
Coercive Ness
13 pages
Zhang Manifold
No ratings yet
Zhang Manifold
16 pages
Convex Optimization L2 18
No ratings yet
Convex Optimization L2 18
11 pages
Convention: Throughout This Discussion A Feasible Direction D at A Point Is by Definition Taken
No ratings yet
Convention: Throughout This Discussion A Feasible Direction D at A Point Is by Definition Taken
12 pages
NLP Slides
No ratings yet
NLP Slides
201 pages
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Heuristic Methods For Optimization
No ratings yet
Heuristic Methods For Optimization
18 pages
SD Convergence
No ratings yet
SD Convergence
4 pages
Sonc Proof
No ratings yet
Sonc Proof
2 pages
L1 Winter2023 FINE4050
No ratings yet
L1 Winter2023 FINE4050
35 pages
1 s2.0 S0304405X23001435 Main
No ratings yet
1 s2.0 S0304405X23001435 Main
38 pages
1 s2.0 S0304405X23001253 Main
No ratings yet
1 s2.0 S0304405X23001253 Main
45 pages
1 s2.0 S0304405X23001393 Main
No ratings yet
1 s2.0 S0304405X23001393 Main
19 pages
1 s2.0 S0304405X23001423 Main
No ratings yet
1 s2.0 S0304405X23001423 Main
20 pages
Mathematics HL Mock Exam Paper 1: Section A
No ratings yet
Mathematics HL Mock Exam Paper 1: Section A
4 pages
Algebra and Equations Notes Grade 11
No ratings yet
Algebra and Equations Notes Grade 11
26 pages
1 Real Analysis: Image Is "Weierstrass Function" by Eeyore22, Released To The Public Domain
No ratings yet
1 Real Analysis: Image Is "Weierstrass Function" by Eeyore22, Released To The Public Domain
8 pages
Maths Calculus Gate
No ratings yet
Maths Calculus Gate
30 pages
CALCULUS 4A - Cubic Functions
No ratings yet
CALCULUS 4A - Cubic Functions
4 pages
Lec 7
No ratings yet
Lec 7
39 pages
Signals and Systems: 18EC45 Model Question Paper-1 With Effect From 2019-20 (CBCS Scheme)
No ratings yet
Signals and Systems: 18EC45 Model Question Paper-1 With Effect From 2019-20 (CBCS Scheme)
3 pages
Notas de Sistemas Dinamicos
No ratings yet
Notas de Sistemas Dinamicos
75 pages
Types of Matrix: Definition
No ratings yet
Types of Matrix: Definition
3 pages
EE5130: Digital Signal Processing
No ratings yet
EE5130: Digital Signal Processing
3 pages
EMC Chap 3
No ratings yet
EMC Chap 3
28 pages
Introduction
No ratings yet
Introduction
3 pages
Lesson 7 Power Series Interval and Radius of Convergence
No ratings yet
Lesson 7 Power Series Interval and Radius of Convergence
18 pages
Cumulative Probabilities
No ratings yet
Cumulative Probabilities
4 pages
Section N Notes With Answers
No ratings yet
Section N Notes With Answers
4 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
3 pages
PiSo Vs Transient Simple
No ratings yet
PiSo Vs Transient Simple
33 pages
MM Soalan Paper 1 Set 3
No ratings yet
MM Soalan Paper 1 Set 3
10 pages
PETSc Tutorial
No ratings yet
PETSc Tutorial
132 pages
NeurIPS 2023 Contextual Stochastic Bilevel Optimization Paper Conference
No ratings yet
NeurIPS 2023 Contextual Stochastic Bilevel Optimization Paper Conference
23 pages
Maxima & Minima Case Study Based Qs
No ratings yet
Maxima & Minima Case Study Based Qs
7 pages
Chapter 3 - Random Numbers - 2013 - Simulation
No ratings yet
Chapter 3 - Random Numbers - 2013 - Simulation
8 pages
A9718052 - 21136 - 30 - 2019 - 1. UNIT - I Series Completion
No ratings yet
A9718052 - 21136 - 30 - 2019 - 1. UNIT - I Series Completion
28 pages
FVC Reconstruct
No ratings yet
FVC Reconstruct
1 page
Kreatryx - K-Test 2018 - Online Test Series For GATE 2018
No ratings yet
Kreatryx - K-Test 2018 - Online Test Series For GATE 2018
1 page
Distribution of Seismic Forces
No ratings yet
Distribution of Seismic Forces
1 page
Complex Numbers
0% (1)
Complex Numbers
19 pages