Optimization Methods (MFE) : Elena Perazzi
Optimization Methods (MFE) : Elena Perazzi
Optimization Methods (MFE) : Elena Perazzi
Lecture 03
Elena Perazzi
EPFL
Fall 2019
Kuhn-Tucker theory.
cP(x)
c=1 c=1
c = 10 c = 10
c = 100 c = 100
a b x
g1 (x) = x − b, g2 (x) = a − x
We build
Algorithm:
Start with a relatively small value of the penalty parameter c at an
infeasible point not too close to the constraint boundary. This will
ensure that no steep gradients are present in the initial optimization
of θ(~x , c). The min of θ(~x , c) will probably not be a feasible point for
the original problem.
Gradually increase c (e.g. ck+1 = ηck ), each time starting the
optimization from the solution of the problem with the previous value
of c. If c increases gradually the solution of the new problem will
never be far from the solution of the previous one. This will make it
easier to find the min of θ(~x , c) from one iteration to the next.
Stop when you find a solution that is in or close enough to the
feasible region.
In the feasible region, the extra term on the RHS is positive, and
becomes infinite at the borders of the region, where at least one
constraint is satisfied with equality.
Algorithm:
Start with a relatively high value of the barrier parameter r at an point
inside the feasible region, not too close to the constraint boundary.
The solution of this problem will stay inside the feasible region, and
will not approach the region where the barrier term is high.
Gradually decrease r , each time starting the optimization from the
solution of the problem with the previous value of r .
Stop if xk+1 close enough to xk .
The solution will converge to a (local) constrained min of the original
problem from the inside of the feasible region, in a similar way as the
solution of the penalty function problem was converging from the
outside.
f2 (x)=5-x 2 f1 (x)=5-(x-1)2
6
x*=0, f'(x*)=0 x*=1, f'(x*)=0
5
2
3 f1 (x)=5-(x+1)
x*=0, f'(x*)<0
0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
∇L ~ (x ∗ ) + λ∇h(x
~ = ∇f ~ ∗
)=0 (7)
0.5
0
0 0.5 1 1.5
x2 ≤ 1 − x1
x2 ≥ x1
x2 ≥ 0.5
Elena Perazzi (EPFL) Optimization methods (MFE) Lecture 03 Fall 2019 20 / 28
Kuhn-Tucker theory – tools
~ x L(x ∗ , λ∗ , µ∗ ) = ∇f
∇ ~ (x ∗ ) + Σk λ∗ ∇h
~ i (x ∗ ) + Σm µ∗ ∇g
~ i (x ∗ ) = 0
i=1 i j=1 i
∂L(x ∗ , λ∗ , µ∗ )
= hi (x ∗ ) = 0
∂λi
∂L(x ∗ , λ∗ , µ∗ )
= gj (x ∗ ) ≤ 0
∂µj
µ∗j ≥ 0
∂L(x ∗ , λ∗ , µ∗ ) ∗ ∗ ∗
Σm
j=1 µj = Σ m
j=1 gj (x )µj = 0
∂µj
These are the necessary conditions! for a min if the constraint qualification
holds.
Elena Perazzi (EPFL) Optimization methods (MFE) Lecture 03 Fall 2019 22 / 28
Kuhn-Tucker theorem
If x ∗ is a local maximum conditional on gj (x) ≤ 0, j = 1, ..., m and
hi (x) = 0, i = 1, ..., k, and J(x ∗ ) has maximal rank (i.e. rank=number of
active constraints) then there exists a vector of Lagrange multipliers
(λ∗1 , λ∗2 , ...., λ∗k , µ∗1 , ..., µ∗m ) such that
~ x L(x ∗ , λ∗ , µ∗ ) = ∇f
∇ ~ (x ∗ ) + Σk λ∗ ∇h
~ i (x ∗ )−Σm µ∗ ∇g
~ i (x ∗ ) = 0
i=1 i j=1 i
∂L(x ∗ , λ∗ , µ∗ )
= hi (x ∗ ) = 0
∂λi
∂L(x ∗ , λ∗ , µ∗ )
= gj (x ∗ ) ≤ 0
∂µj
µ∗j ≥ 0
∂L(x ∗ , λ∗ , µ∗ ) ∗ ∗ ∗
Σm
j=1 µj = Σ m
j=1 gj (x )µj = 0
∂µj
The equations
gj (x ∗ ) ≤ 0
µ∗j ≥ 0
∗ ∗
Σm
j=1 gj (x )µj = 0
min 3x + 4y s.t. (x 2 + y 2 ) ≥ 4, x ≥ 1