0% found this document useful (0 votes)

18 views37 pages

Lecture14 KKT

The document discusses Karash-Kuhn-Tucker (KKT) conditions for constrained optimization problems. It introduces Lagrangian multipliers and defines the primal and dual problems. The minimax inequality establishes weak duality between the primal and dual problems. KKT conditions provide a method to find the optimal solution when strong duality holds between the primal and dual problems. Strong duality holds when the problem satisfies Slater's condition.

Uploaded by

srinivasa.reddy1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views37 pages

Lecture14 KKT

Uploaded by

srinivasa.reddy1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 37

Karash-Kuhn-Tucker conditions

MFDS Team
Introduction

▶ We will look at constrained optimization and Lagrange

multipliers.
▶ We will look at primal and dual problems and how their
solutions are related.
▶ We will set up Karash-Kuhn-Tucker conditions.
Constrained optimization and Lagrange multipliers

▶ Consider the following problem: minx f (x ), f : RD → R,

subject to additional constraints - so we are looking at a
minimization problem except that the set of all x over which
minimization is performed is not all of RD .
▶ The constrained problem becomes minx f (x ) subject to
gi (x ) ≤ 0 ∀i1, 2, . . . m.
▶ Since we have a method of finding a solution to the
unconstrained optimization problem, one way to proceed now
is to convert the given constrained optimization problem into
an unconstrained one.
▶ We construct J(x ) = f (x ) + i=mi=1 1(gi (x )), where 1(z) = 0
P
for z ≤ 0 and 1(z) = ∞ for z > 0.
Constrained optimization and Lagrange multipliers

▶ The formulation of J(x ) in the previous slide ensures that its

value is infinity if any one of the constraints gi (x ) is not
satisfied. This ensures that the optimal solution to the
unconstrained problem is the same as the constrained problem.
▶ The step-function is also difficult to optimize, and our
solution is to replace the step-function by a linear function
using Lagrange multipliers.
▶ We create the Lagrangian of the given constrained
optimization problem as follows:
L(x , λ) = f (x ) + i=1 λi gi (x ) = f (x ) + λT g (x ), where
Pi=m
λi ≥ 0 for all i.
Primal and dual problems

▶ The primal problem is min f (x ) subject to

gi (x ) ≤ 0, 1 ≤ i ≤ m. Optimization is performed over the
primal variables x .
▶ The associated Lagrangian dual problem is maxλ∈Rm D(λ)
subject to λ ≥ 0 where λ are dual variables.
▶ D(λ) = minx ∈Rd L(x , λ).
▶ The following minimax inequality holds over two arguments
x , y : maxy minx ϕ(x , y ) ≤ minx maxy ϕ(x , y ).
Minimax inequality

▶ Why is this inequality true?

▶ Assume that x , y : maxy minx ϕ(x , y ) = ϕ(xA , yA ) and
minx maxy ϕ(x , y ) = ϕ(xB , yB ).
▶ Fixing y at yA we see that the inner operation on the left
hand side of the minimax inequality is a min operation over x
and returns xA . Thus we have ϕ(xA , yA ) ≤ ϕ(xB , yA ).
▶ Fixing x at xB we see that the inner operation on the right
hand side of the minimax inequality is a max operation over y
and returns yB . Thus we have ϕ(xB , yB ) ≥ ϕ(xB , yA ).
▶ From the above we conclude that ϕ(xB , yB ) ≥ ϕ(xA , yA ).
Minimax inequality

▶ The difference between J(x ) and the Lagrangian L(x , λ) is

that the indicator function is relaxed to a linear function.
▶ When λ ≥ 0, the Lagrangian L(x , λ) is a lower bound on
J(x ).
▶ The maximum of L(x , λ) with respect to λ is J(x ) - if the
point x satisfies all the constraints gi (x ) ≤ 0, then the
maximum of the Lagrangian is obtained at λ = 0 and it is
equal to J(x ). If one or more constraints is violated such that
gi (x ) > 0, then the associated Lagrangian coefficient λi can
be taken to be ∞ so as to equal J(x ).
Minimax inequality

▶ From the previous slide, we have J(x ) = maxλ≥0 L(x , λ).

▶ Our original constrained optimization problem boiled down to
minimizing J(x ), in other words we are looking at
minx ∈Rd maxλ≥0 L(x , λ)
▶ Using the minimax inequality we see that
minx ∈Rd maxλ≥0 L(x , λ) ≥ maxλ≥0 minx ∈Rd L(x , λ).
▶ This is known as weak duality. The inner part of the right
hand side of the inequality is D(λ), and the inequality above
is the reason for setting up the associated Lagrangian dual
problem for the original constrained optimization problem.
Lagrangian formulation

▶ In contrast to the original formulation

D(λ) = minx ∈Rd L(x , λ) is an unconstrained optimization
problem for a given value of λ.
▶ We observe that D(λ) = minx ∈Rd L(x , λ) is a point-wise
minimum of affine functions and hence D(λ) is concave even
though f () and g () may be nonconvex.
▶ We have obtained a Lagrangian formulation for a constrained
optimization problem where the constraints are inequalities.
What happens when some constraints are equalities?
Modeling equality constraints

▶ Suppose the problem is minx f (x ) subject to gi (x ) ≤ 0 for all

1 ≤ i ≤ m and hj (x ) = 0 for 1 ≤ j ≤ n.
▶ We model the equality constraint hj (x ) = 0 with two
inequality constraints hj (x ) ≥ 0 and hj (x ) ≤ 0.
▶ The resulting Lagrange multipliers are then unconstrained.
▶ The Lagrange multipliers for the original inequality constraints
are non-negative while those corresponding to the equality
constraints are unconstrained.
Convex optimization

▶ We are interested in a class of optimization problems where

we can guarantee global optimality.
▶ When f (), the objective function, is a convex function and g ()
and h() are convex functions, we have a convex optimization
problem.
▶ In this setting we have strong duality - the optimal solution of
the primal problem is equal to the optimal solution of the dual
problem.
▶ What is a convex function?
Convex function

▶ First we need to know what is a convex set. A set C is a

convex set if for any x, y ∈ C , θx + (1 − θ)y ∈ C where
0 ≤ θ ≤ 1.
▶ For any two points lying in the convex set, a line joining them
lies entirely in the convex set.
▶ Let a function f : Rd → R be a function whose domain is a
convex set C .
▶ The function is a convex function if for any x , y ∈ C ,
f (θx + (1 − θ)y ) ≤ θf (x ) + (1 − θ)f (y )
▶ Another way of looking at a convex function is to use the
gradient: for any two points x and y , we have
f (y ) ≥ f (x ) + ∇x f (x )(y − x ).
Example

▶ The negative entropy, a useful function in Machine Learning,

is convex: f (x) = xlog2 x for x > 0.
▶ First let us check if f (θx + (1 − θ)y ) ≤ θf (x) + (1 − θ)f (y ).
Take x = 2, y = 4, and θ = 0.5 to get
f (0.5 ∗ 2 + 0.5 ∗ 4) = f (3) = 3log2 3 ≈ 4.75. Then
θf (2) + (1 − θ)f (4) = 0.5 ∗ 2log2 2 + 0.5 ∗ 4log2 4 = log2 32 = 5.
Therefore the convexity criterion is satisfied for these two
points.
▶ Let us now use the gradient criterion. We have
∇x = log2 x + x xlog1 e 2 . Calculating f (2) + ∇f (2) ∗ (4 − 2)
gives 2log2 2 + (log2 2 + log1e 2 ∗ 2 ≈ 6.9. We see that
f (4) = 4log2 4 = 8 which shows that the gradient criterion is
also satisfied.
Linear programming

▶ Let us look at a convex optimization problem where the

objective function and constraints are all linear.
▶ Such a convex optimization problem is called a linear
programming problem.
▶ We can express a linear programming problem as minx c T x
subject to Ax ≤ b where A ∈ Rm×d and b ∈ Rm×1 .
▶ The Lagrangian L(x , λ) is given by
L(x , λ) = c T x + λT (Ax − b ) where λ ∈ Rm is the vector of
non-negative Lagrangian multipliers.
▶ We can rewrite the Lagrangian as
L(x , λ) = (c + AT λ)T x − λT b .
Linear programming

▶ Taking the derivative of the Lagrangian with respect to x and

setting it to zero we get c + AT λ = 0.
▶ Since D(λ) = minx ∈Rd L(x , λ), plugging in the above
equation gives D(λ) = −λT b .
▶ We would like to maximize D(λ), subject to the constraint
λ ≥ 0.
▶ Thus we end up with the following problem:

maxλ∈Rm − λT b
subject to c + AT λ = 0
λ≥0
Linear programming

▶ We can solve the original primal linear program or the dual

one - the optimum in each case is the same.
▶ The primal linear program is in d variables but the dual is in
m variables, where m is the number of constraints in the
original primal program.
▶ We choose to solve the primal or dual based on which of m or
d is smaller.
Quadratic programming

▶ We now consider the case of a quadratic objective function

subject to affine constraints:

minx ∈Rd
2
x Qx + c T x subject to
1 T

Ax ≤ b
▶ Here A ∈ Rm×d , b ∈ Rm , c ∈ Rd
Quadratic programming

▶ The Lagrangian L(x , λ) is given by

2 x Qx + c x + λ (Ax − b ).
1 T T T

▶ Rearranging the above we have

L(x , λ) = 12 x T Qx + (c + AT λ)T x − λT b
▶ Taking the derivative of L(x , λ) and setting it equal to zero
gives Qx + (c + AT λ) = 0.
▶ If we take Q to be invertible, we have x = Q −1 (c + AT λ).
▶ Plugging this value of x into L(x , λ) gives us
D(λ) = − 21 (c + AT λ)Q −1 (c + AT λ) − λT b .
▶ This gives us the dual optimization problem:
maxλ∈Rm − 12 (c + AT λ)Q −1 (c + AT λ) − λT b subject to
λ ≥ 0.
Strong duality

▶ The minimax inequality establishes weak duality which states

that the optimal solution of the primal problem is greater than
or equal to that of the dual problem.
▶ When equality holds, this becomes strong duality.
▶ Strong duality is useful in that one can solve the dual problem
to get the same solution as solving the primal problem.
▶ Solving the dual problem may be easier.
▶ When does strong duality hold?
Slater’s condition

We shall work with the following optimization problem:

minimize f (x ) subject to
gi (x ) ≤ 0 ∀i ∈ [m]
hj (x ) = 0 ∀j ∈ [p]

The Lagrangian associated with this optimization problem is

i=m j=p
minimizef (x ) + λ i g (x ) + νj hj (x )
X X

i=1 j=1

The λi s and hj s are called Lagrange multipliers.

Slater’s condition

Given a Lagrangian L(x , λ, ν) over some optimization domain D,

the Lagrangian dual is the function F (λ, ν) = inf x ∈D L(x , λ, ν).
The dual optimization problem is

max F (λ, ν)
subject to λ ≥ 0
Slater’s condition

▶ For a primal optimization problem we say that it obeys Slater’s

condition if the objective function f is convex, the constraints
gi are all convex and the contraint functions hi are all linear
and there exists a point x̄ in the interior of the region, i.e
gi (x̄ ) < 0 for all i ∈ [m] and hj (x̄ ) = 0 for all j ∈ [p].
▶ Theorem: Suppose Slater’s condition holds and the
region has a non-empty interior. Then we have strong
duality.
Proof of Slater’s condition

Let us define two sets A = {(u , v , t)|∃x ∈ D such that gi (x ) ≤

ui , i = 1 . . . m, hi (x) = vi i = 1 . . . p, f (x ) ≤ t} and
B = {(0, 0, s) ∈ Rm × Rp × R|s < p ∗ }, where p ∗ is the optimal
value to the primal problem.
We can show that the sets A and B are convex sets and are
disjoint. This means according to the separating hyperplane
theorem, there exists a separating hyperplane which separates two
disjoint convex sets.
Proof of Slater’s condition

We can define the separating hyerplane as follows:

(u , v , t) ∈ A =⇒ λ̃T u + ν̃ T v + µt ≥ α
(0, 0, t) ∈ B =⇒ λ̃T u + ν̃ T v + µt ≤ α

We can see from the above that λ̃ ≥ 0 and µ > 0. This is because
if (u , v , t) ∈ A then (k u , v , kt), k > 1 ∈ A, and a negative λ̃, µ will
make the left hand-side of the inequality arbitrarily small, so that it
cannot be lower-bounded by α.
The bottom condition means that µt ≤ α for t < p ∗ which means
that µp ∗ ≤ α.
Proof of Slater’s condition

For any x ∈D
i=m
λ̃i gi (x ) + ν T (Ax − b) + µf (x ) ≥ α ≥ µp ∗
X

i=1

There are now two cases: µ > 0 and µ = 0. When µ > 0, we can
divide the left and right-hand sides to get

L(x , λ̃/µ, ν̃/µ) ≥ p ∗

for all x ∈ D. Defining λ = λ̃/µ and ν = ν̃/µ, we can set

g (λ, ν) = infx L(x , λ, ν). We can see that g (λ, ν) ≥ p ∗ .
Proof of Slater’s condition

▶ By weak duality we know that p ∗ ≥ g (λ, ν). From the

previous slide we have g (λ, ν) = p ∗ .
▶ Let us now consider the case µ = 0.
▶ Then, for all x ∈ D, we have i=1 λ̃i gi (x ) + ν T (Ax − b) ≥ 0.
Pi=m

▶ For the point x̃ that satisfies Slater’s condition (which is

Pi=m
gi (x̃ ) < 0 and Ax̃ = b), we have i=1 λ̃i gi (x̃ ) ≥ 0.
Proof of Slater’s condition

▶ From gi (x̃ ) < 0 and λ̃i ≥ 0, we conclude that λ̃i = 0.

▶ From (λ̃, ν̃, µ) ̸= 0, and (λ̃, µ) = 0 we conclude that ν̃ ̸= 0.
i=1 λ̃i gi (x ) + ν (Ax − b) ≥ 0, we have
▶ Then from i=m T
P
ν̃ T (Ax − b) ≥ 0.
▶ We already know that x̃ is such that Ax̃ − b = 0. Since
x̃ ∈ D, we can think of a point x̃ + ϵ ∈ D such that
ν̃ T (A(x̃ + ϵ) − b) < 0 unless ν̃ T A = 0.
▶ But if there exists non-zero ν̃ such that ν̃ T A = 0, then it
means A does not have rank p which is a contradiction. Thus
ν̃ = 0, but this contradicts (λ̃, ν̃, µ) ̸= 0. Therefore µ cannot
be zero.
Duality gap

▶ In some cases computing the optimum solution for the dual

problem is easier than computing the optimal solution for the
primal problem.
▶ Let α∗ denote the optimal solution to the primal problem and
β ∗ denote the optimal solution to the dual problem. From
weak duality we know that α∗ ≥ β ∗ .
▶ Any feasible solution to the dual problem is a lower bound on
the optimal solution to the primal problem.
▶ We have f (x ) − α∗ < f (x ) − F (λ, ν). If we know that
f (x ) − F (λ, ν) < ϵ, then we know that x is at most ϵ away
from the true optimal solution. f (x ) − F (λ, ν) is called the
duality gap.
Complementary slackness

We make the following claim: Claim 1: Let x ∗ ∈ Rn be primal

optimal and (λ∗ , ν ∗ ) ∈ Rm × Rp be dual optimal. Then
▶ x ∗ ∈ argminx L(x , λ∗ , ν ∗ )
▶ λ∗i gi (x ∗ ) = 0 ∀i ∈ [m]
Proof of complementary slackness

We have f (x ∗ ) = F (λ∗ , ν∗) because of strong duality. Then we

can write

f (x ∗ ) = F (λ∗ , ν∗)
= infx (f (x ) + λ∗i gi (x ) + νi∗ hi (x ))
X X

i∈[m] i∈[p]

≤ f (x ∗ ) + λ∗i gi (x ∗ ) + νi∗ hi (x ∗ )
X X

i∈[m] i∈[p]
≤ f (x ∗)
Proof of complementary slackness

▶ The first line of the preceding set of equations is due to strong

duality.
▶ The second line states shows how the optimal dual solution is
defined.
▶ The third line is simply the definition of the Lagrangian dual.
▶ The fourth and final line comes about because we know that
the primal feasibility of x ∗ gives gi (x ∗ ) ≤ 0, hi (x ∗ ) = 0 and
the dual feasibility of (λ∗ , ν ∗ ) gives λ∗i ≥ 0.
Proof of complementary slackness

▶ Our chain of inequalities started with f (x ∗ ) and ended with

f (x ∗ ). Thus the inequalities are actually equalities. In
particular, there isP an equality between the third and fourth
i∈[m] λi gi (x ) = 0. Each term in the
line which means ∗ ∗

summation i∈[m] λ∗i gi (x ∗ ) has the same sign which means

P
that the sum can be zero only when each term is zero. Thus
we have λ∗i gi (x ∗ ) = 0 ∀i ∈ [m]. This is known as
complementary slackness.
KKT conditions for strong duality

Given a primal optimization problem, we say that x ∗ and

(λ∗ , ν ∗ ) ∈ Rm × Rp respect the Karash-Kuhn-Tucker conditions if:
▶ gi (x ∗ ) ≤ 0 ∀i ∈ [m].
▶ hi (x ∗ ) = 0 ∀i ∈ [p].
▶ λ∗i ≥ 0 ∀i ∈ [m].
▶ λ∗i gi (x ∗ ) = 0 ∀i ∈ [m].
▶ ∇f (x ∗ ) + i=m i=1 λi ∇gi (x ) + i=1 νi ∇hi (x ) = 0.
P ∗ ∗
Pi=p ∗ ∗
KKT conditions for strong duality

Theorem: For any optimization problem, if strong duality holds

then any primal optimal solution x ∗ and dual optimal solution
(λ∗ , ν ∗ ) ∈ Rm × Rp respect the KKT conditions. Conversely if f
and gi are convex for all i ∈ [m] and hi are affine for all i ∈ [p]
then the KKT conditions are sufficient for strong duality.
Therefore the KKT conditions are both necessary and sufficient for
strong duality. We will show the proof in the next slides.
KKT conditions for strong duality

Assume that strong duality holds and x ∗ and (λ∗ , ν ∗ ) ∈ Rm × Rp

are primal and dual optimal solutions. Since x ∗ is feasible, we see
that the first two KKT conditions are true: gi (x ∗ ) ≤ 0 ∀i ∈ [m]
and hi (x ∗ ) = 0 ∀i ∈ [p]. Since (λ∗ , ν ∗ ) is dual feasible, we see that
the third KKT condition is true: λ∗i ≥ 0.

The previous claim we proved establishes that for the primal and
dual feasible solutions, the fourth KKT condition must hold, i.e
λ∗i gi (x ∗ ) = 0 ∀i ∈ [m]. The previous claim also establishes that
x ∗ ∈ argminx L(x , λ∗ , ν ∗ ), which means that the gradient of L must
vanish at x ∗ . Thus the last KKT condition must hold true.
Sufficiency - KKT conditions for strong duality

Now we will show that if we assume the KKT conditions and the
problem is convex, we have strong duality. The first two conditions
gi (x ∗ ) ≤ 0 ∀i ∈ [m] and hi (x ∗ ) = 0 ∀i ∈ [p]
The condition λ∗i ≥ 0 ∀i ∈ [m] together with the information that
f and the constraints gi are convex and the constraints hi are
affine enable us to establish
Pi=m that
∗ ∗
i=1 νi hi (x ) is a convex
L(x , λ , ν ) = f (x ) + i=1 λ∗i gi (x ) + i=m ∗
P
function.
By the last condition we see that the gradient of this convex
function vanishes at x ∗ which means x ∗ is a local and global
minimum.
Sufficiency - KKT conditions for strong duality

Thus we have

F (λ∗ , ν ∗ ) = L(x ∗ , λ∗ , ν ∗ )
i=m i=p
= f (x ) + x νi∗ hi (x ∗ )
X X
∗
λ∗i gi ( ∗ ) +
i=1 i=1
= f (x ∗ )

Algebraic Expressions
100% (1)
Algebraic Expressions
20 pages
Lecture 6 Duality
No ratings yet
Lecture 6 Duality
68 pages
Ampc1 LB
No ratings yet
Ampc1 LB
59 pages
Convexdualitychapter PDF
No ratings yet
Convexdualitychapter PDF
214 pages
5 Duality
No ratings yet
5 Duality
40 pages
Constrained Optimization: Class Notes On: Mathematical Foundations in Engineering, ECEG 6209
100% (1)
Constrained Optimization: Class Notes On: Mathematical Foundations in Engineering, ECEG 6209
19 pages
Karush-Kuhn-Tucker (KKT) Conditions: Lecture 11: Convex Optimization
No ratings yet
Karush-Kuhn-Tucker (KKT) Conditions: Lecture 11: Convex Optimization
4 pages
Op Tim Ization Notes
No ratings yet
Op Tim Ization Notes
55 pages
Lagrange Multipliers
No ratings yet
Lagrange Multipliers
35 pages
L09 Duality
No ratings yet
L09 Duality
50 pages
Unbalanced Maximization Assignment Problem
No ratings yet
Unbalanced Maximization Assignment Problem
9 pages
Chap13 Dual-Corres
No ratings yet
Chap13 Dual-Corres
25 pages
04b Convex-Optimization
No ratings yet
04b Convex-Optimization
30 pages
Lecture NLP 1+updated+ (Part+2) Print
No ratings yet
Lecture NLP 1+updated+ (Part+2) Print
49 pages
Lec14 Duality
No ratings yet
Lec14 Duality
8 pages
ConvexSpring25 Week4
No ratings yet
ConvexSpring25 Week4
23 pages
p4 Duality KKT Annotated
No ratings yet
p4 Duality KKT Annotated
18 pages
An Easy Look at The Cubic Formula: Osler@rowan - Edu
No ratings yet
An Easy Look at The Cubic Formula: Osler@rowan - Edu
7 pages
Lec 09
No ratings yet
Lec 09
56 pages
p4 CO Duality Annotated
No ratings yet
p4 CO Duality Annotated
17 pages
Optimization Lesson 2 - Constrained Multi-Variable Optimization
No ratings yet
Optimization Lesson 2 - Constrained Multi-Variable Optimization
31 pages
Lecture 0-Ml Convex Optimization
No ratings yet
Lecture 0-Ml Convex Optimization
11 pages
10-1 Duality
No ratings yet
10-1 Duality
4 pages
Convex - Module A Part 4
No ratings yet
Convex - Module A Part 4
22 pages
Lecture5 SVM
No ratings yet
Lecture5 SVM
67 pages
Optimisation
No ratings yet
Optimisation
38 pages
Convex Optimization and Lagrange Duality
No ratings yet
Convex Optimization and Lagrange Duality
24 pages
Chapter 4 Duality
No ratings yet
Chapter 4 Duality
19 pages
Convexdualitychapter PDF
No ratings yet
Convexdualitychapter PDF
214 pages
Support Vector Machine
No ratings yet
Support Vector Machine
20 pages
Optmizationtechniques 150308051251 Conversion Gate01
No ratings yet
Optmizationtechniques 150308051251 Conversion Gate01
18 pages
KKT Conditions and Duality: March 23, 2012
No ratings yet
KKT Conditions and Duality: March 23, 2012
36 pages
Taylor Introms12 PPT 10
No ratings yet
Taylor Introms12 PPT 10
38 pages
Chapter 05 Regression Analysis
No ratings yet
Chapter 05 Regression Analysis
47 pages
Optimization Problems
No ratings yet
Optimization Problems
3 pages
Lecture 9 - SVM
No ratings yet
Lecture 9 - SVM
42 pages
Three Strategies To Derive A Dual Problem
No ratings yet
Three Strategies To Derive A Dual Problem
8 pages
Quadratic Equations Class Notes
100% (2)
Quadratic Equations Class Notes
12 pages
Formula Icse Class 10
100% (1)
Formula Icse Class 10
10 pages
Lecture 3: Lagrangian Duality and Algorithms For The Lagrangian Dual Problem
No ratings yet
Lecture 3: Lagrangian Duality and Algorithms For The Lagrangian Dual Problem
47 pages
Apostila Otimização MIT
No ratings yet
Apostila Otimização MIT
101 pages
07 - Solving Linear Systems by Elimination 1 Student Notes
No ratings yet
07 - Solving Linear Systems by Elimination 1 Student Notes
2 pages
Lecture 2 - Optimization With Equality Constraints
No ratings yet
Lecture 2 - Optimization With Equality Constraints
44 pages
Polynomials Synopsis: + BX + C Where A, B, C Are Real + BX + CX + D, Where A, B, C Are Real
No ratings yet
Polynomials Synopsis: + BX + C Where A, B, C Are Real + BX + CX + D, Where A, B, C Are Real
9 pages
Lecture 8
No ratings yet
Lecture 8
5 pages
Optimisation THM Proof
No ratings yet
Optimisation THM Proof
12 pages
Day 5 Problem Set
No ratings yet
Day 5 Problem Set
4 pages
斯坦福大学机器学习数学基础 57-64
No ratings yet
斯坦福大学机器学习数学基础 57-64
8 pages
Wisdom of Crowds Intro
No ratings yet
Wisdom of Crowds Intro
53 pages
Mathematics For Economics (ECON 104)
No ratings yet
Mathematics For Economics (ECON 104)
46 pages
5165 Test 2 Cheating
No ratings yet
5165 Test 2 Cheating
7 pages
Optimality Conditions
No ratings yet
Optimality Conditions
10 pages
Chapter 9st - Non-Linear Programming
No ratings yet
Chapter 9st - Non-Linear Programming
21 pages
5 Lagrange Duality
No ratings yet
5 Lagrange Duality
4 pages
LA Opti Assignment 53.1 Lagrangian Duality
No ratings yet
LA Opti Assignment 53.1 Lagrangian Duality
8 pages
Roger D. Peng - Advanced Statistical Computing (2022 Update) (2023) - Libgen - Li
No ratings yet
Roger D. Peng - Advanced Statistical Computing (2022 Update) (2023) - Libgen - Li
107 pages
Dualidad
No ratings yet
Dualidad
7 pages
LagrangeForSVMs PDF
No ratings yet
LagrangeForSVMs PDF
21 pages
Lecture 7: Weak Duality: 7.1.1 Primal Problem
No ratings yet
Lecture 7: Weak Duality: 7.1.1 Primal Problem
10 pages
Lecture 8: Strong Duality: 8.1.1 Primal and Dual Problems
No ratings yet
Lecture 8: Strong Duality: 8.1.1 Primal and Dual Problems
9 pages
Chap2 1
No ratings yet
Chap2 1
45 pages
Quadratic Programming
No ratings yet
Quadratic Programming
19 pages
Factoring General Trinomials
No ratings yet
Factoring General Trinomials
24 pages
Handout MATH F471 2 2024
No ratings yet
Handout MATH F471 2 2024
2 pages
Lecture 4: Equality Constrained Optimization: Tianxi Wang
No ratings yet
Lecture 4: Equality Constrained Optimization: Tianxi Wang
14 pages
07 Integer Programming I
No ratings yet
07 Integer Programming I
50 pages
06 Lagrange
No ratings yet
06 Lagrange
4 pages
Integration
No ratings yet
Integration
39 pages
CH 10 1 (M)
No ratings yet
CH 10 1 (M)
8 pages
MA1002 - MCQ Module 2
No ratings yet
MA1002 - MCQ Module 2
5 pages
Math Chapter 7
No ratings yet
Math Chapter 7
4 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
11 pages
Coffee Blending
No ratings yet
Coffee Blending
6 pages
Cheatsheet
No ratings yet
Cheatsheet
2 pages
Chapter III or
No ratings yet
Chapter III or
34 pages
Discretization of Convection - Diffusion Type Equation: 10 Indo - German Winter Academy 2011
No ratings yet
Discretization of Convection - Diffusion Type Equation: 10 Indo - German Winter Academy 2011
50 pages
Department of Chemical Engineering
No ratings yet
Department of Chemical Engineering
8 pages
Midterm 1 Practice Solutions
No ratings yet
Midterm 1 Practice Solutions
12 pages
Business Statistics II Model QP 2022
No ratings yet
Business Statistics II Model QP 2022
5 pages
Constrained Optimization
No ratings yet
Constrained Optimization
10 pages
ECN 2115 Lecture 1 - 2
No ratings yet
ECN 2115 Lecture 1 - 2
6 pages
Class 9 Holiday HW - 1
No ratings yet
Class 9 Holiday HW - 1
2 pages
A Zaenal Mufaqih - Tugas6
No ratings yet
A Zaenal Mufaqih - Tugas6
6 pages
WAVELET Slides PDF
No ratings yet
WAVELET Slides PDF
159 pages
Dunkerley'S Method of Approximation To Findout The Natural Frequency of Multi Degree Freedom System
No ratings yet
Dunkerley'S Method of Approximation To Findout The Natural Frequency of Multi Degree Freedom System
9 pages
EE364a Homework 7 Solutions
No ratings yet
EE364a Homework 7 Solutions
16 pages
Daa-Unit Iv
No ratings yet
Daa-Unit Iv
3 pages
Acharya Institute of Technology: X X X X J
No ratings yet
Acharya Institute of Technology: X X X X J
1 page
Calculus: Maths of the Gods
From Everand
Calculus: Maths of the Gods
Bill Todorovich
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)

Lecture14 KKT

Uploaded by

Lecture14 KKT

Uploaded by

Karash-Kuhn-Tucker conditions

▶ We will look at constrained optimization and Lagrange

▶ Consider the following problem: minx f (x ), f : RD → R,

▶ The formulation of J(x ) in the previous slide ensures that its

▶ The primal problem is min f (x ) subject to

▶ Why is this inequality true?

▶ The difference between J(x ) and the Lagrangian L(x , λ) is

▶ From the previous slide, we have J(x ) = maxλ≥0 L(x , λ).

▶ In contrast to the original formulation

▶ Suppose the problem is minx f (x ) subject to gi (x ) ≤ 0 for all

▶ We are interested in a class of optimization problems where

▶ First we need to know what is a convex set. A set C is a

▶ The negative entropy, a useful function in Machine Learning,

▶ Let us look at a convex optimization problem where the

▶ Taking the derivative of the Lagrangian with respect to x and

▶ We can solve the original primal linear program or the dual

▶ We now consider the case of a quadratic objective function

▶ The Lagrangian L(x , λ) is given by

▶ Rearranging the above we have

▶ The minimax inequality establishes weak duality which states

We shall work with the following optimization problem:

The Lagrangian associated with this optimization problem is

The λi s and hj s are called Lagrange multipliers.

Given a Lagrangian L(x , λ, ν) over some optimization domain D,

▶ For a primal optimization problem we say that it obeys Slater’s

Let us define two sets A = {(u , v , t)|∃x ∈ D such that gi (x ) ≤

We can define the separating hyerplane as follows:

L(x , λ̃/µ, ν̃/µ) ≥ p ∗

for all x ∈ D. Defining λ = λ̃/µ and ν = ν̃/µ, we can set

▶ By weak duality we know that p ∗ ≥ g (λ, ν). From the

▶ For the point x̃ that satisfies Slater’s condition (which is

▶ From gi (x̃ ) < 0 and λ̃i ≥ 0, we conclude that λ̃i = 0.

▶ In some cases computing the optimum solution for the dual

We make the following claim: Claim 1: Let x ∗ ∈ Rn be primal

We have f (x ∗ ) = F (λ∗ , ν∗) because of strong duality. Then we

▶ The first line of the preceding set of equations is due to strong

▶ Our chain of inequalities started with f (x ∗ ) and ended with

summation i∈[m] λ∗i gi (x ∗ ) has the same sign which means

Given a primal optimization problem, we say that x ∗ and

Theorem: For any optimization problem, if strong duality holds

Assume that strong duality holds and x ∗ and (λ∗ , ν ∗ ) ∈ Rm × Rp

You might also like