0% found this document useful (0 votes)

20 views4 pages

Lecture 12

Uploaded by

samhith23

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views4 pages

Lecture 12

Uploaded by

samhith23

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

6FMAI19 Nonlinear Optimization Spring, 2022

Lecture #12 — 13/4, 2022

Lecturer: Yura Malitsky Scribe: Aban Husain

1 L-smooth functions and strong convexity

Unless otherwise specified, X is a finite dimensional R-vector space equipped with p-norm
∥ · ∥.

Definition 1. The dual space X ∗ of X is the space of linear forms on X with norm ∥ · ∥∗
defined by
∥f ∥∗ = max f (x).
∥x∥=1

As X is assumed to be finite dimensional, there is a natural equivalence between X and

X ∗ , i.e. X ∗ is an R-vector space of the same dimension as X.

Remark 2. For X with norm ∥ · ∥p , the dual norm is ∥ · ∥q , where p1 + 1q = 1 for p > 1, and
q = ∞ for p = 1. In particular, if X has Euclidean norm, then so does X ∗ .

Recall that, for an arbitrary function f : X → R, the Legendre-Fenchel transform (or

complex conjugate) f ∗ : X ∗ → R can be constructed as:

f ∗ (y) = sup {⟨y, x⟩ − f (x)}.

x∈X

Consider the negative entropy function on the n-simplex

n Pn
(P
i=1 xi log xi if x ∈ ∆n = {x ∈ Rn | i=1 xi = 1}
h(x) =
∞ else.

Then by straightforward calculation

h∗ (y) = sup {⟨y, x⟩ − h(x)}

x∈X
n
X
= sup {⟨y, x⟩ − xi log xi }
x∈∆n i=1
n
X
= log( exp yi ).
i=1

1.1 L-smooth functions

The definition of L-smooth functions can be generalized to X with an unspecified p-norm.

Definition 3. A differentiable function f : X → R is L-smooth with respect to a norm ∥ · ∥ if

∥∇f (y) − ∇f (x)∥∗ ≤ L∥y − x∥, ∀x, y ∈ X.

Theorem 4. Let f : X → R be convex, and L > 0. Then the following are equivalent for all
x, y ∈ X and λ ∈ [0, 1]:

1. f is L-smooth with respect to ∥ · ∥;

1
2. f (y) ≤ f (x) + ⟨∇f (x), y − x⟩ + L2 ∥y − x∥2 ;
1
3. f (y) − f (x) − ⟨∇f (x), y − x⟩ ≥ 2L ∥∇f (x) − ∇f (y)∥2∗ ;
1
4. ⟨∇f (x) − ∇f (y), x − y⟩ ≥ L ∥∇f (x) − ∇f (y)∥2∗ ;

5. f (λx + (1 − λ)y) ≥ λf (x) + (1 − λ)f (y) − L2 λ(1 − λ)∥x − y∥2 .

Proof. (1) ⇒ (2) Let xλ = x + λ(y − x) for λ ∈ [0, 1]. Using the fundamental theorem of
calculus and Hölder’s inequality:
Z 1
f (y) − f (x) − ⟨∇f (x), y − x⟩ = ⟨∇f (xλ ) − ∇f (x), y − x⟩ dλ
0
Z 1
≤ ∥∇f (xλ ) − ∇f (x)∥∗ ∥y − x∥ dλ
0
Z 1
≤ Lλ∥y − x∥2 dλ
0
L
= ∥y − x∥2 .
2

(2) ⇒ (3) For fixed x ∈ X let

φ(y) = f (y) − f (x) − ⟨∇f (x), y − x⟩.

From definition ∇φ(y) = ∇f (y)−∇f (x), and by convexity, φ(x) = 0 is a minimum value. For
y ∈ X, set z = y − ∥∇φ(y)∥
L
∗
v where v is chosen so that ⟨∇φ(y), v⟩ = ∥∇φ(y)∥∗ and ∥v∥ = 1.
Then

0 ≤ φ(z)
∥∇φ(y)∥∗ L ∥∇φ(y)∥∗ 2
= φ(y) − ⟨∇φ(y), v⟩ + ∥ v∥
L 2 L
1
= f (y) − f (x) − ⟨∇f (x), y − x⟩ − ∥∇f (y) − ∇f (x)∥2∗ .
2L
(3) ⇒ (4) For each x, y ∈ X,
1
f (y) − f (x) − ⟨∇f (x), y − x⟩ ≥ ∥∇f (x) − ∇f (y)∥2∗
2L
1
f (x) − f (y) − ⟨∇f (y), x − y⟩ ≥ ∥∇f (y) − ∇f (x)∥2∗
2L
Summation yields (4).
(4) ⇒ (1) Using Hölder’s inequality,
1
∥∇f (x) − ∇f (y)∥2∗ ≤ ⟨∇f (x) − ∇f (y), x − y⟩ ≤ ∥∇f (x) − ∇f (y)∥∗ ∥x − y∥.
L
(2) ⇒ (5) This follows from the definition of convexity and the inequality in (2).
(5) ⇒ (2) Rewrite (5) as

f (x + λ(y − x)) − f (x) L(1 − λ)

f (y) ≤ f (x) + + ∥y − x∥2 .
λ 2
The limit as λ → 0 results in (2).

2
Pn
Claim 5. The function f (x) = log( i=1 exp xi ) is 1-smooth with respect to ∥ · ∥2 and ∥ · ∥∞ .

The first and second order partial derivatives of f are

 n
x x e xk ) 2 ,
−e i e j /( if i ̸= j
P
n

∂f 2

∂f X
(x) = exi /( exk ), (x) = i=k
n n
∂xi k=1
∂xi ∂xj −exi exi /(


P
exk )2 + exi /(
P
exk ) if i = j.
i=k k=1

Fix the notation σ = ∇f (x) and ∇2 f (x) = diag(σ) − σσ T .

1. In the case of Euclidean norm, L is bounded by the largest eigenvalue of the Hessian.
By Weyl’s inequality ∇2 f (x) ≼ diag(σ) ≼ I, so f is 1-smooth with respect to ∥ · ∥2 .

2. Given ∥ · ∥∞ , for any d ∈ R the inequality ⟨∇2 f (x), d⟩ ≤ ⟨diag(σ), d⟩ ≤ ∥d∥∞ holds.
Since f is twice continuously differentiable, for x, y ∈ R there exists some z ∈ [x, y] such
that
1
f (y) = f (x) + ⟨∇f (x), y − x⟩ + ⟨∇2 f (z)(y − x), y − x⟩
2
1
≤ f (x) + ⟨∇f (x), y − x⟩ + ∥x − y∥∞ .
2
By 4, f is 1-smooth with respect to ∥ · ∥∞ .

1.2 µ-strongly convex functions

The definition of strongly convex functions can also be generalized.

Definition 6. A function f : X → R is µ-strongly convex wrt. ∥ · ∥ if for all x, y ∈ X and

λ ∈ [0, 1]:
µ
λf (x) + (1 − λ)f (y) ≥ f (λx + (1 − λ)y) + λ(1 − λ)∥y − x∥2 .
2
It is important to note that the equivalence
µ
f is µ-strongly convex ⇔ f (x) − ∥x∥2 is convex
2
holds only in the Euclidean case.

Theorem 7. Let f : X → R ∪ {∞}. The following are equivalent for all x, y ∈ X:

1. f is µ-strongly convex with respect to ∥ · ∥;

2. f (y) ≥ f (x) + ⟨gx , y − x⟩ + µ2 ∥y − x∥2 , ∀gx ∈ ∂f (x);

3. ⟨gx − gy , x − y⟩ ≥ µ∥x − y∥2 , ∀gx ∈ ∂f (x), ∀gy ∈ ∂f (y).

Proof. (1) ⇒ (2) Let xλ = x + λ(y − x), λ ∈ [0, 1]. The definition of µ-strong convexity can
be rewritten as
µ f (xλ ) − f (x)
f (y) ≥ f (x) + (1 − λ)∥y − x∥2 + .
2 λ
Allowing λ → 0,
µ
f (y) ≥ f (x) + ∥y − x∥2 + ⟨∇fy−x (x), y − x⟩
2
µ
≥ f (x) + ∥y − x∥2 + ⟨gx , y − x⟩ ∀gx ∈ ∂f (x).
2

3
(2) ⇒ (1) For x, y ∈ X and xλ = x + λ(y − x), λ ∈ [0, 1]:
µ
λf (y) ≥ λ(f (xλ ) + ⟨gxλ , y − xλ ⟩ + ∥y − xλ ∥2 )
2
µ
(1 − λ)f (x) ≥ (1 − λ)(f (xλ ) + ⟨gxλ , x − xλ ⟩ + ∥x − xλ ∥2 ).
2
Summation yields (1).
(2) ⇒ (3) Monotonicity follows immediately from (2).
(3) ⇒ (2) For λ ∈ [0, 1], let xλ = x + λ(y − x). Given that f is convex, for gxλ ∈ ∂f (xλ ),
Z 1
f (y) − f (x) = ⟨gxλ , y − x⟩ dλ.
0

Since ⟨gxλ , y − x⟩ ≥ ⟨gx , y − x⟩ + µλ∥x − y∥2 , (2) follows.

2 Fenchel duality of L-smooth and strongly convex functions

In the last lecture, the following relations between subgradients of a function and its convex
conjugate were established.

Lemma 8. [Fenchel Young’s equality] For a proper, lower semicontinuous convex function
f : X → R, the following conditions are equivalent:

1. f (x) + f ∗ (y) = ⟨y, x⟩;

2. x ∈ ∂f ∗ (y);

3. y ∈ ∂f (x).

Theorem 9. Let f : X → R. The following statements hold:

1. If f is closed and µ-strongly convex with respect to ∥ · ∥, then f ∗ is is 1

µ -smooth with
respect to ∥ · ∥∗ ;

2. If f is convex and L-smooth with respect to ∥ · ∥, then f ∗ is is 1

L -strongly convex with
respect to ∥ · ∥∗ .

Proof. Both statements are direct consequences of Fenchel Young, 4 and 7.

Claim 10. The negative entropy function h(x) on the n-simplex is 1-stronlgy convex with
respect to both ∥ · ∥1 and ∥ · ∥2 .

Since the complex conjuagate of h(x) is 1-smooth with respect to ∥·∥2 and ∥·∥∞ , 9 ensures
1-strong convexity with respect to the dual norms.

Lecture Notes PDF
No ratings yet
Lecture Notes PDF
143 pages
Convex Analysis and Optimization Solution Manual
100% (2)
Convex Analysis and Optimization Solution Manual
193 pages
JEHLE RENY Solutions To Selected Exercises
100% (8)
JEHLE RENY Solutions To Selected Exercises
38 pages
Grundlehren Der Mathematischen Wissenschaften 305: A Series of Comprehensive Studies in Mathematics
No ratings yet
Grundlehren Der Mathematischen Wissenschaften 305: A Series of Comprehensive Studies in Mathematics
431 pages
CntrlEngg (Optimization) ConvexAnalysisAndOptimization Solutions DimitriBertsekas
No ratings yet
CntrlEngg (Optimization) ConvexAnalysisAndOptimization Solutions DimitriBertsekas
191 pages
Evans PDE Solution Chapter 3 Nonlinear First-Order PDE
No ratings yet
Evans PDE Solution Chapter 3 Nonlinear First-Order PDE
6 pages
Mclas Tema1 v2
No ratings yet
Mclas Tema1 v2
74 pages
Optimization Best
No ratings yet
Optimization Best
71 pages
Week02 Convex Optimization
No ratings yet
Week02 Convex Optimization
48 pages
Convex Functions: Renu M. R
No ratings yet
Convex Functions: Renu M. R
43 pages
Institute of Computer Science: Academy of Sciences of The Czech Republic
No ratings yet
Institute of Computer Science: Academy of Sciences of The Czech Republic
49 pages
Mathematics For Economics (ECON 104)
No ratings yet
Mathematics For Economics (ECON 104)
46 pages
Convexsol 1
No ratings yet
Convexsol 1
40 pages
03 Convex Functions
No ratings yet
03 Convex Functions
31 pages
Lecture 02 - Convexity
No ratings yet
Lecture 02 - Convexity
42 pages
Exercises With Solutions PDF
No ratings yet
Exercises With Solutions PDF
37 pages
Lect3 Removed
No ratings yet
Lect3 Removed
44 pages
Chapter 3
No ratings yet
Chapter 3
43 pages
Func 20160919
No ratings yet
Func 20160919
35 pages
Chapter - 2 - Convex Function
No ratings yet
Chapter - 2 - Convex Function
32 pages
Gradient
No ratings yet
Gradient
37 pages
Lect5 Removed
No ratings yet
Lect5 Removed
35 pages
ConvexSpring25 Week3
No ratings yet
ConvexSpring25 Week3
30 pages
Lect4 Removed
No ratings yet
Lect4 Removed
32 pages
1 Convex Analysis: 1.1 Motivations: Convex Optimization Problems
No ratings yet
1 Convex Analysis: 1.1 Motivations: Convex Optimization Problems
24 pages
VIP and CP S Nanda (Me)
No ratings yet
VIP and CP S Nanda (Me)
26 pages
03 Convex Functions Notes Cvxopt f22
No ratings yet
03 Convex Functions Notes Cvxopt f22
21 pages
Gradient
No ratings yet
Gradient
31 pages
Nonlinear Programming 3rd Edition Theoretical Solutions Manual
No ratings yet
Nonlinear Programming 3rd Edition Theoretical Solutions Manual
27 pages
Lecture 3 Si416 2025
No ratings yet
Lecture 3 Si416 2025
23 pages
Convex Optimization Cheatsheet
No ratings yet
Convex Optimization Cheatsheet
2 pages
Lecture 4 Si416 2025
No ratings yet
Lecture 4 Si416 2025
22 pages
Closed Functions: September 4, 2007
No ratings yet
Closed Functions: September 4, 2007
19 pages
CPSC 542f Notes
No ratings yet
CPSC 542f Notes
10 pages
Coercive Ness
No ratings yet
Coercive Ness
13 pages
(Strong, Strict) Convexity (Princeton. Lecture 14 Pages. ORF523 - Lec7)
No ratings yet
(Strong, Strict) Convexity (Princeton. Lecture 14 Pages. ORF523 - Lec7)
14 pages
Optimality Conditions: Unconstrained Optimization: 1.1 Differentiable Problems
No ratings yet
Optimality Conditions: Unconstrained Optimization: 1.1 Differentiable Problems
10 pages
1 Theory of Convex Functions
No ratings yet
1 Theory of Convex Functions
14 pages
Some Special Class of Functions in Optimization: Convex, Lipschitz, Strongly Convex
No ratings yet
Some Special Class of Functions in Optimization: Convex, Lipschitz, Strongly Convex
17 pages
Notes ch0
No ratings yet
Notes ch0
12 pages
Convex Optimization L2 18
No ratings yet
Convex Optimization L2 18
11 pages
Bregman
No ratings yet
Bregman
9 pages
Convexity, Lipschitzness, Smoothness
No ratings yet
Convexity, Lipschitzness, Smoothness
5 pages
Lec3 Convex Function Exercise
No ratings yet
Lec3 Convex Function Exercise
4 pages
Characterization of Lipschitz
No ratings yet
Characterization of Lipschitz
8 pages
Optimality Conditions
No ratings yet
Optimality Conditions
10 pages
01 Convex and Concave Functions
No ratings yet
01 Convex and Concave Functions
5 pages
Analiza Convexa
No ratings yet
Analiza Convexa
4 pages
5 - The Bellman Equation
No ratings yet
5 - The Bellman Equation
7 pages
Lecture 1 2 Background
No ratings yet
Lecture 1 2 Background
6 pages
Global Minimum Point of A Convex Function
No ratings yet
Global Minimum Point of A Convex Function
6 pages
Lecture 7
No ratings yet
Lecture 7
4 pages
Leonetti Convexity Arxiv
No ratings yet
Leonetti Convexity Arxiv
4 pages
14.451 Notes: 1 Mathematical Preliminaries
No ratings yet
14.451 Notes: 1 Mathematical Preliminaries
5 pages
Review Question 3
No ratings yet
Review Question 3
4 pages
Recitation 11: Based On Nesterov, Yurii. Introductory Lectures On Convex Optimization: A Basic Course
No ratings yet
Recitation 11: Based On Nesterov, Yurii. Introductory Lectures On Convex Optimization: A Basic Course
3 pages
Lecture 10
No ratings yet
Lecture 10
4 pages

Lecture 12

Uploaded by

Lecture 12

Uploaded by

6FMAI19 Nonlinear Optimization Spring, 2022

Lecture #12 — 13/4, 2022

1 L-smooth functions and strong convexity

As X is assumed to be finite dimensional, there is a natural equivalence between X and

Recall that, for an arbitrary function f : X → R, the Legendre-Fenchel transform (or

f ∗ (y) = sup {⟨y, x⟩ − f (x)}.

Consider the negative entropy function on the n-simplex

Then by straightforward calculation

h∗ (y) = sup {⟨y, x⟩ − h(x)}

1.1 L-smooth functions

Definition 3. A differentiable function f : X → R is L-smooth with respect to a norm ∥ · ∥ if

∥∇f (y) − ∇f (x)∥∗ ≤ L∥y − x∥, ∀x, y ∈ X.

1. f is L-smooth with respect to ∥ · ∥;

5. f (λx + (1 − λ)y) ≥ λf (x) + (1 − λ)f (y) − L2 λ(1 − λ)∥x − y∥2 .

(2) ⇒ (3) For fixed x ∈ X let

φ(y) = f (y) − f (x) − ⟨∇f (x), y − x⟩.

f (x + λ(y − x)) − f (x) L(1 − λ)

The first and second order partial derivatives of f are

Fix the notation σ = ∇f (x) and ∇2 f (x) = diag(σ) − σσ T .

1.2 µ-strongly convex functions

Definition 6. A function f : X → R is µ-strongly convex wrt. ∥ · ∥ if for all x, y ∈ X and

Theorem 7. Let f : X → R ∪ {∞}. The following are equivalent for all x, y ∈ X:

1. f is µ-strongly convex with respect to ∥ · ∥;

2. f (y) ≥ f (x) + ⟨gx , y − x⟩ + µ2 ∥y − x∥2 , ∀gx ∈ ∂f (x);

3. ⟨gx − gy , x − y⟩ ≥ µ∥x − y∥2 , ∀gx ∈ ∂f (x), ∀gy ∈ ∂f (y).

Since ⟨gxλ , y − x⟩ ≥ ⟨gx , y − x⟩ + µλ∥x − y∥2 , (2) follows.

2 Fenchel duality of L-smooth and strongly convex functions

1. f (x) + f ∗ (y) = ⟨y, x⟩;

Theorem 9. Let f : X → R. The following statements hold:

1. If f is closed and µ-strongly convex with respect to ∥ · ∥, then f ∗ is is 1

2. If f is convex and L-smooth with respect to ∥ · ∥, then f ∗ is is 1

Proof. Both statements are direct consequences of Fenchel Young, 4 and 7.

You might also like