0% found this document useful (0 votes)

40 views17 pages

Some Special Class of Functions in Optimization: Convex, Lipschitz, Strongly Convex

The document summarizes some key concepts in optimization: - Convex functions have certain properties like Jensen's inequality and their epigraph is convex. - Strongly convex functions have an additional quadratic term in their properties compared to convex functions. - Lipschitz continuity means a function's slope is bounded, and Lipschitz gradient means the gradient's slope is bounded. These relate a function to linear or quadratic bounds.

Uploaded by

Bikshu11

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views17 pages

Some Special Class of Functions in Optimization: Convex, Lipschitz, Strongly Convex

Uploaded by

Bikshu11

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

Some special class of functions in optimization:

convex, Lipschitz, strongly convex

Andersen Ang

Mathématique et recherche opérationnelle

UMONS, Belgium
[email protected] Homepage: angms.science

First draft: June 6, 2017

Last update: November 4, 2020
Overview

1 Convex function

2 α-strongly convex function

3 Lipschitz continuity, Lipschitz gradient and Lipschitz Hessian

4 Summary

2 / 17
Convex function
A function f (x) with f : dom f → R is convex if :
I dom f is a convex set
I ∀x, y ∈ dom f , f satisfies
I Jensen’s inequality
f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y).
I Gradient of f is monotonic

x − y, ∇f (x) − ∇f (y) ≥ 0.
I 1st-order Taylor approximation at point x is a global under-estimator

f (y) ≥ f (x) + ∇f (x), y − x .
I Epigraph of f is a convex set.
I f is strictly convex if ≤, ≥ became <, > (i.e. strict inequality).
I The 4 definitions are equivalent: you can move from one definition to
another as “if and only if”. See optimization books for the proof of
equivalence between these 4 definitions.
3 / 17
Convexity: the geometry of Jensen’s inequality
f : dom f → R is convex if :
(1) dom f is a convex set and
(2) ∀x, y ∈ dom f, f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y)

5 f
λx + (1 − λ)y
4 f (λx + (1 − λ)y)
λf (x) + (1 − λ)f (y)
3
f (x)

0
−2 −1 0 1
x
4 / 17
Convexity: the geometry of 1st-order Taylor approximation
f : dom f → R is convex if :
(1) dom f is a convex set and

(2) ∀x, y ∈ dom f, f (y) ≥ f (x) + ∇f (x), y − x

f
f (−1) + ∇f (−1)(y − (−1))
20
f (y)

−4 −2 0 2
y
5 / 17
α-strongly convex function
A function f : dom f → R is α-strongly convex if:
I dom f is a convex set.
I ∀x, y ∈ dom f , f satisfies
I Jensen’s inequality with an additional quadratic term with α > 0
α
f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y)− λ(1 − λ)kx − yk22 .
2
I gradf is monotonic with an additional quadratic term with α > 0
x − y, ∇f (x) − ∇f (y) ≥ αkx − yk22 ≥ 0.

I 1st-order Taylor approximation at point x is global under-estimator

with an additional quadratic term with α > 0
α
f (y) ≥ f (x) + ∇f (x), y − x + kx − yk22 .

2
or we say f is lower bounded by a quadratic function.
I With α > 0, the function f (x) − α2 kxk22 is convex.
I If f is twice differentiable, it is α-strongly convex iff ∇2 f (x) αI.
I These definitions are equivalent
6 / 17
Equivalence between definitions

of strong convexity
We show ∇2 f (x) αI =⇒ x − y, ∇f (x) − ∇f (y) ≥ αkx − yk22 , α > 0.
Rb
First recall from calculus G(b) − G(a) = a g(θ)dθ. Next, a smart step, let
θ = y + τ (x − y), then dθ = (x − y)dτ . Consider integral range from 0 to 1 for
τ we let G be ∇f and g be ∇2 f , this gives
Z 1
∇2 f y + τ (x − y) (x − y)dτ.

∇f (x) − ∇f (y) =
0
(left hand side is a vector, right hand side is matrix-vector product, also a vector)

Take dot product with x − y on the whole equation on both sides

Z 1
D E
∇2 f y + τ (x − y) (x − y)dτ .

x − y, ∇f (x) − ∇f (y) = x − y,
0

By ∇2 f (x) αI for all x, we have ∇2 f y + τ (x − y) αI and
Z 1
D E
α(x − y)dτ = αkx − yk22 .

x − y, ∇f (x) − ∇f (y) ≥ x − y,
0

7 / 17
α-strongly convex: the geometry of the lower bounded
f (x) : dom f → R is α-strongly convex if
(1) dom f is a convex set and
α
(2) for all x, y ∈ dom f : f (y) ≥ f (x) + ∇f (x)> (y − x) + kx − yk22
2

20 f
α
f (−1) + ∇f (−1)(y − (−1)) + ky − (−1)k22
f (y)

2
10 f (−1) + ∇f (−1)(y − (−1))

−4 −2 0 2
y
Interpretation: f is lower bounded by a quadratic curve with some
curvature, which is also lower bounded by the 1st order Taylor
approximation (zero curvature) =⇒ f is not “too flat” (at least not “as
flat as” the lower bound). In other words: f is at least α-amount of
“bumpy”. 8 / 17
Lipschitz continuity
A function f (x) : dom f → R is Lipschitz if for any two points
x, y ∈ dom f , there exists a constant L ≥ 0 (the Lipschitz constant) such
that
|f (x) − f (y)| ≤ Lkx − yk.

|f (x) − f (y)|
I Re-arrange gives ≤ L, which is approximately the
kx − yk
magnitude of the gradient when x, y are close =⇒ f is Lipschitz
means the “slope” (rate of change) of f is bounded above by a global
constant L.
I Removing the absolute value sign:

f (x) ≤ f (y) + Lkx − yk
f (x) ≥ f (y) − Lkx − yk

meaning that f for all x is bounded above and below by a linear

function.
9 / 17
The geometry of Lipschitz continuity
A function is Lipschitz means function does not have sharp changes
everywhere: ∀x, the function value f is entirely outside a cone which is
modeled by the linear functions in the last page.

10
f (x)

−10

−4 −2 0 2 4
x

Important note: such property is global, such cone exists for all points
on f . i.e. the cone can “slide” along the curve and the argument still
holds.
10 / 17
Lipschitz continuous gradient
A function f : dom f → R is smooth if for any two points x, y ∈ dom f ,
there exists a constant L such that

k∇f (x) − ∇f (y)k ≤ Lkx − yk.

I This assume f is differentiable.

I f is L-smooth is also called L-Lipschitz gradient.
I f is L-smooth is equivalent to

f (y) − f (x) − ∇f (x), y − x ≤ L ky − xk22 .

2
Removing the absolute value sign:

f (y) ≤ f (x) + ∇f (x), y − x + L2 ky − xk22

(

f (y) ≥ f (x) + ∇f (x), y − x − L2 ky − xk22

meaning that f is bounded above and below by a quadratic function.

11 / 17
Equivalent definitions of L-smooth function
A function f (x) is L -smooth if
I gradf is L-Lipschitz with Lipschitz constant L ≥ 0.
i.e. ∀x, y ∈ domf we have L ≥ 0
k∇f (x) − ∇f (y)k ≤ Lkx − yk.
I f is bounded by a quadratic function with L > 0:
f (y) − f (x) − ∇f (x), y − x ≤ L ky − xk22 .

2
I the gradient of f is monotonic with additional term with L > 0:

1
x − y, ∇f (x) − ∇f (y) ≥ k∇f (x) − ∇f (y)k22 .

L
I the norm of the slope of ∇f (which is ∇2 f ) is bounded above.
I If f is twice differentiable, ∇2 f (x) LI, or all the eigenvalue of
∇2 f (x) is upperbounded by L.
These definitions are equivalent. e.g.: take the norm of the 3rd condition
gives the 1st condition.
12 / 17
Proof of equivalence
We show for L > 0, k∇f (x) − ∇f (y)k ≤ Lkx − yk implies
f (y) − f (x) − ∇f (x), y − x ≤ L ky − xk22 .

2
Rb
Recall from calculus G(b) − G(a) = a g(θ)dθ. Next, a smart step, let g(θ) as
g(τ ) = h∇f (x + τ (y − x)), y − xi be a function in τ and dθ = dτ . Consider the
definite integral of g(τ ) from 0 to 1, let G(b) = f (y) and G(a) = f (x), hence
R1D E
f (y) − f (x) = 0 ∇f (x + τ (y − x)), y − x dτ
R1D E
= 0 ∇f (x + τ (y − x))−∇f (x) + ∇f (x), y − x dτ.

As ∇f (x) is independent of τ , can take out from the integral

Z 1 D E
f (y) − f (x) = h∇f (x), y − xi + ∇f (x + τ (y − x)) − ∇f (x), y − x dτ.
0

The idea is to create the

term h∇f (x),
y − xi so that we can move it to the left
and get f (y) − f (x) − ∇f (x), y − x

13 / 17
Proof of equivalence - continue

| 1 h∇f (x + τ (y − x)) − ∇f (x), y − xi dτ

R
|f (y) − f (x) − h∇f (x), y − xi| = |
R 10
≤ 0 h∇f (x + τ (y − x)) − ∇f (x), y − xi dτ

c.s. R1
≤ 0 k∇f (x + τ (y − x)) − ∇f (x)k · ky − xkdτ.

c.s. means Cauchy – Schwarz inequality.

Now look at k∇f (x + τ (y − x)) − ∇f (x)k, this is exactly where we can apply
the Lipschitz gradient inequality

k∇f (x + τ (y − x)) − ∇f (x)k ≤ Lkτ (y − x)k ≤ L|τ |ky − xk = Lτ ky − xk

where kτ (y − x)k = |τ |ky − xk as norm is non-negative. Note that the integral

range is from 0 to 1 so the absolute sign in τ can be removed. Lastly
Z 1
L
Lτ dτ · ky − xk2 = ky − xk2 .

f (y) − f (x) − ∇f (x), y − x ≤
0 2

14 / 17
L-smoothness: the geometry of the upper bound
any two points x, y ∈ dom f ,
A function f is
L-smooth if for
f (y) ≤ f (x) + ∇f (x), y − x + L2 ky − xk22

20 f
f (−1) + ∇f (−1)(y − (−1)) + L2 ky − (−1)k
f (y)

−4 −2 0 2
y

Interpretation : f is globally bounded above by a quadratic function.

i.e. f cannot be “too sharp” (f is flatter than the upper bound), or f
cannot grow “too fast”.

15 / 17
Lipschitz continuous Hessian
A function f (x) : dom f → R has L-Lipschitz Hessian, if for any two
points x, y ∈ dom f , there exists a constant L (the Lipschitz constant)
such that
k∇2 f (x) − ∇2 f (y)k ≤ Lkx − yk.
I This assumes f is twice differentiable.
I This means the norm of ∇3 f (x) is bounded above by L.
I f has L-Lipschitz Hessian is equivalent to

f (x)−f (y)− ∇f (x), y−x − ∇2 f (x)(y−x), y−x ≤ L ky−xk32

6
see here for the proof.
Removing the absolute value sign, and make y the subject:

2 L
f (y) ≥ f (x) −
∇f (x), y − x −
∇ f (x)(y − x), y − x − 6 ky − xk32
L
f (y) ≤ f (x) − ∇f (x), y − x − ∇2 f (x)(y − x), y − x + 6 ky − xk32
which means f (y) is bounded above and below by two cubic
functions parameterized at the point x for all y.
16 / 17
Last page - summary
f is convex if domf is convex and
1. f
(λx + (1 − λ)y) ≤ λf (x)
+ (1 − λ)f (y)
2. x − y, ∇f (x) − ∇f (y) ≥ 0

3. f (y) ≥ f (x) + ∇f (x), y − x

f is α-strongly convex if domf is convex and

1. f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y) − α 2
λ(1 − λ)kx − yk22
2. x − y, ∇f (x) − ∇f (y) ≥ αkx − yk22

3. f (y) ≥ f (x) + ∇f (x), y − x + α kx − yk22

2
α
4. f (x) − 2 kxk22 is convex
5. ∇2 f (x) αI, if f is twice differentiable

f is L-Lipschitz gradient (L-smooth) if f is differentiable and

1. k∇f
(x) − ∇f (y)k ≤ Lkx − yk
2. f (y) − f (x) − ∇f (x), y − x ≤ L ky − xk22

2
1
3. x − y, ∇f (x) − ∇f (y) ≥ L k∇f (x) − ∇f (y)k22

4. ∇2 f (x) LI, if f is twice differentiable

f is L-Lipschitz Hessian if f is twice differentiable and

1. k∇2 f (x) − ∇2 f (y)k ≤ Lkx − yk
L
2. f (x) − f (y) − ∇f (x), y − x − ∇2 f (x)(y − x), y − x ≤ ky − xk32

6
End of document
17 / 17

A Course in Mathemathical Analysis Vol. II. Goursat, Edouard. 1916
No ratings yet
A Course in Mathemathical Analysis Vol. II. Goursat, Edouard. 1916
274 pages
Homework Session 2 Flows On The Line: 2.4 Linear Stability Analysis
No ratings yet
Homework Session 2 Flows On The Line: 2.4 Linear Stability Analysis
6 pages
Lecture 2 - Conservation Laws
No ratings yet
Lecture 2 - Conservation Laws
16 pages
Module 3 - The Derivative
No ratings yet
Module 3 - The Derivative
3 pages
UCI Biomedical Engineering El Camino Transfer 2020-2021
No ratings yet
UCI Biomedical Engineering El Camino Transfer 2020-2021
2 pages
Complex Integration
No ratings yet
Complex Integration
48 pages
WILLIAMS
No ratings yet
WILLIAMS
5 pages
Optimization Lecture 2
No ratings yet
Optimization Lecture 2
7 pages
Integral Calculus
No ratings yet
Integral Calculus
3 pages
Fourier - Series - Fact - Sheet - Corrected
No ratings yet
Fourier - Series - Fact - Sheet - Corrected
3 pages
Algorithmic Stability
No ratings yet
Algorithmic Stability
87 pages
A Note On The Accelerated Proximal Gradient Method For Nonconvex Optimization
No ratings yet
A Note On The Accelerated Proximal Gradient Method For Nonconvex Optimization
9 pages
Mittag-Leffler Therorem
No ratings yet
Mittag-Leffler Therorem
30 pages
Analiza Convexa
No ratings yet
Analiza Convexa
4 pages
Apendix-B-IntegralsRaymond A. Serway, John W. Jewett - Physics For Scientists and Engineers With Modern Physics (2013, Cengage Learning)
No ratings yet
Apendix-B-IntegralsRaymond A. Serway, John W. Jewett - Physics For Scientists and Engineers With Modern Physics (2013, Cengage Learning)
2 pages
Ma209 PS4
No ratings yet
Ma209 PS4
5 pages
Envelope Theorem
No ratings yet
Envelope Theorem
1 page
Lecture 3 Si416 2025
No ratings yet
Lecture 3 Si416 2025
23 pages
SSC CPO Math Trigonometry 2020
No ratings yet
SSC CPO Math Trigonometry 2020
19 pages
01 Convex and Concave Functions
No ratings yet
01 Convex and Concave Functions
5 pages
Convexity, Lipschitzness, Smoothness
No ratings yet
Convexity, Lipschitzness, Smoothness
5 pages
Convexity 1
No ratings yet
Convexity 1
3 pages
Notes ch0
No ratings yet
Notes ch0
12 pages
2 Directional Derivative
No ratings yet
2 Directional Derivative
3 pages
Techniques of Integration
No ratings yet
Techniques of Integration
52 pages
Lecture 1 2 Background
No ratings yet
Lecture 1 2 Background
6 pages
ODE Notes Lecture 2-4
No ratings yet
ODE Notes Lecture 2-4
49 pages
Lecture 15 Projected Gradient
No ratings yet
Lecture 15 Projected Gradient
8 pages
Grundlehren Der Mathematischen Wissenschaften 305: A Series of Comprehensive Studies in Mathematics
No ratings yet
Grundlehren Der Mathematischen Wissenschaften 305: A Series of Comprehensive Studies in Mathematics
431 pages
Welcome To Oulu: Guide For Foreigners
No ratings yet
Welcome To Oulu: Guide For Foreigners
51 pages
Beaches - Fort - Night Clubs - Casino - Church: This Photo by Unknown Author Is Licensed Under CC BY-SA
No ratings yet
Beaches - Fort - Night Clubs - Casino - Church: This Photo by Unknown Author Is Licensed Under CC BY-SA
1 page
Distributed Multi-Agent Optimization Based On An Exact Penalty Method With Equality and Inequality Constraints
No ratings yet
Distributed Multi-Agent Optimization Based On An Exact Penalty Method With Equality and Inequality Constraints
8 pages
Airtel Digital TV Recharge - Online DigitalTv Recharge
No ratings yet
Airtel Digital TV Recharge - Online DigitalTv Recharge
1 page
Vector Differential Calculus
No ratings yet
Vector Differential Calculus
12 pages
Madanapalle Institute of Technology & Science: Madanapalle (Ugc-Autonomous) WWW - Mits.ac - in
No ratings yet
Madanapalle Institute of Technology & Science: Madanapalle (Ugc-Autonomous) WWW - Mits.ac - in
41 pages
Convex Optimization Prerequisite - Topics
No ratings yet
Convex Optimization Prerequisite - Topics
6 pages
Mclas Tema1 v2
No ratings yet
Mclas Tema1 v2
74 pages
Laurent Z
No ratings yet
Laurent Z
31 pages
Sketching Non-Linear Systems: Linearizing at The Origin
No ratings yet
Sketching Non-Linear Systems: Linearizing at The Origin
9 pages
Record Lab
No ratings yet
Record Lab
2 pages
Maths MCQ
No ratings yet
Maths MCQ
120 pages
Numerical Analysis I (MATH 573)
No ratings yet
Numerical Analysis I (MATH 573)
3 pages
Li 2017
No ratings yet
Li 2017
25 pages
Lecture Notes: The Finite Element Method: Aurélien Larcher, Niyazi Cem de Girmenci Fall 2013
No ratings yet
Lecture Notes: The Finite Element Method: Aurélien Larcher, Niyazi Cem de Girmenci Fall 2013
1 page
Func 20160919
No ratings yet
Func 20160919
35 pages
(Strong, Strict) Convexity (Princeton. Lecture 14 Pages. ORF523 - Lec7)
No ratings yet
(Strong, Strict) Convexity (Princeton. Lecture 14 Pages. ORF523 - Lec7)
14 pages
JEE Main Advanced Topic Wise PDF
No ratings yet
JEE Main Advanced Topic Wise PDF
1 page
The Story of Mathematics
0% (1)
The Story of Mathematics
12 pages
Fourier Transforms Solved - Two Marks
100% (2)
Fourier Transforms Solved - Two Marks
9 pages
Ps 2
No ratings yet
Ps 2
3 pages
Optimization Best
No ratings yet
Optimization Best
71 pages
Signals and Systems
No ratings yet
Signals and Systems
122 pages
Lec3 Convex Function Exercise
No ratings yet
Lec3 Convex Function Exercise
4 pages
Practice questions-EE5180
No ratings yet
Practice questions-EE5180
2 pages
Laplace Transform Table
No ratings yet
Laplace Transform Table
3 pages
Applied Engineering Mathematics Solution Book 2
100% (2)
Applied Engineering Mathematics Solution Book 2
38 pages
Class7 - Ode - Matlab
No ratings yet
Class7 - Ode - Matlab
24 pages
Lecture 12
No ratings yet
Lecture 12
4 pages
03 Convex Functions Notes Cvxopt f22
No ratings yet
03 Convex Functions Notes Cvxopt f22
21 pages
Unit Partial Differentiatiion: Several
No ratings yet
Unit Partial Differentiatiion: Several
34 pages
CS 726: Nonlinear Optimization 1 Lecture 3: Di Erentiability
No ratings yet
CS 726: Nonlinear Optimization 1 Lecture 3: Di Erentiability
22 pages
Convex Functions - Pages From Royden-Fitzpatrick-130-134
No ratings yet
Convex Functions - Pages From Royden-Fitzpatrick-130-134
5 pages
Convex Optimization L2 18
No ratings yet
Convex Optimization L2 18
11 pages
An Implicit Function Theorem For Locally Lipschitz-2001
No ratings yet
An Implicit Function Theorem For Locally Lipschitz-2001
6 pages
Appendix A Course Syllabi EETP
No ratings yet
Appendix A Course Syllabi EETP
121 pages
Epigrafo PDF
No ratings yet
Epigrafo PDF
12 pages
Exercises With Solutions PDF
No ratings yet
Exercises With Solutions PDF
37 pages
CS 726: Nonlinear Optimization 1 Lecture 04: Convexity and Continuity
No ratings yet
CS 726: Nonlinear Optimization 1 Lecture 04: Convexity and Continuity
16 pages
1 Convex Analysis: 1.1 Motivations: Convex Optimization Problems
No ratings yet
1 Convex Analysis: 1.1 Motivations: Convex Optimization Problems
24 pages
Recitation 11: Based On Nesterov, Yurii. Introductory Lectures On Convex Optimization: A Basic Course
No ratings yet
Recitation 11: Based On Nesterov, Yurii. Introductory Lectures On Convex Optimization: A Basic Course
3 pages
Gradient
No ratings yet
Gradient
37 pages
Convex Functions: Renu M. R
No ratings yet
Convex Functions: Renu M. R
43 pages
Lecture 10
No ratings yet
Lecture 10
4 pages
Convex Optimization Cheatsheet
No ratings yet
Convex Optimization Cheatsheet
2 pages
Meanvalhhhhue
No ratings yet
Meanvalhhhhue
4 pages
Applied Mathematics Letters: M. Soleimani-Damaneh
No ratings yet
Applied Mathematics Letters: M. Soleimani-Damaneh
4 pages
Concave and Convex Functions: 1 Basic Definitions
No ratings yet
Concave and Convex Functions: 1 Basic Definitions
12 pages
Nisheeth VishnoiFall2014 ConvexOptimization PDF
No ratings yet
Nisheeth VishnoiFall2014 ConvexOptimization PDF
114 pages
Lecture Notes PDF
No ratings yet
Lecture Notes PDF
143 pages
Convex Functions and Their Applications PDF
100% (2)
Convex Functions and Their Applications PDF
44 pages
Chapter 3
No ratings yet
Chapter 3
43 pages
Gradient
No ratings yet
Gradient
31 pages
1 Theory of Convex Functions
No ratings yet
1 Theory of Convex Functions
14 pages
Existence of Weakly e Cient Solutions in Nonsmooth Vector Optimization
No ratings yet
Existence of Weakly e Cient Solutions in Nonsmooth Vector Optimization
10 pages
Optimality Conditions: Unconstrained Optimization: 1.1 Differentiable Problems
No ratings yet
Optimality Conditions: Unconstrained Optimization: 1.1 Differentiable Problems
10 pages
A Saddle Point Theorem For Non-Smooth Functionals and Problems at Resonance
No ratings yet
A Saddle Point Theorem For Non-Smooth Functionals and Problems at Resonance
15 pages
03 Convex Functions
No ratings yet
03 Convex Functions
31 pages
Analysis 15 Lipschitz
No ratings yet
Analysis 15 Lipschitz
2 pages
Adapting To Unknown Smoothness: R. M. Castro May 20, 2011
No ratings yet
Adapting To Unknown Smoothness: R. M. Castro May 20, 2011
9 pages
Lectures On Lipschitz Analysis
No ratings yet
Lectures On Lipschitz Analysis
77 pages
Basic Concepts: 1.1 Continuity
No ratings yet
Basic Concepts: 1.1 Continuity
7 pages
Lipschitz Functions: Lorianne Ricco February 4, 2004
No ratings yet
Lipschitz Functions: Lorianne Ricco February 4, 2004
3 pages