0% found this document useful (0 votes)
55 views

Chapter - 2 - Convex Function

The document summarizes definitions and methods for checking convexity of functions. It defines convex, proper, and epigraph functions. It describes how to check convexity by examining convexity along lines, through first-order and second-order conditions, and by ensuring sublevel sets are convex. Examples of convex functions include norms, logarithms, and indicator functions of convex sets. Convexity can be checked by ensuring univariate functions formed from slices are convex or that Hessian matrices are positive semidefinite.

Uploaded by

Hong Kimmeng
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views

Chapter - 2 - Convex Function

The document summarizes definitions and methods for checking convexity of functions. It defines convex, proper, and epigraph functions. It describes how to check convexity by examining convexity along lines, through first-order and second-order conditions, and by ensuring sublevel sets are convex. Examples of convex functions include norms, logarithms, and indicator functions of convex sets. Convexity can be checked by ensuring univariate functions formed from slices are convex or that Hessian matrices are positive semidefinite.

Uploaded by

Hong Kimmeng
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Chapter 2: Optimization for Data Science

Convex Functions

TANN Chantara

Department of Applied Mathematics and Statistics


Institute of Technology of Cambodia

October 9, 2022

TANN Chantara (ITC) Convex Functions October 9, 2022 1 / 31


Table of Contents

1 Definitions

2 Checking Convexity

3 Convexity Preserving Transformations

4 Schur Lemma

5 Generalized Inequalities

6 Summary

TANN Chantara (ITC) Convex Functions October 9, 2022 1 / 31


Definitions

Epigraph and Domain

Definition: The epigraph of f : Rn → (−∞, ∞] is the set

epi(f) = {(x, α) ∈ Rn+1 : f(x) ≤ α}

Definition: The domain of f : Rn → (−∞, ∞] is the set

dom(f) = {x ∈ Rm : f(x) ≤ ∞}

Definition: A function f : Rn → (−∞, ∞] is called proper if dom(f) ̸= ∅

TANN Chantara (ITC) Convex Functions October 9, 2022 2 / 31


Definitions

Convex Functions
Definition: A function f : Rn → (−∞, ∞] is called convex if its epigraph
is a convex set.

Proposition: f is conve if and only if its domain is a convex set and

f(θx + (1 − θ)y) ≤ θf(x) + (1 − θ)f(y) (1)

for all x, y ∈ dom(f) and θ ∈ [0, 1]

• f is called strictly convex if the in equality in (1) is strict.


• f is called concave if −f is convex.
TANN Chantara (ITC) Convex Functions October 9, 2022 3 / 31
Definitions

Sublevel Sets
Definition: The α-sublevel set of a function f : Rn → (−∞, ∞] is defined
as Cα = {x : f(x) ≤ α}

Proposition: f is convex then all of its sublevel sets are convex.

• Reverse implication is not true.


• Exercise: Find a non-convex function whose sublevel sets are all convex.

TANN Chantara (ITC) Convex Functions October 9, 2022 4 / 31


Definitions

Examples of Convex Functions

Univariate functions Domain


• Exponential functions f(x) = e ax R
• Powers f(x) = xα (α ≥ 1, α ≤ 0) R++
• Negative logarithm f(x) = − log x R++
• Negative entropy f(x) = x log x R++
Multivariate functions Domain
• Negative entropy f(x) = a⊤ x + b Rn
f(x) = ||x||p = ( i=1 ||x||p )1/p Rn
Pn
• p-Norms (p ≥ 1)
• ∞-Norm f(x) = ||x||p = maxi |xi | Rn
(
0, x ∈ C
• Indicator function of f(x) = C
∞, else
convex set C

Convention: R++ = {x ∈ R : x > 0}

TANN Chantara (ITC) Convex Functions October 9, 2022 5 / 31


Definitions

Example of Convex Functions (cont’d)

Univariate functions Domain


• Trace functions (linear functions) Rm×n

m X
n
f(X) = tr(A⊤ X) =
X
Aij Xij , (A ∈ Rm×n )
i=1 j=1

• maximum eigenvalue f(X) = λmax (X) Sn


• Spectral norm f(X) = ||X||2 = supv̸=0 ||Xv||2 /||v||2 Rm×n

TANN Chantara (ITC) Convex Functions October 9, 2022 6 / 31


Checking Convexity

Checking Convexity Along Line

Proposition: A function f : Rn → (−∞, ∞] is convex if and only if


each univariate function g : R → (−∞, ∞] of the form

g(t) = f(x+ty), for x, y ∈ Rn

is convex in t.

Proof:
⇒: For any x, y ∈ Rn and θ ∈ (0, 1), consider t = θa + (1 − θ)b for
arbitrary a, b ∈ R
g(t) = g(θa + (1 − θ)b) = f(x + (θa + (1 − θ)b)y)
= f(θ(x + ay) + (1 − θ)(x + by))
≤ θf(x + ay) + (1 − θ)f(x + by) = θg(a) + (1 − θ)g(b)

TANN Chantara (ITC) Convex Functions October 9, 2022 7 / 31


Checking Convexity

Checking Convexity Along Line

Proof: Cont’d
⇐: For any x, y ∈ Rn , θ ∈ (0, 1) and t1 , t2 ∈ R
f(x + (θt1 + (1 − θ)t2 )y) = g(θt1 + (1 − θ)t2 )
≤ θg(t1 ) + (1 − θ)g(t2 ) = θf(x + t1 y) + (1 − θ)f(x + t2 y)
For t1 = 0, t2 = 1, x = x′ , y = y′ − x′ we get
f(θx′ + (1 − θ)y′ ) ≤ θf(x′ ) + (1 − θ)f(y′ )

TANN Chantara (ITC) Convex Functions October 9, 2022 8 / 31


Checking Convexity

1st Order Conditions


Definition: A function f : Rn → (−∞, ∞] is differentiable if its gradient
∇f = (∂f/∂x1 , ..., ∂f/∂xn ) exists at each point in dom(f) and if dom(f) is
open

Proposition: A differentiable function f : Rn → (−∞, ∞] is convex


if and only if dom(f) is convex and

f(y) ≥ f(x) + ∇f(x)⊤ (y − x) ∀x, y ∈dom(f)

⇒ 1st-order Taylor approximation


underestimates f globally.
⇒ From local information about
convex function we can obtain
global information.

TANN Chantara (ITC) Convex Functions October 9, 2022 9 / 31


Checking Convexity

Univariate Functions

Proposition: A differential function f : R → R is convex if and only if

f(y) ≥ f(x) + f ′ (x)(y − x) ∀x, y ∈ R

Proof:
⇒: If x, y ∈ R, 0 < t ≤ 1, then
f(x + t(y − x)) ≤ (1 − t)f(x) + tf(y) (convexity)
f(y) − f(x) ≥ [f(x + t(y − x))]/t (divide by t)

f(y) − f(x) ≥ f (x)(y − x) (limit t → ∞)
⇐: For any x, y ∈ R, 0 < t ≤ 1, Let z = tx + (1 − t)y
t(f(x) − f(z)) ≥ tf ′ (z)(x − z) (by assumption)
(1 − t)(f(y) − f(z)) ≥ (1 − t)t′ (z)(y − z) (by assumption)
tf(x) + (1 − t)f(y) ≥ f(z) (sum of above)

TANN Chantara (ITC) Convex Functions October 9, 2022 10 / 31


Checking Convexity

1st Order Conditions∗

Proposition: A differentiable function f : Rn → R is convex if and only if

f(y) ≥ f(x) + ∇f(x)⊤ (y − x) ∀x, y ∈ Rn

Proof:
⇒: g(t) = f(tx + 1(1 − t)y) is convex in t for any x, y ∈ Rn
g′ (t) = ∇f(tx + (1 − t)y)⊤ (y − x) (definition of g)
g(1) ≥ g(0) + g′ (0) (convexity of g)
f(x) ≥ f(y) + ∇f(y)⊤ (y − x) (substition)
n
⇐: x, y ∈ R , t, t̃ ∈ R, z = ty + (1 − t)x, z̃ = t̃y + (1 − t̃)x
f(z) ≥ f(z̃) + ∇f(z̃)⊤ (z − z̃) (by assumption)

g(t) ≥ g(t̃) + ∇g(t̃) (t − t̃) (definition of g, z, z̃)

By the 1st-order condition for univariate functions g is convex. Thus f is


also convex.

TANN Chantara (ITC) Convex Functions October 9, 2022 11 / 31


Checking Convexity

2nd Order Conditions


Definition: A function f : Rn → (−∞, ∞] is twice differentiable if its
Hessian
 2
∂ f/∂x1 ∂x1 ··· ∂ 2 f/∂x1 ∂xn

2
∇ f(x) = 
 .. .. .. 
. . . 
∂ 2 f/∂xn ∂x1 ··· ∂ 2 f/∂xn ∂xn

exists at each point in dom(f), and dom(f) is open.

Proposition: A twice differential function f : Rn → (−∞, ∞] is


convex if and only if dom(f) is convex and

∇2 f(x) ⪰ 0 ∀x ∈dom(f)

The condition ∇2 f(x) ⪰ 0 can be interpreted geometrically as the


requirement that f has upward curvature at x.
TANN Chantara (ITC) Convex Functions October 9, 2022 12 / 31
Checking Convexity

Univariate Functions∗

Proposition: A twice differential function f : R → R is convex if and only


if f ′′ (x) ≥ 0 ∀x ∈ R
Proof:
⇒: If x, y ∈ R, y > x, then
f(y) ≥ f(x) + f ′ (x)(y − x) (1st order conditions)
f(x) ≥ f(y) + f ′ (y)(x − y) (1st order conditions)
0 ≥ (f ′ (x) − f ′ (y))/(y − x) (sum of above × (y − x)−2 )
′′
0 ≥ f (x) (limit y → x)
⇐: For x, y ∈ R, weRhave
f(y) = f(x) + xRy f ′ (u)du = f(x) + xy f ′ (x) + xu f ′′ (v)dvdu
R R

≥ f(x) + xy f ′ (x)du = f(x) + f ′ (x)(y − x)

Thus, f is convex as it satisfies the 1st-order condition.

TANN Chantara (ITC) Convex Functions October 9, 2022 13 / 31


Checking Convexity

2nd-Order Conditions∗

Proposition: A twice differential function f : Rn → R is convex if and


only if f ′′ (x) ⪰ 0 ∀x ∈ Rn
Proof:
⇒: g(t) = f(x + ty) is convex in t for anyx, y ∈ Rn
g′′ (t) = y⊤ ∇2 f(x + ty)y (definition of g)
g′′ (t) ≥ 0 (univariate case)
∇2 f(x) ⪰ 0 (as y is arbitrary)
⇐: Define g as above. For any t we have
∇2 f(x + ty) ⪰ 0 (by assumption)
g′′ (t) ≥ 0 (definition of g)

By the 2nd-order condition for univariate functions, g is convex. Thus, f is


also convex.

TANN Chantara (ITC) Convex Functions October 9, 2022 14 / 31


Checking Convexity

Examples

• Quadratic functions f(x) = x⊤ Px + q⊤ x + r are convex if ∇2 f(x) = P ⪰ 0

• The least-squares objective f(x) = ||Ax − b||22 is convex because


∇2 f(x) = 2A⊤ A ⪰ 0 for all A ∈ Rm×n

• Quadratic-over-linear function of the type f(x, y) = x2 /y are convex as


long as y > 0 because
!
2 y
∇2 f(x, y) = 3 (y − x) ⪰ 0 ∀y > 0
y −x

TANN Chantara (ITC) Convex Functions October 9, 2022 15 / 31


Checking Convexity

Negative Log-Determinant
Proposition: The log-determinant function f(X) = − log det(X) is convex
on the set of positive definite matrices Sn++ .
Proof: Homework.

TANN Chantara (ITC) Convex Functions October 9, 2022 16 / 31


Convexity Preserving Transformations

Convexity Preserving Transformations

Sometimes one can establish convexity of f by showing that f is obtained


from simple convex functions via transformations that preserve convexity:
non-negative weight sum
composition with affine function
pointwise maximum and supremum
composition
minimization
perspective

TANN Chantara (ITC) Convex Functions October 9, 2022 17 / 31


Convexity Preserving Transformations

Affine Transformations

Affine transformation of input: if f is convex, then g(x) = f(Ax + b) is


also convex
Non-negative affine transformation of output: If f1 , ..., fK are convex
functions and ρ1 , ..., ρK are non-negative numbers, then the conic
combination g(x) = ρ1 f1 (x) + · · · + ρK fK (x) is convex
Generalization to integrals: if f(x, y) is convex in x for each fixed
y ∈ Y and ρ(y) is a non-negative function of y, then
R
g(x) = Y ρ(y)f(x, y)dy
is convex in x (provided that the integral exists)

TANN Chantara (ITC) Convex Functions October 9, 2022 18 / 31


Convexity Preserving Transformations

Pointwise maximum and supremum


Maximum of convex functions: If f1 , ..., fK are convex, then the pointwise
maximum g(x) = max{f1 (x), ..., fK (x)} is also convex.

Recall : Intersections of convex sets are convex


Supremum of convex functions: If f(x, y) is convex in x for every fixed
y ∈ Y, then the pointwise supremum

g(x) = sup f(x, y)


y∈Y

is also convex
TANN Chantara (ITC) Convex Functions October 9, 2022 19 / 31
Convexity Preserving Transformations

Examples

1 Piecewise linear functions f(x) = maxi=1,..,K {a⊤


i x + bj } are convex.
2 The sum of the r largest components of x ∈ Rn is convex as it can be
written as a maximum of linear functions.

f(x) = max{xi1 + xi2 + · · · + xir : 1 < i1 < i2 < ir ≤ n

3 The support function of a (possibly noncovex) set C is convex.

TANN Chantara (ITC) Convex Functions October 9, 2022 20 / 31


Convexity Preserving Transformations

Examples Cont’d
4 Maximum eigenvalue f(X) = λmax (X) for X ∈ Sn
Write X = RDR⊤ , with R orthogonal and D = diag(λ1 , ..., λn )

f(X) = sup v12 λ1 + · · · + vn2 λn = sup v⊤ Dv = sup v⊤ Xv


||v||2 =1 ||v||2 =1 ||v||2 =1

5 Spectral norm f(X) = ||X||2 = supv̸=0 ||Xv||2 /||v||2 for X ∈ Rm×n

f(X) = sup ||Xv||2 = sup sup u⊤ Xv


||v||2 =1 ||V||2 =1 ||u||2 =1

Recall: u⊤ Xv ≤ ||u||2 ||Xv||2 = ||Xv||2

In both cases f(X) is the supremum of linear functions in X and thus


convex.

TANN Chantara (ITC) Convex Functions October 9, 2022 21 / 31


Convexity Preserving Transformations

Composition

Proposition: If g : Rn → R is convex and h : R → R is convex and


non-decreasing, the f : Rn → R defined as f(x) = h(g(x)) is convex.

Proof: For any x, y ∈ Rn and θ ∈ [0, 1]

f(θx + (1 − θ)y) = h(g(θx + (1 − θ)y)) (definition of f)


≤ h(θg(x) + (1 − θ)g(y)) (conv. of g, mono. of h)
≤ θh(g(x)) + (1 − θ)h(g(y)) (convexity of h)
= θf(x) + (1 − θ)f(y) (definition of f)

Thus f is convex.
Example: f(x) = exp(g(x)) is convex if g is convex.

TANN Chantara (ITC) Convex Functions October 9, 2022 22 / 31


Convexity Preserving Transformations

Generalizations

Definition: A function f : Rn → [−∞, ∞) is concave if −f is convex.

Proposition: If g : R2 → R is concave and h : R → R is convex and


non-increasing, then f : Rn → R defined as f(x) = h(g(x)) is convex.

Proposition: If g : Rn → Rk is convex in each component, while


h : Rk → R is non-decreasing in each argument and convex, then
f : Rn → R defined via f(x) = h(g(x)) is convex.

Proposition: If g : Rn → Rk is concave in each component, while


h : Rk → R is non-increasing in each argument and convex, then
f : Rn → R defined via f(x) = h(g(x)) is convex.

TANN Chantara (ITC) Convex Functions October 9, 2022 23 / 31


Convexity Preserving Transformations

Minimization

Proposition: If f(x, y) and g(x, y) are convex in (x, y) and C is a convex


set, then the optimal value function
(
inf y∈C f(x, y)
h(x) =
s.t. g(x, y) ≤ 0

is convex
Proof: Assume that the inner problem is solvable, i.e., for every
x ∈dom(h). Choose x1 , x2 ∈dom(h) and let y1 , y2 ∈ C be the
corresponding minimizers, i.e., h(xi ) = f(xi , yi ) for i = 1, 2, for any
θ ∈ [0, 1]

TANN Chantara (ITC) Convex Functions October 9, 2022 24 / 31


Convexity Preserving Transformations

Minimization Cont’d

h(θx1 + (1 − θ)x2 ) = inf {f(θx1 + 1(1 − θ)x2 , y) :


y∈C

g(θx1 + (1 − θ)x2 , y) ≤ 0}
≤ f(θx1 + (1 − θ)x2 , θy1 + (1 − θ)y2 )
≤ θf(x1 , y1 ) + (1 − θ)f(x2 , y2 )
= θh(x1 ) + (1 − θ)h(x2 )

Thus, h is convex. If the problem is not solvable, once can use a similar
argument using ε-optimal solution for ε → 0

TANN Chantara (ITC) Convex Functions October 9, 2022 25 / 31


Schur Lemma

Schur Lemma

!
A B
Lemma (Schur) Consider X ∈ Sn partitioned as X = ,
B⊤ C
where C ≻ 0. Then

X ⪰ 0 ⇐⇒ A − BC−1 B⊤ ⪰ 0

The matrix A − BC−1 B⊤ is called the Schur complement of C


Proof: Consider the functions f(x, y) = x⊤ Ax + 2x⊤ By + y⊤ Cy and
h(x) = inf y f(x, y) = x⊤ (A − BC−1 B⊤ )x
⇒ X ⪰ 0 =⇒ f convex in (x, y) =⇒ h convex in x
=⇒ A − BC−1 B⊤ ⪰ 0
⇐ We have A − BC−1 B⊤ . Assume X ⪰̸ 0
=⇒ ∃(x0 , y0 ) ̸= 0 with f(x0 , y0 ) < 0
=⇒ h(x0 )x0⊤ (A − BC−1 B⊤ )x0 < 0, which contradicts the positive
definiteness of the Schur complement. Hence, X ⪰ 0.
TANN Chantara (ITC) Convex Functions October 9, 2022 26 / 31
Schur Lemma

Distance function

The distance of x to a fixed convex set C is convex in x, i.e.,

f(x) = dist(x, C) = inf ||x − y||2


y∈C

TANN Chantara (ITC) Convex Functions October 9, 2022 27 / 31


Schur Lemma

Perspective function

Proposition: If f(x) is convex, then the perspective of f, defined as

g(x, t) = tf(x/t), dom(g) = {(x, t) : (x/t) ∈ dom(f), t >)}

is convex in (x, t)
Proof: Choose (x1 , t1 ), (x2 , t2 ) ∈ dom(g) and θ ∈ [0, 1], then

θx1 + (1 − θ)x2
 
g(θ(x1 , t1 ) + (1 − θ)(x2 , t2 )) = (θt1 + (1 − θ)t2 )f
θt1 + (1 − θ)t2
θt1 x1 /t1 + (1 − θ)t2 x2 /t2

= (θt1 + (1 − θ)t2 )f
θt1 + (1 − θ)t2
≤ θt1 f(x1 /t1 ) + (1 − θ)t2 f(x2 /t2 )
= θg(x1 , t1 ) + (1 − θ)g(x2 , t2 )

Thus g is convex in (x, t)

TANN Chantara (ITC) Convex Functions October 9, 2022 28 / 31


Schur Lemma

Relative Entropy

Proposition: The relative entropy of two vector p, q ∈ Rn++ defined as


Pn
f(p, q) = i=1 pi log(pi /qi )

is convex.
Proof: The negative logarithm f(x) = − log(x) is convex on R++ . We
therefore conclude that its perspective function

g(x, t) = −t log(x/t) = t log(t/x)

is convex on R2++ . The relative entropy now can be seen as a sum of n


convex functions and as such is convex.

TANN Chantara (ITC) Convex Functions October 9, 2022 29 / 31


Generalized Inequalities

Convexity w.r.t generalized inequalities

Definition: Let K ⊂ Rm be a proper convex cone. The function


f : Rn → Rm is called K-convex if

f(θx + (1 − θ)y) ⪯K θf(x) + (1 − θ)f(y) ∀x, y ∈ Rn , θ ∈ [0, 1]

Proposition: If K is a proper convex cone and f is a K-convex function,


then the set C = {x : f(x) ⪯K 0} is convex.
Proof: Consider x, y ∈ C and θ ∈ [0, 1]. Then

f(θx + (1 − θ)y ⪯K θf(x) + (1 − θ)f(y) (f is K-convex)


⪯K 0 (x, y ∈ C, K convex)

Thus, θx + (1 − θ)y ∈ C, which implies that C is convex


Example: f : Sn → Sn , f(X) = X2 , is Sn+ -convex.

TANN Chantara (ITC) Convex Functions October 9, 2022 30 / 31


Summary

Summary

Definition: epigraph, domain and sublevel sets; proper, convex and


concave functions;
Checking convexity: using the basic definition; checking convexity
along lines; checking the 1st- or 2nd-order conditions (only for
differentiable functions).
Convexity-preserving transformations: non-negative weighted sum
and integral; composition with affine function; parametric maximum;
composition; parametric minimum (check convexity condition!);
perspective.
Schur’s lemma: a block matrix with a positive definite diagonal
block is psd if and only if this block’s Schur complement is psd.
Generalized inequalities: constructing convex sets using K-convex
constraint functions and conic inequalities.

TANN Chantara (ITC) Convex Functions October 9, 2022 31 / 31

You might also like