0% found this document useful (0 votes)
16 views35 pages

Func 20160919

Uploaded by

yueqi.yx
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views35 pages

Func 20160919

Uploaded by

yueqi.yx
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

2.

2 Convex Function

Yipeng Liu

School of Electronic Engineering/Center for Robotics/Center for Information in Medicine


University of Electronic Science and Technology of China (UESTC)

[email protected]

October 10, 2016

1 / 35
Overview

1. definition

2. basic properties

3. epigraph and sublevel set

4. Jensen’s inequality

5. operations that preserve convexity

6. conjugate function

7. log-concave and log-convex functions

8. convexity with respect to generalized inequalities

2 / 35
Definition

f : RN → R is convex if dom f is a convex set and

f (θx + (1 − θ)y) 6 θf (x) + θf (1 − y)

for all x, y ∈ dom f , 0 6 θ 6 1

• f is concave if −f is convex

• f is strictly convex if dom f is convex and

f (θx + (1 − θ)y) < θf (x) + θf (1 − y)

for all x, y ∈ dom f, x 6= y, 0 < θ < 1

3 / 35
Examples on R

convex:

• affine: ax + b on R, for any a, b ∈ R

• exponential: eax , for any a ∈ R

• powers:xα on R++ , for all α > 1 or α 6 0

• powers of absolute value: |x|p on R, for p > 1

• negative entropy: x log x on R++

concave:

• affine: ax + b on R, for any a, b ∈ R

• powers: xα on R++ , for all 0 6 α 6 1

• logarithm: log x on R++

4 / 35
Examples on RN and RM ×N

affine functions are convex and concave; all norms are convex
examples on RN :

• affine function: f (x) = aT x + b


P 1/p
N
• norms: kxkp = n=1 |xi |p , for p > 1; kxk∞ = maxn |xn |

examples on RM ×N :

• affine function
  M X
X N
f (X) = tr AT X + b = Amn Xmn + b
m=1 n=1

• spectral (maximum singular value) norm


  1/2
f (X) = kXk2 = σmax (X) = λmax XT X

5 / 35
Restriction of a convex function to a line

f : RN → R is convex if and only if the function g : R → R,

g(t) = f (x + tv), dom g = {t |x + tv ∈ dom f }

is convex (in t) for any x ∈ dom f, v ∈ RN


check convexity of f by checking convexity of functions of only one variable
example: f : SN → R with f (X) = log det X, dom f = SN
++

 
g(t) = log det (X + tV) = log det X + log det I + tX−1/2 VX−1/2
N
X
= log det X + log(1 + tλn )
n=1

where λn are the eigenvalues of X−1/2 VX−1/2


g is concave in t (for any choice of X  0, V); hence f is concave

6 / 35
Extended-value extension

extended-value extension f˜ of f is

f˜(x) = f (x), x ∈ dom f, f˜(x) = ∞, x ∈


/ dom f

often simplifies notation; for example, the condition

0 6 θ 6 1 ⇒ f˜(θx + (1 − θ)y) 6 θf˜(x) + (1 − θ)f˜(y)

(as an inequality in R ∪ {∞}), means the same as the two conditions

• dom f is convex

• x, y ∈ dom f

0 6 θ 6 1 ⇒ f (θx + (1 − θ)y) 6 θf (x) + (1 − θ)f (y)

7 / 35
First-order condition
f is differentiable if dom f is open and the gradient
 
∂f (x) ∂f (x) ∂f (x)
∇f (x) = , ,··· ,
∂x1 ∂x2 ∂xN

exists at each x ∈ dom f


1st-order condition: differentiable f with convex domain is convex iff

f (y) > f (x) + ∇f (x)T (y − x) , for all x, y ∈ dom f

first-order approximation of f is global underestimator

8 / 35
Second-order conditions

f is twice differentiable if dom f is open and the Hessian ∇2 f (x) ∈ SN

∂ 2 f (x)
∇2 f (x)i,j = , i, j = 1, 2, · · · , N
∂xi ∂xj

exists at each x ∈ dom f


2nd-order conditions: for twice differentiable f with convex domain

• f is convex if and only if

∇2 f (x)  0, for all x ∈ dom f

• if ∇2 f (x)  0, for all x ∈ dom f , then f is strictly convex

9 / 35
Examples
quadratic function: f (x) = (1/2) xT Px + qT x + r (with P ∈ SN )

∇f (x) = Px + q, ∇2 f (x) = P

convex if P  0
least-squares objective: f (x) = kAx − bk22

∇f (x) = 2AT (x − b), ∇2 f (x) = 2AT A

convex (for any A)

quadratic-over-linear: f (x, y) = x2 y


" #" #T
2 2 y y
∇ f (x, y) = 3 0
y −x −x

convex for y > 0

10 / 35
Examples

PN
log-sum-exp (soft max): f (x) = log n=1 exp xn is convex

1 1
∇2 f (x) = diag(z) − zzT , (zn = exp xn )
1T z (1T z)2

to show ∇2 f (x)  0, we must verify that vT ∇2 f (x)v > 0 for all v:

2
zn vn2
P  P  P
zn − n vn zn
vT ∇2 f (x)v = n
P
n
2 >0
n zn
2
zn vn2
P P  P 
since n vn zn 6 n n zn (Cauchy-Schwarz inequality)

Q 1/N
N
geometric mean: f (x) = n=1 xn on RN
++ is concave

(similar proof as for log-sum-exp)

11 / 35
Epigraph and sublevel set

α-sublevel set of f : RN → R

Cα = {x ∈ dom f |f (x) 6 α }

sublevel sets of convex functions are convex (converse is false)


epigraph of f : RN → R

n o
epi f = (x, t) ∈ RN +1 |x ∈ dom f, f (x) 6 t

a function (in black) f is convex if and


only if the region above its graph (in
green, epi f ) is a convex set

two kinds of relations between convex set and convex function

12 / 35
Some convex functions constructed from convex sets

Let K ⊆ RN be a convex set.

1. The characteristic function (equivalent to indicator) of K is:


(
0, if x ∈ K
χK (x) =
+∞, otherwise
2. Suppose that 0 ∈ K. The Minkowski function of K is:

µK = inf {t > 0 : x ∈ tK}

note: epi µK is the light cone of K, which is a convex cone.


for t > 0, x, y ∈ RN
• µ is positively homogeneous: µK (tx) = tµK (x)
• µ is subadditive: µK (x + y) = µK (x) + µK (y)

µK < 1 for x ∈ int K

13 / 35
Jensen’s inequality

basic inequality: if f is convex, then for 0 6 θ 6 1

f (θx + (1 − θ)y) 6 θf (x) + (1 − θ)f (y)

extension: if f is convex, then

f (Ez) 6 Ef (z)

for any random variable z.

basic inequality is special case with discrete distribution

prob(z = x) = θ, prob(z = y) = 1 − θ

14 / 35
Operations that preserve convexity

practical methods for establishing convexity of a function

1. verify definition (often simplified by restricting to a line)


2. for twice differentiable functions, show ∇2 f (x)  0
3. show that f is obtained from simple convex functions by operations that
preserve convexity
• nonnegative weighted sum
• composition with affine function
• pointwise maximum and supremum
• composition
• minimization
• perspective

15 / 35
Positive weighted sum & composition with affine function

nonnegative multiple: αf is convex if f is convex, α > 0


sum: f1 + f2 convex if f1 , f2 convex (extends to infinite sums, integrals).
composition with affine function: f (Ax + b) is convex if f is convex

examples

• log barrier for linear inequalities

M
X  
f (x) = − log bm − aTm x ,
m=1
n o
dom f = x aTm x < bm , m = 1, · · · , M

• (any) norm of affine function: f (x) = kAx + bk

16 / 35
Pointwise maximum

if f1 , · · · , fM are convex, then f (x) = max{f1 (x), · · · , fM (x)} is convex

proof: A function is convex iff its epigraph is convex + the epigraph of a


pointwise maximum is the intersection of the epigraphs ⇒ the pointwise
maximum of convex functions is convex

examples

• piecewise-linear function f (x) = maxm=1,··· ,M (aT


m x + bm ) is convex

• sum of K largest components of x ∈ RN :

f (x) = x[1] + x[2] + · · · + x[K]

is convex (x[k] is kth largest component of x)

f (x) = max {xn1 + xn2 + · · · + xnK |1 6 n1 < n2 < · · · < nK 6 N }

the max of all the functions which select K entries from x and sum them.
17 / 35
Pointwise supremum
if f (x, y) is convex in x for each f (x, y), y ∈ C, then

g(x) = sup f (x, y)


y∈C

is convex
note that: f (x, y) does not need to be convex in f (x, y)
examples
• support function of a set

C : SC (x) = sup xT y
y∈C

• distance to farthest point in a set C:

f (x) = sup kx − yk
y∈C

• maximum eigenvalue of symmetric matrix: for X ∈ SN

λmax (X) = sup yT Xy


kyk2 =1

18 / 35
Composition with scalar functions

composition of g : RN → R and h : R → R:

f (x) = h(g(x))

g convex, h convex, h̃ nondecreasing


f is convex if
g concave, h convex, h̃ nonincreasing

• proof (for N = 1, differentiable g, h)

f 00 (x) = h00 (g(x))g 0 (x)2 + h0 (g(x))g 00 (x)


• note: monotonicity must hold for extended-value extension h̃

examples

• exp g(x) is convex if g is convex

• 1/g(x) is convex if g is concave and positive

19 / 35
Vector composition
N
composition of g : R → RK and h : RK → R:

f (x) = h(g(x)) = h(g1 (x), g2 (x), · · · , gK (x))

f (x) = h(g(x)) = h(g1 (x), g2 (x), · · · , gK (x))


gk convex, h convex, h̃ nondecreasing in each argument
f is convex if
gk concave, h convex, h̃ nonincreasing in each argument

proof (for N = 1, differentiable g, h)

f 00 (x) = g 0 (x)T ∇2 h(g(x))g 0 (x) + ∇h(g(x))T g 00 (x)

examples
PM
• log gm (x) is concave if gm are concave and positive
m=1
PM
• log m=1 exp gm (x) is convex if gm are convex

20 / 35
Infimum

if f (x, y) is convex in (x, y) and C is a convex set, then

g(x) = inf f (x, y)


y∈C

is convex
example

• f (x, y) = xT Ax + 2xT By + yT Cy with


" #
A B
 0, C  0
BT C

minimizing over y gives g(x) = inf y f (x, y) = xT (A − BC−1 BT )x


g is convex, hence Schur complement A − BC−1 BT  0
• distance to a set: dist(x, S) = inf y∈S kx − yk is convex if S is convex

21 / 35
Perspective
the perspective of a function f : RN → R is the function g : RN × R → R,

g(x, t) = tf (x/t), dom g = {(x, t) |x/t ∈ dom f, t > 0 }

g is convex if f is convex
examples

• f (x) = xT x is convex; hence g(x, t) = xT x t is convex for t > 0




• negative logarithm f (x) = − log x is convex; hence relative entropy


g(x, t) = t log t − t log x is convex on R2++ .
• f is convex, then

Ax + b
g(x) = (cT x + d)f ( )
cT x + d
is convex on
 
Ax + b
x cT x + d > 0, ∈ dom f
cT x + d

22 / 35
Conjugate function
the conjugate of a function f is

f ∗ (y) = sup (yT x − f (x))


x∈dom f

when y is fixed, xy is a line with 0 point in it and the slop is y


• f ∗ is the maximum gap between linear function yT x and f (x)
• f ∗ is convex (even if f is not), since it is the pointwise maximum of
convex (affine) functions in y
• for differentiable f , conjugation is called the Legendre transform 23 / 35
Conjugate function
examples

• negative logarithm f (x) = − log x


(
∗ −1 − log(−y), y < 0
f (y) = sup(yx + log x) =
x>0 ∞, otherwise
1 T
• strictly convex quadratic f (x) = 2
x Qx with Q ∈ SN
++

1 T 1
f ∗ (y) = sup(yT x − x Qx) = yT Q−1 y
x 2 2
• indicator function f (x) = 1C (x)

f ∗ (y) = 1∗C (x) = sup yT x


x∈C

called the support function of C


• norm f (x) = kxk

f ∗ (y) = 1{y:kyk∗ 61} (y)

24 / 35
Conjugate function

Properties

• Fenchel’s inequality: for any x, y,

f (x) + f ∗ (y) > xT y

• Hence conjugate of conjugate f ∗∗ satisfies f ∗∗ 6 f

• If f is closed and convex, then f ∗∗ = f

• If f is closed and convex, then for any x, y,

x ∈ ∂f ∗ (y) ⇔ y ∈ ∂f ∗ (x) ⇔ f (x) + f ∗ (y) = xT y

• If f (u, v) = f1 (u) + f2 (v) (here u ∈ RN , v ∈ RM ), then

f ∗ (w, z) = f1∗ (w) + f2∗ (z)

25 / 35
Quasiconvex functions
N
Definition 1: f : R → R is quasiconvex if dom f is convex and the sublevel
sets
Sα = {x ∈ dom f |f (x) 6 α }

are convex for all α

• f is quasiconcave if −f is quasiconvex

• f is quasilinear if it is quasiconvex and quasiconcave

26 / 35
examples

p
• |x| is quasiconvex on R
• ceil(x) = inf {z ∈ Z |z > x } is quasilinear

• log x is quasilinear on R++

• f (x1 , x2 ) = x1 x2 is quasiconcave on R2++

• linear-fractional function

aT x + b n o
f (x) = T
, dom f = x cT x + d > 0
c x+d
is quasilinear
• distance ratio

kx − ak2 
f (x) = , dom f = x kx − ak2 6 kx − bk2
kx − bk2
is quasiconvex

27 / 35
Quasiconvex functions

internal rate of return

• cash flow x = [x0 , · · · , xN ]T ; xn is payment in period n (to us if xn > 0)

• we assume x0 < 0 (investment) and x0 + x1 + · · · + xN > 0

• present value of cash flow x, for interest rate r:

N
X
PV(x, r) = (1 + r)−n xn
n=0

• internal rate of return is smallest interest rate for which PV(x, r):

IRR(x) = inf {r > 0 |PV(x, r) = 0 }

IRR is quasiconcave: superlevel set is intersection of halfspaces

N
X
IRR(x) > R ⇔ (1 + r)−n xn > 0 for 06r<R
n=0

28 / 35
Properties of quasiconvex functions

modified Jensen inequality (Definition 2): for quasiconvex f

0 6 θ 6 1 ⇒ f (θx + (1 − θ)y) 6 max{f (x), f (y)}

first-order condition: differentiable f with cvx domain is quasiconvex iff

f (y) 6 f (x) ⇒ ∇f (x)T (y − x) 6 0

sums of quasiconvex functions are not necessarily quasiconvex

29 / 35
Strictly local quasiconvex function

let x, z ∈ RN , κ, ε > 0, f : RN → R is (ε, κ, z)-strictly locally quasiconvex


(SLQC) in x, if at least one of the following applies:

• f (x) − f (z) 6 ε

• k∇f (x)k2 > 0, and for every y ∈ B(z, ε/κ) it holds that
h∇f (x), y − xi 6 0

L-Lipschitz + strictly quasiconvex = (ε, L, z)-SLQC

note: Lipschitz continuity: kf (x) − f (y)k 6 L kx − yk , ∀ x, y ∈ C

normalized gradient descent methods can solve the SLQC optimization

30 / 35
Log-concave and log-convex functions

a positive function f is log-concave if log f is concave:

f (θx + (1 − θ)y) > f (x)θ f (y)1−θ , for 0 6 θ 6 1

f is log-convex if log f is convex

• powers: xa on R++ is log-convex for a 6 1, log-concave for a > 1

• many common probability densities are log-concave, e.g., normal:


 
1 1
f (x) = q exp − (x − x)T Σ−1 (x − x)
2
(2π)N det Σ

• cumulative Gaussian distribution function φ is log-concave

x
u2
Z  
1
φ(x) = √ exp − du
2π −∞ 2

31 / 35
Properties of log-concave functions

• twice differentiable f with convex domain is log-concave if and only if

f (x)∇2 f (x)  ∇f (x)∇f (x)T

for all x ∈ dom f


• product of log-concave functions is log-concave

• sum of log-concave functions is not always log-concave

• integration: if f : RN × RM → R is log-concave, then


Z
g(x) = f (x, y)dy

is log-concave (not easy to show)

32 / 35
Properties of log-concave functions

consequences of integration property

• convolution f ∗ g of log-concave functions f, g is log-concave


Z
f ∗ g(x) = f (x − y)g(y)dy

• if C is convex and y is a random variable with log-concave pdf, then

f (x) = prob(x + y ∈ C)

is log-concave
proof: write f (x) as integral of product of log-concave functions
(
1, u ∈ C
Z
f (x) = g(x + y) p(y)dy, g(u) =
0, u ∈
/C
p is pdf of y

33 / 35
Properties of log-concave functions

example: yield function

h(x) = prob(x + w ∈ S)

• x ∈ RN : nominal/target parameter values for product

• w ∈ RN : random variations of parameters in manufactured product

• S: set of acceptable values

if S is convex and w has a log-concave pdf, then

• h is log-concave

• yield regions {x |h(x) > α } are convex

34 / 35
Convexity with respect to generalized inequalities

f : RN → RM is K-convex if dom f is convex and

f (θx + (1 − θ)y) K θf (x) + (1 − θ)f (y)

for x, y ∈ dom f, 0 6 θ 6 1
example: f : SM → SM , f (X) = X2 is SM
+ -convex

proof: for fixed z ∈ RM , zT X2 z = kXzk22 is convex in X, i.e.,

zT (θX + (1 − θ)Y)2 z 6 θzT X2 z + (1 − θ)zT Y2 z

for X, Y ∈ SM , 0 6 θ 6 1
therefore, (θX + (1 − θ)Y)2  θX2 + (1 − θ)Y2

35 / 35

You might also like