0% found this document useful (0 votes)
33 views11 pages

Convex Optimization L2 18

This document discusses convex functions and their properties. It begins by defining a convex function on a convex set as one where the inequality f(λx1 + (1-λ)x2) ≤ λf(x1) + (1-λ)f(x2) holds for any x1, x2 in the set and λ between 0 and 1. It then discusses level sets and sublevel sets of convex functions, proving that the sublevel sets of a convex function over a convex domain are also convex. The concepts of the epigraph, graph, and epigraph form of an optimization problem are introduced. It is shown that a function is convex if and only if its epigraph is

Uploaded by

Naseer Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views11 pages

Convex Optimization L2 18

This document discusses convex functions and their properties. It begins by defining a convex function on a convex set as one where the inequality f(λx1 + (1-λ)x2) ≤ λf(x1) + (1-λ)f(x2) holds for any x1, x2 in the set and λ between 0 and 1. It then discusses level sets and sublevel sets of convex functions, proving that the sublevel sets of a convex function over a convex domain are also convex. The concepts of the epigraph, graph, and epigraph form of an optimization problem are introduced. It is shown that a function is convex if and only if its epigraph is

Uploaded by

Naseer Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Convex functions I

ELL 822–Selected Topics in Communications


Let f : S → R, where S is a nonempty convex set in Rn . The
function f is convex on S if
f (λx1 + (1 − λ)x2 ) ≤ λf (x1 ) + (1 − λ)f (x2 ) for each x1 , x2 ∈ S and
for each λ ∈ (0, 1)
Lecture 2
Convex functions

- Ref: [Boyd] Chapter 3

Strictly convex on S if the above inequality is true as a strict


inequality
Jun B. Seo 2-1 2-2

Level sets Sublevel sets


• Sublevel set associated with f is defined as (for α ∈ R)
Level set of function f with level α is defined as
Sα = {x ∈ dom f |f (x) ≤ α}
S = {x|f (x) = α}
Pn • Let S be a nonempty convex set in Rn and let f : S → R be a
4 2
– Styblinski-Tang function: f (x) = 0.5( i=1 xi − 16xi + 5xi ) convex function. Then, the sublevel set Sα is convex:

4
Proof Suppose x 1 , x 2 ∈ Sα . Thus, we have x 1 , x 2 ∈ S and
100
3 f (x 1 ) ≤ α and f (x 2 ) ≤ α.
2
50
1 Consider x = λx 1 + (1 − λ)x 2 for λ ∈ (0, 1).
0
x2

-50 -1 By convexity of S, we can see x ∈ S and x ∈ Sα .


-2
-100
4
2 4
-3 Since f is convex, we write
0 2 -4
x2 -2 0
-2 x1
-4 -4 -2 0 2 4
-4
x1 f (x) ≤ λf (x 1 ) + (1 − λ)f (x 2 ) ≤ λα + (1 − λ)α = α.

2-3 2-4
Epigraph I Epigraph II
• Let S be a nonempty set in Rn and let f : S → R.
Proof
• The graph of f is described by the set {(x, f (x))|x ∈ S} ⊂ Rn+1
Suppose that f is convex and let (x 1 , y1 ), and (x 2 , y2 ) ∈ epi f
• The epigraph of f , denoted by epi f , is a subset of Rn+1 , i.e.,
– It means that x 1 , x 2 ∈ S and y1 ≥ f (x 1 ) and y2 ≥ f (x 2 ).
epi f = {(x, y)|x ∈ S, y ∈ R, y ≥ f (x)} – Convexity of f enables us to write

f (λx 1 + (1 − λ)x 2 ) ≤ λf (x 1 ) + (1 − λ)f (x 2 )


≤ λy1 + (1 − λ)y2

– Since λx 1 + (1 − λ)x 2 ∈ S, we have

[x 1 + (1 − λ)x 2 , λy1 + (1 − λ)y2 ] ∈ epi f

• Let S be a nonempty convex set. Then, f is convex if and only if


epi f is a convex set:
2-5 2-6

Epigraph III Epigraph IV


Constrained optimization problem in standard form
Proof continued minimize f0 (x)
x
Conversely assume that epi f is convex and let x 1 , x 2 ∈ S subject to fi (x) ≤ 0, for i = 1, . . . , m
Then, while [x 1 , f (x 1 )] ∈ epi f and [x 2 , f (x 2 )] ∈ epi f , due to hj (x) = 0, for j = 1, . . . , p
convexity of epi f , for λ ∈ (0, 1) we have
We can rewrite this in epigraph from as
[λx 1 + (1 − λ)x 2 , λf (x 1 ) + (1 − λ)f (x 2 )] ∈ epi f .
minimize t
x,t
It implies subject to f0 (x) − t ≤ 0
f (λx 1 + (1 − λ)x 2 ) ≤ λf (x 1 ) + (1 − λ)f (x 2 ), fi (x) ≤ 0, for i = 1, . . . , m
hj (x) = 0, for j = 1, . . . , p
which is convex.
– Every convex optimization problem can be transformed into a
problem with linear objective function
2-7 2-8
Epigraph V First-order condition of convex functions I
The epigraph form is an optimization problem in the (epi)graph Let S be a nonempty open set in Rn and let f : S → R be
space (x, t): differentiable on S.
minimize t Then, f is convex if and only if for any x ∈ S, we have
min. f0 (x) = |x| x,t
x
⇒ subject to |x| − t ≤ 0 f (y) ≥ f (x) + ∇f (x)T (y − x) for each y ∈ S,
subject to −x +1≤0
−x +1≤0 df (x) T
h i
df (x)
where ∇f (x) = dx1 , . . . , dxn is gradient of f .

3 feasible
region

-3 0 1 3 – The first order approximation of f at x is a global lower bound


2-9 2-10

First-order condition of convex functions II First-order condition of convex functions III


Proof Proof continued
By convexity of f , To show the converse, suppose a point t = αx + (1 − α)y
f (αy + (1 − α)x) ≤ αf (y) + (1 − α)f (x) We need to show that f is convex if the following holds
= α(f (y) − f (x)) + f (x) f (x) ≥ f (t) + ∇f (t)(x − t)
We rewrite this as f (y) ≥ f (t) + ∇f (t)(y − t)

f (αy + (1 − α)x) − f (x) + αf (x) ≤ αf (y). Multiplying each with α and 1 − α, we have

Finally, we have αf (x) ≥ αf (t) + α∇f (t)(x − t)


f (x + α(y − x)) − f (x) (1 − α)f (y) ≥ (1 − α)f (t) + (1 − α)∇f (t)(y − t)
f (y) ≥f (x) + (y − x)
α(y − x) Adding them yields
f (x + ∆x) − f (x)
=f (x) + (y − x) αf (x) + (1 − α)f (y) ≥ f (t) + ∇f (t)(αx + (1 − α)y − t)
∆x
=f (x) + ∇f (x)T (y − x) = f (αx + (1 − α)y)

2-11 2-12
First-order condition of convex functions IV Convex functions II
• If f is convex and x, y ∈ dom f , we have Let S be an nonempty convex set in Rn and let f : S → R be
t ≥ f (y) ≥ f (x) + ∇f (x)T (y − x) for (y, t) ∈ epi f differentiable on S.

• The epi f has a supporting hyperplane with [∇f (x), −1] at x Then, f is convex if and only if for each x 1 , x 2 ∈ S, we have
" # " #!
y x [∇f (x 2 ) − ∇f (x 1 )]T (x 2 − x 1 ) ≥ 0 (monotone)
(y, t) ∈ epi f ⇒ [∇f (x) − 1] − ≤0
t f (x)
Proof If f is convex, for two distinct x 1 and x 2 we have

f (x 1 ) ≥ f (x 2 ) + ∇f (x 2 )T (x 1 − x 2 )
f (x 2 ) ≥ f (x 1 ) + ∇f (x 1 )T (x 2 − x 1 )

Adding two equations side-by-side, we have

∇f (x 2 )T (x 1 − x 2 ) + ∇f (x 1 )T (x 2 − x 1 ) ≤ 0
Global minimum of convex function f is attained if and only if
non-vertical supporting
∇f (x) = 0 hyperplanes
2-13 2-14

Convex functions III Second-order condition of convex functions I

Proof continued • Let S n denote the set of symmetric n × n matrices, i.e.,


To prove the converse, by assumption, if the following holds for
S n = {X ∈ Rn×n |X = X T }
x = λx 1 + (1 − λ)x 2
– S n++ (or, S n+ ) denote the set of symmetric positive
[∇f (x) − ∇f (x 1 )]T (x − x 1 ) ≥ 0,
(semi)definite matrix, i.e., z T H z > 0 (or, z T H z ≥ 0) for z ∈ Rn
we have (1 − λ)[∇f (x) − ∇f (x 1 )]T (x 2 − x 1 ) ≥ 0, i.e., • Let S be a nonempty open set in Rn and let f : S → R be twice
differentiable on S.
∇f (x)T (x 2 − x 1 ) ≥ ∇f (x 1 )T (x 2 − x 1 )
• f is convex (strictly) function if and only if its Hessian matrix
Using the mean value theorem, i.e.,
∂ 2 f (x)
H (x) = [hij (x)] with hij (x) =
f (x 2 ) − f (x 1 ) = ∇f (x)T (x 2 − x 1 ), ∂xi ∂xj

for x = λx 1 + (1 − λ)x 2 and λ ∈ (0, 1). We have the result. H (x) ∈ S n+ (or S n++ ) over S

2-15 2-16
Second-order condition of convex functions II Second-order condition of convex functions III
Proof Proof continued
Using convexity of f , f (y) ≥ f (x) + ∇f (x)(y − x) for To show the converse, use ‘mean value theorem’ extended to
y = x + λx ∈ S with small λ, we have second order
f (x + λx) ≥ f (x) + λ∇f (x)T x Let f : Rn → R be twice continuously differentiable over an open
set S, and x ∈ S.
Using Taylor expansion of f , we also have
For all y such that x + y ∈ S there exists an α ∈ [0, 1]
1
f (x + λx) = f (x) + λ∇f (x)T x + λ2 x T H (x)x + λ2 kxk2 O(x; λx)
2 1
f (x + y) = f (x) + y T ∇f (x) + y T ∇2 f (x + αy)y
where O(x; λx) → 0 as λ → 0. 2

Plugging this, dividing λ2 and letting λ → 0 yields In using mean value theorem, let x = x + y ∈ S. Then,

1 T 1
x H (x)x + O(x; λx) ≥ 0 f (x) = f (x) + y T ∇f (x) + y T ∇2 f (x + αy)y
2 | {z } 2
→0

2-17 2-18

Second-order condition of convex functions IV Restriction of a convex function to a line I


• f : Rn → R is convex if and only if g : R → R
Proof continued
g(t) = f (x + tv) for dom g = {t|x + tv ∈ domf }
The point x + αy in ∇2 f (x + αy) is expressed as
is convex (in t) for any x ∈ dom f , v ∈ Rn
x + αy = x + α(x − x) = αx + (1 − α)x = x̂ : used to check convexity of f by checking convexity of functions
of one variable
The theorem gives
f (x1 , x2 ) = x21 + x22
1
f (x) = f (x) + y ∇f (x) + y T ∇2 f (x̂)y
T
60 20
f (x1 , x2 ) = x21 − x22
2
0
If 12 y T ∇2 f (x̂)y ≥ 0, then 40

-20
20
T
f (x) ≥ f (x) + ∇f (x) (x − x), -40
0 4
6 2 4
4 0
which completes the proof 2 0
x2 -2 -4 -6 -4 -2 0 2 4 6 x2 -2
-4 -4
-2
0
x1
2
-6 x1

2-19 2-20
Restriction of a convex function to a line II Restriction of a convex function to a line III
Proof
f : S n → R and f (X ) = log det X for dom f = S n++ .
Show whether f is convex or not g(t) = log det X + log det(Q(I + tΛ)Q T )
Proof = log det X + log det((I + tΛ)Q T Q)
n
Y
g(t) = log det(X + tV ) = log det(X 1/2
(I + tX −1/2
VX −1/2
)X 1/2
) = log det X + log det(I + tΛ) = log det X + log (1 + tλi )
i=1
= log det(X (I + tX −1/2 VX −1/2 )) n
X
= log det X + log(1 + tλi )
= log det X + log det(I + tX −1/2 VX −1/2 ) i=1
= log det X + log det(I + tQΛQ T )
By examining g 00 (t), i.e.,
= log det X + log det Q(I + tΛ)Q T
n
00
X 1
where real symmetric matrix A = QΛQ T
with = QQ T QT Q =I g (t) = − <0
i=1
(1 + λi t)2
and Λ is a diagonal matrix of eigenvalues of X −1/2 VX −1/2
we can see that f is concave.
2-21 2-22

Operations that preserve convexity I Operations that preserve convexity II


• Every norm on Rn is convex
• Let f1 , f2 , . . . , fk : Rn → R be convex function.
– Nonnegative weighted sum:
Scalar composition f = h(g(x)), where h : Rk → R and
k
X g : Rn → Rk
f (x) = αi fi (x)
i=1 • f is convex if h is convex and nondecreasing, and g is convex,
is convex for αi > 0
• f is convex if h is convex and nonincreasing, and g is concave
– Pointwise maximum or supremum
• f is concave if h is concave and nondecreasing, and g is concave
f (x) = max{f1 (x), . . . , fk (x)} • f is concave if h is concave and nonincreasing, and g is convex

– Composition with an affine mapping: Suppose f : Rn → R and


h(x) = f (Ax + b).
If f is convex, so is h
2-23 2-24
Operations that preserve convexity III Operations that preserve convexity IV

• Extended-value extension f̃ of convex function f for x ∈ dom f For functions h : Rk → R and gi : Rn → Rk


(
f (x), if x ∈ S Vector composition f = h(g(x)) = h(g1 (x), . . . , gk (x)) where
f̃ (x) =
∞, if x ∈
/S
f 00 = g 0 (x)T ∇2 h(g(x))g 0 (x) + ∇h(g(x))T g 00 (x)
– f̃ is defined on Rn , and takes values in R ∪ {∞}.
– f is convex for x ∈ convex set S, f̃ satisfies for θ ∈ [0, 1], f is convex if h is convex and nondecreasing in each argument, and g
is convex,
f̃ (θx 1 + (1 − θ)x 2 ) ≤ θf̃ (x 1 ) + (1 − θ)f̃ (x 2 )
f is convex if h is convex and nonincreasing in each argument and gi
• The following statements also hold: is concave
f is concave if h is concave and nondecreasing in each argument, and
– f is convex if h is convex, h̃ is nondecreasing, and g is convex,
gi is concave
– f is convex if h is convex, h̃ is nonincreasing, and g is concave

2-25 2-26

Subgradient of convex functions I Subgradient of convex functions II


A subgradient of convex function f : S → R at x ∈ S is any Let f (x) = min{f1 (x), f2 (x)}, where f1 and f2 are defined as
g ∈ Rn such that
f1 (x) = 4 − |x| and f2 (x) = 4 − (x − 2)2 for x ∈ R
T
f (x) ≥ f (x) + g (x − x) for all x ∈ S

always exists for convex f


If f is differentiable, then we have unique g = ∇f (x)

0 1

Subgradient of f at x = 1: λ∇f1 (1) + (1 − λ)∇f2 (1) for λ ∈ [0, 1]


Subgradient of f at x = 4: λ∇f1 (4) + (1 − λ)∇f2 (4) for λ ∈ [0, 1]
Subgradient is also a global underestimator of f at x 2-27 2-28
Subgradient of convex functions III Subgradient of convex functions IV

The subdifferential of f at x is the set of all subgradients at x • If g is a subgradient of f at x, from f (y) ≥ f (x) + g T (y − x),

∂f (x) = {g|f (x) ≥ f (x) + g T (x − x), ∀x ∈ dom f } f (y) < f (x) ⇒ g T (y − x) ≤ 0

0 1

• The nonzero subgradients at x define supporting hyperplanes to


the sublevel set
• ∂f (x) is always a closed convex set (possibly empty), since it is
{y|f (y) ≤ f (x)}
the intersection of an infinite set of halfspaces

2-29 2-30

Subgradient of convex functions V Subgradient of convex functions VI


Let S be a convex set in Rn and f : S → R be convex function. Proof continued
For x ∈ int S, then ∂f (x) is nonempty. a T (x − x) + b(z − f (x)) ≤ 0
Proof
• b > 0, as z → ∞, inequality does not hold: b ≤ 0
Suppose a hyperplane with normal vector [a, b] for a ∈ Rn , b ∈ R
(not both zero) such that for all (x, z) ∈ epi f • b = 0 (this means a vertical hyperplane), we have
" # " #!
x x a T (x − x) ≤ 0
[a T b] − = a T (x − x) + b(z − f (x)) ≤ 0
z f (x)
where (x, f (x)) is a boundary point of epi f which is impossible for all x ∈ S when x ∈ int S
Suppose x + a ∈ int S for  > 0

a T (x − x) = a T a ≤ 0

This implies that a must be also zero, which contradicts nonzero


vector a and b
2-31 2-32
Subgradient of convex functions VII Subgradient of convex functions VIII
Proof continued A subgradient of convex function f : S → R at x ∈ S is any
g ∈ Rn such that
a T (x − x) + b(z − f (x)) ≤ 0
• b < 0, let a
e = a/|b|, and divide both sides with |b|, f (x) ≥ f (x) + g T (x − x) for all x ∈ S
" #T " # " #T " # This is rewritten as
−a x −a x
eT x − z ≤ a
e T x − f (x) ⇒
e e
a ≤ " # " #!
1 f (x) 1 z T T x x
f (x) − g x ≥ f (x) − g x ⇒ [g − 1] − ≤0
f (x) f (x)
for all (x, z) ∈ epi f , while we have a
e ∈ ∂f (x)
At point x, there exist a supporting hyperplane with [g, −1]
• Letting z = f (x), we get a hyperplane, H as
n o
H = x|â T (x̂ − x̂ 0 ) = 0
where
" # " # " #
a
e x x
â = , x̂ = , and x̂ 0 =
−1 f (x) f (x)
2-33 2-34

Subgradient of convex functions IX Properties of subdifferential I


• Scaling: For λ > 0, the function λf is

The following functions are not subdifferentiable at x = 0 ∂(λf )(x) = λ∂f (x)

• f : R → R and dom f = R+ • Sum: the function of f1 + f2 is convex,


(
1, if x = 0 ∂(f1 + f2 )(x) = ∂f1 (x) + ∂f2 (x)
f (x) =
0, if x > 0
• Composition with affine mapping: let φ(x) = f (Ax + b). Then,
• f : R → R and dom f = R+
∂φ(x) = AT ∂f (Ax + b)

f (x) = − x
• Finite pointwise maximum: if f (x) = maxi=1,...,n fi (x), then
The only supporting hyperplane to epi f at (0, f (0)) is vertical
 
∂f (x) = conv ∪i:fi (x)=f (x) ∂fi (x)

convex hull of the union of subdifferentials of all ‘active’ function


at x
2-35 2-36
Properties of subdifferential II Quasi-convex
• Consider a piecewise-linear function Let f : S → R, where S is a nonempty convex set in Rn .
The function f is called quasi-convex (or unimodal)
f (x) = max (aiT x + bi )
i=1,...,m
• if and only if for x1 , x2 ∈ S,
f (λx1 + (1 − λ)x2 ) ≤ max{f (x1 ), f (x2 )} for each λ ∈ (0, 1)
• if and only if all its sublevel set Sα = {x ∈ dom f |f (x) ≤ α} for
α ∈ R are convex
• the function f is called quasi-concave, if −f is quasiconvex

• The subdifferential at x is a polyhedron

∂f = conv{ai |i ∈ I (x)} with I (x) = {i|aiT x + bi = f (x)}

2-37 2-38

First-order condition of Quasi-convex I First-order condition of Quasi-convex II

Let S be a nonempty open convex set in Rn , and let f : S → R


be differentiable on S Proof continued
Then, f is quasiconvex if and only if Suppose f (x2 ) ≤ f (x1 ). We assume that x2 > x1 and show that
for x 1 , x 2 ∈ S and f (x 2 ) ≤ f (x 1 ), ∇f (x 1 )T (x 2 − x 1) ≤ 0 f (z) ≤ f (x1 ) for z ∈ [x1 , x2 ] due to quasi-convexity.

Proof If f (x 1 ) > f (x 2 ), by definition of quasi-convexity, Consider it is not true, i.e., there is a z ∈ [x1 , x2 ] with f (z) > f (x1 )
Then, there exists z such that f 0 (z) < 0
f (λx 2 + (1 − λ)x 1 ) = f (x 1 + λ(x 2 − x 1 )) ≤ f (x 1 )
By definition of quasi-convexity, we must have
We can write for 0 < λ ≤ 1
f (x1 ) ≤ f (z) ⇒ f (z)0 (x1 − z) ≤ 0
f (x 1 + λ(x 2 − x 1 )) − f (x 1 )
(x 2 − x 1 ) ≤ 0
λ(x 2 − x 1 ) However, this contradicts, since f 0 (z) < 0 and x1 − z < 0
As λ → 0, we have the result

2-39 2-40
Pseudo-convex

• For quasi-convex function, ∇f (x) = 0 does not give the


condition of global minimizer
• Let S be a nonempty open set in Rn and let f : S → R be
differentiable on S
The function f is called pseudoconvex:
if for each x 1 , x 2 ∈ S with ∇f (x 1 )T (x 2 − x 1 ) ≥ 0, we have
f (x 2 ) ≥ f (x 1 )
This shows that if ∇f (x) = 0 at any point x, we have

f (x) ≥ f (x) for all x,

which implies that x is a global minimum for f

2-41

You might also like