0% found this document useful (0 votes)
52 views4 pages

Lec3 Convex Function Exercise

The document discusses optimization methods for convex functions. It defines what it means for a function f(x) to be convex and proves several properties of convex functions, including: - A function is convex if and only if its restriction to any line is also convex. - A differentiable function is convex if and only if its gradient satisfies a certain inequality. - Norm functions are convex due to properties like subadditivity and absolute homogeneity. - Several examples of convex and non-convex functions are provided.

Uploaded by

zxm1485
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views4 pages

Lec3 Convex Function Exercise

The document discusses optimization methods for convex functions. It defines what it means for a function f(x) to be convex and proves several properties of convex functions, including: - A function is convex if and only if its restriction to any line is also convex. - A differentiable function is convex if and only if its gradient satisfies a certain inequality. - Norm functions are convex due to properties like subadditivity and absolute homogeneity. - Several examples of convex and non-convex functions are provided.

Uploaded by

zxm1485
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Optimization Methods - Convex Functions

Bolei Zhang
December 7, 2022

1 Problems
Problem 1: A function f (·) is convex ⇐⇒ ∀x, v ∈ domf, g(t) = f (x + tv) is convex.

Solution: “ =⇒ ”: For any t1 , t2 ∈ domg, x, v ∈ domf , there is

g(θt1 + (1 − θ)t2 ) = f (x + θt1 v + (1 − θ)t2 v)


= f (θ(x + t1 v) + (1 − θ)(x + t2 v))
(1)
≤ θf (x + t1 v) + (1 − θ)f (x + t2 v))
= θg(t1 ) + (1 − θ)g(t2 )
“⇐=”: For any x1 , x2 ∈ domf , there is x1 + tx2 ∈ domg. We have

f (θx1 + (1 − θ)x2 ) = f (x1 + (1 − θ)x2 − (1 − θ)x1 )


= f (x1 + (1 − θ)(x2 − x1 ))
= g(0 + (1 − θ) · 1) (2)
≤ θg(0) + (1 − θ)g(1)
= θf (x1 ) + (1 − θ)f (x2 )

Theorem 1: First order condition: For a differentiable function f , if dom f is a convex set, then f is
a convex function ⇐⇒ f (y) ≥ f (x) + ∇f (x)T (y − x), ∀x, y ∈ domf .

Proof: First consider the case n = 1: We show that a differentiable function f : R → R is convex if
and only if
f (y) ≥ f (x) + f ′ (x)(y − x) (3)
for all x and y in domf .
Assume first that f is convex and x, y ∈ domf . Since domf is convex, we conclude that for all
0 < t ≤ 1, x + t(y − x) ∈ domf , and by convexity of f ,

f (x + t(y − x)) ≤ (1 − t)f (x) + tf (y). (4)

If we divide both sides by t, we obtain

f (x + t(y − x)) − f (x)


f (y) ≥ f (x) + , (5)
t
and taking the limit as t → 0 yields (3).
To show sufficiency, assume the function satisfies (3) for all x and y in domf (which is an interval).
Choose any x ̸== y, and 0 ≤ θ ≤ 1, and let z = θx + (1 − θ)y. Applying (3) twice yields

f (x) ≥ f (z) + f ′ (z)(x − z), f (y) ≥ f (z) + f ′ (z)(y − z). (6)

Multiplying the first inequality by θ, the second by 1 − θ, and adding them yields

θf (x) + (1 − θ)f (y) ≥ f (z), (7)

which proves that f is convex.

1
Now we can prove the general case, with f : Rn → R. Let x, y ∈ Rn and consider f restricted
to the line passing through them, i.e., the function defined by g(t) = f (ty + (1 − t)x), so g ′ (t) =
∇f (ty + (1 − t)x)T (y − x).
First assume f is convex, which implies g is convex, so by the argument above we have g(1) ≥
g(0) + g ′ (0), which means
f (y) ≥ f (x) + ∇f (x)T (y − x). (8)
Now assume that this inequality holds for any x and y, so if ty +(1−t)x ∈ domf , and t̃y +(1− t̃)x ∈
domf , we have

f (ty + (1 − t)x) ≥ f (t̃y + (1 − t̃)x) + ∇f (t̃y + (1 − t̃)x)T (y − x)(t − t̃), (9)

i.e., g(t) ≥ g(t) + g ′ (t̃)(t − t̃). We have seen that this implies that g is convex.

Definition 1: Norm functions Given a vector space X, a norm on X is a real-valued function


p : X → R with the following properties, where |s| denotes the usual absolute value of a scalar s:

• Subadditivity/Triangle inequality: p(x + y) ≤ p(x) + p(y) for all x, y ∈ X;


• Absolute homogeneity: p(sx) = |s| p(x) for all x ∈ X and all scalars s.
• Positive definiteness/Point-separating: for all x ∈ X, if p(x) = 0, then x = 0.

Theorem 2: Norm functions are convex.

Proof: If f : Rn → R is an norm, and 0 ≤ θ ≤ 1,then f (θx + (1 − θ)y) ≤ f (θx) + f ((1 − θ)y) =


θf (x) + (1 − θ)f (y). The inequality follows from the triangle inequality, and the equality follows from
homogeneity of a norm.

Problem 2: The maximum function f (x) = max{x1 , x2 , ..., xn }, x ∈ Rm is convex.

Solution: The function f (x) = maxi xi satisfies, for 0 ≤ θ ≤ 1,

f (θx + (1 − θ)y) = max(θxi + (1 − θ)yi )


i
≤ θ max xi + (1 − θ) max yi (10)
i i
= θf (x) + (1 − θ)f (y)

Problem 3: Determine whether f (x) = x−2 , x ̸= 0 is convex function.

Solution: The domain of this function is not a convex set.

Problem 4: L0 norm: ||x||0 is the number of non-zero elements in x. Explain that whether L0 norm
is a norm function, whether it is a convex function.

Solution: Let x = (0, 1). There is ||x||0 = 1 and ||2x||0 = 1. However, 2||x||0 = 2, which violates the
homogeneity. Therefore, L0 norm is not a norm function.
Let x = (0, 1), y = (1, 0). Let θ = 0.5,

f (θx + (1 − θ)y) = ||(0.5, 0.5)||0 = 2, (11)

θf (x) + (1 − θ)f (y) = 0.5 + 0.5 = 1, (12)


Therefore, L0 norm is not convex.
x2
Problem 5: f (x, y) = y , y > 0 is a convex function.

Solution: The Hessian matrix of f (x, y) is (for y > 0)


" 2 #
− 2x
  T
2 y y2 2 y y
∇ f= 2 = ⪰ 0. (13)
− 2x
y2
2x
y3 y 3 −x −x

2
Problem 6: Compute the conjugate of the functions:
(1) f (x) = ax + b, domf = R.
(2) f (x) = − log x, domf = R++

Solution: (1). As a function of x, yx − ax − b is bounded if and only if y = a, in which case it is


constant. Therefore the domain of the conjugate function f ∗ is the singleton {a}, and f ∗ (a) = −b.
(2) The function xy + log x is unbounded above if y ≥ 0 and reaches its maximum at x = −1/y
otherwise. Therefore, domf ∗ = {y|y < 0} = −R++ and f ∗ (y) = − log(−y) − 1 for y < 0.

Problem 7: Determine whether the following functions are convex


• f (x) = ex − 1, x ∈ R
• f (x1 , x2 ) = x1 x2 , x1 , x2 ∈ R
Pk
• h(z) = log( i=1 exi ), xi ∈ R

Solution: (1) The function is convex as ∇2 f = ex >0.


0 1
(2) The function is not convex as ∇2 f = is not positive semi-definite.
1 0
(3) The Hessian of the log-sum-exp function is
1
∇2 f (x) = T
((1T z)diag(z) − zz T ),
(1 z)2
where z = (ex1 , ..., exn ). To verify that ∇2 f (x) ⪰ 0 we must show that for all v, v T ∇2 f (x)v ≥ 0, i.e.,
n n n
1 X X X
v T ∇2 f (x)v = (( zi )( v 2
i zi ) − ( vi zi )2 ) ≥ 0
(1T z)2 i=1 i=1 i=1

. But this follows from the Cauchy-Schwarz inequality (aT a)(bT b) ≥ (aT b)2 applied to the vectors
√ √
with components ai = vi zi , bi = zi .
Pn
Problem 8: The KL-divergence is convex DKL (u, v) = i=1 (ui log uvii − ui + vi )
2
Solution: First, we show that g(x, t) = −t log(x/t) is convex on R++ : Let domf = C. The domain
2
domg = R++ is a convex set.
The function g(·) is convex because the function is defined as a scaled and shifted version of convex
function f (x).
n
Therefore, the relative entropy of two vectors u, v ∈ R++ , defined as
n
X
ui log(ui /vi )
i=1

is convex in (u, v), since it is a sum of relative entropies of ui , vi .


The KL-divergence is convex as it is the relative entropy plus a linear function of (u, v).

Problem 9: Convex-concave functions and saddle-points. We say the function f : Rn × Rm → R is


convex-concave if f (x, z) is a concave function of z, for each fixed x, and a convex function of x, for
each fixed z. We also require its domain to have the product form domf = A × B, where A ⊆ Rn and
B ⊆ Rm are convex.
(1) Give a second-order condition for a twice differentiable function f : Rn × Rm → R to be
convex-concave, in terms of its Hessian ∇2 f (x, z).
(2) Suppose that f : Rn × Rm → R is convex-concave and differentiable, with ∇f (x̃, z̃) = 0. Show
that the saddle-point property holds: for all x, z, we have

f (x̃, z) ≤ f (x̃, z̃) ≤ f (x, z̃)

. Show that this implies that f satisfies the strong max-min property:

sup inf f (x, z) = inf sup f (x, z)


z x x z

3
(and their common value is f (x̃, z̃).
(3) Now suppose that f : Rn × Rm → R is differentiable, but not necessarily convex-concave, and
the saddle-point property holds at x̃, z̃:

f (x̃, z) ≤ f (x̃, z̃) ≤ f (x, z̃)

for all x, z. Show that ∇f (x̃, z̃) = 0

Solution: (1) The Hessian matrix of f is


 
2 A11 A12
∇ f=
A21 A22

where A11 ∈ Rn×n , A12 ∈ Rn×m , A21 ∈ Rm×n , A22 ∈ Rm×m . And there is ∇2 f = A11 , ∇2z f = A22 .
As f is convex when z is fixed, then A11 is positive semi-definite; As f is concave when x is fixed, then
A22 is negative semi-definite.
(2) For the first inequality, as f (x, z) is convex of x, there is

f (x, z̃) ≥ f (x̃, z̃) + ∇x f (x̃, z̃)(x − x̃) = f (x̃, z̃)


The same is true for the left part of the inequality.
Next we prove supz inf x f (x, z) = f (x̃, z̃). First

inf f (x, z) ≤ f (x̃, z) ≤ f (x̃, z̃),


x

As
sup inf f (x, z) ≤ f (x̃, z̃),
z x

and
sup inf f (x, z) ≥ inf f (x, z̃) ≥ f (x̃, z̃),
z x x

Therefore, there is supz inf x f (x, z) = f (x̃, z̃). The same is true for inf x supz f (x, z) = f (x̃, z̃), which
proves the equality.
(3) We need to prove ∇x f (x̃, z̃) = 0 and ∇z f (x̃, z̃) = 0. We prove by contradiction. Assume
∇x f (x̃, z̃) ̸= 0, let v = (∇x f (x̃, z̃)T , 0)T . Get the first order of f (x̃ + tv, z̃) at (x̃, z̃) (t ̸= 0),

f (x̃ + tv, z̃) = f (x̃, z̃) + t||∇x f (x̃, z̃)||2 + O(t2 ).


Take t < 0 with small enough absolute value, there is f (x̃ + tv, z̃) < f (x̃, z̃), which contradicts with
the assumption. Therefore, there is ∇x f (x̃, z̃) = 0 and ∇z f (x̃, z̃) = 0.

Problem 10: Show that f (Ax + b) is convex if f (x) is a convex function.

Solution: The domain of f (Ax + b) is the same with f (x), which is a convex set. For any two points
x, y ∈ domf , 0 ≤ θ ≤ 1, there is,

f (A(θx + (1 − θ)y) + b) = f (θ(Ax + b) + (1 − θ)(Ay + b)) ≤ θf (Ax + b) + (1 − θ)f (Ay + b). (14)

Therefore, f (Ax + b) is convex.

You might also like