0% found this document useful (0 votes)
108 views40 pages

talkCAJH wk4-1

The document discusses a lecture on optimization that covers convexity. The key topics covered in the lecture include: 1) Positive semi-definite matrices and their properties. A matrix is positive semi-definite if its eigenvalues are all greater than or equal to 0. 2) Convex sets including their definitions and examples such as affine sets, halfspaces, balls, and cones of positive semi-definite matrices. 3) Linear, affine, conic, and convex combinations and how they relate to linear, affine, and convex sets.

Uploaded by

Eligius Martinez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
108 views40 pages

talkCAJH wk4-1

The document discusses a lecture on optimization that covers convexity. The key topics covered in the lecture include: 1) Positive semi-definite matrices and their properties. A matrix is positive semi-definite if its eigenvalues are all greater than or equal to 0. 2) Convex sets including their definitions and examples such as affine sets, halfspaces, balls, and cones of positive semi-definite matrices. 3) Linear, affine, conic, and convex combinations and how they relate to linear, affine, and convex sets.

Uploaded by

Eligius Martinez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Optimization (2MMD10/2DME20), lecture week 4

Cor Hurkens
Technische Universiteit Eindhoven

Fall 2017, Q1

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 1/40


Program for this week

Convexity, convexity, convexity, . . .


Positive semi-definite matrices and functions
Convex sets
Convex functions
Applications of convexity
Useful inequalities

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 2/40


Positive semi-definite matrices (1)

Definition
A real matrix A is symmetric if AT = A.
The set of symmetric n n matrices is denoted by S n .
Recall:
Theorem
For any matrix A S n ,
there exists an n n matrix F and a diagonal matrix
so that F T F = I and F T AF = .
Let 1 , . . . , n be the diagonal entries of , and let f1 , . . . , fn be the
columns of F . Then
f1 , . . . , fn is an orthonormal basis of Rn .
Afi = i fi for all i.
A = 1 f1 f1T + + n fn fnT .

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 3/40


Positive semi-definite matrices (2)

A function f : Rn R is positive semi-definite


if f (x) 0 for all x Rn .
If A S n , then the function f (x) = x T Ax = i j Aij xi xj
P P
is a homogeneous quadratic function (f : Rn R).

Definition
Let A S n . Then A is positive semi-definite (PSD) if

x T Ax 0 for all x Rn .

The set of positive semi-definite matrices is denoted by S+n .

An A S n is positive definite (PD) if A is PSD and non-singular.


n
The set of positive definite matrices is denoted by S++ .
We write A  0 to denote that A is PSD, and A  0 if A is PD.

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 4/40


Positive semi-definite matrices (3)

Most results on PSD matrices in this course are derived from this one:
Theorem
Let A S n . The following three statements are equivalent:
1 A is positive semi-definite.
2 each eigenvalue of A is 0.
3 there is some real matrix Z such that A = Z T Z .

In particular,
A is PSD = det(A) 0
A is PSD = the diagonal entries of A are 0
if A is diagonal, then: A is PSD diagonal entries of A are 0
 
B 0
if A = , then: A is PSD both B and C are PSD.
0 C

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 5/40


Positive semi-definite matrices (4)
Matrices A, B S n are congruent,
if B = U T AU for some non-singular U.
Lemma
Let A, B S n be congruent. Then A  0 B  0.

Applying one or more of the following symmetric matrix operations to A


yields a congruent matrix:
scaling the i-th row and the i-the column by a 6= 0
interchanging the i-th row with the j-th row and the i-th column
with the j-th column
adding the i-th row to the j-th row and adding the i-th
column to the j-th column
By these operations, a matrix A may be transformed to a congruent
diagonal matrix D. Then, A  0 D  0 D 0.

1 0 1 1
0 2 0 4
Example: is 1 0 3 0 PSD?

1 4 0 4
CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 6/40
Positive semi-definite matrices (5)

1 0 1 1 1 0 1 1
0 2 0 4 0 2 0 4
=
1 0 3 0 1 0 3 0
1 4 0 4 1 4 0 4

1 0 0 0 1 0 0 0
0 2 0 4 0 2 0 4

1 0 2 1 0 0 2 1
1 4 1 3 0 4 1 3

1 0 0 0 1 0 0 0
0 2 0 0 0 2 0 0

0 0 2 1 0 0 2 1
0 4 1 5 0 0 1 5

1 0 0 0 1 0 0 0
0 2 0 0 0 2 0 0
NOT PSD!!
0 0 2 0 0 0 2 0
1
0 0 1 5 2 0 0 0 5 12
CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 7/40
Positive semi-definite matrices (6)

1 0 1 1 1 0 1 1
0 2 0 2
= 0 2 0 2


1 0 3 0 1 0 3 0
1 2 0 4 1 2 0 4

1 0 0 0 1 0 0 0
0 2 0 2 0 2 0 2


1 0 2 1 0 0 2 1
1 2 1 3 0 2 1 3

1 0 0 0 1 0 0 0
0 2 0 0 0 2 0 0


0 0 2 1 0 0 2 1
0 2 1 1 0 0 1 1

1 0 0 0 1 0 0 0
0 2 0 0 0 2 0 0 YES, PSD!!


0 0 2 0 0 0 2 0
1
0 0 1 2 0 0 0 12
CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 8/40
Linear, affine, conic, convex combinations

Definition
Let x, y Rn , , R.
Then z := x + y is a linear combination of x and y .

z lies on the plane through 0, x, y


if + = 1, then z lies on the line through x, y affine
if , 0, then z lies between directions 0 x and 0 y conic
if both, then z lies between x and y convex

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 9/40


Linear, affine, conic, convex combinations (2)

Linear

Affine Conic

Convex

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 10/40


Linear, affine, convex sets

Definition
A set L Rn is
linear, if x + y L for all x, y L and all , R
affine, if x + y L for all x, y L and , R with + = 1

Theorem
Let L Rn . The following are equivalent:
L is affine
L = {x | Ax = b} for some A, b
L = {Cx + d | x} for some C , d

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 11/40


Linear, affine, convex sets (2)

Definition
A set C Rn is
convex, if x + y C for all x, y C and , 0 with + = 1

Example
affine sets are convex.
a hyperplane Ha,b := {x Rn | aT x = b} is convex

a halfspace Ha,b := {x Rn | aT x b} is convex
the unit ball B n := {x Rn | kxk 1} is convex

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 12/40


Linear, affine, convex sets (3)

Definition
A set C Rn is a cone, if x + y C for all x, y C and all , 0.
Note: cones are convex sets.

Example
linear sets are cones
the Lorentz cone Ln+1 := {(x, t) | x Rn , t R, kxk t} is a cone
the positive semi-definite (PSD) matrices

S+n := {A S n | A  0}

form a cone

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 13/40


Linear, affine, convex sets (4)

Definition
A function f : Rn R is a norm if
f (x) 0 for all x Rn
f (x) = 0 x = 0
f (x) = f (x) for all R+ , x Rn
f (x + y ) f (x) + f (y ) for all x, y Rn

Definition
If f is a norm,
then the norm ball is {x Rn | f (x) 1}
and the norm cone is {(x, t) Rn+1 | f (x) t}.

For any norm, the norm ball is a convex set and the norm cone is a cone.

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 14/40


Making convex sets (1)

Intersection of convex sets:


Lemma
Let C TRn be convex for all A.
Then A C is convex.

Example
The set of copositive polynomials of degree n:

P+n := {(p0 , . . . , pn ) | 0 p0 + p1 x + + pn x n for all x [0, )}

can be written as P+n = x[0,) Pxn , where


T

Pxn := {(p0 , . . . , pn ) | 0 p0 + p1 x + + pn x n }.

Each Pxn is a halfspace in Rn+1 , hence P+n is convex.

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 15/40


Making convex sets (2)

Polyhedra:

Definition
A polyhedron is
a set P = {x Rn | Ax b} for some linear inequalities Ax b.

Example: the n-simplex {x Rn | x 0,


P
xi = 1}
Polyhedra are convex sets

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 16/40


Making convex sets (3a)

Balls and Ellipsoids:

Example
The unit ball B n = {x Rn | ||x|| 1} is convex.

Definition
Let Z be a non-singular n n matrix; let c Rn .
Then E (Z , c) := {c + Zx | kxk 1} is an ellipsoid.

So ellipsoids are scaled, rotated and shifted balls.


Lemma
A set E Rn is an ellipsoid if and only if

E = {y Rn | (y c)T A1 (y c) 1}

for some c Rn and some positive definite A.

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 17/40


Making convex sets (3b)

Balls and Ellipsoids:


Lemma
Let f : Rn Rm be affine (i.e. f : x 7 Ax + b); let C Rn be convex.
Then f [C ] := {f (x) | x C } is convex.

Example
Consider an ellipsoid E (Z , c) = {Zx + c | kxk 1}.
For f : x 7 Zx + c, we have E (Z , c) = f [B n ].
Hence ellipsoids are convex sets.

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 18/40


Making convex sets (4)

Convex hulls:
Definition
Let a1 , . . . , am Rn . Let 1 , . . . , m 0 and i i = 1.
P
Then 1 a1 + + m am is a convex combination of a1 , . . . , am .

Definition
The convex hull of a1 , . . . , am is
X X
conv{a1 , . . . , am } := { i ai | i = 1, i 0}.
i i
P
For the affine function f : 7 P i i ai and
for the convex set C := { | i i = 1, i 0},
we have conv{a1 , . . . , am } = f [C ].
Hence conv{a1 , . . . , am } is convex.

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 19/40


Making convex sets (5)

Inverse image of an affine function:


Lemma
Let f : Rn Rm be affine (i.e. f : x 7 Ax + b); let C Rm be convex.
Then f 1 [C ] := {x Rn | f (x) C } is convex.

Example
Let A0 , . . . , An S n . Then the set

X := {x Rn | A0 + x1 A1 + + xm Am  0}

is convex, as X = f 1 [S+n ], where f : Rm S n is the affine map

f : x 7 A0 + x1 A1 + + xm Am .

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 20/40


More about convex sets (1)

Definition
Let C Rn be convex and non-empty.
A point x C is an extreme point of C ,
if x = x1 + (1 )x2 with x1 , x2 C and 0 < < 1
implies x = x1 = x2 .

Example
What are the extreme points of
(a) a closed disk in R2 (b) a convex polygon in R2 ?

Lemma
Let P Rn be a polyhedron. Then P has finitely many extreme points.

Krein-Milman theorem
A compact convex set C is the convex hull of its extreme points.

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 21/40


More about convex sets (2)

Theorem
Let C Rn be a closed, convex set. Let x0 6 C .
Then there exists a nonzero y Rn and a z R such that
y T x > z for all x C and y T x0 < z.

Theorem
Let C , D Rn be convex sets with C D = .
Then there exist a nonzero vector y Rn and a z R such that
y T x z for all x C and y T x z for all x D.
The hyperplane H = {x | y T x = z} is said to separate C from D.

Theorem
Let C Rn be a convex set, and let x0 lie on the boundary of C .
Then there exist a nonzero vector y Rn and a z R such that
y T x z for all x C and y T x0 = z.
The hyperplane H = {x | y T x = z} is said to support C at x0 .
CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 22/40
Convex functions (1)

Definition
A function f : Rn R {} is convex if

f (x + y ) f (x) + f (y )

for all x, y Rn and , 0 with + = 1.


Function f strictly convex: strict inequality for , > 0 and x 6= y
Function f concave: if f is convex.
Example
Norm functions are convex
If C Rn is a convex set, then the function f : Rn R defined by

0 if x C
f (x) =
otherwise

is convex.

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 23/40


Convex functions (2)

Definition
Let f : Rn R {} be a function. Then the epigraph of f is

Epi(f ) := {(x, t) | x Rn , t R, f (x) t}.

Theorem
f is a convex function Epi(f ) is a convex set.

Lemma
Let f : Rn R {} be convex and let R.
Then the sublevel set {x Rn | f (x) } is a convex set.

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 24/40


First-order condition (1)

The gradient of a function f : Rn R is the vector

f f T
f = ( ,..., )
x1 xn

Theorem
Let f : Rn R be a differentiable function.
Then f is convex if and only if

f (y ) f (x) + f (x)T (y x)

for all x, y Rn .

For convex f ,
f (x ) = min{f (y ) | y Rn } f (x ) = 0

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 25/40


First-order condition (2)

Example
For a matrix A S n , a vector b Rn , and c R,
the quadratic function f (x) = x T Ax + bx + c is convex
if and only if A is positive semi-definite.

Proof:
First-order condition f (y ) f (x) + f (x)T (y x)
f (x) = 2x T A + b
y T Ay + by + c x T Ax + bx + c + (2x T A + b)(y x)
is equivalent to (y x)T A(y x) 0

Well-known special case


For a, b, c R, the univariate quadratic function f (x) = ax 2 + bx + c is
convex if and only if a 0.

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 26/40


Second-order condition (1)

The Hessian of a twice differentiable function f : Rn R is a symmetric


matrix 2 f S n such that
2f
(2 f )ij =
xi xj

Example
Let Q S n and let c be a vector.
Then the Hessian of f (x) = 21 x T Qx + c T x is 2 f (x) = Q.

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 27/40


Second-order condition (2)

Recall: a function is analytic, if it has a Taylor series for each point x in


its domain that converges to the function in an open neighborhood of x.
Univariate case
For univariate analytic functions f : R R we have:
1
f (x + h) = f (x) + f 0 (x)h + f 00 (x + h)h2
2
for some [0, 1].

Multivariate case
For multivariate analytic functions f : Rn R we have:
1
f (x + h) = f (x) + f (x)T h + hT 2 f (x + h)h
2
for some [0, 1].

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 28/40


Second-order condition (3)

Theorem
Let f : Rn R be a twice differentiable function.
Then f is convex if and only if 2 f (x)  0 for all x Rn .

The proof uses:


Lemma
A function f : Rn R is convex,
if and only if g () := f (x + d ) is convex for all x, d Rn .

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 29/40


Examples

x 7 exp(ax) is convex on R, for all a R


x 7 x a is convex on R+ , for all a 0 and a 1 (concave otherwise)
x 7 log(x) is concave on R+
x 7 x log(x) is convex on R+
x2
(x, y ) 7 yis convex on {(x, y ) | y > 0}
x 7 log( i exp(xi )) is convex on Rn
P

x 7 ( i xi )1/n is concave on Rn+


Q

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 30/40


Operations that preserve convexity

Lemma
If f , g : Rn R are convex functions, then so is

f + g

for all , 0

Lemma
If f , g : Rn R are convex, then so is max{f , g }.

Lemma
If f : Rn R is convex and g : Rm Rn is affine, then f g is convex.

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 31/40


Example

Three lines in space have a unique waist


Given: three fixed lines in space that are made of iron wire;
these lines are pairwise disjoint and pairwise non-parallel.
We stretch an elastic band around the lines, which then by elasticity
will slip to a position where its total circumference is minimal.
Show: final position of band does not depend on initial position.

Mathematical formulation
Let `1 , `2 , `3 be three lines in R3 .
Find min f (p1 , p2 , p3 ) = ||p1 p2 || + ||p2 p3 || + ||p3 p1 ||
such that pi `i for i = 1, 2, 3
f (p1 , p2 , p3 ) is strictly convex

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 32/40


Inequalities from convexity

Almost all elementary (and many other) inequalities follow from


convexity.

A well-known inequality
ez 1 + z for all real numbers z.

Proof:
First-order condition f (y ) f (x) + f (x)T (y x)
Choose f (t) = e t , x = 0 and y = z

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 33/40


Inequalities: Jensen (1)

Johan Jensen (18591925): Danish mathematician and engineer

Theorem (Jensen, 1906)


For a convex function f : R R and real numbers x1 , . . . , xn , we have
n n
!
1 X 1X
f (xi ) f xi .
n n
i=1 i=1

Question: When does equality hold for strictly convex f ?


If f is concave: then the inequality holds with instead of

Theorem
For a convex function f : R R and real numbers x1 , . . . , xn , and
positive real numbers a1 , . . . , an , we have
Pn  Pn 
i=1 ai f (xi ) i=1 ai xi
P n f P n .
i=1 ai i=1 ai

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 34/40


Inequalities: Jensen (2)

Theorem
For positive real numbers a1 , . . . , an , we have
a1 + a2 + + an
(a1 a2 an )1/n
n
Proof: let xi = ln ai , and use Jensen with f (x) = e x

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 35/40


Inequalities: Jensen (3)

Information theory considers the information content of a system that


produces messages mk with probability
Pn pk ,
where p1 , p2 , . . . , pn 0 and k=1 pk = 1.

The entropy of a probability distribution is defined as


n
X
H(p) = pk log pk .
k=1

The entropy satisfies the bound

H(p) log n.

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 36/40


Inequalities: Master (1)

Master inequality
Let g : R R be a strictly concave function, and let f : R R R be
the function that is defined by
 
x
f (x, y ) = y g .
y

Then all real numbers x1 , . . . , xn and all positive real numbers y1 , . . . , yn


satisfy the inequality
n n n
!
X X X
f (xi , yi ) f xi , yi
i=1 i=1 i=1

Equality holds if and only if the two sequences xi and yi are proportional
(that is, if there exists a real number t such that xi /yi = t for all i).

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 37/40


Inequalities: Master (2)

How to prove the master inequality


by induction on n
case n = 1 holds with equality
case n = 2 holds with = y1 /(y1 + y2 ) and = y2 /(y1 + y2 ):
f (x1 , y1 ) + f (x2 , y2 ) =
   
x1 x2
= y1 g + y2 g
y1 y2
    
y1 x1 y2 x2
= (y1 + y2 ) g + g
y1 + y2 y1 y1 + y2 y2
 
x1 + x2
(y1 + y2 ) g = f (x1 + x2 , y1 + y2 ) .
y1 + y2
As g is strictly concave, equality holds if and only if x1 /y1 = x2 /y2 .
The inductive step for n 3 also follows from the inequality.

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 38/40


Inequalities: Master (3)

Cauchy inequality
For real numbers a1 , . . . , an and b1 , . . . , bn , we have

(a12 +a22 + +an2 ) (b12 +b22 + +bn2 ) (a1 b1 +a2 b2 + +an bn )2 .

Proof:
use the strictly concave function g (x) = x
set xi = ai2 and yi = bi2

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 39/40


Homework week 4

Read chapters 2 and 3 in the book of Boyd & Vandenberghe


Recommended exercises:
25, 28, 30, 34, 36, 37, 39, 42

Collection of exercises can be downloaded from:


CANVAS:exercises_1-5

CAJ Hurkens Optimization (2MMD10/2DME20), lecture week 4 40/40

You might also like