Additional Exercises Sol

Download as pdf or txt
Download as pdf or txt
You are on page 1of 632

Additional Exercises for Convex Optimization

Stephen Boyd Lieven Vandenberghe

March 18, 2016

This is a collection of additional exercises, meant to supplement those found in the book Convex
Optimization, by Stephen Boyd and Lieven Vandenberghe. These exercises were used in several
courses on convex optimization, EE364a (Stanford), EE236b (UCLA), or 6.975 (MIT), usually for
homework, but sometimes as exam questions. Some of the exercises were originally written for the
book, but were removed at some point. Many of them include a computational component using
CVX, a Matlab package for convex optimization; files required for these exercises can be found at
the book web site www.stanford.edu/~boyd/cvxbook/. We are in the process of adapting many
of these problems to be compatible with two other packages for convex optimization: CVXPY
(Python) and Convex.jl (Julia). Some of the exercises require a knowledge of elementary analysis.
You are free to use these exercises any way you like (for example in a course you teach), provided
you acknowledge the source. In turn, we gratefully acknowledge the teaching assistants (and in
some cases, students) who have helped us develop and debug these exercises. Pablo Parrilo helped
develop some of the exercises that were originally used in 6.975.
Course instructors can obtain solutions by email to us. Please specify the course you are teaching
and give its URL.
Well update this document as new exercises become available, so the exercise numbers and
sections will occasionally change. We have categorized the exercises into sections that follow the
book chapters, as well as various additional application areas. Some exercises fit into more than
one section, or dont fit well into any section, so we have just arbitrarily assigned these.

Stephen Boyd and Lieven Vandenberghe

1
Contents
1 Convex sets 3

2 Convex functions 8

3 Convex optimization problems 29

4 Duality 95

5 Approximation and fitting 136

6 Statistical estimation 204

7 Geometry 235

8 Unconstrained and equality constrained minimization 279

9 Interior point methods 298

10 Mathematical background 321

11 Circuit design 324

12 Signal processing and communications 350

13 Finance 375

14 Mechanical and aerospace engineering 455

15 Graphs and networks 513

16 Energy and power 533

17 Miscellaneous applications 576

2
1 Convex sets
1.1 Is the set {a Rk | p(0) = 1, |p(t)| 1 for t }, where

p(t) = a1 + a2 t + + ak tk1 ,

convex?
Solution. Yes, it is convex; it is the intersection of an infinite number of slabs,

{a | 1 a1 + a2 t + + ak tk1 1},

parametrized by t [, ], and the hyperplane

{a | a0 = 1}.

1.2 Set distributive characterization of convexity. [?, p21], [?, Theorem 3.2] Show that C Rn is
convex if and only if ( + )C = C + C for all nonnegative , .
Solution. The equality is trivially true for = = 0, so we will assume that + 6= 0, and
dividing by + , we can rephrase the theorem as follows: C is convex if and only if

C = C + (1 )C

for 0 1.
C C + (1 )C for all [0, 1] is true for any set C, because x = x + (1 )x.
C C + (1 )C for all [0, 1] is just a rephrasing of Jensens inequality.

1.3 Composition of linear-fractional functions. Suppose : Rn Rm and : Rm Rp are the


linear-fractional functions
Ax + b Ey + f
(x) = , (y) = ,
cT x + d gT y + h

with domains dom = {x | cT x + d > 0}, dom = {y | g T x + h > 0}. We associate with and
the matrices " # " #
A b E f
, ,
cT d gT h
respectively.
Now consider the composition of and , i.e., (x) = ((x)), with domain

dom = {x dom | (x) dom }.

Show that is linear-fractional, and that the matrix associated with it is the product
" #" #
E f A b
.
gT h cT d

3
Solution. We have, for x dom ,

E((Ax + b)/(cT x + d)) + f


(x) = .
g T (Ax + b)/(cT x + d) + h

Multiplying numerator and denominator by cT x + d yields

EAx + Eb + f cT x + f d
(x) =
g T Ax + g T b + hcT x + hd
(EA + f cT )x + (Eb + f d)
= ,
(g T A + hcT )x + (g T b + hd)

which is the linear fractional function associated with the product matrix.

1.4 Dual of exponential cone. The exponential cone Kexp R3 is defined as

Kexp = {(x, y, z) | y > 0, yex/y z}.


.
Find the dual cone Kexp
We are not worried here about the fine details of what happens on the boundaries of these cones,
so you really neednt worry about it. But we make some comments here for those who do care
about such things.
The cone Kexp as defined above is not closed. To obtain its closure, we need to add the points

{(x, y, z) | x 0, y = 0, z 0}.

(This makes no difference, since the dual of a cone is equal to the dual of its closure.)
Solution. The dual cone can be expressed as

Kexp = cl{(u, v, w) | u < 0, uev/u ew}
= {(u, v, w) | u < 0, uev/u ew} {(0, v, w) | v 0, w 0},

where cl means closure. We didnt expect people to add the points on the boundary, i.e., to work
out the second set in the second line.
The dual cone can be expressed several other ways as well. The conditions u < 0, uev/u ew
can be expressed as
(u/w) log(u/w) + v/w u/w 0,
or
u log(u/w) + v u 0,
or
log(u/w) 1 (v/u).
, we need to have ux + vy + wz 0 whenever
Now lets derive the result. For (u, v, w) to be in Kexp
y > 0 and yex/y z. Thus, we must have w 0; otherwise we can make ux + vy + wz negative

4
by choosing a large enough z. Now lets see what happens when w = 0. In this case we need
ux + vy 0 for all y > 0. This happens only if u = 0 and v 0. So points with

u = 0, v 0, w=0
.
are in Kexp
Lets now consider the case w > 0. Well minimize ux + vy + wz over all x, y and z that satisfy
y > 0 and yex/y z. Since w > 0, we minimize over z by taking z = yex/y , which yields

ux + vy + wyex/y .

Now we will minimize over x. If u > 0, this is unbounded below as x . If u = 0, the minimum
is not achieved, but occurs as x ; the minimum value is then vy. This has a nonnegative
minimum value over all y > 0 only when v 0. Thus we find that points satisfying

u = 0, v 0, w>0
.
are in Kexp
If u < 0, the minimum of ux + vy + wyex/y occurs when its derivative with respect to x vanishes.
This leads to u+wex/y = 0, i.e., x = y log(u/w). For this value of x the expression above becomes

y(u log(u/w) + v u).

Now we minimize over y > 0. We get if u log(u/w)+vu < 0; we get 0 if u log(u/w)+vu


0.
So finally, we have our condition:

u log(u/w) + v u 0, u < 0, w > 0.

Dividing by u and taking the exponential, we can write this as

uev/u ew, u < 0.

(The condition w > 0 is implied by these two conditions.)


Finally, then, we have

Kexp = {(u, v, w) | u < 0, uev/u ew} {(0, v, w) | v 0, w 0}.

1.5 Dual of intersection of cones. Let C and D be closed convex cones in Rn . In this problem we will
show that
(C D) = C + D .
Here, + denotes set addition: C + D is the set {u + v | u C , v D }. In other words, the
dual of the intersection of two closed convex cones is the sum of the dual cones.

(a) Show that C D and C + D are convex cones. (In fact, C D and C + D are closed, but
we wont ask you to show this.)

5
(b) Show that (C D) C + D .
(c) Now lets show (C D) C + D . You can do this by first showing

(C D) C + D C D (C + D ) .

You can use the following result:

If K is a closed convex cone, then K = K.

Next, show that C D (C + D ) and conclude (C D) = C + D .


(d) Show that the dual of the polyhedral cone V = {x | Ax  0} can be expressed as

V = {AT v | v  0}.

Solution.

(a) Suppose x C D. This implies that x C and x D, which implies x C and x D


for any 0. Thus, x C D for any 0, which shows C D is a cone. We know C D
is convex since the intersection of convex sets is convex.
To show C + D is a closed convex cone, note that both C and D are convex cones, thus
C + D is the conic hull of C D , which is a convex cone.
(b) Suppose x C + D . We can write x as x = u + v, where u C and v D . We know
uT y 0 for all y C and v T y 0 for all y D, which implies that xT y = uT y + v T y 0 for
all y C D. This shows x (C D) , and so (C D) C + D .
(c) We showed in part (a) that C D and C + D are closed convex cones. This implies
(C D) = C D and (C + D ) = (C + D ), and so

(C D) C + D C D (C + D ) .

Suppose x (C + D ) . xT y 0 for all y = u + v, where u C , v D . This can be


written as xT u + xT v 0, for all u C and v D . Since 0 C and 0 D , taking v = 0
we get xT u 0 for all u C , and taking u = 0 we get xT v 0 for all v D . This implies
x C = C and x D = D, i.e., x C D.
So we have shown both (C D) C + D and (C D) C + D , which implies
(C D) = C + D .
(d) Using the result we just proved, we can write

V = {x | aT1 x 0} + + {x | aTm x 0} .

The dual of {x | aTi x 0} is the set {ai | 0}, so we get

V = {a1 | 0} + + {am | 0}
= {1 a1 + + m am | i 0, i = 1, . . . , m}.

This can be more compactly written as

V = {AT v | v  0}.

6
1.6 Polar of a set. The polar of C Rn is defined as the set

C = {y Rn | y T x 1 for all x C}.

(a) Show that C is convex (even if C is not).


(b) What is the polar of a cone?
(c) What is the polar of the unit ball for a norm k k?
(d) Show that if C is closed and convex, with 0 int C, then (C ) = C.

Solution.

(a) The polar is the intersection of hyperplanes {y | y T x 1}, parametrized by x C, so it is


convex.
(b) If C is a cone, then we have y T x 1 for all x C if and only if y T x 0 for all x C. (To see
this, suppose that y T x > 0 for some x C. Then x C for all > 0, so y T x can be made
arbitrarily large, and in particular, exceeds one for large enough.) Therefore the polar of a
cone K is K , the negative of the dual cone.
(c) The polar of the unit ball in the norm k k is the unit ball for the dual norm k k .
(d) Suppose that x C and y C . Then we have y T x 1. Since this is true for any y C ,
we xT y 1 for all y C . Thus we have x (C ) . So C (C ) .
Now suppose that x (C ) \ C. Then we can find a separating hyperplane for {x} and C,
i.e., a 6= 0 and b with aT x > b, and aT z b for z C. Since z = 0 int C, we have b > 0.
By scaling a and b, we can assume that aT x > 1 and aT z 1 for all z C. Thus, a C .
Our assumption x (C ) then tells us that aT x 1, a contradiction.

1.7 Dual cones in R2 . Describe the dual cone for each of the following cones.

(a) K = {0}.
(b) K = R2 .
(c) K = {(x1 , x2 ) | |x1 | x2 }.
(d) K = {(x1 , x2 ) | x1 + x2 = 0}.

Solution.

(a) K = R2 . To see this:

K = {y | y T x 0 for all x K}
= {y | y T 0 0}
= R2 .

(b) K = {0}. To see this, we need to identify the values of y R2 for which y T x 0 for all
x R2 . But given any y 6= 0, consider the choice x = y, for which we have y T x = kyk22 < 0.
So the only possible choice is y = 0 (which indeed satisfies y T x 0 for all x R2 ).
(c) K = K. (This cone is self-dual.)
(d) K = {(x1 , x2 ) | x1 x2 = 0}. Here K is a line, and K is the line orthogonal to it.

7
2 Convex functions
2.1 Maximum of a convex function over a polyhedron. Show that the maximum of a convex function f
over the polyhedron P = conv{v1 , . . . , vk } is achieved at one of its vertices, i.e.,
sup f (x) = max f (vi ).
xP i=1,...,k

(A stronger statement is: the maximum of a convex function over a closed bounded convex set is
achieved at an extreme point, i.e., a point in the set that is not a convex combination of any other
points in the set.) Hint. Assume the statement is false, and use Jensens inequality.
Solution. Lets assume the statement is false: We have z P with f (z) > f (vi ) for i = 1, . . . , k.
We can represent z as
z = 1 v1 + + k vk ,
where  0, 1T = 1. Jensens inequality tells us that
f (z) = f (1 v1 + + k vk )
1 f (v1 ) + + k f (vk )
< f (z),
so we have a contradiction.
2.2 A general vector composition rule. Suppose
f (x) = h(g1 (x), g2 (x), . . . , gk (x))
where h : Rk R is convex, and gi : Rn R. Suppose that for each i, one of the following holds:
h is nondecreasing in the ith argument, and gi is convex
h is nonincreasing in the ith argument, and gi is concave
gi is affine.
Show that f is convex. (This composition rule subsumes all the ones given in the book, and is
the one used in software systems such as CVX.) You can assume that dom h = Rk ; the result
the
also holds in the general case when the monotonicity conditions listed above are imposed on h,
extended-valued extension of h.
Solution. Fix x, y, and [0, 1], and let z = x + (1 )y. Lets re-arrange the indexes so that
gi is affine for i = 1, . . . , p, gi is convex for i = p + 1, . . . , q, and gi is concave for i = q + 1, . . . , k.
Therefore we have
gi (z) = gi (x) + (1 )gi (y), i = 1, . . . , p,
gi (z) gi (x) + (1 )gi (y), i = p + 1, . . . , q,
gi (z) gi (x) + (1 )gi (y), i = q + 1, . . . , k.
We then have
f (z) = h(g1 (z), g2 (z), . . . , gk (z))
h(g1 (x) + (1 )g1 (y), . . . , gk (x) + (1 )gk (y))
h(g1 (x), . . . , gk (x)) + (1 )h(g1 (y), . . . , gk (y))
= f (x) + (1 )f (y).

8
The second line holds since, for i = p + 1, . . . , q, we have increased the ith argument of h, which
is (by assumption) nondecreasing in the ith argument, and for i = q + 1, . . . , k, we have decreased
the ith argument, and h is nonincreasing in these arguments. The third line follows from convexity
of h.

2.3 Logarithmic barrier for the second-order cone. The function f (x, t) = log(t2 xT x), with dom f =
{(x, t) Rn R | t > kxk2 } (i.e., the second-order cone), is convex. (The function f is called the
logarithmic barrier function for the second-order cone.) This can be shown many ways, for example
by evaluating the Hessian and demonstrating that it is positive semidefinite. In this exercise you
establish convexity of f using a relatively painless method, leveraging some composition rules and
known convexity of a few other functions.

(a) Explain why t(1/t)uT u is a concave function on dom f . Hint. Use convexity of the quadratic
over linear function.
(b) From this, show that log(t (1/t)uT u) is a convex function on dom f .
(c) From this, show that f is convex.

Solution.

(a) (1/t)uT u is the quadratic over linear function, which is convex on dom f . So t (1/t)uT u is
concave, since it is a linear function minus a convex function.
(b) The function g(u) = log u is convex and decreasing, so if u is a concave (positive) function,
the composition rules tell us that g u is convex. Here this means log(t (1/t)uT u) is a
convex function on dom f .
(c) We write f (x, t) = log(t (1/t)uT u) log t, which shows that f is a sum of two convex
functions, hence convex.

2.4 A quadratic-over-linear composition theorem. Suppose that f : Rn R is nonnegative and convex,


and g : Rn R is positive and concave. Show that the function f 2 /g, with domain dom f dom g,
is convex.
Solution. Without loss of generality we can assume that n = 1. (The general case follows by
restricting to an arbitrary line.) Let x and y be in the domains of f and g, and let [0, 1], and
define z = x + (1 )y. By convexity of f we have

f (z) f (x) + (1 )f (y).

Since f (x) and f (x) + (1 )f (y) are nonnegative, we have

f (z)2 (f (x) + (1 )f (y))2 .

(The square function is monotonic on R+ .) By concavity of g, we have

g(z) g(x) + (1 )g(y).

Putting the last two together, we have

f (z)2 (f (x) + (1 )f (y))2


.
g(z) g(x) + (1 )g(y)

9
Now we use convexity of the function u2 /v, for v > 0, to conclude

(f (x) + (1 )f (y))2 f (x)2 f (y)2


+ (1 ) .
g(x) + (1 )g(y) g(x) g(y)
This, together with the inequality above, finishes the proof.

2.5 A perspective composition rule. [?] Let f : Rn R be a convex function with f (0) 0.

(a) Show that the perspective tf (x/t), with domain {(x, t) | t > 0, x/t dom f }, is nonincreasing
as a function of t.
(b) Let g be concave and positive on its domain. Show that the function

h(x) = g(x)f (x/g(x)), dom h = {x dom g | x/g(x) dom f }

is convex.
(c) As an example, show that

xT x
h(x) = Qn , dom h = Rn++
( k=1 xk )1/n
is convex.

Solution.

(a) Suppose t2 > t1 > 0 and x/t1 dom f , x/t2 dom f . We have x/t2 = (x/t1 ) + (1 )0
where = t1 /t2 . Hence, by Jensens inequality,
t1 t1 t1
f (x/t2 ) f (x/t1 ) + (1 )f (0) f (x/t1 )
t2 t2 t2
if f (0) 0. Therefore
t2 f (x/t2 ) t1 f (x/t1 ).
If we assume that f is differentiable, we can also verify that the derivative of the perspective
with respect to t is less than or equal to zero. We have

tf (x/t) = f (x/t) f (x/t)T (x/t)
t
This is less than or equal to zero, because, from convexity of f ,

0 f (0) f (x/t) + f (x/t)T (0 x/t).

(b) This follows from a composition theorem: tf (x/t) is convex in (x, t) and nonincreasing in t;
therefore if g(x) is concave then g(x)f (x/g(x)) is convex. This composition rule is actually
not mentioned in the lectures or the textbook, but it is easily derived as follows. Let us denote
the perspective function tf (x/t) by F (x, t). Consider a convex combination x = u + (1 )v
of two points in dom h. Then

h(u + (1 )v) = F (u + (1 )v, g(u + (1 )v))


F (u + (1 )v, g(u) + (1 )g(v)),

10
because g is concave, and F (x, t) is nonincreasing with respect to t. Convexity of F then gives

h(u + (1 )v) F (u, g(u)) + (1 )F (v, g(v))


= h(u) + (1 )h(v).

As an alternative proof, we can establish convexity of h directly from the definition. Consider
two points u, v dom g, with u/g(u) dom f and v/g(v) dom f . We take a convex
combination x = u + (1 )v. This point lies in dom g because dom g is convex. Also,

x g(u) u (1 )g(v) v
= +
g(x) g(x) g(u) g(x) g(v)
u v
= 1 + 2 + 3 0
g(u) g(v)

where
g(u) (1 )g(v)
1 = , 2 = , 3 = 1 1 2 .
g(x) g(x)
These coefficients are nonnegative and add up to one because g(x) > 0 on its domain, and
g(u) + (1 )g(v) g(x) by concavity of g. Therefore x/g(x) is a convex combination of
three points in dom f , and therefore in dom f itself. This shows that dom h is convex.
Next we verify Jensens inequality:

f (x/g(x)) 1 f (u/g(u)) + 2 f (v/g(v)) + 3 f (0)


1 f (u/g(u)) + 2 f (v/g(v))

because f (0) 0. Subsituting the expressions for 1 and 2 we get

g(x)f (x/g(x)) g(u)f (u/g(u)) + (1 )g(v)f (v/g(v))

i.e., Jensens inequality h(x) h(u) + (1 )h(v).


(c) Apply part (b) to f (x) = xT x and g(x) = ( xk )1/n .
Q
k

2.6 Perspective of log determinant. Show that f (X, t) = nt log tt log det X, with dom f = Sn++ R++ ,
is convex in (X, t). Use this to show that

g(X) = n(tr X) log(tr X) (tr X)(log det X)


n n n
! !
X X X
= n i log i log i ,
i=1 i=1 i=1

where i are the eigenvalues of X, is convex on Sn++ .


Solution. This is the perspective function of log det X:

f (X, t) = t log det(X/t) = nt log t log det X.

Convexity of g follows from g(X) = f (X, tr X) and the fact that tr X is a linear function of X
(and positive).

11
2.7 Pre-composition with a linear fractional mapping. Suppose f : Rm R is convex, and A Rmn ,
b Rm , c Rn , and d R. Show that g : Rn R, defined by

g(x) = (cT x + d)f ((Ax + b)/(cT x + d)), dom g = {x | cT x + d > 0},

is convex.
Solution. g is just the composition of the perspective of f (which is convex) with the affine map
that takes x to (Ax + b, cT x + d), and so is convex.

2.8 Scalar valued linear fractional functions. A function f : Rn R is called linear fractional if it has
the form f (x) = (aT x + b)/(cT x + d), with dom f = {x | cT x + d > 0}. When is a linear fractional
function convex? When is a linear fractional function quasiconvex?
Solution. Linear fractional functions are always quasiconvex, since the sublevel sets are convex,
defined by one strict and one nonstrict linear inequality:

f (x) t cT x + d > 0, aT x + b t(cT x + d) 0.

To analyze convexity, we form the Hessian:

2 f (x) = (cT x + d)2 (acT + caT ) + (aT x + b)(cT x + d)3 ccT .

First assume that a and c are not colinear. In this case, we can find x with cT x + d = 1 (so,
x dom f ) with aT x + b taking any desired value. By taking it as a large and negative, we see
that the Hessian is not positive semidefinite, so f is not convex.
So for f to be convex, we must have a and c colinear. If c is zero, then f is affine (hence convex).
Assume now that c is nonzero, and that a = c for some R. In this case, f reduces to

cT x + b b d
f (x) = T
=+ T ,
c x+d c x+d
which is convex if and only if b d.
So a linear fractional function is convex only in some very special cases: it is affine, or a constant
plus a nonnegative constant times the inverse of cT x + d.

2.9 Show that the function


kAx bk22
f (x) =
1 xT x
is convex on {x | kxk2 < 1}.
Solution. The epigraph is convex because

kAx bk22
1 xT x
t
defines a convex set.
Heres another solution: The function kAx bk22 /u is convex in (x, u), since it is the quadratic over
linear function, pre-composed with an affine mapping. This function is decreasing in its second
argument, so by a composition rule, we can replace the second argument with any convcave function,
and the result is convex. But u = 1 = xT x is concave, so were done.

12
xk )1/n with dom f = Rn++ is concave,
Q
2.10 Weighted geometric mean. The geometric mean f (x) = ( k
as shown on page 74. Extend the proof to show that
n
k
Y
f (x) = xk , dom f = Rn++
k=1
P
is concave, where k are nonnegative numbers with k k = 1.
Solution. The Hessian of f is
 
2 f (x) = f (x) qq T diag()1 diag(q)2

where q is the vector (1 /x1 , . . . , n /xn ). We have


n n
!
X X
T 2 2
y f (x)y = f (x) ( k yk /xk ) k yk2 /x2k 0
k=1 k=1

by applying the Cauchy-Schwarz inequality (uT v)2 (kuk2 kvk2 )2 to the vectors

u = ( 1 y1 /x1 , . . . , n yn /xn ), v = ( 1 , . . . , n ).

2.11 Suppose that f : Rn R is convex, and define

g(x, t) = f (x/t), dom g = {(x, t) | x/t dom f, t > 0}.

Show that g is quasiconvex.


Solution. The -sublevel set of g is defined by

f (x, t) , t > 0, x/t dom f.

This is equivalent to
tf (x, t) t, t > 0, x/t dom f.
The function tf (x/t), with domain {(x, t) | x/t dom f, t > 0}, is the persepctive function of g,
and is convex. The first inequality above is convex, since the righthand side is an affine function of
t (for fixed ). So the -sublevel set of g is convex, and were done.

2.12 Continued fraction function. Show that the function


1
f (x) =
1
x1
1
x2
1
x3
x4
defined where every denominator is positive, is convex and decreasing. (There is nothing special
about n = 4 here; the same holds for any number of variables.)
Solution. We will use the composition rules and recursion. g4 (x) = 1/x4 is clearly convex and
decreasing in x4 . The function x31z is convex in x3 and z (over dom f ), and is decreasing in x3 and

13
increasing in z; it follows by the composition rules that g3 (x) = x3 g14 (x) is convex and decreasing
1
in all its variables. Repeating this argument for gk (x) = xk gk+1 (x) shows that f is convex and
decreasing.
Here is another way: g1 (x) = x3 1/x4 is clearly concave and increasing in x3 and x4 . The
function x2 1/z is concave and increasing in x2 and z; it follows by the composition rules that
g2 (x) = x2 1/g1 (x) is concave and increasing. Repeating this argument shows that f is concave
and increasing, so f is convex and decreasing.

2.13 Circularly symmetric Huber function. The scalar Huber function is defined as
(
(1/2)x2 |x| 1
fhub (x) =
|x| 1/2 |x| > 1.

This convex function comes up in several applications, including robust estimation. This prob-
lem concerns generalizations of the Huber function to Rn . One generalization to Rn is given
by fhub (x1 ) + + fhub (xn ), but this function is not circularly symmetric, i.e., invariant under
transformation of x by an orthogonal matrix. A generalization to Rn that is circularly symmetric
is (
(1/2)kxk22 kxk2 1
fcshub (x) = fhub (kxk) =
kxk2 1/2 kxk2 > 1.
(The subscript stands for circularly symmetric Huber function.) Show that fcshub is convex. Find

the conjugate function fcshub .
Solution. We cant directly use the composition form given above, since fhub is not nondecreasing.
But we can write fcshub = h g, where h : R R and g : Rn R are defined as

0
x0
h(x) = x2 /2
0x1
x 1/2 x > 1,

and g(x) = kxk2 . We can think of g as as a version of the scalar Huber function, modified to be
zero when its argument is negative. Clearly, g is convex and h is convex and increasing. Thus,
from the composition rules we conclude that fcshub is convex.
Now we will show that (
(1/2)kyk22 kyk2 1
fcshub (y) =
otherwise.
Suppose kyk2 > 1. Taking x = ty/kyk2 , we see that for t 1 we have

y T x f (x) = tkyk2 t + 1/2 = t(kyk2 1) + 1/2.



Letting t , we see that for any y with kyk2 > 1, supx (y T x f (x)) = . Thus, fcshub (y) =
for kyk2 > 1.

Now suppose kyk2 1. We can write fcshub (y) as
( )
T
fcshub (y) = max sup (y x (1/2)kxk22 ), T
sup (y x kxk2 + 1/2) .
kxk2 1 kxk2 1

14
It is easy to show that y T x (1/2)kxk22 is maximized over {x | kxk2 1} when x = y (set the
gradient of y T x (1/2)kxk22 equal to zero). This gives

sup (y T x (1/2)kxk22 ) = (1/2)kyk22 .


kxk2 1

To find supkxk2 1 (y T x kxk2 + 1/2), notice that for kxk2 1

y T x kxk2 + 1/2 kyk2 kxk2 kxk2 + 1/2 = kxk2 (kyk2 1) + 1/2 kyk2 1/2.

Here, the first inequality follows from Cauchy-Schwarz, and the second inequality follows from
kyk2 1 and kxk2 1. Furthermore, if we choose x = y/kyk2 , then

y T x kxk2 + 1/2 = kyk2 1/2,

thus,
sup (y T x kxk2 + 1/2) = kyk2 1/2.
kxk2 1

For kyk2 1

sup (y T x kxk2 + 1/2) = kyk2 1/2 (1/2)kyk22 = sup (y T x (1/2)kxk22 ),


kxk2 1 kxk2 1


so we conclude that for kyk2 1, fcshub (y) = (1/2)kyk22 .

2.14 Reverse Jensen inequality. Suppose f is convex, 1 > 0, i 0, i = 2, . . . , k, and 1 + + n = 1,


and let x1 , . . . , xn dom f . Show that the inequality

f (1 x1 + + n xn ) 1 f (x1 ) + + n f (xn )

always holds. Hints. Draw a picture for the n = 2 case first. For the general case, express x1 as a
convex combination of 1 x1 + + n xn and x2 , . . . , xn , and use Jensens inequality.
Solution. Let z = 1 x1 + + n xn , with 1 > 0, i 0, i = 2, . . . , n, and 1 + + n = 1.
Then we have
x 1 = 0 z + 2 x 2 + + n x n ,
where
i = i /1 , i = 2, . . . , n,
and 0 = 1/1 . Since 1 > 1, we see that 0 > 0; from i 0 we get i 0 for i = 2, . . . , n. Simple
algebra shows that 1 + + n = 1. From Jensens inequality we have

f (x1 ) 0 f (z) + 2 f (x2 ) + + n f (xn ),

so
1 2 n
f (z) f (x1 ) + f (x2 ) + + f (xn ),
0 0 0
Substituting for i this becomes the inequality we want,

f (z) 1 f (x1 ) + + n f (xn ).

15
2.15 Monotone extension of a convex function. Suppose f : Rn R is convex. Recall that a function
h : Rn R is monotone nondecreasing if h(x) h(y) whenever x  y. The monotone extension
of f is defined as
g(x) = inf f (x + z).
z0

(We will assume that g(x) > .) Show that g is convex and monotone nondecreasing, and
satisfies g(x) f (x) for all x. Show that if h is any other convex function that satisfies these
properties, then h(x) g(x) for all x. Thus, g is the maximum convex monotone underestimator
of f .
Remark. For simple functions (say, on R) it is easy to work out what g is, given f . On Rn , it
can be very difficult to work out an explicit expression for g. However, systems such as CVX can
immediately handle functions such as g, defined by partial minimization.
Solution. The function f (x + z) is jointly convex in x and z; partial minimization over z in the
nonnegative orthant yields g, which therefore is convex.
To show that g is monotone, suppose that x  y. Then we have x = y + z, where z = x y  0,
and so
g(y) = inf f (y + z) f (y + z) = f (x).
z0

To how that g(x) f (x), we use

g(x) = inf f (x + z) f (x).


z0

Now suppose that h is monotone, convex, and satisfies h(x) f (x) for all x. Then we have, for all
x and z, h(x + z) f (x + z). Taking the infimum over z  0, we obtain

inf h(x + z) inf f (x + z).


z0 z0

Since h is monotone nondecreasing, the lefthand side is h(x); the righthand side is g(x).

2.16 Circularly symmetric convex functions. Suppose f : Rn R is convex and symmetric with respect
to rotations, i.e., f (x) depends only on kxk2 . Show that f must have the form f (x) = (kxk2 ),
where : R R is nondecreasing and convex, with dom f = R. (Conversely, any function of this
form is symmetric and convex, so this form characterizes such functions.)
Solution. Define (a) = f (ae1 ), where e1 is the first standard unit vector, and a R. is a
convex function, and it is symmetric: (a) = f (ae1 ) = f (ae1 ) = (a), since k ae1 k2 = kae1 k2 .
A symmetric convex function on R must have its minimum at 0; for suppose that (a) < (0).
By Jensens inequality, (0) (1/2)(a) + (1/2)(a) = (a), a contradiction. Therefore is
nondecreasing for a 0. Now we define (a) = (a) for a 0 and (a) = (0) for a < 0. is
convex and nondecreasing. Evidently we have f (x) = (kxk2 ), so were done.

2.17 Infimal convolution. Let f1 , . . . , fm be convex functions on Rn . Their infimal convolution, denoted
g = f1   fm (several other notations are also used), is defined as

g(x) = inf{f1 (x1 ) + + fm (xm ) | x1 + + xm = x},

16
with the natural domain (i.e., defined by g(x) < ). In one simple interpretation, fi (xi ) is the cost
for the ith firm to produce a mix of products given by xi ; g(x) is then the optimal cost obtained
if the firms can freely exchange products to produce, all together, the mix given by x. (The name
convolution presumably comes from the observation that if we replace the sum above with the
product, and the infimum above with integration, then we obtain the normal convolution.)

(a) Show that g is convex.


(b) Show that g = f1 + + fm
. In other words, the conjugate of the infimal convolution is the

sum of the conjugates.

Solution.

(a) We can express g as

g(x) = inf (f1 (x1 ) + + fm (xm ) + (x1 , . . . , xm , x)) ,


x1 ,...,xm

where (x1 , . . . , xm , x) is 0 when x1 + + xm = x, and otherwise. The function on the


righthand side above is convex in x1 , . . . , xm , x, so by the partial minimization rule, so is g.
(b) We have

g (y) = sup(y T x f (x))


x
 
T
= sup y x inf f1 (x1 ) + + fm (xm )
x x1 ++xm =x
 
= sup y T x1 f1 (x1 ) + + y T xm fm (xm ) ,
x=x1 ++xm

where we use the fast that ( inf S) is the same as (sup S). The last line means we are to
take the supremum over all x and all x1 , . . . , xm that sum to x. But this is the same as just
taking the supremum over all x1 , . . . , xm , so we get
 
g (y) = sup y T x1 f1 (x1 ) + + y T xm fm (xm )
x1 ,...,xm

= sup(y T x1 f1 (x1 )) + + sup(y T xm fm (xm ))


x1 xm
= f1 (y) + +
fm (y).

2.18 Conjugate of composition of convex and linear function. Suppose A Rmn with rank A = m,
and g is defined as g(x) = f (Ax), where f : Rm R is convex. Show that

g (y) = f ((A )T y), dom(g ) = AT dom(f ),

where A = (AAT )1 A is the pseudo-inverse of A. (This generalizes the formula given on page 95
for the case when A is square and invertible.)
Solution. Let z = (A )T y, so y = AT z. Then we have

g ? (y) = sup(y T x f (Ax))


x
= sup(z T (Ax) f (Ax))
x

17
= sup(z T w f (w))
w

= f (z)
= f ((A )T y).

Now y dom(g ) if and only if (A )T y = z dom(f ). But since y = AT z, we see that this is
equivalent to y AT dom(f ).

2.19 [?, p104] Suppose 1 , . . . , n are positive. Show that the function f : Rn R, given by
n
(1 exi )i ,
Y
f (x) =
i=1

is concave on ( n )
X
n xi
dom f = x R++ i e 1 .


i=1
Hint. The Hessian is given by

2 f (x) = f (x)(yy T diag(z))

where yi = i exi /(1 exi ) and zi = yi /(1 exi ).


Solution. Well use Cauchy-Schwarz to show that the Hessian is negative semidefinite. The
Hessian is given by
2 f (x) = f (x) (yy T diag(z))
where yi = i exi /(1 exi ) and zi = yi2 /(i exi ).
For any v Rn , we can show
n
!2 n
T 2
X X vi2 yi2
v f (x)v = v i yi x
0,
i=1 i=1
i e i

T 2 T T
by applying inequality, (a b) (a a)(b b), toTthe vectors with components
the Cauchy-Schwarz
ai = vi yi / i exi and bi = i exi . The result follows because (b b) 1 on dom f . Thus the
Hessian is negative semidefinite.

2.20 Show that the following functions f : Rn R are convex.

(a) The difference between the maximum and minimum value of a polynomial on a given interval,
as a function of its coefficients:

f (x) = sup p(t) inf p(t) where p(t) = x1 + x2 t + x3 t2 + + xn tn1 .


t[a,b] t[a,b]

a, b are real constants with a < b.


(b) The exponential barrier of a set of inequalities:
m
e1/fi (x) ,
X
f (x) = dom f = {x | fi (x) < 0, i = 1, . . . , m}.
i=1

The functions fi are convex.

18
(c) The function
g(y + x) g(y)
f (x) = inf
>0
if g is convex and y dom g. (It can be shown that this is the directional derivative of g at
y in the direction x.)
Solution.
(a) f is the difference of a convex and a concave function. The first term is convex, because it is
the supremum of a family of linear functions of x. The second term is concave because it is
the infimum of a family of linear functions.
(b) h(u) = exp(1/u) is convex and decreasing on R++ :
1 1/u 2 1/u 1
h0 (u) = e , h00 (u) = e + 4 e1/u .
u2 u3 u
Therefore the composition h(fi (x)) = exp(1/fi (x)) is convex if fi is convex.
(c) Can be written as
1
 
f (x) = inf t g(y + x) g(y) ,
t>0 t
the infimum over t of the perspective of the convex function g(y + x) g(y)).
2.21 Symmetric convex functions of eigenvalues. A function f : Rn R is said to be symmetric if it is
invariant with respect to a permutation of its arguments, i.e., f (x) = f (P x) for any permutation
matrix P . An example of a symmetric function is f (x) = log( nk=1 exp xk ).
P

In this problem we show that if f : Rn R is convex and symmetric, then the function g : Sn R
defined as g(X) = f ((X)) is convex, where (X) = (1 (X), 2 (x), . . . , n (X)) is the vector of
eigenvalues of X. This implies, for example, that the function
n
X
g(X) = log tr eX = log ek (X)
k=1

is convex on Sn .

(a) A square matrix S is doubly stochastic if its elements are nonnegative and all row sums and
column sums are equal to one. It can be shown that every doubly stochastic matrix is a convex
combination of permutation matrices.
Show that if f is convex and symmetric and S is doubly stochastic, then
f (Sx) f (x).

(b) Let Y = Q diag()QT be an eigenvalue decomposition of Y Sn with Q orthogonal. Show


that the n n matrix S with elements Sij = Q2ij is doubly stochastic and that diag(Y ) = S.
(c) Use the results in parts (a) and (b) to show that if f is convex and symmetric and X Sn ,
then
f ((X)) = sup f (diag(V T XV ))
V V
where V is the set of n n orthogonal matrices. Show that this implies that f ((X)) is convex
in X.

19
Solution.
(a) Suppose S is expressed as a convex combination of permutation matrices:
X
S= k P k
k

with k 0,
P
k k = 1, and Pk permutation matrices. From convexity and symmetry of f ,
X X X
f (Sx) = f ( k Pk x) k f (Pk x) = k f (x) = f (x).
k k k

(b) From X = Q diag()QT ,


n
X
Xii = Q2ij j .
j=1

From QQT = I, we have j Q2ij = 1. From QT Q = I, we have i Q2ij = 1.


P P

(c) Combining the results in parts (a) and (b), we conclude that for any symmetric X, the
inequality
f (diag(X)) f ((X))
holds. Moreover, if V is orthogonal, then (X) = (V T XV ). Therefore also
f (diag(V T XV )) f ((X))
for all orthogonal V , with equality if V = Q. In other words
f ((X)) = sup f (diag(V T XV )).
V V

This shows that f ((X)) is convex because it is the supremum of a family of convex functions
of X.
2.22 Convexity of nonsymmetric matrix fractional function. Consider the function f : Rnn Rn R,
defined by
f (X, y) = y T X 1 y, dom f = {(X, y) | X + X T  0}.
When this function is restricted to X Sn , it is convex.
Is f convex? If so, prove it. If not, give a (simple) counterexample.
Solution. The function is not convex. Restrict the function f to g(s) = f (X, y), with
" # " #
1 s 1
X= , y= ,
s 1 1
and s R. (The domain of g is R.) If f is convex then so is g. But we have
2
g(s) = ,
1 + s2
which is certainly not convex.
For a very specific counterexample, take (say) s = +1 and s = 1. Then we have g(1) = 1,
g(+1) = 1 and
g((1 + 1)/2) = g(0) = 2 6 (g(1) + g(+1))/2 = 1.

20
2.23 Show that the following functions f : Rn R are convex.

(a) f (x) = exp(g(x)) where g : Rn R has a convex domain and satisfies


" #
2 g(x) g(x)
0
g(x)T 1

for x dom g.
(b) The function
f (x) = max {kAP x bk | P is a permutation matrix}
with A Rmn , b Rm .

Solution.

(a) The gradient and Hessian of f are

f (x) = eg(x) g(x)


2 f (x) = eg(x) 2 g(x) eg(x) g(x)g(x)T
 
= eg(x) 2 g(x) g(x)g(x)T
 0.

(b) f is the maximum of convex functions kAP x bk, parameterized by P .

2.24 Convex hull of functions. Suppose g and h are convex functions, bounded below, with dom g =
dom h = Rn . The convex hull function of g and h is defined as

f (x) = inf {g(y) + (1 )h(z) | y + (1 )z = x, 0 1} ,

where the infimum is over , y, z. Show that the convex hull of h and g is convex. Describe epi f
in terms of epi g and epi h.
Solution. We note that f (x) t if and only if there exist [0, 1], y, z, t1 , t2 such that

g(y) t1 , h(z) t2 , y + (1 )z = x, t1 + (1 )t2 = t.

Thus
epi f = conv (epi g epi h) ,
i.e., epi f is the convex hull of the union of the epigraphs of g and h. This shows that f is convex.
As an alternative proof, we can make a change of variable y = y, z = (1 )z in the minimization
problem in the definition, and note that f (x) is the optimal value of

minimize g(y/) + (1 )h(z/(1 )))


subject to y + z = x
0 1,

with variables , y, z. This is a convex problem, and therefore the optimal value is a convex function
of the righthand side x.

21
2.25 Show that a function f : R R is convex if and only if dom f is convex and

1 1 1
det x y z 0

f (x) f (y) f (z)

for all x, y, z dom f with x < y < z.


Solution.

1 1 1 1 1 1 1 1 0
det t1 t2 t3 = det t1 t2 t3 0 1 1

f (t1 ) f (t2 ) f (t3 ) f (t1 ) f (t2 ) f (t3 ) 0 0 1

1 0 0
= t1 t2 t1 t3 t2


f (t1 ) f (t2 ) f (t1 ) f (t3 ) f (t2 )
= (t2 t1 )(f (t3 ) f (t2 )) (t3 t2 )(f (t2 ) f (t1 )).

This is nonnegative if and only if


t3 t1 1 1
f (t2 ) f (t1 ) + f (t3 ).
(t2 t1 )(t3 t2 ) t2 t1 t3 t2

This is Jensens inequality

f (t1 + (1 )t3 ) f (t1 ) + (1 )f (t3 )

with
t2 t1 t3 t2
= , 1 = .
t3 t1 t3 t1

2.26 Generalization of the convexity of log det X 1 . Let P Rnm have rank m. In this problem we
show that the function f : Sn R, with dom f = Sn++ , and

f (X) = log det(P T X 1 P )

is convex. To prove this, we assume (without loss of generality) that P has the form
" #
I
P = ,
0

where I. The matrix P T X 1 P is then the leading m m principal submatrix of X 1 .

(a) Let Y and Z be symmetric matrices with 0 Y  Z. Show that det Y det Z.
(b) Let X Sn++ , partitioned as " #
X11 X12
X= T ,
X12 X22

22
with X11 Sm . Show that the optimization problem

minimize log 1
" det Y # " #
Y 0 X11 X12
subject to  T ,
0 0 X12 X22

with variable Y Sm , has the solution


1 T
Y = X11 X12 X22 X12 .

(As usual, we take Sm 1 .)


++ as the domain of log det Y
Hint. Use the Schur complement characterization of positive definite block matrices (page 651
of the book): if C  0 then " #
A B
0
BT C

if and only if A BC 1 B T  0.
(c) Combine the result in part (b) and the minimization property (page 3-19, lecture notes) to
show that the function
1 T 1
f (X) = log det(X11 X12 X22 X12 ) ,
with dom f = Sn++ , is convex.
1 T 1
(d) Show that (X11 X12 X22 X12 ) is the leading m m principal submatrix of X 1 , i.e.,
1 T 1
(X11 X12 X22 X12 ) = P T X 1 P.

Hence, the convex function f defined in part (c) can also be expressed as f (X) = log det(P T X 1 P ).
Hint. Use the formula for the inverse of a symmetric block matrix:
" #1 " # " # " #T
A B 0 0 I 1 T 1 I
= + (A BC B )
BT C 0 C 1 1
C BT 1
C BT

if C and A BC 1 B T are invertible.

Solution.

(a) Y  Z if and only if Y 1/2 ZY 1/2  0, which implies det(Y 1/2 ZY 1/2 ) = det Y 1 det Z =
det Z/ det Y 0.
(b) The optimal Y maximizes det Y subject to the constraint
" # " # " #
X11 X12 Y 0 X11 Y X12
T = T  0.
X12 X22 0 0 X12 X22

By the Schur complement property in the hint this inequality holds if and only if
1 T
Y  X11 X12 X22 X12
1 T 1 T
and this implies det Y det(X11 X12 X22 X12 ). Therefore Y = X11 X12 X22 X12 is optimal.

23
(c) Define a function F : Sn Sm R with F (X, Y ) = log det Y 1 on the domain
( " # " #)
X11 X12 Y 0
dom F = (X, Y ) Y  0,  .

T
X12 X22 0 0

This function is convex because its domain is convex and log det Y 1 is convex on the set of
positive definite Y . In part (b) we proved that

f (X) = inf F (X, Y )


Y

and convexity follows from the minimization property.


(d) The formula for the inverse shows that (A BC 1 B T )1 is the 1,1 block of the inverse of the
block matrix.

2.27 Functions of a random variable with log-concave density. Suppose the random variable X on Rn
has log-concave density, and let Y = g(X), where g : Rn R. For each of the following statements,
either give a counterexample, or show that the statement is true.

(a) If g is affine and not constant, then Y has log-concave density.


(b) If g is convex, then prob(Y a) is a log-concave function of a.
(c) If g is concave, then E ((Y a)+ ) is a convex and log-concave function of a. (This quantity is
called the tail expectation of Y ; you can assume it exists. We define (s)+ as (s)+ = max{s, 0}.)

Solution.

(a) This one is true. Let p be the density of X, and let g(x) = cT x + d, with c 6= 0 (otherwise g
would be constant). Since g is not constant, we conclude that Y has a density pY .
With a > 0, define the function
(
1 a g(x) a + a
h(x, a) =
0 otherwise,

which is the 0 1 indicator function of the convex set {(x, a) | a g(x) a + a}. The 0 1
indicator function of a convex set is log-concave, so by the integration rule it follows that
Z
p(x)h(x, a) dx = E h(X, a) = prob(a Y a + a)

is log-concave in a. It follows that


prob(a Y a + a)
a
is log-concave (since a > 0). Taking a 0, this converges to pY (a), which we conclude is
log-concave.
(b) This one is true. Here we define the function
(
1 g(x) a
h(x, a) =
0 otherwise,

24
which is the 01 indicator function of the convex set epi g = {(x, a) | g(x) a}, and therefore
log-concave. By the integration rule we get that
Z
p(x)h(x, a) dx = E h(X, a) = prob(Y a)

is log-concave in a.
If we assume that g is concave, and we switch the inequality, we conclude that prob(Y a)
is log-concave in a. (Well use this below.)
(c) This one is true. Convexity of the tail expectation holds for any random variable, so it has
has nothing to do with g, and it has nothing to do with log-concavity of the density of X. For
any random variable Y on R, we have
d
E(Y a)+ = prob(Y a).
da
The righthand side is nondecreasing in a, so the tail expectation has nondecreasing derivative,
which means it is a convex function.
Now lets show that the tail expectation is log-concave. One simple method is to use the
formula above to note that
Z
E(Y a)+ = prob(Y b) db.
a

The integration rule for log-concave functions tells us that this is log-concave.
We can also give a direct proof following the style of the ones given above. We define g as
h(x, a) = (g(x) a)+ . This function is log-concave. First, its domain is {(x, a) | g(x) > a},
which is convex. Concavity of log h(x, a) = log(g(x) a) follows from the composition rule:
log is concave and increasing, and g(x) a is concave in (x, a). So by the integration rule we
get Z
p(x)h(x, a) dx = E(g(x) a)+

is log-concave in a.

2.28 Majorization. Define C as the set of all permutations of a given n-vector a, i.e., the set of vectors
(a1 , a2 , . . . , an ) where (1 , 2 , . . . , n ) is one of the n! permutations of (1, 2, . . . , n).

(a) The support function of C is defined as SC (y) = maxxC y T x. Show that

SC (y) = a[1] y[1] + a[2] y[2] + + a[n] y[n] .

(u[1] , u[2] , . . . , u[n] denote the components of an n-vector u in nonincreasing order.)


Hint. To find the maximum of y T x over x C, write the inner product as

y T x = (y1 y2 )x1 + (y2 y3 )(x1 + x2 ) + (y3 y4 )(x1 + x2 + x3 ) +


+ (yn1 yn )(x1 + x2 + + xn1 ) + yn (x1 + x2 + + xn )

and assume that the components of y are sorted in nonincreasing order.

25
(b) Show that x satisfies xT y SC (y) for all y if and only if

sk (x) sk (a), k = 1, . . . , n 1, sn (x) = sn (a),

where sk denotes the function sk (x) = x[1] + x[2] + + x[k] . When these inequalities hold, we
say the vector a majorizes the vector x.
(c) Conclude from this that the conjugate of SC is given by
(
0 if x is majorized by a
SC (x) =
+ otherwise.

Since SC is the indicator function of the convex hull of C, this establishes the following result:
x is a convex combination of the permutations of a if and only if a majorizes x.

Solution.

(a) Suppose y is sorted. From the expression of the inner product it is clear that the permutation
of a that maximizes the inner product with y is xk = a[k] , k = 1, . . . , n.
(b) We first show that if a majorizes x, then xT y SC (y) for all y. Note that if x is majorized
by a, then all permutations of x are majorized by a, so we can assume that the components
of y are sorted in nonincreasing order. Using the results from part a,

SC (y) xT y
= (y1 y2 )(s1 (a) x1 ) + (y2 y3 )(s2 (a) x1 x2 ) +
+ (yn1 yn )(sn1 (a) x1 xn1 ) + yn (sn (a) x1 xn )
(y1 y2 )(s1 (a) s1 (x)) + (y2 y3 )(s2 (a) s2 (x)) +
+ (yn1 yn )(sn1 (a) sn1 (x)) + yn (sn (a) sn (x))
0.

Next, we show that the conditions are necessary. We distinguish two cases.
Suppose sk (x) > sk (a) for some k < n. Assume the components of x are sorted in
nonincreasing order and choose

y1 = = yk1 = 1, yk = = yn = 0.

Then SC (y) xT y = sk (a) sk (x) < 0.


Suppose sn (x) 6= sn (a). Choose y = 1 if sn (x) > sn (a) and y = 1 if sn (x) < sn (a). We
have SC (y) xT y = y1 (sn (a) sn (x)) < 0.
(c) The expression for the conjugate follows from the fact that if SC (y) xT y is positive for some
y then it is unbounded above, and if x is majorized by a then x = 0 is the optimum.

2.29 Convexity of products of powers. This problem concerns the product of powers function f : Rn++
R given by f (x) = x11 xnn , where Rn is a vector of powers. We are interested in finding
values of for which f is convex or concave. You already know a few, for example when n = 2 and
= (2, 1), f is convex (the quadratic-over-linear function), and when = (1/n)1, f is concave

26
(geometric mean). Of course, if n = 1, f is convex when 1 or 0, and concave when
0 1.
Show each of the statements below. We will not read long or complicated proofs, or ones that
involve Hessians. We are looking for short, snappy ones, that (where possible) use composition
rules, perspective, partial minimization, or other operations, together with known convex or concave
functions, such as the ones listed in the previous paragraph. Feel free to use the results of earlier
statements in later ones.

(a) When n = 2,  0, and 1T = 1, f is concave.


(b) When  0 and 1T = 1, f is concave. (This is the same as part (a), but here it is for general
n.)
(c) When  0 and 1T 1, f is concave.
(d) When  0, f is convex.
(e) When 1T = 1 and exactly one of the elements of is positive, f is convex.
(f) When 1T 1 and exactly one of the elements of is positive, f is convex.

Remark. Parts (c), (d), and (f) exactly characterize the cases when f is either convex or concave.
That is, if none of these conditions on hold, f is neither convex nor concave. Your teaching staff
has, however, kindly refrained from asking you to show this.
Solution. To shorten our proofs, when both x and are vectors, we overload notation so that

f (x) = x11 xnn = x .

(a) Since x11 is concave for 0 1 1, applying the perspective transformation gives that

x2 (x1 /x2 )1 = x11 x1


2
1

is concave, which is what we wanted.


(b) The proof is by induction on n. We know the base case with n = 1 holds. For the induction
T
step, if Rn+1 = (x1 , . . . , xn ), and 1T = 1, then x
+ , = (1 , . . . , n ), x /1 is concave by
T T
the induction assumption. The function y 1 z 11 is concave by (a) and nondecreasing. The
composition rules give that
T T
T
x/1 )1 x11
( n+1 xn+1
=x n+1
= x

is concave.
T T
(c) If 1T 1, then x/1 is concave by (b). The function y 1 is concave and nondecreasing.
Composition gives that
T T
(x/1 )1 = x
is concave.
T T
(d) If  0, then x/1 is concave by part (b). (We can assume 1T 6= 0.) The function y 1 is
convex and nonincreasing, since 1T < 0. Composition gives that
T T
(x/1 )1 = x

27
is convex.
Heres another proof, that several people used, and which is arguably simpler than the one
above. Since i 0, i log xi is a convex function of xi , and therefore the sum i i log xi is
P

convex in x. By the composition rules, the exponential of a convex function is convex, so


X
exp( i log xi ) = x
i

is convex.
(e) If Rn+1 and 1T = 1, we can assume that the single positive element is n+1 > 0, so

that = (1 , . . . , n )  0. If x is convex by part (d). Applying the
= (x1 , . . . , xn ), then x
perspective transformation gives that

T
x/xn+1 ) = x
xn+1 ( x11
n+1 xn+1
=x n+1
= x

is convex.
T
(f) If 1T 1 and exactly one element of is positive, then x/1 is convex by part (e). The
T
function y 1 is convex and nondecreasing. Composition gives us that
T T
(x/1 )1 = x

is convex.

Remark. The proofs for (c), (d), and (f) are syntactically identical.
Remark. We can also prove (c) with the following self-contained argument. A syntactically identical
self-contained argument also works for (f) by substituting convex for concave.
The proof is by induction on n. We know the base case: x11 is concave for 0 1 1. For the
inductive step, if Rn+1 + and 1T 1, let = (1 , . . . , n ) and x
= (x1 , . . . , xn ). Note that

/1 T
x
is concave by the induction assumption. Applying the perspective transformation gives that
T T T
11T /1
x/xn+1 )/1
xn+1 ( /1 xn+1
=x
T
is concave. The function y 1 is concave and nondecreasing, and composing it with the previous
function shows that
T T 1T
11T /1 T
T
x/1 xn+1
( ) x1n+11
=x xn+1
=x n+1
= x

is concave, completing the proof.

28
3 Convex optimization problems
3.1 Minimizing a function over the probability simplex. Find simple necessary and sufficient conditions
for x Rn to minimize a differentiable convex function f over the probability simplex, {x | 1T x =
1, x  0}.
Solution. The simple basic optimality condition is that x is feasible, i.e., x  0, 1T x = 1, and
that f (x)T (y x) 0 for all feasible y. Well first show this is equivalent to

min f (x)i f (x)T x.


i=1,...,n

To see this, suppose that f (x)T (y x) 0 for all feasible y. Then in particular, for y = ei ,
we have f (x)i f (x)T x, which is what we have above. To show the other way, suppose that
f (x)i f (x)T x holds, for i = 1, . . . , n. Let y be feasible, i.e., y  0, 1T y = 1. Then multiplying
f (x)i f (x)T x by yi and summing, we get
n n
!
X X
yi f (x)i yi f (x)T x = f (x)T x.
i=1 i=1

The lefthand side is y T f (x), so we have f (x)T (y x) 0.


Now we can simplify even further. The condition above can be written as
n
f X f
min xi .
i=1,...,n xi xi
i=1

But since 1T x = 1, x  0, we have


n
f X f
min xi ,
i=1,...,n xi xi
i=1

and it follows that n


f X f
min = xi .
i=1,...,n xi xi
i=1

The right hand side is a mixture of f /xi terms and equals the minimum of all of the terms. This
is possible only if xk = 0 whenever f /xk > mini f /xi .
Thus we can write the (necessary and sufficient) optimality condition as 1T x = 1, x  0, and, for
each k,
f f
xk > 0 = min .
xk i=1,...,n xi

In particular, for ks with xk > 0, f /xk are all equal.

3.2 Hello World in CVX*. Use CVX, CVX.PY or Convex.jl to verify the optimal values you obtained
(analytically) for exercise 4.1 in Convex Optimization.
Solution.

(a) p? = 0.6

29
(b) p? =
(c) p? = 0
1
(d) p? = 3
1
(e) p? = 2

%exercise 4.1 using CVX

%set up a vector to store optimal values of problems


optimal_values=zeros(5,1);

%part a
cvx_begin
variable x(2)
minimize(x(1)+x(2))
2*x(1)+x(2) >= 0;
x(1)+3*x(2) >= 1;
x >= 0;
cvx_end

optimal_values(1)=cvx_optval;

%part b
cvx_begin
variable x(2)
minimize(-sum(x))
2*x(1)+x(2) >= 0;
x(1)+3*x(2) >= 1;
x >= 0;
cvx_end
optimal_values(2)=cvx_optval;

%part c
cvx_begin
variable x(2)
minimize(x(1))
2*x(1)+x(2) >= 0;
x(1)+3*x(2) >= 1;
x >= 0;
cvx_end
optimal_values(3)=cvx_optval;

%part d
cvx_begin
variable x(2)
minimize(max(x))

30
2*x(1)+x(2) >= 0;
x(1)+3*x(2) >= 1;
x >= 0;
cvx_end
optimal_values(4)=cvx_optval;

%part e
cvx_begin
variable x(2,1)
minimize( square(x(1))+ 9*square(x(2)) )
2*x(1)+x(2) >= 0;
x(1)+3*x(2) >= 1;
x >= 0;
cvx_end
optimal_values(5)=cvx_optval;

import cvxpy as cvx


import numpy as np

x1 = cvx.Variable()
x2 = cvx.Variable()
constraints = [2*x1 + x2 >= 1, x1 + 3*x2 >= 1, x1 >= 0, x2 >= 0]
prob = cvx.Problem([], constraints)

# (a)
prob.objective = cvx.Minimize(x1+x2)
prob.solve()
print With obj. x1+x2, p* is %.2f, optimal x1 is %.2f, optimal x2 is %.2f \
% (prob.value, x1.value, x2.value)

# (b)
prob.objective = cvx.Minimize(-x1-x2)
prob.solve(solver=cvx.CVXOPT)
print With obj. -x1-x2, status is + prob.status

# (c)
prob.objective = cvx.Minimize(x1)
prob.solve()
print With obj. x1, p* is %.2f, x1* is %.2f, x2* is %.2f \
% (prob.value, x1.value, x2.value)

# (d)
prob.objective = cvx.Minimize(cvx.max_elemwise(x1, x2))
prob.solve()
print With obj. max(x1,x2), p* is %.2f, x1* is %.2f, x2* is %.2f \
% (prob.value, x1.value, x2.value)

31
# (e)
prob.objective = cvx.Minimize(cvx.square(x1) + 9*cvx.square(x2))
prob.solve()
print With obj. x1^2+9x2^2, p* is %.2f, x1* is %.2f, x2* is %.2f \
% (prob.value, x1.value, x2.value)

# exercise 4.1 using CVX


using Convex

# set up a vector to store optimal values of problems


optimal_values=zeros(5);

# part a
x = Variable(2)
obj = x[1] + x[2]
constr = [2*x[1]+x[2] >= 1, x[1]+3*x[2] >= 1, x >= 0]
p = minimize(obj, constr)
solve!(p)
optimal_values[1]=p.optval;

# part b
x = Variable(2)
obj = -sum(x)
constr = [2*x[1]+x[2] >= 1, x[1]+3*x[2] >= 1, x >= 0]
p = minimize(obj, constr)
solve!(p)
optimal_values[2]=p.optval;

# part c
x = Variable(2)
obj = x[1]
constr = [2*x[1]+x[2] >= 1, x[1]+3*x[2] >= 1, x >= 0]
p = minimize(obj, constr)
solve!(p)
optimal_values[3]=p.optval;

# part d
x = Variable(2)
obj = maximum(x)
constr = [2*x[1]+x[2] >= 1, x[1]+3*x[2] >= 1, x >= 0]
p = minimize(obj, constr)
solve!(p)
optimal_values[4]=p.optval;

# part e

32
x = Variable(2)
obj = square(x[1]) + 9*square(x[2])
constr = [2*x[1]+x[2] >= 1, x[1]+3*x[2] >= 1, x >= 0]
p = minimize(obj, constr)
solve!(p)
optimal_values[5]=p.optval;

3.3 Reformulating constraints in CVX*. Each of the following CVX code fragments describes a convex
constraint on the scalar variables x, y, and z, but violates the CVX rule set, and so is invalid.
Briefly explain why each fragment is invalid. Then, rewrite each one in an equivalent form that
conforms to the CVX rule set. In your reformulations, you can use linear equality and inequality
constraints, and inequalities constructed using CVX functions. You can also introduce additional
variables, or use LMIs. Be sure to explain (briefly) why your reformulation is equivalent to the
original constraint, if it is not obvious.
Check your reformulations by creating a small problem that includes these constraints, and solving
it using CVX. Your test problem doesnt have to be feasible; its enough to verify that CVX
processes your constraints without error.
Remark. This looks like a problem about how to use CVX software, or tricks for using CVX.
But it really checks whether you understand the various composition rules, convex analysis, and
constraint reformulation rules.

(a) norm([x + 2*y, x - y]) == 0


(b) square(square(x + y)) <= x - y
(c) 1/x + 1/y <= 1; x >= 0; y >= 0
(d) norm([max(x,1), max(y,2)]) <= 3*x + y
(e) x*y >= 1; x >= 0; y >= 0
(f) (x + y)^2/sqrt(y) <= x - y + 5
(g) x^3 + y^3 <= 1; x >= 0; y >= 0
(h) x + z <= 1 + sqrt(x*y - z^2); x >= 0; y >= 0

Solution.

(a) The lefthand side is correctly identified as convex, but equality constraints are only valid with
affine left and right hand sides. Since the norm of a vector is zero if and only if the vector is zero,
we can express the constraint as x + 2*y == 0; x - y == 0, or simply x == 0; y == 0.
(b) The problem is that square() can only accept affine arguments, because it is convex, but not
increasing. To correct this use square_pos() instead:
square_pos(square(x + y)) <= x - y
We can also reformulate this constraint by introducing an additional variable.
variable t
square(x + y) <= t
square(t) <= x - y

33
Note that, in general, decomposing the objective by introducing new variables doesnt need to
work. It works in this case because the outer square function is convex and monotonic over
R+ .
Alternatively, we can rewrite the constraint as
(x + y)^4 <= x - y
(c) 1/x isnt convex, unless you restrict the domain to R++ . We can write this one as inv_pos(x) + inv_pos(y)
The inv_pos function has domain R++ so the constraints x > 0, y > 0 are (implicitly) in-
cluded.
(d) The problem is that norm() can only accept affine argument since it is convex but not increas-
ing. One way to correct this is to introduce new variables u and v:
norm([u, v]) <= 3*x + y
max(x,1) <= u
max(y,2) <= v
Decomposing the objective by introducing new variables works here because norm is convex
and monotonic over R2+ , and in particular over [1, ) [2, ).
(e) xy isnt concave, so this isnt going to work as stated. But we can express the constraint as
x >= inv_pos(y). (You can switch around x and y here.) Another solution is to write the
constraint as geo_mean([x, y]) >= 1. We can also give an LMI representation:
[x 1; 1 y] == semidefinite(2)
(f) This fails when we attempt to divide a convex function by a concave one. We can write this
as
quad_over_lin(x + y, sqrt(y)) <= x - y + 5
This works because quad_over_lin is monotone decreasing in the second argument, so it can
accept a concave function here, and sqrt is concave.
(g) The function x3 + y 3 is convex for x 0, y 0. But x3 isnt convex for x < 0, so CVX is
going to reject this statement. One way to rewrite this constraint is
quad_pos_over_lin(square(x),x) + quad_pos_over_lin(square(y),y) <= 1
This works because quad_pos_over_lin is convex and increasing in its first argument, hence
accepts a convex function in its first argument. (The function quad_over_lin, however, is
not increasing in its first argument, and so wont work.)
Alternatively, and more simply, we can rewrite the constraint as
pow_pos(x,3) + pow_pos(y,3) <= 1
(h) The problem here is that xy isnt concave, which causes CVX to reject the statement. To
correct this, notice that q q
xy z 2 = y(x z 2 /y),
so we can reformulate the constraint as
x + z <= 1 + geo_mean([x - quad_over_lin(z,y), y])

34
This works, since geo_mean is concave and nondecreasing in each argument. It therefore
accepts a concave function in its first argument.

We can check our reformulations by writing the following feasibility problem in CVX (which is
obviously infeasible)

cvx_begin
variables x y u v z
x == 0;
y == 0;
(x + y)^4 <= x - y;
inv_pos(x) + inv_pos(y) <= 1;
norm([u; v]) <= 3*x + y;
max(x,1) <= u;
max(y,2) <= v;
x >= inv_pos(y);
x >= 0;
y >= 0;
quad_over_lin(x + y, sqrt(y)) <= x - y + 5;
pow_pos(x,3) + pow_pos(y,3) <= 1;
x+z <= 1+geo_mean([x-quad_over_lin(z,y), y])
cvx_end

3.4 Optimal activity levels. Solve the optimal activity level problem described in exercise 4.17 in Convex
Optimization, for the instance with problem data

1 2 0 1 100
3 2 4
0 0 3 1 100
2 1 10
A= 0 3 1 1 , cmax = 100 , p= , pdisc = , q= .

7 4 5
2 1 2 5 100


6 2 10
1 0 3 2 100

You can do this by forming the LP you found in your solution of exercise 4.17, or more directly,
using CVX*. Give the optimal activity levels, the revenue generated by each one, and the total
revenue generated by the optimal solution. Also, give the average price per unit for each activity
level, i.e., the ratio of the revenue associated with an activity, to the activity level. (These numbers
should be between the basic and discounted prices for each activity.) Give a very brief story
explaining, or at least commenting on, the solution you find.
Solution. For this part, we write the problem in a form close to its original statement, and let
CVX* do the work of reformulating it as an LP.
The following CVX code implements this:

A=[ 1 2 0 1;
0 0 3 1;
0 3 1 1;
2 1 2 5;

35
1 0 3 2];

cmax=[100;100;100;100;100];
p=[3;2;7;6];
pdisc=[2;1;4;2];
q=[4; 10 ;5; 10];

cvx_begin
variable x(4)
maximize( sum(min(p.*x,p.*q+pdisc.*(x-q))) )
subject to
x >= 0;
A*x <= cmax
cvx_end

x
r=min(p.*x,p.*q+pdisc.*(x-q))
totr=sum(r)
avgPrice=r./x

import cvxpy as cvx


import numpy as np

A = np.matrix(1 2 0 1; \
0 0 3 1; \
0 3 1 1; \
2 1 2 5; \
1 0 3 2)

cmax = np.matrix(100; 100; 100; 100; 100)


p = np.matrix(3; 2; 7; 6)
pdisc = np.matrix(2; 1; 4; 2)
q = np.matrix(4; 10; 5; 10)

x = cvx.Variable(4)

t1 = cvx.mul_elemwise(p, x)
t2 = cvx.mul_elemwise(p, q) + cvx.mul_elemwise(pdisc, x-q)
obj = cvx.Maximize(cvx.sum_entries(cvx.min_elemwise(t1, t2)))

cons = [x >= 0, A*x <= cmax]

prob = cvx.Problem(obj, cons)


prob.solve()

r = cvx.min_elemwise(t1, t2).value

36
totr = sum(r)
avgPrice = r / x.value

print x.value
print r
print totr
print avgPrice

using Convex

A = [ 1 2 0 1;
0 0 3 1;
0 3 1 1;
2 1 2 5;
1 0 3 2];

cmax = [100;100;100;100;100];
p = [3;2;7;6];
pdisc = [2;1;4;2];
q = [4; 10 ;5; 10];

x = Variable(4);
r = min(p.*x, p.*q + pdisc.*(x - q));
totr = sum(r);
constraints = [x >= 0, A*x <= cmax];
p = maximize(totr, constraints);
solve!(p);

avgPrice = evaluate(r)./evaluate(x);
println(x.value)
println(evaluate(r))
println(evaluate(totr))
println(avgPrice)

The result of the code is

x =

4.0000
22.5000
31.0000
1.5000

r =

37
12.0000
32.5000
139.0000
9.0000

totr =

192.5000

avgPrice =

3.0000
1.4444
4.4839
6.0000

We notice that the 3rd activity level is the highest and is also the one with the highest basic price.
Since it also has a high discounted price its activity level is higher than the discount quantity level
and it produces the highest contribution to the total revenue. The 4th activity has a discounted
price which is substantially lower then the basic price and its activity is therefore lower than the
discount quantity level. Moreover it requires the use of a lot of resources and therefore its activity
level is low.

3.5 Minimizing the ratio of convex and concave piecewise-linear functions. We consider the problem

maxi=1,...,m (aTi x + bi )
minimize
mini=1,...,p (cTi x + di )
subject to F x  g,

with variable x Rn . We assume that cTi x+di > 0 and maxi=1,...,m (aTi x+bi ) 0 for all x satisfying
F x  g, and that the feasible set is nonempty and bounded. This problem is quasiconvex, and can
be solved using bisection, with each iteration involving a feasibility LP. Show how the problem can
be solved by solving one LP, using a trick similar to one described in 4.3.2.
Solution. We will show that the problem is equivalent to the optimization problem

minimize max (aTi y + bi t)


i=1,...,m
subject to min (cTi y + di t) 1
i=1,...,p (1)
F y  gt
t0

with variables y, t. This can be further expressed as an LP using the standard tricks, by introducing

38
an additional variable u:
minimize u
subject to aTi y + bi t u, i = 1, . . . , m
cTi y + di t 1, i = 1, . . . , p
F y  gt
t 0.

To show that (1) is equivalent to the problem in the assignment, we first note that t > 0 for all
feasible (y, t). Indeed, the first constraint implies that (y, t) 6= 0. We must have t > 0 because
otherwise F y  0 and y 6= 0, which means that y defines an unbounded direction in the polyhedron
{x | F x  g}, contradicting the assumption that this polyhedron is bounded. If t > 0 for all feasible
y, t, we can rewrite problem (1) as

minimize t max (aTi (y/t) + bi )


i=1,...,m
subject to min (cTi (y/t) + di ) 1/t
i=1,...,p (2)
F (y/t)  g
t 0.

Next we argue that the first constraint necessarily holds with equality at the optimum, i.e., the
optimal solution of (2) is also the solution of

minimize t max (aTi (y/t) + bi )


i=1,...,m
subject to min (cTi (y/t) + di ) = 1/t
i=1,...,p (3)
F (y/t)  g
t 0.

To see this, suppose we fix y/t in (2) and optimize only over t. Since maxi (aTi (y/t) + bi ) 0 if
F (y/t) g, we minimize the cost function by making t as small as possible, i.e., choosing t such
that
min (cTi (y/t) + di ) = 1/t.
i=1,...,p

The final step is to substitute this expression for the optimal t in the cost function of (3) to get

max (aTi (y/t) + bi )


i=1,...,m
minimize
min (cTi (y/t) + di )
i=1,...,p
subject to F (y/t)  g
t 0.

This is the problem of the assignment with x = y/t.

3.6 Two problems involving two norms. We consider the problem

kAx bk1
minimize , (4)
1 kxk

39
and the very closely related problem
kAx bk21
minimize . (5)
1 kxk
In both problems, the variable is x Rn , and the data are A Rmn and b Rm . Note that
the only difference between problem (4) and (5) is the square in the numerator. In both problems,
the constraint kxk < 1 is implicit. You can assume that b / R(A), in which case the constraint
kxk < 1 can be replaced with kxk 1.
Answer the following two questions, for each of the two problems. (So you will answer four questions
all together.)

(a) Is the problem, exactly as stated (and for all problem data), convex? If not, is it quasiconvex?
Justify your answer.
(b) Explain how to solve the problem. Your method can involve an SDP solver, an SOCP solver,
an LP solver, or any combination. You can include a one-parameter bisection, if necessary.
(For example, you can solve the problem by bisection on a parameter, where each iteration
consists of solving an SOCP feasibility problem.)
Give the best method you can. In judging best, we use the following rules:
Bisection methods are worse than one-shot methods. Any method that solves the problem
above by solving one LP, SOCP, or SDP problem is better than any method that uses a
one-parameter bisection. In other words, use a bisection method only if you cannot find
a one-shot method.
Use the simplest solver needed to solve the problem. We consider an LP solver to be
simpler than an SOCP solver, which is considered simpler than an SDP solver. Thus, a
method that uses an LP solver is better than a method that uses an SOCP solver, which
in turn is better than a method that uses an SDP solver.

Solution. First we discuss the problem


kAx bk1
minimize .
1 kxk
(a) This problem is in general not convex. As an example, take n = 1,
" # " #
2 1
A= , b= .
0 0.01

The objective is
|2x 1| + 0.01
f (x) = ,
1 |x|
which is not convex (as can be easily seen by plotting the function). However, the problem is
quasiconvex: for 0, the sublevel sets

S = {x | kAx bk1 /(1 kxk ) }


= {x | kAx bk1 + kxk }

are convex. For < 0 the -sublevel set is empty, hence convex.

40
(b) The problem can be solved using a one-parameter bisection and an LP solver. At each iteration
we solve the feasibility problem

find x
subject to kAx bk1 /(1 kxk )

(with domain {x | kxk < 1}) for fixed . The assumption b 6 R(A) implies that the optimal
value is positive, so we only have to consider positive values of . The feasibility problem can
be written in convex form
find x
subject to kAx bk1 + kxk .

(Note that if > 0 the constraint implies the implicit constraint kxk < 1.) It can be further
rewritten as an LP feasibility problem

find x
subject to y1  x  y1
z  Ax b  z
1T z + y

with variables x Rn , y R, z Rm .
In fact, we can do better and pose the problem as a single LP. We first note that by introducing
an auxiliary scalar variable t we can formulate the problem as

minimize kAx bk1 /t


subject to t + kxk 1

with an implicit constraint t > 0. A change of variables y = x/t, z = 1/t gives a convex
problem
minimize kAy bzk1
subject to 1 + kyk z.
(Note that the constraint implies z > 0.) This problem now reduces to an LP

minimize 1T u
subject to u  Ay bz  u
v1  y  v1
1+v z

with variables u Rm , y Rn , z R, v R.

Next we consider the problem


kAx bk21
minimize .
1 kxk
(a) This problem is convex. The objective is the composition of the function
(
s2 /(1 t) s 0
(s, t) =
0 s < 0,

41
with domain R [0, 1), with the functions g1 (x) = kAx bk1 and g2 (x) = kxk . Since
is convex, and nondecreasing in each argument, and g1 and g2 are convex, the composition
(g1 (x), g2 (x)) is a convex function. (We had to extend the function as zero for s < 0 in
order to claim that is nondecreasing in each argument.)
As another proof, one can note that the epigraph of the function is a convex set, since
kAx bk21 kAx bk21
t, kxk < 1 + kxk 1, t>0
1 kxk t
and the left-hand side of the second inequality is a convex function, jointly in x and t. (The
function kyk21 /t is convex because it is the perspective of the convex function kzk21 .)
(b) The problem can be written as an SOCP or SDP. With the assumption that b
/ R(A), it is
equivalent to the problem:
minimize y
subject to s2 /(1 t) y
kAx bk1 s
kxk t
t 1.
The second and third constraints can be reformulated as linear inequalties. The first constraint
is equivalent to the linear matrix inequality
" #
1t s
0
s y

and also (using the observation in problem T4.26) to a second-order cone constraint
" #
2s
1 t + y, t 1, y 0.

1ty

2

If we use the SOCP formulation we obtain


minimize y
subject to z  Ax b  z,
1T z s,
t1
"  x  t1,#
2s
1t+y

1ty

2
t1
y 0.
which is an SOCP in the variables s, t, x, y and z.
The SDP formulation is
minimize y
subject to z  Ax b  z,
1T z s,
t1
"  x  t1,
#
1t s
 0.
s y

42
There are several variations on this; for example we can just substitute 1T z for s in the linear
matrix inequality.
3.7 The illumination problem. In lecture 1 we encountered the function

f (p) = max | log aTi p log Ides |


i=1,...,n

where ai Rm , and Ides > 0 are given, and p Rm


+.

(a) Show that exp f is convex on {p | aTi p > 0, i = 1, . . . , n }.


(b) Show that the constraint no more than half of the total power is in any 10 lamps is convex
(i.e., the set of vectors p that satisfy the constraint is convex).
(c) Show that the constraint no more than half of the lamps are on is (in general) not convex.

Solution. To simplify the notation, we assume that Ides = 1 (if not, we can simply redefine aij
as aij /Ides ). We write Ii = aTi p (in the notation of page 1-6 of the lecture notes), so the objective
function can be written as
f (p) = max | log aTi p|.
i
The domain of f is
{p | aTi p > 0, i = 1, . . . , n}.

(a) First note that

| log(aTi p)| = max{log aTi p, log(1/aTi p)}


= log max{aTi p, 1/aTi p},

so we can write the objective function as

f (p) = log max max{aTi p, 1/aTi p}.


i

Both aTi p and 1/aTi p are convex on dom f , and therefore maxi max{aTi p, 1/aTi p} is a convex
function. In other words exp f is convex.
(b) This constraint can be expressed as
l
X m
X
p[i] 0.5 pi 0
i=1 i=1

where p[i] is the ith largest component of p (see page 4-11 of the lecture notes). The first term
on the left side is the power in the l lamps with the highest power, the second term is one
half of the total power. The function on the left hand side is convex, since it is the sum of
Pl
i=1 p[i] , which is convex, and a linear function,
(c) Consider two solutions p1 and p2 that satisfy the constraint. In the first solutions, the first
m/2 lamps are on and the rest is off (i.e., the first m/2 components of p1 are positive and the
rest is zero); in the second solution the first m/2 lamps are off and the rest is on (i.e., the first
m/2 components of p2 are zero, and the rest is positive). The number of nonzero components
in a convex combination of p1 and p2 will be m, i.e., the convex combination does not satisfy
the constraint.

43
3.8 Schur complements and LMI representation. Recognizing Schur complements (see A5.5) often
helps to represent nonlinear convex constraints as linear matrix inequalities (LMIs). Consider the
function
f (x) = (Ax + b)T (P0 + x1 P1 + + xn Pn )1 (Ax + b)
where A Rmn , b Rm , and Pi = PiT Rmm , with domain

dom f = {x Rn | P0 + x1 P1 + + xn Pn  0}.

This is the composition of the matrix fractional function and an affine mapping, and so is convex.
Give an LMI representation of epi f . That is, find a symmetric matrix F (x, t), affine in (x, t), for
which
x dom f, f (x) t F (x, t)  0.
Remark. LMI representations, such as the one you found in this exercise, can be directly used in
software systems such as CVX.
Solution. The epigraph of f is the set of points (x, t) that satisfy P0 + x1 P1 + + xn Pn  0 and

(Ax + b)T (P0 + x1 P1 + + xn Pn )1 (Ax + b) t.

Using Schur complements, we can write the second inequality as


" #
t (Ax + b)T
 0.
(Ax + b) P0 + x1 P1 + + xn Pn

This a linear matrix inequality in the variables x, t, i.e., a convex constraint.


3.9 Complex least-norm problem. We consider the complex least `p -norm problem

minimize kxkp
subject to Ax = b,

where A Cmn , b Cm , and the variable is x Cn . Here k kp denotes the `p -norm on Cn ,


defined as !1/p
n X
kxkp = |xi |p
i=1
for p 1, and kxk = maxi=1,...,n |xi |. We assume A is full rank, and m < n.

(a) Formulate the complex least `2 -norm problem as a least `2 -norm problem with real problem
data and variable. Hint. Use z = (<x, =x) R2n as the variable.
(b) Formulate the complex least ` -norm problem as an SOCP.
(c) Solve a random instance of both problems with m = 30 and n = 100. To generate the
matrix A, you can use the Matlab command A = randn(m,n) + i*randn(m,n). Similarly,
use b = randn(m,1) + i*randn(m,1) to generate the vector b. Use the Matlab command
scatter to plot the optimal solutions of the two problems on the complex plane, and comment
(briefly) on what you observe. You can solve the problems using the CVX functions norm(x,2)
and norm(x,inf), which are overloaded to handle complex arguments. To utilize this feature,
you will need to declare variables to be complex in the variable statement. (In particular,
you do not have to manually form or solve the SOCP from part (b).)

44
Solution.

(a) Define z = (<x, =x) R2n , so kxk22 = kzk22 . The complex linear equations Ax = b is the same
as <(Ax) = <b, =(Ax) = =b, which in turn can be expressed as the set of linear equations
" # " #
<A =A <b
z= .
=A <A =b

Thus, the complex least `2 -norm problem can be expressed as

minimize kzk
" 2 # " #
<A =A <b
subject to z= .
=A <A =b

(This is readily solved analytically).


(b) Using epigraph formulation, with new variable t, we write the problem as

minimize t " #
zi
subject to t, i = 1, . . . , n

zn+i
" 2 # " #
<A =A <b
z= .
=A <A =b

This is an SOCP with n second-order cone constraints (in R3 ).


(c) % complex minimum norm problem
%
randn(state,0);
m = 30; n = 100;
% generate matrix A
Are = randn(m,n); Aim = randn(m,n);
bre = randn(m,1); bim = randn(m,1);

A = Are + i*Aim;
b = bre + i*bim;

% 2-norm problem (analytical solution)


Atot = [Are -Aim; Aim Are];
btot = [bre; bim];
z_2 = Atot*inv(Atot*Atot)*btot;
x_2 = z_2(1:100) + i*z_2(101:200);

% 2-norm problem solution with cvx


cvx_begin
variable x(n) complex
minimize( norm(x) )
subject to

45
A*x == b;
cvx_end

% inf-norm problem solution with cvx


cvx_begin
variable xinf(n) complex
minimize( norm(xinf,Inf) )
subject to
A*xinf == b;
cvx_end

% scatter plot
figure(1)
scatter(real(x),imag(x)), hold on,
scatter(real(xinf),imag(xinf),[],filled), hold off,
axis([-0.2 0.2 -0.2 0.2]), axis square,
xlabel(Re x); ylabel(Im x);
The plot of the components of optimal p = 2 (empty circles) and p = (filled circles) solutions
is presented below. The optimal p = solution minimizes the objective maxi=1,...,n |xi | subject
to Ax = b, and the scatter plot of xi shows that almost all of them are concentrated around a
circle in the complex plane. This should be expected since we are minimizing the maximum
magnitude of xi , and thus almost all of xi s should have about an equal magnitude |xi |.
0.2

0.15

0.1

0.05
Im x

0.05

0.1

0.15

0.2
0.2 0.15 0.1 0.05 0 0.05 0.1 0.15 0.2
Re x

3.10 Linear programming with random cost vector. We consider the linear program

minimize cT x
subject to Ax  b.

46
Here, however, the cost vector c is random, normally distributed with mean E c = c0 and covariance
E(c c0 )(c c0 )T = . (A, b, and x are deterministic.) Thus, for a given x Rn , the cost cT x is
a (scalar) Gaussian variable.
We can attach several different meanings to the goal minimize cT x; we explore some of these
below.
(a) How would you minimize the expected cost E cT x subject to Ax  b?
(b) In general there is a tradeoff between small expected cost and small cost variance. One way
to take variance into account is to minimize a linear combination
E cT x + var(cT x) (6)
of the expected value E cT x and the variance var(cT x) = E(cT x)2 (E cT x)2 . This is called
the risk-sensitive cost, and the parameter 0 is called the risk-aversion parameter, since
it sets the relative values of cost variance and expected value. (For > 0, we are willing to
tradeoff an increase in expected cost for a decrease in cost variance). How would you minimize
the risk-sensitive cost? Is this problem a convex optimization problem? Be as specific as you
can.
(c) We can also minimize the risk-sensitive cost, but with < 0. This is called risk-seeking. Is
this problem a convex optimization problem?
(d) Another way to deal with the randomness in the cost cT x is to formulate the problem as
minimize
subject to prob(cT x )
Ax  b.
Here, is a fixed parameter, which corresponds roughly to the reliability we require, and
might typically have a value of 0.01. Is this problem a convex optimization problem? Be as
specific as you can. Can you obtain risk-seeking by choice of ? Explain.
Solution.
(a) Since E cT x = cT0 x, the problem is an LP
minimize cT0 x
subject to Ax  b.

(b) We have
var(cT x) = E(cT x E cT x)2 = E((c c0 )T x)2
= E xT (c c0 )(c c0 )T x
= xT (E(c c0 )(c c0 )T )x
= xT x.
The risk-averse cost can therefore be minimized by solving
minimize cT0 x + xT x
(7)
subject to Ax  b,
which is a (convex) quadratic program in x (since  0 and 0).

47
(c) Problem (7) is not convex if < 0 (the objective function is concave).
(d) The question is whether
minimize
subject to prob(cT x ) (8)
Ax  b
is a convex problem in x. For fixed x, cT x is a random variable, normally distribution with
mean cT0 x and variance xT x. Therefore
!
T cT0 x
prob(c x ) = ,
k1/2 xk
R u2 /2
where (t) = 1
2 x e du. The function is monotonically decreasing, and therefore we
can write

prob(cT x ) ( cT0 x)/k1/2 xk 1 ()


1 ()k1/2 xk + cT0 x .

If 0.5, we have 1 () 0, so this is a convex constraint in x.


In summary, we can write (8) as

minimize
subject to 1 ()k1/2 xk + cT0 x
Ax  b,

which is an SOCP in x and (a linear objective, one second-order cone constraint, and a set
of linear inequality constraints).
We obtain risk-seeking by choosing > 0.5. If > 0.5, we must have < E cT x. If you plot
the pdf of cT x, it will be clear that there two ways to decrease prob(cT x ) if < E cT x:
we can decrease the expected value (this shifts the pdf to the left), or we can increase the
variance, which is a risk-seeking choice.

3.11 Formulate the following optimization problems as semidefinite programs. The variable is x Rn ;
F (x) is defined as
F (x) = F0 + x1 F1 + x2 F2 + + xn Fn
with Fi Sm . The domain of f in each subproblem is dom f = {x Rn | F (x)  0}.

(a) Minimize f (x) = cT F (x)1 c where c Rm .


(b) Minimize f (x) = maxi=1,...,K cTi F (x)1 ci where ci Rm , i = 1, . . . , K.
(c) Minimize f (x) = sup cT F (x)1 c.
kck2 1

(d) Minimize f (x) = E(cT F (x)1 c) where c is a random vector with mean E c = c and covariance
E(c c)(c c)T = S.

Solution.

48
(a) Using the Schur complement theorem we can write the problem as an SDP

minimize t" #
F (x) c
subject to 0
cT t

with variables x, t. Note that the two problems are not quite equivalent at the boundary of
the domain, i.e., for points x with F (x) positive semidefinite but not positive definite. The
linear matrix inequality in the SDP given above is equivalent to

F (x)  0, c R(F (x)), cT F (x) c t

where F (x) is the pseudo-inverse (see page 651 of the textbook). The SDP is therefore
equivalent to
minimize cT F (x) c
subject to F (x)  0
c R(F (x)).
If F (x) is positive semidefinite but singular, and c R(F (x)), the objective function cT F (x) c
is finite, whereas it is + in the original problem. However this does not change the optimal
value of the problem (unless the set {x | F (x)  0} is empty).
As an example, consider
" # " #
1 x 0
c= , F (x) = .
0 0 1x

Then the problem in the assignment is to minimize 1/x, with domain {x | 0 < x < 1}. The
optimal value is 1 and is not attained. The SDP reformulation is equivalent to minimizing
1/x subject to 0 x 1. The optimal value is 1 and attained at x = 1.
(b)

minimize t" #
F (x) ci
subject to  0, i = 1, . . . , K.
cTi t

(c) The cost function can be expressed as

f (x) = max (F (x)1 ),

so f (x) t if and only if F (x)1  tI. Using a Schur complement we get

minimize t" #
F (x) I
subject to  0.
I tI

(d) The cost function can be expressed as

f (x) = cT F (x)1 c + tr(F (x)1 S).

49
Pm T
If we factor S as S = k=1 ck ck the problem is equivalent to
m
minimize cT F (x)1 c + cTk F (x)1 ck ,
P
k=1

which we can write as an SDP


P
minimize t0 + tk
" k #
F (x) c
subject to 0
cT t0
" #
F (x) ck
 0, k = 1, . . . , m.
cTk tk

3.12 A matrix fractional function.[?] Show that X = B T A1 B solves the SDP

minimize tr X
" #
A B
subject to  0,
BT X

with variable X Sn , where A Sm


++ and B R
mn
are given.
Conclude that tr(B T A1 B) is a convex function of (A, B), for A positive definite.
Solution. The constraint is equivalent to X  B T A1 B. Therefore tr X tr(B T A1 B) for all
feasible X, with equality if X = B T A1 B. This shows that X = B T A1 B is optimal.
The optimal value can be expressed as tr(B T A1 B) = inf X F (X, A, B) where F is defined as

F (X, A, B) = tr X,

with domain
( " # )
m m m
A B
dom F = (X, A, B) S S S A  0, 0 .

BT X

The function F is convex, jointly in A, B, X. Therefore its infimum over X, which is tr(B T A1 B),
is convex in A, B.

3.13 Trace of harmonic mean of matrices. [?] The matrix H(A, B) = 2(A1 + B 1 )1 is known as the
harmonic mean of positive definite matrices A and B. Show that X = (1/2)H(A, B) solves the
SDP
maximize " tr X # " #
X X A 0
subject to  ,
X X 0 B
n
with variable X S . The matrices A Sn++ and B Sn++ are given. Conclude that the function
tr (A1 + B 1 )1 , with domain Sn++ Sn++ , is concave.


Hint. Verify that the matrix " #


A1 I
R=
B 1 I

50
is nonsingular. Then apply the congruence transformation defined by R to the two sides of matrix
inequality in the SDP, to obtain an equivalent inequality
" # " #
T X X T A 0
R RR R.
X X 0 B

Solution. We first show that the matrix is nonsingular. Assume


" #" # " #
A1 I x 0
= .
B 1 I y 0

From the first equation, y = A1 x. Substituting this in the second equation gives B 1 x+A1 x =
0, and therefore xT (B 1 + A1 )x = 0. Since A and B are positive definite this implies x = 0. If
x = 0, then also y = A1 x = 0. This shows that the matrix has a zero nullspace, or, equivalently,
its columns are linearly independent.
Following the hint, we write the constraint as
" #" #" # " #" #" #
A1 B 1 X X A1 I A1 B 1 A 0 A1 I
 .
I I X X B 1 I I I 0 B B 1 I

After working out the products we get


" # " #
(A1 + B 1 )X(A1 + B 1 ) 0 A1 + B 1 0
 .
0 0 0 A+B

This shows that the SDP is equivalent to

maximize tr X
subject to X  (A1 + B 1 )1 .

We have tr X tr((A1 + B 1 )1 ) for all feasible X, with equality if X = (A1 + B 1 )1 .

3.14 Trace of geometric mean of matrices. [?]


 1/2
G(A, B) = A1/2 A1/2 BA1/2 A1/2

is known as the geometric mean of positive definite matrices A and B. Show that X = G(A, B)
solves the SDP
maximize " tr X #
A X
subject to  0.
X B
The variable is X Sn . The matrices A Sn++ and B Sn++ are given.
Conclude that the function tr G(A, B) is concave, for A, B positive definite.
Hint. The symmetric matrix square root is monotone: if U and V are positive semidefinite with
U  V then U 1/2  V 1/2 .

51
Solution. Using Schur complements, we can write the constraint as

XA1 X  B,

and
(A1/2 XA1/2 )2 = A1/2 XA1 XA1/2  A1/2 BA1/2 .
Using the hint, this implies that

A1/2 XA1/2  (A1/2 BA1/2 )1/2


X  A1/2 (A1/2 BA1/2 )1/2 A1/2 .

We conclude that every feasible X satisfies X  G(A, B), and hence tr X tr G(A, B).
Moreover, X = G(A, B) is feasible because

XA1 X = A1/2 (A1/2 BA1/2 )1/2 A1/2 A1 A1/2 (A1/2 BA1/2 )1/2 A1/2
= A1/2 (A1/2 BA1/2 )A1/2
= B.

3.15 Transforming a standard form convex problem to conic form. In this problem we show that any
convex problem can be cast in conic form, provided some technical conditions hold. We start with
a standard form convex problem with linear objective (without loss of generality):

minimize cT x
subject to fi (x) 0, i = 1, . . . , m,
Ax = b,

where fi : Rn R are convex, and x Rn is the variable. For simplicity, we will assume that
dom fi = Rn for each i.
Now introduce a new scalar variable t R and form the convex problem

minimize cT x
subject to tfi (x/t) 0, i = 1, . . . , m,
Ax = b, t = 1.

Define
K = cl{(x, t) Rn+1 | tfi (x/t) 0, i = 1, . . . , m, t > 0}.
Then our original problem can be expressed as

minimize cT x
subject to (x, t) K,
Ax = b, t = 1.

This is a conic problem when K is proper.


You will relate some properties of the original problem to K.

52
(a) Show that K is a convex cone. (It is closed by definition, since we take the closure.)
(b) Suppose the original problem is strictly feasible, i.e., there exists a point x
with fi (x) < 0,
i = 1, . . . , m. (This is called Slaters condition.) Show that K has nonempty interior.
(c) Suppose that the inequalities define a bounded set, i.e., {x | fi (x) 0, i = 1, . . . , m} is
bounded. Show that K is pointed.

Solution.

(a) The functions tfi (x/t) are convex, so the intersection of their 0-sublevel sets is convex. We
see that it is a cone since if (x, t) K and > 0 we have (x, t) K.
x, 1) int K, so int K 6= .
(b) We have (
(c) To show K is pointed, assume that (x, t) K and (x, t) K. Since the second component is
always nonnegative in K, we see that t = 0. So (x, 0) K, (x, 0) K. Assume that x 6= 0.
(x, 0) K means that there are sequences xk x, tk 0, tk > 0, with tk f (xk /tk ) 0,
i = 1, . . . , m. This is the same as f (xk /tk ) 0. But xk x 6= 0, so xk /tk is an unbounded
sequence, and therefore cannot be in {x | fi (x) 0, i = 1, . . . , m}. So we have a contradiction.

3.16 Exploring nearly optimal points. An optimization algorithm will find an optimal point for a problem,
provided the problem is feasible. It is often useful to explore the set of nearly optimal points. When
a problem has a strong minimum, the set of nearly optimal points is small; all such points are close
to the original optimal point found. At the other extreme, a problem can have a soft minimum,
which means that there are many points, some quite far from the original optimal point found, that
are feasible and have nearly optimal objective value. In this problem you will use a typical method
to explore the set of nearly optimal points.
We start by finding the optimal value p? of the given problem

minimize f0 (x)
subject to fi (x) 0, i = 1, . . . , m
hi (x) = 0, i = 1, . . . , p,

as well as an optimal point x? Rn . We then pick a small positive number , and a vector c Rn ,
and solve the problem

minimize cT x
subject to fi (x) 0, i = 1, . . . , m
hi (x) = 0, i = 1, . . . , p
f0 (x) p? + .

Note that any feasible point for this problem is -suboptimal for the original problem. Solving this
problem multiple times, with different cs, will generate (perhaps different) -suboptimal points. If
the problem has a strong minimum, these points will all be close to each other; if the problem has
a weak minimum, they can be quite different.
There are different strategies for choosing c in these experiments. The simplest is to choose the
cs randomly; another method is to choose c to have the form ei , for i = 1, . . . , n. (This method
gives the range of each component of x, over the -suboptimal set.)

53
You will carry out this method for the following problem, to determine whether it has a strong
minimum or a weak minimum. You can generate the vectors c randomly, with enough samples for
you to come to your conclusion. You can pick  = 0.01p? , which means that we are considering the
set of 1% suboptimal points.
The problem is a minimum fuel optimal control problem for a vehicle moving in R2 . The position
at time kh is given by p(k) R2 , and the velocity by v(k) R2 , for k = 1, . . . , K. Here h > 0 is
the sampling period. These are related by the equations

p(k + 1) = p(k) + hv(k), v(k + 1) = (1 )v(k) + (h/m)f (k), k = 1, . . . , K 1,

where f (k) R2 is the force applied to the vehicle at time kh, m > 0 is the vehicle mass, and
(0, 1) models drag on the vehicle; in the absense of any other force, the vehicle velocity decreases
by the factor 1 in each discretized time interval. (These formulas are approximations of more
accurate formulas that involve matrix exponentials.)
The force comes from two thrusters, and from gravity:
" # " # " #
cos 1 cos 2 0
f (k) = u1 (k) + u2 (k) + , k = 1, . . . , K 1.
sin 1 sin 2 mg

Here u1 (k) R and u2 (k) R are the (nonnegative) thruster force magnitudes, 1 and 2 are the
directions of the thrust forces, and g = 10 is the constant acceleration due to gravity.
The total fuel use is
K1
X
F = (u1 (k) + u2 (k)) .
k=1

(Recall that u1 (k) 0, u2 (k) 0.)


The problem is to minimize fuel use subject to the initial condition p(1) = 0, v(1) = 0, and the
way-point constraints
p(ki ) = wi , i = 1, . . . , M.
(These state that at the time hki , the vehicle must pass through the location wi R2 .) In addition,
we require that the vehicle should remain in a square operating region,

kp(k)k P max , k = 1, . . . , K.

Both parts of this problem concern the specific problem instance with data given in thrusters_data.*.

(a) Find an optimal trajectory, and the associated minimum fuel use p? . Plot the trajectory p(k)
in R2 (i.e., in the p1 , p2 plane). Verify that it passes through the way-points.
(b) Generate several 1% suboptimal trajectories using the general method described above, and
plot the associated trajectories in R2 . Would you say this problem has a strong minimum, or
a weak minimum?

Solution.

(a) The following Matlab script finds the optimal solution.

54
cvx_quiet(true);
thrusters_data;
F = [ cos(theta1) cos(theta2);...
sin(theta1) sin(theta2)];

% finding optimal solution


cvx_begin
variables u(2,K-1) p(2,K) v(2,K)
minimize ( sum(sum(u)))
p(:,1) == 0; % initial position
v(:,1) == 0; % initial velocity
% way-point constraints
p(:,k1) == w1;
p(:,k2) == w2;
p(:,k3) == w3;
p(:,k4) == w4;
for i=1:K-1
p(:,i+1) == p(:,i) + h*v(:,i);
v(:,i+1) == (1-alpha)*v(:,i) + h*F*u(:,i)/m + [0; -g*h];
end
u >= 0;
% constaints on positions (x,y)
p <= pmax;
p >= -pmax;
cvx_end

display(The optimal fuel use is: );


optval = cvx_optval
plot(p(1,:),p(2,:));
hold on
ps = [zeros(2,1) w1 w2 w3 w4];
plot(ps(1,:),ps(2,:),*);
xlabel(x); ylabel(y); title(optimal);
axis([-6 6 -6 6]);
This Matlab script generates the following optimal trajectory.

55
optimal
6

y
2

6
6 4 2 0 2 4 6
x

The optimal value fuel use is found to be 1055.3.


(b) The following script finds 1% suboptimal solutions.
% finding nearly optimal solutions
cvx_begin
variables u(2,K-1) p(2,K) v(2,K)
minimize ( sum ( sum ( randn(2,K-1).*u ) ) + ...
sum ( sum ( randn(2,K).*p ) ) + ...
sum ( sum ( randn(2,K).*v ) ) )
p(:,1) == 0; % initial position
v(:,1) == 0; % initial velocity
% way-point constraints
p(:,k1) == w1;
p(:,k2) == w2;
p(:,k3) == w3;
p(:,k4) == w4;
for i=1:K-1
p(:,i+1) == p(:,i) + h*v(:,i);
v(:,i+1) == (1-alpha)*v(:,i) + F*u(:,i) + [0; -g*h];
end
u >= 0;
sum(sum(u))<=1.01*optval;
% constaints on positions (x,y)
p <= pmax;
p >= -pmax;
cvx_end

figure;
plot(p(1,:),p(2,:));
hold on
ps = [zeros(2,1) w1 w2 w3 w4];
plot(ps(1,:),ps(2,:),*);

56
xlabel(x); ylabel(y); title(suboptimal);
axis([-6 6 -6 6]);
This script returns 4 randomly-generated nearly optimal trajectories.
suboptimal
6

0
y

6
6 4 2 0 2 4 6
x

suboptimal
6

0
y

6
6 4 2 0 2 4 6
x

suboptimal
6

0
y

6
6 4 2 0 2 4 6
x

57
suboptimal
6

y
2

6
6 4 2 0 2 4 6
x

We see that these nearly optimal trajectories are very, very different. So in this problem there
is a weak minimum, i.e., a very large 1%-suboptimal set.

3.17 Minimum fuel optimal control. Solve the minimum fuel optimal control problem described in
exercise 4.16 of Convex Optimization, for the instance with problem data

1 0.4 0.8 1 7
A= 1 0 0 , b = 0 , xdes = 2 , N = 30.

0 1 0 0.3 6

You can do this by forming the LP you found in your solution of exercise 4.16, or more directly
using CVX. Plot the actuator signal u(t) as a function of time t.
Solution. The following Matlab code finds the solution

close all
clear all

n=3; % state dimension


N=30; % time horizon

A=[ -1 0.4 0.8; 1 0 0 ; 0 1 0];


b=[ 1 0 0.3];
x0 = zeros(n,1);
xdes = [ 7 2 -6];

cvx_begin
variable X(n,N+1);
variable u(1,N);
minimize (sum(max(abs(u),2*abs(u)-1)))
subject to
X(:,2:N+1) == A*X(:,1:N)+b*u; % dynamics
X(:,1) == x0;

58
3

2.5

1.5

u
1

0.5

0.5

0 5 10 15 20 25
t

Figure 1: Minimum fuel actuator signal.

X(:,N+1) == xdes;
cvx_end

stairs(0:N-1,u,linewidth,2)
axis tight
xlabel(t)
ylabel(u)

The optimal actuator signal is shown in figure 1.


3.18 Heuristic suboptimal solution for Boolean LP. This exercise builds on exercises 4.15 and 5.13 in
Convex Optimization, which involve the Boolean LP
minimize cT x
subject to Ax  b
xi {0, 1}, i = 1, . . . , n,

with optimal value p? . Let xrlx be a solution of the LP relaxation


minimize cT x
subject to Ax  b
0  x  1,

so L = cT xrlx is a lower bound on p? . The relaxed solution xrlx can also be used to guess a Boolean
, by rounding its entries, based on a threshold t [0, 1]:
point x
(
1 xrlx
i t
x
i =
0 otherwise,

59
is Boolean (i.e., has entries in {0, 1}). If it is feasible for the Boolean
for i = 1, . . . , n. Evidently x
LP, i.e., if A x  b, then it can be considered a guess at a good, if not optimal, point for the Boolean
LP. Its objective value, U = cT x , is an upper bound on p? . If U and L are close, then x is nearly
optimal; specifically, x cannot be more than (U L)-suboptimal for the Boolean LP.
This rounding need not work; indeed, it can happen that for all threshold values, x
is infeasible.
But for some problem instances, it can work well.
Of course, there are many variations on this simple scheme for (possibly) constructing a feasible,
good point from xrlx .
Finally, we get to the problem. Generate problem data using one of the following.
Matlab code:

rand(state,0);
n=100;
m=300;
A=rand(m,n);
b=A*ones(n,1)/2;
c=-rand(n,1);

Python code:

import numpy as np
np.random.seed(0)
(m, n) = (300, 100)
A = np.random.rand(m, n); A = np.asmatrix(A)
b = A.dot(np.ones((n, 1)))/2; b = np.asmatrix(b)
c = -np.random.rand(n, 1); c = np.asmatrix(c)

Julia code:

srand(0);
n=100;
m=300;
A=rand(m,n);
b=A*ones(n,1)/2;
c=-rand(n,1);

You can think of xi as a job we either accept or decline, and ci as the (positive) revenue we
generate if we accept job i. We can think of Ax  b as a set of limits on m resources. Aij , which
is positive, is the amount of resource i consumed if we accept job j; bi , which is positive, is the
amount of resource i available.
Find a solution of the relaxed LP and examine its entries. Note the associated lower bound L.
Carry out threshold rounding for (say) 100 values of t, uniformly spaced over [0, 1]. For each value
of t, note the objective value cT x and the maximum constraint violation maxi (A x b)i . Plot the
objective value and the maximum violation versus t. Be sure to indicate on the plot the values of
t for which x is feasible, and those for which it is not.

60
Find a value of t for which x
is feasible, and gives minimum objective value, and note the associated
upper bound U . Give the gap U L between the upper bound on p? and the lower bound on p? .
In Matlab, if you define vectors obj and maxviol, you can find the upper bound as U=min(obj(find(maxviol<=0)))
Matlab Solution
The following Matlab code finds the solution

% generate data for boolean LP relaxation & heuristic


rand(state,0);
n=100;
m=300;
A=rand(m,n);
b=A*ones(n,1)/2;
c=-rand(n,1);

% solve LP relaxation
cvx_begin
variable x(n)
minimize (c*x)
subject to
A*x <= b
x>=0
x<=1
cvx_end
xrlx = x;
L=cvx_optval;

% sweep over threshold & round


thres=0:0.01:1;
maxviol = zeros(length(thres),1);
obj = zeros(length(thres),1);
for i=1:length(thres)
xhat = (xrlx>=thres(i));
maxviol(i) = max(A*xhat-b);
obj(i) = c*xhat;
end

% find least upper bound and associated threshold


i_feas=find(maxviol<=0);
U=min(obj(i_feas))
%U=min(obj(find(maxviol <=0)))
t=min(i_feas);
min_thresh=thres(t)

% plot objective and max violation versus threshold


subplot(2,1,1)

61
plot(thres(1:t-1),maxviol(1:t-1),r,thres(t:end),maxviol(t:end),b,linewidth,2);
xlabel(threshold);
ylabel(max violation);
subplot(2,1,2)
hold on; plot(thres,L*ones(size(thres)),k,linewidth,2);
plot(thres(1:t-1),obj(1:t-1),r,thres(t:end),obj(t:end),b,linewidth,2);
xlabel(threshold);
ylabel(objective);

The lower bound found from the relaxed LP is L = 33.1672. We find that the threshold value
: U = 32.4450. The difference
t = 0.6006 gives the best (smallest) objective value for feasible x
is 0.7222. So x, with t = 0.6006, can be no more than 0.7222 suboptimal, i.e., around 2.2%
suboptimal.
In figure 2, the red lines indicate values for thresholding values which give infeasible x
, and the blue
lines correspond to feasible x . We see that the maximum violation decreases as the threshold is
increased. This occurs because the constraint matrix A only has nonnegative entries. At a threshold
of 0, all jobs are selected, which is an infeasible solution. As we increase the threshold, projects are
removed in sequence (without adding new projects), which monotonically decreases the maximum
violation. For a general boolean LP, the corresponding plots need not exhibit monotonic behavior.
30

20
max violation

10

10

20

30
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
threshold

10
objective

20

30

40

50
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
threshold

Figure 2: Plots of violation and objective vs threshold rule, for the Matlab code.

Python Solution
The following Python code finds the solution

import cvxpy as cvx


import numpy as np

62
import matplotlib.pyplot as plt

np.random.seed(0)
(m, n) = (300, 100)
A = np.random.rand(m, n); A = np.asmatrix(A)
b = A.dot(np.ones((n, 1)))/2; b = np.asmatrix(b)
c = -np.random.rand(n, 1); c = np.asmatrix(c)

#Solve relaxed LP
x = cvx.Variable(n)
objective = cvx.Minimize(c.T*x)
constraints = [0<=x, x<=1, A*x<=b]
cvx.Problem(objective, constraints).solve()
L = objective.value
x_rlx = x.value

#Rounding parameters
N = 100
t = np.linspace(0, 1, num=N).reshape(N, 1)
maxviol = np.zeros((N, 1))
obj = np.zeros((N, 1))
U = float(inf)
t_best = float(nan)

#Round
for i in range(N):
x = np.matrix(x_rlx >= t[i], dtype = float)
obj[i] = c.T*x
maxviol[i] = max(A*x-b)
if maxviol[i]<=0 and obj[i]<U:
U = float(obj[i])
x_best = x
t_best = t[i]

#Plot
plt.figure(1)
plt.subplot(211)
plt.plot(t[maxviol<=0], maxviol[maxviol<=0], b)
plt.plot(t[maxviol>0], maxviol[maxviol>0], r)
plt.ylabel(max violation)
plt.xlabel(threshold)

plt.subplot(212)
plt.plot(t[maxviol<=0], obj[maxviol<=0], b)

63
plt.plot(t[maxviol>0], obj[maxviol>0], r)
plt.plot(t,objective.value*np.ones((N,1)), g)
plt.ylabel(objective)
plt.xlabel(threshold)
plt.savefig(figures/boolean_lp_heur_py.eps)
plt.show()

The lower bound found from the relaxed LP is L = 34.42. We find that the threshold value
: U = 33.58. The difference is 0.84.
t = 0.56 gives the best (smallest) objective value for feasible x
So x
, with t = 0.56, can be no more than 0.84 suboptimal, i.e., around 2.5% suboptimal.
In figure 3, the red lines indicate values for thresholding values which give infeasible x
, and the blue
lines correspond to feasible x . We see that the maximum violation decreases as the threshold is
increased. This occurs because the constraint matrix A only has nonnegative entries. At a threshold
of 0, all jobs are selected, which is an infeasible solution. As we increase the threshold, projects are
removed in sequence (without adding new projects), which monotonically decreases the maximum
violation. For a general boolean LP, the corresponding plots need not exhibit monotonic behavior.

30
20
max violation

10
0
10
20
30
0.0 0.2 0.4 0.6 0.8 1.0
0 threshold

10
objective

20

30

40

50
0.0 0.2 0.4 0.6 0.8 1.0
threshold

Figure 3: Plots of violation and objective vs threshold rule, for the Python code.

Julia Solution
The following Julia code finds the solution

using Convex, Gadfly

# generate data for boolean LP relaxation & heuristic


srand(0);
n=100;
m=300;

64
A=rand(m,n);
b=A*ones(n,1)/2;
c=-rand(n,1);

# solve LP relaxation
x = Variable(n);
p = minimize(c*x, A*x <= b, x >= 0, x<= 1);
solve!(p);
xrlx = x.value;
L = p.optval;

# sweep over threshold & round


thres=0:0.01:1;
maxviol = zeros(length(thres),1);
obj = zeros(length(thres),1);
for i = 1:length(thres)
xhat = xrlx .>= thres[i];
maxviol[i] = maximum(A*xhat-b);
obj[i] = (c*xhat)[1];
end

# find least upper bound and associated threshold


i_feas = find(maxviol.<=0);
U = minimum(obj[i_feas])
t = minimum(i_feas);
min_thresh = thres[t];

# plot objective and max violation versus threshold


p1 = plot(
layer(
x = thres[1:t-1],
y = maxviol[1:t-1],
Geom.line,
Theme(default_color = color("red"))
),
layer(
x = thres[t:end],
y = maxviol[t:end],
Geom.line,
Theme(default_color = color("blue"))
),
Guide.xlabel("threshold"),
Guide.ylabel("max violation")
);
draw(PS("boolean_lp_heur_jl_1.eps", 6inch, 2inch), p1)

65
p2 = plot(
layer(
x = thres[1:t-1],
y = obj[1:t-1],
Geom.line,
Theme(default_color = color("red"))
),
layer(
x = thres[t:end],
y = obj[t:end],
Geom.line,
Theme(default_color = color("blue"))
),
layer(
x = thres,
y = L*ones(length(thres)),
Geom.line,
Theme(default_color = color("black"))
),
Guide.xlabel("threshold"),
Guide.ylabel("objective")
);
draw(PS("boolean_lp_heur_jl_2.eps", 6inch, 2inch), p2)

The lower bound found from the relaxed LP is L = 33.828. We find that the threshold value
: U = 32.601. The difference is
t = 0.54 gives the best (smallest) objective value for feasible x
1.23. So x
, with t = 0.54, can be no more than 1.23 suboptimal, i.e., around 3.6% suboptimal.
In figure 4, the red lines indicate values for thresholding values which give infeasible x
, and the blue
lines correspond to feasible x . We see that the maximum violation decreases as the threshold is
increased. This occurs because the constraint matrix A only has nonnegative entries. At a threshold
of 0, all jobs are selected, which is an infeasible solution. As we increase the threshold, projects are
removed in sequence (without adding new projects), which monotonically decreases the maximum
violation. For a general boolean LP, the corresponding plots need not exhibit monotonic behavior.

3.19 Optimal operation of a hybrid vehicle. Solve the instance of the hybrid vehicle operation problem de-
scribed in exercise 4.65 in Convex Optimization, with problem data given in the file hybrid_veh_data.*,
and fuel use function F (p) = p + p2 (for p 0).
Hint. You will actually formulate and solve a relaxation of the original problem. You may find that
some of the equality constraints you relaxed to inequality constraints do not hold for the solution
found. This is not an error: it just means that there is no incentive (in terms of the objective) for
the inequality to be tight. You can fix this in (at least) two ways. One is to go back and adjust
certain variables, without affecting the objective and maintaining feasibility, so that the relaxed
constraints hold with equality. Another simple method is to add to the objective a term of the

66
Figure 4: Plots of violation and objective vs threshold rule, for the Julia code.

form
T
X
 max{0, Pmg (t)},
t=1

where  is small and positive. This makes it more attractive to use the brakes to extract power
from the wheels, even when the battery is (or will be) full (which removes any fuel incentive).
Find the optimal fuel consumption, and compare to the fuel consumption with a non-hybrid ver-
sion of the same vehicle (i.e., one without a battery). Plot the braking power, engine power,
motor/generator power, and battery energy versus time.
max , i.e., the partial
How would you use optimal dual variables for this problem to find Ftotal /Ebatt
derivative of optimal fuel consumption with respect to battery capacity? (You can just assume
that this partial derivative exists.) You do not have to give a long derivation or proof; you can
just state how you would find this derivative from optimal dual variables for the problem. Verify
your method numerically, by changing the battery capacity a small amount and re-running the
optimization, and comparing this to the prediction made using dual variables.
Solution.
The code for solving this instance is given below. The optimal fuel consumption of the vehicle
with the battery is Ftotal = 5077.53. Without the battery the fuel consumption goes up to Ftotal =
5896.81. The battery can shift energy in time and recover energy that would have been lost in
friction braking, so we expect the fuel use to be smaller than without a battery.
Closely examining the optimal variables reveals that there is a short period when the relaxed
battery energy conservation constraint does not hold : We actually elect to throw battery energy
away. This is a brief period before t 200, when the battery is already full, so no further charging
will make any difference at all. This is not a problem: The optimal fuel consumption is correct. The
constraint can be made tight by following the method in the hint, without affecting fuel optimality.
max , we let ? (t) be an optimal dual variable associated with the constraint
To find Ftotal /Ebatt
E(t) Ebatt . We interpret ? (t) as the partial derivative of optimal fuel use with respect to
max
max at time t. It tells us how the fuel usage would improve if, at time t only, we were able to
Ebatt

67
max , we sum:
store more energy in the battery. To get the full partial derivative Ftotal /Ebatt
T
Ftotal X
max = ? (t).
Ebatt t=1

max = 4.9680. We can verify this numerically by perturbing


This formula gives us Ftotal /Ebatt
max +
the constraint by a small amount and running the optimization again. Setting E(t) Ebatt
max F
E, E = 0.1 we obtain Ftotal /Ebat E = 4.9658, where F is the difference in fuel
comsumption between the perturbed and unperturbed cases. The numerical answer is indeed very
close to the answer obtained from the dual variables.

Brake power
15

10
Pbr

5
0 50 100 150 200 250 300 350 400
time
Engine power
20

15
Peng

10

0
0 50 100 150 200 250 300 350 400
time
Motor/generator power
10

5
Pmg

10
0 50 100 150 200 250 300 350 400
time
Battery stored energy
100

80

60
Ebatt

40

20

0
0 50 100 150 200 250 300 350 400
time

% instance of hybrid vehicle optimization problem,


% exercise 4.65 in Boyd & Vandenberghe, Convex Optimization

hybrid_veh_data

% solution via CVX

epsilon=1e-5;

68
cvx_begin
cvx_quiet(true)
variables Peng(T) Pbr(T) Pmg(T) E(T+1)
dual variable lambda
% minimize (sum(Peng+gamma*square(Peng))) % total fuel use
minimize (sum(Peng+gamma*square(Peng))+epsilon*sum(pos(-Pmg)))
Preq == Peng + Pmg - Pbr; % power balance
E(1) == E(T+1); % starting and ending battery energy match
Pmg_min <= Pmg; % maximum generator power
Pmg <= Pmg_max; % maximum motor power
0 <= E;
% assign the dual variable to this set of constraints
% for use in sensitivity analysis
lambda: E <= Ebatt_max; % battery capacity
0 <= Peng;
Peng <= Peng_max; % max engine power
Pbr >= 0;
for t=1:T
E(t+1)<=E(t)-Pmg(t)-eta*abs(Pmg(t));
% this is a relaxation; true formula is below
% E(t+1)==E(t)-Pmg(t)-eta*abs(Pmg(t));
% extra term above is used to gaurantee tightness at optimal
end
cvx_end
Ftot = cvx_optval

% verify that the constraint holds with equality


DE=E(2:T+1)-E(1:T)+Pmg(1:T)+eta*abs(Pmg(1:T));
if max(abs(DE))<1e-3
fprintf(battery energy constraint holds with equality \n)
end

% solve the problem for a vehicle without a battery


% can do this analytically, but no harm in using CVX
cvx_begin
cvx_quiet(true)
variables Peng_nobatt(T) Pbr_nobatt(T)
minimize (sum(Peng_nobatt+gamma*square(Peng_nobatt))) % total fuel use
Preq == Peng_nobatt - Pbr_nobatt;
0 <= Peng_nobatt;
Peng_nobatt <= Peng_max;
Pbr_nobatt >= 0;
cvx_end
Ftot_no_batt = cvx_optval

69
% generate the plots
scrsz = get(0,ScreenSize);
figure(Position,[1 1 scrsz(3) scrsz(4)])
subplot(4,1,1);
plot(Pbr); hold on; plot(Pbr_nobatt,r)
title(Brake power); xlabel(time); ylabel(Pbr);
%legend(With battery,Without battery)
subplot(4,1,2);
plot(Peng); hold on; plot(Peng_nobatt, r)
title(Engine power); xlabel(time); ylabel(Peng);
%legend(With battery,Without battery)
subplot(4,1,3);
plot(Pmg)
title(Motor/generator power); xlabel(time); ylabel(Pmg);
subplot(4,1,4);
plot(E)
title(Battery stored energy); xlabel(time); ylabel(Ebatt);

% numerical verification of sensitivity analysis


% perturb Ebatt_max by deltaE and examine the change in F
deltaE=0.1;
cvx_begin
cvx_quiet(true)
variables Peng(T) Pbr(T) Pmg(T) E(T+1)
% minimize (sum(Peng+gamma*square(Peng))) % total fuel use
minimize (sum(Peng+gamma*square(Peng))+epsilon*sum(pos(-Pmg)))
Preq == Peng + Pmg - Pbr;
E(1) == E(T+1);
Pmg_min <= Pmg;
Pmg <= Pmg_max;
0 <= E;
E <= Ebatt_max+deltaE;
0 <= Peng;
Peng <= Peng_max;
Pbr >= 0;
for t=1:T
E(t+1)<=E(t)-Pmg(t)-eta*abs(Pmg(t));
end
cvx_end
Ftot_deltaE=cvx_optval;
% calculate sensitivity numerically
dFdE_num=(Ftot_deltaE-Ftot)/deltaE
% take the negative of the sum of the dual variables
% to obtain dFtot/dEbatt_max
dFdE_tot=-sum(lambda)

70
3.20 Optimal vehicle speed scheduling. A vehicle (say, an airplane) travels along a fixed path of n
segments, between n + 1 waypoints labeled 0, . . . , n. Segment i starts at waypoint i 1 and
terminates at waypoint i. The vehicle starts at time t = 0 at waypoint 0. It travels over each
segment at a constant (nonnegative) speed; si is the speed on segment i. We have lower and upper
limits on the speeds: smin  s  smax . The vehicle does not stop at the waypoints; it simply
proceeds to the next segment. The travel distance of segment i is di (which is positive), so the
travel time over segment i is di /si . We let i , i = 1, . . . , n, denote the time at which the vehicle
arrives at waypoint i. The vehicle is required to arrive at waypoint i, for i = 1, . . . , n, between
times imin and imax , which are given. The vehicle consumes fuel over segment i at a rate that
depends on its speed, (si ), where is positive, increasing, and convex, and has units of kg/s.
You are given the data d (segment travel distances), smin and smax (speed bounds), min and max
(waypoint arrival time bounds), and the fuel use function : R R. You are to choose the speeds
s1 , . . . , sn so as to minimize the total fuel consumed in kg.

(a) Show how to pose this as a convex optimization problem. If you introduce new variables, or
change variables, you must explain how to recover the optimal speeds from the solution of
your problem. If convexity of the objective or any constraint function in your formulation is
not obvious, explain why it is convex.
(b) Carry out the method of part (a) on the problem instance with data in
veh_speed_sched_data.m. Use the fuel use function (si ) = as2i + bsi + c (the parameters
a, b, and c are defined in the data file). What is the optimal fuel consumption? Plot the
optimal speed versus segment, using the matlab command stairs or the function step from
matplotlib in Python and Julia to better show constant speed over the segments.

Solution.

(a) The fuel consumed over the ith segment is (di /si )(si ), so the total fuel used is ni=1 (di /si )(si ).
P

The vehicle arrives at waypoint i at time i = ij=1 (dj /sj ). Thus our problem is
P

n
P
minimize i=1 (di /si )(si )
subject to si si smax
min i = 1, . . . , n
Pi i
i j=1 (dj /sj ) imax i = 1, . . . , n,
min

with variables s1 , . . . , sn .
In this form, this is not a convex problem: the objective function need not be convex in si ,
and the inequalities imin ij=1 (dj /sj ) are not convex.
P

However, we can formulate this as a convex problem by making a change of variables. We


formulate the problem using the transit times of the segments, ti , as the optimization variable,
where ti = di /si . (We then have si = di /ti .) Our problem can be written as
n
P
minimize i=1 ti (di /ti )
subject to di /smax t di /smin i = 1, . . . , n
i Pi i i
i j=1 tj imax i = 1, . . . , n,
min

with variables t1 , . . . , tn .
This is a convex problem. The function ti (di /ti ), the perspective of , is convex jointly in
di and ti ; in particular, it is convex in ti . Therefore the objective function is convex, since it

71
is a positive weighted sum of convex functions. The constraints are all linear in t. Once we
have solved the problem for t?i we recover the optimal speeds using s?i = di /t?i .
(b) The optimal fuel consumption is 2617.83 kg. The code below solves the problem in Matlab:
% solution to vehicle speed scheduling problem
veh_speed_sched_data

cvx_begin
variable t(n)
minimize(sum(a*d.^2.*inv_pos(t)+b*d+c*t))
t<=d./smin;
t>=d./smax;
tau_min<=cumsum(t);
tau_max>=cumsum(t);
cvx_end
s=d./t;

stairs(s)
title(speed over segment);xlabel(i);ylabel(si)
%print(-depsc,veh_speed.eps)
Here is a solution in Python.
# solution to vehicle speed scheduling problem

import numpy as np
import cvxpy as cvx
import matplotlib.pyplot as plt
from veh_speed_sched_data import *

t = cvx.Variable(n)
obj = cvx.sum_entries(a*cvx.mul_elemwise(cvx.square(d), cvx.inv_pos(t)) + b*d + c*t)
cons = [t <= d / smin, t >= d / smax]
cons += [tau_min[i] <= cvx.sum_entries(t[0:i+1]) for i in range(n)]
cons += [tau_max[i] >= cvx.sum_entries(t[0:i+1]) for i in range(n)]

cvx.Problem(cvx.Minimize(obj), cons).solve()

s = d / t.value

plt.step(np.arange(n), s)
plt.title(speed over segment)
plt.xlabel($i$)
plt.ylabel($s_i$)
plt.savefig("veh_speed_sched.eps")
plt.show()
Finally, here is a Julia solution.

72
# solution to vehicle speed scheduling problem
include("veh_speed_sched_data.jl")

cumsum_mat = zeros(n, n);


for i = 1:n
for j = 1:i
cumsum_mat[i,j] = 1
end
end

using Convex, SCS


t = Variable(n);
constraints = [
t <= d./smin;
t >= d./smax;
tau_min <= cumsum_mat*t;
tau_max >= cumsum_mat*t;
];
p = minimize(sum(a*(d.^2).*inv_pos(t) + b*d + c*t), constraints);
solve!(p, SCSSolver(max_iters=15000));
s = vec(d./t.value);

using PyPlot
step(1:length(s), s);
xlabel("i");
ylabel("s_i");
title("speed over segment");

73
speed over segment
2

1.8

1.6

1.4

1.2
si

0.8

0.6

0.4
0 10 20 30 40 50 60 70 80 90 100
i

3.21 Norm approximation via SOCP, for `p -norms with rational p.

(a) Use the observation at the beginning of exercise 4.26 in Convex Optimization to express the
constraint

y z1 z2 , y, z1 , z2 0,
with variables y, z1 , z2 , as a second-order cone constraint. Then extend your result to the
constraint
y (z1 z2 zn )1/n , y 0, z  0,
where n is a positive integer, and the variables are y R and z Rn . First assume that n is
a power of two, and then generalize your formulation to arbitrary positive integers.
(b) Express the constraint
f (x) t
as a second-order cone constraint, for the following two convex functions f :
(
x x 0
f (x) =
0 x < 0,

where is rational and greater than or equal to one, and

f (x) = x , dom f = R++ ,

where is rational and negative.

74
(c) Formulate the norm approximation problem
minimize kAx bkp
as a second-order cone program, where p is a rational number greater than or equal to one.
The variable in the optimization problem is x Rn . The matrix A Rmn and the vector
b Rm are given. For an m-vector y, the norm kykp is defined as
m
!1/p
X
p
kykp = |yk |
k=1

when p 1.
Solution.
(a) The constraints are equivalent to
" #
2t
z 1 + z2 , t 0.

z1 z2

Expanding the norm inequality gives t2 z1 z2 and z1 + z2 0. It is therefore equivalent to



|t| z1 z2 and z1 , z2 0. If we also constrain t to be nonnegative, we obtain the constraint
in the problem statement.
For the second constraint, first suppose n = 2l . Introduce variables yij for i = 1, . . . , l 1,
and j = 1, . . . , 2i , and write the constraint
y (z1 z2 zn )1/n , y 0, z  0,
as the following set of inequalities:
z  0,
1/2
yl1,j (z2j1 z2j ) , yl1,j 0, j = 1, . . . , 2l1
yij (yi+1,2j1 yi+1,2j )1/2 , yij 0, i = 1, . . . , l 2, j = 1, . . . , 2i
y (y11 y12 )1/2 , y 0.
Then apply the second order cone formulation as in part (a).
For general n we write the constraint as
y (y mn z1 zn )1/m , y 0, z  0,
where m is the smallest power of two that is greater than or equal to n, and then use the
previous formulation.
(b) For the first function, write the constraint as
y (ts )1/r , y x, y 0, t0
where = r/s and r and s are integers, and then apply part (a) with zk = t for k = 1, . . . , s
and zk = 1 for k = s + 1, . . . , r.
For the second function, we express the constraint as
1 (xr ts )1/(r+s) , x 0, t 0,
where = r/s and r and s are integers, and then apply the formulation of part (a).

75
(c) First express the problem as
mP
minimize k=1 tk
subject to y  Ax b  y
ykp tk , k = 1, . . . , m

and use part (b).


3.22 Linear optimization over the complement of a convex set. Suppose C Rn+ is a closed bounded
convex set with 0 C, and c Rn+ . We define

C = cl(Rn+ \ C) = cl{x Rn+ | x 6 C},

which is the closure of the complement of C in Rn+ .


Show that cT x has a minimizer over C of the form ek , where 0 and ek is the kth standard
unit vector. (If you have not had a course on analysis, you can give an intuitive argument.)
If follows that we can minimize cT x over C by solving n one-dimensional optimization problems
(which, indeed, can each be solved by bisection, provided we can check whether a point is in C or
not).
Solution. Suppose the claim is false, and let x? be an optimal point, so p? = cT x? (we know the
optimum is attained, since C is closed and c  0). If ci = 0 for some i, we can find a > 0 for
which ei C (since C is bounded) and cT (ei ) = 0; so ei must be optimal. Thus, we can assume
without loss of generality that c  0. We can also assume p? > 0, otherwise we must have x? = 0.
For i = 1, . . . , n, let i = p? /ci , so cT (i ei ) = p? . If i ei C for some i, then i ei is also an optimal
point, so we can assume i ei / C, for all i = 1, . . . , n. This implies i ei int C (the interior of C in
n ?
R+ ). We can write x as a convex combination of the points i ei ,
n
X
x? = i (i ei ),
i=1

where i = x?i ci /p? . (It is easy to check that  0 and 1T = 1.) Since i ei are all interior
points, their convex combination must also be in the interior of C, but this implies x? which
/ C,
contradicts the assumption that it is optimal.
3.23 Jensens inequality for posynomials. Suppose f : Rn R is a posynomial function, x, y Rn++ ,
and [0, 1]. Define z Rn++ by zi = xi yi1 , i = 1, . . . , n. Show that f (z) f (x) f (y)1 .
Interpretation. We can think of z as a -weighted geometric mean between x and y. So the
statement above is that a posynomial, evaluated at a weighted geometric mean of two points, is no
more than the weighted geometric mean of the posynomial evaluated at the two points.
Solution.
Let
K
ck x1 1k x2 2k xnnk .
X
f (x) =
k=1
We apply the standard trasformation

x
i = log xi , yi = log yi , zi = log zi , i = 1, . . . , n,

76
and
ck = log ck .
The inequality we want to prove is equivalent to

log f (z) log f (x) + (1 ) log f (y).

Since
xi + (1 )
zi = yi , i = 1, . . . , n,
the inequality is
K K K
T i +(1)T T T
X X X
log e kx kyi +
ck
log e kx+
ck
+ (1 ) log ek y+ck .
k=1 k=1 k=1

Let
k = kT x
x + ck ,
yk = kT y + ck ,
so we finally have
K
X K
X K
X
log exk +(1)yk log exk + (1 ) log eyk .
k=1 k=1 k=1

This holds because the function log-sum-exp is convex (Chap. 3, page 72).

3.24 CVX implementation of a concave function. Consider the concave function f : R R defined by
(
(x + 1)/2 x > 1
f (x) =
x 0 x 1,

with dom f = R+ . Give a CVX implementation of f , via a partially specified optimization problem.
Check your implementation by maximizing f (x) + f (a x) for several interesting values of a (say,
a = 1, a = 1, and a = 3).
Solution. You cant construct f using simple disciplined convex programming (DCP) rules, so
youll need to represent it as the optimal value of a convex problem. This can be done several ways,
using various built-in CVX functions (which are themselves written as the optimal value of a cone
problem!). Here is one:
maximize u + v/2
subject to x = u + v, u 1, v 0,
with variables u, v. (Note that the squareroot function implies that u 0.) This implementation
relies on the built-in function sqrt.
Of course, you have to argue that the optimal value of this problem is indeed f (x). First of all, the
domain is correct: if x < 0, the problem above is infeasible, so f (x) = . Now suppose x [0, 1].
We have u = x v, so the derivative of the objective with respect to v is (1/2)(x v)1/2 + 1/2.
This is negative, so the optimal v is v = 0, and in this range of x, the optimal value of the problem

is x, as required. Now consider the case x > 1. Setting the derivative above to zero yields

77
v = x 1, which
implies u = 1. Plugging this into the objective, we find that the optimal value of
the problem is 1 + (1/2)(x 1) = (x + 1)/2 as required.
There are many other possible choices, of course.
Here is the implementation:

function result = ccv_f(x)


cvx_begin
variables u v;
maximize(sqrt(u)+v/2);
subject to
x == u+v
v >= 0
u <= 1
cvx_end
result = cvx_optval;

And here is the test script:

for a=[-1,1,3]
cvx_begin
variable x
maximize(ccv_f(x) + ccv_f(a-x));
cvx_end
a
x
cvx_optval
end

For a = 1, the problem is infeasible; when a = 1,


When it is run, it does everything the right way.
the optimal x is 1/2, and optimal value is 2; when a = 3, x = 3/2 is optimal, and the optimal
value is 5/2.

3.25 The following optimization problem arises in portfolio optimization:

rT x + d
maximize
kRx + qk2
n
fi (xi ) b
P
subject to
i=1
x  c.

The variable is x Rn . The functions fi are defined as

fi (x) = i xi + i |xi | + i |xi |3/2 ,

with i > |i |, i > 0. We assume there exists a feasible x with rT x + d > 0.

78
Show that this problem can be solved by solving an SOCP (if possible) or a sequence of SOCP
feasibility problems (otherwise).

Solution. Since there is a feasible x with rT x + d > 0, we can equivalently solve


kRx + qk2
minimize
rT x + d
n
fi (xi ) b
P
subject to
i=1
xc

and take as the domain of the cost function {x | rT x+d > 0}. The objective function is quasiconvex
and can be minimized via bisection. An alternative solution s to make a change of variables
1 1
y= x, t=
rT x + d rT x + d
by replacing x with y/t and adding the constraint rT y + dt = 1, t 0. This gives the equivalent
problem
minimize kRy + qtk2
n  
i yi + i |yi | + i t (|yi |/t)3/2 tb
X
subject to
i=1
y  tc
rT y + dt = 1
t 0.
(Note that the first constraint implies that y = 0 if t = 0, but this is impossible because of the
third constraint. Therefore t > 0 at the optimum.) The problem is further equivalent to

minimize w
subject to kRy + qtk2 w
n
(i yi + i ui + i zi ) tb
P
i=1
u  y  u
3/2
ui zi t1/2 , i = 1, . . . , n
y  tc
rT y + dt = 1
t 0.
3/2
It remains to express the constraints ui t1/2 zi as second order cone constraints. These con-
straints are equivalent to

u2i zi vi , vi2 ui t, zi , vi , ui , t 0

which in turn are equivalent to second-order cone constraints


" # " #
2ui 2vi
zi + v i , ui + ti , zi , vi , ui , t 0.

z i vi ui ti

2 2

79
3.26 Positive nonconvex QCQP. We consider a (possibly nonconvex) QCQP, with nonnegative variable
x Rn ,
minimize f0 (x)
subject to fi (x) 0, i = 1, . . . , m
x  0,
where fi (x) = (1/2)xT Pi x + qiT x + ri , with Pi Sn , qi Rn , and ri R, for i = 0, . . . , m. We do
not assume that Pi  0, so this need not be a convex problem.
Suppose that qi  0, and Pi have nonpositive off-diagonal entries, i.e., they satisfy

(Pi )jk 0, j 6= k, j, k = 1, . . . , n,

for i = 0, . . . , m. (A matrix with nonpositive off-diagonal entries is called a Z-matrix.) Explain


how to reformulate this problem as a convex problem.
Hint. Change variables using yj = (xj ), for some suitable function .
1/2
Solution. Let yj = x2j . Because xj 0, we can recover xj as xj = yj . The quadratic functions
fi can be written in terms of y as
n n
1X 1X X 1/2
fi (x) = (Pi )jj yj + (Pi )jk (yj yk )1/2 + (qi )j yj .
2 j=1 2 j6=k j=1

1/2
Since yj and (yj yk )1/2 (the geometric mean of yj and yk ) are concave and (Pi )jk 0, qi  0, this
is convex in y. Thus the QCQP becomes a convex problem in y.

3.27 Affine policy. We consider a family of LPs, parametrized by the random variable u, which is
uniformly distributed on U = [1, 1]p ,

minimize cT x
subject to Ax  b(u),

where x Rn , A Rmn , and b(u) = b0 + Bu Rm is an affine function of u. You can think of


ui as representing a deviation of the ith parameter from its nominal value. The parameters might
represent (deviations in) levels of resources available, or other varying limits.
The problem is to be solved many times; in each time, the value of u (i.e., a sample) is given, and
then the decision variable x is chosen. The mapping from u into the decision variable x(u) is called
the policy, since it gives the decision variable value for each value of u. When enough time and
computing hardware is available, we can simply solve the LP for each new value of u; this is an
optimal policy, which we denote x? (u).
In some applications, however, the decision x(u) must be made very quickly, so solving the LP is
not an option. Instead we seek a suboptimal policy, which is affine: xaff (u) = x0 + Ku, where x0 is
called the nominal decision and K Rnp is called the feedback gain matrix. (Roughly speaking,
x0 is our guess of x before the value of u has been revealed; Ku is our modification of this guess,
once we know u.) We determine the policy (i.e., suitable values for x0 and K) ahead of time; we
can then evaluate the policy (that is, find xaff (u) given u) very quickly, by matrix multiplication
and addition.

80
We will choose x0 and K in order to minimize the expected value of the objective, while insisting
that for any value of u, feasibility is maintained:

minimize E cT xaff (u)


subject to Axaff (u)  b(u) u U.

The variables here are x0 and K. The expectation in the objective is over u, and the constraint
requires that Axaff (u)  b(u) hold almost surely.

(a) Explain how to find optimal values of x0 and K by solving a standard explicit convex op-
timization problem (i.e., one that does not involve an expectation or an infinite number of
constraints, as the one above does.) The numbers of variables or constraints in your formula-
tion should not grow exponentially with the problem dimensions n, p, or m.
(b) Carry out your method on the data given in affine_pol_data.m. To evaluate your affine
policy, generate 100 independent samples of u, and for each value, compute the objective
value of the affine policy, cT xaff (u), and of the optimal policy, cT x? (u). Scatter plot the
objective value of the affine policy (y-axis) versus the objective value of the optimal policy
(x-axis), and include the line y = x on the plot. Report the average values of cT xaff (u) and
cT x? (u) over your samples. (These are estimates of E cT xaff (u) and E cT x? (u). The first
number, by the way, can be found exactly.)

Solution. Lets start with the objective. We compute the expected value of the affine policy as

E cT (x0 + Ku) = cT x0 + (K T c)T E u = cT x0 .

Now lets look at the constraints, which we write out in terms of its entries:

(Ax0 )i + sup((AK B)u)i (b0 )i , i = 1, . . . , m.


uU

This turns into the explicit constraint

(Ax0 )i + k(AK B)i k1 (b0 )i , i = 1, . . . , m.

Here (AK B)i is the ith row of the matrix A BK.


So we can find an optimal affine policy by solving the problem (which can be transformed to an
LP)
minimize cT x0
subject to (Ax0 )i + k(AK B)i k1 (b0 )i , i = 1, . . . , m,
with variables x0 and K.
The code below implements this.

% Affine policy.
affine_pol_data;

% compute affine policy


cvx_begin

81
variables x0(n) K(n,p)
minimize (c*x0)
subject to
A*x0+norms(A*K-B,1,2) <= b0
cvx_end

% compare 100 samples


aff_obj = []; opt_obj = [];
for i=1:100
u = 2*rand(p,1)-1;

cvx_begin quiet
variable x_opt(n)
minimize (c*x_opt)
subject to
A*x_opt <= b0+B*u
cvx_end

aff_obj(i) = c*(x0+K*u);
opt_obj(i) = cvx_optval;
end

figure;
plot(opt_obj,aff_obj,x); hold on;
axis equal
plot(xlim, xlim);
xlabel(Optimal policy objective)
ylabel(Affine policy objective)

print -depsc affine_pol.eps

82
2.8

2.9

3.1
Affine policy objective

3.2

3.3

3.4

3.5

3.6

3.7

3.8

3.8 3.6 3.4 3.2 3 2.8 2.6


Optimal policy objective

The (exact) expected cost of the affine policy is 3.219, the mean objective of the affine policy is
3.223 for the sample, and the mean objective of the optimal policy is 3.301 for the sample. The
plot shows that the affine policy produces points that are suboptinmal, but not by much.

3.28 Probability bounds. Consider random variables X1 , X2 , X3 , X4 that take values in {0, 1}. We are
given the following marginal and conditional probabilities:

prob(X1 = 1) = 0.9,
prob(X2 = 1) = 0.9,
prob(X3 = 1) = 0.1,
prob(X1 = 1, X4 = 0 | X3 = 1) = 0.7,
prob(X4 = 1 | X2 = 1, X3 = 0) = 0.6.

Explain how to find the minimum and maximum possible values of prob(X4 = 1), over all (joint)
probability distributions consistent with the given data. Find these values and report them.
Hints. (You should feel free to ignore these hints.)

Matlab:
CVX supports multidimensional arrays; for example, variable p(2,2,2,2) declares a
4-dimensional array of variables, with each of the four indices taking the values 1 or 2.
The function sum(p,i) sums a multidimensional array p along the ith index.
The expression sum(a(:)) gives the sum of all entries of a multidimensional array a. You
might want to use the function definition sum_all = @(A) sum( A(:));, so sum_all(a)
gives the sum of all entries in the multidimensional array a.

83
Python:
Create a 1-d Variable and manually index the entries. You should come up with a rea-
sonable scheme to avoid confusion.
Julia:
You can create a multidimensional array of variables in Convex.jl. For example, the
following creates a 4-dimensional array of variables, with each of the four indices taking
the values 1 or 2.
p = [Variable() for i in 1:16];
p = reshape(p, 2, 2, 2, 2)
You can use the function sum to sum over various indices in the multidimesional array.
sum(p[:,:,:,:]) # sum all entries
sum(p[1,:,2,:]) # fix first and third indices
To create constraints with the variables in the array, you need to access each variable
independently. Something like p >= 0 will not work.

Solution. The outcome space for the random variables has 24 = 16 elements, corresponding to all
possible 0/1 assignments to the variables. The variable is a probability disitribution on this set,
which we can represent as a vector p R16 . There are several reasonable choices for how we index
the vector. One is to index the vector p by i = 1, . . . , 16, with some encoding of the index, such as

i = 1 + X1 + 2X2 + 4X3 + 8X4 .

Another is to encode the index as 4-tuple pijkl , where i, j, k, l {1, 2}. (These are for Matlab
encodings; in a real language, wed start the indices at 0, or represent the probability distribution
as a dictionary, i.e., a set of key-value pairs.)
Now for any event A {0, 1}4 , prob(A) is a linear function of p. Indeed, the coefficient of each
index is 0 if the index is not in A and 1 if the index is in A. So for any A, prob(A) is a linear
function of p.
It follows that the first three constraints, on marginal probabilities, are linear equalities on p: we
simply sum the entries of p corresponding to the event (such as X1 = 1) and constrain the sum to
equal the given righthand side value. Conditional probabilities are ratios of probabilities of events,
and so are linear-fractional functions of p. We can write the constraint

prob(A B)
prob(A | B) = =
prob(B)
as
prob(A B) = prob(B),
which is a linear equaility constraint on p. Since prob(X4 = 1) is a linear function of p, maximizing
and minimizing it subject to the constraints is just an LP.
We find that the range of possible values of prob(X4 = 1) is between 0.48 and 0.61.
The following code solves the problem in Matlab, using the two different encodings described above.

84
sum_all = @(A) sum( A(:));

for direction=[-1, 1]
% 1 in the ith entry of p corresponds to X_i being 0.
% 2 corresponds to X_i being 1.
cvx_begin
variable p(2,2,2,2);
maximize( direction*sum_all(p(:,:,:,2)))

sum_all(p)==1;
p >= 0;

sum_all( p(2,:,:,:)) == .9;


sum_all( p(:,2,:,:)) == .9;
sum_all( p(:,:,2,:)) == .1;

sum_all( p(2,:,2,1)) == .7* sum_all(p(:,:,2,:));


sum_all( p(:,2,1,2)) == .6*sum_all(p(:,2,1,:));
cvx_end
sum_all(p(:,:,:,2))
end

for direction=[-1, 1]
% p(k) corresponds to the arrangement x_1 x_2 x_3 x_4
% where k = x_1 + 2 x_2 + 4 x_3+ 8 x_4 + +1
cvx_begin
variable p(16);
maximize( direction*sum(p([9,10,11,12,13,14,15,16])))

sum(p)==1;
p >= 0;

sum( p([1,3,5,7,9,11,13,15]+1)) == .9;


sum( p([2,3,6,7,10,11,14,15]+1) ) == .9;
sum( p([4,5,6,7,12,13,14,15]+1)) == .1;

sum( p([5,7]+1)) == .7* sum(p([4,5,6,7,12,13,14,15]+1));


sum( p([10,11]+1)) == .6*sum(p([2,3,10,11]+1));
cvx_end
sum(p([9,10,11,12,13,14,15,16]))
end

The following code solves the problem in Python.

85
import cvxpy as cvx

for direction in [-1, 1]:


# 0 in the ith entry of p corresponds to X_i being 0.
# 1 corresponds to X_i being 1.
p = cvx.Variable(2, 2, 2, 2)
obj = direction*cvx.sum_entries(p[:, :, :, 1])
cons = [cvx.sum_entries(p) == 1, p >= 0]
cons += [cvx.sum_entries(p[1, :, :, :]) == .9]
cons += [cvx.sum_entries(p[:, 1, :, :]) == .9]
cons += [cvx.sum_entries(p[:, :, 1, :]) == .1]

cons += [cvx.sum_entries(p[1, :, 1, 0]) == .7 * cvx.sum_entries(p[:, :, 1, :])]


cons += [cvx.sum_entries(p[:, 1, 0, 1]) == .6 * cvx.sum_entries(p[:, 1, 0, :])]

cvx.Problem(cvx.Maximize(obj), cons).solve()
print cvx.sum_entries(p[:, :, :, 1]).value

for direction in [-1, 1]:


# p(k) corresponds to the arrangement x_1 x_2 x_3 x_4
# where k = x_1 + 2 x_2 + 4 x_3+ 8 x_4
p = cvx.Variable(16)
obj = direction*cvx.sum_entries(p[8:])
cons = [cvx.sum_entries(p) == 1, p >= 0]
cons += [cvx.sum_entries(p[1::2]) == .9]
cons += [cvx.sum_entries(p[2::4]) + cvx.sum_entries(p[3::4]) == .9]
cons += [cvx.sum_entries(p[4:8]) + cvx.sum_entries(p[12:16]) == .1]

cons += [p[5]+p[7] == .7 * (cvx.sum_entries(p[4:8]) + cvx.sum_entries(p[12:16]))]


cons += [p[10]+p[11] == .6 * (p[2]+p[3]+p[10]+p[11])]

cvx.Problem(cvx.Maximize(obj), cons).solve()
print cvx.sum_entries(p[8:]).value

A solution in Julia follows.

using Convex

p = [Variable() for i in 1:16];


p = reshape(p, 2, 2, 2, 2);

constraints = [];
for i in 1:16
constraints += p[i] >= 0;
end
constraints += sum(p) == 1;

86
constraints += sum(p[2,:,:,:]) == 0.9;
constraints += sum(p[:,2,:,:]) == 0.9;
constraints += sum(p[:,:,2,:]) == 0.1;
constraints += sum(p[2,:,2,1]) == 0.7 * sum(p[:,:,2,:]);
constraints += sum(p[:,2,1,2]) == 0.6 * sum(p[:,2,1,:]);

objective = sum(p[:,:,:,2]);

maxp = maximize(objective, constraints);


solve!(maxp);
println(maxp.optval);
minp = minimize(objective, constraints);
solve!(minp);
println(minp.optval);

3.29 Robust quadratic programming. In this problem, we consider a robust variation of the (convex)
quadratic program
minimize (1/2)xT P x + q T x + r
subject to Ax  b.
For simplicity we assume that only the matrix P is subject to errors, and the other parameters (q,
r, A, b) are exactly known. The robust quadratic program is defined as

minimize supP E ((1/2)xT P x + q T x + r)


subject to Ax  b

where E is the set of possible matrices P .


For each of the following sets E, express the robust QP as a tractable convex problem. Be as specific
as you can. (Here, tractable means that the problem can be reduced to an LP, QP, QCQP, SOCP,
or SDP. But you do not have to work out the reduction, if it is complicated; it is enough to argue
that it can be reduced to one of these.)

(a) A finite set of matrices: E = {P1 , . . . , PK }, where Pi Sn+ , i = 1, . . . , K.


(b) A set specified by a nominal value P0 Sn+ plus a bound on the eigenvalues of the deviation
P P0 :
E = {P Sn | I  P P0  I}
where R and P0 Sn+ .
(c) An ellipsoid of matrices:
K
( )
X
E= P0 + Pi ui kuk2 1 .


i=1

You can assume Pi Sn+ , i = 0, . . . , K.

Solution.

87
(a) The objective function is a (finite) maximum of convex functions, hence convex. The formu-
lation  
minimize maxi=1,...,K (1/2)xT Pi x + q T x + r
subject to Ax  b,
with variable x, is tractable.
It can be expressed as the convex QCQP

minimize t
subject to (1/2)xT Pi x + q T x + r t, i = 1, . . . , K
Ax  b,

with variables x and t.


(b) For given x, the supremum of xT P x over I  P  I is given by

sup xT P x = xT x.
IP I

Therefore we can express the robust QP as

minimize (1/2)xT (P0 + I)x + q T x + r


subject to Ax  b

which is a QP.
(c) For given x, the quadratic objective function is
K
!
X
(1/2) xT P0 x + sup ui (xT Pi x) + q T x + r
kuk2 1 i=1

K
!1/2
X
T T 2
= (1/2)x P0 x + (1/2) (x Pi x) + q T x + r.
i=1

This is a convex function of x. To see this, observe that each of the functions xT Pi x is convex
since Pi  0. The second term is a composition h(g1 (x), . . . , gK (x)) of h(y) = kyk2 with
gi (x) = xT Pi x. The functions gi are convex and nonnegative. The function h is convex and,
for y RK+ , nondecreasing in each of its arguments. Therefore the composition is convex.
The resulting problem can be expressed as

minimize (1/2)xT P0 x + kyk2 + q T x + r


subject to (1/2)xT Pi x yi , i = 1, . . . , K
Ax  b.

3.30 Smallest confidence ellipsoid. Suppose the random variable X on Rn has log-concave density p.
Formulate the following problem as a convex optimization problem: Find an ellipsoid E that satisfies
prob(X E) 0.95 and is smallest, in the sense of minimizing the sum of the squares of its semi-
axis lengths. You do not need to worry about how to solve the resulting convex optimization
problem; it is enough to formulate the smallest confidence ellipsoid problem as the problem of
minimizing a convex function over a convex set involving the parameters that define E.

88
Solution. We parametrize the ellipsoid as E(c, P ) = {x | (x c)T P 1 (x c) 1}, where P  0.
1/2
The semi-axis lengths are i , where i are the eigenvalues of P . The sum of the squares of the
semi-axis lengths is i i = tr P . So our job is to minimize tr P subject to prob(X E) 0.95.
P

Define g(x, c, P ) as the 0-1 indicator function of E, i.e.,


(
1 (x c)T P 1 (x c) 1
g(x, c, P ) =
0 otherwise.

Since (x c)T P 1 (x c) is a convex function of (x, c, P ) (for P  0), log g is concave in (x, c, P ),
so g is log-concave in (x, c, P ). By the integration rule for log-concave functions,
Z
p(x)g(x, c, P ) dx = prob(X E(c, P ))

is log-concave in (c, P ). So for any , prob(X E(c, P )) is a convex constraint in (c, P ). In


particular,
{(c, P ) | prob(X E(c, P )) 0.95}
is a convex set. We simply minimize (the linear function) tr P over this set to get the smallest
confidence ellipsoid.

3.31 Stochastic optimization via Monte Carlo sampling. In (convex) stochastic optimization, the goal
is to minimize a cost function of the form F (x) = E f (x, ), where is a random variable on ,
and f : Rn R is convex in its first argument for each . (For simplicity we consider
the unconstrained problem; it is not hard to include constraints.) Evidently F is convex. Let p?
denote the optimal value, i.e., p? = inf x F (x) (which we assume is finite).
In a few very simple cases we can work out what F is analytically, but in general this is not possible.
Moreover in many applications, we do not know the distribution of ; we only have access to an
oracle that can generate independent samples from the distribution.
A standard method for approximately solving the stochastic optimization problem is based on
Monte Carlo sampling. We first generate N independent samples, 1 , . . . , N , and form the empir-
ical expectation
N
1 X
F (x) = f (x, i ).
N i=1
This is a random function, since it depends on the particular samples drawn. For each x, we
have E F (x) = F (x), and also E(F (x) F (x))2 1/N . Roughly speaking, for N large enough,
F (x) F (x).
To (approximately) minimize F , we instead minimize F (x). The minimizer, x ? , and the optimal
value p = F (
? x ), are also random variables. The hope is that for N large enough, we have p? p? .
?

(In practice, stochastic optimization via Monte Carlo sampling works very well, even when N is
not that big.)
One way to check the result of Monte Carlo sampling is to carry it out multiple times. We repeatedly
generate different batches of samples, and for each batch, we find x? and p? . If the values of p? are
near each other, its reasonable to believe that we have (approximately) minimized F . If they are
not, it means our value of N is too small.

89
Show that E p? p? .
This inequality implies that if we repeatedly use Monte Carlo sampling and the values of p? that
we get are all very close, then they are (likely) close to p? .
Hint. Show that for any function G : Rn R (convex or not in its first argument), and any
random variable on , we have
inf E G(x, ) E inf G(x, ).
x x

Solution. Lets show the hint. For each and any z Rn , we have G(z, ) inf x G(x, ). Since
expectation is monotone, we can take expectection and get
E G(z, ) E inf G(x, ).
x
n
This holds for all z R , so we conclude that
inf E G(z, ) E inf G(x, ),
z x

which is what we wanted to show.


Then we have
p? = inf x F (x)
= inf x E F (x)
PN
= inf x E N1 i=1 f (x, i )
PN
E inf x N1 i=1 f (x, i )
= E p? .

3.32 Satisfying a minimum number of constraints. Consider the problem


minimize f0 (x)
subject to fi (x) 0 holds for at least k values of i,
with variable x Rn , where the objective f0 and the constraint functions fi , i = 1, . . . , m (with
m k), are convex. Here we require that only k of the constraints hold, instead of all m of them.
In general this is a hard combinatorial problem; the brute force solution is to solve all m k convex
problems obtained by choosing subsets of k constraints to impose, and selecting one with smallest
objective value.
In this problem we explore a convex restriction that can be an effective heuristic for the problem.

(a) Suppose > 0. Show that the constraint


m
X
(1 + fi (x))+ m k
i=1

guarantees that fi (x) 0 holds for at least k values of i. ((u)+ means max{u, 0}.)
Hint. For each u R, (1 + u)+ 1(u > 0), where 1(u > 0) = 1 for u > 0, and 1(u > 0) = 0
for u 0.

90
(b) Consider the problem

minimize f0 (x)
Pm
subject to i=1 (1 + fi (x))+ m k
> 0,

with variables x and . This is a restriction of the original problem: If (x, ) are feasible for
it, then x is feasible for the original problem. Show how to solve this problem using convex
optimization. (This may involve a change of variables.)
(c) Apply the method of part (b) to the problem instance

minimize cT x
subject to aTi x bi holds for at least k values of i,

with m = 70, k = 58, and n = 12. The vectors b, c and the matrix A with rows aTi are given
in the file satisfy_some_constraints_data.*.
Report the optimal value of , the objective value, and the actual number of constraints that
are satisfied (which should be larger than or equal to k). To determine if a constraint is
satisfied, you can use the tolerance aTi x bi feas , with feas = 105 .
A standard trick is to take this tentative solution, choose the k constraints with the smallest
values of fi (x), and then minimize f0 (x) subject to these k constraints (i.e., ignoring the other
m k constraints). This improves the objective value over the one found using the restriction.
Carry this out for the problem instance, and report the objective value obtained.

Solution.

(a) We first prove the hint. If u > 0 then 1(u > 0) = 1 and 1 + u > 1, so (1 + u)+ > 1. If u 0
then 1(u > 0) = 0 and (1 + u)+ 0. Hence (1 + u)+ 1(u > 0) for all u R.
Applying this to u = fi (x), we have that (1 + fi (x))+ 1(fi (x) > 0) for all i. Hence
m
X m
X
1(fi (x) > 0) (1 + fi (x))+ .
i=1 i=1

The constraint m i=1 (1 + fi (x))+ m k guarantees that


m
i=1 1(fi (x) > 0) m k, so
P P

fi (x) > 0 holds for at most m k values of i. In other words, fi (x) 0 holds for at least k
values of i.
(b) If > 0 then (u)+ = (u)+ for all u R. Hence (1 + fi (x))+ = (1/ + fi (x))+ . The
constraint mi=1 (1 + fi (x))+ m k can then be written as
P

m
1
X  
+ fi (x) m k,
i=1
+

or equivalently,
m 
1 1
X 
+ fi (x) (m k) .
i=1
+

91
Letting = 1/, the restricted problem can be expressed as

minimize f0 (x)
Pm
subject to i=1 ( + fi (x))+ (m k)
> 0,

with variables x and . Since the function ()+ is convex and nondecreasing, and + fi (x) is
convex in both and x, ( + fi (x))+ is convex, so this is a convex optimization problem.
After solving this problem and obtaining an optimal value ? for , the optimal value of
is 1/? . Note that we can replace the constraint > 0 by 0, since if = 0 then the
constraint m i=1 ( + fi (x))+ (m k) is the same as all the constraints fi (x) 0 hold.
P

Certainly in this case at least k of them hold. Alternatively we can introduce a new variable
t and replace the constraint > 0 with et .
(c) The optimal value of is 282.98 and the objective value is 8.45. The number of constraints
satisfied is 66, which exceeds our required minimum, k = 58.
When we take this tentative solution, choose the k constraints with the smallest values of fi (x),
and then minimize f0 (x) subject to these k constraints, we get an objective value between
8.75 and 8.86 (depending on the solver; these numbers should be the same . . . ). In any
case, it gives a modest improvement in objective compared to the restriction.
The actual optimal value (which we obtained using branch and bound, a global optimization
method) is 9.57. To compute this takes far more effort than to solve the restriction; for
larger problem sizes solving the global problem is prohibitively slow.
The following Matlab code solves the problem:
satisfy_some_constraints_data;

cvx_begin quiet
variables x(n) mu_var
minimize(c * x)
subject to
sum(pos(mu_var + A * x - b)) <= (m - k) * mu_var
mu_var >= 0
cvx_end
fprintf(Optimal value of lambda: %f\n, 1 / mu_var)
fprintf(Objective value: %f\n, cvx_optval)

fprintf(Number of constraints satisfied: %d\n, nnz(A * x - b <= 1e-5))

% Choose k least violated inequalities as constraints


[~, idx] = sort(A * x - b);
least_violated = idx(1:k);
cvx_begin quiet
variables x(n)
minimize(c * x)
subject to
A(least_violated, :) * x <= b(least_violated)

92
cvx_end
fprintf(Objective after minimizing wrt k constraints: %f\n, cvx_optval)
The following Python code solves the problem:
import cvxpy as cvx

from satisfy_some_constraints_data import *

x = cvx.Variable(n)
mu = cvx.Variable()
constraints = [cvx.sum_entries(cvx.pos(mu + A * x - b)) <= (m - k) * mu,
mu >= 0]
problem = cvx.Problem(cvx.Minimize(c.T * x), constraints)
problem.solve()
print(Optimal value of lambda: {}.format(1 / mu.value))
print(Objective value: {}.format(problem.value))

print(Number of constraints satisfied: {}


.format(np.count_nonzero(A.dot(x.value.A1) - b <= 1e-5)))

# Choose k least violated inequalities as constraints


least_violated = np.argsort(A.dot(x.value).A1 - b)[:k]
constraints = [A[least_violated] * x <= b[least_violated]]
problem = cvx.Problem(cvx.Minimize(c.T * x), constraints)
problem.solve()
print(Objective after minimizing wrt k constraints: {}
.format(problem.value))
The following Julia code solves the problem:
using Convex, SCS
set_default_solver(SCSSolver(verbose=false))

include("satisfy_some_constraints_data.jl")

x = Variable(n)
mu = Variable()
constraints = [sum(pos(mu + A * x - b)) <= (m - k) * mu,
mu >= 0]
problem = minimize(c * x, constraints)
solve!(problem)
println("Optimal value of lambda: $(1 / mu.value)")
println("Objective value: $(problem.optval)")

num_constraints_satisfied = countnz(A * x.value - b .<= 1e-5)


println("Number of constraints satisfied: $num_constraints_satisfied")

93
# Choose k least violated inequalities as constraints
least_violated = sortperm((A * x.value - b)[:])[1:k]
constraints = [A[least_violated, :] * x <= b[least_violated]]
problem = minimize(c * x, constraints)
solve!(problem)
println("Objective after minimizing wrt k constraints: $(problem.optval)")

94
4 Duality
4.1 Numerical perturbation analysis example. Consider the quadratic program

minimize x21 + 2x22 x1 x2 x1


subject to x1 + 2x2 u1
x1 4x2 u2 ,
5x1 + 76x2 1,

with variables x1 , x2 , and parameters u1 , u2 .

(a) Solve this QP, for parameter values u1 = 2, u2 = 3, to find optimal primal variable values
x?1 and x?2 , and optimal dual variable values ?1 , ?2 and ?3 . Let p? denote the optimal objective
value. Verify that the KKT conditions hold for the optimal primal and dual variables you
found (within reasonable numerical accuracy).
Matlab hint: See 3.7 of the CVX users guide to find out how to retrieve optimal dual
variables. To specify the quadratic objective, use quad_form().
(b) We will now solve some perturbed versions of the QP, with

u1 = 2 + 1 , u2 = 3 + 2 ,

where 1 and 2 each take values from {0.1, 0, 0.1}. (There are a total of nine such combi-
nations, including the original problem with 1 = 2 = 0.) For each combination of 1 and 2 ,
make a prediction p?pred of the optimal value of the perturbed QP, and compare it to p?exact ,
the exact optimal value of the perturbed QP (obtained by solving the perturbed QP). Put
your results in the two righthand columns in a table with the form shown below. Check that
the inequality p?pred p?exact holds.

1 2 p?pred p?exact
0 0
0 0.1
0 0.1
0.1 0
0.1 0.1
0.1 0.1
0.1 0
0.1 0.1
0.1 0.1

Matlab solution
The following Matlab code sets up the simple QP and solves it using CVX.
Part (a):

Q = [1 -1/2; -1/2 2];


f = [-1 0];
A = [1 2; 1 -4; 5 76];

95
b = [-2 -3 1];

cvx_begin
variable x(2)
dual variable lambda
minimize(quad_form(x,Q)+f*x)
subject to
lambda: A*x <= b
cvx_end
p_star = cvx_optval

Part (b):

arr_i = [0 -1 1];
delta = 0.1;
pa_table = [];
for i = arr_i
for j = arr_i
p_pred = p_star - [lambda(1) lambda(2)]*[i; j]*delta;
cvx_begin
variable x(2)
minimize(quad_form(x,Q)+f*x)
subject to
A*x <= b+[i;j;0]*delta
cvx_end
p_exact = cvx_optval;

pa_table = [pa_table; i*delta j*delta p_pred p_exact]


end
end

When we run this, we find the optimal objective value is p? = 8.22 and the optimal point is
x?1 = 2.33, x?2 = 0.17. (This optimal point is unique since the objective is strictly convex.) A set
of optimal dual variables is ?1 = 2.13, ?2 = 3.31 and ?3 = 0.08.
Python solution
The following Python code sets up the simple QP and solves it using CVXPY.
Part (a) and (b):

import cvxpy as cvx


import numpy as np

# part (a)
Q = np.matrix(1 -0.5; -0.5 2)
f = np.matrix(-1; 0)

96
A = np.matrix(1 2; 1 -4; 5 76)
b = np.matrix(-2; -3; 1)

x = cvx.Variable(2)
obj = cvx.quad_form(x, Q) + x.T*f
cons = [A*x <= b]

p_star = cvx.Problem(cvx.Minimize(obj), cons).solve()


lambdas = cons[0].dual_value

print p_star
print x.value
print lambdas
print A*x.value - b
print 2*Q*x.value + f + A.transpose()*cons[0].dual_value

# part(b)
arr_i = np.array([0, -1, 1])
delta = 0.1
pa_table = np.zeros((9, 4))
count = 0
for i in arr_i:
for j in arr_i:
p_pred = p_star - (lambdas[0]*i + lambdas[1]*j)*delta
cons = [A*x <= b+np.matrix(np.array([i,j,0])).T*delta]
p_exact = cvx.Problem(cvx.Minimize(obj), cons).solve()
pa_table[count,:] = np.array([i*delta, j*delta, p_pred, p_exact])
count += 1

print pa_table

When we run this, we find the optimal objective value is p? = 8.22 and the optimal point is
x?1 = 2.33, x?2 = 0.17. (This optimal point is unique since the objective is strictly convex.) A set
of optimal dual variables is ?1 = 1.60, ?2 = 3.67 and ?3 = 0.11.
Julia solution
The following Julia code sets up the simple QP and solves it using Convex.jl.
Part (a) and (b):

using Convex

Q = [1 -1/2; -1/2 2];


f = [-1 0];
A = [1 2; 1 -4; 5 76];
b = [-2 -3 1];

97
# part (a)
x = Variable(2);
constr = A*x <= b;
p = minimize(quad_form(x,Q)+f*x, constr);
solve!(p);
lambda = constr.dual;
p_star = p.optval;

println(lambda);
println(p_star);
println(x.value);
println(A*x.value - b);
println(2*Q*x.value + f + A*lambda);

# part (b)
arr_i = [0 -1 1];
delta = 0.1;
pa_table = zeros(9, 4);
count = 1;
for i in arr_i
for j in arr_i
p_pred = p_star - dot(lambda[1:2], [i;j])*delta;
x = Variable(2);
p = minimize(quad_form(x,Q)+f*x, A*x <= b+[1;j;0]*delta);
solve!(p);
p_exact = p.optval;
pa_table[count, :] = [i*delta j*delta p_pred p_exact];
count += 1;
end
end

When we run this, we find the optimal objective value is p? = 8.22 and the optimal point is
x?1 = 2.33, x?2 = 0.17. (This optimal point is unique since the objective is strictly convex.) A set
of optimal dual variables is ?1 = 2.78, ?2 = 2.86 and ?3 = 0.04.
All languages
The KKT conditions are
x?1 + 2x?2 u1 , x?1 4x?2 u2 , 5x?1 + 76x?2 1
?
1 0, ?2 0, ?3 0
?1 (x?1 + 2x?2 u1 ) = 0, ?2 (x?1 4x?2 u2 ) = 0, ?3 (5x?1 + 76x?2 1) = 0,
? ? ? ? ?
2x1 x2 1 + 1 + 2 + 53 = 0,
4x?2 x?1 + 2?1 4?2 + 76?3 = 0.

We check these numerically. The dual variable ?1 , ?2 and ?3 are all greater than zero and the
quantities

98
A*x-b
2*Q*x+f+A*lambda

are found to be very small. Thus the KKT conditions are verified.
The predicted optimal value is given by

p?pred = p? ?1 1 ?2 2 .

The values obtained are

1 2 p?pred p?exact
0 0 8.22 8.22
0 0.1 8.55 8.70
0 0.1 7.89 7.98
0.1 0 8.44 8.57
0.1 0.1 8.77 8.82
0.1 0.1 8.10 8.32
0.1 0 8.01 8.22
0.1 0.1 8.34 8.71
0.1 0.1 7.68 7.75

The inequality p?pred p?exact is verified to be true in all cases.


[Convex.jl with the SCS solver is less accurate, p?pred and p?exact coincide within a few percents.]

4.2 A determinant maximization problem. We consider the problem


minimize log det X 1
subject to ATi XAi  Bi , i = 1, . . . , m,

with variable X Sn , and problem data Ai Rnki , Bi Sk++


i
, i = 1, . . . , m. The constraint
X  0 is implicit.
We can give several interpretations of this problem. Here is one, from statistics. Let z be a random
variable in Rn , with covariance matrix X, which is unknown. However, we do have (matrix) upper
bounds on the covariance of the random variables yi = ATi z Rki , which is ATi XAi . The problem
is to find the covariance matrix for z, that is consistent with the known upper bounds on the
covariance of yi , that has the largest volume confidence ellipsoid.
Derive the Lagrange dual of this problem. Be sure to state what the dual variables are (e.g.,
vectors, scalars, matrices), any constraints they must satisfy, and what the dual function is. If
the dual function has any implicit equality constraints, make them explicit. You can assume that
Pm T
i=1 Ai Ai  0, which implies the feasible set of the original problem is bounded.
What can you say about the optimal duality gap for this problem?
Solution. We introduce a dual variable Zi Ski for each of the matrix inequalities in the original
problem. These must satisfy Zi  0. The Lagrangian is
m
X
L(X, Z1 , . . . , Zm ) = log det X + tr Zi (ATi XAi Bi ).
i=1

99
As a function of X, this is strictly convex, and therefore X minimizes the Lagrangian if and only
if the gradient of the Lagrangian with respect to X is zero, i.e.,
m
X 1 +
X
Ai Zi ATi = 0.
i=1

This equation has a solution if and only if


m
X
Ai Zi ATi  0.
i=1

In fact, this is exactly the domain of the dual function; that is, if this condition doesnt hold, the
Lagrangian is unbounded below. If the condition doesnt hold, then there is a nonzero vector v for
which ATi v = 0 for all i. Take X = I + tvv T , where t > 0. Then we have
X
L(X, Z1 , . . . , Zm ) = log(1 + tkvk2 ) + tr Zi (ATi Ai Bi ),
i

which tends to as t . So we can now move forward with the assumption


m
X
Ai Zi ATi  0.
i=1

Thus we find that the X that minimizes the Lagrangian is


m
!1

X
X = Ai Zi ATi .
i=1

Therefore, the dual function is

g(Z1 , . . . , Zm ) = L(X , Z1 , . . . , Zm )
m
!1 m m
!1
X X X
= log det Ai Zi ATi + tr Zi (ATi Ai Zi ATi Ai B i )
i=1 i=1 i=1
m
! m m
! m
!1
X X X X
= log det Ai Zi ATi tr Zi Bi + tr Ai Zi ATi Ai Zi ATi
i=1 i=1 i=1 i=1
m m
!
X X
= log det Ai Zi ATi tr Zi Bi + n,
i=1 i=1

with domain ( m )
X
T
dom g = (Z1 , . . . , Zm ) Ai Z i Ai  0 .


i=1

(We didnt expect you to be so careful in identifying the domain of g.)


The dual problem is therefore
P 
m T Pm
maximize log det i=1 Ai Zi Ai i=1 tr Zi Bi
subject to Zi  0, i = 1, . . . , m.

100
Note that for  > 0 and small enough, X = I is strictly feasible for the primal problem. Specifically,
2

choose  < mini=1,...,m min (Bi )/max (Ai ) (since Bi  0,  > 0 exists). Therefore by Slaters
condition, strong duality always holds.

4.3 The relative entropy between two vectors x, y Rn++ is defined as


n
X
xk log(xk /yk ).
k=1

This is a convex function, jointly in x and y. In the following problem we calculate the vector x
that minimizes the relative entropy with a given vector y, subject to equality constraints on x:
n
X
minimize xk log(xk /yk )
k=1
subject to Ax = b
1T x = 1

The optimization variable is x Rn . The domain of the objective function is Rn++ . The parameters
y Rn++ , A Rmn , and b Rm are given.
Derive the Lagrange dual of this problem and simplify it to get
Pn aT
kz
maximize bT z log k=1 yk e

(ak is the kth column of A).

Solution. The Lagrangian is


X
L(x, z, ) = xk log(xk /yk ) + bT z z T Ax + 1T x.
k

Minimizing over xk gives the conditions

1 + log(xk /yk ) aTk z = 0, k = 1, . . . , n,

with solution
T
xk = yk eak z+1 .
Plugging this in in L gives the Lagrange dual function
n
X T
g(z, ) = bT z + yk eak z+1
k=1

and the dual problem


Pn aT
k z+1
maximize bT z + k=1 yk e .
This can be simplified a bit if we optimize over by setting the derivative equal to zero:
n
X T
= 1 log yk eak z .
k=1

After this simplification the dual problem reduces to the problem in the assignment.

101
4.4 Source localization from range measurements. [?] A signal emitted by a source at an unknown
position x Rn (n = 2 or n = 3) is received by m sensors at known positions y1 , . . . , ym Rn .
From the strength of the received signals, we can obtain noisy estimates dk of the distances kxyk k2 .
We are interested in estimating the source position x based on the measured distances dk .
In the following problem the error between the squares of the actual and observed distances is
minimized: m X 2
minimize f0 (x) = kx yk k22 d2k .
k=1

Introducing a new variable t = xT x, we can express this as


m 
X 2
minimize t 2ykT x + kyk k22 d2k
k=1
(9)
subject to xT x t = 0.

The variables are x Rn , t R. Although this problem is not convex, it can be shown that
strong duality holds. (It is a variation on the problem discussed on page 229 and in exercise 5.29
of Convex Optimization.)
Solve (9) for an example with m = 5,
" # " # " # " # " #
1.8 2.0 1.5 1.5 2.5
y1 = , y2 = , y3 = , y4 = , y5 = ,
2.5 1.7 1.5 2.0 1.5

and
d = (2.00, 1.24, 0.59, 1.31, 1.44).
The figure shows some contour lines of the cost function f0 , with the positions yk indicated by
circles.

2.5

2
y

1.5

0.5

0.5 1 1.5 2 2.5 3


x

102
To solve the problem, you can note that x? is easily obtained from the KKT conditions for (9) if
the optimal multiplier ? for the equality constraint is known. You can use one of the following
two methods to find ? .
Derive the dual problem, express it as an SDP, and solve it using CVX.
Reduce the KKT conditions to a nonlinear equation in , and pick the correct solution (simi-
larly as in exercise 5.29 of Convex Optimization).
Solution. Define
2y1T d21 ky1 k22

1

2y2T 1


d22 ky2 k22
1 0 0 0
A= , b= , C = 0 1 0 , f = 0

.. .. ..
. . .

0 0 0 1/2
2y5T 1 d25 ky5 k22
and z = (x1 , x2 , t). With this notation, the problem is
minimize kAz bk22
subject to z T Cz + 2f T z = 0.
The Lagrangian is
L(z, ) = z T (AT A + C)z 2(AT b f )T z + kbk22 ,
which is bounded below as a function of z only if
AT A + C  0, AT b f R(AT A + C).
The KKT conditions are therefore as follows.
Primal feasibility.
z T Cz + 2f T z = 0.
Dual feasibility.
AT A + C  0, AT b f R(AT A + C).
Gradient of Lagrangian is zero.
(AT A + C)z = AT b f.
(Note that this implies the range condition for dual feasibility.)

Method 1. We derive the dual problem. If is feasible, then


g() = (AT b f )T (AT A + C) (AT b f ) + kbk22 ,
so the dual problem can be expressed as an SDP
maximize t + kbk2 2
" #
AT A + C AT b f
subject to  0.
(AT b f )T t
Solving this in CVX gives ? = 0.5896. From ? , we get
z ? = (AT A + C)1 (AT b f ) = (1.33, 0.64, 2.18).
Hence x? = (1.33, 0.64).

103
Method 2. Alternatively, we can solve the KKT equations directly. To simplify the equations,
we make a change of variables
w = QT LT z
where L is the Cholesky factor in the factorization AT A = LLT , and Q is the matrix of eigenvectors
of L1 CLT = QQT . This transforms the KKT equations to

wT w + 2g T w = 0, I +  0, (I + )w = h g

where
g = QT L1 f, h = QT L1 AT b.
We can eliminate w from the last equation in the KKT conditions to obtain an equation in :
n+1
!
X k (hk gk )2 2gk (hk gk )
r() = + =0
k=1
(1 + k )2 1 + k

In our example, the eigenvalues are

1 = 0.5104, 2 = 0.2735, 3 = 0.

The figure shows the function r on two different scales.

100 100

90 90

80 80

70 70

60 60

50 50
r
r

40 40

30 30

20 20

10 10

0 0

10 10
50 0 50 6 5 4 3 2 1 0 1 2
x x

The correct solution of r() = 0 is the one that satisfies 1 + k 0 for k = 1, 2, 3, i.e., the solution
to the right of the two singularities. This solution can be determined using Newtons method by
repeating the iteration
r()
:= 0
r ()
a few times, starting at a value close to the solution. This gives ? = 0.5896. From ? , we determine
x? as in the first method.
The last figure shows the contour lines and the optimal x? .

104
3

2.5

y
1.5

0.5

0.5 1 1.5 2 2.5 3


x

4.5 Projection on the `1 ball. Consider the problem of projecting a point a Rn on the unit ball in
`1 -norm:
minimize (1/2)kx ak22
subject to kxk1 1.
Derive the dual problem and describe an efficient method for solving it. Explain how you can
obtain the optimal x from the solution of the dual problem.
Solution. The Lagrangian is
n
1 X (xk ak )2
L(x, ) = kx ak22 + kxk1 = + |xk | .
2 k=1
2

This is easy to minimize over x because it is separable:


! (
(xk ak )2 2 /2 + |ak | |ak |
gk () = inf + |xk | =
xk 2 a2k /2 > |ak |.

This is a differential concave function with derivative

gk0 () = max{|ak | , 0}.

The dual problem is



P
maximize g() = k gk ()
subject to 0.
g is differentiable and concave, with derivative
n
g 0 () =
X
max{|ak | , 0} 1.
k=1

105
The derivative varies from kak1 1 at the origin, to 1 as max |ak |. If kak1 1, then g is
decreasing on R+ , the optimal is zero, and the optimal x is x = a.
If kak1 > 1, we can find the optimal by solving the piecewise-linear equation
n
X
max{|ak | , 0} = 1.
k=1

From the optimal , we obtain the optimal x as follows:



0
|ak |
xk = ak < |ak |, ak > 0
a + < |a |, a < 0.

k k k

4.6 A nonconvex problem with strong duality. On page 229 of Convex Optimization, we consider the
problem
minimize f (x) = xT Ax + 2bT x
(10)
subject to xT x 1
with variable x Rn , and data A Sn , b Rn . We do not assume that A is positive semidefinite,
and therefore the problem is not necessarily convex. In this exercise we show that x is (globally)
optimal if and only if there exists a such that

kxk2 1, 0, A + I  0, (A + I)x = b, (1 kxk22 ) = 0. (11)

From this we will develop an efficient method for finding the global solution. The conditions (11)
are the KKT conditions for (10) with the inequality A + I  0 added.

(a) Show that if x and satisfy (11), then f (x) = inf x L(


x, ) = g(), where L is the Lagrangian
of the problem and g is the dual function. Therefore strong duality holds, and x is globally
optimal.
(b) Next we show that the conditions (11) are also necessary. Assume that x is globally optimal
for (10). We distinguish two cases.
(i) kxk2 < 1. Show that (11) holds with = 0.
(ii) kxk2 = 1. First prove that (A + I)x = b for some 0. (In other words, the negative
gradient (Ax + b) of the objective function is normal to the unit sphere at x, and point
away from the origin.) You can show this by contradiction: if the condition does not
hold, then there exists a direction v with v T x < 0 and v T (Ax + b) < 0. Show that
f (x + tv) < f (x) for small positive t.
It remains to show that A + I  0. If not, there exists a w with wT (A + I)w < 0, and
without loss of generality we can assume that wT x 6= 0. Show that the point y = x + tw
with t = 2wT x/wT w satisfies kyk2 = 1 and f (y) < f (x).
(c) The optimality conditions (11) can be used to derive a simple algorithm for (10). Using the
eigenvalue decomposition A = ni=1 i qi qiT , of A, we make a change of variables yi = qiT x,
P

and write (10) as


Pn 2 Pn
minimize i=1 i yi + 2 i=1 i yi
subject to y T y 1

106
where i = qiT b. The transformed optimality conditions (11) are
kyk2 1, n , (i + )yi = i , i = 1, . . . , n, (1 kyk2 ) = 0,
if we assume that 1 2 n . Give an algorithm for computing the solution y and .

Solution.
(a) Suppose x and satisfy the conditions. Then
f (x) = f (x) + (kxk22 1)
= xT (A + I)x + 2bT x
 
T (A + I)
= inf x x + 2bT x

x

= g().
The first line follows from complementary slackness ((1 kxk22 ) = 0). Line 3 follows from
the fact that the Lagrangian L( if A + I  0 and therefore x minimizes
x, ) is convex in x
the Lagrangian if x L(x, ) = (A I)x + b = 0.
(b) (i) If kxk2 < 1 and x is globally optimal, then x is the unconstrained minimum of xT Ax+2bT x.
Therefore f must be convex (A  0) and the gradient Ax + b must be zero.
(ii) Suppose there exists a v with (Ax + b)T v < 0 and v T x < 0. For small positive t we have
kx + tvk2 < 1 and
f (x + tv) = xT Ax + 2bT x + t2 v T Av + 2tv T (Ax + b)
< xT Ax + 2bT x.
Therefore x is not even locally optimal.
For the second part, we first note that
wT x T wT x
y T y = (x 2 w) (x 2 w) = xT x = 1.
wT w wT w
Also, using b = (A + I)x,
y T Ay + 2bT y = y T (A + I)y 2xT (A + I)y
= (y x)T (A + I)(y x) xT (A + I)x
< xT (A + I)x
= xT (A + I)x + 2bT x
= xT Ax + 2bT x.
The inequality on line 3 follows from wT (A + I)w < 0.
(c) We assume that the eigenvalues are sorted in nondecreasing order with
1 2 k1 > k = k+1 = = n .
In other words the smallest eigenvalue has multiplicity n k + 1 where k 1. For > n ,
define n
X i2
() = .
i=1
(i + )2

107
We extend the function to = n using the interpretation a/0 = 0 if a = 0 and a/0 = if
a > 0. In other words, at = n we define
k1
X i2
() =
i=1
(i + )2

if k = = n = 0 and as () = + otherwise. We also define = max{0, n }. Now


distinguish three cases.
() 1 (In other words () = + or () [1, ).) Since () decreases mono-
tonically to zero as there is a unique that satisfies the nonlinear equation
() = 1. This equation is sometimes called the secular equation and is easily solved for
(using Newtons method or the bisection method). From the solution , determine y as

i
yi = , i = 1, . . . , n
i +
(with the interpretation 0/0 = 0 if happens to be equal to n ). It can be verified that
y and satisfy the optimality conditions and kyk2 = 1.
() < 1 and n 0. This implies k = = n = 0 and = n . Choose = n =
and
i /(i + ) i = 1, . . . , k 1

yi = (1 ())1/2 i = k

0 i > k.
With this choice the optimality conditions are satisfied with kyk2 = 1.
() < 1 and n > 0. This implies = 0. We take = 0 and

i
yi = , i = 1, . . . , n.
i
With this choice the optimality conditions are satisfied with kyk2 < 1.

4.7 Connection between perturbed optimal cost and Lagrange dual functions. In this exercise we explore
the connection between the optimal cost, as a function of perturbations to the righthand sides of
the constraints,
p? (u) = inf{f0 (x) | x D, fi (x) ui , i = 1, . . . , m},
(as in 5.6), and the Lagrange dual function

g() = inf (f0 (x) + 1 f1 (x) + + m fm (x)) ,


x

with domain restricted to  0. We assume the problem is convex. We consider a problem with
inequality constraints only, for simplicity.
We have seen several connections between p? and g:

Slaters condition and strong duality. Slaters condition is: there exists u 0 for which
p? (u) < . Strong duality (which follows) is: p? (0) = sup g(). (Note that we include the
condition  0 in the domain of g.)

108
A global inequality. We have p? (u) p? (0) ?T u, for any u, where ? maximizes g.
Local sensitivity analysis. If p? is differentiable at 0, then we have p? (0) = ? , where ?
maximizes g.

In fact the two functions are closely related by conjugation. Show that

p? (u) = (g) (u).

Here (g) is the conjugate of the function g. You can show this for u int dom p? .
Hint. Consider the problem

minimize f0 (x)
subject to fi (x) = fi (x) ui 0, i = 1, . . . , m.

Verify that Slaters condition holds for this problem, for u int dom p? .
Solution. Suppose u int dom p? . This means that there exists u u with u dom p? , i.e.,
which p? (
u) < . This in turn means that there exists x
dom f0 with fi (
x) u
i .
Following the hint, consider the problem

minimize f0 (x)
subject to fi (x) = fi (x) ui 0, i = 1, . . . , m.

Its optimal value is p? (u), and its dual function is

g() = inf (f0 (x) + 1 (f1 (x) u1 ) + m (fm (x) um )) = g() T u.


x

Since x x) u
satisfies fi ( i < ui , is is strictly feasible for the problem above, so Slaters condition
holds. Therefore we have  
p? (u) = sup g() T u .

We rewrite this as  
p? (u) = sup T (u) (g()) = (g) (u).

4.8 Exact penalty method for SDP. Consider the pair of primal and dual SDPs

(P) minimize cT x (D) maximize tr(F0 Z)


subject to F (x)  0 subject to tr(Fi Z) + ci = 0, i = 1, . . . , m
Z  0,

where F (x) = F0 + x1 F1 + + xn Fn and Fi Sp for i = 0, . . . , n. Let Z ? be a solution of (D).


Show that every solution x? of the unconstrained problem

minimize cT x + M max{0, max (F (x))},

where M > tr Z ? , is a solution of (P).

109
Solution. The unconstrained problem can be written as an SDP

minimize cT x + t
subject to F (x)  tI
t 0.

The dual of this problem is

maximize tr(F0 Z)
subject to tr(Fi Z) + ci = 0, i = 1, . . . , m
tr Z + s = 1
Z  0, s 0.

We see that Z ? is feasible in this problem, with s > 0. By complementary slackness this means
that t = 0 at the optimum of the primal problem.

4.9 Quadratic penalty. Consider the problem

minimize f0 (x)
(12)
subject to fi (x) 0, i = 1, . . . , m,

where the functions fi : Rn R are differentiable and convex.


Show that m
X
(x) = f0 (x) + max{0, fi (x)}2 ,
i=1

where > 0, is convex. Suppose x minimizes . Show how to find from x a feasible point for the
dual of (12). Find the corresponding lower bound on the optimal value of (12).
Solution.

(a) The function max{0, fi (x)}2 is convex because it is the composition of a convex nondecreasing
function h(t) = max{0, t}2 with a convex function fi .
(b) The function h(t) = max{0, t}2 is differentiable with h0 (t) = 2 max{0, t}. Therefore x
satisfies
m
i fi (
X
(
x) = f0 (
x) + x) = 0
i=1

where (
i = 2 max{0, fi ( 0 x) 0
fi (
x)} =
2fi (
x) fi (
x) > 0.
Pm
(c) Let g() = inf x (f0 (x) + i=1 i fi (x)) be the dual function of (12). The lower bound corre-
is
sponding to
m
= inf (f0 (x) + i fi (x))
X
g()
x
i=1
m
i fi (
X
= f0 (
x) + x)
i=1

110
m
X
= f0 (
x) + 2 max{0, fi (
x)}fi (
x)
i=1
Xm
= f0 (
x) + 2 x)}2 .
max{0, fi (
i=1

4.10 Binary least-squares. We consider the non-convex least-squares approximation problem with binary
constraints
minimize kAx bk22
(13)
subject to x2k = 1, k = 1, . . . , n,
where A Rmn and b Rm . We assume that rank(A) = n, i.e., AT A is nonsingular.
{1, 1}n is sent over a noisy
One possible application of this problem is as follows. A signal x
channel, and received as b = A 2
x + v where v N (0, I) is Gaussian noise. The solution of (13)
is the maximum likelihood estimate of the input signal x, based on the received signal b.

(a) Derive the Lagrange dual of (13) and express it as an SDP.


(b) Derive the dual of the SDP in part (a) and show that it is equivalent to
minimize tr(AT AZ) 2bT Az + bT b
subject to diag(Z) =
# 1
" (14)
Z z
 0.
zT 1

Interpret this problem as a relaxation of (13). Show that if


" #
Z z
rank( T )=1 (15)
z 1

at the optimum of (14), then the relaxation is exact, i.e., the optimal values of problems (13)
and (14) are equal, and the optimal solution z of (14) is optimal for (13). This suggests a
heuristic for rounding the solution of the SDP (14) to a feasible solution of (13), if (15) does
not hold. We compute the eigenvalue decomposition
" # n+1
" #" #T
Z z X vi vi
= i ,
zT 1 i=1
ti ti

where vi Rn and ti R, and approximate the matrix by a rank-one matrix


" # " #" #T
Z z v1 v1
1 .
zT 1 t1 t1

(Here we assume the eigenvalues are sorted in decreasing order). Then we take x = sign(v1 )
as our guess of good solution of (13).
(c) We can also give a probabilistic interpretation of the relaxation (14). Suppose we interpret z
and Z as the first and second moments of a random vector v Rn (i.e., z = E v, Z = E vv T ).
Show that (14) is equivalent to the problem
minimize E kAv bk22
subject to E vk2 = 1, k = 1, . . . , n,

111
where we minimize over all possible probability distributions of v.
This interpretation suggests another heuristic method for computing suboptimal solutions
of (13) based on the result of (14). We choose a distribution with first and second moments
E v = z, E vv T = Z (for example, the Gaussian distribution N (z, Z zz T )). We generate a
number of samples v from the distribution and round them to feasible solutions x = sign(
v ).
We keep the solution with the lowest objective value as our guess of the optimal solution
of (13).
(d) Solve the dual problem (14) using CVX. Generate problem instances using the Matlab code
randn(state,0)
m = 50;
n = 40;
A = randn(m,n);
xhat = sign(randn(n,1));
b = A*xhat + s*randn(m,1);
for four values of the noise level s: s = 0.5, s = 1, s = 2, s = 3. For each problem instance,
compute suboptimal feasible solutions x using the the following heuristics and compare the
results.
(i) x(a) = sign(xls ) where xls is the solution of the least-squares problem

minimize kAx bk22 .

(ii) x(b) = sign(z) where z is the optimal value of the variable z in the SDP (14).
(iii) x(c) is computed from a rank-one approximation of the optimal solution of (14), as ex-
plained in part (b) above.
(iv) x(d) is computed by rounding 100 samples of N (z, Z zz T ), as explained in part (c) above.

Solution.

(a) The Lagrangian is

L(x, ) = xT (AT A + diag())x 2bT Ax + bT b 1T .

The dual function is



T T T T T T
1 b A(A A + diag()) A b + b b A A + diag()  0,

g() = AT b R(AT A + diag())


otherwise.

Using Schur complements we can express the dual as an SDP

maximize 1T t + bT b
" #
AT A + diag() AT b
subject to  0.
bT A t

112
(b) We first write the problem as a minimization problem
T T
minimize " +tb b
1 #
A A + diag() AT b
T
subject to  0.
bT A t

We introduce a Lagrange multiplier " #


Z z
zT
for the constraint and form the Lagrangian

L(, t, Z, z, ) = 1T + t bT b tr(Z(AT A + diag())) + 2z T AT b t


= (1 diag(Z))T + t(1 ) bT b tr(ZAT A) + 2z T AT b.

This is unbounded below unless diag(Z) = 1 and = 1. The dual problem of the SDP in
part 1 is therefore
minimize tr(AT AZ) 2bT Az + bT b
subject to diag(Z)
" =
# 1
Z z
 0.
zT 1
To see that this is a relaxation of the original problem we note that the binary LS problem is
equivalent to
minimize tr(AT AZ) 2bT Az + bT b
subject to diag(Z) = 1
Z = zz T .
In the relaxation we replace Z = zz T by the weaker constraint Z  zz T , which is equivalent
to " #
Z z
 0.
zT 1
(c) We have

E kAv bk22 = E(v T AT Av 2bT AT v + bT b)


= tr(E(vv T )AT A) 2bT AT E v + bT b
= tr(ZAT A) 2bT AT z + bT b
E vk2 = Zkk .

(d) The Matlab code for solving the SDP is as follows.


cvx_begin sdp
variable z(n);
variable Z(n,n) symmetric;
minimize( trace(A*A*Z) - 2*b*A*z + b*b )
subject to
[Z, z; z 1] >= 0
diag(Z)==1;
cvx_end

113
The table lists, for each s, the values of

f (x) = kAx bk2

for x = x
and the four approximations, and also the lower bound on the optimal value of
minimize kAx bk2
subject to x2k = 1, k = 1, . . . , n,
obtained by the SDP relaxation.
s f (
x) f (x(a) ) f (x(b) ) f (x(c) ) f (x(d) ) lower bound
0.5 4.1623 4.1623 4.1623 4.1623 4.1623 4.0524
1.0 8.3245 12.7299 8.3245 8.3245 8.3245 7.8678
2.0 16.6490 30.1419 16.6490 16.6490 16.6490 15.1804
3.0 24.9735 33.9339 25.9555 25.9555 24.9735 22.1139
For s = 0.5, all heuristics return x . This is likely to be the global optimum, but that is
not necessarily true. However, from the lower bound we know that the global optimum
is in the interval [4.0524, 4.1623], so even if 4.1623 is not the global optimum, it is quite
close.
For higher values of s, the result from the first heuristic (x(a) ) is substantially worse than
the SDP heuristics.
All three SDP heurstics return x for s = 1 and s = 2.
For s = 3, the randomized rounding method returns x . The other SDP heuristics give
slightly higher values.

4.11 Monotone transformation of the objective. Consider the optimization problem


minimize f0 (x)
(16)
subject to fi (x) 0, i = 1, . . . , m.
where fi : Rn R for i = 0, 1, . . . , m are convex. Suppose : R R is increasing and convex.
Then the problem
minimize f0 (x) = (f0 (x))
(17)
subject to fi (x) 0, i = 1, . . . , m
is convex and equivalent to it; in fact, it has the same optimal set as (16).
In this problem we explore the connections between the duals of the two problems (16) and (17).
We assume fi are differentiable, and to make things specific, we take (a) = exp a.

(a) Suppose is feasible for the dual of (16), and x


minimizes
m
X
f0 (x) + i fi (x).
i=1

Show that x
also minimizes m
i fi (x)
X
exp f0 (x) +
i=1
Thus,
for appropriate choice of . is dual feasible for (17).

114
(b) Let p? denote the optimal value of (16) (so the optimal value of (17) is exp p? ). From we
obtain the bound
p? g(),
where g is the dual function for (16). From we obtain the bound exp p? g(),
where g is
the dual function for (17). This can be expressed as

p? log g().

How do these bounds compare? Are they the same, or is one better than the other?

Solution.
Pm
(a) Since x
minimizes f0 (x) + i=1 i fi (x) we have
m
X
f0 (
x) + i fi (
x) = 0. (18)
i=1

But
m m
" #
X
i fi (
X
i fi (
exp f0 (x) + x) = exp f0 (
x)f0 (
x) + x)
x i=1 i=1
x=
x
m
" #
i ef0 (x) fi (
X
= exp f0 (x) f0 (
x) + x) .
i=1

i = exp f0 (
Clearly, by comparing this equation to (18), if we take x)i 0
m
" #
X
i fi (
exp f0 (x) + x) =0
x i=1 x=
x

i = exp f0 (
or x)i is dual feasible for the modified problem.
= ef0 (x) + Pm ef0 (x) i fi (
(b) We have g() x) and therefore
i=1
m
!
= log ef0 (x) +
X
f0 (
x)
log g() e i fi (
x)
i=1
m
" !#
X
f0 (
x)
= log e 1+ fi (
x)
i=1
m
!
X
= f0 (
x) + log 1 + i fi (
x) .
i=1


Now we show that the bound we get from the modified problem is always worse, i.e., log g()
g(). This follows immediately from the identity log(1 + y) y 0, since g() = f0 ( x) +
Pm
i=1 i fi (
x) and therefore
m m
!
g() = log 1 +
X X
log g() x)
i fi ( i fi (
x)
i=1 i=1
Pm
which is true by taking y = i=1 i fi (
x).

115
4.12 Variable bounds and dual feasibility. In many problems the constraints include variable bounds, as
in
minimize f0 (x)
subject to fi (x) 0, i = 1, . . . , m (19)
li xi ui , i = 1, . . . , n.
Let Rn+ be the Lagrange multipliers associated with the constraints xi ui , and let Rn+
be the Lagrange multipliers associated with the constraints li xi . Thus the Lagrangian is
m
X
L(x, , , ) = f0 (x) + i fi (x) + T (x u) + T (l x).
i=1

(a) Show that for any x Rn and any , we can choose  0 and  0 so that x minimizes
L(x, , , ). In particular, it is very easy to find dual feasible points.
(b) Construct a dual feasible point (, , ) by applying the method you found in part (a) with
x = (l + u)/2 and = 0. From this dual feasible point you get a lower bound on f ? . Show
that this lower bound can be expressed as

f ? f0 ((l + u)/2) ((u l)/2)T |f0 ((l + u)/2)|

where | | means componentwise. Can you prove this bound directly?

Solution.

(a) Suppose Rm (  0) is given. We show that we can always find  0 and  0 such
that (, , ) is dual feasible. As a matter of fact, given any x Rn and Rm we can find
 0 and  0 so that x minimizes L. We have
m
X
x L(x, , , ) = f0 (x) + i fi (x) + ( ).
i=1

If x minimizes L, we have x L = 0 and therefore


m
X
= f0 (x) + i fi (x). (20)
i=1

If we can choose  0 and  0 such that (20) holds for any x and we are done. But
this is easy, since any vector can be written as the difference of two componentwise positive
vectors. Simply take and as the positive and negative parts of f0 (x) + m i=1 i fi (x)
P

respectively, i.e.,
" m
#+
X
= f0 (x) + i fi (x)
i=1
m m
!
1 X X
= |f0 (x) + i fi (x)| + f0 (x) + i fi (x)
2 i=1 i=1

116
and
" m
#
X
= f0 (x) + i fi (x)
i=1
m m
!
1 X X
= |f0 (x) + i fi (x)| f0 (x) i fi (x)
2 i=1 i=1

where | | is componentwise. Thus, x minimizes L for any given and some  0,  0.


Therefore, if  0 then (, , ) is dual feasible.
(b) In this case
1
= (f0 ((l + u)/2) |f0 ((l + u)/2)|)
2
1
= (f0 ((l + u)/2) + |f0 ((l + u)/2)|)
2
and therefore the lower bound becomes
l+u 1 l+u l+u T l+u
 
L(x, 0, , ) = f0 ( )+ f0 ( ) |f0 ( )| ( u)
2 2 2 2 2
1 l+u l+u T l+u
 
+ f0 ( ) + |f0 ( )| ( l)
2 2 2 2
l+u ul T l+u
= f0 ( )+( ) |f0 ( )|.
2 2 2
This bound can also be derived directly. Since f0 is convex
u+l u+l T ? u+l
f ? f0 ( ) + f0 ( ) (x )
2 2 2
u+l u+l T u+l
f0 ( ) + inf f0 ( ) (x ).
2 lxu 2 2

But inf lxu f0 ((u + l)/2)T (x (l + u)/2) can be simply found by setting (xi (li + ui )/2)
to its maximum value (i.e., (ui li )/2) if f0 ((l + u)/2)i 0, and by setting (xi (li + ui )/2)
to its minimum value (i.e., (li ui )/2) if f0 ((l + u)/2)i > 0. Therefore, inf lxu f0 ((u +
l)/2)T (x (u + l)/2) = |f0 ((u + l)/2)|T (u l)/2 and we get

f ? f0 ((u + l)/2) |f0 ((u + l)/2)|T (u l)/2.

4.13 Deducing costs from samples of optimal decision. A system (such as a firm or an organism) chooses
a vector of values x as a solution of the LP
minimize cT x
subject to Ax  b,

with variable x Rn . You can think of x Rn as a vector of activity levels, b Rm as a


vector of requirements, and c Rn as a vector of costs or prices for the activities. With this
interpretation, the LP above finds the cheapest set of activity levels that meet all requirements.
(This interpretation is not needed to solve the problem.)

117
We suppose that A is known, along with a set of data

(b(1) , x(1) ), ..., (b(r) , x(r) ),

where x(j) is an optimal point for the LP, with b = b(j) . (The solution of an LP need not be unique;
all we say here is that x(j) is an optimal solution.) Roughly speaking, we have samples of optimal
decisions, for different values of requirements.
You do not know the cost vector c. Your job is to compute the tightest possible bounds on the
costs ci from the given data. More specifically, you are to find cmax
i and cmin
i , the maximum and
minimum possible values for ci , consistent with the given data.
Note that if x is optimal for the LP for a given c, then it is also optimal if c is scaled by any positive
factor. To normalize c, then, we will assume that c1 = 1. Thus, we can interpret ci as the relative
cost of activity i, compared to activity 1.

(a) Explain how to find cmax


i and cmin
i . Your method can involve the solution of a reasonable
number (not exponential in n, m or r) of convex or quasiconvex optimization problems.
(b) Carry out your method using the data found in deducing_costs_data.m. You may need
to determine whether individual inequality constraints are tight; to do so, use a tolerance
threshold of  = 103 . (In other words: if aTk x bk 103 , you can consider this inequality
as tight.)
Give the values of cmax
i and cmin
i , and make a very brief comment on the results.

Solution. A feasible point x is optimal for the LP if and only if there exists a  0 with c = AT
and k (aTk x bk ) = 0, k = 1, . . . , m, where aTk is the kth row of A. So all we know about x(j) is
(j) (j)
that there exists a (j)  0 with c = AT (j) and k (aTk x(j) bk ) = 0, k = 1, . . . , m. But we dont
know the vectors (j) .
We can express this as follows: c is a possible cost vector if and only if there exists (j) , j = 1, . . . , r,
such that
(j)  0, j = 1, . . . , r
c = AT (j) , j = 1, . . . , r
(j) (j)
k (aTk x(j) bk ) = 0, j = 1, . . . , r, k = 1, . . . , m.
Here we know ak , x(j) , b(j) ; we dont know c or (j) . But careful examination of these equations and
inequalities shows that they are in fact a set of linear equalities and inequalities on the unknowns,
c and (j) . So we can minimize (or maximize) over ci by solving the optimization problem:

minimize ci
subject to (j)  0, j = 1, . . . , r
c = AT (j) , j = 1, . . . , r
(j) (j)
k (aTk x(j) bk ) = 0, j = 1, . . . , r, k = 1, . . . , m
c1 = 1,

with variables c and (1) , . . . , (r) . The solution gives us cmin


i . If you look at this carefully, youll
(j) T (j) (j) (j) (j) (j)
see that it is an LP. k (ak x bk ) = 0 reduces to k 0 when aTk x(j) = bk , and k = 0
(j)
when aTk x(j) > bk . If we maximize instead of minimize, we get cmax i . Therefore, we solve a set of
2(n 1) LPs: one to maximize each ci and one to minimize each ci , for i = 2, . . . , n.

118
In the problem instance, m = 10, n = 5, and r = 10. Our method requires the solution of 8 LPs.
The following Matlab code solves the problem.

%this script solves the deducing costs problem


deducing_costs_data;

cmin = ones(n,1);
cmax = ones(n,1);
for i = 2:n
% calculate lower bound cmin(i)
cvx_begin
variables c(n) lambdas(m,r)
minimize(c(i))
c(1) == 1;
lambdas >= 0;
lambdas(abs(A*X-B)>=1e-3) == 0;
c*ones(1,r) == A*lambdas;
cvx_end
cmin(i) = cvx_optval;

% calculate upper bound cmax(i)


cvx_begin
variables c(n) lambdas(m,r)
maximize(c(i))
c(1) == 1;
lambdas >= 0;
lambdas(abs(A*X-B)>=1e-3) == 0;
c*ones(1,r) == A*lambdas;
cvx_end
cmax(i) = cvx_optval;
end

disp(cmin c_true cmax:)


[cmin c_true cmax]

The bounds, displayed as [cmin c_true cmax], are

ans =

1.0000 1.0000 1.0000


0.6477 1.3183 1.7335
0.0609 0.3077 0.4044
0.0161 1.0190 1.5127
0.7378 0.8985 1.0723

The tightest bounds are on c3 and c5 and the weakest bound is on c4 .

119
4.14 Kantorovich inequality.

(a) Suppose a Rn with a1 a2 an > 0, and b Rn with bk = 1/ak .


Derive the KKT conditions for the convex optimization problem

minimize log(aT x) log(bT x)


subject to x  0, 1T x = 1.

Show that x = (1/2, 0, . . . , 0, 1/2) is optimal.


(b) Suppose A Sn++ with eigenvalues k sorted in decreasing order. Apply the result of part (a),
with ak = k , to prove the Kantorovich inequality:
s s

T
1/2 
T 1
1/2 1 n
2 u Au u A u +
n 1

for all u with kuk2 = 1.

Solution.

(a) The KKT conditions are


1 1
a + T b  1 x  0, 1T x = 1,
aT x b x
plus the complementary slackness conditions
1 1
 
xk ak bk = 0, k = 1, . . . , n.
aT x bT x

We show that x = (1/2, 0, . . . , 0, 1/2), = 2 solve these equations, and hence are primal and
dual optimal.
The feasibility conditions x  0, 1T x = 1 obviously hold, and the complementary slackness
conditions are trivially satisfied for k = 2, . . . , n 2. It remains to verify the inequality
ak bk
T
+ T , k = 1, . . . , n, (21)
a x b x
and the complementary slackness condition
1 1
 
xk T ak T bk = 0, k = 1, n. (22)
a x b x

For x = (1/2, 0, . . . , 0, 1/2), = 2 the inequality (21) holds with equality for k = 1 and k = n,
since
a1 b1 2a1 2/a1
T
+ T = + = 2,
a x b x a1 + an 1/a1 + 1/an
and
an bn 2an 2/an
+ = + = 2.
aT x bT x a1 + an 1/a1 + 1/an

120
Therefore also (22) is satisfied. The remaining inequalities in (21) reduce to
ak bk ak + a1 an /ak
+ =2 2, k = 2, . . . , n 1.
aT x bT x a1 + an
This is valid, since it holds with equality for k = 1 and k = n, and the function t + a1 an /t is
convex in t, so
t + a1 an /t
2
a1 + an
for all t [an , a1 ].
(b) Diagonalize A using its eigenvalue decomposition
A =QQT , and define ak = k , bk = 1/k ,
T 2 T
xk = (Q u)k . From part (a), Q u = (1/ 2, 0, . . . , 1/ 2) is optimal. Therefore,
1
(uT Au)(uT A1 u) (1 + n )(1 1
1 + n )
4
s s !2
1 1 n
= + .
4 n 1

4.15 State and solve the optimality conditions for the problem
" #1
X1 X2
minimize log det
X2T

X3
subject to tr X1 =
tr X2 =
tr X3 = .

The optimization variable is " #


X1 X2
X= ,
X2T X3
with X1 Sn , X2 Rnn , X3 Sn . The domain of the objective function is S2n
++ . We assume
> 0, and > 2 .
Solution. This is a convex problem with three equality constraints

minimize f0 (X)
subject to h1 (X) =
h2 (X) =
h3 (X) = ,

where f0 (X) = log det X and


" # ! " # ! " # !
I 0 0 I 0 0
h1 (X) = tr X , h2 (X) = (1/2) tr X , h3 (X) = tr X .
0 0 I 0 0 I

The general optimality condition for an equality constrained problem,


3
X
f0 (X) + i hi (X) = 0,
i=1

121
reduces to " # " # " #
1 I 0 0 I 0 0
X + 1 + (2 /2) + 3 = 0,
0 0 I 0 0 I
along with the feasibility conditions tr X1 = , tr X2 = , tr X3 = .
From the first condition
" #1 " #
1 I (2 /2)I 1 I 2 I
X= =
(2 /2)I 3 I 2 I 3 I

where " # " #1


1 2 1 (2 /2)
= .
2 3 (2 /2) 3
From the feasibility conditions we see that we have to choose i (and hence i ), such that

1 = /n, 2 = /n, 3 = /n.

We conclude that the optimal solution is


" #
I I
X = (1/n) .
I I

4.16 Consider the optimization problem

minimize log det X + tr(SX)


subject to X is tridiagonal

with domain Sn++ and variable X Sn . The matrix S Sn is given. Show that the optimal Xopt
satisfies
1
(Xopt )ij = Sij , |i j| 1.

Solution.
minimize log det X + tr(SX)
subject to tr((ei eTj + ej eTi )X) = 0, i j > 1.
The optimality conditions are

X 1 = S +
X
X  0, X tridiagonal, ij (ei eTj + ej eTi ).
ij>1

From the last condition we see that (X 1 )ij = Sij if |i j| 1.

4.17 We denote by f (A) the sum of the largest r eigenvalues of a symmetric matrix A Sn (with
1 r n), i.e.,
r
X
f (A) = k (A),
k=1

where 1 (A), . . . , n (A) are the eigenvalues of A sorted in decreasing order.

122
(a) Show that the optimal value of the SDP

maximize tr(AX)
subject to tr X = r
0  X  I,

with variable X Sn , is equal to f (A).


(b) Show that f is a convex function.
(c) Assume A(x) = A0 + x1 A1 + + xm Am , with Ak Sn . Use the observation in part (a) to
formulate the optimization problem

minimize f (A(x)),

with variable x Rm , as an SDP.

Solution.

(a) Let A = V V T = nk=1 k vk vkT be the eigenvalue decomposition of A. If we make a change


P

of variables Y = V T XV the problem becomes


maximize tr(Y )
subject to tr Y = r
0  Y  I.
We can assume Y is diagonal at the optimum. To see this, note that the objective function and
the first constraint only involve the diagonal elements of Y . Moreover, if Y satisfies 0  Y  I,
then its diagonal elements satisfy 0 Yii 1. Therefore if a non-diagonal Y is optimal, then
setting its off-diagonal elements to zero yields another feasible matrix with the same objective
value.
To find the optimal value we can therefore solve
n
P
maximize i Yii
i=1
n
P
subject to Yii = r
i=1
0 Yii 1, i = 1, . . . , n.
Since the eigenvalues i are sorted in nonincreasing order, an optimal solution is Y11 = =
Yrr = 1, Yr+1,r+1 = = Ynn = 0. Converting back to the original variables gives
r
X
X= vk vkT .
k=1

(b) Any function of the form supXC tr(AX) is convex in A.


(c) To derive the dual of the problem in part (a), we first write it as a minimization

minimize tr(AX)
subject to tr X = r
0  X  I.

123
The Lagrangian is

L(X, , U, V ) = tr(AX) + (tr X r) tr(U X) + tr(V (X I))


= tr((A + I U + V )X) r tr V.

By minimizing over X we obtain the dual function


(
r tr V A + I U + V = 0
g(, U, V ) =
otherwise.

The dual problem is


maximize r tr V
subject to A I = V U
U  0, V  0.
If we change the dual problem to a minimization and eliminate the variable U , we obtain a
dual problem for the SDP in part (a) of the assignment:

minimize r + tr V
subject to A I  V
V  0.

By strong duality, the optimal value of this problem is equal to f (A). We can therefore
minimize f (A(x)) over x by solving the

minimize r + tr V
subject to A(x) I  V
V  0,

which is an SDP in the variables R, V Sn , x Rm .

4.18 An exact penalty function. Suppose we are given a convex problem

minimize f0 (x)
(23)
subject to fi (x) 0, i = 1, . . . , m

with dual
maximize g()
(24)
subject to  0.
We assume that Slaters condition holds, so we have strong duality and the dual optimum is
attained. For simplicity we will assume that there is a unique dual optimal solution ? .
For fixed t > 0, consider the unconstrained minimization problem

minimize f0 (x) + t max fi (x)+ , (25)


i=1,...,m

where fi (x)+ = max{fi (x), 0}.

(a) Show that the objective function in (25) is convex.

124
(b) We can express (25) as

minimize f0 (x) + ty
subject to fi (x) y, i = 1. . . . , m (26)
0y

where the variables are x and y R.


Find the Lagrange dual problem of (26) and express it in terms of the Lagrange dual function
g for problem (23).
(c) Use the result in (b) to prove the following property. If t > 1T ? , then any minimizer of (25)
is also an optimal solution of (23).

(The second term in (25) is called a penalty function for the constraints in (23). It is zero if x is
feasible, and adds a penalty to the cost function when x is infeasible. The penalty function is called
exact because for t large enough, the solution of the unconstrained problem (25) is also a solution
of (23).)
Solution.

(a) The first term is convex. The second term is convex since it can be expressed as

max fi (x)+ = max{f1 (x), . . . , fm (x), 0},


i=1,...,m

i.e., the pointwise maximum of a number of convex functions.


(b) The Lagrangian is
m
X
L(x, y, , ) = f0 (x) + ty + i (fi (x) y) y.
i=1

The dual function is


m
X
inf L(x, y, , ) = inf f0 + ty + i (fi (x) y) y
x,y x,y
i=1
(
g() if T
1 + =t
=
otherwise.

Therefore the dual of problem (5) is

maximize g()
subject to 1T + = t
 0, 0,

or, equivalently,
maximize g()
subject to 1T t
 0.

125
(c) If 1T ? < t, then ? is also optimal for the dual problem derived in part (b). By complementary
slackness y = 0 in the optimal solution of the primal problem (5), and the optimal x satisfies
fi (x) 0, i = 1, . . . , m, i.e., it is feasible in the original problem (2), and therefore also
optimal.
4.19 Infimal convolution. Let f1 , . . . , fm be convex functions on Rn . Their infimal convolution, denoted
g = f1   fm (several other notations are also used), is defined as
g(x) = inf{f1 (x1 ) + + fm (xm ) | x1 + + xm = x},
with the natural domain (i.e., defined by g(x) < ). In one simple interpretation, fi (xi ) is the cost
for the ith firm to produce a mix of products given by xi ; g(x) is then the optimal cost obtained
if the firms can freely exchange products to produce, all together, the mix given by x. (The name
convolution presumably comes from the observation that if we replace the sum above with the
product, and the infimum above with integration, then we obtain the normal convolution.)
(a) Show that g is convex.
(b) Show that g = f1 + + fm
. In other words, the conjugate of the infimal convolution is the

sum of the conjugates.


(c) Verify the identity in part (b) for the specific case of two strictly convex quadratic functions,
fi (x) = (1/2)xT Pi x, with Pi Sn++ , i = 1, 2.
Hint: Depending on how you work out the conjugates, you might find the matrix identity
(X + Y )1 Y = X 1 (X 1 + Y 1 )1 useful.
Solution.
(a) We can express g as
g(x) = inf (f1 (x1 ) + + fm (xm ) + (x1 , . . . , xm , x)) ,
x1 ,...,xm

where (x1 , . . . , xm , x) is 0 when x1 + + xm = x, and otherwise. The function on the


righthand side above is convex in x1 , . . . , xm , x, so by the partial minimization rule, so is g.
(b) We have
g (y) = sup(y T x f (x))
x
 
T
= sup y x inf f1 (x1 ) + + fm (xm )
x x1 ++xm =x
 
= sup y T x1 f1 (x1 ) + + y T xm fm (xm ) ,
x=x1 ++xm

where we use the fact that ( inf S) is the same as (sup S). The last line means we are to
take the supremum over all x and all x1 , . . . , xm that sum to x. But this is the same as just
taking the supremum over all x1 , . . . , xm , so we get
 
g (y) = sup y T x1 f1 (x1 ) + + y T xm fm (xm )
x1 ,...,xm

= sup(y T x1 f1 (x1 )) + + sup(y T xm fm (xm ))


x1 xm
= f1 (y) + + fm

(y).

126
(c) For fi (x) = (1/2)xT Pi x, we have fi (y) = (1/2)y T Pi1 y (see Example 3.22 in the textbook).
Therefore, we have f1 + f2 = (1/2)y T (P11 + P21 )y.
On the other hand, to compute g = f1  f2 , we need to evaluate
g(x) = inf{(1/2)xT1 P1 x1 + (1/2)xT2 P2 x2 | x1 + x2 = x}.
This requires solving a linearly constrained convex quadratic problem. The optimality condi-
tion is
P1 0 I x1 0
0 P2 I x2 = 0 .

I I 0 x
Applying block elimination, we obtain the equation (P11 + P21 ) = x, and thus we have:
x?1 = P11 (P11 + P21 )1 x, x?2 = P21 (P11 + P21 )1 x.
Plugging in these values into the expression of g(x) yields:
g(x) = (1/2)xT (P11 + P21 )1 x,
and thus (since it is a quadratic function), we have indeed:
g (y) = (1/2)y T (P11 + P21 )y = f1 (y) + f2 (y).
As an aside, notice that the quadratic form g(x) corresponds to the (matrix) harmonic mean
of the matrices Pi .
4.20 Derive the Lagrange dual of the optimization problem
n
X
minimize (xi )
i=1
subject to Ax = b
with variable x Rn , where
|u| c
(u) = = 1 + , dom = (c, c).
c |u| c |u|
c is a positive parameter. The figure shows for c = 1.
5

4.5

3.5

2.5
z

1.5

0.5

0
1 0.5 0 0.5 1
x

127
Solution. The Lagrangian is
n 
X 
L(x, z) = (xi ) xi (aTi z) + bT z
i=1

where ai is the ith column of A. The dual function is


n
X  
g(z) = bT z + inf (xi ) xi (aTi z)
xi
i=1
X
= bT z + h(aTi z)
i

where h(y) = inf u ((u) yu). If |y| 1/c, the minimizer in the definition of h is u = 0 and
h(y) = 0. Otherwise, we find the minimum by setting the derivative equal to zero. If y > 1/c, we
solve
c
0 (u) = = y.
(c u)2

The solution is u = c (c/y)1/2 and h(y) = (1 cy)2 . If y < 1/c, we solve
c
0 (u) = = y.
(c + u)2

The solution is u = c + (c/y)1/2 and h(y) = (1 cy)2 . Combining the different cases, we
can write  2
p
1 c|y| |y| > 1/c

h(u) =
0 otherwise.
The dual problem is
maximize bT z + h(aTi z).
P
i
The figure shows the function h for c = 1.

0.1

0.1

0.2
z

0.3

0.4

0.5
3 2 1 0 1 2 3
x

128
4.21 Robust LP with polyhedral cost uncertainty. We consider a robust linear programming problem,
with polyhedral uncertainty in the cost:

minimize supcC cT x
subject to Ax  b,

with variable x Rn , where C = {c | F c  g}. You can think of x as the quantities of n products
to buy (or sell, when xi < 0), Ax  b as constraints, requirements, or limits on the available
quantities, and C as giving our knowledge or assumptions about the product prices at the time
we place the order. The objective is then the worst possible (i.e., largest) possible cost, given the
quantities x, consistent with our knowledge of the prices.
In this exercise, you will work out a tractable method for solving this problem. You can assume
that C 6= , and the inequalities Ax  b are feasible.

(a) Let f (x) = supcC cT x be the objective in the problem above. Explain why f is convex.
(b) Find the dual of the problem
maximize cT x
subject to F c  g,
with variable c. (The problem data are x, F , and g.) Explain why the optimal value of the
dual is f (x).
(c) Use the expression for f (x) found in part (b) in the original problem, to obtain a single LP
equivalent to the original robust LP.
(d) Carry out the method found in part (c) to solve a robust LP with the data below. In
MATLAB:
rand(seed,0);
A = rand(30,10);
b = rand(30,1);
c_nom = 1+rand(10,1); % nominal c values
In Python:
import numpy as np
np.random.seed(10)
(m, n) = (30, 10)
A = np.random.rand(m, n); A = np.asmatrix(A)
b = np.random.rand(m, 1); b = np.asmatrix(b)
c_nom = np.ones((n, 1)) + np.random.rand(n, 1); c_nom = np.asmatrix(c_nom)
In Julia:
srand(10);
n = 10;
m = 30;
A = rand(m, n);
b = rand(m, 1);
c_nom = 1 + rand(n, 1);

129
Then, use C described as follows. Each ci deviates no more than 25% from its nominal value,
i.e., 0.75cnom  c  1.25cnom , and the average of c does not deviate more than 10% from the
average of the nominal values, i.e., 0.9(1T cnom )/n 1T c/n 1.1(1T cnom )/n.
Compare the worst-case cost f (x) and the nominal cost cTnom x for x optimal for the robust
problem, and for x optimal for the nominal problem (i.e., the case where C = {cnom }). Com-
pare the values and make a brief comment.

Solution.

(a) For each c C, cT x is linear, therefore convex. f (x) is the supremum of convex functions, and
therefore convex.
(b) The dual is
minimize T g
subject to F T = x,  0,
with variable . Since the primal problem is feasible (we know this since we assume C = 6 ),
we are guaranteed there is zero duality gap, so f (x), the optimal value of the primal problem,
is also the optimal value of the dual above.
(c) Substituting our expression for f (x) into the original problem, we arrive at

minimize inf{T g | F T = x,  0}
subject to Ax  b.

We can just as well minimize over x and at the same time, which gives the problem

minimize T g
subject to Ax  b, F T = x,  0,

which is an LP in the variables x and . Solving this single LP gives us the optimal x for the
original robust LP.
(d) We define xnom and xrob to be the nominal and robust solutions respectively. The numerical
results are given in the table below.
xnom xrob
MATLAB version: cTnom x 1.50 1.94
f (x) 4.07 2.31

xnom xrob
Python version: cTnom x 2.11 2.52
f (x) 7.22 3.17

xnom xrob
Julia version: cTnom x 1.53 2.39
f (x) 4.32 2.99

The MATLAB code below formulates and solve the robust diet problem.

130
rand(seed,0);
n=10;m=30;
A = rand(m,n);
b = rand(m,1);
c_nom = 1+1*rand(n,1); % nominal c values

F = [eye(n);-eye(n);ones(1,n)/n;-ones(1,n)/n];
g = [1.25*c_nom;-0.75*c_nom;1.1*sum(c_nom)/n;-0.9*sum(c_nom)/n];
k = length(g);

% robust LP
cvx_begin
variables x_rob(n) lambda(k)
minimize(lambda*g)
A*x_rob>=b
lambda>=0
F*lambda==x_rob
cvx_end

% nominal cost of x_rob


c_nom*x_rob

% nominal LP
cvx_begin
variables x_nom(n)
minimize(c_nom*x_nom)
A*x_nom>=b
cvx_end

% worst case cost of x_nom


cvx_begin
variables c_wc(n)
maximize(c_wc*x_nom)
F*c_wc<=g
cvx_end
The Python code below formulates and solve the robust diet problem.
import cvxpy as cvx
import numpy as np

np.random.seed(10)
(m, n) = (30, 10)
A = np.random.rand(m, n); A = np.asmatrix(A)
b = np.random.rand(m, 1); b = np.asmatrix(b)
c_nom = np.ones((n, 1)) + np.random.rand(n, 1); c_nom = np.asmatrix(c_nom)

131
#Solution begins
#------------------------------------------------------------
F = np.r_[np.eye(n), -np.eye(n), np.ones((1, n))/n, -np.ones((1, n))/n]
g = np.r_[1.25*c_nom, -0.75*c_nom, 1.1*sum(c_nom)/n, -0.9*sum(c_nom)/n]

lam = cvx.Variable(g.size)
x = cvx.Variable(n)

const = [A*x>=b, F.T*lam==x, lam>=0]


cvx.Problem(cvx.Minimize(lam.T*g), const).solve()
x_rob = x.value

const = [A*x>=b]
cvx.Problem(cvx.Minimize(c_nom.T*x), const).solve()
x_nom = x.value

print For x_nom and x_rob


print nominal costs are %.2f, and %.2f % (c_nom.T*x_nom, c_nom.T*x_rob)

c = cvx.Variable(n)
f_x_nom = cvx.Problem(cvx.Maximize(c.T*x_nom), [F*c<=g]).solve()
f_x_rob = cvx.Problem(cvx.Maximize(c.T*x_rob), [F*c<=g]).solve()

print worst-case costs are %.2f, and %.2f % (f_x_nom, f_x_rob)

The Julia code below formulates and solve the robust diet problem.
srand(10);
n = 10;
m = 30;
A = rand(m, n);
b = rand(m, 1);
c_nom = 1 + rand(n, 1);

F = [eye(n); -eye(n); ones(1, n)/n; -ones(1, n)/n];


g = [1.25*c_nom; -0.75*c_nom; 1.1*sum(c_nom)/n; -0.9*sum(c_nom)/n];
k = length(g);

using Convex, SCS

#Robust LP
x = Variable(n);
lambda = Variable(k);
problem1 = minimize(lambda*g, A*x >= b, lambda >= 0, F*lambda - x ==0)
solve!(problem1, SCSSolver(verbose=0));
x_rob = x.value;

132
println("Nominal cost of x_rob")
println(c_nom*x_rob)

#Nominal LP
x = Variable(n);
problem2 = minimize(c_nom*x, A*x >= b);
solve!(problem2, SCSSolver(verbose=0));
x_nom = x.value;
println("Nominal cost of x_nom")
println(c_nom*x_nom)

#Worst-case cost for x_rob


c = Variable(n);
constraint = F*c <= g;
problem3 = maximize(x_rob*c, F*c <= g);
solve!(problem3, SCSSolver(verbose=0));
f_c_rob = problem3.optval;
println("Worst-case cost of x_rob:")
println(f_c_rob)

#Worst-case cost for x_nom


problem4 = maximize(x_nom*c, F*c <= g);
solve!(problem4, SCSSolver(verbose=0));
f_c_nom = problem4.optval;
println("Worst-case cost of x_nom:")
println(f_c_nom)

4.22 Diagonal scaling with prescribed column and row sums. Let A be an n n matrix with positive
entries, and let c and d be positive n-vectors that satisfy 1T c = 1T d = 1. Consider the geometric
program
minimize xT Ay
n
xci i = 1
Q
subject to
i=1
n d
yj j = 1,
Q
j=1

with variables x, y Rn (and implicit constraints x  0, y  0). Write this geometric program
in convex form and derive the optimality conditions. Show that if x and y are optimal, then the
matrix
1
B= T diag(x)A diag(y)
x Ay
satisfies B1 = c and B T 1 = d.
Solution. We make a change of variables ui = log xi , vj = log yj and define ij = log Aij . The

133
GP in convex form is n P
n
eij +ui +vj )
P
minimize log (
i=1 j=1
subject to cT u = 0
dT v = 0.
The optimality conditions are cT u = dT v = 0 and
n n
eui eij evj
P
evj eij eui
P
j=1 i=1
n P
n ci , i = 1, . . . , n, n P
n = dj , j = 1, . . . , n.
ekl +uk +vl ekl +uk +vl
P P
k=1 l=1 k=1 l=1

In the original variables xi = eui , yi = evi , this is


1 1
diag(x)Ay = c, diag(y)AT x = d.
xT Ay xT Ay
Taking the inner product with 1 shows = = 1.
This result can be found in the following paper: A. W. Marshall and I. Olkin, Scaling of matrices
to achieve specified row and column sums. Numerische Mathematik 12, 83-90 (1968).

4.23 A theorem due to Schoenberg. Suppose m balls in Rn , with centers ai and radii ri , have a nonempty
intersection. We define y to be a point in the intersection, so

ky ai k2 ri , i = 1, . . . , m. (27)

Suppose we move the centers to new positions bi in such a way that the distances between the
centers do not increase:
kbi bj k2 kai aj k2 , i, j = 1, . . . , m. (28)
We will prove that the intersection of the translated balls is nonempty, i.e., there exists a point x
with kx bi k2 ri , i = 1, . . . , m. To show this we prove that the optimal value of

minimize t
(29)
subject to kx bi k22 ri2 + t, i = 1, . . . , m,

with variables x Rn and t R, is less than or equal to zero.

(a) Show that (28) implies that

t (x bi )T (x bj ) (y ai )T (y aj ) for i, j I,

if (x, t) is feasible in (29), and I {1, . . . , m} is the set of active constraints at x, t.


(b) Suppose x, t are optimal in (29) and that 1 , . . . , m are optimal dual variables. Use the
optimality conditions for (29) and the inequality in part a to show that
m
X m
X
t=tk i (x bi )k22 k i (y ai )k22 .
i=1 i=1

134
Solution.

(a) Follows from

2t + ri2 + rj2 2(x bi )T (x bj ) = kx bi k22 + kx bj k22 2(x bi )T (x bj )


ky ai k22 + ky aj k22 2(y ai )T (y aj )
ri2 + rj2 2(y ai )T (y aj ).

The equality follows from the fact that constraints i and j are active. The first inequality
is (28), and the second inequality is (27).
(b) Let I be the set of active constraints at the optimal x, t. The optimality conditions are: primal
feasibility and
m
X m
X
i = 1, x= i bi , i 0, i = 0 for i 6 I.
i=1 i=1

Therefore
m
X
t = tk i (x bi )k22
i=1
X m
m X  
= i j t (x bi )T (x bj )
i=1 j=1
XX  
= i j t (x bi )T (x bj )
iI jI
XX
i j (y ai )T (y aj )
iI jI
m X
X m
= i j (y ai )T (y aj )
i=1 j=1
Xm
= k i (y ai )k22 .
i=1
P P
Line 1 follows from x = i i bi . Line 2 follows from i i = 1. Lines 3 and 5 follow from
i = 0 for i 6 I. Line 4 follows from part 1 and i 0.

The result appears in the following paper: I.J. Schoenberg, On a theorem of Kirzbraun and Valen-
tine, The American Mathematical Monthly 60, 620-622 (1953).

135
5 Approximation and fitting
5.1 Three measures of the spread of a group of numbers. For x Rn , we define three functions that
measure the spread or width of the set of its elements (or coefficients). The first function is the
spread, defined as
sprd (x) = max xi min xi .
i=1,...,n i=1,...,n

This is the width of the smallest interval that contains all the elements of x.
The second function is the standard deviation, defined as

n n
!2 1/2
1X 1X
stdev (x) = x2 xi .
n i=1 i

n i=1

This is the statistical standard deviation of a random variable that takes the values x1 , . . . , xn , each
with probability 1/n.
The third function is the average absolute deviation from the median of the values:
n
X
aamd (x) = (1/n) |xi med(x)|,
i=1

where med(x) denotes the median of the components of x, defined as follows. If n = 2k 1 is


odd, then the median is defined as the value of middle entry when the components are sorted, i.e.,
med(x) = x[k] , the kth largest element among the values x1 , . . . , xn . If n = 2k is even, we define
the median as the average of the two middle values, i.e., med(x) = (x[k] + x[k+1] )/2.
Each of these functions measures the spread of the values of the entries of x; for example, each
function is zero if and only if all components of x are equal, and each function is unaffected if a
constant is added to each component of x.
Which of these three functions is convex? For each one, either show that it is convex, or give a
counterexample showing it is not convex. By a counterexample, we mean a specific x and y such
that Jensens inequality fails, i.e., ((x + y)/2) > ((x) + (y))/2.
Solution. The first one is straightforward. The maximum of xi is convex, and the minimum is
concave. The difference, which is sprd (x), is therefore convex.
The second one is also convex. The standard deviation is the Euclidean norm of a linear function
of x,
stdev (x) = (1/ n)k(I (1/n)11T )xk2 ,
and so is convex.
The third function is also convex. The snappiest proof we know goes like this. Consider the function
defined as
(x) = inf kx t1k1 .
t
Since kx t1k1 is convex in x, t, it follows that (which is obtained by minimizing over t) is convex
in x. The median t = med(x) minimizes kx t1k1 . (When n is even, any number between the two
middle ones is also a minimizer.) Using this value of t, we see that

aamd (x) = (1/n)(x),

136
and so is convex.
By the way, the same proof works for the other two functions. For p = , the t that minimizes
kx t1k is t = (maxi xi + mini xi )/2, and we have

sprd (x) = 2(x).

For p = 2, the t that minimizes kx t1k2 is the mean, t = (1/n)


P
i xi . Then we have

stdev (x) = (1/ n)(x).

We mention another nice proof of convexity of aamd (x). Suppose that n = 2k 1 is odd. We use
x[1] to denote the largest element of x, x[2] the second largest, and so on. The median of x is given
by x[k] . Our function can be expressed as

k1
X n
X k1
X n
X
sprd (x) = (x[i] x[k] ) + (x[k] x[i] ) = x[i] x[i] .
i=1 i=k+1 i=1 i=k+1

But we know that for any r, the function


r
X
x[i] ,
i=1

the sum of the r largest entries of x, is convex. The sum of the r smallest elements can be expressed
as
n
X n
X nr
X
x[i] = xi x[i] ,
i=nr+1 i=1 i=1

and so is concave. It follows that sprd (x) is the difference of a convex and concave function, and
so is convex. The same type of argument works when n is even.
And, here is yet another proof of convexity of sprd . Lets take n even for simplicity. Consider
! c with all entries in c either 1 or 1, with an equal number of 1s and 1s. (There are
each vector
n
such vectors.) The linear function cT x gives the difference between the sum of some half
n/2
of the entries of x and the sum of the other half. Our function sprd (x) is the maximum over all
such linear functions, and therefore is convex.

5.2 Minimax rational fit to the exponential. (See exercise 6.9 of Convex Optimization.) We consider
the specific problem instance with data

ti = 3 + 6(i 1)/(k 1), yi = eti , i = 1, . . . , k,

where k = 201. (In other words, the data are obtained by uniformly sampling the exponential
function over the interval [3, 3].) Find a function of the form

a0 + a1 t + a2 t2
f (t) =
1 + b1 t + b2 t2

that minimizes maxi=1,...,k |f (ti ) yi |. (We require that 1 + b1 ti + b2 t2i > 0 for i = 1, . . . , k.)

137
Find optimal values of a0 , a1 , a2 , b1 , b2 , and give the optimal objective value, computed to an
accuracy of 0.001. Plot the data and the optimal rational function fit on the same plot. On a
different plot, give the fitting error, i.e., f (ti ) yi .
Hint. To check if a feasibility problem is feasible, in Matlab, you can use strcmp(cvx_status,Solved)
after cvx_end. In Python, use problem.status == optimal. In Julia, use problem.status == :Optimal.
In Julia, make sure to use the ECOS solver.
Solution. The objective function (and therefore also the problem) is not convex, but it is quasi-
convex. We have maxi=1,...,k |f (ti ) yi | if and only if

a + a t + a t2
0 1 i 2 i
yi , i = 1, . . . , k.

1 + b1 ti + b2 t2i

This is equivalent to (since the denominator is positive)

|a0 + a1 t + a2 t2i yi (1 + b1 ti + b2 t2i )| (1 + b1 ti + b2 t2i ), i = 1, . . . , k,

which is a set of 2k linear inequalities in the variables a and b (for fixed ). In particular, this
shows the objective is quasiconvex. (In fact, it is a generalized linear fractional function.)
To solve the problem we can use a bisection method, solving an LP feasibility problem at each step.
At each step we select some value of and solve the feasibility problem

find a, b
subject to |a0 + a1 ti + a2 t2i yi (1 + b1 ti + b2 t2i )| (1 + b1 ti + b2 t2i ), i = 1, . . . , k,

with variables a and b. (Note that as long as > 0, the condition that the denominator is positive
is enforced automatically.) This can be turned into the LP feasibility problem

find a, b
subject to a0 + a1 ti + a2 t2i yi (1 + b1 ti + b2 t2i ) (1 + b1 ti + b2 t2i ), i = 1, . . . , k .
2 2 2
a0 + a1 ti + a2 ti yi (1 + b1 ti + b2 ti ) (1 + b1 ti + b2 ti ), i = 1, . . . , k.

The following Matlab code solves the problem for the particular problem instance.

k=201;
t=(-3:6/(k-1):3);
y=exp(t);

Tpowers=[ones(k,1) t t.^2];

u=exp(3); l=0; % initial upper and lower bounds


bisection_tol=1e-3; % bisection tolerance

while u-l>= bisection_tol


gamma=(l+u)/2;
cvx_begin % solve the feasibility problem
cvx_quiet(true);

138
variable a(3);
variable b(2);
subject to
abs(Tpowers*a-y.*(Tpowers*[1;b])) <= gamma*Tpowers*[1;b];
cvx_end

if strcmp(cvx_status,Solved)
u=gamma;
a_opt=a;
b_opt=b;
objval_opt=gamma;
else
l=gamma;
end
end

y_fit=Tpowers*a_opt./(Tpowers*[1;b_opt]);

figure(1);
plot(t,y,b, t,y_fit,r+);
xlabel(t);
ylabel(y);

figure(2);
plot(t, y_fit-y);
xlabel(t);
ylabel(err);

The following Python code solves the problem for the particular problem instance.

import numpy as np
import matplotlib.pyplot as plt
import cvxpy as cvx

k = 201
t = -3 + 6 * np.arange(k) / (k - 1)
y = np.exp(t)

Tpowers = np.vstack((np.ones(k), t, t**2)).T

a = cvx.Variable(3)
b = cvx.Variable(2)
gamma = cvx.Parameter(sign=positive)
lhs = cvx.abs(Tpowers * a - (y[:, np.newaxis] * Tpowers) * cvx.vstack(1, b))
rhs = gamma * Tpowers * cvx.vstack(1, b)
problem = cvx.Problem(cvx.Minimize(0), [lhs <= rhs])

139
l, u = 0, np.exp(3) # initial upper and lower bounds
bisection_tol = 1e-3 # bisection tolerance
while u - l >= bisection_tol:
gamma.value = (l + u) / 2
# solve the feasibility problem
problem.solve()
if problem.status == optimal:
u = gamma.value
a_opt = a.value
b_opt = b.value
objval_opt = gamma.value
else:
l = gamma.value

y_fit = (Tpowers * a_opt / (Tpowers * np.vstack((1, b_opt)))).A1

plt.figure()
plt.plot(t, y, b, t, y_fit, r+)
plt.xlabel(t)
plt.ylabel(y)
plt.show()

plt.figure()
plt.plot(t, y_fit - y)
plt.xlabel(t)
plt.ylabel(err)
plt.show()

The following Julia code solves the problem for the particular problem instance.

using Convex, ECOS, PyPlot

set_default_solver(ECOSSolver(verbose=false))

k = 201
t = -3:6/(k-1):3
y = exp(t)

Tpowers = [ones(k) t t.^2];

a = Variable(3)
b = Variable(2)

l, u = 0, exp(3) # initial upper and lower bounds


bisection_tol = 1e-3 # bisection tolerance

140
a_opt, b_opt, objval_opt = zeros(3), zeros(2), 0
while u - l >= bisection_tol
gamma = (l + u) / 2
lhs = abs(Tpowers * a - (y .* Tpowers) * [1; b])
rhs = gamma * Tpowers * [1; b]
problem = minimize(0, [lhs <= rhs])
# solve the feasibility problem
solve!(problem)
if problem.status == :Optimal
u = gamma
a_opt = evaluate(a)
b_opt = evaluate(b)
objval_opt = gamma
else
l = gamma
end
end

y_fit = (Tpowers * a_opt ./ (Tpowers * [1; b_opt]))

figure()
plot(t, y, "b", t, y_fit, "r+")
xlabel("t")
ylabel("y")
show()

figure()
plot(t, y_fit - y)
xlabel("t")
ylabel("err")
show()

The optimal values are

a0 = 1.0099, a1 = 0.6117, a2 = 0.1134, b1 = 0.4147, b2 = 0.0485,

and the optimal objective value is 0.0233. We plot the fit and the error in Figure 5 and 6.
5.3 Approximation with trigonometric polynomials. Suppose y : R R is a 2-periodic function. We
will approximate y with the trigonometric polynomial
K
X K
X
f (t) = ak cos(kt) + bk sin(kt).
k=0 k=1

We consider two approximations: one that minimizes the L2 -norm of the error, defined as
Z 1/2
2
kf yk2 = (f (t) y(t)) dt ,

141
25

20

15
y

10

0
3 2 1 0 1 2 3
x

Figure 5: Chebyshev fit with rational function. The line represents the data and the crosses the
fitted points.

0.025

0.02

0.015

0.01

0.005
err

0.005

0.01

0.015

0.02

0.025
3 2 1 0 1 2 3
x

Figure 6: Fitting error for Chebyshev fit of exponential with rational function.

142
and one that minimizes the L1 -norm of the error, defined as
Z
kf yk1 = |f (t) y(t)| dt.

The L2 approximation is of course given by the (truncated) Fourier expansion of y.


To find an L1 approximation, we discretize t at 2N points,

ti = + i/N, i = 1, . . . , 2N,

and approximate the L1 norm as


2N
X
kf yk1 (/N ) |f (ti ) y(ti )|.
i=1

(A standard rule of thumb is to take N at least 10 times larger than K.) The L1 approximation (or
really, an approximation of the L1 approximation) can now be found using linear programming.
We consider a specific case, where y is a 2-periodic square-wave, defined for t as
(
1 |t| /2
y(t) =
0 otherwise.

(The graph of y over a few cycles explains the name square-wave.)


Find the optimal L2 approximation and (discretized) L1 optimal approximation for K = 10. You
can find the L2 optimal approximation analytically, or by solving a least-squares problem associated
with the discretized version of the problem. Since y is even, you can take the sine coefficients in
your approximations to be zero. Show y and the two approximations on a single plot.
In addition, plot a histogram of the residuals (i.e., the numbers f (ti )y(ti )) for the two approxima-
tions. Use the same horizontal axis range, so the two residual distributions can easily be compared.
(Matlab command hist might be helpful here.) Make some brief comments about what you see.
Solution.
The optimal approximations are shown below. The L2 optimal approximation has the familiar
Gibbs ringing near the discontinuities in y, but the L1 optimal approximation has much less
pronounced oscillation near the discontinuity in y. The L1 optimal approximation has very small
error, except near the discontinuity.

143
square
L2norm
1 L1norm

0.8

0.6

0.4

0.2

0.2
3 2 1 0 1 2 3
t

The residual distributions of the two approximations are shown below. As expected, the L2 approx-
imation has more small residuals, and fewer large residuals, compared to the L1 approximation.
The L1 approximation has a large number of zero and very small errors, and a few large ones.

144
histogram of the residuals
150

100
l2norm

50

0
0.5 0.4 0.3 0.2 0.1 0 0.1 0.2 0.3 0.4 0.5

150

100
l1norm

50

0
0.5 0.4 0.3 0.2 0.1 0 0.1 0.2 0.3 0.4 0.5

The Matlab code is as follows.

% square wave approximations


close all; clear all;
K = 10;
N = 10*K;
t = -pi+pi/N:pi/N:pi;
y = (abs(t) <= pi/2);

% A = [f0 f1 f2 f3 ... f9 f10];


A = ones(2*N,1);
for k = 1:K
A = [A cos(k*t)];
end

% L2 approximation
c = A\y
y_2 = A*c;

% L1 approximation
cvx_begin
variable x(K+1)
minimize(norm(A*x-y,1))
cvx_end

145
y_1 = A*x;

% plot of the square-wave and optimal fits


figure;
plot(t,y,t,y_2,t,y_1,--);
title(approximations)
legend(square,l2norm,l1norm);
axis([-pi pi -0.2 1.2])

% distribution of residual magnitudes


figure;
subplot(2,1,1), hist(y - y_2, 25), axis([-.6 .6 0 150]);
title(l2 and l1 residual distributions), ylabel(l2-norm)
subplot(2,1,2), hist(y - y_1, 25), axis([-.6 .6 0 150]);

5.4 Penalty function approximation. We consider the approximation problem


minimize (Ax b)
where A Rmn and b Rm , the variable is x Rn , and : Rm R is a convex penalty
function that measures the quality of the approximation Ax b. We will consider the following
choices of penalty function:
(a) Euclidean norm.
m
X
(y) = kyk2 = ( yk2 )1/2 .
k=1

(b) `1 -norm.
m
X
(y) = kyk1 = |yk |.
k=1

(c) Sum of the largest m/2 absolute values.


bm/2c
X
(y) = |y|[k]
k=1

where |y|[1] , |y|[2] , |y|[3] , . . . , denote the absolute values of the components of y sorted in
decreasing order.
(d) A piecewise-linear penalty.

m
X 0
|u| 0.2
(y) = h(yk ), h(u) = |u| 0.2 0.2 |u| 0.3
k=1 2|u| 0.5 |u| 0.3.

(e) Huber penalty.


m
(
X u2 |u| M
(y) = h(yk ), h(u) =
M (2|u| M ) |u| M
k=1

with M = 0.2.

146
(f) Log-barrier penalty.
m
X
(y) = h(yk ), h(u) = log(1 u2 ), dom h = {u | |u| < 1}.
k=1

Here is the problem. Generate data A and b as follows:

m = 200;
n = 100;
A = randn(m,n);
b = randn(m,1);
b = b/(1.01*max(abs(b)));

(The normalization of b ensures that the domain of (Ax b) is nonempty if we use the log-barrier
penalty.) To compare the results, plot a histogram of the vector of residuals y = Ax b, for each
of the solutions x, using the Matlab command

hist(A*x-b,m/2);

Some additional hints and remarks for the individual problems:

(a) This problem can be solved using least-squares (x=A\b).


(b) Use the CVX function norm(y,1).
(c) Use the CVX function norm_largest().
(d) Use CVX, with the overloaded max(), abs(), and sum() functions.
(e) Use the CVX function huber().
(f) The current version of CVX handles the logarithm using an iterative procedure, which is slow
and not entirely reliable. However, you can reformulate this problem as

(Ax b)k )(1 + (Ax b)k )))1/2m ,


Qm
maximize ( k=1 ((1

and use the CVX function geo_mean().

Solution.
We show that each of the functions is convex and, hence, (Ax b) is convex.

Parts 1,2. Any norm k k is a convex function.


Part 3.. See exercise 3.19 (b).
Parts 4,5,6. Making a plot of each of these functions h clearly shows that they are convex.
For the piecewise-linear penalty, we can also note that

h(u) = max{0, |u| 0.2, 2|u| 0.5},

so it is the pointwise maximum of convex functions. In part 6, h is the Euclidean norm of the

vector ( , u).

147
Part 7. We can express h as the sum of two convex functions:

h(u) = log(1 + u) log(1 u).

The Matlab code is as follows.

m = 200; n = 100
A = randn(m,n);
b = randn(m,1);
b = b/(1.01*max(abs(b)));

% Part 1. L2
x1 = A\b;

% Part 2. L1
cvx_begin
variable x2(n)
minimize(norm(A*x2-b,1))
cvx_end

% Part 3. Sum of largest abs. values


cvx_begin
variable x3(n)
minimize(norm_largest(A*x3-b,floor(m/2)))
cvx_end

% Part 4. PWL
cvx_begin
variable x4(n)
minimize(sum(max([zeros(m,1), abs(A*x4-b)-0.2, 2*abs(A*x4-b)-0.5])))
cvx_end

% Part 5. huber
disp(huber)
cvx_begin
variable x5(n)
minimize sum(huber(A*x5 - b, .2))
cvx_end

% Part 6. Smoothed l1
disp(sqrt)
cvx_begin
variable x6(n)
minimize(sl1(A*x6-b, 1e-8))
cvx_end

148
% Part 7. entropy
cvx_begin
variable x7(n)
maximize(geomean([1-(A*x7-b); 1+(A*x7-b)]))
cvx_end

The residual distributions of an example problem are shown in the figure.

a b
150
6

100
4

2 50

0 0
1 0 1 2 1 0 1 2

c d
60 50

40
40
30

20
20
10

0 0
1 0 1 1 0 1

149
e f
20 150

15
100
10
50
5

0 0
1 0 1 1 0 1

g
6

0
1 0 1

5.5 `1.5 optimization. Optimization and approximation methods that use both an `2 -norm (or its
square) and an `1 -norm are currently very popular in statistics, machine learning, and signal and
image processing. Examples include Huber estimation, LASSO, basis pursuit, SVM, various `1 -
regularized classification methods, total variation de-noising, etc. Very roughly, an `2 -norm cor-
responds to Euclidean distance (squared), or the negative log-likelihood function for a Gaussian;
in contrast the `1 -norm gives robust approximation, i.e., reduced sensitivity to outliers, and also
tends to yield sparse solutions (of whatever the argument of the norm is). (All of this is just
background; you dont need to know any of this to solve the problem.)
In this problem we study a natural method for blending the two norms, by using the `1.5 -norm,
defined as
k
!2/3
X
3/2
kzk1.5 = |zi |
i=1

for z Rk . We will consider the simplest approximation or regression problem:

minimize kAx bk1.5 ,

with variable x Rn , and problem data A Rmn and b Rm . We will assume that m > n and
the A is full rank (i.e., rank n). The hope is that this `1.5 -optimal approximation problem should
share some of the good features of `2 and `1 approximation.

(a) Give optimality conditions for this problem. Try to make these as simple as possible.

150
(b) Explain how to formulate the `1.5 -norm approximation problem as an SDP. (Your SDP can
include linear equality and inequality constraints.)
(c) Solve the specific numerical instance generated by the following code:
randn(state,0);
A=randn(100,30);
b=randn(100,1);
Numerically verify the optimality conditions. Give a histogram of the residuals, and repeat
for the `2 -norm and `1 -norm approximations. You can use any method you like to solve the
problem (but of course you must explain how you did it); in particular, you do not need to
use the SDP formulation found in part (b).

Solution.

(a) We can just as well minimize the objective to the 3/2 power, i.e., solve the problem
Pm T
minimize f (x) = i=1 |ai x bi |3/2

This objective is differentiable, in fact, so the optimality condition is simply that the gradient
should vanish. (But it is not twice differentiable.) The gradient is
m
X
f (x) = (3/2) sign(aTi x bi )|aTi x bi |1/2 ai ,
i=1

so the optimality condition is just


m
X
(3/2) sign(ri )|ri |1/2 ai = 0,
i=1

where ri = aTi x bi is the ith residual. We can, of course, drop the factor 3/2.
(b) We can write an equivalent problem

minimize 1T t
subject to s3/2  t,
si  aTi x bi  si i = 1, . . . , m,

with new variables t, s Rm .


3/2
We need a way to express si ti using LMIs. We first write it as s2i ti si . Were going
to express this using some LMIs. Recall that the general 2 2 LMI
" #
u v
0
v w

is equivalent to u 0, uw v 2 . So we can write s2i ti si as
" #
si si
 0.
si t

151
Now this is not yet an LMI, because the 1, 1 entry is not affine in the variables. To deal with

this, we introduce a new variable y Rm , which satisfies 0  y  s:
" # " #
yi si si yi
 0,  0.
si ti yi 1

These are LMIs in the variables. The first LMI is equivalent to yi 0, yi ti s2i . The second

LMI is equivalent to si yi2 , i.e., si |yi |. These two together are equivalent to s2i ti si .
(Here we use the fact that if we increase the 1, 1 entry of a matrix, it gets more positive
semidefinite. (Thats informal, but you know what we mean.)
Now we can assemble an SDP to solve our `1.5 -norm approximation problem:
minimize 1T t
subject to s T
" i  ai #x bi  "si , i = #1, . . . , m
yi si si yi
 0,  0, i = 1, . . . , m,
si ti yi 1

with variables x Rn , t Rm , y Rm , s Rm .
Here is another solution several of you used, which we like. The final SDP is
minimize z
subject to s Tx b  s ,
" i  ai # i i i = 1, . . . , m
s i yi
 0, i = 1, . . . , m,
yi 1
" #
z sT
 0,
s diag(y)

with variables x Rn , y Rm , s Rm , and z R.


The first set of inequalities is equivalent to |aTi x bi | si ; the set of 2 2 LMIs is equivalent
to si yi2 , and the last (2m + 1) (2m + 1) LMI is equivalent to
m
1
X
T
z s (diag(y) )s = s2i /yi .
i=1

Evidently we minimize z, and therefore the righthand side above. For si fixed, the choice

yi = si minimizes the objective, so we are minimizing
m m
X X 3/2
s2i /yi = si .
i=1 i=1

(c) Were going to use CVX to solve the problem. The function norm(r,1.5) isnt implemented

yet, so well have to do it ourselves. One simple way is to note that |r|3/2 = r2 / r, which is the

composition of the quadratic over linear function x21 /x2 with x1 = r, x2 = r. Fortunately,
the result is convex, since the quadratic over linear function is convex and decreasing in its
second argument, so it can accept a concave positive function there. In other words, CVX will
accept quad_over_lin(s,sqrt(s)), and recognize it as convex. So we have a snappy, short
way to express s3/2 for s > 0. Now we have to form the composition of this with the convex
function ri = aTi x bi . Here is one way to do this.

152
cvx_begin
variables x(n) s(m);
s >= abs(A*x-b);
minimize(sum(quad_over_lin(s,sqrt(s),0)));
cvx_end
The following code solve the problem for the different norms and plot histograms of the
residuals.
n=30;
m=100;
randn(state,0);
A=randn(m,n);
b=randn(m,1);

%l1.5 solution
cvx_begin
variables x(n) s(m);
s >= abs(A*x-b);
minimize(sum(quad_over_lin(s,sqrt(s),0)));
cvx_end

%l2 solution
xl2=A\b;

%l1 solution
cvx_begin
variables xl1(n);
minimize(norm(A*xl1-b,1));
cvx_end

r=A*x-b; %residuals
rl2=A*xl2-b;
rl1=A*xl1-b;

%check optimality condition


A*(3/2*sign(r).*sqrt(abs(r)))

subplot(3,1,1)
hist(r)
axis([-2.5 2.5 0 50])
xlabel(r)
subplot(3,1,2)
hist(rl2)
axis([-2.5 2.5 0 50])
xlabel(r2)

153
50
40
30
20
10
0
2.5 2 1.5 1 0.5 0 0.5 1 1.5 2 2.5
r
50
40
30
20
10
0
2.5 2 1.5 1 0.5 0 0.5 1 1.5 2 2.5
r2
50
40
30
20
10
0
2.5 2 1.5 1 0.5 0 0.5 1 1.5 2 2.5
r1

Figure 7: Histogram of the residuals for `1.5 -norm, `2 -norm, and `1 -norm

subplot(3,1,3)
hist(rl1)
axis([-2.5 2.5 0 50])
xlabel(r1)

%solution using SDP


cvx_begin sdp
variables xdf(n) r(m) y(m) t(m);
A*xdf-b<=r;
-r<=A*xdf-b;
minimize(sum(t));
for i=1:m
[y(i), r(i); r(i), t(i)]>=0;
[r(i), y(i); y(i), 1]>=0;
end
cvx_end
Figure 7 shows the histograms of the residuals for the three norms.

5.6 Total variation image interpolation. A grayscale image is represented as an m n matrix of

154
intensities U orig . You are given the values Uijorig , for (i, j) K, where K {1, . . . , m} {1, . . . , n}.
Your job is to interpolate the image, by guessing the missing values. The reconstructed image
will be represented by U Rmn , where U satisfies the interpolation conditions Uij = Uijorig for
(i, j) K.
The reconstruction is found by minimizing a roughness measure subject to the interpolation con-
ditions. One common roughness measure is the `2 variation (squared),
m X
X n m X
X n
(Uij Ui1,j )2 + (Uij Ui,j1 )2 .
i=2 j=1 i=1 j=2

Another method minimizes instead the total variation,


m X
X n m X
X n
|Uij Ui1,j | + |Uij Ui,j1 |.
i=2 j=1 i=1 j=2

Evidently both methods lead to convex optimization problems.


Carry out `2 and total variation interpolation on the problem instance with data given in tv_img_interp.m.
This will define m, n, and matrices Uorig and Known. The matrix Known is m n, with (i, j) entry
one if (i, j) K, and zero otherwise. The mfile also has skeleton plotting code. (We give you
the entire original image so you can compare your reconstruction to the original; obviously your
solution cannot access Uijorig for (i, j) 6 K.)
Solution. The code for the interpolation is very simple. For `2 interpolation, the code is the
following.

cvx_begin
variable Ul2(m, n);
Ul2(Known) == Uorig(Known); % Fix known pixel values.
Ux = Ul2(1:end,2:end) - Ul2(1:end,1:end-1); % x (horiz) differences
Uy = Ul2(2:end,1:end) - Ul2(1:end-1,1:end); % y (vert) differences
minimize(norm([Ux(:); Uy(:)], 2)); % l2 roughness measure
cvx_end

For total variation interpolation, we use the following code.

cvx_begin
variable Utv(m, n);
Utv(Known) == Uorig(Known); % Fix known pixel values.
Ux = Utv(1:end,2:end) - Utv(1:end,1:end-1); % x (horiz) differences
Uy = Utv(2:end,1:end) - Utv(1:end-1,1:end); % y (vert) differences
minimize(norm([Ux(:); Uy(:)], 1)); % tv roughness measure
cvx_end

We get the following images

155
orig obscure

10 10

20 20

30 30

40 40

50 50
10 20 30 40 50 10 20 30 40 50

l2 tv

10 10

20 20

30 30

40 40

50 50
10 20 30 40 50 10 20 30 40 50

5.7 Piecewise-linear fitting. In many applications some function in the model is not given by a formula,
but instead as tabulated data. The tabulated data could come from empirical measurements,
historical data, numerically evaluating some complex expression or solving some problem, for a set
of values of the argument. For use in a convex optimization model, we then have to fit these data
with a convex function that is compatible with the solver or other system that we use. In this
problem we explore a very simple problem of this general type.
Suppose we are given the data (xi , yi ), i = 1, . . . , m, with xi , yi R. We will assume that xi are
sorted, i.e., x1 < x2 < < xm . Let a0 < a1 < a2 < < aK be a set of fixed knot points, with
a0 x1 and aK xm . Explain how to find the convex piecewise linear function f , defined over
[a0 , aK ], with knot points ai , that minimizes the least-squares fitting criterion
m
X
(f (xi ) yi )2 .
i=1

You must explain what the variables are and how they parametrize f , and how you ensure convexity
of f .
Hints. One method to solve this problem is based on the Lagrange basis, f0 , . . . , fK , which are the
piecewise linear functions that satisfy
fj (ai ) = ij , i, j = 0, . . . , K.
Another method is based on defining f (x) = i x + i , for x (ai1 , ai ]. You then have to add
conditions on the parameters i and i to ensure that f is continuous and convex.

156
Apply your method to the data in the file pwl_fit_data.m, which contains data with xj [0, 1].
Find the best affine fit (which corresponds to a = (0, 1)), and the best piecewise-linear convex
function fit for 1, 2, and 3 internal knot points, evenly spaced in [0, 1]. (For example, for 3 internal
knot points we have a0 = 0, a1 = 0.25, a2 = 0.50, a3 = 0.75, a4 = 1.) Give the least-squares
fitting cost for each one. Plot the data and the piecewise-linear fits found. Express each function
in the form
f (x) = max (i x + i ).
i=1...,K

(In this form the function is easily incorporated into an optimization problem.)
Solution. Following the hint, we will use the Lagrange basis functions f0 , . . . , fK . These can be
expressed as
a1 x
 
f0 (x) = ,
a1 a0 +
x ai1 ai+1 x
  
fi (x) = min , , i = 1, . . . , K 1,
ai ai1 ai ai+1 +
and
x aK1
 
fK (x) = .
aK aK1 +

The function f can be parametrized as


K
X
f (x) = zi fi (x),
i=0

where zi = f (ai ), i = 0, . . . , K. We will use z = (z0 , . . . , zK ) to parametrize f . The least-squares


fitting criterion is then
m
X
J= (f (xi ) yi )2 = kF z yk22 ,
i=1
m(K+1)
where F R is the matrix

Fij = fj (xi ), i = 1, . . . , m, j = 0, . . . , K.

(We index the columns of F from 0 to K here.)


We must add the constraint that f is convex. This is the same as the condition that the slopes of
the segments are nondecreasing, i.e.,
zi+1 zi zi zi1
, i = 1, . . . , K 1.
ai+1 ai ai ai1

This is a set of linear inequalities in z. Thus, the best PWL convex fit can be found by solving the
QP
minimize kF z yk22
zi
subject to azi+1
i+1 ai
azii z i1
ai1 , i = 1, . . . , K 1.

The following code solves this problem for the data in pwl_fit_data.

157
figure
plot(x,y,k:,linewidth,2)
hold on

% Single line
p = [x ones(100,1)]\y;
alpha = p(1)
beta = p(2)
plot(x,alpha*x+beta,b,linewidth,2)
mse = norm(alpha*x+beta-y)^2

for K = 2:4
% Generate Lagrange basis
a = (0:(1/K):1);
F = max((a(2)-x)/(a(2)-a(1)),0);
for k = 2:K
a_1 = a(k-1);
a_2 = a(k);
a_3 = a(k+1);
f = max(0,min((x-a_1)/(a_2-a_1),(a_3-x)/(a_3-a_2)));
F = [F f];
end
f = max(0,(x-a(K))/(a(K+1)-a(K)));
F = [F f];

% Solve problem
cvx_begin
variable z(K+1)
minimize(norm(F*z-y))
subject to
(z(3:end)-z(2:end-1))./(a(3:end)-a(2:end-1)) >=...
(z(2:end-1)-z(1:end-2))./(a(2:end-1)-a(1:end-2))
cvx_end

% Calculate alpha and beta


alpha = (z(2:end)-z(1:end-1))./(a(2:end)-a(1:end-1))
beta = z(2:end)-alpha(1:end).*a(2:end)

% Plot solution
y2 = F*z;
mse = norm(y2-y)^2
if K==2
plot(x,y2,r,linewidth,2)
elseif K==3
plot(x,y2,g,linewidth,2)

158
2

1.5

0.5
y

0.5

1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
x

Figure 8: Piecewise-linear approximations for K = 1, 2, 3, 4

else
plot(x,y2,m,linewidth,2)
end

end
xlabel(x)
ylabel(y)

This generates figure 8. We can see that the approximation improves as K increases. The following
table shows the result of this approximation.

K 1 , . . . , K 1 , . . . , K J
1 1.91 0.87 12.73
2 0.27, 4.09 0.33, 2.51 2.62
3 1.80, 2.67, 4.25 0.10, 1.59, 2.65 0.60
4 3.15, 2.11, 2.68, 4.90 0.03, 1.29, 1.57, 3.23 0.22

There is another way to solve this problem. We are looking for a piecewise linear function. If we
have at least one internal knot (K 2), the function should satisfy the two following constraints:

convexity: 1 2 K
continuity: i ai + i = i+1 ai + i+1 , i = 1, . . . , K 1.

159
Therefore, the opimization problem is

minimize ( m i=1 f (xi ) yi )


2
P

subject to i i+1 , i = 1, . . . , K 1
i ai + i = i+1 ai + i+1 , i = 1, . . . , K 1

Reformulating the problem by representing f (xi ) in matrix form, we get

minimize k diag(x)F + F yk2


subject to i i+1 , i = 1, . . . , K 1
i ai + i = i+1 ai + i+1 , i = 1, . . . , K 1

where the variables are RK and RK , and problem data are x Rm , y Rm and

1
if aj1 = xi , j = 1
Fij = 1 if aj1 < xi aj

0 otherwise .

% another approach for PWL fitting problem


clear all;
pwl_fit_data;
m = length(x);
xp = 0:0.001:1; % for fine-grained pwl function plot
mp = length(xp);
yp = [];

for K = 1:4 % internal knot 1,2,3

a = [0:1/K:1]; % a_0,...,a_K
% matrix for sum f(x_i)
F = sparse(1:m,max(1,ceil(x*K)),1,m,K);

% solve problem
cvx_begin
variables alpha(K) beta(K)
minimize( norm(diag(x)*F*alpha+F*beta-y) )
subject to
if (K>=2)
alpha(1:K-1).*a(2:K)+beta(1:K-1) == alpha(2:K).*a(2:K)+beta(2:K)
a(1:K-1) <= a(2:K)
end
cvx_end

fp = sparse(1:mp,max(1,ceil(xp*K)),1,mp,K);
yp = [yp diag(xp)*fp*alpha+fp*beta];
end
plot(x,y,b.,xp,yp);

160
5.8 Least-squares fitting with convex splines. A cubic spline (or fourth-order spline) with breakpoints
0 , 1 , . . . , M (that satisfy 0 < 1 < < M ) is a piecewise-polynomial function with the
following properties:

the function is a cubic polynomial on each interval [i , i+1 ]


the function values, and the first and second derivatives are continuous on the interval (0 , M ).

The figure shows an example of a cubic spline f (t) with M = 10 segments and breakpoints 0 = 0,
1 = 1, . . . , 10 = 10.

10

0
z

10
0 2 4 6 8 10
x

In approximation problems with splines it is convenient to parametrize a spline as a linear combi-


nation of basis functions, called B-splines. The precise definition of B-splines is not important for
our purposes; it is sufficient to know that every cubic spline can be written as a linear combination
of M + 3 cubic B-splines gk (t), i.e., in the form

f (t) = x1 g1 (t) + + xM +3 gM +3 (t) = xT g(t),

and that there exist efficient algorithms for computing g(t) = (g1 (t), . . . , gM +3 (t)). The next figure
shows the 13 B-splines for the breakpoints 0, 1, . . . , 10.

161
1

0.8

0.6
z
0.4

0.2

0
0 2 4 6 8 10
x

In this exercise we study the problem of fitting a cubic spline to a set of data points, subject to the
constraint that the spline is a convex function. Specifically, the breakpoints 0 , . . . , M are fixed,
and we are given N data points (tk , yk ) with tk [0 , M ]. We are asked to find the convex cubic
spline f (t) that minimizes the least-squares criterion
N
X
(f (tk ) yk )2 .
k=1

We will use B-splines to parametrize f , so the variables in the problem are the coefficients x in
f (t) = xT g(t). The problem can then be written as
N 
X 2
minimize xT g(tk ) yk
k=1 (30)
subject to xT g(t) is convex in t on [0 , M ].

(a) Express problem (30) as a convex optimization problem of the form


minimize kAx bk22
subject to Gx  h.
(b) Use CVX to solve a specific instance of the optimization problem in part (a). As in the figures
above, we take M = 10 and 0 = 0, 1 = 1, . . . , 10 = 10.
Download the Matlab files spline_data.m and bsplines.m. The first m-file is used to generate
the problem data. The command [t, y] = spline_data will generate two vectors t, y of
length N = 51, with the data points tk , yk .
The second function can be used to compute the B-splines, and their first and second deriva-
tives, at any given point u [0, 10]. The command [g, gp, gpp] = bsplines(u) returns
three vectors of length 13 with elements gk (u), gk0 (u), and gk00 (u). (The right derivatives are
returned for u = 0, and the left derivatives for u = 10.)
Solve the convex spline fitting problem (30) for this example, and plot the optimal spline.

162
Solution.

(a) The objective function is


N
X
(xT g(tk ) yk )2 = kAx bk22
k=1
with
g(t1 )T

y1

g(t2 )T


y2

A= .. , b= .. .
. .


g(tN )T yN
To handle the convexity constraint we note that f 00
is piecewise linear in t. Therefore f 00 (t) 0
for all g (0 , M ) if and only if f 00 (k ) = xT g 00 (k ) 0 for k = 0, . . . , M . This gives a set
of linear inequalities Gx  0 with

g 00 (0 )T


g 00 (1 )T

G = .. .
.


g 00 (M )T

(b) [u, y] = spline_data;


N = length(u);

A = zeros(N, 13);
b = y;
for k = 1:N
[g, gp, gpp] = bsplines(u(k));
A(k,:) = g;
end;

% Solution without convexity constraint


xls = A\b;

% Solution with convexity constraint


G = zeros(11, 13);
for k = 1:11
[g, gp, gpp] = bsplines(k-1);
G(k,:)= gpp;
end;
cvx_begin
variable x(13);
minimize( norm(A*x - b) );
subject to
G*x >= 0;
cvx_end

163
% plot solution without convexity constraints
figure(1)
npts = 1000;
t = linspace(0, 10, npts);
fls = zeros(1, npts);
for k = 1:npts
[g, gp, gpp] = bsplines(t(k));
fls(k) = xls * g;
end;
plot(u, y,o, t, fls, -);

% plot solution with convexity constraints


figure(2)
f = zeros(1, npts);
for k = 1:npts
[g, gp, gpp] = bsplines(t(k));
f(k) = x * g;
end;
plot(u, y,o, t, f, -);

The function on the right is the optimal convex spline. The function on the left is the the
optimal spline without convexity constraint.
8 8

6 6

4 4
z

2 2

0 0

2 2
0 2 4 6 8 10 0 2 4 6 8 10
x x

5.9 Robust least-squares with interval coefficient matrix. An interval matrix in Rmn is a matrix whose
entries are intervals:

A = {A Rmn | |Aij Aij | Rij , i = 1, . . . , m, j = 1, . . . , n}.

The matrix A Rmn is called the nominal value or center value, and R Rmn , which is
elementwise nonnegative, is called the radius.
The robust least-squares problem, with interval matrix, is

minimize supAA kAx bk2 ,

164
with optimization variable x Rn . The problem data are A (i.e., A and R) and b Rm . The
objective, as a function of x, is called the worst-case residual norm. The robust least-squares
problem is evidently a convex optimization problem.

(a) Formulate the interval matrix robust least-squares problem as a standard optimization prob-
lem, e.g., a QP, SOCP, or SDP. You can introduce new variables if needed. Your reformulation
should have a number of variables and constraints that grows linearly with m and n, and not
exponentially.
(b) Consider the specific problem instance with m = 4, n = 3,

60 0.05 45 0.05 8 0.05 6
90 0.05 30 0.05 30 0.05 3
A= , b= .

0 0.05 8 0.05 4 0.05 18
30 0.05 10 0.05 10 0.05 9

(The first part of each entry in A gives Aij ; the second gives Rij , which are all 0.05 here.) Find
the solution xls of the nominal problem (i.e., minimize kAx bk2 ), and robust least-squares
solution xrls . For each of these, find the nominal residual norm, and also the worst-case residual
norm. Make sure the results make sense.

Solution.
(a) The problem is equivalent to

minimize supAA kAx bk22 ,

which can be reformulated as


minimize y T y
subject to y  Ax b  y, for all A A.
We have
n
(Aij xj + Rij |xj |) bi
X
sup (Ax b)i =
AA j=1
n
(Aij xj Rij |xj |) bi .
X
inf (Ax b)i =
AA
j=1

We can therefore write the problem as


minimize y T y
+ R|x| b  y
subject to Ax
R|x| b  y
Ax

where |x| Rn is the vector with elements |x|i = |xi |, or equivalently as the QP

minimize y T y
+ Rz b  y
subject to Ax

Ax Rz b  y
z  x  z.

165
The variables are x Rn , y Rm , z Rn .
The problem also has an alternative formulation: the trick is to find an alternative expression
for the worst-case residual norm as a function of x:

f (x) = sup kAx bk22


AA
2
m n
(Aij + Vij )xj bi .
X X
= sup
i=1 |Vij |Rij j=1
j=1,...,n

Assume x is fixed and define r = Ax b. We work out the worst-case value of the uncertain
parameters V , i.e., the values that maximize
2 2
n n
(Aij + Vij )xj bi = ri +
X X
Vij xj
j=1 j=1

for i = 1, . . . , m. If ri 0, the worst-case values for Vij are the values that make i Vij xj
P

positive and as large as possible, i.e., Vij = Rij if xj 0 and Vij = Rij otherwise. If
P
ri < 0, the worst-case values for Vij are the values that make i Vij xj negative and as large
as possible, i.e., Vij = Rij if xj 0 and Vij = Rij otherwise. Therefore
2
m
X n
X
f (x) = |ri | + Rij |xj |
i=1 j=1
b + R |x| 2

= Ax 2

if we interpret the absolute values of the vectors component-wise.


We can therefore write the problem as

minimize y T y
b + R|x|  y.

subject to Ax

This can be further written as a QP

minimize y T y
+ Rz b  y
subject to Ax
Rz b  y
Ax
z  x  z.

The variables are x Rn , y Rm , z Rn .


(b) The results are given by the following:
Residual norms for the nominal problem when using LS solution:
7.5895

Residual norms for the nominal problem when using robust solution:
17.7106

166
Residual norms for the worst-case problem when using LS solution:
26.7012

Residual norms for the worst-case problem when using robust solution:
17.7940
The MATLAB script below computes the least-squares and the robust solutions and also
computes, for each one, the nominal and the worst-case residual norms.
% input data
A_bar = [ 60 45 -8; ...
90 30 -30; ...
0 -8 -4; ...
30 10 -10];
d = .05;
R = d*ones(4,3);
b = [ -6; -3; 18; -9];

% least-squares solution
x_ls = A_bar\b;

% robust least-squares solution


cvx_begin
variables x(3) y(4) z(3)
minimize ( norm( y ) )
A_bar*x + R*z - b <= y
A_bar*x - R*z - b >= -y
x <= z
x + z >= 0
cvx_end

% computing nominal residual norms


nom_res_ls = norm(A_bar*x_ls - b);
nom_res_rls = norm(A_bar*x - b);

% computing worst-case nominal norms


r = A_bar*x_ls - b;
Delta = zeros(4,3);
for i=1:length(r)
if r(i) < 0
Delta(i,:) = -d*sign(x_ls);
else
Delta(i,:) = d*sign(x_ls);
end
end
wc_res_ls = norm(r + Delta*x_ls);

167
wc_res_rls = cvx_optval;

% display
disp(Residual norms for the nominal problem when using LS solution: );
disp(nom_res_ls);
disp(Residual norms for the nominal problem when using robust solution: );
disp(nom_res_rls);
disp(Residual norms for the worst-case problem when using LS solution: );
disp(wc_res_ls);
disp(Residual norms for the worst-case problem when using robust solution: );
disp(wc_res_rls);

The robust least-square solution can also be found using the following script:
cvx_begin
variables x(3) t(4)
minimize ( norm ( t ) )
abs(A_bar*x - b) + R*abs(x) <= t
cvx_end
We also generated, for fun, the following histograms showing the distribution of the residual
norms for the case where x = xls and x = xrls . Those were obtained by creating 1000 instances
of A by sampling Aij uniformly between Aij Rij and Aij + Rij , and then evaluating the
residual norm for each A and each of the 2 solutions.
title1
60

50

40

30

20

10

0
0 2 4 6 8 10 12 14 16 18 20

168
title2
60

50

40

30

20

10

0
17.64 17.66 17.68 17.7 17.72 17.74 17.76 17.78 17.8

The following MATLAB script generates these histograms:


% Monte-Carlo simulation
N = 1000;
res_ls = zeros(N,1);
res_rls = zeros(N,1);
for k=1:N
Delta = d*(2*rand(4,3)-1);
A = A_bar + Delta;
res_ls(k) = norm(A*x_ls - b);
res_rls(k) = norm(A*x - b);
end
figure;
hist(res_ls,50);
figure;
hist(res_rls,50);
In Python, the least-squares and the robust solutions can be found using the following code.

# generating matrices
import numpy as np
import cvxpy as cvx

A_bar = np.array(np.mat(
60 45 -8;\
90 30 -30;\

169
0 -8 -4;\
30 10 -10))
d = .05;
R = d*np.ones((4,3))
b = np.array([[-6],
[-3],
[18],
[-9]])

# least-squares solution
x_ls = np.linalg.lstsq(A_bar, b)[0]

# robust least-squares solution


x = cvx.Variable(3)
y = cvx.Variable(4)
z = cvx.Variable(3)
objective = cvx.Minimize(cvx.norm(y))
constraints = [A_bar*x + R*z - b <= y,
A_bar*x - R*z - b >= -y,
x <= z,
x + z >= 0]
prob = cvx.Problem(objective, constraints)
result = prob.solve()

x_rls = x.value
# computing nominal residual norms
nom_res_ls = np.linalg.norm(np.dot(A_bar, x_ls) - b)
nom_res_rls = np.linalg.norm(np.dot(A_bar, x_rls) - b)
# computing worst-case nominal norms
r = np.dot(A_bar, x_ls) - b;
Delta = np.zeros((4,3));
for i in range(r.shape[0]):
if r[i] < 0:
Delta[i, :] = -d*np.sign(x_ls.T)
else:
Delta[i,:] = d*np.sign(x_ls.T)

wc_res_ls = np.linalg.norm(r + np.dot(Delta, x_ls))


wc_res_rls = result
# display
print "Residual norms for the nominal problem when using LS solution: "
print nom_res_ls
print "Residual norms for the nominal problem when using robust solution: "
print nom_res_rls
print "Residual norms for the worst-case problem when using LS solution: "

170
print wc_res_ls
print "Residual norms for the worst-case problem when using robust solution: "
print wc_res_rls

In Julia, the least-squares and the robust solutions can be found using the following code.

# generating matrices
close
A_bar = [ 60 45 -8;
90 30 -30;
0 -8 -4;
30 10 -10];
d = .05;
R = d*ones(4,3);
b = [ -6; -3; 18; -9];

using Convex, SCS

# least-squares solution
x_ls = \(A_bar, b);

# robust least-squares solution

x = Variable(3);
y = Variable(4);
z = Variable(3);
constraint = [A_bar*x + R*z - b <= y
A_bar*x - R*z - b >= -y
x <= z
x + z >= 0];
problem = minimize(norm(y), constraint);
solve!(problem)

# computing nominal residual norms


x_rls = x.value;
nom_res_ls = norm(A_bar*x_ls - b);
nom_res_rls = norm(A_bar*x_rls - b);
# computing worst-case nominal norms
r = A_bar*x_ls - b;
Delta = zeros(4,3);
for i=1:length(r)
if r[i] < 0
Delta[i,:] = -d*sign(x_ls);
else
Delta[i,:] = d*sign(x_ls);

171
end
end
wc_res_ls = norm(r + Delta*x_ls);
wc_res_rls = problem.optval;

# display
@printf("Nominal problem, using LS solution: %e\n", nom_res_ls);
@printf("Nominal problem, using robust solution: %e\n", nom_res_rls);
@printf("Worst-case problem, using LS solution: %e\n", wc_res_ls);
@printf("Worst-case problem, using robust solution: %e\n", wc_res_rls);

5.10 Identifying a sparse linear dynamical system. A linear dynamical system has the form

x(t + 1) = Ax(t) + Bu(t) + w(t), t = 1, . . . , T 1,

where x(t) Rn is the state, u(t) Rm is the input signal, and w(t) Rn is the process noise,
at time t. We assume the process noises are IID N (0, W ), where W  0 is the covariance matrix.
The matrix A Rnn is called the dynamics matrix or the state transition matrix, and the matrix
B Rnm is called the input matrix.
You are given accurate measurements of the state and input signal, i.e., x(1), . . . , x(T ), u(1), . . . , u(T
1), and W is known. Your job is to find a state transition matrix A and input matrix B from these
data, that are plausible, and in addition are sparse, i.e., have many zero entries. (The sparser the
better.)
By doing this, you are effectively estimating the structure of the dynamical system, i.e., you are
determining which components of x(t) and u(t) affect which components of x(t + 1). In some
applications, this structure might be more interesting than the actual values of the (nonzero)
coefficients in A and B.

By plausible, we mean that
1
TX   2 q
1/2
W x(t + 1) Ax(t) Bu(t) n(T 1) + 2 2n(T 1).

2
t=1

(You can just take this as our definition of plausible. But to explain this choice, we note that when
A = A and B = B, the left-hand side is 2 , with n(T 1) degrees of freedom, and so has mean
p
n(T 1) and standard deviation 2n(T 1). Thus, the constraint above states that the LHS does
not exceed the mean by more than 2 standard deviations.)

(a) Describe a method for finding A and B, based on convex optimization.


We are looking for a very simple method, that involves solving one convex optimization
problem. (There are many extensions of this basic method, that would improve the simple
method, i.e., yield sparser A and B that are still plausible. Were not asking you to describe
or implement any of these.)
(b) Carry out your method on the data found in sparse_lds_data.m. Give the values of A and
B that you find, and verify that they are plausible.
In the data file, we give you the true values of A and B, so you can evaluate the performance
of your method. (Needless to say, you are not allowed to use these values when forming A and

172
Using these true values, give the number of false positives and false negatives in both A
B.)
and B. A false positive in A,
for example, is an entry that is nonzero, while the corresponding
entry in A is zero. A false negative is an entry of A that is zero, while the corresponding
entry of A is nonzero. To judge whether an entry of A (or B) is nonzero, you can use the test

|Aij | 0.01 (or |Bij | 0.01).

Solution. The problem can be expressed as

minimize + card(B)
card(A)
PT 1   2
1/2
p
subject to W x(t + 1) Ax(t) Bu(t) n(T 1) 2 2n(T 1),

t=1
2

where card(X) is the cardinality (the number of nonzero entries) of matrix X.


However, there are two problems with this: the objective is non-convex, and the lower bound
PT 1   2
1/2
p
W x(t + 1) Ax(t) Bu(t) n(T 1) 2 2n(T 1) is not a convex constraint.

t=1 2
The second problem is dealt with by noting that we can always increase the magnitudes of the
implied errors by (for example) multiplying candidate A and B
by a constant. Hence the lower
bound can be neglected, without loss of generality.
Unfortunately the card function is less comprehensively dealt with. However, we can use the
(standard) heuristic of instead minimizing the `1 norm of the entries of A and B.
This gives the
convex optimization problem

minimize 1 + kvec(B)k
kvec(A)k 1
PT 1   2
1/2
p
subject to W x(t + 1) Ax(t) Bu(t) n(T 1) + 2 2n(T 1),

t=1
2

where vec(X) represents the columns of X concatenated to make a single column vector. Roughly
speaking, note that the constraint will always be tight: relaxing the requirement on the implied
errors allows more freedom to reduce the `1 norm of A and B. Thus, in reality, we never need to
worry about the lower bound from above as it will be always satisfied.
This problem is easily solved in CVX using the following code

% Load problem data.


sparse_lds_data;

fit_tol = sqrt(n*(T-1) + 2*sqrt(2*n*(T-1))); % fit tolerance.


cvx_begin
variables Ahat(n,n) Bhat(n,m);
minimize(sum(norms(Ahat, 1)) + sum(norms(Bhat, 1)))
norm(inv(Whalf)*(xs(:,2:T) - Ahat*xs(:,1:T-1) ...
- Bhat*us), fro) <= fit_tol;
cvx_end
disp(cvx_status)

% Check lower bound.


fit_tol_m = sqrt(n*(T-1) - 2*sqrt(2*n*(T-1)));
disp(norm(inv(Whalf)*(xs(:,2:T) ...

173
- Ahat*xs(:,1:T-1) - Bhat*us), fro) >= fit_tol_m);

% Round near-zero elements to zero.


Ahat = Ahat .* (abs(Ahat) >= 0.01)
Bhat = Bhat .* (abs(Bhat) >= 0.01)

% Display results.
disp([false positives, Ahat: num2str(nnz((Ahat ~= 0) & (A == 0)))])
disp([false negatives, Ahat: num2str(nnz((Ahat == 0) & (A ~= 0)))])
disp([false positives, Bhat: num2str(nnz((Bhat ~= 0) & (B == 0)))])
disp([false negatives, Bhat: num2str(nnz((Bhat == 0) & (B ~= 0)))])

and no false
With the given problem data, we get 1 false positive and 2 false negatives for A,

positives and 1 false negative for B. The matrix estimates are

0 0 0 0 0 0 0 0

0 0 1.1483 0.0899 0.1375 0.0108 0 0

0 0 0 0.9329 0 0 0.2868 0




0 0.2055 0 0 0 0 0 0
A= ,

0 0 0 0 0 0 0 0


0 0 0 0 0.0190 0.9461 0 0.8697

0 0 0 0 0.2066 0 0 0


0 0 0 0 0 0 0 0

and
1.4717 0 0 0

0 0 0 0.2832

0 0 0 0




0 0 0 0
B= .

1.6363 0.0456 0 0


0 1.4117 0 0

0.0936 0 0 0.7755


0 0.5705 0 0

Finally, there are lots of methods that will do better than this, usually by taking this as a starting
point and polishing the result after that. Several of these have been shown to give fairly reliable,
if modest, improvements. You were not required to implement any of these methods.

5.11 Measurement with bounded errors. A series of K measurements y1 , . . . , yK Rp , are taken in order
to estimate an unknown vector x Rq . The measurements are related to the unknown vector x by
yi = Ax + vi , where vi is a measurement noise that satisfies kvi k but is otherwise unknown.
(In other words, the entries of v1 , . . . , vK are no larger than .) The matrix A and the measurement
noise norm bound are known. Let X denote the set of vectors x that are consistent with the
observations y1 , . . . , yK , i.e., the set of x that could have resulted in the measurements made. Is X
convex?
Now we will examine what happens when the measurements are occasionally in error, i.e., for a few
i we have no relation between x and yi . More precisely suppose that Ifault is a subset of {1, . . . , K},

174
and that yi = Ax + vi with kvi k (as above) for i 6 Ifault , but for i Ifault , there is no relation
between x and yi . The set Ifault is the set of times of the faulty measurements.
Suppose you know that Ifault has at most J elements, i.e., out of K measurements, at most J are
faulty. You do not know Ifault ; you know only a bound on its cardinality (size). For what values of
J is X, the set of x consistent with the measurements, convex?
Solution. The set X of vectors x consistent with the observations y1 , . . . , yK is given by:

X = {x | Ax yi + e, Ax yi e, i = 1, . . . , K}

where e is the p-dimensional vector of ones. This is a polyhedron and thus a convex set.
Now assume that we know an upper bound, J, on the cardinality of Ifault , the set of faulty mea-
surements. If J > 0, for each possible set Ifault , the set of vectors x consistent with the valid
measurements (yi , i {1, . . . , K}\Ifault ) is again a polyhedron. The set X will be the union of
these polyhedra, which is in general not convex.
For example, consider the case where p = q = 1 and A = = 1. Assume that we have three
measurements where y1 = 2, y2 = 0.5, y3 = 0.5. In the case that J = 1 we have four different
alternatives, for the cases where measurements 1, 2, 3 or none of them are faulty. The possible x
for the four cases are [0.5, 0.5], [1, 1.5], , respectively. Clearly all of these sets are convex,
whereas their union, which is equal to [0.5, 0.5] [1, 1.5], is nonconvex.
Clearly, if J = 0 we are back to the case with no faulty measurements analyzed above. Thus, the
set X is convex for J = 0.

5.12 Least-squares with some permuted measurements. We want to estimate a vector x Rn , given
some linear measurements of x corrupted with Gaussian noise. Heres the catch: some of the
measurements have been permuted.
More precisely, our measurement vector y Rm has the form

y = P (Ax + v),

where vi are IID N (0, 1) measurement noises, x Rn is the vector of parameters we wish to
estimate, and P Rmm is a permutation matrix. (This means that each row and column of P
has exactly one entry equal to one, and the remaining m 1 entries zero.) We assume that m > n
and that at most k of the measurements are permuted; i.e., P ei 6= ei for no more than k indices i.
We are interested in the case when k < m (e.g. k = 0.4m); that is, only some of the measurements
have been permuted. We want to estimate x and P .
Once we make a guess P for P , we can get the maximum likelihood estimate of x by minimizing
kAx P T yk2 . The residual A
x P T y is then our guess of what v is, and should be consistent with
being a sample of a N (0, I) vector.
In principle, we can find the maximum likelihood estimate of x and P by solving a set of m

k (k! 1)
least-squares problems, and choosing one that has minimum residual. But this is not practical unless
m and k are both very small.
Describe a heuristic method for approximately solving this problem, using convex optimization.
(There are many different approaches which work quite well.)

175
You might find the following fact useful. The solution to

minimize kAx P T yk

over P Rmm a permutation matrix, is the permutation that matches the smallest entry in y
with the smallest entry in Ax, does the same for the second smallest entries and so forth.
Carry out your method on the data in ls_perm_meas_data.*. Give your estimate of the permuted
indices. The data file includes the true permutation matrix and value of x (which of course you
cannot use in forming your estimate). Compare the estimate of x you get after your guessed
permutation with the estimate obtained assuming P = I.
Remark. This problem comes up in several applications. In target tracking, we get multiple noisy
measurements of a set of targets, and then guess which targets are the same in the different sets of
measurements. If some of our guesses are wrong (i.e., our target association is wrong) we have the
present problem. In vision systems the problem arises when we have multiple camera views of a
scene, which give us noisy measurements of a set of features. A feature correspondence algorithm
guesses which features in one view correspond to features in other views. If we make some feature
correspondence errors, we have the present problem.
Note. If you are using Julia, you might have to set the solver to run more than the default number
of iterations, using solve!(problem, SCSSolver(max_iters=10000)).
Solution.
The basic idea is to treat the permuted measurements as measurements with high noise, i.e., as
outliers. So we first use some robust estimator, like `1 or Huber, to estimate x, with no permuta-
tions.
We then look for measurements with big residuals; these are (probably) the ones that were per-
muted. If a small enough number of candidates come up, we can try all permutations of these
measurements, to see which has the smallest residuals.
Otherwise we can remove the k largest outliers, and solve the least squares problem without those
measurements. We can then use this estimate x to permute the suspected measurements, using the
method described in the problem statement.
Finally, the method can be applied recursively. That is, once we guess P , we permute the measure-
ments to get y = P T y, and apply the method again. If with this new data we guess that P = I,
were done; if not, we permute again. (Our final estimate of the permutation is then the product
of the estimated permutations from each step.)
The figure shows the residuals after solving the robust estimation problem. There are some clear
outliers, the k largest of which we take to be our candidate indices. We use one iteration of the
method outlined above to get xfinal . Simply using P = I gives the naive estimator, xnaive . The
errors were
kxtrue xfinal k = 0.061
kxtrue xnaive k = 3.4363
in Matlab,
kxtrue xfinal k = 0.084
kxtrue xnaive k = 2.2684
in Python, and

176
180

160

140

120

100
residual

80

60

40

20

0
0 10 20 30 40 50 60 70 80 90 100
idx

Figure 9: Robust estimator residuals

180

160

140

120

100
residual

80

60

40

20

0
0 10 20 30 40 50 60 70 80 90 100
idx

Figure 10: Robust estimator residuals (Python)

177
250

200

150

residual
100

50

0
0 10 20 30 40 50 60 70 80 90 100
idx

Figure 11: Robust estimator residuals (Julia)

kxtrue xfinal k = 0.068


kxtrue xnaive k = 2.6768

in Julia. We identified all but one of the (37) permuted indices for the Matlab case, all but four
of the (40) permuted indices for the Python case, and all but two of the (39) permuted indices for
the Julia case.
The following Matlab code solves the problem.

ls_perm_meas_data

% naive estimator (P = I)
x_naive = A\y;

% robust estimator
cvx_begin
variable x_hub(n)
minimize(sum(huber(A*x_hub-y,1)))
cvx_end

plot(abs(A*x_hub-y),.,MarkerSize, 15);
ylabel(residual);xlabel(idx);

% remove k largest residuals


[vals,cand_idxs]=sort(abs(A*x_hub-y),descend);

178
cand_idxs=sort(cand_idxs(1:k));
A_hat=A;y_hat=y;
A_hat(cand_idxs,:)=[];y_hat(cand_idxs)=[];
% ls estimate with candidate idxs removed
x_ls=A_hat\y_hat;

% match predicted outputs with measurements


[a,b]=sort(A(cand_idxs,:)*x_ls);
[a,c]=sort(y(cand_idxs));

% reorder A matrix
cand_perms=cand_idxs;
cand_perms(b,:)=cand_perms(c,:);
A(cand_perms,:)=A(cand_idxs,:);
x_final=A\y;

% final estimate of permuted indices


cand_idxs(find(cand_perms==cand_idxs))=[];

% residuals
norm(x_naive-x_true)
norm(x_final-x_true)

The following Python code solves the problem.

import cvxpy as cvx


import numpy as np
import matplotlib.pyplot as plt

np.random.seed(0)
m=100
k=40 # max # permuted measurements
n=20
A=10*np.random.randn(m,n)
x_true=np.random.randn(n,1) # true x value
y_true = A.dot(x_true) + np.random.randn(m,1)
# build permuted indices
perm_idxs=np.random.permutation(m)
perm_idxs=np.sort(perm_idxs[:k])
temp_perm=np.random.permutation(k)
new_pos=np.zeros(k)
for i in range(k):
new_pos[i] = perm_idxs[temp_perm[i]]
new_pos = new_pos.astype(int)
# true permutation matrix
P=np.identity(m)

179
P[perm_idxs]=P[new_pos,:]
true_perm=[]
for i in range(k):
if perm_idxs[i] != new_pos[i]:
true_perm = np.append(true_perm, perm_idxs[i])
y=P.dot(y_true)
new_pos = None

# naive estimator (P=I)


x_naive = np.linalg.lstsq(A,y)[0]

# robust estimator
x_hub = cvx.Variable(n)
obj = cvx.sum_entries(cvx.huber(A*x_hub-y))
cvx.Problem(cvx.Minimize(obj)).solve()

plt.figure(1)
plt.plot(np.arange(m), np.abs(A.dot(x_hub.value)-y), .)
plt.ylabel(residual)
plt.xlabel(idx)

# remove k largest residuals


cand_idxs = np.zeros(m);
cand_idxs[:] = np.fliplr(np.argsort(np.abs(A.dot(x_hub.value)-y).T))
cand_idxs = np.sort(cand_idxs[:k])
cand_idxs = cand_idxs.astype(int)
keep_idxs = np.zeros(m);
keep_idxs[:] = np.argsort(np.abs(A.dot(x_hub.value)-y).T)
keep_idxs = np.sort(keep_idxs[:(m-k)])
keep_idxs = keep_idxs.astype(int)
A_hat = A[keep_idxs,:]
y_hat = y[keep_idxs,:]
# ls estimate with candidate idxs removed
x_ls = np.linalg.lstsq(A_hat,y_hat)[0]

# match predicted outputs with measurements


b = np.zeros(k)
c = np.zeros(k)
b[:] = np.argsort(A[cand_idxs,:].dot(x_ls).T)
b = b.astype(int)
c[:] = np.argsort(y[cand_idxs].T)
c = c.astype(int)

# reorder A matrix
cand_perms = np.zeros(len(cand_idxs));

180
cand_perms[:]=cand_idxs[:]
cand_perms[b]=cand_perms[c]
cand_perms = cand_perms.astype(int)
A[cand_perms,:]=A[cand_idxs,:]
x_final = np.linalg.lstsq(A,y)[0]

# final estimate of permuted indices


perm_estimate = []
for i in range(k):
if cand_perms[i] != cand_idxs[i]:
perm_estimate = np.append(perm_estimate, cand_idxs[i])

naive_error = np.linalg.norm(x_naive-x_true)
final_error = np.linalg.norm(x_final-x_true)

The following Julia code solves the problem.

include("ls_perm_meas_data.jl")

using Convex, SCS, Gadfly

# naive estimator (P = I)
x_naive = A\y;

# robust estimator
x_hub = Variable(n);
p = minimize(sum(huber(A*x_hub-y, 1)));
solve!(p, SCSSolver(max_iters=10000));
residuals = abs(vec(evaluate(A*x_hub - y)));

pl = plot(
x=1:length(residuals),
y=residuals,
Geom.point,
Guide.xlabel("idx"),
Guide.ylabel("residual")
);
display(p);
draw(PS("ls_perm_meas_jl.eps", 6inch, 5inch), pl);

# remove k largest residuals


cand_idxs = sortperm(residuals, rev=true);
not_cand_idxs = sort(cand_idxs[k+1:end]);
cand_idxs = sort(cand_idxs[1:k]);
A_hat = A[not_cand_idxs, :];
y_hat = y[not_cand_idxs];

181
# ls estimate with candidate idxs removed
x_ls = A_hat\y_hat;

# match predicted outputs with measurements


b = sortperm(A[cand_idxs, :]*x_ls);
c = sortperm(y[cand_idxs]);

# reorder A matrix
cand_perms = copy(cand_idxs);
cand_perms[b] = cand_perms[c];
A[cand_perms,:] = A[cand_idxs, :];
x_final = A\y;

# final estimate of permuted indices


cand_idxs = cand_idxs[cand_perms.!=cand_idxs];

# residuals
println(norm(x_naive-x_true));
println(norm(x_final-x_true));

5.13 Fitting with censored data. In some experiments there are two kinds of measurements or data
available: The usual ones, in which you get a number (say), and censored data, in which you dont
get the specific number, but are told something about it, such as a lower bound. A classic example
is a study of lifetimes of a set of subjects (say, laboratory mice). For those who have died by the end
of data collection, we get the lifetime. For those who have not died by the end of data collection,
we do not have the lifetime, but we do have a lower bound, i.e., the length of the study. These are
the censored data values.
We wish to fit a set of data points,

(x(1) , y (1) ), . . . , (x(K) , y (K) ),

with x(k) Rn and y (k) R, with a linear model of the form y cT x. The vector c Rn is the
model parameter, which we want to choose. We will use a least-squares criterion, i.e., choose c to
minimize
K 
X 2
J= y (k) cT x(k) .
k=1

Here is the tricky part: some of the values of y (k) are censored; for these entries, we have only a
(given) lower bound. We will re-order the data so that y (1) , . . . , y (M ) are given (i.e., uncensored),
while y (M +1) , . . . , y (K) are all censored, i.e., unknown, but larger than D, a given number. All the
values of x(k) are known.

(a) Explain how to find c (the model parameter) and y (M +1) , . . . , y (K) (the censored data values)
that minimize J.
(b) Carry out the method of part (a) on the data values in cens_fit_data.*. Report c, the value
of c found using this method.

182
Also find cls , the least-squares estimate of c obtained by simply ignoring the censored data
samples, i.e., the least-squares estimate based on the data

(x(1) , y (1) ), . . . , (x(M ) , y (M ) ).

The data file contains ctrue , the true value of c, in the vector c_true. Use this to give the two
relative errors
kctrue ck2 kctrue cls k2
, .
kctrue k2 kctrue k2

Solution.

(a) The trick is to introduce dummy variables to serve as placeholders for the measurements
which are censored, i.e., y (k) , k = M + 1, . . . , K. By introducing the dummy variables
(z (1) , . . . , z (KM ) ), we get the QP
 2  2
M (k) cT x(k) + K (kM ) cT x(k)
P P
minimize k=1 y k=M +1 z
subject to z (k) D, k = 1, . . . , K M,

where the variables are c and z (k) , k = 1, . . . , K M .


(b) The following MATLAB code solves the problem.
cens_fit_data;

% Using censored data method


cvx_begin
variables c(n) z(K-M)
minimize(sum_square(y-X(:,1:M)*c)+sum_square(z-X(:,M+1:K)*c))
subject to
z >= D
cvx_end
c_cens = c;

% Comparison to least squares method, ignoring all censored data


cvx_begin
variable c(n)
minimize(sum_square(y-X(:,1:M)*c))
cvx_end
c_ls = c;

[c_true c_cens c_ls]


cens_relerr = norm(c_cens-c_true)/norm(c_true)
ls_relerr = norm(c_ls-c_true)/norm(c_true)
We get the following estimates of our parameter vector:
[c_true c_cens c_ls] =
-0.4326 -0.2946 -0.3476

183
-1.6656 -1.7541 -1.7955
0.1253 0.2589 0.2000
0.2877 0.2241 0.1672
-1.1465 -0.9917 -0.8357
1.1909 1.3018 1.3005
1.1892 1.4262 1.8276
-0.0376 -0.1554 -0.5612
0.3273 0.3785 0.3686
0.1746 0.2261 -0.0454
-0.1867 -0.0826 -0.1096
0.7258 1.0427 1.5265
-0.5883 -0.4648 -0.4980
2.1832 2.1942 2.4164
-0.1364 -0.3586 -0.5563
0.1139 -0.1973 -0.3701
1.0668 1.0194 0.9900
0.0593 -0.1186 -0.2539
-0.0956 -0.1211 -0.1762
-0.8323 -0.7523 -0.4349
This gives a relative error of 0.1784 for c, and a relative error of 0.3907 for cls .
The following Python code solves the problem.
import numpy as np
import cvxpy as cvx
# data for censored fitting problem.
np.random.seed(15)

n = 20; # dimension of xs
M = 25; # number of non-censored data points
K = 100; # total number of points
c_true = np.random.randn(n,1)
X = np.random.randn(n,K)
y = np.dot(np.transpose(X),c_true) + 0.1*(np.sqrt(n))*np.random.randn(K,1)

# Reorder measurements, then censor


sort_ind = np.argsort(y.T)
y = np.sort(y.T)
y = y.T
X = X[:, sort_ind.T]
D = (y[M-1]+y[M])/2.0
y = y[range(M)]

# Using censored data method


z = cvx.Variable(K-M)
c = cvx.Variable(n)
constraints = [D <= z]

184
objective = cvx.Minimize(cvx.sum_squares(y-X[:,range(M)].T*c) + cvx.sum_squares(z-X[:,M:].T
prob = cvx.Problem(objective, constraints)
prob.solve()
c_cens = c.value

# Comparison to least squares method, ignoring all censored data


objective = cvx.Minimize(cvx.sum_squares(y-X[:,range(M)].T*c))
prob = cvx.Problem(objective, [])
prob.solve()
c_ls = c.value;
cens_relerr = np.linalg.norm(c_cens-c_true)/np.linalg.norm(c_true)
ls_relerr = np.linalg.norm(c_ls-c_true)/np.linalg.norm(c_true)

print "c_true is:"


print c_true.T
print "c_cens is:"
print np.squeeze(np.asarray(c_cens.T))
print "c_ls is:"
print np.squeeze(np.asarray(c_ls.T))
print "The relative error when we use the censored data is:"
print cens_relerr
print "The relative error when we do not use the censored data is:"
print ls_relerr
We get the following estimates of our parameter vector:
[c_true c_cens c_ls] =
-0.3123 -0.4063 -0.8695
0.3393 0.4083 0.3878
-0.1559 -0.3226 -0.0792
-0.5018 -0.6489 -0.5269
0.2356 0.3401 0.4485
-1.7636 -1.8654 -2.1460
-1.0959 -0.9181 -0.7917
-1.0878 -1.1706 -0.8663
-0.3052 -0.3446 -0.1832
-0.4737 -0.4661 -0.2506
-0.2006 -0.2165 -0.1583
0.3552 0.2531 0.5553
0.6895 0.5241 0.4282
0.4106 0.3151 0.0690
-0.5650 -0.5048 -0.4336
0.5994 0.5844 0.3568
-0.1629 -0.1862 -0.2024
1.6002 1.6390 2.0071
0.6816 0.6823 0.8107
0.0149 0.1073 0.0936

185
This gives a relative error of 0.1306 for c, and a relative error of 0.3332 for cls .
The following Julia code solves the problem.
include("cens_fit_data.jl");

using Convex, SCS

z = Variable(K-M);
c = Variable(n);
constraints = [D <= z];
prob = minimize(sumsquares(y - X[:,1:M]*c) + sumsquares(z - X[:, M+1:end]*c), constraints
solve!(prob, SCSSolver(verbose=0));
c_cens = c.value;

# Comparison to least squares method, ignoring all censored data


prob = minimize(sumsquares(y - X[:,1:M]*c), constraints);
solve!(prob, SCSSolver(verbose=0));
c_ls = c.value;

cens_relerr = norm(c_cens - c_true)/norm(c_true);


ls_relerr = norm(c_ls - c_true)/norm(c_true);

println("c_true, c_cens, c_ls: ")


println([c_true c_cens c_ls])

@printf("Relative error when we use the censored data is: %.4f\n", cens_relerr)
@printf("Relative error when we do not use the censored data is: %.4f\n", ls_relerr)

We get the following estimates of our parameter vector:


[c_true c_cens c_ls] =
-0.1555 0.0593 -0.0546
0.7980 1.0252 0.7304
-0.6821 -0.7113 -0.9468
0.1050 0.2874 0.5334
0.1488 0.1385 0.5203
0.2274 0.2207 0.1647
-0.3334 -0.2384 -0.1678
1.9171 1.5830 1.7391
-0.6174 -0.4677 -0.2668
-0.3075 -0.6946 -0.8639
1.2054 1.2582 1.3485
-0.3394 -0.1406 0.1117
-1.9851 -1.9871 -2.1897
0.8647 0.8050 0.9064

186
1.1234 1.2350 1.4486
-1.0329 -1.0031 -1.2435
0.1215 0.1263 0.2917
0.5975 0.5997 0.4654
-0.5449 -0.6555 -0.8844
-1.3857 -1.6498 -1.8234
This gives a relative error of 0.1844 for c, and a relative error of 0.3170 for cls .

5.14 Spectrum analysis with quantized measurements. A sample is made up of n compounds, in quantities
qi 0, for i = 1, . . . , n. Each compound has a (nonnegative) spectrum, which we represent as a
vector s(i) Rm n. (Precisely what s(i) means wont matter to us.) The spectrum
+ , for i = 1, . . . ,P
of the sample is given by s = ni=1 qi s(i) . We can write this more compactly as s = Sq, where
S Rmn is a matrix whose columns are s(1) , . . . , s(n) .
Measurement of the spectrum of the sample gives us an interval for each spectrum value, i.e.,
l, u Rm
+ for which
li si ui , i = 1, . . . , m.
(We dont directly get s.) This occurs, for example, if our measurements are quantized.
Given l and u (and S), we cannot in general deduce q exactly. Instead, we ask you to do the
following. For each compound i, find the range of possible values for qi consistent with the spectrum
measurements. We will denote these ranges as qi [qimin , qimax ]. Your job is to find qimin and qimax .
Note that if qimin is large, we can confidently conclude that there is a significant amount of compound
i in the sample. If qimax is small, we can confidently conclude that there is not much of compound
i in the sample.

(a) Explain how to find qimin and qimax , given S, l, and u.


(b) Carry out the method of part (a) for the problem instance given in spectrum_data.m. (Ex-
ecuting this file defines the problem data, and plots the compound spectra and measurement
bounds.) Plot the minimum and maximum values versus i, using the commented out code in
the data file. Report your values for q4min and q4max .

Solution. To find qimin , we solve the convex optimization problem

minimize qi
subject to l  Sq  u, q  0,

with variable q Rn . Then we set qimin = qi? . Similarly to find qimax , we solve the convex
optimization problem
maximize qi
subject to l  Sq  u, q  0,
and we set qimax = qi? . Here is a plot of the spectra of the compounds s(1) , . . . , s(n) , alongside the
lower and upper bounds u and l.

187
1.4 0.8

0.7
1.2

0.6
1

0.5
0.8

0.4

0.6
0.3

0.4
0.2

0.2
0.1

0 0
0 50 100 150 200 250 0 50 100 150 200 250

Here is a plot of qimin and qimax , for i = 1, . . . , n.

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0
0 1 2 3 4 5 6 7 8 9 10 11

For this particular instance, q4min = 0.121, q4max = 0.205.


The matlab code to solve this problem is given below.

spectrum_data;

qmin = zeros(n,1); qmax = zeros(n,1);


for i = 1:n
cvx_begin
variable q(n)
l <= S*q; u >= S*q;
q >= 0;
minimize(q(i))

188
cvx_end
qmin(i) = q(i);
cvx_begin
variable q(n)
l <= S*q; u >= S*q;
q >= 0;
maximize(q(i))
cvx_end
qmax(i) = q(i);
end

% print quantity bounds


[qmin qmax]

figure; hold on;


for i = 1:n
plot([i,i],[qmin(i),qmax(i)],o-,MarkerFaceColor,b);
end
axis([0,11,0,1]);
%print(-depsc,spect_minmax.eps);

5.15 Learning a quadratic pseudo-metric from distance measurements. We are given a set of N pairs of
points in Rn , x1 , . . . , xN , and y1 , . . . , yN , together with a set of distances d1 , . . . , dN > 0.
The goal is to find (or estimate or learn) a quadratic pseudo-metric d,
 1/2
d(x, y) = (x y)T P (x y) ,

with P Sn+ , which approximates the given distances, i.e., d(xi , yi ) di . (The pseudo-metric d is
a metric only when P  0; when P  0 is singular, it is a pseudo-metric.)
To do this, we will choose P Sn+ that minimizes the mean squared error objective

N
1 X
(di d(xi , yi ))2 .
N i=1

(a) Explain how to find P using convex or quasiconvex optimization. If you cannot find an exact
formulation (i.e., one that is guaranteed to minimize the total squared error objective), give
a formulation that approximately minimizes the given objective, subject to the constraints.
(b) Carry out the method of part (a) with the data given in quad_metric_data.m. The columns
of the matrices X and Y are the points xi and yi ; the row vector d gives the distances di . Give
the optimal mean squared distance error.
We also provide a test set, with data X_test, Y_test, and d_test. Report the mean squared
distance error on the test set (using the metric found using the data set above).

Solution.

189
(a) The problem is
1 PN
minimize N i=1 (di d(xi , yi ))2
with variable P Sn+ . This problem can be rewritten as
1 PN 2
minimize N i=1 (di 2di d(xi , yi ) + d(xi , yi )2 ),

with variable P (which enters through d(xi , yi )). The objective is convex because each term
of the objective can be written as (ignoring the 1/N factor)
 1/2
d2i 2di (xi yi )T P (xi yi ) + (xi yi )T P (xi yi ),

which is convex in P . To see this, note that the first term is constant and the third term
is linear in P . The middle term is convex because it is the negation of the composition of a
concave function (square root) with a linear function of P .
(b) The following code solves the problem for the given instance. We find that the optimal mean
squared error on the training set is 0.887; on the test set, it is 0.827. This tells us that we
probably havent overfit. In fact, the optimal P is singular; it has one zero eigenvalue. This
is correct; the positive semidefinite constraint is active.
Here is the solution in Matlab.
%% learning a quadratic metric

quad_metric_data;

Z = X-Y;
cvx_begin
variable P(n,n) symmetric
% objective
f = 0;
for i = 1:N
f = f + d(i)^2 -2*d(i)*sqrt(Z(:,i)*P*Z(:,i)) + Z(:,i)*P*Z(:,i);
end
minimize (f/N)
subject to
P == semidefinite(n);
cvx_end

Z_test = X_test-Y_test;
d_hat = norms(sqrtm(P)*Z_test);
obj_test = sum_square(d_test - d_hat)/N_test
Here it is in Python.
from quad_metric_data import *
import numpy as np
from scipy import linalg as la
import cvxpy as cvx

190
Z = X - Y
P = cvx.Variable(n,n)
f = 0
for i in range(N):
f += d[i]**2
f += -2*d[i]*cvx.sqrt(cvx.quad_form(Z[:,i],P))
f += cvx.quad_form(Z[:,i],P)

prob = cvx.Problem(cvx.Minimize(f/N),[P == cvx.Semidef(n)])


train_error = prob.solve()
print(train_error)

Z_test = X_test-Y_test
d_hat = np.linalg.norm(la.sqrtm(P.value).dot(Z_test),axis=0)
obj_test = (np.linalg.norm(d_test - d_hat)**2)/N_test
print(obj_test)
And finally, Julia.
# learning a quadratic metric
include("quad_metric_data.jl");
using Convex, SCS

Z = X-Y;
P = Semidefinite(n, n);

f = 0;
for i = 1:N
f += d[i]^2 - 2*d[i]*sqrt(Z[:,i]*P*Z[:,i]) + Z[:,i]*P*Z[:,i];
end
p = minimize(f/N);
solve!(p, SCSSolver(max_iters=100000));

Z_test = X_test-Y_test;
d_hat = sqrt(sum((real(sqrtm(P.value))*Z_test).^2, 1));
obj_test = vecnorm(d_test - d_hat)^2/N_test;
println("train error: ", p.optval)
println("test error: ", obj_test)
eig(P.value)

5.16 Polynomial approximation of inverse using eigenvalue information. We seek a polynomial of degree
k, p(a) = c0 + c1 a + c2 a2 + + ck ak , for which

p(A) = c0 I + c1 A + c2 A2 + ck Ak

is an approximate inverse of the nonsingular matrix A, for all A A Rnn . When x = p(A)b
is used as an approximate solution of the linear equation Ax = b, the associated residual norm is

191
kA(p(A)b) bk2 . We will judge our polynomial (i.e., the coefficients c0 , . . . , ck ) by the worst case
residual over A A and b in the unit ball:
Rwc = sup kA(p(A)b) bk2 .
AA, kbk2 1

The set of matrices we take is A = {A Sn | (A) }, where (A) is the set of eigenvalues of A
(i.e., its spectrum), and R is a union of a set of intervals (that do not contain 0).

(a) Explain how to find coefficients c?0 , . . . , c?k that minimize Rwc . Your solution can involve ex-
pressions that involve the supremum of a polynomial (with scalar argument) over an interval.
(b) Carry out your method for k = 4 and = [0.6, 0.3] [0.7, 1.8]. You can replace the
supremum of a polynomial over by a maximum over uniformly spaced (within each interval)
points in , with spacing 0.01. Give the optimal value Rwc? and the optimal coefficients
c? = (c?0 , . . . , c?k ).

Remarks. (Not needed to solve the problem.)


The approximate inverse p(A)b would be computed by recursively, requiring the multiplication
of A with a vector k times.
This approximate inverse could be used as a preconditioner for an iterative method.
The Cayley-Hamilton theorem tells us that the inverse of any (invertible) matrix is a polyno-
mial of degree n 1 of the matrix. Our hope here, however, is to get a single polynomial, of
relatively low degree, that serves as an approximate inverse for many different matrices.
Solution.
(a) We can rewrite
Rwc = sup sup kA(p(A)b) bk2 ,
AA kbk2 1

and recognize the inner supremum as the definition of the spectral norm of Ap(A) I. If A
is symmetric, then Ap(A) I is also symmetric, and its spectral norm is the largest absolute
value of its eigenvalues.
Let QDQT be an eigenvalue decomposition of A, with Q orthogonal and D diagonal. Then
Ap(A) I = c0 A + c1 A2 + + ck Ak+1 I
= c0 QDQT + c1 QD2 QT + + ck QDk+1 QT I
= Q(c0 D + c1 D2 + + ck Dk+1 I)QT
= Q(Dp(D) I)QT ,
which shows that p() 1 (Ap(A) I) if (A).
We can then rewrite
Rwc = sup kAp(A) Ik2
AA
= sup sup |p() 1|
AA (A)
= sup |p() 1|

= sup |c0 + c1 2 + + ck k+1 1|,

192
and note that Rwc is a convex function of c since, for any , c0 + c1 2 + + ck k+1 1 is
an affine function of c. We can write the optimization problem as

minimize sup |c0 + c1 2 + + ck k+1 1|.

This problem can be converted exactly into an SDP (since the supremum of a polynomial has
an LMI representation), but for any practical purpose simple sampling of is fine.
(b) Let 1 , . . . , N be uniformly spaced (within each interval contained in ) sample points, with
spacing 0.01. We approximate the objective by

max |c0 i + c1 2i + + ck k+1


i 1|.
i=1,...,N

Define RN (k+1) as ij = ji . This allows us to write the (approximate) optimization


problem as
minimize kc 1k .
We find that Rwc? = 0.2441, with coefficients

c? = (1.54, 4.43, 1.98, 5.61, 1.96).

Thats impressive: A single polynomial of degree 4 serves as a crude inverse for a whole family
of matrices! We plot p() 1 with a blue line over the interval [0.6, 1.8], and indicate
and the bounds Rwc? with a red dashed line.
We didnt ask you to do this, but we also compute kAp(A)b bk2 for 1000 random instances
of A A with n = 10, and b with kbk2 = 1. We generate A by sampling eigenvalues
uniformly from , placing them in a diagonal matrix D, and forming A = QDQT , for a
random orthogonal matrix Q (which we generate by taking the QR decomposition of a matrix
with Gaussian entries). We generate b from a Gaussian distribution and normalize (which
gives a uniform distribution on the unit sphere). Of course, the distributions dont matter. A
histogram and the code are given below. Note that the worst case error across the samples is
a bit under 0.24, quite consistent with our value of Rwc? .

0.2

0.2
p() 1

0.4

0.6

0.8

0.4 0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6


193
100

90

80

70

60
occurrences

50

40

30

20

10

0
0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24
||Ap(A)b b||2

% Polynomial approximation of inverse using eigenvalue information.


k = 4; % k degree polynomials
l1 = -.6;
u1 = -.3;
l2 = .7;
u2 = 1.8;
eta = 0.01;

lambdas = [l1:eta:u1 l2:eta:u2];

Lambda = lambdas;
for i = 1:k
Lambda(:,end+1) = Lambda(:,end).*lambdas;
end

cvx_begin
variable c(k+1)
minimize( norm(Lambda*c - 1,inf) )
cvx_end
R_wc = cvx_optval;

lambdas_full = [l1:eta:u2];
Lambda_full = lambdas_full;
for i = 1:k
Lambda_full(:,end+1) = Lambda_full(:,end).*lambdas_full;
end

y2 = Lambda_full*c - 1;
figure

194
hold on
plot(lambdas_full,y2)
plot(l1:eta:u1,0*(l1:eta:u1) + R_wc,--r,...
l1:eta:u1,0*(l1:eta:u1) - R_wc,--r,...
l2:eta:u2,0*(l2:eta:u2) + R_wc,--r,...
l2:eta:u2,0*(l2:eta:u2) - R_wc,--r)

ylabel(\lambda p(\lambda) - 1)
xlabel(\lambda)
axis([l1 u2 -1.2 .4])
print -depsc poly_approx_inv.eps

%testing code
randn(state,0)
rand(state,0)
n = 10;
runs = 1000;
r = (u1-l1)/(u2-l2 + u1-l1);

errors = zeros(runs,1);
for i = 1:runs
%create random eigenvalues in [l1,u1] \cup [l2,u2]
a = rand(n,1);
a = l1 + (l2-u1)*(sign(a-r)+1)/2 + (u2-l2 + u1-l1)*a;
[Q R] = qr(rand(n,n));
A = Q*diag(a)*Q;

b = randn(n,1);
b = b/norm(b);

q = -b;
for j = 1:k+1
q = q + c(j)*A^j*b;
end
errors(i) = norm(q);
end

figure
hist(errors,25)
xlabel(||Ap(A)b - b||_2)
ylabel(occurrences)
print -depsc poly_approx_inv_hist.eps

195
5.17 Fitting a generalized additive regression model. A generalized additive model has the form
n
X
f (x) = + fj (xj ),
j=1

for x Rn , where R is the offset, and fj : R R, with fj (0) = 0. The functions fj are called
the regressor functions. When each fj is linear, i.e., has the form wj xj , the generalized additive
model is the same as the standard (linear) regression model. Roughly speaking, a generalized
additive model takes into account nonlinearities in each regressor xj , but not nonlinear interactions
among the regressors. To visualize a generalized additive model, it is common to plot each regressor
function (when n is not too large).
We will restrict the functions fj to be piecewise-affine, with given knot points p1 < < pK . This
means that fj is affine on the intervals (, p1 ], [p1 , p2 ], . . . , [pK1 , pK ], [pK , ), and continuous at
p1 , . . . , pK . Let C denote the total (absolute value of) change in slope across all regressor functions
and all knot points. The value C is a measure of nonlinearity of the regressor functions; when
C = 0, the generalized additive model reduces to a linear regression model.
Now suppose we observe samples or data (x(1) , y (1) ), . . . , (x(N ) , y (N ) ) Rn R, and wish to fit
a generalized additive model to the data. We choose the offset and the regressor functions to
minimize
N
1 X
(y (i) f (x(i) )2 + C,
N i=1
where > 0 is a regularization parameter. (The first term is the mean-square error.)

(a) Explain how to solve this problem using convex optimization.


(b) Carry out the method of part (a) using the data in the file gen_add_reg_data.m. This file
contains the data, given as an N n matrix X (whose rows are (x(i) )T ), a column vector y
(which give y (i) ), a vector p that gives the knot points, and the scalar lambda.
Give the mean-square error achieved by your generalized additive regression model. Compare
the estimated and true regressor functions in a 3 3 array of plots (using the plotting code in
the data file as a template), over the range 10 xi 10. The true regressor functions (to
be used only for plotting, of course) are given in the cell array f.

Hints.

You can represent each regressor function fj as a linear combination of the basis functions
b0 (u) = u and bi (u) = (u pk )+ (pk )+ for k = 1, 2, . . . , K, where (a)+ = max{a, 0}.
You might find the matrix XX = [b0 (X) b1 (X) bK (X)] useful.

Solution. There is not much more to say beyond showing the code and the plot.

% Fitting a generalized additive regression model.


gen_add_reg_data;

%build an augmented data matrix XX


XX=X;

196
for ii=1:K
XX=[XX,max(0,X-p(ii))+min(p(ii),0)];
end

%Perform regression
cvx_begin
variables alpha c(9*(K+1))
minimize(1/N*sum_square(y-alpha-XX*c)+lambda*norm(c,1))
cvx_end

%Plot functions.
xx=linspace(-10,10,1024);
yy=zeros(9,1024);
figure
for jj=1:9
yy(jj,:)=c(jj)*xx;
for ii=1:K
yy(jj,:)=yy(jj,:)+c(ii*9+jj)*(pos(xx-p(ii))-pos(-p(ii)));
end
subplot(3,3,jj);
plot(xx,yy(jj,:));
hold on;
plot(xx,f{jj}(xx),r)
end

print -depsc gen_add_reg.eps

197
4 3 4

2 2
2
0 1
0
2 0

4 1 2
10 0 10 10 0 10 10 0 10

2 1 1

1
0 0
0
1 1
1

2 2 2
10 0 10 10 0 10 10 0 10

2 3 2

2
1 1
1
0 0
0

1 1 1
10 0 10 10 0 10 10 0 10

The blue and red lines correspond to the estimated and true regressors, respectively.

5.18 Multi-label support vector machine. The basic SVM described in the book is used for classification
of data with two labels. In this problem we explore an extension of SVM that can be used to carry
out classification of data with more than two labels. Our data consists of pairs (xi , yi ) Rn
{1, . . . , K}, i = 1, . . . , m, where xi is the feature vector and yi is the label of the ith data point. (So
the labels can take the values 1, . . . , K.) Our classifier will use K affine functions, fk (x) = aTk x + bk ,
k = 1, . . . , K, which we also collect into affine function from Rn into RK as f (x) = Ax + b. (The
rows of A are aTk .) Given feature vector x, we guess the label y = argmaxk fk (x). We assume that
exact ties never occur, or if they do, an arbitrary choice can be made. Note that if a multiple of 1
is added to b, the classifier does not change. Thus, without loss of generality, we can assume that
1T b = 0.
To correctly classify the data examples, we need fyi (xi ) > maxk6=yi fk (xi ) for all i. This is a set of
homogeneous strict inequalities in ak and bk , which are feasible if and only if the set of nonstrict
inequalities fyi (xi ) 1 + maxk6=yi fk (xi ) are feasible. This motivates the loss function
m 
X 
L(A, b) = 1 + max fk (xi ) fyi (xi ) ,
k6=yi +
i=1

where (u)+ = max{u, 0}. The multi-label SVM chooses A and b to minimize

L(A, b) + kAk2F ,

subject to 1T b = 0, where > 0 is a regularization parameter. (Several variations on this are


possible, such as regularizing b as well, or replacing the Frobenius norm squared with the sum of
norms of the columns of A.)

198
(a) Show how to find A and b using convex optimization. Be sure to justify any changes of
variables or reformulation (if needed), and convexity of the objective and constraints in your
formulation.
(b) Carry out multi-label SVM on the data given in multi_label_svm_data.m. Use the data
given in X and y to fit the SVM model, for a range of values of . This data set includes an
additional set of data, Xtest and ytest, that you can use to test the SVM models. Plot the
test set classification error rate (i.e., the fraction of data examples in the test set for which
y 6= y) versus .
You dont need to try more than 10 or 20 values of , and we suggest choosing them uniformly
on a log scale, from (say) 102 to 102 .

Solution.

(a) The multi-label SVM problem is convexas stated. The variables are A and b. The only
constraint, 1T b = 0, is linear. The regularization term in the objective, kAk2F , is convex
quadratic. Let us justify that the loss function term L(A, b) is convex, which we can do by
showing each term,  
1 + max fk (xi ) fyi (xi ) ,
k6=yi +

is convex. Since fk (xi ) is linear in the variables, maxk6=yi fk (xi ) is convex. The argument of
()+ is a sum of a constant, a convex, and a linear function, and so is convex. The function
()+ is convex and nondecreasing, so by the composition rule, the term above is convex.
(b) The plot below shows the training and test error obtained for different values of . A reasonable
value to choose would be around = 1, although smaller values seem to work well too.

0.25
Training Error
Test Error

0.2

0.15
error

0.1

0.05

0
2 1 0 1 2
10 10 10 10 10

The following code solves the problem.

199
% multi-label support vector machine
multi_label_svm_data;

mus = 10.^(-2:0.25:2);
errorTrain = [];
errorTest = [];

for mu = mus
cvx_begin quiet
variables A(K, n) b(K)
expressions L f(K, mTrain)
% f(k, i) stores f_k(x_i)
f = A*x + b*ones([1 mTrain]);
L = 0;
for k = 1:K
% process all examples with y_i = k simultaneously
ind = [1:k-1 k+1:K];
L = L + sum(pos(1 + max(f(ind, y==k), [], 1) ...
- f(k, y==k)));
end
minimize (L + mu*sum(sum(A.^2)))
subject to
sum(b) == 0;
cvx_end

[~, indTrain] = max(A*x + b*ones([1 mTrain]), [], 1);


errorTrain(end+1) = sum(indTrain~=y)/mTrain;
[~, indTest] = max(A*xtest + b*ones([1 mTest]), [], 1);
errorTest(end+1) = sum(indTest~=ytest)/mTest;
end

% plot
clf;
semilogx(mus, errorTrain); hold on;
semilogx(mus, errorTest, r); hold off;
xlabel(\mu); ylabel(error);
legend(Training Error, Test Error);
print -depsc multi_label_svm.eps;

5.19 Colorization with total variation regularization. A mn color image is represented as three matrices
of intensities R, G, B Rmn , with entries in [0, 1], representing the red, green, and blue pixel
intensities, respectively. A color image is converted to a monochrome image, represented as one
matrix M Rmn , using
M = 0.299R + 0.587G + 0.114B.
(These weights come from different perceived brightness of the three primary colors.)

200
In colorization, we are given M , the monochrome version of an image, and the color values of some
of the pixels; we are to guess its color version, i.e., the matrices R, G, B. Of course thats a very
underdetermined problem. A very simple technique is to minimize the total variation of (R, G, B),
defined as
Rij Ri,j+1


Gij Gi,j+1
m1
X n1

X Bij Bi,j+1

tv(R, G, B) = ,
Rij Ri+1,j

i=1 j=1
Gij Gi+1,j

B B
ij i+1,j 2
subject to consistency with the given monochrome image, the known ranges of the entries of
(R, G, B) (i.e., in [0, 1]), and the given color entries. Note that the sum above is of the norm
of 6-vectors, and not the norm-squared. (The 6-vector is an approximation of the spatial gradient
of (R, G, B).)
Carry out this method on the data given in image_colorization_data.*. The file loads flower.png
and provides the monochrome version of the image, M, along with vectors of known color intensities,
R_known, G_known, and B_known, and known_ind, the indices of the pixels with known values. If R
denotes the red channel of an image, then R(known_ind) returns the known red color intensities in
Matlab, and R[known_ind] returns the same in Python and Julia. The file also creates an image,
flower_given.png, that is monochrome, with the known pixels colored.
The tv function, invoked as tv(R,G,B), gives the total variation. CVXPY has the tv function
built-in, but CVX and CVX.jl do not, so we have provided the files tv.m and tv.jl which contain
implementations for you to use.
In Python and Julia we have also provided the function save_img(filename,R,G,B) which writes
the image defined by the matrices R, G, B, to the file filename. To view an image in Matlab use
the imshow function.
The problem instance is a small image, 75 75, so the solve time is reasonable, say, under ten
seconds or so in CVX or CVXPY, and around 60 seconds in Julia.
Report your optimal objective value and, if you have access to a color printer, attach your recon-
structed image. If you dont have access to a color printer, its OK to just give the optimal objective
value.
Solution.
Let R, G, B denote the matrices for the image, Rknown , Gknown , B known the matrices of known color
values (only defined for some entries), and K denote the indices of color values. Image colorization
is then the following optimization problem:
minimize tv(R, G, B)
subject to 0.299R + 0.587G + 0.114B = M
Rij = Rijknown , (i, j) K
Gij = Gijknown , (i, j) K
Bij = Bijknown , (i, j) K
0 Rij , Gij , Bij 1.

The following Matlab code solves this problem.

201
image_colorization_data;
cvx_begin quiet
variables R(m,n) G(m,n) B(m,n)
minimize tv(R,G,B)
subject to
% grayscale reconstruction matches given grayscale
0.299*R +0.587*G + 0.114*B == M
% colors match given colors
R(known_ind) == R_known
G(known_ind) == G_known
B(known_ind) == B_known
% colors in range
R >= 0; G >= 0; B >= 0
R <= 1; G <= 1; B <= 1
cvx_end
outim = cat(3,R,G,B);
imshow(outim);
imwrite(outim,flower_reconstructed.png);

Here is the solution in Python.

import cvxpy as cvx


from image_colorization_data import *

R = cvx.Variable(m,n)
G = cvx.Variable(m,n)
B = cvx.Variable(m,n)
constraints = [
0.299*R + 0.587*G + 0.114*B == M,
R[known_ind] == R_known,
G[known_ind] == G_known,
B[known_ind] == B_known,
0 <= R, 0 <= G, 0 <= B,
1 >= R, 1 >= G, 1 >= B,
]
optval = cvx.Problem(cvx.Minimize(cvx.tv(R,G,B)), constraints).solve()
print(optval)
save_img(flower_reconstructed.png, R.value, G.value, B.value)

Here is the solution in Julia.

using Convex
include("image_colorization_data.jl");
include("tv.jl");

202
R = Variable(m,n);
G = Variable(m,n);
B = Variable(m,n);

constraints = [
0.299*R + 0.587*G + 0.114*B == M;
R[known_ind] == R_known;
G[known_ind] == G_known;
B[known_ind] == B_known;
0 <= R; 0 <= G; 0 <= B;
1 >= R; 1 >= G; 1 >= B;
];
problem = minimize(tv(R,G,B), constraints)
solve!(problem)
save_img("flower_reconstructed.png", R.value, G.value, B.value)

The results are shown below.

Original image on the left, image in monochrome with colored pixels shown in the middle, and
reconstructed image on the right.
In Matlab our optimum objective value was 609.03. In Python it was 620.78. In Julia it was 615.00.
(These values should be the same, of course.)

203
6 Statistical estimation
6.1 Maximum likelihood estimation of x and noise mean and covariance. Consider the maximum
likelihood estimation problem with the linear measurement model
yi = aTi x + vi , i = 1, . . . , m.
The vector x Rn is a vector of unknown parameters, yi are the measurement values, and vi are
independent and identically distributed measurement errors.
In this problem we make the assumption that the normalized probability density function of the
errors is given (normalized to have zero mean and unit variance), but not their mean and variance.
In other words, the density of the measurement errors vi is
1 z
p(z) = f( ),

where f is a given, normalized density. The parameters and are the mean and standard
deviation of the distribution p, and are not known.
The maximum likelihood estimates of x, , are the maximizers of the log-likelihood function
m m
X X yi aTi x
log p(yi aTi x) = m log + log f ( ),
i=1 i=1

where y is the observed value. Show that if f is log-concave, then the maximum likelihood estimates
of x, , can be determined by solving a convex optimization problem.
Solution. With a change of variables
z = (1/)x, u = /, t = 1/,
the problem reduces to
m
X
maximize m log t + log f (yi t aTi z u),
i=1

which is a convex optimization problem since the objective is a concave function of (z, u, t). We
recover optimal values of x, , and using
= 1/t, = u/t, x = (1/t)z.

6.2 Mean and covariance estimation with conditional independence constraints. Let X Rn be a
Gaussian random variable with density
1
p(x) = exp((x a)T S 1 (x a)/2).
(2)n/2 (det S)1/2
The conditional density of a subvector (Xi , Xj ) R2 of X, given the remaining variables, is also
Gaussian, and its covariance matrix Rij is equal to the Schur complement of the 2 2 submatrix
" #
Sii Sij
Sij Sjj

204
in the covariance matrix S. The variables Xi , Xj are called conditionally independent if the
covariance matrix Rij of their conditional distribution is diagonal.
Formulate the following problem as a convex optimization problem. We are given N independent
samples y1 , . . . , yN Rn of X. We are also given a list N {1, . . . , n} {1, . . . , n} of pairs of
conditionally independent variables: (i, j) N means Xi and Xj are conditionally independent.
The problem is to compute the maximum likelihood estimate of the mean a and the covariance
matrix S, subject to the constraint that Xi and Xj are conditionally independent for (i, j) N .

Solution. The log-likelihood function is


N
(yk a)T S 1 (yk a)
X
l(S, a) = (N n/2) log(2) (N/2) log det S (1/2)
k=1
N 
= n log(2) log det S tr(S 1 Y ) (a )T S 1 (a )
2
where and Y are the sample mean and covariance:
N N
1 X 1 X
= yk , Y = (yk )(yk )T .
N k=1 N k=1

1
Note that the inverse of the conditional covariance matrix Rij is the 22 submatrix of S 1 formed
by rows and columns i and j. The variables i and j are conditionally independent of (S 1 )ij = 0.
The ML problem is therefore

maximize l(S, a)
subject to (S 1 )ij = 0, (i, j) N .

The optimal a is clearly a = , so it remains to solve

maximize log det S 1 tr(S 1 Y )


subject to (S 1 )ij = 0, (i, j) N

which is convex in S 1 .

6.3 Maximum likelihood estimation for exponential family. A probability distribution or density on a
set D, parametrized by Rn , is called an exponential family if it has the form

p (x) = a() exp(T c(x)),

for x D, where c : D Rn , and a() is a normalizing function. Here we intepret p (x) as a


density function when D is a continuous set, and a probability distribution when D is discrete.
Thus we have Z 1
a() = exp(T c(x)) dx
D
when p is a density, and
!1
X
T
a() = exp( c(x))
xD

205
when p represents a distribution. We consider only values of for which the integral or sum above
is finite. Many families of distributions have this form, for appropriate choice of the parameter
and function c.

(a) When c(x) = x and D = Rn+ , what is the associated family of densities? What is the set of
valid values of ?
(b) Consider the case with D = {0, 1}, with c(0) = 0, c(1) = 1. What is the associated exponential
family of distributions? What are the valid values of the parameter R?
(c) Explain how to represent the normal family N (, ) as an exponential family. Hint. Use pa-
rameter (z, Y ) = (1 , 1 ). With this parameter, T c(x) has the form z T c1 (x)+tr Y C2 (x),
where C2 (x) Sn .
(d) Log-likelihood function. Show that for any x D, the log-likelihood function log p (x) is
concave in . This means that maximum-likelihood estimation for an exponential family leads
to a convex optimization problem. You dont have to give a formal proof of concavity of
log p (x) in the general case: You can just consider the case when D is finite, and state that
the other cases (discrete but infinite D, continuous D) can be handled by taking limits of finite
sums.
(e) Optimality condition for ML estimation. Let ` (x1 , . . . , xK ) be the log-likelihood function for
K IID samples, x1 , . . . , xk , from the distribution or density p . Assuming log p is differentiable
in , show that
K
1 X
(1/K) ` (x1 , . . . , xK ) = c(xi ) E c(x).
K i=1

(The subscript under E means the expectation under the distribution or density p .)
Interpretation. The ML estimate of is characterized by the empirical mean of c(x) being
equal to the expected value of c(x), under the density or distribution p . (We assume here
that the maximizer of ` is characterized by the gradient vanishing.)

Solution.

(a) We need 0 for the integral to converge, in which case we have


1
Z
n
exp(T x) dx = Qn ,
R + i=1 (i )

so a() = ni=1 (i ). In this case the distribution is that of n independent exponentially


Q

distributed variables, with means 1/i .


(b) We have
1 exp
p (0) = , p (1) = ,
exp + 1 exp + 1
which we recognize as the Bernoulli distribution. Any value of R is valid.
(c) We write the N (, ) density as
 
p(x) = a exp (x )T 1 (x )/2 ,

206
where a is the normalizing constant,

a = (2)n/2 det 1/2 .

We write this in the form


 
exp z T c1 (x) + tr Y C2 (x) ,
p(x) = a

where Y = 1 and z = 1 ,

c1 (x) = x, C2 (x) = (1/2)xxT ,

and a
is the normalizing constant

= a exp(T 1 /2) = (2)n/2 det Y 1/2 exp(z T Y 1 z/2).


a

The set of valid values of z and Y is simply Rn Sn++ . (Note that the mapping between (, )
and (z, Y ) is a bijection.)
We didnt ask you to find it, but the log-likelihood function is

log p (x) = (n/2) log(2) + (1/2) log det Y z T Y 1 z/2 + z T x xT Y 1 x/2,

which is indeed concave (as part (d) tells us it must be).


(d) Well consider the case when D is finite. From the definition of p (x) we have

log p (x) = log a() + T c(x)


!
X
= log exp( c(x)) + T c(x).
T

xD

The second term is affine in ; the first is concave, since it is the negative log-sum-exp function
composed with the affine function of theta T c(x).
We can show that f () = log p (x) is concave, in the more general case of a density, by
computing the Hessian. Defining e(, x) = exp(T c(x)), the second partial derivatives are

2f
R R R R
( D ci (x)e(, x) dx) ( D cj (x)e(, x) dx) ( D ci (x)cj (x)e(, x) dx) ( D e(, x) dx)
= ,
i j ( D e(, x) dx)2
R

where c(x) = (c1 (x), . . . , cm (x)). For any vector v Rm , this gives

dx)2 D ( i vi ci (x))2 e(, x) dx (


R P R P  R
T 2 ( D( i vi ci (x))e(, x) D e(, x) dx)
v ( f ())v = .
( D e(, x) dx)2
R

From the Cauchy-Schwarz inequality for functions,


Z 2 Z  Z 
g(x)h(x) dx (g(x))2 dx (h(x))2 dx ,

with g(x) = ( i vi ci (x)) e(, x) and h(x) = e(, x), we get v T (2 f ())v 0 for all v. The
P p p

Hessian is negative semidefinite, so log p (x) is concave.

207
(e) Its a simple calculation to get the gradient of log p (especially when you ignore issues such
as whether or not it exists, which we plan to do). The log-likelihood function is
K
X
` (x1 , . . . , xK ) = log p (xi ),
i=1

so we have
K
!!
X X
T T
` (x1 , . . . , xK ) = c(xi ) log exp c(x)
i=1 xD
K P T )c(x)
xD (exp c(x)
X
= c(xi ) K
exp c(x)T
P
i=1 xD

XK
= c(xi ) K E c(x).

i=1

6.4 Maximum likelihood prediction of team ability. A set of n teams compete in a tournament. We
model each teams ability by a number aj [0, 1], j = 1, . . . , n. When teams j and k play each
other, the probability that team j wins is equal to prob(aj ak + v > 0), where v N (0, 2 ).
You are given the outcome of m past games. These are organized as

(j (i) , k (i) , y (i) ), i = 1, . . . , m,

meaning that game i was played between teams j (i) and k (i) ; y (i) = 1 means that team j (i) won,
while y (i) = 1 means that team k (i) won. (We assume there are no ties.)

Rn ,
(a) Formulate the problem of finding the maximum likelihood estimate of team abilities, a
given the outcomes, as a convex optimization problem. You will find the game incidence
matrix A Rmn , defined as

y (i) l = j (i)


Ail = y (i) l = k (i)

0 otherwise,

useful.
The prior constraints a i [0, 1] should be included in the problem formulation. Also, we
note that if a constant is added to all team abilities, there is no change in the probabilities of
game outcomes. This means that a is determined only up to a constant, like a potential. But
this doesnt affect the ML estimation problem, or any subsequent predictions made using the
estimated parameters.
(b) Find a
for the team data given in team_data.m, in the matrix train. (This matrix gives the
outcomes for a tournament in which each team plays each other team once.) You may find
the CVX function log_normcdf helpful for this problem.
You can form A using the commands
A = sparse(1:m,train(:,1),train(:,3),m,n) + ...
sparse(1:m,train(:,2),-train(:,3),m,n);

208
(c) Use the maximum likelihood estimate a found in part (b) to predict the outcomes of next
years tournament games, given in the matrix test, using y(i) = sign(
aj (i) a
k(i) ). Compare
these predictions with the actual outcomes, given in the third column of test. Give the
fraction of correctly predicted outcomes.
The games played in train and test are the same, so another, simpler method for predicting
the outcomes in test it to just assume the team that won last years match will also win this
years match. Give the percentage of correctly predicted outcomes using this simple method.

Solution.

(a) The likelihood of the outcomes y given a is

1 (i)
Y  
p(y|a) = y (aj (i) ak(i) ) ,
i=1,...,n

where is the cumulative distribution of the standard normal. The log-likelihood function is
therefore X
l(a) = log p(y|a) = log ((1/)(Aa)i ) .
i

This is a concave function.


The maximum likelihood estimate a
is any solution of

maximize l(a)
subject to 0  a  1.

This is a convex optimization problem since the objective, which is maximized, is concave,
and the constraints are 2n linear inequalities.
(b) The following code solves the problem
% Form adjacency matrix
team_data
A1 = sparse(1:m,train(:,1),train(:,3),m,n);
A2 = sparse(1:m,train(:,2),-train(:,3),m,n);
A = A1+A2;

% Estimate abilities
cvx_begin
variable a_hat(n)
minimize(-sum(log_normcdf(A*a_hat/sigma)))
subject to
a_hat >= 0
a_hat <= 1
cvx_end
Using this code we get that a
= (1.0, 0.0, 0.68, 0.37, 0.79, 0.58, 0.38, 0.09, 0.67, 0.58).
(c) The following code is used to predict the outcomes in the test set

209
% Estimate errors in test set
A1 = sparse(1:m_test,test(:,1),1,m_test,n);
A2 = sparse(1:m_test,test(:,2),-1,m_test,n);
A_test = A1+A2;
res = sign(A_test*a_hat);
Pml = 1-length(find(res-test(:,3)))/m_test
Ply = 1-length(find(train(:,3)-test(:,3)))/m_test
The maximum likelihood estimate gives a correct prediction of 86.7% of the games in test.
On the other hand, 75.6% of the games in test have the same outcome as the games in train.

6.5 Estimating a vector with unknown measurement nonlinearity. (A specific instance of exercise 7.9
in Convex Optimization.) We want to estimate a vector x Rn , given some measurements

yi = (aTi x + vi ), i = 1, . . . , m.

Here ai Rn are known, vi are IID N (0, 2 ) random noises, and : R R is an unknown
monotonic increasing function, known to satisfy

0 (u) ,

for all u. (Here and are known positive constants, with < .) We want to find a maximum
likelihood estimate of x and , given yi . (We also know ai , , , and .)
This sounds like an infinite-dimensional problem, since one of the parameters we are estimating is a
function. In fact, we only need to know the m numbers zi = 1 (yi ), i = 1, . . . , m. So by estimating
we really mean estimating the m numbers z1 , . . . , zm . (These numbers are not arbitrary; they
must be consistent with the prior information 0 (u) for all u.)

(a) Explain how to find a maximum likelihood estimate of x and (i.e., z1 , . . . , zm ) using convex
optimization.
(b) Carry out your method on the data given in nonlin_meas_data.m, which includes a matrix
A Rmn , with rows aT1 , . . . , aTm . Give x
ml , the maximum likelihood estimate of x. Plot your
estimated function ml . (You can do this by plotting ( zml )i versus yi , with yi on the vertical
axis and (
zml )i on the horizontal axis.)

Hint. You can assume the measurements are numbered so that yi are sorted in nondecreasing order,
i.e., y1 y2 ym . (The data given in the problem instance for part (b) is given in this
order.)
Solution.

(a) We can write the measurement model as

1 (yi ) = aTi x + vi , i = 1, . . . , m.

The function 1 is unknown, but it has derivatives between 1/ and 1/. Therefore zi =
1 (yi ) and yi must satisfy the inequalities
yi+1 yi yi+1 yi
zi+1 zi , i = 1, . . . , m 1,

210
if we assume that the points are sorted with yi in increasing order. Conversely, if z and y satisfy
these inequalities, then there exists a nonlinear function with yi = (zi ), i = 1, . . . , m, and
with derivatives between and (for example, a piecewise-linear function that interpolates
the points). Therefore, as suggested in the problem statement, we can use z1 , . . . , zm as
parameters instead of .
The log-likelihood function is
m
1 X T 2
l(z, x) = (z i a i x) m log( 2).
2 2 i=1

Thus to find a maximum likelihood estimate of x and z one solves the problem
m
(zi aTi x)2
P
minimize
i=1
subject to (yi+1 yi )/ zi+1 zi (yi+1 yi )/, i = 1, . . . , m 1.

This is a quadratic program with variables z Rm and x Rn .


(b) The following Matlab code solve the given problem
nonlin_meas_data

row=zeros(1,m);
row(1)=-1;
row(2)=1;
col=zeros(1,m-1);
col(1)=-1;
B=toeplitz(col,row);

cvx_begin
variable x(n);
variable z(m);
minimize(norm(z-A*x));
subject to
1/beta*B*y<=B*z;
B*z<=1/alpha*B*y;
cvx_end

disp(estimated x:); disp(x);

plot(z,y)
ylabel(y)
xlabel(z)
title(ML estimate of \phi)
The estimated x is x = (0.4819, 0.4657, 0.9364, 0.9297).
The first figure shows the estimate function . The second figure shows and the data points
aTi x, yi .

211
t1
2.5

1.5

0.5

0
y

0.5

1.5

2.5
6 4 2 0 2 4
z

t2
2.5

1.5

0.5

0
y

0.5

1.5

2.5
6 4 2 0 2 4
x

6.6 Maximum likelihood estimation of an increasing nonnegative signal. We wish to estimate a scalar

212
signal x(t), for t = 1, 2, . . . , N , which is known to be nonnegative and monotonically nondecreasing:

0 x(1) x(2) x(N ).

This occurs in many practical problems. For example, x(t) might be a measure of wear or dete-
rioration, that can only get worse, or stay the same, as time t increases. We are also given that
x(t) = 0 for t 0.
We are given a noise-corrupted moving average of x, given by
k
X
y(t) = h( )x(t ) + v(t), t = 2, . . . , N + 1,
=1

where v(t) are independent N (0, 1) random variables.

(a) Show how to formulate the problem of finding the maximum likelihood estimate of x, given
y, taking into account the prior assumption that x is nonnegative and monotonically nonde-
creasing, as a convex optimization problem. Be sure to indicate what the problem variables
are, and what the problem data are.
(b) We now consider a specific instance of the problem, with problem data (i.e., N , k, h, and y)
given in the file ml_estim_incr_signal_data.*. (This file contains the true signal xtrue,
which of course you cannot use in creating your estimate.) Find the maximum likelihood
estimate x
ml , and plot it, along with the true signal. Also find and plot the maximum likelihood
estimate x
ml,free not taking into account the signal nonnegativity and monotonicity.
Hints.
Matlab: The function conv (convolution) is overloaded to work with CVX.
Python: Numpy has a function convolve which performs convolution. CVXPY has conv
which does the same thing for variables.
Julia: The function conv is overloaded to work with Convex.jl.

Solution.

(a) To simplify our notation, we let the signal yx be the noiseless moving average of the signal x.
That is,
k
X
yx (t) = h( )x(t ), t = 2, . . . , N + 1.
=1

Note that yx is a linear function of x.


The nonnegativity and monotonicity constraint on x can be expressed as a set of linear in-
equalities,
x(1) 0, x(1) x(2), . . . x(N 1) x(N ).

Now we turn to the maximum likelihood problem. The likelihood function is


NY
+1
p (y(t) yx (t)) ,
t=2

213
where p is the density function of a N (0, 1) random variable. The negative log-likelihood
function has the form
+ k yx yk22 ,
where is a constant, and is a positive constant. Thus, the ML estimate is found by mini-
mizing the quadratic objective kyx yk22 . The ML estimate when we do not take into account
signal nonnegativity and monotonicity can be found by solving a least-squares problem,

yx yk22 .
ml,free = argmin k
x
x

Since convolution is an invertible linear operation, to get the ML estimate without the mono-
tonicity constraint, we simply apply deconvolution. Since the number of measurements and
variables to estimate are the same (N ), there is no smoothing effect to reduce noise, and we
can expect the deconvolved estimate to be poor.
With the prior assumption that the signal x is nonnegative and monotonically nondecreasing
we can find the ML estimate x ml by solving the QP

minimize k yx yk22
subject to x(1) 0
x(t) x(t + 1) t = 1, . . . , N 1.

(b) The ML estimate for the given problem instance, with and without the assumption of nonneg-
ativity and monotonicity, is plotted below. Weve also plotted the true signal x. We observe
that the ML estimate x
ml , which takes into consideration nonnegativity and monotonicity, is
a much better estimate of signal x than the simple unconstrained solution x ml,free obtained
via deconvolution.

214
12

10

0
xt
xmono
xls
2
0 10 20 30 40 50 60 70 80 90 100
t

Plots for the other languages are shown below.


The following Matlab code computes the ML solutions for the constrained and unconstrained
estimation problems.

% ML estimation of increasing nonnegative signal


% problem data
ml_estim_incr_signal_data;

% maximum likelihood estimation with no monotonicity taken in to account


% can be solved analytically
cvx_begin
variable xls(N)
yhat = conv(h,xls); % estimated output
% yhat is truncated to match problem description
minimize (sum_square(yhat(1:end-3) - y))
cvx_end

% monotonic and non-negative signal estimation


cvx_begin

215
variable xmono(N)
yhat = conv(h,xmono); % estimated output
minimize (sum_square(yhat(1:end-3) - y))
subject to
xmono(1) >= 0;
xmono(1:N-1) <= xmono(2:N);
cvx_end

t = 1:N;
figure; set(gca, FontSize,12);
plot(t,xtrue,--,t,xmono,r,t,xls,k:);
xlabel(t); legend(xt,xmono,xls,Location,SouthEast);
%print -depsc ml_estim_incr_signal_plot

The following code implements the same solution in Python


import cvxpy as cvx
import matplotlib.pyplot as plt

# Load data file, gives us N and Y


from ml_estim_incr_signal_data import *

# Estimate signal without constraints


xls = cvx.Variable(N)
yhat = cvx.conv(h,xls)
error = cvx.sum_squares(yhat[0:-3] - y)
prob = cvx.Problem(cvx.Minimize(error))
prob.solve()

xmono = cvx.Variable(N)
yhat = cvx.conv(h,xmono)
error = cvx.sum_squares(yhat[0:-3] - y)
constraints = [xmono[0] >= 0]
constraints += [xmono[0:-1] <= xmono[1:]]
prob2 = cvx.Problem(cvx.Minimize(error),constraints)
prob2.solve()

fig = plt.figure()
t = np.arange(N)
plt.plot(t,xtrue,b--,t,xls.value,k--,t,xmono.value,r)
plt.show()
plt.savefig(ml_estim_incr_signal_plot.eps,format=eps)
The plot below shows the results from the Python implementation.

216
12

10

2
0 20 40 60 80 100

In Julia, the code is written as follows.


# ML estimation of increasing nonnegative signal
# problem data
include("ml_estim_incr_signal_data.jl");

using Convex

# maximum likelihood estimation with no monotonicity taken in to account


# can be solved analytically
xls = Variable(N);
yhat = conv(h, xls); # estimated output
# yhat is truncated to match problem description
p = minimize (sum_squares(yhat[1:end-3] - y));
solve!(p);

# monotonic and non-negative signal estimation


xmono = Variable(N);
yhat = conv(h, xmono); # estimated output
constraints = [xmono[1] >= 0, xmono[1:N-1] <= xmono[2:N]]
p = minimize (sum_squares(yhat[1:end-3] - y), constraints);
solve!(p);

217
using PyPlot
t = 1:N;
figure();
hold(true);
plot(t, xtrue, "b--", label="xtrue");
plot(t, xmono.value, "r", label="xmono");
plot(t, xls.value, "k:", label="xls");
xlabel("t");
legend(loc=4);
savefig("ml_estim_incr_signal.eps");

12

10

0
xtrue
2 xmono
xls
4
0 20 40 60 80 100
t

6.7 Relaxed and discrete A-optimal experiment design. This problem concerns the A-optimal experi-
ment design problem, described on page 387, with data generated as follows.

n = 5; % dimension of parameters to be estimated


p = 20; % number of available types of measurements
m = 30; % total number of measurements to be carried out
randn(state, 0);
V=randn(n,p); % columns are vi, the possible measurement vectors

218
Solve the relaxed A-optimal experiment design problem,
P 1
p T
minimize (1/m) tr v v
i=1 i i i
subject to 1T = 1,  0,

with variable Rp . Find the optimal point ? and the associated optimal value of the relaxed
problem. This optimal value is a lower bound on the optimal value of the discrete A-optimal
experiment design problem,
P 1
p T
minimize tr i=1 mi vi vi
subject to m1 + + mp = m, mi {0, . . . , m}, i = 1, . . . , p,

with variables m1 , . . . , mp . To get a suboptimal point for this discrete problem, round the entries
in m? to obtain integers m i . If needed, adjust these by hand or some other method to ensure that
they sum to m, and compute the objective value obtained. This is, of course, an upper bound on
the optimal value of the discrete problem. Give the gap between this upper bound and the lower
bound obtained from the relaxed problem. Note that the two objective values can be interpreted
as mean-square estimation error E k x xk22 .
Solution. The objective of the relaxed problem is convex, so it is a convex problem. Expressing
it in CVX requires a little work. Wed like to write the objective as

minimize ((1/m)*trace(inv(V*diag(lambda)*V)))

but this wont work, because CVX doesnt know about matrix convex functions. Instead, we can
express the objective as a sum of matrix fractional functions,
P 1
p
minimize (1/m) nk=1 eTk T
P
v v
i=1 i i i ek
subject to 1T = 1,  0.

where ek Rn is the kth unit vector. (Note that e is defined in exercise 6.9 as the estimation error
vector, so ek could also mean the kth entry of the error vector. But here, clearly, ek is kth unit
vector.)
We can express this in CVX using the function matrix_frac. The following code solves the problem.

n = 5; % dimension
p = 20; % number of available types of measurements
m = 30; % total number of measurements to be carried out
randn(state, 0);
V=randn(n,p); % columns are vi, the possible measurement vectors

cvx_begin
variable lambda(p)
obj = 0;
for k=1:n
ek = zeros(n,1);

219
ek(k)=1;
obj = obj + (1/m)*matrix_frac(ek,V*diag(lambda)*V);
end
minimize( obj )
subject to
sum(lambda) == 1
lambda >= 0
cvx_end

lower_bound = cvx_optval

t = -0.00; % small offset chosen by hand to make rounding work out.


% for this problem data, none is needed!
m_rnd = pos(round(m*lambda+t));
sum(m_rnd) % should be == m

% now find objective value of rounded experiment design


upper_bound = trace(inv(V*diag(m_rnd/m)*V))/m
gap = upper_bound-lower_bound
rel_gap = gap/lower_bound

For this problem instance, simple rounding yielded m


i that summed to m = 30, so no adjustment
of the rounded values is needed. The lower bound is 0.2481; the upper bound is 0.2483. The gap
is 0.00023, which is around 0.1%.
What this means is this: We have found a choice of 30 measurements, each one from the set of 20
possible measurements, that yields a mean-square estimation error E k x xk22 = 0.2483. We do
not know whether this is the optimal choice of 30 measurements. But we do know that this choice
is no more than 0.1% suboptimal; the optimal choice can achieve a mean-square error that is no
smaller than 0.2481. Our experiment design is, if not optimal, very nearly optimal. (In fact, it is
very likely to be optimal.)

6.8 Optimal detector design. We adopt here the notation of 7.3 of the book. Explain how to design a
(possibly randomized) detector that minimizes the worst-case probability of our estimate being off
by more than one,
Pwc = max prob(| | 2).

(The probability above is under the distribution associated with .)


Carry out your method for the problem instance with data in off_by_one_det_data.m. Give the
? with the
optimal detection probability matrix D. Compare the optimal worst-case probability Pwc
ml
worst-case probability Pwc obtained using a maximum-likelihood detector.
Solution. For = j we have

prob(| | 2) =
X
Dij = 1 Dj+1,j Djj Dj1,j ,
|ij|2

220
where we interpret Dij as zero for i = 0 and i = m + 1. Therefore we find our detector by solving
the problem P 
minimize maxj |ij|2 Dij
subject to D = [t1 tn ]P
tk  0, 1T tk = 1, k = 1, . . . , n,
with variables t1 , . . . , tn . This problem is evidently convex, since the constraints are linear equalities
and inequalities, and the objective is a piecewise linear convex function.
The optimal detection probability matrix for the given problem instance is

0.2466 0.2816 0.1616 0.0911 0.1102

0.4816 0.3855 0.4761 0.1807 0.1601

D= 0.0046 0.0611 0.1691 0.0662 0.0014 .


0.1477 0.1687 0.1218 0.3664 0.3940
0.1195 0.1031 0.0715 0.2956 0.3343
? = 0.27. The worst-case
The worst-case probability that we are off by more than one is Pwc
ml = 0.74,
probability of being off by more than one using a maximum-likelihood detector is Pwc
nearly a factor of three worse than the optimal detector.
The following matlab code finds the optimal probability detection matrix D.

off_by_one_det_data;

% off by one randomized detector


cvx_begin
variables D(m,m) T(m,n)
% d(i) is prob were off by zero or one, with hypothesis i
d = cvx(zeros(m,1));
d(1) = D(1,1)+D(2,1);
for i = 2:m-1
d(i) = D(i-1,i)+D(i,i)+D(i+1,i);
end
d(m) = D(m-1,m)+D(m,m);
T >= 0; sum(T) == 1;
D == T*P; % detection probability matrix
minimize (max(1-d))
cvx_end
Popt = cvx_optval; % max prob we are off by more than one

% maximum likelihood (ML) detector


% first we form the detector matrix, T_ml
Tml = zeros(m,n);
for i=1:n
j = find(P(i,:)==max(P(i,:)));
Tml(j,i) = 1;
end

221
% and now the ML probability detector matrix, D_ml
Dml = Tml*P;

% worst case probability using maximum-likelihood detector


% dml(i) is prob were off by zero or one, with hypothesis i
dml(1) = Dml(1,1)+Dml(2,1);
for i=2:m-1
dml(i) = Dml(i-1,i)+Dml(i,i)+Dml(i+1,i);
end
dml(m) = Dml(m-1,m)+Dml(m,m);
Pml = max(1-dml); % ML max prob we are off by more than one

disp(the optimal detection prob matrix);


disp(D)
disp(max prob using optimal detector);
disp(Popt)
disp(max prob using ML detector);
disp(Pml)

6.9 Experiment design with condition number objective. Explain how to solve the experiment design
problem (7.5) with the condition number cond(E) of E (the error covariance matrix) as the
objective to be minimized.
Solution. The problem is a quasiconvex minimization problem. We can write it as

minimize t1 /t2
subject to t2 I  m T
i=1 i vi vi  t1 I
P
T
 0, 1 = 1

(where the domain of t1 /t2 is R R++ ), with variables , t1 , t2 . We can solve this using bisection,
solving an SDP feasibility problem in each step.
We can also solve the problem with one SDP, as follows. We divide the LMI by t2 , and define
s = t1 /t2 , = /t2 to get
minimize s
subject to I  m T
i=1 i vi vi  sI
P
T
 0, 1 = 1/t2 ,
where the variable t2 must be positive. The last constraint is the same as 1T > 0, which is
redundant. So we end up with the signle SDP

minimize s
subject to I  m T
i=1 i vi vi  tI
P

 0.

We reconstruct the optimal as ? = ? /1T ? .

222
6.10 Worst-case probability of loss. Two investments are made, with random returns R1 and R2 . The
total return for the two investments is R1 + R2 , and the probability of a loss (including breaking
even, i.e., R1 + R2 = 0) is ploss = prob(R1 + R2 0). The goal is to find the worst-case (i.e.,
maximum possible) value of ploss , consistent with the following information. Both R1 and R2 have
Gaussian marginal distributions, with known means 1 and 2 and known standard deviations 1
and 2 . In addition, it is known that R1 and R2 are correlated with correlation coefficient , i.e.,

E(R1 1 )(R2 2 ) = 1 2 .

Your job is to find the worst-case ploss over any joint distribution of R1 and R2 consistent with the
given marginals and correlation coefficient.
We will consider the specific case with data

1 = 8, 2 = 20, 1 = 6, 2 = 17.5, = 0.25.

We can compare the results to the case when R1 and R2 are jointly Gaussian. In this case we have

R1 + R2 N (1 + 2 , 12 + 22 + 21 2 ),

which for the data given above gives ploss = 0.050. Your job is to see how much larger ploss can
possibly be.
This is an infinite-dimensional optimization problem, since you must maximize ploss over an infinite-
dimensional set of joint distributions. To (approximately) solve it, we discretize the values that R1
and R2 can take on, to n = 100 values r1 , . . . , rn , uniformly spaced from r1 = 30 to rn = +70.
We use the discretized marginals p(1) and p(2) for R1 and R2 , given by

exp (ri k )2 /(2k2 )



(k)
pi = prob(Rk = ri ) = Pn 2 2 ,

j=1 exp (rj k ) /(2k )

for k = 1, 2, i = 1, . . . , n.
Formulate the (discretized) problem as a convex optimization problem, and solve it. Report the
maximum value of ploss you find. Plot the joint distribution that yields the maximum value of ploss
using the Matlab commands mesh and contour.
Remark. You might be surprised at both the maximum value of ploss , and the joint distribution
that achieves it.
Solution. Let P Rnn
+ be the matrix of joint probabilities, with Pij = prob(R1 = ri , R2 = rj ).
The condition that the marginals are the given ones is

P 1 = p(1) , P T 1 = p(2) .

The correlation constraint can be expressed as

(r 1 1)T P (r 2 1) = 1 2 .

The probability of a loss R1 + R2 0 is given by


X
prob(R1 + R2 0) = Pij .
ri +rj 0

223
So the problem is an LP,
P
maximize ri +rj 0 Pij
subject to Pij 0, i, j = 1, . . . , n
P 1 = p(1)
P T 1 = p(2)
(r 1 1)T P (r 2 1) = 1 2 ,

with variable P Rnn


+ .
The code to find the worst-case joint distribution is below.

% loss bounds solution


clear all; close all;

mu1 = 8; mu2 = 20;


sigma1 = 6; sigma2 = 17.5;
rho = -0.25;

% assuming jointly gaussian distribution


mu = mu1 + mu2;
sigma = sqrt(sigma1^2 + sigma2^2 + 2*rho*sigma1*sigma2);
ploss = normcdf(0, mu, sigma); % gaussian probability of loss

n = 100;
rmin = -30; rmax = 70;
% discretize outcomes of R1 and R2
r = linspace(rmin,rmax,n);

% marginal distributions
p1 = exp(-(r-mu1).^2/(2*sigma1^2)); p1 = p1/sum(p1);
p2 = exp(-(r-mu2).^2/(2*sigma2^2)); p2 = p2/sum(p2);

% form mask of region where R1 + R2 <= 0


r1p = r*ones(1,n); r2p = ones(n,1)*r;
loss_mask = (r1p + r2p <= 0);

cvx_begin
variable P(n,n)
maximize (sum(sum(P(loss_mask))))
subject to
P >= 0;
sum(P ,2) == p1;
sum(P,2) == p2;
(r - mu1)*P*(r - mu2) == rho*sigma1*sigma2;
cvx_end

224
pmax = cvx_optval; % worst case probability of loss
pmax
ploss

P = full(P);
figure(1); mesh(r1p, r2p, P);
xlabel(R1); ylabel(R2); zlabel(density);
xlim([rmin rmax]); ylim([rmin rmax]);
print -depsc loss_bounds_mesh.eps

figure(2); contour(r1p, r2p, P);


xlabel(R1); ylabel(R2); grid on;
xlim([rmin rmax]); ylim([rmin rmax]);
print -depsc loss_bounds_cont.eps

This yields a probability of loss of approximately 0.192, almost four times larger than the loss
probability when R1 and R2 are jointly Gaussian! The resulting (worst-case) distribution is plotted
below. It has mass on the line where R1 + R2 = 0 (i.e., break even, which counts as a loss for us).
Then it has extra mass on another region in the plane, which is needed to make the marginals the
given Gaussians, as well as to meet the constraint on the correlation coefficient.
By the way, we were not picky about the numerics when grading, and gave generous partial credit
so long as you grasped the main idea.
Remark. We didnt ask you to show this, but heres the analysis when R1 and R2 are jointly
Gaussian. Since we are given the means, marginal variances, and correlation, we have
" #
12 1 2
(R1 , R2 ) N ((1 , 2 ), ) , = .
1 2 22

The total return is R1 + R2 is also Gaussian, with E(R1 + R2 ) = 1 + 2 and variance

var(R1 + R2 ) = 1T 1 = 12 + 22 + 21 2 .

The probability of loss for the jointly Gaussian case is therefore



1 + 2
ploss = q ,
12 + 22 + 21 2

t2 /2
Rx
where (x) = 1
2 e dt is the cumulative distribution of a standard Gaussian.

225
0.014

0.012

0.01

density 0.008

0.006

0.004

0.002

60
40 60
20 40
20
0
0
20 20
R2
R1

70

60

50

40

30
R2

20

10

10

20

30
30 20 10 0 10 20 30 40 50 60 70
R1

6.11 Minimax linear fitting. Consider a linear measurement model y = Ax + v, where x Rn is a


vector of parameters to be estimated, y Rm is a vector of measurements, v Rm is a set of
measurement errors, and A Rmn with rank n, with m n. We know y and A, but we dont
know v; our goal is to estimate x. We make only one assumption about the measurement error v:
kvk .
= By; we must choose the estimation matrix B
We will estimate x using a linear estimator x
nm
R x. We will choose B to minimize the maximum possible
. The estimation error is e = x
value of kek , where the maximum is over all values of x and all values of v satisfying kvk .

226
(a) Show how to find B via convex optimization.
(b) Numerical example. Solve the problem instance given in minimax_fit_data.m. Display the
you obtain and report k
x x xtrue k . Here xtrue is the value of x used to generate the
measurement y; it is given in the data file.

Solution.

(a) The problem is to minimize the objective f (B), where

f (B) = max k
x xk = max k(BA I)x + Bvk .
x, kvk  x, kvk 

The maximum over x is +, unless we have BA = I, so this will be a constraint for us.
(By the way, BA = I means that B is a left inverse of A. The associated estimator is called
an unbiased linear estimator, since without noise there is no estimation error.) Assuming
BA = I, we have

f (B) = max kBvk = max max |bTi v| =  max kbi k1 ,


kvk  kvk  i=1,...,m i=1,...,m

where bTi are the rows of B. (In fact, f (B) is a norm of B called, for obvious reasons, the
max-row-sum norm, and denoted kBk .) The problem is thus

minimize maxi=1,...,m kbi k1


subject to BA = I.

This is evidently a convex problem, with variable B. In fact, it is separable in the rows; we
can solve for each row separately, by solving

minimize kbi k1
subject to bTi A = eTi ,

for i = 1, . . . , n.
(b) The solution is given on the following page.

227
%% minimax linear fitting

minimax_fit_data;

cvx_begin
variable B(n,m)
minimize (norm(B,inf))
subject to
B*A == eye(n);
cvx_end

x = B*y;
fprintf(1, estimation error = %f\n, norm(x - x_true, inf));

6.12 Cox proportional hazards model. Let T be a continuous random variable taking on values in R+ .
We can think of T as modeling an event that takes place at some unknown future time, such as
the death of a living person or a machine failure.
The survival function is S(t) = prob(T t), which satisfies S(0) = 1, S 0 (t) 0, and limt S(t) =
0. The hazard rate is given by (t) = S 0 (t)/S(t) R+ , and has the following interpretation: For
small > 0, (t) is approximately the probability of the event occurring in [t, t + ], given that it
has not occurred up to time t. The survival function can be expressed in terms of the hazard rate:
 Z t 
S(t) = exp ( ) d .
0

(The hazard rate must have infinite integral over [0, ).)
The Cox proportional hazards model gives the hazard rate as a function of some features or ex-
planatory variables (assumed constant in time) x Rn . In particular, is given by

(t) = 0 (t) exp(wT x),

where 0 (which is nonnegative, with infinite integral) is called the baseline hazard rate, and w Rn
is a vector of model parameters. (The name derives from the fact that (t) is proportional to
exp(wi xi ), for each i.)
Now suppose that we have observed a set of independent samples, with event times tj and feature
values xj , for j = 1, . . . , N . In other words, we observe that the event with features xj occurred at
time tj . You can assume that the baseline hazard rate 0 is known. Show that maximum likelihood
estimation of the parameter w is a convex optimization problem.
Remarks. Regularization is typically included in Cox proportional hazards fitting; for example,
adding `1 regularization yields a sparse model, which selects the features to be used. The basic
Cox proportional hazards model described here is readily extended to include discrete times of the
event, censored measurements (which means that we only observe T to be in an interval), and the
effects of features that can vary with time.
Solution. Since the cumulative distribution function of T is 1 S(t), the density is given by
 Z t 
S 0 (t) = (t)S(t) = 0 (t) exp(wT x) exp 0 ( ) exp(wT x) d ,
0

228
where we use the definition (t) = S 0 (t)/S(t) and substitute the model form for (t). Thus the
log density is
wT x S0 (t) exp(wT x)
plus a constant (i.e., a term that does not depend on w), where
Z t
S0 (t) = 0 ( ) d
0

is the baseline survival function. Since 0 is nonnegative, we have that S0 (t) 0, so the log density
is a concave function of w.
The log-likelihood function for the observed data is then
N 
X 
`(w) = wT xj S0 (tj ) exp(wT xj ) ,
j=1

which is a concave function of w. Maximizing w is a convex optimization problem.

6.13 Maximum likelihood estimation for an affinely transformed distribution. Let z be a random variable
on Rn with density pz (u) = exp (kuk2 ), where : R R is convex and increasing. Examples of
such distributions include the standard normal N (0, 2 I), with (u) = (u)2+ + , and the multivari-
able Laplacian distribution, with (u) = (u)+ + , where and are normalizing constants, and
(a)+ = max{a, 0}. Now let x be the random variable x = Az + b, where A Rnn is nonsingular.
The distribution of x is parametrized by A and b.
Suppose x1 , . . . , xN are independent samples from the distribution of x. Explain how to find a
maximum likelihood estimate of A and b using convex optimization. If you make any further
assumptions about A and b (beyond invertiblility of A), you must justify it.
Hint. The density of x = Az + b is given by
1
px (v) = pz (A1 (v b)).
| det A|

Solution. The density of x = Az + b is given by


1
px (v) = exp (kA1 (v b)k2 ).
| det A|
We first observe that the density with parameters (A, b) is the same as the density with parameters
(AQ, b), for any orthogonal matrix Q, since

| det(AQ)| = | det A|| det Q| = | det A|,

and
k(AQ)1 (v b)k2 = kQT A1 (v b)k2 = kA1 (v b)k2 .
Let A have SVD A = U V T . Choosing Q = V U T , we see that AQ = U U T Sn++ . So we can
always assume that A Sn++ .
The log-likelihood function for a single sample x is

`(A, b) = log det A (kA1 (x b)k2 ),

229
so for N independent samples, we have log-likelihood function
N
(kA1 (xi b)k2 ).
X
`(A, b) = N log det A
i=1

We must maximize ` over A Sn++ and b Rn . Its not concave in these parameters, but we will
use instead the parameters

B = A1 Sn++ , c = A1 b Rn .

From B and c we can recover A and b as

A = B 1 , b = B 1 c.

In terms of B and c, the log-likelihood function is


N
c) = N log det B
X
`(B, (kBxi ck2 ),
i=1

which is concave. To see this, we note that the first term is concave in B, and the second term is
concave since kBxi ck2 is convex in (B, c), and by the composition rule, (kBxi ck2 ) is convex.
c) over B Sn , c Rn , and then get A and b as described above.
So we just maximize `(B, ++

6.14 A simple MAP problem. We seek to estimate a point x R2+ , with exponential prior density
p(x) = exp (x1 + x2 ), based on the measurements

y1 = x1 + v1 , y2 = x2 + v2 , y3 = x1 x2 + v3 ,

where v1 , v2 , v3 are IID N (0, 1) random variables (also independent of x). A nave estimate of x is
given by xnaive = (y1 , y2 ).
(a) Explain how to find the MAP estimate of x, given the observations y1 , y2 , y3 .
(b) Generate 100 random instances of x and y, from the given distributions. For each instance,
find the MAP estimate x map and the nave estimate xnaive . Give a scatter plot of the MAP
map x, and another scatter plot of the nave estimation error, x
estimation error, i.e., x naive x.
XXX missing (March 2016) XXX
6.15 Minimum possible maximum correlation. Let Z be a random variable taking values in Rn , and let
Sn++ be its covariance matrix. We do not know , but we do know the variance of m linear
functions of Z. Specifically, we are given nonzero vectors a1 , . . . , am Rn and 1 , . . . , m > 0 for
which
var(aTi Z) = i2 , i = 1, . . . , m.
For i 6= j the correlation of Zi and Zj is defined to be
ij
ij = p .
ii jj
Let max = maxi6=j |ij | be the maximum (absolute value) of the correlation among entries of Z. If
max is large, then at least two components of Z are highly correlated (or anticorrelated).

230
(a) Explain how to find the smallest value of max that is consistent with the given information,
using convex or quasiconvex optimization. If your formulation involves a change of variables
or other transformation, justify it.
(b) The file correlation_bounds_data.* contains 1 , . . . , m and the matrix A with columns
a1 , . . . , am . Find the minimum value of max that is consistent with this data. Report your
minimum value of max , and give a corresponding covariance matrix that achieves this value.
You can report the minimum value of max to an accuracy of 0.01.

Solution.

(a) Using the formula for the variance of a linear function of a random variable, we have that

var(aTi Z) = aTi var(Z)ai = aTi ai ,

which is a linear function of . We can find the minimum value of the maximum correlation
among components of Z that is consistent with the data by solving the following optimization
problem.
minimize maxi6=j |ij |
subject to aTi ai = i2 , i = 1, . . . , m
 0.
Observe that
|ij |
|ij | = p
ii jj
is a quasiconvex function of : the numerator is a nonnegative convex function of , and the
denominator is a positive concave function of (it is the composition of the geometric mean
and a linear function of ). Since the maximum of quasiconvex functions is quasiconvex, the
objective in the optimization problem above is quasiconvex. Thus, we can find the smallest
value of max by solving a quasiconvex optimization problem. In particular, we have that
max t is consistent with the data if and only if the following convex feasibility problem is
feasible: p
|ij | t ii jj , i 6= j
T
ai ai = i ,2 i = 1, . . . , m
 0
We can find the minimum value of t for which this problem is feasible using bisection search
starting with t = 0 and t = 1.
(b) The following Matlab code solves the problem.
clear all; close all; clc
correlation_bounds_data;

% find the minimum possible maximum correlation


lb = 0;
ub = 1;
Sigma_opt = nan(n,n);
while ub-lb > 1e-3
t = (lb+ub)/2;

231
cvx_begin sdp czz
variable Sigma(n,n) symmetric
for i = 1:(n-1)
for j = (i+1):n
abs(Sigma(i,j)) <= t * geo_mean([Sigma(i,i), Sigma(j,j)])
end
end
for i = 1:m
A(:,i) * Sigma * A(:,i) == sigma(i)^2
end
Sigma >= 0
cvx_end
if strcmp(cvx_status, Solved)
ub = t;
Sigma_opt = Sigma;
else
lb = t;
end
end

% print the results and check the correlation matrix


t
Sigma = Sigma_opt
C = diag(1./sqrt(diag(Sigma)));
R = C * Sigma * C;
rho_max = max(max(R - diag(diag(R))))

The following Python code solves the problem.


import cvxpy as cvx
from math import sqrt

from correlation_bounds_data import *

Sigma = cvx.Semidef(n)
t = cvx.Parameter(sign=positive)
rho_cons = []
for i in range(n - 1):
for j in range(i + 1, n):
Sij = cvx.vstack(Sigma[i, i], Sigma[j, j])
rho_cons += [cvx.abs(Sigma[i, j]) <= t * cvx.geo_mean(Sij)]
var_cons = [A[:, i].T * Sigma * A[:, i] == sigma[i]**2
for i in range(m)]
problem = cvx.Problem(cvx.Minimize(0), rho_cons + var_cons)

lb, ub = 0.0, 1.0

232
Sigma_opt = None
while ub - lb > 1e-3:
t.value = (lb + ub) / 2
problem.solve()
if problem.status == cvx.OPTIMAL:
ub = t.value
Sigma_opt = Sigma.value
else:
lb = t.value

print(rho_max =, t.value)
print(Sigma =, Sigma_opt)

# compute the correlation matrix


C = np.diag([1 / sqrt(Sigma_opt[i, i]) for i in range(n)])
R = C * Sigma_opt * C
print(R =, R)
The following Julia code solves the problem.
using Convex, SCS
set_default_solver(SCSSolver(verbose=false))

include("correlation_bounds_data.jl")

lb = 0; ub = 1
t = (lb+ub)/2
Sigma_opt = []
while ub-lb > 1e-3
t = (lb+ub)/2
Sigma = Semidefinite(n)
problem = satisfy()
for i = 1:(n-1)
for j = (i+1):n
problem.constraints +=
(abs(Sigma[i,j]) <= t*geomean(Sigma[i,i],Sigma[j,j]))
end
end
for i = 1:m
problem.constraints += (A[:,i] * Sigma * A[:,i] == sigma[i]^2)
end

solve!(problem)
if problem.status == :Optimal
ub = t
Sigma_opt = Sigma.value
else

233
lb = t
end
end
println("t = $(round(t,4))")
println("Sigma = $(round(Sigma_opt,4))")

# compute the correlation matrix


C = diagm(Float64[1/sqrt(Sigma_opt[i,i]) for i in 1:n])
R = C * Sigma_opt * C
println("R = $(round(R,4))")

We find that the minimum value of the maximum correlation that is consistent with the data
is max = 0.62. A corresponding covariance matrix is

3.78 0.30 1.46 1.47 0.24

0.30 2.91 1.28 1.29 0.86

= 1.46 1.28 1.46 0.09 0.27 .


1.47 1.29 0.09 1.48 0.49
0.24 0.09 0.27 0.49 0.42

Although we did not ask you to do so, it is a good idea to compute the correlation matrix to
this value of , and check that the maximum correlation is equal to max :

1.00 0.09 0.62 0.62 0.19

0.09 1.00 0.62 0.62 0.08

R= 0.62 0.62 1.00 0.06 0.34 .


0.62 0.62 0.06 1.00 0.62
0.19 0.08 0.34 0.62 1.00

234
7 Geometry
7.1 Efficiency of maximum volume inscribed ellipsoid. In this problem we prove the following geo-
metrical result. Suppose C is a polyhedron in Rn , symmetric about the origin, and described
as
C = {x | 1 aTi x 1, i = 1, . . . , p}.
Let
E = {x | xT Q1 x 1},
with Q Sn++ , be the maximum volume ellipsoid with center at the origin, inscribed in C. Then
the ellipsoid
nE = {x | xT Q1 x n}

(i.e., the ellipsoid E, scaled by a factor n about the origin) contains C.

(a) Show that the condition E C is equivalent to aTi Qai 1 for i = 1, . . . , p.


(b) The volume of E is proportional to (det Q)1/2 , so we can find the maximum volume ellipsoid
E inside C by solving the convex problem

minimize log det Q1


(31)
subject to aTi Qai 1, i = 1. . . . , p.

The variable is the matrix Q Sn and the domain of the objective function is Sn++ .
Derive the Lagrange dual of problem (31).
(c) Note that Slaters condition for (31) holds (aTi Qai < 1 for Q = I and  > 0 small enough),
so we have strong duality, and the KKT conditions are necessary and sufficient for optimality.
What are the KKT conditions for (31)?
Suppose Q is optimal. Use the KKT conditions to show that

xC = xT Q1 x n.

In other words C nE, which is the desired result.

Solution.

(a) The ellipsoid E = {Q1/2 y | kyk2 1} is contained in C if and only if

kQ1/2 ai k2 = sup |aTi Q1/2 y| 1, i = 1, . . . , p.


kyk2 1

(b) The dual function is

g() = inf L(Q, )


Q0
p !
1
X
= inf log det Q + i (aTi Qai 1)
Q0
i=1
p ! p !
1
X X
= inf log det Q + tr ( i ai aTi )Q i .
Q0
i=1 i=1

235
We now use the following fact:
(

1
 log det Y + n Y  0
inf log det X + tr(XY ) =
X0 otherwise.

The value for Y  0 follows by setting the gradient of log det X 1 + tr(XY ) to zero. This
gives X 1 + Y = 0, so the minimizer is X = Y 1 if Y  0. If Y 6 0, there exists a nonzero
a with aT Y a 0. Choosing X = I + taaT gives det X = 1 + tkak22 and

log det X 1 + tr(XY ) = log(1 + taT a) + tr Y + taT Y a.

If aT Y a 0 this goes to as t .
We conclude that the dual function is
p p p
log det P ( a aT ) P + n if
P
(i ai aTi )  0
i i i i
g() = i=1 i=1 i=1
otherwise.

The resulting dual problem is


p p
(i ai aTi )
P P
maximize log det i + n
i=1 i=1
subject to  0.

(c) The KKT conditions are:


Primal feasibility: Q  0 and aTi Qai 1 for i = 1, . . . , p.
Nonnegativity of dual multipliers:  0.
Complementary slackness: i (1 aTi Qai ) = 0 for i = 1, . . . , p.
Gradient of Lagrangian is zero:
p
1
X
Q = i ai aTi . (32)
i=1

The complementary slackness condition implies that aTi Qai = 1 if i > 0.


Now suppose Q and are primal and dual optimal. If we take the inner product of the two
sides of the equation (32) with Q, we get
p
X p
X p
X
n= i tr(Qai aTi ) = i aTi Qai = i .
i=1 i=1 i=1

The last step follows from the complementary slackness conditions. Finally, we note, again
using (32), that
p p
1
X X
T
x Q x= i (aTi x)2 i = n
i=1 i=1

if x C, i.e., if |aTi x| 1 for i = 1, . . . , p.

236
7.2 Euclidean distance matrices. A matrix X Sn is a Euclidean distance matrix if its elements xij
can be expressed as
xij = kpi pj k22 , i, j = 1, . . . , n,
for some vectors p1 , . . . , pn (of arbitrary dimension). In this exercise we prove several classical
characterizations of Euclidean distance matrices, derived by I. Schoenberg in the 1930s.

(a) Show that X is a Euclidean distance matrix if and only if


X = diag(Y )1T + 1 diag(Y )T 2Y (33)
for some matrix Y Sn+ (the symmetric positive semidefinite matrices of order n). Here,
diag(Y ) is the n-vector formed from the diagonal elements of Y , and 1 is the n-vector with
all its elements equal to one. The equality (33) is therefore equivalent to
xij = yii + yjj 2yij , i, j = 1, . . . , n.
Hint. Y is the Gram matrix associated with the vectors p1 , . . . , pn , i.e., the matrix with
elements yij = pTi pj .
(b) Show that the set of Euclidean distance matrices is a convex cone.
(c) Show that X is a Euclidean distance matrix if and only if
diag(X) = 0, X22 X21 1T 1X21
T
 0. (34)
The subscripts refer to the partitioning
" #
T
x11 X21
X=
X21 X22

with X21 Rn1 , and X22 Sn1 .


Hint. The definition of Euclidean distance matrix involves only the distances kpi pj k2 , so
the origin can be chosen arbitrarily. For example, it can be assumed without loss of generality
that p1 = 0. With this assumption there is a unique Gram matrix Y for a given Euclidean
distance matrix X. Find Y from (33), and relate it to the lefthand side of the inequality (34).
(d) Show that X is a Euclidean distance matrix if and only if
1 T 1
diag(X) = 0, (I 11 )X(I 11T )  0. (35)
n n
Hint. Use the same argument as in part (c), but take the mean of the vectors pk at the origin,
i.e., impose the condition that p1 + p2 + + pn = 0.
(e) Suppose X is a Euclidean distance matrix. Show that the matrix W Sn with elements
wij = exij , i, j = 1, . . . , n,
is positive semidefinite.
Hint. Use the following identity from probability theory. Define z N (0, I). Then
Tx 1 2
E eiz = e 2 kxk2

for all x, where i = 1 and E denotes expectation with respect to z. (This is the character-
istic function of a multivariate normal distribution.)

237
Solution.

(a) Suppose X is a Euclidean distance matrix with xij = kpi pj k22 , and let Y be the corresponding
Gram matrix. Then

xij = kpi pj k22


= kpi k22 + kpj k22 2pTi pj
= yii + yjj 2yij .

Moreover Y  0, because all Gram matrices are positive semidefinite. (To see this, write Y
as Y = P T P , where P is the matrix with the vectors pi as its columns. Then for all u,

uT Y u = uT P T P u = kP uk22 0.

Hence, Y  0.)
Conversely, suppose xij = yii + yjj 2yij for some Y  0. By factoring Y as Y = P T P (for
example, using the eigenvalue decomposition), we obtain a set of points pi (the columns of P )
that satisfy xij = kpi pj k22 . This proves that X is a Euclidean distance matrix.
(b) The result in part (a) characterizes the set of Euclidean distance matrices as the image of the
positive semidefinite cone under a linear mapping

Y 7 diag(Y )1T + 1 diag(Y )T 2Y.

Since the set of positive semidefinite matrices is a convex cone, its image under a linear
transformation is also a convex cone.
(c) Suppose X is a Euclidean distance matrix. Then clearly, diag(X) = 0. To prove that

X22 X21 1T 1X21


T
0

we follow the hint and derive the Gram matrix Y corresponding to a configuration p1 , . . . , pn
with p1 = 0. Since p1 = 0, the first column and the first row of Y are zero, i.e.,
" #
0 0
Y = ,
0 Y22

if we use the same partitioning as for X. (Y22 is a matrix of order n 1.) The equation

X = diag(Y )1T + 1 diag(Y )T 2Y

then gives the conditions

X21 = diag(Y22 ), X22 = diag(Y22 )1T + 1 diag(Y22 )T 2Y22 .

Solving for Y gives


1 
Y22 = X21 1T + 1X21T
X22 ,
2
and this matrix is positive semidefinite because the Gram matrix Y is positive semidefinite.
This proves that X22 X21 1T 1X21 T  0 if X is a Euclidean distance matrix.

238
Conversely, suppose X satisfies

diag(X) = 0, X22 X21 1T 1X21


T
 0.

Define " #
0 0
Y = T T X ) .
0 (1/2)(X21 1 + 1X21 22

Then Y  0 and
" #
0
diag(Y ) =
X21
" #
0 T
X21
T T
diag(Y )1 + 1 diag(Y ) 2Y = T T 2Y
X21 X21 1 + 1X21 22
" #
T
0 X21
=
X21 X22
= X.

From the result in part (a) this implies that X is a Euclidean distance matrix.
P
(d) Suppose X is a Euclidean distance matrix. Setting k pk = 0 is equivalent to imposing
Y 1 = 0. Under this condition we can solve

X = diag(Y )1T + 1 diag(Y )T 2Y

for Y . Define y = diag(Y ). Then


 
X1 = y1T + 1y T 2Y 1 = ny + (1T y)1

and 1T X1 = 2n1T y. Solving for 1T y and y gives 1T y = (1T X1)/(2n) and


1  1 1
y= X1 (1T y)1 = X1 2 (1T X1)1.
n n 2n
This gives
1 
Y = X y1T 1y T
2
1 1 1 T 1 T

T T
= X X11 11 X + 2 (1 X1)11
2 n n n
1 1 T 1 T
= (I 11 )X(I 11 ).
2 n n
The result now follows from Y  0.
Conversely, suppose X satisfies
1 T 1
diag(X) = 0, (I 11 )X(I 11T )  0.
n n

239
Define
1 1 1
Y = (I 11T )X(I 11T )
2 n n
1 1 1 T 1

= X X11 11 X + 2 (1T X1)11T .
T
2 n n n
Then Y  0, and
1 1 T
 
diag(Y ) = X1 (1 X1)1
n 2n
1 1 1
diag(Y )1T + 1 diag(Y )T 2Y = (X1)1T + 1(X1)T 2 (1T X1)11T
n n n
1 1 1
+ X (X1)1T 1(X1)T + 2 (1T X1)11T
n n n
= X.

Therefore X is a Euclidean distance matrix.


(e) To simplify the notation, we show that the matrix W with elements wik = exp(xik /2) is
positive semidefinite. Since the Euclidean distance matrices form a cone, this is equivalent to
the result in the problem statement.
Following the hint we can write
1 2 T (p
wik = e 2 kpi pk k2 = E ejz i pk )
,

or, in matrix notation,


T T H
ejz p1 ejz p1

T T
ejz p2 ejz p2


W = E ..
..
. .


Tp Tp
ejz n ejz n

where the subscript H denotes complex conjugate transpose. This shows that the matrix W
is positive semidefinite, since for all u,
T T T H
ejz p1 ejz p1

u1 u1
T T
u2 ejz p2 ejz p2 u2

T

u Wu = E ..
..
.. ..
. . . .


Tp Tp
un ejz n ejz n un
2
X
jz T pk
= E uk e



k
0.

7.3 Minimum total covering ball volume. We consider a collection of n points with locations x1 , . . . , xn
Rk . We are also given a set of m groups or subsets of these points, G1 , . . . , Gm {1, . . . , n}. For
each group, let Vi be the volume of the smallest Euclidean ball that contains the points in group
Gi . (The volume of a Euclidean ball of radius r in Rk is ak rk , where ak is known constant that

240
is positive but otherwise irrelevant here.) We let V = V1 + + Vm be the total volume of these
minimal covering balls.
The points xk+1 , . . . , xn are fixed (i.e., they are problem data). The variables to be chosen are
x1 , . . . , xk . Formulate the problem of choosing x1 , . . . , xk , in order to minimize the total minimal
covering ball volume V , as a convex optimization problem. Be sure to explain any new variables
you introduce, and to justify the convexity of your objective and inequality constraint functions.
Solution. We introduce two new variables for each group, zi Rk (the center of the covering
ball), and rk R (the radius). The Euclidean ball with center zi and radius ri covers the group
Gi of points if and only if
kzi xj k2 ri for all j Gi .
Therefore our problem can stated as

minimize ak ni=1 rik


P

subject to kzi xj k2 ri for j Gi , i = 1, . . . , m.

The variables here are z1 , . . . , zm , r1 , . . . , rm , and x1 , . . . , xk . The problem data are xk+1 , . . . , xn ,
and the groups G1 , . . . , Gm . The constraints are clearly convex, since kzi xj k2 is convex in the
variables, and the righthand side, ri , is linear. The objective is convex, since ri 0. (For k odd,
the objective is not convex over all of Rm , but it is convex on Rm + .)

7.4 Maximum-margin multiclass classification. In an m-category pattern classification problem, we are


given m sets Ci Rn . Set Ci contains Ni examples of feature vectors in class i. The learning
problem is to find a decision function f : Rn {1, 2, . . . , m} that maps each training example to
its class, and also generalizes reliably to feature vectors that are not included in the training sets
Ci .

(a) A common type of decision function for two-way classification is


(
1 if aT x + b > 0
f (x) =
2 if aT x + b < 0.

In the simplest form, finding f is equivalent to solving a feasibility problem: find a and b such
that
aT x + b > 0 if x C1
aT x + b < 0 if x C2 .
Since these strict inequalities are homogeneous in a and b, they are feasible if and only if the
nonstrict inequalities
aT x + b 1 if x C1
aT x + b 1 if x C2
are feasible. This is a feasibility problem with N1 + N2 linear inequalities in n + 1 variables a,
b.
As an extension that improves the robustness (i.e., generalization capability) of the classifier,
we can impose the condition that the decision function f classifies all points in a neighborhood

241
of C1 and C2 correctly, and we can maximize the size of the neighborhood. This problem can
be expressed as
maximize t
subject to aT x + b > 0 if dist(x, C1 ) t,
aT x + b < 0 if dist(x, C2 ) t,
where dist(x, C) = minyC kx yk2 .
This is illustrated in the figure. The centers of the shaded disks form the set C1 . The centers
of the other disks form the set C2 . The set of points at a distance less than t from Ci is the
union of disks with radius t and center in Ci . The hyperplane in the figure separates the two
expanded sets. We are interested in expanding the circles as much as possible, until the two
expanded sets are no longer separable by a hyperplane.
h2

h1

Since the constraints are homogeneous in a, b, we can again replace them with nonstrict
inequalities
maximize t
subject to aT x + b 1 if dist(x, C1 ) t, (36)
aT x + b 1 if dist(x, C2 ) t.
The variables are a, b, and t.
(b) Next we consider an extension to more than two classes. If m > 2 we can use a decision
function
f (x) = argmax (aTi x + bi ),
i=1,...,m

parameterized by m vectors ai Rn and m scalars bi . To find f , we can solve a feasibility


problem: find ai , bi , such that

aTi x + bi > max (aTj x + bj ) if x Ci , i = 1, . . . , m,


j6=i

or, equivalently,

aTi x + bi 1 + max (aTj x + bj ) if x Ci , i = 1, . . . , m.


j6=i

242
Similarly as in part (a), we consider a robust version of this problem:
maximize t
subject to aTi x + bi 1 + maxj6=i (aTj x + bj ) if dist(x, Ci ) t, (37)
i = 1, . . . , m.
The variables in the problem are ai Rn , bi R, i = 1, . . . , m, and t.

Formulate the optimization problems (36) and (37) as SOCPs (if possible), or as quasiconvex
optimization problems involving SOCP feasibility problems (otherwise).
Solution.

(a) The constraint


aT x + b 1 if dist(x, C1 ) t
can be written as
inf aT (x + u) + b 1 if x C1 .
kuk2 t

This is equivalent to
aT x tkak2 + b 1 if x C1 .
Similarly, the constraint
aT x + b 1 if dist(x, C2 ) t
is equivalent to
aT x + tkak2 + b 1 if x C2 .
The problem is therefore equivalent to
maximize t
subject to aT x + b 1 + tkak2 if x C1
aT x + b (1 + tkak2 ) if x C2 .
This is a quasiconvex optimization problem. It can be solved by bisection on t, via a sequence
of SOCP feasibility problems.
A better method is to make the problem convex by a change of variables
1 b = 1
a
= a, b.
1 + tkak2 1 + tkak2
We have to add the constraint k
ak2 1/t to ensure that the change of variables is invertible.
This gives
maximize t
subject to aT x + b 1 x C1 ,
T x + b 1 x C2
a
k
ak2 1/t,
which is still not convex. However it is equivalent to
minimize k ak2
T x + b 1 x C1 ,
subject to a
T x + b 1 x C2 .
a

243
The optimal t is 1/k ak2 . If we square the objective of this problem it is a QP.
This last problem can be interpreted as follows. By minimizing k ak2 we maximize the distance
between the hyperplanes a T
z + b = 1 and a T
z + b = 1. (Recall from homework 1 that the
ak2 .)
distance is 2/k
h2
h1

(b) We first write the problem as

minimize t
subjec to aTi x + bi 1 + aTj x + bj if dist(x, Ci ) t, i, j = 1, . . . , m, j 6= i.

Using similar arguments as in part (a), we can write this as a quasiconvex optimization problem

maximize t
subject to (ai aj )T x + bi bj 1 + tkai aj k2
if x Ci , i, j = 1, . . . , m, j 6= i.

This problem can be solved by bisection in t. Each bisection step reduces to an SOCP
feasibility problem with variables ai , bi , i = 1, . . . , m.

7.5 Three-way linear classification. We are given data

x(1) , . . . , x(N ) , y (1) , . . . , y (M ) , z (1) , . . . , z (P ) ,

three nonempty sets of vectors in Rn . We wish to find three affine functions on Rn ,

fi (z) = aTi z bi , i = 1, 2, 3,

that satisfy the following properties:

f1 (x(j) ) > max{f2 (x(j) ), f3 (x(j) )}, j = 1, . . . , N,


f2 (y (j) ) > max{f1 (y (j) ), f3 (y (j) )}, j = 1, . . . , M,
f3 (z (j) ) > max{f1 (z (j) ), f2 (z (j) )}, j = 1, . . . , P.

244
In words: f1 is the largest of the three functions on the x data points, f2 is the largest of the three
functions on the y data points, f3 is the largest of the three functions on the z data points. We
can give a simple geometric interpretation: The functions f1 , f2 , and f3 partition Rn into three
regions,

R1 = {z | f1 (z) > max{f2 (z), f3 (z)}},


R2 = {z | f2 (z) > max{f1 (z), f3 (z)}},
R3 = {z | f3 (z) > max{f1 (z), f2 (z)}},

defined by where each function is the largest of the three. Our goal is to find functions with
x(j) R1 , y (j) R2 , and z (j) R3 .
Pose this as a convex optimization problem. You may not use strict inequalities in your formulation.
Solve the specific instance of the 3-way separation problem given in sep3way_data.m, with the
columns of the matrices X, Y and Z giving the x(j) , j = 1, . . . , N , y (j) , j = 1, . . . , M and z (j) , j =
1, . . . , P . To save you the trouble of plotting data points and separation boundaries, we have
included the plotting code in sep3way_data.m. (Note that a1, a2, a3, b1 and b2 contain arbitrary
numbers; you should compute the correct values using CVX.)
Solution. The inequalities

f1 (x(j) ) > max{f2 (x(j) ), f3 (x(j) )}, j = 1, . . . , N,


f2 (y (j) ) > max{f1 (y (j) ), f3 (y (j) )}, j = 1, . . . , M,
f3 (z (j) ) > max{f1 (z (j) ), f2 (z (j) )}, j = 1, . . . , P.

are homogeneous in ai and bi so we can express them as

f1 (x(j) ) max{f2 (x(j) ), f3 (x(j) )} + 1, j = 1, . . . , N,


f2 (y (j) ) max{f1 (y (j) ), f3 (y (j) )} + 1, j = 1, . . . , M,
f3 (z (j) ) max{f1 (z (j) ), f2 (z (j) )} + 1, j = 1, . . . , P.

Note that we can add any vector to each of the ai , without affecting these inequalities (which
only refer to difference between ai s), and we can add any number to each of the bi s for the same
reason. We can use this observation to normalize or simplify the ai and bi . For example, we can
assume without loss of generality that a1 + a2 + a3 = 0 and b1 + b2 + b3 = 0.
The following script implements this method for 3-way classification and tests it on a small separable
data set

clear all; close all;


% data for problem instance
M = 20;
N = 20;
P = 20;

X = [

3.5674 4.1253 2.8535 5.1892 4.3273 3.8133 3.4117 ...

245
3.8636 5.0668 3.9044 4.2944 4.7143 3.3082 5.2540 ...
2.5590 3.6001 4.8156 5.2902 5.1908 3.9802 ;...
-2.9981 0.5178 2.1436 -0.0677 0.3144 1.3064 3.9297 ...
0.2051 0.1067 -1.4982 -2.4051 2.9224 1.5444 -2.8687 ...
1.0281 1.2420 1.2814 1.2035 -2.1644 -0.2821];

Y = [
-4.5665 -3.6904 -3.2881 -1.6491 -5.4731 -3.6170 -1.1876 ...
-1.0539 -1.3915 -2.0312 -1.9999 -0.2480 -1.3149 -0.8305 ...
-1.9355 -1.0898 -2.6040 -4.3602 -1.8105 0.3096; ...
2.4117 4.2642 2.8460 0.5250 1.9053 2.9831 4.7079 ...
0.9702 0.3854 1.9228 1.4914 -0.9984 3.4330 2.9246 ...
3.0833 1.5910 1.5266 1.6256 2.5037 1.4384];

Z = [
1.7451 2.6345 0.5937 -2.8217 3.0304 1.0917 -1.7793 ...
1.2422 2.1873 -2.3008 -3.3258 2.7617 0.9166 0.0601 ...
-2.6520 -3.3205 4.1229 -3.4085 -3.1594 -0.7311; ...
-3.2010 -4.9921 -3.7621 -4.7420 -4.1315 -3.9120 -4.5596 ...
-4.9499 -3.4310 -4.2656 -6.2023 -4.5186 -3.7659 -5.0039 ...
-4.3744 -5.0559 -3.9443 -4.0412 -5.3493 -3.0465];

cvx_begin
variables a1(2) a2(2) a3(2) b1 b2 b3
a1*X-b1 >= max(a2*X-b2,a3*X-b3)+1;
a2*Y-b2 >= max(a1*Y-b1,a3*Y-b3)+1;
a3*Z-b3 >= max(a1*Z-b1,a2*Z-b2)+1;
a1 + a2 + a3 == 0
b1 + b2 + b3 == 0
cvx_end

% now lets plot the three-way separation induced by


% a1,a2,a3,b1,b2,b3
% find maximally confusing point
p = [(a1-a2);(a1-a3)]\[(b1-b2);(b1-b3)];

% plot
t = [-7:0.01:7];
u1 = a1-a2; u2 = a2-a3; u3 = a3-a1;
v1 = b1-b2; v2 = b2-b3; v3 = b3-b1;
line1 = (-t*u1(1)+v1)/u1(2); idx1 = find(u2*[t;line1]-v2>0);
line2 = (-t*u2(1)+v2)/u2(2); idx2 = find(u3*[t;line2]-v3>0);
line3 = (-t*u3(1)+v3)/u3(2); idx3 = find(u1*[t;line3]-v1>0);
plot(X(1,:),X(2,:),*,Y(1,:),Y(2,:),ro,Z(1,:),Z(2,:),g+,...
t(idx1),line1(idx1),k,t(idx2),line2(idx2),k,t(idx3),line3(idx3),k);

246
axis([-7 7 -7 7]);

The following figure is generated.

6 4 2 0 2 4 6

7.6 Feature selection and sparse linear separation. Suppose x(1) , . . . , x(N ) and y (1) , . . . , y (M ) are two
given nonempty collections or classes of vectors in Rn that can be (strictly) separated by a hyper-
plane, i.e., there exists a Rn and b R such that

aT x(i) b 1, i = 1, . . . , N, aT y (i) b 1, i = 1, . . . , M.

This means the two classes are (weakly) separated by the slab

S = {z | |aT z b| 1},

which has thickness 2/kak2 . You can think of the components of x(i) and y (i) as features; a and b
define an affine function that combines the features and allows us to distinguish the two classes.
To find the thickest slab that separates the two classes, we can solve the QP

minimize kak2
subject to aT x(i) b 1, i = 1, . . . , N
aT y (i) b 1, i = 1, . . . , M,

with variables a Rn and b R. (This is equivalent to the problem given in (8.23), p424, 8.6.1;
see also exercise 8.23.)
In this problem we seek (a, b) that separate the two classes with a thick slab, and also has a sparse,
i.e., there are many j with aj = 0. Note that if aj = 0, the affine function aT z b does not depend
on zj , i.e., the jth feature is not used to carry out classification. So a sparse a corresponds to a
classification function that is parsimonious; it depends on just a few features. So our goal is to find

247
an affine classification function that gives a thick separating slab, and also uses as few features as
possible to carry out the classification.
This is in general a hard combinatorial (bi-criterion) optimization problem, so we use the standard
heuristic of solving
minimize kak2 + kak1
subject to aT x(i) b 1, i = 1, . . . , N
aT y (i) b 1, i = 1, . . . , M,
where 0 is a weight vector that controls the trade-off between separating slab thickness and
(indirectly, through the `1 norm) sparsity of a.
Get the data in sp_ln_sp_data.m, which gives x(i) and y (i) as the columns of matrices X and Y,
respectively. Find the thickness of the maximum thickness separating slab. Solve the problem above
for 100 or so values of over an appropriate range (we recommend log spacing). For each value,
record the separation slab thickness 2/kak2 and card(a), the cardinality of a (i.e., the number
of nonzero entries). In computing the cardinality, you can count an entry aj of a as zero if it
satisfies |aj | 104 . Plot these data with slab thickness on the vertical axis and cardinality on the
horizontal axis.
Use this data to choose a set of 10 features out of the 50 in the data. Give the indices of the features
you choose. You may have several choices of sets of features here; you can just choose one. Then
find the maximum thickness separating slab that uses only the chosen features. (This is standard
practice: once youve chosen the features youre going to use, you optimize again, using only those
features, and without the `1 regularization.
Solution. The script used to solve this problem is

cvx_quiet(true);
sp_ln_sp_data;

% thickest slab
cvx_begin
variables a(n) b
minimize ( norm(a) )
a*X - b >= 1
a*Y - b <= -1
cvx_end
w_thickest = 2./norm(a);
disp(The thickness of the maximum thickness separating slab is: );
disp(w_thickest);

% generating the trade-off curve


lambdas = logspace(-2,5);
A = zeros(n,length(lambdas));

for i=1:length(lambdas)
cvx_begin
variables a(n) b

248
minimize ( norm(a) + lambdas(i)*norm(a,1) )
a*X - b >= 1
a*Y - b <= -1
cvx_end
A(:,i) = a;
end
w = 2./norms(A); % width of the slab
card = sum((abs(A) > 1e-4));
plot(card,w)
hold on;
plot(card,w,*)
xlabel(card(a));
ylabel(w);
title(width of the slab versus cardinality of a);

% feature selection (fixing card(a) to 10)


indices = find(card == 10);
idx = indices(end);
w_before = w(idx);
a_selected = A(:,idx);
features = find(abs(a_selected) > 1e-4);
num_feat = length(features);
X_sub = X(features,:);
Y_sub = Y(features,:);
cvx_begin
variables a(num_feat) b
minimize ( norm(a) )
a*X_sub - b >= 1
a*Y_sub - b <= -1
cvx_end
w_after = 2/norm(a);
disp(Using only the following 10 features);
disp(features);
disp(the width of the thickest slab returned by the regularized);
disp(optimization problem was: );
disp(w_before);
disp(after reoptimizing, the width of the thickest slab is: );
disp(w_after)

The thickness of the maximum thickness separating slab is found to be 116.4244. The script also
generates the following trade-off curve

249
title
120

115

110

105

100
y

95

90

85

80

75
10 15 20 25 30 35 40 45 50
x

We find that, using only the features

1, 7, 8, 18, 19, 21, 23, 26, 27, 46,

the width of the thickest slab found from the regularized optimization problem is 77.0246. After
re-optimizing over this subset of variables, we find that the width of the thickest slab increases to
78.4697.

7.7 Thickest slab separating two sets. We are given two sets in Rn : a polyhedron

C1 = {x | Cx  d},

defined by a matrix C Rmn and a vector d Rm , and an ellipsoid

C2 = {P u + q | kuk2 1},

defined by a matrix P Rnn and a vector q Rn . We assume that the sets are nonempty and
that they do not intersect. We are interested in the optimization problem
maximize inf xC1 aT x supxC2 aT x
subject to kak2 = 1.

with variable a Rn .
Explain how you would solve this problem. You can answer the question by reducing the problem
to a standard problem class (LP, QP, SOCP, SDP, . . . ), or by describing an algorithm to solve it.

250
Remark. The geometrical interpretation is as follows. If we choose
1
b = ( inf aT x + sup aT x),
2 xC1 xC2

then the hyperplane H = {x | aT x = b} is the maximum margin separating hyperplane separating


C1 and C2 . Alternatively, a gives us the thickest slab that separates the two sets.
Solution.
maximize dT z kP T ak2 q T a
subject to kak2 1
CT z + a = 0
z  0.
An SOCP.
7.8 Bounding object position from multiple camera views. A small object is located at unknown position
x R3 , and viewed by a set of m cameras. Our goal is to find a box in R3 ,

B = {z R3 | l  z  u},

for which we can guarantee x B. We want the smallest possible such bounding box. (Although
it doesnt matter, we can use volume to judge smallest among boxes.)
Now we describe the cameras. The object at location x R3 creates an image on the image plane
of camera i at location
1
vi = T (Ai x + bi ) R2 .
ci x + di
The matrices Ai R23 , vectors bi R2 and ci R3 , and real numbers di R are known, and
depend on the camera positions and orientations. We assume that cTi x + di > 0. The 3 4 matrix
" #
Ai bi
Pi =
cTi di

is called the camera matrix (for camera i). It is often (but not always) the case the that the first 3
columns of Pi (i.e., Ai stacked above cTi ) form an orthogonal matrix, in which case the camera is
called orthographic.
We do not have direct access to the image point vi ; we only know the (square) pixel that it lies in.
In other words, the camera gives us a measurement vi (the center of the pixel that the image point
lies in); we are guaranteed that
kvi vi k i /2,
where i is the pixel width (and height) of camera i. (We know nothing else about vi ; it could be
any point in this pixel.)
Given the data Ai , bi , ci , di , vi , i , we are to find the smallest box B (i.e., find the vectors l and
u) that is guaranteed to contain x. In other words, find the smallest box in R3 that contains all
points consistent with the observations from the camera.
(a) Explain how to solve this using convex or quasiconvex optimization. You must explain any
transformations you use, any new variables you introduce, etc. If the convexity or quasicon-
vexity of any function in your formulation isnt obvious, be sure justify it.

251
(b) Solve the specific problem instance given in the file camera_data.m. Be sure that your final
numerical answer (i.e., l and u) stands out.

Solution.

(a) We get a subset P R3 (which well soon see is a polyhedron) of locations x that are consistent
with the camera measurements. To find the smallest box that covers any subset in R3 , all we
need to do is maximize and minimize the (linear) functions x1 , x2 , and x3 to get l and u. Here
P is a polyhedron, so well end up solving 6 LPs, one to get each of l1 , l2 , l3 , u1 , u2 , and u3 .
Now lets look more closely at P. Our measurements tell us that
1
vi (i /2)1  (Ai x + bi )  vi + (i /2)1, i = 1, . . . , m.
cTi x+ di

We multiply through by cTi x + di , which is positive, to get

vi (i /2)1)(cTi x + di )  Ai x + bi  (
( vi + (i /2)1)(cTi x + di ), i = 1, . . . , m,

a set of 2m linear inequalities on x. In particular, it defines a polyhedron.


To get lk we solve the LP

minimize xk
vi (i /2)1)(cTi x + di )  Ai x + bi ,
subject to ( i = 1, . . . , m,
Ai x + bi  (vi + (i /2)1)(cTi x + di ), i = 1, . . . , m,

for k = 1, 2, 3; to get uk we maximize the same objective.


(b) Here is a script that solves given instance:
% load the data
camera_data;
A1 = P1(1:2,1:3); b1 = P1(1:2,4); c1 = P1(3,1:3); d1 = P1(3,4);
A2 = P2(1:2,1:3); b2 = P2(1:2,4); c2 = P2(3,1:3); d2 = P2(3,4);
A3 = P3(1:2,1:3); b3 = P3(1:2,4); c3 = P3(3,1:3); d3 = P3(3,4);
A4 = P4(1:2,1:3); b4 = P4(1:2,4); c4 = P4(3,1:3); d4 = P4(3,4);

cvx_quiet(true);
for bounds = 1:6
cvx_begin
variable x(3)
switch bounds
case 1
minimize x(1)
case 2
maximize x(1)
case 3
minimize x(2)
case 4
maximize x(2)

252
case 5
minimize x(3)
case 6
maximize x(3)
end
% constraints for 1st camera
(vhat(:,1)-rho(1)/2)*(c1*x + d1) <= A1*x + b1;
A1*x + b1 <= (vhat(:,1)+rho(1)/2)*(c1*x + d1);
% constraints for 2ns camera
(vhat(:,2)-rho(2)/2)*(c2*x + d2) <= A2*x + b2;
A2*x + b2 <= (vhat(:,2)+rho(2)/2)*(c2*x + d2);
% constraints for 3rd camera
(vhat(:,3)-rho(3)/2)*(c3*x + d3) <= A3*x + b3;
A3*x + b3 <= (vhat(:,3)+rho(3)/2)*(c3*x + d3);
% constraints for 4th camera
(vhat(:,4)-rho(4)/2)*(c4*x + d4) <= A4*x + b4;
A4*x + b4 <= (vhat(:,4)+rho(4)/2)*(c4*x + d4);
cvx_end
val(bounds) = cvx_optval;
end
disp([l1 = num2str(val(1))]);
disp([l2 = num2str(val(3))]);
disp([l3 = num2str(val(5))]);
disp([u1 = num2str(val(2))]);
disp([u2 = num2str(val(4))]);
disp([u3 = num2str(val(6))]);

The script returns the following results:


l1 = -0.99561
l2 = 0.27531
l3 = -0.67899
u1 = -0.8245
u2 = 0.37837
u3 = -0.57352

7.9 Triangulation from multiple camera views. A projective camera can be described by a linear-
fractional function f : R3 R2 ,
1
f (x) = (Ax + b), dom f = {x | cT x + d > 0},
cT x + d
with " #
A
rank( T ) = 3.
c
The domain of f consists of the points in front of the camera.

253
Before stating the problem, we give some background and interpretation, most of which will not
be needed for the actual problem.

m c

b i

The 3 4-matrix " #


A b
P =
cT d
is called the camera matrix and has rank 3. Since f is invariant with respect to a scaling of P , we
can normalize the parameters and assume, for example, that kck2 = 1. The numerator cT x + d is
then the distance of x to the plane {z | cT z + d = 0}. This plane is called the principal plane. The
point
" #1 " #
A b
xc =
cT d
lies in the principal plane and is called the camera center. The ray {xc + c | 0}, which is
perpendicular to the principal plane, is the principal axis. We will define the image plane as the
plane parallel to the principal plane, at a unit distance from it along the principal axis.
The point x0 in the figure is the intersection of the image plane and the line through the camera
center and x, and is given by
1
x0 = xc + T (x xc ).
c (x xc )

Using the definition of xc we can write f (x) as


1
f (x) = A(x xc ) = A(x0 xc ) = Ax0 + b.
cT (x xc )

This shows that the mapping f (x) can be interpreted as a projection of x on the image plane to
get x0 , followed by an affine transformation of x0 . We can interpret f (x) as the point x0 expressed
in some two-dimensional coordinate system attached to the image plane.

254
In this exercise we consider the problem of determining the position of a point x R3 from its
image in N cameras. Each of the cameras is characterized by a known linear-fractional mapping
fk and camera matrix Pk :
" #
1 Ak bk
fk (x) = T (Ak x + bk ), Pk = , k = 1, . . . , N.
ck x + dk cTk dk

The image of the point x in camera k is denoted y (k) R2 . Due to camera imperfections and
calibration errors, we do not expect the equations fk (x) = y (k) , k = 1, . . . , N , to be exactly
solvable. To estimate the point x we therefore minimize the maximum error in the N equations by
solving
minimize g(x) = max kfk (x) y (k) k2 . (38)
k=1,...,N

(a) Show that (38) is a quasiconvex optimization problem. The variable in the problem is x R3 .
The functions fk (i.e., the parameters Ak , bk , ck , dk ) and the vectors y (k) are given.
(b) Solve the following instance of (38) using CVX (and bisection): N = 4,

1 0 0 0 1 0 0 0
P1 = 0 1 0 0 , P2 = 0 0 1 0 ,

0 0 1 0 0 1 0 10

1 1 1 10 0 1 1 0
P3 = 1 1 1 0 , P4 = 0 1 1 0 ,

1 1 1 10 1 0 0 10
" # " # " # " #
0.98 1.01 0.95 2.04
y (1) = , y (2) = , y (3) = , y (4) = .
0.93 1.01 1.05 0.00

You can terminate the bisection when a point is found with accuracy g(x) p? 104 , where
p? is the optimal value of (38).

Solution.

(a) The constraint g(x) is equivalent to

kAk x + bk + (cTk x + dk )k2 (cTk x + dk ), k = 1, . . . , N.

This is a set N convex second-order cone constraints in x.


(b) The CVX code printed below returns x = (4.9, 5.0, 5.2) and t = 0.0495.
P1 = [1, 0, 0, 0; 0, 1, 0, 0; 0, 0, 1, 0];
P2 = [1, 0, 0, 0; 0, 0, 1, 0; 0, -1, 0, 10];
P3 = [1, 1, 1, -10; -1, 1, 1, 0; -1, -1, 1, 10];
P4 = [0, 1, 1, 0; 0, -1, 1, 0; -1, 0, 0, 10];

u1 = [0.98; 0.93];
u2 = [1.01; 1.01];

255
u3 = [0.95; 1.05];
u4 = [2.04; 0.00];

cvx_quiet(true);
l = 0; u = 1;
tol = 1e-5;
while u-l > tol
t = (l+u)/2;
cvx_begin
variable x(3);
y1 = P1*[x;1];
norm( y1(1:2) - y1(3)*u1 ) <= t * y1(3);

y2 = P2*[x;1];
norm( y2(1:2) - y2(3)*u2 ) <= t * y2(3);

y3 = P3*[x;1];
norm( y3(1:2) - y3(3)*u3 ) <= t * y3(3);

y4 = P4*[x;1];
norm( y4(1:2) - y4(3)*u4 ) <= t * y4(3);
cvx_end
disp(cvx_status)
if cvx_optval == Inf,
l = t;
else
lastx = x;
u = t;
end;
end;
lastx

7.10 Projection onto the probability simplex. In this problem you will work out a simple method for
finding the Euclidean projection y of x Rn onto the probability simplex P = {z | z  0, 1T z = 1}.
Hints. Consider the problem of minimizing (1/2)ky xk22 subject to y  0, 1T y = 1. Form the
partial Lagrangian
L(y, ) = (1/2)ky xk22 + (1T y 1),
leaving the constraint y  0 implicit. Show that y = (x 1)+ minimizes L(y, ) over y  0.
Solution. To find the Euclidean projection of y onto P, we solve the problem,

minimize (1/2)ky xk22


subject to y  0, 1T y = 1,

with variable y. The partial Lagrangian formed by dualizing the constraint 1T y = 1, is

L(y, ) = (1/2)ky xk22 + (1T y 1),

256
with the implicit constraint y  0. This can also be written as

L(y, ) = (1/2)ky (x 1)k22 + (1T x 1) n 2 .

To minimize L(y, ) over y  0 we solve the problem

minimize (1/2)ky (x 1)k22


subject to y  0,

with variable y. This is simply the Euclidean projection of x 1 onto Rn+ , with the solution
y = (x 1)+ . Substituting y into L(y, ), we obtain the dual function

g() = (1/2)k(x 1)+ (x 1)k22 + (1T x 1) n 2


= (1/2)k(x 1) k22 + (1T x 1) n 2 .

Since is a scalar, we can find ? by using a bisection method to maximize g(). The projection
of x onto P is given by y ? = (x ? 1)+ .

7.11 Conformal mapping via convex optimization. Suppose that is a closed bounded region in C with
no holes (i.e., it is simply connected). The Riemann mapping theorem states that there exists a
conformal mapping from onto D = {z C | |z| 1}, the unit disk in the complex plane.
(This means that is an analytic function, and maps one-to-one onto D.)
One proof of the Riemann mapping theorem is based on an infinite dimensional optimization
problem. We choose a point a int (the interior of ). Among all analytic functions that map
(the boundary of ) into D, we choose one that maximizes the magnitude of the derivative at
a. Amazingly, it can be shown that this function is a conformal mapping of onto D.
We can use this theorem to construct an approximate conformal mapping, by sampling the bound-
ary of , and by restricting the optimization to a finite-dimensional subspace of analytic functions.
Let b1 , . . . , bN be a set of points in (meant to be a sampling of the boundary). We will search
only over polynomials of degree up to n,

(z)
= 1 z n + 2 z n1 + + n z + n+1 ,

where 1 , . . . , n+1 C. With these approximations, we obtain the problem

maximize |0 (a)|
subject to |(b
i )| 1, i = 1, . . . , N,

with variables 1 , . . . , n+1 C. The problem data are b1 , . . . , bN and a int .

(a) Explain how to solve the problem above via convex or quasiconvex optimization.
(b) Carry out your method on the problem instance given in conf_map_data.m. This file defines
the boundary points bi and plots them. It also contains code that will plot (b
i ), the boundary
of the mapped region, once you provide the values of j ; these points should be very close to
the boundary of the unit disk. (Please turn in this plot, and give us the values of j that you
find.) The function polyval may be helpful.

Remarks.

257
Weve been a little informal in our mathematics here, but it wont matter.
You do not need to know any complex analysis to solve this problem; weve told you everything
you need to know.
A basic result from complex analysis tells us that is one-to-one if and only if the image of
the boundary does not loop over itself. (We mention this just for fun; were not asking you
to verify that the you find is one-to-one.)

Solution. The constraint functions can be written as



i )| = 1 bni + 2 bn1
|(b + + n bi + n+1 ,

i

which are evidently convex in , since (b i ) is a (complex) affine function of . The objective as
stated is
|0 (a)| = 1 nan1 + 2 (n 1)an2 + + n ,

which is not concave in . But we observe that if satisfies the constraints, then so does , where
|| = 1. Therefore we can just as well maximize <0 (a), or even insist that 0 (a) be real (either of
these works). This yields
n1 + (n 1)an2 + +

maximize <
1 na 2 n
n1
subject to 1 bn + b + + b + n+1 1, i = 1, . . . , N,

i 2 i n i

which is convex (in fact, an SOCP).


The code which solves this problem is given below:

% approximate conformal mapping via convex optimization


conf_map_data;

cvx_begin
variable alpha(n+1) complex; % polynomial coefficients
alpha_prime = alpha(1:n).*(n:-1:1); % coefficients of derivative
maximize (real(polyval(alpha_prime,a)));
norm(polyval(alpha,b),Inf) <= 1; % must map boundary points into unit disk
cvx_end
w = polyval(alpha,b); % (boundary of) mapped region

subplot(1,2,1)
plot(real(b),imag(b));
title(Boundary of region);
axis equal;
axis([-1.5 1.5 -1.5 1.5]);

subplot(1,2,2)
plot(real(w),imag(w),b,cos(theta),sin(theta),g);
title(Boundary of mapped region);
axis equal;

258
axis([-1.5 1.5 -1.5 1.5]);

%print(-depsc,conf_map.eps);

The coefficients of the conformal mapping polynomial are

alpha =
-0.0015 + 0.0051i
-0.0074 - 0.0167i
0.0107 + 0.0091i
0.0206 - 0.0281i
0.0017 + 0.0740i
0.0031 - 0.0241i
-0.1263 + 0.0543i
0.0434 - 0.1671i
-0.1122 - 0.0056i
1.0430 - 0.0000i
0.0016 - 0.0026i

This gives us the following conformal mapping:

Boundary of region Boundary of mapped region


1.5 1.5

1 1

0.5 0.5

0 0

0.5 0.5

1 1

1.5 1.5
1.5 1 0.5 0 0.5 1 1.5 1.5 1 0.5 0 0.5 1 1.5

7.12 Fitting a vector field to given directions. This problem concerns a vector field on Rn , i.e., a function
F : Rn Rn . We are given the direction of the vector field at points x(1) , . . . , x(N ) Rn ,
1
q (i) = F (x(i) ), i = 1, . . . , N.
kF (x(i) )k2

(These directions might be obtained, for example, from samples of trajectories of the differential
equation z = F (z).) The goal is to fit these samples with a vector field of the form

F = 1 F1 + + m Fm ,

259
where F1 , . . . , Fm : Rn Rn are given (basis) functions, and Rm is a set of coefficients that
we will choose.
We will measure the fit using the maximum angle error,

J = max 6 (q (i) , F (x(i) )) ,

i=1,...,N

where 6 (z, w) = cos1 ((z T w)/kzk2 kwk2 ) denotes the angle between nonzero vectors z and w. We
are only interested in the case when J is smaller than /2.

(a) Explain how to choose so as to minimize J using convex optimization. Your method can
involve solving multiple convex problems. Be sure to explain how you handle the constraints
F (x(i) ) 6= 0.
(b) Use your method to solve the problem instance with data given in vfield_fit_data.m, with
an affine vector field fit, i.e., F (z) = Az + b. (The matrix A and vector b are the parameters
above.) Give your answer to the nearest degree, as in 20 < J ? 21 .
This file also contains code that plots the vector field directions, and also (but commented
out) the directions of the vector field fit, F (x(i) )/kF (x(i) )k2 . Create this plot, with your fitted
vector field.

Solution. Let us work out the condition under which J ? < , where (0, /2). (And yes, we
do mean to use strict inequality here.) Since cos1 is monotone decreasing on this region, we have
J ? < if and only if there exists with

q (i)T F (x(i) )
> cos(), i = 1, . . . , N.
kF (x(i) )k2

We write this as
q (i)T F (x(i) ) > cos()kF (x(i) )k2 , i = 1, . . . , N.
This is a set of strict SOC inequalities. Now we observe that the left and righthand sides are ho-
mogeneous in . Therefore the strict inequalities above hold if and only if the nonstrict inequalities

q (i)T F (x(i) ) 1 + cos()kF (x(i) )k2 , i = 1, . . . , N,

hold. (We can achieve this by scaling ; this is the standard trick to handling strict inequalities
when the problem is homogeneous.) Now we use bisection on to approximate J ? with any
desired degree of accuracy. Note that any feasible cannot be zero, since the righthand side of
the inequality is positive. Our code (shown below) just checks whether a given number of degrees
error is attainable; we find that 15 < J ? 16 . The code produces the plot shown below.

%solution of vector field direction fitting problem


vfield_fit_data;

%check manually for feasibility;


%we find 16 degrees works; 15 does not
deg = 16; % degrees of angle fit

260
gamma=deg*(pi/180);

cvx_begin
variables A(2,2) b(2);
Fhat=A*X+b*ones(1,N);
sum(Q.*Fhat)>=1+cos(gamma)*norms(Fhat);
cvx_end

%Normalize F_hat
Fhatn=Fhat*diag(1./norms(Fhat));
%Lets plot the directions Q
figure
quiver(x1,x2,Q(1,:),Q(2,:))
hold on
quiver(x1,x2,Fhatn(1,:),Fhatn(2,:),r)
xlabel(x1)
ylabel(x2)
print -depsc vfield_fit

1.5

0.5
x2

0.5

1.5
1.5 1 0.5 0 0.5 1 1.5
x1

7.13 Robust minimum volume covering ellipsoid. Suppose z is a point in Rn and E is an ellipsoid in Rn
with center c. The Mahalanobis distance of the point to the ellipsoid center is defined as
M (z, E) = inf{t 0 | z c + t(E c)},
which is the factor by which we need to scale the ellipsoid about its center so that z is on its
boundary. We have z E if and only if M (z, E) 1. We can use (M (z, E) 1)+ as a measure of
the Mahalanobis distance of the point z to the ellipsoid E.

261
Now we can describe the problem. We are given m points x1 , . . . , xm Rn . The goal is to find the
optimal trade-off between the volume of the ellipsoid E and the total Mahalanobis distance of the
points to the ellipsoid, i.e.,
m
X
(M (z, E) 1)+ .
i=1
Note that this can be considered a robust version of finding the smallest volume ellipsoid that
covers a set of points, since here we allow one or more points to be outside the ellipsoid.

(a) Explain how to solve this problem. You must say clearly what your variables are, what problem
you solve, and why the problem is convex.
(b) Carry out your method on the data given in rob_min_vol_ellips_data.m. Plot the optimal
trade-off curve of ellipsoid volume versus total Mahalanobis distance. For some selected points
on the trade-off curve, plot the ellipsoid and the points (which are in R2 ). We are only
interested in the region of the curve where the ellipsoid volume is within a factor of ten (say)
of the minimum volume ellipsoid that covers all the points.
Important. Depending on how you formulate the problem, you might encounter problems that
are unbounded below, or where CVX encounters numerical difficulty. Just avoid these by
appropriate choice of parameter.
Very important. If you use Matlab version 7.0 (which is filled with bugs) you might find that
functions involving determinants dont work in CVX. If you use this version of Matlab, then
you must download the file blkdiag.m on the course website and put it in your Matlab path
before the default version (which has a bug).

Solution. We parametrize the ellipsoid E as


E = {x | kAx bk2 1},
where b Rn and (without loss of generality) A Sn++ . The volume of E is proportional to
det A1 . The Mahalanobis distance to the ellipsoid center is then M (z, E) = kAz bk2 , which is a
convex function of A and b. It follows by the composition rules that
m
X
(kAxi bk2 1)+
i=1
is a convex function of A and b.
To find the optimal trade-off we simply minimize
m
log det A1 +
X
(kAxi bk2 1)+
i=1

over A Sn++ and b Rn . Here is a positive weight that we use to trace out the optimal trade-off
curve.
We can replace log det A1 with any other convex function that increases monotonically with
det A1 . For example, we could minimize the convex function
m
X
(det A)1/n + (kAxi bk2 1)+ .
i=1

We get the same optimal trade-off curve (but with a different parametrization).

262
% robust min volume ellipoid covering problem
rob_min_vol_ellips_data;
K = 10; % number of points on trade-off curve
MM = zeros(K,1);
vol = zeros(K,1);
lambda = logspace(-2,-1,K);
clf;

plot(x(1,:),x(2,:),.r) % plot the data points


hold on
t=linspace(0,2*pi,100);
u=[cos(t);sin(t)]; %unit circle

for i=1:K
cvx_begin
variable A(n,n) symmetric
variable b(n)
M = sum(pos(norms(A*x-b*ones(1,m))-1));
minimize (-det_rootn(A) + lambda(i)*M)
%minimize (-log_det(A) + lambda(i)*M)
cvx_end
vol(i) = 1/det(A);
MM(i)=M;
E=inv(A)*(u+b*ones(1,length(t)));
plot(E(1,:),E(2,:),-) %covering ellipse
hold on
end

print -depsc rob_min_vol_ellips_pts

figure
plot(MM,vol,o-)
xlabel(total Mahalonibis distance)
ylabel(ellipsoid volume)

print -depsc rob_min_vol_ellips_tradeoff

263
60

50

40
ellipsoid volume

30

20

10

0
0 5 10 15 20 25 30
total Mahalonibis distance

264
5

10

15

20

25
4 2 0 2 4 6 8

7.14 Isoperimetric problem. We consider the problem of choosing a curve in a two-dimensional plane
that encloses as much area as possible between itself and the x-axis, subject to constraints. For
simplicity we will consider only curves of the form

C = {(x, y) | y = f (x)},

where f : [0, a] R. This assumes that for each x-value, there can only be a single y-value, which
need not be the case for general curves. We require that at the end points (which are given), the
curve returns to the x-axis, so f (0) = 0, and f (a) = 0. In addition, the length of the curve cannot
exceed a budget L, so we must have
Z aq
1 + f 0 (x)2 dx L.
0

The objective is the area enclosed, which is given by


Z a
f (x) dx.
0

To pose this as a finite dimensional optimization problem, we discretize over the x-values. Specif-
ically, we take xi = h(i 1), i = 1, . . . , N + 1, where h = a/N is the discretization step size, and

265
we let yi = f (xi ). Thus our objective becomes
N
X
h yi ,
i=1

and our constraints can be written as


N q
X
h 1 + ((yi+1 yi )/h)2 L, y1 = 0, yN +1 = 0.
i=1

In addition to these constraints, we will also require that our curve passes through a set of pre-
specified points. Let F {1, . . . , N + 1} be an index set. For j F, we require yj = yjfixed , where
y fixed RN +1 (the entries of y fixed whose indices are not in F can be ignored). Finally, we add a
constraint on maximum curvature,
C (yi+2 2yi+1 + yi )/h2 C, i = 1, . . . , N 1.
Explain how to find the curve, i.e., y1 , . . . , yN +1 , that maximizes the area enclosed subject to these
constraints, using convex optimization. Carry out your method on the problem instance with data
given in iso_perim_data.m. Report the optimal area enclosed, and use the commented out code
in the data file to plot your curve.
Remark (for your amusement only). The isoperimetric problem is an ancient problem in mathe-
matics with a history dating all the way back to the tragedy of queen Dido and the founding of
Carthage. The story (which is mainly the account of the poet Virgil in his epic volume Aeneid ),
goes that Dido was a princess forced to flee her home after her brother murdered her husband. She
travels across the mediterranean and arrives on the shores of what is today modern Tunisia. The
natives werent very happy about the newcomers, but Dido was able to negotiate with the local
King: in return for her fortune, the King promised to cede her as much land as she could mark out
with the skin of a bull.
The king thought he was getting a good deal, but Dido outmatched him in mathematical skill. She
broke down the skin into thin pieces of leather and sewed them into a long piece of string. Then,
taking the seashore as an edge, they laid the string in a semicircle, carving out a piece of land
larger than anyone imagined; and on this land, the ancient city of Carthage was born. When the
king saw what she had done, he was so impressed by Didos talent that he asked her to marry him.
Dido refused, so the king built a university in the hope that he could find another woman with
similar talent.
Solution. Putting everything together, our problem is
minimize h N
P
y
Pi=1 pi
subject to h N i=1 1 + ((yi+1 yi )/h)2 L
C (yi+2 2yi+1 + yi )/h2 C, i = 1, . . . , N 1
y1 = 0, yN +1 = 0, yj = yjfixed , j F,
with variables y1 , . . . , yN +1 . This is convex since the objective is linear and we can write
q
1 + ((yi+1 yi )/h)2 = k(1, (yi+1 yi )/h)k2 ,
which is clearly a convex function of y1 , . . . , yN +1 .
The following matlab code solves this problem.

266
iso_perim_data;

cvx_begin
variables y(N+1)
for i = 1:N-1
abs((y(i+2)-2*y(i+1)+y(i))/h^2) <= C;
end
len = 0;
for i = 1:N
len = len+h*norm([1,(y(i+1)-y(i))/h]);
end
len <= L;
y(1) == 0; y(N+1) == 0;
y(F) == yfixed(F);
maximize(h*sum(y))
cvx_end
max_area = cvx_optval;

% plot the curve


x = [0:h:a];
figure; hold on;
plot(0,0,bo,markerfacecolor,b,markersize,7);
plot(a,0,bo,markerfacecolor,b,markersize,7);
for i = 1:length(F);
plot(x(F(i)),yfixed(F(i)),bo,markerfacecolor,b,markersize,7);
end
axis([0,1,0,0.4]);
plot(x,y);
print(-depsc,iso_perim_curve.eps);

Running the code in matlab gives an optimal area of 0.206. The optimal curve is shown in the
following figure (the blue dots show the fixed points).

267
0.4

0.35

0.3

0.25

0.2

0.15

0.1

0.05

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

7.15 Dual of maximum volume ellipsoid problem. Consider the problem of computing the maximum
volume ellipsoid inscribed in a nonempty bounded polyhedron
C = {x | aTi x bi , i = 1, . . . , m}.
Parametrizing the ellipsoid as E = {Bu + d | kuk2 1}, with B Sn++ and d Rn , the optimal
ellipsoid can be found by solving the convex optimization problem
minimize log det B
subject to kBai k2 + aTi d bi , i = 1, . . . , m
with variables B Sn , d Rn . Derive the Lagrange dual of the equivalent problem
minimize log det B
subject to kyi k2 + aTi d bi , i = 1, . . . , m
Bai = yi , i = 1, . . . , m
with variables B Sn , d Rn , yi Rn , i = 1, . . . , m.
Solution. The Lagrangian is
L(B, d, Y, , Z)
m
X m
X
= log det B + i (kyi k2 + aTi d bi ) + (ziT Bai ziT yi )
i=1 i=1
m
X m
X m
X
= log det B + ziT Bai + (i kyi k2 ziT yi ) + i aTi d bT .
i=1 i=1 i=1

268
Pm
Setting the gradient of the terms in B to zero gives B 1 = (1/2) T
i=1 (ai zi + zi aTi ), and
m
! !
X X
inf log det B + ziT Bai = log det (ai ziT + zi aTi ) + n n log 2.
B
i i=1

The infimum of the terms in L that involve yi is zero if kzi k2 i and otherwise. The infimum
of the term that involves d is zero if i i ai = 0 and otherwise. Combining everything gives
P

the dual problem


m 
(ai ziT + zi aTi ) bT + n n log 2
P
maximize log det
i=1
subject to kzi k2 i , i = 1, . . . , m
m
P
i ai = 0
i=1
 0.

7.16 Fitting a sphere to data. Consider the problem of fitting a sphere {x Rn | kx xc k2 = r} to m


points u1 , . . . , um Rn , by minimizing the error function
m 
X 2
kui xc k22 r2
i=1

over the variables xc Rn , r R.

(a) Explain how to solve this problem using convex or quasiconvex optimization. The simpler your
formulation, the better. (For example: a convex formulation is simpler than a quasiconvex
formulation; an LP is simpler than an SOCP, which is simpler than an SDP.) Be sure to explain
what your variables are, and how your formulation minimizes the error function above.
(b) Use your method to solve the problem instance with data given in the file sphere_fit_data.m,
with n = 2. Plot the fitted circle and the data points.

Solution.

(a) The problem can be formulated as a simple least-squares problem, the simplest nontrivial
convex optimization problem!
We will formulate the problem as

minimize kAx bk22 .

Choose as variables x = (xc , t) with t defined as t = r2 kxc k22 . Use the optimality conditions
AT (Ax b) = 0 of the least-squares problem to show that t + kxc k22 0 at the optimum. This
ensures that r can be computed from the optimal xc , t using the formula r = (t + kxc k22 )1/2 .
Take
2uT1 1 ku1 k22

2uT2 1 ku2 k22


" #
xc
A= . .. ,
x= , b= .. .
.. . t .


2uTm 1 kum k22

269
The last equation in AT (Ax b) = 0 gives
m 
X 
2uTi xc + t kui k22 = 0,
i=1

from which we obtain m


1 X
t + kxc k22 = kui xc k22 .
m i=1

(b) xc = (2.5869, 6.4883), R = 1.3052.

7.5

6.5

5.5

4 3.5 3 2.5 2 1.5 1

7.17 The polar of a set C Rn is defined as

C = {x | uT x 1 u C}.

(a) Show that C is convex, regardless of the properties of C.


(b) Let C1 and C2 be two nonempty polyhedra defined by sets of linear inequalities:

C1 = {u Rn | A1 u  b1 }, C2 = {v Rn | A2 v  b2 }

with A1 Rm1 n , A2 Rm2 n , b1 Rm1 , b2 Rm2 . Formulate the problem of finding the
Euclidean distance between C1 and C2 ,

minimize kx1 x2 k22


subject to x1 C1
x2 C2 ,

as a QP. Your formulation should be efficient, i.e., the dimensions of the QP (number of
variables and constraints) should be linear in m1 , m2 , n. (In particular, formulations that
require enumerating the extreme points of C1 and C2 are to be avoided.)

Solution.

270
(a) C is the intersection of halfspaces.
(b) xi Ci if and only if the optimal value of the LP

maximize xTi ui
subject to Ai ui  bi
is less than or equal to one. The dual of this problem is
minimize bTi zi
subject to ATi zi + xi = 0
zi  0
with variable zi . The optimal value is less than or equal to one if and only if there exists a
feasible zi with bTi zi 1. The problem in the statement is therefore equivalent to

minimize kx1 x2 k22


subject to bT1 z1 1, bT2 z2 1
AT1 z1 + x1 = 0, AT2 z2 + x2 = 0
z1  0, z2  0.
The variables are x1 , x2 Rn , z1 Rm1 , z2 Rm2 .
7.18 Polyhedral cone questions. You are given matrices A Rnk and B Rnp .
Explain how to solve the following two problems using convex optimization. Your solution can
involve solving multiple convex problems, as long as the number of such problems is no more than
linear in the dimensions n, k, p.

(a) How would you determine whether ARk+ BRp+ ? This means that every nonnegative linear
combination of the columns of A can be expressed as a nonnegative linear combination of the
columns of B.
(b) How would you determine whether ARk+ = Rn ? This means that every vector in Rn can be
expressed as a nonnegative linear combination of the columns of A.

Solution.
(a) If ARk+ BRp+ holds, then we must have ai BRp+ , for i = 1, . . . , k, where ai is the ith
column of A. Conversely, if ai BRp+ for i = 1, . . . , k, then ARk+ BRp+ . To see this,
suppose that ai = Byi , with yi  0, i = 1, . . . , k. For any  0, we have
k k
i Byi = B(1 y1 + + k yk ) BRp+ ,
X X
A = i ai =
i=1 i=1

since 1 y1 + + k yk  0.
We can determine whether ai BRn+ by solving a feasibility LP: Find yi  0 with Byi = ai .
So we can determine whether ARk+ BRp+ by solving k LPs. If any of these is infeasible, we
know that ARk+ 6 BRp+ (indeed, we have found a point ai in ARk+ that is not in BRp+ ); if
they are all feasible, then we know ARk+ BRp+ .
We can lump these k LPs into one LP: Find Y Rpk with Yij 0 and A = BY . (But this
LP separates into the LPs given above.)

271
(b) We can use a very similar argument here. We check whether or not there are yi  0 and
zi  0 for which Ayi = ei and Azi = ei , where ei is the ith standard unit vector. If not, then
ARk+ 6= Rn . But if we find such vectors, then we have, for any x Rn ,
x = A (Y ((x)+ ) + Z((x) )) ,
where (x)+ is the nonnegative part of x and (x) is the negative part. The vector Y ((x)+ ) +
Z((x) ) on the right is nonnegative.
7.19 Projection on convex hull of union of ellipsoids. Let E1 , . . . , Em be m ellipsoids in Rn defined as
Ei = {Ai u + bi | kuk2 1}, i = 1, . . . , m,
with Ai Rnn and bi Rn . Consider the problem of projecting a point a Rn on the convex
hull of the union of the ellipsoids:
minimize kx ak2
subject to x conv(E1 Em ).
Formulate this as a second order cone program.
Solution. First express the problem as
minimize kx ak2
m
P
subject to x = i (Ai ui + bi )
i=1
i 0, i = 1, . . . , m
m
P
i = 1
i=1
kui k2 1, i = 1, . . . , m.
The variables are x, 1 , . . . , m , u1 , . . . , um . This problem is not convex because of the products
i ui . It becomes convex if we make a change of variables i ui = yi :
minimize kx ak2
m
P
subject to x = (Ai yi + i bi )
i=1
i 0, i = 1, . . . , m
m
P
i = 1
i=1
kyi k2 i , i = 1, . . . , m.
To cast in the standard SOCP form, we also need to make the objective linear by introducing a
variable t:
minimize t
subject to kx ak2 t
m
P
x= (Ai yi + i bi )
i=1
i 0, i = 1, . . . , m
m
P
i = 1
i=1
kyi k2 i , i = 1, . . . , m.

272
7.20 Bregman divergences. Let f : Rn R be strictly convex and differentiable. Then the Bregman
divergence associated with f is the function Df : Rn Rn R given by
Df (x, y) = f (x) f (y) f (y)T (x y).

(a) Show that Df (x, y) 0 for all x, y dom f .


(b) Show that if f = k k22 , then Df (x, y) = kx yk22 .
(c) Show that if f (x) = ni=1 xi log xi (negative entropy), with dom f = Rn+ (with 0 log 0 taken
P

to be 0), then
n
X
Df (x, y) = (xi log(xi /yi ) xi + yi ) ,
i=1
the Kullback-Leibler divergence between x and y.
(d) Bregman projection. The previous parts suggest that Bregman divergences can be viewed as
generalized distances, i.e., functions that measure how similar two vectors are. This suggests
solving geometric problems that measure distance between vectors using a Bregman divergence
rather than Euclidean distance.
Explain whether
minimize Df (x, y)
subject to x C,
with variable x Rn , is a convex optimization problem (assuming C is convex).
(e) Duality. Show that Dg (y , x ) = Df (x, y), where g = f and z = f (z). You can assume
that f = (f )1 and that f is closed.

Solution.
(a) Since f is convex and differentiable,
f (x) f (y) + f (y)T (x y)
for all x, y dom f . Rearranging,
f (x) f (y) f (y)T (x y) 0.
The lefthand side is Df (x, y).
(b) If f = k k22 , then
Df (x, y) = kxk22 kyk22 2y T (x y)
= kxk22 2y T x + kyk22
= kx yk22 .
Pn
(c) If f (x) = i=1 xi log xi , then
Xn n
X n
X
Df (x, y) = xi log xi yi log yi + (xi yi )( log yi 1)
i=1 i=1 i=1
Xn Xn Xn n
X
= xi log xi xi log yi yi log yi + yi log yi 1T x + 1T y
i=1 i=1 i=1 i=1
n
X
= xi log(xi /yi ) 1T x + 1T y.
i=1

273
(d) Since Df (x, y) is affine in x for fixed y, Bregman projection (onto a convex set) is a convex
optimization problem.
(e) Recall (from page 95) that
f (f (z)) = z T f (z) f (z).
It follows that

Df (x, y) = f (x) f (y) f (y)T (x y)


= f (x) + y T f (y) f (y) xT f (y)
= f (x) + f (y ) xT f (y).

Applying this equality to Dg , we obtain

Dg (y , x ) = f (y ) + f (x ) y T x ,

where x = f (x ). Since f = (f )1 , it follows that x = f 1 (x ) = x. Similarly,


because f is closed and convex, we have that f = f . Substituting above, we get

Dg (y , x ) = f (y ) + f (x) xT y ,

which is precisely Df (x, y).


Alternatively, we could start from the definition of Dg :

Dg (y , x ) = f (y ) f (x ) f (x )T (y x ).

The result follows by plugging in f (y) and f (x) for y and x , observing that f (x ) = x,
using the expression above for f (f (z)), and rearranging.

7.21 Ellipsoidal peeling. In this problem, you will implement an outlier identification technique using
Lowner-John ellipsoids. Given a set of points D = {x1 , . . . , xN } in Rn , the goal is to identify a
set O D that are anomalous in some sense. Roughly speaking, we think of an outlier as a point
that is far away from most of the points, so we would like the points in D \ O to be relatively close
together, and to be relatively far apart from the points in O.
We describe a heuristic technique for identifying O. We start with O = and find the minimum
volume (Lowner-John) ellipsoid E containing all xi / O (which is all xi in the first step). Each
iteration, we flag (i.e., add to O) the point that corresponds to the largest dual variable for the
constraint xi E; this point will be one of the points on the boundary of E, and intuitively, it will
be the one for whom the constraint is most binding. We then plot vol E (on a log scale) versus
card O and hope that we see a sharp drop in the curve. We use the value of O after the drop.
The hope is that after removing a relatively small number of points, the volume of the minimum
volume ellipsoid containing the remaining points will be much smaller than the minimum volume
ellipsoid for D, which means the removed points are far away from the others.
For example, suppose we have 100 points that lie in the unit ball and 3 points with (Euclidean)
norm 1000. Intuitively, it is clear that it is reasonable to consider the three large points outliers.
The minimum volume ellipsoid of all 103 points will have very large volume. The three points will
be the first ones removed, and as soon as they are, the volume of the ellipsoid ellipsoid will drop
dramatically and be on the order of the volume of the unit ball.

274
Run 6 iterations of the algorithm on the data given in ellip_anomaly_data.m. Plot vol E (on a
log scale) versus card O. In addition, on a single plot, plot all the ellipses found with the function
ellipse_draw(A,b) along with the outliers (in red) and the remaining points (in blue).
Of course, we have chosen an example in R2 so the ellipses can be plotted, but one can detect
outliers in R2 simply by inspection. In dimension much higher than 3, however, detecting outliers
by plotting will become substantially more difficult, while the same algorithm can be used.
Note. In CVX, you should use det_rootn (which is SDP-representable and handled exactly) instead
of log_det (which is handled using an inefficient iterative procedure).

275
Solution. The code for removing the outliers and generating the plots is given below. We remark
that the algorithm described here is typically called ellipsoidal peeling in the literature.
50

40

30

20

10

10

20
20 15 10 5 0 5 10 15

276
3
ellipsoid volume 10

2
10

1
10
1 2 3 4 5 6
number of removed points

277
%% ellipsoidal peeling

ellip_anomaly_data;

volumes = [];
removed = [];

figure(1); hold all;

for i = 1:6
% fit ellipsoid
cvx_begin
variable A(2,2) symmetric
variable b(2)
dual variable v
maximize (det_rootn(A))
subject to
v : norms(A*X + b*ones(1,size(X,2))) <= 1
cvx_end

ellipse_draw(A,b);

% detect outliers
[vm idx] = max(v);
removed = [ removed X(:,idx) ];
X(:,idx) = [];
volumes = [ volumes 1/det(A) ];
end

plot(X(1,:), X(2,:), bx); % normal points


plot(removed(1,:), removed(2,:), ro); % outliers
print(-depsc,../figures/ellipsoids.eps);

figure(2);
semilogy(volumes)
xlabel(number of removed points);
ylabel(ellipsoid volume);
set(gca,XTick, 1:length(volumes));
print(-depsc,../figures/volume_iters.eps);

278
8 Unconstrained and equality constrained minimization
8.1 Gradient descent and nondifferentiable functions.
(a) Let > 1. Show that the function
q


x21 + x22 |x2 | x1
f (x1 , x2 ) =
x + |x2 |
1 otherwise


1+
is convex. You can do this, for example, by verifying that
n o
f (x1 , x2 ) = sup x1 y1 + x2 y2 y12 + y22 1, y1 1/ 1 + .
p

Note that f is unbounded below. (Take x2 = 0 and let x1 go to .)


(b) Consider the gradient descent algorithm applied to f , with starting point x(0) = (, 1) and an
exact line search. Show that the iterates are
k k
1 1
 
(k) (k)
x1 = , x2 = .
+1 +1
Therefore x(k) converges to (0, 0). However, this is not the optimum, since f is unbounded
below.
Solution.
(a) We follow the hint and examine the optimization problem

minimize x1 y1 + x2 y2
subject to y12 + y22 1

y1 1/ 1 +
with variables y1 , y2 . The feasible set is the part of the unit disk to the right of the vertical
line through y1 = (1 + )1/2 .
a1
z

c x

a2

279

We maximize the inner product of y with the coefficient vector (x1 , x2 ). There are three
cases to distinguish, depending on the orientation of the coefficient vector.

If x1 > 0 and |x2 | x1 , the coefficient vector lies in the cone between the vectors (1, )

and (1, ), and the optimum is
" # " #
y1 1 x
= 2 1 .
y2 x1 + x22 x2

The optimal value is (x21 + x22 )1/2 .


If x2 0, and x1 < x2 , then the point
" # " #
y1 1 1
=
y2 1+

is optimal, and the optimal value is (x1 + x2 )/(1 + )1/2 .


If x2 0, and x1 < x2 , then the point
" # " #
y1 1 1
=
y2 1+

is optimal, and the optimal value is (x1 x2 )/(1 + )1/2 .


(k) (k)
(b) We first note that the iterates given in the problem satisfy |x2 | < x1 , so they are in
the interior of the region where f (x1 , x2 ) = (x21 + x22 )1/2 . In this region the function is
differentiable with gradient
" #
1 x1
f (x) = q .
x21 + x22 x2

Since we use an exact line search, only the direction of f (x) matters.
We now verify the expressions
k k
1 1
 
(k) (k)
x1 = , x2 = .
+1 +1
in the assignment. For k = 0, we get the starting point x(0) = (, 1). The gradient at x(k) is
(k) (k)
proportional to (x1 , x2 ), and therefore the exact line search minimizes f along the line
(k)
" # k " #
(1 t)x1 1 (1 t)

= .
(1 t)x2
(k)
+1 (1 t)(1)k
Along this line f is given by
   1/2  1 k
(k)
f (1 t)x1 , (1 t)xk2 2 2
= (1 t) + (1 t) 2
.
+1
This is minimized by t = 2/(1 + ), so we have
k " #
1 (1 t)

(k+1)
x =
+1 (1 t)(1)k
k+1 " #
1


= .
+1 (1)k+1

280
8.2 A characterization of the Newton decrement. Let f : Rn R be convex and twice differentiable,
and let A be a p n-matrix with rank p. Suppose x
is feasible for the equality constrained problem

minimize f (x)
subject to Ax = b.

Recall that the Newton step x at x


can be computed from the linear equations
" #" # " #
2 f (
x ) AT x f (
x)
= ,
A 0 u 0

and that the Newton decrement (


x) is defined as

( x)T x)1/2 = (xT 2 f (


x) = (f ( x)x)1/2 .

Assume the coefficient matrix in the linear equations above is nonsingular and that (
x) is positive.
Express the solution y of the optimization problem

minimize f ( x)T y
subject to Ay = 0
y T 2 f (
x)y 1

in terms of Newton step x and the Newton decrement (


x).
Solution. Strong duality holds (y = 0 is strictly feasible). The optimality conditions for y are

Ay = 0 and y T 2 f (
x)y 1.
0.
(1 y T 2 f (
x)y) = 0.
x) + AT + 2 f (
f ( x)y = 0

where is the multiplier for the inequality y T 2 f (


x)y 1 and is the multiplier for Ay = 0.
From the last condition and Ay = 0, we see that y = x. The value of the multiplier follows
from
2 = 2 (y T 2 f (
x)y) = x2 f (
x)x = (x)2 .
Hence = (
x) and
1
y= x.
(
x)

8.3 Suggestions for exercises 9.30 in Convex Optimization. We recommend the following to generate
a problem instance:

n = 100;
m = 200;
randn(state,1);
A=randn(m,n);

281
Of course, you should try out your code with different dimensions, and different data as well.
In all cases, be sure that your line search first finds a step length for which the tentative point is
in dom f ; if you attempt to evaluate f outside its domain, youll get complex numbers, and youll
never recover.
To find expressions for f (x) and 2 f (x), use the chain rule (see Appendix A.4); if you attempt
to compute 2 f (x)/xi xj , you will be sorry.
To compute the Newton step, you can use vnt=-H\g.

8.4 Suggestions for exercise 9.31 in Convex Optimization. For 9.31a, you should try out N = 1,
N = 15, and N = 30. You might as well compute and store the Cholesky factorization of the
Hessian, and then back solve to get the search directions, even though you wont really see any
speedup in Matlab for such a small problem. After you evaluate the Hessian, you can find the
Cholesky factorization as L=chol(H,lower). You can then compute a search step as -L\(L\g),
where g is the gradient at the current point. Matlab will do the right thing, i.e., it will first solve
L\g using forward substitution, and then it will solve -L\(L\g) using backward substitution. Each
substitution is order n2 .
To fairly compare the convergence of the three methods (i.e., N = 1, N = 15, N = 30), the
horizontal axis should show the approximate total number of flops required, and not the number
of iterations. You can compute the approximate number of flops using n3 /3 for each factorization,
and 2n2 for each solve (where each solve involves a forward substitution step and a backward
substitution step).

8.5 Efficient numerical method for a regularized least-squares problem. We consider a regularized least
squares problem with smoothing,
k
X n1
X n
X
minimize (aTi x 2
bi ) + 2
(xi xi+1 ) + x2i ,
i=1 i=1 i=1

where x Rn is the variable, and , > 0 are parameters.

(a) Express the optimality conditions for this problem as a set of linear equations involving x.
(These are called the normal equations.)
(b) Now assume that k  n. Describe an efficient method to solve the normal equations found in
part (a). Give an approximate flop count for a general method that does not exploit structure,
and also for your efficient method.
(c) A numerical instance. In this part you will try out your efficient method. Well choose k = 100
and n = 4000, and = = 1. First, randomly generate A and b with these dimensions. Form
the normal equations as in part (a), and solve them using a generic method. Next, write
(short) code implementing your efficient method, and run it on your problem instance. Verify
that the solutions found by the two methods are nearly the same, and also that your efficient
method is much faster than the generic one.

Note: Youll need to know some things about Matlab to be sure you get the speedup from the
efficient method. Your method should involve solving linear equations with tridiagonal coefficient
matrix. In this case, both the factorization and the back substitution can be carried out very

282
efficiently. The Matlab documentation says that banded matrices are recognized and exploited,
when solving equations, but we found this wasnt always the case. To be sure Matlab knows your
matrix is tridiagonal, you can declare the matrix as sparse, using spdiags, which can be used to
create a tridiagonal matrix. You could also create the tridiagonal matrix conventionally, and then
convert the resulting matrix to a sparse one using sparse.
One other thing you need to know. Suppose you need to solve a group of linear equations with the
same coefficient matrix, i.e., you need to compute F 1 a1 , ..., F 1 am , where F is invertible and ai
are column vectors. By concatenating columns, this can be expressed as a single matrix
h i
F 1 a1 F 1 am = F 1 [a1 am ] .

To compute this matrix using Matlab, you should collect the righthand sides into one matrix (as
above) and use Matlabs backslash operator: F\A. This will do the right thing: factor the matrix
F once, and carry out multiple back substitutions for the righthand sides.
Solution.

(a) The objective function is

xT (AT A + + I)x 2bT Ax + bT b,

where A Rkn is the matrix with rows ai , and Rnn is the tridiagonal matrix

1 1 0 0 0 0


1 2 1 0 0 0


0 1 2 0 0 0


= .. .. .. .. .. ..
.
. . . . . .
0 0 0 2 1 0



0 0 0 1 2 1
0 0 0 0 1 1

Since the problem is unconstrained, the optimality conditions are

(AT A + + I)x? = AT b. (39)

(b) If no structure is exploited, then solving (39) costs approximately (1/3)n3 flops. If k  n, we
need to solve a system F x = g where F is the sum of a tridiagonal and a (relatively) low-rank
matrix. We can use the Sherman-Morrison-Woodbury formula

x? = ( + I)1 g ( + I)1 AT (I + A( + I)1 AT )1 A( + I)1 g

to efficiently solve (39) as follows:


(i) Solve ( + I)z1 = g and ( + I)Z2 = AT for z1 and Z2 . Since + I is tridiagonal,
the total cost for this is approximately 6nk + 10n flops (4n for factorization and 6n(k + 1)
for the solves).
(ii) Form Az1 and AZ2 (2nk + 2nk 2 flops).
(iii) Solve (I + AZ2 )z3 = Az1 for z3 ((1/3)k 3 flops).

283
(iv) Form x? = z1 Z2 z3 (2nk flops).
The total flop count, keeping only leading terms, is 2nk 2 flops, which is much smaller than
(1/3)n3 when k  n.
(c) Heres the Matlab code:
clear all; close all;

n = 4000;
k = 100;
delta = 1;
eta = 1;

A = rand(k,n);
b = rand(k,1);

e = ones(n,1);
D = spdiags([-e 2*e -e],[-1 0 1], n,n);
D(1,1) = 1; D(n,n) = 1;
I = sparse(1:n,1:n,1);

F = A*A + eta*I + delta*D;


P = eta*I + delta*D; %P is cheap to invert since its tridiagonal
g = A*b;

%Directly computing optimal solution


fprintf(\nComputing solution directly\n);
s1 = cputime;
x_gen = F\g;
s2 = cputime;
fprintf(Done (in %g sec)\n,s2-s1);

fprintf(\nComputing solution using efficient method\n);


%x_eff = P^{-1}g - P^{-1}A(I +AP^{-1}A)^{-1}AP^{-1}g.

t1= cputime;
Z_0 = P\[g A];
z_1 = Z_0(:,1);
%z_2 = A*z_1;
Z_2 = Z_0(:,2:k+1);
z_3 = (sparse(1:k,1:k,1) +A*Z_2)\(A*z_1);
x_eff = z_1 - Z_2*z_3;
t2 = cputime;
fprintf(Done (in %g sec)\n,t2-t1);

fprintf(\nrelative error = %e\n,norm(x_eff-x_gen)/norm(x_gen) );

284
8.6 Newton method for approximate total variation de-noising. Total variation de-noising is based on
the bi-criterion problem with the two objectives
n1
X
kx xcor k2 , tv (x) = |xi+1 xi |.
i=1

Here xcor Rn is the (given) corrupted signal, x Rn is the de-noised signal to be computed,
and tv is the total variation function. This bi-criterion problem can be formulated as an SOCP,
or, by squaring the first objective, as a QP. In this problem we consider a method used to approx-
imately formulate the total variation de-noising problem as an unconstrained problem with twice
differentiable objective, for which Newtons method can be used.
We first observe that the Pareto optimal points for the bi-criterion total variation de-noising problem
can found as the minimizers of the function

kx xcor k22 + tv (x),

where 0 is parameter. (Note that the Euclidean norm term has been squared here, and so is
twice differentiable.) In approximate total variation de-noising, we substitute a twice differentiable
approximation of the total variation function,
n1
X q 
atv (x) = 2 + (xi+1 xi )2  ,
i=1

for the total variation function tv . Here  > 0 is parameter that controls the level of approximation.
In approximate total variation de-noising, we use Newtons method to minimize

(x) = kx xcor k22 + atv (x).

(The parameters > 0 and  > 0 are given.)

(a) Find expressions for the gradient and Hessian of .


(b) Explain how you would exploit the structure of the Hessian to compute the Newton direction
for efficiently. (Your explanation can be brief.) Compare the approximate flop count for
your method with the flop count for a generic method that does not exploit any structure in
the Hessian of .
(c) Implement Newtons method for approximate total variation de-noising. Get the corrupted
signal xcor from the file approx_tv_denoising_data.m, and compute the de-noised signal x? ,
using parameters  = 0.001, = 50 (which are also in the file). Use line search parameters
= 0.01, = 0.5, initial point x(0) = 0, and stopping criterion 2 /2 108 . Plot the
Newton decrement versus iteration, to verify asymptotic quadratic convergence. Plot the final
smoothed signal x? , along with the corrupted one xcor .

Solution.

(a) The gradient and Hessian of are most easily derived using the chain rule. We first start
with the expressions

(x) = 2(x xcor ) + atv (x), 2 (x) = 2I + 2 atv (x),

285
so the challenge is to find the gradient the Hessian of atv . Let f : R R denote the function
p
f (u) = 2 + u2 ,

which is our approximation of the absolute value function. Its first and second derivatives are

f 0 (u) = u(2 + u2 )1/2 , f 00 (u) = 2 (2 + u2 )3/2 .

We define F : R(n1) R as
n1
X
F (u1 , . . . , un1 ) = f (ui ),
i=1

i.e., F (u) is the sum of the approximate absolute values of the components of u. Its gradient
and Hessian are

F (u) = (f 0 (u1 ), . . . , f 0 (un1 )), 2 F (u) = diag(f 00 (u1 ), . . . , f 00 (un1 )).

We have atv (x) = F (Dx), where D is the forward difference matrix



1 1

1 1
R(n1)n .

D= .. ..

. .

1 1

Using the chain rule, we obtain

atv (x) = DT F (Dx), 2 atv (x) = DT 2 F (Dx)D.

Putting it all together, we get

(x) = 2(x xcor ) + DT F (Dx)


2 (x) = 2I + DT 2 F (Dx)D.

Note that the Hessian is tridiagonal.


We can also write out the gradient and Hessian explicitly in terms of x, by components:
2 1/2
2(x1 xcor 2
1 ) (x2 x1 )( + (x2 x1 ) ) i=1
2 1/2 + (x x 2 1/2
= 2(xi xcor 2
i ) (xi+1 xi )( + (xi+1 xi ) ) i
2
i1 )( + (xi xi1 ) ) i = 2, . . . , n 1
xi
cor 2 2 1/2

2(xn xn ) + (xn xn1 )( + (xn xn1 ) ) i = n.

The Hessian components are:




2 + 2 (2 + (x2 x1 )2 )3/2 i=j=1
2 + 2 (2 + (xi xi1 )2 )3/2 + 2 (2 + (xi+1 xi )2 )3/2 i = j = 2, . . . , n 1




2 + 2 (2 + (xn xn1 )2 )3/2

2
i=j=n
=
xi xj
2 (2 + (xi xi1 )2 )3/2 ij =1
2 (2 + (xj xj1 )2 )3/2 ji=1






0 otherwise.

286
(b) Let H = 2 (x) and g = (x). The Newton step is found by solving the linear equations
Hxnt = g. A generic method to solve the equations, that does not exploit structure in H,
costs O(n3 ) flops. Since H is tridiagonal, it is banded with bandwidth k = 1. We can form its
Cholesky factor (which is lower bidiagonal), and carry out the forward and back substitutions,
in O(nk 2 ) flops, which is the same as O(n) flops. Thus, by exploiting the fact that H is
tridiagonal, we can compute xnt in O(n) flops. Of course, thats much faster than O(n3 )
flops. (The memory requirement drops from O(n2 ) to O(n).)
(c) The only trick to implementing Newtons method is to be sure that Matlab exploits the fact
that H is tridiagonal. This can be done by defining the Hessian H to be a sparse matrix.
Once this is done, the Newton step can be computed extremely efficiently.
The following Matlab code implements Newtons method.
% Newton method for approximate total variation de-noising
%
% problem data
approx_tv_denoising_data;
D = spdiags([-1*ones(n,1) ones(n,1)], 0:1, n-1, n);

% Newtons method
ALPHA = 0.01;
BETA = 0.5;
MAXITERS = 100;
NTTOL = 1e-10;

x = zeros(n,1);
newt_dec = [];

for iter = 1:MAXITERS


d = (D*x);
val = (x-xcor)*(x-xcor) + ...
MU*sum(sqrt(EPSILON^2+d.^2)-EPSILON*ones(n-1,1));
grad = 2*(x - xcor) + ...
MU*D*(d./sqrt(EPSILON^2+d.^2));
hess = 2*speye(n) + ...
MU*D*spdiags(EPSILON^2*(EPSILON^2+d.^2).^(-3/2),0,n-1,n-1)*D;

v = -hess\grad;
lambdasqr = -grad*v; newt_dec = [newt_dec sqrt(lambdasqr)];

if (lambdasqr/2) < NTTOL, break; end;

t = 1;
while ((x+t*v-xcor)*(x+t*v-xcor) + ...
MU*sum(sqrt(EPSILON^2+(D*(x+t*v)).^2)-EPSILON*ones(n-1,1)) > ...
val - ALPHA*t*lambdasqr )
t = BETA*t;

287
end;
x = x+t*v;
end;

% plotting results
figure;
semilogy([1:iter],newt_dec,o-);
xlabel(iter); ylabel(newtdec); grid on;

figure;
time = 1:5000;
plot(time,x,time,xcor,:);
xlabel(t); legend(x(t),xcor(t));
We start our algorithm from x = 0. The resulting Newton decrement versus iteration is
plotted below. High accuracy is obtained in 14 steps, with quadratic convergence observed in
steps 10 through 14.
2
10

1
10

0
10

1
10
newtdec

2
10

3
10

4
10

5
10
0 2 4 6 8 10 12 14
iter

The final smoothed signal (shown in solid line type, in blue), along with the corrupted one
(shown in dotted line type, green), is shown below.

288
3
x(t)
xcor(t)

3
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
t

The approximate total variation de-noising has performed well, more or less removing the
noise spikes as well as the faster smaller noise, while preserving the jump discontinuity in the
signal.
Although we didnt ask you to do it, its interesting to compare approximate total variation
de-noising with Tikhonov smoothing, with objective

kx xcor k22 + kDxk22 .

Here we have a trade-off between preserving the jump discontinuity and attenuating noise
spikes in the signal. We got the best jump preservation and some spike attenuation for
= 250 which gives the signal shown below.

289
Tikhonov mu = 250
3
xtikh(t)
xcor(t)

3
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
t

To achieve a similar level of spike attenuation as the one obtained by approximate total
variation de-noising, we choose = 20000, which gives the signal shown below. The spikes
are more or less removed, but we have significantly distorted the jump discontinuity.
Tikhonov mu = 20000
3
xtikh(t)
xcor(t)

3
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
t

8.7 Derive the Newton equation for the unconstrained minimization problem
Pm
minimize (1/2)xT x + log T
i=1 exp(ai x + bi ).

Give an efficient method for solving the Newton system, assuming the matrix A Rmn (with
rows aTi ) is dense with m  n. Give an approximate flop count of your method.

290
Solution. The Hessian is  
H = I + AT diag(z) zz T A.
where
exp(aTi x + bi )
zi = P T
,
i exp(ai x + bi )
so H is diagonal plus a low rank term, and we can more or less follow the method of page 10-30
of the lecture notes. However diag(z) zz T is singular, since (diag(z) zz T )1 = 0, so we cannot
directly factor it using the Cholesky factorization. Note that

diag(z) zz T = L diag(z)1 LT

where
L = diag(z) zz T .
The Newton system    
I + AT diag(z) zz T A x = g
is therefore equivalent to " #" # " #
IAT L x g
T = .
L A diag(z) u 0
Eliminating x gives an equation

(diag(z) + LT AAT L)u = LT Ag.

with m + 1 variables.
The cost is roughly (1/3)m3 + m2 n flops.

8.8 We consider the equality constrained problem

minimize tr(CX) log det X


subject to diag(X) = 1.

The variable is the matrix X Sn . The domain of the objective function is Sn++ . The matrix
C Sn is a problem parameter. This problem is similar to the analytic centering problem discussed
in lecture 11 (p.1819) and pages 553-555 of the textbook. The differences are the extra linear term
tr(CX) in the objective, and the special form of the equality constraints. (Note that the equality
constraints can be written as tr(Ai X) = 1 with Ai = ei eTi , a matrix of zeros except for the i, i
element, which is equal to one.)

(a) Show that X is optimal if and only if

X  0, X 1 C is diagonal, diag(X) = 1.

(b) The Newton step X at a feasible X is defined as the solution of the Newton equations

X 1 XX 1 + diag(w) = C + X 1 , diag(X) = 0,

291
with variables X Sn , w Rn . (Note the two meanings of the diag function: diag(w) is
the diagonal matrix with the vector w on its diagonal; diag(X) is the vector of the diagonal
elements of X.) Eliminating X from the first equation gives an equation

diag(X diag(w)X) = 1 diag(XCX).

This is a set of n linear equations in n variables, so it can be written as Hw = g. Give a


simple expression for the coefficients of the matrix H.
(c) Implement the feasible Newton method in Matlab. You can use X = I as starting point. The
code should terminate when (X)2 /2 106 , where (X) is the Newton decrement.
You can use the Cholesky factorization to evaluate the cost function: if X = LLT where L is
P
triangular with positive diagonal then log det X = 2 i log Lii .
To ensure that the iterates remain feasible, the line search has to consist of two phases. Starting
at t = 1, you first need to backtrack until X + tX  0. Then you continue the backtracking
until the condition of sufficient decrease

f0 (X + tX) f0 (X) + t tr(f0 (X)X)

is satisfied. To check that a matrix X + tX is positive definite, you can use the Cholesky
factorization with two output arguments ([R, p] = chol(A) returns p > 0 if A is not positive
definite).
Test your code on randomly generated problems of sizes n = 10, . . . , 100 (for example, using
n = 100; C = randn(n); C = C + C).

Solution.

(a) The Lagrangian is

L(X, w) = tr(CX) log det X + wT diag(X) 1T w


= tr(CX) log det X + tr(diag(w)X) 1T w,

and its gradient with respect to X is

X L(X, w) = C X 1 + diag(w).

Therefore the optimality conditions are:

X dom f0 , X 1 C diag(w) = 0, diag(X) = 1.

(b) Eliminating X from the first equation gives

X = XCX + X X diag(w)X.

Substituting this in the second equation gives

diag(X) = diag(XCX) + diag(X) diag(X diag(w)X) = 0.

Since X is assumed to be feasible, we can simplify this as

diag(X diag(w)X) = 1 diag(XCX).

292
To write this as a linear equation Hw = g, we note that
n
X
(diag(X diag(w)X)i = Xij2 wj , i = 1, . . . , n.
j=1

Therefore Hij = Xij2 , i.e., H is obtained from X by squaring its components.


(c) maxiters = 50;
alpha = 0.01;
beta = 0.5;
tol = 1e-6;

X = eye(n);
for iter = 1:maxiters

R = chol(X); % X = R * R.
val = C(:)*X(:) - 2*sum(log(diag(R)));

% Solve
%
% inv(X) * dX * inv(X) + diag(y) = inv(X) - C
% diag(dX) = 0
%
% using block elimination:
%
% (X.^2) * y = 1 - diag(X*C*X)
% dX = X - X * C * X - X*diag(y)*X.

y = X.^2 \ ( 1 - diag(X*C*X) );
dX = X - X * (C + diag(y)) * X;

% fprime = -trace( dX * X^-1 * dX * X^-1 )


% = -|| R^{-T} * dX * R^{-1} ||_F^2

fprime = -norm(R \ dX / R, fro)^2;


if -fprime/2 <= tol; break; end;
t = 1;
[R, p] = chol(X + t*dX);
while p
t = beta*t;
[R, p] = chol(X + t*dX);
end;
while ( -2*sum(log(diag(R))) + C(:)*(X(:)+t*dX(:)) > ...
val + alpha*t*fprime )
t = beta*t;
R = chol(X + t*dX);
end;

293
X = X + t * dX;

end;

8.9 Estimation of a vector from one-bit measurements. A system of m sensors is used to estimate an
unknown parameter x Rn . Each sensor makes a noisy measurement of some linear combination
of the unknown parameters, and quantizes the measured value to one bit: it returns +1 if the
measured value exceeds a certain threshold, and 1 otherwise. In other words, the output of
sensor i is given by (
T 1 aTi x + vi bi
yi = sign(ai x + vi bi ) =
1 aTi x + vi < bi ,
where ai and bi are known, and vi is measurement error. We assume that the measurement errors
vi are independent random variables with
a v zero-mean unit-variance Gaussian distribution (i.e.,
2
with a probability density (v) = (1/ 2)e /2 ). As a consequence, the sensor outputs yi are
random variables with possible values 1. We will denote prob(yi = 1) as Pi (x) to emphasize that
it is a function of the unknown parameter x:
Z
1 2 /2
Pi (x) = prob(yi = 1) = prob(aTi x + vi bi ) = et dt
2 bi aT
i x
Z bi aT x
1 i 2 /2
1 Pi (x) = prob(yi = 1) = prob(aTi x + vi < bi ) = et dt.
2

The problem is to estimate x, based on observed values y1 , y2 , . . . , ym of the m sensor outputs.


We will apply the maximum likelihood (ML) principle to determine an estimate x . In maximum
likelihood estimation, we calculate x
by maximizing the log-likelihood function

Y Y X X
l(x) = log Pi (x) (1 Pi (x)) = log Pi (x) + log(1 Pi (x)).
yi =1 yi =1 yi =1 yi =1

(a) Show that the maximum likelihood estimation problem

maximize l(x)

is a convex optimization problem. The variable is x. The measured vector y, and the param-
eters ai and bi are given.
(b) Solve the ML estimation problem with data defined in one_bit_meas_data.m, using Newtons
method with backtracking line search. This file will define a matrix A (with rows aTi ), a vector
b, and a vector y with elements 1.

Remark. The Matlab functions erfc and erfcx are useful to evaluate the following functions:
Z u Z
1 2 /2 1 u 1 2 /2 1 u
et dt = erfc( ), et dt = erfc( )
2 2 2 2 u 2 2
Z u Z
1 2 t2 /2 1 u 1 2 2 /2 1 u
eu /2 e dt = erfcx( ), eu /2 et dt = erfcx( ).
2 2 2 2 u 2 2
Solution.

294
(a) The problem is
m
X m
X
minimize log (bi + aTi x) log (bi aTi x).
yi =1 yi =1

where Z u
1 2 /2
(u) = et dt.
2

(u) is log-concave (it is the cumulative distribution function of a log-concave density; see
exercise T3.55). Therefore (aTi x bi ) and (bi aTi x) are log-concave.
(b) To simplify notation we redefine A and b as

A := diag(y)A, b := diag(y)b.

This allows us to express the problem as

minimize h(Ax b),

where h : Rm R is defined as
m
X
h(w) = log (wi ).
i=1

The gradient and Hessian of f (x) = h(Ax b) are given by

f (x) = AT h(Ax b), 2 f (x) = AT 2 h(Ax b)A.

The first derivatives of h are


h(w) 0 (wi )
=
wi (wi )

1/ 2
= .
exp(wi2 /2)(wi )

The Hessian 2 h(w) is diagonal with diagonal elements

2 h(w) 00 (wi ) 0 (wi )2


= +
wi2 (wi ) (wi )2
!2
wi / 2 1/ 2
= + .
exp(wi2 /2)(wi ) exp(wi2 /2)(wi )

In the following Matlab implementation we take the least-squares solution as starting point.
one_bit_meas_data;
[m,n] = size(A);
A = diag(y)*A;
b = y.*b;
x = A\b;

295
for k=1:50
w = A*x-b;
Phi = 0.5*erfc( -w/sqrt(2) );
Phix = 0.5*sqrt(2*pi) * erfcx(-w/sqrt(2));
val = -sum(log(Phi));
grad = -A* (1./Phix);
hess = A* diag((w + 1./Phix)./Phix) * A;
v = -hess\grad;
fprime = grad*v
if (-fprime/2 < 1e-8), break; end;
t = 1;
while ( -sum(log(0.5*erfc(-(A*(x+t*v)-b)/sqrt(2)))) > ...
val + 0.01*t*fprime )
t = t/2;
end;
x = x + t*v;
end;

This converges in a few iterations to

x = (0.27, 9.15, 7.98, 6.70, 6.02, 5.0, 4.30, 2.68, 2.02, 0.68)

as shown in the plot.

2
10

0
10

2
10

4
10
z

6
10

8
10

10
10
0 0.5 1 1.5 2 2.5 3 3.5 4
k

8.10 Functions with bounded Newton decrement. Let f : Rn R be a convex function with 2 f (x)  0
for all x dom f and Newton decrement bounded by a positive constant c:

(x)2 c x dom f.

Show that the function g(x) = exp(f (x)/c) is concave.

296
Solution. The gradient and Hessian of g are

ef (x)/c
g(x) = f (x)
c
ef (x)/c 2 ef (x)/c
2 g(x) = f (x) + f (x)f (x)T .
c c2
The Hessian is negative semidefinite if
1
2 f (x) f (x)f (x)T  0.
c
Using Schur complements this can be written in the equivalent form
" #
2 f (x) f (x)
 0.
f (x)T c

By another application of the Schur complement theorem, this is equivalent to

c f (x)T 2 f (x)1 f (x) 0.

8.11 Monotone convergence of Newtons method. Suppose f : R R is strongly convex and smooth,
and in addition, f 000 0. Let x? minimize f , and suppose Newtons method is initialized with
x(0) < x? . Show that the iterates x(k) converge to x? monotonically, and that a backtracking line
search always takes a step size of one, i.e., t(k) = 1.

297
9 Interior point methods
9.1 Dual feasible point from analytic center. We consider the problem

minimize f0 (x)
(40)
subject to fi (x) 0, i = 1, . . . , m,

where the functions fi are convex and differentiable. For u > p? , define xac (u) as the analytic
center of the inequalities

f0 (x) u, fi (x) 0, i = 1, . . . , m,

i.e.,
m
!
X
xac (u) = argmin log(u f0 (x)) log(fi (x)) .
i=1
Show that Rm , defined by
u f0 (xac (u))
i = , i = 1, . . . , m
fi (xac (u))
is dual feasible for the problem above. Express the corresponding dual objective value in terms of
u, xac (u) and the problem parameters.
Solution. Setting the gradient of the barrier function equal to zero we get
m
1 X 1
f0 (xac (u)) + fi (xac (u)) = 0.
u f0 (xac (u)) i=1
fi ac (u))
(x

This shows that xac (u) minimizes the Lagrangian


m
X
L(x, ) = f0 (x) + i fi (x)
i=1

for
u f0 (xac (u))
i = .
fi (xac (u))
The corresponding dual objective value is

L(xac (u), ) = f0 (xac (u)) m(u f0 (xac (u))).

9.2 Efficient solution of Newton equations. Explain how you would solve the Newton equations in the
barrier method applied to the quadratic program

minimize (1/2)xT x + cT x
subject to Ax  b

where A Rmn is dense. Distinguish two cases, m  n and n  m, and give the most efficient
method in each case.

298
Solution. The Newton equations for

minimize t((1/2)xT x + cT x) aTi x)


P
i log(bi

are
(tI + AT diag(d)2 A)x = tc AT d
where dk = 1/(bk aTk x).
If m  n we can form this Newton system and solve it using the Cholesky factorization, at a cost
of roughly mn2 + (1/3)n3 operations.
If m  n, it is more efficient to first write the equations as
" #" # " #
tI AT x tc AT d
=
A diag(d)2 v 0

and then eliminate x:


1 1
(diag(d)2 + AAT )v = Ac AAT d.
t t
This can be solved at a cost of roughly nm2 + (1/3)m3 operations.

9.3 Efficient solution of Newton equations. Describe an efficient method for solving the Newton equa-
tion in the barrier method for the quadratic program

minimize (1/2)(x a)T P 1 (x a)


subject to 0  x  1,

with variable x Rn . The matrix P Sn and the vector a Rn are given.


Assume that the matrix P is large, positive definite, and sparse, and that P 1 is dense. Efficient
means that the complexity of the method should be much less than O(n3 ).
Solution. The barrier function is
n
X n
X
(x) = log xi log(1 xi ),
i=1 i=1

and its derivatives are

(x) = diag(x)1 1 + diag(1 x)1 1


2 (x) = diag(x)2 + diag(1 x)2 .

The Newton equation is


(tP 1 + D)x = g
where D = 2 (x), a positive diagonal matrix, and g = tP 1 (x a) (x). To take advantage
of the structure in P 1 , we rewrite the equation as
" #" # " #
tP 1 I x g
= .
I D y 0

299
Eliminating x from the first equation gives
1
(tD + P )y = P g, x = P (g y).
t
If P is sparse, then D + (1/t)P is sparse, with the same sparsity pattern as P . The equation in
y can therefore be solved efficiently via a sparse Cholesky factorization. The computation of x
requires a sparse matrix-vector multiplication.

9.4 Dual feasible point from incomplete centering. Consider the SDP

minimize 1T x
subject to W + diag(x)  0,

with variable x Rn , and its dual

maximize tr W Z
subject to Zii = 1, i = 1, . . . , n
Z  0,

with variable X Sn . (These problems arise in a relaxation of the two-way partitioning problem,
described on page 219; see also exercises 5.39 and 11.23.)
Standard results for the barrier method tell us that when x is on the central path, i.e., minimizes
the function
(x) = t1T x + log det(W + diag(x))1
for some parameter t > 0, the matrix
1
Z = (W + diag(x))1
t
is dual feasible, with objective value tr W Z = 1T x n/t.
Now suppose that x is strictly feasible, but not necessarily on the central path. (For example, x
might be the result of using Newtons method to minimize , but with early termination.) Then
the matrix Z defined above will not be dual feasible. In this problem we will show how to construct
a dual feasible Z (which agrees with Z as given above when x is on the central path), from any
point x that is near the central path. Define X = W + diag(x), and let v = 2 (x)1 (x) be
the Newton step for the function defined above. Define
1  1 
Z = X X 1 diag(v)X 1 .
t
(a) Verify that when x is on the central path, we have Z = Z.
(b) Show that Zii = 1, for i = 1, . . . , n.
(c) Let (x) = (x)T 2 (x)1 (x) be the Newton decrement at x. Show that

(x) = tr(X 1 diag(v)X 1 diag(v)) = tr(X 1/2 diag(v)X 1/2 )2 .

(d) Show that (x) < 1 implies that Z  0. Thus, when x is near the central path (meaning,
(x) < 1), Z is dual feasible.

300
Solution.

(a) When x is on the central path, we have v = 0, so Z = Z.


(b) 2 (x)v = t1 + diag X 1 , hence
n
!
1
(X 1 )ii (X 1 )ik vk (X 1 )ik
X
Zii =
t k=1
n
!
1
(X 1 )ii + (X 1 )2ik vk
X
=
t k=1
1  1 
= (X )ii + 2 (
x)v
t
= 1.

(c) 2 = v T 2 ( vi vj (X 1 )2ij = tr X 1 diag(v)X 1 diag(v).


P
x)v = ij

(d) From 2, all eigenvalues of (X 1/2 diag(v)X 1/2 )2 are less than one in absolute value. Therefore
the eigenvalues of (X 1/2 diag(v)X 1/2 ) are less than one, i.e.,

I X 1/2 diag(v)X 1/2  0.

9.5 Standard form LP barrier method. In the following three parts of this exercise, you will implement
a barrier method for solving the standard form LP

minimize cT x
subject to Ax = b, x  0,

with variable x Rn , where A Rmn , with m < n. Throughout these exercises we will assume
that A is full rank, and the sublevel sets {x | Ax = b, x  0, cT x } are all bounded. (If this is
not the case, the centering problem is unbounded below.)

(a) Centering step. Implement Newtons method for solving the centering problem
Pn
minimize cT x i=1 log xi
subject to Ax = b,

with variable x, given a strictly feasible starting point x0 .


Your code should accept A, b, c, and x0 , and return x? , the primal optimal point, ? , a dual
optimal point, and the number of Newton steps executed.
Use the block elimination method to compute the Newton step. (You can also compute the
Newton step via the KKT system, and compare the result to the Newton step computed via
block elimination. The two steps should be close, but if any xi is very small, you might get a
warning about the condition number of the KKT matrix.)
Plot 2 /2 versus iteration k, for various problem data and initial points, to verify that your
implementation gives asymptotic quadratic convergence. As stopping criterion, you can use
2 /2 106 . Experiment with varying the algorithm parameters and , observing the effect
on the total number of Newton steps required, for a fixed problem instance. Check that your
computed x? and ? (nearly) satisfy the KKT conditions.

301
To generate some random problem data (i.e., A, b, c, x0 ), we recommend the following ap-
proach. First, generate A randomly. (You might want to check that it has full rank.) Then
generate a random positive vector x0 , and take b = Ax0 . (This ensures that x0 is strictly
feasible.) The parameter c can be chosen randomly. To be sure the sublevel sets are bounded,
you can add a row to A with all positive elements. If you want to be able to repeat a run with
the same problem data, be sure to set the state for the uniform and normal random number
generators.
Here are some hints that may be useful.
We recommend computing 2 using the formula 2 = xTnt f (x). You dont really need
for anything; you can work with 2 instead. (This is important for reasons described
below.)
There can be small numerical errors in the Newton step xnt that you compute. When
x is nearly optimal, the computed value of 2 , i.e., 2 = xTnt f (x), can actually be
(slightly) negative. If you take the squareroot to get , youll get a complex number,
and youll never recover. Moreover, your line search will never exit. However, this only
happens when x is nearly optimal. So if you exit on the condition 2 /2 106 , everything
will be fine, even when the computed value of 2 is negative.
For the line search, you must first multiply the step size t by until x + txnt is feasible
(i.e., strictly positive). If you dont, when you evaluate f youll be taking the logarithm
of negative numbers, and youll never recover.
(b) LP solver with strictly feasible starting point. Using the centering code from part (a), imple-
ment a barrier method to solve the standard form LP
minimize cT x
subject to Ax = b, x  0,
with variable x Rn , given a strictly feasible starting point x0 . Your LP solver should take
as argument A, b, c, and x0 , and return x? .
You can terminate your barrier method when the duality gap, as measured by n/t, is smaller
than 103 . (If you make the tolerance much smaller, you might run into some numerical
trouble.) Check your LP solver against the solution found by CVX*, for several problem
instances.
The comments in part (a) on how to generate random data hold here too.
Experiment with the parameter to see the effect on the number of Newton steps per centering
step, and the total number of Newton steps required to solve the problem.
Plot the progress of the algorithm, for a problem instance with n = 500 and m = 100, showing
duality gap (on a log scale) on the vertical axis, versus the cumulative total number of Newton
steps (on a linear scale) on the horizontal axis.
Your algorithm should return a 2k matrix history, (where k is the total number of centering
steps), whose first row contains the number of Newton steps required for each centering step,
and whose second row shows the duality gap at the end of each centering step. In order to
get a plot that looks like the ones in the book (e.g., figure 11.4, page 572), you should use the
following code:
[xx, yy] = stairs(cumsum(history(1,:)),history(2,:));
semilogy(xx,yy);

302
(c) LP solver. Using the code from part (b), implement a general standard form LP solver, that
takes arguments A, b, c, determines (strict) feasibility, and returns an optimal point if the
problem is (strictly) feasible.
You will need to implement a phase I method, that determines whether the problem is strictly
feasible, and if so, finds a strictly feasible point, which can then be fed to the code from
part (b). In fact, you can use the code from part (b) to implement the phase I method.
To find a strictly feasible initial point x0 , we solve the phase I problem

minimize t
subject to Ax = b
x  (1 t)1, t 0,

with variables x and t. If we can find a feasible (x, t), with t < 1, then x is strictly feasible for
the original problem. The converse is also true, so the original LP is strictly feasible if and
only if t? < 1, where t? is the optimal value of the phase I problem.
We can initialize x and t for the phase I problem with any x0 satisfying Ax0 = b, and
t0 = 2 mini x0i . (Here we can assume that min x0i 0; otherwise x0 is already a strictly
feasible point, and we are done.) You can use a change of variable z = x+(t1)1 to transform
the phase I problem into the form in part (b).
Check your LP solver against CVX* on several numerical examples, including both feasible
and infeasible instances.

Solution.

(a) The Newton step xnt is defined by the KKT system:


" #" # " #
H AT xnt g
= ,
A 0 w 0

where H = diag(1/x21 , . . . , 1/x2n ), and g = c (1/x1 , . . . , 1/xn ). The KKT system can be
efficiently solved by block elimination, i.e., by solving

AH 1 AT w = AH 1 g,

and setting xnt = H 1 (AT w + g). The KKT optimality condition is

AT ? + c (1/x?1 , . . . , 1/x?n ) = 0.

When the Newton method converges, i.e., xnt 0, w is the dual optimal point ? .
The following functions compute the analytic center using Newtons method. Here is the
function in MATLAB.

function [x_star, nu_star, lambda_hist] = lp_acent(A,b,c,x_0)


% solves problem
% minimize c*x - sum(log(x))
% subject to A*x = b
% using Newtons method, given strictly feasible starting point x0

303
% input (A, b, c, x_0)
% returns primal and dual optimal points
% lambda_hist is a vector showing lambda^2/2 for each newton step
% returns [], [] if MAXITERS reached, or x_0 not feasible

% algorithm parameters
ALPHA = 0.01;
BETA = 0.5;
EPSILON = 1e-3;
MAXITERS = 100;

if (min(x_0) <= 0) || (norm(A*x_0 - b) > 1e-3) % x0 not feasible


fprintf(FAILED);
nu_star = []; x_star = []; lambda_hist=[];
return;
end

m = length(b);
n = length(x_0);

x = x_0; lambda_hist = [];


for iter = 1:MAXITERS
H = diag(x.^(-2));
g = c - x.^(-1);
% lines below compute newton step via whole KKT system
% M = [ H A; A zeros(m,m)];
% d = M\[-g; zeros(m,1)];
% dx = d(1:n);
% w = d(n+1:end);

% newton step by elimination method


w = (A*diag(x.^2)*A)\(-A*diag(x.^2)*g);
dx = -diag(x.^2)*(A*w + g);

lambdasqr = -g*dx; % dx*H*dx;


lambda_hist = [lambda_hist lambdasqr/2];
if lambdasqr/2 <= EPSILON break; end

% backtracking line search


% first bring the point inside the domain
t = 1; while min(x+t*dx) <= 0 t = BETA*t; end
% now do backtracking line search
while c*(t*dx)-sum(log(x+t*dx))+sum(log(x))-ALPHA*t*g*dx> 0
t = BETA*t;
end

304
x = x + t*dx;
end

if iter == MAXITERS % MAXITERS reached


fprintf(ERROR: MAXITERS reached.\n);
x_star = []; nu_star = [];
else
x_star = x;
nu_star = w;
end
Here is the function in Python.
# s o l v e s t h e problem
# minimize c x sum ( l o g ( x ) )
# s u b j e c t t o Ax = b
# u s i n g N e w t o n s method , g i v e n s t r i c t l y f e a s i b l e s t a r t i n g p o i n t x0
# returns
# x s t a r : primal optimal point
# nu star : dual optimal point
# l a m b d a h i s t : a r r a y o f lambda 2/2 f o r each newton s t e p
# x s t a r and n u s t a r w i l l be empty a r r a y s
# i f max i t e r a t i o n s r e a c h e d o r x 0 i s not f e a s i b l e
d e f l p a c e n t (A, b , c , x 0 ) :
# parameters
x 0 = x 0 . reshape ( len ( x 0 ) , 1)
b = b . reshape ( len (b ) , 1)
c = c . reshape ( len ( c ) , 1)
ALPHA = 0 . 0 1
BETA = 0 . 5
EPSILON = 1 e3
MAXITERS = 100

l a m b d a h i s t = np . a r r a y ( [ ] )
A = np . matrix (A)

i f min ( x 0 ) <= 0 o r np . l i n a l g . norm ( np . dot (A, x 0 ) b ) > EPSILON :


p r i n t ERROR: x 0 not f e a s i b l e
r e t u r n np . a r r a y ( [ ] ) , np . a r r a y ( [ ] ) , l a m b d a h i s t

m= b. size
n = x 0 . size
x = x 0
for iterNum i n xrange (MAXITERS ) :
# . r e s h a p e ( n ) makes x i n t o a 1D a r r a y i n s t e a d o f a 2D column
H = np . d i a g ( x . r e s h a p e ( n ) ( 2))
g = c x ( 1)

305
# newton s t e p by e l i m i n a t i o n method
X = np . d i a g ( x . r e s h a p e ( n ) 2 )
w = np . l i n a l g . l s t s q (A X A. T, A X g ) [ 0 ]
dx = X (A. T w + g )

lambdasqr = np . dot ( g . r e s h a p e ( n ) , dx . r e s h a p e ( n ) . T ) ;
l a m b d a h i s t = np . append ( l a m b d a h i s t , lambdasqr / 2 )

i f lambdasqr / 2 <= EPSILON :


r e t u r n x , w, l a m b d a h i s t

# backtracking l i n e search
t = 1
# f i r s t b r i n g t h e p o i n t i n s i d e t h e domain
w h i l e min ( x + t dx ) <= 0 : t = BETA
# backtracking l i n e search
w h i l e t np . dot ( c . r e s h a p e ( n ) , dx . r e s h a p e ( n ) . T) np . sum ( np . l o g ( x . r e s h a p e ( n )
t = BETA
x += t dx
p r i n t ERROR: MAXITERS r e a c h e d
r e t u r n np . a r r a y ( [ ] ) , np . a r r a y ( [ ] ) , l a m b d a h i s t
Here is the function in Julia.
# s o l v e s t h e problem
# minimize c x sum ( l o g ( x ) )
# s u b j e c t t o Ax = b
# u s i n g N e w t o n s method , g i v e n s t r i c t l y f e a s i b l e s t a r t i n g p o i n t x0
# returns
# x s t a r : primal optimal point
# nu star : dual optimal point
# l a m b d a h i s t : a r r a y o f lambda 2/2 f o r each newton s t e p
# x s t a r and n u s t a r w i l l be empty a r r a y s
# i f max i t e r a t i o n s r e a c h e d o r x 0 i s not f e a s i b l e
f u n c t i o n l p a c e n t (A, b , c , x 0 )
# parameters
ALPHA = 0 . 0 1
BETA = 0 . 5
EPSILON = 1 e3
MAXITERS = 100

lambda hist = Float64 [ ]

# check f e a s i b i l i t y o f x0
i f minimum ( x 0 ) <= 0 | | norm (A x 0 b ) > EPSILON
p r i n t l n ( ERROR: x 0 not f e a s i b l e )

306
return Float64 [ ] , Float64 [ ] , lambda hist
else
m = length (b)
n = length ( x 0 )
x = x 0
f o r i t e r = 1 :MAXITERS
H = diagm ( x . ( 2 ) )
g = c x .( 1)
# l i n e s below compute newton s t e p v i a whole KKT system
# M = [ H A ; A z e r o s (m,m) ] ;
# d = M\[g ; z e r o s (m, 1 ) ] ;
# dx = d ( 1 : n ) ;
# w = d ( n+1: end ) ;
# newton s t e p by e l i m i n a t i o n method
X = diagm ( x . 2 )
w = (AXA)\( AX g )
dx = X (A w + g )
lambdasqr = dot ( g , dx ) # d x Hdx ;
push ! ( l a m b d a h i s t , lambdasqr / 2 )
i f lambdasqr /2 <= EPSILON
return x star , nu star , lambda hist
end
# backtracking l i n e search
# f i r s t b r i n g t h e p o i n t i n s i d e t h e domain
t = 1
w h i l e minimum ( x + t dx ) <= 0
t = BETA
end
# now do b a c k t r a c k i n g l i n e s e a r c h
w h i l e t dot ( c , dx ) sum ( l o g ( x+t dx ) ) + sum ( l o g ( x ) ) ALPHA t dot ( g , dx ) >
t = BETA
end
x += t dx
x star = x
nu star = w
end
p r i n t l n ( ERROR: MAXITERS r e a c h e d )
return Float64 [ ] , Float64 [ ] , lambda hist
end
end
The random data is generated as given in the problem statement, with A R100500 . The
Newton decrement versus number of Newton steps is plotted below. Quadratic convergence
is clear. The Newton direction computed by the two methods are very close. The KKT
optimality condtions are verified for the points returned by the function.

307
2
10

1
10

0
10

1
10
lambdasqr/2

2
10

3
10

4
10

5
10
1 2 3 4 5 6 7 8
iters

(b) The following functions solve the LP using the barrier method. Here is the function in MAT-
LAB.
function [x_star, history, gap] = lp_barrier(A,b,c,x_0)
% solves standard form LP
% minimize c^T x
% subject to Ax = b, x >=0;
% using barrier method, given strictly feasible x0
% uses function std_form_LP_acent() to carry out centering steps
% returns:
% - primal optimal point x_star
% - history, a 2xk matrix that returns number of newton steps
% in each centering step (top row) and duality gap (bottom row)
% (k is total number of centering steps)
% - gap, optimal duality gap

% barrier method parameters


T_0 = 1;
MU = 20;
EPSILON = 1e-3; % duality gap stopping criterion

n = length(x_0);
t = T_0;
x = x_0;
history = [];

308
while(1)
[x_star, nu_star, lambda_hist] = lp_acent(A,b,t*c,x);
x = x_star;
gap = n/t;
history = [history [length(lambda_hist); gap]];
if gap < EPSILON break; end
t = MU*t;
end

Here is the function in Python.


# s o l v e s s t a n d a r d form LP
# minimize c T x
# s u b j e c t to Ax = b , x >=0;
# u s i n g b a r r i e r method , g i v e n s t r i c t l y f e a s i b l e x0
# u s e s f u n c t i o n l p a c e n t ( ) t o c a r r y out c e n t e r i n g s t e p s
# returns
# x s t a r : primal optimal point x s t a r
# gap : o p t i m a l d u a l i t y gap
# num newton steps : a r r a y o f number o f newton s t e p s p e r i t e r a t i o n
# d u a l i t y g a p s : a r r a y o f d u a l i t y gap each i t e r a t i o n
# x s t a r w i l l be empty a r r y i f t h e c e n t e r i n g s t e p s a r e i n f e a s i b l e o r
# r e a c h maximum i t e r a t i o n s
d e f l p b a r r i e r (A, b , c , x 0 ) :
# parameters
T 0 = 1
MU = 20
EPSILON = 1 e3 # d u a l i t y gap s t o p p i n g c r i t e r i o n
n = len ( x 0 )
t = T 0
x = x 0
num newton steps = l i s t ( )
duality gaps = l i s t ()
gap = f l o a t ( n ) / t
w h i l e True :
x s t a r , n u s t a r , l a m b d a h i s t = l p a c e n t (A, b , t c , x )
# invalid centring step
i f l e n ( x s t a r ) == 0 :
r e t u r n np . a r r a y ( [ ] ) , gap , num newton steps , d u a l i t y g a p s
x = x star
gap = f l o a t ( n ) / t
num newton steps . append ( l e n ( l a m b d a h i s t ) )
d u a l i t y g a p s . append ( gap )
i f gap < EPSILON :
r e t u r n x s t a r , gap , num newton steps , d u a l i t y g a p s
t = MU

309
Here is the function in Julia.
# s o l v e s s t a n d a r d form LP
# minimize c T x
# s u b j e c t to Ax = b , x >=0;
# u s i n g b a r r i e r method , g i v e n s t r i c t l y f e a s i b l e x0
# u s e s f u n c t i o n l p a c e n t ( ) t o c a r r y out c e n t e r i n g s t e p s
# returns
# x s t a r : primal optimal point x s t a r
# gap : o p t i m a l d u a l i t y gap
# num newton steps : a r r a y o f number o f newton s t e p s p e r i t e r a t i o n
# d u a l i t y g a p s : a r r a y o f d u a l i t y gap each i t e r a t i o n
# x s t a r w i l l be empty a r r y i f t h e c e n t e r i n g s t e p s a r e i n f e a s i b l e o r
# r e a c h maximum i t e r a t i o n s
f u n c t i o n l p b a r r i e r (A, b , c , x 0 )
# b a r r i e r method p a r a m e t e r s
T 0 = 1
MU = 20
EPSILON = 1 e3 # d u a l i t y gap s t o p p i n g c r i t e r i o n
n = length ( x 0 )
t = T 0
x = x 0
num newton steps = I n t 6 4 [ ]
duality gaps = Float64 [ ]
while ( true )
x s t a r , n u s t a r , l a m b d a h i s t = l p a c e n t (A, b , t c , x )
# invalid centering step
i f l e n g t h ( x s t a r ) == 0
r e t u r n F l o a t 6 4 [ ] , gap , num newton steps , d u a l i t y g a p s
end
x = x star
gap = n/ t
push ! ( num newton steps , l e n g t h ( l a m b d a h i s t ) )
push ! ( d u a l i t y g a p s , gap )
i f gap < EPSILON
r e t u r n x s t a r , gap , num newton steps , d u a l i t y g a p s
end
t = MU
end
end
The following scripts generate test data and plots the progress of the barrier method. The
scripts also check the computed solution against CVX*. Here is the sript in MATLAB.

% script that generates data and tests the functions


% std_form_LP_acent
% std_form_LP_barrier

310
clear all;
m = 10;
n = 200;

rand(seed,0);
randn(seed,0);
A = [randn(m-1,n); ones(1,n)];
x_0 = rand(n,1) + 0.1;
b = A*x_0;
c = randn(n,1);

% analytic centering
figure
[x_star, nu_star, lambda_hist] = lp_acent(A,b,c,x_0);
semilogy(lambda_hist,bo-)
xlabel(iters)
ylabel(lambdasqr/2)
print -depsc lp_acent_newtondec.eps

% solve the LP with barrier


figure
[x_star, history, gap] = lp_barrier(A,b,c,x_0);
[xx, yy] = stairs(cumsum(history(1,:)),history(2,:));
semilogy(xx,yy,bo-);
xlabel(iters)
ylabel(gap)
print -depsc lp_barrier_iters.eps

p_star = c*x_star;

% solve LP using cvx for comparison


cvx_begin
variable x(n)
minimize(c*x)
subject to
A*x == b
x >= 0
cvx_end

fprintf(\n\nOptimal value found by barrier method:\n);


p_star
fprintf(Optimal value found by CVX:\n);
cvx_optval
fprintf(Duality gap from barrier method:\n);

311
gap
Here is the script in Python.
## t e s t s l p a c e n t and l p b a r r i e r
m = 10
n = 200
np . random . s e e d ( 2 )
A = np . v s t a c k ( ( np . random . randn (m1, n ) , np . o n e s ( ( 1 , n ) ) ) )
A = np . matrix (A)
x 0 = np . random . rand ( n , 1 ) + 0 . 1 ;
b = A x 0 ;
c = np . random . randn ( n , 1 ) ;

# test lp acent
x s t a r , n u s t a r , l a m b d a h i s t = l p a c e n t (A, b , c , x 0 ) ;
p l t . s e m i l o g y ( r a n g e ( 1 , l e n ( l a m b d a h i s t )+1) , l a m b d a h i s t )
p l t . show ( )

# test lp barrier
x s t a r , gap , num newton steps , d u a l i t y g a p s = l p b a r r i e r (A, b , c , x 0 ) ;
plt . figure ()
p l t . s t e p ( np . cumsum ( num newton steps ) , d u a l i t y g a p s , where = post )
p l t . y s c a l e ( log )
p l t . show ( )

# compare t o CVXPY
x = Variable (n)
o b j = Minimize ( s u m e n t r i e s ( m u l e l e m w i s e ( c , x ) ) )
prob = Problem ( obj , [A x == b , x >= 0 ] )
prob . s o l v e ( )
p r i n t o p t i m a l v a l u e from b a r r i e r method : , np . dot ( c . r e s h a p e ( n ) , x s t a r . r e s h a p e
p r i n t o p t i m a l v a l u e from CVXPY: , prob . v a l u e ;
p r i n t d u a l i t y gap from b a r r i e r method : , gap ;
Here is the script in Julia.
# t e s t s l p a c e n t and l p b a r r i e r
u s i n g Gadfly , Convex , SCS
m = 10;
n = 200;
s ran d ( 1 ) ;
A = [ randn (m1,n ) ; o n e s ( 1 , n ) ] ;
x 0 = rand ( n ) + 0 . 1 ;
b = A x 0 ;
c = randn ( n ) ;

# analytic centering

312
x s t a r , n u s t a r , l a m b d a h i s t = l p a c e n t (A, b , c , x 0 ) ;
p1 = p l o t (
x=1: l e n g t h ( l a m b d a h i s t ) , y=l a m b d a h i s t , Geom . path , Geom . p o i n t ,
Guide . x l a b e l ( i t e r s ) , Guide . y l a b e l ( lambda 2 / 2 ) ,
Scale . y log10
);
d i s p l a y ( p1 ) ;

# s o l v e LP with b a r r i e r
x s t a r , gap , num newton steps , d u a l i t y g a p s = l p b a r r i e r (A, b , c , x 0 ) ;
p2 = p l o t (
x=cumsum ( num newton steps ) , y=d u a l i t y g a p s , Geom . s t e p , Geom . p o i n t ,
Guide . x l a b e l ( i t e r s ) , Guide . y l a b e l ( gap ) ,
Scale . y log10
);
d i s p l a y ( p2 ) ;
p s t a r = dot ( c , x s t a r ) ;

# s o l v e LP with Convex . j l
x = Variable (n ) ;
p = minimize ( dot ( c , x ) , Ax == b , x >= 0 ) ;
s o l v e ! ( p , SCSSolver ( m a x i t e r s = 1 0 0 0 0 0 ) ) ;
p r i n t l n ( o p t i m a l v a l u e from b a r r i e r method : , p s t a r ) ;
p r i n t l n ( o p t i m a l v a l u e from Convex . j l : , p . o p t v a l ) ;
p r i n t l n ( d u a l i t y gap from b a r r i e r method : , gap ) ;

3
10

2
10

1
10

0
10
gap

1
10

2
10

3
10

4
10

5
10
5 10 15 20 25 30 35 40
iters

313
(c) The following functions implement the full LP solver (phase I and phase II). Here is the
function in MATLAB.
function [x_star,p_star,gap,status,nsteps] = lp_solve(A,b,c);
% solves the LP
% minimize c^T x
% subject to Ax = b, x >= 0;
% using a barrier method
% computes a strictly feasible point by carrying out
% a phase I method
% returns:
% - a primal optimal point x_star
% - the primal optimal value p_star
% - status: either Infeasible or Solved
% - nsteps(1): number of newton steps for phase I
% - nsteps(2): number of newton steps for phase I

[m,n] = size(A);
nsteps = zeros(2,1);

% phase I
x0 = A\b; t0 = 2+max(0,-min(x0));
A1 = [A,-A*ones(n,1)];
b1 = b-A*ones(n,1);
z0 = x0+t0*ones(n,1)-ones(n,1);
c1 = [zeros(n,1);1];
[z_star, history, gap] = lp_barrier(A1,b1,c1,[z0;t0]);
if (z_star(n+1) >= 1)
fprintf(\nProblem is infeasible\n);
x_star = []; p_star = Inf; status = Infeasible;
nsteps(1) = sum(history(1,:)); gap = [];
return;
end
fprintf(\nFeasible point found\n);
nsteps(1) = sum(history(1,:));
x_0 = z_star(1:n)-z_star(n+1)*ones(n,1)+ones(n,1);

% phase II
[x_star, history, gap] = lp_barrier(A,b,c,x_0);
status = Solved; p_star = c*x_star;
nsteps(2) = sum(history(1,:));
Here is the function in Python.
# s o l v e s t h e LP
# minimize c T x
# s u b j e c t to Ax = b , x >= 0

314
# u s i n g a b a r r i e r method
# computes a s t r i c t l y f e a s i b l e p o i n t by c a r r y i n g out a phase I method
# returns :
# x s t a r : primal optimal point
# p s t a r : primal optimal value
# gap : o p t i m a l d u a l i t y gap
# s t a t u s : e i t h e r : I n f e a s i b l e o r : Optimal
# n s t e p s : a r r a y o f number o f newton s t e p s f o r phase I and phase I I
d e f l p s o l v e (A, b , c ) :
m, n = A. shape
b = b . r e s h a p e (m, 1 )
n s t e p s = np . z e r o s ( 2 )
# phase I
x0 = np . l i n a l g . l s t s q (A, b ) [ 0 ] ;
t 0 = 2 + max ( 0 , min ( x0 ) )

A1 = np . h s t a c k ( (A, np . dot (A, np . o n e s ( n ) ) . r e s h a p e (m, 1 ) ) )


b1 = b np . dot (A, np . o n e s ( n ) ) . r e s h a p e (m, 1 )
z0 = x0 . r e s h a p e ( n , 1 ) + t 0 np . o n e s ( ( n , 1 ) ) np . o n e s ( ( n , 1 ) )
c1 = np . v s t a c k ( ( np . z e r o s ( ( n , 1 ) ) , np . o n e s ( ( 1 , 1 ) ) ) )
x 0 = np . v s t a c k ( ( z0 , t 0 ) ) . r e s h a p e ( n+1, 1 )
z s t a r , gap , num newton steps , d u a l i t y g a p s = l p b a r r i e r (A1 , b1 , c1 , x 0 )
# z s t a r = mat [ z s t a r ]
n s t e p s [ 0 ] = sum ( num newton steps )
i f l e n ( z s t a r ) == 0 :
p r i n t Phase I : problem i s i n f e a s i b l e
r e t u r n np . a r r a y ( [ ] ) , np . i n f , np . i n f , I n f e a s i b l e , n s t e p s

p r i n t Phase I : f e a s i b l e p o i n t found
x 0 = z s t a r [ : n ] z s t a r [ n ] [ 0 ] np . o n e s ( ( n , 1 ) ) + np . o n e s ( ( n , 1 ) )

# phase I I
x s t a r , gap , num newton steps , d u a l i t y g a p s = l p b a r r i e r (A, b , c , x 0 )
n s t e p s [ 1 ] = sum ( num newton steps )
i f l e n ( x s t a r ) == 0 :
r e t u r n np . a r r a y ( [ ] ) , np . i n f , np . i n f , I n f e a s i b l e , n s t e p s

p s t a r = np . dot ( c . r e s h a p e ( l e n ( c ) ) , x s t a r . r e s h a p e ( l e n ( c ) ) )
r e t u r n x s t a r , p s t a r , gap , Optimal , n s t e p s
Here is the function in Julia.
# s o l v e s t h e LP
# minimize c T x
# s u b j e c t to Ax = b , x >= 0
# u s i n g a b a r r i e r method
# computes a s t r i c t l y f e a s i b l e p o i n t by c a r r y i n g out a phase I method

315
# returns :
# x s t a r : primal optimal point
# p s t a r : primal optimal value
# gap : o p t i m a l d u a l i t y gap
# s t a t u s : e i t h e r : I n f e a s i b l e o r : Optimal
# n s t e p s : a r r a y o f number o f newton s t e p s f o r phase I and phase I I
f u n c t i o n l p s o l v e (A, b , c )
m, n = s i z e (A)
nsteps = zeros (2)
# phase I
x0 = A\b
t 0 = 2 + max ( 0 , minimum ( x0 ) )
A1 = [A A o n e s ( n ) ]
b1 = b A o n e s ( n )
z0 = x0 + t 0 o n e s ( n) o n e s ( n )
c1 = [ z e r o s ( n ) ; 1 ]
z s t a r , gap , num newton steps , d u a l i t y g a p s = l p b a r r i e r (A1 , b1 , c1 , [ z0 ; t 0 ]
n s t e p s [ 1 ] = sum ( num newton steps )
i f z s t a r [ n+1] >= 1
p r i n t l n ( Phase I : problem i s i n f e a s i b l e )
return Float64 [ ] , Inf , Inf , : I n f e a s i b l e , nsteps
end
p r i n t l n ( Phase I : f e a s i b l e p o i n t found )
x 0 = z s t a r [ 1 : n ] z s t a r [ n+1] o n e s ( n ) + o n e s ( n )
# phase I I
x s t a r , gap , num newton steps , d u a l i t y g a p s = l p b a r r i e r (A, b , c , x 0 )
n s t e p s [ 2 ] = sum ( num newton steps )
i f l e n g t h ( x s t a r ) == 0
return Float64 [ ] , Inf , Inf , : I n f e a s i b l e , nsteps
end
p s t a r = dot ( c , x s t a r )
r e t u r n x s t a r , p s t a r , gap , : Optimal , n s t e p s
end
We test our LP solver on two problem instances, one infeasible, and one feasible. We check
our results against the output of CVX*. Here is the script in MATLAB.
% solves standard form LP for two problem instances

clear all;
m = 100;
n = 500;

% infeasible problem instance


rand(state,0);
randn(state,0);
A = [randn(m-1,n); ones(1,n)];

316
b = randn(m,1);
c = randn(n,1);

[x_star,p_star,gap,status,nsteps] = lp_solve(A,b,c);

% solve LP using cvx for comparison


cvx_begin
variable x(n)
minimize(c*x)
subject to
A*x == b
x >= 0
cvx_end

% feasible problem instance


A = [randn(m-1,n); ones(1,n)];
v = rand(n,1) + 0.1;
b = A*v;
c = randn(n,1);

[x_star,p_star,gap,status,nsteps] = lp_solve(A,b,c);

% solve LP using cvx for comparison


cvx_begin
variable x(n)
minimize(c*x)
subject to
A*x == b
x >= 0
cvx_end

fprintf(\n\nOptimal value found by barrier method:\n);


p_star
fprintf(Optimal value found by CVX:\n);
cvx_optval
fprintf(Duality gap from barrier method:\n);
gap
Here is the script in Python.
## t e s t s l p s o l v e f o r i n f e a s i b l e and f e a s i b l e problem
m = 100
n = 500

# i n f e a s i b l e problem
A = np . v s t a c k ( ( np . random . randn (m1, n ) , np . o n e s ( ( 1 , n ) ) ) )
b = np . random . randn (m) ;

317
c = np . random . randn ( n , 1 ) ;
x s t a r , p s t a r , gap , s t a t u s , n s t e p s = l p s o l v e (A, b , c ) ;

# compare t o CVXPY
x = Variable (n)
o b j = Minimize ( s u m e n t r i e s ( m u l e l e m w i s e ( c , x ) ) )
prob = Problem ( obj , [A x == b , x >= 0 ] )
prob . s o l v e ( )

p r i n t s t a t u s from l p s o l v e r : , s t a t u s
p r i n t s t a t u s from CVXPY: , prob . s t a t u s
print

# f e a s i b l e problem
A = np . v s t a c k ( ( np . random . randn (m1, n ) , np . o n e s ( ( 1 , n ) ) ) )
v = np . random . rand ( n ) + 0 . 1
b = np . dot (A, v )
c = np . random . randn ( n )
x s t a r , p s t a r , gap , s t a t u s , n s t e p s = l p s o l v e (A, b , c ) ;

# compare t o CVXPY
x = Variable (n)
o b j = Minimize ( s u m e n t r i e s ( m u l e l e m w i s e ( c , x ) ) )
prob = Problem ( obj , [A x == b , x >= 0 ] )
prob . s o l v e ( )
p r i n t o p t i m a l v a l u e from b a r r i e r method : , np . dot ( c . r e s h a p e ( n ) , x s t a r . r e s h a p e
p r i n t o p t i m a l v a l u e from CVXPY: , prob . v a l u e ;
p r i n t d u a l i t y gap from b a r r i e r method : , gap ;
Here is the script in Julia.
# t e s t s l p s o l v e f o r i n f e a s i b l e and f e a s i b l e problem
m = 100;
n = 500;

# i n f e a s i b l e problem
s ran d ( 1 ) ;
A = [ randn (m1,n ) ; o n e s ( 1 , n ) ] ;
b = randn (m) ;
c = randn ( n ) ;
x s t a r , p s t a r , gap , s t a t u s , n s t e p s = l p s o l v e (A, b , c ) ;
# comapare t o Convex . j l
x = Variable (n)
p = minimize ( dot ( c , x ) , Ax == b , x >= 0 )
s o l v e ! ( p , SCSSolver ( v e r b o s e=f a l s e ) ) ;

p r i n t l n ( s t a t u s from l p s o l v e r : , s t a t u s )

318
p r i n t l n ( s t a t u s from Convex . j l : , p . s t a t u s )

# f e a s i b l e problem
A = [ randn (m1,n ) ; o n e s ( 1 , n ) ] ;
v = rand ( n ) + 0 . 1 ;
b = Av ;
c = randn ( n ) ;
x s t a r , p s t a r , gap , s t a t u s , n s t e p s = l p s o l v e (A, b , c ) ;
# compare t o Convex . j l
x = Variable (n)
p = minimize ( dot ( c , x ) , Ax == b , x >= 0 )
s o l v e ! ( p , SCSSolver ( m a x i t e r s =100000 , v e r b o s e=f a l s e ) ) ;
p r i n t l n ( o p t i m a l v a l u e from b a r r i e r method : , p s t a r ) ;
p r i n t l n ( o p t i m a l v a l u e from Convex . j l : , p . o p t v a l ) ;
p r i n t l n ( d u a l i t y gap from b a r r i e r method : , gap ) ;

9.6 Primal and dual feasible points in the barrier method for LP. Consider a standard form LP and its
dual
minimize cT x maximize bT y
subject to Ax = b subject to AT y  c,
x0
with A Rmn and rank(A) = m. In the barrier method the (feasible) Newton method is applied
to the equality constrained problem

minimize tcT x + (x)


subject to Ax = b,
n
where t > 0 and (x) =
P
log xi . The Newton equation at a strictly feasible x
is given by
i=1
" #" # " #
2 (
x ) AT x tc (
x)
= .
A 0 w 0

x) 1 where (
Suppose ( x) is the Newton decrement at x
.

(a) Show that x


+ x is primal feasible.
(b) Show that y = (1/t)w is dual feasible.
(c) Let p? be the optimal value of the LP. Show that

T n + (
? x) n
p
c x .
t

Solution.

319
(a) x
+x satisfies the equality constraint A(
x +x) = Ax +Ax = b. It is nonnegative because,
from the first equation
n 
xi 2
X 
2 T 2
= x ( x)x = .
i=1
x
i
Therefore 1 implies |xi | x
i .
(b) From the first part of the Newton equation
1 1
c + AT w = ((
x) + 2 (
x)x).
t t
The ith component is
1 xi
(c AT y)i = (1 ).
t
xi x
i
This is nonnegative because |xi | x
i .
(c) The gap between primal and dual objective is
n n
1X xi 1 X xi
cT x
bT y = x
T (c AT y) = (1 ) = (n ).
t i=1 x
i t i=1
xi

To bound the last term we use the Cauchy-Schwarz inequality |1T v| nkvk2 with vi =
xi /
xi : n
X x
i
n.



i=1
x
i

320
10 Mathematical background
10.1 Some famous inequalities. The Cauchy-Schwarz inequality states that

|aT b| kak2 kbk2

for all vectors a, b Rn (see page 633 of the textbook).


(a) Prove the Cauchy-Schwarz inequality.
Hint. A simple proof is as follows. With a and b fixed, consider the function g(t) = ka + tbk22
of the scalar variable t. This function is nonnegative for all t. Find an expression for inf t g(t)
(the minimum value of g), and show that the Cauchy-Schwarz inequality follows from the fact
that inf t g(t) 0.
(b) The 1-norm of a vector x is defined as kxk1 = nk=1 |xk |. Use the Cauchy-Schwarz inequality
P

to show that
kxk1 nkxk2
for all x.
(c) The harmonic mean of a positive vector x Rn++ is defined as
n
!1
1X 1
.
n k=1 xk
P
Use the Cauchy-Schwarz inequality to show that the arithmetic mean ( k xk )/n of a positive
n-vector is greater than or equal to its harmonic mean.
Solution.
(a) First note that the inequality is trivially satisfied if b = 0. Assume b 6= 0. We write g as

g(t) = (a tb)T (a tb) = kak22 + 2taT b + t2 kbk22 .

Setting the derivative equal to zero gives t = (aT b)/kbk2 . Therefore

(aT b)2
inf g(t) = kak22 .
t kbk22
The Cauchy-Schwarz inequality now follows from inf g(t) 0.
(b) Define ak = |xk |, bk = 1. Then

kxk1 = aT b kak2 kbk2 = nkxk2 .
p
(c) Define ak = xk /n and bk = 1/ nxk . Then
n n
! !
T 2 1X X 1
1 = (a b) kak22 kbk22 = xk .
n k=1 k=1
nxk

Hence,
1X n
xk P .
n k k 1/xk

321
10.2 Schur complements. Consider a matrix X = X T Rnn partitioned as
" #
A B
X= ,
BT C

where A Rkk . If det A 6= 0, the matrix S = C B T A1 B is called the Schur complement of


A in X. Schur complements arise in many situations and appear in many important formulas and
theorems. For example, we have det X = det A det S. (You dont have to prove this.)
(a) The Schur complement arises when you minimize a quadratic form over some of the variables.
Let f (u, v) = (u, v)T X(u, v), where u Rk . Let g(v) be the minimum value of f over u, i.e.,
g(v) = inf u f (u, v). Of course g(v) can be .
Show that if A  0, we have g(v) = v T Sv.
(b) The Schur complement arises in several characterizations of positive definiteness or semidefi-
niteness of a block matrix. As examples we have the following three theorems:
X  0 if and only if A  0 and S  0.
If A  0, then X  0 if and only if S  0.
X  0 if and only if A  0, B T (I AA ) = 0 and C B T A B  0, where A is the
pseudo-inverse of A. (C B T A B serves as a generalization of the Schur complement in
the case where A is positive semidefinite but singular.)
Prove one of these theorems. (You can choose which one.)
Solution.

(a) If A  0, then g(v) = v T Sv.


We have f (u, v) = uT Au + 2v T Bu + v T Cv. If A  0, we can minimize f over u by setting the
gradient with respect to u equal to zero. We obtain u? (v) = A1 Bv, and
g(v) = f (u? (v), v) = v T (C B T A1 B)v = v T Sv.
(b) Positive definite and semidefinite block matrices.
X  0 if and only if A  0 and S  0.
Suppose X  0. Then f (u, v) > 0 for all non-zero (u, v), and in particular, f (u, 0) =
uT Au > 0 for all non-zero u (hence, A  0), and f (A1 Bv, v) = v T (C B T A1 B)v > 0
(hence, S = C B T A1 B  0). This proves the only if part.
To prove the if part, we have to show that if A  0 and S  0, then f (u, v) > 0 for
all nonzero (u, v) (that is, for all u, v that are not both zero). If v 6= 0, then it follows
from (a) that
f (u, v) inf f (u, v) = v T Sv > 0.
u
If v = 0 and u 6= 0, f (u, 0) = uT Au
> 0.
If A  0, then X  0 if and only if S  0.
From part (a) we know that if A  0, then inf u f (u, v) = v T Sv. If S  0, then
f (u, v) inf f (u, v) = v T Sv 0
u

for all u, v, and hence X  0. This proves the if-part.


To prove the only if-part we note that f (u, v) 0 for all (u, v) implies that inf u f (u, v)
0 for all v, i.e., S  0.

322
X  0 if and only if A  0, B T (I AA ) = 0, C B T A B  0.
Suppose A Rkk with rank(A) = r. Then there exist matrices Q1 Rkr , Q2
Rk(kr) and an invertible diagonal matrix Rrr such that
" #
h i 0 h iT
A= Q1 Q2 Q1 Q2 ,
0 0

and [Q1 Q2 ]T [Q1 Q2 ] = I. The matrix


" #
Q1 Q2 0
Rnn
0 0 I

is nonsingular, and therefore


" # " #T " #" #
A B Q1 Q2 0 A B Q1 Q2 0
 0 0
BT C 0 0 I BT C 0 0 I

0 QT1 B
0 0 QT2 B  0

B T Q1 B T Q2 C
" #
QT1 B
QT2 B = 0,  0.
B T Q1 C

We have  0 if and only if A  0. It can be verified that

A = Q1 1 QT1 , I AA = Q2 QT2 .

Therefore
QT2 B = 0 Q2 QT2 B = (I A A)B = 0.
Moreover, since is invertible,
" #
QT1 B
T  0  0, C B T Q1 1 QT1 B = C B T A B  0.
B Q1 C

323
11 Circuit design
11.1 Interconnect sizing. In this problem we will size the interconnecting wires of the simple circuit
shown below, with one voltage source driving three different capacitive loads Cload1 , Cload2 , and
Cload3 .

c3

c6

c5

We divide the wires into 6 segments of fixed length li ; our variables will be the widths wi of the
segments. (The height of the wires is related to the particular IC technology process, and is fixed.)
The total area used by the wires is, of course,
X
A= wi li .
i

Well take the lengths to be one, for simplicity. The wire widths must be between a minimum and
maximum allowable value:
Wmin wi Wmax .
For our specific problem, well take Wmin = 0.1 and Wmax = 10.
Each of the wire segments will be modeled by a simple simple RC circuit, with the resistance
inversely proportional to the width of the wire and the capacitance proportional to the width. (A
far better model uses an extra constant term in the capacitance, but this complicates the equations.)
The capacitance and resistance of the ith segment is thus

Ci = k0 wi , Ri = /wi ,

where k0 and are positive constants, which we take to be one for simplicity. We also have
Cload1 = 1.5, Cload2 = 1, and Cload3 = 5.
Using the RC model for the wire segments yields the circuit shown below.

324
r2 r3

c2 c3
r1

r5

r4
c6
r6
c1

c4 c5

We will use the Elmore delay to model the delay from the source to each of the loads. The Elmore
delay to loads 1, 2, and 3 are given by

T1 = (C3 + Cload1 )(R1 + R2 + R3 ) + C2 (R1 + R2 ) +


+(C1 + C4 + C5 + C6 + Cload2 + Cload3 )R1
T2 = (C5 + Cload2 )(R1 + R4 + R5 ) + C4 (R1 + R4 ) +
+(C6 + Cload3 )(R1 + R4 ) + (C1 + C2 + C3 + Cload1 )R1
T3 = (C6 + Cload3 )(R1 + R4 + R6 ) + C4 (R1 + R4 ) +
+(C1 + C2 + C3 + Cload1 )R1 + (C5 + Cload2 )(R1 + R4 ).

Our main interest is in the maximum of these delays,

T = max{T1 , T2 , T3 }.

(a) Explain how to find the optimal trade-off curve between area A and delay T .
(b) Optimal area-delay sizing. For the specific problem parameters given, plot the area-delay
trade-off curve, together with the individual Elmore delays. Comment on the results you
obtain.
(c) The simple method. Plot the area-delay trade-off obtained when you assign all wire widths
to be the same width (which varies between Wmin and Wmax ). Compare this curve to the
optimal one, obtained in part (b). How much better does the optimal method do than the
simple method? Note: for a large circuit, say with 1000 wires to size, the difference is far
larger.

For this problem you can use the CVX in GP mode. Weve also made available the function
elm_del_example.m, which evaluates the three delays, given the widths of the wires.
Solution. The dashed line in the figure is the answer for part 1. The solid line is the optimal
trade-off curve for part 2.

325
100

80

60
delay

40

20

0
0 5 10 15 20 25
area

The figure was generated using the following matlab code.

Cload = [ 1.5; 1; 5];


maxw = 10;
minw = 0.1;

% Optimal trade-off.
N = 50;
mus = logspace(-3,3,N);
areas = zeros(1,N);
delays = zeros(1,N);
for i=1:N
cvx_begin gp
variables T w(6)
C = w;
R = 1./w;
minimize( sum(w) + mus(i)*T )
subject to
w / maxw <= 1.0;
minw ./ w <= 1;
(C(3) + Cload(1)) * sum(R([1,2,3])) ...
+ C(2) * sum(R([1,2])) ...
+ (sum(C([1,4,5,6])) + sum(Cload([2,3]))) * R(1) <= T;
(C(5) + Cload(2)) * sum(R([1,4,5])) ...
+ C(4) * sum(R([1,4])) ...
+ (C(6) + Cload(3)) * sum(R([1,4])) ...
+ (sum(C([1,2,3])) + Cload(1)) * R(1) <= T;
(C(6) + Cload(3)) * sum(R([1,4,6])) ...

326
+ C(4) * sum(R([1,4])) ...
+ sum(C([1,2,3]) + Cload(1)) * R(1) ...
+ (C(5) + Cload(2)) * sum(R([1,4])) <= T;
cvx_end
delays(i) = T;
areas(i) = sum(w);
end;

% Equal wire widths


N=100;
ws = logspace(-1,1,N);
areas2 = zeros(1,N);
delays2 = zeros(1,N);
for i=1:length(ws)
w = ws(i)*ones(6,1);
areas2(i) = sum(w);
C = w;
R = 1./w;
T1 = (C(3) + Cload(1)) * sum(R([1,2,3])) ...
+ C(2) * sum(R([1,2])) ...
+ (sum(C([1,4,5,6])) + sum(Cload([2,3]))) * R(1);
T2 = (C(5) + Cload(2)) * sum(R([1,4,5])) ...
+ C(4) * sum(R([1,4])) ...
+ (C(6) + Cload(3)) * sum(R([1,4])) ...
+ (sum(C([1,2,3])) + Cload(1)) * R(1);
T3 = (C(6) + Cload(3)) * sum(R([1,4,6])) ...
+ C(4) * sum(R([1,4])) ...
+ sum(C([1,2,3]) + Cload(1)) * R(1) ...
+ (C(5) + Cload(2)) * sum(R([1,4]));
delays2(i) = max([T1, T2, T3]);
end;

plot(areas,delays,-, areas2,delays2,--);

XXX old solution XXX. To find the optimal trade-off curve between area and delay, we can minimize
over a weighted average of area and delay. In particular the problem becomes:

minimize maxi=1,2,3 (Ti ) + A


subject to Wmin wi Wmax , i = 1, . . . , 6

For each value of 0, we find a new point on the optimal trade-off curve and represents the
slope of the curve at that point. We can introduce a new variable t which maximizes the three
delays and by doing so we obtain an equivalent problem:

minimize t + A
subject to Ti t, i = 1, 2, 3
Wmin wi Wmax , i = 1, . . . , 6

327
Since the expressions of Ti and the area are a posynomial in the variables wi we can formulate this
minimization problem as a geometric programming with variables wi and t.

minimize t + A
subject to t1 Ti 1, i = 1, 2, 3
Wmin wi1 1, i = 1, . . . , 6
1 w 1,
Wmax i = 1, . . . , 6
i

We then use the geometric programming solver provided on the web to solve this minimization
problem . The matlab code used is reported at the end of the solution.

200

180

160

140

120
delay

100

80

60

40

20

0
0 5 10 15 20 25 30
area

In the figure above, we can see the optimal trade-off curve (solid line) together with the individual
delays (dashed lines). Moreover the dash-dot line is the simple design area delay trade-off. As
we expected the simple design is always worst than the optimal one. In particular the difference
between the two designs is larger when the area is smaller.
We can also see that the three delays are equal for a large part of the trade-off curve. This is what
we expect, because if one of the delays is less than the others, it means that we could reallocate
the area to have a higher delay on that path but lower delays on the other two. When the delays
are not equal it means that the constriaints on the width dimensions are active and are forcing the
design to have different delays.

clear all
close all

328
%We specify the constants of this problem
ro=1;
k=1;
L=1;
cload1=1.5;
cload2=1;
cload3=5;
maxw=10;
minw=0.1;
%A1 and B1 describe T1/t
A1= [-1 -1 0 0 0 0 0
-1 0 -1 0 0 0 0
-1 0 0 -1 0 0 0
-1 0 0 0 0 0 0
-1 -1 1 0 0 0 0
-1 -1 0 1 0 0 0
-1 -1 0 0 1 0 0
-1 -1 0 0 0 1 0
-1 -1 0 0 0 0 1
-1 0 -1 1 0 0 0
];
B1=[log((cload1+cload2+cload3)*ro)
log(cload1*ro)
log(cload1*ro)
log(+3*k*ro)
log(ro*k)
log(ro*k)
log(ro*k)
log(ro*k)
log(ro*k)
log(ro*k)];
%A2 and B2 describe T2/t
A2= [-1 -1 0 0 0 0 0
-1 0 0 0 -1 0 0
-1 0 0 0 0 -1 0
-1 0 0 0 0 0 0
-1 -1 1 0 0 0 0
-1 -1 0 1 0 0 0
-1 -1 0 0 1 0 0
-1 -1 0 0 0 1 0
-1 -1 0 0 0 0 1
-1 0 0 0 -1 1 0
-1 0 0 0 -1 0 1
];
B2=[log((cload1+cload2+cload3)*ro)

329
log((cload2+cload3)*ro)
log(cload2*ro)
log(+3*k*ro)
log(ro*k)
log(ro*k)
log(ro*k)
log(ro*k)
log(ro*k)
log(ro*k)
log(ro*k)];
%A2 and B2 describe T2/t
A3= [-1 -1 0 0 0 0 0
-1 0 0 0 -1 0 0
-1 0 0 0 0 0 -1
-1 0 0 0 0 0 0
-1 -1 1 0 0 0 0
-1 -1 0 1 0 0 0
-1 -1 0 0 1 0 0
-1 -1 0 0 0 1 0
-1 -1 0 0 0 0 1
-1 0 0 0 -1 1 0
-1 0 0 0 -1 0 1
];
B3=[log((cload1+cload2+cload3)*ro)
log((cload2+cload3)*ro)
log(cload3*ro)
log(+3*k*ro)
log(ro*k)
log(ro*k)
log(ro*k)
log(ro*k)
log(ro*k)
log(ro*k)
log(ro*k)];
%A4 and B4 are the size constraints on w_i
A4= [zeros(6,1) -eye(6)
zeros(6,1) eye(6)];
B4= [ones(6,1)*log(minw)
-ones(6,1)*log(maxw)];

%A0 and B0 descirbe the wheigted average of area and maximum delay
% For each value of mu we find a new point on the optimal trade-off curve
A0=eye(7);
mu=0.02;
for i=1:55,

330
B0=[0
log(mu*L)*ones(6,1)];

A=[A0 A1 A2 A3 A4];
B=[B0 B1 B2 B3 B4];
szs=[7 10 11 11 1 1 1 1 1 1 1 1 1 1 1 1];

%initial feasible point


x0=[100 1 1 1 1 1 1];

[x,nu,lambda] = gp(A,B,szs,x0);
o=exp(x);
tmax(i)=o(1);
W=o(2:7);
[t1(i),t2(i),t3(i)]=elm_del_example(W);
Area(i)=ones(1,6)*W*L;
mu=mu*1.2;
end

%we calculate the area versus delay for the suboptimal design
i=1;
for ws=0.1:0.1:5
[subt1(i),subt2(i),subt3(i)]=elm_del_example(ones(6,1)*ws);
subt(i)=max([subt1(i),subt2(i),subt3(i)]);
Areasub(i)=L*ws*6;
i=i+1;
end

%we plot the results together


plot(Area,tmax);
hold on
plot(Area,t1,--);
plot(Area,t2,--);
plot(Area,t3,--);
plot(Areasub,subt,-.);
xlabel(area)
ylabel(delay)
11.2 Optimal sizing of power and ground trees. We consider a system or VLSI device with many sub-
systems or subcircuits, each of which needs one or more power supply voltages. In this problem we
consider the case where the power supply network has a tree topology with the power supply (or
external pin connection) at the root. Each node of the tree is connected to some subcircuit that
draws power.
We model the power supply as a constant voltage source with value V . The m subcircuits are
modeled as current sources that draw currents i1 (t), . . . , im (t) from the node (to ground) (see the
figure below).

331
r2 r3

i6 i5
r1

r5

vi r4

i4
r6

i1

i2 i3

The subcircuit current draws have two components:

ik (t) = idc ac
k + ik (t)

where idc ac
k is the DC current draw (which is a positive constant), and ik (t) is the AC draw (which
has zero average value). We characterize the AC current draw by its RMS value, defined as
Z T !1/2
1 2
RMS(iac
k ) = lim iac
k (t) dt .
T T 0

For each subcircuit we are given maximum values for the DC and RMS AC currents draws, i.e.,
constants Ikdc and Ikac such that

0 idc dc
k Ik , RMS(iac ac
k ) Ik . (41)

The n wires that form the distribution network are modeled as resistors Rk (which, presumably, have
small value). (Since the circuit has a tree topology, we can use the following labeling convention:
node k and the current source ik (t) are immediately following resistor Rk .) The resistance of the
wires is given by
Ri = li /wi ,
where is a constant and li are the lengths of the wires, which are known and fixed. The variables
in the problem are the width of the wires, w1 , . . . , wn . Obviously by making the wires very wide,
the resistances become very low, and we have a nearly ideal power network. The purpose of this
problem is to optimally select wire widths, to minimize area while meeting certain specfications.
Note that in this problem we ignore dynamics, i.e., we do not model the capacitance or inductance
of the wires.
As a result of the current draws and the nonzero resistance of the wires, the voltage at node k
(which supplies subcircuit k) has a DC value less than the supply voltage, and also an AC voltage
(which is called power supply ripple or noise). By superposition these two effects can be analyzed
separately.

332
The DC voltage drop V vkdc at node k is equal to the sum of the voltage drops across wires
on the (unique) path from node k to the root. It can be expressed as
m
X X
V vkdc = idc
j Ri , (42)
j=1 iN (j,k)

where N (j, k) consists of the indices of the branches upstream from nodes j and k, i.e.,
i N (j, k) if and only if Ri is in the path from node j to the root and in the path from node
k to the root.
The power supply noise at a node can be found as follows. The AC voltage at node k is equal
to m X X
vkac (t) = iac
j (t) Ri .
j=1 iN (j,k)

We assume the AC current draws are independent, so the RMS value of vkac (t) is given by the
squareroot of the sum of the squares of the RMS value of the ripple due to each other node,
i.e.,

2 1/2
m
X X
RMS(vkac ) = RMS(iac ) Ri . (43)

j
j=1 iN (j,k)
Pn
The problem is to choose wire widths wi that minimize the total wire area i=k wk lk subject to
the following specifications:

maximum allowable DC voltage drop at each node:

V vkdc Vmax
dc
, k = 1, . . . , m, (44)
dc is a given constant.
where V vkdc is given by (42), and Vmax
maximum allowable power supply noise at each node:

RMS(vkac ) Vmax
ac
, k = 1, . . . , m, (45)

where RMS(vkac ) is given by (43), and Vmax


ac is a given constant.

upper and lower bounds on wire widths:

wmin wi wmax , i = 1, . . . , n, (46)

where wmin and wmax are given constants.


maximum allowable DC current density in a wire:
,
X
idc
j
wk max , k = 1, . . . , n, (47)
jM(k)

where M(k) is the set of all indices of nodes downstream from resistor k, i.e., j M(k) if
and only if Rk is in the path from node j to the root, and max is a given constant.

333
maximum allowable total DC power dissipation in supply network:
2
n
X X
Rk idc
j
Pmax , (48)
k=1 jM(k)

where Pmax is a given constant.

These specifications must be satisfied for all possible ik (t) that satisfy (41).
Formulate this as a convex optimization problem in the standard form

minimize f0 (x)
subject to fi (x) 0, i = 1, . . . , p
Ax = b.

You may introduce new variables, or use a change of variables, but you must say very clearly

what the optimization variable x is, and how it corresponds to the problem variables w (i.e.,
is x equal to w, does it include auxiliary variables, . . . ?)
what the objective f0 and the constraint functions fi are, and how they relate to the objectives
and specifications of the problem description
why the objective and constraint functions are convex
what A and b are (if applicable).

Solution. There are at least three correct solutions.

(a) Use the inverse widths as variables, i.e., define yi = 1/wi and solve the problem
n
X
minimize li /yi
i=1
Xm X
subject to Ijdc dc
li yi Vmax , k = 1, . . . , m
j=1 iN (j,k)
2 1/2
m
X ac X
ac
Ij lk yk Vmax , k = 1, . . . , m


j=1 kN (i,j)
1/wmax yi 1/wmin , i = 1, . . . , n

X
Ijdc yk max , k = 1, . . . , n
jM(k)
2
n
X X
li yi Ijdc Pmax .
k=1 jM(k)

The objective is a convex function of yi since the lengths li are positive constants. All the
constraints except the second are linear inequalities in y. The second constraint is a second-
order cone constraint.

334
(b) Express the problem as a geometric program in w:
n
X
minimize li wi
i=1
m
) Ij li wi1 1, k = 1, . . . , m
dc 1 dc
X X
subject to (Vmax
j=1 iN (j,k)
2
m
) Ij lk wk1 1, k = 1, . . . , m
ac 1 ac
X X
(Vmax
j=1 kN (i,j)
1 1, i
wi wmax = 1, . . . , n
wmin wi1 1, i = 1, . . . , n

1 Ijdc wk1 1, k = 1, . . . , n
X
max

jM(k)
2
n
1
li wi1
X X
Pmax Ijdc 1.
k=1 jM(k)

(Note that we have squared both sides of the second constraint.) The objective function and
all constraint functions are posynomial in w, so we have a geometric program. Writing the
geometric program in the exponential form yields a convex optimization problem.
(c) In fact the problem is also convex in the original variables w:
n
X
minimize li wi
i=1
Xm X
subject to Ijdc dc
li /wi Vmax , k = 1, . . . , m
j=1 iN (j,k)
2 1/2
m
X ac X
ac
Ij lk /wk Vmax , k = 1, . . . , m


j=1 kN (i,j)
wmin wi wmax , i = 1, . . . , n
,
X
Ijdc wk max , k = 1, . . . , n
jM(k)
2
n
X X
li /wi Ijdc Pmax .
k=1 jM(k)

The objective function is linear in w. The left hand side of the first constraint is a convex
function of w, since w  0, and all the constants in the expression are positive. To show that
the third constraint is convex, we can use one of the composition theorems. We know that
the function ! m 1/2
X
f (z1 , . . . , zm ) = zi2
i=1

335
is convex. When restricted to the set {z | z  0}, it is also nondecreasing in each argument.
The function on the left hand side of the constraint is the composition of f with a nonnegative
convex function X
zi = gi (w) = Ijac lk /wk .
kN (i,j)

The third set of constraints are linear inequalities. The last two constraints are convex because
1/wi is convex on wi > 0, and all coeefficients in the expressions are positive.

11.3 Optimal amplifier gains. We consider a system of n amplifiers connected (for simplicity) in a chain,
as shown below. The variables that we will optimize over are the gains a1 , . . . , an > 0 of the
amplifiers. The first specification is that the overall gain of the system, i.e., the product a1 an ,
is equal to Atot , which is given.

a1 a2 an

We are concerned about two effects: noise generated by the amplifiers, and amplifier overload.
These effects are modeled as follows.
We first describe how the noise depends on the amplifier gains. Let Ni denote the noise level (RMS,
or root-mean-square) at the output of the ith amplifier. These are given recursively as
 1/2
2
N0 = 0, Ni = ai Ni1 + i2 , i = 1, . . . , n

where i > 0 (which is given) is the (input-referred) RMS noise level of the ith amplifier. The
output noise level Nout of the system is given by Nout = Nn , i.e., the noise level of the last amplifier.
Evidently Nout depends on the gains a1 , . . . , an .
Now we describe the amplifier overload limits. Si will denote the signal level at the output of the
ith amplifier. These signal levels are related by

S0 = Sin , Si = ai Si1 , i = 1, . . . , n,

where Sin > 0 is the input signal level. Each amplifier has a maximum allowable output level
Mi > 0 (which is given). (If this level is exceeded the amplifier will distort the signal.) Thus we
have the constraints Si Mi , for i = 1, . . . , n. (We can ignore the noise in the overload condition,
since the signal levels are much larger than the noise levels.)
The maximum output signal level Smax is defined as the maximum value of Sn , over all input signal
levels Sin that respect the the overload constraints Si Mi . Of course Smax Mn , but it can be
smaller, depending on the gains a1 , . . . , an .
The dynamic range D of the system is defined as D = Smax /Nout . Evidently it is a (rather
complicated) function of the amplifier gains a1 , . . . , an .
The goal is to choose the gains ai to maximize the dynamic range D, subject to the constraint
tot max (which are given).
i ai = A , and upper bounds on the amplifier gains, ai Ai
Q

336
Explain how to solve this problem as a convex (or quasiconvex) optimization problem. If you intro-
duce new variables, or transform the variables, explain. Clearly give the objective and inequality
constraint functions, explaining why they are convex if it is not obvious. If your problem involves
equality constraints, give them explicitly.
Carry out your method on the specific instance with n = 4, and data

Atot = 10000,
= (105 , 102 , 102 , 102 ),
M = (0.1, 5, 10, 10),
max
A = (40, 40, 40, 20).

Give the optimal gains, and the optimal dynamic range.


Solution. We have the constraints

a1 an = A, ai Ai , i = 1, . . . , n.

You should already be getting that GP (geometric programming) feeling, given the form of these
two constraints. Indeed, that is what we are going to end up with.
To get an expression for the output noise level, we first write the recursion as
 
N02 = 0, Ni2 = a2i Ni1
2
+ i2 , i = 1, . . . , n.

Then we can express the square of the output noise level (i.e., the output noise power) as
n Y
X n
2
Nout = a2j i2 .
i=1 j=i

The amplifier overload constraints can be expressed as


i
Y
Si = Sin aj Mi , i = 1, . . . , n.
j=1

Hence the maximum possible input signal level is


Mi
Sin,max = min Qi
j=1 aj
i

Since Sn = ASin , we can express the maximum output signal level as


n
Y
Smax = ASin,max = min Mi aj
i
j=i+1
Qn
(we interpret n+1 as 1).
Therefore the dynamic range is given by
Qn
mini Mi j=i+1 aj
D= 1/2
Pn Qn 2 2
i=1 a
j=i j i

337
(not exactly what youd call a simple function of ai ).
To maximize D, we can minimize 1/D2 . To do this, we introduce a new variable t, and minimize t
subject to
n Y
X n n
Y
a2j i2 tMi2 a2j , i = 1, . . . , n
i=1 j=i j=i+1

and the constraints


a1 an = A, ai Ai , i = 1, . . . , n.
This is a GP.

% optimal amplifier gains


n = 4; % 4 amplifier stages
Atot = 10000; % 80dB total gain
alpha = [1e-5 1e-2 1e-2 1e-2]; % amplifier input-referred noise levels
M = [0.1 5 10 10]; % amplifier overload limits
Amax = [40 40 40 20]; % amplifier max gains

cvx_begin gp
variables a(n) S(n) Sin
prod(a) == Atot; % overall gain is A
a <= Amax; % max amplifier stage gains
% noise levels
Nsquare(1) = a(1)^2*alpha(1)^2;
for i=2:n
Nsquare(i) = a(i)^2*(Nsquare(i-1)+alpha(i)^2);
end
% signal levels
S(1) == a(1)*Sin;
for i=2:n
S(i) == a(i)*S(i-1);
end
S <= M; % signal level limits
maximize (S(n)/sqrt(Nsquare(n)))
cvx_end

opt_dyn_range = cvx_optval

opt_gains = a

signal_levels_and_limits = [S M]

11.4 Blending existing circuit designs. In circuit design, we must select the widths of a set of n compo-
nents, given by the vector w = (w1 , . . . , wn ), which must satisfy width limits

W min wi W max , i = 1, . . . , n,

338
where W min and W max are given (positive) values. (You can assume there are no other constraints
on w.) The design is judged by three objectives, each of which we would like to be small: the
circuit power P (w), the circuit delay D(w), and the total circuit area A(w). These three objectives
are (complicated) posynomial functions of w.
You do not know the functions P , D, or A. (That is, you do not know the coefficients or exponents
in the posynomial expressions.) You do know a set of k designs, given by w(1) , . . . , w(k) Rn , and
their associated objective values

P (w(j) ), D(w(j) ), A(w(j) ), j = 1, . . . , k.

You can assume that these designs satisfy the width limits. The goal is to find a design w that
satisfies the width limits, and the design specifications

P (w) Pspec , D(w) Dspec , A(w) Aspec ,

where Pspec , Dspec , and Aspec are given.


Now consider the specific data given in blend_design_data.m. Give the following.

A feasible design (i.e., w) that satisfies the specifications.


A clear argument as to how you know that your design satisfies the specifications, even though
you do not know the formulas for P , D, and A.
Your method for finding w, including any code that you write.

Hints/comments.

You do not need to know anything about circuit design to solve this problem.
See the title of this problem.

Solution. Were given very little to go on in this problem! Clearly, the solution must rely on some
basic properties of posynomials, since we are not even told what the posynomials P , D, and A are,
other than their values at the designs w(1) , . . . , w(k) .

First ideas and attempt. The title of this course, and the title of the problem, suggests that
you should consider blended designs, which have the form

w = 1 w(1) + + k w(k) ,

where  0, 1T = 1. When the function f is convex, we can say that

f (w) 1 f (w(1) ) + + k f (w(k) ),

by Jensens inequality. Were getting close, since we can now make a statement that our design has
an f -value less than some other number that depends on the f -values at our given designs, and our
blending parameters. But no cigar, since P , D, and A are posynomials, and therefore not known
to be convex.

339
An attempt that works. A little reflection leads us in the right direction. When f is a posyn-
omial function, the function g defined by

g(x) = log f (exp x)

is convex, where exp x is meant elementwise. Therefore the functions log P , log D, and log A are
convex functions of the variables x = log w. And this means, for example, that

log P (exp x) 1 log P (exp x(1) ) + + k log P (exp x(k) ),

whenever  0, 1T = 1, and x(j) = log w(j) . Letting P (j) denote P (w(j) ), and defining the
blended design w as
log w = 1 log w(1) + + k log w(k) ,
we write the inequality above as

log P (w) 1 log P (1) + + k log P (k) .

Weve now got a method to form a blended design, for which we can predict an upper bound on
how large the power can be. Its just linear interpolation on a log scale; equivalently, the geometric
interpolation
(1) (k)
wi = (wi )1 (wi )k .
Note that w will satisfy the width limits, since the original designs do.
Of course similar inequalities hold for the delay and area. We can try to find a blend that meets
the specifications by solving the LP feasibility problem

find
Pk
subject to j log P (j) log Pspec
Pj=1
k
j log D(j) log Dspec
Pj=1
k (j) log A
j=1 j log A spec
1T = 1,  0,

with variable (the blending parameter).


Its important to understand exactly what we have done here. If the LP above is feasible, then
the blend obtained from the feasible (and done geometrically, as described above) must satisfy
the design specifications. We can make this statement without knowing the specific posynomial
expressions for P , D, and A. Its nothing more than Jensens inequality, but even so, its pretty
interesting.
On the other hand, if the LP above is infeasible, then we can say nothing. The design specifications
could be infeasible, or feasible; we do not know.
The following code attempts to find a feasible blending parameter in Matlab.

% solution for circuit blending problem


blend_design_data;
% find feasible blend factors
cvx_begin

340
variable theta(k)
log(P)*theta <= log(P_spec);
log(D)*theta <= log(D_spec);
log(A)*theta <= log(A_spec);
sum(theta) == 1;
theta >= 0;
cvx_end
% now create design
w = exp(log(W)*theta)

Here is a Python version.

import numpy as np
import cvxpy as cvx
from blend_design_data import *

theta = cvx.Variable(k)
constraints = [np.log(P)*theta <= np.log(P_spec)]
constraints += [np.log(D)*theta <= np.log(D_spec)]
constraints += [np.log(A)*theta <= np.log(A_spec)]
constraints += [cvx.sum_entries(theta)==1, theta>=0]

cvx.Problem(cvx.Minimize(0),constraints).solve()

# now create design


w = np.exp(np.log(W)*theta.value)
print(w)
print(theta.value)

And finally, Julia.

# solution for circuit blending problem


include("blend_design_data.jl");
# find feasible blend factors

using Convex
theta = Variable(k);
constraints = [
log(P)*theta <= log(P_spec);
log(D)*theta <= log(D_spec);
log(A)*theta <= log(A_spec);
sum(theta) == 1;
theta >= 0;
];
p = satisfy(constraints);
solve!(p);

341
# now create design
w = exp(log(W)*theta.value);

It turns out its feasible, so weve solved the problem! So indeed there exists a design which satisfies
the specifications. Our blending parameter (from Matlab) is

= (0.0128, 0.5096, 0.0050, 0.4626, 0.0075, 0.0026),

resulting in widths

w = (2.6203, 3.3033, 2.9562, 3.2743, 2.3037, 3.6820, 2.9177, 3.7012, 3.9140, 3.4115).

This solution is not unique, so you might be interested to know how we graded the numerical
answers. (Of course, the main point was not to get the right answer, but to realize that you must
do the blending in log space, and to clearly explain why you could be certain that your design
meets the specifications.)
To grade the numerical answers, we found the bounding box (i.e., smallest and largest possible
value of each width) over the inequalities above. The bounds were:

w = (2.5064, 3.1177, 2.8817, 3.2293, 2.2109, 3.5823, 2.8526, 3.5293, 3.7406, 3.3164)
w = (2.8151, 3.4431, 3.0924, 3.3212, 2.4757, 3.7381, 3.0056, 3.8082, 4.0065, 3.4695).

Any number outside these ranges is wrong. (However, numbers inside the ranges need not be
correct.)

An approach that almost works. We mention another approach which is correct, but fails to
solve this particular problem instance. The functions P (exp x), D(exp x), and A(exp x) are convex,
where x = log w. (This follows immediately from the fact that their logs are convex, and the
exponential of a convex function is convex.) Using the notation above, we have that the inequality

P (w) 1 P (1) + + k P (k)

holds. It follows that if the feasibility LP


find
Pk
subject to j P (j) Pspec
Pj=1
k
j D(j) Dspec
Pj=1
k (j) A
j=1 j A spec
T
1 = 1,  0,
is feasible, then weve found a set of blending parameters that must achieve the given specs. The
logic here is absolutely correct. Note that the blending parameters here are the same as above
they correspond to geometric or logarithmic blending, and not arithmetic blending. This LP above
is more stringent than the first one given above; that is, it has a smaller feasible set, which means
it will work in fewer cases than the one given above.
For the given example, this LP is not feasible, so for this particular problem instance, this approach
wont work. (You might ask: Did the teaching staff intentionally arrange for the given problem
instance to fail using this approach? Would they really do something like that?)

342
11.5 Solving nonlinear circuit equations using convex optimization. An electrical circuit consists of b
two-terminal devices (or branches) connected to n nodes, plus a so-called ground node. The goal is
to compute several sets of physical quantities that characterize the circuit operation. The vector of
branch voltages is v Rb , where vj is the voltage appearing across device j. The vector of branch
currents is i Rb , where ij is the current flowing through device j. (The symbol i, which is often
used to denote an index, is unfortunately the standard symbol used to denote current.) The vector
of node potentials is e Rn , where ek is the potential of node k with respect to the ground node.
(The ground node has potential zero by definition.)
The circuit variables v, i, and e satisfy several physical laws. Kirchhoffs current law (KCL) can
be expressed as Ai = 0, and Kirchhoffs voltage law (KVL) can be expressed as v = AT e, where
A Rnb is the reduced incidence matrix, which describes the circuit topology:

1 branch j enters node k

Akj = +1 branch j leaves node k

0 otherwise,

for k = 1, . . . , n, j = 1, . . . , b. (KCL states that current is conserved at each node, and KVL states
that the voltage across each branch is the difference of the potentials of the nodes it is connected
to.)
The branch voltages and currents are related by

vj = j (ij ), j = 1, . . . , b,

where j is a given function that depends on the type of device j. We will assume that these
functions are continuous and nondecreasing. We give a few examples. If device j is a resistor with
resistance Rj > 0, we have j (ij ) = Rj ij (which is called Ohms law). If device j is a voltage
source with voltage Vj and internal resistance rj > 0, we have j (ij ) = Vj + rj ij . And for a more
interesting example, if device j is a diode, we have j (ij ) = VT log(1 + ij /IS ), where IS and VT are
known positive constants.

(a) Find a method to solve the circuit equations, i.e., find v, i, and e that satisfy KCL, KVL,
and the branch equations, that relies on convex optimization. State the optimization problem
clearly, indicating what the variables are. Be sure to explain how solving the convex optimiza-
tion problem you propose leads to choices of the circuit variables that satisfy all of the circuit
equations. You can assume that no pathologies occur in the problem that you propose, for
example, it is feasible, a suitable constraint qualification holds, and so on.
Hint. You might find the function : Rb R,
b Z ij
X
(i1 , . . . , ib ) = j (uj ) duj ,
j=1 0

useful.
(b) Consider the circuit shown in the diagram below. Device 1 is a voltage source with parameters
V1 = 1000, r1 = 1. Devices 2 and 5 are resistors with resistance R2 = 1000, and R5 = 100
respectively. Devices 3 and 4 are identical diodes with parameters VT = 26, IS = 1. (The
units are mV, mA, and .)

343
The nodes are labeled N1 , N2 , and N3 ; the ground node is at the bottom. The incidence
matrix A is
1 1 0 0 0
A = 0 1 1 1 0 .

0 0 0 1 1
(The reference direction for each edge is down or to the right.)
Use the method in part (a) to compute v, i, and e. Verify that all the circuit equations hold.

R2 D4
N1 N2 N3

+
V1 D3 R5

Solution.

(a) We first observe that the function given in the hint is convex: since j is nondecreasing,
Z ij
j (uj ) duj
0

is convex in ij , and is the sum of these functions. We note that

(i) = (1 (i1 ), . . . , b (ib )).

The optimization problem to solve is

minimize (i)
subject to Ai = 0,

with variable i Rb . The optimality conditions are Ai = 0 (which is KCL), and (i)+AT =
0, where is a dual variable associated with the constraint Ai = 0. Defining v = (i) and
e = , the optimality conditions can be expressed as

Ai = 0, AT e = v, vj = j (ij ), j = 1, . . . , b.

These are exactly the circuit equations.


By the way, this characterization of a circuit in terms of an optimization problem was known
to J. C. Maxwell. The function is called the content function for the circuit.
(b) Let us first compute the content function for each component.
For a resistor Z i
ru du = (1/2)ri2 .
0

344
For a voltage source Z i
V + ru du = V i + (1/2)ri2 .
0

For a diode
Z i
VT log(1 + u/IS ) du = VT IS ((1 + i/IS ) log(1 + i/IS ) i/IS ).
0

We add the individual content functions for the circuit, yielding in the optimization problem

minimize V1 i1 + (1/2)
P 2 P
+ ij /IS ) log(1 + ij /IS ) ij /IS )
j{1,2,5} Rj ij + j{3,4} VT IS ((1
subject to Ai = 0.

Solving this problem in CVX* results in



999.017 0.983 999.017

982.966


0.983
999.017
982.967

AT e = 16.051 , i= 0.854 , e = 16.051 , (i) = 16.051



3.154 0.129 12.897 3.154
12.897 0.129 12.897

and
k(i) AT ek2
= 3.996 107 .
k(i)k2
We conclude that the solutions match.
In CVX (and POGS)
% Setup
A = [ 1 1 0 0 0
0 -1 1 1 0
0 0 0 -1 1
-1 0 -1 0 -1];

% Remove redundant ground constraint (ie force ground potential = 0)


A = A(1:end-1, :);

R1 = 1;
R2 = 1e3;
R5 = 1e2;

VT = 26;
IS = 1;
VS = 1e3;

cvx_begin
variable ii(5)
dual variable e

345
OBJ1 = VS*ii(1) + (1/2)*R1*ii(1)^2;
OBJ2 = (1/2)*R2*ii(2)^2;
OBJ3 = VT*IS*(-entr(1 + ii(3)/IS) - ii(3)/IS);
OBJ4 = VT*IS*(-entr(1 + ii(4)/IS) - ii(4)/IS);
OBJ5 = (1/2)*R5*ii(5)^2;
minimize(OBJ1 + OBJ2 + OBJ3 + OBJ4 + OBJ5)
subject to
e : A * ii == 0;
cvx_end

% Check Constraints
v = [VS + R1*ii(1)
R2*ii(2)
VT*log(1 + ii(3)/IS)
VT*log(1 + ii(4)/IS)
R5*ii(5)];

v_err = v - A * e;

fprintf(Relative error in voltage: %e\n, norm(v_err) / norm(v))

%% POGS

f.h = kIndEq0;
g.h = [kSquare; kSquare; kNegEntr; kNegEntr; kSquare];
g.a = [ 1; 1; 1/IS; 1/IS; 1];
g.b = [ 0; 0; -1; -1; 0];
g.c = [R1; R2; VT*IS; VT*IS; R5];
g.d = [VS; 0; -VT; -VT; 0];

[ii, ~, l] = pogs(A, f, g);


e = -l;

% Check Constraints
v = [VS + R1*ii(1)
R2*ii(2)
VT*log(1 + ii(3)/IS)
VT*log(1 + ii(4)/IS)
R5*ii(5)];

v_err = v - A * e;

fprintf(Relative error in voltage: %e\n, norm(v_err) / norm(v))

In Julia

346
# Pkg.update()
# Pkg.add("Convex")
# Pkg.add("SCS")

using Convex

# Setup
A = [ 1 1 0 0 0
0 -1 1 1 0
0 0 0 -1 1
-1 0 -1 0 -1];

# Remove redundant ground constraint


A = A[1:end-1, :];

R1 = 1
R2 = 1e3
R5 = 1e2

VT = 26
IS = 1
VS = 1e3

ii = Variable(5)

OBJ1 = VS*ii[1] + (1./2)*R1*ii[1]^2


OBJ2 = (1./2)*R2*ii[2]^2
OBJ3 = VT*IS*(-entropy(1. + ii[3]/IS) - ii[3]/IS)
OBJ4 = VT*IS*(-entropy(1. + ii[4]/IS) - ii[4]/IS)
OBJ5 = (1./2)*R5*ii[5]^2

problem = minimize(OBJ1 + OBJ2 + OBJ3 + OBJ4 + OBJ5, [A * ii == 0])


solve!(problem)
ee = -problem.constraints[1].dual

v = [VS + R1*ii.value[1]
R2*ii.value[2]
VT*log(1 + ii.value[3]/IS)
VT*log(1 + ii.value[4]/IS)
R5*ii.value[5]]

v_err = v - A * ee

@printf("Relative error in voltage: %e\n", norm(v_err) / norm(v))

347
println(v)
println(ii)
In CVXPY
from cvxpy import *
import numpy as np
import math

A = np.array([[ 1, 1, 0, 0, 0],
[ 0, -1, 1, 1, 0],
[ 0, 0, 0, -1, 1],
[-1, 0, -1, 0, -1]], dtype=np.float64)
A = A[0:-1,:]

R1 = 1.
R2 = 1e3
R5 = 1e2

VT = 26
IS = 1
VS = 1e3

ii = Variable(5)

OBJ1 = VS*ii[0] + (1./2)*R1*square(ii[0])


OBJ2 = (1./2)*R2*square(ii[1])
OBJ3 = VT*IS*(-entr(1. + ii[2]/IS) - ii[2]/IS)
OBJ4 = VT*IS*(-entr(1. + ii[3]/IS) - ii[3]/IS)
OBJ5 = (1./2)*R5*square(ii[4])

obj = Minimize(OBJ1 + OBJ2 + OBJ3 + OBJ4 + OBJ5)


constr = [A * ii == 0.]
problem = Problem(obj, constr)

#problem.solve(verbose=True, solver=SCS, eps=1e-4)


problem.solve(verbose=True)

e = -problem.constraints[0].dual_value

v = np.array([[VS + R1*float(ii.value[0])],
[R2*float(ii.value[1])],
[VT*math.log(1. + float(ii.value[2])/IS)],
[VT*math.log(1. + float(ii.value[3])/IS)],
[R5*float(ii.value[4])]])

v_err = v - np.transpose(A) * e

348
rel_err = np.linalg.norm(v_err) / np.linalg.norm(v)

print "Relative error in voltage: %e\n" % rel_err

print v
print ii.value

349
12 Signal processing and communications
12.1 FIR low-pass filter design. Consider the (symmetric, linear phase) finite impulse response (FIR)
filter described by its frequency response
N
X
H() = a0 + ak cos k,
k=1

where [0, ] is the frequency. The design variables in our problems are the real coefficients
a = (a0 , . . . , aN ) RN +1 , where N is called the order or length of the FIR filter. In this problem
we will explore the design of a low-pass filter, with specifications:

For 0 /3, 0.89 H() 1.12, i.e., the filter has about 1dB ripple in the passband
[0, /3].
For c , |H()| . In other words, the filter achieves an attenuation given by in
the stopband [c , ]. Here c is called the filter cutoff frequency.

(It is called a low-pass filter since low frequencies are allowed to pass, but frequencies above the
cutoff frequency are attenuated.) These specifications are depicted graphically in the figure below.

11

10

09
a

1a
zz
ma
00 w1 wc ii
w

For parts (a)(c), explain how to formulate the given problem as a convex or quasiconvex optimiza-
tion problem.

(a) Maximum stopband attenuation. We fix c and N , and wish to maximize the stopband atten-
uation, i.e., minimize .
(b) Minimum transition band. We fix N and , and want to minimize c , i.e., we set the stopband
attenuation and filter length, and wish to minimize the transition band (between /3 and
c ).
(c) Shortest length filter. We fix c and , and wish to find the smallest N that can meet the
specifications, i.e., we seek the shortest length FIR filter that can meet the specifications.

350
(d) Numerical filter design. Use CVX to find the shortest length filter that satisfies the filter
specifications with
c = 0.4, = 0.0316.
(The attenuation corresponds to 30dB.) For this subproblem, you may sample the constraints
in frequency, which means the following. Choose K large (say, 500; an old rule of thumb is that
K should be at least 15N ), and set k = k/K, k = 0, . . . , K. Then replace the specifications
with
For k with 0 k /3, 0.89 H(k ) 1.12.
For k with c k , |H(k )| .
Plot H() versus for your design.

Solution.

(a) The first problem can be expressed as

minimize
subject to f1 (a) 1.12
f2 (a) 0.89
f3 (a)
f4 (a) ,

where
f1 (a) = sup H(), f2 (a) = inf H(),
0/3 0/3

f3 (a) = sup H(), f4 (a) = inf H().


c c

For each , H() is a linear function of a. hence the functions f1 and f3 are convex, and f2
and f4 are concave. It follows that the problem is convex.
(b) This problem can be expressed

minimize f5 (a)
subject to f1 (a) 1.12
f2 (a) 0.89

where f1 and f2 are the same functions as above, and

f5 (a) = inf{ | H() for }.

This is a quasiconvex optimization problem in the variables a because f1 is convex, f2 is


concave, and f5 is quasiconvex: its sublevel sets are

{a | f5 (a) } = {a | H() for },

i.e., the intersection of an infinite number of halfspaces.

351
(c) This problem can be expressed as

minimize f6 (a)
subject to f1 (a) 1.12
f2 (a) 0.89
f3 (a)
f4 (a)

where f1 , f2 , f3 , and f4 are defined above and

f6 (a) = min{k | ak+1 = = aN = 0}.

The sublevel sets of f6 are affine sets:

{a | f6 (a) k} = {a | ak+1 = = aN = 0}.

This means f6 is a quasiconvex function, and again we have a quasiconvex optimization prob-
lem.
(d) After discretizing we can express the problem in part (c) for a given length N as the feasibility
LP
0.89 H(k ) 1.12 for 0 k /3,
H(k ) for c k
with variable a. (For fixed k , H(k ) is an affine function of a, hence all constraints in this
problem are linear inequalities in a.)
The following code carries out the quasi-convex optimization. We find that the shortest filter
has length N = 16.

clear all;
K = 500;
wp = pi/3; wc = .4*pi; alpha = 0.0316;
w = linspace(0, pi, K);
wi = max(find(w<=wp)); wo = min(find(w>=wc));
for N = 1:50 %we could have done bisection, but we were lazy this time
k = [0:1:N];
C = cos(k*w);
cvx_begin
variable a(N+1)
%passband constraints
C(1:wi,:)*a <= 1.12;
C(1:wi,:)*a >= 0.89;
cos(wp*linspace(0,N,N+1))*a >= 0.89
%stopband constraints
C(wo:K,:)*a <= alpha;
C(wo:K,:)*a >= -alpha;
cos(wc*linspace(0,N,N+1))*a <= alpha
cvx_end
if (strcmp(cvx_status,Solved) == 1)

352
break;
end
end

H = a*cos(k*w);
plot(w,H)
set(gca,YTick,[-alpha 0 alpha .89 1 1.12])
axis([0 pi -alpha 1.12])
hold on
plot([0 wp wp],[.89 .89 -alpha],:)
plot([wc wc pi],[1.12 alpha alpha],:)
xlabel(\omega)
ylabel(H(\omega))
hold off
It yields the following filter.

1.12

0.89
H()

0.0316
0
0.0316
0 0.5 1 1.5 2 2.5 3

12.2 SINR maximization. Solve the following instance of problem 4.20: We have n = 5 transmitters,
grouped into two groups: {1, 2} and {3, 4, 5}. The maximum power for each transmitter is 3, the
total power limit for the first group is 4, and the total power limit for the second group is 6. The
noise is equal to 0.5 and the limit on total received power is 5 for each receiver. Finally, the path
gain matrix is given by
1.0 0.1 0.2 0.1 0.0
0.1 1.0 0.1 0.1 0.0

G = 0.2 0.1 2.0 0.2 0.2 .


0.1 0.1 0.2 1.0 0.1
0.0 0.0 0.2 0.1 1.0
Find the transmitter powers p1 , . . . , p5 that maximize the minimum SINR ratio over all receivers.

353
Also report the maximum SINR value. Solving the problem to an accuracy of 0.05 (in SINR) is
fine.
Hint. When implementing a bisection method in CVX, you will need to check feasibility of a convex
problem. You can do this using strcmpi(cvx_status, Solved).
Solution. The Matlab code used to solve the problem is reported at the end of this paragraph. The
optimal values of the trasmitted powers are: p1 = 2.1188, p2 = 1.8812, p3 = 1.6444, p4 = 2.3789,
p5 = 1.8011. The maximum SINR is 1.6884.

% SINR maximization (an instance of exercise 4.20).


%
n = 5;
G =[1 0.1 0.2 0.1 0
0.1 1 0.1 0.1 0
0.2 0.1 2 0.2 0.2
0.1 0.1 0.2 1 0.1
0 0 0.2 0.1 1];
sigma = 0.5;
Pmax = 3;

% set up lower and upper bounds


l = 0;
u = 100;
tol = 1e-4;
Gtilde = G - diag(diag(G));

% use bisection to solve linear-fractional problem


while u-l > tol
t = (l+u)/2;

% solve feasibility problem for this value of t


cvx_begin
cvx_quiet(true);
variable p(n);
Gtilde*p + sigma*ones(n,1) <= t * diag(G).*p;
p >= 0;
p <= Pmax;
p(1)+p(2) <= 4;
p(3)+p(4)+p(5) <= 6;
G*p <= 5;
cvx_end

if strcmpi(cvx_status, Solved)
u = t;
% save best values
pstar = p;

354
sstar = 1/t;
else
l = t;
end
end

% output results
pstar
sstar

12.3 Power control for sum rate maximization in interference channel. We consider the optimization
problem
n
!
X pi
maximize log 1 + P
i=1 j6=i Aij pj + vi
n
X
subject to pi = 1
i=1
pi 0, i = 1, . . . , n
with variables p Rn . The problem data are the matrix A Rnn and the vector v Rn .
We assume A and v are componentwise nonnegative (Aij 0 and vi 0), and that the diagonal
elements of A are equal to one. If the off-diagonal elements of A are zero (A = I), the problem
has a simple solution, given by the waterfilling method. We are interested in the case where the
off-diagonal elements are nonzero.
We can give the following interpretation of the problem, which is not needed below. The variables
in the problem are the transmission powers in a communications system. We limit the total power
to one (for simplicity; we could have used any other number). The ith term in the objective is
the Shannon capacity of the ith channel; the fraction in the argument of the log is the signal to
interference plus noise ratio.
We can express the problem as
n Pn !
j=1 Bij pj
X
maximize log Pn
i=1 j=1 Bij pj pi
n
X (49)
subject to pi = 1
i=1
pi 0, i = 1, . . . , n,

where B Rnn is defined as B = A + v1T , i.e., Bij = Aij + vi , i, j = 1, . . . , n. Suppose B is


nonsingular and
B 1 = I C
with Cij 0. Express the problem above as a convex optimization problem. Hint. Use y = Bp as
variables.

Solution. We make a change of variables


y = Bp, p = B 1 y = y Cy.

355
If we substitute this in (49) the cost function reduces to
n
X Y
log(yi /(Cy)i ) = log (yi /(Cy)i ),
i=1 i

and the last constraint to


yi (Cy)i , i = 1, . . . , n.
Since this constraint and the objective are homogeneous, we can replace the constraint 1T p = 1 by
another normalization, e.g., Y
yi = 1.
i

This results in the GP Q


minimize i (Cy)i
subject to yi (Cy)i , i = 1, . . . , n
Q
i yi = 1.

12.4 Radio-relay station placement and power allocation. Radio relay stations are to be located at posi-
tions x1 , . . . , xn R2 , and transmit at power p1 , . . . , pn 0. In this problem we will consider the
problem of simultaneously deciding on good locations and operating powers for the relay stations.
The received signal power Sij at relay station i from relay station j is proportional to the transmit
power and inversely proportional to the distance, i.e.,
pj
Sij = ,
kxi xj k2

where > 0 is a known constant.


Relay station j must transmit a signal to relay station i at the rate (or bandwidth) Rij 0 bits
per second; Rij = 0 means that relay station j does not need to transmit any message (directly)
to relay station i. The matrix of bit rates Rij is given. Although it doesnt affect the problem, R
would likely be sparse, i.e., each relay station needs to communicate with only a few others.
To guarantee accurate reception of the signal from relay station j to i, we must have

Sij Rij ,

where > 0 is a known constant. (In other words, the minimum allowable received signal power
is proportional to the signal bit rate or bandwidth.)
The relay station positions xr+1 , . . . , xn are fixed, i.e., problem parameters. The problem variables
are x1 , . . . , xr and p1 , . . . , pn . The goal is to choose the variables to minimize the total transmit
power, i.e., p1 + + pn .
Explain how to solve this problem as a convex or quasiconvex optimization problem. If you intro-
duce new variables, or transform the variables, explain. Clearly give the objective and inequality
constraint functions, explaining why they are convex. If your problem involves equality constraints,
express them using an affine function.
Solution. XXX lost? it should exist somewhere XXX

356
12.5 Power allocation with coherent combining receivers. In this problem we consider a variation on
the power allocation problem described on pages 4-13 and 4-14 of the notes. In that problem we
have m transmitters, each of which transmits (broadcasts) to n receivers, so the total number of
receivers is mn. In this problem we have the converse: multiple transmitters send a signal to each
receiver.
More specifically we have m receivers labeled 1, . . . , m, and mn transmitters labeled (j, k), j =
1, . . . , m, k = 1, . . . , n. The transmitters (i, 1), . . . , (i, n) all transmit the same message to the
receiver i, for i = 1, . . . , m.
Transmitter (j, k) operates at power pjk , which must satisfy 0 pjk Pmax , where Pmax is a given
maximum allowable transmitter power.
The path gain from transmitter (j, k) to receiver i is Aijk > 0 (which are given and known). Thus
the power received at receiver i from transmitter (j, k) is given by Aijk pjk .
For i 6= j, the received power Aijk pjk represents an interference signal. The total interference-plus-
noise power at receiver i is given by
X
Ii = Aijk pjk +
j6=i, k=1,...,n

where > 0 is the known, given (self) noise power of the receivers. Note that the powers of the
interference and noise signals add to give the total interference-plus-noise power.
The receivers use coherent detection and combining of the desired message signals, which means
the effective received signal power at receiver i is given by
2
X
Si = (Aiik pik )1/2 .
k=1,...,n

(Thus, the amplitudes of the desired signals add to give the effective signal amplitude.)
The total signal to interference-plus-noise ratio (SINR) for receiver i is given by i = Si /Ii .
The problem is to choose transmitter powers pjk that maximize the minimum SINR mini i , subject
to the power limits.
Explain in detail how to solve this problem using convex or quasiconvex optimization. If you
transform the problem by using a different set of variables, explain completely. Identify the objective
function, and all constraint functions, indicating if they are convex or quasiconvex, etc.
Solution. The objective is just 1T p. The requirement for accurate reception is
pj
Rij , i, j = 1, . . . , n
kxi xj k2
which can be written as

kxi xj k2 Rij pj 0, i, j = 1, . . . , n.

This constraint is convex in (x, p) since kxi xj k is convex and pj is linear in (x, p) (hence
convex). In fact this constraint is a second-order cone constraint. Note that this constraint includes
the nonnegativity constraint on pi .

357
Therefore we can formulate this problem as the quadratically constrained linear program

minimize 1T p
subject to kxi xj k2 Rij pj 0, i, j = 1, . . . , n

where the variables are x1 , . . . , xr and p1 , . . . , pn . The problem parameters or data are xr+1 , . . . , xn ,
, , and Rij , i, j = 1, . . . , n.
The power limit constraints, 0 pjk Pmax , are evidently convex in p.
The S/I ratio for receiver i is given by the (scary) expression
P 2
1/2
k=1,...,n (Aiik pik )
i = P .
j6=i, k=1,...,n Aijk pjk +

To say that our objective exceeds t, i.e., mini i t, is the same as


2
X X
(Aiik pik )1/2 t Aijk pjk + . (50)
k=1,...,n j6=i, k=1,...,n

In fact, we will see that this describes a convex constraint on p, which means that the minimum
S/I ratio is a quasiconcave function of p. So our problem is, directly, a quasiconvex problem in the
variables pij .
To see this, we start by examining the function
n
!2
X 1/2
f (x) = xk
k=1

for x  0. It looks reminiscent of a p-norm, except that p = 1/2, so we know the function isnt
convex. Well, in fact, the function is concave on x  0 as given in the notes.
Therefore, it follows that the constraint (50) is convex, since it has the form g(p) h(p) where g
is concave and h is affine.

12.6 Antenna array weight design. We consider an array of n omnidirectional antennas in a plane, at
positions (xk , yk ), k = 1, . . . , n.

p1

k
theta

358
A unit plane wave with frequency is incident from an angle . This incident wave induces in
the kth antenna element a (complex) signal exp(i(xk cos + yk sin t)), where i = 1. (For
simplicity we assume that the spatial units are normalized so that the wave number is one, i.e., the
wavelength is = 2.) This signal is demodulated, i.e., multiplied by eit , to obtain the baseband
signal (complex number) exp(i(xk cos + yk sin )). The baseband signals of the n antennas are
combined linearly to form the output of the antenna array
n
X
G() = wk ei(xk cos +yk sin )
k=1
Xn
= (wre,k cos k () wim,k sin k ()) + i (wre,k sin k () + wim,k cos k ()) ,
k=1

if we define k () = xk cos + yk sin . The complex weights in the linear combination,

wk = wre,k + iwim,k , k = 1, . . . , n,

are called the antenna array coefficients or shading coefficients, and will be the design variables
in the problem. For a given set of weights, the combined output G() is a function of the angle
of arrival of the plane wave. The design problem is to select weights wi that achieve a desired
directional pattern G().
We now describe a basic weight design problem. We require unit gain in a target direction tar ,
i.e., G(tar ) = 1. We want |G()| small for | tar | , where 2 is our beamwidth. To do this,
we can minimize
max
tar
|G()|,
| |

where the maximum is over all [, ] with | tar | . This number is called the sidelobe
level for the array; our goal is to minimize the sidelobe level. If we achieve a small sidelobe level,
then the array is relatively insensitive to signals arriving from directions more than away from
the target direction. This results in the optimization problem

minimize max|tar | |G()|


subject to G(tar ) = 1,

with w Cn as variables.
The objective function can be approximated by discretizing the angle of arrival with (say) N values
(say, uniformly spaced) 1 , . . . , N over the interval [, ], and replacing the objective with

max{|G(k )| | |k tar | }

(a) Formulate the antenna array weight design problem as an SOCP.


(b) Solve an instance using CVX, with n = 40, tar = 15 , = 15 , N = 400, and antenna
positions generated using
rand(state,0);
n = 40;
x = 30 * rand(n,1);
y = 30 * rand(n,1);

359
Compute the optimal weights and make a plot of |G()| (on a logarithmic scale) versus .
Hint. CVX can directly handle complex variables, and recognizes the modulus abs(x) of a
complex number as a convex function of its real and imaginary parts, so you do not need to
explicitly form the SOCP from part (a). Even more compactly, you can use norm(x,Inf)
with complex argument.

Solution.

(a) The problem can be expressed as the SOCP

minimize t
subject to kAk xk2 t, kI
Bx = d

with variables x, t, where I = {k | |k tar | } and


" #
wre
x = R2n
wim
" #
cos 1 (k ) cos n (k ) sin 1 (k ) sin n (k )
Ak =
sin 1 (k ) sin n (k ) cos 1 (k ) cos n (k )
" #
cos 1 (tar ) cos n (tar ) sin 1 (tar ) sin n (tar )
B =
sin 1 (tar ) sin n (tar ) cos 1 (tar ) cos n (tar )
" #
1
d = .
0

(b) The figure below shows the output of the antenna array for different values of .

0
10

1
10
|G(theta)|

2
10

3
10
200 150 100 50 0 50 100 150 200
theta

The following code solves the problem.

360
rand(state,0);
n = 40;
X = 30*[rand(1,n); rand(1,n)];
N=400;
beamwidth = 15*pi/180;
theta_tar=15*pi/180;
theta = linspace(theta_tar+beamwidth, 2*pi+theta_tar-beamwidth, N);
A = exp(i * [cos(theta), sin(theta)] * X);
Atar = exp(i * [cos(theta_tar), sin(theta_tar)] * X)
cvx_begin
variable w(n) complex
minimize(max(abs(A*w)))
subject to
Atar*w == 1;
cvx_end

12.7 Power allocation problem with analytic solution. Consider a system of n transmitters and n re-
ceivers. The ith transmitter transmits with power xi , i = 1, . . . , n. The vector x will be the variable
in this problem. The path gain from each transmitter j to each receiver i will be denoted Aij and
is assumed to be known (obviously, Aij 0, so the matrix A is elementwise nonnegative, and
Aii > 0). The signal received by each receiver i consists of three parts: the desired signal, arriving
from transmitter i with power Aii xi , the interfering signal, arriving from the other receivers with
P
power j6=i Aij xj , and noise i (which are positive and known). We are interested in allocating
the powers xi in such a way that the signal to noise plus interference ratio at each of the receivers
exceeds a level . (Thus is the minimum acceptable SNIR for the receivers; a typical value
might be around = 3, i.e., around 10dB). In other words, we want to find x  0 such that for
i = 1, . . . , n
X
Aii xi Aij xj + i .
j6=i

Equivalently, the vector x has to satisfy

x  0, Bx  (51)

where B Rnn is defined as

Bii = Aii , Bij = Aij , j 6= i.

(a) Show that (51) is feasible if and only if B is invertible and z = B 1 1  0 (1 is the vector with
all components 1). Show how to construct a feasible power allocation x from z.
(b) Show how to find the largest possible SNIR, i.e., how to maximize subject to the existence
of a feasible power allocation.

To solve this problem you may need the following:


Hint. Let T Rnn be a matrix with nonnegative elements, and s R. Then the following are
equivalent:

361
(a) s > (T ), where (T ) = maxi |i (T )| is the spectral radius of T .
(b) sI T is nonsingular and the matrix (sI T )1 has nonnegative elements.
(c) there exists an x  0 with (sI T )x  0.

(For such s, the matrix sI T is called a nonsingular M-matrix.)


Remark. This problem gives an analytic solution to a very special form of transmitter power
allocation problem. Specifically, there are exactly as many transmitters as receivers, and no power
limits on the transmitters. One consequence is that the receiver noises i play no role at all in the
solution just crank up all the transmitters to overpower the noises!
Solution. First note that the existence of a vector x satisfying

x  0, Bx  (52)

is equivalent to the existence of a vector x satisfying

x  0, Bx  0. (53)

The reason is simply that we can always multiply any x satisfying (53) by a large enough positive
number so that
x  0, B(x) 
and hence x would satisfy (52). Obviously, since  0, any solution of (52) is also a solution
of (53).
We can assume without loss of generality that the diagonal elements of A are equal to one (by
dividing each row of A by Aii ). The matrix B can then be written as

B = (1 + )I A,

which has the form B = sI T where T has nonnegative elements. Hence the following properties
are equivalent:

(a) 1 + > (A)


(b) B is nonsingular and B 1 has nonnegative elements
(c) there exists an x  0 with Bx  0.

The two parts of the problem now follow immediately.

(a) The if-part is immediate since z solves (53) if it is nonnegative. For the only if part,
we combine the 2nd and 3rd properties listed above, and conclude that there exists an x
satisfying (53) if and only if B 1 exists and has nonnegative elements. This implies that
z = B 1 1  0.
(b) B is a nonsingular M -matrix if 1 + > (A), i.e., if
1
< .
(A) 1

362
12.8 Optimizing rates and time slot fractions. We consider a wireless system that uses time-domain
multiple access (TDMA) to support n communication flows. The flows have (nonnegative) rates
r1 , . . . , rn , given in bits/sec. To support a rate ri on flow i requires transmitter power

p = ai (ebr 1),

where b is a (known) positive constant, and ai are (known) positive constants related to the noise
power and gain of receiver i.
TDMA works like this. Time is divided up into periods of some fixed duration T (seconds). Each
of these T -long periods is divided into n time-slots, with durations t1 , . . . , tn , that must satisfy
t1 + + tn = T , ti 0. In time-slot i, communications flow i is transmitted at an instantaneous
rate r = T ri /ti , so that over each T -long period, T ri bits from flow i are transmitted. The power
required during time-slot i is ai (ebT ri /ti 1), so the average transmitter power over each T -long
period is
n
X
P = (1/T ) ai ti (ebT ri /ti 1).
i=1
When ti is zero, we take P = if ri > 0, and P = 0 if ri = 0. (The latter corresponds to the case
when there is zero flow, and also, zero time allocated to the flow.)
The problem is to find rates r Rn and time-slot durations t Rn that maximize the log utility
function n X
U (r) = log ri ,
i=1
subject to P P max . (This utility function is often used to ensure fairness; each communication
flow gets at least some positive rate.) The problem data are ai , b, T and P max ; the variables are ti
and ri .

(a) Formulate this problem as a convex optimization problem. Feel free to introduce new variables,
if needed, or to change variables. Be sure to justify convexity of the objective or constraint
functions in your formulation.
(b) Give the optimality conditions for your formulation. Of course we prefer simpler optimality
conditions to complex ones. Note: We do not expect you to solve the optimality conditions;
you can give them as a set of equations (and possibly inequalities).

Hint. With a log utility function, we cannot have ri = 0, and therefore we cannot have ti = 0;
therefore the constraints ri 0 and ti 0 cannot be active or tight. This will allow you to simplify
the optimality conditions.
Solution. The problem is
n
P
maximize i=1 log ri
T
subject to 1 t = T
P = (1/T ) ni=1 ai ti (ebT ri /ti 1) P max ,
P

with variables r Rn and t Rn . There is an implicit constraint that ri > 0, and also that ti > 0.
In fact, we dont need to introduce any new variables, or to change any variables. This is a convex
optimization problem just as it stands. The objective is clearly concave, and so can be maximized.

363
The only question is whether or not the function P is convex in r and t. To show this, we need to
show that the function f (x, y) = xex/y is convex in x and y, for y > 0. But this is nothing more
than the perspective of the exponential function, so its convex. The function P is just a positive
weighted sum of functions of this form (plus an affine function), so its convex.
We introduce a Lagrange multiplier R for the equality constraint, and R+ for the inequality
constraint. We dont need Lagrange multipliers for the implicit constraints t  0, r  0; even if we
did introduce them theyd be zero at the optimum, since these constraints cannot be tight.
The KKT conditions are: primal feasibility,
n
X
1T t = T, (1/T ) ai ti (ebT ri /ti 1) P max ,
i=1

dual feasibility, 0,
L
= 1/ri + ai bebT ri /ti = 0, i = 1, . . . , n,
ri
L  
= (ai /T ) ebT ri /ti 1 (bT ri /ti )ebT ri /ti + = 0, i = 1, . . . , n,
ti
and the complementarity condition (P P max ) = 0.
In fact, the constraint P P max must be tight at the optimum, because the utility is monotonic
increasing in r, and if the power constraint were slack, we could increase rates slightly, without
violating the power limit, and get more utility. In other words, we can replace P P max with
P = P max . This means we can replace the second primal feasibility condition with an equality, and
also, we conclude that the complementarity condition always holds.
Thus, the KKT conditions are
1T t = T,
(1/T ) ni=1 ai ti (ebT ri /ti 1) = P max ,
P

1/ri + ai bebT ri /ti = 0, i = 1, . . . , n,


(ai /T ) ebT ri /ti 1 (bT ri /ti )ebT ri /ti + = 0, i = 1, . . . , n,
0.

We didnt ask you to solve these equations. As far as we know, theres no analytical solution.
But, after a huge and bloody algebra battle, its possible to solve the KKT conditions using a one-
parameter search, as in water-filling. Although this appears to be a great solution, it actually has no
better computational complexity than a standard method, such as Newtons method, for solving the
KKT conditions, provided the special structure in the Newton step equations is exploited properly.
Either way, you end up with a method that involves say a few tens of iterations, each one requiring
O(n) flops.
Remember, we didnt ask you to solve the KKT equations. And you should be grateful that we
didnt, because we certainly could have.
12.9 Optimal jamming power allocation. A set of n jammers transmit with (nonnegative) powers
p1 , . . . , pn , which are to be chosen subject to the constraints
p  0, F p  g.

364
The jammers produce interference power at m receivers, given by
n
X
di = Gij pj , i = 1, . . . , m,
j=1

where Gij is the (nonnegative) channel gain from jammer j to receiver i.


Receiver i has capacity (in bits/s) given by
Ci = log(1 + i /(i2 + di )), i = 1, . . . , m,
where , i , and i are positive constants. (Here i is proportional to the signal power at receiver
i and i2 is the receiver i self-noise, but you wont need to know this to solve the problem.)
Explain how to choose p to minimize the sum channel capacity, C = C1 + + Cm , using convex
optimization. (This corresponds to the most effective jamming, given the power constraints.) The
problem data are F , g, G, , i , i .
If you change variables, or transform your problem in any way that is not obvious (for example, you
form a relaxation), you must explain fully how your method works, and why it gives the solution.
If your method relies on any convex functions that we have not encountered before, you must show
that the functions are convex.
Disclaimer. The teaching staff does not endorse jamming, optimal or otherwise.
Solution. This is almost a trick question, because it is so easy: there is no need to change variables,
or introduce any relaxation. Well show that the following problem is convex:
minimize C
subject to p  0, F p  g.

We observe that the function f (x) = log(1 + 1/x) is convex for x > 0: we have
1
f 0 (x) = ,
x2 + x
which evidently is increasing in x, so f 00 (x) > 0.
Heres another proof that f (x) = log(1 + 1/x) is convex: it is the composition of h(u1 , u2 ) =
log(eu1 + eu2 ), which is increasing and convex, with u1 = 0 and u2 = log x, which are convex.
Thus,
h(g(x)) = log(e0 + e log x ) = log(1 + 1/x)
is convex.
And, heres yet another, just for fun. We write
log(1 + 1/x) = log(x/(x + 1)) = log(1 1/(x + 1)),
the composition of h(u) = log(1 u), which is convex and increasing, with u = 1/(x + 1), which
is convex.
For rows giT of G, we write
Ci = f ((giT p + i2 )/i ),
which is convex in p, since the argument to f is affine in p. It follows that C is a convex function
of p.

365
12.10 2D filter design. A symmetric convolution kernel with support {(N 1), . . . , N 1}2 is charac-
terized by N 2 coefficients
hkl , k, l = 1, . . . , N.
These coefficients will be our variables. The corresponding 2D frequency response (Fourier trans-
form) H : R2 R is given by
X
H(1 , 2 ) = hkl cos((k 1)1 ) cos((l 1)2 ),
k,l=1,...,N

where 1 and 2 are the frequency variables. Evidently we only need to specify H over the region
[0, ]2 , although it is often plotted over the region [, ]2 . (It wont matter in this problem, but
we should mention that the coefficients hkl above are not exactly the same as the impulse response
coefficients of the filter.)
We will design a 2D filter (i.e., find the coefficients hkl ) to satisfy H(0, 0) = 1 and to minimize the
maximum response R in the rejection region rej [0, ]2 ,

R= sup |H(1 , 2 )|.


(1 ,2 )rej

(a) Explain why this 2D filter design problem is convex.


(b) Find the optimal filter for the specific case with N = 5 and

rej = {(1 , 2 ) [0, ]2 | 12 + 22 W 2 },

with W = /4.
You can approximate R by sampling on a grid of frequency values. Define

(p) = (p 1)/M, p = 1, . . . , M.

(You can use M = 25.) We then replace the exact expression for R above with
= max{|H( (p) , (q) )| | p, q = 1, . . . , M, ( (p) , (q) ) rej }.
R
Plot the optimal frequency response using plot_2D_filt(h),
Give the optimal value of R.
available on the course web site, where h is the matrix containing the coefficients hkl .

Solution.

% 2D filter design problem via convex optimization

% Number of taps of the FIR filter


N=5;

% Frequency discretization for design


M=25;

% Specifications
W = pi/4;

366
cvx_begin
variable h(N,N);
variable t
sum(sum(h)) == 1; % H(0,0)==1
minimize (t);
% loop over frequency
for p=1:M
for q=1:M
w1 = pi*((p-1)/M); w2=pi*((q-1)/M);
cosw1k = cos(w1*[0:(N-1)]);
cosw2l = cos(w2*[0:(N-1)]);
H(p,q) = cosw1k*h*cosw2l;
if (w1^2+w2^2 >= W^2)
abs(H(p,q)) <= t
end
end
end
cvx_end

HH=plot_2D_filt(h);
print -depsc opt_2D_filt_resp

This yields the following optimal frequency response.

1.2

0.8

0.6
H(1,2)

0.4

0.2

0.2
4

2 4
2
0
0
2
2
2 4 4
1

12.11 Maximizing log utility in a wireless system with interference. Consider a wireless network consisting
of n data links, labeled 1, . . . , n. Link i transmits with power Pi > 0, and supports a data rate

367
Ri = log(1 + i ), where i is the signal-to-interference-plus-noise ratio (SINR). These SINR ratios
depend on the transmit powers, as described below.
nn
The system is characterized by the link gain matrix G R++ , where Gij is the gain from the
transmitter on link j to the receiver for link i. The received signal power for link i is Gii Pi ; the
noise plus interference power for link i is given by
X
i2 + Gij Pj ,
j6=i

where i2 > 0 is the receiver noise power for link i. The SINR is the ratio
Gii Pi
i = .
i2 +
P
j6=i Gij Pj

The problem is to choose the transmit powers P1 , . . . , Pn , subject to 0 < Pi Pimax , in order to
maximize the log utility function
n
X
U (P ) = log Ri .
i=1
(This utility function can be argued to yield a fair distribution of rates.) The data are G, i2 , and
Pimax .
Formulate this problem as a convex or quasiconvex optimization problem. If you make any trans-
formations or use any steps that are not obvious, explain.
Hints.
The function log log(1 + ex ) is concave. (If you use this fact, you must show it.)
You might find the new variables defined by zi = log Pi useful.
Solution. First we show that f (x) = log log(1 + ex ) is concave. Its first and second derivatives are
1
f 0 (x) = ,
(log(1 + ex ))(1 + ex )
1 ex
f 00 (x) = +
(log(1 + ex ))2 (1 + ex )2 log(1 + ex )(1 + ex )2
1 1 1
 
= + .
log(1 + ex )(1 + ex )2 log(1 + ex ) ex
The first term is positive. The second is negative since ex log(1 + ex ), which follows from
log(1 + u) u. (This in turn follows since log(1 + u) is a concave function, and u is its first order
Taylor approximation at 0.)
Now lets solve the problem. We can write log Ri as
log Ri = log log(1 + i ) = log log(1 + elog i ).
Defining new variables zi = log Pi , we write log i as
!
Gii ezi
log i = log
Ni + j6=i Gij ezj
P

X
zj
= log Gii + zi log Ni + Gij e .
j6=i

368
 
The term log Ni + j6=i Gij ezj can be rewritten (with some effort) into log-sum-exp of an affine
P

transformation of z, so we see that log i is a concave function of z. Using the fact that log log(1+eu )
is concave and increasing, we conclude that the function
n
X n
X
log Ri = log log(1 + ei )
i=1 i=1

is a concave function of z. (It is not a simple function, to be surebut it is concave!)


To solve the problem, then, we maximize this concave function, subject to the constraints zi
?
log Pimax . We recover the optimal Pi via Pi? = ezi .
One common error was to argue that log(1 + i ) is quasiconcave in the powers (which is true), so
the problem is quasiconcave (which is false, since sum of quasiconcave is not quasiconcave).
12.12 Spectral factorization via semidefinite programming. A Toeplitz matrix is a matrix that has constant
values on its diagonals. We use the notation
xm1 xm

x1 x2 x3

x2 x1 x2 xm2 xm1


x3 x2 x1 xm3 xm2

Tm (x1 , . . . , xm ) = .. .. .. .. .. ..

. . . . . .




xm1 xm2 xm2 x1 x2
xm xm1 xm2 x2 x1
to denote the symmetric Toeplitz matrix in Smm constructed from x1 , . . . , xm . Consider the
semidefinite program
minimize cT x
subject to Tn (x1 , . . . , xn )  e1 eT1 ,
with variable x = (x1 , . . . , xn ), where e1 = (1, 0, . . . , 0).

(a) Derive the dual of the SDP above. Denote the dual variable as Z. (Hence Z Sn and the
dual constraints include an inequality Z  0.)
(b) Show that Tn (x1 , . . . , xn )  0 for every feasible x in the SDP above. You can do this by
induction on n.
For n = 1, the constraint is x1 1 which obviously implies x1 > 0.
In the induction step, assume n 2 and that Tn1 (x1 , . . . , xn1 )  0. Use a Schur
complement argument and the Toeplitz structure of Tn to show that Tn (x1 , . . . , xn )  e1 eT1
implies Tn (x1 , . . . , xn )  0.
(c) Suppose the optimal value of the SDP above is finite and attained, and that Z is dual optimal.
Use the result of part (b) to show that the rank of Z is at most one, i.e., Z can be expressed
as Z = yy T for some n-vector y. Show that y satisfies
y12 + y22 + + yn2 = c1
y1 y2 + y2 y3 + + yn1 yn = c2 /2
..
.
y1 yn1 + y2 yn = cn1 /2
y1 yn = cn /2.

369
This can be expressed as an identity |Y ()|2 = R() between two functions

Y () = y1 + y2 ei + y3 e3i + + yn ei(n1)
R() = c1 + c2 cos + c3 cos(2) + + cn cos((n 1))

(with i = 1). The function Y () is called a spectral factor of the trigonometric polynomial
R().

Solution.
(a) The Lagrangian is

L(x, Z) = cT x + tr(Z(e1 eT1 Tn (x1 , . . . , xn ))


= cT x + Z11 x1 (Z11 + + Znn ) 2x2 (Z21 + + Zn,n1 )
2x3 (Z31 + + Zn,n2 ) 2xn Zn1 .

In the dual SDP we maximize g(Z) = inf x L(x, Z) subject to Z  0:


maximize Z11
subjec to Z11 + Z22 + + Znn = c1
2(Z21 + Z32 + + Zn,n1 ) = c2
2(Z31 + Z42 + + Zn,n2 ) = c3

2(Zn1,1 + Zn2 ) = cn1
2Zn1 = cn
Z  0.

(b) The constraint Tn (x1 , . . . , xn )  e1 eT1 can be written as


" #
T
x1 1 x
 0.
x A

where x = (x2 , . . . , xn ) and A = Tn1 (x1 , . . . , xn1 ). By assumption the 2,2 block A is positive
definite, so by the Schur complement theorem the inequality is equivalent to x1 1 x T A1 x
0. Hence x1 x T 1
A x 1 > 0 and therefore
" #
T
x1 x
Tn (x1 , . . . , xn ) =  0.
x
A

(c) By strong duality, the primal and dual optimal solutions satisfy tr(XZ) = 0 where

X = Tn (x1 , . . . , xn ) e1 eT1 .

From part (b), Tn (x1 , . . . , xn ) is strictly positive definite and therefore the rank of X is at
least n 1, i.e., its nullspace has dimension at most one. Then tr(ZX) = 0 and Z  0 imply
Z = yy T with y in the nullspace of X.
Plugging in Zij = yi yj in the equality constraints in the dual SDP gives

y12 + + yn2 = c1 , y1 yk + + ynk yn = ck /2, k = 2, . . . , n.

370
12.13 Bandlimited signal recovery from zero-crossings. Let y Rn denote a bandlimited signal, which
means that it can be expressed as a linear combination of sinusoids with frequencies in a band:
B
2 2
X    
yt = aj cos (fmin + j 1)t + bj sin (f + j 1)t , t = 1, . . . , n,
j=1
n n min

where fmin is lowest frequency in the band, B is the bandwidth, and a, b RB are the cosine and
sine coefficients, respectively. We are given fmin and B, but not the coefficients a, b or the signal
y.
We do not know y, but we are given its sign s = sign(y), where st = 1 if yt 0 and st = 1
if yt < 0. (Up to a change of overall sign, this is the same as knowing the zero-crossings of the
signal, i.e., when it changes sign. Hence the name of this problem.)
We seek an estimate y of y that is consistent with the bandlimited assumption and the given signs.
Of course we cannot distinguish y and y, where > 0, since both of these signals have the same
sign pattern. Thus, we can only estimate y up to a positive scale factor. To normalize y, we will
require that k
y k1 = n, i.e., the average value of |yi | is one. Among all y that are consistent with the
bandlimited assumption, the given signs, and the normalization, we choose the one that minimizes
k
y k2 .

(a) Show how to find y using convex or quasiconvex optimization.


(b) Apply your method to the problem instance with data in zero_crossings_data.*. The data
files also include the true signal y (which of course you cannot use to find y). Plot y and y,
and report the relative recovery error, ky yk2 /kyk2 . Give one short sentence commenting on
the quality of the recovery.

Solution.

(a) We can express our estimate as y = Ax, where x = (a, b) R2B is the vector of cosine and
sinusoid coefficients, and we define the matrix

A = [C S] Rn2B ,

where C, S RnB have entries

Ctj = cos(2(fmin + j 1)t/n), Stj = sin(2(fmin + j 1)t/n),

respectively.
To ensure that the signs of y are consistent with s, we need the constraints st aTt x 0 for
t = 1, . . . , n, where aT1 , . . . , aTn are the rows of A. To achieve the proper normalization, we
also need the linear equality constraint k y k1 = sT Ax = n. (Note that an `1 -norm equality
constraint is not convex in general, but here it is, since the signs are given.)
We have a convex objective and linear inequality and equality constraints, so our optimization
problem is convex:
minimize kAxk2
subject to st aTt x 0, t = 1, . . . , n
sT Ax = n.

371
We get our estimate as y = Ax? , where x? is a solution of this problem.
One common mistake was to formulate the problem above without the normalization con-
straint. The (incorrect) argument was that youd solve the problem, which is homogeneous,
and then scale what you get so its `1 norm is one. This doesnt work, since the (unique)
solution to the homogeneous problem is x = 0 (since x = 0 is feasible). However, this method
did give numerical results far better than x = 0. The reason is that the solvers returned a
very small x, for which Ax had the right sign. And no, that does not mean the error wasnt
a bad one.
(b) The recovery error is 0.1208. This is very impressive considering how little information we
were given.
The following matlab code solves the problem:
zero_crossings_data;

% Construct matrix A whose columns are bandlimited sinusoids


C = zeros(n,B);
S = zeros(n,B);
for j = 1:B
C(:,j) = cos(2*pi * (f_min+j-1) * (1:n) / n);
S(:,j) = sin(2*pi * (f_min+j-1) * (1:n) / n);
end
A = [C S];

% Minimize norm subject to L1 normalization and sign constraints


cvx_begin quiet
variable x(2*B)
minimize norm(A*x)
subject to
s .* (A*x) >= 0
s * (A*x) == n
cvx_end

y_hat = A*x;
fprintf(Recovery error: %f\n, norm(y - y_hat) / norm(y));
figure
plot(y)
hold all
plot(y_hat)
xlim([0,n])
legend(original, recovered, Location, SouthEast);
title(original and recovered bandlimited signals);
The following Python code solves the problem:
import numpy as np
import cvxpy as cvx
import matplotlib.pyplot as plt

372
original and recovered bandlimited signals
3

original
recovered
4
0 200 400 600 800 1000 1200 1400 1600 1800 2000

Figure 12: The original bandlimited signal y and the estimate y recovered from zero crossings.

from zero_crossings_data import *

# Construct matrix A whose columns are bandlimited sinusoids


C = np.zeros((n, B))
S = np.zeros((n, B))
for j in range(B):
C[:, j] = np.cos(2 * np.pi * (f_min + j) * np.arange(1, n + 1) / n)
S[:, j] = np.sin(2 * np.pi * (f_min + j) * np.arange(1, n + 1) / n)
A = np.hstack((C, S))

# Minimize norm subject to L1 normalization and sign constraints


x = cvx.Variable(2 * B)
obj = cvx.norm(A * x)
constraints = [cvx.mul_elemwise(s, A * x) >= 0,
s.T * (A * x) == n]
problem = cvx.Problem(cvx.Minimize(obj), constraints)
problem.solve()

y_hat = np.dot(A, x.value.A1)


print(Recovery error: {}
.format(np.linalg.norm(y - y_hat) / np.linalg.norm(y)))
plt.figure()

373
plt.plot(np.arange(0, n), y, label=original);
plt.plot(np.arange(0, n), y_hat, label=recovered);
plt.xlim([0, n])
plt.legend(loc=lower left)
plt.show()
The following Julia code solves the problem:
using Convex, SCS, Gadfly
set_default_solver(SCSSolver(verbose=false))

include("zero_crossings_data.jl")

# Construct matrix A whose columns are bandlimited sinusoids


C = zeros(n, B);
S = zeros(n, B);
for j in 1:B
C[:, j] = cos(2 * pi * (f_min + j - 1) * (1:n) / n);
S[:, j] = sin(2 * pi * (f_min + j - 1) * (1:n) / n);
end
A = [C S];

# Minimize norm subject to L1 normalization and sign constraints


x = Variable(2 * B)
obj = norm(A * x, 2)
constraints = [s .* (A * x) >= 0,
s * (A * x) == n]
problem = minimize(obj, constraints)
solve!(problem)

y_hat = A * x.value
println("Recovery error: $(norm(y - y_hat) / norm(y))")
pl = plot(
layer(x=1:n, y=y, Geom.line, Theme(default_color = colorant"blue")),
layer(x=1:n, y=y_hat, Geom.line, Theme(default_color = colorant"green")),
);
display(pl);

374
13 Finance
13.1 Transaction cost. Consider a market for some asset or commodity, which we assume is infinitely
divisible, i.e., can be bought or sold in quantities of shares that are real numbers (as opposed to
integers). The order book at some time consists of a set of offers to sell or buy the asset, at a given
price, up to a given quantity of shares. The N offers to sell the asset have positive prices per share
psell sell sell sell
1 , . . . , pN , sorted in increasing order, in positive share quantities q1 , . . . , qN . The M offers to
buy buy
buy the asset have positive prices p1 , . . . , pN , sorted in decreasing order, and positive quantities
q1buy , . . . , qMbuy
. The price psell
1 is called the (current) ask price for the asset; pbuy
1 is the bid price
for the asset. The ask price is larger than the bid price; the difference is called the spread. The
average of the ask and bid prices is called the mid-price, denoted pmid .
Now suppose that you want to purchase q > 0 shares of the asset, where q q1sell + + qN
sell , i.e.,

your purchase quantity does not exceed the total amount of the asset currently offered for sale.
Your purchase proceeds as follows. Suppose that

q1sell + + qksell < q q1sell + + qk+1


sell
.

Then you pay an amount

A = psell sell sell sell sell sell sell


1 q1 + + pk qk + pk+1 (q q1 qk ).

Roughly speaking, you work your way through the offers in the order book, from the least (ask)
price, and working your way up the order book until you fill the order. We define the transaction
cost as
T (q) = A pmid q.
This is the difference between what you pay, and what you would have paid had you been able to
purchase the shares at the mid-price. It is always positive.
We handle the case of selling the asset in a similar way. Here we take q < 0 to mean that we sell q
shares of the asset. Here you sell shares at the bid price, up to the quantity q buy (or q, whichever
is smaller); if needed, you sell shares at the price pbuy
2 , and so on, until all q shares are sold. Here
buy buy
we assume that q q1 + + qM , i.e., you are not selling more shares than the total quantity
of offers to buy. Let A denote the amount you receive from the sale. Here we define the transaction
cost as
T (q) = pmid q A,
the difference between the amount you would have received had you sold the shares at the mid-price,
and the amount you received. It is always positive. We set T (0) = 0.

(a) Show that T is a convex piecewise linear function.


(b) Show that T (q) (s/2)|q|, where s is the spread. When would we have T (q) = (s/2)|q| for
all q (in the range between the total shares offered to purchase or sell)?
(c) Give an interpretation of the conjugate function T (y) = supq (yq T (q)). Hint. Suppose you
can purchase or sell the asset in another market, at the price pother .

Solution.

375
(a)
(b)
(c) Suppose you can purchase (or sell) elsewhere, in another market, any amount of the asset at
the price pother . If you buy q shares in the market, and then sell them in the other market, your
profit is qpother qpmid T (q). In fact, this formula works for q < 0 as well, which corresponds
to buying q shares in the other market, and selling them in the given market. Naturally, you
would choose q to maximize your profit. Your optimal profit is T (pother pmid ). So, the
conjugate function maps a price differential with respect to another market into the optimal
profit you can attain.

13.2 Risk-return trade-off in portfolio optimization. We consider the portfolio risk-return trade-off prob-
lem of page 185, with the following data:

0.12 0.0064 0.0008 0.0011 0
0.10 0.0008 0.0025 0 0
p = , = .

0.07 0.0011 0 0.0004 0
0.03 0 0 0 0

(a) Solve the quadratic program

minimize pT x + xT x
T
subject to 1 x = 1, x  0

for a large number of positive values of (for example, 100 values logarithmically spaced
between 1 and 107 ). Plot the optimal values of the expected return pT x versus the standard
deviation (xT x)1/2 . Also make an area plot of the optimal portfolios x versus the standard
deviation (as in figure 4.12).
(b) Assume the price change vector p is a Gaussian random variable, with mean p and covariance
. Formulate the problem

maximize pT x
subject to prob(pT x 0)
1T x = 1, x  0,

as a convex optimization problem, where < 1/2 is a parameter. In this problem we maximize
the expected return subject to a constraint on the probability of a negative return. Solve the
problem for a large number of values of between 104 and 101 , and plot the optimal values
of pT x versus . Also make an area plot of the optimal portfolios x versus .
Hint: The Matlab functions erfc and erfcinv can be used to evaluate
Z x
2
(x) = (1/ 2) et /2 dt

and its inverse:


1
(u) = erfc(u/ 2).
2
Since you will have to solve this problem for a large number of values of , you may find the
command cvx_quiet(true) helpful.

376
(c) Monte Carlo simulation. Let x be the optimal portfolio found in part (b), with = 0.05.
This portfolio maximizes the expected return, subject to the probability of a loss being no
more than 5%. Generate 10000 samples of p, and plot a histogram of the returns. Find the
empirical mean of the return samples, and calculate the percentage of samples for which a loss
occurs.
Hint: You can generate samples of the price change vector using
p=pbar+sqrtm(Sigma)*randn(4,1);

Solution.

(a) Risk-return tradeoff.

n=4;
pbar = [.12 .10 .07 .03];
S = [0.0064 0.0008 -0.0011 0;
0.0008 0.0025 0 0;
-0.0011 0 0.0004 0;
0 0 0 0];
novals = 100;
returns = zeros(1,novals);
stdevs = zeros(1,novals);
pfs = zeros(n,novals);
muvals = logspace(0,7,novals);
for i=1:novals
mu = muvals(i);
cvx_begin
variable x(n);
minimize (-pbar*x + mu*quad_form(x,S));
subject to
sum(x) == 1;
x >= 0;
cvx_end;
returns(i) = pbar*x;
stdevs(i) sqrt(x*S*x);
pfs(n,i)= x;
end;

figure(1)
plot(stdevs,returns);
figure(2)
plot(stdevs, pfs(1,:)); hold on
plot(stdevs, (pfs(1,:)+pfs(2,:)));
plot(stdevs, (pfs(1,:)+pfs(2,:)+pfs(3,:)));
hold off
axis([-.01 .09 -0.1 1.1]);

377
0.12 1 x3

0.1 0.8
x2

0.08 0.6
x1

y
y

0.4
0.06

0.2
0.04
0
0.02
0 0.02 0.04 0.06 0.08 0 0.02 0.04 0.06 0.08
x x

(b) Portfolio optimization with loss constraint. The problem

maximize pT x
subject to (( pT x)/ xT 1/2 x)
1T x = 1, x  0

is equivalent to the SOCP

maximize pT X
subject to pT x + 1 ()k1/2 k2 .

(Note that 1 () < 0 if > 1/2.)

% compute S^{1/2}
[V,D] = eig(S);
sqrtS = V*diag(sqrt(diag(D)))*V;

beta = 0.0;
novals = 100;
etas = logspace(-4,-1,novals);

% maximize pbar*x
% subject to Phi((beta-pbar*x) / sqrt(x*S*x)) <= eta
% 1*x = 1, x >= 0
% gamma = Phi^{-1}(eta) where Phi(u) = .5 * erfc(-u/sqrt(2))

returns = zeros(1,novals);
pfs = zeros(n,novals);
for k = 1:novals
eta = etas(k);
gamma = -sqrt(2)*erfcinv(2*eta);
cvx_begin
variable x(n)

378
maximize(pbar*x)
subject to
sum(x) == 1;
x >= 0;
pbar*x + gamma * norm(sqrtS*x,2) >= beta
cvx_end
returns(k) = pbar*x;
pfs(:,k) = x;
end;

figure(1)
semilogx(etas,returns)
figure(2)
semilogx(etas, pfs(1,:)); hold on
semilogx(etas, (pfs(1,:)+pfs(2,:)));
semilogx(etas, (pfs(1,:)+pfs(2,:)+pfs(3,:)));
grid on
hold off
1.2
0.12

0.115 1 x3

0.11
0.8
y

0.105
y

x2
0.6
0.1

0.4
0.095
x1

0.09 4 3 2 1
0.2 4 3 2 1
10 10 10 10 10 10 10 10
x x

(c) The following code was used for this part of the problem:
randn(state,0);

% Number of Monte Carlo samples


N=10000;

eta = 0.05;
gamma = sqrt(2)*erfcinv(2*(1-eta));

% Get maximizing portfolio for eta = 10


cvx_begin
variable x(n)
maximize(pbar*x)

379
subject to
sum(x) == 1
x >= 0
pbar*x + gamma * norm(sqrtS*x,2) >= 0
cvx_end

% Monte Carlo simulation


returns = [];
for i = 1:N
p = pbar+sqrtS*randn(n,1);
returns = [returns p*x];
end

% Empirical mean, histogram and percentage of losses


emp_mean = mean(returns)
perc_losses = sum(returns < 0)/N
hist(returns,50)
hold on
plot([0;0],[0;700],r--)
The following figure shows the histogram of the returns:
700

600

500

400

300

200

100

0
0.2 0.1 0 0.1 0.2 0.3 0.4 0.5
return

In this case the empirical mean of the return is 0.1182 and approximately 4.97% of the samples
generate a loss.

13.3 Simple portfolio optimization. We consider a portfolio optimization problem as described on pages
155 and 185186 of Convex Optimization, with data that can be found in the file simple_portfolio_data.*.

(a) Find minimum-risk portfolios with the same expected return as the uniform portfolio (x =
(1/n)1), with risk measured by portfolio return variance, and the following portfolio con-
straints (in addition to 1T x = 1):

380
No (additional) constraints.
Long-only: x  0.
Limit on total short position: 1T (x ) 0.5, where (x )i = max{xi , 0}.
Compare the optimal risk in these portfolios with each other and the uniform portfolio.
(b) Plot the optimal risk-return trade-off curves for the long-only portfolio, and for total short-
position limited to 0.5, in the same figure. Follow the style of figure 4.12 (top), with horizontal
axis showing standard deviation of portfolio return, and vertical axis showing mean return.

Solution.
(a) We can express these as QPs:
No (additional) constraints:

minimize xT x
subject to 1T x = 1, pT x = pT (1/n)1
Long only
minimize xT x
subject to x  0, 1T x = 1
pT x = pT (1/n)1
Limit on total short position:
minimize xT x
subject to 1T x = 1, pT x = pT (1/n)1
1T x 0.5
Although the two portfolios have the same expected return, their risk profiles differ drastically.
The standard deviations are:
uniform: 8.7%
long-only: 5.1%
limit on total short position: 2.1%
unconstrained: 1.9%
for the MATLAB data, and
uniform: 27.00%
long-only: 3.95%
limit on total short position: 1.50%
unconstrained: 0.9%
for the Python data. Notice that as the size of the feasible set increases, the objective value
improves.
(b) The optimal risk-return trade-off curves can be generated by scalarizing the bicriterion ob-
jective, pT x xT x, over a range of values for . For example, the optimal curve for the
long-only portfolio is generated by solving the following family of QPs:
maximize pT x xT x
subject to x  0, 1T x = 1

381
0.2

0.18

0.16

0.14

0.12
y

0.1

0.08

0.06

0.04

0.02
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35
x

20
Long only
18 Limit on short

16

14
Return in %

12

10

2
0 5 10 15 20 25 30 35 40
Risk in %

For every level of return, there is a portfolio with limits on total short positions (red) that has
greater expected return than the optimal long-only portfolio (blue).
The following CVX code solves this problem:
simple_portfolio_data;
%% part i
%minimum-risk unconstrained portfolio with same expected return as uniform
%allocation
cvx_begin
cvx_quiet(true)

382
variable x_unconstrained(n)
minimize(quad_form(x_unconstrained,S))
subject to
sum(x_unconstrained)==1;
pbar*x_unconstrained==x_unif*pbar;
cvx_end
%% part ii
%minimum-risk long-only portfolio with same expected return as uniform
%allocation
cvx_begin
cvx_quiet(true)
variable x_long(n)
minimize(quad_form(x_long,S))
subject to
x_long>=0;
sum(x_long)==1;
pbar*x_long==x_unif*pbar;
cvx_end
%% part iii
%minimum-risk constrained short portfolio with same expected return as uniform
%allocation
cvx_begin
cvx_quiet(true)
variable x_shortconstr(n)
minimize(quad_form(x_shortconstr,S))
subject to
sum(pos(-x_shortconstr))<=0.5;
sum(x_shortconstr)==1;
pbar*x_shortconstr==x_unif*pbar;
cvx_end
%% Generate risk-return trade-off curves
sprintf(unconstrained sd: %0.3g\n, sqrt(quad_form(x_unconstrained,S)))
sprintf(long only sd: %0.3g\n, sqrt(quad_form(x_long,S)))
sprintf(constrained short sd: %0.3g\n, sqrt(quad_form(x_shortconstr,S)))
sprintf(x_unif sd: %0.3g\n, sqrt(quad_form(x_unif,S)))
novals=100;
r_long = [];
r_shortconstr = [];
sd_long = [];
sd_shortconstr = [];
muvals = logspace(-1,4,novals);

for i=1:novals
mu = muvals(i);
%long only

383
cvx_begin
cvx_quiet(true)
variable x(n)
maximize(pbar*x - mu*quad_form(x,S))
subject to
x>=0;
sum(x)==1;
cvx_end
r_long = [r_long, pbar*x];
sd_long = [sd_long, sqrt(x*S*x) ];
%constrained short
cvx_begin
cvx_quiet(true)
variables x(n)
maximize(pbar*x - mu*quad_form(x,S))
subject to
sum(x)==1;
sum(pos(-x))<=.5;
cvx_end
r_shortconstr = [r_shortconstr, pbar*x];
sd_shortconstr = [sd_shortconstr, sqrt(x*S*x)];
end;

plot(sd_long, r_long);
hold; plot(sd_shortconstr, r_shortconstr, r);
The following CVXPY code implements this:
import numpy as np
import cvxpy as cvx
import matplotlib.pyplot as plt

np.random.seed(1)
n = 20
pbar = np.ones((n,1))*.03 + np.r_[np.random.rand(n-1,1), np.zeros((1,1))]*.12;
S = np.random.randn(n, n); S = np.asmatrix(S)
S = S.T*S
S = S/max(np.abs(np.diag(S)))*.2
S[:, -1] = np.zeros((n, 1))
S[-1, :] = np.zeros((n, 1)).T
x_unif = np.ones((n, 1))/n; x_unit = np.asmatrix(x_unif)

x = cvx.Variable(n)
risk = cvx.quad_form(x,S)
constraints = [cvx.sum_entries(x)==1, pbar.T*x==sum(pbar)/n]

384
#Uniform portfolio
print Risk for uniform: %.2f%% % float(np.sqrt(np.sum(pbar)/n)*100)

#No additional constraints


cvx.Problem(cvx.Minimize(risk), constraints).solve()
print Risk for unconstrained: %.2f%% % float(np.sqrt(risk.value)*100)

#Long only
cvx.Problem(cvx.Minimize(risk), constraints + [x>=0]).solve()
print Risk for long only: %.2f%% % float(np.sqrt(risk.value)*100)

#Limit on total short position


cvx.Problem(cvx.Minimize(risk), constraints\
+ [cvx.sum_entries(cvx.neg(x))<=0.5]).solve()
print Risk for limit on short: %.2f%% % float(np.sqrt(risk.value)*100)

gamma = cvx.Parameter(sign=positive)
expec_return = pbar.T*x
prob = cvx.Problem(cvx.Maximize(expec_return-gamma*risk), [])

N = 128

#Long only
gamma_vals = np.logspace(-1,5,num=N)
return_vec1 = np.zeros((N,1))
risk_vec1 = np.zeros((N,1))
prob.constraints = [cvx.sum_entries(x)==1, x>=0]
for i in range(N):
gamma.value = gamma_vals[i]
prob.solve()
return_vec1[i] = expec_return.value
risk_vec1[i] = risk.value

plt.figure()
plt.plot(np.sqrt(risk_vec1)*100, return_vec1*100, label=Long only)

#Limit on Short
return_vec2 = np.zeros((N,1))
risk_vec2 = np.zeros((N,1))
prob.constraints = [cvx.sum_entries(x)==1, cvx.sum_entries(cvx.neg(x))<=0.5]
for i in range(N):
gamma.value = gamma_vals[i]
prob.solve()
return_vec2[i] = expec_return.value

385
risk_vec2[i] = risk.value

plt.plot(np.sqrt(risk_vec2)*100, return_vec2*100, label=Limit on short)


plt.legend()
plt.xlabel(Risk in %)
plt.ylabel(Return in %)
plt.savefig(simple_portfolio.eps)
plt.show()
The following Convex.jl code implements this:
using Convex, Gadfly
include("simple_portfolio_data.jl")

# uniform portfolio
println("uniform sd: ", sqrt(x_unif*S*x_unif));

# part i
# minimum-risk unconstrained portfolio with same expected return as uniform
# allocation
x_unconstrained = Variable(n);
p = minimize(quadform(x_unconstrained, S));
p.constraints += sum(x_unconstrained) == 1;
p.constraints += pbar*x_unconstrained == pbar*x_unif;
solve!(p);
println("unconstrained sd: ", sqrt(p.optval));

# part ii
# minimum-risk long-only portfolio with same expected return as uniform
# allocation
x_long = Variable(n);
p = minimize(quadform(x_long, S));
p.constraints += x_long >= 0;
p.constraints += sum(x_long) == 1;
p.constraints += pbar*x_long == pbar*x_unif;
solve!(p);
println("long only sd: ", sqrt(p.optval));

# part iii
# minimum-risk constrained short portfolio with same expected return as uniform
# allocation
x_shortconstr = Variable(n);
p = minimize(quadform(x_shortconstr, S));
p.constraints += sum(pos(-x_shortconstr)) <= 0.5;
p.constraints += sum(x_shortconstr) == 1;
p.constraints += pbar*x_shortconstr == pbar*x_unif;
solve!(p);

386
println("constrained short sd: ", sqrt(p.optval));

# Generate risk-return trade-off curves


novals=50;
r_long = [];
r_shortconstr = [];
sd_long = [];
sd_shortconstr = [];
muvals = logspace(-1,3,novals);

for i=1:novals
mu = muvals[i];

# long only
x = Variable(n);
ret = pbar*x;
risk = quadform(x, S);
p = maximize(ret - mu*risk);
p.constraints += x >= 0;
p.constraints += sum(x) == 1;
solve!(p);
r_long = [r_long; evaluate(ret)];
sd_long = [sd_long; sqrt(evaluate(risk))];

# constrained short
x = Variable(n);
ret = pbar*x;
risk = quadform(x, S);
p = maximize(ret - mu*risk);
p.constraints += sum(pos(-x)) <= 0.5;
p.constraints += sum(x) == 1;
solve!(p);
r_shortconstr = [r_shortconstr; evaluate(ret)];
sd_shortconstr = [sd_shortconstr; sqrt(evaluate(risk))];
end

plot(
layer(x=sd_long, y=r_long, Geom.line,
Theme(default_color=color("red"))),
layer(x=sd_shortconstr, y=r_shortconstr, Geom.line,
Theme(default_color=color("blue")))
)

13.4 Bounding portfolio risk with incomplete covariance information. Consider the following instance of
the problem described in 4.6, on p171173 of Convex Optimization. We suppose that ii , which

387
are the squares of the price volatilities of the assets, are known. For the off-diagonal entries of ,
all we know is the sign (or, in some cases, nothing at all). For example, we might be given that
12 0, 23 0, etc. This means that we do not know the correlation between p1 and p2 , but we
do know that they are nonnegatively correlated (i.e., the prices of assets 1 and 2 tend to rise or
fall together).
2 , the worst-case variance of the portfolio return, for the specific case
Compute wc

0.1 0.2 + +

0.2 + 0.1
x= , = ,

0.05 + 0.3 +
0.1 + 0.1

where a + entry means that the element is nonnegative, a means the entry is nonpositive,
and means we dont know anything about the entry. (The negative value in x represents a
short position: you sold stocks that you didnt have, but must produce at the end of the investment
period.) In addition to wc2 , give the covariance matrix
wc associated with the maximum risk.
Compare the worst-case risk with the risk obtained when is diagonal.
Solution. We can solve the problem with the following CVXCVXPY code:

x = [0.1 0.2 -0.05 0.1];

cvx_begin
variable Sigma(4,4) symmetric
maximize(x*Sigma*x)
subject to
Sigma(1,1) == 0.2; Sigma(2,2) == 0.1;
Sigma(3,3) == 0.3; Sigma(4,4) == 0.1;
Sigma(1,2) >= 0; Sigma(1,3) >= 0;
Sigma(2,3) <= 0; Sigma(2,4) <= 0;
Sigma(3,4) >= 0;
Sigma == semidefinite(4);
cvx_end

s_wc = sqrt(cvx_optval)

import cvxpy as cvx


import numpy as np

x = np.matrix(0.1; 0.2; -0.05; 0.1);

Sigma = cvx.semidefinite(4)
constraints = [Sigma[0,0]==0.2, Sigma[1,1]==0.1]
constraints += [Sigma[2,2]==0.3, Sigma[3,3]==0.1]
constraints += [Sigma[0,1]>=0, Sigma[0,2]>=0]
constraints += [Sigma[1,2]<=0, Sigma[1,3]<=0, Sigma[2,3]>=0]
objective = cvx.Maximize(x.T*Sigma*x)

388
cvx.Problem(objective, constraints).solve()

sigma_wc = np.sqrt(objective.value)

2 = 0.1232. The
Running this script we get the optimal value to be 0.0151662, that is, we get wc
associated matrix is given by

0.2000 0.0790 0.0000 0.1118
0.0790 0.1000 0.1387 0.0000
=

0.0000 0.1387 0.3000 0.0752


0.1118 0.0000 0.0752 0.1000
in Matlab,
0.2000 0.0847 0.0000 0.1003
0.0847 0.1000 0.1273 0.0000
=

0.0000 0.1273 0.3000 0.0522


0.1003 0.0000 0.0522 0.1000
in Python, and
0.2000 0.0979 0.0000 0.0741
0.0978 0.1000 0.1010 0.0000
=

0.0000 0.1010 0.3000 0.0000


0.0741 0.0000 0.0000 0.1000
in Julia.
13.5 Log-optimal investment strategy. In this problem you will solve a specific instance of the log-optimal
investment problem described in exercise 4.60, with n = 5 assets and m = 10 possible outcomes in
each period. The problem data are defined in log_opt_invest.*, with the rows of the matrix P
giving the asset return vectors pTj . The outcomes are equiprobable, i.e., we have j = 1/m. Each
column of the matrix P gives the return of the associated asset in the different posible outcomes.
You can examine the columns to get an idea of the types of assets. For example, the last asset gives
a fixed and certain return of 1%; the first asset is a very risky one, with occasional large return,
and (more often) substantial loss.
Find the log-optimal investment strategy x? , and its associated long term growth rate Rlt
? . Compare

this to the long term growth rate obtained with a uniform allocation strategy, i.e., x = (1/n)1, and
also with a pure investment in each asset.
For the optimal investment strategy, and also the uniform investment strategy, plot 10 sample
trajectories of the accumulated wealth, i.e., W (T ) = W (0) Tt=1 (t), for T = 0, . . . , 200, with
Q

initial wealth W (0) = 1.


To save you the trouble of figuring out how to simulate the wealth trajectories or plot them nicely,
weve included the simulation and plotting code in log_opt_invest.*; you just have to add the
code needed to find x? .
Hint (MATLAB users only): The current version of CVX handles the logarithm via an iterative
method, which can be slow and unreliable. Youre better off using geo_mean(), which is directly
handled by CVX, to solve the problem.
Solution.

389
(a) The following MATLAB code was used to solve this problem:
clear all

P = [3.5000 1.1100 1.1100 1.0400 1.0100;


0.5000 0.9700 0.9800 1.0500 1.0100;
0.5000 0.9900 0.9900 0.9900 1.0100;
0.5000 1.0500 1.0600 0.9900 1.0100;
0.5000 1.1600 0.9900 1.0700 1.0100;
0.5000 0.9900 0.9900 1.0600 1.0100;
0.5000 0.9200 1.0800 0.9900 1.0100;
0.5000 1.1300 1.1000 0.9900 1.0100;
0.5000 0.9300 0.9500 1.0400 1.0100;
3.5000 0.9900 0.9700 0.9800 1.0100];

[m,n] = size(P);

% Find log-optimal investment policy


cvx_begin
variable x_opt(n)
maximize(geomean(P*x_opt))
sum(x_opt) == 1
x_opt >= 0
cvx_end

x_opt
x_unif = ones(n,1)/n;
R_opt = sum(log(P*x_opt))/m
R_unif = sum(log(P*x_unif))/m
The following Python code solves the problem:
import numpy as np
import cvxpy as cvx
from scipy.stats import gmean
from cmath import log

P = np.array(np.mat(
3.5000 1.1100 1.1100 1.0400 1.0100;\
0.5000 0.9700 0.9800 1.0500 1.0100;\
0.5000 0.9900 0.9900 0.9900 1.0100;\
0.5000 1.0500 1.0600 0.9900 1.0100;\
0.5000 1.1600 0.9900 1.0700 1.0100;\
0.5000 0.9900 0.9900 1.0600 1.0100;\
0.5000 0.9200 1.0800 0.9900 1.0100;\
0.5000 1.1300 1.1000 0.9900 1.0100;\
0.5000 0.9300 0.9500 1.0400 1.0100;\

390
3.5000 0.9900 0.9700 0.9800 1.0100))

m=P.shape[0];
n=P.shape[1];
x_unif = np.ones(n)/n; # uniform resource allocation

x_opt = cvx.Variable(n)
objective = cvx.Maximize(cvx.sum_entries(cvx.log(P*x_opt)))
constraints = [ cvx.sum_entries(x_opt) == 1,
x_opt >= 0]
prob = cvx.Problem(objective, constraints)
result = prob.solve()

x_opt = x_opt.value
print "status:", prob.status
print "Log-optimal investment strategy:"
print x_opt
print "Optimal long term growth rate:" , prob.value/m
print "Long term growth associated with uniform allocation strategy:",
print np.log(gmean(P.dot(x_unif)))
The following Julia code solves the problem:
P = [3.5000 1.1100 1.1100 1.0400 1.0100;
0.5000 0.9700 0.9800 1.0500 1.0100;
0.5000 0.9900 0.9900 0.9900 1.0100;
0.5000 1.0500 1.0600 0.9900 1.0100;
0.5000 1.1600 0.9900 1.0700 1.0100;
0.5000 0.9900 0.9900 1.0600 1.0100;
0.5000 0.9200 1.0800 0.9900 1.0100;
0.5000 1.1300 1.1000 0.9900 1.0100;
0.5000 0.9300 0.9500 1.0400 1.0100;
3.5000 0.9900 0.9700 0.9800 1.0100];

m,n = size(P);
x_unif = ones(n)/n; # uniform resource allocation

using Convex, SCS

# Find log-optimal investment policy


x_rob = Variable(n);
constraint = sum(x_rob) == 1;
constraint += x_rob >=0;
problem = maximize(sum(log(P*x_rob)), constraint);
solve!(problem);

# Nominal cost of x_rob

391
x_opt = x_rob.value;
println("Log-optimal investment strategy:")
println(x_opt)
R_opt = sum(log(P*x_opt))/m
println("Optimal long term growth rate:")
println(R_opt)
R_unif = sum(log(P*x_unif))/m
println("Long term growth associated with uniform allocation strategy:")
println(R_unif)
It was found that the log-optimal investment strategy is:

xopt = (0.0580, 0.4000, 0.2923, 0.2497, 0.0000) .

This strategy achieves a long term growth rate Rlt = 0.0231. In contrast, the uniform alloca-

tion strategy achieves a growth rate of Runif = 0.0114.


Clearly asset 1 is a high-risk asset. The amount that we invest in this asset will grow by a
factor of 3.50 with probability 20% and will be halved with probability 80%. On the other
hand, asset 5 is an asset with a certain return of 1% per time period. Finally, assets 2, 3 and 4
are low-risk assets. It turns out that the log-optimal policy in this case is to invest very little
wealth in the high-risk asset and no wealth on the sure asset and to invest most of the wealth
in asset 2.
(b) We show the scripts to generate the random event sequences and the trajectory plots. Since
the trajectories depend on the specific random numbers generated, we show the plots for all
languages, for completeness. In MATLAB:
% Generate random event sequences
rand(state,10);
N = 10; % number of random trajectories
T = 200; % time horizon
w_opt = []; w_unif = [];
for i = 1:N
events = ceil(rand(1,T)*m);
P_event = P(events,:);
w_opt = [w_opt [1; cumprod(P_event*x_opt)]];
w_unif = [w_unif [1; cumprod(P_event*x_unif)]];
end

% Plot wealth versus time


figure
semilogy(w_opt,g)
hold on
semilogy(w_unif,r--)
grid
axis tight
xlabel(time)

392
ylabel(wealth)
This generates the following plot:

3
10

2
10
wealth

1
10

0
10

1
10

20 40 60 80 100 120 140 160 180 200


time

In Python:
import matplotlib.pyplot as plt
np.random.seed(10);
N = 10; # number of random trajectories
T = 200; # time horizon
w_opt = np.zeros((N,T+1))
w_unif = np.zeros((N,T+1))
for i in range(N):
events = np.floor(np.random.rand(T)*m)
events = events.astype(int)
P_event = P[events,:]
w_opt[i,:] = np.append(1, np.cumprod(P_event.dot(x_opt)))
w_unif[i,:] = np.append(1, np.cumprod(P_event.dot(x_unif)))

plt.figure(figsize=(10, 6), dpi=80)


plt.xlabel($t$)
plt.ylabel($W(t)$)
plt.gca().set_yscale(log)
plt.plot(range(T+1), np.transpose(w_opt), color="green", linewidth=2.0)
plt.plot(range(T+1), np.transpose(w_unif), color="red", linewidth=2.0, linestyle=--)
plt.savefig(log_opt_invest.eps)

393
plt.show()
This generates the following plot:

3
10

2
10

1
10
W(t)

0
10

-1
10

-2
10
0 50 100 150 200
t

In Julia:
using PyPlot, Distributions

srand(10);
N = 10; # number of random trajectories
t = 200; # time horizon
w_opt = zeros(N,t+1);
w_unif = zeros(N,t+1);
for i = 1:N
events = vec(ceil(rand(1,t)*m))
events = round(Int64, events)
P_event = P[events,:];
w_opt[i, :] = [1.0 ; cumprod(P_event*x_opt)];
w_unif[i, :] = [1.0 ; cumprod(P_event*x_unif)];
end

# Plot wealth versus time


figure();
hold("true");
for i = 1:N
semilogy(1:t+1, vec(w_opt[i,:]), "g", linestyle="--");
semilogy(1:t+1, vec(w_unif[i,:]), "r");
end
axis([0, t, 0, 1000])
xlabel("time");
ylabel("wealth");

394
savefig("log_opt_invest.eps");
This generates the following plot:

10 3

10 2

10 1
wealth

10 0

10 -1

0 50 100 150 200


time

The log-optimal investment policy consistently increases the wealth. On the other hand the
uniform allocation policy generates quite random trajectories, a few with very large increases
in wealth, and many with poor performance. This is due to the fact that with this policy 20%
of the wealth is invested in the high-risk asset.
13.6 Optimality conditions and dual for log-optimal investment problem.
(a) Show that the optimality conditions for the log-optimal investment problem described in
exercise 4.60 can be expressed as: 1T x = 1, x  0, and for each i,
m m
X pij X pij
xi > 0 j = 1, xi = 0 j 1.
j=1
pTj x j=1
pTj x

We can interpret this as follows. pij /pTj x is a random variable, which gives the ratio of the
investment gain with asset i only, to the investment gain with our mixed portfolio x. The
optimality condition is that, for each asset we invest in, the expected value of this ratio is one,
and for each asset we do not invest in, the expected value cannot exceed one. Very roughly
speaking, this means our portfolio does as well as any of the assets that we choose to invest
in, and cannot do worse than any assets that we do not invest in.
Hint. You can start from the simple criterion given in 4.2.3 or the KKT conditions.
(b) In this part we will derive the dual of the log-optimal investment problem. We start by writing
the problem as
minimize m
P
j=1 j log yj
subject to y = P T x, x  0, 1T x = 1.

395
Here, P has columns p1 , . . . , pm , and we have the introduced new variables y1 , . . . , ym , with
the implicit constraint y  0. We will associate dual variables , and 0 with the constraints
y = P T x, x  0, and 1T x = 1, respectively. Defining j = j /0 for j = 1, . . . , m, show that
the dual problem can be written as
m P
maximize j=1 j log(
j /j )
subject to P  1,

with variable . The objective here is the (negative) Kullback-Leibler divergence between the
given distribution and the dual variable .

Solution.

(a) The problem is the same as minimizing f (x) = m T


P
j=1 j log(pj x) over the probability sim-
plex. If x is feasible the optimality condition is that f (x)T (z x) 0 for all z with z  0,
1T z = 1. This holds if and only if for each k,
f f
xk > 0 = min = f (x)T x.
xk i=1,...,n xi
Pm
From f (x) = j=1 j log(pTj x), we get
m
X
f (x)i = j (1/pTj x)pij ,
j=1

so m
X
f (x)T x = j (1/pTj x)pTj x = 1.
j=1

The optimality conditions are, 1T x = 1, x  0, and for each i,


m m
X pij X pij
xi > 0 j = 1, xi = 0 j 1.
j=1
pTj x j=1
pTj x

(b) The Lagrangian is


m
X
L(x, , , 0 ) = j log yj + T (y P T x) T x + 0 (1T x 1).
j=1

This is unbounded below unless T P T T + 0 1T = 0. Since L is separable in y we can


minimize it over y by minimizing over yj . We find the minimum is obtained for yj = j /j ,
so we have m X
g(, , 0 ) = 1 0 + j log(j /j ),
j=1

provided P  0 1. Thus we can write the dual problem as


Pm
maximize 1 0 + j=1 j log(j /j )
subject to P  0 1

396
with variables Rm , 0 R. This has implicit constraint  0.
We can further simplify this, by analytically optimizing over 0 . From the constraint inequality
we see that 0 > 0. Defining = /0 , we get the problem in variables , 0
maximize 1 0 + m
P
j=1 j log(
j 0 /j )
subject to 0 P  0 1.
Cancelling 0 from the constraint we get
Pm
maximize 1 0 + log 0 + j=1 j log(
j /j )
subject to P  1.
The optimal value of 0 is evidently 0 = 1, so we end up with the dual problem
Pm
maximize j=1 j log(
j /j )
subject to P  1.
13.7 Arbitrage and theorems of alternatives. Consider an event (for example, a sports game, political
elections, the evolution of the stock market over a certain period) with m possible outcomes.
Suppose that n wagers on the outcome are possible. If we bet an amount xj on wager j, and the
outcome of the event is i (i = 1, . . . , m), then our return will be equal to rij xj . The return rij xj is
the net gain: we pay xj initially, and receive (1 + rij )xj if the outcome of the event is i. We allow
the bets xj to be positive, negative, or zero. The interpretation of a negative bet is as follows. If
xj < 0, then initially we receive an amount of money |xj |, with an obligation to pay (1 + rij )|xj | if
outcome i occurs. In that case, we lose rij |xj |, i.e., our net is gain rij xj (a negative number).
We call the matrix R Rmn with elements rij the return matrix. A betting strategy is a vector
x Rn , with as components xj the amounts we bet on each wager. If we use a betting strategy
x, our total return in the event of outcome i is equal to nj=1 rij xj , i.e., the ith component of the
P

vector Rx.

(a) The arbitrage theorem. Suppose you are given a return matrix R. Prove the following theorem:
there is a betting strategy x Rn for which
Rx  0
if and only if there exists no vector p Rm that satisfies
RT p = 0, p  0, p 6= 0.
We can interpret this theorem as follows. If Rx  0, then the betting strategy x guarantees a
positive return for all possible outcomes, i.e., it is a sure-win betting scheme. In economics,
we say there is an arbitrage opportunity.
If we normalize the vector p in the second condition, so that 1T p = 1, we can interpret it as
a probability vector on the outcomes. The condition RT p = 0 means that
E Rx = pT Rx = 0
for all x, i.e., the expected return is zero for all betting strategies. In economics, p is called a
risk neutral probability.
We can therefore rephrase the arbitrage theorem as follows: There is no sure-win betting
strategy (or arbitrage opportunity) if and only if there is a probability vector on the outcomes
that makes all bets fair (i.e., the expected gain is zero).

397
Country Odds Country Odds
Holland 3.5 Czech Republic 17.0
Italy 5.0 Romania 18.0
Spain 5.5 Yugoslavia 20.0
France 6.5 Portugal 20.0
Germany 7.0 Norway 20.0
England 10.0 Denmark 33.0
Belgium 14.0 Turkey 50.0
Sweden 16.0 Slovenia 80.0

Table 1: Odds for the 2000 European soccer championships.

(b) Betting. In a simple application, we have exactly as many wagers as there are outcomes
(n = m). Wager i is to bet that the outcome will be i. The returns are usually expressed as
odds. For example, suppose that a bookmaker accepts bets on the result of the 2000 European
soccer championships. If the odds against Belgium winning are 14 to one, and we bet $100 on
Belgium, then we win $1400 if they win the tournament, and we lose $100 otherwise.
In general, if we have m possible outcomes, and the odds against outcome i are i to one,
then the return matrix R Rmm is given by

rij = i if j = i
rij = 1 otherwise.

Show that there is no sure-win betting scheme (or arbitrage opportunity) if


m
X 1
= 1.
i=1
1 + i

In fact, you can verify that if this equality is not satisfied, then the betting strategy

1/(1 + i )
xi = Pm
1 i=1 1/(1 + i )

always results in a profit.


The common situation in real life is
m
X 1
> 1,
i=1
1 + i

because the bookmakers take a cut on all bets.

Solution.

(a) This follows directly from the theorem of alternatives for strict linear inequalities on page 8-26
of the lecture notes.
(b) XXX missing ?? XXX

398
(c) Using the theorem proved in part (a), we know that if there is no arbitrage opportunity, then
there exists a solution to the conditions (2), i.e., there exists p1 , p2 , satisfying

p1 0, p2 0, p1 + p2 6= 0.

and RT p = 0. We can assume that p1 + p2 = 1, so the condition RT p = 0 reduces to


u1r d1r
p1 + (1 p1 ) =0
1+r 1+r
and
max{0, Su K} max{0, Sd K}
   
p1 1 + (1 p1 ) 1 = 0.
(1 + r)C (1 + r)C
The first equality yields
1+rd
p1 = .
ud
The second equality yields
1
C= (p1 max{0, Su K} + (1 p1 ) max{0, Sd K}).
1+r

13.8 Log-optimal investment. We consider an instance of the log-optimal investment problem described
in exercise 4.60 of Convex Optimization. In this exercise, however, we allow x, the allocation vector,
to have negative components. Investing a negative amount xi W (t) in an asset is called shorting
the asset. It means you borrow the asset, sell it for |xi W (t)|, and have an obligation to purchase it
back later and return it to the lender.

(a) Let R be the n m-matrix with columns rj :


h i
R= r1 r2 rm .

We assume that the elements rij of R are all positive, which implies that the log-optimal
investment problem is feasible. Show the following property: if there exists a v Rn with

1T v = 0, RT v  0, RT v 6= 0 (54)

then the log-optimal investment problem is unbounded (assuming that the probabilities pj are
all positive).
(b) Derive a Lagrange dual of the log-optimal investment problem (or an equivalent problem of
your choice). Use the Lagrange dual to show that the condition in part a is also necessary for
unboundedness. In other words, the log-optimal investment problem is bounded if and only
if there does not exist a v satisfying (54).
(c) Consider the following small example. We have four scenarios and three investment options.
The return vectors for the four scenarios are

2 2 0.5 0.5
r1 = 1.3 , r2 = 0.5 , r3 = 1.3 , r4 = 0.5 .

1 1 1 1

399
The probabilities of the three scenarios are
p1 = 1/3, p2 = 1/6, p3 = 1/3, p4 = 1/6.
The interpretation is as follows. We can invest in two stocks. The first stock doubles in value
in each period with a probability 1/2, or decreases by 50% with a probability 1/2. The second
stock either increases by 30% with a probability 2/3, or decreases by 50% with a probability
1/3. The fluctuations in the two stocks are independent, so we have four scenarios: both stocks
go up (probability 2/6), stock 1 goes up and stock 2 goes down (probability 1/6), stock 1 goes
down and stock 2 goes up (probability 1/3), both stocks go down (probability 1/6). The
fractions of our capital we invest in stocks 1 and 2 are denoted by x1 and x2 , respectively.
The rest of our capital, x3 = 1 x1 x2 is not invested.
What is the expected growth rate of the log-optimal strategy x? Compare with the strategies
(x1 , x2 , x3 ) = (1, 0, 0), (x1 , x2 , x3 ) = (0, 1, 0) and (x1 , x2 , x3 ) = (1/2, 1/2, 0). (Obviously the
expected growth rate for (x1 , x2 , x3 ) = (0, 0, 1) is zero.)
Remark. The figure below shows a simulation that compares three investment strategies over
200 periods. The solid line shows the log-optimal investment strategy. The dashed lines show
the growth for strategies x = (1, 0, 0), (0, 1, 0), and (0, 0, 1).
6
10

4
10

2
10

0
10
W

2
10

4
10

6
10

8
10
0 50 100 150 200 250
t

Solution.
(a) The objective function grows unboundedly along the line defined as x + tv, t 0.
(b) We introduce new variables y, and write the problem as
m
X
minimize pj log yj
j=1
subject to y = RT x
1T x = 1.

400
The Lagrangian is
m
X
L(x, y, , ) = pj log yj + T (y RT x) + (1T x 1),
j=1

which is bounded below as a function of x only if R = 1. As a function of y, L reaches its


minimum if yj = pj /j . We obtain the dual function
m
X
g(, ) = pj log(pj /j ) + 1
i=1

if  0 and R = 1. The dual problem is


m
X
minimize pi log pi /i 1 +
i=1
subject to R = 1.

The variables are Rm and R.


To prove the condition for unboundedness we can use the following theorem of alternatives.
There exists a solution x  0 satisfying Ax = b if and only if there does not exist a v with
bT v 0, AT v  0, v 6= 0. We can prove this as in lecture 8, or the reader on page 130. The
dual function of the feasibility problem

x  0, Ax = b

is

g(, ) = inf T x + T (Ax b)


x
(
bT if AT =
=
otherwise.

Therefore the alternative system is

AT  0, AT 6= 0, bT 0.

Now we apply this result to problem (4). If (4) is infeasible, then

1T v 0, RT v  0, RT v 6= 0

is infeasible. (This follows from the fact that all elements of R are positive). By the theorem
of alternatives , there exists a y  0 such that

Ry = 1,

which means = y, = 1 is feasible in the dual problem. Therefore the primal problem is
bounded above.

401
(c) R = [1 1 -1/2 -1/2;
0.3 -0.5 0.3 -0.5];
p = [1/3; 1/6; 1/3; 1/6];
x = [0;0];
alpha = 0.2;
beta = 0.5;

for i=1:100
val = -p*log(1+R*x);
grad = -R*(p./(1+R*x));
hess = R*diag(p./(1+R*x).^2)*R;
dx = -hess\grad;
fprime = grad*dx;
if (sqrt(-fprime) <= 1e-5), break; end;
t=1;
for k=1:50
newx = x+t*dx;
newval = -p*log(1+R*newx);
if ((min(1+R*newx) > 0) & (-p*log(1+R*newx) <= val + t*alpha*fprime))
else t = beta*t; end;
end;
x = x + t*dx;
end;
The optimal solution is

x1 = 0.4973, x2 = 0.1994, x3 = 0.3034.

The expected growth rate is 0.0623.


The expected growth rate for x = (1, 0, 0) is 0. The expected growth rate for x = (0, 1, 0) is
0.0561. The expected growth rate for x = (1/2, 1/2, 0) is 0.0535.

13.9 Maximizing house profit in a gamble and imputed probabilities. A set of n participants bet on
which one of m outcomes, labeled 1, . . . , m, will occur. Participant i offers to purchase up to qi > 0
gambling contracts, at price pi > 0, that the true outcome will be in the set Si {1, . . . , m}. The
house then sells her xi contracts, with 0 xi qi . If the true outcome j is in Si , then participant
i receives $1 per contract, i.e., xi . Otherwise, she loses, and receives nothing. The house collects a
total of x1 p1 + + xn pn , and pays out an amount that depends on the outcome j,
X
xi .
i: jSi

The difference is the house profit.

(a) Optimal house strategy. How should the house decide on x so that its worst-case profit (over the
possible outcomes) is maximized? (The house determines x after examining all the participant
offers.)

402
(b) Imputed probabilities. Suppose x? maximizes the worst-case house profit. Show that there
exists a probability distribution on the possible outcomes (i.e., Rm T
+ , 1 = 1) for which
?
x also maximizes the expected house profit. Explain how to find .
Hint. Formulate the problem in part (a) as an LP; you can construct from optimal dual
variables for this LP.
Remark. Given , the fair price for offer i is pfair = jSi j . All offers with pi > pfair
P
i i will
fair
be completely filled (i.e., xi = qi ); all offers with pi < pi will be rejected (i.e., xi = 0).
Remark. This exercise shows how the probabilities of outcomes (e.g., elections) can be guessed
from the offers of a set of gamblers.
(c) Numerical example. Carry out your method on the simple example below with n = 5 partici-
pants, m = 5 possible outcomes, and participant offers
Participant i pi qi Si
1 0.50 10 {1,2}
2 0.60 5 {4}
3 0.60 5 {1,4,5}
4 0.60 20 {2,5}
5 0.20 10 {3}
Compare the optimal worst-case house profit with the worst-case house profit, if all offers were
accepted (i.e., xi = qi ). Find the imputed probabilities.

Solution.

(a) The worst-case house profit is X


pT x max xi ,
j=1,...,m
i: jSi

which is a piecewise-linear concave function of x. To find the x that maximizes the worst-case
profit, we solve the problem,

maximize pT x maxj=1,...,m aTj x


subject to 0  x  q,

with variable x. aTj are the rows of the subset matrix A, with
(
1 j Si
Aji =
0 otherwise.

(b) The problem from part (a) can be expressed as

maximize pT x t
subject to t1  Ax (55)
0  x  q,

where t is a new scalar variable. The Lagrangian is

L(x, t, 1 , 2 , 3 ) = t pT x + T1 (Ax t1) T2 x + T3 (x q).

403
This is bounded below if and only if 1T 1 = 1, and AT 1 2 + 3 = p. The dual can be
written as
maximize q T 3
subject to 1T 1 = 1
(56)
AT 1 2 + 3 = p
1  0, 2  0, 3  0,
with variables 1 , 2 , and 3 . Notice that 1 must satisfy 1T 1 = 1, and 1  0, hence it is a
probability distribution.
Suppose x?wc , t? , ?1 , ?2 , and ?3 are primal and dual optimal for problem (55), and let us set
= ?1 . To maximize the expected house profit we solve the problem,

maximize pT x T Ax
(57)
subject to 0  x  q.

Let bT1 , . . . , bTn be the rows of AT . We know that a point x?e is optimal for problem (57) if
and only if x?ei = qi when pi bTi > 0, x?ei = 0 when pi bTi < 0, and 0 x?ei qi when
pi bTi = 0.
To see why x?e = x?wc , let us take a look at one of the KKT conditions for problem (55). This
can be written as
p AT = ?3 ?2 ,
with = ?1 . If pi bTi > 0, then we must have ?3i ?2i > 0, which means that ?2i = 0
and ?3i > 0 (by complementary slackness), and so x?wci = qi . Similarly, if pi bTi < 0,
then ?3i ?2i < 0, which means that ?2i > 0 and ?3i = 0, and so x?wci = 0. Finally, when
pi bTi = 0, we must have ?2i = 0 and ?3i = 0, and so 0 x?wci qi .
In summary, in order to find a probability distribution on the possible outcomes for which the
same x? maximizes both the worst-case as well as the expected house profit, we solve the dual
LP (56), and set = ?1 .
(c) The following Matlab code solves the problem.
% solution for gambling problem
A = [1 0 1 0 0;
1 0 0 1 0;
0 0 0 0 1;
0 1 1 0 0;
0 0 1 1 0];

p = [0.5; 0.6; 0.6; 0.6; 0.2];


q = [10; 5; 5; 20; 10];

n = 5; m = 5;

cvx_begin
variables x(n) t
dual variable lambda1
maximize (p*x-t)

404
subject to
lambda1: A*x <= t
x >= 0
x <= q
cvx_end

% optimal worst case house profit


pwc = cvx_optval
% optimal worst case profit if all offer are accepted
pwc_accept = p*q-max(A*q)
% imputed probabilities
pi = lambda1
% fair prices
pfair = A*pi
% optimal purchase quantities
xopt = x
The following Python code solves the problem.
# Solution for gambling problem
import cvxpy as cvx
import numpy as np

A = np.matrix(1, 0, 1, 0, 0; \
1, 0, 0, 1, 0; \
0, 0, 0, 0, 1; \
0, 1, 1, 0, 0; \
0, 0, 1, 1, 0)
p = np.matrix(0.5; 0.6; 0.6; 0.6; 0.2)
q = np.matrix(10; 5; 5; 20; 10)

n = 5
m = 5

x, t = cvx.Variable(n), cvx.Variable(1)
obj = p.T*x-t
cons = [A*x <= t]
cons += [x >= 0]
cons += [x <= q]

p_star = cvx.Problem(cvx.Maximize(obj), cons).solve()


lambda1 = cons[0].dual_value

# optimal worst case house profit


pwc = p_star
# optimal worst case profit if all offers are accepted
pwc_accept = p.T*q-max(A*q)

405
# imputed probabilities
pi = lambda1
# fair prices
pfair = A.T*pi
# optimal purchase quantities
xopt = x.value
The following Julia code solves the problem.
# solution for gambling problem
A = [1 0 1 0 0;
1 0 0 1 0;
0 0 0 0 1;
0 1 1 0 0;
0 0 1 1 0];

p = [0.5; 0.6; 0.6; 0.6; 0.2];


q = [10; 5; 5; 20; 10];

n = 5; m = 5;

x = Variable(n);
t = Variable();
constraints = [A*x <= t, x >= 0, x <= q];
problem = maximize(p*x - t, constraints);
solve!(problem)

# optimal worst case house profit


pwc = problem.optval
# optimal worst case profit if all offer are accepted
pwc_accept = p*q - maximum(A*q)
# imputed probabilities
pi = constraints[1].dual
# fair prices
pfair = A*pi
# optimal purchase quantities
xopt = x.value
Our results are summarized in the following table. We find that the optimal worst case house
profit is 3.5, and the worst case house profit, if all offers are accepted, is 5. The imputed
probabilities are
= (0.1145, 0.3855, 0.0945, 0.1910, 0.2145)
in Matlab,
= (0.0970, 0.4040, 0.1316, 0.1715, 0.1970)
in Python, and
= (0.0567, 0.4433, 0.0784, 0.2649, 0.1567)

406
in Julia. Note that although these are the imputed probabilities that we found, the proba-
bilities are not unique. The associated fair prices and optimal contract numbers are shown
below.
Participant i pi pfair
i qi xi Si
1 0.50 0.5000 10 5 {1,2}
2 0.60 0.1910 5 5 {4}
3 0.60 0.5200 5 5 {1,4,5}
4 0.60 0.6000 20 5 {2,5}
5 0.20 0.0945 10 10 {3}

13.10 Optimal investment to fund an expense stream. An organization (such as a municipality) knows
its operating expenses over the next T periods, denoted E1 , . . . , ET . (Normally these are positive;
but we can have negative Et , which corresponds to income.) These expenses will be funded by a
combination of investment income, from a mixture of bonds purchased at t = 0, and a cash account.
The bonds generate investment income, denoted I1 , . . . , IT . The cash balance is denoted B0 , . . . , BT ,
where B0 0 is the amount of the initial deposit into the cash account. We can have Bt < 0 for
t = 1, . . . , T , which represents borrowing.
After paying for the expenses using investment income and cash, in period t, we are left with
Bt Et + It in cash. If this amount is positive, it earns interest at the rate r+ > 0; if it is negative,
we must pay interest at rate r , where r r+ . Thus the expenses, investment income, and cash
balances are linked as follows:
(
(1 + r+ )(Bt Et + It ) Bt Et + It 0
Bt+1 =
(1 + r )(Bt Et + It ) Bt Et + It < 0,

for t = 1, . . . , T 1. We take B1 = (1 + r+ )B0 , and we require that BT ET + IT = 0, which


means the final cash balance, plus income, exactly covers the final expense.
The initial investment will be a mixture of bonds, labeled 1, . . . , n. Bond i has a price Pi > 0,
a coupon payment Ci > 0, and a maturity Mi {1, . . . , T }. Bond i generates an income stream
given by
Ci
t < Mi
(i)
at = Ci + 1 t = M i

0 t > Mi ,
for t = 1, . . . , T . If xi is the number of units of bond i purchased (at t = 0), the total investment
cash flow is
(1) (n)
It = x1 at + + xn at , t = 1, . . . , T.
We will require xi 0. (The xi can be fractional; they do not need to be integers.)
The total initial investment required to purchase the bonds, and fund the initial cash balance at
t = 0, is x1 P1 + + xn Pn + B0 .

(a) Explain how to choose x and B0 to minimize the total initial investment required to fund the
expense stream.

407
(b) Solve the problem instance given in opt_funding_data.m. Give optimal values of x and B0 .
Give the optimal total initial investment, and compare it to the initial investment required if
no bonds were purchased (which would mean that all the expenses were funded from the cash
account). Plot the cash balance (versus period) with optimal bond investment, and with no
bond investment.

Solution.
(a) We will give several solutions (which are close, but not the same).

Method 1. The cash balance propagation equations,


(
(1 + r+ )(Bt Et + It ) Bt Et + It 0
Bt+1 =
(1 + r )(Bt Et + It ) Bt Et + It < 0,
which can be expressed as
Bt+1 = min {(1 + r+ )(Bt Et + It ), (1 + r )(Bt Et + It )} ,
are clearly not convex when Bt are considered as variables. We will describe two ways to
handle them.
In the first method, we relax the propagation equations to the convex inequalities
Bt+1 min {(1 + r+ )(Bt Et + It ), (1 + r )(Bt Et + It )} .
With this relaxation, the problem can be expressed as:
minimize B0 + ni=1 Pi xi
P
Pn (i)
subject to i=1 at xi = It , t = 1, . . . , T
x  0, B0 0
B1 = (1 + r+ )B0
Bt+1 min {(1 + r+ )(Bt Et + It ), (1 + r )(Bt Et + It )} , t = 1, . . . , T 1
BT ET + IT = 0,
with variables B0 , . . . , BT , I1 , . . . , IT , x1 , . . . , xn . (We could treat It as an affine expression,
instead of a variable; it makes no difference.)
We will now argue that any solution of this problem must satisfy
Bt+1 = min {(1 + r+ )(Bt Et + It ), (1 + r )(Bt Et + It )} , t = 1, . . . , T 1,
which means that by solving the LP above, we are also solving the original problem. Suppose
this is not the case; suppose, for example, that is the first period for which
B +1 < min {(1 + r+ )(B E + I ), (1 + r )(B E + I )} .
Of course, this is kind of silly: it means we have elected to throw away some cash be-
tween periods and + 1. To give a formal argument that this cannot happen, we first
note that Bt+1 is monotonically nondecreasing in Bt for all t, since each term within the
mininimization is monotonically increasing in Bt . Thus, we can decrease B until B +1 =
min {(1 + r+ )(B E + I ), (1 + r )(B E + I )}, and still maintain feasibility of the
cash stream. By induction, we can decrease B 1 , B 2 , . . . , B0 and still have a feasible
solution. This new stream is feasible, and reduces B0 , which shows that the original point was
not optimal.

408
Method 2. The second method carries out a different relaxation. In this method we do
not consider B1 , . . . , BT as variables; instead we think of them as functions of x and B0 ,
defined by a recursion. We will show that BT (B0 , x) is a concave function of B0 and x.
B1 (B0 , x) = (1 + r+ )B0 is an concave (affine) function. If Bt (B0 , x) is a concave function,
then

Bt+1 (B0 , x) = min {(1 + r+ )(Bt (B0 , x) Et + It (x)),


(1 + r )(Bt (B0 , x) Et + It (x))} , t = 1, . . . , T 1

is also a concave function because it is the pointwise minimum of two concave functions.
Therefore, we can conclude that BT (B0 , x) is a concave function.
The final cash balance constraint BT (B0 , x) ET + IT (x) = 0 is clearly not convex because
BT (B0 , x) is a concave function. To handle this equation, we relax it to the convex inequality

BT (B0 , x) ET + IT (x) 0.

With this relaxation, the problem can be expressed as:


minimize B0 + ni=1 Pi xi
P

subject to x  0, B0 0
BT (B0 , x) ET + IT (x) 0,

with variables B0 and x1 , . . . , xn . Here BT (B0 , x) and IT (x) are defined by the following
recursive definitions:

B1 (B0 , x) = (1 + r+ )B0
Bt+1 (B0 , x) = min {(1 + r+ )(Bt (B0 , x) Et + It (x)),
(1 + r )(Bt (B0 , x) Et + It (x))} , t = 1, . . . , T 1
n
X (i)
It (x) = at xi , t = 1, . . . , T.
i=1

We can argue that any solution of this problem must satisfy

BT (B0 , x) ET + IT (x) = 0,

which is very similar to the argument given above (for the inequality relaxation).

Method 3. Here is yet another solution to this problem. The cash balance propagation
equations, (
(1 + r+ )(Bt Et + It ) Bt Et + It 0
Bt+1 =
(1 + r )(Bt Et + It ) Bt Et + It < 0,
are equivalent to LP constraints

Bt+1 = (1 + r+ )St+ (1 + r )St ,


Bt Et + It = St+ St ,
St+ 0,
St 0.

409
The equivalence can be shown easily: Suppose Bt Et + It is positive, then St+ = Bt Et + It
and St = 0, so, Bt+1 = (1 + r+ )(Bt Et + It ). If Bt Et + It is negative, then St+ = 0 and
St = Bt Et + It , so, Bt+1 = (1 + r )(Bt Et + It ).
Therefore, the problem can be expressed as a LP:

minimize B0 + ni=1 Pi xi
P
Pn (i)
subject to i=1 at xi = It , t = 1, . . . , T
x  0, B0 0
B1 = (1 + r+ )B0
BT ET + IT = 0,
Bt+1 = (1 + r+ )St+ (1 + r )St , t = 1, . . . , T 1
Bt Et + It = St+ St , t = 1, . . . , T 1
St+ 0, t = 1, . . . , T 1
St 0 t = 1, . . . , T 1,

with variables B0 , . . . , BT , I1 , . . . , IT , x1 , . . . , xn , S1+ , . . . , ST+1 and S1+ , . . . , ST1 . With this


approach, we have to argue that for any t, we will either have St+ or St zero. (This is easily
seen.)
(b) % Optimal investment to fund an expense stream
clear all; close all;
opt_funding_data;

cvx_begin
variables x(n) B0 B(T) I(T)
minimize( P*x+B0 )
subject to
I == A*x
x >= 0 % use x == 0 for no bond purchase
B0 >= 0
B(1) <= (1+rp)*B0
for t = 1:T-1
B(t+1) <= min( (1+rp)*(B(t)-E(t)+I(t)), (1+rn)*(B(t)-E(t)+I(t)) )
end
B(T)-E(T)+I(T) == 0
cvx_end

% optimal total initial investment


cvx_optval

% associated optimal cash balance plot


bar(0:T,[B0;B]);
ylabel(cash balance); xlabel(t);
The code (CVX part) with recursive definition of Bt (B0 , x) is given below.
cvx_begin
variables x(n) B0

410
minimize( P*x+B0 )
subject to
x >= 0 % use x == 0 for no bond purchase
B0 >= 0
B(1) = (1+rp)*B0
for t = 1:T-1
B(t+1)=min((1+rp)*(B(t)-E(t)+A(t,:)*x),(1+rn)*(B(t)-E(t)+A(t,:)*x));
end
B(T)-E(T)+A(T,:)*x >= 0
cvx_end
The code (CVX part) with LP formulation is given below.
cvx_begin
variables x(n) B0 B(T) I(T) Sp(T-1) Sn(T-1)
minimize( P*x+B0 )
subject to
I == A*x
x >= 0 % use x == 0 for no bond purchase
B0 >= 0
B(1) == (1+rp)*B0
Sp >= 0
Sn >= 0
for t = 1:T-1
B(t)-E(t)+I(t) == Sp(t) - Sn(t);
B(t+1) == (1+rp)*Sp(t)-(1+rn)*Sn(t)
end
B(T)-E(T)+I(T) == 0
cvx_end
The optimal bond investment is

x = (0.0000, 18.9326, 0.0000, 0.0000, 13.8493, 8.9228),

and the optimal initial deposit to the cash account is B0 = 0. The optimal total initial
investment is 40.7495.
With no bond investment (i.e., x = 0), an initial deposit of B0 = 41.7902 is cash is required,
around 2.5% more than when we invest in bonds.

13.11 Planning production with uncertain demand. You must order (nonnegative) amounts rI , . . . , rm of
raw materials, which are needed to manufacture (nonnegative) quantities q1 , . . . , qn of n different
products. To manufacture one unit of product j requires at least Aij units of raw material i, so
we must have r  Aq. (We will assume that Aij are nonnegative.) The per-unit cost of the raw
materials is given by c Rm T
+ , so the total raw material cost is c r.
The (nonnegative) demand for product j is denoted dj ; the number of units of product j sold is
sj = min{qj , dj }. (When qj > dj , qj dj is the amount of product j produced, but not sold; when
dj > qj , dj qj is the amount of unmet demand.) The revenue from selling the products is pT s,

411
10

2
cash balance

10
0 1 2 3 4 5 6 7 8 9 10 11 12
t

Figure 13: Cash balance with optimal bond purchase.


45

40

35

30
cash balance

25

20

15

10

0
0 1 2 3 4 5 6 7 8 9 10 11 12
t

Figure 14: Cash balance with no bond purchase.

412
where p Rn+ is the vector of product prices. The profit is pT s cT r. (Both d and q are real
vectors; their entries need not be integers.)
You are given A, c, and p. The product demand, however, is not known. Instead, a set of K
possible demand vectors, d(1) , . . . , d(K) , with associated probabilities 1 , . . . , K , is given. (These
satisfy 1T = 1,  0.)
You will explore two different optimization problems that arise in choosing r and q (the variables).

I. Choose r and q ahead of time. You must choose r and q, knowing only the data listed
above. (In other words, you must order the raw materials, and commit to producing the chosen
quantities of products, before you know the product demand.) The objective is to maximize the
expected profit.

II. Choose r ahead of time, and q after d is known. You must choose r, knowing only the
data listed above. Some time after you have chosen r, the demand will become known to you.
This means that you will find out which of the K demand vectors is the true demand. Once you
know this, you must choose the quantities to be manufactured. (In other words, you must order
the raw materials before the product demand is known; but you can choose the mix of products to
manufacture after you have learned the true product demand.) The objective is to maximize the
expected profit.

(a) Explain how to formulate each of these problems as a convex optimization problem. Clearly
state what the variables are in the problem, what the constraints are, and describe the roles
of any auxiliary variables or constraints you introduce.
(b) Carry out the methods from part (a) on the problem instance with numerical data given in
planning_data.m. This file will define A, D, K, c, m, n, p and pi. The K columns of D are the
possible demand vectors. For both of the problems described above, give the optimal value of
r, and the expected profit.

Solution. We first consider the case when r and q must be decided before the demand is known.
The variables are r Rm and q Rn . The cost of the resources is deterministic; its just cT r. The
revenue from product sales is a random variable, with values

pT min{q, d(k) }, k = 1, . . . , K,

(where min{q, d(k) } is meant elementwise), with associated probabilities 1 , . . . , K . The expected
profit is therefore
K
X
T
c r + k pT min{q, d(k) }.
k=1

Our problem is thus


maximize cT r + K T (k)
k=1 k p min{q, d }
P

subject to q  0, r  0, r  Aq,
with variables r and q. The objective is concave (and piecewise-linear), and the constraints are all
linear inequalities, so this is a convex optimization problem.

413
Now we turn to the case when we must commit to the resource order, but can choose the product
mix after we know the demand. In this case we need to have a separate product quantity vector
for each possible demand vector. Thus we have variables r Rm
+ and

q (1) , . . . , q (K) Rn+ .

Here q (k) is the product mix we produce if d(k) turns out to be the actual product demand. Our
problem can be formulated as
PK
maximize cT r + k=1 k p
T min{q (k) , d(k) }
subject to r  0
q (k)  0, r  Aq (k) , k = 1, . . . , K.

Note that the variables r and q (1) , . . . , q (K) must be decided at the same time, taking all of them
into account; we cannot optimize them separately. Nor can we decide on r first, without taking the
different q (k) s into account. (However, provided we have solved the problem above to get the right
r, we can then optimize over q, once we know which demand vector is the true one. But there is
no need for this, since we have already worked out the optimal q for each possible demand vector.)
We can simplify this a bit. In this case, we will not produce any excess product, because this wastes
resources, and we know the demand before we decide how much to manufacture. So we will have
q  d, and we can write the problem as

maximize cT r + K T (k)
P
k=1 k p q
subject to r  0
d(k)  q (k)  0, r  Aq (k) , k = 1, . . . , K.

With the values from planning_data.m, the optimal values of r for the two problems, respectively,
are

rI = (45.5, 65.8, 44.6, 76.7, 72.0, 77.8, 59.3, 61.9, 83.2, 73.1)
rII = (65.7, 59.6, 55.2, 70.8, 69.7, 68.4, 55.3, 57.4, 63.8, 66.2)

These yield expected profits, respectively, of 114.3 and 133.0.


The CVX code that solves this problem is listed below.

disp(Problem I.)
cvx_begin
variable r(m); % amount of each raw material to buy.
variable q(n); % amount of each product to manufacture.
r >= 0;
q >= 0;
r >= A*q;
S = min(q*ones(1,K),D);
maximize(p*S*pi - c*r)
cvx_end
rI = r; disp(r);

414
disp([profit: num2str(cvx_optval)]);
profitI = p*S - c*r; % save for graphing.

disp(Problem II.)
cvx_begin
variable r(m); % amount of each raw material to buy.
variable q(n, K); % product to manufacture, for each demand.
q >= 0;
r >= 0;
r*ones(1,K) >= A*q;
S = min(q, D);
maximize(p*S*pi - c*r)
cvx_end
rII = r; disp(rII);
disp([profit: num2str(cvx_optval)]);
profitII = p*S - c*r; % save for graphing.

We mention two other optimization problems, that we didnt ask you to explore. First, suppose we
committed to buy rI resources under the first plan, but later learned the demand (before production
started). The question then is: how suboptimal is the expected profit when using the rI from the
first problem instead of rII ? For the given numerical instance, our expected profit drops from 133.0
to 128.7.
Finally, its also interesting to look at performance when we know the demand before buying the
resources. This is the full prescience, or full information casewe know everything beforehand. In
this case we can optimize the choice of r and q for each possible demand vector separately. This
obviously gives an upper bound on the profit available when we have less information. In this case,
the expected profit with full prescience is 161.4.
The code that solves these two additional problems is listed below.

disp(Problem II but with rI.)


r = rI;
cvx_begin
variable q(n, K); % product to manufacture, for each demand.
q >= 0;
r*ones(1,K) >= A*q;
S = min(q, D);
maximize(p*S*pi - c*r)
cvx_end
disp([profit: num2str(p*S*pi - c*r)]);
profitIII = p*S - c*r; % save for graphing.

disp(Full prescience.)
cvx_begin
variable r(m, K); % amount of each raw material to buy, for each demand..
variable q(n, K); % product to manufacture, for each demand.

415
q >= 0;
r >= 0;
r >= A*q;
S = min(q, D);
maximize((p*S - c*r)*pi)
cvx_end
disp([profit: num2str((p*S*pi - c*r)*pi)]);
profitIV = p*S - c*r; % save for graphing.

Histograms showing profits with the different possible outcomes are shown below.

15

10

0
0 50 100 150 200 250 300 350 400
15

10

0
0 50 100 150 200 250 300 350 400
15

10

0
0 50 100 150 200 250 300 350 400
15

10

0
0 50 100 150 200 250 300 350 400

13.12 Gini coefficient of inequality. Let x1 , . . . , xn be a set of nonnegative numbers with positive sum,
which typically represent the wealth or income of n individuals in some group. The Lorentz curve
is a plot of the fraction fi of total wealth held by the i poorest individuals,
i
X
fi = (1/1T x) x(j) , i = 0, . . . , n,
j=1

versus i/n, where x(j) denotes the jth smallest of the numbers {x1 , . . . , xn }, and we take f0 = 0.
The Lorentz curve starts at (0, 0) and ends at (1, 1). Interpreted as a continuous curve (as, say,

416
n ) the Lorentz curve is convex and increasing, and lies on or below the straight line joining
the endpoints. The curve coincides with this straight line, i.e., fi = (i/n), if and only if the wealth
is distributed equally, i.e., the xi are all equal.
The Gini coefficient is defined as twice the area between the straight line corresponding to uniform
wealth distribution and the Lorentz curve:
n
X
G(x) = (2/n) ((i/n) fi ).
i=1

The Gini coefficient is used as a measure of wealth or income inequality: It ranges between 0 (for
equal distribution of wealth) and 1 1/n (when one individual holds all wealth).

(a) Show that G is a quasiconvex function on x Rn+ \ {0}.


(b) Gini coefficient and marriage. Suppose that individuals i and j get married (i 6= j) and
therefore pool wealth. This means that xi and xj are both replaced with (xi + xj )/2. What
can you say about the change in Gini coefficient caused by this marriage?

Solution.

(a) Recall that the sum of the i smallest values of x, Si = ij=1 x(i) , is concave, so Z(x) = ni=1 Si
P P

is concave. We can express G as G = (n + 1)/n (2/n)Z(x)/1T x. Thus, the sublevel set is

{x | G(x) t} = {x | ((n + 1)/n t)1T x (2/n)Z(x) 0}.

The righthand side is clearly convex, since the function ((n + 1)/n t)1T x (2/n)Z(x) is
convex.
(b) Marriage cannot increase the Gini coefficient. Heres a snappy proof, using part (a). Let z be
the same as x, but with ith and jth entries swapped. Then (x + z)/2 is precisely the income
distribution after i and j are married. Clearly G(z) = G(x). By quasiconvexity,

G((x + z)/2) max{G(x), G(z)} = G(x).

13.13 Internal rate of return for cash streams with a single initial investment. We use the notation of
example 3.34 in the textbook. Let x Rn+1 be a cash flow over n periods, with x indexed from 0
to n, where the index denotes period number. We assume that x0 < 0, xj 0 for j = 1, . . . , n, and
x0 + + xn > 0. This means that there is an initial positive investment; thereafter, only payments
are made, with the total of the payments exceeding the initial investment. (In the more general
setting of example 3.34, we allow additional investments to be made after the initial investment.)

(a) Show that IRR(x) is quasilinear in this case.


(b) Blending initial investment only streams. Use the result in part (a) to show the following.
Let x(i) Rn+1 , i = 1, . . . , k, be a set of k cash flows over n periods, each of which satisfies
the conditions above. Let w Rk+ , with 1T w = 1, and consider the blended cash flow given
by x = w1 x(1) + + wk x(k) . (We can think of this as investing a fraction wi in cash flow
i.) Show that IRR(x) maxi IRR(x(i) ). Thus, blending a set of cash flows (with initial
investment only) will not improve the IRR over the best individual IRR of the cash flows.

417
Solution.

(a) Consider any cash flow with x0 < 0, xj 0 for j = 1, . . . , n, and x0 + + xn > 0. These
assumptions imply that the present value function PV(x, r) is monotone decreasing in r, so
it crosses 0 at exactly one point. (This is not necessarily the case when some xj are negative
for j > 0, which corresponds to making additional investments in addition to the original
investment; cf. example 3.34.) Therefore we can say that

IRR R PV(x, R) 0.

The righthand side is a linear inequality in x, and so defines a convex set. It follows that IRR
is quasiconvex; but we know from example 3.34 that it is quasiconcave, so it is quasilinear (on
the set of x satisfying the assumptions above).
(b) Now we can solve part (b). For a quasiconvex function, its value at any convex combination
of some points cannot exceed the maximum value at the points. This gives us our desired
conclusion.

13.14 Efficient solution of basic portfolio optimization problem. This problem concerns the simplest
possible portfolio optimization problem:

maximize T w (/2)wT w
subject to 1T w = 1,

with variable w Rn (the normalized portfolio, with negative entries meaning short positions),
and data (mean return), Sn++ (return covariance), and > 0 (the risk aversion parameter).
The return covariance has the factor form = F QF T + D, where F Rnk (with rank K) is
the factor loading matrix, Q Sk++ is the factor covariance matrix, and D is a diagonal matrix
with positive entries, called the idiosyncratic risk (since it describes the risk of each asset that is
independent of the factors). This form for is referred to as a k-factor risk model. Some typical
dimensions are n = 2500 (assets) and k = 30 (factors).

(a) What is the flop count for computing the optimal portfolio, if the low-rank plus diagonal
structure of is not exploited? You can assume that = 1 (which can be arranged by
absorbing it into ).
(b) Explain how to compute the optimal portfolio more efficiently, and give the flop count for your
method. You can assume that k  n. You do not have to give the best method; any method
that has linear complexity in n is fine. You can assume that = 1.
Hints. You may want to introduce a new variable y = F T w (which is called the vector of
factor exposures). You may want to work with the matrix
" #
1 F
G= R(n+k)(1+k) ,
0 I

treating it as dense, ignoring the (little) exploitable structure in it.


(c) Carry out your method from part (b) on some randomly generated data with dimensions
n = 2500, k = 30. For comparison (and as a check on your method), compute the optimal
portfolio using the method of part (a) as well. Give the (approximate) CPU time for each

418
method, using tic and toc. Hints. After you generate D and Q randomly, you might want to
add a positive multiple of the identity to each, to avoid any issues related to poor conditioning.
Also, to be able to invert a block diagonal matrix efficiently, youll need to recast it as sparse.
(d) Risk return trade-off curve. Now suppose we want to compute the optimal portfolio for M
values of the risk aversion parameter . Explain how to do this efficiently, and give the
complexity in terms of M , n, and k. Compare to the complexity of using the method of
part (b) M times. Hint. Show that the optimal portfolio is an affine function of 1/.

Solution.

(a) We compute the optimal portfolio by solving the (linear) KKT system
" #" # " #
1 w
= ,
1T 0 1

where R is the Lagrange multiplier associated with the the budget constraint 1T w = 1.
The flop count is (1/3)(n + 1)3 , which after dropping non-dominant terms is the same as
(1/3)n3 .
(b) There are several ways to describe an efficient method, and they are all related. Here is one.
We introduce a new variable y = F T w (which represents the factor exposures for the portfolio
w), and solve the problem

maximize T w (1/2)wT Dw (1/2)y T Qy


subject to 1T w = 1, F T w = y,

with variables w, y. The KKT system is



D 0 1 F w
0 Q 0 I y
0
= .

1T 0 0 0 1


FT I 0 0 0

We will use standard block elimination, as described in C.4 of the textbook. To match the
notation of that section, we define
" # " #
D 0 1 F
A11 = , A12 = , A21 = AT12 , A22 = 0,
0 Q 0 I

(with A22 R(k+1)(k+1) ), " # " #


1
b1 = , b2 =
0 0
(where the 0s in these expressions have dimension k), and
" # " #
w
x1 = , x2 = .
y

419
Now we solve the linear equations above by elimination. The Schur complement is

S = A22 A21 A1
11 A12
" #" #" #
1T 0 D1 0 1 F
=
FT I 0 Q1 0 I
" #
1T D1 1 1T D1 F
= .
F T D1 1 F T D1 F + Q1

The dominant part of forming S is computing F T D1 F , which costs nk 2 flops. (Computing


Q1 is order k 3 , which is non-dominant.) We also compute the reduced righthand side,
b = b2 A21 A1 b1
11
" #
1 1T D1
= ,
F T D1

with negligible cost (even ignoring the fact that F T D1 was already computed in forming S).
Now we solve the reduced system Sx2 = b, which costs (1/3)(k + 1)3 , and so is non-dominant.
We break up x2 into its first component, which is , and its last k components, . Finally, we
find x1 from

x1 = A1
11 (b1 A12 x2 )
" #
D1 ( 1 F )
= ,
Q1

which has negligible cost. In fact, we only need the top part of x1 :

w = D1 ( 1 F ).

The overall flop count is nk 2 flops, quite a bit more efficient than the other method, which
requires (1/3)n3 flops.
(c) The Matlab code below carries out both methods.
% efficient solution of basic portfolio optimization problem
% generate some random data
n = 2500; % number of assets
k = 30; % number of factors
randn(state,0);
rand(state,0);
F = randn(n,k);
d = 0.1+rand(n,1);
Q = randn(k); Q=Q*Q+eye(k);
Sigma = diag(d) + F*Q*F;
mu = rand(n,1);

% the slow way, solve full KKT


tic;

420
wnu = [Sigma ones(n,1); ones(n,1) 0] \ [mu ; 1];
toc;
wslow = wnu(1:n);

% fast method (specific code)


tic;
% form Schur complement (some inefficiencies here...)
S= -[sum(1./d) (1./d)*F; F*(1./d) F*diag(1./d)*F+inv(Q)];
btilde = [1- ones(1,n)*(mu./d); -F*(mu./d)];
x2 = S\btilde;
nu = x2(1); kappa = x2(2:k+1);
wfast = (mu-nu*ones(n,1)-F*kappa)./d;
toc;
rel_err = norm(wfast-wslow)/norm(wslow)

% fast method (general code)


A11 = sparse([diag(d) zeros(n,k); zeros(k,n) Q]);
A12 = sparse([ones(n,1) F; zeros(k,1) -eye(k)]);
A21 = A12;
b1 = [mu; zeros(k,1)];
b2 = [1; zeros(k,1)];
tic;
R= chol(A11); % cholesky factorization; well need A11^-1 several times
S=-A21*(R\(R\A12));
btilde = b2-A21*(R\(R\b1));
x2 = S\btilde;
x1 = R\(R\(b1-A12*x2)); %
wfast = x1(1:n);
toc;

rel_err = norm(wfast-wslow)/norm(wslow)
The times for the two methods are around 2 seconds (slow method) and 60 milliseconds (fast
method). With a real language (e.g., C or C++), the difference would be even more dramatic.
Here is the corresponding Julia code. As of Winter 2015, Julias sparse Cholesky factorization
still has some issues that are being worked out, so we do not include that method here. The
times were roughly 3 seconds for the slow method and 1.5 seconds for the fast method.

# efficient solution of basic portfolio optimization problem


# generate some random data
n = 2500; # number of assets
k = 30; # number of factors
srand(0);
F = randn(n,k);
d = 0.1 + rand(n);
Q = randn(k,k);

421
Q = Q*Q+eye(k);
Sigma = diagm(d) + F*Q*F;
mu = rand(n);

# the slow way, solve full KKT


tic();
wnu = [Sigma ones(n); ones(1,n) 0] \ [mu ; 1];
toc();
wslow = wnu[1:n];

# fast method (specific code)


tic();
# form Schur complement (some inefficiencies here...)
inv_d = 1./d;
S= -[sum(inv_d) inv_d*F; F*inv_d F*diagm(inv_d)*F+inv(Q)];
btilde = [1-ones(1,n)*(mu./d); -F*(mu./d)];
x2 = S\btilde;
nu = x2[1]; kappa = x2[2:k+1];
wfast = (mu-nu*ones(n)-F*kappa)./d;
toc();
rel_err = norm(wfast-wslow)/norm(wslow);
println(rel_err)

Here is the solution in Python. The times were roughly 0.34 seconds for the slow method and
0.02 seconds for the fast method.

# efficient solution of basic portfolio optimization problem


# generate some random data

import numpy as np
import scipy.sparse as sps
import scipy.sparse.linalg as splinalg
import time

n = 2500 # number of assets


k = 30 # number of factors
np.random.seed(0)

F = np.random.randn(n,k)
F = np.matrix(F)
d = 0.1 + np.random.rand(n)
d = np.matrix(d).T
Q = np.random.randn(k)
Q = np.matrix(Q).T
Q =Q * Q.T + np.eye(k)
Sigma = np.diag(d.A1) + F*Q*F.T

422
mu = np.random.rand(n)
mu = np.matrix(mu).T

# the slow way, solve full KKT


t = time.time()
kkt_matrix = np.vstack(
(np.hstack((Sigma, np.ones((n,1)))),
np.hstack((np.ones(n), [0.]))))
wnu = np.linalg.solve(kkt_matrix, np.vstack((mu, [1.])))
print "Elapsed time is %f seconds." % (time.time() - t)
wslow = wnu[:n]

# fast method (specific code)


t = time.time()
# form Schur complement (some inefficiencies here...)
S = -np.vstack(
(np.hstack(([np.sum(1./d)],((1./d).T*F).A1)),
np.hstack((F.T*(1./d), F.T*np.diag(1./d.A1)*F+np.linalg.inv(Q))))
)
btilde = np.vstack(
([1-np.sum(mu/d)],
-F.T*(mu/d))
)
x2 = np.linalg.solve(S,btilde)
nu = x2[0][0,0]
kappa = np.matrix(x2[1:k+1])
wfast = (mu-nu*np.ones((n,1))-F*kappa)/d
print "Elapsed time is %f seconds." % (time.time() - t)
rel_err = np.sqrt(np.sum((wfast-wslow).A1**2)/np.sum(wslow.A1**2))
print rel_err

# fast method (general code)


A11 = sps.vstack((
sps.hstack((sps.diags([d.A1],[0]), np.zeros((n,k)))),
sps.hstack((sps.csc_matrix((k,n)), Q))
))
A12 = sps.vstack((
sps.hstack((np.ones((n,1)), sps.csr_matrix(F))),
sps.hstack((sps.csc_matrix((k,1)), sps.eye(k)))
))
A21 = A12.T
b1 = np.vstack((mu, np.zeros((k,1))))
b2 = np.vstack((1., np.zeros((k,1))))
t = time.time()
factor_solve = splinalg.factorized(A11)

423
S = -A21*factor_solve(A12.todense())
btilde = b2-A21*factor_solve(b1)
x2 = np.linalg.solve(S,btilde)
x1 = factor_solve(b1-A12*x2)
wfast = x1[:n]
print "Elapsed time is %f seconds." % (time.time() - t)
rel_err = np.sqrt(np.sum((wfast-wslow).A1**2)/np.sum(wslow.A1**2))
print rel_err

(d) Now we put back in to the KKT system:


" #" # " #
1 w
= ,
1T 0 1

Divide the first block row by , and change variables to = / to get


" #" # " #
1 w /
= .
1T 0 1

This shows that the optimal portfolio is an affine function of 1/. Therefore, w has the form
w = w0 + (1/)w1 , where w0 is the optimal portfolio for = 0 (or = ), and w0 + w1 is
the optimal portfolio for = 1. We just solve for the optimal portfolio for these two values of
(or any other two, for that matter!), and then we can construct the solution for any other
value with 2n flops.
Thus we perform two solves, which costs (2/3)nk 2 ; then we need 2nM flops to find all the
optimal portfolios. The complexity couldnt be any smaller, since to just write down the M
optimal portfolios, we touch nM real numbers.

13.15 Sparse index tracking. The (weekly, say) return of n stocks is given by a random variable r Rn ,
with mean r and covariance E(r r)(r r)T =  0. An index (such as S&P 500 or Wilshire
5000) is a weighted sum of these returns, given by z = cT r, where c Rn+ . (For example, the
vector c is nonzero only for the stocks in the index, and the coefficients ci might be proportional to
some measure of market capitalization of stock i.) We will assume that the index weights c Rn ,
as well as the return mean and covariance r and , are known and fixed.
Our goal is to find a sparse weight vector w Rn , which can include negative entries (meaning,
short positions), so that the RMS index tracking error, defined as
!1/2
E(z wT r)2
E= ,
E z2

does not exceed 0.10 (i.e., 10%). Of course, taking w = c results in E = 0, but we are interested in
finding a weight vector with (we hope) many fewer nonzero entries than c has.
Remark. This is the idea behind an index fund : You find a sparse portfolio that replicates or tracks
the return of the index (within some error tolerance). Acquiring (and rebalancing) the sparse
tracking portfolio will incur smaller transactions costs than trading in the full index.

(a) Propose a (simple) heuristic method for finding a sparse weight vector w that satisfies E 0.10.

424
(b) Carry out your method on the problem instance given in sparse_idx_track_data.m. Give
card(w), the number of nonzero entries in w. (To evaluate card(w), use sum(abs(w)>0.01),
which treats weight components smaller than 0.01 as zero.) (You might want to compare the
index weights and the weights you find by typing [c w]. No need to print or turn in the
resulting output, though.)

Solution. We have

E(z wT r)2 = E((c w)T r)2 = (c w)T ( + rrT )(c w),

so
k( + rrT )1/2 (c w)k2
E= .
k( + rrT )1/2 ck2

The heuristic is to choose w by solving the problem

minimize kwk1
subject to k( + rrT )1/2 (c w)k2 0.10k( + rrT )1/2 ck2 .

The code below carries this out for the given data.

sparse_idx_track_data
sigma_rr=Sigma+rbar*rbar;

cvx_begin
variable w(n);
minimize(norm(w,1))
subject to
quad_form(c-w,sigma_rr)<=0.10^2*quad_form(c,sigma_rr);
cvx_end
card_w = sum(abs(w)>0.01)

% we didnt ask you to do the following.


% we fix the chosen stocks, and optimize the weights for best tracking
zeros_idx = (abs(w)<0.01);
cvx_begin
variable w2(n);
minimize (quad_form(c-w2,sigma_rr));
subject to
w2(zeros_idx) == 0;
cvx_end

E2 = sqrt(cvx_optval/quad_form(c,sigma_rr));

The method finds a weight vector with card(w) = 50. This can be compared to card(c) = 500.
This weight vector (naturally) achieves a tracking error E = 0.10.

425
In fact we can improve this result, even though we didnt expect you to do any more. For example,
we can just take the selected stocks from the basic method, and then minimize E over the weights
for those. This reduces the tracking error to under 4%.
13.16 Option price bounds. In this problem we use the methods and results of example 5.10 to give
bounds on the arbitrage-free price of an option. (See exercise 5.38 for a simple version of option
pricing.) We will use all the notation and definitions from Example 5.10.
We consider here options on an underlying asset (such as a stock); these have a payoff or value that
depends on S, the value of the underlying asset at the end of the investment period. We will assume
that the underying asset can only take on m different values, S (1) , . . . , S (m) . These correspond to
the m possible scenarios or outcomes described in Example 5.10.
A risk-free asset has value r > 1 in every scenario.
A put option at strike price K gives the owner the right to sell one unit of the underlying stock
at price K. At the end of the investment period, if the stock is trading at a price S, then the
put option has payoff (K S)+ = max{0, K S} (since the option is exercised only if K > S).
Similarly a call option at strike price K gives the buyer the right to buy a unit of stock at price K.
A call option has payoff (S K)+ = max{0, S K}.
A collar is an option with payoff

C S0
S>C
SS0 F SC
F S

S<F
0

where F is the floor, C is the cap and S0 is the price of the underlying at the start of the investment
period. This option limits both the upside and downside of payoff.
Now we consider a specific problem. The price of the risk-free asset, with r = 1.05, is 1. The price
of the underlying asset is S0 = 1. We will use m = 200 scenarios, with S (i) uniformly spaced from
S (1) = 0.5 to S (200) = 2. The following options are traded on an exchange, with prices listed below.

Type Strike Price


Call 1.1 0.06
Call 1.2 0.03
Put 0.8 0.02
Put 0.7 0.01.

A collar with floor F = 0.9 and cap C = 1.15 is not traded on an exchange. Find the range of
prices for this collar, consistent with the absence of arbitrage and the prices given above.
Solution. This problem is exactly the one described at the end of Example 5.10. Lets label the
risk-free asset as asset 1, the underlying asset as asset 2, the four options traded on the exchange
as assets 36, and the collar as asset 7. A set of 7 prices for these, denoted p, is consistent with the
no-arbitrage assumption if and only if there is a y  0 with V T y = p, where V is the value matrix
as defined in Example 5.10. We are given the first 6 entries of p, and need to bound the last entry.
The lower bound is found by solving the LP
minimize pn
subject to V T y = p
y  0,

426
with variables pn R and y Rm . To find the upper bound we maximize pn instead. The
following code solves the problem:

% ee364a option price bounds


% additional exercise solution

r=1.05; % risk-free rate


m=200; % scenarios
n=7; % assets
V=zeros(m,n); % value/payoff matrix
V(:,1) = r; % risk-free asset
V(:,2) = linspace(0.5,2,m); % underlying
% the four exchange traded options:
V(:,3) = pos(V(:,2) - 1.1);
V(:,4) = pos(V(:,2) - 1.2);
V(:,5) = pos(0.8-V(:,2));
V(:,6) = pos(0.7-V(:,2));
% collar option:
F=0.9;C=1.15;
V(:,7) = min(max(V(:,2)-1,F-1),C-1);

p = [1; 1; 0.06; 0.03; 0.02; 0.01]; % asset prices (from exchange)

cvx_begin
variables p_collar y(m)
minimize p_collar
%maximize p_collar
y>=0
V*y== [p; p_collar]
cvx_end

The bounds were as follows:

[lb ub]=
0.033 0.065

We didnt ask you to carry out the following check, but had this been a real pricing exercise, you
should have done it. We vary both the range of final underlying asset prices (which we arbitrarily
took to be 0.5 and 2, representing a halving or doubling of value), and m, the number of samples of
the final underlying asset value (which arbitrarily took to be the round number 200). Sure enough,
varying these arbitrary parameters doesnt change the call option price bounds very much.

13.17 Portfolio optimization with qualitative return forecasts. We consider the risk-return portfolio opti-
mization problem described on pages 155 and 185 of the book, with one twist: We dont precisely
know the mean return vector p. Instead, we have a range of possible values for each asset, i.e., we
have l, u Rn with l  p  u. We use l and u to encode various qualitative forecasts we have

427
about the mean return vector p. For example, l7 = 0.02 and u7 = 0.20 means that we believe the
mean return for asset 7 is between 2% and 20%.
Define the worst-case mean return Rwc , as a function of portfolio vector x, as the worst (minimum)
value of pT x, over all p consistent with the given bounds l and u.

(a) Explain how to find a portfolio x that maximizes Rwc , subject to a budget constraint and risk
limit,
1T x = 1, xT x max
2
,
where Sn++ and max R++ are given.
(b) Solve the problem instance given in port_qual_forecasts_data.m. Give the optimal worst-
case mean return achieved by the optimal portfolio x? .
In addition, construct a portfolio xmid that maximizes cT x subject to the budget constraint
and risk limit, where c = (1/2)(l + u). This is the optimal portfolio assuming that the mean
return has the midpoint value of the forecasts. Compare the midpoint mean returns cT xmid
and cT x? , and the worst-case mean returns of xmid and x? .
Briefly comment on the results.

Solution.

(a) There are several ways to solve this problem. All depend on coming up with an explicit
expression for Rwc that is clearly concave.
Here is one way. The value Rwc is the optimal value of the box-constrained LP

minimize pT x
subject to l  p  u,

with variable p. As in exercise 4.8(c), we can choose an optimal p as


(
li xi 0
p?i =
ui xi 0.

The worst case mean return can be written as

Rwc (x) = p? x
n
X
= (li max{0, xi } ui max{0, xi })
i=1
n
1X
= (li (xi + |xi |) + ui (xi |xi |))
2 i=1
n
X
= cT x ri |xi |,
i=1

where c = (1/2)(l + u) and r = (1/2)(u l).


Thus the problem is
maximize cT x ni=1 ri |xi |
P

subject to 1T x = 1, xT x max
2 ,

428
with variable x Rn , which is convex. This problem has a very nice interpretation. The
objective is to maximize the midpoint mean return, minus a regularization term, which is a
weighted `1 norm; the weights are one half the spread in the forecast returns.
Here is another way to derive a nice expression for Rwc . We first note that the problem of
finding the worst-case return splits across the assets, and always occurs with either the lower
or upper limit of return. When xi 0, the worst-case return is obtained using li ; when xi 0,
it is obtained using ui . The associated worst case return for asset i is then min(li xi , ui xi ).
Thus we have nX
Rwc = min(li xi , ui xi ).
i=1

This is evidently concave (indeed, it is DCP compliant). We get the problem


n
P
maximize i=1 min(li xi , ui xi )
subject to 1 x = 1, xT x max
T 2 .

Following the same idea, we can split x into a positive and negative vector, as x = x+ x ,
with x+ , x  0. Then we write the objective as lT x+ uT x , which gives

maximize lT x+ uT x
subject to 1T (x+ x ) = 1
(x+ x )T (x+ x ) max
2

x+  0, x  0.

Of course all these methods are equivalent; they just look a little different.
(b) The following code constructs an optimal portfolio, as well as the optimal portfolio assuming
the mean return takes on the midpoint forecast value. These two portfolios are
x xmid
0.0000 -0.1550
-0.9395 -1.0726
0.0000 0.9360
0.3559 0.8118
-0.0000 -0.7599
0.0813 0.5137
1.2577 1.0699
0.2249 -0.4942
0.0000 -0.1051
0.0197 0.2553
They achieve midpoint returns 0.38, 0.55, and worst-case returns 0.17, 0.08.
Brief comments. If we assume the return will be the midpoint value, the worst-case return
is actually a loss! Its also interesting to note that the portfolio that maximizes worst-case
return is sparse; it doesnt invest in 4 assets. These happen to be ones that have large ranges
in the forecast mean return.

429
%% portfolio optimization with qualitative return forecasts

port_qual_forecasts_data

c = (l + u)/2;
r = (u - l)/2;

cvx_begin
variable x(n)
maximize(c*x - r*abs(x))
subject to
sum(x) == 1;
quad_form(x,Sigma) <= sigma_max^2;
cvx_end

% assume mean return takes on midpoint forecast value


cvx_begin
variable x_mid(n)
maximize(c*x_mid)
subject to
sum(x_mid) == 1;
quad_form(x_mid,Sigma) <= sigma_max^2;
cvx_end

[ x x_mid ]

fprintf(Midpoint return for x: %.2f\n, ...


(c*x));
fprintf(Midpoint return for x_mid: %.2f\n, ...
(c*x_mid));
fprintf(Worst-case return for x: %.2f\n, ...
(c*x - r*abs(x)));
fprintf(Worst-case return for x_mid: %.2f\n, ...
(c*x_mid - r*abs(x_mid)));

13.18 De-leveraging. We consider a multi-period portfolio optimization problem, with n assets and T
time periods, where xt Rn gives the holdings (say, in dollars) at time t, with negative entries
denoting, as usual, short positions. For each time period the return vector has mean Rn and
covariance Sn++ . (These are known.)
The initial portfolio x0 maximizes the risk-adjusted expected return T x xT x, where > 0,
subject to the leverage limit constraint kxk1 Linit , where Linit > 0 is the given initial leverage
limit. (There are several different ways to measure leverage; here we use the sum of the total
short and long positions.) The final portfolio xT maximizes the risk-adjusted return, subject to
kxk1 Lnew , where Lnew > 0 is the given final leverage limit (with Lnew < Linit ). This uniquely
determines x0 and xT , since the objective is strictly concave.

430
The question is how to move from x0 to xT , i.e., how to choose x1 , . . . , xT 1 . We will do this so as
to maximize the objective
T 
X 
J= T xt xTt xt (xt xt1 ) ,
t=1

which is the total risk-adjusted expected return, minus the total transaction cost. The transaction
cost function has the form
n 
X 
(u) = i |ui | + i u2i ,
i=1

where  0 and  0 are known parameters. We will require that kxt k1 Linit , for t =
1, . . . , T 1. In other words, the leverage limit is the initial leverage limit up until the deadline T ,
when it drops to the new lower value.

(a) Explain how to find the portfolio sequence x?1 , . . . , x?T 1 that maximizes J subject to the
leverage limit constraints.
(b) Find the optimal portfolio sequence x?t for the problem instance with data given in deleveraging_data.m.
Compare this sequence with two others: xlp t = x0 for t = 1, . . . , T 1 (i.e., one that does all
trading at the last possible period), and the linearly interpolated portfolio sequence

xlin
t = (1 t/T )x0 + (t/T )xT , t = 1, . . . , T 1.

For each of these three portfolio sequences, give the objective value obtained, and plot the
risk and transaction cost adjusted return,

T xt xTt xt (xt xt1 ),

and the leverage kxt k1 , versus t, for t = 0, . . . , T . Also, for each of the three portfolio sequences,
generate a single plot that shows how the holdings (xt )i of the n assets change over time, for
i = 1, . . . , n.
Give a very short (one or two sentence) intuitive explanation of the results.

Solution.

(a) It is easy to see that is convex in u. By the composition rule, it follows that each period
cost at time t is concave in x1 , . . . , xT 1 . Then, maximizing J subject to the leverage limit
constraints is a convex optimization problem. The constraints for this optimization problem
are straightforward: kxi k1 Linit , i = 1, ..., T 1, x0 = x?0 and xT = x?T , where x?0 and x?T
are both given by maximizing the risk adjusted return (without transaction cost) subject to
initial and last leverage limit.
(b) The total net returns for the linearly interpolated portfolio sequence, the portfolio sequence
that does all the trading at the last period, and the optimal sequence are 506.18, 520.42 and
531.38 respectively. In the following graphs, the three sequences are drawn in green, red, and
blue, in this order.

431
40 10

30 9

20 8

Net return

Leverage
10
7
0
6
10
5
20
0 5 10 15 0 5 10 15
T T

1.5 1.5 1.5

1 1 1

0.5 0.5 0.5


holdings

holdings

holdings
0 0 0

0.5 0.5 0.5

1 1 1

1.5 1.5 1.5

2 2 2
0 5 10 15 0 5 10 15 0 5 10 15
T T T

Intuitive explanation. The linearly interpolated portfolio smoothly changes its positions, so
it doesnt incur much transaction cost. On the other hand, it starts deleveraging well before
the deadline, so it loses some opportunity to improve the objective in the early steps. When
we make all trades in the last period, we get a good objective value up until then, but then
we pay for it in transaction costs in the last period, and the total objective suffers. When
you transition just right, you combine the best of both strategies: you keep the portfolio near
the optimal one for high leverage for about 10 periods, then smoothly transition to the new
portfolio, in such a way that the total objective is maximized.
The following code solve the problem.
% de-leveraging
deleveraging_data;

% compute initial and final portfolio


cvx_begin quiet
variable x0_star(n)
maximize (mu*x0_star - gamma*quad_form(x0_star, Sigma))
subject to
norm(x0_star, 1) <= Linit;
cvx_end

cvx_begin quiet

432
variable xT_star(n)
maximize (mu*xT_star - gamma*quad_form(xT_star, Sigma))
subject to
norm(xT_star, 1) <= Lnew;
cvx_end

% linear interpolation
Xlin = zeros(n, T+1);
for i = 1:T+1
Xlin(:, i) = (1-(i-1)/T)*x0_star + (i-1)/T*xT_star;
end

% trading at the last period


Xlp = [repmat(x0_star, 1, T) xT_star];

% optimization
cvx_begin quiet
variable X(n, T+1)
expressions dX(n, T) ret trs
dX = X(:, 2:T+1) - X(:, 1:T);
ret = 0; trs = 0;
for t = 2:T+1
ret = ret + mu*X(:, t) - gamma*quad_form(X(:, t), Sigma);
trs = trs + kappa*abs(dX(:, t-1)) ...
+ quad_form(dX(:, t-1), diag(lambda));
end

maximize ret - trs


subject to
X(:, 1) == x0_star;
X(:, T+1) == xT_star;
norms(X(:, 2:T), 1) <= Linit;
cvx_end

levOpt = norms(X, 1);


levLin = norms(Xlin, 1);
levLp = norms(Xlp, 1);

retOpt = X*mu - gamma*diag(X*Sigma*X);


retLin = Xlin*mu - gamma*diag(Xlin*Sigma*Xlin);
retLp = Xlp*mu - gamma*diag(Xlp*Sigma*Xlp);

dXlin = Xlin(:, 2:T+1) - Xlin(:, 1:T);


dXlp = Xlp(:, 2:T+1) - Xlp(:, 1:T);

433
trsOpt = abs(dX)*kappa + diag(dX*diag(lambda)*dX);
trsLin = abs(dXlin)*kappa + diag(dXlin*diag(lambda)*dXlin);
trsLp = abs(dXlp)*kappa + diag(dXlp*diag(lambda)*dXlp);

netReturnOpt = retOpt(2:T+1) - trsOpt;


netReturnLin = retLin(2:T+1) - trsLin;
netReturnLp = retLp (2:T+1) - trsLp;

fprintf(J of x^lin : %f\n, sum(netReturnLin));


fprintf(J of x^lp : %f\n, sum(netReturnLp));
fprintf(J of x^star: %f\n, sum(netReturnOpt));

% plot
clf;
subplot(2, 3, 1.5);
plot(1:T, netReturnLin, g, 1:T, netReturnLp, r, ...
1:T, netReturnOpt, b);
xlabel(T); ylabel(Net return); xlim([0, T]);

subplot(2, 3, 2.5);
plot(0:T, levLin, g, 0:T, levLp, r, 0:T, levOpt, b);
xlabel(T); ylabel(Leverage); axis([0 T Lnew-0.5 Linit+0.5]);

subplot(2, 3, 4); % x^lin


hold on;
for i = 1:n
plot(0:T, Xlin(i, :), g);
end
hold off;
xlabel(T); ylabel(holdings); xlim([0 T]);

subplot(2, 3, 5); % x^lp


hold on;
for i = 1:n
plot(0:T, Xlp(i, :), r);
end
hold off;
xlabel(T); ylabel(holdings); xlim([0 T]);

subplot(2, 3, 6); % x^star


hold on;
for i = 1:n
plot(0:T, X(i, :), b);
end
hold off;

434
xlabel(T); ylabel(holdings); xlim([0 T]);

print -depsc deleveraging.eps;

13.19 Worst-case variance. Suppose Z is a random variable on Rn with covariance matrix Sn+ .
Let c Rn . The variance of Y = cT Z is var(Y ) = cT c. We define the worst-case variance
of Y , denoted wcvar(Y ), as the maximum possible value of cT c, over all Sn+ that satisfy
ii = ii , i = 1, . . . , n. In other words, the worst-case variance of Y is the maximum possible
variance, if we are allowed to arbitrarily change the correlations between Zi and Zj . Of course we
have wcvar(Y ) var(Y ).

(a) Find a simple expression for wcvar(Y ) in terms of c and the diagonal entries of . You must
justify your expression.
(b) Portfolio optimization. Explain how to find the portfolio x Rn that maximizes the expected
return T x subject to a limit on risk, var(rT x) = xT x R, and a limit on worst-case
risk wcvar(rT x) Rwc , where R > 0 and Rwc > R are given. Here = E r and =
E(r )(r )T are the (given) mean and covariance of the (random) return vector r Rn .
(c) Carry out the method of part (b) for the problem instance with data given in
wc_risk_portfolio_opt_data.m. Also find the optimal portfolio when the worst-case risk
limit is ignored. Find the expected return and worst-case risk for these two portfolios.

Remark. If a portfolio is highly leveraged, and the correlations in the returns change drastically,
you (the portfolio manager) can be in big trouble, since you are now exposed to much more risk
than you thought you were. And yes, this (almost exactly) has happened.
Solution.

(a) We will show that


n
!2
X p
wcvar(Y ) = |ci | ii .
i=1

This worst-case variance value is obtained by the matrix wc = ddT , where d Rn is defined
as p
di = sign(ci ) ii , i = 1, . . . , n.
Note that wc Sn+ and its diagonal entries are the same as .
Sn that satisfies
To show this (and also, to derive it) we proceed as follows. For any +
ii for all i, we have
ii =
n
= ii + ij
X X
cT c c2i ci cj
i=1 i6=j
n
ij .
X X
= c2i ii + ci cj
i=1 i6=j

 0, we must have
Since q q
ij |
| ii
jj = ii jj

435
over the off-diagonal elements of
for i 6= j. If we maximize cT c subject to this limit, we
find that the maximum occurs when
q
ij = sign(ci cj ) ii jj

for i 6= j. We can re-write this as


q
ij =
ii jj sign(ci ) sign(cj ), i, j = 1, . . . , n.

 0. Fortunately, the matrix


So far, we have not taken into account the constraint that
T
above satisfies this, since we can express it as = dd .
(b) Given the data , , R, and Rwc , we solve the optimization problem

maximize T x
subject to xT x R
Pn
i=1 |xi | ii Rwc .

(c) The optimal portfolio with the worst-case risk limit is

x? = (2.263, 0, 0, 0, 0, 0, 0, 0.698, 1.673, 0),

with expected return T x? = 8.195, and worst case risk wcvar(rT x? ) = 10.
The optimal portfolio with the worst-case risk limit ignored is

x? = (1.625, 1.214, 0.672, 1.821, 2.395, 0.892, 0.533, 3.967, 0.565, 2.769),

with expected return T x? = 13.512, and worst case risk wcvar(rT x? ) = 117.197.
The following code solves the problem.
% worst-case variance
wc_risk_portfolio_opt_data;

cvx_begin quiet
variable x(n)
maximize (mu*x)
subject to
quad_form(x, Sigma) <= R;
norm(sqrt(diag(Sigma)).*x, 1) <= sqrt(R_wc);
cvx_end

fprintf(Portfolio with the worst-case risk limit:\n)


fprintf( Expected return: %.3f\n, mu*x);
fprintf( Variance: %.3f\n, quad_form(x, Sigma));
fprintf( Worst case risk: %.3f\n, norm(sqrt(diag(Sigma)).*x, 1)^2);
fprintf( x = ();
for i = 1:n
fprintf(%.3f, , x(i));

436
end
fprintf()\n\n);

cvx_begin quiet
variable x(n)
maximize (mu*x)
subject to
quad_form(x, Sigma) <= R;
cvx_end

fprintf(Portfolio without the worst-case risk limit:\n)


fprintf( Expected return: %.3f\n, mu*x);
fprintf( Variance: %.3f\n, quad_form(x, Sigma));
fprintf( Worst case risk: %.3f\n, norm(sqrt(diag(Sigma)).*x, 1)^2);
fprintf( x = ();
for i = 1:n
fprintf(%.3f, , x(i));
end
fprintf()\n);

13.20 Risk budget allocation. Suppose an amount xi > 0 is invested in n assets, labeled i = 1, ..., n, with
asset return covariance matrix Sn++ . We define the risk of the investments as the standard
deviation of the total return, R(x) = (xT x)1/2 .
We define the (relative) risk contribution of asset i (in the portfolio x) as

log R(x) R(x) xi


i = = , i = 1, . . . , n.
log xi R(x) xi

Thus i gives the fractional increase in risk per fractional increase in investment i. We can express
the risk contributions as
xi (x)i
i = T , i = 1, . . . , n,
x x
Pn
from which we see that i=1 i = 1. For general x, we can have i < 0, which means that a small
increase in investment i decreases the risk. Desirable investment choices have i > 0, in which case
we can interpret i as the fraction of the total risk contributed by the investment in asset i. Note
that the risk contributions are homogeneous of degree zero, i.e., scaling x by a positive constant
does not affect i .
In the risk budget allocation problem, we are given and a set of desired risk contributions des
i >0
T des T
with 1 = 1; the goal is to find an investment mix x  0, 1 x = 1, with these risk contributions.
When des = (1/n)1, the problem is to find an investment mix that achieves so-called risk parity.

(a) Explain how to solve the risk budget allocation problem using convex optimization.
Hint. Minimize (1/2)xT x ni=1 des
P
i log xi .

437
(b) Find the investment mix that achieves risk parity for the return covariance matrix

6.1 2.9 0.8 0.1
2.9 4.3 0.3 0.9
= .

0.8 0.3 1.2 0.7
0.1 0.9 0.7 2.3
For your convenience, this is contained in risk_alloc_data.m.

Solution.
(a) Consider the problem of minimizing f (x), with
n
X
f (x) = (1/2)xT x des
i log xi .
i=1

The optimality condition for this problem is


f (x) = x diag(x)1 des = 0.
Hence, at the optimal point x? , we have
x?i (x? )i = des
i .
Pn
The constraint 1T des = 1 implies that x?T x? = ? ?
i=1 xi (x )i = 1. Therefore, we also have
x?i (x? )i
des
i = .
x?T x?
Note that this equation is homogeneous in x. So choosing x = x? /(1T x? ) will do the trick.
Since the objective function is strictly convex, the minimizer is unique; hence, the optimal
investment mix is unique.
(b) The following code implements the solution:
risk_alloc_data
n = size(Sigma,1);
rho = ones(n,1)/n;

cvx_begin
variable x(n)
minimize (1/2*quad_form(x,Sigma) - log( geo_mean(x)))
% could also use:
% minimize (1/2*quad_form(x,Sigma) - rho*log(x))
cvx_end
x_hat = x/sum(x);

% verify risk parity


x_hat.*(Sigma*x_hat)./quad_form(x_hat,Sigma)

% compare with uniform allocation


u = ones(n,1)/n;
u.*(Sigma*u)./quad_form(u,Sigma)

438
The optimal investment mix is: x
= (.1377, .1134, .4759, .2731).

13.21 Portfolio rebalancing. We consider the problem of rebalancing a portfolio of assets over multiple
periods. We let ht Rn denote the vector of our dollar value holdings in n assets, at the beginning
of period t, for t = 1, . . . , T , with negative entries meaning short positions. We will work with the
portfolio weight vector, defined as wt = ht /(1T ht ), where we assume that 1T ht > 0, i.e., the total
portfolio value is positive.
The target portfolio weight vector w? is defined as the solution of the problem
maximize T w 2 wT w
subject to 1T w = 1,

where w Rn is the variable, is the mean return, Sn++ is the return covariance, and > 0 is
the risk aversion parameter. The data , , and are given. In words, the target weights maximize
the risk-adjusted expected return.
At the beginning of each period t we are allowed to rebalance the portfolio by buying and selling
assets. We call the post-trade portfolio weights w
t . They are found by solving the (rebalancing)
problem
maximize T w 2 wT w T |w wt |
subject to 1T w = 1,
with variable w Rn , where Rn+ is the vector of (so-called linear) transaction costs for
the assets. (For example, these could model bid/ask spread.) Thus, we choose the post-trade
weights to maximize the risk-adjusted expected return, minus the transactions costs associated
with rebalancing the portfolio. Note that the pre-trade weight vector wt is known at the time we
solve the problem. If we have w t = wt , it means that no rebalancing is done at the beginning of
period t; we simply hold our current portfolio. (This happens if wt = w? , for example.)
After holding the rebalanced portfolio over the investment period, the dollar value of our portfolio
becomes ht+1 = diag(rt )h t , where rt Rn is the (random) vector of asset returns over period
++
t is the post-trade portfolio given in dollar values (which you do not need to know). The
t, and h
next weight vector is then given by
diag (rt )w
t
wt+1 = T
.
rt w
t

(If rtT w
t 0, which means our portfolio has negative value after the investment period, we have
gone bust, and all trading stops.) The standard model is that rt are IID random variables with
mean and covariance and , but this is not relevant in this problem.

(a) No-trade condition. Show that w


t = wt is optimal in the rebalancing problem if

|(wt w? )| 

holds, where the absolute value on the left is elementwise.


Interpretation. The lefthand side measures the deviation of wt from the target portfolio w? ;
when this deviation is smaller than the cost of trading, you do not rebalance.
Hint. Find dual variables, that with w = wt satisfy the KKT conditions for the rebalancing
problem.

439
(b) Starting from w1 = w? , compute a sequence of portfolio weights w t for t = 1, . . . , T . For each
t, find w
t by solving the rebalancing problem (with wt a known constant); then generate a
vector of returns rt (using our supplied function) to compute wt+1 (The sequence of weights
is random, so the results wont be the same each time you run your script. But they should
look similar.)
Report the fraction of periods in which the no-trade condition holds and the fraction of periods
in which the solution has only zero (or negligible) trades, defined as kwt wt k 103 . Plot
the sequence w t for t = 1, 2, . . . , T .
The file portf_weight_rebalance_data.* provides the data, a function to generate a (ran-
dom) vector rt of market returns, and the code to plot the sequence w t . (The plotting code
also draws a dot for every non-negligible trade.)
Carry this out for two values of , = 1 and = 2 . Briefly comment on what you observe.
Hint. In CVXPY we recommend using the solver ECOS. But if you use SCS you should
increase the default accuracy, by passing eps=1e-4 to the cvxpy.Problem.solve() method.

Solution.

(a) No-trade condition. The solution w? of the problem without transaction costs satisfies the
KKT conditions
+ w? + ? 1 = 0, 1T w? = 1,
where ? R is an optimal dual variable for the constraint 1T w = 1. (This is a set of linear
equations that we can easily solve, but we dont need this fact for this problem.)
Now we derive optimality conditions for the rebalancing problem. First we express it in the
form
maximize T w 2 wT w T s
subject to 1T w = 1,
w wt  s
wt w  s,
with additional (slack) variable s Rn . Defining dual variables , + , and for the three
constraints, the optimality conditions are

+ w + 1 + + = 0
+ = 0
+  0
 0
T+ (w wt s) = 0
T (wt w s) = 0
1T w = 1
w wt  s
wt w  s.

We want to find a condition under which w = wt , s = 0 is optimal. (This means we do not


trade.) With these choices for primal variables, the last 5 conditions hold. So we need to

440
find dual variables so that the first 4 conditions hold. Well use the dual variable ? from the
original problem, and seek + and so that the following 4 conditions hold:

+ wt + ? 1 + + = 0
+ = 0
+  0
 0.

The first and second conditions can be written (subtracting the optimal solution of the problem
without transaction costs)

(wt w? ) + + = 0, = + + .

This holds for some +  0 and  0 if and only if

|(wt w? )|  .

This is what we were supposed to show. Under this condition, w = wt is optimal for the
rebalancing problem, i.e., we do not trade.
(b) We report the average plus or minus one standard deviations of the results, computed over
multiple runs of the solution script. (The results vary since the returns are generated ran-
domly.) With = 1 , the no-trade condition holds (10 3)% of the times and the trades
are negligible (17 4)% of the times. With = 2 the no-trade condition holds (27 6)% of
the times and the trades are negligible (38 7)% of the times. We observe that the no-trade
condition holds for less periods than the periods with negligible trades. This is correct since
the no-trade condition is a sufficient but not necessary condition for having negligible trades.
We also observe that with transaction costs 2 we rebalance less than with 1 , since 2  1 .
The plots show that the sequence of w t deviate randomly from w? and the trades tend to bring
?
it closer to w (that is in fact rebalancing). By increasing the transaction costs, 2  1 , we
rebalance less and let the weights diverge farther from the target w? .

441
0.35 Optimal weights and trades with transaction cost 1

0.30
0.25
post-trade weights

0.20
0.15
0.10
0.05
0.00
0 20 40 60 80 100
period t
0.40 Optimal weights and trades with transaction cost 2
0.35
0.30
post-trade weights

0.25
0.20
0.15
0.10
0.05
0.00
0 20 40 60 80 100
period t

The following Python code solves the problem:


import numpy as np
import cvxpy as cvx
import matplotlib.pyplot as plt

T = 100
n = 5
gamma = 8.0
Sigma = np.array([[ 1.512e-02, 1.249e-03, 2.762e-04, -5.333e-03, -7.938e-04],
[ 1.249e-03, 1.030e-02, 6.740e-05, -1.301e-03, -1.937e-04],
[ 2.762e-04, 6.740e-05, 1.001e-02, -2.877e-04, -4.283e-05],
[ -5.333e-03, -1.301e-03, -2.877e-04, 1.556e-02, 8.271e-04],
[ -7.938e-04, -1.937e-04, -4.283e-05, 8.271e-04, 1.012e-02]])
mu = np.array([ 1.02 , 1.028, 1.01 , 1.034, 1.017])
kappa_1 = np.array([ 0.002, 0.002, 0.002, 0.002, 0.002])
kappa_2 = np.array([ 0.004, 0.004, 0.004, 0.004, 0.004])
threshold = 0.001

442
## Solve target weights problem
w = cvx.Variable(n)
cvx.Problem(cvx.Maximize(w.T*mu - (gamma/2.)*cvx.quad_form(w, Sigma)),
[cvx.sum_entries(w) == 1]).solve()
w_star = w.value.A1

generateReturns = lambda: np.random.multivariate_normal(mu,Sigma)

## Generate market scenario


plt.figure(figsize=(13,10))
for i, kappa in [(1,kappa_1), (2,kappa_2)]:
ws = np.zeros((T,n))
us = np.zeros((T,n))
no_trade_cond = np.zeros(T)
w_t = w_star
for t in range(T):
# check if no-trade condition holds
no_trade_cond[t]= max(gamma*np.abs(np.dot(Sigma, w_t - w_star)) -
kappa) <= 0
w = cvx.Variable(n)
cvx.Problem(cvx.Maximize(w.T*mu -
(gamma/2.)*cvx.quad_form(w, Sigma) -
cvx.sum_entries(cvx.abs(w - w_t).T*kappa)),
[cvx.sum_entries(w) == 1]).solve()
ws[t,:] = w.value.A1
us[t,:] = w.value.A1 - w_t
w_t = w.value.A1 * generateReturns()
w_t /= sum(w_t)

neglig_trades = np.max(np.abs(us),1) < threshold


print "The no-trade condition holds %.1f%% of the times."%(
sum(no_trade_cond)*100./T)
print "The optimal solution has neglig. trades %.1f%% of the times."%(
sum(neglig_trades)*100./T)

plt.subplot(210+i)
colors = [b,r,g,c,m]
for j in range(n):
plt.plot(range(T), ws[:,j], colors[j])
plt.plot(range(T), [w_star[j]]*T, colors[j]+--)
non_zero_trades = abs(us[:,j]) > threshold
plt.plot(np.arange(T)[non_zero_trades],
ws[non_zero_trades, j], colors[j]+o)
plt.ylabel(post-trade weights)

443
plt.xlabel(period $t$)
plt.title(Optimal weights and trades with transaction cost $\kappa_%d$%i)
plt.savefig("portfolio_weights.eps")
The following Matlab code solves the problem:
T = 100;
n = 5;
gamma = 8.0;
threshold = 0.001;
Sigma = [[ 1.512e-02, 1.249e-03, 2.762e-04, -5.333e-03, -7.938e-04],
[ 1.249e-03, 1.030e-02, 6.740e-05, -1.301e-03, -1.937e-04],
[ 2.762e-04, 6.740e-05, 1.001e-02, -2.877e-04, -4.283e-05],
[ -5.333e-03, -1.301e-03, -2.877e-04, 1.556e-02, 8.271e-04],
[ -7.938e-04, -1.937e-04, -4.283e-05, 8.271e-04, 1.012e-02]];
mu = [ 1.02 , 1.028, 1.01 , 1.034, 1.017];
kappa_1 = [ 0.002, 0.002, 0.002, 0.002, 0.002];
kappa_2 = [ 0.004, 0.004, 0.004, 0.004, 0.004];

% Solve target weights problem


cvx_begin quiet
variable w_star(n)
maximize(mu*w_star - (gamma/2.)*w_star*Sigma*w_star)
subject to
sum(w_star) == 1
cvx_end

generateReturns = @() mu + randn(1, length(mu)) * chol(Sigma);

kappas = {kappa_1, kappa_2};


figure(position, [0, 0, 900, 800]);
for i = 1:2
kappa = kappas{i};
ws = zeros(T,n);
us = zeros(T,n);
no_trade_cond = zeros(T,1);
w_t = w_star;
for t = 1:T
% check if no-trade condition holds
no_trade_cond(t) = max(gamma*abs(Sigma*(w_t - w_star))-kappa) <= 0;

cvx_begin quiet
variable w_tilde(n)
maximize(mu*w_tilde - (gamma/2.)*w_tilde*Sigma*w_tilde - ...
kappa*(abs(w_tilde - w_t)))
subject to
sum(w_tilde) == 1

444
cvx_end
ws(t,:) = w_tilde;
us(t,:) = (w_tilde - w_t);
w_t = w_tilde.*generateReturns();
w_t = w_t/sum(w_t);
end

neglig_trades = max(abs(us)) < threshold;


fprintf(The no-trade condition holds %f%% of the times.\n, ...
sum(no_trade_cond)*100./T);
fprintf([The optimal solution has neglig. ...
trades %f%% of the times.\n], ...
sum(neglig_trades)*100./T);
colors = [b,r,g,c,m];
range = 1:T;
subplot(2,1,i)
for j = 1:n
hold on
plot(range, ws(:,j), [- colors(j)]);
plot(range, ones(T)*w_star(j), [-- colors(j)]);
non_zero_trades = abs(us(:,j)) > threshold;
plot(range(non_zero_trades), ws(non_zero_trades,j), ...
[o colors(j)], MarkerFaceColor, colors(j));
xlabel(Period t);
ylabel(Post-trade weights w_t tilde);
title([Optimal weights and trades with transaction cost kappa_ ...
num2str(i)]);
end
end
print -depsc portfolio_weights
The following Julia code solves the problem:
# data and starter code for multiperiod portfolio rebalancing problem
T = 100;
n = 5;
gamma = 8.0;
threshold = 0.001;
Sigma = [[ 1.512e-02 1.249e-03 2.762e-04 -5.333e-03 -7.938e-04]
[ 1.249e-03 1.030e-02 6.740e-05 -1.301e-03 -1.937e-04]
[ 2.762e-04 6.740e-05 1.001e-02 -2.877e-04 -4.283e-05]
[ -5.333e-03 -1.301e-03 -2.877e-04 1.556e-02 8.271e-04]
[ -7.938e-04 -1.937e-04 -4.283e-05 8.271e-04 1.012e-02]];
mu = [ 1.02 , 1.028, 1.01 , 1.034, 1.017];
kappa_1 = [ 0.002, 0.002, 0.002, 0.002, 0.002];
kappa_2 = [ 0.004, 0.004, 0.004, 0.004, 0.004];

445
using Distributions, Convex, SCS, PyPlot
solver = SCSSolver(verbose=0)

# compute target weights


w_star = Variable(n);
problem = maximize(mu*w_star - (gamma/2.)*quad_form(w_star,Sigma),
sum(w_star) == 1);
solve!(problem, solver);
w_star = w_star.value;

generateReturns() = rand(MvNormal(mu, Sigma));

kappas = {kappa_1, kappa_2};


figure(figsize=(13,10));
for i = 1:2
kappa = kappas[i];
us = zeros(T,n);
ws = zeros(T,n);
no_trade_cond = zeros(T);
w_t = w_star;
for t = 1:T
# check if no-trade condition holds
no_trade_cond[t] = maximum(gamma*abs(Sigma*(w_t - w_star))
- kappa) <= 0;
w_tilde = Variable(n);
problem = maximize(mu*w_tilde - (gamma/2.)*quad_form(w_tilde,Sigma) -
kappa*(abs(w_tilde - w_t)), sum(w_tilde) == 1);
solve!(problem, solver);
w_tilde = w_tilde.value;
ws[t,:] = w_tilde;
us[t,:] = (w_tilde - w_t);
w_t = w_tilde.*generateReturns();
w_t = w_t/sum(w_t);
end

neglig_trades = maximum(abs(us),2) .< threshold;


@printf("The no-trade condition holds %.1f%% of the times.\n",
sum(no_trade_cond)*100/T);
@printf("The optimal solution has neglig. trades %.1f%% of the times.\n",
sum(neglig_trades)*100./T);

colors = ["b","r","g","c","m"];
subplot(210+i);
for j = 1:n
plot(1:T, ws[:,j], colors[j]);

446
plot(1:T, w_star[j]*ones(T), colors[j]*"--");
non_zero_trades = abs(us[:,j]) .> threshold;
plot((1:T)[non_zero_trades], ws[non_zero_trades,j], colors[j]*"o");
end
ylabel("post-trade weights");
xlabel("period \$t\$");
title(@sprintf "Opt. weights and trades with trans. cost \$\\kappa_%d\$" i);
end
savefig(@sprintf "portfolio_weights.eps")

13.22 Portfolio optimization using multiple risk models. Let w Rn be a vector of portfolio weights,
where negative values correspond to short positions, and the weights are normalized such that
1T w = 1. The expected return of the portfolio is T w, where Rn is the (known) vector
of expected asset returns. As usual we measure the risk of the portfolio using the variance of
the portfolio return. However, in this problem we do not know the covariance matrix of the
asset returns; instead we assume that is one of M (known) covariance matrices (k) Sn++ ,
k = 1, . . . , M . We can think of the (k) as representing M different risk models, associated with
M different market regimes (say). For a weight vector w, there are M different possible values
of the risk: wT (k) w, k = 1, . . . , M . The worst-case risk, across the different models, is given by
maxk=1,...,M wT (k) w. (This is the same as the worst-case risk over all covariance matrices in the
convex hull of (1) , . . . , (M ) .)
We will choose the portfolio weights in order to maximize the expected return, adjusted by the
worst-case risk, i.e., as the solution w? of the problem

maximize T w maxk=1,...,M wT (k) w


subject to 1T w = 1,

with variable w, where > 0 is a given risk-aversion parameter. We call this the mean-worst-case-
risk portfolio problem.

(a) Show that there exist 1 , . . . , M 0 such that M ?


P
k=1 k = and the solution w of the
mean-worst-case-risk portfolio problem is also the solution of the problem

maximize T w M T (k)
P
k=1 k w w
T
subject to 1 w = 1,

with variable w.
Remark. The result above has a beautiful interpretation: We can think of the k as allocating
our total risk aversion in the mean-worst-case-risk portfolio problem across the M different
regimes.
Hint. The values k are not easy to find: you have to solve the mean-worst-case-risk problem
to get them. Thus, this result does not help us solve the mean-worst-case-risk problem; it
simply gives a nice interpretation of its solution.

447
(b) Find the optimal portfolio weights for the problem instance with data given in multi_risk_portfolio_data.*
Report the weights and the values of k , k = 1, . . . , M . Give the M possible values of the risk
associated with your weights, and the worst-case risk.

Solution.

(a) We can reformulate the mean-worst-case-risk portfolio problem as

minimize T w + t
subject to wT (k) w t, k = 1, ..., M,
1T w = 1.

with variables w Rn and t R. Let k be a dual variable for the kth inequality constraint,
and be a dual variable for the equality constraint. The KKT conditions are then

+ 1 + 2k (k) w = 0
P
k
k k = 0
P

1T w = 1
 wT (k) wt
T (k)
k w w t = 0, k = 1, ..., M
 0.

Similarly, the KKT conditions for the problem


PM
maximize T w k=1 k w
T (k) w
(58)
subject to 1T w = 1
are
+ 1 + 2k (k) w = 0
P
k (59)
1T w = 1,
where is a dual variable for the equality constraint. Let (w? , t? , ? , ? ) be optimal for the
mean-worst-case-risk portfolio problem, and choose k = ?k , k = 1, . . . , M . With this choice
of the k , we have that k k = k ?k = from the optimality conditions for the mean-
P P

worst-case-risk portfolio problem. Moreover, (w, ) = (w? , ? ) satisfy (59), so w? is optimal


for (58).
(b) The results are
weights:
0.42474
0.66427
-0.11469
1.38055
1.42423
-1.52706
-0.61401
-0.49879
-0.25407
0.11484

448
gamma_k values:
0.29232
0.00000
0.00000
0.46580
0.14230
0.09958

risk values:
0.12188
0.08454
0.08247
0.12188
0.12188
0.12188

worst case risk:


0.12188
The following Matlab code solves the problem.
clear all; clc
multi_risk_portfolio_data;

cvx_begin quiet
variables t w(n)
dual variables gamma_dual{M}
maximize(mu*w - gamma*t)
subject to
gamma_dual{1}: w*Sigma_1*w <= t
gamma_dual{2}: w*Sigma_2*w <= t
gamma_dual{3}: w*Sigma_3*w <= t
gamma_dual{4}: w*Sigma_4*w <= t
gamma_dual{5}: w*Sigma_5*w <= t
gamma_dual{6}: w*Sigma_6*w <= t
sum(w) == 1
cvx_end

% weights
w

% dual variables
disp(gamma_k values);
disp(gamma_dual{1});
disp(gamma_dual{2});
disp(gamma_dual{3});

449
disp(gamma_dual{4});
disp(gamma_dual{5});
disp(gamma_dual{6});

% risk values
disp(risk values);
disp(w*Sigma_1*w);
disp(w*Sigma_2*w);
disp(w*Sigma_3*w);
disp(w*Sigma_4*w);
disp(w*Sigma_5*w);
disp(w*Sigma_6*w);

% worst case risk


t
The following Python code solves the problem.
import numpy as np
import cvxpy as cvx

from multi_risk_portfolio_data import *

w = cvx.Variable(n)
t = cvx.Variable()
risks = [cvx.quad_form(w, Sigma) for Sigma in
(Sigma_1, Sigma_2, Sigma_3, Sigma_4, Sigma_5, Sigma_6)]
risk_constraints = [risk <= t for risk in risks]
prob = cvx.Problem(cvx.Maximize(w.T*mu - gamma * t),
risk_constraints + [cvx.sum_entries(w) == 1])
prob.solve()

print(\nweights:)
print(\n.join([{:.5f}.format(weight) for weight in w.value.A1]))

print(\ngamma_k values:)
print(\n.join([{:.5f}.format(risk.dual_value)
for risk in risk_constraints]))

print(\nrisk values:)
print(\n.join([{:.5f}.format(risk.value) for risk in risks]))

print(\nworst case risk:\n{:.5f}.format(t.value))


The following Julia code solves the problem.
using Convex, SCS
set_default_solver(SCSSolver(verbose=false))

450
include("multi_risk_portfolio_data.jl");

w = Variable(n)
t = Variable()

risk_1 = quadform(w, Sigma_1)


risk_2 = quadform(w, Sigma_2)
risk_3 = quadform(w, Sigma_3)
risk_4 = quadform(w, Sigma_4)
risk_5 = quadform(w, Sigma_5)
risk_6 = quadform(w, Sigma_6)

p = maximize(mu*w - gamma * t, [sum(w) == 1])

p.constraints += risk_1 <= t


p.constraints += risk_2 <= t
p.constraints += risk_3 <= t
p.constraints += risk_4 <= t
p.constraints += risk_5 <= t
p.constraints += risk_6 <= t;

solve!(p)

println("\nweights:")
println(round(w.value, 5))

println("\ngamma_k values:")
for i = 2:7
println(round(p.constraints[i].dual,5))
end

println("\nrisk values:")
for sigma in (Sigma_1, Sigma_2, Sigma_3, Sigma_4, Sigma_5, Sigma_6)
println(round(w.value*sigma*w.value,5))
end

println("\nworst case risk:")


println(round(t.value, 5))

13.23 Computing market-clearing prices. We consider n commodities or goods, with p Rn++ the vector
of prices (per unit quantity) of them. The (nonnegative) demand for the products is a function of
the prices, which we denote D : Rn Rn , so D(p) is the demand when the product prices are
p. The (nonnegative) supply of the products (i.e., the amounts that manufacturers are willing to
produce) is also a function of the prices, which we denote S : Rn Rn , so S(p) is the supply
when the product prices are p. We say that the market clears if S(p) = D(p), i.e., supply equals

451
demand, and we refer to p in this case as a set of market-clearing prices.
Elementary economics courses consider the special case n = 1, i.e., a single commodity, so supply
and demand can be plotted (vertically) against the price (on the horizontal axis). It is assumed
that demand decreases with increasing price, and supply increases; the market clearing price can be
found graphically, as the point where the supply and demand curves intersect. In this problem we
examine some cases in which market-clearing prices (for the general case n > 1) can be computed
using convex optimization.
We assume that the demand function is Hicksian, which means it has the form D(p) = E(p),
where E : Rn R is a differentiable function that is concave and increasing in each argument,
called the expenditure function. (While not relevant in this problem, Hicksian demand arises from
a model in which consumers make purchases by maximizing a concave utility function.)
We will assume that the producers are independent, so S(p)i = Si (pi ), i = 1, . . . , n, where Si :
R R is the supply function for good i. We will assume that the supply functions are positive
and increasing on their domain R+ .

(a) Explain how to use convex optimization to find market-clearing prices under the assumptions
given above. (You do not need to worry about technical details like zero prices, or cases in
which there are no market-clearing prices.)
(b) Compute market-clearing prices for the specific case with n = 4,
4
!1/4
Y
E(p) = pi ,
i=1

S(p) = (0.2p1 + 0.5, 0.02p2 + 0.1, 0.04p3 , 0.1p4 + 0.2).


Give the market-clearing prices and the demand and supply (which should match) at those
prices.
Hint: In CVX and CVXPY, geo_mean gives the geometric mean of the entries of a vector
argument. Julia does not yet have a vector argument geom_mean function, but you can get the
geometric mean of 4 variables a, b, c, d using geomean(geomean(a, b), geomean(c, d)).

Solution.
P n
R pi
(a) Consider the function L(p) = R pi i=1 0 i
S (t)dt. We claim that L(p) is a convex function of p.
Indeed, each separate term 0 Si (t)dt is convex in p because it is convex in pi (its derivative
is Si (pi ) which is increasing by assumption). Thus, L(p) is convex in p being a sum of convex
functions. Moreover, note that S(p) = L(p). We also know that E(p) is concave in p and
D(p) = E(p) by assumption.
We now claim that the market-clearing prices p are exactly the optimal point of the following
convex optimization problem:
minimize L(p) E(p),
with variable p. This is a convex problem because L is convex and E is concave. The optimal
point p? of this problem has to satisfy the condition


(L(p) E(p)) = 0.
p=p?

452
Since S(p) = L(p) and D(p) = E(p), we conclude that p? satisfies S(p? ) = D(p? ) which
means that p? is the market-clearing price vector.
(b) For this problem instance we have
4 Z pi
X
L(p) = Si (t)dt = 0.1p21 + 0.5p1 + 0.01p22 + 0.1p2 + 0.02p23 + 0.05p24 + 0.2p4 ,
i=1 0

so all we need to do is solve the problem

minimize L(p) E(p),

with L as above and E as given, i.e., the geometric mean.


Solving this problem we find the following market-clearing prices:

p = (0.6363, 2.6193, 3.1589, 1.2341).

The supply and demand at these prices are:

S(p) = D(p) = (0.6272, 0.1523, 0.1263, 0.3234).

The following Matlab code solves the problem.


cvx_begin quiet
variable p(4)
minimize(0.1 * square(p(1)) + 0.5 * p(1) ...
+ 0.01 * square(p(2)) + 0.1 * p(2) ...
+ 0.02 * square(p(3)) ...
+ 0.05 * square(p(4)) + 0.2 * p(4) ...
- geo_mean(p))
cvx_end
fprintf(The market-clearing prices are: %f, %f, %f, %f\n, p)
fprintf(The supply at these prices: %f, %f, %f, %f\n, ...
[0.2 * p(1) + 0.5, 0.02 * p(2) + 0.1, 0.04 * p(3), 0.1 * p(4) + 0.2])
g = geo_mean(p);
fprintf(The demand at these prices: %f, %f, %f, %f\n, ...
[g / (4.0 * p(1)), g / (4.0 * p(2)), g / (4.0 * p(3)), g / (4.0 * p(4))])
The following Python code solves the problem.
import cvxpy as cvx
import numpy as np

p = cvx.Variable(4)
obj = 0.1 * cvx.square(p[0]) + 0.5 * p[0] \
+ 0.01 * cvx.square(p[1]) + 0.1 * p[1] \
+ 0.02 * cvx.square(p[2]) \
+ 0.05 * cvx.square(p[3]) + 0.2 * p[3] \
- cvx.geo_mean(p)

453
problem = cvx.Problem(cvx.Minimize(obj))
problem.solve()

prices = [p.value.A1[i] for i in range(4)]


print(The market-clearing prices are: {}.format(prices))
supply = [0.2 * prices[0] + 0.5, 0.02 * prices[1] + 0.1, \
0.04 * prices[2], 0.1 * prices[3] + 0.2]
print(The supply at these prices: {}.format(supply))
g = np.power(np.prod(prices), 1.0 / 4)
demand = [g / (4 * prices[i]) for i in range(4)]
print(The demand at these prices: {}.format(demand))
The following Julia code solves the problem.
using Convex, SCS
set_default_solver(SCSSolver(verbose=false))

p = Variable(4)
obj = 0.1 * square(p[1]) + 0.5 * p[1] +
0.01 * square(p[2]) + 0.1 * p[2] +
0.02 * square(p[3]) +
0.05 * square(p[4]) + 0.2 * p[4] -
geomean(geomean(p[1], p[2]), geomean(p[3], p[4]))

problem = minimize(obj)
solve!(problem)
prices = vec(p.value)
println("Market-clearing prices are: $prices")
supply = [0.2 * prices[1] + 0.5, 0.02 * prices[2] + 0.1,
0.04 * prices[3], 0.1 * prices[4] + 0.2]
println("Supply at these prices: $supply")
g = (prices[1] * prices[2] * prices[3] * prices[4])^(1.0 / 4)
demand = [g / (4 * prices[1]), g / (4 * prices[2]),
g / (4 * prices[3]), g / (4 * prices[4])]
println("Demand at these prices: $demand")

454
14 Mechanical and aerospace engineering
14.1 Optimal design of a tensile structure. A tensile structure is modeled as a set of n masses in R2 ,
some of which are fixed, connected by a set of N springs. The masses are in equilibrium, with spring
forces, connection forces for the fixed masses, and gravity balanced. (This equilibrium occurs when
the position of the masses minimizes the total energy, defined below.)
We let (xi , yi ) R2 denote the position of mass i, and mi > 0 its mass value. The first p masses
are fixed, which means that xi = xfixed
i and yi = yifixed , for i = 1, . . . , p. The gravitational potential
energy of mass i is gmi yi , where g 9.8 is the gravitational acceleration.
Suppose spring j connects masses r and s. Its elastic potential energy is
 
(1/2)kj (xr xs )2 + (yr ys )2 ,

where kj 0 is the stiffness of spring j.


To describe the topology, i.e., which springs connect which masses, we will use the incidence matrx
A RnN , defined as


1 head of spring j connects to mass i
Aij = 1 tail of spring j connects to mass i

0 otherwise.

Here we arbitrarily choose a head and tail for each spring, but in fact the springs are completely
symmetric, and the choice can be reversed without any effect. (Hopefully you will discover why it
is convenient to use the incidence matrix A to specify the topology of the system.)
The total energy is the sum of the gravitational energies, over all the masses, plus the sum of the
elastic energies, over all springs. The equilibrium positions of the masses is the point that minimizes
the total energy, subject to the constraints that the first p positions are fixed. (In the equilibrium
positions, the total force on each mass is zero.) We let Emin denote the total energy of the system,
in its equilibrium position. (We assume the energy is bounded below; this occurs if and only if each
mass is connected, through some set of springs with positive stiffness, to a fixed mass.)
The total energy Emin is a measure of the stiffness of the structure, with larger Emin corresponding
to stiffer. (We can think of Emin = as an infinitely unstiff structure; in this case, at least one
mass is not even supported against gravity.)

(a) Suppose we know the fixed positions xfixed


1 , . . . , xfixed
p , y1fixed , . . . , ypfixed , the mass values m1 , . . . , mn ,
the spring topology A, and the constant g. You are to choose nonnegative k1 , . . . , kN , subject
to a budget constraint 1T k = k1 + +kN = k tot , where k tot is given. Your goal is to maximize
Emin .
Explain how to do this using convex optimization.
(b) Carry out your method for the problem data given in tens_struct_data.m. This file defines
all the needed data, and also plots the equilibrium configuration when the stiffness is evenly
distributed across the springs (i.e., k = (k tot /N )1).
Report the optimal value of Emin . Plot the optimized equilibrium configuration, and compare
it to the equilibrium configuration with evenly distributed stiffness. (The code for doing this
is in the file tens_struct_data.m, but commented out.)

455
Solution.

(a) AT x gives a vector of the x-displacements of the springs, and AT y gives a vector of the
y-displacements of the springs.
The energy as a function of x, y, and k is given by

E(x, y, k) = (1/2)xT A diag(k)AT x + (1/2)y T A diag(k)AT y + cT y,

where ci = gmi . This is an affine function of k. Thus the minimum energy is the minimum of
this function, over all x and y that satisfy the fixed constraints. It follows immediately that
Emin is a concave function of k. Thats good, since we want to maximize it.
Well need an explicit formula for Emin . To do this we partition A into A1 and A2 , where
A1 RpN is made up of the first p rows of A, while A2 R(np)N is made up of the last
n p rows of A.
Let x, y, c Rnp denote the last n p (i.e., free) elements of x, y and c respectively. The
minimum energy can be written as
 
Emin (k) = min (1/2)zxT diag(k)zx + (1/2)zyT diag(k)zy + cT y + C ,
x
,
y

Pp fixed
where C = i=1 gmi yi and

zx = AT2 x
+ bx
zy = AT2 y + by
bx = AT1 xfixed
by = AT1 y fixed .

Thus to evaluate Emin we need to evaluate the minimum of an unconstrained quadratic in x



and y. This gives us

Emin (k) = (1/2)(bTx Dbx vxT Q1 vx ) + (1/2)(bTy Dby vyT Q1 vy ) + C,

where

D = diag(k)
Q = A2 DAT2
vx = A2 Dbx
vy = A2 Dby + c.

Note that all these terms are affine in k.


We can therefore write down the problem of re-allocating stiffness as

maximize bTx Dbx vxT Q1 vx + bTy Dby vyT Q1 vy


subject to 1T k = k tot , k  0.

This is a convex optimization problem. The constraints are obviously convex in k. The
objective has two terms which are affine (and therefore concave) in k and two terms which

456
are the negatives of matrix fractionals of affine terms in k (and thus also concave). Thus the
objective is concave in k.
Here is another solution to this problem. The stiffness allocation problem is

maximize Emin (k)


subject to 1T k = k tot , k  0.

Let (x? , y ? ) be the equilibrium mass coordinates for stiffness k and let be the Lagrange
multiplier associated with the equality constraint of k. Then the Lagrangian of the stiffness
allocation problem is

L(k, ) = (1/2)(x?T A diag(k)AT x? + y ?T A diag(k)AT y ? ) + cT y ? + (1T k k tot )


X
= kj ((1/2)(x?T aj aTj x? + y ?T aj aTj y ? ) + ) + cT y ? k tot ,
j

where aj is the jth column of A. However, this is the same as the Lagrangian of the following
optimization problem

minimize cT y k tot
subject to (1/2)xT aj aTj x + (1/2)y T aj aTj y + 0, j = 1, . . . , N
xi = xfixed
i , i = 1, . . . , p
yi = yifixed , i = 1, . . . , p.

The value of this optimization problem is equal to the optimal Emin . The optimal spring
stiffness k ? is equal to the optimal Lagrange multipliers corresponding to the inequality con-
straints.
Here is something that doesnt work (but was proposed by a large number of people): alter-
nating between minimizing over x and y and maximizing over k. Some people argued that this
would converge to a local optimum of the problem, thus a global optimum since the problem
is convex. However, this is not true.
(b) The following code solves the stiffness re-allocation problem.
tens_struct_data;

c = g*m;
% Optimum stiffness allocation problem
A1 = A(1:p,:); A2 = A(p+1:n,:); cbar = c(p+1:n);
cvx_begin
variable k(N)
D = diag(k);
bx = A1*x_fixed;
by = A1*y_fixed;
vx = A2*D*bx;
vy = A2*D*by+cbar;
maximize(bx*D*bx-matrix_frac(vx,A2*D*A2)+...
by*D*by-matrix_frac(vy,A2*D*A2))
subject to

457
k >= 0; sum(k) == k_tot;
cvx_end

% Compute Emin
Eunif = 0.5*x_unif*A*diag(k_unif)*A*x_unif;
Eunif = Eunif + 0.5*y_unif*A*diag(k_unif)*A*y_unif;
Eunif = Eunif + c*y_unif
Emin = 0.5*cvx_optval+c(1:p)*y_fixed

% Form x and y
xmin = -(A2*D*A2)\(A2*D*A1*x_fixed);
ymin = -(A2*D*A2)\(A2*D*A1*y_fixed+cbar);

xopt = zeros(n,1);
xopt(1:p) = x_fixed;
xopt(p+1:n) = xmin;

yopt = zeros(n,1);
yopt(1:p) = y_fixed;
yopt(p+1:n) = ymin;

% Plot optimized structure and structure with uniform stiffness


figure
ind_ex = find(k_unif < 1e-2); %do not show springs with k < 1e-2
Aadj = A(:,setdiff(1:N,ind_ex));
Aadj2 = double(Aadj*Aadj-diag(diag(Aadj*Aadj)) ~= 0);
gplot(Aadj2,[x_unif y_unif],o-);
hold on
plot(x_fixed,y_fixed,ro);
xlabel(x); ylabel(y);
axis([0.1 0.8 -1.5 1])

figure
ind_ex = find(k < 1e-2); %do not show springs with k < 1e-2
Aadj = A(:,setdiff(1:N,ind_ex));
Aadj2 = double(Aadj*Aadj-diag(diag(Aadj*Aadj)) ~= 0);
gplot(Aadj2,[xopt yopt],o-);
hold on
plot(x_fixed,y_fixed,ro);
xlabel(x); ylabel(y);
axis([0.1 0.8 -1.5 1])
hold off
The optimal energy is Emin (k ? ) = 57.84. On the other hand the minimum energy is 18.37
when stiffness is uniformly allocated. The mass configurations for both uniform and optimal
stiffness allocations are shown in figure 15. Springs with stiffness less than 102 are not shown.

458
14.2 Equilibrium position of a system of springs. We consider a collection of n masses in R2 , with
locations (x1 , y1 ), . . . , (xn , yn ), and masses m1 , . . . , mn . (In other words, the vector x Rn gives
the x-coordinates, and y Rn gives the y-coordinates, of the points.) The masses mi are, of course,
positive.
For i = 1, . . . , n 1, mass i is connected to mass i + 1 by a spring. The potential energy in the ith
spring is a function of the (Euclidean) distance di = k(xi , yi ) (xi+1 , yi+1 )k2 between the ith and
(i + 1)st masses, given by (
0 di < li
Ei =
(ki /2)(di li )2 di li
where li 0 is the rest length, and ki > 0 is the stiffness, of the ith spring. The gravitational
potential energy of the ith mass is gmi yi , where g is a positive constant. The total potential energy
of the system is therefore
n1
X
E= Ei + gmT y.
i=1

The locations of the first and last mass are fixed. The equilibrium location of the other masses is
the one that minimizes E.

(a) Show how to find the equilibrium positions of the masses 2, . . . , n1 using convex optimization.
Be sure to justify convexity of any functions that arise in your formulation (if it is not obvious).
The problem data are mi , ki , li , g, x1 , y1 , xn , and yn .
(b) Carry out your method to find the equilibrium positions for a problem with n = 10, mi = 1,
ki = 10, li = 1, x1 = y1 = 0, xn = yn = 10, with g varying from g = 0 (no gravity) to g = 10
(say). Verify that the results look reasonable. Plot the equilibrium configuration for several
values of g.

Solution. The only tricky part here is to show that Ei is a convex function of (x, y). Define
g(u) = ki /2(u li )2 for u li , and g(u) = 0 for u < li . This function is convex and nondecreasing
on its extended domain, so
Ei = g(k(xi , yi ) (xi+1 , yi+1 )k2 )
is a convex function of (x, y). So E is convex, and we can minimize it subject to equality constraints
on (x1 , y1 ) and (xn , yn ). The CVX code below solves the problem for the given problem instance.

% equilibrium position of mass spring systems.


n=10; % number of masses
m=ones(n,1); % mass values
l=ones(n-1,1); % spring resting lengths
k=10*ones(n-1,1); % spring stiffnesses
g=3;
x1=0; y1=0;
xn=10; yn=10;

459
1

0.5

0
y

0.5

1.5
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
x
1

0.5

0
y

0.5

1.5
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
x

Figure 15: Mass configuration for uniform (top) and optimal (bottom) stiffness allocation.

460
cvx_quiet(true);

K = 10; % number of values of g


g= linspace(0,10,K);
X = zeros(n,K); % hold optimal X positions for K values of g
Y = zeros(n,K); % hold optimal Y positions for K values of g
for i=1:K
cvx_begin
variables x(n) y(n)
x(1) == x1;
x(n) == xn;
y(1) == y1;
y(n) == yn;
d = norms([x(2:n)-x(1:n-1); y(2:n)-y(1:n-1)]); % distances between masses
minimize (sum(k.*square_pos(d-l))+g(i)*m*y)
cvx_end
X(:,i)=x;
Y(:,i)=y;
end

plot(X,Y)
print -depsc mass_spr_pos

The plot below shows the equilibrium positions for 10 values of g between g = 0 and g = 10. We
can see that with no gravity, the equilibrium positions are on a straight line between the fixed
endpoints; as gravity increases, we see more sag in the positions.

10

5
0 1 2 3 4 5 6 7 8 9 10

14.3 Elastic truss design. In this problem we consider a truss structure with m bars connecting a set

461
of nodes. Various external forces are applied at each node, which cause a (small) displacement in
the node positions. f Rn will denote the vector of (components of) external forces, and d Rn
will denote the vector of corresponding node displacements. (By corresponding we mean if fi is,
say, the z-coordinate of the external force applied at node k, then di is the z-coordinate of the
displacement of node k.) The vector f is called a loading or load.
The structure is linearly elastic, i.e., we have a linear relation f = Kd between the vector of
external forces f and the node displacements d. The matrix K = K T  0 is called the stiffness
matrix of the truss. Roughly speaking, the larger K is (i.e., the stiffer the truss) the smaller the
node displacement will be for a given loading.
We assume that the geometry (unloaded bar lengths and node positions) of the truss is fixed; we
are to design the cross-sectional areas of the bars. These cross-sectional areas will be the design
variables xi , i = 1, . . . , m. The stiffness matrix K is a linear function of x:

K(x) = x1 K1 + + xm Km ,

where Ki = KiT  0 depend on the truss geometry. You can assume these matrices are given or
known. The total weight Wtot of the truss also depends on the bar cross-sectional areas:

Wtot (x) = w1 x1 + + wm xm ,

where wi > 0 are known, given constants (density of the material times the length of bar i). Roughly
speaking, the truss becomes stiffer, but also heavier, when we increase xi ; there is a tradeoff between
stiffness and weight.
Our goal is to design the stiffest truss, subject to bounds on the bar cross-sectional areas and total
truss weight:
l xi u, i = 1, . . . , m, Wtot (x) W,
where l, u, and W are given. You may assume that K(x)  0 for all feasible vectors x. To obtain
a specific optimization problem, we must say how we will measure the stiffness, and what model of
the loads we will use.

(a) There are several ways to form a scalar measure of how stiff a truss is, for a given load f . In
this problem we will use the elastic stored energy
1
E(x, f ) = f T K(x)1 f
2
to measure the stiffness. Maximizing stiffness corresponds to minimizing E(x, f ).
Show that E(x, f ) is a convex function of x on {x | K(x)  0}.
Hint. Use Schur complements to prove that the epigraph is a convex set.
(b) We can consider several different scenarios that reflect our knowledge about the possible
loadings f that can occur. The simplest is that f is a single, fixed, known loading. In more
sophisticated formulations, the loading f might be a random vector with known distribution,
or known only to lie in some set F, etc.
Show that each of the following four problems is a convex optimization problem, with x as
variable.

462
Design for a fixed known loading. The vector f is known and fixed. The design problem
is
minimize E(x, f )
subject to l xi u, i = 1, . . . , m
Wtot (x) W.
Design for multiple loadings. The vector f can take any of N known values f (i) , i =
1, . . . , N , and we are interested in the worst-case scenario. The design problem is

minimize maxi=1,...,N E(x, f (i) )


subject to l xi u, i = 1, . . . , m
Wtot (x) W.

Design for worst-case, unknown but bounded load. Here we assume the vector f can take
arbitrary values in a ball B = {f | kf k2 }, for a given value of . We are interested
in minimizing the worst-case stored energy, i.e.,

minimize supkf k2 E(x, f (i) )


subject to l xi u, i = 1, . . . , m
Wtot (x) W.

Design for a random load with known statistics. We can also use a stochastic model of the
uncertainty in the load, and model the vector f as a random variable with known mean
and covariance:
E f = f (0) , E(f f (0) )(f f (0) )T = .
In this case we would be interested in minimizing the expected stored energy, i.e.,

minimize E E(x, f (i) )


subject to l xi u, i = 1, . . . , m
Wtot (x) W.

Hint. If v is a random vector with zero mean and covariance , then E v T Av = E tr Avv T =
tr A E vv T = tr A.
(c) Formulate the four problems in (b) as semidefinite programming problems.

Solution. There are several correct answers for each subproblem. We give only one or two.

(a) The epigraph of E(x, f ) is


1
{(x, t) | f T K(x)1 f t, K(x)  0}.
2
Using Schur complements we can express this as the solution set of the linear matrix inequality
" #
K(x) f
 0,
fT 2t

i.e., as a convex set.

463
(b) Fixed known load. The objective function is convex (see part a). The constraints are
linear inequalities.
Multiple loads. The objective function is the pointwise maximum of N convex functions,
hence convex.
Unknown but bounded load. The objective function is the pointwise supremum of an
(infinite) number of convex functions, and hence convex.
Random load. We have

E f T K(x)1 f = f0T K(x)1 f0 + E(f f0 )T K(x)1 (f f0 )


= f0T K(x)1 f0 + tr K(x)1 .

We have already shown that the first term in the sum is convex. To show convexity of the
second term, we write the eigenvalue decomposition of as = ni=1 i qi qiT . (i 0,
P

since a covariance matrix is always positive semidefinite.) Then we can write

tr K(x)1 = tr K(x)1
X
i qi qiT
i
n n
i tr K(x)1 qi qiT = i qiT K(x)1 qi ,
X X
=
i=1 i=1

i.e., a nonnegative sum of convex functions of x.


(c) Fixed known load.
minimize t/2
" #
K(x) f
subject to 0
fT t
Wtot (x) W
l  x  u.
Multiple loadings.

minimize t/2
" #
K(x) fi
subject to  0, i = 1, . . . , N
fiT t
Wtot (x) W
l  x  u.

Unknown but bounded load. Note that


1 T 2
 
sup f K(x)1 f = sup f T K(x)1 f
kf k2 2 2 kf k2 1
2
= max (K(x)1 )
2
2
= (min (K(x)))1 ,
2

464
where max (K(x)1 ) is the largest eigenvalue of K(x)1 and min (K(x)) is the smallest
eigenvalue of K(x).
This observation leads to the SDP formulation:
maximize t
subject to K(x)  tI
Wtot (x) W
l  x  u.
(See page 7-9 of the lecture notes). The variables are x, t. From the optimal t we obtain
the optimal value of the original problem as 2 /(2t).
Random load. There are several ways to express the problem as an SDP. One possibility
is to follow our proof of convexity of the objective function, and to write the problem as
n
1 X
minimize (t0 + i ti )
2 i=1
" #
K(x) f0
subject to 0
f0T t0
" #
K(x) qi
 0, i = 1, . . . , n
qiT ti
Wtot (x) W
l  x  u.
The variables are x and t0 , t1 , . . . , tn .

14.4 A structural optimization problem. [?] The figure shows a two-bar truss with height 2h and width
w. The two bars are cylindrical tubes with inner radius r and outer radius R. We are interested
in determining the values of r, R, w, and h that minimize the weight of the truss subject to a
number of constraints. The structure should be strong enough for two loading scenarios. In the
first scenario a vertical force F1 is applied to the node; in the second scenario the force is horizontal
with magnitude F2 .

h1

Fx r

h2

The weight of the truss is proportional to the total volume of the bars, which is given by
p
2(R2 r2 ) w2 + h2

465
This is the cost function in the design problem.
The first constraint is that the truss should be strong enough to carry the load F1 , i.e., the stress
caused by the external force F1 must not exceed a given maximum value. To formulate this
constraint, we first determine the forces in each bar when the structure is subjected to the vertical
load F1 . From the force equilibrium and the geometry of the problem we can determine that the
magnitudes of the forces in two bars are equal and given by

w 2 + h2
F1 .
2h
The maximum force in each bar is equal to the cross-sectional area times the maximum allowable
stress (which is a given constant). This gives us the first constraint:

w2 + h2
F1 (R2 r2 ).
2h

The second constraint is that the truss should be strong enough to carry the load F2 . When F2 is
applied, the magnitudes of the forces in two bars are again equal and given by

w 2 + h2
F2 ,
2w
which gives us the second constraint:

w2 + h2
F2 (R2 r2 ).
2w

We also impose limits wmin w wmax and hmin h hmax on the width and the height of the
structure, and limits 1.1r R Rmax on the outer radius.
In summary, we obtain the following problem:

minimize 2(R2 r2 ) w2 + h2

w 2 + h2
subject to F1 (R2 r2 )
2h

w 2 + h2
F2 (R2 r2 )
2w
wmin w wmax
hmin h hmax
1.1r R Rmax
R > 0, r > 0, w > 0, h > 0.

The variables are R, r, w, h.


Formulate this as a geometric programming problem.
Solution.

466
We can introduce new variables
p
u = R2 r 2 , L= w2 + h2

and write the problem as

minimize 2uL
subject to (F1 /(2))Lh1 u1 1
(F2 /(2))Lw1 u1 1
(1/wmax )w 1, wmin w1 1
(1/hmax )h 1, hmin h1 1
0.21r2 u1 1
2 2
(1/Rmax )u + (1/Rmax )r2 1
w2 L2 + h2 L2 1
r > 0, w > 0, h > 0, u > 0, L > 0,

with scalar variables


r, w, h, u, and L. The desired values can be recovered from the GP by
calculating R = u + r2 .
A geometric programming problem can only have monomial equality constraints, so we cannot add
an equality constraint L2 = w2 + h2 . Therefore we changed it to an inequality

L2 w2 + h2 ,

i.e., w2 L2 + h2 L2 1. To see why this works, notice that L appears only in the objective and
the first two inequality constraints. Each of these involves
an expression that is monotonically
increasing in L, so at the optimum L will be equal to w2 + h2 . If this is not the case, then we
could make L smaller, which maintains feasibility and strictly decreases the objective.
Also, note that we replaced the inequality R 1.1r with u (1.12 1)r2 = 0.21r2 .

14.5 Optimizing the inertia matrix of a 2D mass distribution. An object has density (z) at the point
z = (x, y) R2 , over some region R R2 . Its mass m R and center of gravity c R2 are given
by
1
Z Z
m= (z) dxdy, c= (z)z dxdy,
R m R
and its inertia matrix M R22 is
Z
M= (z)(z c)(z c)T dxdy.
R

(You do not need to know the mechanics interpretation of M to solve this problem, but here it is,
for those interested. Suppose we rotate the mass distribution around a line passing through the
center of gravity in the direction q R2 that lies in the plane where the mass distribution is, at
angular rate . Then the total kinetic energy is ( 2 /2)q T M q.)
The goal is to choose the density , subject to 0 (z) max for all z R, and a fixed total
mass m = mgiven , in order to maximize min (M ).

467
To solve this problem numerically, we will discretize R into N pixels each of area a, with pixel
i having constant density i and location (say, of its center) zi R2 . We will assume that the
integrands above dont vary too much over the pixels, and from now on use instead the expressions
N N N
X a X X
m=a i , c= i zi , M =a i (zi c)(zi c)T .
i=1
m i=1 i=1
The problem below refers to these discretized expressions.

(a) Explain how to solve the problem using convex (or quasiconvex) optimization.
(b) Carry out your method on the problem instance with data in inertia_dens_data.m. This
file includes code that plots a density. Give the optimal inertia matrix and its eigenvalues,
and plot the optimal density.

Solution.
(a) We first express M as
N
X
M =a i zi ziT mccT ,
i=1
so min (M ) if and only if
N
X
a i zi ziT mccT  I.
i=1
Using Schur complements, and the given fixed mass, this can be written as
" PN #
a T I
i=1 i zi zi c
 0,
c T 1/mgiven
which is an LMI in and .
We can express therefore express the problem as the SDP
maximize
" #
T PN
subject to i=1 i zi zi I
a c
0
c T 1/mgiven
a1T = mgiven , 0   max 1
c = (a/mgiven ) N
P
i=1 i zi ,

with variables RN , c R2 , and R.


We can also express the problem another way, without explicit LMIs. We can write the LMI
as
N
! 1
X
T
c a i zi ziT I c 1/mgiven ,
i=1
where we assume the matrix inverted here is positive definite. This leads to the problem
maximize
 P 1
subject to cT a N T
i=1 i zi zi I c 1/mgiven
a1T = mgiven , 0   max 1
c = (a/mgiven ) N
P
i=1 i zi .

468
(b) The code below solves the problem. The optimal inertia matrix is
" #
0.6619 .0001
M= ,
.0001 0.6619

which has two equal eigenvalues. The optimal mass distribution is shown below.

% intertia density optimization


inertia_dens_data;

cvx_begin
variables rho(N) c(2) gamma
maximize (gamma)
[a*Z*diag(rho)*Z - gamma*eye(2) c; c 1/mgiven] == semidefinite(3);
a*sum(rho) == mgiven;
0 <= rho;
rho <= rhomax;
c == (a/mgiven)*Z*rho;
cvx_end
M = a*Z*diag(rho)*Z-mgiven*c*c;

%%%%% Plot Solution %%%%%%

P = nan(n,n) ; P(ind)=rho;
pcolor(P); axis square;
colormap autumn ; colorbar ;
print -depsc inertia_dens

469
3
x 10
25
7

6
20

15
4

3
10

5
1

5 10 15 20 25

14.6 Truss loading analysis. A truss (in 2D, for simplicity) consists of a set of n nodes, with positions
p(1) , . . . , p(n) R2 , connected by a set of m bars with tensions t1 , . . . , tm R (tj < 0 means bar j
operates in compression).
Each bar puts a force on the two nodes which it connects. Suppose bar j connects nodes k and l.
The tension in this bar applies a force
tj
(p(l) p(k) ) R2
kp(l) p(k) k2

to node k, and the opposite force to node l. In addition to the forces imparted by the bars, each
node has an external force acting on it. We let f (i) R2 be the external force acting on node i. For
the truss to be in equilibrium, the total force on each node, i.e., the sum of the external force and
the forces applied by all of the bars that connect to it, must be zero. We refer to this constraint as
force balance.
The tensions have given limits, Tjmin tj Tjmax , with Tjmin 0 and Tjmax 0, for j = 1, . . . , m.
(For example, if bar j is a cable, then it can only apply a nonnegative tension, so Tjmin = 0, and
we interpret Tjmax as the maximum tension the cable can carry.)
The first p nodes, i = 1, . . . , p, are free, while the remaining n p nodes, i = p + 1, . . . , n, are
anchored (i.e., attached to a foundation). We will refer to the external forces on the free nodes
as load forces, and external forces at the anchor nodes as anchor forces. The anchor forces are
unconstrained. (More accurately, the foundations at these points are engineered to withstand any
total force that the bars attached to it can deliver.) We will assume that the load forces are just

470
dead weight, i.e., have the form
" #
0
f (i) = , i = 1, . . . , p,
wi

where wi 0 is the weight supported at node i.


The set of weights w Rp+ is supportable if there exists a set of tensions t Rm and anchor forces
f (p+1) , . . . , f (n) that, together with the given load forces, satisfy the force balance equations and
respect the tension limits. (The tensions and anchor forces in a real truss will adjust themselves to
have such values when the load forces are applied.) If there does not exist such a set of tensions
and anchor forces, the set of load forces is said to be unsupportable. (In this case, a real truss will
fail, or collapse, when the load forces are applied.)
Finally, we get to the questions.

(a) Explain how to find the maximum total weight, 1T w, that is supportable by the truss.
(b) Explain how to find the minimum total weight that is not supportable by the truss. (Here we
mean: Find the minimum value of 1T w, for which (1 + )w is not supportable, for all  > 0.)
(c) Carry out the methods of parts (a) and (b) on the data given in truss_load_data.m. Give
the critical total weights from parts (a) and (b), as well as the individual weight vectors.

Notes.

In parts (a) and (b), we dont need a fully formal mathematical justification; a clear argument
or explanation of anything not obvious is fine.
The force balance equations can be expressed in the compact and convenient form

f load,x
load,y
At + f = 0,
f anch

where
(1) (p)
f load,x = (f1 , . . . , f1 ) Rp ,
(1) (p)
f load,y = (f2 , . . . , f2 ) Rp ,
(p+1) (n) (p+1) (n)
f anch = (f1 , . . . , f1 , f2 , . . . , f2 ) R2(np) ,

and A R2nm is a matrix that can be found from the geometry data (truss topology and
node positions). You may refer to A in your solutions to parts (a) and (b). For part (c), we
have very kindly provided the matrix A for you in the m-file, to save you the time and trouble
of working out the force balance equations from the geometry of the problem.

Solution.

(a) The first step is to decompose A into three submatrices



Ax
A = Ay ,

Aanch

471
(with p, p, and 2(n p) rows, respectively), so force balance with the dead weight loads is

Ax t = 0, Ay t = w, Aanch t + f anch = 0.

Since f anch is unconstrained, the last set of force balance equations will always be satisfied,
no matter what value t has (indeed, just take f anch = Aanch t). So we can just ignore these
equations. So the weight vector w is supportable provided there exists t which satisfies

Ax t = 0, Ay t = w, T min  t  T max .

To find the maximum supportable total weight, we can solve the LP

maximize 1T w
subject to Ax t = 0, Ay t = w, w0
T min  t  T max ,

with variables t and w. This tells us that the set of supportable weight vectors is a polyhedron,
which well denote as S Rn+ . Here we are maximizing a linear function (1T w) over S, which
is an LP.
(b) Finding the minimum unsupportable total weight is more difficult than finding the maximum
supportable total weight. The set of unsupportable weight vectors is

U = Rn+ \ S,

which is not convex. Minimizing over U the total weight, a linear function of w, is not an LP.
The key insight is this: The minimum unsupportable weight will be concentrated at one of
the free nodes (which has the nice interpretation as the weakest node in the truss). Once
we know this, we solve the problem by considering each node in turn, finding the maximum
supportable weight at each node, by solving the LPs

maximize
subject to Ax t = 0, Ay t = ek
T min  t  T max ,

with variables t and , for k = 1, . . . , n. This number is the same as the minimum unsupport-
able weight at the node. The minimum of these numbers give us the minimum unsupportable
total weight.
The tricky part is to show that a minimum of 1T w over U occurs at a w of the form w = wk ek ,
where ek is the kth unit vector. (The bar above U means its closure.) Here is one argument
for this, which relies on knowing some analysis. Suppose that w? minimizes 1T w over U,
with associated total weight W ? = 1T w? . This means that w? is actually supportable, but
just barely; (1 + )w? is unsupportable for all  > 0. We claim that one of the points W ? ek
also minimizes 1T w over U. (This is what we want to show.) Let  > 0, and consider the
point (1 + )w? , which is unsupportable. But the vector (1 + )w? is a convex combination
of the vectors (1 + )W ? ek ; so if all these vectors were supportable, we would have (1 + )w?
supportable, since the set of supportable weights is convex. So at least one of the vectors
(1 + )W ? ek is unsupportable. Since  was arbitrary, we conclude that one of the vectors
W ? ek is also on the boundary of U (and S), and so also minimizes the total weight over U.

472
Another proof. The set of weights is not supportable if the equations above have no solu-
tion. Well derive the alternative inequalities. We form the Lagrangian

L = T Ax t T (Ay t w) + T (T min t) + T (t T max ),

with  0,  0. We minimize over the variable t to get

ATx ATy + = 0,

which leaves
g(, , , ) = T w + T T min T T max .
Thus, the alternative is:
ATx ATy + = 0,
T w + T T min T T max = 1,
 0,  0.
where the 1 on righthand side of the second equation comes from homogeneity. We can
now say: w is unsupportable if and only if there are , , , satisfying the equations and
inequalities above.
Now suppose w is unsupportable, with total weight W = 1T w, and let , , , be the corre-
sponding dual variables satisfying the equations and inequalities above.
Since ,  0 and T min  0  T max , we have T T min T T max 0. So T w 1, which
implies that has at least one positive component, since w is elementwise positive. Let k be
such that k = maxj j . From above, we know that k > 0. Now define w = ek , where
= T w/k . Then the equations and inequalities above hold for w, which tells us that
w is also unsupportable. Now we will show that 1T w W , which means the new weight
distribution has a total weight not exceeding the original weight. Note that

1T w
= = (/k )T w 1T w = W,

because the vector /k has all entries 1 (and w  0).


So weve shown that given any unsupportable weight vector w, we can construct an unsup-
portable weight vector, with no more total weight, that is concentrated at a single node.
We gave varying amounts of partial credit for explanations of why the least total weight that
is unsupportable should be concentrated at a single node.
(c) We find that the maximum total weight that is supportable is 5.79, with corrosponding weight
vector
max_supp_w =

1.2053
0.8166
0.7599
0.9818
0.5918
1.4369.

473
We also find that the minimum unsupportable weight is 1.2 (placed at node 4). The following
code solves this problem.
% truss load analysis
truss_load_data;

Ax = A(1:p,:);
Ay = A(p+1:2*p,:);

% maximum supportable total weight


cvx_begin quiet
variables w(p) t(m)
maximize( sum(w) )
subject to
w >= 0;
t >= Tmin;
t <= Tmax;
Ax*t == 0;
Ay*t == w;
cvx_end

max_supp_w = w
max_supp_weight = sum(w)

% minimum unsupportable total weight


% to find this, we loop over the free nodes
w_min_single = zeros(p,1);
for k = 1:p
ek = zeros(p,1);
ek(k) = 1;
cvx_begin quiet
variables alpha t(m)
maximize( alpha )
subject to
alpha >= 0;
t >= Tmin;
t <= Tmax;
Ax*t == 0;
Ay*t == alpha*ek;
cvx_end
% store max supportable weight at node k
% (= min unsupportable weight at node k)
w_min_single(k) = alpha;
end

[min_unsupp_weight weakest_node] = min(w_min_single);

474
min_usupp_w = zeros(p,1);
min_usupp_w(weakest_node) = min_unsupp_weight
min_unsupp_weight

14.7 Quickest take-off. This problem concerns the braking and thrust profiles for an airplane during
take-off. For simplicity we will use a discrete-time model. The position (down the runway) and the
velocity in time period t are pt and vt , respectively, for t = 0, 1, . . .. These satisfy p0 = 0, v0 = 0,
and pt+1 = pt + hvt , t = 0, 1, . . ., where h > 0 is the sampling time period. The velocity updates as

vt+1 = (1 )vt + h(ft bt ), t = 0, 1, . . . ,

where (0, 1) is a friction or drag parameter, ft is the engine thrust, and bt is the braking force,
at time period t. These must satisfy

0 bt min{B max , ft }, 0 ft F max , t = 0, 1, . . . ,

as well as a constraint on how fast the engine thrust can be changed,

|ft+1 ft | S, t = 0, 1, . . . .

Here B max , F max , and S are given parameters. The initial thrust is f0 = 0. The take-off time is
T to = min{t | vt V to }, where V to is a given take-off velocity. The take-off position is P to = pT to ,
the position of the aircraft at the take-off time. The length of the runway is L > 0, so we must
have P to L.

(a) Explain how to find the thrust and braking profiles that minimize the take-off time T to ,
respecting all constraints. Your solution can involve solving more than one convex problem,
if necessary.
(b) Solve the quickest take-off problem with data

h = 1, = 0.05, B max = 0.5, F max = 4, S = 0.8, V to = 40, L = 300.

Plot pt , vt , ft , and bt versus t. Comment on what you see. Report the take-off time and
take-off position for the profile you find.

Solution. To check if T to = T is possible, we solve the (convex) feasibility inequalities

p0 = 0, pt+1 = pt + hvt , t = 0, . . . , T 1
v0 = 0, vt+1 = (1 )vt + h(ft bt ), t = 0, . . . , T 1
0 bt min{B max , ft }, t = 0, . . . , T
f0 = 0, 0 ft F max , t = 0, . . . , T
|ft+1 ft | S, t = 0, 1, . . . , T 1
vT V to , pT L,

with variables p, v, f, b RT +1 . To find the quickest take-off, we start with T = 1 (say) and
increment it until the constraints above are feasible.
The following code solves the problem.

475
% solution for quickest take-off problem
h = 1;
eta = 0.05;
Bmax = 0.5;
Fmax = 4;
S = 0.8;
Vto = 40;
L = 300;

T = 17; % infeasible for T=1,...,16


cvx_begin
variables p(T+1) v(T+1) f(T+1) b(T+1)
minimize (p(T+1)) % objective not needed
subject to
p(1) == 0;
p(2:T+1) == p(1:T) + h*v(1:T);
p(T+1) <= L;
v(1) == 0;
v(T+1) >= Vto;
v(2:T+1) == (1-eta)*v(1:T) + h*(f(1:T)-b(1:T));
f(1) == 0;
0 <= b;
b <= Bmax;
b <= f;
0 <= f;
f <= Fmax;
abs(f(2:T+1)-f(1:T)) <= S;
cvx_end

% plots
subplot(4,1,1); plot([0:T], p); xlabel(time); ylabel(pt);
subplot(4,1,2); plot([0:T], v); xlabel(time); ylabel(vt);
subplot(4,1,3); plot([0:T], f); xlabel(time); ylabel(ft);
subplot(4,1,4); plot([0:T], b); xlabel(time); ylabel(bt);
print -depsc quickest_takeoff
The quickest take-off time is T to? = 17. The profiles for quickest take-off are shown below. You can
check that the use of the brake is not optional; the problem becomes infeasible when you disable it
(say, by setting B max = 0). At first it seems strange that a brake (which, after all, is used to slow
an airplane down) is critical in take-off. But a bit of thought reveals why its needed: You lock the
aircraft down while the engine thrusts up. If you dont, youre already moving down the runway
at less than full thrust, which means youll be farther along the runway when you take off. By the
way, if you make the runway a bit longer, the brake isnt needed; you just thrust up to max thrust
right from the start.
We can also note that if you can take off at time T , then you can take off in any later time period.
To see this, suppose that s 0, and ft , bt are trajectories that allow take-off at time T . Then we

476
set ft = bt = 0 for t = 0, . . . , s, and ft = fts , bt = fts for t = s, . . . , T + s. This trajectory is also
feasible, and leads to take-off time T + s and the same take-off location.

500
pt

500
0 2 4 6 8 10 12 14 16 18
time
50
vt

50
0 2 4 6 8 10 12 14 16 18
time
5
ft

0
0 2 4 6 8 10 12 14 16 18
time
0.5
bt

0.5
0 2 4 6 8 10 12 14 16 18
time

14.8 Optimal spacecraft landing. We consider the problem of optimizing the thrust profile for a spacecraft
to carry out a landing at a target position. The spacecraft dynamics are

p = f mge3 ,
m

where m > 0 is the spacecraft mass, p(t) R3 is the spacecraft position, with 0 the target landing
position and p3 (t) representing height, f (t) R3 is the thrust force, and g > 0 is the gravitational
acceleration. (For simplicity we assume that the spacecraft mass is constant. This is not always
a good assumption, since the mass decreases with fuel use. We will also ignore any atmospheric
friction.) We must have p(T td ) = 0 and p(T td ) = 0, where T td is the touchdown time. The
spacecraft must remain in a region given by

p3 (t) k(p1 (t), p2 (t))k2 ,

where > 0 is a given minimum glide slope. The initial position p(0) and velocity p(0)
are given.
The thrust force f (t) is obtained from a single rocket engine on the spacecraft, with a given
maximum thrust; an attitude control system rotates the spacecraft to achieve any desired direction

477
of thrust. The thrust force is therefore characterized by the constraint kf (t)k2 F max . The fuel
use rate is proportional to the thrust force magnitude, so the total fuel use is
Z T td
kf (t)k2 dt,
0

where > 0 is the fuel consumption coefficient. The thrust force is discretized in time, i.e., it is
constant over consecutive time periods of length h > 0, with f (t) = fk for t [(k 1)h, kh), for
k = 1, . . . , K, where T td = Kh. Therefore we have

vk+1 = vk + (h/m)fk hge3 , pk+1 = pk + (h/2)(vk + vk+1 ),

where pk denotes p((k1)h), and vk denotes p((k1)h).


We will work with this discrete-time model.
For simplicity, we will impose the glide slope constraint only at the times t = 0, h, 2h, . . . , Kh.

(a) Minimum fuel descent. Explain how to find the thrust profile f1 , . . . , fK that minimizes fuel
consumption, given the touchdown time T td = Kh and discretization time h.
(b) Minimum time descent. Explain how to find the thrust profile that minimizes the touch-
down time, i.e., K, with h fixed and given. Your method can involve solving several convex
optimization problems.
(c) Carry out the methods described in parts (a) and (b) above on the problem instance with
data given in spacecraft_landing_data.*. Report the optimal total fuel consumption for
part (a), and the minimum touchdown time for part (b). The data files also contain plotting
code (commented out) to help you visualize your solution. Use the code to plot the spacecraft
trajectory and thrust profiles you obtained for parts (a) and (b).

Hints.

In Julia, the plot will come out rotated.

Remarks. If youd like to see the ideas of this problem in action, watch these videos:

http://www.youtube.com/watch?v=2t15vP1PyoA
https://www.youtube.com/watch?v=orUjSkc2pG0
https://www.youtube.com/watch?v=1B6oiLNyKKI
https://www.youtube.com/watch?v=ZCBE8ocOkAQ

Solution.

(a) To find the minimum fuel thrust profile for a given K, we solve
K
k=1 kfk k2
P
minimize
subject to vk+1 = vk + (h/m)fk hge3 , pk+1 = pk + (h/2)(vk + vk+1 ),
kfk k2 F max , (pk )3 k((pk )1 , (pk )2 )k2 , k = 1, . . . , K
pK+1 = 0, vK+1 = 0, p1 = p(0), v1 = p(0),

with variables p1 , . . . , pK+1 , v1 , . . . , vK+1 , and f1 , . . . , fK . This is a convex optimization prob-


lem.

478
(b) We can solve a sequence of convex feasibility problems to find the minimum touchdown time.
For each K we solve
minimize 0
subject to vk+1 = vk + (h/m)fk hge3 , pk+1 = pk + (h/2)(vk + vk+1 ),
kfk k2 F max , (pk )3 k((pk )1 , (pk )2 )k2 , k = 1, . . . , K
pK+1 = 0, vK+1 = 0, p1 = p(0), v1 = p(0),

with variables p1 , . . . , pK+1 , v1 , . . . , vK+1 , and f1 , . . . , fK . If the problem is feasible, we reduce


K, otherwise we increase K. We iterate until we find the smallest K for which a feasible
trajectory can be found. (In fact, the problem is quasiconvex as long as (1/m)F max g, so
we can use bisection to speed up our search.)
(c) For part (a) the optimal total fuel consumption is 193.0. For part (b) the minimum touchdown
time is K = 25. The following plots show the trajectories we obtain. The blue line shows the
position of the spacecraft, the black arrows show the thrust profile, and the colored surface
shows the glide slope constraint.
Here is the minimum fuel trajectory for part (a). Notice that for a portion of the trajectory
the thrust is exactly equal to zero (which we would expect, given our cost function).

100

80

60

40

20

50
40
40
30
20
20
0
10 20
0 40

Here is a minimum time trajectory for part (b).

479
100

80

60

40

20

50
40
40
30
20
20
0
10 20
0 40

The following code solves the problem. In Matlab:


spacecraft_landing_data;

% solve part (a) (find minimum fuel trajectory)


cvx_solver sdpt3;
cvx_begin
variables p(3,K+1) v(3,K+1) f(3,K)
v(:,2:K+1) == v(:,1:K)+(h/m)*f-h*g*repmat([0;0;1],1,K);
p(:,2:K+1) == p(:,1:K)+(h/2)*(v(:,1:K)+v(:,2:K+1));
p(:,1) == p0; v(:,1) == v0;
p(:,K+1) == 0; v(:,K+1) == 0;
p(3,:) >= alpha*norms(p(1:2,:));
norms(f) <= Fmax;
minimize(sum(norms(f)))
cvx_end
min_fuel = cvx_optval*gamma*h;
p_minf = p; v_minf = v; f_minf = f;

% solve part (b) (find minimum K)


% we will use a linear search, but bisection is faster
Ki = K;
while(1)
cvx_begin
variables p(3,Ki+1) v(3,Ki+1) f(3,Ki)
v(:,2:Ki+1) == v(:,1:Ki)+(h/m)*f-h*g*repmat([0;0;1],1,Ki);

480
p(:,2:Ki+1) == p(:,1:Ki)+(h/2)*(v(:,1:Ki)+v(:,2:Ki+1));
p(:,1) == p0; v(:,1) == v0;
p(:,Ki+1) == 0; v(:,Ki+1) == 0;
p(3,:) >= alpha*norms(p(1:2,:));
norms(f) <= Fmax;
minimize(sum(norms(f)))
cvx_end
if(strcmp(cvx_status,Infeasible) == 1)
Kmin = Ki+1;
break;
end
Ki = Ki-1;
p_mink = p; v_mink = v; f_mink = f;
end

% plot the glide cone


x = linspace(-40,55,30); y = linspace(0,55,30);
[X,Y] = meshgrid(x,y);
Z = alpha*sqrt(X.^2+Y.^2);
figure; colormap autumn; surf(X,Y,Z);
axis([-40,55,0,55,0,105]);
grid on; hold on;

% plot minimum fuel trajectory for part (a)


plot3(p_minf(1,:),p_minf(2,:),p_minf(3,:),b,linewidth,1.5);
quiver3(p_minf(1,1:K),p_minf(2,1:K),p_minf(3,1:K),...
f_minf(1,:),f_minf(2,:),f_minf(3,:),0.3,k,linewidth,1.5);
print(-depsc,spacecraft_landing_a.eps);

% plot minimum time trajectory for part (b)


figure; colormap autumn; surf(X,Y,Z);
axis([-40,55,0,55,0,105]); grid on; hold on;
plot3(p_mink(1,:),p_mink(2,:),p_mink(3,:),b,linewidth,1.5);
quiver3(p_mink(1,1:Kmin),p_mink(2,1:Kmin),p_mink(3,1:Kmin),...
f_mink(1,:),f_mink(2,:),f_mink(3,:),0.3,k,linewidth,1.5);
print(-depsc,spacecraft_landing_b.eps);
In Python:
import numpy as np
import cvxpy as cvx
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm

h = 1.
g = 0.1

481
m = 10.
Fmax = 10.
p0 = np.matrix(50;50;100)
v0 = np.matrix(-10;0;-10)
alpha = 0.5
gamma = 1.
K = 35

#Solution begins
#--------------------------------------------------------------

e3 = np.matrix(0; 0; 1)

p = cvx.Variable(3, K+1)
v = cvx.Variable(3, K+1)
f = cvx.Variable (3, K)

fuel_use = h*gamma*sum([cvx.norm(f[:,i]) for i in range(K)])

const = [v[:,i+1] == v[:,i] + (h/m)*f[:,i]-h*g*e3 for i in range(K)]


const += [p[:,i+1] == p[:,i] + h/2*(v[:,i]+v[:,i+1]) for i in range(K)]
const += [p[:,0]==p0, v[:,0]==v0]
const += [p[:,K]==0, v[:,K]==0]
const += [p[2,i] >= alpha*cvx.norm(p[0:2,i]) for i in range(K+1)]
const += [cvx.norm(f[:,i]) <= Fmax for i in range(K)]

prob = cvx.Problem(cvx.Minimize(fuel_use), const)


prob.solve()
print Minimum fuel use is %.2f % fuel_use.value

# Minimum fuel trajectory and glide cone


fig = plt.figure()
ax = fig.gca(projection=3d)

X = np.linspace(-40, 55, num=30)


Y = np.linspace(0, 55, num=30)
X, Y = np.meshgrid(X, Y)
Z = alpha*np.sqrt(X**2+Y**2);
ax.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap=cm.coolwarm, linewidth=0)

ax.plot(xs=p.value[0,:].A1,ys=p.value[1,:].A1,zs=p.value[2,:].A1)
ax.set_xlabel(x); ax.set_ylabel(y); ax.set_zlabel(z)

#For minimum time descent, we do a linear search but bisection would be faster.

482
K = 1
while True:
p = cvx.Variable(3, K+1)
v = cvx.Variable(3, K+1)
f = cvx.Variable(3, K)

const = [v[:,i+1] == v[:,i] + (h/m)*f[:,i]-h*g*e3 for i in range(K)]


const += [p[:,i+1] == p[:,i] + h/2*(v[:,i]+v[:,i+1]) for i in range(K)]
const += [p[:,0]==p0, v[:,0]==v0]
const += [p[:,K]==0, v[:,K]==0]
const += [p[2,i] >= alpha*cvx.norm(p[0:2,i]) for i in range(K+1)]
const += [cvx.norm(f[:,i]) <= Fmax for i in range(K)]

prob = cvx.Problem(cvx.Minimize(0), const)


prob.solve()

if prob.status==optimal:
break
K += 1

print The minimum touchdown time is, K

#Minimum time trajectory and glide cone


fig = plt.figure()
ax = fig.gca(projection=3d)

X = np.linspace(-40, 55, num=30)


Y = np.linspace(0, 55, num=30)
X, Y = np.meshgrid(X, Y)
Z = alpha*np.sqrt(X**2+Y**2);
ax.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap=cm.coolwarm, linewidth=0)
ax.plot(xs=p.value[0,:].A1,ys=p.value[1,:].A1,zs=p.value[2,:].A1)
ax.set_xlabel(x); ax.set_ylabel(y); ax.set_zlabel(z)
plt.show()
In Julia:
include("spacecraft_landing_data.jl");
using Convex, SCS
solver = SCSSolver(max_iters=20000, verbose=false);

# solve part (a) (find minimum fuel trajectory)


p = Variable(3, K+1);
v = Variable(3, K+1);
f = Variable(3, K);

483
constraints = [];
constraints += v[:,2:K+1] == v[:,1:K] + (h/m)*f - h*g*repmat([0,0,1], 1, K);
constraints += p[:,2:K+1] == p[:,1:K] + (h/2)*(v[:,1:K] + v[:,2:K+1]);
constraints += p[:,1] == p0;
constraints += v[:,1] == v0;
constraints += p[:,K+1] == 0;
constraints += v[:,K+1] == 0;
for i = 1:K+1
constraints += p[3,i] >= alpha*norm(p[1:2,i]);
end
objective = 0;
for i = 1:K
constraints += norm(f[:,i]) <= Fmax;
objective += norm(f[:,i]);
end
problem = minimize(objective, constraints);
solve!(problem, solver);
min_fuel = problem.optval*gamma*h;
p_minf = p.value; v_minf = v.value; f_minf = f.value;

# solve part (b) (find minimum K)


# we will use a linear search, but bisection is faster
Kmin = K;
p_mink = nothing;
v_mink = nothing;
f_mink = nothing;
while true
p = Variable(3, Kmin+1);
v = Variable(3, Kmin+1);
f = Variable(3, Kmin);
constraints = [];
constraints += v[:,2:Kmin+1] == v[:,1:Kmin] + (h/m)*f - h*g*repmat([0,0,1], 1, Kmin);
constraints += p[:,2:Kmin+1] == p[:,1:Kmin] + (h/2)*(v[:,1:Kmin] + v[:,2:Kmin+1]);
constraints += p[:,1] == p0;
constraints += v[:,1] == v0;
constraints += p[:,Kmin+1] == 0;
constraints += v[:,Kmin+1] == 0;
for i = 1:Kmin+1
constraints += p[3,i] >= alpha*norm(p[1:2,i]);
end
objective = 0;
for i = 1:Kmin
constraints += norm(f[:,i]) <= Fmax;
objective += norm(f[:,i]);
end

484
problem = minimize(objective, constraints);
solve!(problem, solver);

if problem.status != :Optimal
Kmin += 1;
break;
end
Kmin -= 1;
p_mink = p.value; v_mink = v.value; f_mink = f.value;
end

# plot the glide cone


using PyPlot
x = linspace(-40,55,30); y = linspace(0,55,30);
X = repmat(x, length(x), 1);
Y = repmat(y, 1, length(y));
Z = alpha*sqrt(X.^2+Y.^2);
figure();
grid(true);
hold(true);
surf(X, Y, Z, cmap=get_cmap("autumn"));
xlim([-40, 55]);
ylim([0, 55]);
zlim([0, 105]);

# plot minimum fuel trajectory for part (a)


plot(p_minf[:,1], p_minf[:,2], p_minf[:,3]);
heads = p_minf[1:K,:] + f_minf;
quiver(heads[:,1], heads[:,2], heads[:,3],
f_minf[:,1], f_minf[:,2], f_minf[:,3], length=10);

# plot minimum time trajectory for part (b)


figure();
grid(true);
hold(true);
surf(X, Y, Z, cmap=get_cmap("autumn"));
xlim([-40, 55]);
ylim([0, 55]);
zlim([0, 105]);
plot(p_mink[:,1], p_mink[:,2], p_mink[:,3]);
heads_k = p_mink[1:Kmin,:] + f_mink;
quiver(heads_k[:,1], heads_k[:,2], heads_k[:,3],
f_mink[:,1], f_mink[:,2], f_mink[:,3], length=10);

14.9 Feedback gain optimization. A system (such as an industrial plant) is characterized by y = Gu + v,


where y Rn is the output, u Rn is the input, and v Rn is a disturbance signal. The matrix

485
G Rnn , which is known, is called the system input-output matrix. The input signal u is found
using a linear feedback (control) policy: u = F y, where F Rnn is the feedback (gain) matrix,
which is what we need to determine. From the equations given above, we have

y = (I GF )1 v, u = F (I GF )1 v.

(You can simply assume that I GF will be invertible.)


The disturbance v is random, with E v = 0, E vv T = 2 I, where is known. The objective is to
minimize maxi=1,...,n E yi2 , the maximum mean square value of the output components, subject to
the constraint that E u2i 1, i = 1, . . . , n, i.e., each input component has a mean square value not
exceeding one. The variable to be chosen is the matrix F Rnn .

(a) Explain how to use convex (or quasi-convex) optimization to find an optimal feedback gain
matrix. As usual, you must fully explain any change of variables or other transformations you
carry out, and why your formulation solves the problem described above. A few comments:
You can assume that matrices arising in your change of variables are invertible; you do
not need to worry about the special cases when they are not.
You can assume that G is invertible if you need to, but we will deduct a few points from
these answers.
(b) Carry out your method for the problem instance with data

0.3 0.1 0.9
= 1, G = 0.6 0.3 0.3 .

0.3 0.6 0.2

Give an optimal F , and the associated optimal objective value.

Solution. We first note that

E yy T = 2 (I GF )1 (I GF )T , E uuT = 2 F (I GF )1 (I GF )T F T .

The diagonal entries of these are (E y12 , . . . , E yn2 ) and (E u21 , . . . , E u2n ), respectively. We can also
express these quantities as the `2 -norm squared of the rows of the matrices (I GF )1 and F (I
GF )1 , respectively.
The problem is therefore
 
minimize 2 maxi=1,...,n (I GF )1 (I GF )T
  ii
subject to 2 F (I GF )1 (I GF )T F T 1, i = 1, . . . , n,
ii

with variable F Rnn . In this form it is not a convex problem.


We are clearly going to have to change variables to obtain a convex problem. In fact, several
changes of variables lead to a convex problem.

486
Approach #1. One change of variables uses Z = F (I GF )1 . First lets see how we can
recover F from Z. We start with Z(I GF ) = F , from which we then get F = (I + ZG)1 Z.
(Here we simply assume that I + ZG is invertible.)
The constraints are (ZZ T )ii 1/ 2 , or equivalently, that the rows of Z should have `2 norm not
exceeding 1/. So the constraints are convex in Z. Now lets handle the objective. We need to
express (I GF )1 in terms of Z = F (I GF )1 . We start with

I = (I GF )(I GF )1 = (I GF )1 GF (I GF )1 .

We rewrite this as
(I GF )1 = I + GZ.
Thus the objective (ignoring the constant 2 ) is the maximum of the `2 norm squared of the rows
of I + GZ, which is clearly convex. Whew!
Here is the final (convex) problem we solve:
 
minimize 2 maxi=1,...,n (I + GZ)(I + GZ)T
  ii
subject to 2 ZZ T 1, i = 1, . . . , n,
ii

with variable Z. (The two expressions are really just the `2 -norms squared of the rows of the
matrices.)) We recover an optimal F from an optimal Z using F = (I + ZG)1 Z.

Approach #2. Here we use the variable Y = (I GF )1 . We recover F using F = G1 (I Y 1 ),


so this approach requires that G be invertible.
In this case the objective is simple and convex; it is the maximum of the `2 norms of the rows of
Y (squared, and multiplied by 2 ).
Well have to work to handle the constraints, though. Using the formula above for F , we have

F (I GF )1 = F Y = G1 (I Y 1 )Y = G1 (Y I).

So now we see that the constraints are that the rows of the matrix G1 (Y I) have `2 norm not
exceeding 1/, which is convex.
Our convex problem is then
 
minimize 2 maxi=1,...,n Y Y T
 ii 
subject to 2 G1 (Y I)(G1 (Y I))T 1, i = 1, . . . , n,
ii

with variable Y , with F recovered from F = G1 (I Y 1 ).

Approach #3. Here is another approach used by several people, which is a hybrid of the ap-
proaches given above. This approach uses both the new variables:

Y = (I GF )1 , Z = F (I GF )1 ,

487
with the constraint that Z = F Y = G1 (Y I) (assuming that G is invertible here), which we
can also write as GZ = Y I, which avoids the assumption that G is invertible. This change of
variables is not a bijection, that is, a one-to-one correspondence between the old and new variables.
The correct statement is that for each F , there exists a pair Z and Y satisfying GZ = Y I,
Y = (I GF )1 , Z = F (I GF )1 ; conversely, for each pair Z, Y satisfying GZ = Y I, there
exists an F for which Y = (I GF )1 , Z = F (I GF )1 . (Here we are informal about the
existence of inverses.)
We end up with the convex problem
 
minimize 2 maxi=1,...,n Y Y T
  ii
subject to 2 ZZ T 1, i = 1, . . . , n,
ii
GZ = Y I,

with variables Z and Y , with F recovered from F = ZY 1 .

Numerical instance. The code below solves the problem (using approaches 1 and 2, and sure
enough, the results agree). The optimal value is 0.1647, and an optimal F is

1.2990 3.6886 3.4530
F = 0.2103 0.8963 3.3598 .

15.5258 7.3230 9.1520

% feedback gain optimization


G = [0.3 -0.1 -0.9; -0.6 0.3 -0.3; -0.3 0.6 0.2];
sigma = 1;

% approach 1, using Z = F(I-GF)^{-1}


cvx_begin
variable Z(3,3);
minimize (sigma*max(norms((eye(3)+G*Z))))
norms(Z) <= 1/sigma; % rows of Z have norm less than 1/sigma
cvx_end
% recover feedback matrix
opt_val = cvx_optval^2
F = inv(eye(3)+Z*G)*Z

% approach 2, using Y=(I-GF)^{-1}


cvx_begin
variable Y(3,3);
minimize (sigma*max(norms(Y)))
norms((inv(G)*(Y-eye(3)))) <= 1/sigma;
cvx_end
% recover feedback matrix
opt_val2 = cvx_optval^2
F2 = inv(G)*(eye(3)-inv(Y))

488
14.10 Fuel use as function of distance and speed. A vehicle uses fuel at a rate f (s), which is a function
of the vehicle speed s. We assume that f : R R is a positive increasing convex function, with
dom f = R+ . The physical units of s are m/s (meters per second), and the physical units of f (s)
are kg/s (kilograms per second).

(a) Let g(d, t) be the total fuel used (in kg) when the vehicle moves a distance d 0 (in meters)
in time t > 0 (in seconds) at a constant speed. Show that g is convex.
(b) Let h(d) be the minimum fuel used (in kg) to move a distance d (in m) at a constant speed s
(in m/s). Show that h is convex.

Solution.

(a) g(d, t) = tf (d/t) is the perspective of f , and therefore convex.


(b) h(d) = inf t>0 g(d, t) is obtained by partial minimization of g, so is convex.

14.11 Minimum time speed profile along a road. A vehicle of mass m > 0 moves along a road in R3 , which
is piecewise linear with given knot points p1 , . . . , pN +1 R3 , starting at p1 and ending at pN +1 . We
let hi = (pi )3 , the z-coordinate of the knot point; these are the heights of the knot points (above
sea-level, say). For your convenience, these knot points are equidistant, i.e., kpi+1 pi k2 = d for all
i. (The points give an arc-length parametrization of the road.) We let si > 0 denote the (constant)
vehicle speed as it moves along road segment i, from pi to pi+1 , for i = 1, . . . , N , and sN +1 0
denote the vehicle speed after it passes through knot point pN +1 . Our goal is to minimize the total
time to traverse the road, which we denote T .
We let fi 0 denote the total fuel burnt while traversing the ith segment. This fuel burn is turned
into an increase in vehicle energy given by fi , where > 0 is a constant that includes the engine
efficiency and the energy content of the fuel. While traversing the ith road segment the vehicle is
subject to a drag force, given by CD s2i , where CD > 0 is the coefficient of drag, which results in an
energy loss dCD s2i .
We derive equations that relate these quantities via energy balance:
1 2 1
msi+1 + mghi+1 = ms2i + mghi + fi dCD s2i , i = 1, . . . , N,
2 2
where g = 9.8 is the gravitational acceleration. The lefthand side is the total vehicle energy (kinetic
plus potential) after it passes through knot point pi+1 ; the righthand side is the total vehicle energy
after it passes through knot point pi , plus the energy gain from the fuel burn, minus the energy
lost to drag. To set up the first vehicle speed s1 requires an additional initial fuel burn f0 , with
f0 = 21 ms21 .
Fuel is also used to power the on-board system of the vehicle. The total fuel used for this purpose is
fob , where fob = T P , where P > 0 is the (constant) power consumption of the on-board system.
We have a fuel capacity constraint: N i=0 fi + fob F , where F > 0 is the total initial fuel.
P

The problem data are m, d, h1 , . . . , hN +1 , , CD , P , and F . (You dont need the knot points pi .)

(a) Explain how to find the fuel burn levels f0 , . . . , fN that minimize the time T , subject to the
constraints.

489
(b) Carry out the method described in part (a) for the problem instance with data given in
min_time_speed_data.m. Give the optimal time T ? , and compare it to the time T unif achieved
if the fuel for propulsion were burned uniformly, i.e., f0 = = fN . For each of these cases,
plot speed versus distance along the road, using the plotting code in the data file as a template.

Solution.
(a) The time to traverse the ith segment is d/si , so the total time is
N
X d
T = .
i=1
si
We are to solve the problem
PN
minimize i=1 d/si
1 2 1 2
subject to 2 ms i+1 + mghi+1 = 2 msi + mghi + fi dCD s2i , i = 1, . . . , N
N
i=0 fi + P T / F
P

fi 0, i = 0, . . . , N
f0 = 21 ms21 ,
with variables fi and si . The domain of the objective gives the implicit constraint si > 0,
i = 1, . . . , N . The objective is convex, but unfortunately, the equality constraints are not linear
because of the s2i terms. So we will try introducing the new variables zi = s2i . Since si 0,

we can recover speed from the new variables using si = zi . Note that zi is proportional to
the kinetic energy; therefore this change of variables can be thought of as solving the problem
in terms of kinetic energy instead of speed. Our equality constraints now become
1 1
mzi+1 + mghi+1 = mzi + mghi + fi dCD zi , i = 1, . . . , N
2 2
and f0 = 12 mz1 , which are linear equations in the variables zi and fi . We now return to our
objective which, under this change of variables, becomes
N N
X d X 1/2
T = = dzi ,
i=1
si i=1
1/2
which is convex since zi is convex. Having shown that T is convex in our new variables, it
is clear that the fuel capacity constraint
N N N
X X X 1/2
fi + P T / = fi + P/ dzi F
i=0 i=0 i=1

is a convex constraint. We can therefore find f0 , . . . , fN by solving the convex optimization


problem
PN 1/2
minimize i=1 dzi
1 1
subject to i+1 = 2 mzi + mghi + fi dCD zi ,
2 mzi+1 + mghP i = 1, . . . , N
N N 1/2 F
P
i=0 fi + P/ i=1 d(zi )
fi 0, i = 0, . . . , N
f0 = 12 mz1 ,

with variables f0 , . . . , fN and z1 , . . . , zN . Then we recover si using si = zi .

490
(b) The following code solves the problem:

% Minimum time speed profile along a road.


min_time_speed_data;

% minimum time
cvx_begin
variables z(N+1) f(N+1)
minimize sum(d*inv_pos(sqrt(z(1:N))))
subject to
.5*m*z(2:N+1)+m*g*h(2:N+1) ==...
.5*m*z(1:N)+m*g*h(1:N)+eta*f(2:N+1)-d*C_D*z(1:N)
sum(f)+P/eta*sum(d*inv_pos(sqrt(z(1:N)))) <= F
f >= 0
eta*f(1) == .5*m*z(1)
cvx_end

T = cvx_optval

% constant fuel burn


cvx_solver sdpt3
% sedumi fails, but only on this 1 part of 1 problem
% from the entire exam, and only if you replaced
% the vector f, by the variable fc as below
cvx_begin
variables zc(N+1) fc
minimize sum(d*inv_pos(sqrt(zc(1:N))))
subject to
.5*m*zc(2:N+1)+m*g*h(2:N+1) ==...
.5*m*zc(1:N)+m*g*h(1:N)+eta*fc-d*C_D*zc(1:N)
(N+1)*fc+P/eta*sum(d*inv_pos(sqrt(zc(1:N)))) <= F
fc >= 0
eta*fc == .5*m*zc(1)
cvx_end

T_unif = cvx_optval

figure
subplot(3,1,1)
plot((0:N)*d,h);
ylabel(height);
subplot(3,1,2)
stairs((0:N)*d,sqrt(z),b);
hold on
stairs((0:N)*d,sqrt(zc),--r);
ylabel(speed)

491
legend(minimum time,constant burn)
subplot(3,1,3)
plot((0:N)*d,f,b);
hold on
plot((0:N)*d,fc*ones(N+1,1),--r)
xlabel(distance)
ylabel(fuel burned)

print -depsc min_time_speed;

We get T = 213.26 and T unif = 258.48. Note that you can find T unif by replacing the fi s by
a single parameter fc as we did, or by constraining all of the fi s to be equal. The plot below
shows the speed and fuel burn profiles for the two cases.

200

100
height

100

200
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

80
minimum time
60 constant burn
speed

40

20
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

0.2

0.15
fuel burned

0.1

0.05

0
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
distance

We see that the optimal fuel burn attempts to keep a constant velocity, except near the end
of the trajectory, where it coasts to the finish line. The uniform fuel burn wastes fuel (and
therefore loses time) by burning fuel on the downhill parts, leading to a large speed, and
therefore, large loss due to drag.

14.12 Least-cost road grading. A road is to be built along a given path. We must choose the height of
the roadbed (say, above sea level) along the path, minimizing the total cost of grading, subject to
some constraints. The cost of grading (i.e., moving earth to change the height of the roadbed from
the existing elevation) depends on the difference in height between the roadbed and the existing

492
elevation. When the roadbed is below the existing elevation it is called a cut; when it is above
it is called a fill. Each of these incurs engineering costs; for example, fill is created in a series of
lifts, each of which involves dumping just a few inches of soil and then compacting it. Deeper cuts
and higher fills require more work to be done on the road shoulders, and possibly, the addition of
reinforced concrete structures to stabilize the earthwork. This explains why the marginal cost of
cuts and fills increases with their depth/height.
We will work with a discrete model, specifying the road height as hi , i = 1, . . . , n, at points equally
spaced a distance d from each other along the given path. These are the variables to be chosen.
(The heights h1 , . . . , hn are called a grading plan.) We are given ei , i = 1, . . . , n, the existing
elevation, at the points. The grading cost is
n 
X 
C= fill ((hi ei )+ ) + cut ((ei hi )+ ) ,
i=1

where fill and cut are the fill and cut cost functions, respectively, and (a)+ = max{a, 0}. The fill
and cut functions are increasing and convex. The goal is to minimize the grading cost C.
The road height is constrained by given limits on the first, second, and third derivatives:

|hi+1 hi |/d D(1) , i = 1, . . . , n 1


2 (2)
|hi+1 2hi + hi1 |/d D , i = 2, . . . , n 1
3 (3)
|hi+1 3hi + 3hi1 hi2 |/d D , i = 3, . . . , n 1,

where D(1) is the maximum allowable road slope, D(2) is the maximum allowable curvature, and
D(3) is the maximum allowable third derivative.

(a) Explain how to find the optimal grading plan.


(b) Find the optimal grading plan for the problem with data given in
road_grading_data.m, and fill and cut cost functions

fill (u) = 2(u)2+ + 30(u)+ , cut = 12(u)2+ + (u)+ .

Plot hi ei for the optimal grading plan and report the associated cost.
(c) Suppose the optimal grading problem with n = 1000 can be solved on a particular machine
(say, with one, or just a few, cores) in around one second. Assuming the author of the software
took EE364a, about how long will it take to solve the optimal grading problem with n = 10000?
Give a very brief justification of your answer, no more than a few sentences.

Solution.
(a) Fortunately, this problem in convex; C is convex since fill and cut are convex and increasing
and max is convex. Therefore we simply solve the problem
minimize C
subject to |hi+1 hi |/d D(1) , i = 1, . . . , n 1
|hi+1 2hi + hi1 |/d2 D(2) , i = 2, . . . , n 1
|hi+1 3hi + 3hi1 hi2 |/d3 D(3) , i = 3, . . . , n 1,
with optimization variables hi .

493
(b) For this problem instance we have cost functions shown below.

1400
fill
cut

1200

1000

800
cost

600

400

200

0
0 1 2 3 4 5 6 7 8 9 10
elevation change

The following code solves the problem:

% Least-cost road grading.


road_grading_data;

cvx_begin
variables h(n);
minimize sum(alpha_fill*square_pos(h-e)+beta_fill*pos(h-e)+...
alpha_cut*square_pos(e-h)+beta_cut*pos(e-h))
subject to
abs(h(2:n)-h(1:n-1)) <= D1*d
abs(h(3:n)-2*h(2:n-1)+h(1:n-2)) <= D2*d^2
abs(-h(4:n)+3*h(3:n-1)-3*h(2:n-2)+h(1:n-3)) <= D3*d^3
cvx_end

figure
subplot(2,1,1)
plot((0:n-1)*d,e,--r);
ylabel(elevation);
hold on

494
plot((0:n-1)*d,h, b)
legend(e,h);
subplot(2,1,2)
plot((0:n-1)*d,h-e)
ylabel(elevation change)
xlabel(distance)

print -depsc road_grading;

6
e
4 h

2
elevation

6
0 10 20 30 40 50 60 70 80 90 100

4
elevation change

4
0 10 20 30 40 50 60 70 80 90 100
distance

We find that the optimal cost is 7562.82.


(c) Using our knowledge from EE364a we see immediately that this problem is well structured.
Our objective is separable and our constraints have bandwidth of at most 4. For each Newton
step we have to solve a banded system, which we know we can do in O(nk 2 ) flops, where
k = 4 is the bandwidth. We can, therefore, take a Newton step in O(n) flops. We have seen
that the number of iterations required to solve a problem with an interior point method is
practically independent of problem size. Thus, if an optimal grading problem with n = 1000
can be solved in about 1 second, we can solve a problem with n = 10000 in approximately 10
seconds.
To see how this banded system arises, using EE364a knowledge you might solve this problem
using an interior point barrier method. For a given t you are now solving the unconstrained

495
minimization problem
Pn1  
minimize tC i=1 log(D(1) d hi+1 + hi ) + log(D(1) d + hi+1 hi )
Pn1  
(2) 2 hi+1 + 2hi hi1 )
i=2 log(D d 
Pn1 (2) 2
i=2 log(D d + hi+1 2hi + hi1 )
Pn1  (3) 3

i=3 log(D d hi+1 + 3hi 3hi1 + hi2 )
Pn1 
(3) 3 + hi+1 3hi + 3hi1 hi2 ) .
i=3 log(D d

We can represent the Hessian of this function as the sum of 4 matrices: the Hessian relating to
C and the Hessians relating to constraints with D(1) , D(2) , and D(3) . Since C is separable in the
optimization variables hi , its Hessian is diagonal. The constraints with D(1) will contribute a
matrix of bandwidth 2 to the Hessian, as there is only coupling between hi and hi+1 . Similarly
the terms related to D(2) will contribute a matrix of bandwidth 3, and the terms related to D(3)
will contribute a matrix of bandwidth 4. Therefore at each Newton step, as already stated we
must solve a system with bandwidth 4. As the problem size increases, the bandwidth remains
constant, so we expect the problem to scale linearly in n until we run into system related
issues such as memory limitations. The code below generates problem instances from n = 100
to n = 1000:

% Least-cost road grading timing for different problem sizes.


road_grading_data

N = 10;
times = zeros(N,1);
e2 = [];
for i = 1:N
e2 = [e2; e];
tic
cvx_begin
variables h(i*n);
minimize sum(alpha_fill*square_pos(h-e2)+...
beta_fill*pos(h-e2)+...
alpha_cut*square_pos(e2-h)+beta_cut*pos(e2-h))
subject to
abs(h(2:end)-h(1:end-1)) <= D1*d
abs(h(3:end)-2*h(2:end-1)+h(1:end-2)) <= D2*d^2
abs(-h(4:end)+3*h(3:end-1)-3*h(2:end-2)+h(1:end-3)) <= D3*d^3
cvx_end
times(i) = toc;
end

figure
plot((1:N)*n,times);
xlabel(n);
ylabel(time in seconds);

496
print -depsc road_grading_timing.eps

18

16

14

12
time in seconds

10

2
100 200 300 400 500 600 700 800 900 1000
n

We see that in SDPT3 the timing is indeed linear with problem size (with an offset for the
CVX overhead).

14.13 Lightest structure that resists a set of loads. We consider a mechanical structure in 2D (for simplic-
ity) which consists of a set of m nodes, with known positions p1 , . . . , pm R2 , connected by a set
of n bars (also called struts or elements), with cross-sectional areas a1 , . . . , an R+ , and internal
tensions t1 , . . . , tn R.
Bar j is connected between nodes rj and sj . (The indices r1 , . . . , rn and s1 , . . . , sn give the structure
topology.) The length of bar j is Lj = kprj psj k2 , and the total volume of the bars is V =
Pn
j=1 aj Lj . (The total weight is proportional to the total volume.)

Bar j applies a force (tj /Lj )(prj psj ) R2 to node sj and the negative of this force to node rj .
Thus, positive tension in a bar pulls its two adjacent nodes towards each other; negative tension
(also called compression) pushes them apart. The ratio of the tension in a bar to its cross-sectional
area is limited by its yield strength, which is symmetric in tension and compression: |tj | aj ,
where > 0 is a known constant that depends on the material.
The nodes are divided into two groups: free and fixed. We will take nodes 1, . . . , k to be free,
and nodes k + 1, . . . , m to be fixed. Roughly speaking, the fixed nodes are firmly attached to the
ground, or a rigid structure connected to the ground; the free ones are not.

497
A loading consists of a set of external forces, f1 , . . . , fk R2 applied to the free nodes. Each free
node must be in equilibrium, which means that the sum of the forces applied to it by the bars and
the external force is zero. The structure can resist a loading (without collapsing) if there exists a
set of bar tensions that satisfy the tension bounds and force equilibrium constraints. (For those
with knowledge of statics, these conditions correspond to a structure made entirely with pin joints.)
(i) (i)
Finally, we get to the problem. You are given a set of M loadings, i.e., f1 , . . . , fk R2 ,
i = 1, . . . , M . The goal is to find the bar cross-sectional areas that minimize the structure volume
V while resisting all of the given loadings. (Thus, you are to find one set of bar cross-sectional areas,
and M sets of tensions.) Using the problem data provided in lightest_struct_data.m, report V ?
and V unif , the smallest feasible structure volume when all bars have the same cross-sectional area.
The node positions are given as a 2 m matrix P, and the loadings as a 2 k M array F. Use
the code included in the data file to visualize the structure with the bar cross-sectional areas that
you find, and provide the plot in your solution.
Hint. You might find the graph incidence matrix A Rmn useful. It is defined as

+1
i = rj
Aij = 1 i = sj

0 otherwise.

Remark. You could reasonably ask, Does a mechanical structure really solve a convex optimization
problem to determine whether it should collapse?. It sounds odd, but the answer is, yes it does.
Solution. We can use the graph incidence matrix A to express the force exerted by the bars on
each node. For a loading f1 , . . . , fk , define G R2m such that gi , the ith column of G, is the sum
of the forces from each of the bars connected to node i. Then G = P ADAT , where P R2m
is a matrix whose ith column is pi , and D Rnn is a diagonal matrix such that Djj = tj /Lj .
(To see this, note that the ith column of P AD is (ti /Li )(pri psi ), the force that bar i applies
to the adjacent node ri .) The force equilibrium constraints for the loading can then be written as
gi + fi = 0, for i = 1, . . . , k. A single set of bar cross-sectional areas must satisfy the equilibrium
constraints for each of the M loadings. The remaining constraints are easily formulated.
We find that V ? = 188.55, and V unif = 492.00.

The following code solves the problem.

498
% lightest structure that resists a set of loads
lightest_struct_data;

% form incidence matrix


A = zeros(m, n);
for i = 1:n
A(r(i), i) = +1;
A(s(i), i) = -1;
end

L = norms(P*A);

% solve with all bars having same cross-sectional area


cvx_begin quiet
variables a(n) t(n, M)
expression G(2, m, M) % force due to bars
minimize (a*L)
subject to
a == mean(a);
for i = 1:M
abs(t(:, i)) <= sigma.*a;
G(:, :, i) = -P*A*diag(t(:, i)./L)*A;
G(:, 1:k, i) + F(:, :, i) == 0;
end
cvx_end
fprintf(V^unif = %f\n, cvx_optval);

% plot
clf;
subplot(1,2,1); hold on;
for i = 1:n
p1 = r(i); p2 = s(i);
plt_str = b-;
if a(i) < 0.001
plt_str = r--;
end
plot([P(1, p1) P(1, p2)], [P(2, p1) P(2, p2)], ...
plt_str, LineWidth, a(i));
end
axis([-0.5 N-0.5 -0.1 N-0.5]); axis square; box on;
set(gca, xtick, [], ytick, []);
hold off;

% solve with bars having different cross-sectional areas


cvx_begin quiet

499
variables a(n) t(n, M)
expression G(2, m, M) % force due to bars
minimize (a*L)
subject to
for i = 1:M
abs(t(:, i)) <= sigma.*a;
G(:, :, i) = -P*A*diag(t(:, i)./L)*A;
G(:, 1:k, i) + F(:, :, i) == 0;
end
cvx_end
fprintf(V^star = %f\n, cvx_optval);

% plot
subplot(1,2,2); hold on;
for i = 1:n
p1 = r(i); p2 = s(i);
plt_str = b-;
width = a(i);
if a(i) < 0.001
plt_str = r--;
width = 1;
end
plot([P(1, p1) P(1, p2)], [P(2, p1) P(2, p2)], ...
plt_str, LineWidth, width);
end
axis([-0.5 N-0.5 -0.1 N-0.5]); axis square; box on;
set(gca, xtick, [], ytick, []);
hold off;

print -depsc lightest_struct.eps;

500
14.14 Maintaining static balance. In this problem we study a humans ability to
maintain balance against an applied external force. We will use a planar
(two-dimensional) model to characterize the set of push forces a human
can sustain before he or she is unable to maintain balance. We model
the human as a linkage of 4 body segments, which we consider to be rigid
bodies: the foot, lower leg, upper leg, and pelvis (into which we lump the A
upper body). The pose is given by the joint angles, but this wont matter
in this problem, since we consider a fixed pose. A set of 40 muscles act
on the body segments; each of these develops a (scalar) tension ti that
satisfies 0 ti Timax , where Timax is the maximum possible tension for
muscle i. (The maximum muscle tensions depend on the pose, and the
person, but here they are known constants.) An external pushing force
f push R2 acts on the pelvis. Two (ground contact) forces act on the
foot: f heel R2 and f toe R2 . (These are shown at right.) These must
satisfy
|f1heel | f2heel , |f1toe | f2toe ,
where > 0 is the coefficient of friction of the ground. There are also joint
forces that act at the joints between the body segments, and gravity forces
for each body segment, but we wont need them explicitly in this problem.

B C
To maintain balance, the net force and torque on each each body segment must be satisfied. These
equations can be written out from the geometry of the body (e.g., attachment points for the
muscles) and the pose. They can be reduced to a set of 6 linear equations:

Amusc t + Atoe f toe + Aheel f heel + Apush f push = b,

where t R40 is the vector of muscle tensions, and Amusc , Atoe , Aheel , and Apush are known matrices
and b R6 is a known vector. These data depend on the pose, body weight and dimensions, and
muscle lines of action. Fortunately for you, our biomechanics expert Apoorva has worked them
out; you will find them in static_balance_data.* (along with T max and ).
We say that the push force f push can be resisted if there exist muscle tensions and ground contact
forces that satisfy the constraints above. (This raises a philosophical question: Does a person solve
an optimization to decide whether he or she should lose their balance? In any case, this approach
makes good predictions.)
Find F res R2 , the set of push forces that can be resisted. Plot it as a shaded region.
Hints. Show that F res is a convex set. For the given data, 0 F res . Then for = 1 , 2 , . . . , 360 ,
determine the maximum push force, applied in the direction , that can be resisted. To make
a filled region on a plot, you can use the command fill() in Matlab. For Python and Ju-
lia, fill() is also available through PyPlot. In Julia, make sure to use the ECOS solver with
solver = ECOSSolver(verbose=false).
Remark. A person can resist a much larger force applied to the hip than you might think.

501
Solution. The set of vectors (t, f toe , f heel , f push ) which satisfy the constraints

Amusc t + Atoe f toe + Aheel f heel + Apush f push = b


|f1toe | f2toe
|f1heel | f2heel
0  t  T max ,

forms a convex set, a polyhedron. It follows the set of push forces that can be resisted, F res , is
a convex set, since it is the projection of this set onto the variable f push . In fact, F res is also a
polyhedron.
To find the maximum push force for a given direction , we solve the LP

maximize z
subject to Amusc t + Atoe f toe + Aheel f heel + Apush z(cos , sin ) = b
|f1toe | f2toe
|f1heel | f2heel
0  t  T max .

with variables t, f toe , f heel and z R. We solve this problem for a number of values of , and for
each one we record the value f push = z ? (cos , sin ), which is on the boundary of F res . We use
these values to fill out the set when plotting.
The set F res for the given data is shown below.

1000

500

500

1000
f2

1500

2000

2500

3000

3500
400 300 200 100 0 100 200 300 400
f1

We see some interesting things from the plot. One is that its much easier to make someone lose
their static balance by pulling them up, instead of pushing them down. Another interpretation
(maybe more positive) is that a 75 kg person can maintain this posture even with a load of 3150
Newtons (321 kg, 708 lbs) attached to their hips. This is over 4 times body weight! And its a little
easier to make them lose their balance pushing them forward, compared to pulling them backward.
This analysis does not take into account other factors such as the maximum compressive load that

502
can be supported by bones and joints. That said, the human body can produce and withstand
extremely high forces. For reference, in running, the force in the Achilles tendon can be 6 to 8
times body weight and compressive forces in the lower leg can be 10 to 14 times body weight.
The following Matlab code solves the problem.

static_balance_data

theta_push = pi/180.*(0:1:360);
f_push_max = zeros(size(theta_push));
t_muscle = zeros(n_musc, length(theta_push));

for i = 1:length(theta_push)
theta = theta_push(i);
cvx_begin
cvx_quiet(true)
variable f_push;
variable t(40,1);
variables f_toe(2,1) f_heel(2,1);

maximize(f_push);

A_musc*t + A_heel*f_heel + A_toe*f_toe + ...


A_push*f_push*[cos(theta); sin(theta)] == b;
abs(f_toe(1)) <= mu*f_toe(2);
abs(f_heel(1)) <= mu*f_heel(2);
0 <= t;
t <= T_max;
cvx_end
f_push_max(i) = f_push;
t_muscle(:,i) = t;
end

% plot results
figure
fill(f_push_max.*cos(theta_push), f_push_max.*sin(theta_push), c), hold on
xlabel(f^{push}_1 (Newtons))
ylabel(f^{push}_2 (Newtons))
plot([-400,400], [0,0], k--), hold on
plot([0,0], [-3500,1000], k--)

print -depsc static_balance_fres_mat

The following Python code solves the problem.

# solution to static balance problem


import numpy as np

503
import cvxpy as cvx
import matplotlib.pyplot as plt
import matplotlib

from static_balance_data import *

theta_push = np.pi/180. * np.arange(360)


f_push_max = np.zeros(len(theta_push));

for i in range(len(theta_push)):
theta = theta_push[i];
f_push = cvx.Variable(1)
t = cvx.Variable(40,1)
f_toe = cvx.Variable(2,1)
f_heel = cvx.Variable(2,1)

constr = [A_musc*t + A_heel*f_heel + A_toe*f_toe \


+ A_push*f_push*np.array([np.cos(theta), np.sin(theta)]) == b,
cvx.abs(f_toe[0]) <= mu*f_toe[1],
cvx.abs(f_heel[0]) <= mu*f_heel[1],
0 <= t,
t <= T_max]

p = cvx.Problem(cvx.Maximize(f_push), constr)

p.solve(verbose = False)

f_push_max[i] = f_push.value

# plot results
plt.figure(1)
plt.fill(f_push_max*np.cos(theta_push), f_push_max*np.sin(theta_push),c)
plt.plot(np.array([-400,400]), np.array([0,0]), k--)
plt.plot(np.array([0,0]), np.array([-3500,1000]), k--)
plt.xlabel($f^\\mathrm{push}_1$ (Newtons))
plt.ylabel($f^\\mathrm{push}_2$ (Newtons))

plt.savefig(static_balance_fres_py.eps)

plt.show()

The following Julia code solves the problem.

include("static_balance_data.jl");

504
using Convex, ECOS, PyPlot
solver = ECOSSolver(verbose=false);

theta_push = pi/180 * [0:359];


f_push_max = zeros(length(theta_push));

for i = 1:length(theta_push)
theta = theta_push[i];
f_push = Variable(1);
t = Variable(40);
f_toe = Variable(2);
f_heel = Variable(2)

constraints = A_musc*t + A_heel*f_heel + A_toe*f_toe +


A_push*f_push*[cos(theta) sin(theta)] == b;
constraints += abs(f_toe[1]) <= mu*f_toe[2];
constraints += abs(f_heel[1]) <= mu*f_heel[2];
constraints += 0 <= t;
constraints += t <= T_max;

prob = maximize(f_push, constraints);


solve!(prob, solver)

f_push_max[i] = prob.optval
end

# plot results
fill(f_push_max.*cos(theta_push), f_push_max.*sin(theta_push),"c")
plot([-400,400], [0,0], "k--")
plot([0,0], [-3500,1000], "k--")
xlabel("\$f^\\mathrm{push}_1\$ (Newtons)")
ylabel("\$f^\\mathrm{push}_2\$ (Newtons)")
savefig("static_balance_fres_jl.eps")
14.15 Minimum time maneuver for a crane. A crane manipulates a load with mass m > 0 in two
dimensions using two cables attached to the load. The cables maintain angles with respect to
vertical, as shown below.

load

505
The (scalar) tensions T left and T right in the two cables are independently controllable, from 0 up
to a given maximum tension T max . The total force on the load is
" # " #
sin sin
F = T left + T right + mg,
cos cos

where g = (0, 9.8) is the acceleration due to gravity. The acceleration of the load is then F/m.
We approximate the motion of the load using
pi+1 = pi + hvi , vi+1 = vi + (h/m)Fi , i = 1, 2, . . . ,
where pi R2 is the position of the load, vi R2 is the velocity of the load, and Fi R2 is the
force on the load, at time t = ih. Here h > 0 is a small (given) time step.
The goal is to move the load, which is initially at rest at position pinit to the position pdes , also at
rest, in minimum time. In other words, we seek the smallest k for which
p1 = pinit , pk = pdes , v1 = vk = (0, 0)
is possible, subject to the constraints described above.

(a) Explain how to solve this problem using convex (or quasiconvex) optimization.
(b) Carry out the method of part (a) for the problem instance with
m = 0.1, = 15 , T max = 2, pinit = (0, 0), pdes = (10, 2),
with time step h = 0.1. Report the minimum time k ? . Plot the tensions versus time, and the
load trajectory, i.e., the points p1 , . . . , pk in R2 . Does the load move along the line segment
between pinit and pdes (i.e., the shortest path from pinit and pdes )? Comment briefly.

Solution.
(a) The problem as stated is quasiconvex: To see if k ? k, we simply check if there exists a set
of variables that satisfy the constraints, together with pk = pdes , vk = 0.
For a given value for k, we can solve a convex feasibility problem (in fact, an LP) to determine
if such a trajectory exists. Let T R2k1 be a matrix of the tensions, so that T1i , T2i
are T left , T right at time ih, respectively. Then the force applied to the load at time ih be
Fi = M Ti + mg where " #
sin sin
M= .
cos cos
To find a feasible trajectory, we solve the LP
minimize 0
subject to 0  T  T max ,
vi+1 = vi + (h/m)Fi , i = 1, . . . , k 1,
pi+1 = pi + hvi , i = 1, . . . , k 1,
p1 = pinit , pk = pdes , v1 = vk = 0.
We can then find the minimum time by finding the smallest k for which the above problem is
feasible. This can be done by bisection, or by simply increasing k until the problem becomes
feasible.

506
(b) We find that k ? = 34, corresponding to t = 3.4 seconds. From the trajectory plot, we see that
the load does not travel along the line between the initial and final positions. Since the load
must cross a large horizontal distance, we maximize the horizontal force which is accomplished
by setting the tension in the right cable to T max . As the tension produces a force along the
line of the cable, the load rises up in addition to accelerating in the horizontal direction.
The code is shown below in Matlab.
trajectory

0
-5 0 5 10 15

tensions
2

1.5

0.5

0
0 5 10 15 20 25 30 35

clear;
% Angle of cables with respect to vertical
% For convenience, we put the coefficients in a matrix
theta = 15*pi/180;
M = [-sin(theta), sin(theta); cos(theta), cos(theta)];
T_max = 2; % Max tension that each cable can apply (kNewtons)
m = 0.1; % Mass of the load (metric tons)
g = [0;-9.8]; % Gravity (m/s^2)
p_init = [0;0]; % Init position (m)
p_des = [10;2]; % Desired position (m)
h = 0.1; % Simulation timestep (s)

T_feasible = 0;
p_feasible = 0;

%% Run the problem


lower = 10; % A lowerbound obtained by rough check (infeasible)
upper = 50; % A upperbound obtained by rough check (feasible)
while lower + 1 ~= upper
k = floor((lower+upper)/2);
disp([checking: k= num2str(k) ...

507
, lower= num2str(lower) ...
, upper= num2str(upper)]);
cvx_begin quiet
variables T(2,k-1) p(2,k) v(2,k)
F = M*T + m*repmat(g,1,k-1);
minimize 0
subject to
p(:,1) == p_init; p(:,end) == p_des;
v(:,1) == 0; v(:,end) == 0;
0 <= T <= T_max;
v(:,2:end) == v(:,1:end-1) + h/m*F;
p(:,2:end) == p(:,1:end-1) + h*v(:,1:end-1);
cvx_end
if cvx_optval == 0
upper = k;
T_feasible = T;
p_feasible = p;
else
lower = k;
end
end
k = upper

%% Plotting
figure(1); clf;
subplot(2,1,1); plot(p(1,:),p(2,:)); axis equal;
title(trajectory);
subplot(2,1,2); plot(T_feasible(1,:)); hold on; plot(T_feasible(2,:),r);
title(tensions);
print -depsc crane_no_constraint

For Python, the code is given below.

508
7
trajectory
6
5
4
3
2
1
0
1
2 0 2 4 6 8 10 12
2.0
tensions

1.5

1.0

0.5

0.0
0 5 10 15 20 25 30 35

import numpy as np
import cvxpy as cvx
import matplotlib.pyplot as plt

theta = 15*3.141592/180.0
M = np.matrix([[-np.sin(theta), np.sin(theta)],
[np.cos(theta), np.cos(theta)]])
T_max = 2.0 # Max tension that each cable can apply (kNewtons)
m = 0.1 # Mass of the load (Metric tons)
g = np.matrix(0;-9.8) # Gravity (m/s^2)
p_init = np.matrix(0.0;0.0) # Init position (m)
p_des = np.matrix(10.0;2.0) # Desired position (m)
h = 0.1 # Simulation timestep (s)

T_feasible = 0
p_feasible = 0

# Run bisection
lower = 10 # Determined by rough check, infeasible
upper = 50 # Determined by rough check, feasible
while not lower + 1 == upper:
k = int((upper+lower)/2)
print(checking k= + str(k) +
, lower= + str(lower) +
, upper= + str(upper))

T = cvx.Variable(2,k-1)
v = cvx.Variable(2,k)
p = cvx.Variable(2,k)

509
F = M*T + m*np.tile(g,(1,k-1))

constraints = [0 <= T, T <= T_max]


constraints += [p[:,0] == p_init, p[:,k-1] == p_des]
constraints += [v[:,0] == 0, v[:,k-1] == 0]
constraints += [v[:,1:k] == v[:,0:k-1] + (h/m)*F]
constraints += [p[:,1:k] == p[:,0:k-1] + h*v[:,0:k-1]]

prob = cvx.Problem(cvx.Minimize(0),constraints)

opt_val = prob.solve(solver=cvx.ECOS,verbose=False)

if opt_val == 0:
upper = k
T_feasible = T.value
p_feasible = p.value
else: lower = k

k = upper;
print(minimum is + str(k))
plt.subplot(2,1,1)
plt.plot(p_feasible[0,:].T,p_feasible[1,:].T)
plt.title(trajectory)
plt.subplot(2,1,2)
plt.plot(T_feasible.T)
plt.title(tensions)
plt.savefig(crane_no_constraint.eps,format=ps)

The Julia code follows.

7
trajectory

0
2 0 2 4 6 8 10 12

2.0
tensions

1.5

1.0

0.5

0.0
0 5 10 15 20 25 30 35

using Convex, SCS, PyPlot;

510
using ECOS;

theta = 15*3.141592/180;
M = [-sin(theta) sin(theta); cos(theta) cos(theta)];
T_max = 2.0; # Max tension that each cable can apply (kNewtons)
m = 0.1; # Mass of the load (Metric tons)
g = [0; -9.8]; # Gravity (m/s^2)
p_init = zeros(2,1); # Init position (m)
p_des = [10.0; 2]; # Desired position (m)
h = 0.1; # Simulation timestep (s)

T_feasible = 0;
p_feasible = 0;

lower = 10; # Determined by rough check (infeasible)


upper = 50; # Determined by rough check (feasible)
while lower + 1 != upper
k = int((lower+upper)/2);
println(string("checking k=",k,", lower=",lower, ", upper=", upper));
T = Variable(2,k-1);
v = Variable(2,k);
p = Variable(2,k);

F = M*T + m*repmat(g,1,k-1);

constraints = [0 <= T, T <= T_max];


constraints += [p[:,1] == p_init, p[:,end] == p_des];
constraints += [v[:,1] == 0, v[:,end] == 0];
constraints += v[:,2:end] == v[:,1:end-1] + (h/m)*F;
constraints += p[:,2:end] == p[:,1:end-1] + h*v[:,1:end-1];

prob = satisfy(constraints);

#solve!(prob,SCSSolver(verbose=false));
solve!(prob,ECOSSolver(maxit=20000,eps=1e-4,verbose=false));

println(prob.status);
if prob.status == :Optimal
upper = k;
T_feasible = T.value;
p_feasible = p.value;
elseif prob.status == :Infeasible
lower = k;
else
println("solve failed!!!");

511
end
end
k = upper;
println(string("minimum is ",k));

fig = figure("fig1",figsize=(12,8));
subplot(211);
plot(p_feasible[1,:],p_feasible[2,:]);
title("trajectory")
subplot(212);
plot(T_feasible);
title("tensions")
savefig("crane_no_constraint.eps",format="ps");

512
15 Graphs and networks
15.1 A hypergraph with nodes 1, . . . , m is a set of nonempty subsets of {1, 2, . . . , m}, called edges. An
ordinary graph is a special case in which the edges contain no more than two nodes.
We consider a hypergraph with m nodes and assume coordinate vectors xj Rp , j = 1, . . . , m, are
associated with the nodes. Some nodes are fixed and their coordinate vectors xj are given. The
other nodes are free, and their coordinate vectors will be the optimization variables in the problem.
The objective is to place the free nodes in such a way that some measure of the physical size of the
nets is small.
As an example application, we can think of the nodes as modules in an integrated circuit, placed
at positions xj R2 . Every edge is an interconnect network that carries a signal from one module
to one or more other modules.
To define a measure of the size of a net, we store the vectors xj as columns of a matrix X Rpm .
For each edge S in the hypergraph, we use XS to denote the p |S| submatrix of X with the
columns associated with the nodes of S. We define

fS (X) = inf kXS y1T k. (60)


y

as the size of the edge S, where k k is a matrix norm, and 1 is a vector of ones of length |S|.

(a) Show that the optimization problem


P
minimize edges S fS (X)

is convex in the free node coordinates xj .


(b) The size fS (X) of a net S obviously depends on the norm used in the definition (60). We
consider five norms.
Frobenius norm: 1/2
p
XX
kXs y1T kF = (xij yi )2 .
jS i=1

Maximum Euclidean column norm:


p !1/2
X
T 2
kXS y1 k2,1 = max (xij yi ) .
jS
i=1

Maximum column sum norm:


p
X
kXS y1T k1,1 = max |xij yi |.
jS
i=1

Sum of absolute values norm:


p
XX
kXs y1T ksav = |xij yi |
jS i=1

513
Sum-row-max norm: p
X
kXs y1T ksrm = max |xij yi |
jS
i=1
For which of these norms does fS have the following interpretations?
(i) fS (X) is the radius of the smallest Euclidean ball that contains the nodes of S.
(ii) fS (X) is (proportional to) the perimeter of the smallest rectangle that contains the nodes
of S: p
1X
fS (X) = (max xij min xij ).
4 i=1 jS jS

(iii) fS (X) is the squareroot of the sum of the squares of the Euclidean distances to the mean
of the coordinates of the nodes in S:
1/2
X 1 X
fS (X) = k22
kxj x where x
i = xik , i = 1, . . . , p.
jS
|S| kS

(iv) fS (X) is the sum of the `1 -distances to the (coordinate-wise) median of the coordinates
of the nodes in S:
X
fS (X) = kxj x
k1 where i = median({xik | k S}),
x i = 1, . . . , p.
jS

Solution.
(a) Follows from the fact that kX y1T k is convex jointly in X and y.
(b) (i) The maximum Euclidean column norm:

kXS y1T k2,1 = max kxj yk2


jS

is minimized by putting y at the center of the smallest Euclidean ball containing xj , j S.


(ii) The sum-row-max-norm
p
X
kXs y1T ksrm = max |xij yi |
jS
i=1

is minimized by placing y at the midpoint yi = (1/2)(maxjS xij + minjS xij ).


(iii) The Frobenius norm X
kXs y1T k2F = kxj yk22 .
jS
P
is minimized by y = (1/|S|) jS xj .
(iv) The sum of absolute values norm.
p X
X
T
kXs y1 ksav = |xij yi |
i=1 jS

|xij yi | is the median


P
can be minimized for each i independently. The minimizer of jS
of {xij | j S}.

514
15.2 Let W Sn be a symmetric matrix with nonnegative elements wij and zero diagonal. We can
interpret W as the representation of a weighted undirected graph with n nodes. If wij = wji > 0,
there is an edge between nodes i and j, with weight wij . If wij = wji = 0 then nodes i and j are
not connected. The Laplacian of the weighted graph is defined as

L(W ) = W + diag(W 1).

This is a symmetric matrix with elements


( P
n
k=1 wik i=j
Lij (W ) =
wij i 6= j.

The Laplacian has the useful property that


X
y T L(W )y = wij (yi yj )2
ij

for all vectors y Rn .

(a) Show that the function f : Sn R,

f (W ) = inf nmax (L(W ) + diag(x)) ,


1T x=0

is convex.
(b) Give a simple argument why f (W ) is an upper bound on the optimal value of the combinatorial
optimization problem

maximize y T L(W )y
subject to yi {1, 1}, i = 1, . . . , n.

This problem is known as the max-cut problem, for the following reason. Every vector y
with components 1 can be interpreted as a partition of the nodes of the graph in a set
S = {i | yi = 1} and a set T = {i | yi = 1}. Such a partition is called a cut of the graph.
The objective function in the max-cut problem is
X
y T L(W )y = wij (yi yj )2 .
ij

If y is 1-vector corresponding to a partition in sets S and T , then y T L(W )y equals four


times the sum of the weights of the edges that join a point in S to a point in T . This is called
the weight of the cut defined by y. The solution of the max-cut problem is the cut with the
maximum weight.
(c) The function f defined in part 1 can be evaluated, for a given W , by solving the optimization
problem
minimize nmax (L(W ) + diag(x))
subject to 1T x = 0,
with variable x Rn . Express this problem as an SDP.

515
(d) Derive an alternative expression for f (W ), by taking the dual of the SDP in part 3. Show
that the dual SDP is equivalent to the following problem:
X
maximize wij kpi pj k22
ij
subject kpi k2 = 1, i = 1, . . . , n,

with variables pi Rn , i = 1, . . . , n. In this problem we place n points pi on the unit sphere


in Rn in such a way that the weighted sum of their squared pair-wise distances is maximized.

Solution.

(a) Follows from the fact that max (L(W ) + diag(x)) is jointly convex in W and x.
(b) If 1T x = 0, then

sup y T L(W )y = sup y T (L(W ) + diag(x))y


y{1,+1}n y{1,+1}n

sup y T (L(W ) + diag(x))y


y T y=n
= nmax (L(W ) + diag(x)).

(c)

minimize nt
subject to L(W ) + diag(x)  tI
1T x = 0.

(d) The Lagrangian is

G(t, x, Z) = nt + tr(Z(L(W ) + diag(x) tI)) 1T x


= (n tr Z)t + (diag(Z) 1)T x + tr(L(W )Z).

G is unbounded below unless tr Z = n and diag(Z) = 1, so the dual SDP is

maximize tr(L(W )Z)


subject to tr Z = n
diag(Z) = 1
Z  0.

This simplifies to
maximize tr(L(W )Z)
subject to diag(Z) = 1
Z  0.
In the geometric interpretation we interpret Z as a Gram matrix with elements zij = pTi pj
for some set of n vectors pi Rn . The constraint diag(Z) = 1 means that kpi k2 = 1. The

516
objective can be written in terms of the vectors pi using the definition of L(W ):

tr(L(W )Z) = tr(W Z) diag(Z)T (W 1)


X n X
X n
= 2 wij pTi pj + ( wij )kpi k22
i<j i=1 j=1
X
= wij (2pTi pj + kpi k22 + kpj k22 )
i<j
X
= wij kpi pj k22 .
i<j

15.3 Utility versus latency trade-off in a network. We consider a network with m edges, labeled 1, . . . , m,
and n flows, labeled 1, . . . , n. Each flow has an associated nonnegative flow rate fj ; each edge or
link has an associated positive capacity ci . Each flow passes over a fixed set of links (its route);
the total traffic ti on link i is the sum of the flow rates over all flows that pass through link i. The
flow routes are described by a routing matrix R Rmn , defined as
(
1 flow j passes through link i
Rij =
0 otherwise.

Thus, the vector of link traffic, t Rm , is given by t = Rf . The link capacity constraint can be
expressed as Rf  c. The (logarithmic) network utility is defined as U (f ) = nj=1 log fj .
P

The (average queuing) delay on link i is given by


1
di =
ci ti
(multiplied by a constant, that doesnt matter to us). We take di = for ti = ci . The delay or
latency for flow j, denoted lj , is the sum of the link delays over all links that flow j passes through.
We define the maximum flow latency as

L = max{l1 , . . . , ln }.

We are given R and c; we are to choose f .

(a) How would you find the flow rates that maximize the utility U , ignoring flow latency? (In
particular, we allow L = .) Well refer to this maximum achievable utility as U max .
(b) How would you find the flow rates that minimize the maximum flow latency L, ignoring utility?
(In particular, we allow U = .) Well refer to this minimum achievable latency as Lmin .
(c) Explain how to find the optimal trade-off between utility U (which we want to maximize) and
latency L (which we want to minimize).
(d) Find U max , Lmin , and plot the optimal trade-off of utility versus latency for the network with
data given in net_util_data.m, showing Lmin and U max on the same plot. Your plot should
cover the range from L = 1.1Lmin to L = 11Lmin . Plot U vertically, on a linear scale, and L
horizontally, using a log scale.

517
Note. For parts (a), (b), and (c), your answer can involve solving one or more convex optimization
problems. But if there is a simpler solution, you should say so.
Solution.

(a) To maximize utility we solve the convex problem


n P
maximize j=1 log fj
subject to Rf  c.

with variable f , and the implicit constraint f  0.


(b) Link delay is monotonically increasing in traffic, so it is minimized with zero traffic. Since
flow latency is the sum of link delays, it is also minimized by the choice f = 0. So zero
flow minimizes each flow latency. With f = 0 we have di = 1/ci . To sum these delays over
the routes, we multiply by RT to get l = RT (1/c1 , . . . , 1/cm ), where l is the vector of flow
latencies. Thus we have  
Lmin = max RT (1/c1 , . . . , 1/cm ) ,

where the max is the maximum over the entries of the vector RT (1/c1 , . . . , 1/cm ).
(c) Let bT1 , . . . , bTm denote the rows of R. To find the optimal trade-off between U and L, we solve
the problem Pn
maximize j=1 log fj
subject to Rf  c,
m
X Rij
L, j = 1, . . . , n,
c bTi f
i=1 i
for a range of values of L. The variable here is f , and there is an implicit constraint f  0.
Evidently, if Rf  c we must have ci bTi f 0, so the constraints
m
X Rij
L, j = 1, . . . , n,
i=1
ci bTi f

are convex. This is therefore a convex optimization problem.


(d) The following code computes the utility versus latency trade-off.
% solution for network utility problem
clear all;
net_util_data;

% lets find max utility with no delay constraint


cvx_begin
variable f(n)
maximize geomean(f)
R*f <= c
f >= 0; % not needed; enforced by geomean domain
cvx_end
Umax=sum(log(f));

518
% lets find min latency with no utility constraint
% can be done analytically: just take f=0
Lmin = max(R*(1./c));

% now lets do pareto curve

N = 20;
ds = 1.10*Lmin*logspace(0,1,N); % go from 10% above L
Uopt = [];

for d = ds
cvx_begin
variable f(n)
maximize geomean(f);
R*inv_pos(c-R*f) <= d*ones(n,1)
f >= 0; % not needed; enforced by geomean domain
R*f <= c; % not needed; enforced by inv_pos domain
cvx_end
Uopt = [Uopt n*log(cvx_optval)];
end

semilogx(ds,Uopt,k-,[Lmin,ds],[Umax,ones(1,N)*Umax],...
k--,[1,1]*Lmin,[Uopt(1),Umax],k--)
% axis([ds(1), ds(N), -480, -340]);
xlabel(L); ylabel(U);
The trade-off curve is shown in figure 16.

15.4 Allocation of interdiction effort. A smuggler moves along a directed acyclic graph with m edges and
n nodes, from a source node (which we take as node 1) to a destination node (which we take as node
n), along some (directed) path. Each edge k has a detection failure probability pk , which is the
probability that the smuggler passes over that edge undetected. The detection events on the edges
are independent, so the probability that the smuggler makes it to the destination node undetected
is jP pj , where P {1, . . . , m} is (the set of edges on) the smugglers path. We assume that
Q

the smuggler knows the detection failure probabilities and will take a path that maximizes the
probability of making it to the destination node undetected. We let P max denote this maximum
probability (over paths). (Note that this is a function of the edge detection failure probabilities.)
The edge detection failure probability on an edge depends on how much interdiction resources are
allocated to the edge. Here we will use a very simple model, with xj R+ denoting the effort (say,
yearly budget) allocated to edge j, with associated detection failure probability pj = eaj xj , where
aj R++ are given. The constraints on x are a maximum for each edge, x  xmax , and a total
budget constraint, 1T x B.

(a) Explain how to solve the problem of choosing the interdiction effort vector x Rm , subject
to the constraints, so as to minimize P max . Partial credit will be given for a method that
involves an enumeration over all possible paths (in the objective or constraints). Hint. For

519
340

360

380

400
U

420

440

460

480
1 2 3
10 10 10
L

Figure 16: Utility versus maximum latency trade-off. Solid: trade-off curve, dashed: Lmin and
Umax

520
pk over all paths P from the source node 1
Q
each node i, let Pi denote the maximum of kP
to node i (so P max = Pn ).
(b) Carry out your method on the problem instance given in interdict_alloc_data.m. The data
file contains the data a, xmax , B, and the graph incidence matrix A Rnm , where

1 if edge j leaves node i

Aij = +1 if edge j enters node i

0 otherwise.

Give P max? , the optimal value of P max , and compare it to the value of P max obtained with
uniform allocation of resources, i.e., with x = (B/m)1.
Hint. Given a vector z Rn , AT z is the vector of edge differences: (AT z)j = zk zl if edge j
goes from node l to node k.

521
The following figure shows the topology of the graph in question. (The data file contains A;
this figure, which is not needed to solve the problem, is shown here so you can visualize the
graph.)

Node 1

Node 2

Node 4 Node 3

Node 6 Node 5

Node 7

Node 8

Node 9

Node 10

Solution. We will work with the logs of the detection failure probabilities, yj = log pj = aj xj .
Consider these as edge weights on the graph. The smuggler chooses a path from source to destina-
tion that maximizes the total path weight (i.e., the sum of weights along its edges). Thus log P max
is the maximum, over the set of paths from source to destination, of a sum of linear functions of
x; so it is a piecewise-linear convex function of x. It follows that minimizing log P max subject to
the constraints xmax  x  0, 1T x B, is a convex optimization problem. Unfortunately, this
formulation requires an enumeration of all paths. So lets find one that does not.
Following the given hint, we can write a recursion for Pi as follows:
Pi = max (pik Pk ) ,
kpred(i)

where pred(i) denotes the predecessor nodes of i, and pik is the detection failure probability on the
edge from k to i. We also have P1 = 1, Pn = P max .
Letting zi = log Pi we have the problem
minimize zn
subject to zi = maxkpred(i) (aik xik + zk ) , i = 2, . . . , n
z1 = 0
xmax  x  0, 1T x B,

522
with variables x Rm and z Rn . In the second line, we use xik to denote xj , where j corresponds
to the edge from node k to node i.
The objective and constraints in this problem are all monotone nondecreasing in zi , so it follows
that we can relax it to the convex problem

minimize zn
subject to zi maxkpred(i) (aik xik + zk ) , i = 2, . . . , n
z1 = 0
xmax  x  0, 1T x B.

Using the incidence matrix this can be written as the following LP:

minimize zn
subject to AT z  diag(a)x
z1 = 0
xmax  x  0, 1T x B.

GP formulation. Equivalently, we can formulate the problem as a GP:

minimize Pn  
aik
subject to Pi maxkpred(i) Pk yik , i = 2, . . . , n
P1 = 1
y1
Qm
i=1 yi exp(B),

over P Rn and y Rm where yi = exp(xi ) (we use the notation yik as defined above).
The optimal probability of detection failure is P max = 0.043, so the smuggler will get through with
probability 4.3%. Using uniform allocation of resources, we obtain P max = 0.247, so the smuggler
gets through with probability 24.7%.
The following code was used to solve the problem.

interdict_alloc_data

% dynamic programming solution


cvx_begin
variables x(m) z(n)
minimize(z(n))
z(1) == 0
A*z >= -diag(a)*x
x >= 0
x <= x_max
sum(x) <= B
cvx_end

% uniform allocation

523
x_unif=B/m*ones(m,1);
cvx_begin
variables z_unif(n)
minimize(z_unif(n))
z_unif(1) == 0
A*z_unif >= -diag(a)*x_unif
cvx_end

15.5 Network sizing. We consider a network with n directed arcs. The flow through arc k is denoted
xk and can be positive, negative, or zero. The flow vector x must satisfy the network constraint
Ax = b where A is the node-arc incidence matrix and b is the external flow supplied to the nodes.
Each arc has a positive capacity or width yk . The quantity |xk |/yk is the flow density in arc k.
The cost of the flow in arc k depends on the flow density and the width of the arc, and is given by
yk k (|xk |/yk ), where k is convex and nondecreasing on R+ .

(a) Define f (y, b) as the optimal value of the network flow optimization problem
n
X
minimize yk k (|xk |/yk )
k=1
subject to Ax = b

with variable x, for given values of the arc widths y  0 and external flows b. Is f a convex
function (jointly in y, b)? Carefully explain your answer.
(b) Suppose b is a discrete random vector with possible values b(1) , . . . , b(m) . The probability that
b = b(j) is j . Consider the problem of sizing the network (selecting the arc widths yk ) so that
the expected cost is minimized:

minimize g(y) + E f (y, b). (61)

The variable is y. Here g is a convex function, representing the installation cost, and E f (y, b)
is the expected optimal network flow cost
m
X
E f (y, b) = j f (y, b(j) ),
j=1

where f is the function defined in part 1. Is (61) a convex optimization problem?

Solution.

(a) We first note that k (|xk |) is convex as a function of xk . This follows from the composition
rules. The function is a composition h(g(xk )) of two functions:
the convex function g(xk ) = |xk |
the convex nondecreasing function h(u) defined as h(u) = k (u) if u 0 and h(u) = k (0)
for u 0.

524
The function yk k (|xk |/yk ) is jointly convex in xk , yk because it is the perspective of a convex
function.
The function f can be expressed as inf x F (x, y, b) where
( P
k yk k (|xk |/yk ) Ax = b
F (x, y, b) =
+ otherwise.
This function F is jointly convex in (x, y, b). This implies that f (y, b) = inf x f (x, y, b) is
convex.
(b) The expression
m
X
E f (y, b) = j f (y, b(j) )
j=1

shows that E f (y, b) is the sum of convex functions of y.


15.6 Maximizing algebraic connectivity of a graph. Let G = (V, E) be a weighted undirected graph with
n = |V | nodes, m = |E| edges, and weights w1 , . . . , wm R+ on the edges. If edge k connects
nodes i and j, then define ak Rn as (ak )i = 1, (ak )j = 1, with other entries zero. The weighted
Laplacian (matrix) of the graph is defined as
m
X
L= wk ak aTk = A diag(w)AT ,
k=1
nm
where A = [a1 am ] R is the incidence matrix of the graph. Nonnegativity of the weights
implies L  0.
Denote the eigenvalues of the Laplacian L as
1 2 n ,
which are functions of w. The minimum eigenvalue 1 is always zero, while the second smallest
eigenvalue 2 is called the algebraic connectivity of G and is a measure of the connectedness of
a graph: The larger 2 is, the better connected the graph is. It is often used, for example, in
analyzing the robustness of computer networks.
Though not relevant for the rest of the problem, we mention a few other examples of how the
algebraic connectivity can be used. These results, which relate graph-theoretic properties of G
to properties of the spectrum of L, belong to a field called spectral graph theory. For example,
2 > 0 if and only if the graph is connected. The eigenvector v2 associated with 2 is often called
the Fiedler vector and is widely used in a graph partitioning technique called spectral partitioning,
which assigns nodes to one of two groups based on the sign of the relevant component in v2 . Finally,
2 is also closely related to a quantity called the isoperimetric number or Cheeger constant of G,
which measures the degree to which a graph has a bottleneck.
The problem is to choose the edge weights w Rm + , subject to some linear inequalities (and the
nonnegativity constraint) so as to maximize the algebraic connectivity:
maximize 2
subject to w  0, F w  g,
with variable w Rm . The problem data are A (which gives the graph topology), and F and g
(which describe the constraints on the weights).

525
(a) Describe how to solve this problem using convex optimization.
(b) Numerical example. Solve the problem instance given in max_alg_conn_data.m, which uses
F = 1T and g = 1 (so the problem is to allocate a total weight of 1 to the edges of the graph).
Compare the algebraic connectivity for the graph obtained with the optimal weights w? to the
one obtained with wunif = (1/m)1 (i.e., a uniform allocation of weight to the edges).
Use the function plotgraph(A,xy,w) to visualize the weighted graphs, with weight vectors
w? and wunif . You will find that the optimal weight vector v ? has some zero entries (which
due to the finite precision of the solver, will appear as small weight values); you may want to
round small values (say, those under 104 ) of w? to exactly zero. Use the gplot function to
visualize the original (given) graph, and the subgraph associated with nonzero weights in w? .
Briefly comment on the following (incorrect) intuition: The more edges a graph has, the more
connected it is, so the optimal weight assignment should make use of all available edges.

Solution.

(a) The only question is how to ensure that 2 is a concave function of w. In general, the second
lowest eigenvalue of a symmetric matrix is not a concave function. The trick is to form a new
matrix whose smallest eigenvalue is exactly 2 .
First, note that 1 is an eigenvector of L corresponding to 1 ; this follows because 1T ak = 0
for all k. We can thus express the algebraic connectivity 2 as the minimum eigenvalue of
L, restricted to the subspace 1 , i.e., 2 = min (QT LQ), where Q Rn(n1) is a matrix
whose columns form an orthonormal basis for N (1T ) = 1 , so that QT LQ has eigenvalues
2 (L), . . . , n (L). Since min (QT LQ) is concave in w, were done. We end up with the problem

maximize min (QT LQ)


subject to w  0, F w  g,

with variable w Rm .
(b) The solution and plots are below. From the first two plots, it is clear that the solution is
sparse, in the sense that many edges that would be active in the uniform weight graph have
been disabled in the optimal weight graph. In particular, many edges have been removed from
very densely connected sections of the graph.
We find that the algebraic connectivities of the uniform and optimal weights are 0.002204 and
0.005018, respectively. In other words, the optimal graph is better connected despite there
being fewer total edges, because additional weight (e.g., capacity) is assigned to particularly
important edges that substantially improve the global connectivity of the graph.

526
Constantweight graph

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.1
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1
Optimal weight graph

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.1
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1

527
%% maximizing algebraic connectivity of a graph

max_alg_conn_data;

Q = null(ones(1,n)); % columns of Q are orthonormal basis for 1^\perp

cvx_begin
variable w(m)
L = A*diag(w)*A;
maximize (lambda_min(Q*L*Q))
subject to
w >= 0;
F*w <= g;
cvx_end

w(abs(w) < 1e-4) = 0;

% compare algebraic connectivities


L_unif = (1/m)*A*A;
dunif = eig(L_unif);
dopt = eig(L);
fprintf(1, Algebraic connectivity of L_unif: %f\n, dunif(2));
fprintf(1, Algebraic connectivity of L_opt: %f\n, dopt(2));

% plot topology of constant-weight graph


figure(1), clf
gplot(L_unif,xy);
hold on;
plot(xy(:,1), xy(:,2), ko,LineWidth,4, MarkerSize,4);
axis([0.05 1.1 -0.1 0.95]);
title(Constant-weight graph)
hold off;
print -deps graph_plot1.eps;

% plot topology of optimal weight graph


figure(2), clf
gplot(L,xy);
hold on;
plot(xy(:,1), xy(:,2), ko,LineWidth,4, MarkerSize,4);
axis([0.05 1.1 -0.1 0.95]);
title(Optimal weight graph)
hold off;
print -deps graph_plot2.eps;

% plot optimal weight graph with edge thickness proportional to weight

528
figure(3), clf
plotgraph(A,xy,w);
text(0.3,1.05,Optimal weights)
print -deps graph_plot3.eps;

15.7 Graph isomorphism via linear programming. An (undirected) graph with n vertices can be described
by its adjacency matrix A Sn , given by
(
1 there is an edge between vertices i and j
Aij =
0 otherwise.

Two (undirected) graphs are isomorphic if we can permute the vertices of one so it is the same as
the other (i.e., the same pairs of vertices are connected by edges). If we describe them by their
adjacency matrices A and B, isomorphism is equivalent to the existence of a permutation matrix
P Rnn such that P AP T = B. (Recall that a matrix P is a permutation matrix if each row and
column has exactly one entry 1, and all other entries 0.) Determining if two graphs are isomorphic,
and if so, finding a suitable permutation matrix P , is called the graph isomorphism problem.
Remarks (not needed to solve the problem). It is not currently known if the graph isomorphism
problem is NP-complete or solvable in polynomial time. The graph isomorphism problem comes
up in several applications, such as determining if two descriptions of a molecule are the same, or
whether the physical layout of an electronic circuit correctly reflects the given circuit schematic
diagram.

(a) Find a set of linear equalities and inequalities on P Rnn , that together with the Boolean
constraint Pij {0, 1}, are necessary and sufficient for P to be a permutation matrix satisfying
P AP T = B. Thus, the graph isomorphism problem is equivalent to a Boolean feasibility LP.
(b) Consider the relaxed version of the Boolean feasibility LP found in part (a), i.e., the LP that
results when the constraints Pij {0, 1} are replaced with Pij [0, 1]. When this LP is
infeasible, we can be sure that the two graphs are not isomorphic. If a solution of the LP
is found that satisfies Pij {0, 1}, then the graphs are isomporphic and we have solved the
graph isomorphism problem. This of course does not always happen, even if the graphs are
isomorphic.
A standard trick to encourage the entries of P to take on the values 0 and 1 is to add a random
linear objective to the relaxed feasibility LP. (This doesnt change whether the problem is
P
feasible or not.) In other words, we minimize i,j Wij Pij , where Wij are chosen randomly
(say, from N (0, 1)). (This can be repeated with different choices of W .)
Carry out this scheme for the two isomorphic graphs with adjacency matrices A and B given
in graph_isomorphism_data.* to find a permutation matrix P that satisfies P AP T = B.
Report the permutation vector, given by the matrix-vector product P v, where v = (1, 2, . . . , n).
Verify that all the required conditions on P hold. To check that the entries of the solution of
the LP are (close to) {0, 1}, report maxi,j Pij (1 Pij ). And yes, you might have to try more
than one instance of the randomized method described above before you find a permutation
that establishes isomorphism of the two graphs.

Solution.

529
(a) P is a permutation matrix if and only if P 1 = 1, P T 1 = 1, and Pij {0, 1}. The condition
P AP T = B is a quadratic equality on P . But we observe that since P is a permutation
matrix, we have P 1 = P T . Multiplying P AP T = B on the right by P we get P A = BP , a
set of linear equations in P . So, P is a permutation matrix that satisfies P AP T = B if and
only if
P 1 = 1, P T 1 = 1, P A = BP, Pij {0, 1}.
This is a set of linear equations in P , together with the Boolean condition Pij {0, 1}.
(b) The LP relaxation, with the random cost function suggested, is

minimize tr(W T P )
subject to P 1 = 1, P T 1 = 1, P A = BP
0 Pij 1.

(The constraints Pij 1 are redundant and can be removed.)


When we solve this problem for the given data, we find that there are two different permutation
vectors 1 and 2 that relate A and B. The quantity maxi,j Pij (1 Pij ) is suitably small, on
the order of 106 . These two vectors are

1 = (16, 22, 27, 9, 5, 13, 1, 25, 21, 23, 19, 14, 26, 6, 2, 7,
30, 11, 10, 17, 15, 24, 8, 18, 20, 3, 12, 28, 4, 29),

and

2 = (6, 22, 27, 19, 15, 3, 11, 25, 21, 23, 9, 4, 26, 16, 12, 17,
30, 1, 20, 7, 5, 24, 18, 8, 10, 13, 2, 28, 14, 29).

It is interesting to notice that solving the feasibility problem with itself (i.e., with objective
function 0) usually doesnt end up finding a permutation matrix. However adding the linear
random objective does help us find a permutation matrix.
The following MATLAB code solves the problem
graph_isomorphism_data
n = size(A,1);
W = randn(n);

cvx_begin quiet
variable P(n,n)
minimize trace(W*P)
subject to
P*ones(n,1) == ones(n,1);
P*ones(n,1) == ones(n,1);
P*A - B*P == 0;
0 <= P <= 1
cvx_end
fprintf([maximum p(1-p) where p is an entry of P is ...
%d\n], max(max(P.*(1-P))))

530
fprintf([norm of the residual for first constraint is ...
%d\n], norm(P*ones(n,1) - ones(n,1)))
fprintf([norm of the residual for second constraint is ...
%d\n], norm(P*ones(n,1) - ones(n,1)))
fprintf([norm of the residual for third constraint is ...
%d\n], norm(P*A - B*P))
P*(1:n)
The following Python code solves the problem
from cvxpy import *
from graph_isomorphism_data import *

n = A.shape[0]
W = np.random.randn(n, n)

P = Variable(n,n)
objective = Minimize(trace(W*P))
constraints = [ P*np.ones([n,1]) == np.ones([n,1]),
P.T*np.ones([n,1]) == np.ones([n,1]),
P*A == B*P,
0 <= P,
P <= 1]

prob = Problem(objective, constraints)


result = prob.solve()
P = P.value
print(maximum p(1-p) where p is an entry of P is
+ str(np.max(np.multiply(P,1-P))))
print(norm of the residual for first constraint is
+ str(np.linalg.norm(P*np.ones([n,1])- np.ones([n,1]))))
print(norm of the residual for second constraint is
+ str(np.linalg.norm(P.T*np.ones([n,1]) - np.ones([n,1]))))
print(norm of the residual for third constraint is
+ str(np.linalg.norm(P*A-B*P)))
print(np.dot(P,np.arange(n)+1))
The following Julia code solves the problem

using Convex, SCS


include("graph_isomorphism_data.jl");

n = size(A,1);
W = randn(n,n);

P = Variable(n, n);
obj = trace(W*P);

531
constraints = [
sum(P, 2) == ones(n),
sum(P, 1) == ones(1, n),
P*A == B*P,
P >= 0,
P <= 1
];
prob = minimize(obj, constraints);
solve!(prob, SCSSolver(verbose=false, max_iters=20000));
P = P.value;
println(maximum(P.*(1-P)));
println(norm(sum(P, 2) - ones(n)));
println(norm(sum(P, 1) - ones(1, n)));
println(vecnorm(P*A-B*P));
println(P);

532
16 Energy and power
16.1 Power flow optimization with N 1 reliability constraint. We model a network of power lines as a
graph with n nodes and m edges. The power flow along line j is denoted pj , which can be positive,
which means power flows along the line in the direction of the edge, or negative, which means power
flows along the line in the direction opposite the edge. (In other words, edge orientation is only
used to determine the direction in which power flow is considered positive.) Each edge can support
power flow in either direction, up to a given maximum capacity Pjmax , i.e., we have |pj | Pjmax .
Generators are attached to the first k nodes. Generator i provides power gi to the network. These
must satisfy 0 gi Gmax
i , where Gmax
i is a given maximum power available from generator i.
The power generation costs are ci > 0, which are given; the total cost of power generation is cT g.
Electrical loads are connected to the nodes k + 1, . . . , n. We let di 0 denote the demand at node
k + i, for i = 1, . . . , n k. We will consider these loads as given. In this simple model we will
neglect all power losses on lines or at nodes. Therefore, power must balance at each node: the total
power flowing into the node must equal the sum of the power flowing out of the node. This power
balance constraint can be expressed as
" #
g
Ap = ,
d

where A Rnm is the node-incidence matrix of the graph, defined by



+1 edge j enters node i,

Aij = 1 edge j leaves node i,

0 otherwise.

In the basic power flow optimization problem, we choose the generator powers g and the line flow
powers p to minimize the total power generation cost, subject to the constraints listed above.
The (given) problem data are the incidence matrix A, line capacities P max , demands d, maximum
generator powers Gmax , and generator costs c.
In this problem we will add a basic (and widely used) reliability constraint, commonly called an
N 1 constraint. (N is not a parameter in the problem; N 1 just means all-but-one.) This
states that the system can still operate even if any one power line goes out, by re-routing the line
powers. The case when line j goes out is called failure contingency j; this corresponds to replacing
Pjmax with 0. The requirement is that there must exist a contingency power flow vector p(j) that
(j)
satisfies all the constraints above, with pj = 0, using the same given generator powers. (This
corresponds to the idea that power flows can be re-routed quickly, but generator power can only
be changed more slowly.) The N 1 reliability constraint requires that for each line, there is a
contingency power flow vector. The N 1 reliability constraint is (implicitly) a constraint on the
generator powers.
The questions below concern the specific instance of this problem with data given in rel_pwr_flow_data.*.
(Executing this file will also generate a figure showing the network you are optimizating.) Especially
for part (b) below, you must explain exactly how you set up the problem as a convex optimization
problem.

533
(a) Nominal optimization. Find the optimal generator and line power flows for this problem
instance (without the N 1 reliability constraint). Report the optimal cost and generator
powers. (You do not have to give the power line flows.)
(b) Nominal optimization with N 1 reliability constraint. Minimize the nominal cost, but you
must choose generator powers that meet the N 1 reliability requirement as well. Report the
optimal cost and generator powers. (You do not have to give the nominal power line flows, or
any of the contingency flows.)

Solution.

(a) To find the optimal generators and line power flows we solve the LP

minimize cT g " #
g
subject to Ap =
d
P max  p  P max
0  g  Gmax ,

with variables g and p.


(b) To handle the additional N 1 reliability constraint, we must introduce a set of power flow
vectors for each contingency. We then solve the LP

minimize cT g " #
g
subject to Ap(j) = , j = 1, . . . , m
d
(j)
pj = 0, j = 1, . . . , m
P max  p(j)  P max , j = 1, . . . , m
0  g  Gmax ,

with variables g Rk and p(1) , . . . , p(m) Rm .

The optimal costs are 44.60 and 56.20 for parts (a) and (b) respectively. The optimal generator
powers are
3.0 1.9
0.0 1.9
gnom = , grel = .

2.3 4.0
7.0 4.5

We can say a little about these results. In the nominal case, it turns out that the line capacities
are not tight; that is, we have no congestion on the transmission lines. So we must select a set of
generator powers that deliver the required power, which is the sum of the demands (12.32 power
units). To do this most efficiently, we start with generator 4, which has the lowest cost, and we
set it to its maximum, which gives us a total of 7 power units. We then go the second cheapest
generator, generator 1, and set it to its maximum (3 power units), which gives us a total of 10
power units. Finally, we go to the third cheapest generator, generator 3, and use it to satisfy the
remaining demand.

534
When we impose the additional N 1 reliability constraint, we are forced to shift some generation
to more expensive generators.
The following MATLAB code solves parts (a) and (b).

rel_pwr_flow_data;

% nominal case
cvx_begin
variables p(m) g_nom(k)
minimize (c*g_nom)
subject to
A*p == [-g_nom;d];
abs(p) <= Pmax;
g_nom <= Gmax;
g_nom >= 0;
cvx_end
nom_cost = cvx_optval;

% N-1 case
cvx_begin
variables P(m,m) g_rel(k)
minimize (c*g_rel)
subject to
A*P == [-g_rel;d]*ones(1,m);
diag(P) == 0;
abs(P) <= Pmax*ones(1,m);
g_rel <= Gmax;
g_rel >= 0;
cvx_end
rel_cost = cvx_optval;

% show nominal optimal and reliable optimal cost


[nom_cost rel_cost]
% show nominal optimal and reliable optimal generator powers
[g_nom g_rel]

The following Python code solves parts (a) and (b).

import cvxpy as cvx


from rel_pwr_flow_data import *

# nominal case
p = cvx.Variable(m)
g_nom = cvx.Variable(k)
nom_cost = cvx.Problem(cvx.Minimize(c.T*g_nom),
[A[:k,:]*p == -g_nom,

535
A[k:,:]*p == d.T,
cvx.abs(p) <= Pmax.T,
g_nom <= Gmax,
g_nom >= 0.]).solve()

# N-1 case
P = cvx.Variable(m,m)
g_rel = cvx.Variable(k)
rel_cost = cvx.Problem(cvx.Minimize(c.T*g_rel),
[A[:k,:]*P == -g_rel*np.ones((1,m)),
A[k:,:]*P == d.T*np.ones((1,m)),
cvx.diag(P) == 0,
cvx.abs(P) <= Pmax.T*np.ones((1,m)),
g_rel <= Gmax,
g_rel >= 0.]).solve()

# show nominal optimal and reliable optimal cost


print "nom_cost", nom_cost
print "rel_cost", rel_cost
# show nominal optimal and reliable optimal generator powers
print "g_nom:", g_nom.value.A1
print "g_rel:", g_rel.value.A1

The following Julia code solves parts (a) and (b).

include("rel_pwr_flow_data.jl")

using Convex, SCS

# nominal case
p = Variable(m);
g_nom = Variable(k);
constraints = [];
constraints += A*p == [-g_nom; d];
constraints += abs(p) <= Pmax;
constraints += g_nom <= Gmax;
constraints += g_nom >= 0;
problem = minimize(c*g_nom, constraints);
solve!(problem, SCSSolver(max_iters=20000));
nom_cost = problem.optval;
println("nominal cost: ", nom_cost)
println("nominal generator powers: ")
println(g_nom.value)

# N-1 case
P = Variable(m,m);

536
g_rel = Variable(k);
constraints = [];
constraints += A*P == [-g_rel; d]*ones(1,m);
constraints += abs(P) <= Pmax * ones(1,m);
constraints += diag(P) == 0;
constraints += g_rel <= Gmax;
constraints += g_rel >= 0;
problem = minimize(c*g_rel, constraints);
solve!(problem, SCSSolver(max_iters=20000));
rel_cost = problem.optval;
println("reliable cost: ", rel_cost)
println("reliable generator powers: ")
println(g_rel.value)

16.2 Optimal generator dispatch. In the generator dispatch problem, we schedule the electrical output
power of a set of generators over some time interval, to minimize the total cost of generation while
exactly meeting the (assumed known) electrical demand. One challenge in this problem is that the
generators have dynamic constraints, which couple their output powers over time. For example,
every generator has a maximum rate at which its power can be increased or decreased.
We label the generators i = 1, . . . , n, and the time periods t = 1, . . . , T . We let pi,t denote the
(nonnegative) power output of generator i at time interval t. The (positive) electrical demand in
period t is dt . The total generated power in each period must equal the demand:
n
X
pi,t = dt , t = 1, . . . , T.
i=1

Each generator has a minimum and maximum allowed output power:

Pimin pi,t Pimax , i = 1, . . . , n, t = 1, . . . , T.

The cost of operating generator i at power output u is i (u), where i is an increasing strictly
convex function. (Assuming the cost is mostly fuel cost, convexity of i says that the thermal
efficiency of the generator decreases as its output power increases.) We will assume these cost
functions are quadratic: i (u) = i u + i u2 , with i and i positive.
Each generator has a maximum ramp-rate, which limits the amount its power output can change
over one time period:

|pi,t+1 pi,t | Ri , i = 1, . . . , n, t = 1, . . . , T 1.

In addition, changing the power output of generator i from ut to ut+1 incurs an additional cost
i (ut+1 ut ), where i is a convex function. (This cost can be a real one, due to increased fuel
use during a change of power, or a fictitious one that accounts for the increased maintenance cost
or decreased lifetime caused by frequent or large changes in power output.) We will use the power
change cost functions i (v) = i |v|, where i are positive.
Power plants with large capacity (i.e., Pimax ) are typically more efficient (i.e., have smaller i , i ),
but have smaller ramp-rate limits, and higher costs associated with changing power levels. Small

537
gas-turbine plants (peakers) are less efficient, have less capacity, but their power levels can be
rapidly changed.
The total cost of operating the generators is
n X
X T 1
n TX
X
C= i (pi,t ) + i (pi,t+1 pi,t ).
i=1 t=1 i=1 t=1

Choosing the generator output schedules to minimize C, while respecting the constraints described
above, is a convex optimization problem. The problem data are dt (the demands), the generator
power limits Pimin and Pimax , the ramp-rate limits Ri , and the cost function parameters i , i , and
i . We will assume that problem is feasible, and that p?i,t are the (unique) optimal output powers.

(a) Price decomposition. Show that there are power prices Q1 , . . . , QT for which the following
holds: For each i, p?i,t solves the optimization problem

T T 1
t=1 (i (pi,t ) Qt pi,t ) + t=1 i (pi,t+1 pi,t )
P P
minimize
min max
subject to Pi pi,t Pi , t = 1, . . . , T
|pi,t+1 pi,t | Ri , t = 1, . . . , T 1.

The objective here is the portion of the objective for generator i, minus the revenue generated
by the sale of power at the prices Qt . Note that this problem involves only generator i; it can
be solved independently of the other generators (once the prices are known). How would you
find the prices Qt ?
You do not have to give a full formal proof; but you must explain your argument fully. You
are welcome to use results from the text book.
(b) Solve the generator dispatch problem with the data given in gen_dispatch_data.m, which
gives (fake, but not unreasonable) demand data for 2 days, at 15 minute intervals. This file
includes code to plot the demand, optimal generator powers, and prices. (You must replace
these variables with their correct values.) Comment on anything you see in your solution
that might at first seem odd. Using the prices found, solve the problems in part (a) for the
generators separately, to be sure they give the optimal powers (up to some small numerical
errors).

Remark. While beyond the scope of this course, we mention that there are very simple price update
mechanisms that adjust the prices in such a way that when the generators independently schedule
themselves using the prices (as described above), we end up with the total power generated in each
period matching the demand, i.e., the optimal solution of the whole (coupled) problem. This gives
a decentralized method for generator dispatch.
Solution.

(a) We start by forming the partial Lagrangian, introducing dual variable RT for the equality
constraint:
T n
!
X X
L(p, ) = C + t dt pi,t ,
t=1 i=1

538
restricted to the set of p that satisfy the various constraints. To get the dual function, we
minimize over p. But this is separable, so we can minimize the power schedule for each
generator separately. For each i we solve the problem
T T 1
t=1 (i (pi,t ) t pi,t ) + t=1 i (pi,t+1 pi,t )
P P
minimize
min max
subject to Pi pi,t Pi , t = 1, . . . , T
|pi,t+1 pi,t | Ri , t = 1, . . . , T 1.

Now we observe that strong duality holds, since the constraints are linear and the problem
is feasible. This means that when is (dual) optimal, the dual function equals the optimal
(primal) objective function. Since the Lagrangian, with optimal dual variable, is strictly
convex in p, it has a unique solution; this must be the solution of the original problem.
Thus we see that the prices are none other than optimal dual variables for the power balance
equations.
(b) The code below solves the problem.

% solve generator dispatch problem


gen_dispatch_data;

cvx_begin
variable p(n,T);
dual variable Q;
p <= Pmax*ones(1,T);
p >= Pmin*ones(1,T);
Q: sum(p,1) == d; % get lagrange multipliers, which are the prices
abs(p(:,2:T)-p(:,1:T-1)) <= R*ones(1,T-1);
power_cost= sum(alpha*p+beta*(p.^2));
change_power_cost = sum(gamma*abs(p(:,2:T)-p(:,1:T-1)));
minimize (power_cost+change_power_cost)
cvx_end

subplot(3,1,1)
plot(t,d);
title(demand)
subplot(3,1,2)
plot(t,p);
title(generator powers)
subplot(3,1,3)
plot(t,Q);
title(power prices)

print -depsc gen_dispatch

% now lets solve the problems separately, with the given prices
% well solve in one big problem, which is separable, so were really
% solving n separate problems, one for each generator

539
cvx_begin
variable pp(n,T);
pp <= Pmax*ones(1,T);
pp >= Pmin*ones(1,T);
%sum(pp,1) == d;
abs(pp(:,2:T)-pp(:,1:T-1)) <= R*ones(1,T-1);
power_cost= sum(alpha*pp+beta*(pp.^2));
change_power_cost = sum(gamma*abs(pp(:,2:T)-pp(:,1:T-1)));
minimize (power_cost+change_power_cost-sum(pp*Q))
cvx_end

rel_error = norm(pp-p)/norm(p)

This produces the following plots. Note that in two periods, power prices become negative.
This is correct: the negative price is an incentive to the generators to rapidly ramp down
power levels at that time period, which has low demand. The relative error in computing the
powers using the prices found is on the order of 107 .

demand
15

10

5
0 5 10 15 20 25 30 35 40 45 50

generator powers
6

0
0 5 10 15 20 25 30 35 40 45 50

power prices
4

2
0 5 10 15 20 25 30 35 40 45 50

16.3 Optimizing a portfolio of energy sources. We have n different energy sources, such as coal-fired
plants, several wind farms, and solar farms. Our job is to size each of these, i.e., to choose its
capacity. We will denote by ci the capacity of plant i; these must satisfy cmin
i ci cmax
i , where
min
ci and ci max are given minimum and maximum values.

540
Each generation source has a cost to build and operate (including fuel, maintenance, government
subsidies and taxes) over some time period. We lump these costs together, and assume that the
cost is proportional to ci , with (given) coefficient bi . Thus, the total cost to build and operate the
energy sources is bT c (in, say, $/hour).
Each generation source is characterized by an availability ai , which is a random variable with values
in [0, 1]. If source i has capacity ci , then the power available from the plant is ci ai ; the total power
available from the portfolio of energy sources is cT a, which is a random variable. A coal fired plant
has ai = 1 almost always, with ai < 1 when one of its units is down for maintenance. A wind farm,
in contrast, is characterized by strong fluctations in availability with ai = 1 meaning a strong wind
is blowing, and ai = 0 meaning no wind is blowing. A solar farm has ai = 1 only during peak sun
hours, with no cloud cover; at other times (such as night) we have ai = 0.
Energy demand d R+ is also modeled as a random variable. The components of a (the availabil-
ities) and d (the demand) are not independent. Whenever the total power available falls short of
the demand, the additional needed power is generated by (expensive) peaking power plants at a
fixed positive price p. The average cost of energy produced by the peakers is

E p(d cT a)+ ,

where x+ = max{0, x}. This average cost has the same units as the cost bT c to build and operate
the plants.
The objective is to choose c to minimize the overall cost

C = bT c + E p(d cT a)+ .

Sample average approximation. To solve this problem, we will minimize a cost function based
on a sample average of peaker cost,
N
sa T1 X
C =b c+ p(d(j) cT a(j) )+
N j=1

where (a(j) , d(j) ), j = 1, . . . , N , are (given) samples from the joint distribution of a and d. (These
might be obtained from historical data, weather and demand forecasting, and so on.)

Validation. After finding an optimal value of c, based on the set of samples, you should double
check or validate your choice of c by evaluating the overall cost on another set of (validation)
a(j) , d(j) ), j = 1, . . . , N val ,
samples, (
val
val T 1 NX (j)
C = b c + val p(d cT a
(j) )+ .
N j=1

(These could be another set of historical data, held back for validation purposes.) If C sa C val ,
our confidence that each of them is approximately the optimal value of C is increased.
Finally we get to the problem. Get the data in energy_portfolio_data.m, which includes the
required problem data, and the samples, which are given as a 1 N row vector d for the scalars

541
d(j) , and an n N matrix A for a(j) . A second set of samples is given for validation, with the names
d_val and A_val.
Carry out the optimization described above. Give the optimal cost obtained, C sa , and compare to
the cost evaluated using the validation data set, C val .
Compare your solution with the following naive (certainty-equivalent) approach: Replace a and
d with their (sample) means, and then solve the resulting optimization problem. Give the optimal
cost obtained, C ce (using the average values of a and d). Is this a lower bound on the optimal value
of the original problem? Now evaluate the cost for these capacities on the validation set, C ce,val .
Make a brief statement.
Solution. The problem can be formulated as

minimize bT c + E p(d cT a)+


subject to cmin  c  cmax ,

with variable c Rn . It is convex since p(d cT a)+ is a convex function of c for each a and d
(using p > 0), and convexity is preserved under expectation.
The sample average approximation problem is

minimize bT c + (1/N ) N (j) cT a(j) )


P
j=1 p(d +
subject to cmin  c  cmax ,

with variable c.
The naive (certainty-equivalent) problem is

minimize bT c + p(d cT a
)+
subject to cmin  c  cmax ,

with variable c, where


N N
d = (1/N )
X X
d(j) , a
= (1/N ) a(j) .
j=1 j=1

By Jensens inequality, the objective function of this problem is less than or equal to the objective
for the original problem, so the optimal cost obtained by solving the naive problem is a lower bound
on the optimal cost of the original problem.
The code to solve this problem is shown below. We obtain the following numerical results. The
sampled problem has cost Csa = 10176. Evaluating the objective for the validation set gives
C val = 9906. Since these two values are reasonably close we can guess that our solution is close to
optimal for the original problem. The optimal cost for the caive approach is C ce = 6695, which is
indeed less than the value obtained by the sample average approximation. Evaluating the objective
on the validation set gives C ce,val = 12441, which shows that the naive approach is not very good.
By solving the stochastic problem (well, OK, approximately), weve saved around 22% in cost.
The following code solves the problem.

% solution to energy portfolio optimization problem


energy_portfolio_data;

542
% sample average approximation problem
cvx_begin
variable c(n)
minimize (b*c + (1/N)*sum(p*pos(d-c*A)));
cmin <= c;
c <= cmax;
cvx_end

Csa = cvx_optval % sample average optimal cost

% now evaluate objective on validation data


Cval = b*c + (1/Nval)*sum(p*pos(dval-c*Aval))

% naive method
dbar = (1/N)*sum(d);
Abar = (1/N)*sum(A,2);
cvx_begin
variable cce(n)
minimize (b*cce+ p*pos(dbar-cce*Abar))
cmin <= cce;
cce <= cmax;
cvx_end

Cce = cvx_optval % naive method optimal cost

% now evaluate objective on validation data


Cceval = b*cce + (1/Nval)*sum(p*pos(dval-cce*Aval))

16.4 Optimizing processor speed. A set of n tasks is to be completed by n processors. The variables
to be chosen are the processor speeds s1 , . . . , sn , which must lie between a given minimum value
smin and a maximum value smax . The computational load of task i is i , so the time required to
complete task i is i = i /si .
The power consumed by processor i is given by pi = f (si ), where f : R R is positive, increasing,
and convex. Therefore, the total energy consumed is
n
X i
E= f (si ).
i=1
si

(Here we ignore the energy used to transfer data between processors, and assume the processors
are powered down when they are not active.)
There is a set of precedence constraints for the tasks, which is a set of m ordered pairs P
{1, . . . , n} {1, . . . , n}. If (i, j) P, then task j cannot start until task i finishes. (This would be
the case, for example, if task j requires data that is computed in task i.) When (i, j) P, we refer
to task i as a precedent of task j, since it must precede task j. We assume that the precedence
constraints define a directed acyclic graph (DAG), with an edge from i to j if (i, j) P.

543
If a task has no precedents, then it starts at time t = 0. Otherwise, each task starts as soon as all
of its precedents have finished. We let T denote the time for all tasks to be completed.
To be sure the precedence constraints are clear, we consider the very small example shown below,
with n = 6 tasks and m = 6 precedence constraints.

P = {(1, 4), (1, 3), (2, 3), (3, 6), (4, 6), (5, 6)}.

1 4 6

2 3

In this example, tasks 1, 2, and 5 start at time t = 0 (since they have no precedents). Task 1
finishes at t = 1 , task 2 finishes at t = 2 , and task 5 finishes at t = 5 . Task 3 has tasks 1 and 2 as
precedents, so it starts at time t = max{1 , 2 }, and ends 3 seconds later, at t = max{1 , 2 } + 3 .
Task 4 completes at time t = 1 + 4 . Task 6 starts when tasks 3, 4, and 5 have finished, at time
t = max{max{1 , 2 } + 3 , 1 + 4 , 5 }. It finishes 6 seconds later. In this example, task 6 is the
last task to be completed, so we have

T = max{max{1 , 2 } + 3 , 1 + 4 , 5 } + 6 .

(a) Formulate the problem of choosing processor speeds (between the given limits) to minimize
completion time T , subject to an energy limit E Emax , as a convex optimization problem.
The data in this problem are P, smin , smax , 1 , . . . , n , Emax , and the function f . The variables
are s1 , . . . , sn .
Feel free to change variables or to introduce new variables. Be sure to explain clearly why
your formulation of the problem is convex, and why it is equivalent to the problem statement
above.
Important:
Your formulation must be convex for any function f that is positive, increasing, and
convex. You cannot make any further assumptions about f .
This problem refers to the general case, not the small example described above.
(b) Consider the specific instance with data given in proc_speed_data.m, and processor power

f (s) = 1 + s + s2 + s3 .

The precedence constraints are given by an m 2 matrix prec, where m is the number of
precedence constraints, with each row giving one precedence constraint (the first column gives
the precedents).
Plot the optimal trade-off curve of energy E versus time T , over a range of T that extends
from its minimum to its maximum possible value. (These occur when all processors operate at

544
smax and smin , respectively, since T is monotone nonincreasing in s.) On the same plot, show
the energy-time trade-off obtained when all processors operate at the same speed s, which is
varied from smin to smax .
Note: In this part of the problem there is no limit E max on E as in part (a); you are to find
the optimal trade-off of E versus T .

Solution.
(a) First lets look at the energy E. In general it is not a convex function of s. For example

consider f (s) = s1.5 , which is increasing and convex. But (1/s)f (s) = s, which is not
convex. So were going to need to reformulate the problem somehow.
We introduce the variable Rn , defined as
i = i /si .
The variable i is the time required to complete task i. We can recover si from i as si = i /i .
Well use i instead of si .
The energy E, as a function of , is
n
X
E= i f (i /i ).
i=1

This is a convex function of , since each term is the perspective of f , yf (x/y), evaluated at
y = i and x = i . (This shows that E is jointly convex in and , but we take constant
here.)
The processor speed limits smin si smax are equivalent to
i /smax i i /smin , i = 1, . . . , n.

Now lets look at the precedence constraints. To tackle these, we introduce the variable t Rn ,
where ti is an upper bound on the completion time of task i. Thus, we have
T max ti .
i

Task i cannot start before all its precedents have finished; after that, it takes at least i more
time. Thus, we have
tj ti + j , (i, j) P.
Tasks that have no precedent must satisfy ti i . In fact, this holds for all tasks, so we have
ti i , i = 1, . . . , n.

We formulate the problem as


minimize maxi ti
Pn
subject to i=1 i f (i /i ) Emax
i /smax i i /smin , i = 1, . . . , n
ti i , i = 1, . . . , n
tj ti + j , (i, j) P,
with variables t and . The energy constraint is convex, and the other constraints are linear.
The objective is convex.

545
(b) For this particular problem, we have

i f (i /i ) = i + i + i2 /i + i3 /i2 .

To generate the optimal tradeoff curve we scalarize, and minimize T + E for varying over
some range that gives us the full range of T . Thus, we solve the problem

minimize maxi ti + ni=1 i + i + i2 /i + i3 /i2


P 

subject to i /smax i i /smin , i = 1, . . . , n


ti i , i = 1, . . . , n
tj ti + j , (i, j) P,

for taking a values in some range.


If we constrain all processors to have the same speed s, we are in effect adding the constraint
= (1/s). In this case we can find the time required to complete all processes by solving
the problem
minimize maxi ti
subject to ti i / s, i = 1, . . . , n
tj ti + j /
s, (i, j) P.
(We dont really need to solve an optimization problem here; but its easier to solve it than to
write the code to evaluate T .) To generate the tradeoff curve for the case when all processors
are running at the same speed, we solve the problem above for s ranging between smin = 1
and smax = 5. This gives us the full range of possible values of T : when s = smax we find
T = 3.243; when s = smin we find T = 16.212.
The following matlab code was used to plot the two tradeoff curves:
cvx_quiet(true);
ps_data

% Optimal power-time tradeoff curve


Eopt = []; Topt = [];
fprintf(1,Optimal tradeoff curve\n)
for lambda = logspace(0,-3,30);
fprintf(1,Solving for lambda = %1.3f\n,lambda);
cvx_begin
variables t(n) tau(n)
E = sum(tau+alpha+alpha.^2.*inv_pos(tau)+...
alpha.^3.*square_pos(inv_pos(tau)));
minimize(lambda*E+max(t))
subject to
t(prec(:,2)) >= t(prec(:,1))+tau(prec(:,2))
t >= tau
tau >= alpha/s_max
tau <= alpha/s_min
cvx_end
E = sum(tau+alpha+alpha.^2./tau+alpha.^3./(tau.^2));

546
T = max(t);
Eopt = [Eopt E];
Topt = [Topt T];
end

% Tradeoff-curve for constant speed


fprintf(1,\nConstant speed tradeoff curve\n)
Econst = []; Tconst = [];
for s_const = linspace(s_min,s_max,30);
fprintf(1,Solving for s = %1.3f\n,s_const);
cvx_begin
variables t(n)
minimize(max(t))
subject to
t(prec(:,2)) >= t(prec(:,1))+alpha(prec(:,2))/s_const
t >= alpha/s_const
cvx_end
E = sum(alpha*(1/s_const+1+s_const+s_const^2));
T = max(t);
Econst = [Econst E];
Tconst = [Tconst T];
end

plot(Tconst,Econst,r--)
hold on
plot(Topt,Eopt,b-)
xlabel(Time)
ylabel(Energy)
grid on
axis([0 20 0 4000])
print -depsc processor_speed.eps
The two tradeoff curves are shown in the following plot. The solid line corresponds to the
optimal tradeoff curve, while the dotted line corresponds to the tradeoff curve with constant
processor speed.

547
4000

3500

3000

2500
Energy

2000

1500

1000

500

0
0 2 4 6 8 10 12 14 16 18 20
Time

We see that the optimal processor speeds use significantly less energy than when all processors
have the same speed, adjusted to give the same T , especially when T is small.
We note that this particular problem can be solved without using the formulation given in
part (a). For this particular power function we can actually use s as the optimization variable;
we dont need to change coordinates to . This is because E is a convex function of s; it has
the form n X  
E= i 1/si + 1 + si + s2i .
i=1
To get the tradeoff curve we can solve the problem
minimize maxi ti + ni=1 i 1/si + 1 + si + s2i
P 

subject to smin si smax , i = 1, . . . , n


ti i /si , i = 1, . . . , n
tj ti + j /sj , (i, j) P,
with variables t and s, for a range of positive values of .
cvx_begin
variables s(n) t(n)
E = alpha*(inv_pos(s)+1+s+square_pos(s));
minimize(lambda*E+max(t))
subject to
t(prec(:,2)) >= t(prec(:,1))+alpha(prec(:,2)).*...
inv_pos(s(prec(:,2)))
t >= alpha.*inv_pos(s)
s >= s_max
s <= s_min
cvx_end

548
Finally, we note that this specific problem can also be cast as a GP, since E is a posynomial
function of the speeds, and all the constraints can be written as posynomial inequalities.

16.5 Minimum energy processor speed scheduling. A single processor can adjust its speed in each of T
time periods, labeled 1, . . . , T . Its speed in period t will be denoted st , t = 1, . . . , T . The speeds
must lie between given (positive) minimum and maximum values, S min and S max , respectively, and
must satisfy a slew-rate limit, |st+1 st | R, t = 1, . . . , T 1. (That is, R is the maximum allowed
period-to-period change in speed.) The energy consumed by the processor in period t is given by
(st ), where : R R is increasing and convex. The total energy consumed over all the periods
is E = Tt=1 (st ).
P

The processor must handle n jobs, labeled 1, . . . , n. Each job has an availability time Ai
{1, . . . , T }, and a deadline Di {1, . . . , T }, with Di Ai . The processor cannot start work
on job i until period t = Ai , and must complete the job by the end of period Di . Job i involves a
(nonnegative) total work Wi . You can assume that in each time period, there is at least one job
available, i.e., for each t, there is at least one i with Ai t and Di t.
In period t, the processor allocates its effort across the n jobs as t , where 1T t = 1, t  0. Here
ti (the ith component of t ) gives the fraction of the processor effort devoted to job i in period t.
Respecting the availability and deadline constraints requires that ti = 0 for t < Ai or t > Di . To
complete the jobs we must have
Di
X
ti st Wi , i = 1, . . . , n.
t=Ai

(a) Formulate the problem of choosing the speeds s1 , . . . , sT , and the allocations 1 , . . . , T , in
order to minimize the total energy E, as a convex optimization problem. The problem data
are S min , S max , R, , and the job data, Ai , Di , Wi , i = 1, . . . , n. Be sure to justify any change
of variables, or introduction of new variables, that you use in your formulation.
(b) Carry out your method on the problem instance described in proc_sched_data.m, with
quadratic energy function (st ) = + st + s2t . (The parameters , , and are given
in the data file.) Executing this file will also give a plot showing the availability times and
deadlines for the jobs.
Give the energy obtained by your speed profile and allocations. Plot these using the command
bar((s*ones(1,n)).*theta,1,stacked), where s is the T 1 vector of speeds, and is
the T n matrix of allocations with components ti . This will show, at each time period, how
much effective speed is allocated to each job. The top of the plot will show the speed st . (You
dont need to turn in a color version of this plot; B&W is fine.)

Solution. The trick is to work with the variables xti = ti st , which must be nonnegative. Let
X RT n denote the matrix with components xti . The job completion constraint can be expressed
as X T 1  W . The availability and deadline constraints can be expressed as xti = 0 for t < Ai or
t > Di , which are linear constraints. The speed can be expressed as s = X1, a linear function of
our variable X. The objective E is clearly a convex function of s (and so, also of X).

549
12

10

job i
6

0
0 2 4 6 8 10 12 14 16 18
time t

Figure 17: Job availability diagram. Stars indicate the time in which each job becomes available.
Open circles indicate the required end of the job, which is at the end of the deadline time period.

Our convex optimization problem is

minimize E = Tt=1 (st )


P

subject to S min  s  S max , s = X1, X T 1  W, X  0


|st+1 st | R, t = 1, . . . , T 1
Xti = 0, t = 1, . . . , Ai 1, i = 1, . . . , n
Xti = 0, t = Di + 1, . . . , T, i = 1, . . . , n

with variables s RT and X RT n . All generalized inequalities here, including X  0, are


elementwise nonnegativity. Evidently this is a convex problem.
Once we find an optimal X ? and s? , we can recover t? as
?
ti = (1/s?t )x?ti , t = 1, . . . , T, i = 1, . . . , n.

For our data, the availability and deadlines of jobs are plotted in figure 17. The job duration lines
show a start time at the beginning of the available time period and a termination time at the very
end of the deadline time period. Thus, the length of the line shows the actual duration within
which the job is allowed to be completed.
The code that solves the problem is as follows.

proc_sched_data;

cvx_begin
variable X(T,n)

550
X >= 0;
s = sum(X);
minimize(sum(alpha+beta*s+gamma*square(s)))
s >= Smin;
s <= Smax;
abs(s(2:end)-s(1:end-1))<=R; % slew rate constraint

% start/stop constraints
for i=1:n
for t=1:A(i)-1
X(t,i)==0;
end
for t=D(i)+1:T
X(t,i)==0;
end
end

sum(X)>=W;

cvx_end
theta = X./(s*ones(1,n));

figure;
bar((s*ones(1,n)).*theta,1,stacked);
xlabel(t);
ylabel(st);

The optimal total energy is E = 162.125, which is obtained by allocating speeds according to the
plot shown in figure 18.

16.6 AC power flow analysis via convex optimization. This problem concerns an AC (alternating current)
power system consisting of m transmission lines that connect n nodes. We describe the topology
by the node-edge incidence matrix A Rnm , where

+1 line j leaves node i

Aij = 1 line j enters node i

0 otherwise.

The power flow on line j is pj (with positive meaning in the direction of the line as defined in A,
negative meaning power flow in the opposite direction).
Node i has voltage phase angle i , and external power input si . (If a generator is attached to node
i we have si > 0; if a load is attached we have si < 0; if the node has neither, si = 0.) Neglecting
power losses in the lines, and assuming power is conserved at each node, we have Ap = s. (We must
have 1T s = 0, which means that the total power pumped into the network by generators balances
the total power pulled out by the loads.)

551
3.5

2.5

2
st

1.5

0.5

0
0 2 4 6 8 10 12 14 16 18
t

Figure 18: Optimal speed allocation profile. The top of the plot gives the processor speed. The
colored regions indicate the portion of the speed allocated to a particular job at each time.

The line power flows are a nonlinear function of the difference of the phase angles at the nodes they
connect to:
pj = j sin(k l ),
where line j goes from node k to node l. Here j is a known positive constant (related to the
inductance of the line). We can write this in matrix form as p = diag() sin(AT ), where sin is
applied elementwise.
The DC power flow equations are

Ap = s, p = diag() sin(AT ).

In the power analysis problem, we are given s, and want to find p and that satisfy these equations.
We are interested in solutions with voltage phase angle differences that are smaller than 90 .
(Under normal conditions, real power lines are never operated with voltage phase angle differences
more than 20 or so.)
You will show that the DC power flow equations can be solved by solving the convex optimization
problem Pm
minimize i=j j (pj )
subject to Ap = s,
with variable p, where
Z u q
j (u) = sin1 (v/j ) dv = u sin1 (u/j ) + j ( 1 (u/j )2 1),
0

with domain dom j = (j , j ). (The second expression will be useless in this problem.)

552
(a) Show that the problem above is convex.
(b) Suppose the problem above has solution p? , with optimal dual variable ? associated with the
equality constraint Ap = s. Show that p? , = ? solves the DC power flow equation. Hint.
Write out the optimality conditions for the problem above.

Solution. We have i0 (u) = sin1 (u/j ), which is a strictly increasing function of u, so j is


strictly convex.
The optimality conditions for the problem above are

Ap? = s, (p? ) AT ? = 0.

The second equation (dual feasibility) can be written as

j0 (p?j ) = aTj ? , j = 1, . . . , m,

where aj is the jth column of A. Thus we have

sin1 (p?j /j ) = aTj ? , j = 1, . . . , m,

and so
p?j = j sin(aTj ? ), j = 1, . . . , m.
But this is exactly the same as p? = diag() sin(AT ), with = ? .

16.7 Power transmission with losses. A power transmission grid is modeled as a set of n nodes and
m directed edges (which represent transmission lines), with topology described by the node-edge
incidence matrix A Rnm , defined by

+1 edge j enters node i,

Aij = 1 edge j leaves node i,

0 otherwise.

We let pin
j 0 denote the power that flows into the tail of edge j, and pj
out 0 the power that

emerges from the head of edge j, for j = 1, . . . , m. Due to transmission losses, the power that flows
into each edge is more than the power that emerges:

pin out 2 out 2


j = pj + (Lj /Rj )(pj ) , j = 1, . . . , m,

where Lj > 0 is the length of transmission line j, Rj > 0 is the radius of the conductors on line
j, and > 0 is a constant. (The second term on the righthand side above is the transmission line
power loss.) In addition, each edge has a maximum allowed input power, that also depends on the
conductor radius: pin 2
j Rj , j = 1, . . . , m, where > 0 is a constant.
Generators are attached to nodes i = 1, . . . , k, and loads are attached to nodes i = k + 1, . . . , n.
We let gi denote the (nonnegative) power injected into node i by its generator, for i = 1, . . . , k. We
let li denote the (nonnegative) power pulled from node i by the load, for i = k + 1, . . . , n. These
load powers are known and fixed.

553
We must have power balance at each node. For i = 1, . . . , k, the sum of all power entering the node
from incoming transmission lines, plus the power supplied by the generator, must equal the sum of
all power leaving the node on outgoing transmission lines:
X X
pout
j + gi = pin
j , i = 1, . . . , k,
jE(i) jL(i)

where E(i) (L(i)) is the set of edge indices for edges entering (leaving) node i. For the load nodes
i = k + 1, . . . , n we have a similar power balance condition:
X X
pout
j = pin
j + li , i = k + 1, . . . , n.
jE(i) jL(i)

Each generator can vary its power gi over a given range [0, Gmax i ], and has an associated cost of
generation i (gi ), where i is convex and strictly increasing, for i = 1, . . . , k.

(a) Minimum total cost of generation. Formulate the problem of choosing generator and edge input
and output powers, so as to minimize the total cost of generation, as a convex optimization
problem. (All other quantities described above are known.) Be sure to explain any additional
variables or terms you introduce, and to justify any transformations you make.
Hint: You may find the matrices A+ = (A)+ and A = (A)+ helpful in expressing the power
balance constraints.
(b) Marginal cost of power at load nodes. The (marginal) cost of power at node i, for i = k +
1, . . . , n, is the partial derivative of the minimum total power generation cost, with respect to
varying the load power li . (We will simply assume these partial derivatives exist.) Explain
how to find the marginal cost of power at node i, from your formulation in part (a).
(c) Optimal sizing of lines. Now suppose that you can optimize over generator powers, edge input
and output powers (as above), and the power line radii Rj , j = 1, . . . , m. These must lie
between given limits, Rj [Rjmin , Rjmax ] (Rjmin > 0), and we must respect a total volume
constraint on the lines,
m
X
Lj Rj2 V max .
j=1

Formulate the problem of choosing generator and edge input and output powers, as well as
power line radii, so as to minimize the total cost of generation, as a convex optimization
problem. (Again, explain anything that is not obvious.)
(d) Numerical example. Using the data given in ptrans_loss_data.m, find the minimum total
generation cost and the marginal cost of power at nodes k + 1, . . . , n, for the case described
in parts (a) and (b) (i.e., using the fixed given radii Rj ), and also for the case described in
part (c), where you are allowed to change the transmission line radii, keeping the same total
volume as the original lines. For the generator costs, use the quadratic functions

i (gi ) = ai gi + bi gi2 , i = 1, . . . , k,

where a, b Rk+ . (These are given in the data file.)


Remark : In the m-file, we give you a load vector l Rnk . For consistency, the ith entry of
this vector corresponds to the load at node k + i.

554
Solution.

(a) The problem as stated,


Pk
minimize i=1 i (gi )
in
subject to pj = pout 2 out 2
j + (Lj /Rj )(pj ) , j = 1, . . . , m
in 2
0 pj Rj , j = 1, . . . , m
0  pout
0  g  Gmax
(A+ pout )i + gi = (A pin )i , i = 1, . . . , k
(A+ pout )i = (A pin )i + li , i = k + 1, . . . , n,

with variables pin , pout , and g, is not convex, since the power line equations are not affine.
However, the the power line loss equations can be relaxed to inequalities,

pin out 2 out 2


j pj + (Lj /Rj )(pj ) , j = 1, . . . , m,

which are convex, and we can show that they must be tight at the optimal point. First we
argue intuitively. These inequalities allow us the option of putting more energy into a power
line than is needed; in other words, it allows us to throw away energy. Energy is valuable, so
wed be foolish to do this. Thats not really an argument, though. A more formal argument
goes like this. Suppose that the line loss inequality for line j holds strictly. Then we can
reduce the power input to line j without affecting its power output. Now trace a directed
path back to a generator. We move along the path from the node that feeds line j back to
the generator. Along each line, we can reduce the input and output power slightly, while
maintaining power balance. Finally, we reduce the generator power as well, which reduces our
cost function, since i are strictly increasing. Thus, we have constructed a new feasible point
with lower cost, which means that the powers were not optimal.
In summary, we solve the problem by solving the convex problem
Pk
minimize i=1 i (gi )
subject to pj pout
in 2 out 2
j + (Lj /Rj )(pj ) , j = 1, . . . , m
in 2
0 pj Rj , j = 1, . . . , m
0  pout
0  g  Gmax
(A+ pout )i + gi = (A pin )i , i = 1, . . . , k
(A+ pout )i = (A pin )i + li , i = k + 1, . . . , n.

At the optimal point, the power line inequalities here will be tight, i.e., they will hold with
equality.
(b) The marginal cost of power load at node i is exactly equal to the negative of the Lagrange
multiplier associated with the constraint

(A+ pout )i = (A pin )i + li ,

since perturbing this equality constraint is the same as perturbing li , the load power.

555
(c) The key insight is that the problem is convex in the cross-sectional area, sj = Rj2 (neglecting
the constant ), but not Rj . With the variable s instead of R, our optimization problem
becomes Pk
minimize i=1 i (gi )
subject to pin out out 2
j pj + Lj (pj ) /sj , j = 1, . . . , m
min 2 max 2
(Rj ) sj (Rj ) , j = 1, . . . , m
0 pin
j sj , j = 1, . . . , m
0p out

0  g  Gmax
(A+ pout )i + gi = (A pin )i , i = 1, . . . , k
(A+ pout )i = (A pin )i + li , i = k + 1, . . . , n
LT s V max ,
with the s 0 constraint implicit from the first set of constraints. This
q is a convex problem
in the variables pin , pout , and s. We recover the optimal Rj via Rj? = s?j .
(d) We find that the optimal cost of power generation for the unoptimized network is 113.96 and
the optimal cost of power generation for the optimized network is 99.82. We also find that
the marginal cost of load powers for the unoptimized network are
marginalUnif =

14.4256
15.7128
14.3973
12.7591
11.8504
15.3929
11.4307
14.3772,
while the network optimized marginal costs are
marginalOpt =

7.1644
12.6325
6.5118
11.7082
5.8231
13.0379
5.8685
12.3710.
Lastly, the unoptimized radii R vs the optimized radii Ropt are
[R Ropt] =

1.0000 1.4731
1.0000 0.5000

556
1.0000 1.3480
1.0000 0.7491
1.0000 0.5000
1.0000 0.5000
1.0000 0.5000
1.0000 0.5000
1.0000 0.5000
1.0000 1.2073
1.0000 0.5000
1.0000 1.3503
1.0000 1.3478
1.0000 0.5000
1.0000 1.5000
1.0000 1.3828
1.0000 0.5000
1.0000 0.5000
1.0000 1.5000.
The following code solves the problem
% optimal power tranmission with losses

ptrans_loss_data;

Ap = max(A,0);
Am = max(-A,0);

% parts a and b
cvx_begin
variables p_in(m) p_out(m) g(k);
dual variable lambda;
minimize (a*g + sum(b.*(g.^2)))
subject to
p_in <= sigma*R.^2;
p_in >= 0;
p_out >= 0;
g <= Gmax;
g >= 0;
for j = 1:m
p_in(j) >= p_out(j) + alpha*L(j)/(R(j)^2)*p_out(j).^2;
end
Ap(1:k,:)*p_out + g == Am(1:k,:)*p_in;
lambda : Ap(k+1:n,:)*p_out == Am(k+1:n,:)*p_in + l;
cvx_end

pinUnif = p_in;
poutUnif = p_out;

557
gUnif = g;
valUnif = cvx_optval;
marginalUnif = lambda;

% part c
cvx_begin
variables p_in(m) p_out(m) g(k) s(m);
dual variable lambda;
minimize (a*g + sum(b.*(g.^2)))
subject to
p_in <= sigma*s;
p_in >= 0;
p_out >= 0;
g <= Gmax;
g >= 0;
s <= Rmax.^2;
s >= Rmin.^2;
L*s <= Vmax;
for j = 1:m
p_in(j) >= p_out(j) + alpha*L(j)*quad_over_lin(p_out(j),s(j));
end
Ap(1:k,:)*p_out + g == Am(1:k,:)*p_in;
lambda : Ap(k+1:n,:)*p_out == Am(k+1:n,:)*p_in + l;
cvx_end

pinOpt = p_in;
poutOpt = p_out;
gOpt = g;
valOpt = cvx_optval;
marginalOpt = lambda;
Ropt = sqrt(s);

16.8 Utility/power trade-off in a wireless network. In this problem we explore the trade-off between
total utility and total power usage in a wireless network in which the link transmit powers can
be adjusted. The network consists of a set of nodes and a set of links over which data can be
transmitted. There are n routes, each corresponding to a sequence of links from a source to a
destination node. Route j has a data flow rate fj R+ (in units of bits/sec, say). The total utility
(which we want to maximize) is
n
X
U (f ) = Uj (fj ),
j=1

where Uj : R R are concave increasing functions.


The network topology is specified by the routing matrix R Rmn , defined as
(
1 route j passes over link i
Rij =
0 otherwise.

558
The total traffic on a link is the sum of the flows that pass over the link. The traffic (vector) is thus
t = Rf Rm . The traffic on each link cannot exceed the capacity of the link, i.e., t  c, where
c Rm
+ is the vector of link capacities.
The link capacities, in turn, are functions of the link transmit powers, given by p Rm
+ , which
max
cannot exceed given limits, i.e., p  p . These are related by

ci = i log(1 + i pi ),

where i and i are positive parameters that characterize link i. The second objective (which we
want to minimize) is P = 1T p, the total (transmit) power.

(a) Explain how to find the optimal trade-off curve of total utility and total power, using convex
or quasiconvex optimization.

(b) Plot the optimal trade-off curve for the problem instance with m = 20, n = 10, Uj (x) = x
for j = 1, . . . , n, pmax
i = 10, i = i = 1 for i = 1, . . . , m, and network topology generated
using
rand(seed,3);
R = round(rand(m,n));
Your plot should have the total power on the horizontal axis.

Solution.

(a) We can formulate the problem as follows


n T
j=1 Uj (fj ) 1 p
P
maximize
subject to t = Rf, p  p , max

ti i log(1 + i pi ), i = 1, . . . , m,

with variables t, p Rm , f Rn , and tradeoff parameter . The objective and constraints


are convex. To compute the optimal trade-off curve, we solve the optimization problem over
instances with 0 < < .
(b) The following Matlab script solves the problem and plots the tradeoff curve.

clear all, close all;


% generate data
m = 20; n = 10;
pmax = 10;
alpha = 1; beta = 1;
rand(seed, 3);
R = round(rand(m,n));

% tradeoff curve
utility = [];
power = [];
num_evals = 50;

559
for lambda=logspace(-6,3,num_evals)
% compute optimal powers
cvx_begin;
variables t(m) p(m) f(n);
maximize( sum(sqrt(f)) - lambda*sum(p) );
subject to
t == R*f;
p <= pmax;
t <= alpha * log(1 + beta*p);
cvx_end;
utility = [utility, sum(sqrt(f))];
power = [power, sum(p)];
end

% plot results
plot(power, utility);
xlabel(Total power);
ylabel(Total utility);
grid on;
print -depsc util_pwr_tradeoff

4
Total utility

0
0 20 40 60 80 100 120
Total power

16.9 Energy storage trade-offs. We consider the use of a storage device (say, a battery) to reduce the
total cost of electricity consumed over one day. We divide the day into T time periods, and let
pt denote the (positive, time-varying) electricity price, and ut denote the (nonnegative) usage or
consumption, in period t, for t = 1, . . . , T . Without the use of a battery, the total cost is pT u.

560
Let qt denote the (nonnegative) energy stored in the battery in period t. For simplicity, we neglect
energy loss (although this is easily handled as well), so we have qt+1 = qt + ct , t = 1, . . . , T 1,
where ct is the charging of the battery in period t; ct < 0 means the battery is discharged. We will
require that q1 = qT + cT , i.e., we finish with the same battery charge that we start with. With
the battery operating, the net consumption in period t is ut + ct ; we require this to be nonnegative
(i.e., we do not pump power back into the grid). The total cost is then pT (u + c).
The battery is characterized by three parameters: The capacity Q, where qt Q; the maximum
charge rate C, where ct C; and the maximum discharge rate D, where ct D. (The parameters
Q, C, and D are nonnegative.)

(a) Explain how to find the charging profile c RT (and associated stored energy profile q RT )
that minimizes the total cost, subject to the constraints.
(b) Solve the problem instance with data p and u given in storage_tradeoff_data.*, Q = 35,
and C = D = 3. Plot ut , pt , ct , and qt versus t.
(c) Storage trade-offs. Plot the minimum total cost versus the storage capacity Q, using p and
u from storage_tradeoff_data.*, and charge/discharge limits C = D = 3. Repeat for
charge/discharge limits C = D = 1. (Put these two trade-off curves on the same plot.) Give
an interpretation of the endpoints of the trade-off curves.

Solution.

(a) The problem is an LP,

minimize pT (u + c)
subject to D1  c  C1, u + c  0, 0  q  Q1
qt+1 = qt + ct , t = 1, . . . , T
q1 = qT + cT

with variables c, q RT .
(b) The code for the problem is given in part (c). The results are shown below.

561
10
u
uc
5 c

5
0 5 10 15 20 25
t
3

2
pt

0
0 5 10 15 20 25
t
40

30
qt

20

10

0
0 5 10 15 20 25
t

(c) We vary the range of Q from 1 to 150 and solve the LP for the two cases of high charge/dis-
charge limit and low charge/discharge limit.
The following Matlab code solves the problem.
clear all; close all; storage_tradeoff_data;
Q = 35;
C = 3; D = 3;
cvx_quiet(true);
cvx_begin
variables q(T,1) c(T,1);
minimize(p*(u+c));
subject to
c >= -D; c<= C;
q >= 0; q <= Q;
q(2:T) == q(1:T-1) + c(1:T-1);
q(1) == q(T) + c(T);
u+c >= 0;
cvx_end

562
figure;
ts = (1:T)/4;
subplot(3,1,1);
plot(ts,u, r); hold on
plot(ts,c,b);
legend(u,c);
xlabel(t);
ylabel(uc);
subplot(3,1,2);
plot(ts, p);
ylabel(pt);
xlabel(t);
subplot(3,1,3);
plot(ts,q);
ylabel(qt);
xlabel(t);
print -depsc storage_tradeoff_time_trace.eps

%% Plot the trade-off curves


N = 31; Qs = linspace(0, 150,N);
C = 1; D = 1;
cvx_quiet(true);
for i=1:N
Q = Qs(i);
cvx_begin
variables q(T,1) c(T,1);
minimize(p*(u+c));
subject to
c >= -D; c <= C;
q >= 0; q <= Q;
q(2:T) == q(1:T-1) + c(1:T-1);
q(1) == q(T) + c(T);
u + c >= 0;
cvx_end
qStore1(:,i) = q;
cStore1(:,i) = c;
cost1(i) = cvx_optval;
end
figure;
plot(Qs,cost1, b.-);
hold on

C = 3; D = 3;

563
for i=1:N
Q = Qs(i);
cvx_begin
variables q(T,1) c(T,1);
minimize(p*(u+c));
subject to
c >= -D; c <= C;
q >= 0; q <= Q;
q(2:T) == q(1:T-1) + c(1:T-1);
q(1) == q(T) + c(T);
u + c >= 0;
cvx_end
qStore2(:,i) = q;
cStore2(:,i) = c;
cost2(i) = cvx_optval;
end
plot(Qs,cost2, g--);
xlabel(Q);
ylabel(cost);
print -depsc storage_tradeoff_curve.eps

The following Python code solves the problem.


import cvxpy as cvx
import numpy as np
import matplotlib.pyplot as plt

np.random.seed(1)

T = 96
t = np.linspace(1, T, num=T).reshape(T,1)
p = np.exp(-np.cos((t-15)*2*np.pi/T)+0.01*np.random.randn(T,1))
u = 2*np.exp(-0.6*np.cos((t+40)*np.pi/T) - \
0.7*np.cos(t*4*np.pi/T)+0.01*np.random.randn(T,1))

plt.figure(1)
plt.plot(t/4, p);
plt.plot(t/4, u, r);

Q = 35
C, D = 3, 3

q = cvx.Variable(T,1)
c = cvx.Variable(T,1)
obj = p.T*(u+c)
cons = [c >= -D]

564
cons += [c <= C]
cons += [q >= 0]
cons += [q <= Q]
cons += [q[1:] == q[:T-1] + c[:T-1]]
cons += [q[0] == q[T-1] + c[T-1]]
cons += [u+c >= 0]
pstar = cvx.Problem(cvx.Minimize(obj), cons).solve()

plt.figure(2)
ts = np.linspace(1, T, num=T).reshape(T,1)/4
plt.subplot(3,1,1)
plt.plot(ts, u, r);
plt.plot(ts, c.value, b);
plt.xlabel(t)
plt.ylabel(uc)
plt.legend([u, c])
plt.subplot(3,1,2)
plt.plot(ts, p, b);
plt.xlabel(t)
plt.ylabel(pt)
plt.subplot(3,1,3)
plt.plot(ts, q.value, b);
plt.xlabel(t)
plt.ylabel(qt)
plt.ylim((0, 40))
plt.savefig(storage_tradeoff_time_trace.eps)

# Plot the tradeoff curves


N = 31
Qs = np.linspace(0, 150, num=N).reshape(N,1)
C, D = 1, 1
cost1 = np.zeros((N,1))
cost2 = np.zeros((N,1))

for i in range(N):
Q = Qs[i]
q = cvx.Variable(T,1)
c = cvx.Variable(T,1)
obj = p.T*(u+c)
cons = [c >= -D]
cons += [c <= C]
cons += [q >= 0]
cons += [q <= Q]
cons += [q[1:] == q[:T-1] + c[:T-1]]

565
cons += [q[0] == q[T-1] + c[T-1]]
cons += [u+c >= 0]
pstar = cvx.Problem(cvx.Minimize(obj), cons).solve()
cost1[i] = pstar

C, D = 3, 3

for i in range(N):
Q = Qs[i]
q = cvx.Variable(T,1)
c = cvx.Variable(T,1)
obj = p.T*(u+c)
cons = [c >= -D]
cons += [c <= C]
cons += [q >= 0]
cons += [q <= Q]
cons += [q[1:] == q[:T-1] + c[:T-1]]
cons += [q[0] == q[T-1] + c[T-1]]
cons += [u+c >= 0]
pstar = cvx.Problem(cvx.Minimize(obj), cons).solve()
cost2[i] = pstar

plt.figure(3)
plt.plot(Qs, cost2, g--);
plt.plot(Qs, cost1, b.-);
plt.xlabel(Q)
plt.ylabel(cost)
plt.savefig(storage_tradeoff_curve.eps)

plt.show()
The following Julia code solves the problem.
using Convex, Gadfly
include("storage_tradeoff_data.jl")

Q = 35;
C = 3;
D = 3;

q = Variable(T);
c = Variable(T);

constraints = c >= -D;


constraints += c <= C;
constraints += q >= 0;
constraints += q <= Q;

566
constraints += q[2:T] == q[1:T-1] + c[1:T-1];
constraints += q[1] == q[T] + c[T];
constraints += u+c >= 0;

prob = minimize(p*(u+c), constraints);


solve!(prob);

ts = [1:T]/4;

p1 = plot(
layer(
x = ts,
y = u,
Geom.line,
Theme(default_color = color("red"))
),
layer(
x = ts,
y = c.value,
Geom.line,
Theme(default_color = color("blue"))
),
Guide.xlabel("t"),
Guide.ylabel("u<sub>t</sub>,c<sub>t</sub>"),
Guide.manual_color_key("",
["u<sub>t</sub>", "c<sub>t</sub>"],
["red", "blue"])
);

p2 = plot(
layer(
x = ts,
y = p,
Geom.line,
Theme(default_color = color("blue"))
),
Guide.xlabel("t"),
Guide.ylabel("p<sub>t</sub>")
);

p3 = plot(
layer(
x = ts,
y = q.value,
Geom.line,

567
Theme(default_color = color("blue"))
),
Guide.xlabel("t"),
Guide.ylabel("q<sub>t</sub>")
);
draw(PS("storage_tradeoff_time_trace.eps", 6inch, 6inch),
vstack(p1, p2, p3))

# Plot the trade-off curves


N = 31;
Qs = [0:5:150];
C = 1;
D = 1;
cost1 = zeros(N,1);

for i = 1:N
Q = Qs[i];
q = Variable(T);
c = Variable(T);

constraints = c >= -D;


constraints += c <= C;
constraints += q >= 0;
constraints += q <= Q;
constraints += q[2:T] == q[1:T-1] + c[1:T-1];
constraints += q[1] == q[T] + c[T];
constraints += u+c >= 0;

prob = minimize(p*(u+c), constraints);


solve!(prob);
println(prob.status)
cost1[i] = prob.optval;
end

C = 3;
D = 3;
cost2 = zeros(N,1);

for i = 1:N
Q = Qs[i];
q = Variable(T);
c = Variable(T);

constraints = c >= -D;


constraints += c <= C;

568
constraints += q >= 0;
constraints += q <= Q;
constraints += q[2:T] == q[1:T-1] + c[1:T-1];
constraints += q[1] == q[T] + c[T];
constraints += u+c >= 0;

prob = minimize(p*(u+c), constraints);


solve!(prob);
println(prob.status)
cost2[i] = prob.optval;
end

p4 = plot(
layer(
x = Qs,
y = cost1,
Geom.line,
Theme(default_color = color("blue"))
),
layer(
x = Qs,
y = cost2,
Geom.point,
Theme(default_color = color("green"))
),
Guide.xlabel("Q"),
Guide.ylabel("p<sup>T</sup>(c+u)"),
Guide.manual_color_key("", ["C=1, D=1", "C=3, D=3"], ["blue", "green"])
);
display(p4);
draw(PS("storage_tradeoff_curve.eps", 6inch, 3inch), p4)

The trade-off curves are shown below, where the blue solid curve corresponds to C = D = 1
and the green dashed curve corresponds to C = D = 3. The intersection of the trade-off curves
with the y axis (corresponding to Q = 0) gives the total cost if there were no battery, pT u.
On the right end of the trade-off curves, the battery capacity constraint is no longer active,
so no further reduction in total cost is obtained. The total cost reduction here is limited by
the charge/discharge limits.

569
460

440

420

400

380
cost

360

340

320

300

280

260
0 50 100 150
Q

16.10 Cost-comfort trade-off in air conditioning. A heat pump (air conditioner) is used to cool a residence
to temperature Tt in hour t, on a day with outside temperature Ttout , for t = 1, . . . , 24. These
temperatures are given in Kelvin, and we will assume that Ttout Tt .
A total amount of heat Qt = (Ttout Tt ) must be removed from the residence in hour t, where
is a positive constant (related to the quality of thermal insulation).
The electrical energy required to pump out this heat is given by Et = Qt /t , where

Tt
t =
Ttout Tt

is the coefficient of performance of the heat pump and (0, 1] is the efficiency constant. The
efficiency is typically around 0.6 for a modern unit; the theoretical limit is = 1. (When Tt = Ttout ,
we take t = and Et = 0.)
Electrical energy prices vary with the hour, and are given by Pt > 0 for t = 1, . . . , 24. The total
P
energy cost is C = t Pt Et . We will assume that the prices are known.
Discomfort is measured using a piecewise-linear function of temperature,

Dt = (Tt T ideal )+ ,

where T ideal is an ideal temperature, below which there is no discomfort. The total daily discomfort
is D = 24 ideal < T out .
P
t=1 Dt . You can assume that T t
To get a point on the optimal cost-comfort trade-off curve, we will minimize C + D, where > 0.
The variables to be chosen are T1 , . . . , T24 ; all other quantities described above are given.

570
Show that this problem has an analytical solution of the form Tt = (Pt , Ttout ), where : R2 R.
The function can depend on the constants , , T ideal , . Give explicitly. You are free (indeed,
encouraged) to check your formula using CVX, with made up values for the constants.
Disclaimer. The focus of this course is not on deriving 19th century pencil and paper solutions to
problems. But every now and then, a practical problem will actually have an analytical solution.
This is one of them.
Solution. We use the expression for Et and the efficiency to get
(Ttout Tt )2
Et = (/) ,
Tt
which is a convex function of Tt . For any practical problem we can regard the denominator as a
constant, but its cool to note that we can handle the nonlinearity of the thermodynamic efficiency
exactly. It follows that the cost C, which is a positive weighted sum of Et , is convex. The discomfort
is evidently convex in Tt , so the composite objective C + D is convex in Tt . So we have a convex
problem here.
The composite objective C + D is separable in Tt , i.e., a sum of functions of Tt :
24
!
(T out Tt )2
Pt (/) t
X
C + D = + (Tt T ideal )+ .
t=1
Tt
It follows that we can find each Tt (separately) by minimizing
(Ttout Tt )2
Pt (/) + (Tt T ideal )+ .
Tt
The derivative of the first term is
Tt2 (Ttout )2
Pt (/) .
Tt2
First assume that Tt > T ideal . Then the optimality condition is
Tt2 (Ttout )2
Pt (/) + = 0,
Tt2
which gives
Tt = (1 + /(Pt ))1/2 Ttout .
If Tt < T ideal , the optimality condition is
Tt2 (Ttout )2
Pt (/) = 0,
Tt2
which gives Tt = Ttout , which contradicts Tt < T ideal . So this case cannot happen. But we can have
Tt = T ideal ; this happens when
(1 + /(Pt ))1/2 Ttout T ideal .

So the optimal choice of temperature is simply


n o
Tt = (Pt , Ttout ) = max (1 + /(Pt ))1/2 Ttout , T ideal .

Lets test our formula using CVX.

571
% Cost-comfort trade-off in air conditioning.
N = 24;
Tout = 3*sin(2*pi*(1:N)/24-pi/2)+29; Tideal = 25;
Tout = Tout + 273.15; Tideal = Tideal + 273.15;

eta = 0.6; alpha = 1.8; lambda = 1;


price = 6*[ones(8,1);1.5*ones(9,1);ones(7,1)];
k = price*alpha/eta;

cvx_begin
variable T(N)
C = k*quad_over_lin(Tout-T,T,2);
D = sum(pos(T-Tideal));
minimize ( C + lambda*D )
subject to
T >= 0; T <= Tout; % they are not necessary
cvx_end

T_analytic = max(sqrt(k./(k+lambda)).*Tout, Tideal);


C_analytic = k*quad_over_lin(Tout-T_analytic,T_analytic,2);
display([Total cost obtained by cvx: num2str(C)]);
display([Total cost obtained analytically: num2str(C_analytic)]);

plot((1:N),Tout,.-r,(1:N),T,.-b,(1:N),Tideal*ones(N,1),-k);
xlabel(t); ylabel(Temperature); legend(Tout,T,Tideal);

print -depsc air_cond.eps

572
306
Tout
T
305 Tideal

304

303
Temperature

302

301

300

299

298
0 5 10 15 20 25
t

The total energy cost obtained from the analytical solution is 31.838. CVX returns the same answer
(31.838, when using sdpt3; and 31.8204, when using Sedumi).
16.11 Optimal electric motor drive currents. In this problem you will design the drive current waveforms
for an AC (alternating current) electric motor. The motor has a magnetic rotor which spins with
constant angular velocity 0 inside the stationary stator. The stator contains three circuits
(called phase windings) with (vector) current waveform i : R R3 and (vector) voltage waveform
v : R R3 , which are 2-periodic functions of the angular position of the rotor. The circuit
dynamics are
d
v() = Ri() + L i() + k(),
d
where R S++ is the resistance matrix, L S++ is the inductance matrix, and k : R R3 , a
3 3

2-periodic function of , is the back-EMF waveform (which encodes the electromagnetic coupling
between the rotor permanent magnets and the phase windings). The angular velocity , the
matrices R and L, and the back-EMF waveform k, are known.
We must have |vi ()| v supply , i = 1, 2, 3, where v supply is the (given) supply voltage. The output
torque of the motor at rotor position is () = k()T i(). We will require the torque to have a
given constant nonnegative value: () = des for all .
The average power loss in the motor is
Z 2
1
P loss = i()T Ri() d.
2 0

573
The mechanical output power is P out = des , and the motor efficiency is

= P out /(P out + P loss ).

The objective is to choose the current and voltage waveforms to maximize .


Discretization. To solve this problem we consider a discretized version in which takes on the N
values = h, 2h, . . . , N h, where h = 2/N . We impose the voltage and torque constraints for these
values of . We approximate the power loss as
PN
P loss = (1/N ) T
j=1 i(jh) Ri(jh).

The circuit dynamics are approximated as

i((j + 1)h) i(jh)


v(jh) = Ri(jh) + L + k(jh), j = 1, . . . , N,
h
where here we take i((N + 1)h) = i(h) (by periodicity).
Find optimal (discretized) current and voltage waveforms for the problem instance with data given
in ac_motor_data.m. The back-EMF waveform is given as a 3 N matrix K. Plot the three current
waveform components on one plot, and the three voltage waveforms on another. Give the efficiency
obtained.
Solution. We first note that all of the constraints are convex as stated. To maximize the efficiency,
we minimize the power loss, which is a sum of convex quadratic terms. The following code will
solve the problem.

% optimal electric motor drive currents


ac_motor_data;

Rhalf = chol(R);
cvx_begin
variables I(3, N) V(3, N)
minimize (norm(Rhalf*I, fro)) % sqrt(N*Ploss)
subject to
V == R*I + omega*K + omega*L*(I(:, [2:N, 1])-I)/h; % dynamics
tau_des == sum(K.*I); % torque constraint
abs(V) <= V_supply; % voltage limits
cvx_end

Ploss = cvx_optval^2/N;
Pout = omega*tau_des;
eta = Pout/(Pout+Ploss);

fprintf(Maximum efficiency: %f\n, eta);

% plot
subplot(2, 1, 1);

574
plot(I(:, 1:N));
xlabel(theta); ylabel(i(theta));
axis([0, 360, -100, 100]);

subplot(2, 1, 2);
plot(V);
xlabel(theta); ylabel(v(theta));
axis([0, 360, -750, 750]);

print -depsc ac_motor.eps;

The maximum efficiency is 90.2%. The optimal current and voltage waveforms are shown below.

100

50
i(theta)

50

100
0 50 100 150 200 250 300 350
theta

600
400
200
v(theta)

0
200
400
600

0 50 100 150 200 250 300 350


theta

575
17 Miscellaneous applications
17.1 Earth movers distance. In this exercise we explore a general method for constructing a distance
between two probability distributions on a finite set, called the earth movers distance, Wasserstein
metric, or Dubroshkin metric. Let x and y be two probability distributions on {1, . . . , n}, i.e.,
1T x = 1T y = 1, x  0, y  0. We imagine that xi is the amount of earth stored at location i;
our goal is to move the earth between locations to obtain the distribution given by y. Let Cij be
the cost of moving one unit of earth from location j to location i. We assume that Cii = 0, and
Cij = Cji > 0 for i 6= j. (We allow Cij = , which means that earth cannot be moved directly from
node j to node i.) Let Sij 0 denote the amount of earth moved from location j to location i. The
total cost is ni,j=1 Sij Cij = tr C T S. The shipment matrix S must satisfy the balance equations,
P

n
X n
X
Sij = yi , i = 1, . . . , n, Sij = xj , j = 1, . . . , n,
j=1 i=1

which we can write compactly as S1 = y, S T 1 = x. (The first equation states that the total
amount shipped into location i equals yi ; the second equation states that the total shipped out
from location j is xj .) The earth movers distance between x and y, denoted d(x, y), is given by the
minimal cost of earth moving required to transform x to y, i.e., the optimal value of the problem

minimize tr C T S
subject to Sij 0, i, j = 1, . . . , n
S1 = y, S T 1 = x,

with variables S Rnn .


We can also give a probability interpretation of d(x, y). Consider a random variable Z on {1, . . . , n}2
with values Cij . We seek the joint distribution S that minimizes the expected value of the random
variable Z, with given marginals x and y.
The earth movers distance is used to compare, for example, 2D images, with Cij equal to the
distance between pixels i and j. If x and y represent two photographs of the same scene, from
slightly different viewpoints and with an offset in camera position (say), d(x, y) will be small, but
the distance between x and y measured by most common norms (e.g., kx yk1 ) will be large.

(a) Show that d satisfies the following. Symmetry: d(x, y) = d(y, x); Nonnegativity: d(x, y) 0;
Definiteness: d(x, x) = 0, and d(x, y) > 0 for x 6= y.
(b) Show that d(x, y) is the optimal value of the problem

maximize T x + T y
subject to i + j Cij , i, j = 1, . . . , n,

with variables , Rn .

Solution.

(a) First lets show that d(x, y) 0. S and C are both elementwise nonnegative, so tr C T S 0.
So the objective is always nonnegative.

576
Lets now show that d(x, y) = 0 when x = y. To do this, we observe that S = diag(x) is
feasible; the objective is then tr C T S = 0, since the diagonal of C is zero.
We have d(x, y) > 0 when x 6= y. If x 6= y, then S1 6= S T 1, so S cannot be symmetric. A
necessary condition for this is that S have some non-zero off-diagonal elements. Since Sij 0,
S must have some positive off-diagonal elements. We are given that Cij > 0 for i 6= j. So, any
feasible S for x 6= y must satisfy tr C T S > 0.
To show symmetry, suppose S is feasible for the problem. Then S T is feasible for the problem
with x and y swapped. The two costs are the same, since tr C T S = tr C T S T (using symmetry
of C).
Finally, we need to show the triangle inequality. From the definition of d as the optimal value
of a convex optimization problem, with constraints that are jointly convex in S, y, x, we
deduce that d is a convex function of (x, y).
(b) We show that the optimization problem in part (b) is the dual of the optimization problem in
the question statement, which defines d(x, y). The Lagrangian of the problem is L(S, , , ) =
tr C T S tr T S + T (y S1) + T (x S T 1). The dual function is

g(, , ) = inf L(S, , , )


S
= y + x + inf (tr C S tr S S1 T S T 1)
T T T T T
S
= y + T x + inf (tr(C 1T 1T )T S)
T
S

Note that tr(C 1T 1T )T S is linear in S, so L(S, , , ) is bounded below only when


C 1T 1T = 0. So,
(
T y + T x C 1T 1T = 0
g(, , ) =
otherwise

So, the dual problem is

maximize T y + T x
subject to 0
C 1T 1T = 0,

or equivalently

maximize T x + T y
subject to ij 0, i, j = 1, . . . , n
i + j + ij = Cij , i, j = 1, . . . , n.

We treat as a slack variable and can therefore write the problem as

maximize T x + T y
subject to i + j Cij , i, j = 1, . . . , n.

By strong duality, d(x, y), the optimal value of the original problem, is also the optimal value
of this problem.

577
17.2 Radiation treatment planning. In radiation treatment, radiation is delivered to a patient, with the
goal of killing or damaging the cells in a tumor, while carrying out minimal damage to other tissue.
The radiation is delivered in beams, each of which has a known pattern; the level of each beam can
be adjusted. (In most cases multiple beams are delivered at the same time, in one shot, with the
treatment organized as a sequence of shots.) We let bj denote the level of beam j, for j = 1, . . . , n.
These must satisfy 0 bj B max , where B max is the maximum possible beam level. The exposure
area is divided into m voxels, labeled i = 1, . . . , m. The dose di delivered to voxel i is linear in
the beam levels, i.e., di = nj=1 Aij bj . Here A Rmn
P
+ is a (known) matrix that characterizes the
beam patterns. We now describe a simple radiation treatment planning problem.
A (known) subset of the voxels, T {1, . . . , m}, corresponds to the tumor or target region. We
require that a minimum radiation dose Dtarget be administered to each tumor voxel, i.e., di Dtarget
for i T . For all other voxels, we would like to have di Dother , where Dother is a desired maximum
dose for non-target voxels. This is generally not feasible, so instead we settle for minimizing the
penalty X
E= ((di Dother )+ )2 ,
i6T

where ()+ denotes the nonnegative part. We can interpret E as the sum of the squares of the
nontarget excess doses.

(a) Show that the treatment planning problem is convex. The optimization variable is b Rn ;
the problem data are B max , A, T , Dtarget , and Dother .
(b) Solve the problem instance with data given in the file treatment_planning_data.m. Here we
have split the matrix A into Atarget, which contains the rows corresponding to the target
voxels, and Aother, which contains the rows corresponding to other voxels. Give the optimal
value. Plot the dose histogram for the target voxels, and also for the other voxels. Make a
brief comment on what you see. Remark. The beam pattern matrix in this problem instance
is randomly generated, but similar results would be obtained with realistic data.

Solution. Theres not much to say here, except that the constraints are linear, and the objective
is convex. Clearly the function (a+ )2 , which is zero for a 0, and a2 for a 0, is convex; our
objective E is simply a sum of functions of this form, with affine argument.
The optimum penalalty is E ? = 0.308.
Here is the code to solve the problem:

% radiation treatment planning


treatment_planning_data;

cvx_begin
variable b(n); % beam intensities
0 <= b;
b <= Bmax;
Atarget*b >= Dtarget % deliver minimum does to tumor voxels
minimize (sum(square_pos(Aother*b-Dother))) % minimize square excess dose delivered to others
cvx_end

578
subplot(2,1,1);
hist(Atarget*b);
axis([0 2 0 60]);
hold on; plot([Dtarget Dtarget],[0 60],r)
title(Tumor voxel dose histogram)
subplot(2,1,2);
hist(Aother*b);
axis([0 2 0 150]);
hold on; plot([Dother Dother],[0 150],r)
title(Other voxel dose histogram)
xlabel(Dosage)

print -depsc dose_histos

The resulting dose histograms are shown below, along with vertical red lines showing Dtarget and
Dother .

Tumor voxel dose histogram


60

50

40

30

20

10

0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

Other voxel dose histogram


150

100

50

0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Dosage

17.3 Flux balance analysis in systems biology. Flux balance analysis is based on a very simple model of
the reactions going on in a cell, keeping track only of the gross rate of consumption and production
of various chemical species within the cell. Based on the known stoichiometry of the reactions, and
known upper bounds on some of the reaction rates, we can compute bounds on the other reaction
rates, or cell growth, for example.

579
We focus on m metabolites in a cell, labeled M1 , . . . , Mm . There are n reactions going on, labeled
R1 , . . . , Rn , with nonnegative reaction rates v1 , . . . , vn . Each reaction has a (known) stoichiometry,
which tells us the rate of consumption and production of the metabolites per unit of reaction rate.
The stoichiometry data is given by the stoichiometry matrix S Rmn , defined as follows: Sij
is the rate of production of Mi due to unit reaction rate vj = 1. Here we consider consumption
of a metabolite as negative production; so Sij = 2, for example, means that reaction Rj causes
metabolite Mi to be consumed at a rate 2vj .
As an example, suppose reaction R1 has the form M1 M2 + 2M3 . The consumption rate of M1 ,
due to this reaction, is v1 ; the production rate of M2 is v1 ; and the production rate of M3 is 2v1 .
(The reaction R1 has no effect on metabolites M4 , . . . , Mm .) This corresponds to a first column of
S of the form (1, 1, 2, 0, . . . , 0).
Reactions are also used to model flow of metabolites into and out of the cell. For example, suppose
that reaction R2 corresponds to the flow of metabolite M1 into the cell, with v2 giving the flow
rate. This corresponds to a second column of S of the form (1, 0, . . . , 0).
The last reaction, Rn , corresponds to biomass creation, or cell growth, so the reaction rate vn is
the cell growth rate. The last column of S gives the amounts of metabolites used or created per
unit of cell growth rate.
Since our reactions include metabolites entering or leaving the cell, as well as those converted
to biomass within the cell, we have conservation of the metabolites, which can be expressed as
Sv = 0. In addition, we are given upper limits on some of the reaction rates, which we express as
v  v max , where we set vjmax = if no upper limit on reaction rate j is known. The goal is to
find the maximum possible cell growth rate (i.e., largest possible value of vn ) consistent with the
constraints
Sv = 0, v  0, v  v max .

The questions below pertain to the data found in fba_data.m.

(a) Find the maximum possible cell growth rate G? , as well as optimal Lagrange multipliers for
the reaction rate limits. How sensitive is the maximum growth rate to the various reaction
rate limits?
(b) Essential genes and synthetic lethals. For simplicity, well assume that each reaction is con-
trolled by an associated gene, i.e., gene Gi controls reaction Ri . Knocking out a set of genes
associated with some reactions has the effect of setting the reaction rates (or equivalently, the
associated v max entries) to zero, which of course reduces the maximum possible growth rate.
If the maximum growth rate becomes small enough or zero, it is reasonable to guess that
knocking out the set of genes will kill the cell. An essential gene is one that when knocked
out reduces the maximum growth rate below a given threshold Gmin . (Note that Gn is always
an essential gene.) A synthetic lethal is a pair of non-essential genes that when knocked out
reduces the maximum growth rate below the threshold. Find all essential genes and synthetic
lethals for the given problem instance, using the threshold Gmin = 0.2G? .

Solution. Theres not much to do here other than write the code for the problem. For determining
the essential genes and synthetic lethals, we simply loop over pairs of reactions to zero out.
Here is the solution in Matlab.

580
% flux balance analysis in systems biology

fba_data;

% part (a): find maximum growth rate


cvx_begin
variable v(n);
dual variable lambda;
maximize (v(n));
S*v == 0;
v >= 0;
lambda: v <= vmax;
cvx_end
Gstar=cvx_optval;
vopt=v;

[v vmax lambda]

% part (b): find essential genes and synthetic lethals


cvx_quiet(true);
G = zeros(n,n);
for i=1:n
for j=i:n
cvx_begin
variable v(n);
S*v == 0;
v >= 0;
v <= vmax;
v(i) == 0;
v(j) == 0;
maximize (v(n))
cvx_end
G(i,j)=cvx_optval;
G(j,i)=cvx_optval;
end
end

G<0.2*Gstar

Here it is in Python.

# flux balance analysis in systems biology

from fba_data import *


import numpy as np
import cvxpy as cvx

581
v = cvx.Variable(n)
constraints = [v <= vmax, v >= 0, S*v == 0]
prob = cvx.Problem(cvx.Maximize(v[n-1]),constraints)
Gstar = prob.solve()
vopt = v.value

# Look at optimal value, and see which constraints are tight


# Check if the dual value for the first constraint agrees
print(Gstar)
print(vopt)
print(vmax)
print(constraints[0].dual_value)

# Now see which genes are essential, and which pairs are synthetic lethals
G = np.zeros((n,n))
for i in range(n):
for j in range(i-1,n):
v = cvx.Variable(n)
constraints = [v <= vmax, v >= 0, S*v == 0, v[i] == 0, v[j] == 0]
prob = cvx.Problem(cvx.Maximize(v[n-1]),constraints)
val = prob.solve()
G[i,j] = val
G[j,i] = val
print(G < 0.2*Gstar)

Here is a Julia version.

# flux balance analysis in systems biology

include("fba_data.jl");
using Convex, SCS

# part (a): find maximum growth rate


v = Variable(n);
constraints = [
S*v == 0,
v >= 0,
v <= vmax
];
p = maximize(v[n], constraints);
solve!(p);
lambda = constraints[3].dual;
Gstar = p.optval;
vopt = v.value;

582
println("G*: ", Gstar)
println("duals: ", lambda)

# part (b): find essential genes and synthetic lethals


G = zeros(n,n);
for i=1:n
push!(constraints, v[i] == 0)
for j=i:n
push!(constraints, v[j] == 0)
p = maximize(v[n], constraints)
solve!(p, SCSSolver(verbose=false))
G[i,j] = p.optval;
G[j,i] = p.optval;
pop!(constraints)
end
pop!(constraints)
end

# diagonal entries are true for e


println(int(diag(G .< 0.2*Gstar)))
# entries close to 0 are synthetic lethals
println(int(abs(G) .< 0.001))

We find the maximum growth rate to be G? = 13.55. Only three reactions are limited: R1 , R3 ,
and R5 . All other Lagrange multipliers are zero, of course; these have optimal values 0.5, 0.5, and
1.5, respectively. So it appears that the limit on reaction R5 is the one that would have the largest
effect on the optimal growth rate, for a small change in the limit.
We see that the essential genes are G1 and G9 . The synthetic lethals are

{G2 , G3 }, {G2 , G7 }, {G4 , G7 }, {G5 , G7 }.

17.4 Online advertising displays. When a user goes to a website, one of a set of n ads, labeled 1, . . . , n, is
displayed. This is called an impression. We divide some time interval (say, one day) into T periods,
labeled t = 1, . . . , T . Let Nit 0 denote the number of impressions in period t for which we display
ad i. In period t there will be a total of It > 0 impressions, so we must have ni=1 Nit = It , for
P

t = 1, . . . , T . (The numbers It might be known from past history.) You can treat all these numbers
as real. (This is justified since they are typically very large.)
The revenue for displaying ad i in period t is Rit 0 per impression. (This might come from click-
through payments, for example.) The total revenue is Tt=1 ni=1 Rit Nit . To maximize revenue, we
P P

would simply display the ad with the highest revenue per impression, and no other, in each display
period.
We also have in place a set of m contracts that require us to display certain numbers of ads, or mixes
of ads (say, associated with the products of one company), over certain periods, with a penalty for
any shortfalls. Contract j is characterized by a set of ads Aj {1, . . . , n} (while it does not affect

583
the math, these are often disjoint), a set of periods Tj {1, . . . , T }, a target number of impressions
qj 0, and a shortfall penalty rate pj > 0. The shortfall sj for contract j is

X X
sj = qj Nit ,
tTj iAj +

where (u)+ means max{u, 0}. (This is the number of impressions by which we fall short of the
target value qj .) Our contracts require a total penalty payment equal to m
P
j=1 pj sj . Our net profit
is the total revenue minus the total penalty payment.

(a) Explain how to find the display numbers Nit that maximize net profit. The data in this
problem are R RnT , I RT (here I is the vector of impressions, not the identity matrix),
and the contract data Aj , Tj , qj , and pj , j = 1, . . . , m.
(b) Carry out your method on the problem with data given in ad_disp_data.m. ad_disp_data.py.
The data Aj and Tj , for j = 1, . . . , m are given by matrices Acontr Rnm and T contr RT m ,
with ( (
contr 1 i Aj contr 1 t Tj
Aij = Ttj =
0 otherwise, 0 otherwise.
Report the optimal net profit, and the associated revenue and total penalty payment. Give
the same three numbers for the strategy of simply displaying in each period only the ad with
the largest revenue per impression.

Solution.

(a) The constraints Nit 0, ni=1 Nit = It are linear. The contract shortfalls sj are convex
P

functions of N , so the net profit

X n
T X m
X
Rit Nit pj sj
t=1 i=1 j=1

is concave. So we have a convex optimization problem.


(b) We can express the vector of shortfalls as
  
s = q diag Acontr T N T contr
+

The following code solves the problem:


clear all;
% ad display problem
ad_disp_data;

% optimal display
cvx_begin
variable N(n,T)
s = pos(q-diag(Acontr*N*Tcontr));
maximize(R(:)*N(:)-p*s)

584
subject to
N >= 0;
sum(N) == I;
cvx_end
opt_s=s;
opt_net_profit = cvx_optval
opt_penalty = p*opt_s
opt_revenue = opt_net_profit+opt_penalty

% display, ignoring contracts


cvx_begin
variable N(n,T)
maximize(R(:)*N(:))
subject to
N >= 0;
sum(N) == I;
cvx_end
greedy_s = pos(q-diag(Acontr*N*Tcontr));
greedy_net_profit = cvx_optval-p*greedy_s
greedy_penalty = p*greedy_s
greedy_revenue = cvx_optval

import cvxpy as cvx


import numpy as np

np.random.seed(0)
n = 100 #number of ads
m = 30 #number of contracts
T = 60 #number of periods

#number of impressions in each period


I = 10*np.random.rand(T, 1); I = np.asmatrix(I)
#revenue rate for each period and ad
R = np.random.rand(n, T); R = np.asmatrix(R)
#contract target number of impressions
q = T/float(n)*50*np.random.rand(m, 1); q = np.asmatrix(q)
#penalty rate for shortfall
p = np.random.rand(m, 1); p = np.asmatrix(p)
#one column per contract. 1s at the periods to be displayed
Tcontr = np.matrix(np.random.rand(T,m)>.8, dtype = float)
Acontr = np.zeros((n ,m)); Acont = np.asmatrix(Acontr)
for i in range(n):
contract=int(np.floor(m*np.random.rand(1)))
#one column per contract. 1s at the ads to be displayed
Acontr[i,contract]=1

585
#Solution begins
#------------------------------------------------------------
N = cvx.Variable(n, T)
s = cvx.pos(q-cvx.diag(Acontr.T*N*Tcontr))

penalty = cvx.pos(p).T*s
revenue = cvx.sum_entries(cvx.mul_elemwise(R,N))

constraints = [N >= 0]
constraints += [cvx.sum_entries(N[:, t]) == I[t] for t in range(T)]
prob = cvx.Problem([], constraints)

#Optimal solution
prob.objective = cvx.Maximize(revenue - penalty)
optimal_profit = prob.solve()
optimal_revenue = revenue.value
optimal_penalty = penalty.value

#Greedy solution
prob.objective = cvx.Maximize(revenue)
greedy_revenue = prob.solve()
greedy_penalty = penalty.value
greedy_profit = greedy_revenue - greedy_penalty

print [optimal_revenue, optimal_penalty, optimal_profit]


print [greedy_revenue, greedy_penalty, greedy_profit]

The performance of the optimal and greedy strategies is given in the table below. The optimal
strategy gives up a bit of (gross) revenue, in order to pay substantially less penalty, and ends
up way ahead of the greedy strategy.

strategy revenue penalty net profit


optimal 268.23 37.66 230.57
greedy 305.10 232.26 72.84

strategy revenue penalty net profit


optimal 280.94 21.94 259.00
greedy 306.37 159.49 146.88

17.5 Ranking by aggregating preferences. We have n objects, labeled 1, . . . , n. Our goal is to assign a
real valued rank ri to the objects. A preference is an ordered pair (i, j), meaning that object i is
preferred over object j. The ranking r Rn and preference (i, j) are consistent if ri rj + 1.
(This sets the scale of the ranking: a gap of one in ranking is the threshold for preferring one item
over another.) We define the preference violation of preference (i, j) with ranking r Rn as

v = (rj + 1 ri )+ = max{rj + 1 ri , 0}.

586
We have a set of m preferences among the objects, (i(1) , j (1) ), . . . , (i(m) , j (m) ). (These may come
from several different evaluators of the objects, but this wont matter here.)
We will select our ranking r as a minimizer of the total preference violation penalty, defined as
m
X
J= (v (k) ),
k=1

where v (k) is the preference violation of (i(k) , j (k) ) with r, and is a nondecreasing convex penalty
function that satisfies (u) = 0 for u 0.

(a) Make a (simple, please) suggestion for for each of the following two situations:
(i) We dont mind some small violations, but we really want to avoid large violations.
(ii) We want as many preferences as possible to be consistent with the ranking, but will accept
some (hopefully, few) larger preference violations.
(b) Find the rankings obtained using the penalty functions proposed in part (a), on the data
set found in rank_aggr_data.m. Plot a histogram of preference violations for each case and
briefly comment on the differences between them. Give the number of positive preference
violations for each case. (Use sum(v>0.001) to determine this number.)

Remark. The objects could be candidates for a position, papers at a conference, movies, websites,
courses at a university, and so on. The preferences could arise in several ways. Each of a set of
evaluators provides some preferences, for example by rank ordering a subset of the objects. The
problem can be thought of as aggregating the preferences given by the evaluators, to come up with
a composite ranking.
Solution.

(a) Here are some simple suggestions. For the first case (i), we take a quadratic penalty (for
positive violation): (u) = u2+ . For the second case (ii), we take a linear penalty (for positive
violations): (u) = u+ . If the violations were two-sided, these would correspond to an `2 norm
and an `1 norm, respectively.
(b) The problem that we need to solve is
m
P (k) ) )
minimize k=1 ((v +
(k)
subject to v = rj (k) + 1 ri(k) .

The following Matlab code solves the problem for the particular problem instance using the
penalty functions proposed in part (a):
%ranking aggregation problem
%load data
rank_aggr_data;

%find rankings with sum of squared violations penalty


cvx_begin
variables r(n) v(m)
minimize (sum(square_pos(v)));

587
subject to
v == r(preferences(:,2)) + 1 - r(preferences(:,1))
cvx_end
v_sq = pos(v);
num_viol_sq = sum(v_sq>0.001)

%find rankings with sum of violations penalty


cvx_begin
variables r(n) v(m)
minimize (sum(pos(v)));
subject to
v == r(preferences(:,2)) + 1 - r(preferences(:,1))
cvx_end
v_lin = pos(v);
num_viol_lin = sum(v_lin>0.001)

%plot results
subplot(2,1,1);
hist(v_sq,[0:0.5:7.5]);
xlim([0 10]);ylim([0 1000]);
title(Histogram of v with sum of squares violations penalty);
subplot(2,1,2);
hist(v_lin,[0:0.5:7.5]);
xlim([0 10]);ylim([0 1000]);
title(Histogram of v with sum of violations penalty);
print -deps rank_aggr_hist
The resulting residual histograms are shown below.

588
Histogram of v with sum of squares violations penalty
1000

800

600

400

200

0
0 1 2 3 4 5 6 7 8 9 10

Histogram of v with sum of violations penalty


1000

800

600

400

200

0
0 1 2 3 4 5 6 7 8 9 10

For case (i), the violations are often small, but rarely zero; of the 1000 preferences given,
781 are violated. But the largest violation is below 3. For case (ii), only 235 preferences are
violated, but we have a few larger violations, with values up to around 7.
17.6 Time release formulation. A patient is treated with a drug (say, in pill form) at different times.
Each treatment (or pill) contains (possibly) different amounts of various formulations of the drug.
Each of the formulations, in turn, has a characteristic pattern as to how quickly it releases the
drug into the bloodstream. The goal is to optimize the blend of formulations that go into each
treatment, in order to achieve a desired drug concentration in the bloodstream over time.
We will use discrete time, t = 1, 2, . . . , T , representing hours (say). There will be K treatments,
administered at known times 1 = 1 < 2 < < K < T . We have m drug formulations; each
treatment consists of a mixture of these m formulations. We let a(k) Rm + denote the amounts of
the m formulations in treatment k, for k = 1, . . . , K.
(k)
Each formulation i has a time profile pi (t) R+ , for t = 1, 2, . . .. If an amount ai of formulation
i from treatment k is administered at time t0 , the drug concentration in the bloodstream (due to
(k)
this formulation) is given by ai pi (t t0 ) for t > t0 , and 0 for t t0 . To simplify notation, we will
define pi (t) to be zero for t = 0, 1, 2, . . .. We assume the effects of the different formulations and
different treatments are additive, so the total bloodstream drug concentration is given by
K X
m
X (k)
c(t) = pi (t k )ai , t = 1, . . . , T.
k=1 i=1

(This is just a vector convolution.) Recall that pi (t k ) = 0 for t k , which means that the
effect of treatment k does not show up until time k + 1.

589
We require that c(t) cmax for t = 1, . . . , T , where cmax is a given maximum permissible concen-
tration. We define the therapeutic time T ther as

T ther = min{t | c( ) cmin for = t, . . . , T },

with T ther = if c(t) < cmin for t = 1, . . . , T . Here, cmin is the minimum concentration for the drug
to have therapeutic value. Thus, T ther is the first time at which the drug concentration reaches,
and stays above, the minimum therapeutic level.
Finally, we get to the problem. The optimization variables are the treatment formulation vectors
a(1) , . . . , a(K) . There are two objectives: T ther (which we want to be small), and
K1
X
ch
J = ka(k+1) a(k) k
k=1

(which we also want to be small). This second objective is a penalty for changing the formulation
amounts in the treatments.
The rest of the problem concerns the specific instance with data given in the file time_release_form_data.m.
This gives data for T = 168 (one week, starting from 8AM Monday morning), with treatments oc-
curing 3 times each day, at 8AM, 2PM, and 11PM, so we have a total of K = 21 treatments. We
have m = 6 formulations, with profiles with length 96 (i.e., pi (t) = 0 for t > 96).

Explain how to find the optimal trade-off curve of T ther versus J ch . Your method may involve
solving several convex optimization problems.
Plot the trade-off curve over a reasonable range, and be sure to explain or at least comment
on the endpoints of the trade-off curve.
Plot the treatment formulation amounts versus k, and the bloodstream concentration versus
t, for the two trade-off curve endpoints, and one corresponding to T ther = 8.

Warning. Weve found that CVX can experience numerical problems when solving this problem
(depending on how it is formulated). In one case, cvx_status is Solved/Inaccurate when in fact
the problem has been solved (just not to the tolerances SeDuMi likes to see). You can ignore this
status, taking it to mean Optimal. You can also try switching to the SDPT3 solver. In any case,
please do not spend much time worrying about, or dealing with, these numerical problems.
Solution. We formulate the solution as the following bi-criterion optimization problem:
minimize (J ch , T ther )
subject to c(t) cmax , t = 1, . . . , T
min
c(t) c , t = T ther , . . . , T
(k)
a  0, k = 1, . . . , K.

The key to this problem is to recognize that the objective T ther is quasiconvex. The problem as
stated is convex for fixed values of T ther . To solve the problem, then, we will solve a sequence of T
LPs, with T ther = 1, . . . , T . The code to do this is given below.
Note that an acceptable solution to this problem added the constraint that c(T ther 1) < cmin ,
such that the fixed value of T ther satisfies the definition. Points on the trade-off curve found this
way need not be Pareto optimal (i.e., minimal).

590
% solution for time release formulation
close all; clear all
time_release_form_data;

Tther = 1:T;

fastest = -1;
for i = 1:length(Tther)
cvx_begin quiet
variable a(m,K)
c = zeros(1,T);

% vector convolution performed manually


for k = 1:K,
P_shift = zeros(m,T);
P_shift(:, tau(k):tau(k) + Tp-1) = P;
P_shift = P_shift(:,1:T); % truncate if exceed "week"
c = c + sum((a(:,k)*ones(1,T)).*P_shift);
end

Jch = sum(norms(a(:,2:K) - a(:,1:K-1), Inf));

minimize (Jch)
subject to
c <= cmax
c(Tther(i):end) >= cmin
a >= 0
cvx_end

if (strcmpi(cvx_status, Solved) && fastest == -1)


fastest = i;
end

Jch_vals(i) = cvx_optval;
C(i,:) = c;
A{i} = a;
if(abs(cvx_optval) <= 1e-6)
break
end
end

% make plots
figure
plot(Tther(1:length(A)), Jch_vals)
xlabel(Tther); ylabel(Jch);

591
print -depsc time_release_tradeoff.eps

p = 8;

figure
plot(1:T,C(fastest,:), 1:T, C(end,:), 1:T, C(p,:) , ...
1:T, cmin*ones(T,1), k--, 1:T, cmax*ones(T,1), k-- )
axis([0 T 0 5.5]);
xlabel(t); ylabel(ct)
print -depsc time_release_bloodstream.eps

figure
subplot(3,1,1)
plot(1:K, A{fastest}, k)
ylabel([Tther num2str(fastest)]); axis([1 K 0 40]);
subplot(3,1,2)
plot(1:K, A{p}, k)
ylabel(Tther8); axis([1 K 0 40]);
subplot(3,1,3)
plot(1:K, A{end}, k)
ylabel([Tther num2str(length(A))]); axis([1 K 0 40]);
xlabel(k);
print -depsc time_release_formulation.eps

For small values of T ther , the LP will be infeasible. This means that we cannot formulate a treatment
that achieves the minimum therapeutic level by time T ther . As we increase the value of T ther , the
LP becomes feasible.
Once the LP becomes feasible, our solution gives the treatment that achieves the minimum thera-
peutic level as quickly as possible. However, J ch may be large. This means that in order to achieve
the minimum therapeutic level as quickly as possible, we must often change the formulation amounts
in our treatments.
If we let T ther become large enough, J ch goes to 0, since we no longer have to achieve a minimum
therapeutic level. At this point, our treatment consists of a constant drug formulation.
The figure below shows the optimal tradeoff curve. The problem is infeasible for T ther = 1, so
we cannot achieve the minimum therapeutic level after an hour of taking the pill. However, for
T ther 2, we are able to achieve the minimum therapeutic level by some blend of our 6 drug
formulations. Therefore, T ther = 2 gives the fastest acting blend of the formulations; however, the
price we pay is that we must form 4 different pills. For T ther 26, J ch = 0: a single pill (constant
blend) suffices to achieve the minimum therapeutic level after 26 hours. Note that the Pareto
optimal frontier is not convex.

592
40

35

30

25
Jch

20

15

10

0
0 5 10 15 20 25 30
Tther

The next plot shows the treatment formulation amounts versus k for the three selected values of
T ther . Note that as T ther increases, the formulation amounts vary less. For T ther = 2, there are 4
different pills; for T ther = 8, we have 2 or 3 different pills; and for T ther = 26, we have a single pill.

40

30
Tther2

20

10

0
2 4 6 8 10 12 14 16 18 20

40

30
Tther8

20

10

0
2 4 6 8 10 12 14 16 18 20

40

30
Tther26

20

10

0
2 4 6 8 10 12 14 16 18 20
k

The last plot shows the bloodstream concentration c(t) versus t. The bloodstream concentrating

593
corresponding to T ther = 2 (the fastest acting blend) is in blue; T ther = 8 is in red; and T ther = 26
(the constant blend) is in green.

5.5

4.5

3.5

3
ct

2.5

1.5

0.5

0
0 20 40 60 80 100 120 140 160
t

17.7 Sizing a gravity feed water supply network. A water supply network connects water supplies (such
as reservoirs) to consumers via a network of pipes. Water flow in the network is due to gravity
(as opposed to pumps, which could also be added to the formulation). The network is composed
of a set of n nodes and m directed edges between pairs of nodes. The first k nodes are supply or
reservoir nodes, and the remaining n k are consumer nodes. The edges correspond to the pipes
in the water supply network.
We let fj 0 denote the water flow in pipe (edge) j, and hi denote the (known) altitude or height
of node i (say, above sea level). At nodes i = 1, . . . , k, we let si 0 denote the flow into the network
from the supply. For i = 1, . . . , n k, we let ci 0 denote the water flow taken out of the network
(by consumers) at node k + i. Conservation of flow can be expressed as
" #
s
Af = ,
c
where A Rnm is the incidence matrix for the supply network, given by

1 if edge j leaves node i

Aij = +1 if edge j enters node i

0 otherwise.
We assume that each edge is oriented from a node of higher altitude to a node of lower altitude; if
edge j goes from node i to node l, we have hi > hl . The pipe flows are determined by
j Rj2 (hi hl )
fj = ,
Lj

594
where edge j goes from node i to node l, > 0 is a known constant, Lj > 0 is the (known) length
of pipe j, Rj > 0 is the radius of pipe j, and j [0, 1] corresponds to the valve opening in pipe j.
Finally, we have a few more constraints. The supply feed rates are limited: we have si Simax .
The pipe radii are limited: we have Rjmin Rj Rjmax . (These limits are all known.)

(a) Supportable consumption vectors. Suppose that the pipe radii are fixed and known. We say
that c Rnk
+ is supportable if there is a choice of f , s, and for which all constraints
and conditions above are satisfied. Show that the set of supportable consumption vectors is
a polyhedron, and explain how to determine whether or not a given consumption vector is
supportable.
(b) Optimal pipe sizing. You must select the pipe radii Rj to minimize the cost, which we take to
be (proportional to) the total volume of the pipes, L1 R12 + +Lm Rm 2 , subject to being able to

support a set of consumption vectors, denoted c(1) , . . . , c(N ) , which we refer to as consumption
scenarios. (This means that any consumption vector in the convex hull of {c(1) , . . . , c(N ) } will
be supportable.) Show how to formulate this as a convex optimization problem. Note. You
are asked to choose one set of pipe radii, and N sets of valve parameters, flow vectors, and
source vectors; one for each consumption scenario.
(c) Solve the instance of the optimal pipe sizing problem with data defined in the file grav_feed_network_data.m,
and report the optimal value and the optimal pipe radii. The columns of the matrix C in the
data file are the consumption vectors c(1) , . . . , c(N ) .

Hint. AT h gives a vector containing the height differences across the edges.
Solution.

(a) Let P be the set of vectors (f, , s, c) Rm Rm Rk Rnk that satisfy


" #
s
Af = , 0   1, 0  f, 0  c, 0  s  S max , f = D,
c

where D = D1 D2 , with D1 = diag(AT h) diag(1/L1 , . . . , 1/Lm ) and D2 = diag(R12 , . . . , Rm


2 ).

The set P is clearly a polyhedron. Therefore, the set of supportable consumption vectors c,
which is just a projection of P , is also a polyhedron.
To determine whether or not a consumption vector c Rnk+ is supportable, we can solve the
following feasibility problem:

find (f, , s)
subject to 0  f
01
Af = (s, c)
0  s  S max
f = D.

595
(b) The problem that we would like to solve is

minimize L1 R12 + + Lm Rm 2

subject to Rmin  R  Rmax


0  f (p) , p = 1, . . . , N
0  (p)  1, p = 1, . . . , N
Af (p) = (s(p) , c(p) ) p = 1, . . . , N
0  s(p)  S max , p = 1, . . . , N
f (p) = D1 D2 (p) , p = 1, . . . , N.

This problem, however, is not convex in the variables R, f (p) , s(p) , (p) since the last inequality
constraint is not a convex constraint.
The first trick is to not use the valve parameters directly and replace with 1, i.e.

f (p) D1 D2 1.

After finding the flows and pipe radii, we can simply set

(p) = (D1 D2 )1 f (p) [0, 1]m .

This is a valid transformation, since D1 and D2 are diagonal and invertible, so can be
recovered uniquely.
However, the constraint f (p) D1 D2 1 is still not convex because D2 is quadratic in R. The
second trick is to do a change of variables z = D2 1 = (R12 , . . . , Rm
2 ). This is a valid change of

variables since R  0.
Using these transformations, we obtain the equivalent problem

minimize L1 z1 + + Lm zm
subject to (Rjmin )2 zj (Rjmax )2 , j = 1, . . . , m
0  f (p) , p = 1, . . . , N
Af (p) = (s(p) , c(p) ) p = 1, . . . , N
0  s(p)  S max , p = 1, . . . , N
f (p)  D1 z, p = 1, . . . , N,

which is now a convex problem.


(c) The following Matlab script solves the problem.

grav_feed_network_data;
D1 = diag(alpha*(-A*h))*diag(1./L);

cvx_begin
variables z(m) F(m,N) S(k,N)
minimize(L*z)
subject to
F >= 0
Rmin.^2 <= z
z <= Rmax.^2

596
A*F == [-S;C]
S >= 0
max(S) <= Smax
max(F) <= D1*z
cvx_end
R = sqrt(z); % pipe radii
D2 = diag(R.^2);
theta = inv(D1*D2)*F; % valve opening

The optimal value was found to be 313.024 and the pipe radii are given in the table below.
pipe radius
1 0.5
2 0.5
3 2.21
4 0.5
5 0.5
6 0.5
7 1.93
8 1.23
9 0.5
10 2.5
11 2.5
12 0.5
13 0.5
14 1.20
15 0.5
16 2.5
17 1.05
18 1.89
19 1.73
20 0.5
17.8 Optimal political positioning. A political constituency is a group of voters with similar views on a
set of political issues. The electorate (i.e., the set of voters in some election) is partitioned (by a
political analyst) into K constituencies, with (nonnegative) populations P1 , . . . , PK . A candidate in
the election has an initial or prior position on each of n issues, but is willing to consider (presumably
small) deviations from her prior positions in order to maximize the total number of votes she will
receive. We let xi R denote the change in her position on issue i, measured on some appropriate
scale. (You can think of xi < 0 as a move to the left and xi > 0 as a move to the right on the
issue, if you like.) The vector x Rn characterizes the changes in her position on all issues; x = 0
represents the prior positions. On each issue she has a limit on how far in each direction she is
willing to move, which we express as l  x  u, where l 0 and u  0 are given.
The candidates position change x affects the fraction of voters in each constituency that will vote
for her. This fraction is modeled as a logistic function,

fk = g(wkT x + vk ), k = 1, . . . , K.

597
Here g(z) = 1/(1 + exp(z)) is the standard logistic function, and wk Rn and vk R are given
data that characterize the views of constituency k on the issues. Thus the total number of votes
the candidate will receive is
V = P1 f1 + + PK fK .
The problem is to choose x (subject to the given limits) so as to maximize V . The problem data
are l, u, and Pk , wk , and vk for k = 1, . . . , K.

(a) The general political positioning problem. Show that the objective function V need not be
quasiconcave. (This means that the general optimal political positioning problem is not a
quasiconvex problem, and therefore also not a convex problem.) In other words, choose prob-
lem data for which V is not a quasiconcave function of x.
(b) The partisan political positioning problem. Now suppose the candidate focuses only on her
core constituencies, i.e., those for which a significant fraction will vote for her. In this case
we interpret the K constituencies as her core constituencies; we assume that vk 0, which
means that with her prior position x = 0, at least half of each of her core constituencies will
vote for her. We add the constraint that wkT x + vk 0 for each k, which means that she will
not take positions that alienate a majority of voters from any of her core constituencies.
Show that the partisan political positioning problem (i.e., maximizing V with the additional
assumptions and constraints) is convex.
(c) Numerical example. Find the optimal positions for the partisan political positioning problem
with data given in opt_pol_pos_data.m. Report the number of votes from each constituency
under the politicians prior positions (x = 0) and optimal positions, as well as the total number
of votes V in each case.
You may use the function

gapprox (z) = min{1, g(i) + g 0 (i)(z i) for i = 0, 1, 2, 3, 4}

as an approximation of g for z 0. (The function gapprox is also an upper bound on g for


z 0.) For your convenience, we have included function definitions for g and gapprox (g and
gapx, respectively) in the data file. You should report the results (votes from each constituency
and total) using g, but be sure to check that these numbers are close to the results using gapprox
(say, within one percent or so).

Solution.

(a) The simplest counterexample has only one issue and two constituencies. We can take (for
example) w1 = 1, w2 = 1, and v1 = v2 < 0, and (say) P1 = P2 = 1. This means that
neither constituency is very supportive of the position x = 0; one wants the position to be
negative, and the other positive. As a result, the total number of votes V increases for either
x decreasing or increasing. It follows that V is not quasiconcave. The plot below shows V as
a function of x.

598
1

0.95

0.9

0.85

0.8

0.75
V

0.7

0.65

0.6

0.55

0.5
4 3 2 1 0 1 2 3 4
x

(b) The logistic function g is neither convex nor concave. But if we restrict the argument to
nonnegative values, it is concave. In the partisan positioning problem, our constraint of not
alienating any constituency, i.e., wkT x + vk 0, means that for feasible x, g(wkT x + vk ) is
concave. Thus the total number of votes V , which is a nonnegative weighted sum of the
concave functions g(wkT x + vk ), is also concave. The constraints, which are the lower and
upper bounds on x, as well as the constraints wkT x + vk 0, are clearly convex.
(c) We didnt ask you to check the given approximation of g, but we do so here for the record.
The first plot below shows g and gapprox ; the second shows the error gapprox g. The maximum
error is around .01. (We could easily reduce this adding more terms to the maximum.)

599
1

0.95

0.9

0.85

0.8

0.75

0.7

gapprox
0.65 g

0.6

0.55

0.5
0 1 2 3 4 5 6
z

0.012

0.01

0.008
gapproxminusg

0.006

0.004

0.002

0
0 1 2 3 4 5 6
z

The following code solves the problem:


% optimal political positioning.

600
% ee364a
opt_pol_pos_data;

% Counterexample to quasiconvexity for general political positioning problem


plot_counterexample = true;
if plot_counterexample
x = (-4:.01:4);
v_counterexample = -1;
figure;
plot(x,g(x+v_counterexample)+g(-x+v_counterexample))
xlabel(x)
ylabel(V)
print -deps opt_pol_pos_counterexample.eps
end

% Plot approximation quality


plot_apx_quality = true
if plot_apx_quality
x = 0:.01:6;
figure;
plot(x,gapx(x),r,x,g(x),b)
legend(gapprox,g,Location,Best)
xlabel(z)
print -deps opt_pol_pos_logit_approx.eps;

figure;
plot(x,gapx(x)-g(x))
ylabel(gapproxminusg)
xlabel(z)
print -deps opt_pol_pos_logit_error.eps;
end

% Compute optimal positions for partisan political positioning problem


cvx_begin
variable x(n)
maximize(P*gapx(W*x + v))
subject to
x >= l;
x <= u;
W*x + v >= 0;
cvx_end

% Vote before position optimization


Vi = P*g(v)

601
Viapx = P*gapx(v)
% Vote after position optimization
Vf = P*g(W*x + v)
Vfapx = P*gapx(W*x + v)
% Change in vote per constituency
delta = [P.*g(v),P.*g(W*x+v)]
The total vote improves by 16% from 373432 to 433884. The changes in votes for each
constituency are given below. We can see that the change in positions increases the number
of votes from most constituencies, but also lowers the number of votes in a few.
These numbers are computed using g, not the approximation gapprox . But if we calculate
these numbers using gapprox , they differ by less than 1%. A stronger statement can be made
by noting that gapprox is an upper bound on g when the argument is nonnegative, so the
optimal value of our problem using gapprox , which is 435579, in an upper bound on the true
optimal value. Since the objective value we obtained is 433884, we are guaranteed that this is
no more than 0.4% suboptimal. (We didnt ask you to do this analysis.)
Initial Vote Final Vote

48295 52894
21352 28223
27384 40443
26583 22150
46370 47055
41511 35228
25744 34561
14850 10463
31419 43309
24806 21445
35174 41241
29945 56873
17.9 Resource allocation in stream processing. A large data center is used to handle a stream of J types
of jobs. The traffic (number of instances per second) of each job type is denoted t RJ+ . Each
instance of each job type (serially) invokes or calls a set of processes. There are P types of processes,
and we describe the job-process relation by the P J matrix
(
1 job j invokes process p
Rpj =
0 otherwise.

The process loads (number of instances per second) are given by = Rt RP , i.e., p is the sum
of the traffic from the jobs that invoke process p.
The latency of a process or job type is the average time that it takes one instance to complete.
These are denoted lproc RP and ljob RJ , respectively, and are related by ljob = RT lproc , i.e.,
ljjob is the sum of the latencies of the processes called by j. Job latency is important to users, since
ljjob is the average time the data center takes to handle an instance of job type j. We are given a
maximum allowed job latency: ljob  lmax .

602
The process latencies depend on the process load and also how much of n different resources are
made available to them. These resources might include, for example, number of cores, disk storage,
and network bandwidth. Here, we represent amounts of these resources as (nonnegative) real
numbers, so xp Rn+ represents the resources allocated to process p. The process latencies are
given by
lpproc = p (xp , p ), p = 1, . . . , P,
where p : Rn R R {} is a known (extended-valued) convex function. These functions are
nonincreasing in their first (vector) arguments, and nondecreasing in their second arguments (i.e.,
more resources or less load cannot increase latency). We interpret p (xp , p ) = to mean that
the resources given by xp are not sufficient to handle the load p .
We wish to allocate a total resource amount xtot Rn++ among the P processes, so we have
PP tot
p=1 xp  x . The goal is to minimize the objective function

J
X
wj (ttar
j tj )+ ,
j=1

where ttar
j is the target traffic level for job type j, wj > 0 give the priorities, and (u)+ is the
nonnegative part of a vector, i.e., ui = max{ui , 0}. (Thus the objective is a weighted penalty for
missing the target job traffic.) The variables are t RJ+ and xp Rn+ , p = 1, . . . , P . The problem
data are the matrix R, the vectors lmax , xtot , ttar , and w, and the functions p , p = 1, . . . , P .

(a) Explain why this is a convex optimization problem.


(b) Solve the problem instance with data given in res_alloc_stream_data.m, with latency func-
tions (
1/(aTp xp p ) aTp xp > p , xp  xmin
p
p (xp , p ) =
otherwise
where ap Rn++ and xmin
p Rn++ are given data. The vectors ap and xmin
p are stored as the
columns of the matrices A and x_min, respectively.
Give the optimal objective value and job traffic. Compare the optimal job traffic with the
target job traffic.

Solution.

(a) The problem is


J
w (ttar tj )+
P
minimize
Pj=1 j jtot
subject to p xp  x
xp  0, p = 1, . . . , P
t0
= Rt
ljob = RT lproc
ljob  lmax
with variables xp and t, where lpproc = p (xp , p ).
The variables are xp and t, so the constraints t  0, xp  0, and p xp  xtot are convex,
P

as are the implicit constraints p (xp , p ) < . The objective is a positive weighted sum of

603
convex functions of the variables, and so is convex. The job traffic t is affine in the variables,
so each entry of lproc is a convex function of the variables, and therefore (since the coefficients
of RT are nonnegative) so is every entry of ljob . Thus the constraints ljob  lmax are convex.
The constraint xp  xminp used in part (b) is also convex.
(b) The following code solves the given problem instance.

% resource allocation for stream processing.

res_alloc_stream_data;

% solution

cvx_begin
variable x(n,P) % resource allocation
variable t(J) % job traffic
t >= 0
sum(x) <= x_tot; % resource limits
lambda = R*t; % process loads
x >= x_min % minimum allowable resources for the processes
lproc = inv_pos(sum(A.*x)-lambda); % process latencies
ljob = R*lproc; % job latencies
ljob <= l_max; % job latency limit
minimize (w*pos(t_tar-t))
cvx_end

cvx_optval

[t t_tar]

[ljob l_max]

The optimal objective value is 7.74. All job types are handled at their target loads, except for
job 4. (We should not be surprised by sparsity of the job load missed target vector.) All the
latency limits are respected (as we require); some of the jobs have latency smaller than the
maximum allowed value.

17.10 Optimal parimutuel betting. In parimutuel betting, participants bet nonnegative amounts on each
of n outcomes, exactly one of which will actually occur. (For example, the outcome can be which
of n horses wins a race.) The total amount bet by all participants on all outcomes is called the pool
or tote. The house takes a commission from the pool (typically around 20%), and the remaining
pool is divided among those who bet on the outcome that occurs, in proportion to their bets on
the outcome. This problem concerns the choice of the amount to bet on each outcome.
Let xi 0 denote the amount we bet on outcome i, so the total amount we bet on all outcomes is
1T x. Let ai > 0 denote the amount bet by all other participants on outcome i, so after the house
commission, the remaining pool is P = (1 c)(1T a + 1T x), where c (0, 1) is the house commission

604
rate. Our payoff if outcome i occurs is then
xi
 
pi = P.
xi + ai

The goal is to choose x, subject to 1T x = B (where B is the total amount to be bet, which is
given), so as to maximize the expected utility
n
X
i U (pi ),
i=1

where i is the probability that outcome i occurs, and U is a concave increasing utility function,
with U (0) = 0. You can assume that ai , i , c, B, and the function U are known.

(a) Explain how to find an optimal x using convex or quasiconvex optimization. If you use a
change of variables, be sure to explain how your variables are related to x.
(b) Suggest a fast method for computing an optimal x. You can assume that U is strictly concave,
and that scalar optimization problems involving U (such as evaluating the conjugate of U )
are easily and quickly solved.

Remarks.

To carry out this betting strategy, youd need to know ai , and then be the last participant
to place your bets (so that ai dont subsequently change). Youd also need to know the
probabilities i . These could be estimated using sophisticated machine learning techniques or
insider information.
The formulation above assumes that the total amount to bet (i.e., B) is known. If it is not
known, you could solve the problem above for a range of values of B and use the value of B
that yields the largest optimal expected utility.

Solution.

(a) It turns out the problem is convex, exactly as stated. The constraints 1T x = B and x  0
are clearly convex. The objective is concave, which can be seen as follows. Since we know
1T x = B, the remaining pool is a (known) constant, P = (1 c)(1T a + B). The objective is
a convex combination of terms of the form
P xi
 
U ,
xi + ai

which are concave functions of x. To see this we note that xi /(xi + ai ) is concave in xi for
xi 0, and by the composition rules, a concave increasing function of a concave function is
concave.
(b) The problem can be expressed as
n P
maximize i=1 fi (xi )
T
subject to 1 x = B, x  0,

605
where
fi (xi ) = i U (P xi /(xi + ai ))
(which are concave functions of xi , as noted in part (a)). We can use a water-filling method
to solve the problem.
We form the partial Lagrangian (keeping x  0 implicit)
n
X
L(x, ) = fi (xi ) + (B 1T x),
i=1

with dom L = Rn+ R. This is separable in x, so to maximize it over x we simply maximize


over each xi separately:
xi = argmax (fi (xi ) xi ) .
xi 0

Our assumption that U is strictly concave implies that fi is strictly concave, so this has a
unique solution. This is a condition sufficient to allow us to solve the primal problem by solving
the dual. These scalar maximization problems are easily solved (thats our assumption), so
we need only adjust so that 1T x = B. Since each xi is monotone nonincreasing in , we can
use bisection.
When is negative, the problems are unbounded above, so we have 0. When
maxi fi0 (0) (we assume here fi are differentiable at 0), we have b = 0. So we can start bisection
with the initial interval [0, maxi fi0 (0)].
Each step of the bisection requires solving n scalar optimization problems, and we have to
perform a modest number (say, 20) bisection steps.

17.11 Perturbing a Hamiltonian to maximize an energy gap. A finite dimensional approximation of


a quantum mechanical system is described by its Hamiltonian matrix H Sn . We label the
eigenvalues of H as 1 n , with corresponding orthonormal eigenvectors v1 , . . . , vn . In this
context the eigenvalues are called the energy levels of the system, and the eigenvectors are called
the eigenstates. The eigenstate v1 is called the ground state, and 1 is the ground energy. The
energy gap (between the ground and next state) is = 2 1 .
By changing the environment (say, applying external fields), we can perturb a nominal Hamiltonian
matrix to obtain the perturbed Hamiltonian, which has the form
k
X
H = H nom + xi Hi .
i=1

Here H nom Sn is the nominal (unperturbed) Hamiltonian, x Rk gives the strength or value of
the perturbations, and H1 , . . . , Hk Sn characterize the perturbations. We have limits for each
perturbation, which we express as |xi | 1, i = 1, . . . , k. The problem is to choose x to maximize
the gap of the perturbed Hamiltonian, subject to the constraint that the perturbed Hamiltonian
H has the same ground state (up to scaling, of course) as the unperturbed Hamiltonian H nom . The
problem data are the nominal Hamiltonian matrix H nom and the perturbation matrices H1 , . . . , Hk .

(a) Explain how to formulate this as a convex or quasiconvex optimization problem. If you change
variables, explain the change of variables clearly.

606
(b) Carry out the method of part (a) for the problem instance with data given in hamiltonian_gap_data.m.
Give the optimal perturbations, and the energy gap for the nominal and perturbed systems.
The data Hi are given as a cell array; H{i} gives Hi .

Solution.

(a) Let v nom be the ground state of the nominal Hamiltonian. The condition that it be an
eigenstate of the perturbed Hamiltonian is Hv nom = v nom for some R, which is a linear
equality constraint on x and . The condition min (H) , which is convex, ensures that is
the ground energy level of the perturbed Hamiltonian. (It can be shown that this constraint
is not needed.)
We still have to deal with the gap. In general, the second lowest eigenvalue of a symmetric
matrix is not a concave function. But here we know the eigenvector associated with the lowest
eigenvalue. The trick is to form a new matrix whose smallest eigenvalue is exactly 2 for the
perturbed Hamiltonian.
One way is to project onto the subspace orthogonal to v nom , which we do by finding a matrix
Q R(n1)n whose columns are an orthonormal basis for N (v nom ). The matrix QT HQ,
which is in Sn1 , has eigenvalues 2 (H), . . . , n (H), so min (QT HQ) = 2 . This is a concave
function of x, so were done. We end up with the problem

maximize min (QT HQ)


subject to Hv nom = v nom , kxk 1,

with variables x and .


There are several other ways to get a handle on 2 . The matrix
= H + v nom (v nom )T
H

has eigenvalues
1 (H) + , 2 (H), . . . , n (H).
Thus we have
= min{1 (H) + , 2 (H)}.
min (H)
This is a concave function of x and ; if we maximize this function over , we can take any
value that ends up with
= 2 (H),
min (H)
which is just what we wanted. So another formulation is

maximize min (H + v nom (v nom )T )


subject to Hv nom = v nom , kxk 1,

with variables x, , and .


(b) The following code solves the problem (using both ways, just to check).
% hamiltonian perturbation optimization
% ee364a, s. boyd
hamiltonian_gap_data;

607
[V,D]=eig(Hnom);
v_nom = V(:,1); % nominal ground state

cvx_begin
variables x(k) lambda
variable alpha
x >= -1;
x <= 1;
H = Hnom;
for i=1:k
H=H+x(i)*Hpert{i};
end
H*v_nom == lambda*v_nom; % preserve ground state
% first method
Q= null(v_nom); % columns of Q are o.n. basis for v_nom^\perp
maximize (lambda_min(Q*H*Q)-lambda)
% second method
%maximize (lambda_min(H + alpha*v_nom*v_nom) -lambda)
cvx_end

nom_energies = eig(Hnom);
pert_energies = eig(H);

[nom_energies pert_energies]

nom_gap = nom_energies(2)-nom_energies(1)

pert_gap = pert_energies(2)-pert_energies(1)
The optimal perturbations are given below. We can see that several of the perturbations are
at the limits. The energies for the unperturbed and perturbed systems are also given. To
increase the gap the perturbations have reduced the ground energy and also increased the
second energy level. The gap increases by a factor around two, from 3.56 to 7.14.
x =

0.3546
0.1552
0.8463
-0.5188
0.1474
-0.7314
0.3515
-1.0000

608
-0.4400
-0.2201
-0.3730
0.6952
-0.5168
-1.0000
-1.0000

ans =

-7.2432 -21.2974
-3.6805 -14.1566
-2.1072 -12.7372
-1.3870 -10.2474
-1.0429 -5.4235
-0.2360 -1.8341
2.4893 -1.2973
3.6856 3.1430
5.5773 11.8985
8.2111 22.3972

nom_gap =

3.5627

pert_gap =

7.1408

17.12 Theory-applications split in a course. A professor teaches an advanced course with 20 lectures,
labeled i = 1, . . . , 20. The course involves some interesting theoretical topics, and many practical
applications of the theory. The professor must decide how to split each lecture between theory and
applications. Let Ti and Ai denote the fraction of the ith lecture devoted to theory and applications,
for i = 1, . . . , 20. (We have Ti 0, Ai 0, and Ti + Ai = 1.)
A certain amount of theory has to be covered before the applications can be taught. We model
this in a crude way as

A1 + + Ai (T1 + + Ti ), i = 1, . . . , 20,

where : R R is a given nondecreasing function. We interpret (u) as the cumulative amount


of applications that can be covered, when the cumulative amount of theory covered is u. We will

609
use the simple form (u) = a(u b)+ , with a, b > 0, which means that no applications can be
covered until b lectures of the theory is covered; after that, each lecture of theory covered opens
the possibility of covering a lectures on applications.
The theory-applications split affects the emotional state of students differently. We let si denote
the emotional state of a student after lecture i, with si = 0 meaning neutral, si > 0 meaning happy,
and si < 0 meaning unhappy. Careful studies have shown that si evolves via a linear recursion
(dynamics)
si = (1 )si1 + (Ti + Ai ), i = 1, . . . , 20,
with s0 = 0. Here and are parameters (naturally interpreted as how much the student likes or
dislikes theory and applications, respectively), and [0, 1] gives the emotional volatility of the
student (i.e., how quickly he or she reacts to the content of recent lectures). The students terminal
emotional state is s20 .
Now consider a specific instance of the problem, with course material parameters a = 2, b = 3, and
three groups of students, with emotional dynamics parameters given as follows.

Group 1 Group 2 Group 3


0.05 0.1 0.3
-0.1 0.8 -0.3
1.4 -0.3 0.7

Find (four different) theory-applications splits that maximize the terminal emotional state of the
first group, the terminal emotional state of the second group, the terminal emotional state of the
third group, and, finally, the minimum of the terminal emotional states of all three groups.
For each case, plot Ti and the emotional state si for the three groups, versus i. Report the numerical
values of the terminal emotional states for each group, for each of the four theory-applications splits.
Solution. Because of the way that was chosen, the first b lectures have to be theory only,
i.e., Ti = 1, Ai = 0 for i = 1, . . . , b. Using this observation, we can rewrite the condition on the
theory-applications split:

Ab+1 + + Ai a(Tb+1 + + Ti ), i = b + 1, . . . , 20.

This is a linear inequality in the variables Tb+1 , . . . , T20 , and Ab+1 , . . . , A20 .
Note that each si is also linear in the same set of variables. Thus, maximizing the given objective
functions is a linear program.
Solving the numerical instance gives the following plots. The black curve shows Ti . The red, green,
and blue curves show the emotional states of the three student groups.

610
Plan 1
1.5
1
0.5
0
0.5
2 4 6 8 10 12 14 16 18 20
Plan 2
1.5
1
0.5
0
0.5
2 4 6 8 10 12 14 16 18 20
Plan 3
1.5
1
0.5
0
0.5
2 4 6 8 10 12 14 16 18 20
Plan 4
1.5
1
0.5
0
0.5
2 4 6 8 10 12 14 16 18 20

The terminal emotional states for the three groups under the four different lecture plans are given
by the table below.

Group 1 Group 2 Group 3


Plan 1 0.597 -0.064 0.682
Plan 2 -0.064 0.703 -0.300
Plan 3 0.597 -0.064 0.682
Plan 4 0.306 0.306 0.306

The following code solves the problem and generates the plots shown above.

% theory-applications split in a course


clear all; clf;

% set up the parameters


a = 2; b = 3;
theta = [ 0.05; 0.1; 0.3];
alpha = [-0.1; 0.8; -0.3];
beta = [ 1.4; -0.3; 0.7];
n = 20; % number of lectures

611
m = 3; % number of student groups

for plan = 1:m+1


cvx_begin quiet
variable T(n)
expressions s(m, n+1) obj

% compute the emotional states


for i = 1:n
s(:, i+1) = (1-theta).*s(:, i) ...
+ theta.*(alpha*T(i)+beta*(1-T(i)));
end

if plan == 4
obj = min(s(:, n+1));
else
obj = s(plan, n+1);
end

maximize obj
subject to
T >= 0;
T <= 1;
T(1:b) == 1;
cumsum(1-T(b+1:n)) <= a*cumsum(T(b+1:n));
cvx_end

% plot
subplot(4, 1, plan);
plot(1:n, T, k, ...
1:n, s(1, 2:n+1), r, ...
1:n, s(2, 2:n+1), g, ...
1:n, s(3, 2:n+1), b);
title(sprintf(Plan %d, plan));
axis([1 n -0.5 1.5]);
fprintf(Plan %d: %f %f %f\n, plan, ...
s(1, n+1), s(2, n+1), s(3, n+1));
end

print -depsc theory_appls.eps;

17.13 Lyapunov analysis of a dynamical system. We consider a discrete-time time-varying linear dynami-
cal system with state xt Rn . The state propagates according to the linear recursion xt+1 = At xt ,
for t = 0, 1, . . ., where the matrices At are unknown but satisfy At A = {A(1) , . . . , A(K) }, where
A(1) , . . . , A(K) are known. (In computer science, this would be called a non-deterministic linear
automaton.) We call the sequence x0 , x1 , . . . a trajectory of the system. There are infinitely many

612
trajectories, one for each sequence A0 , A1 , . . ..
The Lyapunov exponent of the system is defined as
1/t
= sup lim sup kxt k2 .
A0 ,A1 ,... t

(If you dont know what sup and lim sup mean, you can replace them with max and lim, respec-
tively.) Roughly speaking, this means that all trajectories grow no faster than t . When < 1,
the system is called exponentially stable.
It is a hard problem to determine the Lyapunov exponent of the system, or whether the system is
exponentially stable, given the data A(1) , . . . , A(K) . In this problem we explore a powerful method
for computing an upper bound on the Lyapunov exponent.

(a) Let P Sn++ and define V (x) = xT P x. Suppose V satisfies

V (A(i) x) 2 V (x) for all x Rn , i = 1, . . . , K.

Show that . Thus is an upper bound on the Lyapunov exponent . (The function V
is called a quadratic Lyapunov function for the system.)
(b) Explain how to use convex or quasiconvex optimization to find a matrix P Sn++ with the
smallest value of , i.e., with the best upper bound on . You must justify your formulation.
(c) Carry out the method of part (b) for the specific problem with data given in lyap_exp_bound_data.m.
Report the best upper bound on , to a tolerance of 0.01. The data A(i) are given as a cell
array; A{i} gives A(i) .
(d) Approximate worst-case trajectory simulation. The quadratic Lyapunov function found in
1/t
part (c) can be used to generate sequences of At that tend to result in large values of kxt k2 .
Start from a random vector x0 . At each t, generate xt+1 by choosing At = A(i) that maximizes
V (A(i) xt ), where P is computed from part (c). Do this for 50 time steps, and generate 5 such
1/t
trajectories. Plot kxt k2 and against t to verify that the bound you obtained in the previous
part is valid. Report the lower bound on the Lyapunov exponent that the trajectories suggest.

Solution.

(a) Suppose V satisfies the conditions given above, and x0 , x1 , . . . is any trajectory of the system.
Then, V (xt+1 ) 2 V (xt ) for all t 0. It follows that V (xt ) 2t V (x0 ). So we have

xTt xt T xT x
kxt k22 = x P xt sup V (xt ) min (P )1 2t V (x0 ),
xTt P xt t T
x6=0 x P x

and thus,
1/t
kxt k2 min (P )1/2t V (x0 )1/2t .
Taking the limit as t we get .
(b) The given condition on V is equivalent to the matrix inequalities
T
A(i) P A(i)  2 P, i = 1, . . . , K.

613
For any fixed , these are linear matrix inequalities in P , hence convex constraints.
Note that the LMIs above are homogeneous in P . Therefore, without loss of generality, we
can require P  I in order to enforce P to be positive definite. Thus, there is a quadratic
Lyapunov function that establishes the bound if and only if the LMIs
T
A(i) P A(i)  2 P, i = 1, . . . , K, P I

are feasible. This defines a convex set of P , so finding the smallest possible value of is a
quasiconvex problem. Then, we can use bisection on to solve this problem.
(c) We find that the best bound obtained by the method above is = 0.97. The following code
solves the problem.
% lyapunov analysis of a dynamical system
lyap_exp_bound_data;

% compute lower and upper bounds for bisection


l = 0; u = 0;
for i = 1:K
l = max(l, max(abs(eig(A{i}))));
u = max(u, norm(A{i}));
end
bisection_tol = 1e-4;

while u-l >= bisection_tol


fprintf(bisection bounds: %f %f\n, l, u);
gamma = (l+u)/2;
cvx_begin quiet
variable P(n, n) symmetric
minimize trace(P)
subject to
for i = 1:K
gamma^2*P - A{i}*P*A{i} == semidefinite(n)
end
P-eye(n) == semidefinite(n)
cvx_end

if strcmp(cvx_status, Solved)
u = gamma;
bound = gamma;
P_opt = P;
else
l = gamma;
end
end

fprintf(smallest value of gamma = %f\n, bound);

614
We mention the initial lower and upper bound used in the code, that we didnt ask you to
explore. Suppose that the initial state was an eigenvector of some A(i) with eigenvalue , and
that this particular A(i) was chosen at every time step. Under this condition, it is easy to see
that || gives a lower bound on . On the other hand, the ratio kxt+1 k2 /kxt k2 is bounded by
maxi kA(i) k, so we obtain an upper bound on .
(d) The figure shows five random trajectories in blue, and the bound in red. The initial lower
and upper bound used in the bisection method are shown in green. The trajectories suggest
that the Lyapunov exponent of the system is 0.86 or higher.
1.3

1.2

1.1

1
|xt|1/t
2

0.9

0.8

0.7

0.6

0.5
0 5 10 15 20 25 30 35 40 45 50
t

The following code generates the plot.


% approximate worst-case trajectory
% simulation of a dynamical system
clf; hold on;
T = 50;
for traj = 1:5
x = randn(n, 1); % random initial state
x = 1.1*x/norm(x);
ys = [];
for t = 1:T
best = 0;
for i = 1:K
if best < x*A{i}*P_opt*A{i}*x

615
best = x*A{i}*P_opt*A{i}*x;
x_next = A{i}*x;
end
end
x = x_next;
ys(end+1) = norm(x)^(1/t);
end
plot(1:T, ys);
end

l = 0; u = 0;
for i = 1:K
l = max(l, max(abs(eig(A{i}))));
u = max(u, norm(A{i}));
end

plot([1 T], [bound bound], r, ...


[1 T], [l l], g, ...
[1 T], [u u], g);

xlabel(t); ylabel(|x_t|_2^{1/t});
hold off;

print -depsc lyap_exp_bound.eps;

17.14 Optimal material blending. A standard industrial operation is to blend or mix raw materials
(typically fluids such as different grades of crude oil) to create blended materials or products.
This problem addresses optimizing the blending operation. We produce n blended materials from
m raw materials. Each raw and blended material is characterized by a vector that gives the
concentration of each of q constituents (such as different octane hydrocarbons). Let c1 , . . . , cm Rq+
and c1 , . . . , cn Rq+ be the concentration vectors of the raw materials and the blended materials,
respectively. We have 1T cj = 1T ci = 1 for i = 1, . . . , n and j = 1, . . . , m. The raw material
concentrations are given; the blended product concentrations must lie between some given bounds,
cmin
i  ci  cmax i .
Each blended material is created by pumping raw materials (continuously) into a vat or container
where they are mixed to produce the blended material (which continuously flows out of the mixing
vat). Let fij 0 denote the flow of raw material j (say, in kg/s) into the vat for product i, for
i = 1, . . . , n, j = 1, . . . , m. These flows are limited by the total availability of each raw material:
Pn
i=1 fij Fj , j = 1, . . . , m, where Fj > 0 is the maximum total flow of raw material j available.
Let fi 0 denote the flow rates of the blended materials. These also have limits: fi Fi ,
i = 1, . . . , n.
The raw and blended material flows are related by the (mass conservation) equations
m
fij cj = fi ci ,
X
i = 1, . . . , n.
j=1

616
(The lefthand side is the vector of incoming constituent mass flows and the righthand side is the
vector of outgoing constituent mass flows.)
Each raw and blended material has a (positive) price, pj , j = 1, . . . , m (for the raw materials), and
pi , i = 1, . . . , n (for the blended materials). We pay for the raw materials, and get paid for the
blended materials. The total profit for the blending process is
n X
m n
fi pi .
X X
fij pj +
i=1 j=1 i=1

The goal is to choose the variables fij , fi , and ci so as to maximize the profit, subject to the
constraints. The problem data are cj , cmin
i , c max
i , Fj , Fi , pj , and pj .

(a) Explain how to solve this problem using convex or quasi-convex optimization. You must justify
any change of variables or problem transformation, and explain how you recover the solution
of the blending problem from the solution of your proposed problem.
(b) Carry out the method of part (a) on the problem instance given in
material_blending_data.*. Report the optimal profit, and the associated values of fij , fi ,
and ci .

Solution.

(a) The problem we are to solve is

maximize i,j fij pj + i fi pi


P P

subject to
P i
j fij cj = fi c
T
1 ci = 1
cmin
i  ci  cmax
i
0 fi Fi
0 fij
i fij Fj
P

with variables fij , fi , and ci . Each constraint that is indexed by i must hold for i = 1, . . . , n,
and each constraint is indexed by j must hold for j = 1, . . . , m.
The objective and all constraints except the first set of equality constraints are linear. On the
right hand side of the first set of inequalities we have the product of two variables fi and ci ,
so these constraints are not convex.
To deal with this, we introduce new variables mi = fi ci Rq for i = 1, . . . , n, and reformulate
the problem as an optimization problem with decision variables fij , fi , and mi , removing the
variables ci . The vectors mi are the blended product constituent mass flows.
The variables ci only appear in the first three sets of constraints. In the first set of equality
constraints, we can simply replace fi ci with mi , which results in a set of linear equality
constraints. We express 1T ci = 1 as 1T mi = fi ; these are equivalent since 1T mi = fi 1T ci = fi .
The third set of constraints is equivalent to fi cmin i  mi  fi cmax
i . Therefore, the problem

617
becomes
maximize i,j fij pj + i fi pi
P P
P
subject to j fij cj = mi , i = 1, . . . , n
T
1 mi = fi , i = 1, . . . , n
fi cmin
i  mi  fi cmax
i , i = 1, . . . , n
0 fij , i = 1, . . . , n j = 1, . . . , m
0 f Fi , i = 1, . . . , n
P i
i fij Fj , j = 1, . . . , m,
with variables fij , fi , and mi . This is an LP. In order to reconstruct the solution to the original
problem, we find ci by ci = mi /fi .
(b) The following MATLAB code solves the problem.
clear all
material_blending_data

cvx_begin
variables f(2,4) ftilde(2) m(3,2)
maximize -sum(f*p)+ sum(m*pTilde)
subject to
m == C*f
sum(m) == ftilde
0 <= f
0 <= ftilde <= FTilde
c_minTilde*diag(ftilde) <= m <= c_maxTilde*diag(ftilde)
sum(f) <= F
cvx_end
The following Python code solves the problem.
from cvxpy import *
from material_blending_data import *

f = Variable(2,4)
ftilde = Variable(2)
m = Variable(3,2)

objective = Maximize(-sum_entries(f * p)+ sum_entries(m * pTilde))


constraints = [m == C*f.T,
np.ones([1,3])*m == ftilde.T,
0 <= f,
0 <= ftilde,
ftilde <= FTilde,
c_minTilde * diag(ftilde) <= m,
m <= c_maxTilde * diag(ftilde),
(np.ones([1,2]) * f).T <= F]

618
prob = Problem(objective, constraints)

result = prob.solve()
print prob.value
The following Julia code solves the problem.

using Convex, ECOS


include("material_blending_data.jl");

f = Variable(2,4)
ftilde = Variable(2)
m = Variable(3,2)

obj = (-sum(f * p)+ sum(m * pTilde))


prob = maximize(obj)
prob.constraints += [m == C*f,
ones(1,3)*m == ftilde,
0 <= f,
0 <= ftilde,
ftilde <= FTilde,
c_minTilde * diagm(ftilde) <= m,
m <= c_maxTilde * diagm(ftilde),
f*ones(2,1) <= F]

solve!(prob, ECOSSolver(verbose=0,max_iters=20000))

println(prob.optval)
We find that the optimal value is 127, with a solution
" #
6.0555 0.9167 0.7065 0.3213
f= ,
0.9445 1.0833 5.2935 2.6787

" # 0.8588 0.7029
8
f = , c1 = 0.1000 , c2 = 0.1800 .

10
0.0412 0.1171

17.15 Optimal evacuation planning. We consider the problem of evacuating people from a dangerous area
in a way that minimizes risk exposure. We model the area as a connected graph with n nodes and
m edges; people can assemble or collect at the nodes, and travel between nodes (in either direction)
over the edges. We let qt Rn+ denote the vector of the numbers of people at the nodes, in time
period t, for t = 1, . . . , T , where T is the number of periods we consider. (We will consider the
entries of qt as real numbers, not integers.) The initial population distribution q1 is given. The
nodes have capacity constraints, given by qt  Q, where Q Rn+ is the vector of node capacities.

619
We use the incidence matrix A Rnm to describe the graph. We assign an arbitrary reference
direction to each edge, and take

+1 if edge j enters node i

Aij = 1 if edge j exits node i

0 otherwise.

The population dynamics are given by qt+1 = Aft + qt , t = 1, . . . , T 1 where ft Rm is the


vector of population movement (flow) across the edges, for t = 1, . . . , T 1. A positive flow
denotes movement in the direction of the edge; negative flow denotes population flow in the reverse
direction. Each edge has a capacity, i.e., |ft |  F , where F Rm
+ is the vector of edge capacities,
and |ft | denotes the elementwise absolute value of ft .
An evacuation plan is a sequence q1 , q2 , . . . , qT and f1 , f2 , . . . , fT 1 obeying the constraints above.
The goal is to find an evacuation plan that minimizes the total risk exposure, defined as
T 
X  1 
TX 
Rtot = rT qt + sT qt2 + rT |ft | + sT ft2 ,
t=1 t=1

where r, s Rn+ are given vectors of risk exposure coefficients associated with the nodes, and
r, s Rm
+ are given vectors of risk exposure coefficients associated with the edges. The notation
qt and ft2 refers to elementwise squares of the vectors. Roughly speaking, the risk exposure is a
2

quadratic function of the occupancy of a node, or the (absolute value of the) flow of people along
an edge. The linear terms can be interpreted as the risk exposure per person; the quadratic terms
can be interpreted as the additional risk associated with crowding.
A subset of nodes have zero risk (ri = si = 0), and are designated as safe nodes. The population is
considered evacuated at time t if rT qt + sT qt2 = 0. The evacuation time tevac of an evacuation plan
is the smallest such t. We will assume that T is sufficiently large and that the total capacity of the
safe nodes exceeds the total initial population, so evacuation is possible.
Use CVX* to find an optimal evacuation plan for the problem instance with data given in opt_evac_data.*.
(We display the graph below, with safe nodes denoted as squares.)

2 6
1 4 7
1 5 9
2 5 8
3 7 8
4
3 6

? . Plot the time period risk


Report the associated optimal risk exposure Rtot

Rt = rT qt + sT qt2 + rT |ft | + sT ft2

versus time. (For t = T , you can take the edge risk to be zero.) Plot the node occupancies qt , and
edge flows ft versus time. Briefly comment on the results you see. Give the evacuation time tevac
(considering any rT qt + sT qt2 104 to be zero).

620
Hint. With CVXPY, use the ECOS solver with p.solve(solver=cvxpy.ECOS).
Solution. The optimization problem is given by
   
T T T 2 T 1
T |ft | + sT ft2
P P
minimize t=1 r qt + s qt + t=1 r
subject to qt+1 = Aft + qt , t = 1, . . . , T 1
0  qt  Q, t = 2, . . . , T
|ft |  F, t = 1, . . . , T 1,

with variables q2 , . . . , qT and f1 , . . . , fT 1 . This is evidently a convex optimization problem, since


the objective is convex and the constraints are all linear.
? = 6.59.
The optimal evacuation time is tevac = 17, with total risk exposure Rtot
Note that two of the edge flows reverse direction during the evacuation. This is because the entire
population starts at node 1, but not everyone can move to the safe nodes immediately, due to the
edge capacity constraints. To avoid accumulating risk, some people move to the safer nodes 2 and
3. Once the bottleneck clears, people flow back in the reverse direction, past node 1, and towards
the safe nodes.

3.5 1.0 0.25


3.0 0.8 0.20
2.5 0.15
0.6
2.0 0.10
0.4
Rt

ft
qt

1.5 0.05
0.2
1.0 0.00
0.5 0.0 0.05
0.0 0.2 0.10
0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30
t t t

The following Python code solves the problem.

# solution to optimal evacuation problem


import numpy as np
import cvxpy as cvx
import matplotlib.pyplot as plt
import matplotlib

from opt_evac_data import *

def opt_evac(A, Q, F, q1, r, s, rtild, stild, T):


n,m = A.shape
q = cvx.Variable(n,T)
f = cvx.Variable(m,T-1)

621
node_risk = q.T*r + cvx.square(q).T*s
edge_risk = cvx.vstack(cvx.abs(f).T*rtild + cvx.square(f).T*stild,0)
risk = node_risk + edge_risk

constr = [q[:,0] == q1,


q[:,1:] == A*f + q[:,:-1],
0 <= q, q <= np.tile(Q,(T,1)).T,
cvx.abs(f) <= np.tile(F,(T-1,1)).T]

p = cvx.Problem(cvx.Minimize(sum(risk)), constr)
p.solve(verbose=True, solver=cvx.ECOS)

arr = lambda _: np.array(_.value)


q, f, risk, node_risk = map(arr, (q, f, risk, node_risk))

print "Total risk: ", p.value


print "Evacuated at t =", (node_risk <= 1e-4).nonzero()[0][0] + 1

return q, f, risk, node_risk

# solve
q, f, risk, node_risk = opt_evac(A, Q, F, q1, r, s, rtild, stild, T)

# plot
plt.rc(text, usetex=True)
plt.rcParams.update({font.size: 20})
fig, axs = plt.subplots(1,3,figsize=(15,5))

axs[0].plot(np.arange(1,T+1), risk)
axs[0].set_ylabel($R_t$)

axs[1].plot(np.arange(1,T+1), q.T)
axs[1].set_ylabel($q_t$)

axs[2].plot(np.arange(1,T), f.T)
axs[2].set_ylabel($f_t$)

for ax in axs:
ax.set_xlabel($t$)

if matplotlib.get_backend().lower() in [agg, macosx]:


fig.set_tight_layout(True)
else:
fig.tight_layout()
#plt.tight_layout()

622
fig.savefig(opt_evac.pdf)
fig.savefig(opt_evac.eps)

The following MATLAB code solves the problem.

% solution to optimal evacuation problem


opt_evac_data

[n, m] = size(A);

cvx_begin
variable q(n,T)
variable f(m,T-1)
risk = q*r + square(q)*s + [abs(f)*rtild + square(f)*stild; 0]
minimize( sum(risk) )
subject to
q(:,2:end) == A*f + q(:,1:end-1)
q(:,1) == q1
0 <= q
q <= repmat(Q,1,T)
abs(f) <= repmat(F,1,T-1)
cvx_end

fprintf(Total risk: %f\n, sum(risk))


fprintf(Evacuated at t = %d\n, find(q*r + (q.^2)*s < 1e-4,1))

subplot(1,3,1)
plot(risk)
ylabel(R_t)
xlabel(t)
subplot(1,3,2)
plot(q)
ylabel(q_t)
xlabel(t)
subplot(1,3,3)
plot(f)
ylabel(f_t)
xlabel(t)
print(gcf,-deps,opt_evac.eps)

The following Julia code solves the problem.

# solution to optimal evacuation problem


using Convex, ECOS, PyPlot
include("opt_evac_data.jl");

623
n,m = size(A)
q = Variable(n,T)
f = Variable(m,T-1)
risk = q*r + square(q)*s + [abs(f)*rtild + square(f)*stild,0]
p = minimize(sum(risk))
p.constraints += [q[:,1] == q1,
q[:,2:end] == A*f + q[:,1:end-1],
0 <= q, q <= repmat(Q,1,T),
abs(f) <= repmat(F,1,T-1)]
solve!(p, ECOSSolver(verbose=1))

risk = evaluate(risk)
q = q.value
f = f.value
println("Total risk: $(round(sum(risk),2))")
println("Evacuated at t = $(findfirst(q*r + (q.*q)*s .<= 1e-4))")

fig = figure("stuff",figsize=(22,5))
subplot(131)
plot(risk)
ylabel(L"R_t")
xlabel(L"t")
subplot(132)
plot(q)
ylabel(L"q_t")
xlabel(L"t")
subplot(133)
plot(f)
ylabel(L"f_t")
xlabel(L"t")
savefig("opt_evac.eps")

17.16 Ideal preference point. A set of K choices for a decision maker is parametrized by a set of vectors
c(1) , . . . , c(K) Rn . We will assume that the entries ci of each choice are normalized to lie in the
range [0, 1]. The ideal preference point model posits that there is an ideal choice vector cideal with
entries in the range [0, 1]; when the decision maker is asked to choose between two candidate choices
c and c, she will choose the one that is closest (in Euclidean norm) to her ideal point. Now suppose
that the decision maker has chosen between all K(K 1)/2 pairs of given choices c(1) , . . . , c(K) .
The decisions are represented by a list of pairs of integers, where the pair (i, j) means that c(i) is
chosen when given the choices c(i) , c(j) . You are given these vectors and the associated choices.

(a) How would you determine if the decision makers choices are consistent with the ideal prefer-
ence point model?
(b) Assuming they are consistent, how would you determine the bounding box of ideal choice vec-
tors consistent with her decisions? (That is, how would you find the minimum and maximum

624
values of cideal
i , for cideal consistent with being the ideal preference point.)
(c) Carry out the method of part (b) using the data given in ideal_pref_point_data.*. These
files give the points c(1) , . . . , c(K) and the choices, and include the code for plotting the results.
Report the width and the height of the bounding box and include your plot.

Solution.

(a) The decision that cideal is closer to c(i) than c(j) means that cideal lies in the half-space

1
 
x Rn (c(j) c(i) )T x (c(i) + c(j) )T (c(j) c(i) ) .

2
Thus, the decision makers choices are consistent with an ideal preference point model if and
only if the polyhedron obtained by intersecting the hypercube [0, 1]n and those half-spaces for
all decisions (i, j) is nonempty.
Remark: It is insufficient to only check whether the decisions satisfy transitivity (i.e., if (i, j)
and (j, k) are decisions, then so is (i, k)). Consider c(i) = [i] R1 , i = 1, 2, 3, then the decisions
(1, 2), (1, 3), (3, 2) satisfy transitivity, though there is no cideal satisfying the constraints.
(b) The minimum/maximum value of cideal k can be obtained by minimizing/maximizing cideal
k sub-
ject to the constraint that c ideal lies in the aforementioned polyhedron. In other words, for
each k = 1, . . . , n, to find the minimum value of cideal
k , we solve the problem

minimize cideal
k
subject to 0  cideal  1,
(c(j) c(i) )T cideal 21 (c(i) + c(j) )T (c(j) c(i) ) for decisions (i, j).

The maximum value is obtained by using maximize instead of minimize.


Remark : It is possible to reduce the number of decisions needed to be considered in the convex
problem from K(K 1)/2 to K 1, as long as the decisions satisfy transitivity (which should
be the case if there is no tie). Assume the candidate choices are ordered as c(k1 ) , . . . , c(kK )
from closest to farthest from the ideal point (the ordering can be obtained by counting the
number of times c(i) is preferred in the decisions (i, j)), then we need only to consider the
decisions (k1 , k2 ), . . . , (kK1 , kK ).
(c) The width and the height of the bounding box are 0.140 and 0.098 respectively.
The following Matlab code solves the problem:
% Problem data
K = 8;
n = 2;

% List of candidate choices as row vectors


c = [0.314 0.509; 0.185 0.282; 0.670 0.722; 0.116 0.253; ...
0.781 0.382; 0.519 0.952; 0.953 0.729; 0.406 0.110];

% List of decisions. Row [i j] means c(i) preferred over c(j)


d = [1 2; 3 1; 3 2; 1 4; 2 4; 3 4; 5 1; ...
5 2; 3 5; 5 4; 1 6; 6 2; 3 6; 6 4; ...

625
1.0

0.8

0.6

0.4

0.2

0.0
0.0 0.2 0.4 0.6 0.8 1.0

Figure 19: Plot of the points c(i) and the bounding box.

5 6; 7 1; 7 2; 3 7; 7 4; 5 7; 7 6; ...
1 8; 8 2; 3 8; 8 4; 5 8; 6 8; 7 8];

box = zeros(n, 2);

% Put your code for finding the bounding box here.


% box(i, 1) and box(i, 2) should be the lower and upper bounds
% of the i-th coordinate respectively.
for a = 1:n
for s = [-1, 1]
cvx_begin
variables c_ideal(n)
maximize c_ideal(a) * s
subject to
0 <= c_ideal;
c_ideal <= 1;
for b = 1:size(d,1)
c_ideal * (c(d(b,2),:) - c(d(b,1),:)) <= (c(d(b,1),:) ...
+ c(d(b,2),:)) * (c(d(b,2),:) - c(d(b,1),:)) / 2;
end
cvx_end
box(a, (s + 3) / 2) = cvx_optval * s;
end

626
end

% Drawing the bounding box


figure;
scatter(c(:,1), c(:,2));
hold on
plot([box(1,1);box(1,2);box(1,2);box(1,1);box(1,1)], ...
[box(2,1);box(2,1);box(2,2);box(2,2);box(2,1)]);
hold off
xlim([0 1]);
ylim([0 1]);
disp([Width of bounding box = num2str(box(1,2) - box(1,1))])
disp([Height of bounding box = num2str(box(2,2) - box(2,1))])

The following Python code solves the problem:


from cvxpy import *
import numpy as np
import matplotlib.pyplot as plt

# Problem data
K = 8
n = 2

# List of candidate choices


c = [[0.314, 0.509], [0.185, 0.282], [0.670, 0.722], [0.116, 0.253],
[0.781, 0.382], [0.519, 0.952], [0.953, 0.729], [0.406, 0.110]]
c = [np.array(x) for x in c]

# List of decisions. [i, j] means c[i] preferred over c[j]


d = [[0, 1], [2, 0], [2, 1], [0, 3], [1, 3], [2, 3], [4, 0],
[4, 1], [2, 4], [4, 3], [0, 5], [5, 1], [2, 5], [5, 3],
[4, 5], [6, 0], [6, 1], [2, 6], [6, 3], [4, 6], [6, 5],
[0, 7], [7, 1], [2, 7], [7, 3], [4, 7], [5, 7], [6, 7]]

box = [[0] * 2 for a in range(n)]

# Put your code for finding the bounding box here.


# box[i][0] and box[i][1] should be the lower and upper bounds
# of the i-th coordinate respectively.
for a in range(n):
for s in [-1, 1]:
c_ideal = Variable(n)
objective = Maximize(c_ideal[a] * s)
constraints = [0 <= c_ideal, c_ideal <= 1]
for b in d:

627
constraints.append(c_ideal.T * (c[b[1]] - c[b[0]])
<= np.dot(c[b[0]] + c[b[1]], c[b[1]] - c[b[0]]) / 2)
prob = Problem(objective, constraints)
box[a][(s + 1) // 2] = prob.solve() * s

# Drawing the bounding box


plt.figure()
plt.xlim(0, 1)
plt.ylim(0, 1)
plt.scatter([c[i][0] for i in range(K)], [c[i][1] for i in range(K)])
plt.plot([box[0][0], box[0][1], box[0][1], box[0][0], box[0][0]],
[box[1][0], box[1][0], box[1][1], box[1][1], box[1][0]])
plt.gca().set_aspect(equal, adjustable=box)
plt.show()
print(Width of bounding box = + str(box[0][1] - box[0][0]))
print(Height of bounding box = + str(box[1][1] - box[1][0]))

The following Julia code solves the problem:


using Convex
using PyPlot

# Problem data
K = 8
n = 2

# List of candidate choices as row vectors


c = [0.314 0.509; 0.185 0.282; 0.670 0.722; 0.116 0.253;
0.781 0.382; 0.519 0.952; 0.953 0.729; 0.406 0.110]

# List of decisions. Row [i j] means c[i] preferred over c[j]


d = [1 2; 3 1; 3 2; 1 4; 2 4; 3 4; 5 1;
5 2; 3 5; 5 4; 1 6; 6 2; 3 6; 6 4;
5 6; 7 1; 7 2; 3 7; 7 4; 5 7; 7 6;
1 8; 8 2; 3 8; 8 4; 5 8; 6 8; 7 8]

box = zeros(n, 2)

# Put your code for finding the bounding box here.


# box[i, 1] and box[i, 2] should be the lower and upper bounds
# of the i-th coordinate respectively.
for a in 1:n
for s in [-1, 1]
c_ideal = Variable(n)
constraints = [0 <= c_ideal; c_ideal <= 1]
for b = 1:size(d,1)

628
push!(constraints, c_ideal * (c[d[b,2],:] - c[d[b,1],:])
<= (c[d[b,1],:] + c[d[b,2],:]) * (c[d[b,2],:] - c[d[b,1],:]) / 2)
end
prob = maximize(c_ideal[a] * s, constraints)
solve!(prob)
box[a, (s + 3) / 2] = prob.optval * s
end
end

# Drawing the bounding box


figure()
xlim(0, 1)
ylim(0, 1)
scatter(c[:,1], c[:,2])
plot([box[1,1];box[1,2];box[1,2];box[1,1];box[1,1]],
[box[2,1];box[2,1];box[2,2];box[2,2];box[2,1]])
println("Width of bounding box = $(box[1,2] - box[1,1])")
println("Height of bounding box = $(box[2,2] - box[2,1])")

17.17 Matrix equilibration. We say that a matrix is `p equilibrated if each of its rows has the same `p
norm, and each of its columns has the same `p norm. (The row and column `p norms are related
by m, n, and p.) Suppose we are given a matrix A Rmn . We seek diagonal invertible matrices
D Rmm and E Rnn for which DAE is `p equilibrated.

(a) Explain how to find D and E using convex optimization. (Some matrices cannot be equili-
brated. But you can assume that all entries of A are nonzero, which is enough to guarantee
that it can be equilibrated.)
(b) Equilibrate the matrix A given in the file matrix_equilibration_data.*, with
m = 20, n = 10, p = 2.
Print the row `p norms and the column `p norms of the equilibrated matrix as vectors to check
that each matches.

Hints.
Work with the matrix B, with Bij = |Aij |p .
Consider the problem of minimizing m
P Pn ui +vj subject to 1T u = 0, 1T v = 0.
i=1 j=1 Bij e
(Several variations on this idea will work.)
We have found that expressing the terms in the objective as elog Bij +ui +vj leads to fewer
numerical problems.
Solution. Following the hint, we find the optimality conditions for the suggested problem. The
Lagrangian is
m X
X n
L(u, v, , ) = Bij eui +vj + 1T u + 1T v,
i=1 j=1

629
with dual variables and . The optimality conditions are
n
L X
= Bij eui +vj + = 0, i = 1, . . . , m,
ui j=1

and m
L X
= Bij eui +vj + = 0, j = 1, . . . , n,
vj i=1

along with 1T u = 1T v = 0. Defining D = diag(eu/p ) and E = diag(ev/p ), where the exponentials


are elementwise, we can write the optimality conditions as

1T Dp BE p = 1T , Dp BE p 1 = 1,

i.e., Dp BE p has all column sums equal to , and all row sums equal to . Therefore, the matrix
DAE is `p equilibrated, since |(DAE)pij | = (Dp BE p )ij .
For the given matrix A, we find the equilibrated matrix has row norms and column norms as
follows.
Code solutions for each language follow.
Matlab:

matrix_equilibration_data;
B = abs(A).^p;

cvx_begin
variables u(m) v(n);
expression obj;
for i = 1:m
for j = 1:n
obj = obj + exp(log(B(i,j))+u(i)+v(j));
end
end
minimize(obj);
subject to
sum(u) == 0;
sum(v) == 0;
cvx_end

D = diag(exp(u./p));
E = diag(exp(v./p));
A_eq= D*A*E;

row_norms = norms(A_eq,p,2)
col_norms = norms(A_eq,p,1)

Python:

630
from cvxpy import *
import numpy as np

from matrix_equilibration_data import *


B = np.power(np.abs(A), p)

u = Variable(m)
v = Variable(n)

obj = 0
for i in range(m):
for j in range(n):
obj += exp(log(B[i, j]) + u[i] + v[j])
obj = Minimize(obj)

constraints = [sum(u) == 0, sum(v) == 0]

prob = Problem(obj, constraints)


prob.solve(verbose=True)

D = np.diagflat(np.exp(u.value / p))
E = np.diagflat(np.exp(v.value / p))
A_eq = D * A * E

row_norms = np.linalg.norm(A_eq, p, 1)
col_norms = np.linalg.norm(A_eq.T, p, 1)

print(row_norms)
print(col_norms)

Julia:

# Compute the equilibration


include("matrix_equilibration_data.jl");
B = abs(A).^p;

using Convex;
u = Variable(m); v = Variable(n);

objective = 0;
for i = 1:m
for j = 1:n
objective += exp(log(B[i,j])+u[i]+v[j]);
end
end
constraints = [sum(u) == 0, sum(v) == 0];

631
problem = minimize(objective,constraints);
solve!(problem);

D = diagm(exp(u.value[:,1]./p));
E = diagm(exp(v.value[:,1]./p));
A_eq = D*A*E;

row_norms = sum(abs(A_eq).^p,2).^(1/p)
col_norms = sum(abs(A_eq).^p,2).^(1/p)

632

You might also like