Solutions To Exercises in An Introduction To Convexity: Øyvind Ryan
Solutions To Exercises in An Introduction To Convexity: Øyvind Ryan
Solutions To Exercises in An Introduction To Convexity: Øyvind Ryan
INTRODUCTION TO CONVEXITY
∗
ØYVIND RYAN
JUNE 2020
1
2 CHAPTER 1. THE BASIC CONCEPTS
y = a1 x 1 + . . . + at x t z = b1 x 1 + . . . + bt x t ,
Exercise 1.10. Let S = {(x, y, z) : z ≥ x2 + y 2 } ⊂ IR3 . Sketch the set and verify
that it is a convex set. Is S a finitely generated cone?
Solution: Assume that a1 = (x1 , y1 , z1 ), a2 = (x2 , y2 , z2 ) are both in S, so that
z1 ≥ x21 + y12 , z2 ≥ x22 + y22 . Convexity of S is the same as showing, for 0 ≤ λ ≤ 1,
that
i.e., that
We have that
((1 − λ)x1 + λx2 )2 ≤ (1 − λ)x21 + λx22
(since this can be reorganized to λ(1−λ)(x21 +x22 −2x1 x2 = λ(1−λ)(x1 −x2 )2 ≥ 0.
Convexity of the function f (x) = x2 is really what is at play here. We will return
to this later), and similarly
It follows that
max{cT x : x ∈ P }.
Solution:
(i): The disk with center (3, 4) and radius 1.
(ii): The left half of the disk with center (0, 0) and radius 1, combined with the
rectangle with corners (0, 1), (0, −1), (1, 1), (1, −1), combined with the disk with
center (1, 0) and radius 1.
5 −2 1
(iii): We can write A = +y , and B = x , 0 ≤ x ≤ 1. If we set
0 1 1
x = 1 we obtain the line
5 −2 1 6 −2
+y + = +y
0 1 1 1 1
It follows that A + B is the area between the parallel lines
5 −2 6 −2
+y and +y .
0 1 1 1
(iv): The rectangle with vertices (0, 1), (0, 4), (3, 1), (3, 4).
Exercise 1.14. (i) Prove that, for every λ ∈ IR and A, B ⊆ IRn , it holds that
λ(A + B) = λA + λB.
(ii) Is it true that (λ + µ)A = λA + µA for every λ, µ ∈ IR and A ⊆ IRn ? If
not, find a counterexample.
(iii) Show that, if λ, µ ≥ 0 and A ⊆ IRn is convex, then (λ + µ)A = λA + µA.
Solution:
(i): A general element in A + B is on the form a + b for a ∈ A, b ∈ B. Since
λ(a + b) = λa + λb ∈∈ λA + λB, it follows that λ(A + B) ⊆ λA + λB. The other
way follows in the same way.
(ii): If λ = −µ the set on the left consists of only the origin, while the set on the
right can be any set.
(iii): Let A be convex, and let a1 , a2 ∈ A. Then
λ µ
λa1 + µa2 = (λ + µ) a1 + a2 ∈ (λ + µ)A
λ+µ λ+µ
5
Exercise 1.19. Show that the unit ball B∞ = {x ∈ IRn : kxk∞ ≤ 1} is convex.
Here kxk∞ = maxj |xj | is the max norm of x. Show that B∞ is a polyhedron.
Illustrate when n = 2.
Solution: It is straightforward to show that k · k∞ is a norm. Convexity thus
follows from Exercise 1.4, since the proof therein applies for any norm. That B∞ is
a polyhedron follows from writing kxk∞ ≤ 1 equivalently as {xi ≤ 1, −xi ≤ 1}ni=1 .
For n = 2 we obtain the square with vertices (1, 1), (−1, 1), (1, −1), (−1, −1).
n
Exercise 1.20.Pn Show that the unit ball B1 = {x ∈ IR : kxk1 ≤ 1} is convex.
Here kxk1 = j=1 |xj | is the absolute norm of x. Show that B1 is a polyhedron.
Illustrate when n = 2.
Solution: It is straightforward to show that k · k1 is a norm. Convexity
Pn thus fol-
Pn B1 is a polyhedron follows from writing kxk1 = j=1
lows as above. That
n
|xj | ≤ 1
equivalently as j=1 ±xj ≤ 1, where the signs traverse all possible 2 combina-
tions. For n = 2 there are four possible sign choices, leadin to the polyhedron
defined by x + y ≤ 1, x − y ≤ 1, −x + y ≤ 1, −x − y ≤ 1. This gives the square
with vertices (1, 0), (0, 1), (−1, 0), (0, −1).
Exercise 1.21. Prove Proposition 1.5.1.
Solution: Let x0 ∈ C (C is assumed nonempty), and let L = C − x0 = {x − x0 :
x ∈ C} (so that C = L + x0 ). Assume that C is affine. For x ∈ C we have that
λ(x − x0 ) = λx + (1 − λ)x0 − x0 ∈ L,
C − x1 = (C − x0 ) − (x1 − x0 )
so that C is affine. The other way, if L is a linear subspace of IRn , and x0 ∈ IRn ,
we can find a matrix A with null space L. If Ax0 = b, the solution set of Ax = b
is then the affine set C = L + x0 .
7
x1 − x2 = l1 − l2 ∈ L,
This is a convex combination of the three points, since the coefficients sum to
1 − µ + µ(1 − λ + λ) = 1.
Exercise 2.3. Show P that conv(S) P is convex for all S ⊆ IRn . (Hint: look at two
convex combinations j λj xj and j µj yj , and note that both these points may
be written as a convex combination of the same set of vectors.)
Pt Ps
Solution: Let x = j=1 λj xj and y = j=1 µj yj be convex combinations of
points in S. By taking the union of the points xj and yj , both x and y can be
written as convex combinations of the same set of points {zj }rj=1 (some of the
8
9
Clearly the coefficients (1 − λ)λj + λµj sum to one, so that this is also a convex
combination of points in S. It follows that conv(S) is convex.
Exercise 2.4. Give an example of two distinct sets S and T having the same
convex hull. It makes sense to look for a smallest possible subset S0 of a set S
such that S = conv(S0 ). We study this question later.
Solution: We can set S = {−1, 0, 1}, and T = {−1, 1}. Both have [−1, 1] as
convex hull.
Exercise 2.5. Prove that if S ⊆ T , then conv(S) ⊆ conv(T ).
Solution: This follows from the fact that a convex combination of points in S
then also is a convex combination of points in T .
Exercise 2.6. If S is convex, then conv(S) = S. Show this!
Solution: This is a compulsory exercise. We already know that S ⊆ conv(S).
The converse statement is also shown in the text, but let us repeat it. Assume
that we have shown that any convex combination of t − 1 points in S lies in S.
We can write
t t−1 t−1
!
X X X λj
λ j xj = ( λj ) Pt−1 xj + λt xt .
j=1 j=1 j=1 j=1 λj
Pt−1 λj Pt
By the induction hypothesis j=1
Pt−1
λj
xj ∈ S. But then also j=1 λ j xj ∈ S
j=1
by convexity of S.
Exercise 2.7. Let S = {x ∈ IR2 : kxk2 = 1}, this is the unit circle in IR2 .
Determine conv(S) and cone(S).
Solution: conv(S) must contain any point inside the unit circle. This can be seen
if you take any line through this point. This line will intersect the unit circle at
two points, so that the point is a convex combination of two points on the unit
circle. Therefore D ⊆ conv(S), where D = {x ∈ IR2 : kxk2 ≤ 1}. Since conv(S)
is the smallest convex set that contains S and since D is convex and contains S,
we obtain that conv(S) ⊆ D. It follows that conv(S) = D.
If x ∈ IR2 6= 0 then x = kxk2 u with u = x/kxk2 ∈ S. It follows that u ∈ cone(S),
so that conv(S) = IR2 .
10 CHAPTER 2. CONVEX HULLS AND CARATHÉODORY’S THEOREM
Exercise 2.8. Does affine independence imply linear independence? Does linear
independence imply affine independence? Prove or disprove!
Solution: Linear independence (of the columns of A) is the same as
Ax = 0 ⇒ x = 0.
i.e., it is the same as linear independence of the columns of A with a last compo-
nent with a one added. Clearly then linear independence implies affine indepen-
dence (since equality in the first n components already implies that the coefficients
must be zero). Affine
idependence
does not imply linear independence, however:
A
If A has n rows, can have rank n + 1, but A can have rank at most n. In
1···1
A
particular, there can be n + 1 linearly independent column vectors in ,
1···1
but only n in A.
Exercise 2.9. Let x1 , . . . , xt ∈ IRn be affinely independent and let w ∈ IRn . Show
that x1 + w, . . . , xt + w are also affinely independent.
Solution: X X
λi (xi + w) = 0 and λi = 0
i i
is equivalent to X X
λi xi = 0 and λi = 0
i i
P P
since i λi w = 0 when i λi = 0. The result follows.
Exercise 2.10. Let L be a linear subspace of dimension (in the usual linear
algebra sense) t. Check that this coincides with our new definition of dimension
above. (Hint: add O to a “suitable” set of vectors).
Solution: Let x1 , . . . , xt be a basis for L. {0, x1 , . . . , xt } are affinely independent
since x1 − 0, . . . , xt − 0 are linearly independent. Therefore the affine dimension of
L is ≥ t. If the affine dimension of L was larger than t, we could find at least t + 2
affinely independent points x1 , . . . , xt+2 in L, so that x2 − x1 , . . . , xt+2 − x1 are
linearly independent. There are t + 1 vectors here, all in L, so that the dimension
of L is ≥ t + 1. This is a contradiction. It follows that the affine dimension equals
the dimension.
Exercise 2.11. Prove the last statements in the previous paragraph.
11
Solution: With A being the set of all tj=1 λj xi with tj=1 λj = 1, we have that
P P
X X X
(1 − λ) λ j xj + λ µ j xj = (1 − λ)λj + λµj )xj ∈ A,
j j j
since the coefficients sum to one (by expansion we can clearly assume the same
base set xj for the two linear combinations). It follows that A is affine. Choosing
λj = 0, the other zero, we see that x1 , . . . , xd+1 are in A. Then A must in partic-
ular contain all the convex combinations of these points. Since conv(C) = C, A
contains C. Any affine set that contains C must also contain these, so this is the
smallest one.
Exercise 2.12. Construct a set which is neither open nor closed.
Solution: An example is the half-open interval A = [0, 1).
Exercise 2.13. Show that xk → x if and only if xkj → xj for j = 1, . . . , n. Thus,
convergence of a point sequence simply means that all the component sequences
are convergent.
Solution: If xk → x we can find an N so that kxk − xk2 ≤ for k ≥ N . But
since |xkj − xj | ≤ kxr − xk2 , it follows that also |xkj − xj | ≤ for r ≥ N , so that
xkj → xj for j = 1, . . . , n.
k k
√ other way, if xj → xj for j = 1, . . . , n, we can find an N so that |xj − xj | ≤
The
/ n for all k ≥ N and j = 1, . . . , n. But then, for k ≥ N ,
v v
n
u n
uX uX
k
u k
kx − xk2 = t 2
|xj − xj | ≤ t 2 /n = ,
j=1 j=1
Exercise 2.15. Prove that x ∈ bd(S) if and only if each ball with center x
intersects both S and the complement of S.
Solution: If x ∈ bd(S) is the same as x ∈ cl(S) and x 6∈ int(S). x 6∈ int(S) is the
same as any ball with center x intersects the complement of S. x ∈ cl(S) is the
same as any ball intersects S. The result follows.
Exercise 2.16. Consider again the set C = {(x1 , x2 , 0) ∈ IR3 : x21 + x22 ≤ 1}.
Verify that
(i) C is closed,
(ii) dim(C) = 2,
(iii) int(C) = ∅,
(iv) bd(C) = C,
(v) rint(C) = {(x1 , x2 , 0) ∈ IR3 : x21 + x22 < 1} and
(vi) rbd(C) = {(x1 , x2 , 0) ∈ IR3 : x21 + x22 = 1}.
Solution:
(i): Assume that xk → x, and that xk ∈ C. Since all x2k + yk2 ≤ 1, then also
x2 + y 2 ≤ 1. Since all zk = 0, then also z = 0. It follows that x ∈ C also. It follows
that C is closed.
(ii): It is possible to find only 3 affinely independent points in C, sinde the third
component is always zero. The dimension is thus ≤ 2. To see that the dimension
is exactly 3, choose for instance the three points (0, 0, 0), (1, 0, 0), and (0, 1, 0).
(iii): Any ball around a point in C will contain points with nonzero third compo-
nent, so that no ball can be entirely contained in C. It follows that int(C) = ∅.
(iv): bd(C) = cl(C) \ int(C) = C \ ∅ = C.
(v): Clearly aff(C) is the entire xy-plane.
If x21 + x22 < 1, then in IR2 we can find a ball B o ((x1 , x2 ), r) contained in
B o ((0, 0), 1). We then obtain
B o (x, r) ∩ aff(C) = {(z1 , z2 , 0) ∈ IR3 : (z1 − x1 )2 + (z2 − x22 ) < r}
= (B o ((x1 , x2 , r), 0) ⊆ C,
Exercise 2.17. Show that every polytope in IRn is bounded. (Hint: use the prop-
erties of the norm: kx + yk ≤ kxk + kyk and kλxk = λkxk when λ ≥ 0).
Solution: Let C = conv(x1 , . . . , xt ), ann let x = tj=1 λj xj with the λj nonneg-
P
ative and summing to 1. Using the triangle inequality we obtain
t t t
X X t X t
kxk = k λj xj k ≤ λj kxj k ≤ max kxj k λj = max kxj k.
j=1 j=1
j=1 j=1 j=1
Solution: Any convex combination of values between 0 and 1 lies between 0 and
1. Since the coordinates of all the points lie between 0 and 1, conv(S) must consist
of points with coordinates between 0 and 1 as well, i.e., conv(S) ⊆ {(x1 , x2 , x3 ) ∈
IR3 : 0 ≤ xi ≤ 1 for i = 1, 2, 3}. The other way, since the points constitute all
vertices of the unit cube, conv(S) will contain all the edges of the unit cube (by
taking convex combinations of adjacent vertices). By taking convex combinations
of edges which on different sides of a face, we see that all faces of the cube are also
contained in conv(S). By taking convex combinations of different faces we obtain
the entire cube, so that {(x1 , x2 , x3 ) ∈ IR3 : 0 ≤ xi ≤ 1 for i = 1, 2, 3} ⊆ conv(S).
Equality now follows.
Now, let us exclude the point (1, 1, 1). In all remaining seven points the coor-
dinates sum to ≤ 2, so that this applies to conv(S \ (1, 1, 1)) as well, which
is thus contained in x1 + x2 + x3 ≤ 2. Consider the polyhedron described by
0 ≤ x1 , x2 , x3 ≤ 1, and x1 + x2 + x3 ≤ 2. It is easily verified that the vertices of
this set is S \ (1, 1, 1). Since the polyhedron is bounded, and since the vertcies
equal the extreme points (see Chapter 4), it follows that conv(S \ (1, 1, 1)) equals
this polyhedron.
Exercise 2.25. Let A, B ⊆ IRn . ProveP that conv(A + B) = conv(A) + conv(B).
P the sum Pj,k λj µk (aj + bk ) where aj ∈ A, bk ∈ B
Hint: it is useful to consider
and λj ≥ 0, µk ≥ 0 and j λj = 1 and k µk = 1.
Solution: We have that
X X X X X X X
λj µk (aj + bk ) = ( λj )( µ k bk ) + ( µk )( λ j aj ) = µ k bk + λj aj .
j,k j k k j k j
The right hand side is a general element in conv(A) + conv(B). Since the left
hand side is a convex combination of elements in A + B, it follows that conv(A) +
conv(B) ⊆ conv(A + B). The other way, conv(A) + conv(B) is clearly convex
(the sum of two convex sets is always convex), so that it equals its convex hull.
Therefore
conv(A + B) ⊆ conv(conv(A) + conv(B)) = conv(A) + conv(B),
and the result follows.
Exercise 2.26. When S ⊂ IRn is a finite set, say S = {x1 , . . . , xt }, then we have
t
X X
conv(S) = { λj xj : λj ≥ 0 for each j, λj = 1}.
j=1 j
Solution: Let S be the integers. Clearly conv(S) = IR, but, for any finite subset
S0 , conv(S0 ) = [min(S0 ), max(S0 )]. This means that we can’t use any finite subset
to describe conv(S).
Exercise 2.27. Let x0 ∈ IRn and let C ⊆ IRn be a convex set. Show that
Solution:
Affine independence of x1 , . . . , xt is the same as independent columns
X
in . That xi can be written as an affine combination of the remaining
1···1
X
ones means that column i in can be written as a linear combinations
1···1
of the other columns. The result now follows from the fact that, a set of vectors
is linearly independent if and only if none of the vectors can be written as a
non-trivial linear combination of the others.
Exercise 2.31. Prove that x1 , . . . , xt ∈ IRn are affinely independent if and only
if the vectors (x1 , 1), . . . , (xt , 1) ∈ IRn+1 are linearly independent.
Solution See Exercise 2.8.
Exercise 2.32. Prove Proposition 2.3.2.
Solution: Asume that x = tj=1 λj xj where
P P
λj = 1. Then
X x
λ= .
1···1 1
The convex hull consist of the vectors x for which this has a solution. Affine
independence means that the left hand side columns are linearly independent,
which implies that the solution λ is unique when it exist. This in turn implies
that x has a unique representation.
Exercise 2.33. Prove that cl(A1 ∪ . . . ∪ At ) = cl(A1 ) ∪ . . . ∪ cl(At ) holds whenever
A1 , . . . , At ⊆ IRn .
Solution: cl(A1 ) ∪ . . . ∪ cl(At ) is a closed set containing A1 ∪ . . . ∪ At . Since
cl(A1 ∪ . . . ∪ At ) is the smallest such set, it follows that
cl(A1 ∪ . . . ∪ At ) ⊆ cl(A1 ) ∪ . . . ∪ cl(At ).
The other way, since Ai ⊆ A1 ∪ . . . ∪ At , cl(Ai ) ⊆ cl(A1 ∪ . . . ∪ At ) (since a closed
set S containing A1 ∪ · · · ∪ At also must contain Ai ). But then also
cl(A1 ) ∪ . . . ∪ cl(At ) ⊆ cl(A1 ∪ . . . ∪ At ),
and the result follows.
Exercise 2.34. Prove that every bounded point sequence in IRn has a convergent
subsequence.
Solution: Since any subsequence of a convergent sequence also is convergent, it
is enough to prove the result for n = 1. Due to boundedness there exists an M
so that |xi | ≤ M for all i. Partition the interval [−M, M ] into two parts of equal
length. One of these parts must contain an infinite number of the xi . Choose the
first xi1 taking value in this part, and split the new interval in 2 again. Pick an
xi2 from the new interval, and split in two again, and so on. The new sequence y
defined as yn = xin is clearly a Cauchy sequence, so that it is convergent.
17
Exercise 2.35. Find an infinite set of closed intervals whose union is the open
interval h0, 1i. This proves that the union of an infinite set of closed intervals
may not be a closed set.
Solution: We have that (0, 1) = ∪∞ n=3 [1/n, 1 − 1/n].
Exercise 2.36. Let S be a bounded set in IRn . Prove that cl(S) is compact.
Solution: Sine cl(S) is closed, we only have to prove that cl(S) is bounded.
Since S is bounded, we can find a (closed) ball B(0, r) so that S ⊆ B(0, r). Since
B(0, r) is a closed set containing S, it follows that cl(S) ⊆ B(0, r). Since B(0, r)
is bounded, so is cl(S).
Exercise 2.37. Let S ⊆ IRn . Show that either int(S) = rint(S) or int(S) = ∅.
Solution: We know that we can write aff(S) = L + x0 , where L is a vector space,
and where x0 can be chosen to be any x0 ∈ aff(S) (in particular we can choose
any x0 ∈ S). If L has dimension n, aff(S) = IRn . In this case clearly the relative
interior equals the interior (their definitions coincide).
Assume now that L has dimension m < n. Let x0 , . . . xm be a maximum number
of affinely independent points in S (i.e., the xi − x0 are linearly independent).
Find a vector x 6= 0 so that x − x0 can not be writtenP as a linear combination of
the {xi −x0 }i=1 . I claim that, for any choice of λi , x0 + m
m
i=1 λi (xi −x0 )+(x−x0 )
can not be in aff(S). Otherwise we would have that
m
X m
X
x0 + λi (xi − x0 ) + (x − x0 ) = x0 + µi (xi − x0 )
i=1 i=1
Pm
for some µi (since x0 + i=1 µi (xi − x0 ) describes a general element in aff(S)).
From this one could write x − x0 as a linear combination of the xi − x0 , which is
a contradiction. It follows that int(aff(S)) = ∅ (since was arbitrary). But then
also int(S) = ∅.
Exercise 2.38. Prove Theorem 2.4.3 Hint: To prove that rint(C) is convex, use
Theorem 2.4.2. Concerning int(C), use Exercise 2.37. Finally, to show that cl(C)
is convex, let x, y ∈ cl(C) and consider two point sequences that converge to x
and y, respectively. Then look at a convex combination of x and y and construct
a suitable sequence!
Solution: This is a compulsory exercise. Let C be convex.
Let x, y ∈ rint(C). Since y ∈ rint(C) ⊆ C ⊆ cl(C), it follows from Theorem 2.4.2
that (1 − λ)x1 + λx2 ∈ rint(C) for all 0 < λ < 1. It follows that rint(C) is convex.
Exercise 2.37 says that int(C) is either ∅, or rint(C). In any case int(C) is convex.
Let x, y ∈ cl(C), and let {xk }, {yk } be sequences from C which converge to x and
y, respectively. Then (1 − λ)xk + λyk is a sequence from C (due to convexity),
18 CHAPTER 2. CONVEX HULLS AND CARATHÉODORY’S THEOREM
Note that, for small α, this is still a nonnegative linear combination, and when
α = 0 it is just the original representation of x. But now we gradually increase or
decrease α from zero until one of the coefficients λj − αµj becomes zero, say this
happens for α = α0 . Recall here that each λj is positive and that µ1 6= 0. Then
each coefficient λj − α0 µj is nonnegative and at least one of them is zero. But this
means that we have found a new representation of x as a nonnegative combination
of t − 1 vectors from S. Clearly, this reduction process may be continued until
we have x written as a nonnegative combination of, say, m linearly independent
points in S. Finally, there are at most n linearly independent points in IRn , so
m ≤ n.
where the equality between the first and second line results from Exercise 1.14(i),
and the equality between the second and third line results from convexity of B
combined with (iii) in the same exercise. For sufficiently small we have that
x1 + 1+λ
1−λ
B ⊆ C, so that the above is in (1 − λ)C + λC ⊆ C, due to convexity. It
follows that (1 − λ)x1 + λx2 + B ⊆ C, so that (1 − λ)x1 + λx2 ∈ int(C) = rint(C).
The proof follows.
Note that this proof simplifies in the sense that the case x2 ∈ C needs not be
handled separately first.
Chapter 3
Exercise 3.1. Give an example where the nearest point is unique, and one where
it is not. Find a point x and a set S such that every point of S is a nearest point
to x!
Solution: Let S be the unit disk in IR2 , and let x = (2, 0). Then clearly s = (1, 0)
is the unique nearest point.
Let S be the unit circle in IR2 , and let x = (0, 0). Then there is no unique nearest
point, since every point in S has the same distance to x, so that every point in
S is a nearest point.
Exercise 3.2. Let a ∈ IRn \ {O} and x0 ∈ IRn . Then there is a unique hyperplane
H that contains x0 and has normal vector a. Verify this and find the value of the
constant α (see above).
Solution: The hyperplane is aT x = aT x0 . This hyperplane is unique since any
other value of α will exclude x0 from the set.
Exercise 3.3. Give an example of two disjoint sets S and T that cannot be
separated by a hyperplane.
Solution: Let S be the circle in IR2 consisting of points with modulus 1, T the
circle in IR2 consisting of points with modulus 2.
Exercise 3.4. In view of the previous remark, what about the separation of S
and a point p 6∈ aff(S)? Is there an easy way to find a separating hyperplane?
Solution:
Exercise 3.5. Let C ⊆ IRn be convex. Recall that if a point x0 ∈ C satisfies ( 3.2)
for any y ∈ C, then x0 is the (unique) nearest point to x in C. Now, let C be the
unit ball in IRn and let x ∈ IRn satisfy kxk > 1. Find the nearest point to x in C.
What if kxk ≤ 1?
Solution: If kxk > 1 then clearly x0 = x/kxk is the nearest point. If x ≤ 1 then
x itself is the nearest point.
20
21
Exercise 3.6. Let L be a line in IRn . Find the nearest point in L to a point x ∈
IRn . Use your result to find the nearest point on the line L = {(x, y) : x + 3y = 5}
to the point (1, 2).
Solution: This is a compulsory exercise. Let the line be written on the form
x0 + ta (a being the direction vector of the line). The closest point to x from the
subspace spanned by a is hx,ai
ha,ai
a. It follows that the closest point from the line to
x is
hx − x0 , ai
a + x0 .
ha, ai
The line L can be parametrizes as (5 − 3y, y) = (5, 0) + y(−3, 1). We get that
x − x0 = (1, 2) − (5, 0) = (−4, 2). We then get
hx − x0 , ai 12 + 2
a + x0 = (−3, 1) + (5, 0) = (−21/5, 7/5) + (5, 0) = (4/5, 7/5).
ha, ai 10
Exercise 3.7. Let H be a hyperplane in IRn . Find the nearest point in H to a
point x ∈ IRn . In particular, find the nearest point to each of the points (0, 0, 0)
and (1, 2, 2) in the hyperplane H = {(x1 , x2 , x3 ) : x1 + x2 + x3 = 1}.
Solution: A general hyperplane in IRn can be written on the form x0 + L, where
L is an n − 1-dimensional space. Let a span the orthogonal complement of L. We
subtract x0 and compute the projection to obtain
hx − x0 , ai hx − x0 , ai
x − x0 − a + x0 = x − a
ha, ai ha, ai
(we here instead projected onto the orthogonal complement).
For the plane in question we obtain ha, ai = 3, and we can use x0 = (1, 0, 0) For
the point (0, 0, 0) we obtain
hx − x0 , ai 1
x− a = (1, 1, 1),
ha, ai 3
while for the point (1, 2, 2) we obtain
hx − x0 , ai 4
x− a = (1, 2, 2) − (1, 1, 1) = (−1/3, 2/3, 2/3).
ha, ai 3
Exercise 3.8. Let L be a linear subspace in IRn and let q1 , . . . , qt be an orthonor-
mal basis for L. Thus, q1 , . . . , qt span L, qiT qj = 0 when i 6= j and kqj k = 1
for each j. Let q be the n × t-matrix whose jth column is qj , for j = 1, . . . , t.
Define the associated matrix p = qq T . Show that px is the nearest point in L to
x. (The matrix P is called an orthogonal projector (or projection matrix)). Thus,
performing the projection is simply to apply the linear transformation given by p.
Let L⊥ be the orthogonal complement of L. Explain why (I − P )x is the nearest
point in L⊥ to x.
22 CHAPTER 3. PROJECTION AND SEPARATION
Solution: The orthogonal decomposition theorem states that the closest point is
t
X t
X
hx, qj iqj = (q T x)j qj = qq T x = px.
j=1 j=1
Exercise 3.13. Let C = [0, 1]×[0, 1] ⊂ IR2 and let a = (2, 2). Find all hyperplanes
that separates C and a.
Solution: Let 0 ≤ a ≤ 1/2. Then y = ax + b separates the two sets if b > 1 and
2a + b < 2.
Let a ≤ 0. Then we have separation if a + b > 1 and 2a + b < 2.
Let a ≥ 2. Then we have separation if a + b < 0 and a + 2b > 2.
Exercise 3.14. Let C be the unit ball in IRn and let a 6∈ C. Find a hyperplane
that separates C and a.
Solution: Set x0 = 21 (a + a/kak). we can use aT x = aT x0 = α.
Exercise 3.15. Find an example in IR2 of two sets that have a unique separating
hyperplane.
Solution: The left and right half planes give an example.
Exercise 3.16. Let S, T ⊆ IRn . Explain the following fact: there exists a hyper-
plane that separates S and T if and only if there is a linear function l : IRn → IR
such that l(s) ≤ l(t) for all s ∈ S and t ∈ T . Is there a similar equivalence for
the notion of strong separation?
Solution: Separation means that aT x ≤ α on S, aT x ≥ α on T . Setting l(x) =
aT x (which is linear), this means that l(x) ≤ α on S, l(x) ≥ α on T . This implies
one way. The other way follows by setting α to be any number in the interval
[maxx∈S (l(x)), miny∈T (l(y))].
For strong separation, there must be some so that l(x) ≤ α− on S, l(x) ≥ α+
on T .
Exercise 3.17. Let C be a nonempty closed convex set in IRn . Then the associated
projection operator pC is Lipschitz continuous with Lipschitz constant 1, i.e.,
(Such an operator is called nonexpansive). You are asked to prove this using
the following procedure. Define a = x − pC (x) and b = y − pC (y). Verify that
(a−b)T (pC (x)−pC (y)) ≥ 0. (Show first that aT (pC (y)−pC (x) ≤ 0 and bT (pC (x)−
pC (y) ≤ 0 using ( 3.2). Then consider kx − yk2 = k(a − b) + (pC (x) − pC (y))k2
and do some calculations.)
Solution: From (3.2) it follows that (x − pC (x))T (y − pC (x)) = aT (y − pC (x)) ≤ 0
for all y ∈ C. If we in particular set y = pC (y) we obtain aT (pC (y) − pC (x)) ≤ 0.
Similarly, (x − pC (y))T (y − pC (y)) = bT (x − pC (y)) ≤ 0 for all x ∈ C. If we in
particular set x = pC (x) we obtain bT (pC (x) − pC (y)) ≤ 0.
24 CHAPTER 3. PROJECTION AND SEPARATION
We now obtain
Exercise 4.1. Consider the polytope P ⊂ IR2 being the convex hull of the points
(0, 0), (1, 0) and (0, 1) (so P is a simplex in IR2 ).
(i) Find the unique face of P that contains the point (1/3, 1/2).
(ii) Find all the faces of P that contain the point (1/3, 2/3).
(iii) Determine all the faces of P .
Solution:
(i) The face is the entire polytope.
(ii) This is the line from (1, 0) to (0, 1).
(iii) The three vertices, the three edges, and the polytope itself.
Exercise 4.2. Explain why an equivalent definition of face is obtained using the
condition: if whenever x1 , x2 ∈ C and (1/2)(x1 + x2 ) ∈ F , then x1 , x2 ∈ F .
Solution: The old condition clearly implies that the new condition holds (simply
choose λ = 1/2). The other way, assume that the new condition holds, and let
x1 , x2 ∈ C be so that x = (1 − λ)x1 + λx2 ∈ F . If λ ≤ 1/2, x is the midpoint
between x1 and (1 − 2λ)x1 + 2λx2 , and since both these are on the line segment
between x1 and x2 (and hence in C), it follows from the condition that x1 ∈ F
(1 − 2λ)x1 + 2λx2 . This can be repeated until we find a point in F which is on the
second part of the line segment between x1 and x2 . In the same way, if λ ≥ 1/2,
it follows that x2 ∈ F . Combining these two proves the result.
Exercise 4.3. Prove this proposition!
Solution: Let x1 , x2 ∈ C, and let (1 − λ)x1 + λx2 ∈ F2 . Since F2 ⊆ F1 and F1
is a face of C, it follows that x1 , x2 ∈ F1 . Since F2 is a face of F1 it follows that
x1 , x2 ∈ F2 as well. It follows that F2 is a face of C.
Exercise 4.4. Define C = {(x1 , x2 ) ∈ IR2 : x1 ≥ 0, x2 ≥ 0, x1 + x2 ≤ 1}. Why
does C not have any extreme halfline? Find all the extreme points of C.
Solution: C is bounded. Existence of an extreme halfline would imply C to be
unbounded. The extreme points of C are clearly (0, 0), (1, 0), and (0, 1).
25
26 CHAPTER 4. REPRESENTATION OF CONVEX SETS
Pt λj
Since x1 ∈ P and j=2 1−λ1 xj ∈ P , the extreme point property implies that
x = x1 .
Not every xj needs to be an extreme point: simply add an xj which already is in
the convex hull of the previous ones to see this.
Exercise 4.6. Show that rec(C, x) is a closed convex cone. First, verify that
z ∈ rec(C, x) implies that µz ∈ rec(C, x) for each µ ≥ 0. Next, in order to verify
convexity you may show that
\ 1
rec(C, x) = (C − x)
λ>0
λ
for some xi ∈ C, λi ≥ 0. The λi must sum to one, so that the right hand side can
be written as (1, w) for some w ∈ C (by convexity). It follows that x + z = w.
i.e., adding any element from C to x gives another element in C. We prove the
following statement, which implies that x ∈ rec(C):
x ∈ rec(C) ⇐⇒ C + x ⊆ C.
rint(M ∩ C) = M ∩ rint(C)
cl(M ∩ C) = M ∩ cl(C).
Solution: Since polytopes are bounded, we have that rec(P ) = lin(P ) = {0}.
Exercise 4.10. Let C be a closed convex cone in IRn . Show that rec(C) = C.
Solution: If x ∈ C then (1 + λ)x = x + λx ∈ C for any λ ≥ 0, so that x ∈ rec(C).
The other way, if x ∈ rec(C) then 0 + 1 · x = x ∈ C since 0 ∈ C. The result
follows.
Exercise 4.11. Prove that lin(C) is a linear subspace of IRn .
Solution: lin(C) is closed under multiplication with scalars:
If x ∈ rec(C) then −x ∈ −rec(C). If x ∈ −rec(C) then −x ∈ rec(C). It follows
that lin(C) is closed under multiplication with −1, and thus also with all scalars.
Since both rec(C) and −rec(C) are closed under addition, the result follows.
Exercise 4.12. Show that rec({x : Ax ≤ b}) = {x : Ax ≤ O}.
Solution: Let C = {x : Ax ≤ b}, and assume that z ∈ rec(C, x). Then Ax ≤ b
and A(x + λz) ≤ b for all λ ≥ 0. Since A(x + λz) = Ax + λAz, this is clearly only
possible if Az ≤ 0. This implies the ⊆-direction.
The other way, if Az ≤ 0, then, for any x ∈ C, λ ≥ 0, we have that A(x + λz) ≤
b + 0 = b, so that x + λz ∈ C. It follows that z ∈ rec(C). This proves the other
direction.
Exercise 4.13. Let C be a line-free closed convex set and let F be an extreme
halfline of C. Show that then there is an x ∈ C and a z ∈ rec(C) such that
F = x + cone({z}).
Solution: An extreme halfline can be written on the form {x + tz}t≥0 for some
x ∈ C, and some vector z. Since the extreme halfline is in C, it follows that
z ∈ rec(C, x) = rec(C).
Exercise 4.14. Decide if the following statement is true: if z ∈ rec(C) then
x + cone({z}) is an extreme halfline of C.
Solution: No. Let C = IRn . Then rec(C) = IRn , and C has no extreme half lines.
Exercise 4.15. Consider again the set C = {(x1 , x2 , 0) ∈ IR3 : x21 + x22 ≤ 1}
from Exercise 2.16. Convince yourself that C equals the convex hull of its relative
boundary. Note that we here have bd(C) = C so the fact that C is the convex
hull of its boundary is not very impressive!
Solution: Its relative boundary has been shown to be {(x1 , x2 , 0) : x21 + x22 = 1},
and the convex hull of these is clearly C.
Exercise 4.16. Let H be a hyperplane in IRn . Prove that H 6= conv(rbd(H)).
Solution: The relative boundary of a hyperplane is ∅, so that conv(rbd(H)) =
∅=6 H.
Exercise 4.17. Consider a polyhedral cone C = {x ∈ IRn : Ax ≤ O} (where, as
usual, A is a real m × n-matrix). Show that O is the unique vertex of C.
30 CHAPTER 4. REPRESENTATION OF CONVEX SETS
Solution: This
is a compulsory
exercise.
We can write the system as Ax ≤ b,
1 −1 0
−1 1 1
5
0 −2, b = −5. These equations give 2 = 10 possible 2 × 2
where A =
8 −1 16
−1 −1 −4
subsystems, of which 9 are invertible.
For four of the systems we have that 2y = 5 (where there is equality in the third
equation) so that y = 5/2. For these we get
• The first equation gives equality: We get the point (5/2, 5/2). This violates
the fourth inequality, however.
• The second equation gives equality: We get the point (3/2, 5/2). This is
feasible.
• The fourth equation gives equality: We get the point (37/16, 5/2). This is
feasible.
• The fifth equation gives equality: We get the point (3/2, 5/2), which was
obtained above.
So, so far we have obtained two extreme points. We also get two systems where
the first inequality gives equality (i.e. x = y):
• The fourth equation gives equality: We get the point (16/7, 16/7). This
violates the third inequality, however.
• The fifth equation gives equality: We get the point (2, 2). This also violates
the third inequality.
We also get two systems where the second inequality gives equality (i.e. y = x+1):
• The fourth equation gives equality: We get the point (17/7, 24/7). This is
feasible.
• The fifth equation gives equality: We get the point (3/2, 5/2), which was
obtained above.
Finally, if there is equality in the last two inequalities, we obtain the point
(20/9, 16/9), but this violates the third inequality.
The extreme points are thus
• (3/2, 5/2) (equality in the second, third, fifth inequalities)
• (37/16, 5/2) (equality in the third, fourth inequalities)
• (17/7, 24/7) (equality in the second, fourth inequalities)
When sketching this area, one sees that the first and last constraints can be elimi-
nated. The extreme points are the intersections between the remaining constraints
(number 2, 3, and 4).
Exercise 4.23. Find all the extreme halflines of the cone IRn+ .
32 CHAPTER 4. REPRESENTATION OF CONVEX SETS
Solution: Clearly any halfline where all except one component is nonzero is an
extreme halfline. Assume that we have an extreme halfline where there are points
with two nonzero components. Assume for simplicity that (x1 , x2 , 0) is on the
extreme halfline. Then clearly the halfline must contain all points with arbitrary
values in the first two components, so that it can’t be a halfline.
Exercise 4.24. Determine the recession cone of the set {(x1 , x2 ) ∈ IR2 : x1 >
0, x2 ≥ 1/x1 }. What are the extreme points?
Solution: Clearly the recession cone is IR2+ . All points in the first quadrant on
the graph y = 1/x are extreme points.
Exercise 4.25. Let B be the unit ball in IRn (in the Euclidean norm). Show that
every point in B can be written as a convex combination of two of the extreme
points of C.
Solution: The extreme points of B are the boundary points of B. Clearly any
interior point of B can be written as a convex combination of two boundary
points.
Exercise 4.26. Let C be a compact convex set in IRn and let f : C → IR be a
function satisfying
Xt t
X
f( λ j xj ) ≤ λj f (xj )
j=1 j=1
This implies that, if x = argmaxx extreme (f (x)) is attained, then the maximum
is attained in the extreme point x. It turns out, however, that this maximum
may not be attained: Let C = {x ∈ IR2 : kxk ≤ 1}, and consider the function f
defined to be zero on the interior of C, and defined on the boundary so that it
both ≥ 0 and unbounded there. f is clearly convex, but the maximum x is not
attained due to unboundedness on the boundary.
Exercise 4.27. Prove Corollary 4.4.5 using the Main theorem for polyhedra.
Solution: Clearly a polytope is a bounded polyhedron. Assume now that P =
conv(V ) + cone(Z) is a bounded polyhedron, where V and Z are finite sets.
Boundedness implies that Z is empty, so that P = conv(V ). It follows that P is
a polytope.
33
Exercise 4.28. Let S ⊆ {0, 1}n , i.e., S is a set of (0, 1)-vectors. Define the
polytope P = conv(S). Show that x is a vertex of P if and only if x ∈ S.
Solution: For polytopes, vertices and extreme points coincide.
Clearly any x ∈ S is an extreme point: A line which passes through x ∈ S must
either increase or decrease at least one component linearly, in which case one of
the end components lies outside [0, 1]. But any component of a vector in P must
lie between 0 and 1. It follows that the points are extreme points. It is also clear
that any vector outside S with 0,1 components must be outside P .
Assume that x is an extreme point of P with 0 < x1 < 1. Then x must be
a convex combinations of two different points in S, contradicting that it is an
extreme point. Therefore, all components are either 0 or 1 in an extreme point.
Exercise 4.29. Let S ⊆ {0, 1}3 consist of the points (0, 0, 0), (1, 1, 1), (0, 1, 0)
and (1, 0, 1). Consider the polytope P = conv(S) and find a linear system defining
it.
Solution: Since all four points have equal first and third candidate, we can
eliminate the third component by requiring x3 = x1 . The convex hull of the four
points (0, 0), (0, 1), (1, 0), and (1, 1) in IR2 is clearly described by 0 ≤ x1 , x2 ≤ 1.
A possible linear system defining the system is thus
x1 , x2 , x3 ≥ 0
x1 , x2 ≤ 1
x3 − x1 = 0
x ∈ cone({b0 , b1 , . . . , bk }).
Exercise 4.33. Let P = conv({v1 , . . . , vk }) + cone({z1 , . . . , zm }) ⊆ IRn . Define
new vectors in IRn+1 by adding a component which is 1 for all the v. -vectors and
a component which is 0 for all the z. -vectors, and let C be the cone spanned by
these new vectors. Thus,
for x ∈ P as well, so that valid inequalities can be summed to obtain new valid
inequalities. For weighted sums,
so that we get new valid inequalities here as well, as long as the wights are
nonnegative. It follows that the stated set is a cone.
Chapter 5
Convex functions
f (x3 ) − f (x1 )
f (x2 ) ≤ (x2 − x1 ) + f (x1 ).
x3 − x1
Reorganizing this gives that
f (x2 ) − f (x1 ) f (x3 ) − f (x1 )
≤ ,
x2 − x1 x3 − x1
which states that slope(Px1 , Px2 ) ≤ slope(Px1 , Px3 ) (i.e. (ii)). This proves that (i)
and (ii) are equivalent.
(i) is also equivalent to
f (x3 ) − f (x1 )
f (x2 ) ≤ (x2 − x3 ) + f (x3 ).
x3 − x1
Reorganizing this gives that
f (x3 ) − f (x1 ) f (x3 ) − f (x2 )
≤
x3 − x1 x3 − x2
which states that slope(Px1 , Px3 ) ≤ slope(Px2 , Px3 ) (i.e. (iii)).
Exercise 5.2. Show that the sum of convex functions is a convex function, and
that λf is convex if f is convex and λ ≥ 0 (here λf is the function given by
(λf )(x) = λf (x)).
Solution: We have that
X X X X
fk ((1−λ)x+λy) ≤ ((1−λ)fk (x)+λfk (y)) = (1−λ) fk (x)+λ fk (y),
k k k k
36
37
P
whichs shows that k fk also is convex. We also have that
µf ((1 − λ)x + λy) ≤ µ((1 − λ)f (x) + λf (y)) = (1 − λ)µf (x) + λµf (y),
which leads to r r
λ
Y X
xj j ≤ λj xj .
j=1 j=1
Exercise 5.5. Repeat Exercise 5.2, but now for convex functions defined on some
convex set in IRn .
Solution: The proof goes in the same way.
Exercise 5.6. Verify that every linear function from IRn to IR is convex.
Solution: If f is linear we have that
so that f is also convex and with equality holding in the definition of convexity.
38 CHAPTER 5. CONVEX FUNCTIONS
where we used that h is affine and that f is convex. It follows that f ◦ h also is
convex.
Exercise 5.8. Let f : C → IR be convex and let w ∈ IRn . Show that the function
x → f (x + w) is convex.
Solution: This follows from the previous exercise since x → x + w is affine.
Exercise 5.9. Prove Theorem 5.2.3 (just apply the definitions).
Solution: Assume that f is convex. Let (x, s) ∈ epi(f ), (y, t) ∈ epi(f ). We have
that
f ((1 − λ)x + λy) ≤ (1 − λ)f (x) + λf (y) ≤ (1 − λ)s + λt.
It follows that (1 − λ)(x, s) + λ(y, t) = ((1 − λ)x + λy, (1 − λ)s + λt) ∈ epi(f ), so
that epi(f ) is convex.
Assume now that epi(f ) is convex. Since (x, f (x)) and (y, f (y)) are in epi(f ), also
((1−λ)x+λy, (1−λ)f (x)+λf (y)) ∈ epi(f ). This implies that f ((1−λ)x+λy) ≤
(1 − λ)f (x) + λf (y), so that f is convex.
Exercise 5.10. By the result above we have that if f and g are convex functions,
then the function max{f, g} is also convex. Prove this result directly from the
definition of a convex function.
Solution: This is a compulsory exercise. We have that
f ((1 − λ)x + λy) ≤ (1 − λ)f (x) + λf (y) ≤ (1 − λ) max{f, g}(x) + λ max{f, g}(y)
g((1 − λ)x + λy) ≤ (1 − λ)g(x) + λg(y) ≤ (1 − λ) max{f, g}(x) + λ max{f, g}(y).
f (x) = xT Ax + cT x + α
for some (symmetric) matrix A ∈ IRn×n , a vector c ∈ IRn and a scalar α ∈ IR.
Discuss whether f is convex.
Solution: The Hessian of f is 2A, so f is convex if and only if A is positive
semidefinite.
Exercise 5.16. Assume that f and g are convex functions defined on an inter-
val I. Determine which of the functions following functions that are convex or
concave:
(i) λf where λ ∈ IR,
(ii) min{f, g},
(iii) |f |.
Solution:
40 CHAPTER 5. CONVEX FUNCTIONS
Exercise 5.27. Let C ⊆ IRn be a convex set and consider the distance function
dC defined in ( 3.1), i.e., dC (x) = inf{kx − ck : c ∈ C}. Show that dC is a convex
function.
Solution: Let x, y be given, and find x1 , y1 ∈ C so that kx − x1 k ≤ dC (x) + ,
ky − y1 k ≤ dC (y) + . Since C is convex, (1 − λ)x1 + λy1 ∈ C. We have that
Since this applies for all it follows that dC ((1−λ)x+λy) ≤ (1−λ)dC (x)+λdC (y)
as well, so that dC is convex.
Exercise 5.28. Prove Corollary 6.1.1 using Theorem 5.3.5.
Solution: Let x∗ be a local minimum Since f is convex, Theorem 5.3.5 says that
f (x) ≥ f (x∗ ) + ∇f (x∗ )T (x − x∗ )) for all x ∈ C. If ∇f (x∗ ) = 0 this says that
f (x) ≥ f (x∗ ), i.e., x∗ is a global minimum. Therefore (iii) implies (ii), and (ii)
obviously implies (i).
Assume finally that x∗ is a local minimum, and assume for contradiction that
∇f (x∗ ) 6= 0. Note that ∇f (y)T ∇f (x∗ ) > 0 for y in some neighbourhood of x∗ (at
least if the gradient is continuous).
It is now better to use the following Taylor formula
for some 0 < t < 1. By choosing x = x∗ − α∇f (x∗ ) and α small enough, we get
which contradicts that we have a local minimum. This proves (iii), and the proof
is complete.
Exercise 5.29. Compare the notion of support for a convex function to the notion
of supporting hyperplane of a convex set (see section 3.2). Have in mind that f
is convex if and only if epi(f ) is a convex set. Let f : IRn → IR be convex and
consider a supporting hyperplane of epi(f ). Interpret the hyperplane in terms of
functions, and derive a result saying that every convex function has a support at
every point.
44 CHAPTER 5. CONVEX FUNCTIONS
so that
t = −(a1 , . . . , an )T (x1 − y1 , . . . , xn − yn ) + f (y).
Denote this affine function by h(x). Since aT x ≥ α on epi(f ), from the above it
follows that
All this can be more compactly explained in terms of graphs: The graph of the
hyperplane lies below the graph of f . The supporting hyperplane is viewed as the
tangent plane of the graph.
Chapter 6
Exercise 6.1. Consider the least squares problem minimize kAx − bk over all
x ∈ IRn . From linear algebra we know that the optimal solutions to this problem
are precisely the solutions to the linear system (called the normal equations)
AT Ax = AT b.
Show this using optimization theory by considering the function f (x) = kAx−bk2 .
Solution: The gradient of f (x) = kAx−bk2 = xT AT Ax−2bT Ax+bT b is ∇f (x) =
2AT A − 2AT b. We see that the gradient is zero if and only if AT Ax = AT b.
Exercise 6.2. Prove that the optimality condition is correct in Example 6.2.1.
Solution: Assume that x∗i = 0. Let x = x∗ + ei ∈ C. Then ∇f (x∗ )T (x − x∗ ) =
∂f
∂xi
(x∗ ), which must be greater than zero.
Assume that x∗i > 0. Then x = x∗ ±ei ∈ C for small enough. These two choices
∂f
give ± ∂x i
(x∗ ) as values for ∇f (x∗ )T (x − x∗ ). If both of these are ≥ 0 then clearly
∂f
∂xi
(x∗ ) = 0.
Exercise 6.3. Consider the problem to minimize a (continuously differentiable)
convex function f subject to x ∈ C = {x ∈ IRn : O ≤ x ≤ p} where p is some
nonnegative vector. Find the optimality conditions for this problem. Suggest a
numerical algorithm for solving this problem.
Solution: The constraints can be written as −xi ≤ 0, xi − p ≤ 0, which have
gradients −ei and ei , respectively.
Go through all possibilities of active constraints.
∂f
If 0 < xi < pi , the gradient equation says that ∂x i
= 0 If for instance xi = 0, we
∂f
would add −µi ei to the gradient equation. This is the same as ∂x i
= µi ≥ 0.
If for instance xi = pi , we would add µi ei to the gradient equation. This is the
∂f
same as ∂x i
= −µi ≤ 0.
∂f ∂f
We thus arrive at the following: ∂x i
≥ 0 if xi = 0, ∂x i
≤ 0 if xi = pi .
45
46 CHAPTER 6. NONLINEAR AND CONVEX OPTIMIZATION
We see that x ≥ 0.
From the second component we see that the second constraint must be active
and y > 0. The third constraint can thus never be active. We are left with two
possiblities: The first constraint may or may not be active.
Assume first that the first constraint is active. We then have that y = x − 2 and
y 2 = x, so that y 2 − y − 2 = 0. This gives that y = 2 or y = −1. y = −1 can be
discarded, so that we obtain the candidate (4, 2). We have that f (4, 2) = 14.
Finally assume that the first constraint is not active. The gradient equation can
now be written as
2x 1
= µ2 .
−1 −2y
Taking ratios we see that −2x = −1/2y so that 4xy = 1 (and both x and y must
be > 0). Inserting this in y 2 = x gives that 4y 3 = 1, so that x = 4−2/3 , y = 4−1/3 ,
and
3
f (4−2/3 , 4−1/3 ) = 4−4/3 − 4−1/3 = − 4−1/3 < 0.
4
It follows that this is the constrained minimum.
We should also comment on the possibility of having linearly dependent active
constraint gradients. The only problematic part here can be when the first two
48 CHAPTER 6. NONLINEAR AND CONVEX OPTIMIZATION
constraints both are active, but this is covered by the calculations above, which
lead to f (4, 2) = 14.
We should also comment that the area we minimize over is bounded, so that there
actually exists a global minimum, which is the one we have found.
More on the proof of Theorem 1.2 in [2]
F (G) ⊆ Q follows from Lemma 1.1: Since (1.4) holds for all incidence vectors
of forests, it also holds for their convex hull, i.e., F (G) ⊆ Q. The other way,
as Q is compact and convex, Minkowski’s theorem yields that Q is the convex
hull of its extreme points. According to Chapter 4 in [1], as Q is a polyhedron,
vertices and extreme points coincide, and faces and exposed faces are the same.
It follows that it is enough to show that any unique optimal solution to a problem
on the form max{cT x : x ∈ Q} is also in F (G). We will actually show that such
a unique optimal solution must be an incidence vector for a forest, which is an
even stronger statement.
It is smart to consult section 1.4 in [21] here, where one learns the following
greedy algorithm for constructing a maximum weight forest: If you at any stage
in the algorithm have the components U1 , . . . , Uk , join two of the components
with an edge of maximum overall weight (there may be more than one such),
and terminate when there are no such edges left with positive weight. Since the
later edges added also were candidates at previous iterations, the weights are
decreasing: c(e1 ) ≥ c(e2 ) ≥ · · · c(er ) > 0 (the edges ei are the ones found by the
algorithm). This is a crucial point, which is not commented in the proof.
Let us comment on why the upper triangular system
X
yVj + yVi = c(ej ) (6.1)
i:ej ∈E(Vi )
49
50 CHAPTER 6. NONLINEAR AND CONVEX OPTIMIZATION
can now be repeated to show that yVj1 ≥ 0, yVj2 ≥ 0, and so on, proving that all
yVj ≥ 0.
For dual feasibility we also need to explain why
X
yVi ≥ c(e)
i:e∈E(Vi )
when e is different from the ei added by the algorithm. If e joins two different
components of the forest we end up with, we must have that c(e) ≤ 0, and the
equation follows. Assume thus that e joins two vertices in the same component
of the forest. At some point in the algorithm, the two end nodes of e are joined
into the same component by means of one edge ek . As e was also a candiate for
this join, we must have that c(ek ) ≥ c(e) by maximality of c(ek ) in the algorithm.
But then X X
yVi = yVi = c(ek ) ≥ c(e).
i:e∈E(Vi ) i:ek ∈E(Vi )
51
Chapter 4 in [2]
µk xk is a general element in
P
This is the statement on the sixth line. Since k∈K
PI1 , the last line follows.
52
53
and that
1. An edge in E(Ti ∩ H) contributes both in (ii) and (iv).
2. An edge in E(Ti \ H) contributes both in (ii) and (iii).
3. An edge in δ(H) ∩ E(Ti ) contributes both in (ii) and (6.2).
Therefore, if (i), (ii), and (iii) are scaled by 12 , and these three are added for all i
with (6.2), the left hand side will become x(E(H)) + ki=1 x(E(Ti )). On the right
P
side we obtain
k
!
1 X
|H| + (|Ti | − 1 + |Ti \ H| − 1 + |Ti ∩ H| − 1)
2 i=1
k
1X
= |H| + (2|Ti | − 3)
2 i=1
k
X
= |H| + (|Ti | − 1) − k/2.
i=1
Exercise 4.1: The coordinate change xj = abj yj turns the problem into maximizing
c a
b nj=1 ajj yj under the constraints nj=1 yj ≤ 1, and 0 ≤ yj ≤ bj . Clearly we must
P P
choose y1 = min 1, ab1 , i.e., x1 = min 1, ab1 .
ak ≥ b − a1 − · · · − ak−1 , (6.3)
Pk−1 Pk
i.e., i=1 ai < b ≤ i=1 ai . We obtain the optimal solution
!
b − k−1
P
a
i=1 i
1, . . . , 1, , 0, . . . , 0 , (6.4)
ak
Pk−1
ck (b− i=1 ai )
and the optimal value is c1 + . . . + ck−1 + ak
55
56 CHAPTER 6. NONLINEAR AND CONVEX OPTIMIZATION
which is the same expression we found when solving the primal problem.
The vertices are obtained by collecting all possible maxima when the ci are varied
(this gives all exposed faces, which consitute the vertices for polyhedra). By
permuting the ci in particular we see that all vectors obtained by permuting the
entries in (6.4) also are vertices. Let us take a look at how many possible such
permutations there are. The actual number depends on the k we found in (6.3).
If the middle number in (6.4) is in (0, 1), the number of vertices is n n−1
k
, where
the binomial coefficient comes from the number of ways to place k zeros among
n − 1 numbers.
Exercise 4.2: No. That v(R(u)) > v(Q) implies that that the LP relaxation has
an optimal value which is not integral. The optimal node in the enumeration tree
may still be below u, so that we cannot prune.
Exercise 4.3: Assume that x represents a Hamilton tour. Then clearly (i) holds.
Also, if W is as described in (ii), along the Hamilton tour we must pass at least
twice between an element in W and an element outside W , so that x(δ(W )) ≥ 2.
On the other hand, suppose (i) and (ii) are fulfilled. (i) secures that x passes
through each vertex twice, so that the only possibility is to have one tour, or
several subtours. Assume the latter, and let W be the node set of one of those
subtours. Then no edges are entering or leaving W , so that x(δ(W )) = 0, contra-
dicting (ii). It follows that x represents a Hamilton tour.
Exercise 4.4:
Exercise 4.5: The constraints x(E[S]) ≤ |S| − 1 (for all S ⊆ V , S 6= ∅, S 6= V ),
xe ≥ 0 forces x to be the incidence vector of a forest.
We should now add the degree constraints x(δ(v)) ≤ bv (bv is the constrained
degree at v), for all v ∈ V .
Finally we should add constraints enforcing a tree. For this we can add the con-
straints x(δ(W )) ≥ 1 (for all W ⊆ V , W 6= ∅, W 6= V ).
Exercise 4.6:
57
Exercise 4.7:
Exercise 4.8:
Exercises from [3]
58
59
Exercise 10: Since the capacities are integral, there exists a maximum flow which
is integral (Theorem 1.6). This implies that all edges have either unit- or zero
flow. Due to flow conservation, each vertex has the same number of incoming
unit flow edges as outgoing unit flow edges. Start by following edge-disjoint edges
with unit flow from s, all the way to t (if this was impossible, we would have
a contradiction to flow balancing). If one removes this st-path from D, one still
have a flow of the same type. One can in this way take out one edge-disjoint
st-path at a time, unitl there are noe edges left.
Exercise 11: From the previous exercise it is clear that the maximum number of
edge-disjoint st-paths equals the
P maximum flow, which again equals the capacity
of the minimum cut, which is e∈K c(e) = |K|.
Exercise 12: The upper and lower bounds should be defined as follows:
• For e = (ui , vj ), we set l(e) = baij c, r(e) = daij e.
• For e = (s, ui ), we set l(e) = bri c, r(e) = dri e.
• For e = (vj , t), we set l(e) = bsj c, r(e) = dsj e.
Hoffman’s circulation theorem ensures that, if a circulation exists in thsi graph, an
integral circulation also exists. Such an integral circulation represents a solution
to the matrix rounding problem.
Exercise 13: See the proof of Exercise 4.28.
Exercise 14: That every permutation matrix is a vertex follows directly from the
previous exercise. Integral matrices in Ωn must have exactly one 1 in each row
and column, the rest zeros (in order for a column/row summing to one). But this
is equivalent to being a permutation matrix.