Advanced Algorithms Lecture 07
Advanced Algorithms Lecture 07
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
18.415/6.854 Advanced Algorithms October 3, 2001
Lecture 7
Lecturer: Michel X. Goemans Scribe: Nick Hanssens and Nick Matsakis
P r o o f of T h e o r e m 1:
Take any x > 0 such that Ax = b. We have:
= (s^T+jjT~)~-pCln~i
j
= STx+eT~x
-pClnxj
j
implying that the objective function is lower bounded by a constant (this follows from the fact that
+
for p > 0, S j x p l n x tends to +oo as x goes to 0 or to + m ) . Therefore the infimum of B P ( p ) is
finite for every p > 0. To show that the infimum is attained (that there exists an optimum solution),
it is sufficient to notice that the argument above also leads to upper and lower bounds on x j in order
to have a value below the one for 2 , which means that we can restrict our attention to a compact
set; this implies that the infimum is attained. Finally, we have shown last time that if an optimum
solution exists then it is unique.
For any p > 0, the unique solution to B P ( p ) is called the p-center.
AAx = 0,
A ~ A Y + A S= 0,
x j . sj + +
Axj - sj x j . Asj = /L.
where
lif such a system was easy to solve then this would give a simple algorithm for linear programming by setting p to
0 and replacing the strict inequalities by inequalities.
Indeed, (6) implies that Axj +xjsjlAsj = psj' - x j , or in vector notation,
Observe that this is not a square system of equations (but we have m equations in n unknowns).
Substituting As by -ATAy (because of (5)), we get
But AXS-'AT is an m x m matrix of rank m since A has rank m and X and S-' are diagonal
matrices with positive diagonal elements. Thus AXS-'AT is invertible and we derive (7). (5) then
immediately implies (8), and (10) implies (9).
+ +
At each step, then, replace x and s with the values x Ax and s As (y can always be derived
from x and s). We will show that this iteration will converge to the p-center of BP(p).
3.2 ds and dx
As (x,S,p) + (x + Ax, s + As, p), our proximity vector v becomes w where:
-
p + Axj - Asj - 1
lu
- Axj - Asj which we are hoping will be small.
P
For the analysis, it will be useful to rescale the x-space and the s-space so that the the current
iterates x and s are equal, but in a way that x j s j remains constant. For this, we will rescale
component j of any vector in the x-space by and component j of any vector in the s space
by f i .Rescaling Ax and As, we express wj as wj = dxj . dsj where dxj = (3. c)
and
3.3 Properties
Property 1 A x l A s .
Proof of Property 1: This is true because Ax is in the nullspace of A while As is in the
columnspace of AT. Indeed, premultiplying (5) by AxT and using (4), we get that AxTAs = 0.
+ +
Observe that although x Ax and s As do not necessarily satisfy (and will not) that (xj +
+
Axj) (sj Asj) = p, on average they do since the duality gap (x + +
AX)^ (s As) = Cj(xjs j +
Axjsj + xjAsj + AxjAsj) = n p by the above property and (6).
Property 2 d x l d s .
Proof of Property 2:
dsj and Property 1.
dxTds = Cjdxj - ds3. -
- P
'CjAxj . Asj = 0, using the definition of dxj,
Property 3
Proof of Property 3:
4 Theorems 2 and 3
Theorem 2 If a(x, s, p) < 1, then x + Ax > 0 + As > 0.
and s
These two theorems guarantee that, provided a(x, s, p) is sufficiently small, repeated updates
+ +
x Ax, s As given in the algorithm will converge to the p-center. Theorem 2 was not proved in
this lecture. The proof for theorem 3 is provided below. Theorem 3 shows that the convergence is
quadratic (provided we are close enough).
+ +
Proof of Theorem 3: We have that o2(x Ax, s As, p) = 1 lw1I2 = CjW: = Cj dx: . ds:.
< +
Using the fact that 4 a . b (a b)2 and Property 2,
Using property 3,
Taking the square root, we get
Now, consider these two terms. The first, x. vz, is equal to a2 by definition. The second,
? 3
maxj p / (xj . sj), is at most 1/(1 - a ) by the following argument:
since 1 - a is strictly positive under the conditions of the theorem. Using these upper bounds in
( l l ) , we can conclude the proof by stating,
a2
a(x + Ax, s + As, p) 5 2 - (1-0)'
5 Theorem 5
Theorems 2 and 3 show that the updates Ax, As given in the algorithm will converge to the p-center.
However, instead of making the Newton updates to converge to the p-center for a fixed value of p,
we take one step to get closer to the p-center ( a becomes now a2/(2(1 - a))) and then decrease p
(since our goal is let p. tend to 0) in such a way that our new iterate is within the original value of
a but with respect to the updated p. Theorem 5 shows how the proximity measure changes as we
change p.
Theorem 5 Suppose xTs = n p and a = a(x, s,p), then o(x, s, p(1- 8)) = &do2 + O2 .n.
~ As) = np,
Observe that our new iterate $+Ax and s + As satisfy the condition that ( x + A x ) (sf
and hence Theorem 5 can be applied to see how the proximity measure changes when we modify p
after a Newton iterate.
Proof of Theorem 5: Let v' be the vector v after having changed p, i.e. v(i =
have that
a - 1. We
Thus u' = &v + m6 e .
Our assumption that zTs = np can be translated into uTe = 0 since uTe = C3. u3. = C j (y- 1) =
- -
zTs n. Therefore, we get that
P