Int PT MTHD
Int PT MTHD
Programming
based on Newton’s Method
Robert M. Freund
March, 2004
1
1 The Problem
P : minimize cT x
s.t. Ax = b
x ≥ 0,
D : maximize bT π
s.t. AT π + s = c
s ≥ 0.
2
cT x − bT π = xT s ≥ 0.
⎛ ⎞
x1 0 ... 0
⎜ 0 x2 ... 0 ⎟
⎜ ⎟
⎜ .. .. .. .. ⎟.
⎝ . . . . ⎠
0 0 . . . xn
⎛ ⎞
(x1 )2 0 ... 0
⎜ 0 (x2 )2 ... 0 ⎟
⎜ ⎟
⎜ . .. .. .. ⎟.
⎝ .. . . . ⎠
0 0 . . . (xn )2
⎛ ⎞
(1/x1 ) 0 ... 0
⎜ 0 (1/x2 ) . . . 0 ⎟
⎜ ⎟
⎜ ⎟
⎝ ... ... ..
. ... ⎠
0 0 . . . (1/xn )
3
and
⎛ ⎞
1/(x1 )2 0 ... 0
⎜ 0 1/(x2 )2 ... 0 ⎟
⎜ ⎟
⎜ .. .. .. .. ⎟.
⎝ . . . . ⎠
0 0 . . . 1/(xn )2
n
P (θ) : minimize cT x − θ ln(xj )
j=1
s.t. Ax = b
x > 0.
⎧
⎪
⎨ Ax = b, x > 0
(1)
⎪
⎩ c − θX −1 e = AT π .
If we define s = θX −1 e, then
1
Xs = e ,
θ
4
equivalently
1
XSe = e ,
θ
⎧
⎪
⎪ Ax = b, x > 0
⎪
⎪
⎪
⎨
AT π + s = c (2)
⎪
⎪
⎪
⎪
⎪
⎩ 1
θ XSe − e = 0 .
From the equations of (2) it follows that if (x, π, s) is a solution of (2), then
x is feasible for P , (π, s) is feasible for D, and the resulting duality gap is:
⎧
⎪
⎪ Ax = b, x > 0
⎪
⎪
⎪
⎨
AT π + s = c (3)
⎪
⎪
⎪
⎪
⎪
⎩
1θ Xs − e ≤ β .
5
Lemma 1.1 If (¯ x, π,
¯ s̄) is a β-approximate solution of P (θ) and β < 1,
then x
¯ is feasible for P , (¯
π, s)
¯ is feasible for D, and the duality gap satisfies:
nθ(1 − β) ≤ cT x
¯ − bT π ¯T s̄ ≤ nθ(1 + β).
¯=x (4)
x
¯j s¯j
−β ≤ −1≤β
θ
n n n
nθ(1 − β) = θ(1 − β) ≤ x ¯T s̄ ≤
¯j s¯j = x θ(1 + β) = nθ(1 + β).
j=1 j=1 j=1
Based on the analysis just presented, we are motivated to develop the fol-
lowing algorithm:
6
Step 1. Set current values. (¯ ¯ s̄) = (xk , π k , sk ), θ = θk .
x, π,
�
Step 2. Shrink θ. Set θ = αθ for some α ∈ (0, 1).
� � �
(x , π , s ) = (¯
x, π,
¯ s̄) + (∆x, ∆π, ∆s)
� � �
Step 5. Reset Counter and Continue. (xk+1 , π k+1 , sk+1 ) = (x , π , s ).
�
θk+1 = θ . k ← k + 1. Go to Step 1.
7
θ=0
θ = 1/10
θ = 10
θ = 100
8
• whether or not successive iterative values (xk , π k , sk ) are β-approximate
solutions to P (θk )
n
P (θ) : minimizex cT x − θ ln(xj )
j=1
s.t. Ax = b
x > 0.
⎧
⎪
⎨ Ax = b, x > 0
(7)
⎪
⎩ c − θX −1 e = AT π .
We define s = θX −1 e, whereby
1
Xs = e ,
θ
equivalently
1
XSe = e ,
θ
9
and we can rewrite the Karush-Kuhn-Tucker conditions as:
⎧
⎪ Ax = b, x > 0
⎪
⎪
⎪
⎪
⎨
AT π + s = c (8)
⎪
⎪
⎪
⎪
⎪
⎩ 1
θ XSe − e = 0 .
Let (¯
x, π,
¯ s̄) be our current iterate, which we assume is primal and
dual feasible, namely:
A¯
x = b, x
¯>0, AT π
¯ + s¯ = c , s¯ > 0 . (9)
⎧
⎪
⎪ A(¯
x + ∆x) = b, x
¯ + ∆x > 0
⎪
⎪
⎪
⎨
AT (¯
π + ∆π) + (¯
s + ∆s) = c
⎪
⎪
⎪
⎪
⎪
⎩ 1 ¯
θ (X + ∆X)(S + ∆S)e − e = 0 .
¯
⎧
⎪
⎪ A∆x = 0
⎪
⎪
⎪
⎨
AT ∆π + ∆s = 0
⎪
⎪
⎪
⎪
⎪
⎩ ¯
S̄∆x + X∆s = θe − XSe
¯ ¯ − ∆X∆Se .
Notice that the only nonlinear term in the above system of equations in
(∆x, ∆π, ∆s) is term the term “ ∆X∆Se” in the last system. If we erase
this term, which is the same as the linearized version of the equations, we
obtain the following primal-dual Newton equation system:
10
⎧
⎪ A∆x = 0
⎪
⎪
⎪
⎪
⎨
AT ∆π + ∆s = 0 (10)
⎪
⎪
⎪
⎪
⎪
⎩ ¯
S̄∆x + X∆s = θe − XSe
¯¯ .
The solution (∆x, ∆π, ∆s) of the system (10) is called the primal-
dual Newton step. We can manipulate these equations to yield the following
formulas for the solution:
−1
∆x ← ¯ S̄ −1 AT AX
I −X ¯ S̄ −1 AT A ¯ + θS̄ −1 e ,
−x
−1
∆π ← ¯ S̄ −1 AT
AX ¯ − θS¯−1 e ,
A x (11)
−1
∆s ← AT AX
¯ S̄ −1 AT ¯ + θS̄ −1 e .
A −x
¯ S̄ −1 AT ∆π = A x
AX ¯ − θS̄ −1 e .
∆s ← −AT ∆π
(12)
∆x ← −x
¯+ θS̄ −1 e − S̄ −1 X∆s
¯ .
However, let us instead simply work with the primal-dual Newton system
(10). Suppose that (∆x, ∆π, ∆s) is the (unique) solution of the primal-dual
Newton system (10). We obtain the new value of the variables (x, π, s) by
taking the Newton step:
11
� � �
(x , π , s ) = (¯
x, π,
¯ s)
¯ + (∆x, ∆π, ∆s) .
Note from the first two equations of (14) that ∆xT ∆s = 0. From the third
equation of (13) we have
s¯j x
¯j
1−β ≤ ≤ 1 + β, j = 1, . . . , n, (15)
θ
12
which implies:
(1 − β)θ (1 − β)θ
¯j ≥
x and s¯j ≥ , j = 1, . . . , n . (16)
s̄j x̄j
¯ −1 ∆x ≤ 1 − β < 1 .
X
β
Therefore
�
x =x ¯ +X
¯ + ∆x = X(e ¯ −1 ∆x) > 0 .
We have the exact same chain of inequalities for the dual variables:
13
Therefore
1 � � ∆xj ∆sj
e− X S e =− .
θ j θ
From this we obtain:
� � � �
e − 1θ X S e ≤ e − 1θ X S e1
|∆xj ∆sj |
= nj=1 θ
|∆x | ∆s | x
¯ s¯
= nj=1 x̄j j s̄jj jθ j
|∆x | |∆s |
≤ nj=1 x¯j j s̄j j (1 + β)
¯ −1 ∆xS̄ −1 ∆s(1 + β)
≤ X
2
β
≤ 1−β (1 + β)
14
3 3 √
40 1−α √ 40 + n √ 1
≤ + n = − n= .
α α α 5
Proof: By induction, suppose that the theorem is true for iterates 0, 1, 2, ..., k.
3
Then (xk , π k , sk ) is a β = 40 -approximate solution of P (θk ). From the Re-
laxation Theorem, (x , π , s ) is a 15 -approximate solution of P (θk+1 ) where
k k k
θk+1 = αθk .
From the Quadratic Convergence Theorem, (xk+1 , π k+1 , sk+1 ) is a β-approximate
solution of P (θk+1 ) for
1 2
1+ 5 1 3
β= 2 = .
1− 1 5 40
5
√ 43 (x0 )T s0
k = 10 n ln
37
iterations.
15
θ = 80
x^
~
x θ = 90
_
x θ = 100
16
Therefore k
1
θk ≤ 1 − √ θ0 .
10 n
This implies that
k
1 43
cT xk − bT π k = (xk )T sk ≤ θk n(1 + β) ≤ 1 − √ n θ0 )
10 n 40
k
1 43 (x0 )T s0
≤ 1− √ n 37 ,
10 n 40 40 n
from (4). Taking logarithms, we obtain
T k T k 1 43 0 T 0
ln(c x − b π ) ≤ k ln 1 − √ + ln (x ) s
10 n 37
−k 43 0 T 0
≤ √ + ln (x ) s
10 n 37
43 (x0 )T s0 43 0 T 0
≤ − ln + ln (x ) s = ln().
37 37
Therefore cT xk − bT π k ≤ .
• We do not assume that the current point is near the central path. In
fact, we do not assume that the current point is even feasible.
1
• The fractional decrease parameter α is set to α = 10 rather than the
conservative value of
1
8
α=1− 1 √ .
5 + n
17
• We do not necessarily take the full Newton step at each iteration, and
we take different step-sizes in the primal and dual.
(2) AT (¯
π + ∆π) + (¯
s + ∆s) = c
¯ + ∆X)(S¯ + ∆S)e = θe .
(3) (X
(2) AT ∆π + ∆s = c − AT π̄ − s̄ =: r2
¯
(3) S̄∆x + X∆s = θe − XSe
¯ ¯ =: r3
We refer to the solution (∆x, ∆π, ∆s) to the above system as the primal-
dual Newton direction at the point (¯ x, π, ¯ It differs from that derived
¯ s).
earlier only in that earlier it was assumed that r1 = 0 and r2 = 0.
Given our current point (¯
x, π,
¯ s̄) and a given value of θ, we compute
the Newton direction (∆x, ∆π, ∆s) and we update our variables by choosing
primal and dual step-sizes αP and αD to obtain new values:
(˜ ˜ s̃) ← (¯
x, π, x + αP ∆x, π
¯ + αD ∆π, s̄ + αD ∆s) .
18
In order to ensure that x ˜ > 0 and s˜ > 0, we choose a value of r
satisfying 0 < r < 1 (r = 0.99 is a common value in practice), and determine
αP and αD as follows:
x
¯j
αP = min 1, r min
∆xj <0 −∆xj
s̄j
αD = min 1, r min .
∆sj <0 −∆sj
x̄T s¯
θ≈
n
We then re-set θ to
T
1 x
¯ s̄
θ← ,
10 n
1
where the fractional decrease 10 is user-specified.
19
tolerance to be a small positive number, typically = 10−8 , for example,
and we stop when:
(1) Ax̄ − b ≤
(2) AT π
¯ + s¯ − c ≤
(3) s̄T x
¯≤
(1) Axk − b ≤
(2) AT π k + sk − c ≤
(3) (sk )T xk ≤ .
(2) AT ∆π + ∆s = c − AT π k − sk =: r2
(3) S k ∆x + X k ∆s = θe − X k S k e =: r3
20
5. Determine the step-sizes:
xkj
θP = min 1, r min
∆xj <0 −∆xj
skj
θD = min 1, r min .
∆sj <0 −∆sj
6. Update values:
1. (Interior Point Method) Verify that the formulas given in (11) indeed
solve the equation system (10).
21
2. (Interior Point Method) Prove Proposition 6.1 of the notes on New-
ton’s method for a logarithmic barrier algorithm for linear program-
ming.
LP:
minimize cT x
s.t. Ax = b
x ≥ 0,
where
1 1 1 0 100
A= , b= ,
1 −1 0 1 50
and
cT = −9 −10 0 0 .
In running your algorithm, you will need to specify the starting point
(x0 , π 0 , s0 ), the starting value of the barrier parameter θ0 , and the
value of the fractional decrease parameter α. Use the following values
and make eight runs of the algorithm as follows:
• Starting Points:
– (x0 , π 0 , s0 ) = ((1, 1, 1, 1), (0, 0), (1, 1, 1, 1))
– (x0 , π 0 , s0 ) = ((100, 100, 100, 100), (0, 0), (100, 100, 100, 100))
• Values of θ0 :
– θ0 = 1000.0
– θ0 = 10.0
• Values of α:
1√
– α=1− 10 n
1
– α= 10
22
P (θ):
n
minimize cT x − θ ln(xj )
j=1
s.t. Ax = b
x ≥ 0,
and that we wish to test if x̄ is a β-approximate solution to P (θ) for a
given value of θ. The conditions for x̄ to be a β-approximate solution
to P (θ) are that there exist values (π, s) for which:
⎧
⎪
⎪ A¯
x = b, x
¯>0
⎪
⎪
⎪
⎨
AT π + s = c (17)
⎪
⎪
⎪
⎪
⎪
⎩
1θ Xs
¯ − e ≤ β.
Construct an analytic test for whether or not such (π, s) exist by solv-
ing an associated equality-constrained quadratic program (which has
a closed form solution). HINT: Think of trying to minimize 1θ Xs
¯ − e
over appropriate values of (π, s).
5. Consider the logarithmic barrier problem:
n
P (θ) : min cT x − θ ln xj
j=1
s.t. Ax = b
x > 0.
Suppose x
¯ is a feasible point, i.e., x
¯ > 0 and A¯ ¯ = diag(¯
x = b. Let X x).
a. Construct the (primal) Newton direction d at x̄. Show that d can
be written as:
¯ T (AX
¯ − XA
d = X(I ¯ e − 1 Xc
¯ 2 AT )−1 AX) ¯ .
θ
¯ T (AX
b. Show in the above that P = (I − XA ¯ 2 AT )−1 AX)
¯ is a pro-
jection matrix.
23
c. Suppose AT π ¯ + s¯ = c and s¯ > 0 for some π,
¯ s.
¯ Show that d can
also be written as:
¯ T (AX
¯ − XA
d = X(I ¯ e − 1 Xs̄
¯ 2 AT )−1 AX) ¯ .
θ
24