0% found this document useful (0 votes)

55 views16 pages

Inexact Newton Method For Minimization of Convex P

Uploaded by

ahah07492

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

55 views16 pages

Inexact Newton Method For Minimization of Convex P

Uploaded by

ahah07492

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Inexact Newton method for minimization of

convex piecewise quadratic functions

A.I. Golikov and I.E. Kaporin

arXiv:1901.03245v1 [math.OC] 10 Jan 2019

Abstract An inexact Newton type method for numerical minimization of convex

piecewise quadratic functions is considered and its convergence is analyzed. Ear-
lier, a similar method was successfully applied to optimizaton problems arising in
numerical grid generation. The method can be applied for computing a minimum
norm nonnegative solution of underdetermined system of linear equations or for
finding the distance between two convex polyhedra. The performance of the method
is tested using sample data from NETLIB family of the University of Florida sparse
matrix collection as well as quasirandom data.

1 Introduction

The present paper is devoted to theoretical and experimental study of novel tech-
niques for incorporation of preconditioned conjugate gradient linear solver into inex-
act Newton method. Earlier, similar method was successfully applied to optimizaton
problems arising in numerical grid generation [7, 8, 13], and here we will consider
its application to the numerical solution of piecewise-quadratic unconstrained opti-
mization problems [9, 15, 16, 17]. The latter include such problems as finding the
projection of a given point onto the set of nonnegative solutions of an underdeter-
mined system of linear equations [6] or finding a distance between two convex poly-
hedra [3] (and both are tightly related to the standard linear programming problem).
The paper is organized as follows. In Section 2, a typical problem of minimization

A.I. Golikov
Dorodnicyn Computing Center of FRC CSC RAS, Vavilova 40, 119333 Moscow, Russia;
Moscow Institute of Physics and Technology (State University), 9 Institutskiy per., Dolgoprudny,
Moscow Region, 141701, Russia; e-mail: [email protected]
I.E. Kaporin
Dorodnicyn Computing Center of FRC CSC RAS, Vavilova 40, 119333 Moscow, Russia; e-mail:
[email protected]

1
2 A.I. Golikov and I.E. Kaporin

of piecewise quadratic function is formulated. In Section 3, certain technical results

are given related to objective functions under consideration. Section 4 describes an
inexact Newton method adjusted to the optimization problem. In Section 5, a conver-
gence analysis of the proposed algorithm is given with account of special stopping
rule of inner linear conjugate gradient iterations. In Section 6, numerical results are
presented for various model problems.

2 Optimization problem setting

Consider the piecewise-quadratic unconstrained optimization problem

1 T 2 T
p∗ = arg minm k(b
x + A p)+ k − b p , (1)
p∈R 2

where the standard notation ξ+ = max(0, ξ ) = (ξ + |ξ |)/2 is used. Problem (1) can
be viewed as the dual for finding projection of a vector on the set of nonnegative
solutions of underdetermined linear systems of equations [6, 9]:
1
x∗ = arg min kx − xbk2 ,
Ax=b 2
x≥0

x + AT p∗ )+ . Therefore, we are
the solution of which is expressed via p∗ as x∗ = (b
considering piecewise quadratic function ϕ : Rm → R1 determined as
1
x + AT p)+ k2 − bT p,
ϕ (p) = k(b (2)
2
which is convex and differentiable. Its gradient g(p) = grad p is given by

x + AT p)+ − b,
g(p) = A(b (3)

and it has generalized Hessian [10]

x + AT p)+ AT .
H(p) = ADiag sign(b (4)

The relation of H(p) to ϕ (p) and g(p) will be explained later in Remark 1.

3 Taylor expansion of (·)2+ function

The following result is a special case of Taylor expansion with the residual term in
integral form.
LEMMA 1. For any real scalars η and ζ it holds
Inexact Newton method 3
Z 1 Z 1
1 1
((η + ζ )+ )2 − (η+ )2 − ζ η+ = ζ 2 sign(η + st ζ )+ ds tdt. (5)
2 2 0 0

PROOF. Consider f (ξ ) = 12 (ξ+ )2 and note that f ′ (ξ ) = ξ+ and f ′′ (ξ ) = sign(ξ+ )

(note that f ′′ (0) can formally be set equal to any finite real number, and w.l.o.g. we
use f ′′ (0) = 0). Inserting this into the Taylor expansion
Z 1 Z 1
′ 2 ′′
f (η + ζ ) = f (η ) + ζ f (η ) + ζ f (η + st ζ )ds tdt
0 0

readily gives the desired result.

LEMMA 2. For any real n-vectors y and z it holds
1 1 1
k(y + z)+ k2 − ky+ k2 − zT y+ = zT Diag(d)z, (6)
2 2 2
where Z 1 Z 1
d= sign(y + stz)+ ds 2tdt. (7)
0 0

PROOF. Setting in (5) η = y j , ζ = z j , and summing over all j = 1, . . . , n obvi-

ously yields the required formula. Note that the use of scalar multiple 2 within the
integral provides for the estimate kDiag(d)k ≤ 1.
LEMMA 3. Function (2) and its gradient (3) satisfy the identity
1
ϕ (p + q) − ϕ (p) − qTg(p) = qT A Diag(d)AT q, (8)
2
where Z 1 Z 1
d= x + AT p + stATq)+ ds 2tdt.
sign(b (9)
0 0

PROOF. Setting in (6) and (7) y = xb+ AT p and z = AT q readily yields the required
result (with account of cancellation of linear terms involving b in the left hand side
of (8)).
REMARK 1. As is seen from (9), if the condition

x + AT p + ϑ ATq)+ = sign(b
sign(b x + AT p)+ , (10)

holds true for any 0 ≤ ϑ ≤ 1, then (8) is simplified as

1
ϕ (p + q) − ϕ (p) − qTg(p) = qT H(p)q, (11)
2
where the generalized Hessian matrix H(p) is defined in (4). This explains the key
role of H(p) in the organization of the Newton-type method considered below. Note
that a sufficient condition for (10) to hold is
4 A.I. Golikov and I.E. Kaporin

|(AT q) j | ≤ |(b
x + AT p) j | whenever (AT q) j (b
x + AT p) j < 0; (12)

that is, if certain components of the increment q are relatively small, then ϕ is exactly
quadratic (11) in the corresponding neighborhood of p.

4 Inexact Newton method for dual problem

As suggests condition (10) and its consequence (11), one can try to numerically
minimize ϕ using Newton type method pk+1 = pk − dk , where dk = H(pk )−1 g(pk ).
Note that by (11) this will immediately give the exact minimizer p∗ = pk+1 if the
magnitudes of dk components are sufficiently small to satisfy (12) taken with p = pk
and q = −dk . However, initially pk may be rather far from solution, and only grad-
ual improvements are possible. First, a damping factor αk must be used to guarantee
monotone convergence (with respect to the decrease of ϕ (pk ) as k increases). Sec-
ond, H(pk ) must be replaced by some appropriate approximation Mk in order to
provide its invertibility with a reasonable bound for the inverse. Therefore, we pro-
pose the following prototype scheme

pk+1 = pk − αk Mk−1 g(pk ),

where
Mk = H(pk ) + δ Diag(AAT ). (13)
The parameters 0 < αk ≤ 1 and 0 ≤ δ ≪ 1 must be defined properly for better
convergence. Furthermore, at initial stages of iteration, the most efficient strategy
is to use approximate Newton directions dk ≈ Mk−1 g(pk ), which can be obtained
using preconditioned conjugate gradient (PCG) method for the solution of Newton
equation Mk dk = g(pk ). As will be seen later, it suffices to use any vector dk which
satisfies conditions
dkT gk = dkT Mk dk = ϑk2 gTk Mk−1 gk (14)
with 0 < ϑk < 1 sufficiently separated from zero. For any preconditioning, the ap-
proximations constructed by the PCG method satisfy (14), see Section 5.3 below.
With account of the Armijo type criterion
α T
ϕ (pk − α dk ) ≤ ϕ (pk ) − d g(pk ), α ∈ {1, 1/2, 1/4, . . .}, (15)
2 k
where the maximum steplength α satisfying (15) is used, the inexact Newton algo-
rithm can be presented as follows:
Algorithm 1.
Input:
A ∈ Rm×n , b ∈ Rm , xb ∈ Rm ;
Initialization:
δ = 10−6 , ε = 10−12 , τ = 10−15
Inexact Newton method 5

kmax = 2000, lmax = 10; p0 = 0,

Iterations:
for k = 0, 1, . . . , kmax − 1:
xk = (b x + AT pk )+
1
ϕk = 2 kxk k2 − bT pk
gk = Axk − b
if (kgk k ≤ ε kbk) return {xk , pk , gk }
find dk ∈ Rm such that
dkT gk = dkT Mk dk = ϑk2 gk Mk−1 gk ,
where Mk = A Diag(sign(xk )) AT + δ Diag(AAT )
(0)
α =1
(0)
pk = pk − dk
for l = 0, 1, . . . , lmax − 1:
(l) (l)
xk = (b x + AT pk )+
(l) (l) (l)
ϕk = 12 kxk k2 − bT pk
(l) (l)
ζk = 12 α (l) dkT gk + ϕk − ϕk
(l)
if (ζk > τ |ϕk |) then
α (l+1) = α (l) /2
(l+1)
pk = pk − α (l+1)dk
else
(l+1) (l)
pk = pk
go to NEXT
end if
end for
(l+1)
NEXT: pk+1 = pk
end for
Next we explore the convergence properties of this algorithm.

5 Convergence analysis of inexact Newton method

It appears that Algorithm 1 exactly conforms with the convergence analysis pre-
sented in [13] (see also [2]). For the completeness of presentation and compatibility
of notations, we reproduce here the main results of [13].

5.1 Estimating convergence of inexact Newton method

The main assumptions we need for the function ϕ (p) under consideration are that it
is bounded from below, have gradient g(p) ∈ Rm , and satisfies
6 A.I. Golikov and I.E. Kaporin
γ
ϕ (p + q) − ϕ (p) − qTg(p) ≤ qT Mq (16)
2
for the symmetric positive definite m × m matrix M = M(p) defined above in (13)
and some constant γ ≥ 1. Note that the exact knowledge of γ is not necessary for
actual calculations. The existence of such γ follows from (13) and Lemma 3. Indeed,
denoting D = (Diag(AAT ))1/2 and A b = D−1 A, for the right hand side of (8) one has,
with account of kDiag(d)k ≤ 1 and H(p) ≥ 0,

b 2
kAk kAk
b 2
b 2 Diag(AAT ) ≤
ADiag(d)AT ≤ AAT ≤ kAk δ Diag(AAT ) + H(p) = M;
δ δ
therefore, (16) holds with
b 2 /δ .
γ = kAk (17)
The latter formula explains our choice of M which is more appropriate in cases of
large variations in norms of rows in A (see the examples and discussion in Section 6).
Next we will estimate the reduction in the value of ϕ attained by the descent
along the direction (−d) satisfying (14). One can show the following estimate for
the decrease of objective function value at each iteration (here, simplified notations
p = pk , p̂ = pk+1 etc. are used) p̂ = p − α d with α = 2−l , where l = 0, 1, . . ., as
evaluated according to (15):

ϑ 2 T −1
ϕ ( p̂) ≤ ϕ (p) − g M g. (18)
4γ

In particular, if the values of ϑ 2 are separated from zero by a positive constant

2
ϑmin (lower estimate for ϑ follows from Section 5.3 and an upper bound for
κ = cond(CM)), then, with account for M ≤ (1 + δ )kAk2 I and the boundedness
of ϕ from below, it follows
k−1
4γ (1 + δ )kAk2(ϕ (p0 ) − ϕ (p∗ ))
∑ gTj g j ≤ 2
ϑmin
.
j=0

Noting that the right hand side of the latter estimate does not depend on k, it finally
follows that
lim kg(pk )k = 0,
k→∞

where k is the number of the outer (nonlinear) iteration.

Estimate (18) can be verified as follows (note that quite similar analysis can be
found in [16]). Using q = −β d, where 0 < β < 2/γ , one can obtain from (16) and
(14) the following estimate for the decrease of ϕ along the direction (−d):

ϕ (p − β d) = ϕ (p) − β d Tg + ϕ (p − β d) − ϕ (p) − β d Tg
γ
≤ ϕ (p) − β − β 2 d T Md. (19)
2
Inexact Newton method 7

The following two cases are possible.

Case 1. If the condition (15) is satisfied at once for α = 1, this means that (recall
that the left equality of (14) holds)
1
ϕ ( p̂) ≤ ϕ (p) − d T Md.
2
Case 2. Otherwise, if at least one bisection of steplength was performed (and the
actual steplength is α ), then, using (19) with β = 2α , it follows

ϕ (p) − α d TMd < ϕ (p − 2α d) ≤ ϕ (p) − (2α − 2γα 2)d T Md,

which readily yields α > 1/(2γ ). Since we also have

α T
ϕ (p − α d) ≤ ϕ (p) − d Md,
2
it follows
1 T
ϕ (p − α d) ≤ ϕ (p) − d Md. (20)
4γ
Joining these two cases, taking into account that γ ≥ 1, and using the second equality
in (14) one obtains the required estimate (18).
It remains to notice that as soon as the norms of g attain sufficiently small values,
the resulting directions d will also have small norms. Therefore, the case considered
in Remark 1 will take place, and finally the convergence of the Newton method will
be much faster than at its initial stage.

5.2 Linear CG approximation of Newton directions

Next we relate the convergence of inner linear Preconditioned Conjugate Gradient

(PCG) iterations to the efficiency of Inexact Newton nonlinear solver. Similar issues
were considered in [2, 8, 12, 13].
An approximation d (i) to the solution of the problem Md = g generated on the ith
PCG iteration by the recurrence d (i+1) = d (i) + s(i) (see Algorithm 2 below) can be
written as follows (for our purposes, we always set the initial guess for the solution
d (0) to zero):
i−1
d (i) = ∑ s( j) , (21)
j=0

where the PCG direction vectors are pairwise M-orthogonal: (s( j) )T Ms(l) = 0, j 6= l.
Let also denote the M-norms of PCG directions as η ( j) = (s( j) )T Ms( j) , j = 0, 1, . . . , i−
1. Therefore, from (21), one can determine
i−1
ζ (i) = (d (i) )T Md (i) = ∑ η ( j) ,
j=0
8 A.I. Golikov and I.E. Kaporin

and estimate (20) takes the form

ik −1
1 ( j)
ϕk+1 ≤ ϕk −
4γ ∑ ηk ,
j=0

where k is the Newton iteration number. Summing up the latter inequalities for 0 ≤
k ≤ m − 1, we get
m−1 ik −1
( j)
c0 ≡ 4γ (ϕ0 − ϕ∗ ) ≥ ∑ ∑ ηk (22)
k=0 j=0

On the other hand, the cost measure related to the total time needed to perform
m inexact Newton iterations with ik PCG iterations at each Newton step, can be
estimated as proportional to
−1

m−1
−1
∑m−1
k=0 εCG + ik ε −1 + ik
Tm = ∑ εCG + ik ≤ c0 (
≤ c0 max iCG .
k=0 ∑m−1 ∑
ik −1
η
j) k<m ∑ k −1 η ( j)
k=0 j=0 k j=0 k

Here εCG is a small parameter reflecting the ratio of one linear PCG iteration
cost to the cost of one Newton iteration (in particular, including construction of
preconditioning and several ϕ evaluations needed for backtracking) plus possi-
ble efficiency loss due to early PCG termination. Thus, introducing the function
−1
ψ (i) = (εCG + i)/ζ (i) , (here, we omit the index k) one obtains a reasonable criterion
to stop PCG iterations in the form ψ (i) > ψ (i − 1). Here, the use of smaller values
εCG generally corresponds to the increase of the resulting iteration number bound.
Rewriting the latter condition, one obtains the final form of the PCG stopping rule:
−1
(εCG + i)η (i−1) ≤ ζ (i) . (23)

Note that by this rule, the PCG iteration number is always no less than 2.
Finally, we explicitly present the resulting formulae for the PCG algorithm incor-
porating the new stopping rule. Following [9], we use the Jacobi preconditioning

C = (Diag(M))−1 . (24)

Moreover, the reformulation [14] of the CG algorithm [11, 1] is used. This may give
a more efficient parallel implementation, see, e.g., [9].
Following [14], recall that at each PCG iteration the M −1 -norm of the (i + 1)-
th residual r(i+1) = g − Md (i+1) attains its minimum over the corresponding Krylov
subspace. Using the standard PCG recurrences (see Section 5.3 below) one can find
d (i+1) = d (i) +Cr(i) α (i) + s(i−1) α (i) β (i−1) . Therefore, the optimum increment s(i) in
the recurrence d (i+1) = d (i) + s(i) , where s(i) = V (i) h(i) and V (i) = [Cr(i) | s(i−1) ], can
be determined via the solution of the following 2-dimensional linear least squares
problem:
h (i) i
α
β (i)
= h(i) = arg min kg − Md (i+1)kM−1 = arg min kr(i) − MV (i) hkM−1 .
h∈R2 h∈R2
Inexact Newton method 9

By redefining r(i) := −r(i) and introducing vectors t (i) = Ms(i) , the required PCG
reformulation follows:

Algorithm 2.
r(0) = −g, d (0) = s(−1) = t (−1) = 0, ζ (−1) = 0;
i = 0, 1, . . . , itmax :
w(i) = Cr(i) ,
z(i) = Mw(i) ,
γ (i) = (r(i) )T w(i) , ξ (i) = (w(i) )T z(i) , η (i−1) = (s(i−1) )T t (i−1) ,
ζ (i) = ζ (i−1) + η (i−1),
−1
if ((εCG + i)η (i−1) ≤ ζ (i) ) or (γ (i) ≤ εCG 2
γ (0) ) return {d (i) };
if (k = 0) then
α (i) = −γ (i) /ξ (i) , β (i) = 0;
else
δ (i) = γ (i) /(ξ (i) η (i−1) − (γ (i) )2 ),
α (i) = −η (i−1) δ (i) , β (i) = γ (i) δ (i) ;
end if
t (i) = z(i) α (i) + t (i−1)β (i) , r(i+1) = r(i) + t (i) ,
(i) (i) (i) (i−1) (i)
s = w α +s β , d (i+1) = d (i) + s(i) .

For maximum reliability, the new stopping rule (23) is used along with the standard
one; however, in almost all cases the new rule provides for an earlier CG termination.
Despite of somewhat larger workspace and number of vector operations com-
pared to the standard algorithm, the above version of CG algorithm enables more ef-
ficient parallel implementation of scalar product operations. At each iteration of the
above presented algorithm, it suffices to use one MPI AllReduce(∗,∗,3,. . . ) operation
instead of two MPI AllReduce(∗,∗,1,. . . ) operation in the standard PCG recurrences.
This is especially important when many MPI processes are used and the start-up
time for MPI AllReduce operations is relatively large. For another equivalent PCG
reformulations allowing to properly reorder the scalar product operations, see [5]
and references cites therein.

5.3 Convergence properties of PCG iterations

Let us recall some basic properties of the PCG algorithm, see, e.g. [1]. The standard
PCG algorithm (algebraically equivalent to Algorithm 2) for the solution of the
problem Md = g can be written as follows (the initial guess for the solution d0 is set
to zero):
10 A.I. Golikov and I.E. Kaporin

d (0) = 0, r(0) = g, s(0) = Cr(0) ;

for i = 0, 1, . . . , m − 1 :
α (i) = (r(i) )T Cr(i) /(s(i) )T Ms(i) ,
d (i+1) = d (i) + s(i) α (i) ,
r(i+1) = r(i) − Ms(i) α (0) ,
if ((r(i) )T Cr(i+1) ≤ εCG
2
(r(0) )T Cr(0) ) return d (i+1)
β (i) = (r(i+1) )T Cr(i+1) /(r(i) )T Cr(i) ,
s(i+1) = Cr(i+1) + s(i) β (i) . (25)
endfor

The scaling property (14) (omitting the upper and lower indices at d, it reads d T g =
d T Md) can be proved as follows. Let d = d (i) be obtained after i iterations of the
PCG method applied to Md = g with zero initial guess d (0) = 0. Therefore, d ∈ Ki =
span{Cg,CMCg, . . . , (CM)i−1Cg}, and, by the PCG optimality property, it holds

d = arg min(g − Md)T M −1 (g − Md).

d∈Ki

Since α d ∈ Ki for any scalar α , one gets

(g − α Md)T M −1 (g − α Md) ≥ (g − Md)T M −1 (g − Md).

Setting here α = d T g/d T Md, one can easily transform this inequality as 0 ≥
(−d T g + d T Md)2 , which readily yields (14). Furthermore, by the well known es-
timate of the PCG iteration error [1] using Chebyshev polynomials, one gets
√
1 − θ 2 ≡ (g − Md)T M −1 (g − Md)/gT M −1 g ≤ cosh−2 2i/ κ

where
κ = cond(CM) ≡ λmax (CM)/λmin (CM).
By the scaling condition, this gives
√
θ 2 = d T Md/gT M −1 g ≥ tanh2 2i/ κ . (26)

Hence, 0 < θ < 1 and θ 2 → 1 as the PCG iteration number i grows.

6 Numerical test results

Below we consider two families of test problems which can be solved via minimiza-
tion of piecewise quadratic problems. The first one was described above in Section 2
(see also [6]), while the second coincides with the problem setting for the evaluation
Inexact Newton method 11

of distance between two convex polyhedra used in [3]. The latter problem is of key
importance e.g., in robotics and computer animation.

6.1 Test results for 11 NETLIB problems

Matrix data from the following 11 linear programming problems (this is the same
selection from NETLIB collection as considered in [15]), were used to form test
problems (1). Note that further we only consider the case xb = 0. Recall also the no-
x + AT p∗ )+ . The problems in Table 1 below are ordered by the number
tation x∗ = (b
of nonzero elements nz(A) in A ∈ Rm×n .

Table 1 Matrix properties for 11 NetLib problems

name m n nz(A) kx∗ k mini (AAT )ii maxi (AAT )ii
afiro 27 51 102 634.029569 1.18490000 44.9562810
addlittle 56 138 424 430.764399 1.00000000 10654.0000
agg3 516 758 4756 765883.022 1.00000001 179783.783
25fv47 821 1876 10705 3310.45652 0.00000000 88184.0358
pds 02 2953 7716 16571 160697.180 1.00000000 91.0000000
cre a 3516 7248 18168 1162.32987 0.00000000 27476.8400
80bau3b 2262 12061 23264 4129.96530 1.00000000 321739.679
ken 13 28362 42659 97246 25363.3224 1.00000000 170.000000
maros r7 3136 9408 144848 141313.207 3.05175947 3.37132546
cre b 9648 77137 260785 624.270129 0.00000000 27476.8400
osa 14 2337 54797 317097 119582.321 18.0000000 845289.908

It is readily seen that 3 out of 11 matrices have null rows, and more than half of
them have rather large variance of row norms. This explains the proposed Hessian
regularization (13) instead of the earlier construction [6, 15] Mk = H(pk )+ δ Im . The
latter is a proper choice only for matrices with rows of nearly equal length, such
as maros r7 example or various matrices with uniformly distributed quasirandom
entries, as used for testing in [9, 15]. In particular, estimate (17) with D = I would
take the form γ = kAk2/δ , so the resulting method appears to be rather sensitive to
the choice of δ .
In Table 2, the results presented in [15] are reproduced along with similar data
obtained with our version of Generalized Newton method. It must be stressed that
we used the fixed set of tunung parameters

δ = 10−6, ε = 10−12 , εCG = 10−3 , lmax = 10, (27)

for all problems. Note that In [15] the parameter choice for the Armijo procedure
was not specified.
12 A.I. Golikov and I.E. Kaporin

Table 2 Computational results of different solvers for 11 NetLib problems

name solver time(sec) kAx − bk∞ #NewtIter #MVMult
afiro GNewtEGK 0.001 8.63E–11 17 398
–”– ssGNewton 0.06 6.39E–14 – –
–”– cqpMOSEK 0.31 1.13E–13 – –
addlittle GNewtEGK 0.003 6.45E–10 22 1050
–”– ssGNewton 0.05 2.27E–13 – –
–”– cqpMOSEK 0.35 7.18E–11 – –
agg3 GNewtEGK 0.14 3.93E–07 116 9234
–”– ssGNewton 0.27 3.59E–08 – –
–”– cqpMOSEK 0.40 2.32E–10 – –
25fv47 GNewtEGK 0.54 7.15E–10 114 32234
–”– ssGNewton 1.51 3.43E–09 – –
–”– cqpMOSEK 1.36 1.91E–11 – –
pds 02 GNewtEGK 0.32 1.55E–08 75 8559
–”– ssGNewton 2.30 1.40E–07 – –
–”– cqpMOSEK 0.51 8.20E–06 – –
cre a GNewtEGK 3.36 2.64E–09 219 85737
–”– ssGNewton 1.25 4.13E–06 – –
–”– cqpMOSEK 0.61 2.15E–10 – –
80bau3b GNewtEGK 0.27 3.33E–09 79 6035
–”– ssGNewton 0.95 1.18E–12 – –
–”– cqpMOSEK 0.80 2.90E–07 – –
ken 13 GNewtEGK 1.41 2.70E–08 55 6285
–”– ssGNewton 9.09 4.39E–09 – –
–”– cqpMOSEK 2.09 1.71E–09 – –
maros r7 GNewtEGK 0.10 1.18E–09 27 535
–”– ssGNewton 2.86 2.54E–11 – –
–”– cqpMOSEK 55.20 3.27E–11 – –
cre b GNewtEGK 9.25 6.66E–10 75 24590
–”– ssGNewton 13.20 1.62E–09 – –
–”– cqpMOSEK 2.31 1.61E–06 – –
osa 14 GNewtEGK 42.59 8.25E–08 767 104874
–”– ssGNewton 60.10 4.10E–08 – –
–”– cqpMOSEK 4.40 7.82E–05 – –

In [15], the calculations were performed on 5GHz AMD 64 Athlon X2 Dual

Core. In our experiments, one core of 3.40 GHz x8 Intel (R) Core (TM) i7-3770
CPU was used, which is likely somewhat slower.
Note that the algorithm of [15] is based on direct evaluation of Mk and its sparse
Cholesky factorization, while our implementation, as was proposed in [9], uses the
Jacobi preconditioned Conjugate Gradient iterations for approximate evaluation of
Newton directions. Thus, the efficiency of our implementation critically depends on
the CG iteration convergence, which is sometimes slow. On the other hand, since the
main computational kernels of the algorithm are presented by matrix-vector multi-
Inexact Newton method 13

Table 3 Comparison of JCG stopping criteria γ (i) ≤ εCG

2 γ (0) (old) and (23) (new)

criterion εCG geom.mean time arithm.mean time geom.mean res.

old 0.05 1.78 12.37 3.64–09
old 0.03 1.45 12.21 3.47–09
old 0.01 1.35 14.91 2.11–09
old 0.003 1.48 21.14 2.14–09
old 0.001 1.82 29.12 3.64–09
new 0.003 1.66 11.12 3.46–09
new 0.002 1.39 11.36 3.64–09
new 0.001 1.26 10.81 4.65–09
new 0.0003 1.26 11.25 4.23–09
new 0.0001 1.33 12.47 3.60–09

plications of the type x = Ap or q = AT y, its parallel implementation can be

sufficiently efficient.
In Table 2, the abbreviation cqpMOSEK refers to MOSEK Optimization Soft-
ware package for convex quadratic problems, see [15]. The abbreviation ssGNew-
ton denotes the method implemented and tested in [15], while GNewtEGK stands
for the method proposed in the present paper.
Despite the use of slower computer, our GNewtEGK demonstrates considerably
faster performance in 8 cases of 11. Otherwise, one can observe that smaller com-
putational time of cqpMOSEK goes along with much worse residual norm, see the
results for problems cre b and osa 14 .
Thus, in most cases the presented implementation of Generalized Newton method
takes not too large number of Newton iterations using approximate Newton direc-
tions generated by CG iterations with diagonal preconditioning (24) and special
stopping rule (23).
A direct comparison of efficiency for the standard CG iterations stopping rule
γ (i) ≤ εCG
2 γ (0) (see Algorithm 2 for the notations) and the new one (23) is given in

Table 3, where the timing (in seconds) and precision results averaged over the same
11 problems are given. One can see that nearly the same average residual norm
kAx − bk∞ can be obtained considerably faster and with less critical dependence on
εCG when using the new PCG iteration stopping rule.

6.2 Evaluating the distance between convex polyhedra

Let the two convex polyhedra X1 and X2 be described by the following two systems
of linear inequalities:

X1 = {x1 : AT1 x1 ≤ b1 }, X2 = {x2 : AT2 x2 ≤ b2 },

14 A.I. Golikov and I.E. Kaporin

where A1 ∈ Rs×n1 , A2 ∈ Rs×n2 , and the vectors x1 , x2 , b1 , b2 are of compatible di-

mensions. The original problem of evaluating the distance between X1 and X2 is
(cf. [3], where it was solved by the projected gradient method)

x∗ = arg min kx1 − x2 k2 /2, where x = [xT1 , xT2 ]T ∈ R2s .

AT1 x1 ≤b1 ,AT2 x2 ≤b2

We will use the following regularized/penalized approximate reformulation of the

problem in terms of unconstrained convex piecewise quadratic minimization. Intro-
ducing the matrices

A1 0 2s×(n1 +n2 ) Is −Is
A= ∈R , B= ∈ R2s×2s ,
0 A2 −Is Is

and the vector b = [bT1 , bT2 ]T ∈ Rn1 +n2 , we consider the problem

ε 2 1 T 1 T 2
x∗ (ε ) = arg min kxk + x Bx + k(A x − b)+k ,
x∈R2s 2 2 2ε

where the regularization/penalty parameter ε is a sufficiently small positive number

(we have used ε = 10−4 ). The latter problem can readily be solved by adjusting the
above described Algorithm 1 using δ = 0 and the following explicit expressions for
the gradient and the generalized Hessian:

g(x) = ε x + Bx + ε −1A(AT x − b)+, H(x) = ε I + B + ε −1AD(x)AT ,

where D(x) = Diag(sign(AT x − b)+ ). When solving practical problems of evaluat-

ing the distance between two 3D convex polyhedra determined by their faces (so
that s = 3), the inexact Newton iterations are performed in R6 , and the cost of each
iteration is proportional to the total number n = n1 + n2 of the faces determining
the two polyhedra. In this case, the explicit evaluation of H(x) and the use of its
Cholesky factorization is more preferable than the use of the CG method.
Test polyhedrons with n/2 faces each were centered at the points e = [1, 1, 1]T or
−e and defined as AT1 (x−e) ≤ b1 and AT2 (x+e) ≤ b2 with b1 = b2 = [1, . . . , 1] ∈ Rn/2 ,
respectively. The columns of matrices A1 and A2 were determined by n/2 quasiran-
dom unit 3-vectors generated with the use of logistic sequence (see, e.g.[18] and ref-
2 . We used A (i, j) = ξ
erences cited therein) ξ0 = 0.4, ξk = 1 − 2ξk−1 1 20(i−1+3( j−1))
and similar for A2 ; then the columns of these matrices were normalized to the unit
length. The corresponding performance results
√ seem quite satisfactory, see Table 4.
Note that the lower bound kx1 − x2 k ≥ 2 3 − 2 ≈ 1.464101 always holds for the
distance (since the two balls kx1 − ek = 1 and kx2 + ek = 1 are inscribed in the
corresponding polyhedrons).

Acknowledgements This work was supported by the Russian Foundation for Basic Research
grant No. 17-07-00510 and by Program No. 26 of the Presidium of the Russian Academy of Sci-
ences. The authors are grateful to the anonymous referees and to Prof. V.Garanzha for many useful
comments which greatly improved the exposition of the paper.
Inexact Newton method 15

Table 4 Performance of the generalized Newton method (ε = 10−4 ) for the problem of distance
between two quasirandom convex polyhedrons with n/2 faces each
n kx1 − x2 k2 kAT x∗ − ck∞ time(sec) kg(x∗ )k∞ #NewtIter
8 0.001815 9.69–09 < 0.001 7.89–13 15
16 0.481528 8.63–05 < 0.001 1.27–13 3
32 0.795116 8.80–05 < 0.001 1.46–12 28
64 1.102286 1.32–04 < 0.001 5.58–13 13
128 1.446262 1.36–04 < 0.001 7.12–13 17
256 1.449913 9.54–05 < 0.001 4.37–13 11
512 1.460197 1.31–04 0.001 8.16–13 15
1024 1.460063 1.46–04 0.002 1.09–12 14
2048 1.463320 1.04–04 0.005 6.58–13 19
4096 1.463766 1.26–04 0.009 3.59–13 20
8192 1.463879 1.03–04 0.009 8.32–14 12
16384 1.463976 7.58–05 0.009 1.64–12 13
32768 1.464046 3.28–05 0.018 1.54–12 13

References

1. Axelsson, O.: A class of iterative methods for finite element equations. Computer Meth. Appl.
Mech. Engrg. 9, 123–137 (1976)
2. Axelsson, O., Kaporin, I.E.: Error norm estimation and stopping criteria in preconditioned
conjugate gradient iterations. Numer. Linear Algebra Appls. 8 (4), 265–286 (2001)
3. Bobrow, J.E.: A direct minimization approach for obtaining the distance between convex
polyhedra. The International Journal of Robotics Research, 8(3), 65–76 (1989)
4. Dembo R., Steihaug T.: Truncated Newton algorithms for large-scale unconstrained optimiza-
tion. Math. Program. 26, 190–212 (1983)
5. Dongarra, J., Eijkhout, V.: Finite-choice algorithm optimization in Conjugate Gradients. La-
pack Working Note 159, University of Tennessee Computer Science Report UT-CS-03-502
(2003)
6. Ganin, B.V., Golikov, A.I., Evtushenko, Y.G.: Projective-dual method for solving systems of
linear equations with nonnegative variables. Comput. Math. and Math. Phys. 58 (2), 159–169
(2018)
7. Garanzha V.A., Kaporin I.E.: Regularization of the barrier variational method of grid genera-
tion. Comput. Math. and Math. Phys. 39 (9), 1426–1440 (1999)
8. Garanzha, V., Kaporin, I., Konshin, I.: Truncated Newton type solver with application to grid
untangling problem. Numer. Linear Algebra Appls. 11 (5-6), 525–533 (2004)
9. Garanzha, V.A., Golikov, A.I., Evtushenko, Y.G., Nguen, M.K.: Parallel implementation of
Newton’s method for solving large-scale linear programs. Comput. Math. and Math. Phys. 49
(8), 1303–1317 (2009)
10. Hiriart-Urruty, J. B., Strodiot, J. J., Nguyen, V. H.: Generalized Hessian matrix and second-
order optimality conditions for problems with C1,1 data. Applied mathematics and optimiza-
tion, 11(1), 43–56 (1984)
11. Hestenes, M.R., Stiefel, E.L.: Methods of conjugate gradients for solving linear systems. J.
Research Nat. Bur. Standards 49 (1), 409–436 (1952)
12. Kaporin, I.E., Axelsson, O.: On a class of nonlinear equation solvers based on the residual
norm reduction over a sequence of affine subspaces. SIAM J. Sci. Comput. 16 (1), 228–249
(1995)
13. Kaporin, I.E.: Using inner conjugate gradient iterations in solving large-scale sparse nonlinear
optimization problems. Comput. Math. and Math. Phys. 43 (6), 766–771 (2003)
16 A.I. Golikov and I.E. Kaporin

14. Kaporin, I.E., Milyukova, O.Y.: The massively parallel preconditioned conjugate gradient
method for the numerical solution of linear algebraic equations. In: Zhadan, V.G. (ed.) Col-
lection of Papers of the Department of Applied Optimization of the Dorodnicyn Computing
Center, pp. 132–157, Russian Academy of Sciences, Moscow (2011)
15. Ketabchi, S., Moosaei, H., Parandegan, M., Navidi, H.: Computing minimum norm solution
of linear systems of equations by the generalized Newton method. Numerical Algebra, Con-
trol and Optimization, 7 (2), 113–119 (2017)
16. Mangasarian, O.L.: A finite Newton method for classification. Optimization Methods and
Software, 17 (5), 913–929 (2002)
17. Mangasarian, O.L.: A Newton method for linear programming. Journal of Optimization The-
ory and Applications, 121 (1), 1–18 (2004)
18. Yu, L., Barbot, J.P., Zheng, G., Sun, H.: Compressive sensing with chaotic sequence. IEEE
Signal Processing Letters, 17(8), 731–734 (2010)

Lab 1 Report Material Science
No ratings yet
Lab 1 Report Material Science
10 pages
Fundamentals of Weather and Climate
100% (1)
Fundamentals of Weather and Climate
520 pages
Solving Underdetermined Nonlinear Equations by Newton-Like Method
No ratings yet
Solving Underdetermined Nonlinear Equations by Newton-Like Method
22 pages
FX X RCX I CX I I: Study On Lagrangian Methods
No ratings yet
FX X RCX I CX I I: Study On Lagrangian Methods
10 pages
EE364a Homework 7 Solutions
No ratings yet
EE364a Homework 7 Solutions
16 pages
On Finite Termination of An Iterative Method For Linear Complementarity Problems
No ratings yet
On Finite Termination of An Iterative Method For Linear Complementarity Problems
17 pages
Solutions To Selected Exercises and Additional Examples For My Book Numerical Methods For Evolutionary Differential Equations
No ratings yet
Solutions To Selected Exercises and Additional Examples For My Book Numerical Methods For Evolutionary Differential Equations
19 pages
Errata Qss 2a Eng
No ratings yet
Errata Qss 2a Eng
9 pages
Optimization Class Notes MTH-9842
No ratings yet
Optimization Class Notes MTH-9842
25 pages
Dang 2020
No ratings yet
Dang 2020
21 pages
Dani Ardian - 190604003 - Makalah Metnum
No ratings yet
Dani Ardian - 190604003 - Makalah Metnum
13 pages
Comp Numerical Analysis Problems
No ratings yet
Comp Numerical Analysis Problems
7 pages
A Nachaoui - Appl Analysis 2002
No ratings yet
A Nachaoui - Appl Analysis 2002
19 pages
Nonlinear Analysis: Alexandra Smirnova, Necibe Tuncer
No ratings yet
Nonlinear Analysis: Alexandra Smirnova, Necibe Tuncer
12 pages
An Iterative Method For Solving Fredholm Integral Equations of The First Kind
No ratings yet
An Iterative Method For Solving Fredholm Integral Equations of The First Kind
26 pages
Ahmad-Comment On Kung and Traub
No ratings yet
Ahmad-Comment On Kung and Traub
11 pages
Equality Constrained Optimization: Daniel P. Robinson
No ratings yet
Equality Constrained Optimization: Daniel P. Robinson
33 pages
Style
No ratings yet
Style
10 pages
Brents Method
No ratings yet
Brents Method
12 pages
Computational and Applied 22s Soln
No ratings yet
Computational and Applied 22s Soln
7 pages
A Quasi-Gauss-Newton Method For Solving Non-Linear Algebraic Equations
No ratings yet
A Quasi-Gauss-Newton Method For Solving Non-Linear Algebraic Equations
11 pages
Jurnal - Gauss Newton
No ratings yet
Jurnal - Gauss Newton
11 pages
Nonlinear Equations: + 3x (X) 0 Sin (X) 4 + Sin (2x) + 6x + 9 0
No ratings yet
Nonlinear Equations: + 3x (X) 0 Sin (X) 4 + Sin (2x) + 6x + 9 0
19 pages
Project For Automated Train by Roshan
No ratings yet
Project For Automated Train by Roshan
6 pages
Clase 3 1 Newton Secante Esp
No ratings yet
Clase 3 1 Newton Secante Esp
15 pages
A Truncated Nonmonotone Gauss-Newton Method For Large-Scale Nonlinear Least-Squares Problems
No ratings yet
A Truncated Nonmonotone Gauss-Newton Method For Large-Scale Nonlinear Least-Squares Problems
16 pages
), R Is Continuously Differentiable.: Journal of Industrial and Management Optimization Volume 1, Number 2, May 2005
No ratings yet
), R Is Continuously Differentiable.: Journal of Industrial and Management Optimization Volume 1, Number 2, May 2005
8 pages
Second and Higher Order Iteration in Lagrangian Methodpaper Shelja Salil
No ratings yet
Second and Higher Order Iteration in Lagrangian Methodpaper Shelja Salil
13 pages
Waziri PDF
No ratings yet
Waziri PDF
15 pages
Solving Nonlinear Least Squares Problem Using Gauss-Newton Method
No ratings yet
Solving Nonlinear Least Squares Problem Using Gauss-Newton Method
5 pages
Clnote Oct12
No ratings yet
Clnote Oct12
25 pages
Malak Khashoqji 4th Order
No ratings yet
Malak Khashoqji 4th Order
16 pages
Midterm 2
No ratings yet
Midterm 2
40 pages
An Algorithm For Minimax Solution of Overdetennined Systems of Non-Linear Equations
No ratings yet
An Algorithm For Minimax Solution of Overdetennined Systems of Non-Linear Equations
8 pages
Sequential Quadratic Programming
No ratings yet
Sequential Quadratic Programming
50 pages
Galerkin Methods
No ratings yet
Galerkin Methods
29 pages
Slovak International Scientific Journal 81, 2024-42-46
No ratings yet
Slovak International Scientific Journal 81, 2024-42-46
5 pages
A Strengthened Conjecture On The Minimax Optimal Constant Stepsize For Gradient Descent
No ratings yet
A Strengthened Conjecture On The Minimax Optimal Constant Stepsize For Gradient Descent
8 pages
Siam J. M A A - 2001 Society For Industrial and Applied Mathematics Vol. 22, No. 4, Pp. 1038-1057
No ratings yet
Siam J. M A A - 2001 Society For Industrial and Applied Mathematics Vol. 22, No. 4, Pp. 1038-1057
20 pages
Textbook 656 663
No ratings yet
Textbook 656 663
8 pages
Legendre Polynomials: D PX N PX X NDX
No ratings yet
Legendre Polynomials: D PX N PX X NDX
17 pages
Mathematics 11 02069
No ratings yet
Mathematics 11 02069
8 pages
Statistics 580 Nonlinear Least Squares: I I I I I I I 2 N I I 2
No ratings yet
Statistics 580 Nonlinear Least Squares: I I I I I I I 2 N I I 2
14 pages
Generalized Continuation Newton Methods and The Trust-Region Updating Strategy For The Underdetermined System
No ratings yet
Generalized Continuation Newton Methods and The Trust-Region Updating Strategy For The Underdetermined System
23 pages
Bogdan Ciubotaru - Curs 07 (CO) (3B) (2023-24)
No ratings yet
Bogdan Ciubotaru - Curs 07 (CO) (3B) (2023-24)
46 pages
Iterative Methods - Used For Large Numbers of Equations (N 2000)
No ratings yet
Iterative Methods - Used For Large Numbers of Equations (N 2000)
15 pages
BFGS
No ratings yet
BFGS
9 pages
Int PT MTHD
No ratings yet
Int PT MTHD
24 pages
Numerical Results For Gauss-Seidel Iterative Algor
No ratings yet
Numerical Results For Gauss-Seidel Iterative Algor
11 pages
Computational Methods in Physics: Seminar
No ratings yet
Computational Methods in Physics: Seminar
4 pages
Maximum Slope Method
No ratings yet
Maximum Slope Method
14 pages
An Improved Regula Falsi Method For Finding Simple Zeros of Nonlinear Equations
No ratings yet
An Improved Regula Falsi Method For Finding Simple Zeros of Nonlinear Equations
6 pages
FALLSEM2023-24 EEE1020 ETH VL2023240103124 2023-08-22 Reference-Material-I
No ratings yet
FALLSEM2023-24 EEE1020 ETH VL2023240103124 2023-08-22 Reference-Material-I
9 pages
A Note On The Optimal Convergence Rate of Descent
No ratings yet
A Note On The Optimal Convergence Rate of Descent
11 pages
Optimal Dispatch of Generation Part I: Unconstrained Parameter Optimization
No ratings yet
Optimal Dispatch of Generation Part I: Unconstrained Parameter Optimization
9 pages
A Simple Yet Efficient Two Step Fifth Order Weighted Newton Method For Nonlinear Models
No ratings yet
A Simple Yet Efficient Two Step Fifth Order Weighted Newton Method For Nonlinear Models
23 pages
Matrix-Free Interior Point Method
No ratings yet
Matrix-Free Interior Point Method
24 pages
A System of Nonlinear Equations With Singular Jacobian: ISSN: 2319-8753
No ratings yet
A System of Nonlinear Equations With Singular Jacobian: ISSN: 2319-8753
4 pages
Numerical Optimization: Unit 9: Penalty Method and Interior Point Method Unit 10: Filter Method and The Maratos Effect
No ratings yet
Numerical Optimization: Unit 9: Penalty Method and Interior Point Method Unit 10: Filter Method and The Maratos Effect
24 pages
Broyden ClassMethodsSolving 1965
No ratings yet
Broyden ClassMethodsSolving 1965
18 pages
Ito 2003
No ratings yet
Ito 2003
8 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Raising Vibrations With Crystals... CRYSTALS LOVELY CRYSTALS - Ashtar Command
No ratings yet
Raising Vibrations With Crystals... CRYSTALS LOVELY CRYSTALS - Ashtar Command
6 pages
Narrative Report: Basic Micros
0% (1)
Narrative Report: Basic Micros
3 pages
Pressure in Liquids and Gases
No ratings yet
Pressure in Liquids and Gases
9 pages
BS 7549
No ratings yet
BS 7549
8 pages
RA9260
No ratings yet
RA9260
1 page
Study of Fuzzy Logic and Pid Controller in Buck-Boost Converter
No ratings yet
Study of Fuzzy Logic and Pid Controller in Buck-Boost Converter
4 pages
Chapter 2 - Origin of Soil and Grain Size
No ratings yet
Chapter 2 - Origin of Soil and Grain Size
20 pages
Magnetic Chuck All
No ratings yet
Magnetic Chuck All
32 pages
Open Channel Flow Float Method
No ratings yet
Open Channel Flow Float Method
3 pages
Sun Tracking System Report
No ratings yet
Sun Tracking System Report
15 pages
A Guide To The Design and Application of BBR FRP Strengthening Systems
No ratings yet
A Guide To The Design and Application of BBR FRP Strengthening Systems
101 pages
PARATIE EN - Advanced-Modelling-2014 PDF
No ratings yet
PARATIE EN - Advanced-Modelling-2014 PDF
50 pages
TS JR (Pre-Final-2) (Physics Q P) Ex DT 15-04-2021
No ratings yet
TS JR (Pre-Final-2) (Physics Q P) Ex DT 15-04-2021
2 pages
Chapter 3 OM PDF
No ratings yet
Chapter 3 OM PDF
7 pages
Toroidal Test Procedure
No ratings yet
Toroidal Test Procedure
12 pages
9.ray Optics and Optical Instruments
No ratings yet
9.ray Optics and Optical Instruments
22 pages
Rheofibre - Basf
No ratings yet
Rheofibre - Basf
2 pages
Sba 2
No ratings yet
Sba 2
29 pages
SOP 001 Polarimeter
50% (2)
SOP 001 Polarimeter
2 pages
Schneider
No ratings yet
Schneider
6 pages
New Concepts On Modern Aerospace Vehicles
No ratings yet
New Concepts On Modern Aerospace Vehicles
10 pages
Solution of Example (1) - Shooting Method
No ratings yet
Solution of Example (1) - Shooting Method
5 pages
Radar Level Detection Transmitters
No ratings yet
Radar Level Detection Transmitters
2 pages
Gujarat Technological University: Instructions
No ratings yet
Gujarat Technological University: Instructions
1 page
Existence and Uniqueness: of Solutions
No ratings yet
Existence and Uniqueness: of Solutions
13 pages
Econometrics Assig 1
0% (1)
Econometrics Assig 1
13 pages
Localized Electrons With Wien2k
No ratings yet
Localized Electrons With Wien2k
34 pages
Delhi Public School: Class: XI Subject: Assignment No. 3
No ratings yet
Delhi Public School: Class: XI Subject: Assignment No. 3
1 page

Inexact Newton Method For Minimization of Convex P

Uploaded by

Inexact Newton Method For Minimization of Convex P

Uploaded by

Inexact Newton method for minimization of

convex piecewise quadratic functions

A.I. Golikov and I.E. Kaporin

Abstract An inexact Newton type method for numerical minimization of convex

of piecewise quadratic function is formulated. In Section 3, certain technical results

2 Optimization problem setting

Consider the piecewise-quadratic unconstrained optimization problem

and it has generalized Hessian [10]

3 Taylor expansion of (·)2+ function

PROOF. Consider f (ξ ) = 12 (ξ+ )2 and note that f ′ (ξ ) = ξ+ and f ′′ (ξ ) = sign(ξ+ )

readily gives the desired result.

PROOF. Setting in (5) η = y j , ζ = z j , and summing over all j = 1, . . . , n obvi-

holds true for any 0 ≤ ϑ ≤ 1, then (8) is simplified as

4 Inexact Newton method for dual problem

pk+1 = pk − αk Mk−1 g(pk ),

kmax = 2000, lmax = 10; p0 = 0,

5 Convergence analysis of inexact Newton method

5.1 Estimating convergence of inexact Newton method

In particular, if the values of ϑ 2 are separated from zero by a positive constant

where k is the number of the outer (nonlinear) iteration.

The following two cases are possible.

ϕ (p) − α d TMd < ϕ (p − 2α d) ≤ ϕ (p) − (2α − 2γα 2)d T Md,

which readily yields α > 1/(2γ ). Since we also have

5.2 Linear CG approximation of Newton directions

Next we relate the convergence of inner linear Preconditioned Conjugate Gradient

and estimate (20) takes the form

5.3 Convergence properties of PCG iterations

d (0) = 0, r(0) = g, s(0) = Cr(0) ;

d = arg min(g − Md)T M −1 (g − Md).

Since α d ∈ Ki for any scalar α , one gets

(g − α Md)T M −1 (g − α Md) ≥ (g − Md)T M −1 (g − Md).

Hence, 0 < θ < 1 and θ 2 → 1 as the PCG iteration number i grows.

6 Numerical test results

6.1 Test results for 11 NETLIB problems

Table 1 Matrix properties for 11 NetLib problems

δ = 10−6, ε = 10−12 , εCG = 10−3 , lmax = 10, (27)

Table 2 Computational results of different solvers for 11 NetLib problems

In [15], the calculations were performed on 5GHz AMD 64 Athlon X2 Dual

Table 3 Comparison of JCG stopping criteria γ (i) ≤ εCG

criterion εCG geom.mean time arithm.mean time geom.mean res.

plications of the type x = Ap or q = AT y, its parallel implementation can be

6.2 Evaluating the distance between convex polyhedra

X1 = {x1 : AT1 x1 ≤ b1 }, X2 = {x2 : AT2 x2 ≤ b2 },

where A1 ∈ Rs×n1 , A2 ∈ Rs×n2 , and the vectors x1 , x2 , b1 , b2 are of compatible di-

x∗ = arg min kx1 − x2 k2 /2, where x = [xT1 , xT2 ]T ∈ R2s .

We will use the following regularized/penalized approximate reformulation of the

where the regularization/penalty parameter ε is a sufficiently small positive number

g(x) = ε x + Bx + ε −1A(AT x − b)+, H(x) = ε I + B + ε −1AD(x)AT ,

where D(x) = Diag(sign(AT x − b)+ ). When solving practical problems of evaluat-

You might also like