0% found this document useful (0 votes)
61 views27 pages

Optimizing With Ga

1) The document presents two Levenberg-Marquardt-type algorithms for solving constrained nonlinear equations that converge locally quadratically under a weaker error bound assumption. 2) The first method solves a strictly convex minimization problem at each iteration, while the second solves a single system of linear equations. 3) Numerical results for the second method indicate it works well in practice for solving constrained systems of nonlinear equations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views27 pages

Optimizing With Ga

1) The document presents two Levenberg-Marquardt-type algorithms for solving constrained nonlinear equations that converge locally quadratically under a weaker error bound assumption. 2) The first method solves a strictly convex minimization problem at each iteration, while the second solves a single system of linear equations. 3) Numerical results for the second method indicate it works well in practice for solving constrained systems of nonlinear equations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

LEVENBERG-MARQUARDT METHODS FOR

CONSTRAINED NONLINEAR EQUATIONS WITH


STRONG LOCAL CONVERGENCE PROPERTIES1

Christian Kanzow 2 , Nobuo Yamashita 3 and Masao Fukushima 3

Preprint 244 April 2002

2
Institute of Applied Mathematics and Statistics
University of Würzburg
Am Hubland
97074 Würzburg
Germany
e-mail: [email protected]
3
Department of Applied Mathematics and Physics
Graduate School of Informatics
Kyoto University
Kyoto 606-8501
Japan
e-mail: [email protected]
[email protected]

1
This research was supported in part by a Grant-in-Aid for Scientific Research from the Ministry of
Education, Science, Sports, and Culture of Japan.
Abstract. We consider the problem of finding a solution of a constrained (and not nec-
essarily square) system of equations, i.e., we consider systems of nonlinear equations and
want to find a solution that belongs to a certain feasible set. To this end, we present two
Levenberg-Marquardt-type algorithms that differ in the way they compute their search di-
rections. The first method solves a strictly convex minimization problem at each iteration,
whereas the second one solves only one system of linear equations in each step. Both meth-
ods are shown to converge locally quadratically under an error bound assumption that is
much weaker than the standard nonsingularity condition. Both methods can be globalized
in an easy way. Some numerical results for the second method indicate that the algorithm
works quite well in practice.

Key Words. Constrained equations, Levenberg-Marquardt method, projected gradients,


quadratic convergence, error bounds.

2
1 Introduction
In this paper we consider the problem of finding a solution of the constrained system of
nonlinear equations
F (x) = 0, x ∈ X, (1)
where X ⊆ Rn is a nonempty, closed and convex set and F : O → Rm is a given mapping
defined on an open neighbourhood O of the set X. Note that the dimensions n and m do
not necessarily coincide. We denote by X ∗ the set of solutions to (1).
The solution of an unconstrained square system of nonlinear equations, where X =
n
R and n = m in (1), is a classical problem in mathematics for which many well-known
solution techniques like Newton’s method, quasi-Newton methods, Gauss-Newton methods,
Levenberg-Marquardt methods etc. are available, see, e.g., [19, 4, 14] for three standard
books on this subject.
The solution of a constrained (and possibly nonsquare) system of equations like problem
(1), however, has not been the subject of intense research. In fact, the authors are currently
only aware of the recent papers [9, 15, 12, 13, 18, 23, 22, 1, 20] that deal with constrained
(typically box constrained) systems of equations. Most of these papers describe algorithms
that have certain global and local fast convergence properties under a nonsingularity as-
sumption at the solution.
The nonsingularity assumption implies that the solution is locally unique. Here we
present some Levenberg-Marquardt-type algorithms that are locally quadratically conver-
gent under a weaker assumption that, in particular, allows the solution set to be (locally)
nonunique. To this end, we replace the nonsingularity assumption by an error bound con-
dition. This is motivated by the recent paper [24] that deals with unconstrained equations
only. See also [3, 7] for some subsequent related results for the unconstrained case.
On the other hand, the possibility of dealing with constrained equations is very important.
In fact, systems of nonlinear equations arising in several applications are often constrained.
For example, in chemical equilibrium systems (see, e.g., [16, 17]), the variables correspond to
the concentration of certain elements that are naturally nonnegative. Furthermore, in many
economic equilibrium problems, the mapping F is not defined everywhere (see, e.g., [6]) so
that one is urged to impose suitable constraints on the variables. Finally, engineers often
have a good guess regarding the area where they expect their solution to lie; such a priori
knowledge can then easily be incorporated by adding suitable constraints to the system of
equations.
The organization of this paper is as follows: Section 2 describes a constrained Levenberg-
Marquardt method for the solution of problem (1). It is shown that this method has some nice
local convergence properties under fairly mild assumptions. We also note that the method
can be globalized quite easily. The main disadvantage of this method is that it has to solve
relatively complicated subproblems at each iteration, namely (strictly convex) quadratic pro-
grams in the special case where the set X is polyhedral, and convex minimization problems
in the general case.
In order to avoid this drawback, we present a variant of the constrained Levenberg-
Marquardt method in Section 3 (called the projected Levenberg-Marquardt method) that
solves only a system of linear equations per iteration. This method is shown to have es-

3
sentially the same local (and global) convergence properties as the method of Section 2.
Numerical results for this method are presented in Section 4. We conclude the paper with
some remarks in Section 5.
The notation used in this paper is standard: The Euclidean norm is denoted by k · k,
n
Bδ (x) := {y ∈ R  | ky − xk ≤ δ} is the closed ball centered at x with radius δ > 0,
∗ ∗
dist(y, X ) := inf ky − xk x ∈ X denotes the distance from a point y to the solution set

X ∗ , and PX (x) is the projection of a point x ∈ Rn onto the feasible set X.

2 Constrained Levenberg-Marquardt Method


This section describes and investigates a constrained Levenberg-Marquardt method for the
solution of the constrained system of nonlinear equations (1). The algorithm and the assump-
tions will be given in detail in Subsection 2.1. The convergence of the distance from the iter-
ates to the solution set will be discussed in Subsection 2.2, while Subsection 2.3 considers the
local behaviour of the iterates themselves. A globalized version of the Levenberg-Marquardt
method is given in Subsection 2.4.

2.1 Algorithm and Assumptions


For solving (1) we consider the related optimization problem
min f (x) s.t. x ∈ X, (2)
where
f (x) := kF (x)k2
denotes the natural merit function corresponding to the mapping F . A Gauss-Newton-type
method for this (not necessarily square) system of equations generates a sequence {xk } by
setting xk+1 := xk + dk , where dk is a solution of the linearized subproblem
min f k (d) s.t. xk + d ∈ X (3)
with the objective function
f k (d) := kF (xk ) + Hk dk2 ,
where matrix Hk ∈ Rm×n is an approximation to the (not necessarily existing) Jacobian
F 0 (xk ). However, since we allow the solution of problem (1) to be nonunique and nonisolated,
we replace subproblem (3) by a regularized problem of the form
min θk (d) s.t. xk + d ∈ X (4)
with the objective function
θk (d) := kF (xk ) + Hk dk2 + µk kdk2 , (5)
where µk is a positive parameter. Note that θk is a strictly convex quadratic function. Hence
the solution dk of subproblem (4) always exists uniquely.
Formally, we arrive at the following method for the solution of the constrained system of
nonlinear equations (1).

4
Algorithm 2.1 (Constrained Levenberg-Marquardt Method: Local Version)

(S.0) Choose x0 ∈ X, µ > 0, and set k := 0.

(S.1) If F (xk ) = 0, STOP.

(S.2) Choose Hk ∈ Rm×n , set µk := µkF (xk )k2 , and compute dk as the solution of (4).

(S.3) Set xk+1 := xk + dk , k ← k + 1, and go to (S.1).

Note that the algorithm is well-defined and that all iterates xk belong to the feasible set
X. To establish our (local) convergence results for Algorithm 2.1, we need the following
assumptions.

Assumption 2.2 The solution set X ∗ of problem (1) is nonempty. For some solution x∗ ∈
X ∗ , there exist constants δ > 0, c1 > 0, c2 > 0 and L > 0 such that the following inequalities
hold:

c1 dist(x, X ∗ ) ≤ kF (x)k ∀x ∈ Bδ (x∗ ) ∩ X, (6)


kF (x) − F (xk ) − Hk (x − xk )k ≤ c2 kx − xk k2 ∀x, xk ∈ Bδ (x∗ ) ∩ X, (7)
kF (x) − F (y)k ≤ Lkx − yk ∀x, y ∈ Bδ (x∗ ) ∩ X. (8)

Assumption (6) is a local error bound condition and known to be much weaker than the more
standard nonsingularity of the Jacobian F 0 (x∗ ) in the case where this Jacobian exists and is
a square matrix (i.e., if F is differentiable and n = m). For example, this local error bound
condition is satisfied when F is affine and X is polyhedral. To see this, let F (x) = Ax + a
and X = {x | Bx ≤ b} with appropriate matrices A, B and vectors a, b. Due to Hoffman’s
[11] famous error bound result, there exists τ > 0 such that

τ dist(x, X ∗ ) ≤ kF (x)k + kPX (x)k. (9)

If x ∈ Bδ (x∗ ) ∩ X for some x∗ ∈ X ∗ , then PX (x) = 0. So, (9) reduces to

τ dist(x, X ∗ ) ≤ kF (x)k,

which implies condition (6).


Furthermore, assumption (7) may be viewed as a smoothness condition on F together
with a requirement on the choice of matrix Hk . For example, this condition is satisfied with
the choice Hk := F 0 (xk ) if F is continuously differentiable with F 0 being locally Lipschitzian.
Finally, assumption (8) only says that F is locally Lipschitzian in a neighbourhood of
the solution x∗ . Of course, this condition is automatically satisfied if F is a continuously
differentiable function.

5
2.2 Local Convergence of Distance Function
Throughout this subsection, we suppose that Assumption 2.2 holds. The constants δ, c1 , c2 ,
and L that appear in the subsequent analysis are always the constants from Assumption 2.2.
Our aim is to show that Algorithm 2.1 is locally quadratically convergent in the sense that
the distance from the iterates xk to the solution set X ∗ goes down to zero with a quadratic
rate. In order to verify this result, we need to prove a couple of technical lemmas. These
lemmas can be derived by suitable modifications of the corresponding unconstrained results
in [24].

Lemma 2.3 There exist constants c3 > 0 and c4 > 0 such that the following inequalities
hold for each xk ∈ Bδ/2 (x∗ ) ∩ X:
(a) kdk k ≤ c3 dist(xk , X ∗ ).
(b) kF (xk ) + Hk dk k ≤ c4 dist(xk , X ∗ )2 .
Proof. (a) Let x̄k ∈ X ∗ denote the closest solution to xk so that
kxk − x̄k k = dist(xk , X ∗ ). (10)
Since dk is the global minimum of subproblem (4) and xk + d¯k ∈ X holds for the vector
d¯k := x̄k − xk , we have
θk (dk ) ≤ θk (d¯k ) = θk (x̄k − xk ). (11)
Furthermore, since xk ∈ Bδ/2 (x∗ ) by assumption, we obtain
kx̄k − x∗ k ≤ kx̄k − xk k + kxk − x∗ k ≤ kx∗ − xk k + kxk − x∗ k ≤ δ
so that x̄k ∈ Bδ (x∗ ) ∩ X. Moreover, the definition of µk in Algorithm 2.1 together with (6)
and (10) gives
µk = µkF (xk )k2 ≥ µc21 dist(xk , X ∗ )2 = µc21 kxk − x̄k k2 . (12)
Using (10), (11), (12) and (7), we obtain from the definition of the function θk in (5) that
1 k k
kdk k2 ≤ θ (d )
µk
1 k k
≤ θ (x̄ − xk )
µk
1
kF (xk ) + Hk (x̄k − xk )k2 + µk kx̄k − xk k2

=
µk
1 F (xk ) − F (x̄k ) −Hk (xk − x̄k ) 2 + kx̄k − xk k2

=
µk | {z }
=0
1 2 k
≤ c kx − x̄k k4 + kxk − x̄k k2
µk 2
c22 k
≤ kx − x̄k k2 + kxk − x̄k k2
µc21
 c2 
= 2
+ 1 dist(xk , X ∗ )2 .
µc21

6
q
c22
Therefore, statement (a) holds with c3 := µc21
+ 1.

(b) The definition of θk in (5) implies


kF (xk ) + Hk dk k2 ≤ θk (dk ). (13)
On the other hand, from (11), (5) and (7), we have
θk (dk ) ≤ θk (x̄k − xk )
≤ kF (xk ) − F (x̄k ) − Hk (xk − x̄k )k2 + µk kx̄k − xk k2 (14)
≤ c22 kxk − x̄k k4 + µk kxk − x̄k k2 .
Since (8) yields
µk = µkF (xk )k2 = µkF (xk ) − F (x̄k )k2 ≤ µL2 kxk − x̄k k2 ,
we obtain from (13) and (14) that
kF (xk ) + Hk dk k2 ≤ θk (dk )
≤ c22 kxk − x̄k k4 + µk kxk − x̄k k2
≤ c22 kxk − x̄k k4 + µL2 kxk − x̄k k4
= c22 + µL2 kxk − x̄k k4 .


2
p
Hence statement (b) holds with c4 := c22 + µL2 .

The next result is a major step in verifying local quadratic convergence of the distance
function.

Lemma 2.4 Assume that both xk−1 and xk belong to the ball Bδ/2 (x∗ ) for each k ∈ N. Then
there is a constant c5 > 0 such that
dist(xk , X ∗ ) ≤ c5 dist(xk−1 , X ∗ )2
for each k ∈ N.
Proof. Since xk , xk−1 ∈ Bδ/2 (x∗ ) and xk = xk−1 + dk−1 , we obtain from (7) that

F (xk−1 + dk−1 ) − F (xk−1 ) + Hk−1 dk−1

≤ F (xk−1 ) − F (xk−1 + dk−1 ) + Hk−1 dk−1
≤ c2 kdk−1 k2 .
Using the error bound assumption (6) and Lemma 2.3, we therefore obtain
c1 dist(xk , X ∗ ) ≤ kF (xk )k
= kF (xk−1 + dk−1 )k

≤ F (xk−1 ) + Hk−1 dk−1 + c2 kdk−1 k2
≤ c4 dist(xk−1 , X ∗ )2 + c2 c23 dist(xk−1 , X ∗ )2
= (c4 + c2 c23 )dist(xk−1 , X ∗ )2 ,

7
and this completes the proof by setting c5 := (c4 + c2 c23 )/c1 . 2

The next result shows that the assumption of Lemma 2.4 is satisfied if the starting point x0
in Algorithm 2.1 is chosen sufficiently close to the solution set X ∗ . Let
n δ 1 o
r := min , . (15)
2(1 + 2c3 ) 2c5

Lemma 2.5 Assume that the starting point x0 ∈ X used in Algorithm 2.1 belongs to the ball
Br (x∗ ), where r is defined by (15). Then all iterates xk generated by Algorithm 2.1 belong
to the ball Bδ/2 (x∗ ).
Proof. The proof is by induction on k. We start with k = 0. By assumption, we have
x0 ∈ Br (x∗ ). Since r ≤ δ/2, this implies x0 ∈ Bδ/2 (x∗ ). Now let k ≥ 0 be arbitrarily given
and assume that xl ∈ Bδ/2 (x∗ ) for all l = 0, . . . , k. In order to show that xk+1 belongs to
Bδ/2 (x∗ ), first note that

kxk+1 − x∗ k = kxk + dk − x∗ k
≤ kxk − x∗ k + kdk k
= kxk−1 + dk−1 − x∗ k + kdk k
≤ kxk−1 − x∗ k + kdk−1 k + kdk k
.. ..
. .
Xk
≤ kx0 − x∗ k + kdl k
l=0
k
X
≤ r + c3 dist(xl , X ∗ ),
l=0

where the last inequality follows from Lemma 2.3. Since Lemma 2.4 implies

dist(xl , X ∗ ) ≤ c5 dist(xl−1 , X ∗ )2 l = 1, . . . , k,

we have

dist(xl , X ∗ ) ≤ c5 dist(xl−1 , X ∗ )2
2
≤ c5 c25 dist(xl−2 , X ∗ )2
.. ..
. .
l−1 l
≤ c5 c5 · · · c52 dist(x0 , X ∗ )2
2

l l
= c25 −1 dist(x0 , X ∗ )2
l l
≤ c25 −1 kx0 − x∗ k2
l l
≤ c25 −1 r2 ,

8
for all l = 0, . . . , k. Using r ≤ 1/(2c5 ), we therefore get
k
l l
X

kx k+1
− x k ≤ r + c3 c25 −1 r2
l=0
k
l l
X
= r + c3 r c25 −1 r2 −1
l=0
k
X 1 2l −1
≤ r + c3 r
l=0
2

X 1 l
≤ r + c3 r
l=0
2
= (1 + 2c3 )r
δ
≤ ,
2
where the last inequality follows from the definition (15) of r. This completes the induction.
2

We now obtain the following quadratic convergence result for the distance function as an
immediate consequence of Lemmas 2.4 and 2.5.

Theorem 2.6 Let Assumption 2.2 be satisfied and {xk } be a sequence generated by Algo-
rithm 2.1 with starting point x0 ∈ Br (x∗ ), where r is defined by (15). Then the sequence
{dist(xk , X ∗ )} converges to zero quadratically, i.e., the iterates xk approach the solution set
X ∗ locally quadratically.
Theorem 2.6 is the main result in this subsection and shows that the constrained Levenberg-
Marquardt method of Algorithm 2.1 is locally quadratically convergent under fairly mild
assumptions.

2.3 Local Convergence of Iterates


The aim of this subsection is to investigate the local behaviour of the sequence {xk } generated
by Algorithm 2.1. To this end, we also assume throughout this subsection that the conditions
in Assumption 2.2 are satisfied. Moreover, the constants δ and ci , i = 1, · · · , 5 will be those
from the previous subsections, i.e., from Assumption 2.2 and Lemmas 2.3–2.5.
In view of Theorem 2.6, we know that the distance dist(xk , X ∗ ) from the iterates xk to
the solution set X ∗ converges to zero locally quadratically. However, this says little about
the behaviour of the sequence {xk } itself. In this subsection, we will see that this sequence
converges to a solution of (1), and that the rate of convergence is also locally quadratic.
We start by showing that the sequence is convergent.

9
Theorem 2.7 Let Assumption 2.2 be satisfied and {xk } be a sequence generated by Algo-
rithm 2.1 with starting point x0 ∈ Br (x∗ ), where r is defined by (15). Then the sequence
{xk } converges to a solution x̄ of (1) belonging to the ball Bδ/2 (x∗ ).
Proof. Since the entire sequence {xk } remains in the closed ball Bδ/2 (x∗ ) by Lemma 2.5,
every limit point of this sequence belongs to this set, too. Hence it remains to show that the
sequence {xk } converges. To this end, we first note that, for any positive integers k and m
such that k > m, we have

kxk − xm k = kxk−1 + dk−1 − xm k


≤ kxk−1 − xm k + kdk−1 k
= kxk−2 + dk−2 − xm k + kdk−1 k
≤ kxk−2 − xm k + kdk−2 k + kdk−1 k
.. ..
. .
k−1
X
≤ kdl k
l=m

X
≤ kdl k.
l=m

Now, as in the proof of Lemma 2.5, we have


l l 1 2l −1 1 l
kdl k ≤ c3 dist(xl , X ∗ ) ≤ c3 c25 −1 r2 ≤ c3 r ≤ c3 r ,
2 2
where the first inequality follows from Lemma 2.3 and the third inequality follows from
r ≤ 1/(2c5 ). Consequently, we get

X 1 l
kxk − xm k ≤ c3 r → 0 as m → ∞.
l=m
2

This means {xk } is a Cauchy sequence and hence convergent. 2

In order to prove that the sequence {xk } converges locally quadratically, we need some
further preparatory results.

Lemma 2.8 Let x0 ∈ Br (x∗ ) and {xk } be a sequence generated by Algorithm 2.1. Then
there is a constant c6 > 0 such that

dist(xk , X ∗ ) ≤ c6 kdk k

for all k ∈ N sufficiently large.


Proof. In view of Theorem 2.6, we have
1
dist(xk+1 , X ∗ ) ≤ dist(xk , X ∗ )
2

10
for all k ∈ N sufficiently large. Letting x̄k+1 ∈ X ∗ denote the closest solution to xk+1 , we
then obtain

kdk k = kxk − xk+1 k


≥ kxk − x̄k+1 k − kx̄k+1 − xk+1 k
≥ dist(xk , X ∗ ) − dist(xk+1 , X ∗ )
1
≥ dist(xk , X ∗ ) − dist(xk , X ∗ )
2
1 k ∗
= dist(x , X )
2
for all k ∈ N large enough. 2

The next result shows that the length of the search direction dk goes down to zero locally
quadratically.

Lemma 2.9 Let x0 ∈ Br (x∗ ) and {xk } be a sequence generated by Algorithm 2.1. Then
there is a constant c7 > 0 such that

kdk+1 k ≤ c7 kdk k2

for all k ∈ N sufficiently large.


Proof. In view of Lemmas 2.3, 2.4, and 2.8, we have

kdk+1 k ≤ c3 dist(xk+1 , X ∗ )
≤ c3 c5 dist(xk , X ∗ )2
≤ c3 c5 c26 kdk k2

for all k ∈ N sufficiently large. Setting c7 := c3 c5 c26 gives the desired result. 2

We next show that the length of the search direction dk is eventually in the same order as
the distance from the current iterate xk to the limit point x̄ of the sequence {xk }.

Lemma 2.10 Let x0 ∈ Br (x∗ ) and {xk } be a sequence generated by Algorithm 2.1 and
converging to x̄. Then there exist constants c8 > 0 and c9 > 0 such that

c8 kxk − x̄k ≤ kdk k ≤ c9 kxk − x̄k

for all k ∈ N sufficiently large.


Proof. The right inequality holds with c9 := c3 since Lemma 2.3 implies

kdk k ≤ c3 dist(xk , X ∗ ) ≤ c3 kxk − x̄k

for all k ∈ N. In order to verify the left inequality, let k ∈ N be sufficiently large so that
Lemma 2.9 applies and
c7 kdk k ≤ 1

11
holds. Without loss of generality, we may also assume that
1
kdk+1 k ≤ kdk k
2
holds. We can then apply Lemma 2.9 successively to obtain
2 1 2
kdk+2 k ≤ c7 kdk+1 k2 ≤ 12 c7 kdk k2 ≤
 k
2
kd k,
4 1 3
kdk+3 k ≤ c7 kdk+2 k2 ≤ 21 c7 kdk k2 ≤

2
kdk k,
6 1 4
kdk+4 k ≤ c7 kdk+3 k2 ≤ 21 c7 kdk k2 ≤
 k
2
kd k,
.. .. ..
. . .
i.e.,
1 j k
kdk+j k ≤ kd k for all j = 0, 1, 2, . . .
2
Since
l−1
X
k+l k
x =x + dk+j
j=0

and
x̄ = lim xk+l ,
l→∞

we therefore get

kxk − x̄k = kxk − lim xk+l k


l→∞
l−1
X

= lim dk+j
l→∞
j=0
l−1
X
= lim dk+j
l→∞
j=0
l−1
X k+j
≤ lim d
l→∞
j=0

X k+j
= d
j=0

X
k1 j
≤ kd k
j=0
2
= 2kdk k.

Setting c8 := 1/2 gives the desired result. 2

As a consequence of the previous lemmas, we now obtain our main local convergence result
of this subsection.

12
Theorem 2.11 Let Assumption 2.2 be satisfied and {xk } be a sequence generated by Al-
gorithm 2.1 with starting point x0 ∈ Br (x∗ ) and limit point x̄. Then the sequence {xk }
converges locally quadratically to x̄.
Proof. Using Lemmas 2.9 and 2.10, we immediately obtain

c8 kxk+1 − x̄k ≤ kdk+1 k ≤ c7 kdk k2 ≤ c7 c29 kxk − x̄k2

for all k ∈ N sufficiently large. This shows that {xk } converges locally quadratically to the
limit point x̄. 2

2.4 Globalized Method


So far, we have presented only a local version of the constrained Levenberg-Marquardt
method. Although this is the main emphasis of this paper, we also present, for the sake
of completeness, a globalized version of Algorithm 2.1. The globalization given here is very
simple and might not be the best choice from the computational point of view. Nevertheless,
we can show that it preserves the nice local properties of Algorithm 2.1. Throughout this
subsection, we assume that the mapping F is continuously differentiable.
The globalized Levenberg-Marquardt method is based on a simple descent condition for
the function kF (x)k: If a full Levenberg-Marquardt step gives a sufficient decrease of this
merit function, we accept this point as the new iterate. Otherwise we switch to a projected
gradient step, see, e.g., Bertsekas [2] for more details on projected gradients. Formally, the
globalized method looks as follows. (Recall that we define f (x) := kF (x)k2 .)

Algorithm 2.12 (Constrained Levenberg-Marquardt Method: Globalized Version)

(S.0) Choose x0 ∈ X, µ > 0, β, σ, γ ∈ (0, 1), and set k := 0.

(S.1) If F (xk ) = 0, STOP.

(S.2) Choose Hk ∈ Rm×n , setµk := µkF (xk )k2 , and compute dk as the solution of (4).

(S.3) If
kF (xk + dk )k ≤ γkF (xk )k, (16)
then set xk+1 := xk + dk , k ← k + 1, and go to (S.1); otherwise go to (S.4).

(S.4) Compute a stepsize tk = max{β ` | ` = 0, 1, 2, . . .} such that

f (xk (tk )) ≤ f (xk ) + σ∇f (xk )T xk (tk ) − xk ,




where xk (t) := PX [xk − t∇f (xk )]. Set xk+1 := xk (tk ), k ← k + 1, and go to (S.1).

The convergence properties of Algorithm 2.12 are summarized in the following theorem.

13
Theorem 2.13 Let {xk } be a sequence generated by Algorithm 2.12. Then any accumulation
point of this sequence is a stationary point of (2). Moreover, if an accumulation point x∗
of the sequence {xk } is a solution of (1) and Assumption 2.2 is satisfied at this point, then
the entire sequence {xk } converges to x∗ , the rate of convergence is locally quadratic, and the
sequence {dist(xk , X ∗ )} also converges locally quadratically.
Based on our previous results, the proof can be carried out in exactly the same way as that
of Theorem 3.1 in [24]. We therefore skip the details here.

3 Projected Levenberg-Marquardt Method


This section deals with another Levenberg-Marquardt method for the solution of constrained
nonlinear systems. The main difference from Algorithm 2.1 lies in the fact that the search
direction can be obtained by the solution of a single system of linear equations rather than
a constrained optimization problem. This method is shown to have the same convergence
properties as the Levenberg-Marquardt method of Algorithm 2.1.
The organization of this section is similar to the previous one. We first state the algorithm
and assumptions in Subsection 3.1. Then we investigate the local behaviour of the distance
function in Subsection 3.2. Subsection 3.3 deals with the local behaviour of the iterates.
Finally, Subsection 3.4 contains a simple globalization strategy for the modified Levenberg-
Marquardt method.

3.1 Algorithm and Assumptions


We consider again the constrained system of nonlinear equations (1). In the previous section,
we presented a constrained Levenberg-Marquardt method that generates a sequence {xk } by

xk+1 := xk + dk k = 0, 1, . . . ,

where dk is the solution of the constrained optimization problem

min θk (d) s.t. xk + d ∈ X

with θk being defined by (5).


In this section, we adopt a different approach that uses the formula

xk+1 := PX (xk + dkU ) k = 0, 1, . . . ,

where dkU is the unique solution of the unconstrained (hence the subscript ‘U’) subproblem

min θk (dU ), dU ∈ Rn .

We call this the projected Levenberg-Marquardt method since the unconstrained step gets
projected onto the feasible region X. Note that, whenever the projection can be carried out
efficiently (like in the box constrained case), this method needs a significantly less amount
of work per iteration since the strict convexity of the function θk ensures that dkU is a global

14
minimum of this function if and only if ∇θk (dkU ) = 0, i.e., if and only if dkU is the unique
solution of the system of linear equations

HkT Hk + µk I dU = −HkT F (xk ).



(17)

Specifically we consider the following algorithm.

Algorithm 3.1 (Projected Levenberg-Marquardt Method: Local Version)

(S.0) Choose x0 ∈ X, µ > 0, and set k := 0.

(S.1) If F (xk ) = 0, STOP.

(S.2) Choose Hk ∈ Rm×n , setµk := µkF (xk )k2 , and compute dkU as the solution of (17).

(S.3) Set xk+1 := PX (xk + dkU ), k ← k + 1, and go to (S.1).

Note that the algorithm is well-defined since the coefficient matrix in (17) is always symmetric
positive definite. Furthermore, all iterates xk belong to the feasible set X.
The following assumption is supposed to hold throughout this section.

Assumption 3.2 The solution set X ∗ of problem (1) is nonempty. For some solution x∗ ∈
X ∗ , there exist constants ε > 0, κ1 > 0, κ2 > 0 and L > 0 such that the following inequalities
hold:

κ1 dist(x, X ∗ ) ≤ kF (x)k ∀x ∈ Bε (x∗ ), (18)


kF (x) − F (xk ) − Hk (x − xk )k ≤ κ2 kx − xk k2 ∀x, xk ∈ Bε (x∗ ), (19)
kF (x) − F (y)k ≤ Lkx − yk ∀x, y ∈ Bε (x∗ ). (20)

We tacitly assume that the constant ε > 0 in Assumption 3.2 is taken sufficiently small so
that the mapping F is defined in the entire ball Bε (x∗ ). Note that this is always possible
since F is assumed to be defined on an open set O containing the feasible region X.
Apart from this, the only difference between Assumptions 2.2 and 3.2 lies in the fact
that we now assume that the three conditions (18)–(20) hold in the entire ball Bε (x∗ ),
whereas before it was only assumed that the corresponding conditions (6)–(8) hold in the
intersection Bδ (x∗ ) ∩ X. The reason for this slight modification is that we sometimes have
to apply conditions (18)–(20) to the vector xk + dkU that may lie outside X.
Without the restriction on X, condition (18) is more restrictive than the corresponding
condition (6). Whenever there exists a point x such that F (x) = 0 and x 6∈ X, (18) may
fail even if F is affine and X is polyhedral. Nevertheless, condition (18) is still significantly
weaker than the nonsingularity of the Jacobian of F . To see this, consider the example with
F : R2 → R and X ⊆ R2 being defined by
q
F (x) = x21 + x22 − 1

and 
X = x − 1 ≤ x1 ≤ 1, −1 ≤ x2 ≤ 0 ,

15
respectively. Note that the solution set of F (x) = 0 without the constraint is the unit circle,
while the solution set of the constrained equation F (x) = 0, x ∈ X, is the lower half of the
unit circle. By substituting x := (r cos θ, r sin θ) with r ≥ 0, we have |F (x)| = |r − 1|. It is
easy to see that dist(x, X ∗ ) = |r − 1| when x is an interior point of X. Therefore (18) holds
on the interior of X. However, when x∗ = (−1, 0)T , which is a boundary point of X, (18)
fails since F (x) = 0 but dist(x, X ∗ ) > 0 for any x such that r = 1 and 0 < θ < π. On the
other hand, when x∗ = (0, −1)T , which is also a boundary point of X, (18) is satisfied for
sufficiently small ε > 0.

3.2 Local Convergence of Distance Function


This subsection deals with the behaviour of the sequence {dist(xk , X ∗ )}. The analysis is
similar to that of Subsection 2.2, and many of our results can be found in the related paper
[24] that deals with the convergence properties of a Levenberg-Marquardt method for the
solution of unconstrained systems of equations. We therefore skip some of the proofs here.

Lemma 3.3 There exist constants κ3 > 0 and κ4 > 0 such that the following inequalities
hold for each xk ∈ Bε/2 (x∗ ):
(a) kdkU k ≤ κ3 dist(xk , X ∗ ).

(b) kF (xk ) + Hk dkU k ≤ κ4 dist(xk , X ∗ )2 .


Proof. The proof is similar to Lemma 2.3 and may also be found in [24]. 2

We next state the counterpart of Lemma 2.4. Note, however, that the vector xk−1 + dk−1 U is
k
no longer equal to the next iterate x in the method considered here. Hence the assumption
in the following result is somewhat different from the assumption in the corresponding result
in Lemma 2.4.

Lemma 3.4 Assume that both xk−1 and xk−1 + dk−1 U belong to the ball Bε/2 (x∗ ) for each
k ∈ N. Then there is a constant κ5 > 0 such that

dist(xk , X ∗ ) ≤ κ5 dist(xk−1 , X ∗ )2

for each k ∈ N.
Proof. The definition of xk and the nonexpansiveness of the projection operator imply
that
κ1 dist(xk , X ∗ ) = κ1 dist PX (xk−1 + dUk−1 ), X ∗


= κ1 inf x̄∈X ∗ PX (xk−1 + dUk−1 ) − x̄

= κ1 inf x̄∈X ∗ PX (xk−1 + dk−1
U ) − P X (x̄)
k−1 (21)
≤ κ1 inf x̄∈X ∗ x + dk−1
U − x̄
k−1
+ dU , X ∗
k−1

= κ1 dist x
≤ kF (xk−1 + dUk−1 )k,

16
where the last inequality follows from (18) together with our assumption that xk−1 + dk−1
U ∈
∗ k−1 k−1 k−1 ∗
Bε/2 (x ). Now, using (19) as well as x , x + dU ∈ Bε/2 (x ), we have

F (xk−1 + dk−1 ) − F (xk−1 ) + Hk−1 dk−1
U U

k−1
≤ F (x ) − F (x
k−1
+ d ) + Hk−1 dk−1
k−1
U U (22)
≤ κ2 kdk−1 2
U k .

Using (21), (22) and Lemma 3.3, we obtain

κ1 dist(xk , X ∗ ) ≤ F (xk−1 ) + Hk−1 dk−1



+ κ2 kdk−1 k2
U U
≤ κ4 dist(xk−1 , X ∗ )2 + κ2 κ23 dist(xk−1 , X ∗ )2
= (κ4 + κ2 κ23 )dist(xk−1 , X ∗ )2 .

This completes the proof by setting κ5 := (κ4 + κ2 κ23 )/κ1 . 2

The next result is the counterpart of Lemma 2.5 and states that the assumptions in Lemma
3.4 are satisfied if the starting point x0 is chosen sufficiently close to the solution set. Let
n ε 1 o
r := min , . (23)
2(1 + 2κ3 ) 2κ5

Lemma 3.5 Assume that the starting point x0 ∈ X used in Algorithm 3.1 belongs to the
ball Br (x∗ ), where x∗ denotes a solution of (1) satisfying Assumption 3.2 and r is defined
by (23). Then
xk−1 , xk−1 + dk−1
U ∈ Bε/2 (x∗ )
holds for all k ∈ N.
Proof. The proof is by induction on k. We start with k = 1. By assumption, we have
x0 ∈ Br (x∗ ). Since r ≤ ε/2, this implies x0 ∈ Bε/2 (x∗ ). Furthermore, we obtain from Lemma
3.3

kx0 + d0U − x∗ k ≤ kx0 − x∗ k + kd0U k


≤ r + kd0U k
≤ r + κ3 dist(x0 , X ∗ )
≤ r + κ3 kx0 − x∗ k
≤ (1 + κ3 )r.

Since (1 + κ3 )r ≤ ε/2, it follows that x0 + d0U ∈ Bε/2 (x∗ ).


Now let k ≥ 1 be arbitrarily given and assume that xl−1 , xl−1 + dl−1 U ∈ Bε/2 (x∗ ) for all
l = 1, . . . , k. We have to show that xk and xk + dkU belong to Bε/2 (x∗ ). Since xk−1 + dk−1
U ∈
∗ k k−1 k−1 ∗
Bε/2 (x ), we immediately obtain x = PX (x + dU ) ∈ Bε/2 (x ) from the inequality

kxk − x∗ k = kPX (xk−1 + dk−1 ∗


U ) − PX (x )k ≤ kx
k−1
+ dk−1
U − x∗ k.

17
To see that xk + dkU ∈ Bε/2 (x∗ ), first note that

kxk + dkU − x∗ k ≤ kxk − x∗ k + kdkU k


= kxk−1 + dk−1
U − x∗ k + kdkU k
≤ kxk−1 − x∗ k + kdk−1 k
U k + kdU k
.. ..
. .
Xk
0 ∗
≤ kx − x k + kdlU k
l=0
k
X
≤ r + κ3 dist(xl , X ∗ ),
l=0

where the last inequality follows from Lemma 3.3. Using Lemma 3.4, the induction can then
be completed by following the arguments in the proof of Lemma 2.5. 2

We are now able to state our main local convergence result of this subsection. It is an
immediate consequence of Lemmas 3.4 and 3.5.

Theorem 3.6 Let Assumption 3.2 be satisfied and {xk } be a sequence generated by Algo-
rithm 3.1 with starting point x0 ∈ Br (x∗ ), where r is defined by (23). Then the sequence
{dist(xk , X ∗ )} converges to zero locally quadratically.

3.3 Local Convergence of Iterates


This subsection deals with the local behaviour of the sequence {xk } itself. In order to
investigate its behaviour, we suppose that Assumption 3.2 holds throughout this subsection.
Our first result states that the sequence {xk } generated by Algorithm 3.1 is convergent.

Theorem 3.7 Let Assumption 3.2 be satisfied and {xk } be a sequence generated by Algo-
rithm 3.1 with starting point x0 ∈ Br (x∗ ), where r is defined by (23). Then the sequence
{xk } converges to a solution x̄ of (1) belonging to the ball Bε/2 (x∗ ).

18
Proof. Similar to the proof of Theorem 2.7, we verify that {xk } is a Cauchy sequence.
Indeed, for any intergers k and m such that k > m, we have
kxk − xm k = kPX (xk−1 + dk−1 m
U ) − PX (x )k
≤ kxk−1 + dk−1
U − xm k
≤ kxk−1 − xm k + kdk−1U k
= kPX (x k−2
+ dU ) − PX (xm )k + kdk−1
k−2
U k
k−2 k−2 m k−1
≤ kx + dU − x k + kdU k
≤ kx k−2
− xm k + kdk−2 k−1
U k + kdU k
.. ..
. .
k−1
X
≤ kdlU k
l=m

X
≤ kdlU k.
l=m

The rest of the proof is identical to that of Theorem 2.7. 2

We next want to show that the sequence {xk } is locally quadratically convergent. To this
end, we begin with the following preliminary result.
Lemma 3.8 Let x0 ∈ Br (x∗ ) and {xk } be a sequence generated by Algorithm 3.1. Then
there is a constant κ6 > 0 such that
dist(xk , X ∗ ) ≤ κ6 kdkU k
for all k ∈ N sufficiently large.
Proof. The proof is a modification of that of Lemma 2.8. First note that Theorem 3.6
implies that
1
dist(xk+1 , X ∗ ) ≤ dist(xk , X ∗ )
2
k+1
for all k ∈ N sufficiently large. Let x̄ be the closest solution to xk+1 , i.e.,
dist(xk+1 , X ∗ ) = kxk+1 − x̄k+1 k.
Then we obtain from the nonexpansiveness of the projection operator
kdkU k = kxk + dkU − xk k
≥ kPX (xk + dkU ) − PX (xk )k
= kxk+1 − xk k
≥ kx̄k+1 − xk k − kxk+1 − x̄k+1 k
≥ dist(xk , X ∗ ) − dist(xk+1 , X ∗ )
1
≥ dist(xk , X ∗ ) − dist(xk , X ∗ )
2
1
= dist(xk , X ∗ )
2

19
for all k ∈ N large enough. 2

The next result shows that the length of the unconstrained search direction dkU goes down
to zero locally quadratically.

Lemma 3.9 Let x0 ∈ Br (x∗ ) and {xk } be a sequence generated by Algorithm 3.1. Then
there is a constant κ7 > 0 such that

kdk+1 k 2
U k ≤ κ7 kdU k

for all k ∈ N sufficiently large.


Proof. Lemmas 3.3, 3.4, and 3.8 immediately imply

kdk+1
U k ≤ κ3 dist(x
k+1
, X ∗)
≤ κ3 κ5 dist(xk , X ∗ )2
≤ κ3 κ5 κ26 kdkU k2

for all k ∈ N sufficiently large. The desired result then follows by setting κ7 := κ3 κ5 κ26 . 2

We next state the counterpart of Lemma 2.10 that relates the length of dkU with the distance
from the iterates xk to their limit point x̄.

Lemma 3.10 Let x0 ∈ Br (x∗ ) and {xk } be a sequence generated by Algorithm 3.1 and
converging to x̄. Then there exist constants κ8 > 0 and κ9 > 0 such that

κ8 kxk − x̄k ≤ kdkU k ≤ κ9 kxk − x̄k

for all k ∈ N sufficiently large.


Proof. Lemma 3.3 (a) yields the right inequality with κ9 = κ3 . We will show the left
inequality. Following the proof of Lemma 2.10 and exploiting Lemma 3.9 (instead of Lemma
2.9), we can show that the following inequality holds for some sufficiently large (but fixed)
index k ∈ N:
1 j k
kdk+j
U k ≤ kdU k for all j = 0, 1, 2, . . .
2
Furthermore, the nonexpansiveness of the projection operator yields

kxk − xk+l k = kPX (xk ) − PX (xk+l−1 + dk+l−1


U )k
k k+l−1 k+l−1
≤ kx − x − dU k
k k+l−1 k+l−1
≤ kx − x k + kdU k
.. ..
. .
l−1
X k+j
≤ kdU k.
j=0

20
Since
x̄ = lim xk+l ,
l→∞

we therefore obtain from the continuity of the norm

kxk − x̄k = lim kxk − xk+l k


l→∞
l−1
X
≤ lim kdk+j k
l→∞
j=0
l−1
X 1 j
≤ kdkU k lim
l→∞
j=0
2

X 1 j
= kdkU k
j=0
2
= 2kdkU k.

Since this holds for an arbitrary (sufficiently large) k ∈ N, we obtain the desired result by
setting κ8 := 1/2. 2

Using Lemmas 3.9 and 3.10, we get the following local convergence result for the iterates xk
in exactly the same way as in the proof of the corresponding Theorem 2.11.

Theorem 3.11 Let Assumption 3.2 be satisfied and {xk } be a sequence generated by Al-
gorithm 3.1 with starting point x0 ∈ Br (x∗ ) and limit point x̄. Then the sequence {xk }
converges locally quadratically to x̄.
Hence it turns out that the projected Levenberg-Marquardt method of Algorithm 3.1 has
essentially the same local convergence properties as the constrained Levenberg-Marquardt
method of Algorithm 2.1.

3.4 Globalized Method


Although we are mainly interested in the local behaviour of the projected Levenberg-Mar-
quardt method, we can globalize this method in a simple way by introducing a projected
gradient step whenever the full projected Levenberg-Marquardt step does not provide a
sufficient decrease for kF (x)k. The globalization strategy is therefore very similar to the one
discussed in Subsection 2.4. Assuming that F is continuously differentiable, we may formally
state the algorithm as follows.

Algorithm 3.12 (Projected Levenberg-Marquardt Method: Globalized Version)


(S.0) Choose x0 ∈ X, µ > 0, β, σ, γ ∈ (0, 1), and set k := 0.
(S.1) If F (xk ) = 0, STOP.
(S.2) Choose Hk ∈ Rm×n , setµk := µkF (xk )k2 , and compute dkU as the solution of (17).

21
(S.3) If
kF (PX (xk + dkU ))k ≤ γkF (xk )k, (24)
then set xk+1 := PX (xk + dkU ), k ← k + 1, and go to (S.1); otherwise go to (S.4).

(S.4) Compute a stepsize tk = max{β ` | ` = 0, 1, 2, . . .} such that

f (xk (tk )) ≤ f (xk ) + σ∇f (xk )T xk (tk ) − xk ,




where xk (t) := PX [xk − t∇f (xk )]. Set xk+1 := xk (tk ), k ← k + 1, and go to (S.1).

Algorithm 3.12 has the advantage of having simpler subproblems than Algorithm 2.12. How-
ever, this advantage is realized only if the projections onto the feasible set X can be computed
in a convenient manner, which is particularly the case when X is described by some box
constraints.
Based on our previous results, it is not difficult to see that the counterpart of Theorem
2.13 also holds for Algorithm 3.12. We skip the details here.

4 Numerical Results
We have implemented Algorithm 3.12 in MATLAB and tested it on a number of examples
from different areas. The implementation differs slightly from the description of Algorithm
3.12. Specifically, Algorithm 3.12 considers two types of steps only, namely Levenberg-
Marquardt and projected gradient steps, whereas our implementation uses the following
three types of steps:

• LM-step (Levenberg-Marquardt step): This is used when the descent condition (24) is
satisfied, i.e., (S.3) is carried out.

• LS-step (Line Search step): This step occurs if condition (24) is not satisfied but the
search direction sk := PX (xk + dk ) − xk is a descent direction for f in the sense that
∇f (xk )T sk ≤ −ρksk kp for some constants ρ > 0 and p > 1. We then use an Armijo-
type line search to reduce f along the direction sk .

• PG-step (Projected Gradient step): If neither an LM-step nor an LS-step can be used,
we apply a projected gradient step as described in (S.4) of Algorithm 3.12.

It is easy to see that this modification does not change the local and global convergence
properties of Algorithm 3.12.
The parameters used for our test runs are

β = 0.9, σ = 10−4 , γ = 0.99995, ρ = 10−8 , p = 2.1,

and we terminate the iteration if

kF (xk )k ≤ ε or k ≥ kmax or tk ≤ tmin

22
with
ε = 10−5 , kmax = 100 and tmin = 10−12 .
The computational results obtained with these parameters are shown in Tables 1–6.
Tables 1 and 2 give the results for some square systems of equations. All these systems
have some bound constraints. For example, many of the test examples come from chemical
equilibrium problems where the components of the vector x correspond to chemical concen-
trations, so that these problems have some nonnegativity constraints. Other examples are
obtained from complementarity problems

G(x) − y = 0, x ≥ 0, y ≥ 0, xi yi = 0 ∀i.

Also some convex optimization problems are solved by applying the algorithm to the corre-
sponding KKT conditions.
The starting point taken for all test examples is the vector of lower bounds except for
those examples which arise from complementarity or optimization problems. For the latter
problems we used the standard starting point from the literature (filled with zero Lagrange
multipliers).
The columns in Table 1 contain the name of the test problem (together with a hint to the
literature that, however, is usually not the original reference for that particular example),
the dimension n(= m) of this example, the number of iterations, the number of LM-, LS-
and PG-steps, the number of function evaluations as well as the final value of the merit
function f . Table 2 has a similar structure except that the first column gives the value of a
parameter for the particular problem (we use all three different parameters given in [8]).
Table 3 states the results obtained for some underdetermined systems taken from [3].
The columns have a similar meaning to those of Table 1 except that we added one more
column that gives the dimension m of the corresponding (nonsquare) system.
Finally, Tables 4–6 contain numerical results for some parameter-dependent problems
where the starting point of a problem is equal to the solution of the previous problem, i.e.,
we apply Algorithm 3.12 in the framework of a path-following method. Note, however, that
the dependence of these problems on the corresponding parameters might be nonsmooth,
e.g., the number of (known) solutions in the example given in Table 4 varies significantly
with the values of parameters.
To summarize the results shown in the tables, we were able to solve most of the test
problems without any difficulties. Only in a few cases, we were not able to find an approx-
imate solution (the same is true for the method of [1], which has also been tested on many
of the examples used here). This is typically due to the fact that the step size gets too small
(except for the circuit design problem in Table 1, for which we observed convergence to a
non-optimal stationary point). For some examples, we also needed a relatively large number
of function evaluations (at least compared to the number of iterations), but this is mainly
due to the fact that the stepsize reduction factor β was chosen equal to 0.9 (both for LS-
and PG-steps).

23
Test problem, source n iter LM/LS/PG F-eval. f (x)
Himmelblau function, [8, 14.1.1] 2 8 8/0/0 9 1.1e-11
Equilibrium combustion, [8, 14.1.2] 5 10 6/4/0 11 5.2e-11
Bullard-Biegler system, [8, 14.1.3] 2 11 9/2/0 40 9.5e-15
Ferraris-Tronconi system, [8, 14.1.4] 2 3 3/0/0 4 8.9e-15
Brown’s almost lin. syst., [8, 14.1.5] 5 10 10/0/0 11 9.1e-16
Robot kinematics system, [8, 14.1.6] 8 5 5/0/0 6 2.1e-19
Circuit design problem, [8, 14.1.7] 9 – –/–/– – –
Chem. equil. system, [17, system 1] 11 15 13/1/1 64 6.5e-11
Chem. equil. system, [17, system 2] 5 – –/–/– – –
Combust. system (Lean case), [16] 10 7 5/2/0 99 2.0e-11
Combust. system (Rich case), [16] 10 – –/–/– – –
Kojima-Shindo problem, [6] 4 5 4/1/1 21 3.1e-13
Josephy problem, [6] 4 11 8/2/1 80 9.5e-21
Mathiesen problem, [6] 4 3 3/0/0 4 2.0e-16
Hock-Schittkowski 34, [10] 16 8 7/1/0 32 7.6e-18
Hock-Schittkowski 35, [10] 8 2 2/0/0 3 1.2e-13
Hock-Schittkowski 66, [10] 16 65 35/30/0 253 3.4e-11
Hock-Schittkowski 76, [10] 14 43 23/0/20 428 7.1e-11

Table 1: Numerical results for different test problems (square systems)

∆H n iter LM/LS/PG F-eval. f (x)


-50,000 1 3 3/0/0 4 2.8e-15
-35,958 1 3 3/0/0 4 2.9e-17
-35,510 1 3 3/0/0 4 2.3e-17

Table 2: Numerical results for test problem 14.1.9 from [8] (Smith steady state temperature)

5 Final Remarks
This paper described two Levenberg-Marquardt-type methods for the solution of a con-
strained system of equations. Both methods were shown to possess a local quadratic rate
of convergence under a suitable error bound condition. This property is motivated by the
recent research for unconstrained equations in [24] and seems to be much stronger than that
of any other method for constrained equations known to the authors.
The globalization strategy used in this paper is quite standard and can certainly be
improved, although the numerical results indicate that the method works quite well with this
strategy. However, numerical experiments were carried out for the case of box constraints
only since otherwise the computation of the projections onto the feasible set becomes very
expensive and, in fact, dominates the overall cost of the algorithm. The question of how to
deal with a general convex set X in a numerically efficient way is still open.

24
Test problem, source n m iter LM/LS/PG F-eval. f (x)
Linear system, [3, Problem 2] 100 50 3 3/0/0 4 1.3e-11
Linear system, [3, Problem 2] 200 100 6 6/0/0 7 1.8e-14
Linear system, [3, Problem 2] 300 150 13 13/0/0 14 7 8e-29
Quadratic system, [3, Problem 4] 100 50 11 11/0/0 12 1.2e-11
Quadratic system, [3, Problem 4] 200 100 26 26/0/0 27 5.0e-12
Quadratic system, [3, Problem 4] 300 150 72 72/0/0 73 2.6e-15

Table 3: Numerical results for some underdetermined systems from [3]

R n iter LM/LS/PG F-eval. f (x)


0.995 2 8 8/0/0 9 1.6e-10
0.990 2 9 9/0/0 10 8.5e-11
0.985 2 9 9/0/0 10 1.7e-10
0.980 2 10 10/0/0 11 1.2e-19
0.975 2 11 11/0/0 12 1.1e-10
0.970 2 12 12/0/0 13 1.2e-10
0.965 2 13 13/0/0 14 1.8e-10
0.960 2 15 15/0/0 16 1.9e-10
0.955 2 18 18/0/0 19 2.0e-10
0.950 2 24 24/0/0 25 1.5e-10
0.945 2 – –/–/– – –
0.940 2 – –/–/– – –
0.935 2 – –/–/– – –

Table 4: Numerical results for test problem 14.1.8 from [8] (CSTR)

Acknowledgment. The authors would like to thank Stefania Bellavia for sending them
some of the test problems.

References
[1] S. Bellavia, M. Macconi and B. Morini, An affine scaling trust-region approach to
bound-constrained nonlinear systems, Technical Report, Dipartimento di Energetica,
University of Florence, Italy, 2001.

[2] D.P. Bertsekas, Nonlinear Programming, Athena Scientific, Belmont, MA, 1995.

[3] H. Dan, N. Yamashita and M. Fukushima, Convergence properties of the inexact


Levenberg-Marquardt method under local error bound conditions, Technical Report
2001-001, Department of Applied Mathematics and Physics, Kyoto University (January
2001).

25
c n iter LM/LS/PG F-eval. f (x)
c=0.5 100 4 4/0/0 5 4.1e-11
c=0.6 100 4 4/0/0 5 2.3e-11
c=0.7 100 5 5/0/0 6 1.3e-10
c=0.8 100 9 9/0/0 10 5.1e-11
c=0.9 100 95 3/92/0 383 1.6e-10
c=0.99 100 98 97/1/1 102 1.7e-10

Table 5: Numerical results for Chandrasekhar H-equation, see [14]

c n iter LM/LS/PG F-eval. f (x)


c=3.0 10 14 14/0/0 15 1.0e-10
c=3.1 10 11 7/2/2 177 1.6e-10
c=3.2 10 2 2/0/0 3 6.4e-13
c=3.3 10 2 2/0/0 3 3.0e-15
c=3.4 10 2 2/0/0 3 1.1e-15
c=3.5 10 2 2/0/0 3 2.9e-15
c=3.6 10 2 2/0/0 3 1.4e-15
c=3.7 10 2 2/0/0 3 2.2e-15
c=3.8 10 2 2/0/0 3 1.9e-15
c=3.9 10 2 2/0/0 3 2.3e-15
c=4.0 10 2 2/0/0 3 2.8e-15

Table 6: Numerical results for a chemical equilibrium problem (propane), see [5]

[4] J.E. Dennis and R.B. Schnabel, Numerical Methods for Unconstrained Optimization and
Nonlinear Equations, Prentice-Hall, Englewood Cliffs, 1983.

[5] P. Deuflhard and A. Hohmann, Numerische Mathematik, Walter de Gruyter, 1991.

[6] S.P. Dirkse and M.C. Ferris, MCPLIB: A collection of nonlinear mixed complementarity
problems, Optimization Methods and Software, 5 (1995), pp. 319–345.

[7] J.Y. Fan and Y.X. Yuan, On the convergence of a new Levenberg-Marquardt method,
Technical Report, AMSS, Chinese Academy of Sciences, 2001.

[8] C.A. Floudas, P.M. Pardalos, C.S. Adjiman, W.R. Esposito, Z.H. Gumus, S.T. Harding,
J.L. Klepeis, C.A. Meyer and C.A. Schweiger, Handbook of Test Problems in Local
and Global Optimization, Nonconvex Optimization and Its Applications 33, Kluwer
Academic Publishers, 1999.

[9] S.A. Gabriel and J.-S. Pang, A trust region method for constrained nonsmooth equa-
tions, in: W.W. Hager, D.W. Hearn and P.M. Pardalos (eds.), Large Scale Optimization
– State of the Art, Kluwer Academic Press, 1994, pp. 155–181.

26
[10] W. Hock and K. Schittkowski, Test Examples for Nonlinear Programming Codes, Lec-
ture Notes in Economics and Mathematical Systems 187, Springer, 1981.

[11] A.J. Hoffman, On approximate solutions of systems of linear inequalities, Journal of the
National Bureau of Standards, 49 (1952), pp. 263–265.

[12] C. Kanzow, An active set-type Newton method for constrained nonlinear systems, in:
M.C. Ferris, O.L. Mangasarian and J.-S. Pang (eds.): Complementarity: Applications,
Algorithms and Extensions, Kluwer Academic Publishers, 2001, pp. 179–200.

[13] C. Kanzow, Strictly feasible equation-based methods for mixed complementarity prob-
lems, Numerische Mathematik, 89 (2001), pp. 135–160.

[14] C.T. Kelley, Iterative Methods for Linear and Nonlinear Equations, SIAM, Phildelphia,
PA, 1995.

[15] D.N. Kozakevich, J.M. Martinez and S.A. Santos, Solving nonlinear systems of equations
with simple bounds, Journal of Computational and Applied Mathematics, 16 (1997), pp.
215–235.

[16] K. Meintjes and A.P. Morgan, A methodology for solving chemical equilibrium systems,
Applied Mathematics and Computation, 22 (1987), pp. 333–361.

[17] K. Meintjes and A.P. Morgan, Chemical equilibrium systems as numerical test problems,
ACM Transactions on Mathematical Software, 16 (1990), pp. 143–151.

[18] R.D.C Monteiro and J.-S. Pang, A potential reduction Newton method for constrained
equations, SIAM Journal on Optimization, 9 (1999), pp. 729–754.

[19] J.M. Ortega and W.C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several
Variables, Academic Press, New York, NY, 1970.

[20] L. Qi, X.-J. Tong and D.-H. Li, An active-set projected trust region algorithm for box
constrained nonsmooth equations, Technical Report, Department of Applied Mathe-
matics, Hong Kong Polytechnique University, Hong Kong, October 2001.

[21] S.M. Robinson, Some continuity properties of polyhedral multifunctions, Mathematical


Programming Study, 14 (1981), pp. 206–214.

[22] M. Ulbrich, Nonmonotone trust-region method for bound-constrained semismooth equa-


tions with applications to nonlinear mixed complementarity problems, SIAM Journal
on Optimization, 11 (2001), pp. 889–917.

[23] T. Wang, R.D.C. Monteiro and J.-S. Pang, An interior point potential reduction method
for constrained equations, Mathematical Programming, 74 (1996), pp. 159–195.

[24] N. Yamashita and M. Fukushima, On the rate of convergence of the Levenberg-


Marquardt method, Computing, 15 [Suppl] (2001), pp. 239–249.

27

You might also like