Iterative Algorithms I PDF
Iterative Algorithms I PDF
ITERATIVE ALGORITHMS I
No part of this digital document may be reproduced, stored in a retrieval system or transmitted in any form or
by any means. The publisher has taken reasonable care in the preparation of this digital document, but makes no
expressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. No
liability is assumed for incidental or consequential damages in connection with or arising out of information
contained herein. This digital document is sold with the clear understanding that the publisher is not engaged in
rendering legal, medical or any other professional services.
MATHEMATICS RESEARCH DEVELOPMENTS
ITERATIVE ALGORITHMS I
IOANNIS K. ARGYROS
AND
Á. ALBERTO MAGREÑÁN
New York
Copyright © 2017 by Nova Science Publishers, Inc.
All rights reserved. No part of this book may be reproduced, stored in a retrieval system or transmitted in
any form or by any means: electronic, electrostatic, magnetic, tape, mechanical photocopying, recording or
otherwise without the written permission of the Publisher.
We have partnered with Copyright Clearance Center to make it easy for you to obtain permissions to reuse
content from this publication. Simply navigate to this publication’s page on Nova’s website and locate the
“Get Permission” button below the title description. This button is linked directly to the title’s permission
page on copyright.com. Alternatively, you can visit copyright.com and search by title, ISBN, or ISSN.
For further questions about using the service on copyright.com, please contact:
Copyright Clearance Center
Phone: +1-(978) 750-8400 Fax: +1-(978) 750-4470 E-mail: [email protected].
Independent verification should be sought for any data, advice or recommendations contained in this book. In
addition, no responsibility is assumed by the publisher for any injury and/or damage to persons or property
arising from any methods, products, instructions, ideas or otherwise contained in this publication.
This publication is designed to provide accurate and authoritative information with regard to the subject
matter covered herein. It is sold with the clear understanding that the Publisher is not engaged in rendering
legal or any other professional services. If legal or any other expert assistance is required, the services of a
competent person should be sought. FROM A DECLARATION OF PARTICIPANTS JOINTLY ADOPTED
BY A COMMITTEE OF THE AMERICAN BAR ASSOCIATION AND A COMMITTEE OF
PUBLISHERS.
Additional color graphics may be available in the e-book version of this book.
1 Secant-Type Methods 1
1.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2. Majorizing Sequences for the Secant-Type Method . . . . . . . . . . . . . 2
1.3. Semilocal Convergence of the Secant-Type Method . . . . . . . . . . . . . 9
1.4. Local Convergence of the Secant-Type Method . . . . . . . . . . . . . . . 14
1.5. Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Index 427
Dedicated to
My mother Anastasia
Dedicated to
My parents Alberto and Mercedes
My grandmother Ascensión
My beloved Lara
Preface
It is a well-known fact that iterative methods have been studied since problems where we
cannot find a solution in a closed form. There exist methods with different behaviors when
they are applied to different functions, methods with higher order of convergence, meth-
ods with great zones of convergence, methods which do not require the evaluation of any
derivative, etc. and researchers are developing new iterative methods frequently.
Once these iterative methods appeared, several researchers have studied them in dif-
ferent terms: convergence conditions, real dynamics, complex dynamics, optimal order
of convergence, etc. This phenomena motivated the authors to study the most used and
classical ones as for example Newton’s method or its derivative-free alternative the Secant
method.
Related to the convergence of iterative methods, the most well known conditions are the
Kantorovich ones, who developed a theory which has allow many researchers to continue
and experiment with these conditions. Many authors in the recent years have studied mod-
ifications of theses conditions related, for example, to centered conditions, ω-conditions or
even convergence in Hilbert spaces.
In this monograph, we present the complete recent work of the past decade of the au-
thors on Convergence and Dynamics of iterative methods. It is the natural outgrowth of
their related publications in these areas. Chapters are self-contained and can be read inde-
pendently. Moreover, an extensive list of references is given in each chapter, in order to
allow reader to use the previous ideas. For these reasons, we think that several advanced
courses can be taught using this book.
The list of presented topic of our related studies follows.
Secant-type methods;
Efficient Steffensen-type algorithms for solving nonlinear equations;
On the semilocal convergence of Halley’s method under a center-Lipschitz condition on
the second Fréchet derivative;
An improved convergence analysis of Newton’s method for twice Fréchet differentiable
operators;
Expanding the applicability of Newton’s method using Smale’s α-theory;
Newton-type methods on Riemannian Manifolds under Kantorovich-type conditions;
Improved local convergence analysis of inexact Gauss-Newton like methods;
Expanding the Applicability of Lavrentiev Regularization Methods for Ill-posed Problems;
A semilocal convergence for a uniparametric family of efficient secant-like methods;
On the semilocal convergence of a two-step Newton-like projection method for ill-posed
xiv Ioannis K. Argyros and Á. Alberto Magreñán
equations;
New Approach to Relaxed Proximal Point Algorithms Based on A−maximal;
Newton-type Iterative Methods for Nonlinear Ill-posed Hammerstein-type Equations;
Enlarging the convergence domain of secant-like methods for equations;
Solving nonlinear equations system via an efficient genetic algorithm with symmetric and
harmonious individuals;
On the Semilocal Convergence of Modified Newton-Tikhonov Regularization Method for
Nonlinear Ill-posed Problems;
Local convergence analysis of proximal Gauss-Newton method for penalized nonlinear
least squares problems;
On the convergence of a Damped Newton method with modified right-hand side vector;
Local convergence of inexact Newton-like method Under weak Lipschitz conditions;
Expanding the applicability of Secant method with applications;
Expanding the convergence domain for Chun-Stanica-Neta family of third order methods
in Banach spaces;
Local convergence of modified Halley-like methods with less computation of inversion;
Local convergence for an improved Jarratt-type method in Banach space;
Enlarging the convergence domain of secant-like methods for equations.
The book’s results are expected to find applications in many areas of applied mathemat-
ics, engineering, computer science and real problems. As such this monograph is suitable
to researchers, graduate students and seminars in the above subjets, also to be in all science
and engineering libraries.
The preparation of this book took place during 2015-2016 in Lawton, Oklahoma, USA
and Logroño, La Rioja, Spain.
Ioannis K. Argyros
Á. Alberto Magreñán
April 2016
Chapter 1
Secant-Type Methods
1.1. Introduction
In this chapter we are concerned with the problem of approximating a locally unique solu-
tion x? of the nonlinear equation
F(x) = 0, (1.1.1)
where, F is a Fréchet-differentiable operator defined on a nonempty subset D of a Banach
space X with values in a Banach space Y . A lot of problems from Applied Sciences can
be expressed in a form like (1.1.1) using mathematical modelling [3]. The solutions of
these equations can be found in closed form only in special cases. That is why the most
solution methods for these equations are iterative. The convergence analysis of iterative
methods is usually divided into two categories: semilocal and local convergence analysis.
In the semilocal convergence analysis one derives convergence criteria from the information
around an initial point whereas in the local analysis one finds estimates of the radii of
convergence balls from the information around a solution. If X = Y and Q(x) = F(x) + x,
then the solution x∗ of equation (1.1.1) is very important in fixed point theory.
We study the convergence of the secant-type method
Other choices of θn are also possible [1, 2, 6, 8, 9, 12, 14, 15, 21, 22]. There is a plethora
of sufficient convergence criteria for special cases of secant-type methods (1.1.3)-(1.1.5)
under Lipschitz-type conditions (1.1.2) (see [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16,
17, 18, 19, 20, 21, 22] and the references there in) or even graphical tools to study them [13].
Therefore, it is important to study the convergence of the secant-type method in a unified
way. It is interesting to notice that although we use very general majorizing sequences for
{xn } our technique leads in the semilocal case to: weaker sufficient convergence criteria;
more precise estimates on the distances kxn − xn−1 k, kxn − x∗ k and an at least as precise
information on the location of the solution x∗ in many interesting special cases such as
Newton’s method or the secant method (see Remark 3.3 and the Examples). Moreover,
in the local case: a larger radius of convergence and more precise error estimates than in
earlier studies such as [8, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22] are obtained in this
chapter (see Remark 4.2 and the Examples).
The chapter is organized as follows. In Section 1.2 we study the convergence of the
majorizing sequences for {xn }. Section 1.3 contains the semilocal and Section 1.4 the
local convergence analysis for {xn }. The numerical examples are given in the concluding
Section 1.5. In particular, in the local case we present an example where the radius of
convergence is larger than the one given by Rheinboldt [18] and Traub [19] for Newton’s
method. Moreover, in the semilocal case we provide an example involving a nonlinear
integral equation of Chandrasekhar type [7] appearing in radiative transfer as well as an
example involving a two point boundary value problem.
α−1 = 0, α0 = c, α1 = c + ν
l (αn+1 − αn + λ(αn − αn−1 ))(αn+1 − αn )
αn+2 = αn+1 +
1 − l0 [µ(αn+1 − c) + λ(αn − c) + c]
for each n = 0, 1, 2, · · ·
(1.2.1)
Special cases of the sequence {αn } have been used as majorizing sequences for secant-type
method by several authors. For example: Case 1 (secant method) l0 = l, λ = 1 and µ = 1 has
been studied in [6, 8, 9, 12, 14, 15, 20, 21] and for l0 ≤ l in [2, 4]. Case 2 (Newton’s method)
l0 = l, λ = 0, c = 0 and µ = 2 has been studied in [1, 8, 10, 11, 12, 14, 15, 17, 18, 19, 21, 22]
and for l0 ≤ l in [2, 3, 4]. In the present chapter we shall study the convergence of sequence
{αn } by first simplifying it. Indeed, the purpose of the following transformations is to
study the sequence (1.2.1) after using easier to study sequences defined by (1.2.3), (1.2.6)
and (1.2.8). Let
l0 l
L0 = and L = . (1.2.2)
1 + (µ + λ − 1)l0 c 1 + (µ + λ − 1)l0 c
Secant-Type Methods 3
α = 0, α0 = c, α1 = c + ν
−1
(1.2.3)
Moreover, let
L = bL0 for some b ≥ 1 (1.2.4)
and
βn = L0 αn . (1.2.5)
Then, we can define sequence {βn } by
β = 0, β0 = L0 c, β1 = L0 (c + ν)
−1
b β n+1 − β n + λ(βn − β n−1 ) (βn+1 − βn )
βn+2 = βn+1 +
for each n = 0, 1, 2, · · ·
1 − (µβn+1 + λβn )
(1.2.6)
Furthermore, let
1
γn = − βn for each n = 0, 1, 2, · · · . (1.2.7)
µ+λ
Then, sequence {γn } is defined by
1 1 1
γ−1 = µ+λ , γ0 = µ+λ − L0 c, γ1 =
− L0 (c + ν)
µ+λ
b γn+1 − γn + λ(γn − γn−1 ) (γn+1 − γn )
γn+2 = γn+1 −
for each n = 0, 1, 2, · · ·
µγn+1 + λγn
(1.2.8)
Finally, let
γ
δn = 1 − n for each n = 0, 1, 2, · · · (1.2.9)
γn−1
Then, we define the sequence {δn } by
γ γ
δ0 = 1 − γ 0 , δ1 = 1 − γ1
−1 0
It is convenient for the study of the convergence of the sequence {αn } to define polynomial
p by
p(t) = µt 3 − (λ + 3µ + b)t 2 + (2λ + 3µ + b(λ + 1))t − (µ + λ). (1.2.11)
We have that p(0) = −(µ + λ) < 0 and p(1) = bλ > 0 for λ > 0. It follows from the
intermediate value theorem that p has roots in (0, 1). Denote the smallest root by δ. If
4 Ioannis K. Argyros and Á. Alberto Magreñán
λ = 0, then p(t) = 2
√(t − 1)(µt − (2µ + b)t + µ). Hence, we can choose the smallest root of
2
2µ+b− b +4µb
p given by 2µ ∈ (0, 1) to be δ in this case. Note that in particular for Newton’s
method and secant method, respectively, we have that
and
p(t) = (t − 2)(t 2 − (b + 2)t + 1).
Hence, we obtain, respectively that
4
δ= √ (1.2.12)
b + 4 + b2 + 8b
and
2
δ= √ . (1.2.13)
b + 2 + b2 + 4b
Notice also that
p(t) ≤ 0 for each t ∈ (−∞, δ]. (1.2.14)
Next, we study the convergence of these sequences starting from {δn }.
Lemma 1.2.1. Let δ1 > 0, δ2 > 0 and b ≥ 1 be given parameters. Suppose that
0 < δ2 ≤ δ1 ≤ δ, (1.2.15)
where δ was defined in (1.2.11). Let {δn } be the scalar sequence defined by (1.2.10). Then,
the following assertions hold:
(A1 ) If
δ1 = δ2 (1.2.16)
then,
δn = δ for each n = 1, 2, 3, · · · (1.2.17)
(A2 ) If
0 < δ2 < δ1 < δ (1.2.18)
then, sequence {δn } is decreasing and converges to 0.
Proof. It follows from (1.2.10) and δ2 ≤ δ1 that δ3 > 0. We shall show that
δ3 ≤ δ2 . (1.2.19)
Hence, p1 has two distinct roots δs and δl with δs < δl . Polynomial p1 is quadratic with
respect to δ2 and the leading coefficient (µ(1 − δ1 )) is positive. Therefore, we have that
and
p1 (t) ≤ 0 for each t ∈ [δs , δl ].
Then, (1.2.20) shall be true, if
δ2 ≤ δs . (1.2.22)
By hypothesis (1.2.15) we have δ1 ≤ δ0 . Then by (1.2.14) we get that p(δ1 ) ≤ 0 ⇒ δ1 ≤
δs ⇒(1.2.22), since δ2 ≤ δ1 by hypothesis (1.2.15). Hence, we showed (1.2.19). Therefore,
relation
0 < δk+1 < δk , (1.2.23)
holds for k = 2. Then, we must show that
It follow from (1.2.10), δk < 1 and δk+1 < 1 that δk+2 > 0. Then, in view of (1.2.10) the
right hand side of (1.2.24) is true, if
Lemma 1.2.2. Suppose that the hypothesis (1.2.18) is satified. Then, the sequence {γn } is
decreasingly convergent and sequences {αn } and {βn } are increasingly convergent. More-
over, the following estimate holds:
l0 c < 1. (1.2.28)
and by the preceding equation we deduce that γn > 0 for each n = 1, 2, . . . and
since δn < 1. Hence, sequence {γn } converges to its unique largest lower bound denoted by
γ∗ . We also have that βn = µ+λ
1 1
− γn < µ+λ . Thus, the sequence {βn } is increasing, bounded
from above by µ+λ and as such it converges to its unique least upper bound denoted by β∗ .
1
L−1
Then, in view of (1.2.5) sequence {αn } is also increasing, bounded from above by 0
µ+λ and
such it also converges to its unique least upper bound denoted by α∗ .
Lemma 1.2.3. Suppose that (1.2.15) and (1.2.16) are satisfied. Then, the following asser-
tions hold for each n = 1, 2, · · ·
δn = δ
γn = (1 − δ)n γ0 , γ∗ = lim γn = 0,
n→∞
1 1
βn = − (1 − δ)n γ0 , β∗ = lim βn =
µ+λ n→∞ µ+λ
and
1 1 1
αn = − (1 − δ) γ0 , α∗ = lim αn =
n
L0 µ + λ n→∞ L0 (µ + λ)
Corollary 1.2.4. Suppose that the hypotheses of Lemma 2.1 and Lemma 2.2 hold. Then,
sequence {αn } defined in (1.2.1) is nondecreasing and converges to
1 + (µ + λ − 1)l0 c
α∗ = β∗ .
l0
Next, we present lower and upper bounds on the limit point α∗ .
Lemma 1.2.5. Suppose that the condition (1.2.18) is satisfied. Then, the following assertion
holds
b11 ≤ α∗ ≤ b12 , (1.2.29)
where
1 + (µ + λ − 1)l0 c 1 δ1 δ2
b11 = − exp −2 + ,
l0 µ+λ 2 − δ1 2 − δ2
Secant-Type Methods 7
1 1 + (µ + λ − 1)l0 c 1 ∗
b2 = − exp(δ ) , (1.2.30)
l0 µ+λ
1 δ2 (µ + λ)(1 − (µ + λ − 1)l0 c)
δ∗ = − δ1 + + ln
1 − δ1 1−r 1 − l0 c
and
λδ1 + δ 2 (1 − δ1 )
r=b .
(1 − δ1 )(1 − δ2 )(λ + µ(1 − δ2 ))
Proof. Using (1.2.18) and (1.2.28) we have that 0 < δ3 < δ2 < δ1 . Let us assume that
0 < δk+1 < δk < · · · < δ1 . Then, it follows from the induction hypotheses and (1.2.34) that
δk + δk+1 (1 − δk )
δk+2 = δk+1 b < rδk+1 < r2 δk ≤ · · · ≤ rk−1 δ3 ≤ rk δ2 .
(1 − δk )(1 − δk+1 )(2 − δk+1 )
We have that
∞
γ∗ = lim γn = ∏(1 − δn )γ0 .
n→∞
i=1
This is equivalent to
∞
1 1 (µ + λ)(1 + (µ + λ − 1)l0 c)
ln ∗ = ∑ ln + ln ,
γ n=1 1 − δn 1 − l0 c
then, it follows from (1.2.1) that sequence {αn } is increasing, bounded from above by
1+(µ+λ−1)l0 c
l0 (µ+λ)
and as such it converges to its unique least upper bound α∗ . Criterion
(1.2.31) is the weakest of all the preceding convergence criteria for sequence {αn }.
Clearly all the preceding criteria imply (1.2.31). Finally, define the criteria for N ≥ 1
N
N (C∗ )
(I ) = (1.2.32)
(1.2.31) if criteria (C∗N ) fail.
Lemma 1.2.7. Suppose that the conditions (1.2.18) and (1.2.28) hold. Then, the following
assertion holds
b11 ≤ α∗ ≤ b12 , (1.2.33)
where
1 + (µ + λ − 1)l0 c 1 δ1 δ2
b11
= − exp −2 + ,
l0 µ+λ 2 − δ1 2 − δ2
1 1 + (µ + λ − 1)l0 c 1 ∗
b2 = − exp(δ ) , (1.2.34)
l0 µ+λ
1 δ2 (µ + λ)(1 − (µ + λ − 1)l0 c)
δ∗ = − δ1 + + ln
1 − δ1 1−r 1 − l0 c
and
λδ1 + δ 2 (1 − δ1 )
r=b .
(1 − δ1 )(1 − δ2 )(λ + µ(1 − δ2 ))
Proof. Using (1.2.18) and (1.2.28) we have that 0 < δ3 < δ2 < δ1 . Let us assume that
0 < δk+1 < δk < · · · < δ1 . Then, it follows from the induction hypotheses and (1.2.34) that
δk + δk+1 (1 − δk )
δk+2 = δk+1 b < rδk+1 < r2 δk ≤ · · · ≤ rk−1 δ3 ≤ rk δ2 .
(1 − δk )(1 − δk+1 )(2 − δk+1 )
Secant-Type Methods 9
We have that
∞
γ∗ = lim γn = ∏(1 − δn )γ0 .
n→∞
i=1
This is equivalent to
∞
1 1 (µ + λ)(1 + (µ + λ − 1)l0 c)
ln ∗ = ∑ ln + ln ,
γ n=1 1 − δn 1 − l0 c
1 2 n (µ+λ)(1+(µ+λ−1)l0 c)
≤ 1−δ1 δ1 + δ2 (r + r + · · · + r + · · · + ln 1−l0 c
1 δ2 (µ+λ)(1+(µ+λ−1)l0 c) ∗
≤ 1−δ1 δ1 + 1−r + ln 1−l0 c = −δ .
Definition 1.3.1. Let l0 , l, ν, c, λ, µ be constants satisfying the hypotheses (I N ) for some fixed
integer N ≥ 1. A triplet (F , x−1 , x0 ) belongs to the class K = K (l0 , l, ν, c, λ, µ) if:
(D2 ) x−1 and x0 are two points belonging to the interior D0 of D and satisfying the in-
equality
kx0 − x−1 k ≤ c.
(D3 ) There exists a sequence {θn } of real numbers and λ, µ such that |1 − θn | ≤ λ and
1 + |θn | ≤ µ for each n = 0, 1, 2, · · ·.
( D5 )
U(x0 , α∗0 ) ⊆ Dc = {x ∈ D : F is continuous at x} ⊆ D,
where α∗0 = (µ + λ − 1)(α∗ − c) and α∗ is given in Lemma 2.3.
Next, we present the semilocal convergence result for the secant method.
Theorem 1.3.2. If (F , x−1 , x0 ) ∈ K (l0, l, ν, c, λ, µ) then, the sequence {xn } (n ≥ −1) gener-
ated by the secant-type method is well defined, remains in U(x0 , α∗0 ) for each n = 0, 1, 2, · · ·
and converges to a unique solution x∗ ∈ U(x0 , α∗ − c) of (1.1.1). Moreover, the following
assertions hold for each n = 0, 1, 2, · · ·
and
kx∗ − xn k ≤ α∗ − αn , (1.3.2)
where sequence {αn } (n ≥ 0) is given in (1.2.1). Furthermore, if there exists R such that
Proof. First, we show that M = δF (xk+1, yk+1 ) is invertible for xk+1 , yk+1 ∈ U(x0 , α∗0 ). By
(D2 ),(D3 ) and (D4 ), we have that
and
kI − A −1 M k = kA −1 (M − A )k
≤ kA −1 (M − F 0 (x0 ))k + kA −1(F 0 (x0 ) − A )k
≤ l0 (kxk+1 − x0 k + kyk+1 − x0 k + kx0 − x−1 k)
≤ l0 (kxk+1 − x0 k + |θk+1 |kxk+1 − x0 k + |1 + θk+1 |kxk+1 − x0 k + c)
≤ l0 (µ(αk+1 − c) + λ(αk+1 − c) + c) < 1
(1.3.4)
Using the Banach Lemma on invertible operators [9], [10], [15], [18], [20] and (1.3.4), we
deduce that M is invertible and
By (D4 ), we have
and
and
−1
kxk+2 − xk+1 k = kAk+1 F (xk+1)k
−1
≤ kAk+1 A kkA −1F (xk+1)k
l(αk+1 −αk +|1−θk |(αk −αk−1 ))
≤ 1−l0 [(1+|θk+1 |)(αk+1 −c)+|1−θk+1 |(αk −c)+c]
(αk+1 − αk )
≤ αk+2 − αk+1.
The induction for (1.3.1) is complete. It follows from (1.3.1) and Lemma 2.1 that {xn }
(n ≥ −1) is a complete sequence in a Banach space X and as such it converges to some x∗ ∈
U(x0 , α∗ − c) (since U(x0 , α∗ − c) is a closed set). By letting k → ∞ in (1.3.12), we obtain
F (x∗ ) = 0. Moreover, estimate (1.3.2) follows from (1.3.1) by using standard majoration
techniques [8, 12, 14]. Finally, to show the uniqueness in U(x0 , R), let y∗ ∈ U(x0 , R) be a
solution (1.1.1). Set Z 1
T = F 0 (y∗ + t(y∗ − x∗ ))dt
0
Using (D4 ) and (1.3.3) we get in turn that
If follows from (1.3.13) and the Banach lemma on invertible operators that T −1 exists.
Using the identity:
F (x∗) − F 0 (y∗) = T (x∗ − y∗ ), (1.3.14)
we deduce that x∗ = y∗ .
Remark 1.3.3. If follows from the proof of Theorem 3.2 that sequences {rn }, {sn } defined
by
r = 0, r0 = c, r1 = c + ν
−1
r2 = r1 + l0 (r1 −r1−l
0 +|1−θ0 |(r0 −r−1 ))(r1 −r0 )
0 ((1+|θ1 |)(r1 −r0 ))
(1.3.15)
−r +|1−θ |(r −r ))(r −r )
rn+2 = rn+1 + l(r n+1 n n n n−1 n+1 n
1−l0 [(1+|θn+1 |)(rn+1 −r0 )+(|1−θn+1 |)(rn −r0 )+c]
and
s−1 = 0, s0 = c, s1 = c + ν
l (s −s +λ(s −s −1 ))(s −s ) (1.3.16)
s2 = s1 + 0 11−l00 (1+|θ0 1 |)(s 1
1 −s0 )
0
sn+2 = sn+1 + l(sn+1 −sn +λ(sn −sn−1 ))(sn+1 −sn )
1−l0 (µ(sn+1 −s0 )+λ(sn −s0 ))+c
respectively are more precise majorizing sequences for {xn }. Clearly, these sequences also
converge under the (I N ) hypotheses.
A simple inductive argument shows that if l0 < l for each n = 2, 3, · · ·
In practice, one must choose {θn } so that the best error bounds are obtained (see also
Section 4). Note also that sequences {rn } or {sn } may converge under even weaker
hypotheses. The sufficient convergence criterion (1.2.15) determines the smallness of c
and r. This criterion can be solved for c and r ( see for example the h criteria or (1.3.29)
that follow). Indeed, let us demonstrate the advantages in two popular cases:
Case 1. Newton’s method. (i. e., if c = 0, λ = 0, µ = 1). Then, it can easily be seen that
{sn } (and consequently {rn }) converges provided that (see also [3])
h2 = l2 ν ≤ 1, (1.3.20)
where q
1 √ 2
l2 = 4κ0 + κ0 κ + κ0 κ + 8κ0 , (1.3.21)
4
whereas sequence {xn } converges, if
h1 = l 1 ν ≤ 1 (1.3.22)
where q
1 2
l1 = 4κ0 + κ + κ0 + 8κκ0 , (1.3.23)
4
In the case κ0 = κ (i. e. b = 1), we obtain the famous for its simplicity and clarity Kan-
torovich sufficient convergent criteria [2] given by
h = 2κν ≤ 1. (1.3.24)
h1 1 h1 h2 κ0
→ , → 0, → 0 as →0 (1.3.26)
h 4 h h1 κ
Case 2. Secant method. (i. e. for θn = 0). Schmidt [20], Potra-Ptáck [15], Dennis [8],
Ezquerro el at. [9], used the majorizing sequence {αn } for θn ∈ [0, 1] and l0 = l. That is,
they used the sequence {tn } given by
t−1 = 0, t0 = c, t1 = c + ν
(1.3.27)
t l(tn+1 −tn−1 )(tn+1 −tn )
n+2 = tn+1 + 1−l(tn −tn+1 +c)
Then, in case l0 < l our sequence is more precise (see also (1.3.17)-(1.3.19)). Notice also
that in the preceding references the sufficient convergence criterion associated to {tn } is
given by √
lc + 2 lν ≤ 1 (1.3.29)
Our sufficient convergence criteria can be also weaker in this case (see also the numerical
examples). It is worth nothing that if c = 0 (1.3.29) reduces to (1.3.24) (since κ = 2l).
Similar observations can be made for other choices of parameters.
yn − xn = (1 − θn )(xn−1 − xn ),
and
yn − x∗ = θn (xn − x∗ ) + (1 − θn )(xn−1 − x∗ )
we easily arrive at:
Theorem 1.4.1. Suppose that (D1 ) and (D3 ) hold. Moreover, suppose that there exist x∗ ∈
D, K0 > 0, K > 0 such that F (x∗ ) = 0, F 0 (x∗ )−1 ∈ Ł(Y , X ),
where
K(kxn − x∗ k + |1 − θn |kxn−1 − xn k
eˆn =
1 − K0 ([(1 + |θn |)kxn − x∗ k + |1 − θn |kxn−1 − x∗ k]
K(kxn − x∗ k + λkxn−1 − xn k
en =
1 − K0 ([(µkxn − x∗ k + λkxn−1 − x∗ k]
Secant-Type Methods 15
K(2λ + 1)R∗
en =
1 − K0 (λ + µ)R∗
and
κ0 , if n = 0
K=
κ, if n > 0
Remark 1.4.2. Comments similar to the one given in Remark 3.3 can also follow for this
case. For example, notice again that in the case of Newton’s method
2
R∗ = ,
2κ0 + κ
whereas the convergence ball given independently by Rheinboldt [18] and Traub [19] is
given by
2
R1∗ = .
3κ
Note that
R1∗ ≤ R∗ .
Strict inequality holds in the preceding inequality if κ0 < κ. Moreover, the error bounds are
tighter, if κ0 < κ. Finally, note that κκ0 can be arbitrarily small and
R∗ κ0
1
→ 3 as → 0.
R∗ κ
x3 − 0.49 = 0, (1.5.1)
and we are going to apply the secant method (λ = 1, µ = 1, θn = 0) to find the solution of
(1.5.1). We take the starting points x−1 = 1.14216 · · ·, x0 = 1 and we consider the domain
Ω = B(x0 , 2). In this case, we obtain
ν = 0.147967 · · · , (1.5.2)
ν = 0.14216 · · · , (1.5.3)
l = 2.61119 · · · , (1.5.4)
l0 = 1.74079 · · · . (1.5.5)
√
Notice that hypothesis lc + 2 lν ≤ 1 is not satisfied, but hypotheses of Theorem 3.2 are
satisfied, so the convergence of secant method starting form x0 ∈ B(x0 , 2) converges to the
solution of (1.5.1).
16 Ioannis K. Argyros and Á. Alberto Magreñán
Example 1.5.2. Let X = Y = C [0, 1], equipped with the max-norm. Consider the following
nonlinear boundary value problem
u00 = −u3 − γ u2
u(0) = 0, u(1) = 1.
It is well known that this problem can be formulated as the integral equation
Z 1
u(s) = s + Q (s,t) (u3(t) + γ u2 (t)) dt (1.5.6)
0
Using the norm of the maximum of the rows and (1.5.7)–(1.5.8) we see that since F 0 (x∗ ) =
diag{1, 1, 1}, we can define parameters for Newton’s method by
K = e/2, (1.5.9)
Secant-Type Methods 17
K0 = 1, (1.5.10)
2
R∗ = , (1.5.11)
e+4
R∗0 = R∗ , (1.5.12)
since θn = 1, µ = 2, λ = 0. Then the Newton’s method starting form x0 ∈ B(x∗ , R∗ ) converges
to a solution of (1.5.7). Note that using only Lipschitz condition we obtain the Rheinboldt
or Traub ball R∗T R = 3e
2
< R∗ .
Example 1.5.4. In this example we present an application of the previous analysis to the
Chandrasekhar equation:
Z 1
s x(t)
x(s) = 1 + x(s) dt, s ∈ [0, 1], (1.5.13)
4 0 s +t
which arises in the theory of radiative transfer [7]; x(s) is the unknown function which
is sought in C[0, 1]. The physical background of this equation is fairly elaborate. It was
developed by Chandraseckhar [7] to solve the problem of determination of the angular
distribution of the radiant flux emerging from a plane radiation field. This radiation field
must be isotropic at a point, that is the distribution in independent of direction at that point.
Explicit definitions of these terms may be found in the literature [7]. It is considered to be
the prototype of the equation,
Z 1
ϕ(s)
x(s) = 1 + λs x(s) x(t) dt, s ∈ [0, 1],
0 s +t
for more general laws of scattering, where ϕ(s) is an even polynomial in s with
Z 1
1
ϕ(s) ds ≤ .
0 2
Integral equations of the above form also arise in the other studies [7]. We determine where
a solution is located, along with its region of uniqueness.
Note that solving (3.7) is equivalent to solve F(x) = 0, where F : C[0, 1] → C[0, 1] and
Z 1
s x(t)
[F(x)](s) = x(s) − 1 − x(s) dt, s ∈ [0, 1]. (1.5.14)
4 0 s +t
To obtain a numerical solution of (3.7), we first discretize the problem and approach
the integral by a Gauss-Legendre numerical quadrature with eight nodes,
Z 1 8
0
f (t) dt ≈ ∑ w j f (t j),
j=1
where
t1 = 0.019855072, t2 = 0.101666761, t3 = 0.237233795, t4 = 0.408282679,
t5 = 0.591717321, t6 = 0.762766205, t7 = 0.898333239, t8 = 0.980144928,
w1 = 0.050614268, w2 = 0.111190517, w3 = 0.156853323, w4 = 0.181341892,
w5 = 0.181341892, w6 = 0.156853323, w7 = 0.111190517, w8 = 0.050614268.
18 Ioannis K. Argyros and Á. Alberto Magreñán
Table 1.5.1. The comparison results of kxn+1 − xn k for Example 3.3 using various
methods
[1] Amat, S., Busquier, S., Negra, M., Adaptive approximation of nonlinear operators,
Numer. Funct. Anal. Optim., 25 (2004), 397–405.
[2] Argyros, I.K., Computational theory of iterative methods. Series: Studies in Compu-
tational Mathematics, 15, Editors: C.K. Chui and L. Wuytack, 2007, Elsevier Publ.
Co. New York, U.S.A.
[3] Argyros, I.K., Hilout, S., Weaker conditions for the convergence of Newton’s method,
J. Complexity, AMS, 28 (2012), 364–387.
[4] Argyros, I. K., Cho, Y. J., Hilout, S., Numerical method for equations and its applica-
tions. CRC Press/Taylor and Francis, New York, 2012.
[5] Argyros, I. K., George, S., Ball convergence for Steffensen-type fourth-order methods.
Int. J. Interac. Multim. Art. Intell., 3(4) (2015), 37–42.
[6] Cătinaş, E., The inexact, inexact perturbed, and quasi-Newton methods are equivalent
models, Math. Comp., 74(249) (2005), 291–301.
[7] Chandrasekhar, S., Radiative transfer, Dover Publ., New York, 1960.
[8] Dennis, J.E., Toward a unified convergence theory for Newton-like methods, in Non-
linear Functional Analysis and Applications (L. B. Rall, ed.) Academic Press, New
York, (1971), 425–472.
[9] Ezquerro, J.A., Hernández, Rubio, M.J., Secant-like methods for solving nonlinear
integral equations of the Hammerstein type, J. Comput. Appl. Math., 115 (2000), 245–
254.
[10] Ezquerro, J.A., Gutiérrez, J.M., Hernández, M.A., Romero, N., Rubio, M.J., The
Newton method: from Newton to Kantorovich. (Spanish), Gac. R. Soc. Mat. Esp.,
13(1) (2010), 53–76.
[11] Gragg, W.B., Tapia, R.A., Optimal error bounds for the Newton-Kantorovich theorem,
SIAM J. Numer. Anal., 11 (1974), 10–13.
[12] Kantorovich, L.V., Akilov, G.P., Functional Analysis, Pergamon Press, Oxford, 1982.
[13] Magreñán, Á. A., A new tool to study real dynamics: The convergence plane, App.
Math. Comp., 248 (2014), 215–224.
20 Ioannis K. Argyros and Á. Alberto Magreñán
[14] Ortega, L.M., Rheinboldt, W.C., Iterative Solution of Nonlinear Equations in Several
Variables, Academic press, New York, 1970.
[15] Potra, F.A., Pták, V., Nondiscrete induction and iterative processes. Research Notes in
Mathematics, 103. Pitman (Advanced Publishing Program), Boston, MA, 1984.
[16] Proinov, P.D., General local convergence theory for a class of iterative processes and
its applications to Newton’s method, J. Complexity, 25 (2009), 38–62.
[17] Proinov, P.D., New general convergence theory for iterative processes and its applica-
tions to Newton-Kantorovich type theorems, J. Complexity, 26 (2010), 3–42.
[18] Rheinboldt, W.C., An adaptative continuation process for solving systems of nonlinear
equations, Banach ctz. Publ. 3 (1975), 129–142.
[19] Traub, J. F. , Iterative method for solutions of equations, Prentice-Hall, New Jersey,
1964.
[20] Schmidt, J. W., Untere Fehlerschranken fun Regula-Falsi Verhafren, Period. Hungar.,
9 (1978), 241–247.
[21] Yamamoto, T., A convergence theorem for Newton-like methods in Banach spaces,
Numer. Math., 51 (1987), 545–557.
[22] Zabrejko, P.P., Nguen, D.F., The majorant method in the theory of Newton-
Kantorovich approximations and the Pták error estimates, Numer. Funct. Anal. Optim.,
9 (1987), 671–684.
Chapter 2
2.1. Introduction
In this chapter we are concerned with the problem of approximating a locally unique solu-
tion x? of an equation
F(x) = 0, (2.1.1)
where F is an operator defined on a non–empty, open subset Ω of a Banach space X with
values in a Banach space Y .
Many problems in computational sciences can be brought in the form of equation
(2.1.1). For example, the unknowns of engineering equations can be functions (differ-
ence, differential, and integral equations), vectors (systems of linear or nonlinear algebraic
equations), or real or complex numbers (single algebraic equations with single unknowns).
The solutions of these equations can rarely be found in closed form. That is why the most
commonly used solution method are iterative. The practice of numerical analysis is usually
connected to Newton-like methods [1,3,5,7–9, 10–16,18,19,21–27].
The study about convergence matter of iterative procedures is usually based on two
types: semilocal and local convergence analysis. The semilocal convergence matter is,
based on the information around an initial point, to give conditions ensuring the conver-
gence of the iterative method; while the local one is, based on the information around a
solution, to find estimates of the radii of convergence balls.
A classic iterative process for solving nonlinear equations is Chebyshev’s method (see
[5], [8], [17]):
x0 ∈ Ω,
yk = xk − F 0 (xk )−1 F(xk ),
xk+1 = yk − 12 F 0 (xk )−1 F 00 (xk )(yk − xk )2 , k ≥ 0.
This one-point iterative process depends explicitly on the two first derivatives of F (namely,
xk+1 = ψ(xk , F(xk ), F 0 (xk ), F 00 (xk ))). Ezquerro and Hernández introduced in [15] some
modifications of Chebyshev’s method that avoid the computation of the second derivative
of F and reduce the number of evaluations of the first derivative of F. Actually, these
22 Ioannis K. Argyros and Á. Alberto Magreñán
authors have obtained a modification of the Chebyshev iterative process which only need
to evaluate the first derivative of F, (namely, xk+1 = ψ(xk , F 0 (xk )), but with third-order of
convergence. In this chapter we recall this method as the Chebyshev–Newton–type method
(CNTM) and it is written as follows:
x0 ∈ Ω,
yk = xk − F 0 (xk )−1 F(xk ),
zk = xk + a (yk − xk )
1 0
x −1
((a2 + a − 1) F(xk ) + F(zk )), k ≥ 0.
k+1 = xk − 2 F (xk )
a
There is an interest in constructing families of iterative processes free of derivatives. To
obtain a new family in [8] we considered an approximation of the first derivative of F from
a divided difference of first order, that is, F 0 (xk ) ≈ [xk−1 , xk , F], where, [x, y; F] is a divided
difference of order one for the operator F at the points x, y ∈ Ω. Then, we introduce the
Chebyshev–Secant–type method (CSTM)
x−1 , x0 ∈ Ω,
yk = xk − B−1k F(xk ), Bk = [xk−1 , xk ; F],
z = xk + a (yk − xk ),
k
xk+1 = xk − B−1k (b F(xk ) + c F(zk )), k ≥ 0,
we give the radius of convergence of (STTM) for a nonlinear integral equation. Thirdly,
we discretize a nonlinear integral equation and approximate a numerical solution by using
(STTM).
Lemma 2.2.1. Suppose a ∈ [0, 1], b ∈ [0, 1],c ≥ 0 are parameters satisfying (1−a)c = 1−b,
L0 , L and N are positive constants with L0 ≤ L. Let ψ be a function defined on [0, +∞) by
Hence, in either case ψ(R0 ) > 0. It follows from the intermediate value theorem that there
exists a zero of function ψ in (0, R0) and the minimal such zero must satisfy 0 < R < R0 .
That completes the proof of the lemma.
Remark 2.2.2. We are especially interested in the case when a = b = c = 1. It follows from
(2.2.1) that in this case we can write
We have φ(0) = −1 < 0 and φ(1) = (N + 2)2 + (N + 3)(N + 2) − 1 > 0. Then, again by the
intermediate value theorem there exists R1 ∈ (0, 1) such that φ(R1 ) = 0. Moreover, we get
That is φ is increasing on [0, +∞). Hence, φ crosses the x−axis only once. Therefore R1 is
the unique zero of φ in (0, 1). In this case, by setting
LR
2(1− N+1
= R1 (2.2.7)
2 Lo R)
We can show the main result of this section concerning the local convergence of
(STTM).
(b) Point x? is a solution of equation F(x) = 0, F 0 (x? )−1 ∈ L(Y, X) and there exists a con-
stant L > 0 such that
L
kF 0 (x? )−1 ([x, y; F] − [u, v; F])k ≤ (kx − uk + ky − vk) f or all x, y, u, v ∈ Ω; (2.2.10)
2
(c) There exists a constant L0 > 0 such that
L0
kF 0 (x? )−1 ([x, y; F] − F 0 (x? ))k ≤ (kx − x? k + ky − x? k) f or all x, y ∈ Ω; (2.2.11)
2
Efficient Steffensen-Type Algorithms for Solving Nonlinear Equations 25
It follows from (2.2.24) and the Banach lemma on invertible operators that A−1
0 ∈ L(Y, X)
and
kA−1 0 ?
0 F (x )k ≤ L0
1
< L0 1 . (2.2.25)
1− 2 (N+1)ke0 k 1− 2 (N+1)R
So,
u0 = y0 − x? = x0 − x? − A−1 0 F(x0 ) (2.2.27)
= A−1
0 F 0 (x? )F 0 (x? )−1 (A − [x? , x ; F])e .
0 0 0
ku0 k ≤ kA−1 0 ? L ?
0 F (x )k 2 (kx0 − x k + kG(x0 ) − x0 k)ke0k
≤ 1 L ? ? ?
L0
1− 2 (kx0 − x k + kG(x0 ) − G(x )k + kx − x0 k)ke0 k
(N+1)ke0k
2
≤ 1 L
(2kx0 − x? k + Nkx? − x0 k)ke0 k (2.2.28)
L
1− 20 (N+1)ke0k 2
L(N+2)R0
≤ L ke0 k = ke0 k < R,
2[1− 20 (N+1)R0 ]
we get
kv0 k ≤ aku0 k + (1 − a)ke0 k ≤ ke0 k < R. (2.2.30)
As in (2.2.26), we have
e1 = e0 − A−1
0 (bF(x0 ) + cF(z0 ))
−1
= A0 [x0 , G(x0 ); F]e0 − (b[x? , x0 ; F]e0 + c[x? , z0 ; F]v0 )
= A−1 ? ?
0 [x0 , G(x0 ); F]e0 − b[x , x0 ; F]e0 − c[x , z0 ; F](au0 + (1 − a)e0 )
= A−1 ? ? ?
0 [x0 , G(x0 ); F]e0 − b[x , x0 ; F]e0 − (1 − b)[x , z0 ; F]e0 − ac[x , z0 ; F]u 0
= A−1 ? ? ?
0 ([x0 , G(x0 ); F] − [x , x0 ; F])e0 + (1 − b)([x , x0 ; F] − [x , z0 ; F])e0
+acA−1 ? ? ?
0 ([x , x0 ; F] − [x , z0 ; F] + [x0 , G(x0 ); F] − [x , x0 ; F])u0 − acu0 .
(2.2.32)
Define
D0 = A−1 ?
0 ([x0 , G(x0 ); F] − [x , x0 ; F]),
−1 (2.2.33)
E0 = A0 ([x? , x0 ; F] − [x? , z0 ; F]),
then, we have from (2.2.27) that
u0 = D0 e0 . (2.2.34)
Efficient Steffensen-Type Algorithms for Solving Nonlinear Equations 27
We need to find upper bounds on the norms kD0 k and kE0 k. Using (2.2.10) and (2.2.33)
we get in turn
kD0 k ≤ kA−1 0 ? 0 ? −1 ?
0 F (x )kkF (x ) ([x0 , G(x0 ); F] − [x , x0 ; F])k
1 L ?
≤ L0 2 (kx0 − x k + kG(x0 ) − x0 k)
1− 2 (N+1)ke0k (2.2.36)
L ?
2 (N+2)kx0−x k L(N+2)ke0k
≤ L0 = L
1− 2 (N+1)kx0−x? k 2(1− 20 (N+1)ke0k)
and
kE0 k ≤ kA−1 0 ? 0 ? −1 ? ?
0 F (x )kkF (x ) ([x , x0 ; F] − [x , z0 ; F])k
L aL aL
2 kz0 −x0 k 2 ky0 −x0 k 2 (ku0k+ke0 k)
≤ L0 = L ≤ L
1− 2 (N+1)kx0−x? k 1− 20 (N+1)ke0k 1− 20 (N+1)ke0k
aL
2 (kD0 k+1)ke0k 1 aL
L
(N+2)ke 0k (2.2.37)
≤ L ≤ L [ 2
L0 + 1]ke0 k
1− 20 (N+1)ke0k 1− 20 (N+1)ke0k 2 1− 2 (N+1)ke0k
aL2 (N+2)ke0k2 aLke0 k
≤ L + L .
4(1− 20 (N+1)ke0k)2 2(1− 20 (N+1)ke0 k)
kA−1 0 ?
k F (x )k ≤ L0
1
< L0
1
; (2.2.39)
1− 2 (N+1)kek k 1− 2 (N+1)R
28 Ioannis K. Argyros and Á. Alberto Magreñán
= | 01 (etx+(1−t)y − 1)dt|
R
tx+(1−t)y (tx+(1−t)y)2
= | 01 (tx + (1 − t)y)(1 + (2.3.3)
R
2! + 3! + · · · )dt|
R1 1 1
≤ | 0 (tx + (1 − t)y)(1 + 2! + 3! + · · ·)dt|
≤ e−1 ? ?
2 (|x − x | + |y − x |).
Efficient Steffensen-Type Algorithms for Solving Nonlinear Equations 29
That is to say, the Lipschitz condition (2.2.10) and the center-Lipschitz condition (2.2.11)
are true for L = e and L0 = e − 1, respectively.
2
Choose G(x) = x − hF (x), where h ∈ (0, e−1 ) is a constant. Then, G : Ω ⊆ X → X is
? ?
continuous and such that G(x ) = x . Moreover, for any x ∈ Ω, we have
2 3
|G(x) − G(x? )| = |x − h(ex − 1)| = |x − h(x + x2! + x3! + · · · )|
2 3 1
= |(1 − h)x − h( x2! + x3! + · · · )| ≤ (|1 − h| + h| 2! + 3!1 + · · · |)|x|
= (|1 − h| + h(e − 2))|x − x? |,
(2.3.4)
which means condition (2.2.12) is true for N = |1 − h| + h(e − 2) and N ∈ (0, 1).
Table 2.3.1. The comparison results of R0 and R for Example 2.3.1 using various
choices of a, b, c and h
Table 2.3.3 gives the comparison results of R0 and R for Example 2.3.1 using various
choices of a, b, c and h, which show that the convergence radius R is always enlarged by
using both condition (2.2.10) and (2.2.11) than the one by using only condition (2.2.10).
The same result is true for R0 .
Let us set h = 1 and choose x0 = 0.11. Suppose sequence {xn } is generated by (STTM).
Table 2.3.3 gives the error estimates for Example 2.3.1 using various choices of a, b and c,
which shows that all error estimates given by (2.2.14) (or (2.2.14)) are satisfied. Moreover,
the error estimates of case a = b = c = 1 are smallest among all choices of a, b, c.
Example 2.3.2. Let X = Y = C[0, 1], the space of continuous functions defined on [0, 1],
equipped with the max norm and Ω = U(0, 1). Define function F on Ω, given by
Z 1
F(x)(s) = x(s) − 5 stx3 (t)dt, (2.3.5)
0
Table 2.3.2. The comparison results of error estimates for Example 2.3.1 using
various choices of a, b and c
Then, we have
Z 1
[F 0 (x)y](s) = y(s) − 15 stx2 (t)y(t)dt, f or all y ∈ Ω. (2.3.7)
0
In the last example we are not interested in checking if the hypotheses of Theorem 2.2.3
are satisfied or not, but comparing the numerical behavior of (STTM) with earlier methods.
Example 2.3.3. In this example we present an application of the previous analysis to the
significant Chandrasekhar integral equation [7]:
Z 1
s x(t)
x(s) = 1 + x(s) dt, s ∈ [0, 1]. (2.3.11)
4 0 s +t
Integral equations of the form (2.3.11) are very important and appear in the areas of neutron
transport, radiative transfer and the Kinetic theory of gasses. We refer the interested reader
to [1,11,17] where a detailed description of the physical phenomenon described by (2.3.11)
can be found. We determine where a solution is located, along with its region of uniqueness.
Later, the solution is approximated by an iterative method of (STTM).
Efficient Steffensen-Type Algorithms for Solving Nonlinear Equations 31
Note that solving ((2.3.11)) is equivalent to solve F(x) = 0, where F : C[0, 1] → C[0, 1]
and
s 1 x(t) Z
[F(x)](s) = x(s) − 1 − x(s) dt, s ∈ [0, 1]. (2.3.12)
4 0 s +t
To obtain a numerical solution of (2.3.11), we first discretize the problem and we find it
convenient by testing several number of nodes to approach the integral by a Gauss-Legendre
numerical quadrature with eight nodes (see also [1], [11], [17])
Z 1 8
0
f (t) dt ≈ ∑ w j f (t j),
j=1
where
t1 = 0.019855072, t2 = 0.101666761, t3 = 0.237233795, t4 = 0.408282679,
t5 = 0.591717321, t6 = 0.762766205, t7 = 0.898333239, t8 = 0.980144928,
w1 = 0.050614268, w2 = 0.111190517, w3 = 0.156853323, w4 = 0.181341892,
w5 = 0.181341892, w6 = 0.156853323, w7 = 0.111190517, w8 = 0.050614268.
Table 2.3.3. The comparison results of kxn+1 − xn k for Example 2.3.3 using various
methods
Denote now x = (x1 , x2 , . . ., x8 )T , 1 = (1, 1, . . ., 1)T , A = (ai j ) and write the last nonlin-
ear system in the matrix form:
1
x = 1 + x (Ax), (2.3.13)
4
where represents the inner product. Set G(x) = x. If we choose x0 = (1, 1, . . ., 1)T and
x−1 = (.99, .99, .. ., .99)T . Assume sequence {xn } is generated by (STTM) (or (CSTM))
with different choices of parameters a, b and c. Table 2.3.3 gives the comparison results for
kxn+1 − xn k equipped with the max-norm for this example, which show that (STTM) is faster
32 Ioannis K. Argyros and Á. Alberto Magreñán
than (CSTM). Here, we perform the computations by Maple 11 in a computer equipped with
Inter(R) Core(TM) i3-2310M CPU.
In future results we shall use higher precision instead of a fixed number of digits in all
computations. We shall also use an adaptive arithmetic in each step of the iterative method.
Note that this higher precision is only necessary in the last step of the iterative process.
Table 2.3.3 shows the usefulness of (STTM) since it is faster than other relevant methods
in the literature like (CSTM).
References
[1] Argyros, I.K., Polynomial operator equations in abstract spaces and applications,
St.Lucie/CRC/Lewis Publ. Mathematics series, 1998, Boca Raton, Florida, U.S.A.
[2] Argyros, I.K., On the Newton–Kantorovich hypothesis for solving equations, J. Com-
put. Appl. Math., 169 (2004), 315–332.
[3] Argyros, I.K., A unifying local–semilocal convergence analysis and applications for
two–point Newton–like methods in Banach space, J. Math. Anal. Appl., 298 (2004),
374–397.
[4] Argyros, I.K., New sufficient convergence conditions for the Secant method, Che-
choslovak Math. J., 55 (2005), 175–187.
[6] Argyros, I.K., Hilout, S., On the weakening of the convergence of Newton’s method
using recurrent functions, J. Complexity, 25 (2009), 530–543.
[7] Argyros, I.K., Hilout, S., On the convergence of two-step Newton-type methods of
high efficiency order, Applicationes Mathematicae, 36(4) (2009), 465-499.
[8] Argyros, I.K., Ezquerro, J., Gutiérrez, J.M., Hernández, M., Hilout, S., On the semilo-
cal convergence of efficient Chebyshev-Secant-type methods, J. Comput. Appl. Math.,
235 (2011), 3195–3206.
[9] Argyros, I.K., Cho, Y.J., Hilout, S., Numerical Methods for Equations and Its Appli-
cations, CRC Press/Taylor and Francis Group, New York, 2012.
[10] Argyros, I. K., George, S. Ball convergence for Steffensen-type fourth-order methods.
Int. J. Interac. Multim. Art. Intell., 3(4) (2015), 37–42.
[11] Argyros, I. K., González, D., Local convergence for an improved Jarratt-type method
in Banach space. Int. J. Interac. Multim. Art. Intell., 3(4) (2015), 20–25.
[12] Catinas, E., On some iterative methods for solving nonlinear equations, Revue d’ anal-
yse numerique et de theorie de l’approximation, 23(1) (1994), 47–53.
[14] Dennis, J.E., Toward a unified convergence theory for Newton–like methods, in Non-
linear Funct. Anal. App. (L.B. Rall, ed.), Academic Press, New York, (1971), 425–
472.
[16] Hernández, M.A., Rubio, M.J., Ezquerro, J.A., Solving a special case of conservative
problems by Secant–like method, Appl. Math. Comp., 169 (2005), 926–942.
[17] Hernández, M.A., Rubio, M.J., Ezquerro, J.A., Secant–like methods for solving non-
linear integral equations of the Hammerstein type, J. Comp. Appl. Math., 115 (2000),
245–254.
[18] Grau, M., Noguera, M., A variant of Cauchy’s method with accelerated fifth-order
convergence. Appl. Math. Lett., 17 (2004), 509–517.
[19] Kantorovich, L.V., Akilov, G.P., Functional Analysis, Pergamon Press, Oxford, 1982.
[20] Laasonen, P., Ein überquadratisch konvergenter iterativer algorithmus, Ann. Acad. Sci.
Fenn. Ser I, 450 (1969), 1–10.
[22] Ortega, J.M., Rheinboldt, W.C., Iterative Solution of Nonlinear Equations in Several
Variables, Academic Press, New York, 1970.
[23] Potra, F.A., Sharp error bounds for a class of Newton–like methods, Libertas Mathe-
matica, 5 (1985), 71–84.
[25] Petković, M.S., IIić, S., Dzunić, J., Derivative free two-point methods with and with-
out memory for solving nonlinear equations, App. Math. Comp., 217 (2010), 1887–
1895.
[26] Schmidt, J.W., Untere Fehlerschranken fur Regula–Falsi Verhafren, Period. Hungar.,
9 (1978), 241–247.
[27] Yamamoto, T., A convergence theorem for Newton–like methods in Banach spaces,
Numer. Math., 51 (1987), 545–557.
[28] Wolfe, M.A., Extended iterative methods for the solution of operator equations, Nu-
mer. Math., 31 (1978), 153–174.
Chapter 3
3.1. Introduction
Let X and Y be Banach spaces and D be a non-empty, open and convex subset of X. The aim
of this chapter is to show using a numerical example that the convergence theorem of Ref.
[15] is false under the stated hypotheses. Reference [15] was concerned with the semilocal
convergence of Halley’s method for solving a nonlinear operator equation
F(x) = 0, (3.1.1)
where, LF (x) = 12 F 0 (x)−1 F 00 (x)F 0 (x)−1 F(x). Let U(x, R), U(x, R) stand, respectively, for
the open and closed balls in X with center x and radius R > 0. Halley’s method is cubically
convergent and has been studied extensively (see [1-13] and the references therein). In
36 Ioannis K. Argyros and Á. Alberto Magreñán
particular, recurrence relations have been used by Parida [13], Parida and Gupta [14], Chun,
Stǎnicǎ and Neta [9] together with different continuity conditions on the second Fréchet
derivative F 00 of F such as F 00 is Lipschitz or Hölder continuous to provider a semilocal
convergence analysis for third order methods such as Halley’s, Chebyshev’s method,
super-Halley’s method and other high order methods. The sufficient conditions usually
associated with the semilocal convergence of Halley’s method are the (C) conditions [1, 5]
given by
(C1 ) kF 0 (x0 )−1 F(x0 )k ≤ η,
(C2 ) kF 0 (x0 )−1 F 00 (x)k ≤ β,
(C3 ) kF 0 (x0 )−1 (F 00 (x) − F 00 (y))k ≤ Mkx − yk,
3M 2
(C4 ) h = 2 3 2
η ≤ 1,
(β +2M) 2 −β(β +3M)
(C5 ) U(x0 , R0 ) ⊆ D, where R0 is the small positive root of
M 3 β 2
p(t) = t + t − t + η.
6 2
Similar conditions but with different (C4 ) and (C5 ) have been given by us in [5, Theorem
2.3], where the corresponding to R0 radius is given in closed form. There are many inter-
esting examples in the literature (see [3, 4, 11, 15] and Example 3.5.2), where Lipschitz
condition (C3 )(used in [9, 13, 14]) is violated but center-Lipschitz condition
Lt 2 + βt − 1 = 0. (3.1.6)
Semilocal Convergence of Halley’s Method 37
Then, the Halley sequence {xk } generated by (3.1.2) remains in the open ball U(x0 , R), and
converges to the unique solution x? ∈ U(x0 , R) of Eq. (3.1.1) . Moreover, the following error
estimate holds
∞
a i
kx? − xk k ≤ ∑ γ2 , (3.1.7)
c(1 − τ)γ i=k+1
1 a(a+4)
where a = βη, c = R and γ = (2−3a)2
.
In the present chapter we expand the applicability of Halley’s method using (3.1.3)
instead of (C3 ). The chapter is organized as follows: In Section 3.2 we present a coun-
terexample to show that the result in [15] using (3.1.3) is false. The mistakes in the proof
are pointed out in Section 3.3. Section 3.4 contains our semilocal convergence analysis of
Halley’s method using (3.1.3). The numerical examples are given in the concluding Section
3.5.
So, F(x0 ) = 3, F 0 (x0 ) = 12, F 00 (x0 ) = 12. We can choose η = 41 and β = 1 in Theorem 3.1.1.
Moreover, we have for any x ∈ D that
Hence, the center Lipschitz condition (3.1.3) is true for constant L = 10. We can also verify
condition 12 βη = 18 < τ = 0.134065 . . . is true. By (3.1.6), we get
q
√
β2 + 4L − β 41 − 1
R= = = 0.270156 . . .. (3.2.3)
2L 20
Then, condition U(x0 , R) = [x0 − R, x0 + R] ≈ [0.729844, 1.270156] ⊂ D is also true. Hence,
all conditions in Theorem 3.1.1 are satisfied. However, we can verify that the point x1
generated by the Halley’s method (3.1.2) doesn’t remain in the open ball U(x0 , R). In fact,
we have that
|F 0 (x0 )−1 F(x0 )| 2
|x1 − x0 | = 1 0 −1 00 0 −1
= = 0.285714 . . . > R. (3.2.4)
|1 − 2 F (x0 ) F (x0 )F (x0 ) F(x0 )| 7
and
r = lim rk (3.3.5)
k→∞
provided the limit exists. Using induction, Ref. [15] shows that the following relation holds
for k ≥ 0 if ak ≥ 1:
1
ak+1 = . (3.3.6)
1 − crk
Next Ref. [15] claims that from the initial relations, it follows by induction that, for k =
0, 1, 2, . . .
ca2 bk 2ck
cak dk = k = . (3.3.7)
1 − ck 1 − ck
Here, we point out that the relation (3.3.7) is not always true. In fact, for k = 0, the first and
second equality of (3.3.7) are obtained from
a0 b0
d0 = (3.3.8)
1 − c0
and
c
c0 = a20 b0 , (3.3.9)
2
respectively. We can easily verify (3.3.8) is really true from (3.3.2). However, (3.3.9) is not
true in general. Otherwise, using (3.3.2) and (3.3.9), we demand that
a = bc. (3.3.10)
Clearly, this condition is introduced improperly and will be violated frequently. Since some
lemmas of Ref. [15] are established on the basis of the above basic relation (3.3.7), they
will be not always true. Therefore, the main theorem of Ref. [15] (Theorem 3.1.1 stated
above) will be not always true, because it is based on these lemmas.
Semilocal Convergence of Halley’s Method 39
√
increases
√
monotonically on [0, 2−4 2 ) and there exists a unique point τ ≈ 0.134065 ∈
(0, 2−4 2 ) such that q(τ) = 1.
and real sequences {ak }, {bk}, {ck } and {dk } be defined by (3.3.2)-(3.3.3). Assume that
c2 b2 (a + 4)
c1 = < τ, (3.4.4)
2(2 − a − 2bc)2
where τ is a constant defined in Lemma 3.4.1. Then, sequence {ck } is a bounded strictly
decreasing and ck ∈ (0, τ) for all k ≥ 1. Moreover, we have that
c2k
ck+1 = (ck + 2), k = 1, 2, . . .. (3.4.5)
(1 − 3ck )2
Proof. By conditions (3.4.3) and (3.4.4), a1 , b1 , c1 and d1 are well defined. Since,
ca21 b1 2c1
ca1 d1 = = < 1, (3.4.6)
1 − c1 1 − c1
a2 , b2 and c2 are well defined. We have that
thus (4.5) holds for k = n + 1. So, cn+2 < cn+1 < · · · < c2 < c1 < 1, and dn+2 is well defined.
That completes the induction and the proof of the lemma.
c2
Lemma 3.4.3. Under the assumptions of Lemma 3.4.2, if we set γ = c1 , then for k ≥ 0
2k −1
(i) ck+1 ≤ c1 γ ;
k
(ii) dk+1 ≤ ca1 (1−c1) γ2 −1 ;
2c1
i−1 −1
(iii) r − rk ≤ ∑∞ 2c1
i=k+1 ca1 (1−c1 ) γ
2
;
where r is defined in (3.3.5).
Proof. Obviously, (i) is true for k = 0. By Lemma 3.4.2, for any k ≥ 1, we have that
c2k c2k c2k
ck+1 = (ck + 2) ≤ (ck−1 + 2) ≤ · · · ≤ (c1 + 2) = λc2k ,
(1 − 3ck )2 (1 − 3ck−1 )2 (1 − 3c1 )2
(3.4.12)
where
c1 + 2 c21 (c1 + 2) c2 γ
λ= 2
= 2
= 2= . (3.4.13)
(1 − 3c1 ) 2
(1 − 3c1 ) c1 c1 c1
Multiplying both side of (3.4.12) by λ yields
2 k k
λck+1 ≤ (λck )2 ≤ (λck−1 )2 ≤ · · · ≤ (λc1 )2 = γ2 , (3.4.14)
which shows (i). Consequently, for k ≥ 1, we have
k
cak+1 dk+1 ca2k+1bk+1 2ck+1 2c1 γ2 −1
dk+1 = = = ≤ . (3.4.15)
cak+1 cak+1 (1 − ck+1 ) cak+1 (1 − ck+1 ) ca1 (1 − c1 )
That is, (ii) is true for k ≥ 1. For the case of k = 0, we have that
a1 b1 ca21 b1 2c1
d1 = = = , (3.4.16)
1 − c1 ca1 (1 − c1 ) ca1 (1 − c1 )
which means (ii) is also true for k = 0. Moreover, (iii) is true by using (ii) and the definitions
of rk and r. The proof is complete.
Semilocal Convergence of Halley’s Method 41
Lemma 3.4.4. Under the assumptions of Lemma 3.4.2, if we set a = βη, b = η, c = R1 , then
r = limk→∞ rk < R.
Proof. By the definition of rk , Lemmas 3.4.1-3.4.3, for any k ≥ 1, we have that
2c1 i−1 −1 γ
rk ≤ d0 + ∑ki=1 ca1 (1−c 1)
γ2 < d0 + a1 2Rc
(1−c1 ) (1 + 1−γ2 )
1
c1 (c1 +2)
2(1−cd0)Rc1 2
(3.4.17)
= d0 + 1−c1 (1 + (1−3c 1)
c (c +2) ) = d0 + q(c1 )(R − d0 )
1−( 1 1 2 )2
(1−3c1 )
< d0 + q(τ)(R − d0 ) = R.
1
Here, we used R = c > 1−b a = d0 by (3.4.3). Hence, r = limk→∞ rk exists and r < R. The
2
proof is complete.
Lemma 3.4.5. Set a = βη, b = η and c = R1 . Let {ak }, {bk}, {ck }, {dk} be the sequences
generated by (3.3.2)-(3.3.3). Suppose that conditions (3.4.3) and (3.4.4) are true. Then for
any k ≥ 0, we have
(i) F 0 (xk )−1 exists and kF 0 (xk )−1 F 0 (x0 )k ≤ ak ;
(ii) kF 0 (x0 )−1 F(xk )k ≤ bk ;
(iii) [I − LF (xk )]−1 exists and kLF (xk )k ≤ ck ;
(iv) kxk+1 − xk k ≤ dk ;
(v) kxk+1 − x0 k ≤ rk < R.
Proof. The proof is similar to the one in [15], and we shall omit it.
Proof. Using Lemma 3.4.5, we have for all k ≥ 0, xk ∈ U(x0 , R). From Lemma 3.4.5 and
Lemma 3.4.3, we get for any integer k, m ≥ 1
k+m−1 k+m−1 k+m−1 2c1 2 i−1 −1
kxk+m − xk k ≤ ∑i=k kxi+1 − xi k ≤ ∑i=k di ≤ ∑i=k ca1 (1−c1) γ
i−1 −1 γ2
k−1 (3.4.19)
≤ 2c1
ca1 (1−c1) ∑∞
i=k γ
2
≤ 2c1
ca1 (1−c1 )γ 1−γ2
.
42 Ioannis K. Argyros and Á. Alberto Magreñán
That is, {xk } is a Cauchy sequence. So, there exists a point x? ∈ U(x0 , R) such that {xk }
converges to x? as k → ∞. Using Lemma 3.4.3, clearly we have dk → 0 as k → ∞. Using
Lemma 3.4.5 and Lemma 3.4.2, for any k ≥ 0, we have
kF 0 (x0 )−1 F(xk+1)k ≤ bk+1 = c(1 + c2k )dk2 ≤ c(1 + c21 )dk2 → 0 as k → ∞. (3.4.20)
The continuity of F gives
kF 0 (x0 )−1 F(x? )k = limk→∞ kF 0 (x0 )−1 F(xk+1 )k = 0 as k → ∞, (3.4.21)
that is F(x? ) = 0. By let m → ∞ in (3.4.19), (3.4.18) is obtained immediately.
Finally, we can show the uniqueness of x? ∈ U(x0 , R) by using the same technique as
in [2, 3, 4, 5, 15]. The proof is complete.
Remark 3.4.7. (a) Let us compare our sufficient convergence condition (3.4.3) with condi-
tion (C4 ). Condition (3.4.3) can be rewritten as
√
2β+ β2 +4L (3.4.22)
h0 = 2 η<1
if we use the choices of a, b, c given in Lemma 3.4.5 and R given by (3.2.3). Then, we have
that
h0 ≤ h. (3.4.23)
Estimate (3.4.23) shows that one of our convergence conditions is at least as weak as (C4).
However a direct comparison between (3.4.4) and (C4) is not practical. A similar favorable
comparison can be followed with all other sufficient convergence conditions of the form
(C4 ) already in the literature using M instead of L (see [4, 5, 10, 11, 13, 14, 15]) and the
references therein).
(b) It is possible that (C3 ) is satisfied (hence, (3.1.3) too) but not (C4 ) (or (C5 )). In this
case we test to see if our conditions are satisfied. If our conditions are satisfied although
we predict only quadratic convergence of the Halley method (3.1.2) (see e.g. Lemma 3.4.3)
after a certain iterate xN , where N is a finite natural integer (C4 ) and (C5 ) will be satisfied
for x0 = xN . Therefore, the usual error estimates for the cubical convergence of the Halley
method (3.1.2) will hold. We refer the reader to [3, 4], where we show how to choose N
in the case of Newton’s method. The N for Halley’s method (3.1.2) can be found in an
analogous way.
Hence, the weak Lipschitz condition (3.1.3) is true for constant L = 4. By (3.1.6), we get
q
√
β2 + 4L − β 17 − 1
R= = = 0.390388 . . .. (3.5.3)
2L 8
Then, condition U(x0 , R) = [x0 − R, x0 + R] ≈ [0.609612, 1.390388] ⊂ D is true. We can
c2 b2 (a+4)
also verify conditions 2 − a − 2bc = 2 − βη − 2η/R ≈ 1.326458 > 0 and c1 = 2(2−a−2bc) 2 ≈
0.092729212 < τ = 0.134065 . . . is true. Hence, all conditions in Theorem 3.4.6 are satis-
fied.
Example 3.5.2. In this example we provide an application of our results to a special non-
linear Hammerstein integral equation of the second kind. Consider the integral equation
Z b0
1
u(s) = f (s) + λ k(s,t)u(t)2+ n dt, λ ∈ R, n ∈ N, (3.5.4)
a0
where f is a given continuous function satisfying f (s) > 0 for s ∈ [a0, b0 ] and the kernel is
continuous and positive in [a0, b0 ] × [a0 , b0 ].
Let X = Y = C[a0, b0 ] and D = {u ∈ C[a0 , b0] : u(s) ≥ 0, s ∈ [a0 , b0]}. Define F : D → Y
by
Z b0
1
F(u)(s) = u(s) − f (s) − λ k(s,t)u(t)2+ n dt, s ∈ [a0 , b0]. (3.5.5)
a0
We use the max-norm, The first and second derivatives of F are given by
Z b0
0 1 1
F (u)v(s) = v(s) − λ(2 + ) k(s,t)u(t)1+ n v(t)dt, v ∈ D, s ∈ [a0, b0 ], (3.5.6)
n a0
and
Z b0
00 1 1 1
F (u)(vw)(s) = −λ(1 + )(2 + ) k(s,t)u(t) n (vw)(t)dt, v, w ∈ D, s ∈ [a0, b0 ],
n n a0
(3.5.7)
respectively.
Let x0 (t) = f (t), α = mins∈[a0 ,b0 ] f (s), δ = maxs∈[a0 ,b0] f (s) and M =
R 0
maxs∈[a0 ,b0 ] ab0 |k(s,t)|dt. Then, for any v, w ∈ D,
0 1 1
k[F 00 (x) − F 00 (x0 )](vw)k ≤ |λ|(1 + 1n )(2 + 1n ) maxs∈[a0,b0 ] ab0 |k(s,t)||x(t) n − f (t) n |dtkvwk
R
R 0 |x(t)− f (t)|
= |λ|(1 + 1n )(2 + 1n ) maxs∈[a0,b0 ] ab0 |k(s,t)| n−1 n−2 1 n−1 dtkvwk
x(t) n +x(t) n f (t) n +···+ f (t) n
1 1 R b0 |x(t)− f (t)|
≤ |λ|(1 + n )(2 + n ) maxs∈[a ,b ] a0 |k(s,t)| f (t) n−1
0 0 dtkvwk
n
|λ|(1+ n1 )(2+ n1 ) R b0
≤ n−1 maxs∈[a0 ,b0 ] a0 |k(s,t)||x(t) − f (t)|dtkvwk
α n
|λ|(1+ n1 )(2+ n1 )M
≤ n−1 kx − x0 kkvwk,
α n
(3.5.8)
which means
|λ|(1+ n1 )(2+ n1 )M
kF 00 (x) − F 00 (x0 )k ≤ n−1 kx − x0 k. (3.5.9)
α n
44 Ioannis K. Argyros and Á. Alberto Magreñán
Next, we give a bound for kF 0 (x0 )−1 k. Using (3.5.6), we have that
1 1
kI − F 0 (x0 )k ≤ |λ|(2 + )δ1+ n M. (3.5.10)
n
1
It follows from the Banach theorem that F 0 (x0 )−1 exists if |λ|(2 + 1n )δ1+ n M < 1, and
1
kF 0 (x0 )−1 k ≤ 1 . (3.5.11)
1 − |λ|(2 + n1 )δ1+ n M
1
On the other hand, we have from (3.5.5) and (3.5.7) that kF(x0 )k ≤ |λ|δ2+ n M and
1 1
kF 00 (x0 )k ≤ |λ|(1 + n1 )(2 + 1n )δ n M. Hence, if |λ|(2 + n1 )δ1+ n M < 1, the weak Lipschitz
condition (1.3) is true for
|λ|(1 + 1n )(2 + 1n )M
L= n−1 1 (3.5.12)
α n [1 − |λ|(2 + n1 )δ1+ n M]
Next we let [a0 , b0] = [0, 1], n = 2, f (s) = 1, λ = 1.1 and k(s,t) is the Green’s kernel on
[0, 1] × [0, 1] defined by
t(1 − s), t ≤ s;
G(s,t) = (3.5.14)
s(1 − t), s ≤ t.
Consider the following particular case of (3.5.4):
Z 1
5
u(s) = f (s) + 1.1 G(s,t)u(t) 2 dt, s ∈ [0, 1]. (3.5.15)
0
22 11 11
η= , β= , L= . (3.5.16)
105 14 14
Therefore 2 − a − 2bc ≈ 1.264456 > 0, τ − c1 ≈ 0.027938 > 0 and R ≈ 0.733988. Hence,
U(x0 , R) ⊂ D. Thus, all conditions of Theorem 3.4.6 are satisfied. Consequently, sequence
{xk } generated by Halley’s method (3.1.2) with initial point x0 converges to the unique
solution x? of Eq. (3.5.15) on U(x0 , 0.733988). The Lipschitz condition (C3 ) is not satisfied
[3, 4, 11, 15]. Hence, we have expanded the applicability of Halley’s method. Note also
that verifying (3.1.3) is less expensive than verifying (C3 ).
References
[1] Amat, S., Busquier, S, Third-order iterative methods under Kantorovich conditions, J.
Math. Anal. Appl. 336 (2007), 243–261.
[2] Argyros, I.K., The convergence of Halley-Chebyshev type method under Newton-
Kantorovich hypotheses, Appl. Math. Lett. 6 (1993), 71–74.
[3] Argyros, I.K., Approximating solutions of equations using Newton’s method with a
modified Newton’s method iterate as a starting point, Rev. Anal. Numer. Theor. Approx,
36 (2007), 123–138.
[4] Argyros, I.K., Computational theory of iterative methods, Series: Studies in Compu-
tational Mathematics 15, Editors, C.K. Chui and L. Wuytack, Elservier Publ. Co. New
York, USA, 2007.
[5] Argyros, I.K., Cho, Y.J., Hilout, S., On the semilocal convergence of the Halley
method using recurrent functions, J. Appl. Math. Computing. 37 (2011), 221–246.
[6] Argyros, I.K., Ren, H.M., Ball convergence theorems for Halley’s method in Banach
spaces, J. Appl. Math. Computing 38 (2012), 453–465.
[7] Argyros, I. K., and S. George, Ball convergence for Steffensen-type fourth-order
methods. Int. J. Interac. Multim. Art. Intell., 3(4) (2015), 37–42.
[8] Argyros, I. K., and González, D., Local convergence for an improved Jarratt-type
method in Banach space. Int. J. Interac. Multim. Art. Intell., 3(4) (2015), 20–25.
[9] Chun, C., Stǎnicǎ, P., Neta, B., Third-order family of methods in Banach spaces,
Comp. Math. with App., 61 (2011), 1665–1675.
[10] Deuflhard, P., Newton Methods for Nonlinear Problems: Affine Invariance and Adap-
tive Algorithms, Springer-Verlag, Berlin, Heidelberg, 2004.
[11] Gutiérrez, J.M., Hernández, M.A., Newton’s method under weak Kantorovich condi-
tions, IMA J. Numer. Anal. 20 (2000), 521–532.
[13] Parida, P.K., Study of third order methods for nonlinear equations in Banach spaces,
PhD dissertation, 2007, IIT kharagpur, India.
46 Ioannis K. Argyros and Á. Alberto Magreñán
[15] Xu, X.B., Ling, Y.H., Semilocal convergence for Halley’s method under weak Lips-
chitz condition, App. Math. Comp. 215 (2009), 3057–3067.
Chapter 4
4.1. Introduction
In this chapter, we are concerned with the problem of approximating a locally unique solu-
tion x? of equation
F (x) = 0, (4.1.1)
where, F is a twice Fréchet differentiable operator defined on a convex subset D of a
Banach space X with values in a Banach space Y . Numerous problems in science and
engineering – such as optimization of chemical processes or multiphase, multicomponent
flow – can be reduced to solving the above equation [7, 8, 9, 14, 15, 16]. Consequently,
solving these equations is an important scientific field of research. For most problems,
finding a closed form solution for the non-linear equation (4.1.1) is not possible. Therefore,
iterative solution techniques are employed for solving these equations. The study about
convergence analysis of iterative methods is usually divided into two categories: semilocal
and local convergence analysis. The semilocal convergence analysis is based upon the
information around an initial point to give criteria ensuring the convergence of the iterative
procedure. While the local convergence analysis is based on the information around a
solution to find estimates of the radii of convergence balls.
The most popular iterative method for solving problem (4.1.1) is the Newton’s method
We shall refer to (C1 ) – (C5 ) as the (C) conditions. The following conditions have also
been employed [9, 10, 11, 12, 17, 14]
Here onwards, the conditions (C1 ), (C2 ), (C5 ), (C6 ), (C7 ) are referred as the (H) conditions.
For the semilocal convergence of Newton’s method the conditions (C1 ), (C2 ), (C3 )
together with the following sufficient conditions are given [1, 2, 3, 4, 9, 10, 11, 12, 17, 14,
15, 16, 18]
p
4M + K 2 − K K 2 + 2M
η≤ p , (4.1.3)
3M (K + K 2 + 2M )
U(x0 , R1 ) ⊆ D (4.1.4)
M K
P1(t) = t3 + t 2 − t + η. (4.1.5)
6 2
Whereas the conditions (C1 ), (C2 ), (C6 ), (C7 ) together with
q
4M0 + K02 − K0 K02 + 2M0
η≤ q (4.1.6)
3M0 (K0 + K02 + 2M0 )
U(x0 , R2 ) ⊆ D (4.1.7)
M0 K0
P2(t) = t3 + t 2 − t + η. (4.1.8)
6 2
have also been used for the semilocal convergence of Newton’s method. Conditions (4.1.3)
and (4.1.6) cannot be directly be compared with ours given in Sections 4.2 and 4.3, since
we use L0 that does not appear in (4.1.3) and (4.1.6). However, comparisons can be made
on concrete numerical examples. Let us consider X = Y = R, x0 = 1 and D = [ζ, 2 − ζ] for
ζ ∈ (0, 1). Define function F on D by
F (x) = x5 − ζ. (4.1.9)
Newton’s Method 49
·10−2
20
η
18 h1
h2
16
14
12
10
2
ζ ≈ 0.514 ζ ≈ 0.723
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ζ
Figure 4.1.1. Convergence criteria (4.1.3) and (4.1.6) for the equation (4.1.9). Here, h1 and
h2 stands respectively for the right hand side of the conditions (4.1.3) and (4.1.6).
Then, through some simple calculations, the conditions (C2 ), (C3 ), (C4 ), (C5 ), (C6 ) and
(C7 ) yield
(1 − ζ)
η= , K = 4(2 − ζ) , M = 12(2 − ζ) , K0 = 4,
3 2
5
M0 = 4ζ2 − 20ζ + 28, L0 = 15 − 17ζ + 7ζ2 − ζ 3.
Figure 4.1.1 plots the criteria (4.1.3) and (4.1.6) for the problem (4.1.9). In the Figure 4.1.1,
h1 stands for the right hand side of the condition (4.1.3) and h2 stands for the the right hand
side of the condition (4.1.6). In the Figure 4.1.1, we observe that for ζ < 0.723 the criterion
(4.1.3) does not hold while for ζ < 0.514 the criterion (4.1.6) does not hold. However, one
may see that the method (4.1.2) is convergent.
In this chapter, we expand the applicability of Newton’s method (4.1.2) first under the
(C) conditions and secondly under the (H) conditions. The local convergence of Newton’s
method (4.1.2) is also performed under similar conditions.
The chapter is organized as follows. In the Section 4.2 and Section 4.3, we study ma-
jorizing sequences for the Newton’s iterate {xn }. Section 4.4 contains the semilocal con-
vergence of Newton’s method. The local convergence is given in Section 4.5. Finally,
numerical examples are given in Section 4.6.
50 Ioannis K. Argyros and Á. Alberto Magreñán
tn+1 − tn ≤ αn η (4.2.7)
and
αn η
t ? − tn ≤ . (4.2.8)
1−α
Proof. We use mathematical induction to prove (4.2.7). Set
M
K+ (tk+1 − tk )
αk = 3 . (4.2.9)
2(1 − L0 tk+1)
Newton’s Method 51
αk ≤ α. (4.2.10)
Estimate (4.2.10) holds for k = 0 by (4.2.4) and the choice of η1 given in (4.2.3). Then, we
also have
t2 − t1 ≤ α(t1 − t0 )
and
1 − α2 η
t2 ≤ t1 + α(t1 − t0 ) = η + αη = (1 + α)η = η< = t ?? .
1−α 1−α
Let us assume that (4.2.9) holds for all k ≤ n. Then, we also have by (4.2.5) that
tk+1 − tk ≤ αk η
and
1 − αk+1
tk+1 ≤ η < t ?? .
1−α
Then, we must prove that
K M 1 − αk+1
+ αk η αk η + αL0 η − α ≤ 0. (4.2.11)
2 6 1−α
Estimate (4.2.11) motivates us to define recurrent functions f k on [0, 1) for each k = 1, 2, . . .
by
1 M
fk (t) = K + t k η t k−1η + L0 (1 + t + · · · + t k )η − 1. (4.2.12)
2 3
We need a relationship between two consecutive functions f k . Using (4.2.12) we get that
where
h1 M 1 M i
gk (t) = K+ t k+1 η t − K + t k η + L0 t 2 t k−1η
2 3 2 3
h1 M i
= (2L0t 2 + K t − K ) + t k η(t 2 − 1) t k−1 η. (4.2.14)
2 6
In particular, we get that
gk (α) ≤ 0, (4.2.15)
since α ∈ (0, 1) and
2L 0 α2 + L α − K = 0 (4.2.16)
52 Ioannis K. Argyros and Á. Alberto Magreñán
Let us denote by γ0 and γ1 , respectively, the minimal positive zeros of the following
equations with respect to η
hK M i
+ α(t2 − t1 ) (t2 − t1 ) + L0 (1 + α)(t2 − t1 ) + L0 t1 − 1 = 0 (4.2.20)
2 6
and
hK M i
+ (t2 − t1 ) (t2 − t1 ) + αL0t2 − α = 0. (4.2.21)
2 6
Let us set
γ = min{γ0 , γ1 , 1/L0}. (4.2.22)
Then, we can show the following result.
Proof. As in Lemma 4.2.1 we shall prove (4.2.25) using mathematical induction. We have
by the choice of γ1 that
M
K+ (t2 − t1 )
α1 = 3 (t2 − t1 ) ≤ α. (4.2.26)
2(1 − L0t2 )
Then, it follows from (4.2.26) and (4.2.20) that
0 < t3 − t2 ≤ α(t2 − t1 )
t3 ≤ t2 + α(t2 − t1 )
t3 ≤ t2 + (1 + α)(t2 − t1 ) − (t2 − t1 )
1 − α2
t3 ≤ t1 + (t2 − t1 ) < t ?? .
1−α
Assume that
0 < αk ≤ α (4.2.27)
holds for all n ≤ k. Then, we get by (4.2.5) and (4.2.27) that
and
1 − αk+1
tk+2 ≤ t1 + (t2 − t1 ) < t1?? . (4.2.29)
1−α
Estimate (4.2.27) is true, if k is replaced by k + 1 provided that
hK M i
+ (tk+2 − tk+1 ) (tk+2 − tk+1 ) ≤ α(1 − L0 tk+2)
2 6
or
hK M i h 1 − αk+1 i
+ αk (t2 − t1 ) αk (t2 − t1 ) + αL0 t1 + (t2 − t1 ) − α ≤ 0. (4.2.30)
2 6 1−α
Estimate (4.2.30) motivates us to define recurrent functions f k on [0, 1) by
hK M i
fk(t) = + t k (t2 − t1 ) t k (t2 − t1 ) + t L0 (1 + t + · · · + t k )(t2 − t1 )
2 6
− t(1 − L0 t1 ). (4.2.31)
We have that
h1 M i
fk+1(t) = f k (t) + (2L0t 2 + K t − K ) + t k (t 2 − 1)(t2 − t1 ) t k (t2 − t1 ). (4.2.32)
2 6
In particular, we have the choice of α that
fk (α) ≤ 0 or by (4.2.33) if
(4.2.34)
f1 (α) ≤ 0
Lemma’s 4.2.1 and 4.2.2 admit the following useful extensions. The proofs are omitted
since they can simply be obtained by replacing η = t1 −t0 with tN+1 −tN where N = 1, 2, . . .
for Lemma 4.2.3 and N = 2, 3, . . . for Lemma 4.2.4.
Remark 4.2.5. Another sequence related to Newton’s method (4.1.2) is given by (see The-
orem 4.4.1)
M1
K0 + (s1 − s0 )
2
s0 = 0, s1 = η, s2 = s1 + 3 (s1 − s0 )
2(1 − L0 s1 )
(4.2.35)
M
K + (sn+1 − sn )
3
sn+2 = sn+1 + (sn+1 − sn )2
2(1 − L0 sn+1 )
Newton’s Method 55
for each n = 1, 2, . . . and some K0 ∈ (0, K ], M1 ∈ (0, M ]. Then, a simple inductive argument
shows that
sn ≤ tn (4.2.36)
sn+1 − sn ≤ tn+1 − tn (4.2.37)
and
s? = lim sn ≤ t ? . (4.2.38)
n→∞
Moreover, if K0 < K or M1 < M then (4.2.36) and (4.2.37) hold as strict inequalities.
Clearly, sequence {sn } converges under the hypotheses of Lemma 4.2.1 or Lemma 4.2.2.
However, {sn } can converge under weaker hypotheses than those of Lemma 4.2.2. Indeed,
denote by γ10 and γ11 , respectively, the minimal positive zeros of equations
hK M i
+ α(s2 − s1 ) (s2 − s1 ) + L0 (1 + α)(s2 − s1 ) + L0 s1 − 1 = 0 (4.2.39)
2 6
and
hK M1 i
0
+ (s2 − s1 ) (s2 − s1 ) + αL0 s2 − α = 0. (4.2.40)
2 6
Set
γ1 = min{γ10 , γ11 , 1/L0}. (4.2.41)
Then, we have that
γ ≤ γ1 . (4.2.42)
Moreover, the conclusions of Lemma 4.2.2 hold for sequence {sn } if (4.2.42) replaces
(4.2.23).
Note also that strict inequality can hold in (4.2.42) which implies that the sequence {sn }
– which is tighter than {tn } – converges under weaker conditions.
Lemma 4.3.1. Let K0 > 0, L0 > 0, M0 > 0 and η > 0 with K0 ≤ L0 . Define parameters a,
θ0 and η1 by
2K0
a= q , (4.3.1)
K0 + K02 + 8K0 L0
2
θ0 = r 2 (4.3.2)
K0 K0 2M0 (a + 3)
+ (1 + a)L0 + + (1 + a)L0 +
2 2 3
56 Ioannis K. Argyros and Á. Alberto Magreñán
and
2a
θ1 = r 2 . (4.3.3)
K0 K0 2 M0 a
+ aL 0 + + aL 0 +
2 2 3
Suppose that
1 − a2
θ1 if L0 η ≤
2 + 2a − a2
η≤ (4.3.4)
1 − a2
θ0 if ≤ L0 η
2 + 2a − a2
Then, sequence {vn } generated by
M0 M0 K0
(vn+1 − vn ) + vn +
v0 = 0, v1 = η, vn+2 = vn+1 + 6 2 2 (v − v )
n+1 n (4.3.5)
1 − L0 vn+1
is well defined, increasing, bounded from above
η
v?? = (4.3.6)
1−a
and converges to its unique least upper bound v? which satisfies v? ∈ [0, v??]. Moreover the
following estimates hold
vn+1 − vn ≤ an η (4.3.7)
and
an η
v? − vn ≤ . (4.3.8)
1−a
Proof. As in Lemma 4.2.1 we use mathematical induction to prove that
K0 M0 M0
+ vk + (vk+1 − vk )
βk = 2 2 6 (vk+1 − vk ) ≤ a. (4.3.9)
1 − L0 vk+1
Estimate (4.3.9) holds for k = 0 by the choice of θ1 . Let us assume that (4.3.9) holds for all
k ≤ n. Then, we must prove that
K M0 1 − a k M0 1 − ak+1
0
+ η+ ak η ak η + aL 0 η − a ≤ 0. (4.3.10)
2 2 1−a 6 1−a
Define recurrent functions f k on [0, 1) for each k = 1, 2, . . . by
K M0 M0
0
fk(t) = + (1 + t + · · · + +t k−1 )η + ak η t k−1 η
2 2 6
+ L0 (1 + t + · · · + t k )η − 1. (4.3.11)
Newton’s Method 57
and
hM M0 K0 i
0
(v2 − v1 ) + v1 + (v2 − v1 ) + aL0 v2 − a = 0. (4.3.14)
6 2 2
Set
δ = min{δ0 , δ1 , 1/L0 }. (4.3.15)
Then, we can show:
hK M0 1 − ak M i
0 0 k
+ v1 + (v2 − v1 ) + a (v2 − v1 ) ak (v2 − v1 )
2 2 1−a 6
h 1 − ak+1 i
+ aL0 v1 + (v2 − v1 ) − a ≤ 0. (4.3.19)
1−a
58 Ioannis K. Argyros and Á. Alberto Magreñán
Thus
fk+1(a) ≤ f k (a) ≤ · · · ≤ f 1 (a). (4.3.21)
But by the choice of η0 we have that f 1 (a) ≤ 0.
Remark 4.3.3. A sequence related to Newton’s method (4.1.2) under the (H) conditions is
defined by
M1
K0 + (u1 − u0 )
3 2
u0 = 0, u1 = η, u2 = u1 + (u1 − u0 )
2(1 − L0 u1 )
(4.3.22)
M0
K0 + (un+1 − un )
3
un+2 = un+1 + (un+1 − un )2
2(1 − L0 un+1 )
for each n = 1, 2, . . . and M1 ∈ (0, M ]. Then, a simple inductive argument shows that for
each n = 2, 3, . . .
un ≤ vn (4.3.23)
un+1 − un ≤ vn+1 − vn (4.3.24)
and
u? = lim ≤ v? . (4.3.25)
n→∞
Moreover, if K0 < K or M1 < M0 then (4.3.23) and (4.3.24) hold as strict inequalities.
Sequence {un } converges under the hypotheses of Lemma 4.3.1 or 4.3.2. However, {un }
can converge under weaker hypotheses than those of Lemma 4.3.2. Indeed, denote by δ10
and δ11 , respectively, the minimal positive zeros of equations
hK M0 M0 i
0
+ u2 + (u2 − u1 ) (u2 − u1 ) + L0 (u1 + (1 + a)(u2 − u1 )) − 1 = 0 (4.3.26)
2 2 6
and
hM M0 K0 i
0
(u2 − u1 ) + u1 + (u2 − u1 ) + aL0 u2 − a = 0. (4.3.27)
6 2 2
Newton’s Method 59
Set
δ1 = min{δ10 , δ11 , 1/L0}. (4.3.28)
Then, we have that
δ ≤ δ1 .
Moreover, the conclusions of Lemma 4.3.2 hold for sequence {un } if (4.3.28) replaces
(4.3.16). Note also that strict inequality may hold in (4.3.28) which implies that, tighter
than {vn }, sequence {un } converges under weaker conditions. Finally note that sequence
{tn } is tighter than {vn } although the sufficient convergence conditions for {vn } are weaker
than those of {tn }.
Lemmas similar to Lemma 2.3 and Lemma 2.4 for sequence {vn } can follow in an
analogous way.
U(x0 ,t ? ) ⊆ D . (4.4.1)
Then, the sequence {xn } defined by Newton’s method (4.1.2) is well defined, remains in
U(x0 ,t ? ) for all n ≥ 0 and converges to a unique solution x? ∈ U(x0 ,t ? ) of equation F (x) =
0. Moreover, the following estimates hold for all n ≥ 0
and
kxn − x? k ≤ t ? − tn , (4.4.3)
U(x0 , R) ⊆ D (4.4.4)
and
L0 (t ? + R) ≤ 2. (4.4.5)
and
kz − x0 k ≤ kz − x1 k + kx1 − x0 k
≤ (t ? − t1 ) + (t1 − t0 ) = t ? − t0 ,
Thus estimate (4.4.6) and (4.4.7) hold for k = 0. Given they hold for n = 0, 1, 2, . . ., k, then
we have
k+1 k+1
kxk+1 − x0 k ≤ ∑ kxi − xi−1 k ≤ ∑ (ti − ti−1) = tk+1 − t0 = tk+1 (4.4.8)
i=1 i=1
and
where ( (
K0 , K = 0, M0 , K = 0,
K= and M =
K, K > 0, M, K > 0.
Using (C5 ), we obtain that
0
F (x0 )−1 (F 0 (xk+1 ) − F 0 (x0 ))
≤ L0 kxk+1 − x0 k
≤ L0 tk+1 ≤ L0 t ? < 1. (4.4.12)
It follows from the Banach lemma on invertible operators [7, 8, 14, 15, 16] and (4.4.12) that
F 0 (xk+1)−1 exists and
0
F (xk+1 )−1 F 0 (x0 )
≤ (1 − L0 kxk+1 − x0 k)−1
≤ (1 − L0tk+1)−1 . (4.4.13)
That is,
z ∈ U(xk+1 ,t ? − tk+1 ). (4.4.16)
Estimates (4.4.13) and (4.4.16) imply that (4.4.6) and (4.4.7) hold for n = k + 1. The proof
of (4.4.6) and (4.4.7) is now complete by induction.
Lemma 4.2.1 implies that sequence {tn } is a Cauchy sequence. From (4.4.6) and (4.4.7),
{xn } (n ≥ 0) becomes a Cauchy sequence too and as such it converges to some x? ∈ U(x0 ,t ? )
(since U(x0 ,t ? ) is a closed set). Estimate (4.4.3) follows from (4.4.2) by using standard
majorization techniques [7, 8, 14, 15, 16, 18]. Moreover, by letting k → ∞ in (4.4.11), we
obtain F (x? ) = 0. Finally, to show uniqueness let y? be a solution of equation F (x) = 0 in
U(x0 , R). It follows from (C5 ) for x = y? + θ(x? − y? ), θ ∈ [0, 1], the estimate
0 Z 1
F (x0 )−1 (F 0 (y? + θ(x? − y? )) − F 0 (x0 ))
dθ
0
Z 1
≤ L0 ky? + θ(x? − y? ) − x0 k dθ
0
Z 1
≤ L0 (θ kx? − x0 k + (1 − θ) ky? − x0 k)dθ
0
L0
≤ (t ? + R) ≤ 1, (by (4.6.2))
2
62 Ioannis K. Argyros and Á. Alberto Magreñán
and the Banach lemma on invertible operators implies that the linear operator T ?? =
R1 0 ? ? ? ? 0 ? ?? ?
0 F (y + θ(x − y ))dθ is invertible. Using the identity 0 = F (x ) − F (y ) = T (x −
y? ), we deduce that x? = y? .
Similarly, we show the uniqueness in U(x0 ,t ? ) by setting t ? = R. That completes the
proof of Theorem 4.4.1.
Remark 4.4.2. The conclusions of Theorem 4.4.1 hold if {tn }, t ? are replaced by {rn }, r? ,
respectively.
instead of (4.4.11) and (C6 ), (C7 ) instead of, respectively, (C3 ), (C4 ), we arrive at the
following semilocal convergence result under the (H) conditions [8, Theorem 6.3.7 p. 210
for proof].
and
kxn − x? k ≤ v? − vn (4.4.20)
U(x0 , R) ⊆ D (4.4.21)
and
L0 (t ? + R) ≤ 2. (4.4.22)
Note also that in view of (A3 ) and (A4 ), respectively, there exist c0 ∈ (0, c] and d0 ∈ (0, d]
such that for each θ ∈ [0, 1]
A0 3 .
F 0 (x? )−1 F 00 (x0 + θ(x? − x0 )) − F 00 (x? )
≤ c0 (1 − θ) kx0 − x? k
A0 4 .
F 0 (x? )−1 F 0 (x0 ) − F 0 (x? )
≤ d0 (1 − θ) kx0 − x? k.
U(x? , r) ⊆ D , (4.5.1)
where
2
r= r 2 4c . (4.5.2)
b b
+d + +d +
2 2 3
Then, sequence {xn } (starting from x0 ∈ U(x? , r)) generated by Newton’s method (4.1.2) is
well defined, remains in U(x? , r) for all n ≥ 0 and converges to x? . Moreover the following
estimates hold
where
( (
c0 if n = 0, d0 if n = 0,
c= d=
c if n > 0, d if n > 0.
Proof. The starting point x0 ∈ U(x? , r). Then, suppose that xk ∈ U(x? , r) for all k ≤ n.
Using (A4 ) and the definition of r we get that
0 ? −1 0
F (x ) (F (xk ) − F 0 (x? ))
≤ d kxk − x? k < dr < 1. (4.5.5)
64 Ioannis K. Argyros and Á. Alberto Magreñán
It follows from (4.5.5) and the Banach lemma on invertible operators that F 0 (xk )−1 exists
and
0
1
F (xk )−1 F 0 (x? )
≤ . (4.5.6)
1 − d kxk − x? k
Hence, xk+1 exists. Using (4.1.2), we obtain the approximation
hZ 1 i
x? − xk+1 = −F 0 (xk )−1 F 0 (x? ) F 0 (x?)−1 F 00 (xk + θ(x? − xk )) − F 00 (x? ) + F 00 (x? )
0
(x? − xk )2 (1 − θ)dθ (4.5.7)
In view of (A2 ), (A3 ), (A4 ), (4.5.6), (4.5.7) and the choice of r we have in turn that
R1
(1 − θ)2 kxk − x? k3 dθ + b 01 (1 − θ)dθkxk − x? k2
R
? c 0
kxk+1 − x k ≤
1 − d kxk − x? k
≤ ek kxk − x? k2 < q(r) kxk − x? k = kxk − x? k (4.5.8)
Remark 4.5.2. The local results can be used or projection methods such as Arnolds, the
generalized minimum residual method (GMRES), the generalized conjugate method (GCR)
for combined Newton/finite projection methods and in connection with the mesh inde-
pendence principle to develop the cheapest and most efficient mesh refinement strategies
[7, 8, 4, 15, 16]. These results can also be used to solve equations of the form (4.1.1),
where F 0 , F 00 satisfy differential equations of the form
where, P and Q are known operators. Since, F 0 (x? ) = P (F (x? )) = P (0) and F 00 (x? ) =
Q (F (x?)) = Q (0) we can apply our results without actually knowing the solution x? of
equation (4.1.1).
respectively. Thus we observe that the criterion (4.1.3) fails while the criterion (4.1.6)
holds. From the hypothesis of Lemma 4.2.1, we get
(
0.2017739733 if 0.08268226632 ≤ 0.2499999999
0.09730789545 ≤
0.2036729480 if 0.2499999999 ≤ 0.08268226632.
Thus the hypothesis of Lemma 4.2.1 hold. As a consequence, we can apply the Theorem
4.4.1. The table 4.6.1 reports convergence behavior of Newton’s method (4.1.2) applied to
(4.4.11) with x0 = 1 and ψ = 0.55. Numerical computations are performed to the decimal
point accuracy of 2005 by employing the high-precision library ARPREC. The Table 4.6.2
reports behavior of series {tn } (4.2.5). Comparing Tables 4.6.1 and 4.6.2, we observe that
n tn tn+2 − tn+1 t ? − tn
0 0.00e + 00 4.95e − 02 1.69e − 01
1 9.73e − 02 1.87e − 02 7.16e − 02
2 1.47e − 01 3.26e − 03 2.21e − 02
3 1.66e − 01 1.02e − 04 3.36e − 03
4 1.69e − 01 1.01e − 07 1.02e − 04
5 1.69e − 01 9.75e − 14 1.01e − 07
6 1.69e − 01 9.16e − 26 9.75e − 14
7 1.69e − 01 8.08e − 50 9.16e − 26
8 1.69e − 01 6.30e − 98 8.08e − 50
9 1.69e − 01 3.82e − 194 6.30e − 98
Hammerstein integral equation of the second kind. Consider the integral equation
Z 1
4
x(s) = 1 + G(s,t)x(t)3 dt, s ∈ [0, 1], (4.6.2)
5 0
and
Z 1
24
[F (x)00 yz](s) = G(s,t)x(t)y(t)z(t) dt, s ∈ [0, 1], (4.6.6)
5 0
respectively. We use the max-norm. Let x0 (s) = 1 for all s ∈ [0, 1]. Then, for any y ∈ D , we
have
Z 1
0 12
[(I − F (x0 ))(y)](s) = G(s,t)y(t) dt, s ∈ [0, 1], (4.6.7)
5 0
which means
Z 1
0 12 12 3
kI − F (x0 )k ≤ max G(s,t) dt = = < 1. (4.6.8)
5 s∈[0,1] 0 5 × 8 10
It follows from the Banach theorem that F 0 (x0 )−1 exists and
1 10
kF 0 (x0 )−1 k ≤ = . (4.6.9)
3 7
1−
10
On the other hand, we have from (4.4.7) that
Z 1
4 1
kF (x0 )k = max G(s,t) dt = .
5 s∈[0,1] 0 10
Then, we get η = 1/7. Note that F 00 (x) is not bounded in X or its subset X1 . Take into
account that a solution x? of equation (4.1.1) with F given by (4.4.6) must satisfy
1 ? 3
kx? k − 1 − kx k ≤ 0, (4.6.10)
10
Newton’s Method 67
i.e., kx? k ≤ ρ1 = 1.153467305 and kx? k ≥ ρ2 = 2.423622140, where ρ1 and ρ2 are the
positive roots of the real equation z − 1 − z3 /10 = 0. Consequently, if we look for a solution
such that x? < ρ1 ∈ X1 , we can consider D := {x : x ∈ X1 and kxk < r}, with r ∈ (ρ1 , ρ2 ),
as a nonempty open convex subset of X . For example, choose r = 1.7. Using (4.3.7) and
(4.3.8), we have that for any x, y, z ∈ D
Z 1
0
(F (x) − F 0 (x0 ))y (s)
= 12
G(s,t)(x(t)2 − x0 (t)2)y(t) dt
5
0
Z 1
12
≤ G(s,t)kx(t) − x0(t)k kx(t) + x0(t)ky(t) dt
5 0
12 1
Z
≤ G(s,t) (r + 1)kx(t) − x0(t)ky(t) dt, s ∈ [0, 1]
5 0
(4.6.11)
and
Z 1
24
k(F 00 (x)yz)(s)k = G(s,t)x(t)y(t)z(t) dt, s ∈ [0, 1]. (4.6.12)
5 0
Then, we get
12 1 81
kF 0 (x) − F 0 (x0 )k ≤ (r + 1)kx − x0 k = kx − x0 k, (4.6.13)
5 8 100
24 r 51
kF 00 (x)k ≤ × = (4.6.14)
5 8 50
and
Z 1
00
F (x) − F 00 (x) yz (s)
= 24
G(s,t)(x(t) − x(t)))y(t)z(t)
dt (4.6.15)
5
0
24 1 3
≤ kx − xk = kx − xk. (4.6.16)
5 8 5
Now we can choose constants as follows:
1 6 6 51 49 11
η= , M = , M0 = , K = , L0 = and K0 = .
7 7 7 35 70 15
From (4.1.3) and (4.1.5), we obtain
Table 4.6.3. Comparison among the sequences (4.2.5), (4.2.35), (4.3.5) and (4.3.22)
n tn sn vn un
0 0.000000e + 00 0.000000e + 00 0.000000e + 00 0.000000e + 00
1 1.428571e − 01 1.428571e − 01 1.428571e − 01 1.428571e − 01
2 1.598408e − 01 1.514801e − 01 2.042976e − 01 1.516343e − 01
3 1.600782e − 01 1.515408e − 01 2.356037e − 01 1.516661e − 01
4 1.600783e − 01 1.515408e − 01 2.527997e − 01 1.516661e − 01
5 1.600783e − 01 1.515408e − 01 2.626215e − 01 1.516661e − 01
6 1.600783e − 01 1.515408e − 01 2.683548e − 01 1.516661e − 01
7 1.600783e − 01 1.515408e − 01 2.717435e − 01 1.516661e − 01
8 1.600783e − 01 1.515408e − 01 2.737612e − 01 1.516661e − 01
9 1.600783e − 01 1.515408e − 01 2.749678e − 01 1.516661e − 01
and
(
1 0.6257238049 if 0.1000000000 ≤ 0.2691240473
≤
7 0.5832936968 if 0.2691240473 ≤ 0.1000000000
respectively. Thus hypotheses (4.2.4) and (4.3.4) hold. Comparison – among sequences
(4.2.5), (4.2.35), (4.3.5) and (4.3.22) is reported in Table 4.6.3. In the Table 4.6.3, we
observe that the estimates (4.2.36) and (4.3.23) hold.
Concerning the uniqueness balls. From equation (4.1.5), we get R1 = 0.1627780248
and from equation (4.1.8), we get R2 = 0.1518068730. Whereas from Theorem 4.4.1, we
get R ≤ 1.257142857. Therefore, the new approach provides the largest uniqueness ball.
Example 3. Let us consider the case when X = Y = R, D = U(0, 1) and define F on D
by
F (x) = ex − 1. (4.6.17)
Then, we can define P (x) = x + 1 and Q (x) = x + 1. In order for us to compare our radius
of convergence with earlier ones, let us introduce the Lipschitz condition
0 ? −1 0
F (x ) (F (x) − F 0 (y))
≤ L kx − yk for each x, y ∈ D . (4.6.18)
Using the (A) and (A0 ) conditions – and F 0 (x∗ ) = diag{1, 1, 1} – we set
b = 1.0, c = c0 = c = d = d0 = d = e − 1, L = e, and L0 = e − 1.
We obtain
r = 0.4078499356.
Thus, r0 < r.
and
n ξn kxn − x? k2
1 0.240445748047369
2 0.000661013573819
3 0.000000224531576
4 0.000000000000060
5 0.000000000000000
where
L /3 kxn − x? k + b/2 L /2
pn = , λn = ,
1 − d kxn − x? k 1 − L0 kxn − x? k
L /2 L /3 kxn − x? k + b/2
µn = and ξn = .
1 − L kxn − x? k 1 − L /2 kxn − x? k + b kxn − x? k
To compare the above iterations with the iteration (4.5.3), we produce the comparison
table 4.6.4 and 4.6.5, we apply Newton’s method (4.1.2) to the equation (4.6.21) with
x0 = (0.21, 0.21, 0.21)T. In the Table 4.6.4, we note that the estimate (4.5.3) – of Theo-
rem 4.5.1 – hold.
References
[1] Amat, S., C. Bermúdez, Busquier, S., Legaz, M. J., Plaza, S., On a Family of High-
Order Iterative Methods under Kantorovich Conditions and Some Applications, Abst.
Appl. Anal., 2012, (2012).
[2] Amat, S., Bermúdez, C., Busquier, S., Plaza, S., On a third-order Newton-type
method free of bilinear operators, Numer. Linear Alg. with Appl., 17(4) (2010), 639-
653.
[3] Amat, S., Busquier, S., Third-order iterative methods under Kantorovich conditions,
J. Math. Anal. App., 336(1) (2007), 243-261.
[4] Amat, S., Busquier, S., Gutiérrez, J. M., Third-order iterative methods with applica-
tions to Hammerstein equations: A unified approach, J. Comput. App. Math., 235(9)
(2011), 2936-2943.
[6] Argyros, I.K., On the Newton-Kantorovich hypothesis for solving equations, Journal
of Computation and Applied Mathematics, 169(2) (2004), 315-332.
[7] Argyros, I.K., Cho, Y.J., Hilout, S., Numerical methods for equations and its appli-
cations, CRC Press/Taylor and Francis Publications, New York, 2012.
[9] Ezquerro, J.A., González, D., Hernández, M. A., Majorizing sequences for Newton’s
method from initial value problems, J. Comp. Appl. Math., 236 (2012), 2246-2258.
[10] Ezquerro, J.A., Hernández, M.A., Generalized differentiability conditions for New-
ton’s method, IMA J. Numer. Anal., 22(2) (2002), 187-205.
[11] Ezquerro, J.A., González, D., Hernández, M.A., On the local convergence of New-
ton’s method under generalized conditions of Kantorovich, App. Math. Let., 26
(2013), 566-570.
[12] Gutiérrez, J.M., A new semilocal convergence theorem for Newton’s method, J.
Comp. App. Math., 79 (1997), 131 - 145.
72 Ioannis K. Argyros and Á. Alberto Magreñán
[13] Gutiérrez, J.M., Magreñán, Á.A., Romero, N., On the semilocal convergence of
Newton-Kantorovich method under center-Lipschitz conditions, App. Math. Comp.,
221 (2013), 79-88.
[14] Kantorovich,L.V., The majorant principle and Newton’s method, Doklady Akademii
Nauk SSSR, 76, 17-20, 1951. (In Russian).
[15] Ostrowski, A.M., Solution of equations in Euclidean and Banach spaces, Academic
Press, New York, 3rd Ed. 1973.
[16] Traub, J. F., Iterative Methods for the Solution of Equations, American Mathematical
Soc., 1982.
[17] Werner, W., Some improvements of classical iterative methods for the solution of
nonlinear equations, Numerical Solution of Nonlinear Equations Lecture Notes in
Mathematics, 878/1981 (1981), 426-440.
[18] Yamamoto, T., On the method of tangent hyperbolas in Banach spaces, Journal of
Computational and Applied Mathematics, 21(1), (1988), 75-86.
Chapter 5
5.1. Introduction
Let X , Y be Banach spaces. Let U(x, r) and U(x, r) stand, respectively, for the open and
closed ball in X with center x and radius r > 0. Denote by L (X , Y ) the space of bounded
linear operators from X into Y . In the present chapter we are concerned with the problem
of approximating a locally unique solution x? of equation
F(x) = 0, (5.1.1)
of an approximate zero was proposed and the convergence criteria were provided to deter-
mine an approximate zero for analytic function, depending on the information at the initial
point. Wang and Han [32, 31] generalized Smale’s result by introducing the γ-condition (see
(5.1.3)). For more details on Smale’s theory, the reader can refer to the excellent Dedieu’s
book [15, Chapter 3.3].
Newton’s method defined by
x0 is an initial point
(5.1.2)
xn+1 = xn − F 0 (xn )−1 F(xn ) for each n = 0, 1, 2, · · ·
is undoubtedly the most popular iterative process for generating a sequence {xn } approxi-
mating x? . Here, F 0 (x) denotes the Fréchet-derivative of F at x ∈ U(x0 , R).
In the present chapter we expand the applicability of Newton’s method under the γ-
condition by introducing the notion of the center γ0 -condition (to be precised in Defini-
tion 5.3.1) for some γ0 ≤ γ. This way we obtain tighter upper bounds on the norms of
k F 0 (x)−1 F 0 (x0 ) k for each x ∈ U(x0 , R) (see (5.2.4), (5.2.2) and (5.2.3)) leading to weaker
sufficient convergence conditions and a tighter convergence analysis than in earlier studies
such as [14, 19, 27, 28, 31, 32]. The approach of introducing center-Lipschitz condition
has already been fruitful for expanding the applicability of Newton’s method under the
Kantorovich-type theory [3, 9, 11, 13].
Wang in his work [31] on approximate zeros of Smale (cf. [28, 29]) used the γ-Lipschitz
condition at x0
2γ
k F 0 (x0 )−1 F 00 (x) k≤
(1 − γ k x − x0 k)3 (5.1.3)
for each x ∈ U(x0 , r), 0 < r ≤ R,
where γ > 0 and x0 are such that γ k x − x0 k< 1 and F 0 (x0 )−1 ∈ L (Y , X ) to show the
following semilocal convergence result for Newton’s method.
t ? ≤ R, (5.1.6)
where p
? 1+α− (1 + α)2 − 8 α 1 1
t = ≤ 1− √ . (5.1.7)
4γ 2 γ
Then, sequence {xn } generated by Newton’s method is well defined, remains in U(x0 ,t ?) for
each n = 0, 1, · · · and converges to a unique solution x? ∈ U(x0 ,t ? ) of equation F(x) = 0.
Moreover, the following error estimates hold
and
k xn+1 − x? k≤ t ? − tn , (5.1.9)
where scalar sequence {tn } is defined by
t0 = 0, t1 = η,
γ (tn − tn−1 )2 ϕ(tn )
tn+1 = tn + = tn − 0
1 ϕ (tn ) (5.1.10)
2− 2
(1 − γtn )(1 − γtn−1 )2
(1 − γtn )
f or each n = 1, 2, · · · ,
where
γt 2
ϕ(t) = η − t + . (5.1.11)
1 − γt
Notice that t ? is the small zero of equation ϕ(t) = 0, which exists under the hypothesis
(5.1.5).
The chapter is organized as follows: sections 5.2. and 5.3. contain the semilocal and
local convergence analysis of Newton’s method. Applications and numerical examples are
given in the concluding section 5.4.
and
γ0 (2 − γ0 r)
`0 (r) := . (5.2.6)
(1 − γ0 r)2
Notice that with preceding choices of functions ` and `0 and since condition (5.2.4) is
satisfied, we can always choose γ0 , γ such that
γ0 ≤ γ. (5.2.7)
From now on we assume that condition (5.2.7) is satisfied. We also need a result by Zabre-
jko and Nguen.
for some increasing function λ : [0, R] −→ [0, +∞). Then, the following assertion holds
where Z r
Λ(r) = λ(t) dt.
0
In particular, if
k F 0 (x0 )−1 (F 0 (x) − F 0 (x0 )) k≤ λ0 (r) k x − x0 k
f or each x ∈ U(x0 , r), 0 < r ≤ R
for some increasing function λ0 : [0, R] −→ [0, +∞). Then, the following assertion holds
where Z r
Λ0 (r) = λ0 (t) dt.
0
Using the center-Lipschitz condition and Lemma 5.2.3, we can show the following
result on invertible operators.
Expanding the Applicability of Newton’s Method 77
Using (5.1.3) a similar to Lemma 5.2.1, Banach lemma was given in [31] (see also
[27, 28, 29]).
Lemma 5.2.6. Let F : U(x0 , R) −→ Y be twice Fréchet-differentiable on U(x0 , R). Suppose
F 0 (x0 )−1 ∈ L (Y , X ) and γ R < 1 for some γ > 0 and x0 ∈ X ; condition (5.1.3) holds on
1 1
V0 = U(x0 , r0 ), where r0 = (1 − √ ) . Then F 0 (x)−1 ∈ L (Y , X ) on V0 and satisfies
2 γ
−1
0 −1 0 1
k F (x) F (x0 ) k≤ 2 − . (5.2.9)
(1 − γ r)2
Remark 5.2.7. It follows from (5.2.7)–(5.2.9) that (5.2.8) is more precise upper bound on
the norm of F 0 (x)−1 F 0 (x0 ). This observation leads to a tighter majorizing sequence for
{xn } (see Proposition 5.2.10).
We can show the main following semilocal convergence theorem for Newton’s method.
Theorem 5.2.8. Suppose that
(a) There exist x0 ∈ X and η > 0 such that
F 0 (x0 )−1 ∈ L (Y , X ) and k F 0 (x0 )−1 F(x0 ) k≤ η;
(b) Operator F 0 satisfies Lipschitz and center-Lipschitz conditions (5.2.2) and (5.2.3) on
U(x0 , r0 ) with `(r) and `(r) are given by (5.2.5) and (5.2.6), respectively;
(c) U0 ⊆ U(x0 , R);
(d) Scalar sequence {sn } defined by
s0 = 0, s1 = η,
γ0 (s1 − s0 )2
s2 = s1 +
1
2− (1 − γ s1 )
(1 − γ0 s1 )2
(5.2.10)
γ (sn+1 − sn )2
sn+2 = sn+1 +
1
2− (1 − γ sn+1 ) (1 − γ sn )2
(1 − γ0 sn+1 )2
f or each n = 1, 2, · · ·
78 Ioannis K. Argyros and Á. Alberto Magreñán
(i) Sequence {sn } is increasingly convergent to its unique least upper bound s? which
satisfies s? ∈ [s2 , b], where b is given in (5.2.11).
(ii) Sequence {xn } generated by Newton’s method is well defined, remains in U(x0 , s? )
for each n = 0, 1, · · · and converges to a unique solution x? ∈ U(x0 , s? ) of equation
F(x) = 0. Moreover, the following estimates hold
and
k xn − x? k≤ s? − sn for each n = 0, 1, 2, · · · . (5.2.13)
Proof. (i) It follows from (5.2.8) and (5.2.9) that sequence {sn } is increasing and
bounded above by 1/γ. Hence, it converges to s? ∈ [s2 , b].
and
k z − x0 k≤k z − x1 k + k x1 − x0 k≤ s? − s1 + s1 − s0 = s? − s0 ,
Hence, estimates (5.2.14) and (5.2.15) hold for k = 0. Suppose these estimates hold
for natural integers n ≤ k. Then, we have that
k+1 k+1
k xk+1 − x0 k≤ ∑ k xi − xi−1 k≤ ∑ (si − si−1 ) = sk+1 − s0 = sk+1
i=1 i=1
and
Using (5.2.2), Lemma 5.2.1 for x = xk+1 and the induction hypotheses we get that
1
k F 0 (x0 )−1 (F 0 (xk+1 ) − F 0 (x0 )) k ≤ −1
(1 − γ0 k xk+1 − x0 k)2
1 (5.2.16)
≤ − 1 < 1.
(1 − γ0 sk+1)2
It follows from (5.2.16) and the Banach lemma 5.2.1 on invertible operators that
F 0 (xk+1 )−1 exists and
−1
1
k F 0 (xk+1 )−1 F 0 (x0 ) k≤ 2 − . (5.2.17)
(1 − γ0 sk+1)2
Moreover, it follows from Lemma 5.2.4, (5.2.2) and (5.2.18) in turn for k = 1, 2, · · ·
that
k F 0 (x0 )−1 F(xk+1 ) k
Z 1
≤ k F 0 (x0 )−1 (F 0 (xτk ) − F 0 (xk )) k dτ k xk+1 − xk k
Z0 1 Z 1
2 γ τ ds dτ k xk+1 − xk k2
≤
0 0 (1 − γ k xτk s − x0 k)3
2 γ τ ds dτ k xk+1 − xk k2
Z 1Z 1
≤ 3
0 0 (1 − γ k xk − x0 k −γ τ s k xk+1 − xk k) (5.2.19)
2
γ k xk+1 − xk k
=
(1 − γ k xk − x0 k −γ kxk+1 − xk k) (1− γ k xk − x0 k)2
γ (sk+1 − sk )2 k xk+1 − xk k 2
≤
(1 − γ sk+1) (1 − γ sk )2 sk+1 − sk
γ (sk+1 − sk ) 2
≤ .
(1 − γ sk+1) (1 − γ sk )2
(see also [27, p. 33, estimate (3.19)]) Then, in view of (5.1.2), (5.2.10), (5.2.17) and
80 Ioannis K. Argyros and Á. Alberto Magreñán
and for k = 1, 2, · · ·
That is w ∈ U(xk+1, s? − sk+1 ). The induction for (5.2.14) and (5.2.15) is now com-
pleted. Lemma 5.2.5 implies that {sn } is a complete sequence. It follows from
(5.2.14) and (5.2.15) that {xn } is also a complete sequence in a Banach space X
and as such it converges to some x? ∈ U(x0 , s? ) (since U(x0 , s? ) is a closed set). By
letting k −→ ∞ in (5.2.19) we get F(x? ) = 0. Estimate (5.2.13) is obtained from
(5.2.12) by using standard majorization techniques (cf. [5, 13, 21, 28, 29]). Finally,
to show the uniqueness part, let y? ∈ U(x0 , s? ) be a solution of equation (5.1.1). Us-
Z 1
ing (5.2.3) for x replaced by z? = x? + τ (y? − x? ) and G = F 0 (z? ) dτ we get as in
0
(5.2.9) that k F 0 (x0 )−1 (G − F 0 (x0 )) k< 1. That is G −1 ∈ L (Y , X ). Using the identity
0 = F(x? ) − F(y? ) = G (x? − y? ), we deduce x? = y? .
Remark 5.2.9. (a) The convergence criteria in Theorem 5.2.8 are weaker than in The-
orem 5.1.1. In particular, Theorem 5.1.1 requires that operator F is twice Fréchet-
differentiable but our Theorem 5.2.8 requires only that F is Fréchet-differentiable.
Notice also that if F is twice Fréchet-differentiable, then (5.2.2) implies (5.1.3).
Moreover, in view of (5.1.7) and (5.2.9), we have that (5.1.5) =⇒ (5.2.11) but not
necessarily vice versa. Therefore, Theorem 5.2.8 can apply in cases when Theorem
5.1.1 cannot.
(b) Estimate (5.2.11) can be checked, since scalar sequence is based on the initial data
γ0 , γ and η, especially in the case when si = si+n for some finite i. At this point, we
would like to know if it is possible to find convergence criteria stronger than (5.2.11)
but weaker than (5.1.5). To this extend we first compare our majorizing sequence
{sn } to the old majorizing sequence {tn }.
Expanding the Applicability of Newton’s Method 81
(a) Scalar sequences {tn } and {sn } are increasingly convergent to t ? , s? , respectively.
(b) Sequence {xn } generated by Newton’s method is well defined, remains in U(x0 , r0 )
for each n = 0, 1, · · · and converges to a unique solution x? ∈ U(x0 , r0 ) of equation
F(x) = 0. Moreover, the following estimates hold for each n = 0, 1, · · ·
sn ≤ tn , (5.2.22)
Proof. According to Theorems 5.1.1 and 5.2.8 we only need to show (5.2.22) and (5.2.23),
since (5.2.24) follows from (5.2.22) by letting n → ∞. It follows from the definition of
sequences {tn } and {sn } (see (10) and (21)) that t0 = s0 , t1 = s1 , s2 ≤ t2 and s2 − s1 ≤ t2 −t1 ,
since γ0 ≤ γ,
1 1 1 1
≤ , = (5.2.25)
1 − γ 0 s0 1 − γ t0 1 − γ s1 1 − γ t1
and
1 1
≤ . (5.2.26)
1 1
2− 2−
(1 − γ0 s1 )2 (1 − γ0 t1 )2
Hence, (5.2.22) and (5.2.23) hold for n = 0, 1, 2. Suppose that (5.2.22) and (5.2.23) hold
for all k ≤ n. Then, we have that sk+1 ≤ tk+1 and sk+1 − sk ≤ tk+1 − tk , since γ0 ≤ γ,
1 1 1 1
≤ , ≤ (5.2.27)
1 − γ sk−1 1 − γ tk−1 1 − γ sk 1 − γtk
and
1 1
≤ . (5.2.28)
1 1
2− 2−
(1 − γ0 sk )2 (1 − γ0 tk )2
82 Ioannis K. Argyros and Á. Alberto Magreñán
Remark 5.2.11. In view of (5.2.22)–(5.2.24), our error analysis is tighter and the informa-
tion on the location of the solution x? is at least as precise as the old one. Notice also that
estimates (5.2.22) and (5.2.23) hold as strict inequalities for n > 1 if γ0 < γ (see also the
numerical examples) and these advantages hold under the same or less computational cost
as before (see Remark 5.2.9).
Next, we present our [11, Theorem 3.2]. This theorem shall be used to show that (5.1.5)
can be weakened.
Theorem 5.2.12. Let F : U(x0 , R) ⊆ X −→ Y be Fréchet-differentiable. Suppose there
exist parameters L ≥ L0 > 0 and η > 0 such that for all x, y ∈ U(x0 , R)
(b) Sequence {xn } generated by Newton’s method is well defined, remains in U(x0 , s? )
for each n = 0, 1, · · · and converges to a solution x? ∈ U(x0 , s? ) of equation F(x) = 0.
Moreover, the following estimates hold for each n = 0, 1, · · ·
L k xn − xn−1 k
k xn+1 − xn k≤ ≤ sn+1 − sn
2 (1 − L0 k xn − x0 k)
and
k xn − x? k≤ s? − sn .
Expanding the Applicability of Newton’s Method 83
(c) If there exists ς > s? such that ς < R and L0 (s? + ς) ≤ 2, then, the solution x? of
equation F(x) = 0 is unique in U(x0 , ς).
Remark 5.2.13. (a) If L0 = L, convergence criterion (5.2.31) reduces to the famous for
its simplicity and clarity Kantorovich hypothesis [21] for solving equations
h = 2 L η ≤ 1. (5.2.32)
Notice that
L0 ≤ L (5.2.33)
holds in general and L0 /L can be arbitrarily small [11, 13]. We also have that
h ≤ 1 =⇒ h1 ≤ 1 (5.2.34)
and
h1 L0
−→ 0 as −→ 0.
h0 L
Moreover, the Kantorovich majorizing sequence is given by
sn ≤ sn ,
sn+1 − sn ≤ sn+1 − sn
and
2η
s? ≤ s? = lim sn = √ .
n−→∞ 1+ 1−h
(b) Let us show that Wang’s convergence criterion (5.1.5) can be weakened under the
Kantorovich hypotheses. In particular, suppose that (5.2.30) and (5.2.32) are satis-
fied. Then, (5.2.31) is also satisfied. Moreover, if F is twice Fréchet-differentiable on
U(x0 , 1/γ), then Wang’s condition (5.1.3) is certainly satisfied, if we choose γ = L/2.
Then, condition (5.2.32) becomes
1
γη ≤ , (5.2.35)
4
?
which improves (5.1.5). We must also show√ that s ≤ 1/γ. But the preceding inequal-
ity reduces to showing that h − 2 ≤ 2 1 − h, which is true by (5.2.32). Clearly, in
view of (5.2.31), (5.2.33)–(5.2.35), criterion (5.1.5) (i.e., (5.2.35)) can be improved
even further, for γ0 = L0 /2, if L0 ≤ L.
84 Ioannis K. Argyros and Á. Alberto Magreñán
(c) Suppose Wang’s condition (5.1.3) is satisfied as well as criterion (5.1.5) on√ U(x0 , r0 ).
Recall that r0 = (1− √12 ) 1γ . Then, for r ∈ [0, r0 ], we have that 1/(1−γ r) ≤ 2. Then,
√
in view of (5.1.3) and (5.2.30), we can choose L = 4 2 γ. Then, (5.1.5) becomes
√
L η ≤ 4 (3 2 − 4) = .970562748.
However, we must also have that t ? ≤ 1/L, where t ? is given in (5.1.7). By direct
algebraic manipulation, we see that the preceding inequality is satisfied, if
√ √ √
2 2 (2 2 − 1) 4 2
.078526267 = √ ≤ Lη ≤ √ = .548479169.
8− 2 8 2−1
Hence, the last two estimates on ”L η” are satisfied, if the preceding inequality is
satisfied, which is weaker than (5.2.32) for L η ∈ (.5, .548479169]. However, the
preceding inequality may not be weaker than (5.2.31) (if γ0 = L0 /2) for L0 sufficiently
smaller than L.
1 α ε (2 − r)
f (r) = g(r) r α− , H1 (r) =
2 (1 − ε r)2
and
α ε (2 − r)
(1 − ε r)2
H(r) =
1 + α − r,
αε (2 − r)
2 (1 − β) 1 −
(1 − εr)2
where
" s s #
1 4 ε (2 − r) 2 ε (2 − r) 2 ε(2 − r) 8 ε2 (2 − r)2
g(r) = + + + .
8 (1 − ε r)2 (1 − ε r)2 (1 − r)3 (1 − ε r)2 (1 − r)3 (1 − ε r)4
Suppose that there exist intervals I f , IH and IH1 such that for some α ∈ I
I f ⊂ I, IH ⊂ I, IH1 ⊂ I,
and
/
I0 = I f ∩ IH ∩ IH1 6= 0.
Denote by r? = r? (α) the largest element in I0 . Moreover, suppose there exists a point
x0 ∈ U(x0 , R) such that F 0 (x0 )−1 ∈ Ł(Y , X ) and
r?
≤ R. (5.2.39)
γ
Then, the following assertions hold
(a) Scalar sequence {sn } is increasingly convergent to s? which satisfies
η ≤ s? ≤ s?? = δη,
where
L0 η
δ = 1+ ,
2 (1 − β)(1 − L0 η)
2L 2M
β= p = p ,
L+ L2 + 8 L0 L M+ M 2 + 8 M0 M
L0 L
M0 = , M=
γ γ
γ(2 − ε r? ) 2γ
L0 = and L= , (5.2.40)
(1 − ε r? )2 (1 − r? )3
where sequence {sn }, s? and s?? are given in Theorem 5.2.12.
Remark 5.2.15. (a) It follows from the proof of Theorem 5.2.8 that function f can be
replaced by f 1 defined by
1
f 1 (r) = g(r)α− . (5.2.41)
2
In practice, we shall employ both functions to see which one will produce the largest
possible upper bound r? for α.
α ≤ r1 = 0.179939475 .. .. (5.2.43)
The results obtained in this chapter can be connected to the following notion [14].
Definition 5.2.17. A point x0 is said to be an approximate zero of the first kind for F if {xn }
is well defined for each n = 0, 1, · · · and satisfies
n −1
kxn+1 − xn k ≤ Ξ2 kx1 − x0 k f or some Ξ ∈ (0, 1). (5.2.45)
Notice that if we start from an approximate zero x0 of the first kind then, the convergence of
Newton-Kantorovich method to x? is very fast.
provided that
2−r
(H(r) + r) < 1 (5.2.46)
(1 − r)2
and
0 ≤ α ψ(r) ≤ Ξ < 1, (5.2.47)
Expanding the Applicability of Newton’s Method 87
where
1
ψ(r) = 2−r
.
(1 − r)3 [1 − (1−r) 2 (H(r) + r)]
Conditions (5.2.46) and (5.2.47) must hold respectively in Theorem 5.2.8 and Proposition
5.2.10 for r = r? , r1 . Then, x0 is an approximate zero of the first kind in all these results. If
γ0 = γ, then (5.2.42) holds for r1 = .179939475 · · ·. Using (5.2.45) we notice that (5.2.46)
and (5.2.47) hold at r1 . It then follows that (5.2.45) is satisfied with factor Ξ/η, where Ξ is
given by Ξ = α ψ(r1 ).
or γ = ∞ if F 0 (x0 ) is not invertible or the supremum in γ does not exist. Then, if D = X , the
sufficient convergence condition of Newton-Kantorovich method is given by α ≤ 0.130707.
Rheinboldt in [25] improved Smale’s result by showing convergence of Newton’s method
when α ≤ 0.15229240. Here, we showed convergence for α ≤ r1 = .179939475.
2γ
k F 0 (x? )−1 F 00 (x) k≤ f or each x ∈ U(x? , r). (5.3.1)
(1 − γ k x − x? k)3
Remark 5.3.4. (a) Notice again that `? (r) ≤ `(r) and `/`? can be arbitrarily large.
88 Ioannis K. Argyros and Á. Alberto Magreñán
(b) In order for us to cover the local convergence analysis of Newton’s method, let us
1 1
define function f ε : Iε = [0, (1 − √ )] −→ R by
ε 2
Suppose that
1 1
ε> (1 − √ ). (5.3.5)
2 2
Then, we have that
1 1 1 1 1 1
fε (0) = −1 < 0 and f ε ( (1 − √ )) = √ (1 − √ ) (2 − (1 − √ )) > 0.
ε 2 ε 2 2 ε 2
Hence, it follows from the intermediate value theorem that function f ε has a zero in
◦
I ε . Denote by µ?ε the minimal such zero. Define function gε : Iε −→ R by
(1 − εt)2 (2 − t)t
gε (t) = . (5.3.6)
(1 − t)2 (2 (1 − εt)2 − 1)
Then, we have that
Set
µ?ε
Rε = . (5.3.8)
γ
It follows from the definition of f ε , µ?ε and gε that
√
3− 6
R1 = ≤ Rε . (5.3.9)
3γ
Moreover, strict inequality holds if ε 6= 1. Let us assume that F satisfies the γ? -center-
1 1
Lipschitz condition at x? on U(x? , (1 − √ )) with F(x? ) = 0 and the γ-Lipschitz
εγ 2
? 1
condition at x on U(x , ). Then, for x0 ∈ U(x? , Rε ), we have the identity
?
γ
xτn,?
s
= xn + τ s (x? − xn ) f or each 0 ≤ t ≤ 1, 0 ≤ s ≤ 1.
and Z 1
k (F 0 (xτn,? ) − F 0 (xn )) dτ (x? − xn ) k≤
0
Z
2 γ k xτn,?
1Z 1 s
− x? k ds dτ
τ s − x? k)3
k xτn,?
s
− x? k≤ (5.3.12)
0 0 (1 − γ s k xn,?
1
− 1 k xn − x? k .
(1 − γ k xn − x? k)2
That is we have by (5.3.10)–(5.3.12) that
k xn+1 − x? k
(1 − ε γ k xn − x? k)2 (1 − (1 − γ k xn − x? k)2 )
≤ k xn − x? k (5.3.13)
(2 (1 − ε γ k xn − x? k)2 − 1) (1 − γ k xn − x? k)2
< gε (µ?ε ) k xn − x? k=k xn − x? k< Rε .
Hence we arrived at the following result on the local convergence for Newton’s method.
(a) There exists x? ∈ U(x0 , R) such that F(x? ) = 0 and F 0 (x? )−1 ∈ L (Y , X ).
1
(b) Operator F satisfies the center γ? -center-Lipschitz condition at x? on U(x? , (1 −
εγ
1 1
√ )) for ε satisfying (5.3.5) and the γ-Lipschitz condition at x? on U(x? , ).
2 γ
Then, if x0 ∈ U(x? , Rε ), sequence {xn } generated by Newton’s method is well defined, re-
mains in U(x? , Rε ) for each n = 0, 1, · · · and converges to x? . Moreover, the following
estimate holds
γ (2 − γ k xn − x? k) (1 − εγ k xn − x? k)2
k xn+1 − xn k≤ k xn − x? k2 . (5.3.14)
(1 − γ k xn − x? k)2 (2 (1 − ε k xn − x? k)2 − 1)
Remark 5.3.6. If ε = 1 (i.e. γ? = γ), our results reduces to the ones given by Wang [31]
(see also [30, 32]). Otherwise, if
1 1 γ?
(1 − √ ) ≤ < 1, (5.3.15)
2 2 γ
then, according to (5.3.9) our convergence radius is larger. Moreover, our error bounds are
tighter if γ? < γ.
90 Ioannis K. Argyros and Á. Alberto Magreñán
1
Remark 5.3.7. Let us define function f ε1 : Iε1 = [0, ] −→ R for ε1 > 0 by
ε1
Suppose that
1
ε1 > . (5.3.17)
2
Then, we have
1
fε1 (0) = −1 < 0 and f ε1 ( ) > 0.
ε1
◦
Denote by µ?ε1 the minimal zero of f ε1 on I ε1 . Define function gε1 : Iε1 −→ R by
(2 − t)t
gε1 (t) = . (5.3.18)
(1 − t)2 (1 − ε1 t)
(a) There exists x? ∈ U(x0 , R) such that F(x? ) = 0 and F 0 (x? )−1 ∈ L (Y , X ).
1
(b) Operator F satisfies the center L0 -Lipschitz condition at x? on U(x? , ) for ε1 satis-
ε1
1
fying (5.3.17) and the γ-Lipschitz condition at x? on U(x? , ).
γ
Then, if x0 ∈ U(x? , Rε1 ), sequence {xn } generated by Newton’s method is well defined,
remains in U(x? , Rε1 ) for each n = 0, 1, · · · and converges to x? . Moreover, the following
estimate holds
γ(2 − γ k xn − x? k) k xn − x? k2
k xn+1 − xn k≤ . (5.3.20)
(1 − γ k xn − x? k)2 (1 − ε1 k xn − x? k)
Example 5.4.1. (a) Consider γ = 1.8, γ0 = .44 and η = .1. Using (5.2.10) and (5.2.11),
we get that
γ0 1 1
= .2444444444 ≤ 1 − √ = .2928932190, = .5555555556,
γ 2 γ
Expanding the Applicability of Newton’s Method 91
(b) Consider now γ = .5, γ0 = .44 and η = .1. Using (5.2.10) and (5.2.11), we get that
γ0 1 1 1
= .88 > 1 − √ = .2928932190, (1 − √ ) = .665666406,
γ 2 2 γ0
s2 = .1051130691, s3 = .1051300426, s4 = .1051300428
and
sn = s4 = .1051300428 for each n = 5, 6, · · · .
√
That is sn < (1 − (1/ 2))/γ0 for each n = 1, 2, · · · and condition (5.2.11) holds.
Hence, our Theorem 5.2.8 is applicable. we also have that
α = .05 ≤ .171572876.
Hence the convergence criterion in [31] is also satisfied. We can now compare our
results of Theorem 5.2.8 (see also sequence {sn } given by (5.2.10)) to ones given in
[31, 32] (see also {tn } given by (5.1.10)). Table 5.4.1 shows that our error bounds
using sequence {sn } are tighter than those given in [32].
Example 5.4.2. Let function h : R −→ R be defined by
0 if x ≤ 0
h(x) =
x i f x ≥ 0.
Define function F by
2
ϖ − x + 1 x3 + x
if x≤
1
F(x) = 18 1−x 2 (5.4.1)
ϖ − 71 + 2 x2
if
1
x≥ ,
144 2
92 Ioannis K. Argyros and Á. Alberto Magreñán
2 1
L(u) = 3
+ f or each u ∈ [0, 1) (5.4.4)
(1 − u) 6
2 1
L0 (u) = + f or each u ∈ [0, 1). (5.4.5)
(1 − u)3 12
and
1
0 < F 00 (u) < F 00 (|u|) < L(|u|) 6= u < 1.
f or each (5.4.7)
2
Let x, y ∈ U(0, 1) with |y| + |x − y| < 1. Then, it follows from (5.4.6) and (5.4.7) that
Z 1
|F 0 (x) − F 0 (y)| ≤ |x − y| F 00 (y + t (x − y)) dt
Z0 1 (5.4.8)
≤ |x − y| L(|y| + t |x − y|) dt.
0
Hence, F 0 satisfies the L-Lipschitz condition (5.2.2) on U(0, 1). Similarily, using (5.4.2)
and (5.4.5), we deduce that F 0 satisfies the L0 -center-Lipschitz condition (5.2.3) on U(0, 1).
Notice that
L0 (u) < L(u) f or each u ∈ [0, 1). (5.4.9)
Table 5.4.2 show that our error bounds sn+1 − sn are finer than tn+1 − tn .
Example 5.4.3. Let X = Y = R2 , x0 = (1, 0), D = U(x0 , 1 − κ) for κ ∈ (0, 1). Let us define
function F on D as follows
√
F(x) = (ζ31 − ζ2 − κ, ζ1 + 3 ζ2 − 3 κ) with x = (ζ1 , ζ2 ). (5.4.10)
Using (5.4.10) we see that the γ-Lipschitz condition is satisfied for γ = 2 − κ. We also have
that η = (1 − κ)/3.
Expanding the Applicability of Newton’s Method 93
Case I. Let κ = .6255.√ Then we notice that (5.1.5) is not satisfied since α =
.1715834166 > 3 − 2 2 = .171572875. Hence there√is no guarantee that New-
ton’s method starting from x0 will converge to x? = ( 3 κ, 0) = (.85521599, 0) (cf.
[14, 19, 26, 27, 30, 31, 32]). However, our results can apply. Indeed using the
√ of Lipschitz and center-Lipschitz conditions we have that L0 = 3 − κ and
definition
L = 4 2(2 − κ). Hence, (5.2.31) is satisfied since h = L1 η = .3396683409 < .5. We
conclude that Theorem 5.2.12 is applicable and iteration {sn } converges to x? .
II. Let κ = .7. It can be seen that the condition (5.1.5) holds since α = .13 ≤
Case √
3 − 2 2. We also obtain that h = .2626128133 < .5. We get in turn that 1/γ =
0.7692307,
1 1 1
1− √ = 0.2899932 and 1 − √ = .29289321 < 0.776923.
2 0 γ 2
Then condition (5.2.31) also holds. Using Theorem 5.2.8, the γ0 -center-Lipschitz
condition is satisfied if
0
1
F (x0 )−1 F 0 (x) − F 0 (x0 )
< − 1,
(1 − γ0 kx − x0 k)2
which is certainly satisfied for say γ0 = 1.01. Note that γ0 < 1.3 = γ. Table 5.4.3
compare the sequences {sn }, {tn } and the error bounds tn+1 − tn , sn+1 − sn . We also
observe that {sn } is finer majorizing sequence than {tn }.
94 Ioannis K. Argyros and Á. Alberto Magreñán
Conclusion
A convergence analysis of Newton’s method is provided for approximating a locally unique
solution of nonlinear equation in a Banach space setting. Using Smale’s α-theory and
the center-Lipschitz condition, we presented a new convergence analysis with larger con-
vergence domain and weaker sufficient convergence conditions. Moreover, these ad-
vantages are obtained under the same computational cost as in earlier studies such as
[14, 19, 27, 30, 31, 32]. Numerical examples validating the theoretical results are also
provided in this chapter.
References
[1] Amat, S., Busquier, S., Negra, M., Adaptive approximation of nonlinear operators.
Numer. Funct. Anal. Optim. 25 (2004), 397–405.
[2] Argyros, I.K., A convergence analysis for Newton’s method based on Lipschitz, center
Lipschitz conditions and analytic operators. PanAmer. Math. J. 13 (2003), 35–42.
[3] Argyros, I.K., A unifying local-semilocal convergence analysis and applications for
two-point Newton-like methods in Banach space. J. Math. Anal. Appl. 298 (2004),
374–397.
[4] Argyros, I.K., On the Newton-Kantorovich hypothesis for solving equations. J. Comp.
Appl. Math. 169 (2004), 315–332.
[5] Argyros, I.K., Computational theory of iterative methods. Series: Studies in Comp.
Math., 15, Editors: C.K. Chui and L. Wuytack, Elsevier Publ. Co. New York, U.S.A,
2007.
[7] Argyros, I.K., A new semilocal convergence theorem for Newton’s method under
a gamma-type condition. Atti Semin. Mat. Fis. Univ. Modena Reggio Emilia 56
(2008/09), 31–40.
[8] Argyros, I.K., Semilocal convergence of Newton’s method under a weak gamma con-
dition. Adv. Nonlinear Var. Inequal. 13 (2010), 65–73.
[9] Argyros, I.K., A semilocal convergence analysis for directional Newton methods.
Math. Comput. 80 (2011), 327–343.
[10] Argyros, I.K., Hilout, S., Extending the Newton-Kantorovich hypothesis for solving
equations. J. Comput. Appl. Math. 234 (2010), 2993–3006.
[11] Argyros, I.K., Hilout, S., Weaker conditions for the convergence of Newton’s method.
J. Complexity 28 (2012), 364–387.
[12] Argyros, I.K., Hilout, S., convergence of Newton’s method under weak majorant con-
dition. J. Comput. Appl. Math. 236 (2012), 1892–1902.
96 Ioannis K. Argyros and Á. Alberto Magreñán
[13] Argyros, I.K., Cho, Y.J., Hilout, S., Numerical methods for equations and its applica-
tions, CRC Press/Taylor and Francis Publ., New York, 2012.
[14] Cianciaruso, F. Convergence of Newton-Kantorovich approximations to an approxi-
mate zero. Numer. Funct. Anal. Optim. 28 (2007), 631–645.
[15] Dedieu, J.P., Points fixes, zéros et la méthode de Newton, 54, Springer, Berlin, 2006.
[16] Ezquerro, J.A., Gutiérrez, J.M., Hernández, M.A., Romero, N., Rubio, M.J., The
Newton method: from Newton to Kantorovich. (Spanish) Gac. R. Soc. Mat. Esp. 13
(2010), 53–76.
[17] Ezquerro, J.A., Hernández, M.A., An improvement of the region of accessibility of
Chebyshev’s method from Newton’s method. Math. Comp. 78 (2009), 1613–1627.
[18] Ezquerro, J.A., Hernández, M.A., Romero, N., Newton-type methods of high order
and domains of semilocal and global convergence. App. Math. Comp. 214 (2009),
142–154.
[19] Guo, X., On semilocal convergence of inexact Newton methods. J. Comput. Math. 25
(2007), 231–242.
[20] Hernández, M.A., A modification of the classical Kantorovich conditions for New-
ton’s method. J. Comp. Appl. Math. 137 (2001), 201–205.
[21] Kantorovich, L.V., Akilov, G.P., Functional Analysis, Pergamon Press, Oxford, 1982.
[22] Ortega, L.M., Rheinboldt, W.C., Iterative Solution of Nonlinear Equations in Several
Variables, Academic press, New York, 1970.
[23] Proinov, P.D., General local convergence theory for a class of iterative processes and
its applications to Newton’s method. J. Complexity 25 (2009) 38–62.
[24] Proinov, P.D., New general convergence theory for iterative processes and its applica-
tions to Newton-Kantorovich type theorems. J. Complexity 26 (2010), 3–42.
[25] Rheinboldt, W.C., On a theorem of S. Smale about Newton’s method for analytic
mappings. Appl. Math. Let. 1 (1988), 3–42.
[26] Shen, W., Li, C., Kantorovich-type convergence criterion for inexact Newton methods.
Appl. Numer. Math. 59 (2009) 1599–1611.
[27] Shen, W., Li, C., Smale’s α-theory for inexact Newton methods under the γ-condition.
J. Math. Anal. Appl. 369 (2010), 29–42.
[28] Smale, S., Newton’s method estimates from data at one point. The merging of dis-
ciplines: new directions in pure, applied, and computational mathematics (Laramie,
Wyo., 1985) (1986), 185–196, Springer, New York.
[29] Smale, S., Algorithms for solving equations. Proceedings of the International
Congress of Mathematicians, Vol. 1, 2 (Berkeley, Calif., 1986) (1987), 172–195, Amer.
Math. Soc., Providence, RI.
Expanding the Applicability of Newton’s Method 97
[30] Wang, D.R., Zhao, F.G. The theory of Smale’s point estimation and its applications,
Linear/nonlinear iterative methods and verification of solution (Matsuyama, 1993). J.
Comput. Appl. Math. 60 (1995), 253–269.
[31] Wang, X.H., Convergence of Newton’s method and inverse function theorem in Ba-
nach space. Math. Comp. 68 (1999), 169–186.
[32] Wang, X.H., Han, D.F., On dominating sequence method in the point estimate and
Smale theorem. Sci. China Ser. A 33 (1990), 135–144.
[33] Yakoubsohn, J.C., Finding zeros of analytic functions: α–theory for Secant type
method. J. Complexity 15 (1999), 239–281.
[34] Zabrejko, P.P., Nguen, D.F., The majorant method in the theory of Newton-
Kantorovich approximations and the Pták error estimates. Numer. Funct. Anal. Optim.
9 (1987), 671–684.
Chapter 6
Newton-Type Methods on
Riemannian Manifolds under
Kantorovich-Type Conditions
6.1. Introduction
Let us suppose that F is an operator defined on an open convex subset Ω of a Banach space
E. Let us denote by DF (xn ) the first Fréchet derivatives of F at xn .
Given an integer mand an initial point x0 ∈ E, we move from xn to xn+1 through an
m
intermediate sequence yin i=0 , y0n = xn , which is a generalization of Newton (m = 1) and
simplified Newton (m = ∞) methods
1 0
0 −1 0
y n = yn − D F y n F y n
y2n = y1n − D F y0n −1 F y1n
..
.
ym = x −1 (m−1)
n n+1 = ym−1 − D F y0
n n F y n .
This family of methods was introduced by E. Shamanskii [43]. Under appropriate con-
ditions, these iterative methods converge to a root x∗ of the equation F (x) = 0. More-
over, if x0 is sufficiently near x∗ the method has order of convergence at least m + 1. See
[33, 38, 43, 46]. In particular, Notice that in [38] a modification of D F(xn ) at each sub-
step. In [39, 40, 41], Parida and Gupta provided some recurrence relations to establish
a convergence analysis for a third order Newton-type methods under Lipschitz or Hölder
conditions on the second Fréchet derivative. A modification of the approach used in [39]
and some applications are presented by Chun et al. in [19]. Recently, Argyros and Ren
[17] expanded the applicability of Halley’s method using a center-Lipschitz condition on
the second Fréchet derivative instead of Lipschitz’s condition.
On the other hand, in the last years, attention has been paid in studying Newton’s
method on manifolds, since there are many numerical problems posed on manifolds that
arise naturally in many contexts. Some examples include eigenvalue problems, minimiza-
tion problems with orthogonality constraints, optimization problems with equality con-
100 Ioannis K. Argyros and Á. Alberto Magreñán
straints, invariant subspace computations. See for instance [1, 2, 3, 7, 15, 20, 21, 27, 29, 35,
36, 48, 49]. For these problems, one has to compute solutions of equations or to find zeros
of a vector field on Riemannian manifolds.
The study about convergence matter of iterative methods is usually centered on two
types: semilocal and local convergence analysis. The semilocal convergence matter is,
based on the information around an initial point, to give criteria ensuring the convergence
of iterative methods; while the local one is, based on the information around a solution,
to find estimates of the radii of convergence balls. There is a plethora of studies on the
weakness and/or extension of the hypothesis made on the underlying operators; see for
example [4, 5, 6, 12, 14, 32, 34, 48, 49].
The semilocal convergence analysis of Newton’s method is based on celebrated Kan-
torovich theorem. This theorem is a fundamental result in numerical analysis, e.g., for
providing an iterative method for computing zeros of polynomials or of systems of non-
linear equations. Moreover, this theorem is a very usefull result in nonlinear functional
analysis, e.g., for establishing that a nonlinear equation in an abstract space has a solution.
Let us recall Kantorovich’s theorem in a Banach space setting.
Suppose that for some x0 ∈ Ω, DF (x0 ) is invertible and that for some a > 0 and b ≥ 0 :
−1
DF (x0 )
≤ a,
−1
DF (x0 ) F (x0 )
≤ b,
1
h = abl ≤ (6.1.1)
2
and
1 √
B (x0 ,t∗ ) ⊆ Ω where t∗ = 1− 1−2h .
al
If
vk = −DF (xk )−1 F (xk ) ,
xk+1 = xk + vk .
Then {xk }k∈N ⊆ B (x0 ,t∗ ) and xk −→ p∗ , which is the unique zero of F in B [x0 ,t∗ ] . Further-
more , if h < 12 and B (x0 , r) ⊆ Ω with
1 √
t∗ < r ≤ t∗∗ = 1+ 1−2h ,
al
then p∗ is also the unique zero of F in B (x0 , r). Also, the error bound is:
k b
kxk − x∗ k ≤ (2 h)2 ; k = 1, 2, . . .
h
Newton-Type Methods on Riemannian Manifolds 101
Although the concepts will be defined later on, to extend the method on Riemannian
manifolds, preliminarily we will say that the derivative of F at xn is replaced by the covariant
derivative of X at pn :
∇(.) X (pn ) : Tpn M −→ Tpn M
v −→ ∇Y X,
where Y is a vector field satisfying Y (p) = v. We adopt the notation D X (p) v = ∇y X (p);
hence D X (p) is a linear mapping of Tp M into Tp M. So, in this new context
is written as
−D X (pn )−1 X (pn )
or −1
− ∇X(pn ) X (pn ) .
Now we can write Kantorovich’s theorem in the new context. A proof of this theorem can
be found in [27]. We will say that a singularity of a vector field X, is a point p ∈ M for
which X (p) = 0.
1 √
t∗ < r ≤ t∗∗ = 1+ 1−2h ,
al
then p∗ is also the unique singularity of F in B (p0 , r). The error bound is:
k
d (pk , p∗ ) ≤ bh (2 h)2 ; k = 1, 2, . . . (6.1.3)
102 Ioannis K. Argyros and Á. Alberto Magreñán
Clearly,
l0 ≤ l (6.1.4)
holds in general and l/l0 can be arbitrarily large [6, 12, 14]. In particular, we show that in
the case of the modified Newton’s method, condition (3) of Theorem 6.1.2 can be replaced
by
1
h0 = a b l 0 ≤ , (6.1.5)
2
whereas in the case of Newton’s method, condition (3) of Theorem 6.1.2 can be replaced
by
ab p 1
h1 = (l + 4 l0 + l 2 + 8 l0 l) ≤ , (6.1.6)
8 2
or by q
ab p 1
h2 = (4 l0 + l l0 + 8 l02 + l0 l) ≤ . (6.1.7)
8 2
Notice that
1 1 1 1
h ≤ =⇒ h1 ≤ =⇒ h2 ≤ =⇒ h0 ≤ (6.1.8)
2 2 2 2
but not necessarily vice versa unless if l0 = l. Moreover, we have that
h1 1 h2 h2 h0 l0
−→ , −→ 0, −→ 0 and −→ 0 as −→ 0. (6.1.9)
h 4 h1 h h l
The preceding estimates show by how many times (at most) the applicability of the modified
Newton’s method or Newton’s method can be extended. Moreover, we show that under the
new convergence conditions, the error estimates on the distances d(pn , pn−1 ), d(pn , p∗ )
can be tighter and the information on the location of the solution at least as precise as in
Theorem 6.1.2.
The chapter in organized as follows: Section 6.2. contains some definitions and fun-
damental properties of Riemannian manifolds. The convergence of simplified Newton’s
method and the order of convergence using normal coordinates are given in Sections 6.3.
and 6.4.. Family of high order Newton-type methods, precise majorizing sequences and the
corresponding convergence results are provided in Sections 6.5. and 6.6..
Newton-Type Methods on Riemannian Manifolds 103
Definition 6.2.1. [28, 37, 38, 47] A differentiable manifold of dimension m is a set M and
m m
a family[of injective mappings xα : Uα ⊂ R −→ M of open sets Uα of R into M such that:
(i) xα (Uα ) = M.
α −1
(ii) for any pair α, β with xα (Uα) ∩ xβ Uβ = W 6= 0, / the sets x−1
α (W ) and xβ (W ) are
open sets in Rm and the mappings x−1 β
◦ xα are differentiable.
(iii) The family {(Uα, xα )} is maximal relative to the conditions (i) and (ii).
The pair (Uα , xα ) (or the mapping xα ) with p ∈ xα (Uα ) is called a parametrization
(or system of coordinates) of M at p; xα (Uα ) is then called neighborhood at p and
xα (Uα) , x−1
α is called a coordinate chart. A family {(Uα, xα )} satisfying (i) and (ii) is
called a differentiable structure on M.
and provides a differentiable structure of dimension 2m [22]. Next, we define the concept
of Riemannian metric:
in which dx−1 is the tangent map of x−1 and is a differentiable operator on U for each
i, j = 1, 2, .., n. The operators gi j are called the local representatives of the Riemannian
metric.
104 Ioannis K. Argyros and Á. Alberto Magreñán
The inner product h., .ip induces in a natural way the norm ||.|| p . The subscript p is
usually deleted whenever there is not possibility of confusion.
If p and q are two elements of the manifold M and c : [0, 1] −→ M is a piecewise smooth
curve connecting p and q, then the arc length of c is defined by
Z 1
l (c) = c0 (t) dt (6.2.1)
0
Z 1
dc dc 1/2
= , dt,
0 dt dt
Definition 6.2.3. Let χ (M) be the set of all vector fields of class C∞ on M and D (M) the
ring of real-valued operators of class C∞ defined on M, that is:
χ (M) = C∞ M, T(.) M ,
D (M) = C∞ (M, R).
An affine connection ∇ on M is a mapping
i) ∇ f X+gY Z = f ∇X Z + g∇Y Z.
ii) ∇X (Y + Z) = ∇X Y + ∇X Z.
iii) ∇X ( fY ) = f ∇X Y + X ( f )Y,
D X (p) : Tp M −→ Tp M
(6.2.4)
v 7−→ D X (p) (v) = ∇Y X (p)
where Y is a vector field satisfying Y (p) = v. The value D X (p) (v) depends only on the
tangent vector v = Y (p) since ∇ is linear in Y , thus we can write
smooth curve c and any pair of parallel vector fields P and P0 along c, we have that hP, P0 i
is constant or equivalently
d
hX,Y i = ∇c0 (t) X,Y + X, ∇c0(t)Y ,
dt
where X and Y are vector fields along the differentiable curve c : I −→ M (see [22], [45]).
We say that ∇ is symmetric if
The theorem of Levi-Civita (see [45]), establishes that there exists a unique symmetric
affine connection ∇ on M compatible with the metric. This connection is called connection
of Levi-Civita.
Some times, by abuse of the language, we refer to the image γ (I), of a geodesic γ, as a
geodesic. A basic property of a geodesic is that, γ0 (t) is parallel along of γ (t) ; this implies
that ||γ0 (t)|| is constant.
Let B (p, r) and B [p, r] be respectively the open geodesic and the closed geodesic ball
with center p and radius r, that is:
also if v ∈ Tp M, there exists a unique minimizing geodesic γ such that γ (0) = p and γ0 (0) =
v. The point γ (1) is called the image of v by the exponential map at p, that is, there exist a
well defined map
exp p : Tp M −→ M (6.2.6)
such that
exp p (v) = γ (1),
and for any t ∈ [0, 1]
γ (t) = exp p (tv).
b of the origin 0 p ∈
It can be shown that exp p defines a diffeomorphism of a neighborhood U
Tp M onto a neighborhood U of p ∈ M, called normal neighborhood of p, (see [22]).
106 Ioannis K. Argyros and Á. Alberto Magreñán
ϕ := exp p ◦ f : Rm −→ U.
One of the most important properties of the normal coordinates is that the geodesies passing
through p are given by linear equations, (see [44]).
The exponential map has many important properties [22], [44]. When the exponential
map is defined for each value of the parameter t ∈ R we will say that the Riemannian man-
ifold M is geodesically complete (or simply complete). The Hopf and Rinow theorem (see
[22]), also establishes that the property of the Riemannian manifold of being geodesically
complete is equivalent to being complete as a metric space.
Definition 6.2.6. [28, 37, 38, 47] Let c be a piecewise smooth curve. For any pair a, b ∈ R,
we define the parallel transport along c which is denoted by Pc as
where V is the unique vector field along c such that ∇c0(t)V = 0 and V (c (a)) = v.
It is easy to show that Pc,a,b is linear and one-one, thus Pc,b,a is an isomorphism between
every two tangent spaces Tc(a)M and Tc(b) M. Its inverse is the parallel translation along
the reversed portion of c from V (c (b)) to V (c (a)) , actually Pc,a,b is a isometry between
i
Tc(a) M and Tc(b)M. Moreover, for a positive integer i and for all (v1 , v2 , . . ., vi ) ∈ Tc(a) M ,
we define Pci as
i
i i
Pc,a,b : Tc(a)M −→ Tc(b)M ,
where
i
Pc,a,b (v1 , v2 , . . ., vi ) = (Pc,a,b (v1 ) , Pc,a,b (v2 ), . . ., Pc,a,b (vi )) .
The parallel transport has the important properties:
D X : Ck (T M) −→ Ck−1 (T M)
(6.2.9)
(v, .) 7 → D X (Y ) = ∇Y X,
−
where T M is the tangent bundle. Similar to the higher order Fréchet derivative, see [18]. We
define the higher order covariant derivatives, see [45], as the multilinear map or j-tensor:
j
D jX : Ck (T M) −→ Ck− j (T M)
Newton-Type Methods on Riemannian Manifolds 107
given by
D 2 X : Ck (T M) ×Ck (T M) −→ Ck−2 (T M)
and
The multilinearity refers to the structure of Ck (M)-module, such that, the value of
(v1 , v2 , . . ., v j ) = (Y1 (p) ,Y2 (p) , . . .,Y j−1 (p) ,Y (p)) ∈ (Tp M) j .
D j X (p) : (Tp M) j −→ Tp M
by
D j X (p) (v1 , v2, . . ., v j ) = D j X (Y1,Y2, . . .,Y j−1,Y ) (p) . (6.2.12)
Definition 6.2.7. [28, 37, 38, 47] Let M be a Riemannian manifold, Ω ⊆ M an open convex
set and X ∈ χ (M) . The covariant derivative D X = ∇(.) X is Lipschitz with constant l > 0,
if for any geodesic γ and a, b ∈ R so that γ [a, b] ⊆ Ω, it holds that:
Z b
Pγ,b,a D X (γ (b)) Pγ,a,b − D X (γ (a))
≤ l
γ0 (t)
dt. (6.2.13)
a
Note that Pγ,b,a D X (γ (b))Pγ,b,a and D X (γ(a)) are both operators defined in the same
tangent plane Tγ(a)M. If M is an Euclidean space, the above definition coincides with the
usual Lipschitz definition for the operator DF : M −→ M.
Proposition 6.2.8. [28, 37, 38, 47] Let c be a curve in M and X be a C1 vector field on M,
then the covariant derivative of X in the direction of c0 (s) is
Note that if M = Rn the previous proposition agrees with the definition of classic direc-
tional derivative in Rn ; (see [45]).
It is also possible to obtain a version of the fundamental theorem of calculus for mani-
folds:
0 0
Z t
Pc,t,0 D X (c (t)) c (t) = D X (c (0)) c (0) + Pc,s,0 D 2 X (c (s)) c0 (s), c0 (s) ds.
0
(6.2.16)
Proof. Let us consider the vector field along of the geodesic c (s)
hence
Z t
Pc,t,0 D X (c (t))c0 (t) = D X (c (0)) c0 (0) + Pc,s,0 D D X (c (s)) c0 (s) c0 (s) ds
0
by (6.2.11)
D 2 X (c (s)) c0 (s) , c0 (s) = ∇c0 (s) D X (c (s)) c0 (s) − D X (c (s)) ∇c0 (s) c0 (s)
D D X (c (s))c0 (s) c0 (s) − D X (c (s)) ∇c0 (s)c0 (s) ,
Theorem 6.2.11. Let c be a geodesic in M, [0, 1] ⊆ Dom (c) and X be a C2 vector field on
M, then
Z 1
Pc,1,0 X (c (1)) = X (c (0)) + D X (c (0)) c0 (0) + (1 − t) Pc,s,0 D 2 X (c (t)) c0 (t), c0 (t) dt.
0
(6.2.18)
Proof. Consider the curve
f (s) = Pc,s,0 X (c (s))
in Tc(0) M. We have that
f (n) (s) = Pc,s,0 D (n)X (c (s)) c0 (s), c0 (s), · · · , c0 (s) (6.2.19)
| {z }
n − times
Then
f 00 (s) = Pc,s,0 D 2 X (c (s)) c0 (s) , c0 (s) ,
and from Taylor’s theorem
Z 1
0
f (1) = f (0) + f (0)(1 − 0) + (1 − t) f 00 (t) dt.
0
Therefore
Z 1
Pc,1,0 X (c (1)) = X (c (0)) + D X (c (0)) c0 (0) + (1 − t) Pc,t,0 D 2 X (c (t)) c0 (t), c0 (t) dt.
0
If
vk = −Pσk ,0,1 D X (p0 )−1 Pσk ,1,0 X (pk ) ,
(6.3.1)
pk+1 = exp pk (vk ) ,
where {σk : [0, 1] −→ M}k∈N is the minimizing geodesic family connecting p0 , pk , then
{pk }k∈N ⊆ B (p0 ,t∗ ) and pk −→ p∗ which is the only one singularity of X in B [p0 ,t∗ ].
Furthermore, if h < 12 and B (p0 , r) ⊆ Ω with
1 √
t∗ < r ≤ t∗∗ = 1+ 1−2h ,
al
then p∗ is also the only singularity of F in B (p0 , r). The error bound is:
b
√ k+1
d (pk , p∗ ) ≤ h 1− 1−2h ; k = 1, 2, . . . (6.3.2)
First, we establish some results that are of primary relevance in this proof.
Lemma 6.3.2. Let M be a Riemannian manifold, Ω ⊆ M an open convex set, X ∈ χ (M) and
D X ∈ Lipl (Ω) . Take p ∈ B (p0 , r) ⊆ Ω, v ∈ Tp M, σ : [0, 1] −→ M be a minimizing geodesic
connecting p0 , p and
γ (t) = exp p (tv).
Then
Pγ,t,0 X (γ(t)) = X (p) + Pσ,0,1 t D X (p0 ) Pσ,1,0 v + R (t)
with t
kR (t)k ≤ l kvk + d (p0 , p) t kvk .
2
Proof. From Theorem 6.2.9, it follows that
Z t
Pγ,t,0 X (γ (t)) − X (γ(0)) = Pγ,s,0 D X (γ (s))γ0 (s) ds,
0
Newton-Type Methods on Riemannian Manifolds 111
since γ is a minimizing geodesic, then γ0 (t) is parallel and γ0 (s) = Pγ,0,s γ0 (0). Moreover
γ0 (0) = v then
Z t
Pγ,t,0 X (γ (t)) − X (p) = Pγ,s,0 D X (γ(s)) Pγ,0,s v ds.
0
Thus
Pγ,t,0
Rt
X (γ (t)) − X (p) − Pσ,0,1t D X (p0 ) Pσ,1,0 v
= 0 Pγ,s,0 D X (γ (s))Pγ,0,s v ds − Pσ,0,1 D X (p0 )Pσ,1,0 v
= 0t Pγ,s,0 D X (γ (s))Pγ,0,s v − Pσ,0,1 D X (p0 ) Pσ,1,0 v ds,
R
letting Z t
R (t) = Pγ,s,o D X (γ (s)) Pγ,o,s v − Pσ,0,1 D X (p0 ) Pσ,1,0 v ds,
0
and since D X ∈ Lipl (Ω) , we obtain
Z t
kR (t)k ≤
Pγ,s,o D X (γ(s))Pγ,o,s − D X (p) + D X (p) − Pσ,0,1 D X (p0 ) Pσ,1,0
kvk ds
0
Z t
≤
Pγ,s,o D X (γ(s))Pγ,o,s − D X (p)
+
D X (p) − Pσ,0,1 D X (p0 )Pσ,1,0
kvkds
0
Z 1
=
Pγ,s,o D X (γ(s)) Pγ,o,s − D X (γ (0))
+
D X (σ (1)) − Pσ,0,1 D X (σ (0))Pσ,1,0
kvkds
0
Z t Z s
≤l
γ0 (τ)
dτ + d (p0 , p) kvkds
0 0
Z t Z s
=l
γ0 (0)
dτ + d (p0 , p) kvkds
0 0
Z t
=l s
γ0 (0)
+ d (p0 , p) kvkds
0
t2
=l kvk + td (p0 , p) kvk.
2
Therefore, t
kR (t)k ≤ l kvk + d (p0 , p) t kvk .
2
Now we can prove the simplified Kantorovich theorem on Riemannian manifolds. The
proof of this theorem will be divided in two parts. First, we will prove that simplified
Kantorovich method is well defined, i.e. {pk }k∈N ⊆ B (p0 ,t∗) ; we will also prove the con-
vergence of the method. In the second part, we will establish uniqueness.
112 Ioannis K. Argyros and Á. Alberto Magreñán
• CONVERGENCE
We consider the auxiliary real function f : R −→ R, defined by
l 1 b
f (t) = t 2 − t + . (6.3.4)
2 a a
Its discriminant is
1
4= (1 − 2 l b a),
a2
which is positive, because a b l ≤ 12 . Thus f has a least one real root (unique when
h = 12 ). If t∗ is the smallest root, a direct calculation show that f 0 (t) < 0 for 0 ≤ t < t∗ ,
so f is strictly decreasing in [0,t∗] . Therefore the (scalar) Newton’s method can be
applied to f , in other words:
If t0 ∈ [0,t∗), for k = 0, 1, 2, . . ., we define
f (tk )
tk+1 = tk − .
f 0 (0)
Then {tk }k∈N is well defined, strictly increasing and converges to t∗ . Furthermore, if
h = a b l < 12 , then
b √ k+1
t∗ − tk ≤ 1− 1−2h , k = 1, 2, . . ., see [32]. (6.3.5)
h
Let us take as starting point t0 = 0. We want to show that Newton’s iteration are well
defined for any q ∈ B (p0 ,t∗ ) ⊆ Ω.
We define
f (t)
K (t) = q ∈ B (p0 , t) :
Pσ,0,1 D X (p0 )−1 Pσ,1,0 X (q)
≤ 0 = a f (t), 0 ≤ t < t∗ ,
| f (0)|
(6.3.6)
where σ : [0, 1] −→ M is the minimizing geodesic connecting p0 and q. Note that
K (t) 6= 0/ since p0 ∈ K (t).
Proposition 6.3.4. Under the hypotheses of either the Kantorovich or the simplified
Kantorovich method, if q ∈ B (p0 ,t∗ ), then D X (q) is nonsingular and
1
−1
D X (q)
≤ 0 where λ = d (p0 , q) < t∗ .
| f (λ)|
Since Pα,1,0 and Pα,0,1 are linear, isometric and D X (p0 ) is nonsingular, we have that
φ is linear, nonsingular and
−1
1
φ
=
−1
D X (p0 )
≤ a = 0 ,
| f (0)|
Newton-Type Methods on Riemannian Manifolds 113
with α ([0, 1]) ⊆ B (p0 ,t∗ ). Since d (p0 , q) < t∗ , D X ∈ LipL (Ω) and kα0 (0)k = λ.
Therefore
kD X (q) − φk ≤ lλ. (6.3.8)
Therefore, for any q ∈ B (p0 ,t∗ ), we can apply the Kantorovich methods.
t+ = t − | ff0(t)
(0)|
q+ = expq −Pσ,0,1 D X (p0 )−1 Pσ,1,0 X (q) .
we have
Since
γ (1) = expq −Pσ,0,1 D X (p0 )−1 Pσ,1,0 X (q) = q+ ,
114 Ioannis K. Argyros and Á. Alberto Magreñán
f (t)
d (p0 , q+) = d (p0 , γ(1)) ≤ t + = t+ ,
| f 0 (0)|
therefore
q+ ∈ B (p0 ,t+) ⊂ B (p0 ,t∗ ) .
Moreover, if σ+ [0, 1] −→ M is the minimizing geodesic connecting p0 and q+, then
−1
−1
−Pσ+ ,0,1 D X (p0 ) Pσ+ ,1,0 X (q+ )
≤
D X (p0 )
kX (q+)k .
By Theorem 6.2.10,
kX (q+)k = kX (γ(1))k
1
≤l kvk + d (p0 , p) kvk
2
1
−1
−1
≤l
−Pσ+ ,0,1D X (p0 ) Pσ+ ,1,0X (q)
+ t
−Pσ+ ,0,1 D X (p0 ) Pσ+ ,1,0 X (q)
2
1 f (t) f (t)
≤l +t .
2 | f 0 (0)| | f 0 (0)|
we thus conclude
f (t+ )
kX (q+ )k ≤ ,
| f 0 (0)|
and therefore
q+ ∈ K (t+) .
Now we are going to prove that starting from any point of K (t) the simplified Newton
method converges.
Newton-Type Methods on Riemannian Manifolds 115
τ0 = t
τk+1 = τk − | ff (τ k)
0 (0)| for each k = 0, 1, · · ·
Then the sequence generated by Newton’s method starting with the point q0 = q is
well defined for any k and
qk ∈ K (τk ) . (6.3.9)
Moreover {qk }k∈N converges to some q∗ ∈ B (p0 ,t∗ ), X (q∗ ) = 0 and
Proof. It is clear that the sequence {τk }k∈N is the sequence generated by Newton’s method
for solving f (t) = 0. Therefore, {τk }k∈N is well defined, strictly increasing and it converges
to the root t∗ (see the definition of f ). By hypothesis, q0 ∈ K (τ0 ) ; suppose that the points
q0 , q1 , . . ., qk are well defined. Then, using Banach’s Lemma, we conclude that qk+1 is well
defined. Furthermore,
d (qk+1, qk ) ≤
−Pσk ,0,1 D X (p0 )−1 Pσk ,1,0 X (qk )
.
Since
qk+1 = expqk −Pσk ,0,1 D X (p0 )−1 Pσk ,1,0 X (qk )
and σk : [0, 1] −→ M is the minimizing geodesic connecting p0 , qk , from Lemma 6.3.5 and
using (6.3.9) we obtain
f (τk )
d (qk+1, qk ) ≤ = τk+1 − τk . (6.3.10)
| f 0 (0)|
Hence, for k ≥ s, s ∈ N,
d (qk , qs ) ≤ τs − τk . (6.3.11)
It follows that {qk }k∈N is a Cauchy sequence. Since M is complete, it converges to the some
q∗ ∈ M. Moreover qk ∈ K (τk ) ⊆ B [p0 ,t∗ ], therefore q∗ ∈ B [p0 ,t∗ ].
Next, we prove that X (q∗ ) = 0. We have
kX (qk )k =
Pσk ,0,1 D X (p0 ) Pσk ,1,0 Pσk ,0,1 D X (p0 )−1 Pσk ,1,0 X (qk )
≤ kD X (p0 )k
D X (p0 )−1 X (qk )
f (τk )
≤ (kD X (p0 )k)
| f 0 (0)|
= (kD X (p0 )k) (τk+1 − τk ) .
b √ k+1
d (q∗ , qk ) ≤ 1− 1−2h , k = 1, 2, . . .
h
By hypothesis, p0 ∈ K (0) , thus by the Lemma 6.3.5, the sequence {pk }k∈N generated
by (6.3.1) is well defined, contained in B (p0 ,t∗) and converges to some p∗ , which is
a singular point of X in B [p0 ,t∗ ]. Moreover, if h < 1/2, then
b √ k+1
d (pk , p∗ ) ≤ 1− 1−2h .
h
• UNIQUENESS
This proof will be made in an indirect way, by contradiction. But before we are going
to establish some results.
τ (θ) = t + θa f (t) ,
γ (θ) = expq (θv) .
with Z θ
R (θ) = Pγ,s,0 D X (γ (s))Pγ,0,s v − Pσ,0,1 D X (p0 ) Pσ,1,0 v ds,
0
Newton-Type Methods on Riemannian Manifolds 117
and
θ
kR (θ)k ≤ L kvk + d (p0 , q) θ kvk .
2
This yields to
−1
Z θ
−1
A X (γ (θ))
=
A Pγ,0,θ X (q) − Pγ,s,0 D X (γ (s)) Pγ,0,s v ds
0
−1 Z θ
=
A Pγ,0,θ (1 − θ) X (q) − P γ,s,0 D X (γ(s)) Pγ,0,s − D X (q) v ds
0
−1 θ
Z
≤
A−1 Pγ,0,θ (1 − θ) X (q)
+
A Pγ,0,θ P γ,s,0 D X (γ(s)) Pγ,0,s − D X (q) vds
0
≤ (1 − θ) a f (t) + a kR (θ)k
θ
≤ (1 − θ) a f (t) + aL kvk + d (p0 ,q) θ kvk
2
θ
≤ (1 − θ) a f (t) + aL a f (t) + t θa f (t)
2
1
= 2 b − 2t + a Lt 2 −4θ + a2 L2 θ2 t 2 + 4aLθt + 2abLθ2 − 2aLθ2 t + 4
8
= a f (τ(θ)) .
Therefore
γ(θ) ∈ K (τ (θ)),
and the Lemma is proved.
Lemma 6.3.8. Let 0 ≤ t < t∗ and q ∈ K (t). Suppose that q∗ ∈ B [p0 ,t∗ ] is a singularity
of the vector field X and
t + d (q, q∗ ) = t∗ .
Then
d (p0 , q) = t.
Moreover, letting
t+ = t + a f (t),
q+ = expq A−1 X (q) ,
then t < t+ < t∗ , q+ ∈ K (t+) and
t+ + d (q+, q∗ ) = t∗ .
It follows that α ([0, 1]) ⊂ B (p0 ,t∗ ). Taking u = α0 (0), by Lemma 6.3.2 we have
with
1
kR (1)k ≤ L kuk + d (p0 , q) kuk .
2
Therefore
1
kR (1)k ≤ L d (q, q∗ ) + d (p0 , q) d (q, q∗ ) (6.3.13)
2
1
=L (t∗ − t) + d (p0 , q) (t∗ − t)
2
1
≤L (t∗ − t) + t (t∗ − t) (6.3.14)
2
1
= L (t∗ + t) (t∗ − t) .
2
On the other hand, since | f (t)| is strictly decreasing in [0,t∗ ] and 0 ≤ d (p0 , q) ≤ t < t∗ ,
Because
f 00 (t) = L,
0 = f 0 (t∗ ) = f (t) + f 0 (t)(t∗ − t) + 21 f 00 (t)(t∗ − t)2 ,
and Z t
f 0 (t) = f 0 (0) + f 00 (t)dt,
0
therefore
1
0 = f (t) + f 0 (0) + tL (t∗ − t) + L (t∗ − t)2 ,
2
hence
1
L (t∗ + t) (t∗ − t) = − f 0 (0)(t∗ − t) − f (t).
2
Thus, the last term in (6.3.13) is equal to the last term in the inequality (6.3.15), we conclude
that all these inequalities in (6.3.15) are equalities, in particular
1
= | f 0 (0)| = a,
kD X(p0
)−1 k
kuk −
A−1 X (q)
=
A−1 X (q) + u
> 0,
−1
(6.3.16)
A X (q)
= a f (t),
L 12 (t∗ − t) + d (p0 , q) (t∗ − t) = L 21 (t∗ − t) + t (t∗ − t) .
Newton-Type Methods on Riemannian Manifolds 119
d (p0 , q) = t,
the second equation in (6.3.16) implies that u and A−1 X (q) are linearly dependent vectors
in Tq M, so that there exists r ∈ R such that
1 − |r| = |1 − r| ,
Moreover, given that α is a minimizing geodesic joining q to q∗ , we have that q, α (r) and
q∗ are in the same geodesic line, thus
therefore,
d (q, q+) + d (q+ , q∗ ) = d (q, q∗ ).
Moreover,
d (q, q+) = kruk =
A−1 X (q)
= a f (t) = t+ − t,
hence
d (q+, q∗ ) = d (q, q∗ ) − d (q, q+) = (t∗ − t) − (t+ − t) = t∗ − t+ ,
that is
d (q+ , q∗ ) + t+ = t∗ .
Corollary 6.3.9. Suppose that q∗ ∈ B [p0 ,t∗ ] is a zero of the vector field X. If there exist t˜
and q̃ such that
0 ≤ t˜ < t∗ , q̃ ∈ K (t˜) and t˜ + d (q̃, q∗ ) = t∗ ,
then
d (p0 , q∗ ) = t∗ .
{τk }k∈N converges to t∗ , {qk }k∈N converges to some q̃∗ ∈ B (p0 ,t∗ ) , and X (q∗ ) = 0. More-
over, by Lemma 6.3.8 and applying induction, it is easy to show that for all k,
Lemma 6.3.10. The limit p∗ of the sequence {pk }k∈N is the unique singularity of X in
B [p0 ,t∗ ] .
Proof. Let q∗ ∈ B [p0 ,t∗ ] a singularity of the vector field X. Using induction, we will show
that
d (pk , q∗ ) + tk ≤ t∗ .
We need to consider two cases:
Indeed, for k = 0 (6.3.17) is immediately true, because t0 = 0. Now, suppose the property
is true for some k. Let us take the geodesic
where vk is defined in (6.3.1). From Lemma 6.3.7, for all θ ∈ [0, 1],
Define φ : [0, 1] −→ M by
We know that
φ (0) = d (pk , q∗ ) + tk < t∗ .
We next show, by contradiction, that φ (θ) 6= t∗ for all θ ∈ [0, 1].
Suppose that there exists a θ̃ ∈ [0, 1] such that φ θ̃ = t∗ , and let q̃ = γk θ̃ and t˜ =
tk + θ̃ (tk+1 − tk ) . By (6.3.18) and (6.3.19),
d (p0 , q∗ ) = t∗ ,
Newton-Type Methods on Riemannian Manifolds 121
which contradicts our assumption. Thus φ (θ) 6= t∗ for all θ ∈ [0, 1], Since φ(0) < t∗ and φ
is continuous, we have that φ (θ) < t∗ for all θ ∈ [0, 1]. In particular, by (6.3.19),
Thus,
d (pk+1 , q∗ ) + tk+1 < t∗ ,
in this way (6.3.17) is true for all k ∈ N.
d (pk , q∗ ) + tk = t∗ . (6.3.20)
Indeed, for k = 0, this is immediately true, because t0 = 0. Now, suppose that (6.3.20) is
true for some k. Since pk ∈ K (tk ), by Lemma 6.3.8 we conclude that
d (pk+1 , q∗ ) + tk+1 = t∗ .
d (pk , q∗ ) + tk ≤ t∗ ,
p∗ = q∗ .
1
Lemma 6.3.11. If h = a b L < 2 and B (p0 , r) ⊆ Ω, with
1 √
t∗ < r ≤ t∗∗ = 1+ 1−2h ,
aL
then the limit p∗ of the sequence {pk }k∈N is the unique singularity of the vector field X in
B (p0 , r).
Proof. Let q∗ ∈ B (p0 , r) be a singularity of the vector field X in B (p0 , r). Let us consider
the minimizing geodesic α : [0, 1] −→ M joining p0 to q∗ . By Lemma 6.3.2,
where
1 L
kR (1)k ≤ L kuk + d (p0 , p0 ) kuk = d (p0 , q∗ )2 and kuk = d (p0 , q∗ ) . (6.3.21)
2 2
In a similar way to the inequality (6.3.15), is easy to prove that
1
−1
kR (1)k ≥ kuk −
D X (p0 ) X (p0 )
a
1 b
≥ d (p0 , q∗ ) − .
a a
122 Ioannis K. Argyros and Á. Alberto Magreñán
Therefore
L 1 b
d (p0 , q∗ )2 ≥ d (p0 , q∗ ) − ,
2 a a
hence
f (d (p0 , q∗ )) ≥ 0,
since d (p0 , q∗ ) ≤ r ≤ t∗∗ , then
d (p0 , q∗ ) ≤ t∗ .
Finally, from Lemma 6.3.10,
p∗ = q∗ .
Definition 6.4.1. Let M be a manifold and let {pk }k∈N be a sequence on M converging to
p∗ . If there exists a system of coordinates (U, x) of M with p∗ ∈ Uα, constants p > 0, c ≥ 0
and K ≥ 0 such that, for all k ≥ K, {pk }∞ k=K ⊆ Uα the following inequality holds:
−1
x (pk+1 ) − x−1 (p∗ )
≤ c
x−1 (pk ) − x−1 (p∗ )
p , (6.4.1)
It can be shown that the definition above do not depend on the choice of the coordinates
system and the multiplicative constant c depends on the chart, but for any chart, there exists
such a constant, (see [1]).
Notice that in normal coordinates of 0 pk ,
−1
exp (p) − exp−1 (q)
= d (p, q),
pk pk
d (pk+1 , p∗ ) ≤ cd (pk , p∗ ) p .
If γ[0,t) ⊆ Ω, then
Pγ,t,0 X (γ (t)) = X (p) + t D X (p) v + R (t)
with
l
kR (t)k ≤ t 2 kvk2
2
Newton-Type Methods on Riemannian Manifolds 123
Given that γ is a geodesic, we have that γ0 (t) is parallel and γ0 (s) = Pγ,0,s γ0 (0). Moreover,
since γ0 (0) = v then
Z t
Pγ,t,0 X (γ (t)) − X (p) = Pγ,s,0 D X (γ(s)) Pγ,0,s v ds.
0
Therefore
Pγ,t,0 X (γ (t)) − X (p) − t D X(p) v
= 0t Pγ,s,0 D X (γ (s))Pγ,0,s v ds − t D X (p)v
R
let Z t
R (t) = Pγ,s,0 D X (γ(s)) Pγ,0,s v − D X (p) v ds.
0
By hypothesis, D X ∈ LipL (Ω) , hence
Z t
kR (t)k ≤
Pγ,s,0 D X (γ (s))Pγ,0,s v − D X (p) v
ds
0
Z t
≤
Pγ,s,0 D X (γ (s))Pγ,0,s − D X (p)
kvk ds
0
Z t Z s
≤ L
0
γ (τ) dτ kvk ds.
0 0
thus Z t Z s Z t
L
kR (t)k ≤ L kvk dτ kvk ds = L kvk s kvk ds = t 2 kvk2 .
0 0 0 2
ii) The convergence order of the simplified Newton method in Riemannian manifold is
one (linear convergence).
Proof. Let k be sufficiently large in such a way that p∗ , pk , pk+1 , . . ., belong to a normal
neighborhood U of pk . Let us consider the geodesic γk joining pk to p∗ defined by
i) By Lemma 6.3.2,
with
L
kR (1)k ≤ kuk k2 and kuk k = d (pk , p∗ ) .
2
Hence,
0 = D X (pk )−1 X (pk ) + uk + D X (pk )−1 R (1).
Since
−D X (pk )−1 X (pk ) = exp−1 −1
pk (pk+1 ) and uk = exp pk (p∗ ) ,
we have
−1
exp−1 −1
pk (pk+1 ) − exp pk (p∗ ) = D X (pk ) R (1) ,
thus
L
d (pk+1 , p∗ ) ≤
D X (pk )−1
kuk k2 .
2
Moreover, by Banach’s Lemma,
a a a a
−1
D X (pk )
≤ ≤ ≤ =√ .
1 − ald (pk , p0 ) 1 − alτk 1 − alt∗ 1 − 2abl
Therefore
d (pk+1 , p∗ ) ≤ Cd (pk , p∗ )2 ,
with
La
C= √ .
2 1 − 2abL
ii) Let p0 be sufficiently near to p∗ in such a way that p0 is in the normal neighborhood
U of 0 pk , By Lemma 6.3.2, if σk : [0, 1] −→ M is the minimizing geodesic connecting p0 ,
pk , then
Pγ,1,0 X (p∗ ) = X (pk ) + Pσk ,0,1 D X (p0 ) Pσk ,1,0 uk + R (1),
with
1
kR (1)k ≤ L kuk k + d (p0 , pk ) kuk k.
2
Therefore
0 = Pσk ,0,1 D X (p0 )−1 Pσk ,1,0 X (pk ) + uk + Pσk ,0,1 D X (p0 )−1 Pσk ,1,0 R (1) .
Since
we have
−1
exp−1 −1
pk (pk+1 ) − exp pk (p∗ ) = Pσk ,0,1 D X (p0 ) Pσk ,1,0 R (1).
We thus conclude that
d (pk+1 , p∗ ) =
exp−1 −1
pk (pk+1) − exp pk (p∗ )
=
Pσk ,0,1 D X (p0 )−1 Pσk ,1,0 R (1)
≤
D X (p0 )−1
kR (1)k
1
≤ al kuk k + d (p0 , pk ) kuk k
2
1
= al d (pk , p∗ ) + d (p0 , pk ) d (pk , p∗ )
2
1 d (pk , p∗ )
= al + 1 d (p0 , pk ) d (pk , p∗ ) .
2 d (p0 , pk )
If k is sufficiently large, then d (pk , p∗ ) ≤ d (p0 , pk ) , and therefore
1 d (pk , p∗ ) 3
+1 ≤ ,
2 d (p0 , pk ) 2
and then, for p0 sufficiently close to p∗ ,
d (pk+1 , p∗ ) ≤ K0 d (p0 , pk )d (pk , p∗ ) ,
3aL
with K0 ≤ 2 .
Remark 6.4.4. Note that if instead of putting in the Kantorovich method the point p0 , we fix
p j sufficiently close to p∗ , we will obtain a new convergent method. Indeed the calculations
made in the previous lemma become in
d (pk+1 , p∗ ) ≤ K j d (p j , p∗ )d (pk , p∗ ) ,
3aL
with K j ≤ 2 . Thus,
d (pk+1 , p∗ ) ≤ Kd (p j , p∗ ) d (pk , p∗ ) , (6.4.2)
3aL
with K ≤ 2 .
where σk : [0, 1] −→ M be the minimizing geodesic joining the points pn and qkn ; k =
1, 2, . . ., (m − 1) , thus:
σk (0) = pn and σk (1) = qkn .
Theorem 6.5.1. Under the hypotheses of Kantorovich’s theorem, the method described in
(6.5.1) converges with order of convergence m + 1.
Proof. Let us observe that
(m−1) (m−1) (m−2)
d (pn+1 , pn ) ≤ d pn+1 , qn + d qn , qn + · · · + d q2n , q1n + d q1n , pn .
1
Now, if we define pn+1 = qm n , pn = qn , looking at each step as a different method according
to (6.5.1) , then by Kantorovich theorem in the first step and by the simplified Kantorovich
theorem for the following steps, each one of the sequences {qm n }m∈N for fixed n, is con-
vergent to the same point p∗ ∈ M. Therefore, {pn } p∈N is convergent to p∗ . Moreover, for
Lemma 6.4.3 i) and (6.4.2),
(m−1) (m−2)
d (pn+1 , p∗ ) ≤ Kd (pn , p∗ ) d qn , p∗ ≤ Kd (pn , p∗ ) Kd (pn , p∗ ) d qn , p∗
≤ · · · ≤ K m−1 d (pn , p∗ )m−1 d q1n , p∗ ≤ K m−1 d (pn , p∗ )m−1 Cd (pn , p∗ )2 .
Therefore,
d (pn+1 , p∗ ) ≤ CK m−1 d (pn , p∗ )m+1 .
for the Newton method. Kantorovich criterion (6.1.1) may be not satisfied on a particular
problem but Newton methods may still converges to p∗ [27]. Next, we shall show that
condition (6.1.1) can be weakened by introducing the center Lipschitz condition and relying
on tighter majorizing sequences instead of majorizing function f .
Note that
l0 ≤ l (6.6.3)
holds in general and l/l0 can be arbitrarily large [?], [14].
We present the semilocal convergence of the simplified Newton method using only the
center-Lipschitz condition.
1 p
t∗0 < r ≤ t∗∗
0
= 1 + 1 − 2 h0
a l0
and p∗ is also the only singularity of F in B(p0 , r). Furthemore, the following error bounds
are satisfied fr each k = 1, 2, · · ·
t00 = 0, t10 = b
0 a l0 0 0 2
tk+1 = tk0 + (t − t ) for each k = 1, 2, · · · .
2 k k−1
Proof. Simply notice that l0 , h0 , {tk0 }, t∗0 , t∗∗
0
can replace l, h, {tk}, t∗ , t∗∗, respectively, in
the proof of Theorem 6.3.1.
Remark 6.6.3. Under Kantorovich criterion (6.1.1) a simple inductive argument shows
that
tk0 ≤ tk and tk+1
0
− tk0 ≤ tk+1 − tk for each k = 0, 1, · · · .
Moreover, we have that
1 1
t∗0 ≤ t∗ , 0
t∗∗ ≤ t∗∗ , h≤ =⇒ h0 ≤
2 2
and
h0 l0
−→ 0 as −→ 0.
h l
Furthemore, strict inequality holds in these estimates (for k > 1) if l0 < l.
The convergence order of simplified method is only linear, whereas the convergence
order of Newton method is quadratic if h < 1/2. If criterion h ≤ 1/2 is not satisfied but
weaker h0 ≤ 1/2 is satisfied, we can start with the simplified method until a certain iterate
xN (N a finite natural integer) at which criterion h ≤ 1/2 is satisfied. Such an integer N
exists. Since the simplified Newton method converges [8], [12], [14]. This approach was
not possible before since h ≤ 1/2 was used at the convergence criterion for both methods.
Remark 6.6.4. Under the hypotheses of Theorem 6.1.2, we see in the proof of this Theorem,
sequences {rk}, {sk } defined by
a l0 (r1 − r0 )2
r0 = 0, r1 = b, r2 = r1 +
2 (1 − a l0 r1 )2
(6.6.5)
a l (rk − rk−1 )2
rk+1 = rk + for each k = 2, 3, · · ·
2 (1 − a l0 rk )
s0 = 0, s1 = b
a l (sk − sk−1 )2 (6.6.6)
sk+1 = sk + for each k = 1, 2, · · ·
2 (1 − a l0 sk )
are also majorizing sequences for {pk } such that
and
r∗ = lim rk ≤ s∗ = lim sk ≤ t∗ = lim uk .
k→∞ k→∞ k→∞
Newton-Type Methods on Riemannian Manifolds 129
Simply notice that for the computation of the upper bound on the norms k DX(q)−1 k (see
(6.3.8)), we can have using the center-Lipschitz condition
k φ−1 k a
k DX(q)−1 k≤ −1
≤
1− k φ k k DX(q) − φ k 1 − a l0 λ
instead of the less tight (if l0 < l) and more expensive to compute estimate
a
k DX(q)−1 k≤
1−al λ
obtained in the proofs of Theorems 6.1.2 and 6.3.1 using the Lipschitz condition. Hence, the
results of Theorem 6.1.2 involving sequence {uk} can be rewritten using tighter sequences
{rk } or {sk }. Note that the introduction of the center-Lipschitz condition is not an addi-
tional hypothesis to Lipschitz condition since in practice, the computation of l requires the
computation of l0 . So far we showed that under Kantorovich criterion (6.1.1) the estimates
of the distances d(pk , pk−1), d(pk , p∗ ) are improved (if l0 < l) using tighter sequences {rk },
{sk } for the computation on the upper bounds of these distances. Moreover, the information
on the location of the solution is at least as precise.
Next, we shall show that Kantorovich criterion (6.1.1) can be weakened if one directly
(and not through majorizing function f ) studies the convergence of sequences {rk } and
{sk }. First, we present the results for sequence {sk }.
Lemma 6.6.5. [13] Assume there exist constants l0 ≥ 0, l ≥ 0, a > 0 and b ≥ 0 with l0 ≤ l
such that
≤ 1/2 if l0 6= 0
h1 = l b (6.6.7)
< 1/2 if l0 = 0
p
a 2
where l = l + 4 l0 + l + 8 l0 l . Then, sequence {sn } given by (6.6.6) is nondecreas-
8
ing, bounded from above by s?? and converges to its unique least upper bound s? ∈ [0, s??],
where
2b 4l
s?? = and δ = p < 1 for l0 6= 0. (6.6.8)
2−δ l + l 2 + 8 l0 l
Moreover the following estimates hold
a l0 s? ≤ 1, (6.6.9)
n
δ δ
0 ≤ sn+1 − sn ≤ (sn − sn−1 ) ≤ · · · ≤ b for each n = 1, 2, · · · , (6.6.10)
2 2
n
δ n
sn+1 − sn ≤ (2 h1 )2 −1 b for each n = 0, 1, · · · (6.6.11)
2
and
n n
? δ (2 h1 )2 −1 b
0 ≤ s − sn ≤ , (2 h1 < 1) for each n = 0, 1, · · · . (6.6.12)
2 1 − (2 h1 )2n
130 Ioannis K. Argyros and Á. Alberto Magreñán
Lemma 6.6.6. [16] Suppose that hypotheses of Lemma 6.6.5 hold. Assume that
1
h2 = l 2 b ≤ , (6.6.13)
2
a
where l2 = (4 l0 + (l l0 + 8 l02 )1/2 + (l0 l)1/2 ). Then, scalar sequence {rn } given by (6.6.5)
8
is well defined, increasing, bounded from above by
a l 0 b2
r?? = b + (6.6.14)
2 (1 − (δ/2)) (1 − a l0 b)
and converges to its unique least upper bound r? which satisfies 0 ≤ r? ≤ r?? . Moreover,
the following estimates hold
a l 0 b2
0 < rn+2 − rn+1 ≤ (δ/2)n for each n = 1, 2, · · · . (6.6.15)
2 (1 − a l0 b)
Lemma 6.6.7. [16] Suppose that hypotheses of Lemma 6.6.5 hold and there exists a mini-
mum integer N > 1 such that iterates ri (i = 0, 1, · · · , N −1) given by (6.6.5) are well defined,
1
ri < ri+1 < for each i = 0, 1, · · · , N − 2 (6.6.16)
a l0
and
1 δ
rN ≤ (1 − (1 − a l0 rN−1 ) ). (6.6.17)
a l0 2
Then, the following assertions hold
a l0 rN < 1, (6.6.18)
1 δ
rN+1 ≤ (1 − (1 − a l0 rN ) ), (6.6.19)
a l0 2
δ a l0 (rN+1 − rN )
δN−1 ≤ ≤ 1− , (6.6.20)
2 1 − a l0 rN
sequence {rn } given by (6.6.5) is well defined, increasing, bounded from above by
2
r?? = rN−1 + (rN − rN−1 )
2−δ
and converges to its unique least upper bound r? which satisfies 0 ≤ r? ≤ r?? , where δ is
given in Lemma 6.6.5 and
a l (rn+2 − rn+1 )
δn = .
2 (1 − a l0 rn+2 )
Moreover, the following estimates hold
n−1
δ
0 < rN+n − rN+n−1 ≤ (rN+1 − rN ) for each n = 1, 2, · · · .
2
Newton-Type Methods on Riemannian Manifolds 131
a l0 b al b+δ
r2 = b + ≤ ,
2 (1 − a l0 b) a l + δ a l0
which is (6.6.13). When N > 2 we do not have closed form inequalities (solved for n)
anymore given by
c0 η ≤ c1 ,
where c0 and c1 may depend on l0 and l, see e.g. (6.6.7) or (6.6.13). However, the corre-
sponding inequalities can also be checked out, since only computations involving b, l0 , and
l are carried out (see also [16]). Clearly, the sufficient convergence conditions of the form
(6.6.17) become weaker as N increases.
Remark 6.6.9. In [14], [16], tighter upper bounds on the limit points of majorizing se-
quence {rn }, {sn }, {uk } than [6, 8, 32] are given. Indeed, we have that
? a l0 b
r = lim rn ≤ r3 = 1 + b.
n→∞ (2 − δ) (1 − a l0 b)
Note that
≤ r2 if l0 ≤ l ≤ r1 if l0 ≤ l
r3 and r3
< r2 if l0 < l < r1 if l0 < l
where
2b
r2 = and r1 = 2 b.
2−δ
Moreover, r2 can be smaller than s? for sufficiently small l0 . We have also that
1 1 1
h≤ =⇒ h1 ≤ =⇒ h2 ≤ ,
2 2 2
but not necessarily vice versa unless if l0 = l. Moreover, we have that
h1 1 h2 h2 l0
−→ , −→ 0 and −→ 0 as −→ 0.
h 4 h h1 l
Example 6.6.10. We consider a simple example to test the ”h” conditions in one dimension.
Let X = R, x0 = 1, Ω = [d, 2 − d], d ∈ [0, .5). Define function F on Ω by
F(x) = x3 − d. (6.6.21)
We get that
1
b= (1 − d) and l = 2 (2 − d).
3
Kantorovich condition (6.1.1) is given by
2
h= (1 − d) (2 − d) > .5 for all d ∈ (0, .5).
3
132 Ioannis K. Argyros and Á. Alberto Magreñán
Figure 6.6.1. Functions h, h1 , h2 (from top to bottom) with respect to d in interval (0, .999),
respectively. The horizontal blue line is of equation y = .5.
References
[1] Absil, P.A. Mahony, R., Sepulchre, R., Optimization Algorithms on Matrix Manifolds,
Princeton University Press, Princeton NJ, 2008.
[2] Adler, R.L., Dedieu, J.P., Margulies, J.Y., Martens, M., Shub,M., Newton’s method
on Riemannian manifolds and a geometric model for the human spine, IMA J. Numer.
Anal., 22 (2002), 359–390.
[3] Alvarez, F., Bolte, J., Munier, J., A unifying local convergence result for Newton’s
method in Riemannian manifolds, Foundations Comput. Math., 8 (2008), 197–226.
[4] Amat, S., Busquier, S., Third-order iterative methods under Kantorovich conditions,
J. Math. Anal. Appl., 336 (2007), 243–261.
[5] Amat, S., Busquier, S., Gutiérrez, J. M., Third-order iterative methods with appli-
cations to Hammerstein equations: A unified approach, J. App. Math. Comp., 235
(2011), 2936–2943.
[6] Argyros, I.K., A unifying local-semilocal convergence analysis and applications for
two-point Newton-like methods in Banach space, J. Math. Anal. Appl., 298 (2004),
374–397.
[8] Argyros, I.K., Approximating solutions of equations using Newton’s method with a
modified Newton’s method iterate as a starting point, Rev. Anal. Numér. Théor. Ap-
prox., 36 (2007), 123–138.
[9] Argyros, I.K., Chebysheff-Halley like methods in Banach spaces, Korean J. Comp.
Appl. Math., 4 (1997), 83–107.
[10] Argyros, I.K., Improved error bounds for a Chebysheff-Halley-type method, Acta
Math. Hungar., 84 (1999), 209–219.
[11] Argyros, I.K., Improving the order and rates of convergence for the super-Halley
method in Banach spaces, Korean J. Comput. Appl. Math., 5 (1998), 465–474.
[13] Argyros, I.K., A semilocal convergence analysis for directional Newton methods,
Math. Comp., 80 (2011), 327–343.
[14] Argyros, I.K., Cho, Y.J., Hilout, S., Numerical methods for equations and its applica-
tions, CRC Press/Taylor and Francis Publ., New York, 2012.
[15] Argyros, I.K., Hilout, S., Newton’s method for approximating zeros of vector fields
on Riemannian manifolds, J. App. Math. Comp., 29 (2009), 417–427.
[16] Argyros, I.K., Hilout, S., Weaker conditions for the convergence of Newton’s method,
J. Complexity, 28 (2012), 364–387.
[17] Argyros, I.K., Ren, H., On the semilocal convergence of Halley’s method under a
center-Lipschitz condition on the second Fréchet derivative, App. Math. Comp., 218
(2012), 11488–11495.
[18] Averbuh, V.I., Smoljanov, O.G., Differentiation theory in linear topological spaces,
Uspehi Mat. Nauk 6, 201-260 Russian Math. Surveys 6 (1967), 201–258.
[19] Chun, C., Stanica, P., Neta, B., Third-order family of methods in Banach spaces,
Comput. Math. Appl., 61 (2011), 1665–1675.
[20] Dedieu, J.P., Priouret, P., Malajovich, G., Newton’s method on Riemannian manifolds:
convariant alpha theory, IMA J. Numer. Anal., 23 (2003), 395–419.
[21] Dedieu, J.P., Nowicki, D., Symplectic methods for the approximation of the expo-
nential map and the Newton iteration on Riemannian submanifolds, J. Complexity, 21
(2005), 487–501.
[23] Ezquerro, J.A., Hernández, M.A., New Kantorovich-type conditions for Halley’s
method, Appl. Numer. Anal. Comput. Math., 2 (2005), 70–77.
[24] Ezquerro, J.A., Gutiérrez, J. M., Hernández, M.A., Salanova, M.A., Chebyshev-like
methods and quadratic equations, Rev. Anal. Numér. Théor. Approx., 28 (2000), 23–35.
[25] Ezquerro, J.A., A modification of the Chebyshev method, IMA J. Numer. Anal., 17
(1997) 511–525.
[27] Ferreira, O., Svaiter, B., Kantorovich’s Theorem on Newton’s Method in Riemannian
Manifolds, J. Complexity, 18 (2002), 304–329.
[28] Gabay, D., Minimizing a differentiable function over a differential manifold, J. Optim.
Theory Appl., 37 (1982), 177–219.
[29] Groisser, D., Newton’s method, zeros of vector fields, and the Riemannian center of
mass, Adv. Appl. Math., 33 (2004), 95–135.
Newton-Type Methods on Riemannian Manifolds 135
[30] Hernández, M.A., Romero, N., General study of iterative processes of R-order at least
three under weak convergence conditions, J. Optim. Theory Appl., 133 (2007), 163–
177.
[32] Kantorovich, L.V., Akilov, G.P., Functional Analysis in Normed Spaces, Pergamon,
Oxford, 1964.
[33] Kelley, C.T., A Shamanskii-like acceleration scheme for nonlinear equations at singu-
lar roots, Math. Comp., 47 (1986), 609–623.
[35] Li, C., Wang, J., Newton’s method on Riemannian manifolds: Smale’s point estimate
theory under the γ-condition, IMA J. Numer. Anal., 26 (2006), 228–251.
[36] Li, C., Wang, J., Newton’s method for sections on Riemannian manifolds: Generalized
covariant α-theory, J. Complexity, 25 (2009), 128–151.
[37] Manton, J.H., Optimization algorithms exploiting unitary constraints, IEEE Trans.
Signal Process., 50 (2002), 635–650.
[38] Neta, B., A new iterative method for the solution of systems of nonlinear equations.
Approximation theory and applications (Proc. Workshop, Technion Israel Inst. Tech.,
Haifa, 1980), Academic Press, New York-London, (1981), 249–263.
[39] Parida, P.K., Gupta, D.K., Recurrence relations for a Newton-like method in Banach
spaces, 206 (2007), 873–887.
[40] Parida, P.K., Gupta, D.K., Recurrence relations for semilocal convergence of a
Newton-like method in Banach spaces, J. Math. Anal. Appl. 345 (2008), 350–361.
[41] Parida, P.K., Gupta, D.K., Semilocal convergence of a family of third-order methods in
Banach spaces under Hölder continuous second derivative, Nonlinear Anal. 69 (2008),
4163–4173.
[42] Romero, N., PhD Thesis. Familias paramétricas de procesos iterativos de alto orden
de convergencia. http://dialnet.unirioja.es/
[43] Shamanskii, V.E., A modification of Newton’s method, Ukrain. Mat. Zh., 19 (1967),
133–138.
[44] Spivak, M., A comprehensive introduction to differential geometry, Vol. I, third ed.,
Publish or Perish Inc., Houston, Texas., 2005.
[45] Spivak, M., A comprehensive introduction to differential geometry, Vol. II, third ed.,
Publish or Perish Inc., Houston, Texas, 2005.
136 Ioannis K. Argyros and Á. Alberto Magreñán
[46] Traub, J.F., Iterative methods for the solution of equations, Prentice Hall, Englewood
Cliffs, N. J., 1964.
[47] Udriste, C., Convex functions and optimization methods on Riemannian manifolds,
Mathematics and its Applications, 297, Kluwer Academic Publishers Group, Dor-
drecht, 1994.
[48] Wang, J.H., Convergence of Newton’s method for sections on Riemannian manifolds,
J. Optim. Theory Appl., 148 (2011), 125–145.
[49] Zhang, L.H., Riemannian Newton method for the multivariate eigenvalue problem,
SIAM J. Matrix Anal. Appl., 31 (2010), 2972–2996.
Chapter 7
7.1. Introduction
Let X and Y be Banach spaces. Let D ⊆ X be open set and F : D −→ Y be continuously
differentiable. In this chapter we are concerned with the problem of approximating a locally
unique solution x? of nonlinear least squares problem
min k F(x) k2 . (7.1.1)
x∈D
A solution x? ∈ D of (7.1.1) is also called a least squares solution of the equation F(x) = 0.
Many problems from computational sciences and other disciplines can be brought in a
form similar to equation (7.1.1) using mathematical modelling [8], [11]. For example in
data fitting, we have X = Ri , Y = R j , i is the number of parameters and j is the number of
observations [23].
The solution of (7.1.1) can rarely be found in closed form. That is why the solution
methods for these equations are usually iterative. In particular, the practice of numerical
analysis for finding such solutions is essentially connected to Newton-type methods [8].
The study about convergence matter of iterative procedures is usually centered on two types:
semilocal and local convergence analysis. The semilocal convergence matter is, based on
the information around an initial point, to give criteria ensuring the convergence of iterative
procedures; while the local one is, based on the information around a solution, to find
estimates of the radii of convergence balls. A plethora of sufficient conditions for the local
as well as the semilocal convergence of Newton-type methods as well as an error analysis
for such methods can be found in [1]–[47].
In the present chapter we use the inexact Gauss-Newton like method
xn+1 = xn + sn , B (xn) sn = −F 0 (xn )? F(xn) + rn for each n = 0, 1, · · · , (7.1.2)
where x0 ∈ D is an initial point to generate a sequence {xn } approximating x? . Here, A?
denotes the adjoint of the operator A, B (x) ∈ L (X , Y ) the space of bounded linear operators
138 Ioannis K. Argyros and Á. Alberto Magreñán
7.2. Background
Let U(x, r) and U(x, r) stand, respectively, for the open and closed ball in X with center x ∈
D and radius r > 0. Let A : X −→ Y be continuous linear and injective with closed image,
Improved Local Convergence Analysis of Inexact Gauss-Newton Like Methods 139
Suppose that F 0 (x? )? F(x? ) = 0, F 0 (x? ) is injective and there exist functions f 0 , f :
[0, R) −→ (−∞, +∞) continuously differentiable, such that the following assumptions hold
(H0 )
k F 0 (x) − F 0 (x? ) k≤ f 00 (k x − x? k) − f 00 (0), (7.3.1)
k F 0 (x) − F 0 (x? + τ (x − x? )) k≤ f 0 (k x − x? k) − f 0 (τ k x − x? k), (7.3.2)
for all x ∈ U(x? , κ) and τ ∈ [0, 1];
(H3 ) √
α0 = 2 c β2 D+ f00 (0) < 1.
140 Ioannis K. Argyros and Á. Alberto Magreñán
Let
are increasing. Note also that h2 h1 and h3 h1 are increasing on (0, R).
Lemma 7.3.3. Suppose that (H0 )–(H3 ) hold. Then, the constant ρ0 defined by (7.3.5) is
positive and the following holds for each t ∈ (0, ρ0 ):
√
t f 0 (t) − f (t) + 2 c β ( f 00 (t) + 1)
(1 + ϑ) ω1 β + ω1 ϑ + ω2 < 1,
t (1 − β ( f 00 (t) + 1))
where ϑ, ω1 and ω2 are defined in (7.3.3).
Proof. Using (H1 ), we have that
t f 0 (t) − f (t) f 0 (t) − ( f (t) − f (0))/t 1 − β ( f 0 (t) + 1)
lim = lim
t→0 t (1 − β ( f 00 (t) + 1)) t→0 1 − β ( f 0(t) + 1) 1 − β ( f 00 (t) + 1)
1 − β ( f 0 (0) + 1) f 0 (t) − ( f (t) − f (0))/t
= lim = 0.
1 − β ( f 00 (0) + 1) t→0 1 − β ( f 0(t) + 1)
We deduce that
√
t f 0 (t) − f (t) + 2 c β ( f 00 (t) + 1)
lim(1 + ϑ) ω1 β + ω1 ϑ + ω2
t→0 t (1 − β ( f 00 (t) + 1))
= (1 + ϑ) ω1 α0 + ω1 ϑ + ω2 .
Improved Local Convergence Analysis of Inexact Gauss-Newton Like Methods 141
By (7.3.3), we have that ω1 (α0 + α0 ϑ + ϑ) + ω2 < 1. Then, there exists δ0 such that
√
t f 0 (t) − f (t) + 2 c β ( f 00 (t) + 1)
(1 + ϑ) ω1 β + ω1 ϑ + ω2 < 1 for each t ∈ (0, δ0 ).
t (1 − β ( f 00 (t) + 1))
Lemma 7.3.4. Suppose that (H0 )–(H3 ) hold. Then, for each x ∈ D such that x ∈
U(x? , min{ν0 , κ}), F 0 (x)? F 0 (x) is invertible and the following estimates hold
β
k F 0 (x)+ k≤
1 − β ( f 00 (k x − x? k) + 1)
and √
2 β2 ( f 00 (k x − x? k) + 1)
k F 0 (x)+ − F 0 (x? )+ k≤ .
1 − β ( f 00 (k x − x? k) + 1)
In particular, F 0 (x)? F 0 (x) is invertible in U(x? , r0 ).
Proof. Since x ∈ D such that x ∈ U(x? , min{ν0 , κ}), then k x − x? k≤ ν0 . By Lemma 7.3.1,
(7.3.1) and the definition of β, we have that
We shall bound this error by the error in linearization of the majorant function f :
e f (t, u) := f (u) − ( f (t) + f 0 (t) (u − t)) for each t, u ∈ [0, R). (7.3.8)
Lemma 7.3.5. Suppose that (H0 )–(H3 ) hold. If k x? − x k< κ, then the following assertion
holds
k EF (x, x? ) k≤ e f (k x − x? k, 0).
Lemma 7.3.6. Suppose that (H0 )–(H3 ) hold. Then, for each x ∈ D such that x ∈
U(x? , min{ν0 , κ}), the following estimate holds
√
β e f (k x − x? k, 0) + 2 c β2 ( f 00 (k x − x? k) + 1)
k SF (x) k≤ + k x − x? k .
1 − β ( f 00 (k x − x? k) + 1)
142 Ioannis K. Argyros and Á. Alberto Magreñán
Proof. Let x ∈ D such that x ∈ U(x? , min{ν0 , κ}). Using (7.3.7) and (7.3.9), we have that
k SF (x) k
=k F 0 (x)+ (F(x? ) − (F(x) − F 0 (x) (x? − x))) − (F 0 (x)+ − F 0 (x? )+ ) F(x? ) + (x? − x) k
≤k F 0 (x)+ k k EF (x, x? ) k + k F 0 (x)+ − F 0 (x? )+ k k F(x? ) k + k x? − x k .
Then, we deduce the desired result by Lemmas 7.3.4 and Lemma 7.3.5. That completes the
proof of Lemma 7.3.6.
Lemma 7.3.7. Let parameters ϑ, ω1 and ω2 defined by (7.3.3). Let ν0 , ρ0 and r0 as defined
in (7.3.4), (7.3.5) and (7.3.6), respectively. Suppose that (H0 )–(H3 ) hold. For each x ∈
U(x? , r0 ) \ {x? }, define
Suppose also that the forcing term θ and the residuals r (as defined in (7.1.3)) satisfy
f 0 (k x? − x k) k x? − x k − f (k x? − x k)
k x+ − x? k≤ (1 + ϑ) ω1 β ? − x k2 (1 − β ( f 0 (k x? − x k) + 1))
k x? − x k2 +
√ k x 0
(1 + ϑ) ω1 2 c β2 ( f 00 (k x − x? k) + 1)
+ ω1 ϑ + ω2 k x? − x k .
k x? − x k (1 − β ( f 00 (k x − x? k) + 1))
(7.3.13)
In particular, k x+ − x? k<k x? − x k.
Proof. By Lemma 7.3.4 and since x ∈ U(x? , r0 ), we have that F 0 (x)? F 0 (x) is invertible. In
view of (7.1.2) and (7.3.10), we obtain the identity
Next, we provide the main local convergence result for inexact Gauss-Newton-like
method.
Theorem 7.3.8. Let F : D ⊆ X −→ Y be a continuously differentiable operator. Let pa-
rameters ϑ, ω1 and ω2 defined by (7.3.3). Let ν0 , ρ0 and r0 as defined in (7.3.4), (7.3.5)
and (7.3.6), respectively. Suppose that (H0 )–(H3 ) hold. Then, sequence {xn } generated by
(7.1.2), starting at x0 ∈ U(x? , r0 ) \ {x?} for the the forcing term θn , the residual rn and the
invertible preconditioning matrix Pn satisfying the following estimates for each n = 0, 1, · · ·:
k Pn rn k≤ θn k Pn F 0 (xn )? F(xn ) k, 0 ≤ θn cond(Pn F 0 (xn )? F 0 (xn )) ≤ ϑ,
k B (xn )−1 F 0 (xn )? F 0 (xn ) k≤ ω1 and k B (xn )−1 F 0 (xn )? F 0 (xn ) − I k≤ ω2
is well defined, remains in U(x? , r0 ) for all n ≥ 0 and converges to x? . Moreover, the
following estimate holds for each n = 0, 1, · · ·
k xn+1 − x? k≤ Ξn k xn − x? k, (7.3.17)
where
f 0 (k x? − x0 k) k x? − x0 k − f (k x? − x0 k)
Ξn = (1 + ϑ) ω1 β ? − x k2 (1 − β ( f 0 (k x? − x k) + 1))
k x? − xn k +
√ k x 0 0 0
(1 + ϑ) ω1 2 c β2 ( f 00 (k x0 − x? k) + 1)
+ ω1 ϑ + ω2 .
k x? − x0 k (1 − β ( f 00 (k x0 − x? k) + 1))
Proof. By induction argument, Lemmas 7.3.4 and 7.3.7, {xn } starting at x0 ∈ U(x? , r0 ) \
{x? } is well defined in U(x? , r0 ). By letting x+ = xn+1 , x = xn , r = rn , P = Pn , θ = θn and
P = Pn in (7.3.10)–(7.3.12), we get that
k xn+1 − x? k≤
f 0 (k x? − xn k) k x? − xn k − f (k x? − xn k)
(1 + ϑ) ω1 β ? − x k2 (1 − β ( f 0 (k x? − x k) + 1))
k x? − xn k2 +
√k x n 0 n
(1 + ϑ) ω1 2 c β2 ( f 00 (k xn − x? k) + 1)
+ ω1 ϑ + ω2 k x? − xn k .
k x? − xn k (1 − β ( f 00 (k xn − x? k) + 1))
We also have by Lemma 7.3.7 that k xn − x? k≤k x0 − x? k for each n = 1, 2, · · ·. Hence,
(7.3.17) holds. Proposition 7.3.3 imply that xn+1 ∈ U(x? , r0 ) and lim xn = x? . The proof
n−→∞
of Theorem 7.3.8 is complete.
144 Ioannis K. Argyros and Á. Alberto Magreñán
Remark 7.3.9. If f (t) = f 0 (t) for each t ∈ [0, R), then, Theorem 7.3.8 reduces to [28,
Theorem 7]. In particular, we have in this case that ν = ν0 , ρ = ρ0 , δ = δ0 , α = α0 , r = r0
and D+ f (0) = D+ f0 (0), where ν, ρ, δ, α, r and D+ f (0) are defined, respectively, as ν0 ,
ρ0 , δ0 , α0 , r0 and D+ f0 (0) by setting f 0 (t) = f (t). Otherwise, i.e., if
f0 (t) < f (t) and f 00 (t) < f 0 (t) f or each t ∈ [0, R), (7.3.18)
Note that these advantages are obtained under the same computational cost, since in prac-
tice, the computation of function f requires that of f 0 . Note also that the local results in
[18], [19], [24]–[27] are also extended, since these are special cases of Theorem 7.3.8. In
particular, if ϑ = 0 (i.e., if θn = rn = 0 for each n = 0, 1, · · ·) in Theorem 7.3.8, we improve
the convergence of Gauss-Newton like method under majorant condition, which for ω1 = 1
and ω2 = 0 has been obtained in [26, Theorem 7]. These results extend those the ones ob-
tained by Chen and Li in [18], [19] given only for the the case c = 0. Moreover, if c = 0 and
F 0 (x? ) is invertible, we extend the convergence of inexact Newton-like methods under ma-
jorant condition, which was obtained in [24, Theorem 4]. Furthemore, if c = ϑ = ω2 = 0,
ω1 = 1 and F 0 (x? ) is invertible in Theorem 7.3.8, we extend the convergence of Newton’s
method under majorant condition obtained in [24, Theorem 2.1].
In the next section, we shall show how to choose functions f 0 and f so that (7.3.18) is
satisfied.
L0 t 2 Lt 2
f0 (t) = −t and f (t) = − t,
2 2
where L0 and L are the center-Lipschitz and Lipschitz constants, respectively. We have that
f0 (0) = f (0) = 0 and f 00 (0) = f 0 (0) = −1. Set also R = 1/L. Then, it can easily be seen
that Theorem 7.3.8 specializes to the following proposition.
Then, sequence {xn } generated by (7.1.2) with rn = 0 and B (xn ) = F 0 (xn )? F 0 (xn ), starting
at x0 ∈ U(x? , r0 ) \ {x? } for the the forcing term θn and the invertible preconditioning matrix
Pn satisfying the following estimates for each n = 0, 1, · · ·:
k Pn rn k≤ θn k Pn F 0 (xn )? F(xn ) k, 0 ≤ θn cond(Pn F 0 (xn )? F 0 (xn )) ≤ ϑ,
k xn+1 − x? k≤ ∆n k xn − x? k, (7.4.1)
where
√
(1 + ϑ) ω1 β L ? (1 + ϑ) ω1 2c β2 L0
∆n = k x − xn k + + ω1 ϑ + ω2 .
2 (1 − β L0 k x0 − x? k) 1 − β L0 k x0 − x? k
Remark 7.4.3. (a) If L0 = L, Proposition 7.4.2 reduces to [28, Theorem 16]. Moreover,
if ϑ = 0, Proposition 7.4.2 reduces to [28, Corollary 17]. Furthemore, if c = 0, then,
Proposition 7.4.2 reduces to [19, Corollary 6.1].
(b) If F(x? ) = 0, F 0 (x? )+ = F 0 (x? )−1 and L0 < L, then Theorem 7.3.8 improves the corre-
sponding results on inexact Newton-like methods [30], [33], [37], [38]. In particular
for Newton’s method. Set c = ϑ = ω1 = ω2 = 0. Then, we get that
2
r0 = min {κ, }.
β (2 L0 + L)
This radius is at least as large as the one provided by Traub [46], which is given by
2
r00 = min {κ, }.
3βL
Let us provide a numerical example for this case.
Example 7.4.4. Let X = Y = C [0, 1], the space of continuous functions defined on [0, 1] be
equipped with the max norm and D = U(0, 1). Define function F on D by
Z 1
F(h)(x) = h(x) − 5 x θ h(θ)3 dθ. (7.4.2)
0
Using Proposition 7.4.2, we see that the hypotheses hold for x? (x) = 0, where x ∈ [0, 1],
β = 1, L = 15 and L0 = 7.5. Then, we get that
2 1
r00 = min {κ, } ≤ min {κ, } = r0 .
45 15
2 2
Clearly, if min {κ, } = , then we deduce in particular that r00 < r0 .
45 45
Remark 7.4.5. Let γ ≥ γ0 . Let us define functions f , f 0 : [0, κ] −→ R by
t t
f0 (t) = − 2t and f (t) = − 2t.
1 − γ0 t 1 − γt
Then, we have that f 0 (0) = f (0) = 0, f 00 (0) = f 0 (0) = −1. Set also R = 1/γ. Note also that
1 1
f00 (t) = − 2, f 0 (t) = − 2,
(1 − γ0 t)2 (1 − γt)2
2 γ0 2γ
f000 (t) = and f 00 (t) = .
(1 − γ0 t)3 (1 − γt)3
We introduce the definition of the center γ0 -condition.
Definition 7.4.6. Let γ0 > 0 and let 0 < µ ≤ 1/γ0 be such that U(x? , µ) ⊆ D . The operator
F is said to satisfy the center γ0 -condition at x? on U(x? , µ) if
1
k F 0 (x) − F 0 (x? ) k≤ −1 f or each x ∈ U(x? , µ).
(1 − γ0 k x − x? k)2
We also need the definition of γ-condition due to Wang [47].
Definition 7.4.7. Let γ > 0 and let 0 < µ ≤ 1/γ be such that U(x? , µ) ⊆ D . The operator F
is said to satisfy the γ-condition at x? on U(x? , µ) if
2γ
k F 00 (x) k≤ f or each x ∈ U(x? , µ).
(1 − γ k x − x? k)3
Remark 7.4.8. (a) Note that γ0 ≤ γ holds in general and γ/γ0 can be arbitrarily large
[7]–[16].
(b) If F is an analytic function, Smale [44] used the following choice
F (n)(x? ) 1
γ = sup k k n < +∞.
n∈N? n!
Using the above definitions and choices of functions (see Remark 7.4.5, Definitions
7.4.6 and 7.4.7), the corresponding specialization of Theorem 7.3.8 along the lines
of Proposition 7.4.2 can be obtained. However, we leave this part to the interested
reader. Note that clearly if γ0 = γ, this result reduces to [28, Theorem 18], which in
turn reduces to [19, Example 1] if c = 0. Otherwise (i.e., if γ0 < γ), our result is an
improvement.
Next, we provide an example, where γ0 < γ in the case when F(x? ) = 0, F 0 (x)+ =
F (x)−1 and c = ϑ = ω1 = ω2 = 0.
0
References
[1] Amat, S., Bermúdez, C., Busquier, S., Legaz, M.J., Plaza, S., On a family of high-
order iterative methods under Kantorovich conditions and some applications, Abstr.
Appl. Anal. 2012, Art. ID 782170, 14 pp.
[2] Amat, S., Bermúdez, C., Busquier, S., Plaza, S., On a third-order Newton-type method
free of bilinear operators, Numer. Linear Algebra Appl., 17 (2010), 639–653.
[3] Amat, S., Busquier, S., Gutiérrez, J.M., Geometric constructions of iterative functions
to solve nonlinear equations, J. Comput. Appl. Math., 157 (2003), 197–205.
[4] Amat, S., Busquier, S., Gutiérrez, J.M., Third-order iterative methods with applica-
tions to Hammerstein equations: a unified approach, J. Comput. Appl. Math., 235
(2011), 2936–2943.
[5] Argyros, I.K., Forcing sequences and inexact Newton iterates in Banach space, Appl.
Math. Lett., 13 (2000), 69–75.
[6] Argyros, I.K., A unifying local–semilocal convergence analysis and applications for
two–point Newton–like methods in Banach space, J. Math. Anal. Appl., 298 (2004),
374–397.
[7] Argyros, I.K., On the semilocal convergence of the Gauss–Newton method, Adv. Non-
linear Var. Inequal., 8 (2005), 93–99.
[9] Argyros, I.K., On the semilocal convergence of inexact Newton methods in Banach
spaces, J. Comput. Appl. Math., 228 (2009), 434–443.
[10] Argyros, I.K., Cho, Y.J., Hilout, S., On the local convergence analysis of inexact
Gauss–Newton–like methods, Panamer. Math. J., 21 (2011), 11–18.
[11] Argyros, I.K., Cho, Y.J., Hilout, S., Numerical methods for equations and its applica-
tions, Science Publishers, New Hampshire, USA, 2012.
[12] Argyros, I.K., Hilout, S., On the local convergence of the Gauss-Newton method,
Punjab Univ. J. Math., 41 (2009), 23–33.
147
148 Ioannis K. Argyros and Á. Alberto Magreñán
[13] Argyros, I.K., Hilout, S., Improved generalized differentiability conditions for
Newton–like methods, J. Complexity, 26 (2010), 316–333.
[14] Argyros, I.K., Hilout, S., Extending the applicability of the Gauss-Newton method
under average Lipschitz-type conditions, Numer. Algorithms, 58 (2011), 23–52.
[15] Argyros, I.K., Hilout, S., Improved local convergence of Newton’s method under weak
majorant condition, J. Comput. Appl. Math., 236 (2012), 1892–1902.
[16] Argyros, I.K., Hilout, S., Weaker conditions for the convergence of Newton’s method,
J. Complexity, 28 (2012), 364–387.
[17] Chen, J., The convergence analysis of inexact Gauss–Newton methods for nonlinear
problems, Comput. Optim. Appl., 40 (2008), 97–118.
[18] Chen, J., Li, W., Convergence of Gauss–Newton’s method and uniqueness of the so-
lution, App. Math. Comp., 170 (2005), 686–705.
[19] Chen, J., Li, W., Local convergence results of Gauss-Newton’s like method in weak
conditions, J. Math. Anal. Appl., 324 (2006), 1381–1394.
[20] Chen, J., Li, W., Convergence behaviour of inexact Newton methods under weak Lip-
schitz condition, J. Comput. Appl. Math., 191 (2006), 143–164.
[21] Dedieu, J.P., Shub, M., Newton’s method for overdetermined systems of equations,
Math. Comp., 69 (2000), 1099–1115.
[22] Dembo, R.S., Eisenstat, S.C., Steihaug, T., Inexact Newton methods, SIAM J. Numer.
Anal., 19 (1982), 400–408.
[23] Dennis, J.E., Schnabel, R.B., Numerical methods for unconstrained optimization and
nonlinear equations (Corrected reprint of the 1983 original), Classics in Appl. Math.,
16, SIAM, Philadelphia, PA, 1996.
[24] Ferreira, O.P., Local convergence of Newton’s method in Banach space from the view-
point of the majorant principle, IMA J. Numer. Anal., 29 (2009), 746–759.
[25] Ferreira, O.P., Local convergence of Newton’s method under majorant condition, J.
Comput. Appl. Math., 235 (2011), 1515–1522.
[26] Ferreira, O.P., Gonçalves, M.L.N, Local convergence analysis of inexact inexact-
Newton like methods under majorant condition, Comput. Optim. Appl., 48 (2011),
1–21.
[27] Ferreira, O.P., Gonçalves, M.L.N, Oliveira, P.R., Local convergence analysis of
Gauss–Newton like methods under majorant condition, J. Complexity, 27 (2011), 111–
125.
[28] Ferreira, O.P., Gonçalves, M.L.N, Oliveira, P.R., Local convergence analysis of inex-
act Gauss–Newton like methods under majorant condition, J. Comput. Appl. Math.,
236 (2012), 2487–2498.
Improved Local Convergence Analysis of Inexact Gauss-Newton Like Methods 149
[29] Ferreira, O.P., Svaiter, B.F., Kantorovich’s majorants principle for Newton’s method,
Comput. Optim. Appl., 42 (2009), 213–229.
[30] Guo, X., On semilocal convergence of inexact Newton methods, J. Comput. Math., 25
(2007), 231–242.
[31] Gutiérrez, J.M., Hernández, M.A., Newton’s method under weak Kantorovich condi-
tions, IMA J. Numer. Anal., 20 (2000), 521–532.
[33] Huang, Z.A., Convergence of inexact Newton method, J. Zhejiang Univ. Sci. Ed., 30
(2003), 393–396.
[34] Hiriart-Urruty, J.B, Lemaréchal, C., Convex analysis and minimization algorithms
(two volumes). I. Fundamentals. II. Advanced theory and bundle methods, 305 and
306, Springer–Verlag, Berlin, 1993.
[35] Kantorovich, L.V., Akilov, G.P., Functional Analysis, Pergamon Press, Oxford, 1982.
[36] Li, C., Hu, N., Wang, J., Convergence bahavior of Gauss–Newton’s method and ex-
tensions to the Smale point estimate theory, J. Complexity, 26 (2010), 268–295.
[37] Li, C., Shen, W.P., Local convergence of inexact methods under the Hölder condition,
J. Comput. Appl. Math., 222 (2008), 544–560.
[38] Li, C., Zhang, W–H., Jin, X–Q., Convergence and uniqueness properties of Gauss-
Newton’s method, Comput. Math. Appl., 47 (2004), 1057–1067.
[39] Martı́nez, J.M., Qi, L.Q., Inexact Newton methods for solving nonsmooth equations.
Linear/nonlinear iterative methods and verification of solution (Matsuyama, 1993), J.
Comput. Appl. Math., 60 (1995), 127–145.
[40] Morini, B., Convergence behaviour of inexact Newton methods, Math. Comp., 68
(1999), 1605–1613.
[41] Potra, F.A., Sharp error bounds for a class of Newton–like methods, Libertas Mathe-
matica, 5 (1985), 71–84.
[42] Proinov, P.D., General local convergence theory for a class of iterative processes and
its applications to Newton’s method, J. Complexity, 25 (2009), 38–62.
[43] Proinov, P.D., New general convergence theory for iterative processes and its applica-
tions to Newton–Kantorovich type theorems, J. Complexity, 26 (2010), 3–42.
[44] Smale, S., Newton’s method estimates from data at one point. The merging of dis-
ciplines: new directions in pure, applied, and computational mathematics (Laramie,
Wyo., 1985), 185-196, Springer, New York, 1986.
150 Ioannis K. Argyros and Á. Alberto Magreñán
[45] Stewart, G.W., On the continuity of the generalized inverse, SIAM J. Appl. Math., 17
(1969), 33–45.
[46] Traub, J.F., Iterative Methods for the Solution of Equations, Englewood Cliffs, New
Jersey: Prentice Hall, 1964.
[47] Wang, X.H., Convergence of Newton’s method and uniqueness of the solution of equa-
tions in Banach spaces, IMA J. Numer. Anal., 20 (2000), 123–134.
Chapter 8
8.1. Introduction
In this chapter, we are interested in obtaining a stable approximate solution for a nonlin-
ear ill-posed operator equation of the form
F(x) = y, (8.1.1)
ky − yδ k ≤ δ (8.1.3)
and (8.1.1) has a solution x̂. Since (8.1.1) is ill-posed, its solution need not depend con-
tinuously on the data, i.e., small perturbation in the data can cause large deviations in the
solution. So the regularization methods are used ([9, 10, 11, 13, 14, 17, 19, 20]). Since F is
monotone, the Lavrentiev regularization is used to obtain a stable approximate solution of
(8.1.1). In the Lavrentiev regularization, the approximate solution is obtained as a solution
of the equation
F(x) + α(x − x0 ) = yδ , (8.1.4)
where α > 0 is the regularization parameter and x0 is an initial guess for the solution x̂.
In [8], Bakushinskii and Seminova proposed an iterative method
xδk+1 = xδk − (αk I + Ak,δ )−1 [(F(xδk ) − yδ ) + αk (xδk − x0 )], xδ0 = x0 , (8.1.5)
152 Ioannis K. Argyros and Á. Alberto Magreñán
where Ak,δ := F 0 (xδk ) and (αk ) is a sequence of positive real numbers satisfying αk → 0 as
k → ∞. It is important to stop the iteration at an appropriate step, say k = kδ , and show that
xk is well defined for 0 < k ≤ kδ and xδkδ → x̂ as δ → 0 (see [15]).
In [6]-[8], Bakushinskii and Seminova chose the stopping index kδ by requiring it to
satisfy
kF(xδkδ ) − yδ k2 ≤ τδ < kF(xδk ) − yδ k
for k = 0, 1, · · · and kδ − 1, τ > 1. In fact, they showed that xδkδ → x̂ as δ → 0 under the
following assumptions:
(1) There exists L > 0 such that kF 0 (x) − F 0 (y)k ≤ Lkx − yk for all x, y ∈ D(F);
(2) There exists p > 0 such that
αk − αk+1
≤p (8.1.6)
αk αk+1
∈ N;
for all k p
(3) (2 + Lσ)kx0 − x̂ktd ≤ σ − 2kx0 − x̂kt ≤ dα0 , where
√
σ := ( τ − 1)2 , t := pα0 + 1, d = 2(tkx0 − x̂k + pσ).
The condition (2) in Assumption 8.2.1 weakens the popular hypotheses given in [15],
[16] and [18].
Assumption 8.2.2. There exists a constant K > 0 such that, for all x, y ∈ U(x̂, r) and v ∈ X,
there exists an element denoted by P(x, u, v) ∈ X satisfying
Clearly, Assumption 8.2.2 implies Assumption 8.2.1 (2) with K0 = K, but not neces-
sarily vice versa. Note that K0 ≤ K holds in general and KK0 can be arbitrarily large [1]-[5].
Indeed, there are many classes of operators satisfying Assumption 8.2.1 (2), but not As-
sumption 8.2.2 (see the numerical examples at the end of this chapter). Moreover, if K0
is sufficiently smaller than K which can happen since KK0 can be arbitrarily large, then the
results obtained in this chapter provide a tighter error analysis than the one in [15].
Finally, note that the computation of constant K is more expensive than the computation
of K0 .
We need the auxiliary results based on Assumption 8.2.1.
Proposition 8.2.3. For any u ∈ U(x̂, r) and α > 0,
3K0
k(F 0 (u) + αI)−1 [F(x̂) − F (u) − F 0 (u)(x̂ − u)k ≤ kx̂ − uk2 .
2
Proof. Using the fundamental theorem of integration, for any u ∈ U(x̂, r), we get
Z 1
F(x̂) − F(u) = F 0 (u + t(x̂ − u))(x̂ − u)dt.
0
Then, by (2), (3) in Assumptions 8.2.1 and the inequality k(F 0 (u) + αI)−1 F 0 (uθ)k ≤ 1, we
obtain in turn
Proof. Let Tx̂,u = α(F 0 (x̂) + αI)−1 − (F 0 (u) + αI)−1 ) for all v ∈ X. Then we have, by
Assumption 8.2.2,
Note that the condition (8.2.3) on {αk } is weaker than (8.1.6) considered by Bakunshin-
skii and Smirnova [8] (see [15]). In fact, if (8.1.6) is satisfied, then it also satisfies (8.2.3)
with µ = pα0 + 1, but the converse need not be true (see [15]). Further, note that, for these
choices of {αk }, αk /αk+1 is bounded whereas (αk − αk+1 )/αk αk+1 → ∞ as k → ∞. (2) in
Assumption 8.2.1 is used in the literature for regularization of many nonlinear ill-posed
problems (see [12], [13], [19]-[21]).
In the following, we establish the existence of such a kδ . First, we consider the positive
integer N ∈ N satisfying
(c − 1)δ
αN ≤ < αk (8.3.2)
kx0 − x̂k
Expending the Applicability of Lavrentiev Regularization Methods ... 155
for all k ∈ {0, 1, · · · , N − 1}, where c > 1 and α0 > (c − 1)δ/kx0 − x̂k.
The following technical lemma from [15] is used to prove some of the results of this
chapter.
√ 8.3.1. ([15], Lemma 3.1) Let a > 0 and b ≥ 0 be such that 4ab ≤ 1 and 2θ :=
Lemma
(1 − 1 − 4ab)/2a. Let θ1 , · · · , θn be non-negative real numbers such that θk+1 ≤ aθk + b
and θ0 ≤ θ. Then θk ≤ θ for all k = 1, 2, · · · , n.
The rest of the results in this chapter can be proved along the same lines of the proof in
[15]. In order for us to make the chapter is a self contained as possible we present the proof
of one of them and for the proof of the rest we refer the reader to [15].
Theorem 8.3.2. ([15], Theorem 3.2) Let (8.1.2), (8.1.3), (8.2.3) and Assumption 8.2.1 be
satisfied. Let N be as in (8.3.2) for some c > 1 and 6cK0 kx0 − x̂k/(c − 1) ≤ 1. Then xδk is
defined iteratively for each k ∈ {0, 1, · · · , N} and
2ckx0 − x̂k
kxδk − x̂k ≤ (8.3.3)
c−1
for all k ∈ {0, 1, · · · , N}. In particular, if r > 2ckx0 − x̂k/(c − 1), then xδk ∈ Br (x̂) for k ∈
{0, 1, · · · , N}. Moreover,
for c0 := 73 c + 1.
Proof. We show (8.3.3) by induction. It is obvious that (8.3.3) holds for k = 0. Now, assume
that (8.3.3) holds for some k ∈ {0, 1, · · · , N}. Then it follows from (8.1.5) that
xδk+1 − x̂
= xδk − x̂ − (Aδk + αk I)−1 [F(xδk ) − yδ + αk (xδk − x0 )]
= (Aδk + αk I)−1 ((Aδk + αk I)(xδk − x̂) − [F(xδk ) − yδ + αk (xδk − x0 )])
= (Aδk + αk I)−1 [Aδk (xδk − x̂) + yδ − F(xδk ) + αk (x0 − x̂)]
= αk (Aδk + αk I)−1 (x0 − x̂) + (Aδk + αk I)−1 (yδ − y) (8.3.5)
+(Aδk + αk I)−1 [F(x̂) − F(xδk ) + Aδk (xδk − x̂)]
Using (8.1.3), the estimates k(Aδk + αk I)−1 k ≤ 1/αk , k(Aδk + αk I)−1 Aδk k ≤ 1 and Proposition
8.2.3, we have
δ
kαk (Aδk + αk I)−1 (x0 − x̂) + (Aδk + αk I)−1 (yδ − y)k ≤ kx0 − x̂k +
αk
and
3K0 δ
k(Aδk + αk I)−1 [F(x̂) − F(xδk ) + Aδk (xδk − x̂)]k ≤ kxk − x̂k2 .
2
Thus we have
δ 3K0 δ
kxδk+1 − x̂k ≤ kx0 − x̂k + + kxk − x̂k2 .
αk 2
156 Ioannis K. Argyros and Á. Alberto Magreñán
δ
But, by (8.3.2), αk ≤ kx0 − x̂k/(c − 1) and so
θk+1 ≤ aθ2k + b,
where
3K0 ckx0 − x̂k
θk = kxδk − x̂k, a= , b= .
2 c−1
From the hypothesis of the theorem, we have 4ab = 6cK0 kxc−1
0−x̂k
< 1. It is obvious that
√
1 − 1 − 4ab 2b 2ckx0 − x̂k
θ0 ≤ kx0 − x̂k ≤ θ := = √ ≤ 2b = .
2a 1 + 1 − 4ab c−1
2ckx0 − x̂k
kxδk − x̂k ≤ θ ≤ (8.3.6)
c−1
for all k ∈ {0, 1, · · · , N}. In particular, if r > 2ckx0 − x̂k/(c − 1), then we have xδk ∈ Br (x̂)
for all k ∈ {0, 1, · · · , N}.
Next, let γ = kαN (AδN + αN I)−1 (F(xδN ) − yδ )k. Then, using the estimates
γ
≤ δ + kαN (AδN + αN I)−1 (F(xδN ) − y + AδN (xδN − x̂) − AδN (xδN − x̂))k
= δ + kαN (AδN + αN I)−1 [F(xδN ) − F(x̂) − AδN (xδN − x̂) + AδN (xδN − x̂)]k
kxδN − x̂k2
≤ δ + αN [3K0 + kxδN − x̂k]
2
kxδ − x̂k
≤ δ + αN kxδN − x̂k[1 + 3K0 N ] (8.3.7)
2
2αN ckxδ0 − x̂k 3K0 ckxδ0 − x̂k 1 7c
≤ δ+ [1 + ] ≤ δ + 2cδ[1 + ] ≤ ( + 1)δ.
c−1 c−1 6 3
Lemma 8.4.1. ([15], Lemma 4.1) Let Assumption 8.2.1 hold. Suppose that, for all k ∈
{0, 1, · · · , n}, xk in (8.4.1) is well defined and ρk := kαk (Ak + αk I)−1 (x0 − x̂)k for some
n ∈ N. Then we have
3K0 kxk − x̂k2 3K0 kxk − x̂k2
ρk − ≤ kxk+1 − x̂k ≤ ρk + (8.4.2)
2 2
for all k ∈ {0, 1, · · · , n}.
Theorem 8.4.2. ([15], Theorem 4.2) Let Assumption 8.2.1 hold. If 6K0 kx0 − x̂k ≤ 1 and
r > 2kx0 − x̂k, then, for all k ∈ N, the iterates xk in (8.4.1) are well defined and
2kx0 − x̂k
kxk − x̂k ≤ p ≤ 2kx0 − x̂k (8.4.3)
1+ 1 − 6K0 kx0 − x̂k
for all k ∈ N.
Lemma 8.4.3. ([15], Lemma 4.3) Let Assumptions 8.2.1 and 8.2.6 hold and let r > 2kx0 −
x̂k. Assume that kAk ≤ ηα0 and 4µ(1 + η−1 )K0 kx0 − x̂k ≤ 1 for some η with 0 < η < 1.
Then, for all k ∈ N, we have
1 1
kxk − x̂k ≤ kαk (Ak + αk I)−1 (x0 − x̂)k ≤ kxk − x̂k (8.4.4)
(1 + η)µ 1−η
and
1−η 1 η
kxk − x̂k ≤ k(xk+1 − x̂)k ≤ ( + )kxk − x̂k. (8.4.5)
(1 + η)µ 1 − η (1 + η)µ
The following corollary follows from Lemma 8.4.3 by taking η = 1/3. We show that
this particular case of Lemma 8.4.3 is better suited for our later results.
Corollary 8.4.4. ([15], Corollary 4.4) Let Assumptions 8.2.1 and 8.2.6 hold and let r >
2kx0 − x̂k. Assume that kAk ≤ α0 /3 and 16µK0 kx0 − x̂k ≤ 1. Then, for all k ∈ N, we have
3 3
kxk − x̂k ≤ kαk (A + αk I)−1 (x0 − x̂)k ≤ kxk − x̂k (8.4.6)
4µ 2
and
kxk − x̂k
≤ k(xk+1 − x̂)k ≤ 2kxk − x̂k.
2µ
158 Ioannis K. Argyros and Á. Alberto Magreñán
Theorem 8.4.5. ([15], Theorem 4.5) Let the Assumptions of Lemma 8.4.3 hold. If x0 is
chosen such that x0 − x̂ ∈ N(F 0 (x̂))⊥, then limk→∞ xk = x̂.
Lemma 8.4.6. ([15], Lemma 4.6) Let the assumptions of Lemma 8.4.3 hold for η satisfying
r
η 4
(1 − 1 − )[1 + (2µ − 1)η + 2µ] + 2η < . (8.4.7)
(1 + η)µ 3
Remark 8.4.7. ([15], Remark 4.7) It can be seen that (8.4.7) is satisfied if η ≤ 1/3 + 1/24.
Now, if we take η = 1/3, that is, K0 kx0 − x̂kµ ≤ 1/16 in Lemma 8.4.6, then it takes the
following form.
Lemma 8.4.8. ([15], Lemma 4.8) Let the assumptions of Lemma 8.4.3 hold with η = 1/3.
Then, for all k ≥ l ≥ 0, we have
h kαl (A + αl I)−1 (F(xl ) − y)k i
kxl − x̂k ≤ c1/3 kxk − x̂k + ,
αk
where h 8µ + (8µ + 1)3ε i−1 n 3ε + 1 o
c1/3 = 1 − max µ, 1 + ,
16µ 8
√
4µ
ε := √ √ .
4µ + 4µ − 1
m2 − 4m − 6
c> .
m2 − 7m − 6
Expending the Applicability of Lavrentiev Regularization Methods ... 159
δ
kxδk − xk k ≤ , (8.5.1)
(1 − κ)αk
where
1 3c 6
κ := 4+ + .
m c−1 m
Corollary 8.5.2. ([15], Corollary 5.2) Let Assumption 8.2.1 hold and let 16K0 kx0 − x̂k ≤ 1.
Let N be the integer defined by (8.3.2) with c > 13. Then, for all k ∈ {0, 1, · · · , N}, we have
δ
kxδk − xk k ≤ ,
(1 − κ)αk
where
31c − 19
κ := .
32(c − 1)
Lemma 8.5.3. ([15], Lemma 5.3) Let the assumptions of Lemma 8.5.1 hold. Then we have
Theorem 8.5.4. ([15], Theorem 5.4) Let Assumptions 8.2.1 and 8.2.6 hold. If 16kµkx0 −
x̂k ≤ 1 and the integer kδ is chosen according to stopping rule (8.3.1) with c0 > 94
3 , then
we have n o
δ
kxδkδ − x̂k ≤ ξ inf kxk − x̂k + :k≥0 , (8.5.2)
αk
c c +1 0 kx0 −x̂k)
where ξ = max{2µρ, 1/31−κ1
, c}, ρ := 1 + µ(1+3K
c2 (1−κ) with c1/3 and κ as in Lemma 8.4.8
and Corollary 8.5.2, respectively, and c1 , c2 as in Lemma 8.5.3.
160 Ioannis K. Argyros and Á. Alberto Magreñán
where ξ0 := 8µξ/3 with ξ as in Theorem 8.5.4 and ψ : (0, ϕ(a)] → (0, aϕ(a)] is defined as
ψ(λ) := λϕ−1 (λ), λ ∈ (0, ϕ(a)].
Proof. From (8.4.6) and (8.5.2), we get
δ
kxδkδ − x̂k ≤ ξ00 inf{kαk (A + αk I)−1 (x0 − x̂)k + : k = 0, 1, · · ·} (8.6.1)
αk
µ(1+3k0kx0 −x̂k) c c1 +1
where ξ00 = 4µ
3 max{2µ 1+ c2 (1−κ)
, 1/31−κ , c}. Now, we choose an integer mδ such
√
that mδ = max{k : αk ≥ δ}. Then, we have
δ
kxδkδ − x̂k ≤ ξ00 inf{kαmδ (A + αmδ I)−1 (x0 − x̂)k + : k = 0, 1, · · ·} (8.6.2)
αmδ
δ
√ δ
Note that α mδ ≤ δ, so α mδ → 0 as δ → 0. Therefore by (8.6.2) to show that xδkδ → x̂ as
δ → 0, it is enough to prove that kαmδ (A + αmδ I)−1 (x0 − x̂)k → 0 as δ → 0. Observe that,
for w ∈ R(F 0 (x̂)), i.e., w = F 0 (x̂)u for some u ∈ D(F) we have kαmδ (A + αmδ I)−1 wk ≤
αmδ kuk → 0 as δ → 0. Now since R(F 0 (x̂)) is a dense subset of N(F 0 (x̂))⊥ it follows that
kαmδ (A + αmδ I)−1 (x0 − x̂)k → 0 as δ → 0. Using Assumption 8.2.5, we get that
From (8.6.4), kxδkδ − x̂k ≤ ξ00 {ϕ(αkˆδ ) + αδˆ }. Now using (8.6.5) and (8.6.6) we get kxδkδ −
kδ
x̂k ≤ 2ξ00 αδˆ ≤ 2ξ00 µ α ˆδ ≤ 2ξ00 µψ−1 (δ). This completes the proof.
kδ kδ −1
Expending the Applicability of Lavrentiev Regularization Methods ... 161
Example 8.7.2. Let X = C([0, 1]) (: the space of continuous functions defined on [0, 1]
equipped with the max norm) and D(F) = U(0, 1). Define an operator F on D(F) by
Z 1
F(h)(x) = h(x) − 5 xθh(θ)3 dθ. (8.7.2)
0
Then the Fréchet-derivative is given by
Z 1
0
F (h[u])(x) = u(x) − 15 xθh(θ)2 u(θ)dθ (8.7.3)
0
for all u ∈ D(F). Using (8.7.2), (8.7.3), Assumptions 8.2.1 (2), 8.2.2 for x̂ = 0, we get
K0 = 7.5 < K = 15.
K
Next, we provide an example where K0 can be arbitrarily large.
Example 8.7.3. Let X = D(F) = R, x̂ = 0 and define a function F on D(F) by
F(x) = d0 x − d1 sin1 + d1 sined2 x , (8.7.4)
where d0 , d1 and d2 are the given parameters. Note that F(x̂) = F(0) = 0. Then it can easily
be seen that, for d2 sufficiently large and d1 sufficiently small, KK0 can be arbitrarily large.
We now present two examples where Assumption 8.2.2 is not satisfied, but Assumption
8.2.1 (2) is satisfied.
Example 8.7.4. Let X = D(F) = R, x̂ = 0 and define a function F on D by
1
x1+ i i
F(x) = 1
+ c1 x − c1 − , (8.7.5)
1+ i i+1
where c1 is a real parameter and i > 2 is an integer. Then F 0 (x) = x1/i + c1 is not Lipschitz
on D. Hence Assumption 8.2.2 is not satisfied. However, the central Lipschitz condition in
Assumption 8.2.2 (2) holds for K0 = 1. We also have that F(x̂) = 0. Indeed, we have
kF 0 (x) − F 0 (x̂)k = |x1/i − x̂1/i |
|x − x̂|
= i−1 i−1
x̂ i + · · · + x i
and so
kF 0 (x) − F 0 (x̂)k ≤ K0 |x − x̂|.
162 Ioannis K. Argyros and Á. Alberto Magreñán
for all n ∈ N, where f is a given continuous function satisfying f (s) > 0 for all s ∈ [a, b], λ
is a real number and the kernel G is continuous and positive in [a, b] × [a, b].
For example, when G(s,t) is the Green kernel, the corresponding integral equation is
equivalent to the boundary value problem
u00 = λu1+1/n,
u(a) = f (a), u(b) = f (b).
These type of the problems have been considered in [1]–[5]. The equation of the form
(8.7.6) generalize the equation of the form
Z b
u(s) = G(s,t)u(t)ndt, (8.7.7)
a
which was studied in [1]-[5]. Instead of (8.7.6), we can try to solve the equation F(u) = 0,
where
F : Ω ⊆ C[a, b] → C[a, b], Ω = {u ∈ C[a, b] : u(s) ≥ 0, s ∈ [a, b]}
and Z b
F(u)(s) = u(s) − f (s) − λ G(s,t)u(t)1+1/ndt.
a
The norm we consider is the max-norm. The derivative F 0 is given by
Z b
1
F 0 (u)v(s) = v(s) − λ(1 + ) G(s,t)u(t)1/nv(t)dt
n a
0
for all v ∈ Ω. First of all, we notice that F does not satisfy the Lipschitz-type condition
in Ω. Let us consider, for instance, [a, b] = [0, 1], G(s,t) = 1 and y(t) = 0. Then we have
F 0 (y)v(s) = v(s) and
Z b
0 0 1
kF (x) − F (y)k = |λ|(1 + ) x(t)1/ndt.
n a
kF 0 (x) − F 0 (y)k ≤ L1 kx − yk
would hold for all x ∈ Ω and for a constant L2 . But this is not true. Consider, for example,
the function
t
x j (t) =
j
Expending the Applicability of Lavrentiev Regularization Methods ... 163
for all j ≥ 1 and t ∈ [0, 1]. If these are substituted into (8.7.7), then we have
1 L2
≤ ⇐⇒ j 1−1/n ≤ L2 (1 + 1/n)
j1/n(1 + 1/n) j
for all j ≥ 1. This inequality is not true when j → ∞. Therefore, Assumption 8.2.2 is not
satisfied in this case. However, Assumption 8.2.1 (2) holds. To show this, suppose that
x̂(t) = f (t) and γ = mins∈[a,b] f (s). Then, for all v ∈ Ω, we have
1 Z b
0 0 1/n 1/n
k[F (x) − F (x̂)]vk = |λ| 1 + max G(s,t)(x(t) − f (t) )v(t)dt
n s∈[a,b] a
1
≤ |λ| 1 + max Gn (s,t),
n s∈[a,b]
G(s,t)|x(t)− f (t)|
where Gn (s,t) = x(t)(n−1)/n +x(t)(n−2)/n f (t)1/n +···+ f (t)(n−1)/n
kvk. Hence it follows that
Z b
|λ|(1 + 1/n)
k[F 0 (x) − F 0 (x̂)]vk = max G(s,t)dtkx − x̂k
γ(n−1)/n s∈[a,b] a
≤ K0 kx − x̂k,
|λ|(1+1/n) Rb
where K0 = γ(n−1)/n N and N = maxs∈[a,b] a G(s,t)dt. Then Assumption 8.2.1 (2) holds
for sufficiently small λ.
In the next remarks, we compare our results with the corresponding ones in [15].
Remark 8.7.6. Note that the results in [15] were shown using Assumption 8.2.2 whereas
we used weaker Assumption 8.2.1 (2) in this chapter. Next, our result, Proposition 8.2.3,
was shown with 3K0 replacing K. Therefore, if 3K0 < K (see Example 8.7.3), then our result
is tighter. Proposition 8.2.4 was shown with K0 replacing K. Then, if K0 < K, then our result
is tighter. Theorem 8.3.2 was shown with 6K0 replacing 2K. Hence, if 3K0 < K, our result
is tighter. Similar favorable to us observations are made for Lemma 8.4.1, Theorem 8.4.2
and the rest of the results in [15].
Remark 8.7.7. The results obtained here can also be realized for the operators F satisfying
an autonomous differential equation of the form
F 0 (x) = P(F(x)),
[2] Argyros, I.K., Approximating solutions of equations using Newton’s method with a
modified Newton’s method iterate as a starting point, Rev. Anal. Numer. Theor. Approx.
36 (2007), 123–138.
[3] Argyros, I.K., A semilocal convergence for directional Newton methods, Math. Com-
put. (AMS) 80 (2011), 327–343.
[4] Argyros, I.K., and Hilout, S., Weaker conditions for the convergence of Newton’s
method, J. Complexity 28 (2012), 364–387.
[5] Argyros, I.K., Cho, Y.J., Hilout, S., Numerical methods for equations and its applica-
tions, CRC Press, Taylor and Francis, New York, 2012.
[7] Bakushinskii, A., Seminova, A., A posteriori stopping rule for regularized fixed point
iterations, Nonlinear Anal. 64(2006), 1255–1261.
[8] Bakushinskii, A., Seminova, A., Iterative regularization and generalized discrepancy
principle for monotone operato equations, Numer. Funct. Anal. Optim. 28(2007), 13–
25.
[9] Binder, A., Engl, H.W., Groetsch, C.W., Neubauer, A., Scherzer, O., Weakly closed
nonlinear operators and parameter identification in parabolic equations by Tikhonov
regularization, Appl. Anal. 55(1994), 215–235.
[10] Engl, H.W., Hanke, M., Neubauer, A., Regularization of Inverse Problems, Dordrecht,
Kluwer, 1993.
[11] Engl, H.W., Kunisch, K., Neubauer, A., Convergence rates for Tikhonov regulariza-
tion of nonlinear ill-posed problems, Inverse Problems 5(1989), 523–540.
[12] Jin, Q., On the iteratively regularized Gauss-Newton method for solving nonlinear
ill-posed problems, Math. Comp. 69(2000), 1603–1623.
166 Ioannis K. Argyros and Á. Alberto Magreñán
[13] Jin, Q., Hou, Z.Y., On the choice of the regularization parameter for ordinary and
iterated Tikhonov regularization of nonlinear ill-posed problems, Inverse Problems
13(1997), 815–827.
[14] Jin, Q., Hou, Z.Y., On an a posteriori parameter choice strategy for Tikhonov regular-
ization of nonlinear ill-posed problems, Numer. Math. 83(1990), 139–159.
[15] Mahale, P. Nair, M.T., Iterated Lavrentiev regularization for nonlinear ill-posed prob-
lems, ANZIAM 51(2009), 191–217.
[16] Mahale, P. Nair, M.T., General source conditions for nonlinear ill-posed problems,
Numer. Funct. Anal. Optim. 28(2007), 111–126.
[17] Scherzer, O., Engl, H.W., Kunisch, K., Optimal a posteriori parameter choice for
Tikhonov regularization for solving nonlinear ill-posed problems, SIAM J. Numer.
Anal. 30(1993), 1796–1838.
[18] Semenova, E.V., Lavrentiev regularization and balancing principle for solving ill-
posed problems with monotone operators, Comput. Methods Appl. Math. 4(2010),
444–454.
[20] Tautenhahn, U., On the method of Lavrentiev regularization for nonlinear ill-posed
problems, Inverse Problems 18(2002), 191–207.
[21] Tautenhahn, U., Jin, Q., Tikhonov regularization and a posteriori rule for solving non-
linear ill-posed problems, Inverse Problems 19(2003), 1–21.
Chapter 9
9.1. Introduction
Let U(x, r) and U(x, r) stand, respectively, for the open and closed ball in X with center
x ∈ X and radius r > 0. Denote by L (X , Y ) the space of bounded linear operators from X
into Y .
In this chapter we are concerned with the problem of approximating a locally unique
solution x∗ of nonlinear equation
F(x) = 0, (9.1.1)
where F is a Fráchet-differentiable operator defined on a non-empty convex subset D of a
Banach space X with values in a Banach space Y .
Many problems from computational sciences, physics and other disciplines can be taken
in the form of equation (9.1.1) using Mathematical Modelling [5, 6, 8, 9, 12, 22, 25]. The
solution of these equations can rarely be found in closed form. That is why the solution
methods for these equations are iterative. In particular, the practice of numerical analysis
for finding such solutions is essentially connected to variants of Newton’s method [5, 6, 9,
12, 19, 21, 22, 24, 25]. The study about the convergence of iterative procedures is usually fo-
cussed on two types: semilocal and local convergence analysis. The semilocal convergence
is, based on the information around an initial point, to give criteria ensuring the convergence
of iterative procedure; while the local one is, based on the information around a solution,
to find estimates of the radii of convergence balls. There are a lot of studies on the weak-
ness and/or extension of the hypothesis made on the underlying operators; see for example
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 17, 15, 16, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27]
and the references therein.
Ezquerro and Rubio used in [17] the uniparametric family of secant-like methods de-
168 Ioannis K. Argyros and Á. Alberto Magreñán
fined by
x−1 , x0 given in D ,
y = µxn + (1 − µ)xn−1 , µ ∈ [0, 1], (9.1.2)
n −1
xn+1 = xn − Bn F(xn ), Bn = [yn , xn ; F], for each n = 0, 1, . . .
and the method of recurrent relations to generate a sequence {xn } approximating x∗ . Here,
[z, w; F] for each z, w ∈ D is a divided difference of order one, which is a bounded linear
operator such that [4, 5, 7, 9, 19, 22, 25]
Secant-like method (9.1.2) can be considered as a combination of the secant and New-
ton’s method. Indeed, if µ = 0 we obtain the secant method and if µ = 1 we get New-
ton’s method provided that F 0 is Frchet-differentiable on D , since, then xn = yn and
[yn , xn ; F] = F 0 (xn ).
√
1+ 5
It was shown in [15, 16] that the R-order of convergence is at least for λ ∈ [0, 1),
2
the same as that of the secant method. Later in [12] another uniparametric family of secant-
like methods defined by
x−1 , x0 given in D ,
y = λxn + (1 − λ)xn−1 , λ ≥ 1, (9.1.4)
n −1
xn+1 = xn − An F(xn ), An = [yn , xn−1 ; F] for each n = 0, 1, . . .
(A2 ) Tighter estimates on the distances kxn+1 − xn k and kxn − x∗ k for each n = 0, 1, . . .,
(A4 ) The results are presented in affine invariant form, whereas the ones in [12] are given
in a non-affine invariant forms. The advantages of affine versus non-affine results
have been explained in [4, 5, 7, 9, 19, 22, 25]
Our hypotheses for the semilocal convergence of secant-like method (9.1.4) are:
(C1 ) There exists a divided difference of order one [z, w; F] ∈ L (X , Y ) satisfying (9.1.3),
Efficient Secant-Like Methods 169
kx0 − x−1 k ≤ c,
kA−1
0 ([x, y; F] − [v, w; F])k ≤ K(kx − vk + ky − wk) for each x, y, v, w ∈ D .
We shall denote by (C) conditions (C1 )–(C4 ). In view of (C4 ) there exist H0 , H1 , H > 0
such that
(C5 ) kA−1
0 ([x1 , x0 ; F] − A0 )k ≤ H0 (kx1 − y0 k + kx0 − x−1 k),
(C6 ) kA−1
0 (A1 − A0 )k ≤ H1 (ky1 − y0 k + kx0 − x−1 k) and
(C7 ) kA−1
0 ([x, y; F] − A0 )k ≤ H(kx − y0 k + ky − x−1 k) for each x, y ∈ D .
Clearly
H0 ≤ H1 ≤ H ≤ K (9.1.5)
K H
hold in general and , can be arbitrarily large [5, 6, 9]. Note that (C5 ), (C6 ), (C7 ) are
H H1
not additional to (C4 ) hypotheses. In practise the computation of K requires the computation
of H0 , H1 and H. It also follows from (C4 ) that F is differentiable [5, 6, 19, 21].
The chapter is organized as follows. In Section 9.2. we show that under the same hy-
potheses as in [18] and using recurrent relations, we obtain an at least as precise information
on the location of the solution. Section 9.3. contains the semilocal convergence analysis us-
ing weaker hypotheses and recurrent functions. We also show the advantages (A). The
results are also extended to cover the case of equations with nondifferentiable operators.
Numerical examples are presented in the concluding Section 9.4..
η Kc2
a−1 = , b−1 = ,
c+η c+η
U(x0 , R) ⊆ D
and
( f (a0 )a0 )n
kxn − x∗ k ≤ kx1 − x0 k.
1 − f (a0 )a0
1
Furthermore, the solution x∗ is unique in D0 = U(x0 , σ0 ) ∩ D, where σ0 = − λc − R,
H
provided that
1 1
R< − λc = R0 .
2 H
Proof. The proof with the exception of the uniqueness part is given in Theorem 3 [12] if
we use A−1 −1
0 F instead of F and set b = 1, where kA0 k ≤ b.
kA−1 ∗ ∗
0 (L − A0 )k ≤ H(ky − y0 k + kx − x−1 k) < H(σ0 + λc + R) = 1. (9.2.1)
It follows from (9.2.1) and the Banach lemma on invertible operators [4, 5, 6, 9, 19, 22, 25]
that L−1 ∈ L (Y , X ). Using the identity 0 = F(y∗ ) − F(x∗ ) = L(y∗ − x∗ ) we deduce that
x∗ = y∗ . That completes the proof of the Theorem.
σ < σ0 (9.2.2)
and
R0 < R1 , (9.2.3)
Efficient Secant-Like Methods 171
where
1
σ= − λc − R
K
and
1 1
R0 = − λc
2 K
where given in [12] (for b = 1). Hence, (9.2.2) and (9.2.3) justify our claim for this section
made in the Introduction of this chapter.
Lemma 9.3.1. Let c ≥ 0, η > 0, H > 0, K > 0 and λ ≥ 1. Set t−1 = 0, t0 = c and t1 = c + η.
Define scalar sequences {qn }, {tn }, {αn } for each n = 0, 1, . . . by
qn = Hλ(tn+1 + tn − c), (9.3.1)
K(tn+1 − tn + λ(tn − tn−1 ))
tn+2 = tn+1 + (tn+1 − tn ),
1 − qn
K(tn+1 − tn + λ(tn − tn−1 ))
αn = , (9.3.2)
1 − qn
functions f n on [0, 1) for each n = 1, 2, . . . by
fn (t) = K(t n + λt n−1 )η + Hλ((1 + t + · · · + t n+1 )η + (1 + t + · · · + t n )η + c) − 1 (9.3.3)
and polynomial p on [0, 1) by
p(t) = Hλt 3 + (Hλ + K)t 2 + K(λ − 1)t − λK. (9.3.4)
Denote by α the only root of polynomial p in (0, 1). Suppose that
1 − Hλ(c + 2η)
0 ≤ α0 ≤ α ≤ · (9.3.5)
1 − Hλc
Then, sequence {tn} is non-decreasing, bounded from above by t ∗∗ defined by
η
t ∗∗ = +c (9.3.6)
1−α
and converges to its unique least upper bound t ∗ which satisfies
c + η ≤ t ∗ ≤ t ∗∗ . (9.3.7)
Moreover, the following estimates are satisfied for each n = 0, 1, 2, . . .
0 ≤ tn+1 − tn ≤ αn η (9.3.8)
and
αn η
t ∗ − tn ≤ · (9.3.9)
1−α
172 Ioannis K. Argyros and Á. Alberto Magreñán
Proof. We shall first show that polynomial p has roots in (0, 1). Indeed, we have p(0) =
−λK < 0 and p(1) = 2Hλ > 0. Using the intermediate value theorem we deduce that there
exists at least one root of p in (0, 1). Moreover p0 (t) > 0. Hence p crosses the positive axis
only once. Denote by α the only root of p in (0, 1). It follows from (9.3.1) and (9.3.2) that
estimate (9.3.8) is certainly satisfied if
0 ≤ αn ≤ α. (9.3.10)
t2 − t1 ≤ α(t1 − t0 ) ⇒ t2 ≤ t1 + α(t1 − t0 ) ⇒ t2 ≤ η + t0 + αη
1 − α2
= c + (1 + α)η = c + η < t ∗∗.
1−α
Suppose that
1 − αk+1
tk+1 − tk ≤ αk η and tk+1 ≤ c + η for each k ≤ n. (9.3.11)
1−α
Estimate (9.3.10) shall be true for k + 1 replacing n if
0 ≤ αk+1 ≤ α (9.3.12)
or
fk (α) ≤ 0, (9.3.13)
where f k is defined by (9.3.3). We need a relationship between two consecutive recurrent
functions f k for each k = 1, 2 . . . Using (9.3.3) and (9.3.4) we deduce that
f∞(α) ≤ 0 (9.3.17)
which is true by (9.3.5). The induction for (9.3.8) is complete. That is sequence {tn } is non-
decreasing, bounded from above by t ∗∗ given by (9.3.6) and as such it converges to some t ∗
which satisfies (9.3.7). Estimate (9.3.9) follows from (9.3.8) by using standard majorization
techniques [4, 5, 6, 9, 19, 22, 25]. The proof of Lemma 9.3.1 is complete.
Efficient Secant-Like Methods 173
Lemma 9.3.2. Let c ≥ 0, η > 0, H0 > 0, H1 > 0, H > 0, K > 0 and λ ≥ 1. Set s−1 = 0,
s0 = c, s1 = c + η. Define scalar sequences {sn }, {bn } for each n = 1, 2, . . . by
H0 (s1 − s0 + λ(s0 − s−1 ))
s2 = s1 + (s1 − s0 ),
1 − H1 λ(s1 + s0 − c)
(9.3.18)
K(s − s + λ(s − s ))
sn+2 = sn+1 + n+1 n n n−1
(sn+1 − sn ),
1 − Hλ(sn+1 + sn − c)
H0 (s1 − s0 + λ(s0 − s−1 ))
b = ,
1 1 − H1 λ(s1 + s0 − c)
(9.3.19)
K(sn+1 − sn + λ(sn − sn−1 ))
bn = ,
1 − Hλ(sn+1 + sn − c)
Suppose that
1 − Hλ(2s2 − c)
0 ≤ b1 ≤ α ≤ , (9.3.21)
1 − Hλ(2s1 − c)
where α is defined in Lemma 9.3.1. Then, sequence {sn } is non-decreasing, bounded from
above by s∗∗ defined by
s2 − s1
s∗∗ = c + η + (9.3.22)
1−α
and converges to its unique least upper bound s∗ which satisfies
c + η ≤ s∗ ≤ s∗∗ . (9.3.23)
0 ≤ bn ≤ α. (9.3.25)
0 ≤ s3 − s2 ≤ α(s2 − s1 ) ⇒ s3 ≤ s2 + α(s2 − s1 ) ⇒
⇒ s3 ≤ s2 + (1 + α)(s2 − s1 ) − (s2 − s1 ) ⇒ (9.3.26)
1 − α2
⇒ s3 ≤ s1 + (s2 − s1 ) ≤ s∗∗ .
1−α
174 Ioannis K. Argyros and Á. Alberto Magreñán
Suppose (9.3.25) holds for each n ≤ k. Then using (9.3.18) we get that
and
1 + αk+1
sk+2 ≤ s1 + (s2 − s1 ). (9.3.28)
1−α
Estimate (9.3.25) shall be satisfied if
gk (α) ≤ 0. (9.3.29)
Using (9.3.20) we get the following relationship between two consecutive recurrent func-
tions gk :
gk+1(α) = gk (α) + p(α)αk−1 (s2 − s1 ) = gk (α). (9.3.30)
Define function g∞ on [0, 1) by
Remark 9.3.3. (a) Let us consider an interesting choice for λ. Let λ = 1 (secant
method). Then, using (9.3.4) and (9.3.5) we have that
2K
α= √ (9.3.34)
K + K 2 + 4HK
and
K(c + η) 1 − H(c + 2η)
≤α≤ · (9.3.35)
1 − H(c + η) 1 − Hc
The corresponding condition for the secant method is given by [6, 9, 18, 21]:
p
Kc + 2 Kη ≤ 1. (9.3.36)
Condition (9.3.35) can be weaker than (9.3.36) (see also the numerical examples
at the end of the chapter). Moreover, the majorizing sequence {un } for the secant
method related to (9.3.36) is given by
u−1 = 0, u0 = c, u1 = c + η,
K(un+1 − un−1 ) (9.3.37)
un+2 = un+1 + (un+1 − un ).
1 − K(un+1 + un − c)
Efficient Secant-Like Methods 175
(c) Clearly, iteration {sn } is tighter than {tn} and we have as in (9.3.40) than for H0 < K
or H1 < H
Next, we present obvious and useful extensions of Lemma 9.3.1 and Lemma 9.3.2,
respectively.
t1 ≤ t2 ≤ · · · ≤ tN ≤ tN+1 , (9.3.42)
1
> tN+1 − tN + λ(tN − tN−1 ) (9.3.43)
Hλ
and
1 − Hλ(tN − tN−1 + 2(tN+1 − tN ))
0 ≤ αN ≤ α ≤ (9.3.44)
1 − Hλ(tN − tN−1 )
Then, sequence {tn } generated by (9.3.2) is nondecreasing, bounded from above by t ∗∗ and
converges to t ∗ which satisfies t ∗ ∈ [tN+1 ,t ∗ ]. Moreover, the following estimates are satisfied
for each n = 0, 1, . . .
0 ≤ tN+n+1 − tN+n ≤ αn (tN+1 − tN ) (9.3.45)
and
αn
t ∗ − tN+n ≤ (tN+1 − tN ). (9.3.46)
1−α
Lemma 9.3.5. Let N = 1, 2, . . . be fixed. Suppose that
s1 ≤ s2 ≤ · · · ≤ sN ≤ sN+1 , (9.3.47)
1
> sN+1 − sN + λ(sN − sN−1 ) (9.3.48)
Hλ
176 Ioannis K. Argyros and Á. Alberto Magreñán
and
1 − Hλ(2sN+1 − sN−1 )
0 ≤ bN ≤ α ≤ · (9.3.49)
1 − Hλ(2sN − sN−1 )
Then, sequence {sn } generated by (9.3.18) is nondecreasing, bounded from above by s∗∗
and converges to s∗ which satisfies s∗ ∈ [sN+1 , s∗ ]. Moreover, the following estimates are
satisfied for each n = 0, 1, . . .
and
αn
s∗ − sN+n ≤ (sN+1 − sN ). (9.3.51)
1−α
Next, we present the following semilocal convergence result for secant-like method
under the (C) conditions.
Theorem 9.3.6. Suppose that the (C), Lemma 9.3.1 (or Lemma 9.3.4) conditions and
hold. Then, sequence {xn } generated by secant-like method is well defined, remains in U for
each n = −1, 0, 1, 2, . . . and converges to a solution x∗ ∈ U(x0 ,t ∗ − c) of equation F(x) = 0.
Moreover, the following estimates are satisfied for each n = 0, 1, . . .
and
kxn − x∗ k ≤ t ∗ − tn . (9.3.54)
Furthermore, if there exists T ≥ t ∗ − c such that
U(x0 , r) ⊆ D (9.3.55)
and
H(T + t ∗ + (λ − 1)c) < 1, (9.3.56)
then, the solution x∗ is unique in U(x0 , T ).
and
U(xk+1 ,t ∗ − tk+1 ) ⊆ U(xk ,t ∗ − tk ) (9.3.58)
for each k = −1, 0, 1, . . .. Let z ∈ U(x0 ,t ∗ − t0 ). Then we obtain that
kw − x0 k ≤ kw − x1 k + kx1 − x0 k ≤ t ∗ − t1 + t1 − t0
= t ∗ − t0 ,
hence, w ∈ U(x0 ,t ∗ ,t0 ). Note that kx−1 − x0 k ≤ c = t0 −t−1 and kx1 − x0 k = kA−1
0 F(x0 )k ≤
∗ ∗
η = t1 − t0 < t . That is x1 ∈ U(x0 ,t ) ⊆ D . Hence, estimates (9.3.57) and (9.3.58) hold for
k = −1 and k = 0. Suppose that (9.3.57) and (9.3.58) hold for all n ≤ k. Then, we obtain
that
k+1 k+1
kxk+1 − x0 k ≤ ∑ kxi − xi−1 k ≤ ∑ (ti − ti−1 )
i=1 i=1
= tk+1 − t0 = t ∗ − c ≤ t ∗
and
kA−1
0 (Ak+1 − A0 )k ≤ H(kyk+1 − y0 k + kxk − x−1 k)
≤ H(λkxk+1 − x0 k + |1 − λ|kxk − x−1 k + kxk − x−1 k)
≤ Hλ(kxk+1 − x0 k + kxk − x0 k + kx0 − x−1 k) (9.3.59)
≤ Hλ(tk+1 − t0 + tk − t0 + c)
= Hλ(tk+1 + tk − c) < 1.
It follows from (9.3.59) and the Banach lemma on invertible operators [4, 5, 6, 9, 19, 22, 25]
that A−1
k+1 exists and
kA−1 −1
k+1A0 k ≤ (1 − Hλ(tk+1 + tk − c)) . (9.3.60)
In view of (9.1.4), we obtain the identity
Using (9.1.4), (9.3.16) and the induction hypotheses we get in turn that
kA−1
0 F(xk+1 )k ≤ K(kxk+1 − yk k + kxk − xk−1 k)kxk+1 − xk k
≤ K(kxk+1 − xk k + λkxk − xk−1 k)kxk+1 − xk k (9.3.62)
≤ K(tk+1 − tk + λ(tk − tk+1 ))(tk+1 − tk ).
178 Ioannis K. Argyros and Á. Alberto Magreñán
which completes the induction for (9.3.57). Moreover, let v ∈ U(xk+2,t ∗ − tk+2 ). Then, we
get that
kA−1 ∗ ∗ ∗ ∗
0 ([y , x ; F] − A0 )k ≤ H(ky − y0 k + kx − x−1 k)
≤ H(ky∗ − x0 k + (λ − 1)kx0 − x−1 k (9.3.64)
∗
+kx − x0 k + kx0 − x−1 k)
≤ (R0 + t ∗ + (λ − 1)c) < 1.
It follows from (9.3.64) and the Banach lemma on invertible operators that [y∗ , x∗ ; F]−1
exists. Then, using the identity 0 = F(y∗ ) − F(x∗ ) = [y∗ , x∗ ; F](y∗ − x∗ ), we deduce that
x∗ = y∗ . The proof of Theorem 9.3.6 is complete.
Remark 9.3.7. (a) The limit point t ∗ can be replaced in Theorem 9.3.6 by t ∗∗ given in
closed form by (9.3.6).
(b) It follows from the proof of Theorem 9.3.6 that {sn } is also a majorizing sequence for
{xn }. Hence, Lemma 9.3.2 (or Lemma 9.3.5), {sn }, s∗ can replace Lemma 9.3.1 (or
Lemma 9.3.4) {tn }, t ∗ in Theorem 9.3.6.
Hence we arrive at:
Theorem 9.3.8. Suppose that the (C) conditions, Lemma 9.3.2 (or Lemma 9.3.5) and
hold. Then sequence {xn } generated by secant-like method is well defined, remains in U for
each n = −1, 0, 1, 2, . . . and converges to a solution x∗ ∈ U(x0 , s∗ − c) of equation F(x) = 0.
Moreover, the following estimates are satisfied for each n = 0, 1, . . .
kxn+1 − xn k ≤ sn+1 − sn
Efficient Secant-Like Methods 179
and
kxn − x∗ k ≤ s∗ − sn .
Furthermore, if there exists T ≥ s∗ − c such that
U(x0 , r) ⊆ D
and
H(T + s∗ + (λ − 1)c) < 1,
then, the solution x∗ is unique in U(x0 , T ).
xn+1 = xn − A−1
n (F(xn ) + G(xn )) for each n = 0, 1, 2 . . ., (9.3.66)
(C8 )
kA−1
0 (G(x) − G(y))k ≤ Mkx − yk for each x, y ∈ D , (9.3.67)
and
(C9 )
kA−1
0 (G(x1) − G(x0 ))k ≤ M0 kx1 − x0 k. (9.3.68)
Clearly,
M0 ≤ M (9.3.69)
M
holds and can be arbitrarily large [4, 5, 6, 8, 9].
M0
We shall denote by (C∗ ) the conditions (C) and (C8 ), (C9 ). Then, we can present the
corresponding result along the same lines as in Lemma 9.3.1, Lemma 9.3.2, Lemma 9.3.4,
Lemma 9.3.5, Theorem 9.3.6 and Theorem 9.3.8. However, we shall only present the results
corresponding to Lemma 9.3.2 and Theorem 9.3.8, respectively. The rest combination of
results can be given in an analogous way.
Lemma 9.3.9. Let c ≥ 0, η > 0, H0 > 0, H1 > 0, H > 0, M0 > 0, M > 0, K > 0 and λ ≥ 1.
Set γ−1 = 0, γ0 = c, γ1 = c + η. Define scalar sequences {γn }, {δn } by
H0 (γ1 − γ0 + λ(γ0 − γ−1 )) + M0
γ2 = γ1 + (γ1 − γ0 ),
1 − H1 λ(γ1 + γ0 − c)
K(γn+1 − γn + λ(γn − γn−1 )) + M
γn+2 = γn+1 + (γn+1 − γn ),
1 − Hλ(γn+1 + γn − c)
180 Ioannis K. Argyros and Á. Alberto Magreñán
H0 (γ1 − γ0 + λ(γ0 − γ−1 )) + M0
δ1 = ,
1 − H1 λ(γ1 + γ0 − c)
K(γn+1 − γn + λ(γn − γn−1 )) + M
δn = ,
1 − Hλ(γn+1 + γn − c)
0 ≤ δ1 ≤ α ≤ a,
where α was defined in Lemma 9.3.1. Then, sequence {γn } is non-decreasing, bounded
from above by γ∗∗ defined by
γ − γ1
γ∗∗ = c + η + 2
1−α
∗
and converges to its unique least upper bound γ which satisfies
c + η ≤ γ∗ ≤ γ∗∗ .
Proof. Simply use {γn }, {δn }, {hn }, ϕ, a instead of {sn }, {bn }, {gn }, p, α in the proof of
Lemma 9.3.2
U ⊆D
hold, where U was defined in Theorem 9.3.6 and kA−1 0 (F(x0 ) + G(x0 ))k ≤ η. Then, se-
quence {xn } generated by the secant-like method (9.3.66) in well defined, remains in U
for each n = −1, 0, 1, 2, . . . and converges to a solution x∗ ∈ U(x0 , γ∗ − c) of equation
F(x) + G(x) = 0. Moreover, the following estimates are satisfied for each n = 0, 1, . . .
kxn+1 − xn k ≤ γn+1 − γn
and
kxn − x∗ k ≤ γ∗ − γn .
Efficient Secant-Like Methods 181
U(x0 , γ) ⊆ D
and
K((λ − 1)c + γ) + M
0< ≤ η, for some µ ∈ (0, 1)
1 − Hλ(2γ − c)
then, the solution x∗ is unique in U(x0 , γ).
Proof. The proof until the uniqueness part follows as in Theorem 9.3.6 but using the
identity
instead of (9.3.61). Finally, for the uniqueness part, let y∗ ∈ U(x0 , γ) be such that F(y∗ ) +
G(y∗ ) = 0. Then, we get from (9.3.66) the identity
xn+1 − y∗ = xn − A−1
n (F(xn ) + (xn )) − y
∗
= −A−1 ∗ ∗ ∗
n (F(xn ) − F(x ) − An (xn − y ) + (G(xn ) − G(y )))
= −A−1 ∗ ∗ ∗
n (([xn , y ; F] − [yn , xn−1 ; F])(xn − y ) + (G(xn ) − G(y )).
K(kxn − yn k + kxn−1 − y∗ k) + M
kxn+1 − y∗ k ≤ kxn − y∗ k
1 − Hλ(γn+1 + γn − c)
K((λ − 1)kxn − xn−1 k + kxn−1 − y∗ k) + M
≤ kxn − y∗ k
1 − Hλ(2γ − c)
K((λ − 1)c + γ) + M
≤ kxn − y∗ k ≤ µkxn + y∗ k
1 − Hλ(2γ − c)
≤ µn+1 kx0 − y∗ k ≤ µn+1 γ.
It is well known that this problem can be formulated as the integral equation
Z 1
u(s) = s + Q (s,t) (u3(t) + γ u2 (t)) dt (9.4.1)
0
182 Ioannis K. Argyros and Á. Alberto Magreñán
We observe that Z 1
1
max |Q (s,t)| dt = .
0≤s≤1 0 8
Then problem (9.4.1) is in the form (9.1.1), where, F : D −→ Y is defined as
Z 1
[F(x)] (s) = x(s) − s − Q (s,t) (x3(t) + γ x2 (t)) dt.
0
1
kF 0 (x0 )−1 k ≤ .
5 − 2γ
Choosing x−1 (s) such that |x−1 − x0 k ≤ c and l0 c < 1. Then we have for λ = 1
and
1
kδF(x−1 , x0 )−1 F 0 (x0 )k ≤ ,
(1 − l0 c)
where l0 is such that
kF 0 (x0 )−1 (F 0 (x0 ) − A0 )k ≤ l0 c,
Set u0 (s) = s and D = U(u0 , R0 ). It is easy to verify that U(u0 , R0 ) ⊂ U(0, R0 + 1) since
k u0 k= 1. If 2 γ < 5, and l0 c < 1 the operator F 0 satisfies conditions of Theorem 9.2.6, with
1+γ γ + 6 R0 + 3 2 γ + 3 R0 + 6
η= , K= , H= .
(1 − l0 c)(5 − 2 γ) 8(5 − 2 γ)(1 − l0 c) 16(5 − 2 γ)(1 − l0 c)
Efficient Secant-Like Methods 183
l0 = 0.1938137822. . .,
η = 0.465153 . . .,
K = 0.368246 . . .
and
H = 0.193814 . . ..
Moreover we obtain that a−1 = 0.317477 and b−1 = 0.251336, but conditions of Theo-
rem 2.1 are not satisfied since
a−1 (1 − a−1 )2
b−1 = 0.251336 > 0.147893 = .
2(1 − a−1 ) − λ(1 − 2a−1 )
√
Notice also that the popular condition (9.3.36) is also not satisfied, since Kc + 2 Kη =
1.19599 > 1. Hence, there is no guarantee under the old conditions that the secant-type
method converges to x∗ . However, conditions of Lemma 9.3.1 are satisfied, since
1 − Hλ(c + 2η)
0 < α = 0.724067 ≤ 0.776347 =
1 − Hλc
The convergence of the secant-type method is also ensured by Theorem 123.6.
F(x) = x3 − 2
and we are going to apply secant-type method with λ = 2.5. We take the starting points
x0 = 1, x−1 = 0.25 and we consider the domain Ω = B(x0 , 3/4). In this case, we obtain
c = 0.75,
η = 0.120301 . . .,
K = 0.442105 . . .,
H = 0.180451 . . .,
Notice that the conditions of Theorem 9.2.1 and Lemma 9.3.1 are satisfied, but since H < K
Remark 9.2.2 ensures that our uniqueness ball is larger. It is clear as R1 = 1.83333 . . . >
0.193452 . . . = R0 .
References
[1] Amat, S., Bermúdez, C., Busquier, S., Gretay, J. O., Convergence by nondiscrete
mathematical induction of a two step secant’s method, Rocky Mountain Journal of
Mathematics, 37(2) (2007), 359–369.
[2] Amat, S., Busquier, S., Gutiérrez, J. M., On the local convergence of secant-type
methods, International Journal of Computer Mathematics, 81(8) (2004), 1153–1161.
[3] Amat, S., Busquier, S., Negra, M., Adaptive approximation of nonlinear operators,
Numerical Functional Analysis and Optimization, 25(5) (2004), 397–405.
[4] Argyros, I. K., Polynomial operator equations in abstract spaces and applications,
St.Lucie/CRC/Lewis Publ. Mathematics series, Boca Raton, Florida, U.S.A 1998.
[6] Argyros, I. K., Cho, Y. J., Hilout, S., Numerical methods for equations and its appli-
cations, CRC Press/Taylor & Francis, New York, 2012.
[7] Argyros, I. K., Hilout, S., Convergence conditions for secant-type methods,
Czechoslovak Mathematical Journal, 60 (2010), 11–18.
[8] Argyros, I. K., Hilout, S., Weaker conditions for the convergence of Newton’s method,
Journal of Complexity, 28(3) (2012), 346–387.
[9] Argyros, I. K., Hilout, S., Tabatabai, M.A., Mathematical modelling with applications
in biosciences and engineering, Nova Publishers, New York, 2011.
[10] Bosarge, W. E., Falb, P. L. A multipoint method of third order, Journal of Optimization
Theory and Applications, 4 (1969), 156–166.
[11] Dennis, J. E., Toward a unified convergence theory for Newton-like methods, Func-
tional Analysis and Applications (L.B. Rall, ed.), Academic Press, New York, 1971.
[12] Ezquerro, J. A., Grau-Sánchez, M., Hernández, M. A., Noguera, M., Semilocal con-
vergence of secant-like methods for differentiable and nondifferentiable operators
equations, J. Math. Anal. App., 398(1) (2013), 100-112.
186 Ioannis K. Argyros and Á. Alberto Magreñán
[13] Ezquerro, J. A., Gutiérrez, J. M., Hernández, M.A., Romero, N., Rubio, M. J., The
Newton method: from Newton to Kantorovich. (Spanish), Gac. Real Soc. Mat. Esp.,
13 (2010), 53–76.
[14] Ezquerro, J. A., Hernández, M.A., Romero, N., Velasco, A. I., App. Math. Comp.,
219(8) (2012), 3677-3692.
[15] Ezquerro, J. A., Hernández, M.A., Rubio, M.J., Secant-like methods for solving non-
linear integral equations of the Hammerstein type, Proceedings of the 8th Interna-
tional Congress on Computational and Applied Mathematics, ICCAM-98 (Leuven), J.
Comp. Appl. Math., 115 (2000), 245–254.
[16] Ezquerro, J. A., Hernández, M.A., Rubio, M.J., Solving a special case of conservative
problems by secant-like methods, App. Math. Comp., 169 (2005), 926–942.
[17] Ezquerro, J. A., Rubio, M.J.,, A uniparametric family of iterative processes for solving
nondifferentiable equations, J. Math. Anal. App., 275 (2002), 821–834.
[18] Kantorovich, L. V., Akilov, G. P., Functional analysis, Pergamon Press, Oxford, 1982.
[20] Magreñán, Á. A. A new tool to study real dynamics: The convergence plane App.
Math. Comp., 248 (2014), 215–224.
[21] Ortega, L. M., Rheinboldt, W. C., Iterative solution of nonlinear equations in several
variables, Academic Press, New York, 1970.
[22] Potra, F. A., Pták, V.,, Nondiscrete induction and iterative processes. Research Notes
in Mathematics, 103, Pitman (Advanced Publishing Program), Boston, Massachusetts,
1984.
[23] Proinov, P. D., General local convergence theory for a class of iterative processes and
its applications to Newton’s method, Journal of Complexity, 25 (2009), 38–62.
[24] Schmidt, J. W., Untere fehlerschranken fur regula-falsi verhafren, Periodica Mathe-
matica Hungarica, 9 (1978), 241–247.
[25] Traub, J. F., Iterative Methods for the Solution of Equations, Prentice-Hall, Englewood
Cliffs, New Jersay, 1964.
[26] Yamamoto, T., A convergence theorem for Newton-like methods in Banach spaces,
Numerische Mathematik, 51, 545–557, 1987.
[27] Wolfe, M. A., Extended iterative methods for the solution of operator equations, Nu-
merische Mathematik, 31 (1978), 153–174.
Chapter 10
10.1. Introduction
Let X be a real Hilbert space with inner product h., .i and norm k.k. Let U(x, R) and U(x, R),
stand respectively, for the open and closed ball in X with center x and radius R > 0. Let also
L(X) be the space of all bounded linear operators from X into itself.
In this chapter we are concerned with the problem of approximately solving the ill-
posed equation
F(x) = y, (10.1.1)
where F : D(F) ⊆ X → X is a nonlinear operator satisfying hF(v) − F(w), v − wi ≥
0, ∀v, w ∈ D(F), and y ∈ X.
It is assumed that (10.1.1) has a solution, namely x̂ and F possesses a locally uniformly
bounded Fréchet derivative F 0 (x) for all x ∈ D(F) (cf. [18]) i.e.,
kF 0 (x)k ≤ CF , x ∈ D(F)
ky − yδ k ≤ δ.
Then the problem of recovery of x̂ from noisy equation F(x) = yδ is ill-posed, in the sense
that a small perturbation in the data can cause large deviation in the solution. For solving
(10.1.1) with monotone operators (see [12, 17, 18, 19]) one usually use the Lavrentiev
regularization method. In this method the regularized approximation xδα is obtained by
solving the operator equation
It is known (cf. [19], Theorem 1.1) that the equation (10.1.2) has a unique solution xδα for
α > 0, provided F is Fréchet differentiable and monotone in the ball Br (x̂) ⊂ D(F) with
radius r = kx̂ − x0 k + δ/α. However the regularized equation (10.1.2) remains nonlinear
and one may have difficulties in solving them numerically.
In [6], George and Elmahdy considered an iterative regularization method which con-
verges linearly to xδα and its finite dimensional realization in [7]. Later in [8] George and
Elmahdy considered an iterative regularization method which converges quadratically to xδα
and its finite dimensional realization in [9].
Recall that a sequence (xn ) in X with limxn = x∗ is said to be convergent of order
n
p > 1, if there exist positive reals β, γ, such that for all n ∈ N kxn − x∗ k ≤ βe−γp .If the
sequence (xn ) has the property that kxn − x∗ k ≤ βqn , 0 < q < 1 then (xn ) is said to be
linearly convergent. For an extensive discussion of convergence rate (see [13]).
Note that the method considered in [6], [7], [8] and [9] are proved using a suitably
constructed majorizing sequence which heavily depends on the initial guess and hence not
suitable for practical consideration.
Recently, George and Pareth [10] introduced a two-step Newton-like projection
method(TSNLPM) of convergence order four to solve (10.1.2). (TSNLPM) was realized
as follows:
Let {Ph }h>0 be a family of orthogonal projections on X. Our aim in this section is to
obtain an approximation for xδα , in the finite dimensional space R(Ph ), the range of Ph . For
the results that follow, we impose the following conditions.
Let
εh (x) := kF 0 (x)(I − Ph )k, ∀x ∈ D(F)
and {bh : h > 0} is such that limh→0 k(I−P h )x0 k
bh = 0 and limh→0bh = 0. We assume that
εh (x) → 0, ∀x ∈ D(F) as h → 0. The above assumption is satisfied if, Ph → I pointwise
and if F 0 (x) is a compact operator. Further we assume that εh (x) ≤ ε0 , ∀x ∈ D(F), bh ≤ b0
and δ ∈ (0, δ0 ].
Assumption 10.1.1. (cf. [18], Assumption 3) There exists a constant k0 ≥ 0 such that
for every x, u ∈ D(F) and v ∈ X there exists an element Φ(x, u, v) ∈ X such that [F 0 (x) −
F 0 (u)]v = F 0 (u)Φ(x, u, v), kΦ(x, u,v)k ≤ k0 kvkkx − uk.
Assumption 10.1.2. There exists a continuous, strictly monotonically increasing function
ϕ : (0, a] → (0, ∞) with a ≥ kF 0 (x̂)k satisfying;
(i) limλ→0 ϕ(λ) = 0,
x0 − x̂ = ϕ(F 0 (x̂))v.
Remark 10.1.5. The hypotheses of Assumption 10.1.1 may not hold or may be very ex-
pensive or imposible to verify in general. In particular, as it is the case for well-posed
nonlinear equations the computation of the Lipschitz constant k0 even if this constant exists
is very difficult. Moreover, there are classes of operators for which Assumption 10.1.1 is
not satisfied but the (TSNLPM) converges.
In this paper, we expand the applicability of (TSNLPM) under less computational cost.
Let us explain how we achieve this goal.
(1) Assumption 10.1.3 is weaker than Assumption 10.1.1. Notice that there are classes of
operators that satisfy Assumption 10.1.3 but do not satisfy Assumption 10.1.1;
190 Ioannis K. Argyros and Á. Alberto Magreñán
(2) The computational cost of constant K0 is less than that of constant k0 , even when
K0 = k0 ;
(4) The computable error bounds on the distances involved (including K0 ) are less costly;
(5) The convergence domain of (TSNLPM) with Assumption 10.1.3 can be larger, since
K0
k0 can be arbitrarily small (see Example 10.5.4);
and
(7) Note that the Assumption 10.1.2 involves the Fréchet derivative at the exact solu-
tion x̂ which is unknown in practice. But Assumption 10.1.4 depends on the Fréchet
derivative of F at x0 .
These advantages are also very important in computational mathematics since they
provide under less computational cost.
The paper is organization as follows: In Section 10.2 we present the convergence anal-
ysis of (TSNLPM). Section 10.3 contains the error analysis and parameter choice strategy.
The algorithm for implementing(TSNLPM) is given in Section 10.4. Finally, numerical
examples are presented in the concluding Section 10.5.
Suppose that
1
0 < K0 < (10.2.2)
4(1 + αε00 )
and
4δ0 ε0
(1 + ) < 1. (10.2.3)
α0 α0
Define polynomial P on (0, ∞) by
ε0 K0 2 ε0 δ0 1
P(t) = (1 + ) t + (1 + )t + − . (10.2.4)
α0 2 α0 α0 4(1 + αε00 )
Two-Step Newton-Like Projection Method 191
It follows from (10.2.3) that P has a unique positive root given in closed form by the
quadratic formula. Denote this root by p0 .
Let
b0 < p0 , kx̂ − x0 k ≤ ρ, (10.2.5)
where
ρ < p0 − b0 . (10.2.6)
ε0 k0 δ0
γρ := (1 + ) (ρ + b0 )2 + (ρ + b0 ) + , (10.2.7)
α0 2 α0
4γρ
r := q (10.2.8)
1+ 1 + 32γρ (1 + αε00 )
and
ε0
b := 4(1 + )K0 r. (10.2.9)
α0
Then we have by (10.2.2)-(10.2.9) that
1
0 < γρ < . (10.2.10)
4
0<r<1 (10.2.11)
and
0 < b < 1. (10.2.12)
Indeed, we have by (10.2.4) and (10.2.12) that γρ − 14 ≤ P(p0 ) = 0 ⇒ 0 < γρ < 41 ⇒
(10.2.10). Estimate (10.2.11) follows from (10.2.8) and (10.2.10). Moreover, estimate
(10.2.12) follows from (10.2.2) and (10.2.11). We also have that
γρ < r. (10.2.13)
In view of (10.2.7) and (10.2.8), estimate (10.2.13) reduces to showing that 4γρ (1 + αε00 ) < 1
which is true by the choice of p0 and (10.2.4). Finally it follows from (10.2.13) that
Lemma 10.2.3. Suppose that (10.2.2), (10.2.3) and δ ∈ (0, δ0 ] hold and let Assumption
10.1.3 be satisfied. Then the following estimates hold for (TSNLPM):
192 Ioannis K. Argyros and Á. Alberto Magreñán
(a)
and
(b)
Ph [F(xh,δ δ h,δ
n−1,α ) − f + α(xn−1,α − x0 )]
h,δ h,δ h,δ
= yn−1,α − xn−1,α − R−1
α (yn−1,α )Ph
where
h,δ 0 h,δ h,δ h,δ
Γ1 := R−1
α (yn−1,α )Ph [F (yn−1,α )(yn−1,α − xn−1,α )
h,δ h,δ
−(F(yn−1,α) − F (xn−1,α))]
and
h,δ 0 h,δ 0 h,δ
Γ2 := R−1
α (yn−1,α )Ph [F (yn−1,α ) − F (xn−1,α )]
(xh,δ h,δ
n−1,α − yn−1,α ).
Two-Step Newton-Like Projection Method 193
Note that,
Z 1
h,δ h,δ h,δ
kΓ1 k = kR−1
α (yn−1,α )Ph [F 0 (yn−1,α) − F 0 (xn−1,α
0
+t(yh,δ h,δ h,δ h,δ
n−1,α − xn−1,α ))](yn−1,α − xn−1,α )dtk
Z 1
h,δ 0 h,δ
= kR−1
α (yn−1,α )Ph F (yn−1,α ) [φ(xh,δ
n−1,α +
0
t(yh,δ h,δ h,δ h,δ h,δ
n−1,α − xn−1,α ), yn−1,α, xn−1,α − yn−1,α )]dtk
Z 1
ε0
≤ K0 (1 + )[ kxh,δ h,δ
n−1,α − x0,α (10.2.20)
α0 0
−t(yn−1,α − xn−1,α )kdt + kyh,δ
h,δ h,δ h,δ
n−1,α − x0,α k]
×kyh,δ h,δ
n−1,α − xn−1,α k
Z 1
ε0 h,δ h,δ
≤ K0 (1 + )[ (1 − t)kxn−1,α − x0,α k
α0 0
+tkyh,δ h,δ h,δ h,δ
n−1,α ) − x0,α k + kyn−1,α ) − x0,α k]dt
h,δ h,δ
×kyn−1,α − xn−1,α k
K0 ε0 h,δ h,δ
≤ (1 + )[kxn−1,α − x0,α k
2 α0
h,δ h,δ h,δ
+3kyn−1,α) − x0,α k]en−1,α
the last step follows from the Assumption 10.1.3 and Lemma 10.2.1. Similarly,
ε0
kΓ2 k ≤ K0 (1 + )[kyh,δ h,δ h,δ h,δ h,δ
n−1,α − x0,α k + kx0,α − xn−1,α k]en−1,α (10.2.21)
α0
So, (a) follows from (10.2.19), (10.2.20) and (10.2.21). And (b) follows from (a) and the
triangle inequality;
h,δ h,δ h,δ h,δ h,δ h,δ
kxn,α − xn−1,α k ≤ kxn,α − yn−1,α k + kyn−1,α − xn−1,α k.
Theorem 10.2.4. Under the hypotheses of Lemma 10.2.3 the following estimates hold for
(TSNLPM):
Proof. We have,
h,δ h,δ h,δ h,δ h,δ
yn,α − xn,α = xn,α − yn−1,α − R−1
α (xn,α )Ph
h,δ h,δ h,δ
[F(xn,α ) − f δ + α(xn,α − x0 )] + R−1
α (yn−1,α )
Ph [F(yh,δ δ h,δ
n−1,α ) − f + α(yn−1,α − x0 )]
+α(yh,δ
n−1,α − x0 )]
h,δ h,δ h,δ h,δ
= R−1 0
α (xn,α )Ph [F (xn,α )(xn,α − yn−1,α )
and
ε0 h,δ h,δ h,δ h,δ h,δ h,δ
kΓ4 k ≤ K0 (1 + )[kxn,α − x0,α k + kyn−1,α − x0,α k]kxn,α − yn−1,α k
α0
Now
K0 ε0
eh,δ
n,α ≤ (1 + )[5kxh,δ h,δ
n,α − x0,α k
2 α0
+3kyh,δ h,δ h,δ h,δ
n−1,α − x0,α k]kxn,α − yn−1,α k (10.2.24)
K0 ε0 K0 ε0
≤ (1 + )(8r) (1 + )(8r)kxh,δ h,δ
n−1,α − yn−1,α k
2 α0 2 α0
≤ b2 kxh,δ h,δ
n−1,α − yn−1,α k
≤ b2n eh,δ 2n
0,α ≤ b γρ .
Theorem 10.2.5. Suppose that the hypotheses of Theorem 10.2.4 hold. Then, sequences
{xh,δ h,δ
n,α }, {yn,α } generated by (TSNLPM) are well defined and remain in U(Ph x0 , r) for all
n ≥ 0.
Two-Step Newton-Like Projection Method 195
i.e., xh,δ
1,α ∈ Br (Ph x0 ). Again note that from (10.2.25) and Theorem 10.2.4 we get,
i.e., yh,δ
1,α ∈ Br (Ph x0 ). Further by (10.2.25) and (b) of Lemma 10.2.3 we have,
and
h,δ h,δ h,δ h,δ
ky2,α − Ph x0 k ≤ ky2,α − x2,α k + kx2,α − Ph x0 k
≤ b4 γρ + 2(1 + b)γρ
≤ b2 γρ + 2(1 + b)γρ
1 − b3 1 − b2
≤ [ + ]γ
1−b 1−b ρ
(since b < 1)
2γρ
< <r
1−b
Theorem 10.2.6. Suppose that the hypotheses of Theorem 10.2.5 hold. Then the following
assertions hold
(c)
h,δ h,δ (1 + b)b2n γρ
kxn,α − xα k ≤
1 − b2
where γρ and b are defined by (10.2.7) and (10.2.9), respectively.
kxh,δ h,δ
n+i+1,α − xn+i,α k
≤ (1 + b)bkxh,δ h,δ
n+i,α − yn+i−1,α k
So,
m−1
h,δ h,δ
kxn+m,α − xn,α k ≤ ∑ kxh,δ h,δ
n+i+1,α − xn+i,α k
i=0
m−1
≤ (1 + b)b2n ∑ b2i
i=0
1 − b2m (1 + b)b2n
= (1 + b)b2n γ → γ ,
1 − b2 ρ 1 − b2 ρ
h,δ
as m → ∞. Thus xn,α is a Cauchy sequence in U(Ph x0 , r) and hence it converges, say to
xh,δ
α ∈ U(Ph x0 , r).
Observe that,
h,δ h,δ
kPh [F(xn,α) − f δ + α(xn,α − x0 )]k
h,δ h,δ
= kRα(x0 )(xn,α − yn,α )k
h,δ h,δ
≤ kRα(x0 )kkxn,α − yn,α k
= k(Ph F 0 (xh,δ h,δ
n,α )Ph + αPh )ken,α
≤ (CF + α)eh,δ
n,α . (10.2.26)
Ph [F(xh,δ h,δ δ
α ) + α(xα − x0 )] = Ph y . (10.2.27)
Remark 10.2.7. (a) The convergence order of (TSNLPM) is four [10], under Assump-
tion 10.1.1. In Theorem 10.2.6 the error bounds are too pessimistic. That is why in
Two-Step Newton-Like Projection Method 197
practice we shall use the computational order of convergence (COC) (see eg. [5])
defined by ! !
kxn+1 − xδα k kxn − xδα k
ρ ≈ ln / ln .
kxn − xδα k kxn−1 − xδα k
The (COC) ρ will then be close to 4 which is the order of convergence of (TSNLPM).
(b) Note that from the proof of the Theorem 10.2.5 a larger r can be obtained from solving
the equation
[b4t + 2(1 + bt)]γρ − rt = 0.
Note that this equation has a minimal root r∗ > r. Then, r∗ can replace r in Theorem
10.2.5. However, we decided to use r which is given in closed form. Using, Mathe-
matica or Maple we found r∗ in closed form. But it has a complicated and long form.
That is why we decided not to include r in this paper.
h,δ
Proposition 10.3.1. Let F : D(F) ⊆ X → X be a monotone operator in X. Let xα be the
solution of (10.2.27) and xhα := xh,0
α . Then
h,δ δ
kxα − xhα k ≤ .
α
Proof. The result follows from the monotonicity of F and the relation;
Ph [F(xh,δ h h,δ h δ
α ) − F(xα ) + α(xα − xα )] = Ph (y − y).
2
Theorem 10.3.2. Let ρ < ε and x̂ ∈ D(F) be a solution of (10.1.1). And let As-
K0 (1+ α0 )
0
sumption 10.1.3, Assumption 10.1.4 and the assumptions in Proposition 10.3.1 be satisfied.
Then
εh
kxhα − x̂k ≤ C̃(ϕ(α) + )
α
ε
max{1+(1+ α0 )K0 (2b0 +ρ),ρ+kx̂k}
where C̃ := 0
ε K0 .
1−(1+ α0 ) 2 ρ
0
R1 0 h
Proof. Let M := 0 F (x̂ + t(xα − x̂))dt. Then from the relation
we have,
(Ph MPh + αPh )(xhα − x̂) = Ph α(x0 − x̂) + Ph M(I − Ph )x̂.
198 Ioannis K. Argyros and Á. Alberto Magreñán
Hence,
where ζ1 = (Ph MPh + αPh )−1 Ph [F 0 (x0 ) − M + M(I − Ph )](F 0 (x0 ) + αI)−1 α(x0 − x̂)andζ2 =
(F 0 (x0 ) + αI)−1 α(x0 − x̂) + (Ph MPh + αPh )−1 Ph M(I − Ph )x̂. Observe that,
Z 1
kζ1 k ≤ k(Ph MPh + αPh )−1 Ph [F 0 (x0 ) − F 0 (x̂
0
+t(xhα − x̂))]dt(F 0 (x0 ) + αI)−1 α(x0 − x̂)k
+k(Ph MPh + αPh )−1 Ph M(I − Ph )
(F 0 (x0 ) + αI)−1 α(x0 − x̂)k
≤ k(Ph MPh + αPh )−1 Ph
Z 1
[F 0 (x̂ + t(xhα − x̂))(Ph + I − Ph )
0
εh
φ(x0 , x̂ + t(xhα − x̂), (F 0 (x0 ) + αI)−1 α(x0 − x̂))]dtk + ρ
α
where, here and below εh := εh (x̂ + t(xhα − x̂)). So
Z 1
εh
kζ1 k ≤ (1 + )K0 [kx0 − Ph x0 k + kx̂ + t(xhα − x̂) − Ph x0 k]
α 0
εh
kF 0 (x0 ) + αI)−1 α(x0 − x̂))k + ρ
α
εh 1 εh
≤ (1 + )K0 [(b0 + kx̂ − x0 + x0 − Ph x0 k)ϕ(α) + kxhα − x̂kρ] + ρ
α 2 α
εh 1 h εh
≤ (1 + )K0 [(2b0 + ρ)ϕ(α) + kxα − x̂kρ] + ρ (10.3.2)
α 2 α
and
εh
kζ2 k ≤ ϕ(α) +
kx̂k. (10.3.3)
α
The result now follows from (10.3.1), (10.3.2) and (10.3.3).
h,δ
Theorem 10.3.3. Let xn,α be as in (10.1.4). And the assumptions in Theorem 10.2.6 and
Theorem 10.3.2 hold. Then
1+b δ + εh
kxh,δ
n,α − x̂k ≤ 2
γρ b2n + max{1, C̃}(ϕ(α) + ).
1−b α
Two-Step Newton-Like Projection Method 199
1+b δ εh
kxh,δ
n,α − x̂k ≤ γ b2n + + C̃(ϕ(α) + )
1 − b2 ρ α α
1+b δ + εh
≤ 2
γρ b2n + max{1, C̃}(ϕ(α) + ).
1−b α
Let
2n δ + εh
nδ := min n : b ≤ (10.3.4)
α
and
1+b
C0 = γ + max{1, C̃}. (10.3.5)
1 − b2 ρ
h,δ
Theorem 10.3.4. Let nδ and C0 be as in (10.3.4) and (10.3.5) respectively. And let xnδ ,α be
as in (10.1.4) and the assumptions in Theorem 10.3.3 be satisfied. Then
δ + εh
kxh,δ
nδ ,α − x̂k ≤ C0 (ϕ(α) + ). (10.3.6)
α
gives the optimal order error estimate (see [10]) for ϕ(α) + δ+ε
α . So the relation (10.3.6)
h
Theorem 10.3.5. Let ψ(λ) := λϕ−1 (λ) for 0 < λ ≤ a, and the assumptions in Theorem
10.3.4 hold. For δ > 0, let αδ = ϕ−1 (ψ−1 (δ + εh )) and let nδ be as in (10.3.4). Then
kxh,δ −1
nδ ,α − x̂k = O(ψ (δ + εh )).
DN (α) := {αi = µi α0 , i = 0, 1, · · · , N}
δ + εh
kxh,δ h,δ
ni ,αi − xα i k ≤ C , ∀i = 0, 1, · · ·N.
αi
h,δ
Let xi := xni ,αi . In this paper we select α = αi from DN (α) for computing xi , for each
i = 0, 1, · · · , N.
Theorem 10.3.6. (cf. [18], Theorem 3.1) Assume that there exists i ∈ {0, 1, 2, · · · , N} such
that ϕ(αi ) ≤ δ+ε
αi . Let the assumptions of Theorem 10.3.4 and Theorem 10.3.5 hold and let
h
δ + εh
l := max i : ϕ(αi ) ≤ < N,
αi
δ + εh
k := max{i : kxi − x j k ≤ 4C0 , j = 0, 1, 2, · · · , i}.
αj
Then l ≤ k and kx̂ − xk k ≤ cψ−1 (δ + εh ) where c = 6C0 µ.
• Choose αi := µi α0 , i = 0, 1, 2, · · · , N.
10.4.1. Algorithm
1. Set i = 0.
n o
δ+εh
2. Choose ni := min n : b2n ≤ αi .
3. Solve xi := xh,δ
ni ,αi by using the iteration (10.1.3) and (10.1.4).
Example 10.5.1. (see [18], section 4.3) Let F : D(F) ⊆ L2 (0, 1) −→ L2 (0, 1) defined by
Z 1
F(u) := k(t, s)u3(s)ds,
0
where
(1 − t)s, 0 ≤ s ≤ t ≤ 1
k(t, s) = .
(1 − s)t, 0 ≤ t ≤ s ≤ 1
Then for all x(t), y(t) : x(t) > y(t) :
Z 1 Z 1
3 3
hF(x) − F(y), x − yi = k(t, s)(x − y )(s)ds
0 0
×(x − y)(t)dt ≥ 0.
Thus the operator F is monotone. The Fréchet derivative of F is given by
Z 1
0
F (u)w = 3 k(t, s)u2(s)w(s)ds. (10.5.1)
0
As in [10] one can see that F 0 satisfies the Assumption 10.1.2. In our computation, we take
f (t) = (t − t 11 )/110 and f δ = f + δ. Then the exact solution
x̂(t) = t 3 .
We use
3
x0 (t) = t 3 +(t − t 8 )
56
as our initial guess, so that the function x0 − x̂ satisfies the source condition
x0 − x̂ = ϕ(F 0 (x̂))1
where ϕ(λ) = λ.
For the operator F 0 (.) defined in (10.5.1), εh = O(n−2) (cf. [11]). Thus we expect to
1
obtain the rate of convergence O((δ + εh ) 2 ).
We choose α0 = (1.1)(δ + εh ), µ = 1.1, ρ = 0.11, γρ = 0.7818 and b = 0.99. The re-
sults of the computation are presented in Table 1. The plots of the exact solution and the
approximate solution obtained are given in Figures 1 and 2.
where c1 , c2 are real parameters and i > 2 an integer. Then F 0 (x) = x1/i + c1 is not Lipschitz
on D. However Assumption 10.1.3 holds for K0 = 1.
202 Ioannis K. Argyros and Á. Alberto Magreñán
n=8 n=16
n=32 n=64
Figure 10.5.1. Curves of the exact and approximate solutions.
n=128 n=256
n=512 n=1024
Indeed, we have
1/i
kF 0 (x) − F 0 (x0 )k = |x1/i − x0 |
|x − x0 |
= i−1 i−1
x0 i + · · · + x i
so
kF 0 (x) − F 0 (x0 )k ≤ K0 |x − x0 |.
Two-Step Newton-Like Projection Method 203
kxk −x̂k
n k nk δ + εh α kxk − x̂k (δ+εh )1/2
8 2 2 0.0134 0.0178 0.2217 1.9158
16 2 2 0.0133 0.0178 0.1835 1.5885
32 2 2 0.0133 0.0177 0.1383 1.1981
64 2 2 0.0133 0.0177 0.0998 0.8647
128 2 2 0.0133 0.0177 0.0699 0.6051
256 30 2 0.0133 0.2559 0.0470 0.4070
512 30 2 0.0133 0.2559 0.0290 0.2509
1024 30 2 0.0133 0.2559 0.0121 0.1049
Here, f is a given continuous function satifying f (s) > 0, s ∈ [a, b], λ is a real number, and
the kernel G is continuous and positive in [a, b] × [a, b].
For example, when G(s,t) is the Green kernel, the corresponding integral equation is
equivalent to the boundary value problem
u00 = λu1+1/n
u(a) = f (a), u(b) = f (b).
studied in [1]-[5]. Instead of (10.5.3) we can try to solve the equation F(u) = 0 where
and Z b
F(u)(s) = u(s) − f (s) − λ G(s,t)u(t)1+1/ndt.
a
The norm we consider is the max-norm.
The derivative F 0 is given by
Z b
1
F 0 (u)v(s) = v(s) − λ(1 + ) G(s,t)u(t)1/nv(t)dt, v ∈ Ω.
n a
204 Ioannis K. Argyros and Á. Alberto Magreñán
First of all, we notice that F 0 does not satisfy a Lipschitz-type condition in Ω. Let us con-
sider, for instance, [a, b] = [0, 1], G(s,t) = 1 and y(t) = 0. Then F 0 (y)v(s) = v(s) and
Z b
1
kF 0 (x) − F 0 (y)k = |λ|(1 + ) x(t)1/ndt.
n a
would hold for all x ∈ Ω and for a constant L2 . But this is not true. Consider, for example,
the functions
t
x j (t) = , j ≥ 1, t ∈ [0, 1].
j
If these are substituted into (10.5.5)
1 L2
≤ ⇔ j 1−1/n ≤ L2 (1 + 1/n), ∀ j ≥ 1.
j1/n(1 + 1/n) j
This inequality is not true when j → ∞.
Therefore, condition (10.5.5) is not satisfied in this case. However, Assumption 10.1.3
holds. To show this, let x0 (t) = f (t) and γ = mins∈[a,b] f (s), α > 0 Then for v ∈ Ω,
1 b Z
k[F 0 (x) − F 0 (x0 )]vk = |λ|(1 + ) max | G(s,t)(x(t)1/n − f (t)1/n)v(t)dt|
n s∈[a,b] a
1
≤ |λ|(1 + ) max Gn (s,t)
n s∈[a,b]
G(s,t)|x(t)− f (t)|
where Gn (s,t) = x(t)(n−1)/n +x(t)(n−2)/n f (t)1/n +···+ f (t)(n−1)/n
kvk.
Hence,
Z b
0 0 |λ|(1 + 1/n)
k[F (x) − F (x0 )]vk = max G(s,t)dtkx − x0 k
γ(n−1)/n s∈[a,b] a
≤ K0 kx − x0 k,
where K0 = |λ|(1+1/n)
Rb
γ(n−1)/n
N and N = maxs∈[a,b] a G(s,t)dt. Then Assumption 10.1.3 holds for
sufficiently small λ.
Example 10.5.4. Let X = D(F) = R, x0 = 0, and define function F on D(F) by
F(x) = d0 x + d1 + d2 sined3 x ,
where d0 , d1 , d2 and d3 are given parameters. Then, it can easily be seen that for d3 suffi-
ciently large and d1 sufficiently small, Kk00 can be arbitrarily small.
References
[2] Argyros, I.K., Approximating solutions of equations using Newton’s method with a
modified Newton’s method iterate as a starting point. Rev. Anal. Numer. Theor. Approx.
36, (2007), 123-138.
[3] Argyros, I.K., A Semilocal convergence for directional Newton methods, Math. Com-
put. (AMS). 80, (2011), 327-343.
[4] Argyros, I.K., and Hilout, S., Weaker conditions for the convergence of Newton’s
method, J. Complexity, 28, (2012), 364-387.
[5] Argyros, I.K., Cho, Y.J., and Hilout, S., Numerical methods for equations and its
applications, CRC Press, Taylor and Francis, New York, 2012.
[6] George, S., Elmahdy, A.I., An analysis of Lavrentiev regularization for nonlinear ill-
posed problems using an iterative regularization method, Int. J. Comput. Appl. Math.,
5(3) (2010),369-381.
[7] George, S., Elmahdy, A.I., An iteratively regularized projection method for nonlinear
ill-posed problems”, Int. J. Contemp. Math. Sciences, 5 (52) (2010), 2547-2565.
[8] George, S., Elmahdy, A.I., A quadratic convergence yielding iterative method for non-
linear ill-posed operator equations, Comput. Methods Appl. Math, 12(1) (2012), 32-45
[9] George, S., Elmahdy, A.I., An iteratively regularized projection method with quadratic
convergence for nonlinear ill-posed problems, Int. J. of Math. Analysis, 4(45) (2010),
2211-2228.
[10] George, S., Pareth, S., An application of Newton type iterative method for Lavren-
tiev regularization for ill-posed equations: Finite dimensional realization, IJAM, 42(3)
(2012), 164-170.
[11] Groetsch, C.W., King, J.T., Murio, D., Asymptotic analysis of a finite element method
for Fredholm equations of the first kind, in Treatment of Integral Equations by Nu-
merical Methods, Eds.: C.T.H. Baker and G.F. Miller, Academic Press, London, 1982,
1-11.
206 Ioannis K. Argyros and Á. Alberto Magreñán
[12] Jaan, J., Tautenhahn, U., On Lavrentiev regularization for ill-posed problems in
Hilbert scales, Numer. Funct. Anal. Optim., 24(5-6) (2003), 531-555.
[13] Kelley, C.T., Iterative methods for linear and nonlinear equations, SIAM Philadelphia,
1995.
[14] Mathe, P. Perverzev, S.V., Geometry of linear ill-posed problems in variable Hilbert
scales, Inverse Problems, 19(3) (2003), 789-803.
[15] Mahale, P., Nair, M. T., Iterated Lavrentiev regularization for nonlinear ill-posed prob-
lems, ANZIAM Journal, 51 (2009), 191-217.
[16] Pareth, S., George, S., “Newton type methods for Lavrentiev regularization of nonlin-
ear ill-posed operator equations”, (2012), (submitted).
[17] Perverzev,S.V., Schock, E., On the adaptive selection of the parameter in regulariza-
tion of ill-posed problems, SIAM J.Numer.Anal., 43 (2005), 2060-2076.
[18] Semenova, E.V., Lavrentiev regularization and balancing principle for solving ill-
posed problems with monotone operators, Comput. Methods Appl. Math., 4 (2010),
444-454.
[19] Tautanhahn, U., On the method of Lavrentiev regularization for nonlinear ill-posed
problems, Inverse Problems, 18 (2002), 191-207.
Chapter 11
11.1. Introduction
Let X be a real Hilbert space with the norm k·k and the inner product h., .i. Here we consider
the inclusion problem of the form: find a solution to
0 ∈ M(x), (11.1.1)
nique, while this work was followed by an accelerated research developments. Furthermore
it generalizes the existing theory of maximal monotone operators (based on the classical
resolvent), including the H−maximal monotonicity by Fang and Huang [4] that concerns
with the generalization of the classical maximal monotonicity. Fang and Huang [4] in-
troduced the notion of H−maximal monotonicity, while investigating the solvability of a
general class of inclusion problems. They applied (H, η)−maximal monotonicity [5] in the
context of approximating the solutions of inclusion problems using the generalized resol-
vent operator technique. The generalized resolvent operator technique is equally effective
applying to several other problems, such as equilibria problems in economics, global op-
timization and control theory, operations research, mathematical finance, management and
decision sciences, mathematical programming, and engineering science. For more details
on the resolvent operator technique and its applications, and further developments, we refer
the reader to [1- 33] and references therein.
/
D(M) = {x ∈ X : ∃ y ∈ X : (x, y) ∈ M} = {x ∈ X : M(x) 6= 0}.
dom(M)=X, shall denote the full domain of M, and the range of M is defined by
The inverse M −1 of M is {(y, x) : (x, y) ∈ M}. For a real number ρ and a mapping M, let
ρM = {(x, ρy) : (x, y) ∈ M}. If L and M are any mappings, we define
(i) (r)− strongly monotone if there exists a positive constant r such that
Lemma 11.3.1. ([18]) Let X be a real Hilbert space, let A : X → X be (r)−strongly mono-
tone, and let M : X → 2X be A− maximal monotone. Then the generalized resolvent opera-
tor associated with M and defined by
M
Jρ,A (u) = (A + ρM)−1 (u) ∀ u ∈ X,
1
is ( r−ρm )− Lipschitz continuous for r − ρm > 0.
satisfies
M M 1
kJρ,A (A(u)) − Jρ,A (A(v))k ≤ kA(u) − A(v)k, (11.3.1)
r − ρm
where r − ρm > 0.
where
M
Jρ,A (u) = (A + ρM)−1 (u).
where
M
Jρ,H (u) = (H + ρM)−1 (u).
1
M
h(Jρ,A M
oA)(u) − (Jρ,A oA)(v), A(u) − A(v)i ≤ kA(u) − A(v)k2 ∀ u, v ∈ X,
r − ρm
where r − ρm > 0.
1
M
h(Jρ,H M
oH)(u) − (Jρ,H oH)(v), H(u) − H(v)i ≤ kH(u) − H(v)k2 ∀ u, v ∈ X.
r
In the following theorem, we apply the generalized relaxed proximal point algorithm to
approximate the solution of (11.1.1), and as a result, we succeed achieving linear conver-
gence.
and yk satisfies
kyk − A(JρMk ,A (A(xk )))k ≤ δk kyk − A(xk )k,
where JρMk ,A = (A + ρk M)−1 , and
2γ2 (s2 − 1)
r − ρk m ≥ 1 + p , (11.3.3)
1 − 2γ + (1 − 2γ)2 − 4γ4 (s2 − 1)
s
1 − 2γ 2
1<s≤ 1+ , (11.3.4)
2γ2
∑∞
k=0 δk < ∞, δk → 0, and α = lim supk→∞ αk , and ρ = lim supk→∞ ρk .
212 Ioannis K. Argyros and Á. Alberto Magreñán
Then the sequence {xk } converges linearly to a solution x∗ of (11.1.1) with the conver-
gence rate
s
1 s2
θk = (1 − αk )2 + 2αk (1 − αk ) + α2k
(r − ρk m) (r − ρk m)2
r
1
= s2 + (r − ρk m)2 − 2(r − ρk m) α2k − 2 1 − (r − ρk m) αk + 1
r − ρk m
1 p
= Pk (αk ) ∈ (0, 1),
r − ρk m
where
Pk (αk ) = s2 + (r − ρk m)2 − 2(r − ρk m) α2k − 2 1 − (r − ρk m) αk + 1
2
= 1 − αk (r − ρk m − 1) + α2k (s2 − 1).
Proof. Note that it follows from hypotheses (11.3.2) and (11.3.3) that θk ∈ (0, 1). Suppose
that x∗ is a zero of M. Then from Theorem 11.3.1, it follows that any solution to (11.1.1) is
a fixed point of JρMk ,A oA. For all k ≥ 0, we express
where
Pk (αk )
θ2k = .
(r − ρk m)2
Thus, we have
kA(xk+1) − A(zk+1)k
= k(1 − αk )A(xk) + αk yk − [(1 − αk )A(xk) + αk A(JρMk ,A (A(xk )))]k
= kαk (yk − A(Jρ,A
M
(A(xk ))))k
≤ αk δk kyk − A(xk )k.
kA(xk+1) − A(x∗ )k
≤ kA(zk+1) − A(x∗ )k + kA(xk+1 ) − A(zk+1)k
≤ kA(zk+1) − A(x∗ )k + αk δk kyk − A(xk )k
= kA(zk+1) − A(x∗ )k + δk kA(xk+1 ) − A(xk )k
≤ kA(zk+1) − A(x∗ )k + δk kA(xk+1 ) − A(x∗ )k + δk kA(xk ) − A(x∗ )k. (11.3.6)
(1 − δk )kA(xk+1) − A(x∗ )k
≤ kA(zk+1) − A(x∗ )k + δk kA(xk ) − A(x∗ )k
≤ θk kA(xk) − A(x∗ )k + δk kA(xk) − A(x∗ )k
= θk + δk kA(xk ) − A(x∗ )k. (11.3.7)
Therefore,
(θk + δk )
kA(xk+1) − A(x∗ )k ≤ kA(xk ) − A(x∗ )k, (11.3.8)
1 − δk
where
(θk + δk )
lim sup = lim sup θk
1 − δk
1 p
= Pk (αk ). (11.3.9)
(r − ρk m)
Pk is a quadratic polynomial for each k whose leading coefficient
2
s2 + (r − ρk m)2 − 2(r − ρk m) = 1 − (r − ρk m) + s2 − 1
214 Ioannis K. Argyros and Á. Alberto Magreñán
Now, it follows from (11.3.8) in light of (11.3.9) that the sequence {A(xk )} converges to
A(x∗ ). On the other hand, A is (r)−strongly monotone (and hence, kA(x) − A(y)k ≥ rkx −
yk), we have that
θk
kxk − x∗ k ≤ kA(xk ) − A(x∗ )k → 0, (11.3.10)
r
which completes the proof.
and yk satisfies
kyk − H(JρMk ,H (H(xk )))k ≤ δk kyk − H(xk )k,
where JρMk ,H = (H + ρk M)−1 , and
2γ2 (s2 − 1)
r ≥ 1+ p ,
1 − 2γ + (1 − 2γ)2 − 4γ4 (s2 − 1)
s
1 − 2γ 2
1<s≤ 1+ .
2γ2
Then the sequence {xk } converges linearly to a solution of (11.1.1) with convergence
rate
q
1
θk =(s2 + r2 − 2r)α2k − 2(1 − r)αk + 1,
r
where ∑∞
k=0 δk < ∞, δk → 0, and α = lim supk→∞ αk , and ρ = lim supk→∞ ρk .
Relaxed Proximal Point Algorithms 215
11.4. An Application
Let X be a real Hilbert space and let f : X → R be a locally Lipschitz functional on X. We
consider the inclusion problem: determine a solution to
0 ∈ ∂ f (x), (11.4.1)
Theorem 11.4.1. Let X be a real Hilbert space, and let A : X → X be (r)−strongly monotone
and (s)−Lipschitz continuous. Let f : X → R be a locally Lipschitz functional on X, and let
∂ f : X → 2X be A− maximal monotone. For an arbitrarily chosen initial point x0 , suppose
that the sequence {xk } is generated by the generalized proximal point algorithm
and yk satisfies
∂f
kyk − A(Jρk ,A (A(xk )))k ≤ δk kyk − A(xk )k,
∂f
where Jρ = (A + ρk ∂ f )−1 , and
k ,A
2γ2 (s2 − 1)
r − ρk m ≥ 1 + p ,
1 − 2γ + (1 − 2γ)2 − 4γ4 (s2 − 1)
s
1 − 2γ 2
1<s≤ 1+ .
2γ2
Then the sequence {xk } converges linearly to a solution of (11.4.1) with convergence rate
given in Theorem 11.3.3.
References
[1] Agarwal, R.P., Verma,R.U., General system of (A, η)−maximal relaxed monotone
variational inclusion problems based on generalized hybrid algorithms, Communica-
tions in Nonlinear Science and Numerical Simulations 15 (2010), 238–251.
[2] Argyros, I.K., Cho, Y.J., Hilout, S., Numerical Methods for Equations and its Appli-
cations, CRC Press, Taylor & Francis, New York, 2012.
[21] Dhage, B.C., Verma, R.U., Second order boundary value problems of discontinuous
differential inclusions, Communications on Applied Nonlinear Analysis 12(3) (2005),
37-44.
[3] Eckstein, J., Bertsekas, D.P., On the Douglas-Rachford splitting method and the proxi-
mal point algorithm for maximal monotone operators, Mathematical Programming 55
(1992), 293–318.
[4] Fang, Y.P., Huang, N.J., H− monotone operators and system of variational inclusions,
Communications on Applied Nonlinear Analysis 11(1) (2004), 93–101.
[5] Fang, Y.P., Huang, N.J., Thompson, H.B., A new system of variational inclusions with
(H, η)− monotone operators, Computers and Mathematics with Applications 49(2-3)
(2005), 365–374.
[6] Fukushima, M., The primal Douglas-Rachford splitting algorithm for a class of mono-
tone operators with applications to the traffic equilibrium problem, Mathematical Pro-
gramming 72(1996), 1–15.
[7] Glowinski, R., Le Tellec, P., Augmented Lagrangians and Operator-Splitting Methods
in Continuum Mechanics, SIAM, Philadelphia, PA, 1989.
[8 Lan, H.Y., Kim, J.H., Cho, Y.J., On a new class of nonlinear A−monotone multivalued
variational inclusions, Journal of Mathematical Analysis and Applications 327(1)
(2007), 481–493.
[9] Moudafi, A., Mixed equilibrium problems: Sensitivity analysis and algorithmic as-
pect,Computers and Mathematics with Applications 44 (2002), 1099-1108.
[11] Rockafellar, R.T., Monotone operators and the proximal point algorithm, SIAM Jour-
nal of Control and Optimization 14 (1976), 877-898.
[12] Rockafellar, R.T., Augmented Lagrangians and applications of the proximal point al-
gorithm in convex programming, Mathematics of Operations Research 1 (1976b),
97–116.
[13] Tseng, P, Alternating projection-proximal methods for convex programming and vari-
ational inequalities, SIAM Journal of Optimization 7 (1997), 951–965.
[14] Tseng, P., A modified forward-backward splitting method for maximal monotone
mappings, SIAM Journal of Control and Optimization 38 (2000), 431–446.
[15] Verma, R.U., Sensitivity analysis for generalized strongly monotone variational inclu-
sions based on the (A, η)− resolvent operator technique, Applied Mathematics Letters
19 (2006), 1409–1413.
[16] Verma, R.U., A− monotonicity and its role in nonlinear variational inclusions, Journal
of Optimization Theory and Applications 129(3) (2006), 457-467.
[17] Verma, R.U., General system of A− monotone nonlinear variational inclusion prob-
lems, Journal of Optimization Theory and Applications 131(1) (2006), 151-157.
[18] Verma, R.U., A− monotone nonlinear relaxed cocoercive variational inclusions, Cen-
tral European Journal of Mathematics 5(2) (2007), 1-11.
[19] Verma, R.U., General system of (A, η)−monotone variational inclusion problems
based on generalized hybrid algorithm, Nonlinear Analysis: Hybrid Systems 1 (3)
(2007), 326-335.
[22] Verma, R.U., Auxiliary problem principle and its extension applied to variational in-
equalities, Mathematical Sciences Research Hot-Line 4(2) (2000), 55-63.
[23] Verma, R.U., On a class of nonlinear variational inequalities involving partially re-
laxed monotone and partially strongly monotone mappings, Mathematical Sciences
Research Hot-Line 3 (12) (1999), 7-26.
[24] Verma, R.U., A class of generalized implicit variational inequality type algorithms and
their applications, Mathematical Sciences Research Hot-Line 4 (3) (2000), 17-30.
[25] Verma, R.U., RKKM mappings and their applications, Mathematical Sciences Re-
search Hot-Line 4(10) (2000), 23–27.
[26] Verma, R.U., A class of new minimax inequalities in generalized H-spaces, Mathe-
matical Sciences Research Hot-Line 4(10) (2000), 29-32.
Relaxed Proximal Point Algorithms 219
[29] Verma, R.U., Averaging techniques and cocoercively monotone mappings, Mathemat-
ical Sciences Research Journal 10(3) (2006), 79-82.
[30] Verma, R.U., General class of implicit variational inclusions and graph convergence
on A−maximal relaxed monotonicity, Journal of Optimization Theory and Applica-
tions 155(1) (2012), 196-214.
[31] Xu, H.K., Iterative algorithms for nonlinear operators, Journal of London Mathemat-
ical Society 66 (2002), 240-256.
[32] Zeidler, E., Nonlinear Functional Analysis and its Applications I, Springer-Verlag,
New York, New York, 1986.
[33] Zeidler, E., Nonlinear Functional Analysis and its Applications II/A, Springer-Verlag,
New York, New York, 1990.
Chapter 12
12.1. Introduction
This chapter is devoted to the study of non-linear ill-posed Hammerstein type operator
equations. Recall that ([13, 14, 15, 16]) an equation of the form
(KF)x = y (12.1.1)
and F : D(F) ⊆ L2 [0, 1] → L2 [0, 1] is a nonlinear superposition operator (cf. [24]) defined
as
Fx(s) = f (s, x(s)). (12.1.2)
The first author and his collaborators ([13, 14, 15, 16]), studied ill-posed Hammerstein type
equation extensively under some assumptions on the Fréchet derivative of F. Precisely, in
[13, 15], it is assumed that F 0 (x0 )−1 exists and in [16] it is assumed that F 0 (x)−1 exists for
all x in a ball of radius r around x0 .
222 Ioannis K. Argyros and Á. Alberto Magreñán
Note that if the function f in (12.1.2) is differentiable with respect to the second variable
and for all x ∈ Br (x0 ), t ∈ [0, 1]; ∂2 f (t, x(t)) ≥ κ1 , then F 0 (u)−1 exists and is a bounded
operator for all u ∈ Br (x0 ) (see Remark 2.1 in [15]), here ∂2 f (t, s) represents the partial
derivative of f with respect to the second variable.
Throughout this chapter it is assumed that the available data is yδ with
ky − yδ kY ≤ δ
(KF)x = yδ (12.1.3)
instead of (12.1.1). Observe that the solution x of (12.1.3) can be obtained by solving
Kz = yδ (12.1.4)
F(x) = z. (12.1.5)
In [16], for solving (12.1.5), George and Kunhanandan considered the sequence defined
iteratively by
xδn+1,α = xδn,α − F 0 (xδn,α )−1 (F(xδn,α ) − zδα )
where xδ0,α := x0 ,
zδα = (K ∗ K + αI)−1 K ∗ (yδ − KF(x0 )) + F(x0 ) (12.1.6)
and obtained local quadratic convergence.
Recall that a sequence (xn ) in X with lim xn = x∗ is said to be convergent of order p > 1,
if there exist positive reals c1 , c2 , such that for all n ∈ N
n
kxn − x∗ kX ≤ c1 e−c2 p .
If the sequence (xn ) has the property that kxn − x∗ kX ≤ c1 qn , 0 < q < 1, then (xn ) is said to
be linearly convergent. For an extensive discussion of convergence rate see Kelley [23].
And in [15], George and Nair studied the modified Laverentiev regularization
for obtaining an approximate solution of (12.1.4) and introduced modified Newton’s itera-
tions,
xδn,α = xδn−1,α − F 0 (x0 )−1 (F(xδn−1,α) − F(x0 ) − zδα )
for solving (12.1.5) and obtained local linear convergence. In fact in [15] and [16], a solu-
tion x̂ of (12.1.1) is called an x0 -minimum norm solution if it satisfies
We also assume throughout that the solution x̂ satisfies (12.1.7). In all these papers ([13, 14,
15, 16]), it is assumed that the ill-posedness of (12.1.1) is due to the nonclosedness of the
operator K. In this chapter we consider two cases:
Newton-Type Methods for Ill-Posed Equations 223
Case (1) F 0 (x0 )−1 exists and is a bounded operator, i.e., (12.1.5) is regular.
Case (2) F is monotone ([26], [31]), Z = X is a real Hilbert space and F 0 (x0 )−1 does
not exist, i.e., (12.1.5) is also ill-posed.
The case when F is not monotone and F 0 (x0 )−1 does not exist is the subject matter of
the forthcoming chapter.
One of the advantages of (approximately) solving (12.1.4) and (12.1.5) to obtain an
approximate solution for (12.1.3) is that, one can use any regularization method ([8, 22])
for linear ill-posed equations, for solving (12.1.4) and any iterative method ([10, 12]) for
solving (12.1.5). In fact in this chapter we consider Tikhonov regularization([11, 13, 16,
19, 20]) for approximately solving (12.1.4) and we consider a modified two step Newton
method ([1, 6, 7, 9, 21, 25]) for solving (12.1.5). Note that the regularization parameter α
is chosen according to the adaptive method considered by Pereverzev and Schock in ([28])
for the linear ill-posed operator equations and the same parameter α is used for solving the
non-linear operator equation (12.1.5), so the choice of the regularization parameter is not
depending on the non-linear operator F, this is another advantage over treating (12.1.3) as
a single non-linear operator equation.
This chapter is organized as follows. Preparatory results are given in Section 12.2 and
Section 12.3 comprises the proposed iterative method for case (1) and case (2). Section
12.4 deals with the algorithm for implementing the proposed method. Numerical examples
are given in Section 12.5. Finally the chapter ends with a conclusion in section 12.6.
• limλ→0 ϕ(λ) = 0
•
αϕ(λ)
sup ≤ ϕ(α), ∀λ ∈ (0, a]
λ≥0 λ + α
and
Theorem 12.2.2. (see (4.3) in [16] ) Let zδα be as in (12.1.6) and Assumption 12.2.1 holds.
Then
δ
kF(x̂) − zδα kZ ≤ ϕ(α) + √ . (12.2.1)
α
224 Ioannis K. Argyros and Á. Alberto Magreñán
δ
l := max{i : ϕ(αi ) ≤ √ }. (12.2.3)
αi
We will be using the following theorem from [16] for our error analysis.
Theorem 12.2.3. (cf. [16], Theorem 4.3) Let l be as in (12.2.3), k be as in (12.2.2) and zδαk
be as in (12.1.6) with α = αk . Then l ≤ k and
4µ
kF(x̂) − zδαk kZ ≤ (2 + )µψ−1 (δ).
µ−1
Assumption 12.3.2. Let x0 ∈ X be fixed. There exists a constant k0 such that for every
u ∈ Br (x0 ) ⊆ D(F) and v ∈ X, there exists an element Φ0 (x0 , u, v) ∈ X satisfying
[F 0 (x0 ) − F 0 (u)]v = F 0 (x0 )Φ0 (x0 , u, v), kΦ(x0, u, v)kX ≤ k0 kvkX kx0 − ukX .
Note that
k0 ≤ K0
K0
holds in general and k0 can be arbitrary large. The advantages of the new approach are:
(1) Assumption 12.3.2 is weaker than Assumption 12.3.1. Notice that there are classes
of operators that satisfy Assumption 12.3.2 but do not satisfy Assumption 12.3.1;
(2) The computational cost of finding the constant k0 is less than that of constant K0 ,
even when K0 = k0 ;
(4) The computable error bounds on the distances involved (including k0 ) are less costly
and more precise than the old ones (including K0 );
M ≥ kF 0 (x0 )kX→Z ;
β := kF 0 (x0 )−1 kZ→X ;
1 1
k0 < min{1, };
4 β
226 Ioannis K. Argyros and Á. Alberto Magreñán
√
α0
δ0 < ;
4k0 β
1 1 δ0
ρ := ( − √ );
M 4k0 β α0
δ0
γρ := β[Mρ + √ ];
α0
and
eδn,αk := kyδn,αk − xδn,αk kX , ∀n ≥ 0. (12.3.3)
For convenience, we use the notation xn , yn and en for xδn,αk , yδn,αk and eδn,αk respectively.
Further we define
q := k0 r, r ∈ (r1 , r2 ) (12.3.4)
where q
1− 1 − 4k0 γρ
r1 =
2k0
and q
1 1+ 1 − 4k0 γρ
r2 = min{ , }.
k0 2k0
Note that r is well defined because γρ ≤ 4k10 . We will be using the relation e0 ≤ γρ for
proving our results, which can be seen as follows;
(c) en ≤ q2n γρ , ∀n ≥ 0.
Newton-Type Methods for Ill-Posed Equations 227
The last but one step follows from Assumption 12.3.2 and the last step follows from (a).
This completes the proof of (b) and (c) follows from (b). Now we shall show that xn , yn ∈
Br (x0 ) by induction. For n = 1,
ky1 − x1 kX ≤ q2 e0 . (12.3.6)
i.e., ym+1 ∈ Br (x0 ). Thus by induction xn , yn ∈ Br (x0 ). This completes the proof of the
Theorem.
The main result of this section is the following Theorem.
Newton-Type Methods for Ill-Posed Equations 229
Theorem 12.3.4. Let xn and yn be as in (12.3.2) and (12.3.1) respectively and assumptions
of Theorem 12.3.3 hold. Then (xn ) is a Cauchy sequence in Br (x0 ) and converges to xδαk ∈
Br (x0 ). Further F(xδαk ) = zδαk and
Proof. Using the relation (b) and (c) of Theorem 12.3.3, we obtain
i=m−1
kxn+m − xn kX ≤ ∑ kxn+i+1 − xn+i kX
i=0
i=m−1
≤ ∑ (1 + q)en+i
i=0
i=m−1
≤ ∑ (1 + q)q2(n+i)e0
i=0
= (1 + q)q2ne0 + (1 + q)q2(n+1)e0 + ..... + (1 + q)q2(n+m)e0
≤ (1 + q)q2n(1 + q2 + q2(2) + · · · + q2m )e0
1 − (q2 )m+1
≤ q2n [ ]γρ
1−q
≤ Cq2n .
Thus xn is a Cauchy sequence in Br (x0 ) and hence it converges, say to xδαk ∈ Br (x0 ). Observe
that
Now by letting n → ∞ in (12.3.7) we obtain F(xδαk ) = zδαk . This completes the proof.
Hereafter we assume that
kx̂ − x0 kX < ρ ≤ r.
Theorem 12.3.5. Suppose that the hypothesis of Assumption 12.3.2 holds. Then
β
kx̂ − xδαk kX ≤ kF(x̂) − zδαk kZ .
1 − k0 r
Proof. Note that k0 r < 1 and by Assumption 12.3.2, we have
kx̂ − xδαk kX ≤ kx̂ − xδαk + F 0 (x0 )−1 [F(xδαk ) − F(x̂) + F(x̂) − zδαk ]kX
≤ kF 0 (x0 )−1 [F 0 (x0 )(x̂ − xδαk ) + F(xδαk ) − F(x̂)]kX
+kF 0 (x0 )−1 (F(x̂) − zδαk )kX
≤ k0 kx0 − x̂ − t(xδαk − x̂)kX kx̂ − xδαk kX + βkF(x̂) − zδαk kZ
≤ k0 rkx̂ − xδαk kZ + βkF(x̂) − zδαk kZ .
230 Ioannis K. Argyros and Á. Alberto Magreñán
This completes the proof. The following Theorem is a consequence of Theorem 12.3.4 and
Theorem 12.3.5.
Theorem 12.3.6. Let xn be as in (12.3.2), assumptions in Theorem 12.3.4 and Theorem
12.3.5 hold. Then
β
kx̂ − xn kX ≤ Cq2n + kF(x̂) − zδαk kZ
1 − k0 r
where C is as in Theorem 12.3.4.
Observe that from section 2.2, l ≤ k and αδ ≤ αl+1 ≤ µαl , we have
δ δ δ
√ ≤ √ ≤ µ √ = µϕ(αδ ) = µψ−1 (δ).
αk αl αδ
This leads to the following theorem,
Theorem 12.3.7. Let xn be as in (12.3.2), assumptions in Theorem 12.2.3, Theorem 12.3.4
and Theorem 12.3.5 hold. Let
δ
nk := min{n : q2n ≤ √ }.
αk
Then
kx̂ − xnk kX = O(ψ−1 (δ)).
x0 − x̂ = ϕ1 (F 0 (x0 ))v.
Assumption 12.3.9. For each x ∈ Br̃ (x0 ) there exists a bounded linear operator G(x, x0 )
(see [29]) such that
F 0 (x) = F 0 (x0 )G(x, x0 )
with kG(x, x0 )kX→X ≤ k2 .
Newton-Type Methods for Ill-Posed Equations 231
where ṽδ0,αk := x0 is the initial guess and R(x0 ) := F 0 (x0 ) + αck I, with c ≤ αk . Let
and
x̃δn+1,αk = ṽδ2n,αk (12.3.9)
for n > 0.
First we prove that x̃n,αk converges to the zero xδc,αk of
αk
F(x) + (x − x0 ) = zδαk (12.3.10)
c
1 δ0
ρ< (1 − √ )
M α0
√
with δ0 < α0 . Let
δ0
γ̃ρ := Mρ + √ .
α0
and we define
q1 = k0 r̃, r̃ ∈ (r˜1 , r˜2 ) (12.3.12)
where q
1− 1 − 4k0 γ˜ρ
r˜1 =
2k0
and q
1 1+ 1 − 4k0 γ˜ρ
r˜2 = min{ , }.
k0 2k0
Theorem 12.3.10. Let ẽn and q1 be as in equation (12.3.11) and (12.3.12) respectively, x̃n
and ỹn be as in (12.3.9) and (12.3.8) respectively with δ ∈ (0, δ0 ] and suppose Assumption
12.3.2 holds. Then we have the following.
Now since kR(x0 )−1 F 0 (x0 )kX→X ≤ 1, the proof of (a) follows as in Theorem 12.3.3. Again
observe that
αk
ẽn ≤ kx̃n − ỹn−1 − R(x0 )−1 (F(x̃n ) − zδαk + (x̃n − x0 ))kX
c
αk
+kR(x0 )−1 (F(ỹn−1 ) − zδαk + (ỹn−1 − x0 ))kX
c
αk
≤ kR(x0 )−1 [R(x0 )(x̃n − ỹn−1 ) − (F(x̃n ) − F(ỹn−1 )) − (x̃n − ỹn−1 )]kX
c
Z 1
−1
≤ kR(x0 ) [F 0 (x0 ) − (F(x̃n ) − F(ỹn−1 ))]dt(x̃n − ỹn−1 )kX .
0
So the remaining part of the proof is analogous to the proof of Theorem 12.3.3.
Theorem 12.3.11. Let ỹn and x̃n be as in (12.3.8) and (12.3.9) respectively and assumptions
of Theorem 12.3.10 holds. Then (x̃n ) is a Cauchy sequence in Br̃ (x0 ) and converges to
xδc,αk ∈ Br̃ (x0 ). Further F(xδc,αk ) + αck (xδc,αk − x0 ) = zδαk and
γ̃ρ
where C̃ = 1−q1 .
Proof. Analogous to the proof of Theorem 12.3.4, one can prove that x̃n is a Cauchy se-
quence in Br̃ (x0 ) and hence it converges, say to xδc,αk ∈ Br̃ (x0 ) and
αk
kF(x̃n ) − zδαk + (x̃n − x0 )kX = kR(x0 )(x̃n − ỹn )kX
c
≤ kR(x0 )kX→X k(x̃n − ỹn )kX
αk
≤ (kF 0 (x0 )kX→X + )ẽn
c
0 αk 2n
≤ (kF (x0 )kX→X + )q1 ẽ0 (12.3.13)
c
0 αk 2n
≤ (kF (x0 )kX→X + )q1 γ̃ρ .
c
Newton-Type Methods for Ill-Posed Equations 233
Now by letting n → ∞ in (12.3.13) we obtain F(xδc,αk ) + αck (xδc,αk − x0 ) = zδαk . This completes
the proof.
Assume that k2 < 1−k 0r̃
1−c and for the sake of simplicity assume that ϕ1 (α) ≤ ϕ(α) for
α > 0.
Theorem 12.3.12. Suppose xδc,αk is the solution of (12.3.10) and Assumptions 12.3.2, 12.3.8
and 12.3.9 hold. Then
kx̂ − xδc,αk kX = O(ψ−1 (δ)).
Thus
kxδc,αk − x̂kX ≤ kαk (F 0 (x0 + αk I)−1 (x0 − x̂)kX + k(F 0 (x0 ) + αk I)−1
c(F(x̂) − zδαk )kX + k(F 0 (x0 ) + αk I)−1 [F 0 (x0 )(xδc,αk − x̂)
−c(F(xδc,αk ) − F (x̂))]kX
≤ kαk (F 0 (x0 ) + αk I)−1 (x0 − x̂)kX + kF(x̂) − zδαk kX
Z 1
+k(F 0 (x0 ) + αk I)−1 [F 0 (x0 ) − cF 0 (x̂ + t(xδc,αk − x̂)]
0
(xδc,αk − x̂)dtkX
≤ kαk (F 0 (x0 ) + αk I)−1 (x0 − x̂)kX
+kF(x̂) − zδαk kX + Γ (12.3.14)
R1 0 (x ) − cF 0 (x̂ + t(xδ
where Γ := k(F 0 (x0 ) + αk I)−1 0 [F 0 c,αk − x̂)](xδc,αk − x̂)dtkX . So by As-
sumption 12.3.9, we obtain
Z 1
Γ ≤ k(F 0 (x0 ) + αk I)−1 [F 0 (x0 ) − F 0 (x̂ + t(xδc,αk − x̂)]
0
(xδc,αk − x̂)dtkX + (1 − c)k(F 0 (x0 ) + αI)−1 F 0 (x0 )
Z 1
G(x̂ + t(xδc,αk − x̂), x0 )(xδc,αk − x̂)dtkX
0
≤ k0 r̃kxδc,αk − x̂kX + (1 − c)k2 kxδc,αk − x̂kX (12.3.15)
234 Ioannis K. Argyros and Á. Alberto Magreñán
Theorem 12.3.13. Let x̃n be as in (12.3.9), assumptions in Theorem 12.3.11 and Theorem
12.3.12 hold. Then
−1
kx̂ − x̃n kX ≤ C̃q2n
1 + O(ψ (δ))
Then
kx̂ − x̃nk kX = O(ψ−1 (δ)).
Remark 12.3.15. Let us denote by r̄1 , γ̄ρ , q̄, δ̄0 the parameters using K0 instead of k0 for
Case 1 (Similarly for Case 2). Then we have,
r1 ≤ r̄1 ,
δ̄0 ≤ δ0 ,
γ̄ρ ≤ γρ ,
q ≤ q̄.
Moreover, strict inequality holds in the preceding estimates if k0 < K0 . Let h0 = 4k0 γρ
and h = 4K0 γ̄ρ . We can certainly choose γρ sufficiently close to γ̄ρ . Then, we have that,
h ≤ 1 ⇒ h0 ≤ 1 but not necessarily vice versa unless if k0 = K0 and γρ = γ̄ρ . Finally, we have
that, hh0 → 0 as Kk00 → 0. The last estimate shows by how many times our new approach using
k0 can expand the applicability of the old approach using K0 for these methods. Hence, all
the above justify the claims made at the introduction of the chapter. Finally we note that
the results obtained here are useful even if Assumptions 12.3.1 is satisfied but sufficient
convergence condition h ≤ 1 is not satisfied but h0 ≤ 1 is satisfied. Indeed, we can start
with iterative method described in Case 1 (or Case 2) until a finite step N such that h ≤ 1
is satisfied with xδN+1,αN as a starting point for faster methods such as (12.1.6). Such an
approach has already been employed in [2], [4] and [5] where the modified Newton’s
method is used as a predictor for Newton’s method.
Newton-Type Methods for Ill-Posed Equations 235
12.4. Algorithm
Note that for i, j ∈ {0, 1, 2, · · · , M}
The algorithm for implementing the iterative methods considered in section 3 involves
the following steps.
• α0 = δ2 ;
• αi = µ2i α0 , µ > 1;
• solve xnk using the iteration (12.3.2) or x̃nk using the iteration (12.3.9).
defined by F(u) := u3 ,
Then the Fréchet derivative of F is given by F 0 (u)w = 3(u)2w.
t2 t 11 5 3t 8
837t
In our computation, we take y(t) = 6160 − 16 − 110 − 3t80 − 112 and yδ = y + δ. Then the
exact solution
x̂(t) = 0.5 + t 3 .
236 Ioannis K. Argyros and Á. Alberto Magreñán
We use
3
x0 (t) = 0.5 + t 3 − (t − t 8 )
56
as our initial guess.
We choose α0 = (1.3)2(δ)2 , µ = 1.2, δ = 0.0667 the Lipschitz constant k0 equals ap-
proximately 0.23 and r = 1, so that q = k0 r = 0.23. The iterations and corresponding error
estimates are given in Table 12.5.1. The last column of the Table 12.5.1 shows that the error
1
kxnk − x̂kX is of order O(δ 2 ).
Example 12.5.2. In this example for Case (2), we consider the operator KF : D(KF) ⊆
L2 (0, 1) −→ L2 (0, 1) where K : L2 (0, 1) −→ L2 (0, 1) defined by
Z 1
K(x)(t) = k(t, s)x(s)ds
0
where
(1 − t)s, 0 ≤ s ≤ t ≤ 1
k(t, s) = .
(1 − s)t, 0 ≤ t ≤ s ≤ 1
Then for all x(t), y(t) : x(t) > y(t) : (see section 4.3 in [30])
Z 1 Z 1
3 3
hF(x) − F(y), x − yi = k(t, s)(x − y )(s)ds (x − y)(t)dt ≥ 0.
0 0
In the next two cases, we present examples for nonlinear equations where Assumption
12.3.2 is satisfied but not Assumption 12.3.1.
Example 12.5.3. Let X = Y = R, D = [0, ∞), x0 = 1 and define function F on D by
1
x1+ i
F(x) = + c1 x + c2 , (12.5.1)
1 + 1i
where c1 , c2 are real parameters and i > 2 an integer. Then F 0 (x) = x1/i + c1 is not Lips-
chitz on D. Hence, Assumption 12.3.1 is not satisfied. However central Lipschitz condition
Assumption 12.3.2 holds for k0 = 1.
Indeed, we have
1/i
kF 0 (x) − F 0 (x0 )kX = |x1/i − x0 |
|x − x0 |
= i−1 i−1
x0 i + · · · + x i
238 Ioannis K. Argyros and Á. Alberto Magreñán
so
kF 0 (x) − F 0 (x0 )kX ≤ k0 |x − x0 |.
Here, f is a given continuous function satisfying f (s) > 0, s ∈ [a, b], λ is a real number, and
the kernel G is continuous and positive in [a, b] × [a, b].
For example, when G(s,t) is the Green kernel, the corresponding integral equation is
equivalent to the boundary value problem
u00 = λu1+1/n
u(a) = f (a), u(b) = f (b).
studied in [1]-[5]. Instead of (12.5.2) we can try to solve the equation F(u) = 0 where
and Z b
F(u)(s) = u(s) − f (s) − λ G(s,t)u(t)1+1/ndt.
a
The norm we consider is the max-norm.
The derivative F 0 is given by
Z b
0 1
F (u)v(s) = v(s) − λ(1 + ) G(s,t)u(t)1/nv(t)dt, v ∈ Ω.
n a
First of all, we notice that F 0 does not satisfy a Lipschitz-type condition in Ω. Let us con-
sider, for instance, [a, b] = [0, 1], G(s,t) = 1 and y(t) = 0. Then F 0 (y)v(s) = v(s) and
Z b
1
kF 0 (x) − F 0 (y)kC[a,b]→C[a,b] = |λ|(1 + ) x(t)1/ndt.
n a
would hold for all x ∈ Ω and for a constant L2 . But this is not true. Consider, for example,
the functions
t
x j (t) = , j ≥ 1, t ∈ [0, 1].
j
If these are substituted into (12.5.4)
1 L2
≤ ⇔ j 1−1/n ≤ L2 (1 + 1/n), ∀ j ≥ 1.
j1/n(1 + 1/n) j
1 b Z
0 0
k[F (x) − F (x0 )]vkC[a,b] = |λ|(1 + ) max | G(s,t)(x(t)1/n − f (t)1/n)v(t)dt|
n s∈[a,b] a
1
≤ |λ|(1 + ) max Gn (s,t)
n s∈[a,b]
G(s,t)|x(t)− f (t)|
where Gn (s,t) = x(t)(n−1)/n +x(t)(n−2)/n f (t)1/n +···+ f (t)(n−1)/n
kvkC[a,b].
Hence,
Z b
|λ|(1 + 1/n)
k[F 0 (x) − F 0 (x0 )]vkC[a,b] = max G(s,t)dtkx − x0 kC[a,b]
γ(n−1)/n s∈[a,b] a
≤ k0 kx − x0 kC[a,b] ,
|λ|(1+1/n) Rb
where k0 = γ(n−1)/n N and N = maxs∈[a,b] a G(s,t)dt. Then Assumption 12.3.2 holds for
sufficiently small λ.
Example 12.5.5. Define the scalar function F by F(x) = d0 x+d1 +d2 sin ed3 x , x0 = 0, where
di , i = 0, 1, 2, 3 are given parameters. Then, it can easily be seen that for d3 large and d2
sufficiently small, Kk00 can be arbitrarily small.
12.6. Conclusion
We presented an iterative method which is a combination of modified Newton method and
Tikhonov regularization for obtaining an approximate solution for a nonlinear ill-posed
Hammerstein type operator equation KF(x) = y, with the available noisy data yδ in place of
the exact data y. In fact we considered two cases, in the first case it is assumed that F 0 (x0 )−1
exists and in the second case it is assumed that F is monotone but F 0 (x0 )−1 does not exist.
In both the cases, the derived error estimate using an a priori and balancing principle are of
optimal order with respect to the general source condition. The results of the computational
experiments gives the evidence of the reliability of our approach.
References
[2] Argyros, I. K., Approximating solutions of equations using Newton’s method with a
modified Newton’s method iterate as a starting point. Rev. Anal. Numer. Theor. Approx.
36 (2007), 123-138.
[3] Argyros, I. K., A Semilocal convergence for directional Newton methods, Math. Com-
put.(AMS). 80 (2011), 327-343.
[4] Argyros, I. K., Hilout, S. Weaker conditions for the convergence of Newton’s method,
J. Complexity, 28 (2012), 364-387.
[5] Argyros, I. K., Cho, Y. J., Hilout, S., Numerical methods for equations and its appli-
cations, (CRC Press, Taylor and Francis, New York, 2012).
[6] Argyros, I. K., Hilout, S., A convergence analysis for directional two-step Newton
methods, Numer. Algor., 55 (2010), 503-528.
[7] Bakushinskii, A. B., The problem of convergence of the iteratively regularized Gauss-
Newton method, Comput. Math. Math. Phys., 32 (1982), 1353-1359.
[8] Bakushinskii, A. B., Kokurin, M. Y., Iterative Methods for Approximate Solution of
Inverse Problems, (Springer, Dordrecht, 2004).
[9] Blaschke, B., Neubauer, A., Scherzer, O., On convergence rates for the iteratively
regularized Gauss-Newton method IMA J. Numer. Anal., 17 (1997), 421-436.
[10] Engl, H. W., Regularization methods for the stable solution of inverse problems, Sur-
veys on Mathematics for Industry, 3 (1993), 71-143.
[11] Engl, H. W., Kunisch, K., Neubauer, A., Convergence rates for Tikhonov regulariza-
tion of nonlinear ill-posed problems, Inverse Problems, 5 (1989), 523-540.
[12] Engl, H. W., Kunisch, K., Neubauer, A., Regularization of Inverse Problems, (Kluwer,
Dordrecht, 1996).
[23] Kelley, C. T., Iterative Methods for Linear and Nonlinear Equations (SIAM, Philadel-
phia 1995).
[24] Krasnoselskii, M. A., Zabreiko, P. P., Pustylnik, E. I., Sobolevskii, P. E., Integral
operators in spaces of summable functions (Translated by T. Ando, Noordhoff Inter-
national publishing, Leyden, 1976).
[25] Langer, S., Hohage, T., Convergence analysis of an inexact iteratively regularized
Gauss-Newton method under general source conditions, J. Inverse Ill-Posed Probl.,
15 (2007), 19-35.
[26] Mahale, P., Nair, M. T., A simplified generalized Gauss-Newton method for nonlinear
ill-posed problems, Math. Comp., 78(265) (2009), 171-184.
[27] Nair, M.T., Ravishankar, P., Regularized versions of continuous newton’s method and
continuous modified newton’s method under general source conditions, Numer. Funct.
Anal. Optim. 29(9-10) (2008), 1140-1165.
[28] Pereverzev, S., Schock, E., On the adaptive selection of the parameter in regularization
of ill-posed problems, SIAM. J. Numer. Anal., 43(5) (2005), 2060-2076.
Newton-Type Methods for Ill-Posed Equations 243
[29] Ramm, A. G., Smirnova, A. B., Favini, A., Continuous modified Newton’s-type
method for nonlinear operator equations. Ann. Mat. Pura Appl. 182 (2003), 37-52.
[30] Semenova, E.V., Lavrentiev regularization and balancing principle for solving ill-
posed problems with monotone operators, Comput. Methods Appl. Math., 4 (2010),
444-454.
[31] Tautenhahn, U., On the method of Lavrentiev regularization for nonlinear ill-posed
problems, Inverse Problems, 18 (2002), 191-207.
Chapter 13
13.1. Introduction
Let X , Y be Banach spaces and D be a non-empty, convex and open subset in X . Let
U(x, r) and U(x, r) stand, respectively, for the open and closed ball in X with center x and
radius r > 0. Denote by L (X , Y ) the space of bounded linear operators from X into Y . In
the present chapter we are concerned with the problem of approximating a locally unique
solution x? of equation
F(x) = 0, (13.1.1)
where F is a Fréchet continuously differentiable operator defined on D with values in Y .
A lot of problems from computational sciences and other disciplines can be brought in
the form of equation (13.1.1) using Mathematical Modelling [8, 10, 14]. The solution of
these equations can rarely be found in closed form. That is why most solution methods for
these equations are iterative. In particular, the practice of numerical analysis for finding
such solutions is essentially connected to variants of Newton’s method [8, 10, 14, 23, 26,
28, 33].
A very important aspect in the study of iterative procedures is the convergence domain.
In general the convergence domain is small. This is why it is important to enlarge it without
additional hypotheses. Then, this is our goal in this chapter.
In the present chapter we study the secant-like method defined by
The family of secant-like methods reduces to the secant method if λ = 0 and to Newton’s
method if λ = 1. It was shown in [28] (see also√[7, 8, 20, 22] and the references therein)
that the R–order of convergence is at least (1 + 5)/2 if λ ∈ [0, 1), the same as that of the
secant method. In the real case the closer xn and yn are, the higher the speed of convergence.
246 Ioannis K. Argyros and Á. Alberto Magreñán
Moreover in [19], it was shown that as λ approaches 1 the speed of convergence is close
to that of Newton’s method. Moreover, the advantages of using secant-like method instead
of Newton’s method is that the former method avoids the computation of F 0 (xn )−1 at each
step. The study about convergence matter of iterative procedures is usually centered on
two types: semilocal and local convergence analysis. The semilocal convergence matter is,
based on the information around an initial point, to give criteria ensuring the convergence
of iterative procedure; while the local one is, based on the information around a solution,
to find estimates of the radii of convergence balls. There is a plethora of studies on the
weakness and/or extension of the hypothesis made on the underlying operators; see for
example [1]–[35] or even graphical tools to study this method [25].
The hypotheses used for the semilocal convergence of secant-like method are (see [8,
18, 19, 22]):
(C1 ) There exists a divided difference of order one denoted by [x, y; F] ∈ L (X , Y ) satisfy-
ing
[x, y; F](x − y) = F(x) − F(y) for all x, y ∈ D ;
k x0 − x−1 k≤ c;
k A−1
0 ([x, y; F] − [u, v; F]) k≤ M (k x − u k + k y − v k) for all x, y, u, v ∈ D ;
k A−1
0 ([x, y; F] − [v, y; F]) k≤ L k x − v k for all x, y, v ∈ D ;
(C3?? ) There exist x−1 , x0 ∈ D and K > 0 such that F(x0 )−1 ∈ L (Y , X ) and
k A−1
0 F(x0 ) k≤ η;
k B−1
0 F(x0 ) k≤ η.
We shall refer to (C1 )–(C4 ) as the (C ) conditions. From analyzing the semilocal conver-
gence of the simplified secant method, it was shown [18] that the convergence criteria are
milder than those of secant-like method given in [21]. Consequently, the decreasing and
accessibility regions of (13.1.2) can be improved. Moreover, the semilocal convergence of
(13.1.2) is guaranteed.
In the present chapter we show: an even larger convergence domain can be obtained
under the same or weaker sufficient convergence criteria for method (13.1.2). In view of
(C3 ) we have that
Secant-Like Methods 247
k A−1
0 ([x, y; F] − [x−1 , x0 ; F]) k≤ M0 (k x − x−1 k + k y − x0 k) for all x, y ∈ D .
(C7 ) There exist x0 ∈ D and M2 > 0 such that F 0 (x0 )−1 ∈ L (Y , X ) and
qn = (1 − λ) (tn − t0 ) + (1 + λ) (tn+1 − t0 ),
K (tn+1 − tn + (1 − λ) (tn − tn−1 ))
tn+2 = tn+1 + (tn+1 − tn ), (13.2.1)
1 − M1 q n
K (tn+1 − tn + (1 − λ) (tn − tn−1 ))
αn = , (13.2.2)
1 − M1 q n
function { f n } for each n = 1, 2, · · · by
and polynomial p by
0 < α0 ≤ α ≤ 1 − 2 M1 η. (13.2.5)
c + η ≤ t ? ≤ t ?? . (13.2.7)
0 ≤ tn+1 − tn ≤ αn η (13.2.8)
and
αn η
t ? − tn ≤ . (13.2.9)
1−α
Proof. We shall first prove that polynomial p has roots in (0, 1). If λ 6= 1, p(0) = −(1 −
λ) K < 0 and p(1) = 2 M1 > 0. If λ = 1, p(t) = t p(t), p(0) = −K < 0 and p(1) = 2 M1 > 0.
In either case it follows from the intermediate value theorem that there exist roots in (0, 1).
Denote by α the minimal root of p in (0, 1). Note that, in particular for Newton’s method
(i.e. for λ = 1) and for Secant method (i.e. for λ = 0), we have, respectively by (13.2.4)
that
2K
α= p (13.2.10)
K + K 2 + 4 M1 K
and
2K
α= p . (13.2.11)
K+ K 2 + 8 M1 K
It follows from (13.2.1) and (13.2.2) that estimate (13.2.8) is satisfied if
0 ≤ αn ≤ α. (13.2.12)
t2 − t1 ≤ α (t1 − t0 ) =⇒ t2 ≤ t1 + α (t1 − t0 )
1 − α2
=⇒ t2 ≤ η + t0 + α η = c + (1 + α) η = c + < t ?? .
1−αη
Suppose that
1 − αk+1
tk+1 − tk ≤ αk η and tk+1 ≤ c + η. (13.2.13)
1−α
Estimate (13.2.12) shall be true for k + 1 replacing n if
0 ≤ αk+1 ≤ α (13.2.14)
Secant-Like Methods 249
or
fk (α) ≤ 0. (13.2.15)
We need a relationship between two consecutive recurrent functions f k for each k = 1, 2, · · ·.
It follows from (13.2.3) and (13.2.4) that
M1 η (1 − λ) lim (1 + α + · · · + αn )+
n→∞ (13.2.18)
(1 + λ) lim (1 + α + · · · + αn+1 ) − 1
n→∞
1−λ 1+λ 2 M1 η
= M1 η + −1 = − 1,
1−α 1−α 1−α
since α ∈ (0, 1). In view of (13.2.15), (13.2.16) and (13.2.18) we can show instead of
(13.2.15) that
f∞ (α) ≤ 0, (13.2.19)
which is true by (13.2.5). The induction for (13.2.8) is complete. It follows that sequence
{tn } is non-decreasing, bounded from above by t ?? given by (13.2.6) and as such it con-
verges to t ? which satisfies (13.2.7). Estimate (13.2.9) follows from (13.2.8) by using stan-
dard majorization techniques [8, 10, 23]. The proof of Lemma 13.2.1 is complete.
Lemma 13.2.2. Let c ≥ 0, η > 0, M1 > 0, K > 0 and λ ∈ [0, 1]. Set r−1 = 0, r0 = c and
r1 = c + η. Define scalar sequences {rn } for each n = 1, · · · by
r2 = r1 + β1 (r1 − r0 )
(13.2.20)
rn+2 = rn+1 + βn (rn+1 − rn ),
where
M1 (r1 − r0 + (1 − λ) (r0 − r−1 ))
β1 = ,
1 − M1 q 1
K (rn+1 − rn + (1 − λ) (rn − rn−1 ))
βn = f or each n = 2, 3, · · ·
1 − M1 q n
and function {gn } on [0, 1) for each n = 1, 2, · · · by
n−1
(1 − λ))t n+1
gn (t) = K (t + (r2 − r1 )+
1 −t 1 − t n+2
M1 t (1 − λ) + (1 + λ) (r2 − r1 ) + (2 M1 η − 1)t.
1 −t 1 −t
(13.2.21)
250 Ioannis K. Argyros and Á. Alberto Magreñán
Suppose that
2 M1 (r2 − r1 )
0 ≤ β1 ≤ α ≤ 1 − , (13.2.22)
1 − 2 M1 η
where α is defined in Lemma 13.2.1. Then, sequence {rn } is non-decreasing, bounded from
above by r?? defined by
r2 − r1
r?? = c + η + (13.2.23)
1−α
and converges to its unique least upper bound r? which satisfies
c + η ≤ r? ≤ r?? . (13.2.24)
Moreover, the following estimates are satisfied for each n = 1, · · ·
0 ≤ rn+2 − rn+1 ≤ αn (r2 − r1 ). (13.2.25)
Proof. We shall use mathematical induction to show that
0 ≤ βn ≤ α. (13.2.26)
Estimate (13.2.26) is true for n = 0 by (13.2.22). Then, we have by (13.2.20) that
0 ≤ r3 − r2 ≤ α (r2 − r1 ) =⇒ r3 ≤ r2 + α (r2 − r1 )
=⇒ r3 ≤ r2 + (1 + α) (r2 − r1 ) − (r2 − r1 )
1 − α2
=⇒ r3 ≤ r1 + (r2 − r1 ) ≤ r?? .
1−α
Suppose (13.2.26) holds for each n ≤ k, then, using (13.2.20), we obtain that
1 − αk+1
0 ≤ rk+2 − rk+1 ≤ αk (r2 − r1 ) and rk+2 ≤ r1 + (r2 − r1 ). (13.2.27)
1−α
Estimate (13.2.26) is certainly satisfied, if
gk (α) ≤ 0, (13.2.28)
where gk is defined by (13.2.21). Using (13.2.21), we obtain the following relationship
between two consecutive recurrent functions gk for each k = 1, 2, · · ·
gk+1(α) = gk (α) + p(α) αk−1 (r2 − r1 ) = gk (α), (13.2.29)
since p(α) = 0. Define function g∞ on [0, 1) by
g∞ (t) = lim gk (t). (13.2.30)
k→∞
Remark 13.2.3. Let us see how sufficient convergence criterion on (13.2.5) for sequence
{tn } simplifies in the interesting case of Newton’s method. That is when c = 0 and λ = 1.
Then, (13.2.5) can be written for L0 = 2 M1 and L = 2 K as
1 p 1
h0 = (L + 4 L0 + L2 + 8 L0 L) η ≤ . (13.2.32)
8 2
The convergence criterion in [18] reduces to the famous for it simplicity and clarity Kan-
torovich hypothesis
1
h = Lη ≤ . (13.2.33)
2
Note however that L0 ≤ L holds in general and L/L0 can be arbitrarily large [6, 7, 8, 9, 10,
14]. We also have that
1 1
h ≤ =⇒ h0 ≤ (13.2.34)
2 2
but not necessarily vice versa unless if L0 = L and
h0 1 L
−→ as −→ ∞. (13.2.35)
h 4 L0
Similarly, it can easily be seen that the sufficient convergence criterion (13.2.22) for se-
quence {rn } is given by
q p
1 1
h1 = (4 L0 + L0 L + 8 L20 + L0 L) η ≤ . (13.2.36)
8 2
We also have that
1 1
h0 ≤ =⇒ h1 ≤ (13.2.37)
2 2
and
h1 h1 L0
−→ 0, −→ 0 as −→ 0. (13.2.38)
h h0 L
Note that sequence {rn } is tighter than {tn } and converges under weaker conditions. In-
deed, a simple inductive argument shows that for each n = 2, 3, · · ·, if M1 < K, then
We have the following usefull and obvious extensions of Lemma 13.2.1 and Lemma
13.2.2, respectively.
t1 ≤ t2 ≤ · · · ≤ tN ≤ tN+1 , (13.2.40)
1
> (1 − λ) (tN − t0 ) + (1 + λ) (tN+1 − t0 ) (13.2.41)
M1
and
0 ≤ αN ≤ α ≤ 1 − 2 M1 (tN+1 − tN ). (13.2.42)
252 Ioannis K. Argyros and Á. Alberto Magreñán
and
αn
t ? − tN+n ≤ (tN+1 − tN ). (13.2.44)
1−α
Lemma 13.2.5. Let N = 1, 2, · · · be fixed. Suppose that
r1 ≤ r2 ≤ · · · ≤ rN ≤ rN+1 , (13.2.45)
1
> (1 − λ) (rN − r0 ) + (1 + λ) (rN+1 − r0 ) (13.2.46)
M1
and
2 M1 (rN+1 − rN )
0 ≤ βN ≤ α ≤ 1 − . (13.2.47)
1 − 2 M1 (rN − rN−1 )
Then, sequence {rn } generated by (13.2.20) is nondecreasing, bounded from above by r??
and converges to r? which satisfies r? ∈ [rN+1 , r?? ]. Moreover, the following estimates are
satisfied for each n = 0, 1, · · ·
and
αn
r? − rN+n ≤ (rN+1 − rN ). (13.2.49)
1−α
Next, we present the following semilocal convergence result for secant-like method
under the (C ? ) conditions.
Theorem 13.2.6. Suppose that the (C ? ), Lemma 13.2.1 (or Lemma 13.2.4) conditions and
U(x0 ,t ? ) ⊆ D (13.2.50)
hold. Then, sequence {xn } generated by the secant-like method is well defined, remains in
U(x0 ,t ? ) for each n = −1, 0, 1, · · · and converges to a solution x? ∈ U(x0 ,t ? − c) of equation
F(x) = 0. Moreover, the following estimates are satisfied for each n = 0, 1, · · ·
and
k xn − x? k≤ t ? − tn . (13.2.52)
Furthemore, if there exists r ≥ t ? such that
U(x0 , r) ⊆ D (13.2.53)
and
1 2
r + t? < or r + t ? < , (13.2.54)
M1 M2
then, the solution x? is unique in U(x0 , r).
Secant-Like Methods 253
and
U(xk+1 ,t ? − tk+1 ) ⊆ U(xk ,t ? − tk ) (13.2.56)
for each k = −1, 0, 1, · · ·. Let z ∈ U(x0 ,t ? − t0 ). Then, we obtain that
k w − x0 k≤k w − x1 k + k x1 − x0 k≤ t ? − t1 + t1 − t0 = t ? = t ? − t0 .
which implies x1 ∈ U(x0 ,t ? ) ⊆ D . Hence, estimates (13.2.51) and (13.2.52) hold for k = −1
and k = 0. Suppose (13.2.51) and (13.2.52) hold for all n ≤ k. Then, we obtain that
k+1 k+1
k xk+1 − x0 k≤ ∑ k xi − xi−1 k≤ ∑ (ti − ti−1 ) = tk+1 − t0 ≤ t ?
i=1 i=1
and
k yk − x0 k≤ λ k xk − x0 k +(1 − λ) k xk−1 − x0 k≤ λt ? + (1 − λ)t ? = t ? .
Hence, xk+1 , yk ∈ U(x0 ,t ? ). Let Ek := [xk+1, xk ; F] for each k = 0, 1, · · ·. Using (13.1.2),
Lemma 13.2.1 and the induction hypotheses, we get that
It follows from (13.2.57) and the Banach lemma on invertible operators that B−1
k+1 exists and
1 1
k B−1 0
k+1 F (x0 ) k≤ ≤ , (13.2.58)
1 − Θk 1 − M1 qk+1
Then, using the induction hypotheses, the (C ? ) condition and (13.2.59), we get in turn that
which completes the induction for (13.2.55). Furthemore, let v ∈ U(xk+2 ,t ? − tk+2 ). Then,
we have that
k v − xk+1 k ≤ k v − xk+2 k + k xk+2 − xk+1 k
≤ t ? − tk+2 + tk+2 − tk+1 = t ? − tk+1 ,
which implies v ∈ U(xk+1 ,t ? −tk+1 ). The induction for (13.2.55) and (13.2.56) is complete.
Lemma 13.2.1 implies that {tk} is a complete sequence. It follows from (13.2.55) and
(13.2.56) that {xk } is a complete sequence in a Banach space X and as such it converges
to some x? ∈ U(x0 ,t ? ) (since U(x0 ,t ?) is a closed set). By letting k −→ ∞ in (13.2.60), we
get that F(x? ) = 0. Moreover, estimate (13.2.52) follows from (13.2.51) by using standard
majorization techniques [8, 10, 23]. To show the uniqueness part, let y? ∈ U(x0 , r) be such
F(y? ) = 0, where r satisfies (13.2.53) and (13.2.54). We have that
It follows by (13.2.61) and the Banach lemma on invertible operators that linear operator
[y? , x? ; F]−1 exists. Then, using the identity 0 = F(y? ) − F(x? ) = [y? , x? ; F] (y? − x? ), we
deduce that x? = y? . The proof of Theorem 13.2.6 is complete.
In order for us to present the semilocal result for secant-like method under the (C ?? )
conditions, we first need a result on a majorizing sequence. The proof in given in Lemma
13.2.1.
Remark 13.2.7. Clearly, (13.2.22) (or (13.2.47)), {rn } can replace (13.2.5) (or (13.2.42)),
{tn }, respectively in Theorem 13.2.6.
Lemma 13.2.8. Let c ≥ 0, η > 0, L > 0, M0 > 0 with M0 c < 1 and λ ∈ [0, 1]. Set
L M0
s−1 = 0, s0 = c, s1 = c + η, K̃ = and M̃1 = .
1 − M0 c 1 − M0 c
Secant-Like Methods 255
and polynomial p̃ by
α̃n η
0 ≤ sn+1 − sn ≤ α̃n η and s? − sn ≤ .
1 − α̃
Next, we present the semilocal convergence result for secant-like method under the
(C ?? )
conditions.
Theorem 13.2.9. Suppose that the (C ??) conditions, (13.2.62) (or Lemma 13.2.2 conditions
with α̃n , α̃, M̃1 replacing, respectively, αn , α, M1 ) and U(x0 , s? ) ⊆ D hold. Then, sequence
{xn } generated by the secant-like method is well defined, remains in U(x0 , s? ) for each
n = −1, 0, 1, · · · and converges to a solution x? ∈ U(x0 , s? ) of equation F(x) = 0. Moreover,
the following estimates are satisfied for each n = 0, 1, · · ·
Furthemore, if there exists r ≥ s? such that U(x0 , r) ⊆ D and r + s? + c < 1/M0 , then, the
solution x? is unique in U(x0 , r).
256 Ioannis K. Argyros and Á. Alberto Magreñán
Proof. The proof is analogous to Theorem 13.2.6. Simply notice that in view of (C5 ), we
obtain instead of (13.2.57) that
k A−1
0 (Bk+1 − A0 ) k≤ M0 (k yk+1 − x−1 k + k xk+1 − x0 k)
≤ M0 ((1 − λ) k xk − x0 k +λ k xk+1 − x0 k + k x0 − x−1 k + k xk+1 − x0 k)
≤ M0 ((1 − λ) (sk − s0 ) + (1 + λ) (sk+1 − s0 ) + c) < 1,
leading to B−1
k+1 exists and
1
k B−1
k+1 A0 k≤ ,
1 − Ξk
where Ξk = M0 ((1 − λ) (sk − s0 ) + (1 + λ) (sk+1 − s0 ) + c). Moreover, using (C3? ) instead of
(C3?? ), we get that
k A−1
0 F(xk+1 ) k≤ L (sk+1 − sk + (1 − λ) (sk − sk−1 )) (sk+1 − sk ).
U(x0 ,t ?? ) ⊆ D , (13.2.63)
u−1 = 0, u0 = c, u1 = c + η
M (un+1 − un + (1 − λ) (un − un−1 )) (13.2.64)
un+2 = un+1 + (un+1 − un ),
1 − M q?n
where
q?n = (1 − λ) (un − u0 ) + (1 + λ) (un+1 − u0 ).
Then, if K < M or M1 < M, a simple inductive argument shows that for each n =
2, 3, · · ·
Clearly {tn } converges under the (C ) conditions and conditions of Lemma 2.1. More-
over, as we already showed in Remark 13.2.3, the sufficient convergence criteria of
Theorem 13.2.6 can be weaker than those of Theorem 13.2.9. Similarly if L ≤ M, {sn }
is a tighter sequence than {un }. In general, we shall test the convergence criteria and
use the tightest sequence to estimate the error bounds.
Secant-Like Methods 257
(c) Clearly the conclusions of Theorem 13.2.9 hold if {sn }, (13.2.62) are replaced by
{r̃n }, (13.2.22), where {r̃n } is defined as {rn } with M0 replacing M1 in the definition
of β1 (only at the numerator) and the tilda letters replacing the non-tilda letters in
(13.2.22).
· Fixed λ ∈ [0, 1], the operator B0 = [y0 , x0 ; F] is invertible and such that kB−1
0 k ≤ β,
· kB−1
0 F(x0 )k ≤ η,
13.3.1. Example 1
We illustrate the above-mentioned with an application, where a system of nonlinear equa-
tions is involved. We see that Theorem 13.3.1 cannot guarantee the semilocal convergence
of secant-like methods (13.1.2), but Theorem 13.2.6 can do it.
It is well known that energy is dissipated in the action of any real dynamical system,
usually through some form of friction. However, in certain situations this dissipation is
so slow that it can be neglected over relatively short periods of time. In such cases we
assume the law of conservation of energy, namely, that the sum of the kinetic energy and
the potential energy is constant. A system of this kind is said to be conservative.
If ϕ and ψ are arbitrary functions with the property that ϕ(0) = 0 and ψ(0) = 0, the
general equation
d 2 x(t) dx(t)
µ +ψ + ϕ(x(t)) = 0, (13.3.2)
dt 2 dt
258 Ioannis K. Argyros and Á. Alberto Magreñán
can be interpreted as the equation of motion of a mass µ under the action of a restoring force
−ϕ(x) and a damping force −ψ(dx/dt). In general these forces are nonlinear, and equation
(13.3.2) can be regarded as the basic equation of nonlinear mechanics. In this chapter we
shall consider the special case of a nonlinear conservative system described by the equation
d 2 x(t)
µ + ϕ(x(t)) = 0,
dt 2
in which the damping force is zero and there is consequently no dissipation of energy.
Extensive discussions of (13.3.2), with applications to a variety of physical problems, can
be found in classical references [4] and [32].
Now, we consider the special case of a nonlinear conservative system described by the
equation
d 2 x(t)
+ φ(x(t)) = 0 (13.3.3)
dt 2
with the boundary conditions
x(0) = x(1) = 0. (13.3.4)
After that, we use a process of discretization to transform problem (13.3.3)–(13.3.4) into a
finite-dimensional problem and look for an approximated solution of it when a particular
function φ is considered. So, we transform problem (13.3.3)–(13.3.4) into a system of non-
linear equations by approximating the second derivative by a standard numerical formula.
1
Firstly, we introduce the points t j = jh, j = 0, 1, . . ., m + 1, where h = m+1 and m is
an appropriate integer. A scheme is then designed for the determination of numbers x j ,
it is hoped, approximate the values x(t j ) of the true solution at the points t j . A standard
approximation for the second derivative at these points is
x j−1 − 2x j + x j+1
x00j ≈ , j = 1, 2, . . ., m.
h2
A natural way to obtain such a scheme is to demand that the x j satisfy at each interior mesh
point t j the difference equation
x j−1 − 2x j + x j+1 + h2 φ(x j ) = 0. (13.3.5)
Since x0 and xm+1 are determined by the boundary conditions, the unknowns are
x1 , x2 , . . ., xm .
A further discussion is simplified by the use of matrix and vector notation. Introducing
the vectors
x1 φ(x1 )
x2 φ(x2 )
x = . , vx = .
.. ..
xm φ(xm )
and the matrix
−2 1 0 ··· 0
1 −2 1 ··· 0
0 1 −2 ··· 0
A= ,
.. .. .. .. ..
. . . . .
0 0 0 · · · −2
Secant-Like Methods 259
the system of equations, arising from demanding that (13.3.5) holds for j = 1, 2, . . ., m, can
be written compactly in the form
F(x) ≡ Ax + h2 vx = 0, (13.3.6)
e and h = 1 , so that
where ` = (`1 , `2 , . . ., `8 )t ∈ Ω 9
7 2
kF 0 (x) − F 0 (y)k ≤ h kx − yk. (13.3.7)
4
Considering (see [28])
Z 1
[x, y; F] = F 0 (τx + (1 − τ)y)dτ,
0
1
Table 13.3.2. Absolute errors obtained by secant-like method (13.1.2) with λ = 2 and
{kF(xn )k}
n kxn − x∗ k kF(xn)k
−1 1.3893 . . .× 10−1 8.6355 . . .× 10−2
0 4.5189 . . .× 10−2 1.2345 . . .× 10−2
1 1.43051 . . .× 10−4 2.3416 . . .× 10−5
2 1.14121 . . .× 10−7 1.9681 . . .× 10−8
3 4.30239 . . .× 10−13 5.7941 . . .× 10−14
13.3.2. Example 2
Consider the following nonlinear boundary value problem
(
1
u00 = −u3 − u2
4
u(0) = 0, u(1) = 1.
It is well known that this problem can be formulated as the integral equation
Z 1
1
u(s) = s + Q (s,t) (u3(t) + u2 (t)) dt (13.3.8)
0 4
where, Q is the Green function:
t (1 − s), t ≤ s
Q (s,t) =
s (1 − t), s < t.
We observe that Z 1
1
max |Q (s,t)| dt = .
0≤s≤1 0 8
Then problem (13.3.8) is in the form (13.1.1), where, F is defined as
Z 1
1
[F(x)] (s) = x(s) − s − Q (s,t) (x3(t) + x2 (t)) dt.
0 4
The Fréchet derivative of the operator F is given by
Z 1 Z 1
1
[F 0 (x)y] (s) = y(s) − 3 Q (s,t)x2(t)y(t)dt − Q (s,t)x(t)y(t)dt.
0 2 0
262 Ioannis K. Argyros and Á. Alberto Magreñán
1 + 14 5
Choosing x0 (s) = s and R = 1 we have that kF(x0 )k ≤ = . Define the divided
8 32
difference defined by
Z 1
[x, y; F] = F 0 (τx + (1 − τ)y)dτ.
0
Taking into account that
Z 1
k[x, y; F] − [v, y; F]k ≤ kF 0 (τx + (1 − τ)y) − F 0 (τv + (1 − τ)y) k dτ
0
1 1 2 2 τ
Z
≤ 3τ kx − v2 k + 2τ(1 − τ)kykkx − vk + kx − vk dτ
8 0 2
1 1
≤ kx2 − v2 k + kyk + kx − vk
8 4
1 1
≤ kx + vk + kyk + kx − vk
8 4
25
≤ kx − vk
32
9s
Choosing x−1 (s) = , we find that
10
Z 1
k1 − A0 k ≤ kF 0 (τx0 + (1 − τ)x−1 ) k dτ
0
!
1 1 9 2 1 9
Z
≤ 3 τ + (1 − τ) + τ + (1 − τ) dτ
8 0 10 2 10
≤ 0.409375 . . .
and so
25 −1
L≥ kA k = 1.32275 . . .
32 0
.
In an analogous way, choosing λ = 0.8 we obtain
M0 = 0.899471 . . .,
kB−1
0 k = 1.75262 . . .
and
η = 0.273847 . . ..
Notice that we can not guarantee the convergence of the secant method by Theorem
13.3.1 since the first condition of (3.1) is not satisfied:
√
η 3− 5
a= = 0.732511 . . . > = 0.381966 . . .
c+η 2
Secant-Like Methods 263
K̃ = 1.45349 . . .,
α0 = 0.434072 . . .,
α = 0.907324 . . .
and
1 − 2M̃1 η = 0.945868 . . ..
And condition (2.62) 0 < α0 ≤ α ≤ 1 − 2M̃1 η is satisfied and as a consequence we can
ensure the convergence of the secant method by Theorem 13.2.9.
Conclusion
We presented a new semilocal convergence analysis of the secant-like method for approx-
imating a locally unique solution of an equation in a Banach space. Using a combination
of Lipschitz and center-Lipschitz conditions, instead of only Lipschitz conditions invested
in [18], we provided a finer analysis with larger convergence domain and weaker sufficient
convergence conditions than in [15, 18, 19, 22, 27, 28]. Numerical examples validate our
theoretical results.
References
[1] Amat, S., Busquier, S., Negra, M., Adaptive approximation of nonlinear operators,
Numer. Funct. Anal. Optim., 25 (2004) 397-405.
[2] Amat, S., Busquier, S., Gutiérrez, J. M., On the local convergence of secant-type
methods, International Journal of Computer Mathematics, 81 (8) (2004), 1153-1161.
[3] Amat, S., Bermúdez, C., Busquier, S., Gretay, J. O., Convergence by nondiscrete
mathematical induction of a two step secant’s method, Rocky Mountain Journal of
Mathematics, 37 (2) (2007), 359-369.
[4] Andronow, A.A., Chaikin, C.E., Theory of oscillations, Princenton University Press,
New Jersey, 1949.
[5] Argyros, I.K., Polynomial Operator Equations in Abstract Spaces and Applications,
St.Lucie/CRC/Lewis Publ. Mathematics series, 1998, Boca Raton, Florida, U.S.A.
[6] Argyros, I.K., A unifying local-semilocal convergence analysis and applications for
two-point Newton-like methods in Banach space, J. Math. Anal. Appl., 298 (2004),
374-397.
[7] Argyros, I.K., New sufficient convergence conditions for the secant method Che-
choslovak Math. J., 55 (2005), 175-187.
[9] Argyros, I.K., A semilocal convergence analysis for directional Newton methods,
Math. Comput., 80 (2011), 327-343.
[10] Argyros, I.K., Cho, Y.J., Hilout, S., Numerical Methods for Equations and its Appli-
cations, CRC Press/Taylor and Francis, Boca Raton, Florida, USA, 2012
[11] Argyros, I.K., Hilout, S., Convergence conditions for secant–type methods, Che-
choslovak Math. J., 60 (2010), 253-272.
[12] Argyros, I.K., Hilout, S., Weaker conditions for the convergence of Newton’s method,
J. Complexity, 28 (2012), 364-387.
[13] Argyros, I.K., Hilout, S., Estimating upper bounds on the limit points of majorizing
sequences for Newton’s method, Numer. Algorithms, 62 (1) (2013), 115-132.
266 Ioannis K. Argyros and Á. Alberto Magreñán
[14] Argyros, I.K., Hilout, S., Numerical methods in nonlinear analysis, World Scientific
Publ. Comp., New Jersey, 2013
[15] Argyros, I.K., Ezquerro, J.A., Hernández, M.Á. Hilout, S., Romero, N., Velasco, A.I.,
Expanding the applicability of secant-like methods for solving nonlinear equations,
Carp. J. Math. 31 (1) (2015), 11-30.
[16] Dennis, J.E., Toward a unified convergence theory for Newton–like methods, in Non-
linear Functional Analysis and Applications (L.B. Rall, ed.), Academic Press, New
York, (1971), 425-472.
[17] Ezquerro, J.A., Gutiérrez, J.M., Hernández, M.A., Romero, N., Rubio, M.J., The
Newton method: from Newton to Kantorovich. (Spanish), Gac. R. Soc. Mat. Esp.,
13 (2010), 53-76.
[18] Ezquerro, J.M., Hernández, M.A., Romero, N., A.I. Velasco,Improving the domain of
starting point for secant-like methods,App. Math. Comp., 219 (8) (2012), 3677–3692.
[19] Ezquerro, J.A., Rubio, M.J., A uniparametric family of iterative processes for solv-
ingnondifferentiable equations, J. Math. Anal. Appl, 275 (2002), 821-834.
[20] A. Fraile, E. Larrodé, Á. A. Magreñán, J. A. Sicilia. Decision model for siting trans-
port and logistic facilities in urban environments: A methodological approach. J.
Comp. Appl. Math., 291 (2016), 478-487.
[21] Hernández, M.A., Rubio, M.J., Ezquerro, J.A., secant-like methods for solving non-
linear integral equations of the Hammerstein type. Proceedings of the 8th International
Congress on Computational and Applied Mathematics, ICCAM-98 (Leuven), J. Com-
put. Appl. Math., 115 (2000), 245-254.
[22] Hernández, M.A., Rubio, M.J., Ezquerro, J.A., Solving a special case of conservative
problems by secant-like methods, App. Math. Comp., 169 (2005), 926-942.
[23] Kantorovich, L.V., Akilov, G.P., Functional Analysis, Pergamon Press, Oxford, 1982.
[24] Laasonen, P., Ein überquadratisch konvergenter iterativer algorithmus, Ann. Acad.
Sci. Fenn. Ser I, 450 (1969), 1–10.
[25] Magreñán, Á. A., A new tool to study real dynamics: The convergence plane, App.
Math. Comp., 248 (2014), 215-224.
[26] Ortega, J.M., Rheinboldt, W.C., Iterative Solution of Nonlinear Equations in Several
Variables, Academic Press, New York, 1970.
[27] Potra, F.A., Sharp error bounds for a class of Newton–like methods, Libertas Mathe-
matica, 5 (1985), 71-84.
[28] Potra, F.A., Pták, V., Nondiscrete Induction and Iterative Processes, Pitman, New
York, 1984.
Secant-Like Methods 267
[29] Proinov, P.D., General local convergence theory for a class of iterative processes and
its applications to Newton’s method, J. Complexity, 25 (2009), 38-62.
[30] Proinov, P.D., New general convergence theory for iterative processes and its applica-
tions to Newton–Kantorovich type theorems, J. Complexity, 26 (2010), 3-42.
[31] Schmidt, J.W., Untere Fehlerschranken fur Regula-Falsi Verhafren, Period. Hungar.,
9 (1978), 241-247.
[33] Traub, J.F., Iterative Methods for the Solution of Equations, Prentice-Hall, Englewood
Cliffs, New Jersay, 1964.
[34] Yamamoto, T., A convergence theorem for Newton–like methods in Banach spaces,
Numer. Math., 51 (1987), 545-557.
[35] Wolfe, M.A., Extended iterative methods for the solution of operator equations, Nu-
mer. Math., 31 (1978), 153-174.
Chapter 14
14.1. Introduction
In this chapter, we introduce genetic algorithms as a general tool for solving optimum prob-
lems. As a special case we use these algorithms to find the solution of the system of non-
linear equations
f1 (x1 , x2 , . . ., xn ) = 0,
f2 (x1 , x2 , . . ., xn ) = 0,
.. (14.1.1)
.
fn (x1 , x2 , . . ., xn ) = 0,
where f = ( f 1 , f 2 , . . ., f n ) : D = [a1 , b1 ] × [a2 , b2 ] · · · × [an , bn ] ⊆ Rn → Rn is continuous.
Genetic algorithms (GA) were first introduced in the 1970s by John Holland at Univer-
sity of Michigan [8]. Since then, a great deal of developments on GA have been obtained,
see [5, 6] and references therein. GA are used as an adaptive machine study approach in
their former period of development, and they have been successfully applied in numerous
areas such as artificial intelligence, self-adaption control, systematic engineering, image
processing, combinatorial optimization and financial system. GA show us very extensive
application prospect. Genetic algorithms are search algorithms based on the mechanics of
natural genetics.
In a genetic algorithm, a population of candidate solutions (called individuals or phe-
notypes) to an optimization problem is evolved towards better solutions. Each candidate
solution has a set of properties (its chromosomes or genotype) which can be mutated and
altered; traditionally, solutions are represented in binary as strings of 0s and 1s, but other
encodings are also possible [5]. The evolution usually starts from a population of randomly
generated individuals and happens in generations. In each generation, the fitness of every
270 Ioannis K. Argyros and Á. Alberto Magreñán
individual in the population is evaluated, the more fit individuals are stochastically selected
from the current population, and each individual’s genome is modified (recombined and
possibly randomly mutated) to form a new population. The new population is then used
in the next iteration of the algorithm. Commonly, the algorithm terminates when either a
maximum number of generations has been produced, or a satisfactory fitness level has been
reached for the population.
In the use of genetic algorithms to solve the practical problems, the premature phe-
nomenon often appears, which limits the search performance of genetic algorithm. The
reason for causing premature phenomenon is that highly similar exists between individuals
after some generations, and the opportunity of the generation of new individuals by further
genetic manipulation has been reduced greatly. Many ideas have been proposed to avoid
the premature phenomenon [6, 15, 16, 18]. [18] introduces two special individuals at each
generation of genetic algorithm to make the population maintains diversity in the problem
of finding the minimized distance between surfaces, and the computational efficiency has
been improved. We introduces other two special individuals at each generation of genetic
algorithm in the same problem in [15], and the computational efficiency has been further
improved. Furthermore, we suggest to put some symmetric and harmonious individuals at
each generation of genetic algorithm applied to a general optimal problem in [16], and good
computational efficiency has been obtained. Some application of our methods in reservoir
mid-ong hydraulic power operation has been given in [17].
Recently many authors use GA to solve nonlinear equations system, see [9, 3, 11, 10,
14]. These works give us impression that GA are effective methods for solving nonlin-
ear equations system. However, efforts are still needed so as to solve nonlinear equations
system more effectively. In this chapter, we present a new genetic algorithm to solve Eq.
(14.1.1).
The chapter is organized as follows: We convert the equation problem (14.1.1) to an
optimal problem in Section 14.2, in Section 14.3 we present our new genetic algorithm with
symmetric and harmonious individuals for the corresponding optimal problem, in Section
14.4 we give a mixed method by our method with Newton’s method, whereas in Section
14.5 we provide some numerical examples to show that our new methods are very effective.
Some remarks and conclusions are given in the concluding section 14.6.
1 n
F(x1 , x2 , . . ., xn ) = ∑ | fi(x1 , x2, . . ., xn)|.
n i=1
(14.2.1)
min F(x1 , x2 , . . ., xn )
(14.2.2)
s.t (x1 , x2 , . . ., xn ) ∈ D.
Suppose x? ∈ D is a solution of Eq. (14.1.1), then we have F(x? ) = 0 from the definition
(14.2.1) of F. Since F(x) ≥ 0 holds for all x ∈ D, we deduce that x? is the solution of
Efficient Genetic Algorithm 271
problem (14.2.2). On the other hand, assume x? ∈ D is a solution of problem (14.2.2). Then
for all x ∈ D, we have F(x) ≥ F(x? ). Now we suppose (14.1.1) has at least one solution
denoted as y? . Then, we have 0 ≤ F(x? ) ≤ F(y? ) = 0, that is F(x? ) = 0 is true and x? is
a solution of Eq. (14.1.1). Hence, Eq. (14.1.1) is equivalent to problem (14.2.2) if Eq.
(14.1.1) has at least one solution.
From now on, we always suppose (14.1.1) has at least one solution, and try to find its
solution by finding a solution of (14.2.2) via a genetic algorithm.
14.3.1. Coding
We use the binary code in our method, as used in the simple genetic algorithm [6]. The
binary code is the most used code in GA, which represents the candidate solution by a
string of 0s and 1s. The length of the string is relation to the degree of accuracy needed for
the solution, and satisfies the following inequality
bi − ai
Li ≥ log2 , (14.3.1)
εi
where, Li is the length of the string standing for the i-component of an individual, and εi is
the degree of accuracy needed for xi . In fact, we should choose the minimal positive integer
to satisfy (14.3.1) for Li . Usually, all εi are equal to one another, so it can be denoted
by εx in this status. For example, let a1 = 0, b1 = 1 and εx = 10−6 , then we can choose
L1 = 20, since L1 ≥ log2 106 ≈ 19.93156857. Variable x1 ∈ [a1 , b1 ] is in the form of real
numbers, it can be represented by 20 digits of the binary code:00000000000000000000
stands for 0, 00000000000000000001 stands for 21Li , 00000000000000000010 stands for
2
2 Li
,. . . , 11111111111111111111 stands for 1.
14.3.3. Selection
We use Roulette Wheel Section (called proportional election operator) in our method, as
used in the simple genetic algorithm. The probability of individual is selected and the
fitness function is proportional to the value. Suppose the size of population is N, the fitness
of individual i is gi . The individual i was selected to the next generation with probability pi
given by
gi
pi = N (i = 1, 2, · · · , N). (14.3.3)
∑i=1 gi
14.3.7. Procedure
For the convenience of discussion, we call our genetic algorithm with symmetric and har-
monious individuals as SHEGA, and call the simple genetic algorithm with the elitist model
as EGA. We can give the procedure of SHEGA as follows:
Step 1. Assignment of parameters of genetic algorithm: The size N of population, the
number n of variables of (14.2.2), the lengthes L1 , L2 , . . ., Ln (computed from (14.3.1)) of
the binary string of the components of an individual, symmetry and harmonious factor λ,
controlled precision ε1 in subsection 14.3.4 to introduce the symmetric and harmonious
individuals, the probability pc of the crossover operator, the probability pm of the mutation
operator, and the largest genetic generation G.
Step 2. Generate the initial population randomly.
Step 3. Calculate the fitness value of each individual of the contemporary population,
and reserve the optimal individual of the contemporary population to the next generation.
Step 4. If the distance between the best fitness value of the last generation to that of
the current generation is less than a preset precision ε1 , we generate N − 2 ∗ bλ ∗ Nc − 1
individuals using Roulette Wheel Section and bλ ∗ Nc pairs of symmetry and harmonious
individuals randomly. Otherwise we generate N − 1 individuals using Roulette Wheel Sec-
tion directly. The population is then divided into two parts: one is the seed subpopulation
constituted by symmetry and harmonious individuals, and the other is a subpopulation ready
to be bred and constituted by the residual individuals.
Step 5. Take the crossover operator between each individual in the seed subpopulation
to one individual selected from the other subpopulation randomly. Take the crossover op-
erator each other using two two paired method with probability pc and take the mutation
operator with probability pm for each residual individual in the subpopulation ready to be
bred.
Step 6. Repeat Step 3-Step 5 until the maximum genetic generation G is reached.
that Newton’s method converges to the solution quadratically provided that some neces-
sary conditions are satisfied. However, there are two deficiencies to limit the application of
Newton’s method. First, function f must be differentiable, which will not be satisfied in
practical application. Second, a good initial point for beginning the iterative is key to en-
sure the convergence of the iterative sequence, but it is a difficult task to choose the initial
point in advance. In fact, choosing good initial points to begin the corresponding itera-
tion is the common question for all the classical iterative methods used to solve equation
(14.1.1)[2, 13].
Here, we use Newton’s method as an example. In fact, one can develop other methods
by mixing SHEGA and other iterative methods. We can state SHEGA-Newton method as
follows:
Step 1. Given the maximal iterative step S and the precision accuracy εy. Set s = 1.
Step 2. Find an initial guess x(0) ∈ D by using SHEGA given in Section 3.
(s) (s) (s) (s) (s) (s)
Step 3. Compute f i(x1 , x2 , . . ., xn )(i = 1, 2, . . ., n). If F(x1 , x2 , . . ., xn ) ≤ εy, re-
(s) (s) (s)
port that the approximation solution x(s) = (x1 , x2 , . . ., xn ) is found and exit from the
circulation, where F is defined in (2.1).
Step 4. Compute the Jacobian Js = f 0 (x(s)) and solve the linear equations
Table 14.5.1. The comparison results of convergence number of times for Example 1
Table 14.5.2. The comparison results of the average of the best function value F for
Example 1
G = 300
EGA 0.129795093953972
SHEGA(λ = 0.05) 0.053101422547384
SHEGA(λ = 0.10) 0.041427669194903
SHEGA(λ = 0.15) 0.034877317449825
SHEGA(λ = 0.20) 0.035701604675096
SHEGA(λ = 0.25) 0.038051665705034
SHEGA(λ = 0.30) 0.039332883168632
SHEGA(λ = 0.35) 0.035780879206619
SHEGA(λ = 0.40) 0.034509424138501
SHEGA(λ = 0.45) 0.037425326021257
the fixed maximal generation G = 500 in Table 14.5.4. Here, we say the corresponding
genetic algorithm is convergent if the function value F(x1 , x2 , . . ., xn ) is less than a fixed
precision εy . We set εy = 0.02 for this example. Tables 14.5.3 and 14.5.4 show us that
SHEGA with proper symmetry and harmonious factor λ performs better than EGA.
Table 14.5.3. The comparison results of convergence number of times for Example 2
Table 14.5.4. The comparison results of the average of the best function value F for
Example 2
G = 500
EGA 0.1668487275323187
SHEGA(λ = 0.05) 0.0752619855962109
SHEGA(λ = 0.10) 0.0500864062405815
SHEGA(λ = 0.15) 0.0358268275921585
SHEGA(λ = 0.20) 0.0257859494269335
SHEGA(λ = 0.25) 0.0239622084932336
SHEGA(λ = 0.30) 0.0247106452514721
SHEGA(λ = 0.35) 0.0171980128114993
SHEGA(λ = 0.40) 0.0179659124369376
SHEGA(λ = 0.45) 0.0158999282064303
Table 14.5.5. The comparison results of convergence number of times for Example 3
14.6. Conclusion
We presented a genetic algorithm as a general tool for solving optimum problems. Note that
in the special case of approximating solutions of systems of nonlinear equations there are
many deficiencies that limit the application of the usually employed methods. For example
in the case of Newton’s method function f must be differentiable and a good initial point
must be found. To avoid these problems we have introduced some pairs of symmetric and
harmonious individuals for the generation of a genetic algorithm. The population diversity
is preserved this way and the method guarantees convergence to a solution of the system.
Numerical examples are illustrating the efficiency of the new algorithm.
278 Ioannis K. Argyros and Á. Alberto Magreñán
Table 14.5.6. The comparison results of the average of the best function value F for
Example 3
G = 300
EGA 0.0081958187235953
SHEGA(λ = 0.05) 0.0021219384775699
SHEGA(λ = 0.10) 0.0019286950245614
SHEGA(λ = 0.15) 0.0018367719544782
SHEGA(λ = 0.20) 0.0022816080103967
SHEGA(λ = 0.25) 0.0023297925904943
SHEGA(λ = 0.30) 0.0023318357433983
SHEGA(λ = 0.35) 0.0021392510790106
SHEGA(λ = 0.40) 0.0022381534380744
SHEGA(λ = 0.45) 0.0025798012930550
References
[1] Amat, S., Busquier, S., M. Negra, Adaptive approximation of nonlinear operators,
Numer. Funct. Anal. Optim., 25 (2004), 397-405.
[2] Argyros, I.K., Computational Theory of Iterative Methods, Series: Studies in Com-
putational Mathematics 15, Editors: C.K. Chui and L. Wuytack, Elsevier Publ. Co.,
New York, U.S.A., 2007.
[3] El-Emary, I.M., El-Kareem, M.A., Towards using genetic algorithm for solving non-
linear equation systems, World Applied Sciences Journal 5 (2008), 282-289.
[5] Fogel, D.B., Evolutionary Computation: Toward a New Philosophy of Machine Intel-
ligence, IEEE Press, New York, 2000.
[6] Goldberg, D.E., Genetic Algorithms in Search, Optimization and Machine Learning,
Addison-Wesley, Second Edition, 1989.
[7] Gutiérrez, J.M., A new semilocal convergence theorem for Newton’s method, J. Com-
put. Appl. Math. 79 (1997), 131-145.
[8] Holland, J.H., Adaptation in Natural and Artificial System, Michigan Univ Press, Ann
Arbor, 1975.
[9] Kuri-Morales, A. F., No, R. H., Solution of simultaneous non-linear equations using
genetic algorithms, WSEAS Transactions on SYSTEMS, 2 (2003), 44-51.
[10] Mastorakis, N.E., Solving non-linear equations via genetic algorithms, Proceedings of
the 6th WSEAS Int. Conf. on Evolutionary Computing, Lisbon, Portugal, June 16-18,
2005, 24-28.
[11] Nasira, G.N., Devi, D., Solving nonlinear equations through jacobian sparsity pattern
using genetic algorithm, International Journal of Communications and Engineering,
5 (2012), 78-82.
[12] Neta, B., A new iterative method for the solution of system of nonlinear equations,
Approx. Th. and Applic (Z. Ziegler, ed.), Academic Press, 249-263 1981.
280 Ioannis K. Argyros and Á. Alberto Magreñán
[13] Ortega, J.M., Rheinbolt, W.C., Iterative Solution of Nonlinear Equations in Several
Variables, Academic Press, New York, 1970.
[14] Mhetre, P.H., Genetic algorithm for linear and nonlinear equation, International Jour-
nal of Advanced Engineering Technology, 3 (2012), 114-118.
[15] Ren, H., Bi, W., Wu, Q., Calculation of minimum distance between free-form surfaces
by a type of new improved genetic algorithm, Computer Engineering and Application,
23 (2004), 62-64 (in Chinese).
[16] Ren, H., Wu, Q., Bi, W., A genetic algorithm with symmetric and harmonious indi-
viduals, Computer Engineering and Application, 5(87) (2005) 24-26, 87 (in Chinese).
[17] Wan, X, Zhou, J., Application of genetic algorithm for self adaption, symmetry and
congruity in reservoir mid-ong hydraulic power operation, Advances in Water Science,
18 (2007), 598-603 (in Chinese).
[18] Xi, G., Cai, Y., Finding the minimized distance between surfaces, Journal of CAD and
Graphics, 14 (2002), 209–213 (in Chinese).
Chapter 15
15.1. Introduction
In this chapter we are concerned with the problem of approximately solving the nonlinear
ill-posed operator equation
F(x) = f , (15.1.1)
where F : D(F) ⊆ X → Y is a nonlinear operator between the Hilbert spaces X and Y.
Here and below h., .i denote the inner product and k.k denote the corresponding norm. We
assume throughout that f δ ∈ Y are the available data with
k f − f δk ≤ δ
and (15.1.1) has a solution x̂ ( which need not be unique). Then the problem of recovery of
x̂ from noisy equation F(x) = f δ is ill-posed, in the sense that a small perturbation in the
data can cause large deviation in the solution.
Further it is assumed that F possesses a locally uniformly bounded Fréchet derivative
F 0 (.) in the domain D(F) of F. A large number of problems in mathematical physics and
engineering are solved by finding the solutions of equations in a form like (15.1.1). If one
works with such problems, the measurement data will be distorted by some measurement er-
ror. Therefore, one has to consider appropriate regularization techniques for approximately
solving (15.1.1).
Iterative regularization methods are used for approximately solving (15.1.1). Recall
([20]) that, an iterative method with iterations defined by
where xδ0 := x0 ∈ D(F) is a known initial approximation of x̂, for a known function Φ
together with a stopping rule which determines a stopping index kδ ∈ N is called an iterative
regularization method if kxδkδ − x̂k → 0 as δ → 0.
The Levenberg-Marquardt method([18], [21], [9], [10], [11], [14], [24], [6]) and iter-
atively regularized Gauss-Newton method (IRGNA) ([3], [5]) are the well-known iterative
regularization methods. In Levenberg-Marquardt method, the iterations are defined by,
where A∗k,δ := F 0 (xδk )∗ is as usual the adjoint of Ak,δ := F 0 (xδk ) and (αk ) is a positive sequence
of regularization parameter ([5]). In Gauss-Newton method, the iterations are defined by
with ν ≥ 1, w 6= 0 and F 0 (.) is Lipschitz continuous; N(F 0 (x̂)) denotes the nullspace of
F 0 (x̂). For noise free case Bakushinskii ([3]) obtained the rate
with τ > 1 chosen sufficiently large. Subsequently, many authors extended, modified, and
generalized Bakushinskii’s work to obtain error bounds under various contexts(see [4], [12],
[13], [15], [16], [17], [7]).
Modified Newton-Tikhonov Regularization Method 283
In [20], Mahale and Nair considered a method in which the iterations are defined by
for some constant µ1 > 1 and each gα, for α > 0 is a positive real-valued piecewise contin-
uous function defined on [0, M] with M ≥ kA0 k2 . They choose the stopping index kδ for this
iteration as the positive integer which satisfies
In fact, Mahle and Nair obtained an order optimal error estimate, in the sense that an im-
proved order estimate which is applicable for the case of linear ill-posed problems as well
is not possible, under the following new source condition on x0 − x̂.
In [7], the author considered a particular case of this method, namely, regularized mod-
ified Newton’s method defined iteratively by
xδk+1 = xδk − (A∗0 A0 + αI)−1 [A∗0 (F(xδk ) − yδ ) + α(xδk − x0 )], xδ0 := x0 (15.1.9)
for approximately solving (15.1.1). Using a suitably constructed majorizing sequence (see,
[1], page 28), it is proved that the sequence(xδk ) converges linearly to a solution xδα of the
equation
A∗0 F(xδα) + α(xδα − x0 ) = A∗0 yδ (15.1.10)
and that xδα is an approximation of x̂. The error estimate in this chapter was obtained under
the following source condition on x0 − x̂
1. limλ→0 ϕ(λ) = 0
2. for α ≤ 1, ϕ(α) ≥ α
3. supλ≥0 αϕ(λ)
λ+α
≤ cϕ ϕ(α), ∀λ ∈ (0, a1]
284 Ioannis K. Argyros and Á. Alberto Magreñán
Later in [8], using a two step Newton method (see, [2]), the author proved that the se-
quence (xδk ) in (15.1.9) converges linarly to the solution xδα of (15.1.10). The error estimate
in [8] was based on the following source condition
x0 − x̂ = ϕ(A∗0 A0 )w,
where ϕ is as in Assumption 15.1.1 with a1 ≥ kA0 k2 . In the present chapter we improve the
semilocal convergence by modifying the method (15.1.9).
xδn+1,α = xδn,α − (A∗0 An + αI)−1 [A∗0 (F(xδn,α ) − yδ ) + α(xδn,α − x0 )], xδ0,α := x0 (15.1.12)
kxn+1 − x∗ k ≤ M0 kxn − x∗ k.
Quadratically convergent sequence will always eventually converge faster than a linear con-
vergent sequence.
We choose the regularization parameter α from some finite set
Assumption 15.2.1. There exists a constant k0 > 0, r > 0 such that for every x, u ∈ B(x0 , r)∪
B(x̂, r) ⊂ D(F) and v ∈ X, there exists an element Φ(x, u, v) ∈ X such that
[F 0 (x) − F 0 (x0 )]v = F 0 (x0 )Φ0 (x, x0 , v), kΦ0 (x, x0 , v)k ≤ l0 kvkkx − x0 k.
Note that
l0 ≤ k0
k0 √
holds in general and l0 can be arbitrarily large [1], [2]. Let δ0 < α0 ,
q
1 + 2l0 (1 − √δα0 0 ) − 1
ρ< ,
l0
and
l0 2 δ0
γρ := ρ +ρ+ √ .
2 α0
2−3k0 2
For r ≤ (2+3l0 )k0 , k0 ≤ 3 let g : (0, 1) → (0, 1) be the function defined by
3(1 + l0 r)k0
g(t) := t ∀t ∈ (0, 1).
2(1 − l0 r)
Lemma 15.2.2. Let l0 r < 1 and u ∈ Br (x0 ). Then (A∗0 Au + αI) is invertible:
(i)
(A∗0 Au + αI)−1 = [I + (A∗0 A0 + αI)−1 A∗0 (Au − A0 )]−1 (A∗0 A0 + αI)−1
and
(ii)
1 + l0 r
k(A∗0 Au + αI)−1 A∗0 Au k ≤ ,
1 − l0 r
where Au := F 0 (u).
k(A∗0 A0 + αI)−1 A∗0 (Au − A0 )k = sup k(A∗0 A0 + αI)−1 A∗0 (Au − A0 )vk
kvk≤1
So I + (A∗0 A0 + αI)−1 A∗0 (Au − A0 ) is invertible. Now (i) follows from the following relation
for all n ≥ 0 :
n
kxδn,α − xδα k ≤ re−γ2 (15.2.1)
where γ = −ln(g(γρ )).
where ζ1 = (A∗0 An + αI)−1 A∗0 [An (xδn,α − xδn−1,α ) − (F(xδn,α ) − F(xδn−1,α))] and ζ2 = (A∗0 An +
αI)−1 A∗0 (An − An−1 )(xδn,α − xδn−1,α ). So by Fundamental Theorem of Integration, ζ1 =
(A∗0 An + αI)−1 A∗0 [ 01 (An − F 0 (xδn−1,α + t(xδn,α − xδn−1,α )dt](xδn,α − xδn−1,α ) and hence by As-
R
Similarly,
3(1 + l0 r)k0 ) δ
kxδn+1,α − xδn,α k ≤ kxn,α − xδn−1,α k2
2(1 − l0 r)
≤ g(en )en , (15.2.6)
where
en := kxδn,α − xδn−1,α k, n = 1, 2, · · · .
Now using induction we shall prove that xδn,α ∈ Br (x0 ). Note that
e1 = kxδ1,α − x0 k
= k(A∗0 A0 + αI)−1 A∗0 (F(x0 ) − yδ )k
= k(A∗0 A0 + αI)−1 A∗0 (F(x0 ) − F (x̂) − F 0 (x0 )(x0 − x̂)
+F 0 (x0 )(x0 − x̂) + F(x̂) − yδ )k
Z 1
≤ k(A∗0 A0 + αI)−1 A∗0 ( [F 0 (x̂ + t(x0 − x̂)) − F 0 (x0 )](x0 − x̂)dt
0
+F 0 (x0 )(x0 − x̂) + F(x̂) − yδ )k
Z 1
≤ k(A∗0 A0 + αI)−1 A∗0 A0 ( Φ(x0 , x̂ + t(x0 − x̂), x0 − x̂)k
0
+k(A∗0 A0 + αI)−1 A∗0 F 0 (x0 )(x0 − x̂)k
+k(A∗0 A0 + αI)−1 A∗0 (F(x̂) − yδ )k
l0 2 δ
≤ ρ + ρ + √ ≤ γρ ≤ r (15.2.7)
2 α
i.e.,xδ1,α ∈ Br (x0 ).
Now since γρ < 1, by (15.2.7), e1 < 1. Therefore by (15.2.6) and the fact that g(µt) ≤
µg(t), for all t ∈ (0, 1), we have that en < 1, ∀n ≥ 1 and
n −1
g(e1 )2 e1 .
Thus (xδn,α ) is a Cauchy sequence in Br (x0 ) and hence converges, say to xδα ∈ Br (x0 ). Further
by letting n → ∞ in (15.1.12) we obtain
and
q
2 + (2l0 − 3k0 )γρ + (4l02 + 9k02 − 36k0 l0 )γ2ρ − (12k0 + 8l0 )γρ + 4 2 − 3k0
r2 := min{ , },
2l0 (2 + 3k0 γρ ) (2 + 3l0 )k0
√
(8l0 −12k0)2 +16(36k0l0 −9k0−4l0 )−(8l0 +12k0 ) γρ
with γρ ≤ cl0 k0 := min{1, 2(36k0l0 −9k02−4l02 )
} then 1−g(γρ )
≤ r and
l0 r < 1.
•
sup αϕ(λ)
≤ ϕ(α), ∀λ ∈ (0, a].
λ ≥ 0 λ+α
• there exists v ∈ X such that
x0 − x̂ = ϕ(A∗0 A0 )v.
Modified Newton-Tikhonov Regularization Method 289
max{1, kvk} δ
kxδα − x̂k ≤ ( √ + ϕ(α))
1−q α
where q = l0 r.
R1 0 δ
Proof. Let M = 0 F (x̂ + t(xα − x̂))dt. Then
and hence by (15.1.10), we have (A∗0 M + αI)(xδα − x̂) = A∗0 (yδ − y) + α(x0 − x̂). Thus
xδα − x̂ = (A∗0 A0 + αI)−1 [A∗0 (yδ − y) + α(x0 − x̂) + A∗0 (A0 − M)(xδα − x̂)]
= s1 + s2 + s3 (15.3.1)
where s1 := (A∗0 A0 + αI)−1 A∗0 (yδ − y), s2 := (A∗0 A0 + αI)−1 α(x0 − x̂) and s3 := (A∗0 A0 +
αI)−1 A∗0 (A0 − M)(xδα − x̂). Note that
δ
ks1 k ≤ √ , (15.3.2)
α
by Assumption 15.3.1
ks2 k ≤ ϕ(α)kvk (15.3.3)
and by Assumption 15.2.1
ks3 k ≤ l0 rkxδα − x̂k. (15.3.4)
The result now follows from (15.3.1), (15.3.2), (15.3.3) and (15.3.4).
Theorem 15.3.3. Let the assumptions in Theorem 15.2.3 and Theorem 15.3.2 hold and let
xδn,α be as in (15.1.12). Then
n max{1, kvk} δ
kxδn,α − x̂k ≤ re−γ2 + ( √ + ϕ(α)).
1−q α
n
Further if nδ := min{n : e−γ2 < √δ }, then
α
δ
kxδnδ ,α − x̂k ≤ C̃( √ + ϕ(α))
α
max{1,kvk}
where C̃ := r + 1−q .
290 Ioannis K. Argyros and Á. Alberto Magreñán
Let
n δ
ni = min{n : e−γ2 ≤ √ }
αi
and let xδαi := xδni,αi where xδni,αi be as in (15.1.12) with α = αi and n = ni . Then from
Theorem 15.3.3, we have
δ
kxδαi − x̂k ≤ C̃( √ + ϕ(αi )), ∀i = 1, 2, · · ·N.
αi
Precisely we choose the regularization parameter α = αk from the set DN defined by
DN := {αi = µi α0 , i = 1, 2, · · ·N}
where µ > 1.
To obtain a conclusion from this parameter choice we considered all possible functions
ϕ satisfying Assumption 15.2.1 and ϕ(αi ) ≤ √δαi . Any of such functions is called admissible
for x̂ and it can be used as a measure for the convergence of xδα → x̂ (see [19]).
The main result of this section is the following theorem, proof of which is analogous to
the proof of Theorem 4.4 in [7].
Theorem 15.3.5. Assume that there exists i ∈ {0, 1, · · · , N} such that ϕ(αi ) ≤ √δ . Let
αi
assumptions of Theorem 15.3.3 be satisfied and let
δ
l := max{i : ϕ(αi ) ≤ √ } < N,
αi
δ
k = max{i : ∀ j = 1, 2, · · · , i; kxδαi − xδα j k ≤ 4C̃ √ }
αj
where C̃ is as in Theorem 15.3.3. Then l ≤ k and
• Choose r ∈ (r1 , r2 ).
15.4.1. Algorithm
1. Set i = 0.
n
2. Choose ni = min{n : e−γ2 ≤ √δ }.
αi
3. Solve xδni ,αi = xδαi by using the iteration (15.1.12) with n = ni and α = αi .
4. If kxδαi − xδα j k > 4C̃ √δα j , j < i, then take k = i − 1 and return xδαk .
[2] Argyros, I.K., Hilout, S., A convergence analysis for directional two-step Newton
methods, Numer. Algor., 55 (2010), 503-528.
[3] Bakushinskii, A.B., The problem of the convergence of the iteratively regularized
Gauss-Newton method, Comput. Math. Phy., 32 (1992), 1353-1359.
[4] Bakushinskii, A.B., Iterative methods without saturation for solving degenerate non-
linear operator equations, Dokl. Akad. Nauk, 344 (1995), 7-8, MR1361018.
[5] Blaschke, B., Neubauer, A., Scherzer, O., On convergence rates for the iteratively
regularized Gauss-Newton method, IMA Journal on Numerical Analysis, 17 (1997),
421- 436.
[6] Bockmann, C., Kammanee, A., Braunb, A., Logarithmic convergence of Levenberg-
Marquardt method with application to an inverse potential problem, J. Inv. Ill-Posed
Problems, 19 (2011), 345-367.
[7] George, S., On convergence of regularized modified Newton’s method for nonlinear
ill-posed problems, J. Inv. Ill-Posed Problems, 18 (2010), 133-146.
[8] George, S., “Newton type iteration for Tikhonov regularization of nonlinear ill-posed
problems,” J. Math., 2013 (2013), Article ID 439316, 9 pages.
[10] Kanke, M., The regularizing Levenberg-Marquardt scheme is of optimal order, J. In-
tegral Equations Appl. 22 (2010), 259-283.
[12] Hohage, T., Logarithmic convergence rate of the iteratively regularized Gauss-Newton
method for an inverse potential and an inverse scattering problem, Inverse Problems,
13 (1997), 1279-1299.
294 Ioannis K. Argyros and Á. Alberto Magreñán
[14] Jin, Q., On a regularized Levenberg-Marquardt method for solving nonlinear inverse
problems, Numer. Math. 16 (2010), 229-259.
[15] Kaltenbacher, B., A posteriori parameter choice strategies for some Newton-type
methods for the regularization of nonlinear ill-posed problems, Numerische Mathe-
matik, 79 (1998), 501-528.
[16] Kaltenbacher, B., A note on logarithmic convergence rates for nonlinear Tikhonov
regularization, J. Inv. Ill-Posed Problems, 16 (2008), 79-88.
[17] Langer, S., Hohage T., Convergence analysis of an inexact iteratively regularized
Gauss-Newton method under general source conditions, J. of Inverse & Ill-Posed
Problems, 15 (2007), 19-35.
[18] Levenberg, K., A Method for the solution of certain problems in least squares, Quart.
Appl. Math. 2 (1944), 164-168.
[19] Lu, S., Pereverzev, S.V., Sparsity reconstruction by the standard Tikhonov method,
RICAM-Report No. 2008-17.
[20] Mahale, P., Nair, M.T., A Simplified generalized Gauss-Newton method for nonlinear
ill-posed problems, Mathematics of Computation, 78(265) (2009), 171-184.
[22] Ortega, J.M., Rheinboldt, W.C., Iterative solution of nonlinear equations in general
variables, Academic Press, New York and London (1970).
[23] Perverzev, S.V., Schock, E., “On the adaptive selection of the parameter in regulariza-
tion of ill-posed problems”, SIAM J. Numer. Anal. 43 (2005), 2060-2076.
[24] Pornsawad, P., Bockmann, C., Convergence rate analysis of the first stage Runga-
Kutta-type regularizations, Inverse Problems, 26 (2010), 035005.
Chapter 16
16.1. Introduction
Let X and Y be Hilbert spaces. Let D ⊆ X be open set and F : D −→ Y be continuously
Fréchet-differentiable. Moreover, let J : D → R ∪ {∞} be proper, convex and lower semi-
continuous. In this study we are concerned with the problem of approximating a locally
unique solution x? of the penalized nonlinear least squares problem
min k F(x) k2 +J(x). (16.1.1)
x∈D
A solution x? ∈ D of (16.1.1) is also called a least squares solution of the equation F(x) = 0.
Many problems from computational sciences and other disciplines can be brought in a
form similar to equation (16.1.1) using Mathematical Modelling [3, 6, 14, 16]. For example
in data fitting, we have X = Ri , Y = R j , i is the number of parameters and j is the number
of observations.
The solution of (16.1.1) can rarely be found in closed form. That is why the solution
methods for these equations are usually iterative. In particular, the practice of numeri-
cal analysis for finding such solutions is essentially connected to Newton-type methods
[1, 2, 3, 5, 4, 6, 7, 14, 17]. The study about convergence matter of iterative procedures
is usually centered on two types: semilocal and local convergence analysis. The semilo-
cal convergence matter is, based on the information around an initial point, to give criteria
ensuring the convergence of iterative procedures; while the local one is, based on the in-
formation around a solution, to find estimates of the radii of convergence balls. A plethora
of sufficient conditions for the local as well as the semilocal convergence of Newton-type
methods as well as an error analysis for such methods can be found in [1]–[20].
If J = 0, we obtain the well known Gauss-Newton method defined by
xn+1 = xn − F 0 (xn )+ F(xn ), for each n = 0, 1, 2, . . ., (16.1.2)
296 Ioannis K. Argyros and Á. Alberto Magreñán
where x0 ∈ D is an initial point [12] and F 0 (xn )+ is the Moore-Penrose inverse of the linear
operator F 0 (xn ). In the present paper we use the proximal Gauss-Newton method (to be pre-
cised in Section 16.2, see (16.2.6)) for solving penalized nonlinear least squares problem
(16.1.1). Notice that if J = 0, x? is a solution of (16.1.1), F(x? ) = 0 and F 0 (x? ) is invertible,
then the theories of Gauss-Newton methods merge into those of Newton method. A sur-
vey of convergence results under various Lipschitz-type conditions for Gauss-Newton-type
methods can be found in [2, 6] (see also [5, 9, 10, 12, 15, 18]). The convergence of these
methods requires among other hypotheses that F 0 satisfies a Lipschitz condition or F 00 is
bounded in D . Several authors have relaxed these hypotheses [4, 8, 9, 10, 15]. In particular,
Ferreira et al. [1, 9, 10] have used the majorant condition in the local as well as semilocal
convergence of Newton-type method. Argyros and Hilout [3, 4, 5, 6, 7] have also used
the majorant condition to provide a tighter convergence analysis and weaker convergence
criteria for Newton-type method. The local convergence of inexact Gauss-Newton method
was examined by Ferreira et al. [9] using the majorant condition. It was shown that this
condition is better that Wang’s condition [15], [20] in some sence. A certain relationship
between the majorant function and operator F was established that unifies two previously
unrelated results pertaining to inexact Gauss-Newton methods, which are the result for an-
alytical functions and the one for operators with Lipschitz derivative.
In [7] motivated by the elegant work in [10] and optimization considerations we pre-
sented a new local convergence analysis for inexact Gauss-Newton-like methods by using
a majorant and center majorant function (which is a special case of the majorant function)
instead of just a majorant function with the following advantages: larger radius of con-
vergence; tighter error estimates on the distances k xn − x? k for each n = 0, 1, · · · and a
clearer relationship between the majorant function and the associated least squares prob-
lems (16.1.1). Moreover, these advantages are obtained under the same computational cost,
since as we will see in Section 16.3. and Section 16.4., the computation of the majorant
function requires the computation of the center-majorant function. Furthermore, these ad-
vantages are very important in computational mathematics, since we have a wider choice
of initial guesses x0 and fewer computations to obtain a desired error tolerance on the dis-
tances k xn − x? k for each n = 0, 1, · · ·. In the present paper, we obtain the same advantages
over the work by Allende and Gonçalves [1] but using the proximal Gauss-Newton method
[6, 18].
The paper is organized as follows. In order to make the paper as self contained as
possible, we provide the necessary background in Section 16.2.. Section 16.3. contains
the local convergence analysis of inexact Gauss-Newton-like methods. Some proofs are
abbreviated to avoid repetitions with the corresponding ones in [18]. Special cases and
applications are given in the concluding Section 16.4..
16.2. Background
Let U(x, r) and U(x, r) stand, respectively, for the open and closed ball in X with center
x ∈ D and radius r > 0. Let A : X −→ Y be continuous linear and injective with closed
image, the Moore-Penrose inverse [3] A+ : Y −→ X is defined by A+ = (A? A)−1 A? . I
denotes the identity operator on X (or Y ). Let L (X , Y ) be the space of bounded linear
operators from X into Y . Let M ∈ L (X , Y ), the Ker(M) and Im(M) denote the Kernel
Proximal Gauss-Newton Method 297
and image of M, respectively and M ∗ its adjoint operator. Let M ∈ L (X , Y ) with a closed
image. Recall that the Moore-Pentose inverse of M is the linear operator M + ∈ L (Y , X )
which satisfies
It follows from (16.2.1) that if ∏S denotes the projection of X onto subspace S, then
M + M = IX − ∏ , M M+ = ∏ . (16.2.2)
Ker(M) Im(M)
Lemma 16.2.2. [1, 3, 6, 10] Let A, E : X −→ Y be two continuous linear operators with
closed images. Suppose B = A + E, A is injective and k E A+ k< 1. Then, B is injective.
Lemma 16.2.3. [1, 3, 6, 10] Let A, E : X −→ Y be two continuous linear operators with
closed images. Suppose B = A + E and k A+ k k E k< 1. Then, the following estimates hold
√
+ k A+ k + + 2 k A+ k2 k E k
k B k≤ and k B − A k≤ .
1− k A+ k k E k 1− k A+ k k E k
prox Q
J : X → X
1 2 (16.2.5)
y → Γ(y) = argminx∈X J(x) + k x − y kQ
2
The first optimality condition for (16.2.4) leads to
z = prox Q
J (y) ⇔ 0 ∈ ∂J(z) + Q(z − y)
⇔ Q(z) ∈ (∂I + Q)(z),
298 Ioannis K. Argyros and Á. Alberto Magreñán
which leads to
prox Q −1
J (y) = (∂I + Q) (Q(y))
by using the minimum in (16.2.4). In view, of the above, we can define the proximal Gauss-
Newton method by
H(xn )
xn+1 = prox J (xn − F 0 (xn )+F(xn )) for each n = 0, 1, 2, . . . (16.2.6)
H(xn )
where x0 is an initial point, H(xn ) = F 0 (xn )∗ F 0 (xn ) and prox J is defined in (16.2.5).
Next, we present some auxiliary results.
Lemma 16.2.4. [18] Let Q1 and Q2 be continuous, positive self adjoint operators and
bounded from below on X. Then, the following hold
q
k prox Q
J
1
(y1 ) − prox Q2
J (y2 ) k ≤ k Q1 kk Q1−1 k k y1 − y2 k
Q2
+ k Q−1
1 kk (Q1 − Q2 )(y2 − prox J (y2 )) k .
(16.2.7)
for each y1 , y2 ∈ X.
Lemma 16.2.5. [18] Given xn ∈ X, if F 0 (xn ) is injective with closed image, then xn+1
satisfies
1
xn+1 = argminx∈X k F(xn ) + F 0 (xn )(x − xn ) k2 +J(x). (16.2.8)
2
Lemma 16.2.6. [18] Suppose: x∗ ∈ D satisfies −F 0 (x∗ )∗ F(x∗ ) ∈ ∂J(x∗ ); F 0 (x∗ ) is injective
and Im(F 0 (x∗ )) is closed. Then x∗ satisfies
H(x∗ )
x∗ = prox J x∗ − F 0 (xn )+ F(x∗ ) . (16.2.9)
Proposition 16.2.7. [10] Let R > 0. Suppose g : [0, R) −→ R is convex. Then, the following
holds
g(u) − g(0) g(u) − g(0)
D+g(0) = lim+ = inf .
u→0 u u>0 u
Proposition 16.2.8. [10] Let R > 0 and θ ∈ [0, 1]. Suppose g : [0, R) −→ R is convex. Then,
h : (0, R) −→ R defined by h(t) = (g(t) − g(θt))/t is increasing.
(H1 ) Let x∗ ∈ D, R > 0, α :=k F(x∗ ) k, β :=k F 0 (x∗ )+ k, γ := β k F 0 (x∗ ) k and δ := sup{t ∈
[0, R) : U(x∗ ,t) ⊂ D }. Operator −F 0 (x∗ )∗ F(x∗ ) ∈ ∂J(x∗ ), F 0 (x) is injective and there
exist f 0 , f : [0, R) → R continuously differentiable such that for each x ∈ U(x∗ , δ),
θ ∈ [0, 1] and λ(x) =k x − x∗ k:
and
β k F 0 (x) − F 0 (x∗ + θ(x − x∗ )) k≤ f 00 (λ(x)) − f 00 (θλ(x)); (16.3.2)
Remark 16.3.1. In the literature, with the exception of our works [2, 3, 4, 5, 6, 7] only
(16.3.2) is used. However, notice that (16.3.2) always implies (16.3.1). That is (16.3.1) is
not an additional to (16.3.2) hypothesis. Moreover,
Theorem 16.3.2. Under the (H) hypotheses, let x0 ∈ U(x∗ , r) \ {x∗ }. Then, sequence {xn }
generated by proximal Gauss-Newton method (16.2.6) for solving penalized nonlinear least
squares problem (16.1.1) is well defined, remains in U(x∗ , r) and converges to x∗ . Moreover,
the following estimates hold for each n = 0, 1, 2, . . .:
where
( f 00 (λ0 ) + 1 + γ)[ f 0(λ0 )λ0 − f (λ0 )] 2
ϕ(λ0 , λn , f , f 0 , f 00 ) = λn
(λ0 f00 (λ0 ))2
√
1 + 2 αβ( f 00 (λ0 ) + 1)2
+ λ2n
(λ0 f00 (λ0 ))2
h √ i
αβ 1 + 2 γ + 1 [ f 00 (λ0 ) + 1]
+ λn .
λ0 ( f 00 (λ0 ))2
In order for us to prove Theorem 16.3.2 we shall need several auxiliarity results. The
proofs of the next four Lemmas are omitted, since they have been given, respectively in
Lemmas 16.3.1-16.3.4 in [7]. From now on we assume that hypotheses (H) are satisfied.
Lemma 16.3.3. The following hold, ν > 0, and f 00 (t) < 0 for each t ∈ [0, ν).
Lemma 16.3.4. The function gi , i = 1, 2, . . ., 7 defined by
1
g1 (t) = − ,
f00 (t)
f00 (t) + 1 + γ
g2 (t) = − ,
f00 (t)
t f 0 (t) − f (t)
g3 (t) = ,
t2
f 0 (t) + 1
g4 (t) = 0 ,
t
( f 0 (t) + 1 + γ)(t f 0(t) − f (t))
g5 (t) = 0 ,
(t f 00 (t))2
( f 00 (t) + 1)2
g6 (t) =
(t f 00 (t))2
and
f00 (t) + 1
g7 (t) =
t( f 00 (t))2
for each t ∈ [0, ν) are positive and increasing.
Lemma 16.3.5. The following hold, ρ > 0, and 0 ≤ ψ(t) < 1 for each t ∈ [0, ρ), where
function ψ is defined in the (H) hypotheses.
Lemma 16.3.6. Let x ∈ D . Suppose that λ(x) < min{ν, ρ} and the (H) hypotheses hold
excluding (16.3.2). Then, the following items hold:
β
k F 0 (x)+ k≤ − ,
f00 (λ(x))
√
0 + 0 ∗ + 2β( f 00 (λ(x)) + 1)
k F (x) − F (x ) k≤ −
f00 (λ(x))
and
H(x) = F 0 (x∗ )F 0 (x) is invertible on U(x∗ , r).
Proximal Gauss-Newton Method 301
Remark 16.3.7. It is worth noticing (see also Remark 16.3.2 that the estimates in Lemma
16.3.6 hold with f 0 replaced by f (i.e. using (16.3.2) instead of (16.3.1)). However, in this
case these estimates are less tight.
Lemma 16.3.8. Let x ∈ D . Suppose that λ(x) < min{ν, δ} and the (H) hypotheses exclud-
ing (16.3.2) hold. Then, the following items hold for each x ∈ D :
1 f00 (λ(x))+1+γ
(a) k H(x) k 2 ≤ β ;
1 β
(b) k H(x)−1 k 2 ≤ − f 0 (λ(x))
0
and
Proof.
≤ f 00 (λ(x)) + 1 + γ.
(b) Use Lemma 16.3.6, the definition of H and the last property in (16.2.3).
(c) We use (16.2.2) , (b) and (16.3.1) to obtain in turn that
Lemma 16.3.9. Let x ∈ D . Suppose that λ(x) < δ, then the following items hold:
Remark 16.3.10. (a) Using (16.3.2) only according to Lemma 16.3.9 we have that
G : U(x∗ , r) → X
H(x)
x → prox J (G(x)),
where
G(x) = x − F 0 (x)+F(x).
Notice that according to Lemma 16.3.8 H(x) is invertible in U(x∗ , r). Hence, F 0 (x)+
H(x)
and prox J are well defined in U(x∗ , r).
Proof. Let x ∈ D . Suppose that λ(x) < r. Then, we shall first show that operator G is well
defined and
k G(x) − x∗ k≤ ϕ(λ(x), λ(x), f , f 0, f 00 ), (16.3.5)
where function ϕ was defined in Theorem 16.3.2. Using Lemma 16.2.6 as −F 0 (x∗ )∗ F(x∗ ) ∈
H(x)
∂J(x∗ ) and F 0 (x) is injective we have that x∗ = prox J (GF (x∗ )). Then, according to
Lemma 2.4 we have in turn that
H(x) H(x∗ )
k G(x) − x∗ k = k prox J (GF (x) − prox J (GF (x∗ )))
1
≤ (k H(x) kk H(x)−1 k) 2 k G(x) − G(x∗ ) k
H(x∗ )
+ k H(x)−1 kk (H(x) − H(x∗ ))(G(x∗ ) − prox J (G(x∗ ))) k
≤ P1 (x, x∗ ) + P2 (x, x∗ ),
(16.3.6)
where for simplicity we set
1
P1 (x, x∗ ) = (k H(x) kk H(x)−1 k) 2 k G(x) − G(x∗ ) k
and
P2 (x, x∗ ) =k H(x)−1 kk (H(x) − H(x∗ ))F 0 (x∗ )+ kk F(x∗ ) k .
Using the definition of P2 and items (b) and (c) of Lemma 16.3.8 we get that
αβ
P2 (x, x∗ ) ≤ ( f 0 (λ(x)) + 2 + γ)( f 00 (λ(x)) + 1). (16.3.7)
( f 00 (λ(x)))2 0
Proximal Gauss-Newton Method 303
Then, to find an upper bound on P2 , we first need to find an upper bound on k G(x)−G(x∗ ) k.
Indeed, we have in turn that
k G(x) − G(x∗ ) k = k F 0 (x)+ [F 0 (x)(x − x∗ ) − F(x) + F(x∗ )] + (F 0 (x∗ )+ − F 0 (x)+ )F(x∗ ) k
where
h √ i
( f 00 (λ(x)) + 1 + γ) λ(x) f 0 (λ(x)) − f (λ(x)) + αβ 1 + 2 ( f 00 (λ(x)) + 1)
q(x) =
λ(x) [ f 0 (λ(x))]2
+αβ( f 00 (λ(x)) + 1)
+ .
λ(x) [ f 0 (λ(x))]2
But q(x) ∈ [0, 1), by Lemma 16.3.5, since x ∈ U(x∗ , r) \ {x∗}, so that 0 < λ(x) < r < ρ. That
is we have
k G(x) − x∗ k<k x − x∗ k . (16.3.10)
In particular x0 ∈ U(x∗ , r) \ {x∗ }. That is 0 < λ(x0 ) < r. Then, using mathematical
induction, Lemma 16.3.6 and (16.3.10) for x = x0 we get that λ(x1 ) =k x1 − x∗ k<k x0 −
x∗ k= λ(x0 ) < r. Similarly, we get as in (16.3.9) that
k xk+1 − x∗ k ≤ q(x0 ) k xk − x∗ k
< k xk − x∗ k
< r
from which it follows that lim xk = x∗ and sequence {xk } remains in U(x∗ , r) \ {x∗ }.
k→∞
Remark 16.3.11. If f 0 = f , then the results of this Section reduce to the corresponding
ones in [1] (see also [9]). Otherwise, i.e. if strict inequality holds in (16.3.3), then: our
sufficient convergence condition (H4 ) is weaker than the one in [1] using f 0 instead of f 00
(i.e. the applicability of the method is extended in cases that cannot be covered before);
our convergence ball is larger and the estimates on the distances kxn − x∗ k more precise,
which imply that we have a wider choice of the initial guesses and less iterates are required
to obtain a given error tolerance. Notice also that these advantages are obtained under the
same computational cost as in [1, 9], since in practice the computation of the function f
requires the computation of f 0 as a special case. Therefore, these developments are very
important in computational mathematics.
304 Ioannis K. Argyros and Á. Alberto Magreñán
β k F 0 (x) − F 0 (x∗ ) k≤ L0 k x − x∗ k
and
β k F 0 (x) − F 0 (x∗ + θ(x − x∗ )) k≤ L(1 − θ) k x − x∗ k .
Let ( p )
4+γ− (4 + γ)2 − 8
r := min ,δ .
2L0
Then, sequence {xn } generated by proximal Gauss-Newton method (16.2.6) for solving
penalized nonlinear least squares problem (16.1.1) is well defined, remains in U(x∗ , r) and
converges to x∗ provided that x0 ∈ U(x∗ , r) \ {x∗ }. Moreover, the following estimates hold
L(γ + 2L0 k x0 − x∗ k)
k xk+1 − x∗ k≤ k xn − x∗ k2 for each n = 0, 1, 2, . . ..
2(1 − L0 k x0 − x∗ k)
The preceding results improve earlier ones [1, 8, 9, 10, 12, 15, 18] when L0 < L (see
also Remark 16.3.11). Next, we present an example where L0 < L. More example, where
L0 < L in the Lipschitz case or in Smale’s alpha theory can be found in [3, 4, 5, 6, 7].
Example 16.4.3. Let X = Y = R3 , D = U(0, 1), x∗ = (0, 0, 0) and define function F on D
by
e−1 2
F(x, y, z) = (ex − 1, y + y, z). (16.4.2)
2
Proximal Gauss-Newton Method 305
For simplicity we consider the nonlinear equation F(x) = 0 instead of (16.1.1). We have
that for u = (x, y, z) x
e 0 0
F 0 (u) = 0 (e − 1)y + 1 0 , (16.4.3)
0 0 1
Using the norm of the maximum of the rows and (16.4.2)–(16.4.3) we see that since F 0 (x∗ ) =
diag{1, 1, 1}, we can define parameters L0 and L by
L0 = e − 1 < L = e.
References
[1] Allende, G.B., Gonçalves, M.L.N., Local convergence analysis of a proximal Gauss-
Newton method under a majorant condition, preprint http://arxiv.org/abs/1304.6461
[2] Amat, S., Busquier, S., Gutiérrez, J.M., Geometric constructions of iterative functions
to solve nonlinear equations, J. Comput. Appl. Math., 157 (2003), 197-205.
[4] Argyros, I.K., Hilout, S., Extending the applicability of the Gauss-Newton method
under average Lipschitz-type conditions, Numer. Algorithms, 58 (2011), 23-52.
[5] Argyros, I.K., Hilout, S., Improved local convergence of Newton’s method under weak
majorant condition, J. Comput. Appl. Math., 236 (2012), 1892-1902.
[6] Argyros, I.K., Hilout, S., Computational methods in Nonlinear Analysis, World Sci-
entific Publ. Comp. New Jersey, 2013.
[7] Argyros, I.K., Hilout, S., Improved local convergence analysis of inexact Gauss-
Newton like methods under the majorizing condition in Banach space, J. Franklin
Institute, 350 (2013), 1531-1544.
[8] Chen, J., Li, W., Local convergence results of Gauss-Newton’s like method in weak
conditions, J. Math. Anal. Appl., 324 (2006), 1381–1394.
[9] Ferreira, O.P., Gonçalves, M.L.N, Oliveira, P.R., Local convergence analysis of
Gauss–Newton like methods under majorant condition, J. Complexity, 27 (2011), 111–
125.
[10] Ferreira, O.P., Gonçalves, M.L.N, Oliveira, P.R., Local convergence analysis of inex-
act Gauss–Newton like methods under majorant condition, J. Comput. Appl. Math.,
236 (2012), 2487–2498.
[11] Gutiérrez, J.M., Hernández, M.A., Newton’s method under weak Kantorovich condi-
tions, IMA J. Numer. Anal., 20 (2000), 521–532.
[13] Huang, Z.A., Convergence of inexact Newton method, J. Zhejiang Univ. Sci. Ed., 30
(2003), 393–396.
[14] Kantorovich, L.V., Akilov, G.P., Functional Analysis, Pergamon Press, Oxford, 1982.
[15] Li, C., Hu, N., Wang, J., Convergence bahavior of Gauss–Newton’s method and ex-
tensions to the Smale point estimate theory, J. Complexity, 26 (2010), 268–295.
[16] Potra, F.A., Sharp error bounds for a class of Newton–like methods, Libertas Mathe-
matica, 5 (1985), 71–84.
[17] Proinov, P.D., General local convergence theory for a class of iterative processes and
its applications to Newton’s method, J. Complexity, 25 (2009), 38–62.
[18] Salzo, S., Villa, S., Convergence analysis of a proximal Gauss-Newton method, Com-
put. Optim. Appl., 53 (2012), 557–589.
[19] Smale, S., Newton’s method estimates from data at one point. The merging of dis-
ciplines: new directions in pure, applied, and computational mathematics (Laramie,
Wyo., 1985), 185-196, Springer, New York, 1986.
[20] Wang, X.H., Convergence of Newton’s method and uniqueness of the solution of equa-
tions in Banach spaces, IMA J. Numer. Anal., 20 (2000), 123-134.
Chapter 17
17.1. Introduction
In this chapter we are concerned with the problem of approximating a locally unique solu-
tion x∗ of the nonlinear equation
F(x) = 0, (17.1.1)
where F is a Fréchet-differentiable operator defined on a open convex subset D of a Banach
space X with values in a Banach space Y.
Many problems from Computational Sciences and other disciplines can be brought in a
form similar to equation (17.1.1) using Mathematical Modeling [2, 6, 10]. For example in
data fitting, we have X = Y = Ri , i is number of parameters and i is number of observations.
The solution of (17.1.1) can rarely be found in closed form. That is why the solution
methods for these equations are usually iterative. In particular, the practice of Numerical
Analysis for finding such solutions is essentially connected to Newton-type methods [1]–
[15]. The study about convergence matter of iterative procedures is usually centered on
two types: semilocal and local convergence analysis. The semilocal convergence matter is,
based on the information around an initial point, to give criteria ensuring the convergence
of iteration procedures; while the local one is, based on the information around a solution,
to find estimates of the radii of the convergence balls.
In the present chapter, we study the convergence of the Damped Newton method defined
by
xn+1 = xn − A−1 I − αn F 0 (xn ) − A F(xn ), for each n = 0, 1, 2, . . ., (17.1.2)
where A ∈ L (X, Y) the space of bounded linear operators from X into Y, A−1 ∈ L (Y, X),
αn is a sequence of real numbers chosen to force convergence of sequence xn and x0 is
an initial point. If A = F 0 (x0 ) and αn = 0 for each n = 0, 1, 2, . . ., we obtain the modified
Newton’s method
yn+1 = yn − F 0 (x0 )−1 F(yn ), y0 = x0 , for each n = 0, 1, 2, . . ., (17.1.3)
310 Ioannis K. Argyros and Á. Alberto Magreñán
converges quadratically provided that the iteration starts close enough to the solution. How-
ever, the cost of a Newton iterate may be very expensive, since all the elements of the Ja-
cobian matrix involved must be computed, as well as the need for an exact slowdown of a
system of linear equations using a new matrix for every iterate. As noted in [13] Newton-
like method (17.1.2) uses a modification of the right hand side vector, which is cheaper than
the Newton and faster than the modified Newton method. One step of the method requires
the solution of a linear system, but the system matrix is the same in all iterations.
We present a new local and semilocal convergence analysis for Newton-like method.
In contrast to the work in [11, 13], in the local case the radius of convergence can be
computed as well as the error bounds on the distances kxn − x∗ k for each n = 0, 1, 2, . . .. In
the semilocal case, we present estimates on the smallness of kF(x0 )k as well as computable
estimates for kxn − x∗ k (not given in [11, 13] in terms of the Lipschitz constants and other
initial data).
We denote by U(ν, µ) the open ball centered at ν ∈ X and of radius µ > 0. Moreover,
by U(ν, µ) we denote the closure of U(ν, µ).
The chapter is organized as follows. Sections 17.2. and 17.3. contain the semilocal and
local convergence analysis of Newton-like method (17.1.2), respectively. The numerical
examples are presented in the concluding Section 17.4..
holds;
C2 There exist L0 > 0 such that for each x ∈ D the center-Lipschitz condition
holds;
Damped Newton Method with Modified Right-Hand Side 311
Hence, x1 ∈ U(x0 , r) and (17.2.7) holds for n = 0. Using (17.1.2) it can easily be seen that
the Ostrowski-type approximation
Z 1
F(xn+1 ) = F 0 (xn + θ(xn+1 − xn )) − F 0 (xn ) (xn+1 − xn ) dθ
0
(17.2.9)
+ F 0 (xn ) − A (αn F(xn ) + (xn+1 − xn ))
holds. Using (17.2.9), and the (C) conditions, for n = 0 we get in turn that
Z 1
kF(x1 )k =
F 0 (x0 + θ(x1 − x0 )) − F 0 (x0 ) (x1 − x0 ) dθ
0
+ F 0 (x0 ) − A (α0 F(x0 ) + (x1 − x0 ))
L0
≤ kx1 − x0 k2 +
F 0 (x0 ) − A
|α0 |kF 0 (x0 )k + kx1 − x0 k
2
L 2
≤ q kF(x0 )k2 +
F 0 (x0 ) − A
(αkF(x0 )k + qkF(x0 )k)
2
L 2
0
≤
q kF(x0 )k + F (x0 ) − A (α + q) kF(x0 )
2
≤ qkF(x0 )k.
That is (17.2.8) holds for n = 0. It follows from the existence of x1 ∈ U(x0 , r) and A−1 ∈
L (X, Y) that x2 is well defined. Using (17.1.2) for n = 1, we get by (0), (2), (3) that
kx2 − x1 k = kA−1 I − α1 F 0 (x1 ) − A F(x1 )k
≤ kA−1 k + αkA−1 F 0 (x1 ) − F 0 (x0 ) + F 0 (x0 ) − A kF(x1 )k
≤ kA−1 k + α kA−1kL0 kx1 − x0 k +
A−1 F 0 (x0 ) − A
kF(x1 )k
≤ qkF(x1 )k ≤ q2 kF(x0 )k.
That is, x2 ∈ U(x0 , r). Then, using (17.2.9) for n = 1, as above we get in turn that
L
kF(x2 )k ≤ kx2 − x1 k2
2
+ L0 kx1 − x0 k +
F 0 (x0 ) − A
|α0 |kF 0 (x1 )k + kx2 − x1 k
L
≤ q2 kF 0 (x1 )k2
2
+ L0 qkF(x0 )k +
F 0 (x0 ) − A
(αkF(x1 )k + qkF(x1 )k)
L 2 0
0
≤
q kF (x1 )k + L0 qkF(x0 )k + F (x0 ) − A (α + q) kF(x1 )k
2
≤ qkF(x1 )k ≤ q2 kF(x0 )k.
Similarly, we have using (17.1.2) that
kx3 − x2 k ≤ kA−1 k + αkA−1 F 0 (x2 ) − F 0 (x0 ) + F 0 (x0 ) − A kF(x2 )k
≤ kA−1 k + α kA−1kL0 kx2 − x0 k +
A−1 F 0 (x0 ) − A
kF(x2 )k
≤ qkF(x2 )k ≤ q3 kF(x0 )k.
We also have that
kx3 − x0 k ≤ kx3 − x2 k + kx2 − x1 k + kx1 − x0 k
≤ (q3 + q2 + q)kF(x0 )k
1 − q3
= qkF(x0 )k < r,
1−q
and
L
kF(x3 )k ≤ kx3 − x2 k2
2
+ L0 kx2 − x0 k +
F 0 (x0 ) − A
|α0 |kF 0 (x1 )k + kx3 − x2 k
L
≤ q2 kF(x2 )k2
2
qkF(x0 )k
0
+ L0 + F (x0 ) − A (αkF(x2 )k + qkF(x2 )k)
1−q
L 2 qkF(x0 )k
0
≤ q kF(x2 )k + L0 + F (x0 ) − A (α + q) kF(x2 )k
2 1−q
≤ qkF(x2 )k ≤ q3 kF(x0 )k.
The rest follows in analogous way using induction (simply replace x2 , x3 by xn , xn+1 in the
above estimates). By letting n → ∞ in (17.2.7) we obtain F(x∗ ) = 0.
Condition (1) may not be satisfied but weaker condition (2) may be satisfied. In this
case (1) can be dropped. Then, using instead of approximation (17.2.9) the approximation
Z 1
F(xn+1 ) = F 0 (xn + θ(xn+1 − x0 )) − F 0 (x0 ) (xn+1 − xn ) dθ
0
(17.2.11)
+ F 0 (x0 ) − F 0 (xn ) (xn+1 − xn )
+ F 0 (xn ) − F 0 (x0 ) + F 0 (x0 ) − A (αn F(xn ) + (xn+1 − xn )) ,
314 Ioannis K. Argyros and Á. Alberto Magreñán
Theorem 17.2.2. Suppose that the (C0 ) conditions hold. Then sequence {xn } generated
by the Damped Newton method (17.1.2) is well defined, remains in U(x0 , r) for each n =
0, 1, 2, . . ., and converges to a solution x∗ ∈ U(x0 , r) of equation (17.1.1). Moveover, the
following estimates hold for each n = 0, 1, 2, . . .,
and
kF(xn+1 )k ≤ qkF(xn )k ≤ qn+1 kF(x0 )k,
where q is defined in (4) and r in (5).
Concerning the uniqueness of the solution x∗ in U(x0 , r) we have the following result.
Proposition 17.2.3. Suppose that the (C) or (C0 ) conditions hold. Moveover, suppose that
there exist x0 ∈ D and r1 ≥ r such that F 0 (x0 )−1 ∈ L (X, Y) and
Then the solution x∗ is the only solution of equation (17.1.1) in U(x0 , r1 ), where r is defined
in (5).
Proof. The existence of the solution x∗ is guaranteed by conditions (C) or (C0 ). To show
1
uniqueness, let y∗ ∈ U(x0 , r1 ) with F(y∗ ) = 0. Define M = 0 F 0 (x∗ + θ(y∗ − x∗ )) dθ. Then,
R
C1 There exist L > 0 such that for each x, y ∈ D the Lipschitz condition (17.2.1) holds;
Damped Newton Method with Modified Right-Hand Side 315
C2 There exists l0 > 0 such that for each x ∈ D the center-Lipschitz condition(17.2.2)
kF 0 (x) − F 0 (x∗ )k ≤ l0 kx − x∗ k
holds;
C4 U(x∗ , R) ⊆ D, where R is R1 or R2 .
Then, using (17.2.9), and the (H) conditions, it is standard to arrive at [2, 3, 4, 5, 6, 8, 9, 10,
14, 15]:
Theorem 17.3.1. Suppose that the (H) conditions hold. Then sequence {xn } generated
by the Damped Newton method (17.1.2) is well defined, remains in U(x∗ , R1 ) for each
n = 0, 1, 2, . . ., and converges to x∗ provided that x0 ∈ U(x∗ , R1 ). Moveover, the following
estimates hold for each n = 0, 1, 2, . . .,
where
La β αl0
en = kxn − x∗ k + β1 + αβ1 β + 1 kxn − x∗ k + l0 akxn − x∗ k
2 2
2
αl a
+ αl0 aβkxn − x∗ k + 0 kxn − x∗ k2
2
< p1 (R1 ) + 1 < 1.
In cases (1) cannot be verified by (2) holds, we can present the local convergence of the
Damped Newton method (17.1.2) under the (H0 ) conditions using the following modifica-
tion of the Ostrowski representation (17.3.4) given by
Z 1
∗ −1
0 ∗
xn+1 − x = −A F (x + θ(xn − x∗ )) − F 0 (x∗ ) dθ
0
+ [F (x ) − F 0 (xn )]
0 ∗
h (17.3.6)
− (A − F 0 (x∗ )) + (F 0 (x∗ ) − F 0 (xn ))
(I − αn F 0 (x∗ )
Z 1
0 ∗ ∗ 0 ∗
i
− αn F (x + θ(xn − x )) − F (x ) (xn − x∗ )
0
Theorem 17.3.2. Suppose that the (H0 ) conditions hold. Then sequence {xn } generated
by the Damped Newton method (17.1.2) is well defined, remains in U(x∗ , R2 ) for each
n = 0, 1, 2, . . ., and converges to x∗ provided that x0 ∈ U(x∗ , R2 ). Moveover, the following
estimates hold for each n = 0, 1, 2, . . .,
where
3l0 a β αl0
e0n = kxn − x∗ k + β1 + αβ1 β + 1 kxn − x∗ k + l0 akxn − x∗ k
2 2
2
αl a
+ αl0 aβkxn − x∗ k + 0 kxn − x∗ k2
2
< p2 (R2 ) + 1 < 1.
must be isotropic at a point, that is the distribution in independent of direction at that point.
Explicit definitions of these terms may be found in the literature [7]. It is considered to be
the prototype of the equation,
Z 1
ϕ(s)
x(s) = 1 + λs x(s) x(t) dt, s ∈ [0, 1],
0 s +t
for more general laws of scattering, where ϕ(s) is an even polynomial in s with
Z 1
1
ϕ(s) ds ≤ .
0 2
Integral equations of the above form also arise in the other studies [7]. We determine where
a solution is located, along with its region of uniqueness.
Note that solving (17.4.1) is equivalent to solve F(x) = 0, where F : C[0, 1] → C[0, 1]
and Z 1
s x(t)
[F(x)](s) = x(s) − 1 − x(s) dt, s ∈ [0, 1]. (17.4.2)
4 0 s +t
To obtain a numerical solution of (17.4.1), we first discretize the problem and approach
the integral by a Gauss-Legendre numerical quadrature with eight nodes,
Z 1 8
0
f (t) dt ≈ ∑ w j f (t j),
j=1
where
t1 = 0.019855072, t2 = 0.101666761, t3 = 0.237233795, t4 = 0.408282679,
t5 = 0.591717321, t6 = 0.762766205, t7 = 0.898333239, t8 = 0.980144928,
w1 = 0.050614268, w2 = 0.111190517, w3 = 0.156853323, w4 = 0.181341892,
w5 = 0.181341892, w6 = 0.156853323, w7 = 0.111190517, w8 = 0.050614268.
If we denote xi = x(ti ), i = 1, 2, . . ., 8, equation (3.7) is transformed into the following non-
linear system:
xi 8
xi = 1 + ∑ ai j x j , i = 1, 2, . . ., 8,
4 j=1
ti w j
where, ai j = .
ti + t j
Denote now x = (x1 , x2 , . . ., x8 )T , 1 = (1, 1, . . ., 1)T , A = (ai j ) and write the last nonlin-
ear system in the matrix form:
1
x = 1 + x (Ax), (17.4.3)
4
where represents the inner product. Set G(x) = x. If we choose x0 = (1, 1, . . ., 1)T and
x−1 = (0, 0, . . ., 0)T . Assume sequence {xn } is generated with different choices of αn and
A = F 0 (x0 ). The computational order of convergence (COC) is shown in Table 17.4.1 for
various methods. Here (COC) is defined in [12] by
kxn+1 − x? k∞ kxn − x? k∞
ρ ≈ ln / ln , n ∈ N,
kxn − x? k∞ kxn−1 − x? k∞
The Table 17.4.1 shows the (COC).
318 Ioannis K. Argyros and Á. Alberto Magreñán
Table 17.4.1. The comparison results of the COC for Example 1 using various αn
Example 17.4.2. In this example, we consider the Singular Broyden [13] problem defined
as
F1 (x) = ((3 − hx1 )x1 − 2x2 + 1)2 ,
Fi (x) = ((3 − hxi )xi − xi−1 − 2xi+1 + 1)2 ,
Fn (x) = ((3 − hxn )xn − xn−1 + 1)2 ,
Taking as starting approximation x0 = (−1, . . ., −1)T and h = 2. The Table 17.4.2
shows the (COC) computed as in previous example.
Table 17.4.2. The comparison results of the COC for Example 2 using various αn
[1] Amat, S., Busquier, S., Gutiérrez, J. M., Geometric constructions of iterative functions
to solve nonlinear equations, J. Comput. Appl. Math., 157 (2003), 197-205.
[3] Argyros, I. K., Chen, J., Improved results on estimating and extending the radius of
an attraction ball, Appl. Math. Lett., 23 (2010), 404-408.
[4] Argyros, I. K., Chen, J., On local convergence of a Newton-type method in Banach
space, Int. J. Comput. Math., 86 (2009), 1366-1374.
[5] Argyros, I.K., Hilout, S., Improved local convergence of Newton’s method under weak
majorant condition, J. Comput. Appl. Math., 236 (2012), 1892–1902.
[6] Argyros, I.K., S. Hilout, Computational methods in Nonlinear Analysis, World Scien-
tific Publ. Comp. New Jersey, 2013.
[7] Chandrasekhar, S., Radiative transfer, Dover Publ., New York, 1960.
[8] Chen, J., Sun, Q., The convergence ball of Newton-like methods in Banach space and
applications, Taiwanese J. Math., 11 (2007), 383-397.
[9] Chen, J, Li, W., Convergence behaviour of inexact Newton methods under weak Lip-
schitz condition, J. Comput. Appl. Math., 191 (2006), 143-164.
[10] Kantorovich, L.V., Akilov, G.P., Functional Analysis, Pergamon Press, Oxford, 1982.
[11] Herceg, D., Krejić, N., Lužanin, Z., Quasi-Newton’s method with correction, Novi
Sad J. Math., 26 (1996), 115-127.
[12] Grau, M., Noguera, M., Gutiérrez, J. M., On some computational orders of conver-
gence. Appl. Math. Let., 23 (2010), 472-478.
[13] Krejić, N., Lužanin, Z., Newton-like method with modification of the right-hand-side
vector, Math. Comp. 71 (2002), 237-250
[14] Potra, F.A., Sharp error bounds for a class of Newton–like methods, Libertas Mathe-
matica, 5 (1985), 71-84.
322 Ioannis K. Argyros and Á. Alberto Magreñán
[15] Proinov, P.D., General local convergence theory for a class of iterative processes and
its applications to Newton’s method, J. Complexity, 25 (2009), 38–62.
Chapter 18
18.1. Introduction
Let X , Y be Banach spaces and D be a non-empty, convex and open subset in X . Let
U(x, r) and U(x, r) stand, respectively, for the open and closed ball in X with center x and
radius r > 0. Denote by L (X , Y ) the space of bounded linear operators from X into Y . In
this chapter, we are concerned with the problem of approximating a solution x? of equation
F(x) = 0 (18.1.1)
where x0 is an initial point. There are two difficulties with the implementation of (NM).
The first is to evaluate F 0 and the second difficulty is to exactly solve the following Newton
324 Ioannis K. Argyros and Á. Alberto Magreñán
equation
F 0 (xn )(xn+1 − xn ) = −F(xn ) for each n = 0, 1, 2, . . .. (18.1.3)
It is well-known that evaluating F 0 and solving equation (18.1.3) may be computationally
expensive [1, 5, 6, 8, 12, 13, 14]. That is why inexact Newton method (INLM) has been
used [1, 2, 8, 9, 12, 13, 14]:
for some θ ∈ [0, 1] and for each n = 0, 1, 2, . . ., have been employed. Here, {ηn }, {θn } are
sequences in [0, 1], {Pn } is a sequence in L (Y , X ) and F 0 (x? )−1 F 0 satisfies a Lipschitz or
Hölder condition on U(x? , r) [1]-[6], [8, 9, 10, 13, 14].
In this chapter, we are motivated by the works of Argyros et al.[1, 2], Chen et al.[5]
and Zhang et al.[13] and optimization considerations. We suppose that F has a continuous
Fréchet-derivative in U(x? , r), F(x? ) = 0, F 0 (x? )−1 F 0 exists and F 0 (x? )−1 F 0 satisfies the
Lipschitz with L−average radius condition
Z ρ(x)
0 ? −1 0 0 τ
kF (x ) (F (x) − F (x ))k ≤ L (u) d u (18.1.6)
τρ(x)
for each u ∈ [0, r] and L /L0 can be arbitrarily large [1, 2, 4] (see also the numerical ex-
ample at the end of the chapter). It is worth noticing that (18.1.7) is not an additional to
Inexact Newton Like-Method 325
(18.1.6) hypothesis, since in practice the convergence of (18.1.6) requires the computation
of (18.1.7).
In the computation of kF(x)−1 F 0 (x? )k we use the condition (18.1.7) which is tighter
than 18.1.6, and the Banach lemma on invertible operators [7], to obtain the perturbation
bound
Z ρ(x) −1
0 −1 0 ?
kF (x) F (x )k ≤ 1 − L0 (u) d u for each x ∈ U(x? , r), (18.1.9)
0
Notice that (18.1.6) and (18.1.10) have been used in [5], [13], [14]. It turns out that using
(18.1.9) instead of (18.1.10), in the case when L0 (u) < L (u) for each u ∈ [0, r], leads to
tighter majorizing sequences for (INLM). This observation in turn leads to the following
advantages over the earlier works (for ηn = 0 for each n = 0, 1, 2, . . . or not and L being a
constant or not):
The rest of the chapter is organized as follows. In Section 18.2 we present some auxiliary
results. Section 18.3 contains the local convergence analysis of (INLM). In Section 18.4, we
present special cases. The numerical example appears in Section 18.5 and the conclusion
in Section 18.6.
18.2. Background
In this section we present three auxiliary results. The first two are Banach-type perturbation
lemmas.
1
kF 0 (x)−1 F 0 (x? )k ≤ R ρ(x) . (18.2.2)
1− 0 L0 (u)du
326 Ioannis K. Argyros and Á. Alberto Magreñán
Proof. Let x ∈ U(x? , r). Using (18.1.7) and (18.2.1) we get in turn that
Z ρ(x) Z r
kF 0 (x? )−1 (F 0 (x) − F 0 (x? ))k ≤ L0(u)du < L0 (u)du ≤ 1. (18.2.3)
0 0
It follows from (18.2.3) and the Banach Lemma on invertible operators [7] that F 0 (x)−1 ∈
L(Y , X ) and (18.2.2) holds.
Lemma 18.2.2. Suppose that F is such that F 0 is continuously Fréchet-differentiable in
U(x? , r), F 0 (x? )−1 ∈ L(Y , X ) and F 0 (x? )−1 F 0 satisfies the radius Lipschitz condition with
L −average and the center-Lipschitz condition with L0−average. Then, we have
R ρ(x) R ρ(x)
0 −1 0 L (u)udu − 0 (L (u) − L0(u))ρ(x)du
kF (x) F(x)k ≤ ρ(x) + R ρ(x) (18.2.4)
1− 0 L0 (u)du
R ρ(x)
0 L (u)udu
≤ ρ(x) + R ρ(x) . (18.2.5)
1− 0 L0(u)du
If F 0 (x? )−1 F 0 satisfies the center-Lipschitz condition, then we have
R ρ(x)
0 −1 ρ(x) + 0 L0(u)(ρ(x) − u)ρ(x)du
kF (y) F(x)k ≤ R ρ(y) (18.2.6)
1 − 0 L0 (u)du
which implies (18.2.4) and since L0 (u) ≤ L (u) (18.2.4) implies (18.2.5). Estimate (18.2.6)
is shown in [5, Lemma 2.2, 1.3].
Remark 18.2.3. If L0 = L , then our two preceeding results are reduced to the correspond-
ing ones in [5, 13]. Otherwise, i.e., if strict inequality holds in (18.1.8), then our estimates
are more precise, since
1 1
R ρ(x) < R ρ(x) (18.2.9)
1− 0 L0(u)du 1− 0 L (u)du
and R ρ(x) ρ(x)
L (u)udu L (u)udu
R
0 0
ρ(x) + R ρ(x) < ρ(x) + R ρ(x) . (18.2.10)
1 − 0 L0 (u)du 1 − 0 L (u)du
Inexact Newton Like-Method 327
Notice that the right hand sides of (18.2.9) and (18.2.10) are the upper bounds of the norms
kF 0 (x)−1 F(x? )k, kF 0 (x)−1 F(x)k, respectively obtained in the corresponding Lemmas in
[5], [13].
It turns out that in view of estimates (18.2.9) and (18.2.10), we obtain the advantages al-
ready mentioned in the introduction of this chapter of our approach over the corresponding
ones in [5, 13, 14].
Next, we present another auxiliary result due to Wang [14, Lemma 2.2].
is nondecreasing for some α with α ∈ [0, 1], where L is a positive integrable function. Then,
for each β ≥ 0, the function ϕβ,α defined by
Z t
1
ϕβ,α = uβ L (u)du (18.2.12)
t α+β 0
is also nondecreasing.
(1 + v) 0r̃ L (u)udu
R
+ v ≤ 1. (18.3.4)
1 − 0r̃ L0 (u)du
R
328 Ioannis K. Argyros and Á. Alberto Magreñán
Then (INLM) (for Bn = F 0 (xn )) is convergent for all x0 ∈ U(x? , r̃) and
R ρ(x0 ) !
? L (u)du α
kxn+1 − x k ≤ (1 + v) 0
R ρ(x0 ) ρ(xn ) + v kxn − x? k, (18.3.5)
1+α
ρ(x0 ) (1 − 0 L0(u)du)
where R ρ(x0 )
L (u)du
0
q̃ = (1 + v) +v
R ρ(x0 ) (18.3.6)
ρ(x0 )(1 − 0 L0(u)du)
is less than 1.
Proof. Let x0 ∈ B(x? , r), where r satisfies (18.3.1), then q given by (18.3.3) is such that
q ∈ (0, 1). Indeed, by the positivity of L , we have
R ρ(x0 )
0 L (u)du
q = (1 + v) R ρ(x0 ) +v
1− L0 (u)du
Rr0
0R L (u)du
< (1 + v) + v ≤ 1.
1 − 0r L0 (u)du
Suppose that (notice that x0 ∈ U(x? , r)) xn ∈ U(x? , r), we have by (18.1.3)
where xθ = x? + θ(xn − x? ). It follows, by Lemma 18.2.1 and 18.2.2 and conditions (18.1.6)
and (18.1.7) that we can obtain in turn
Z 1
kxn+1 − x? k = kF 0 (xn )−1 F(x? )k kF 0 (x? )−1 (F 0 (xn ) − F 0 (xθ ))kk(xn − x? )kdθ
0
+θn kF 0 (xn )Pn−1 kkPn F(xn )k
Z 1 Z ρ(x)
1
≤ R ρ(x) L (u)duρ(x)dθ
1− 0 L0(u)du 0 θρ(x)
If, n = 0 in (18.3.1), we get kx1 − x? k ≤ q̃kx0 − x? k < kx0 − x? k. Hence, x1 ∈ U(x? , r̃). That
is (INM) can be continued an infinite number of times. It follows by mathematical induction
that, all xn belongs to U(x? , r̃) and ρ(xn ) = kxn − x? k decreases monotonically. Therefore,
for all k ≥ 0, from (18.3.7) and lemma 18.2.4 we get in turn that
R ρ(xn )
? 0 L (u)udu
kxn − x k ≤ (1 + vn ) R ρ(xn ) + vn ρ(xn )
1− 0 L0 (u)du
ϕ1,α (ρ(xn ))
= (1 + vn ) R ρ(xn ) ρ(xn )1+α + vn ρ(xn )
1− 0 L0 (u)du
ϕ1,α (ρ(x0 ))
≤ (1 + vn ) R ρ(x0 ) ρ(xn )1+α + vn ρ(xn )
1− 0 L0 (u)du
ϕ1,α (ρ(x0 ))
≤ (1 + v) R ρ(x0 ) ρ(xn )1+α + vρ(xn ).
1− 0 L0(u)du
Remark 18.3.2. If L0 = L our Theorem 18.3.1 reduces to Theorem 18.3.1 in [13] (see also
[5]). Otherwise, i.e., if L0 < L , then our Theorem 18.3.1 constitutes an improvement. In
particular, for v = 0, the estimate for the radii of convergence ball for Newton’s method are
given by Z r
1
L0 (u)du ≤
0 2
and
1 r̃
Z
(L0 (u)r̃ + L (u)u)du ≤ 1,
r̃ 0
330 Ioannis K. Argyros and Á. Alberto Magreñán
which reduce to the ones in [14] if L0 = L . Then, we can conclude that vanishing residual,
Theorem 18.3.1 merges into the theory of the Newton method. Besides, if the function Lα
defined by (18.2.11) is nondecreasing for α = 1, we improve the result in [5].
Next, we present a result analogous to Theorem 18.3.1 can also be proven for inexact
Newton-like method, where Bn = B(xn ) approximates F 0 (xn ).
Theorem 18.3.3. Suppose x? satisfies (18.1.1), F has a continuous derivative in U(x? , r),
F 0 (x? )−1 exists and F 0 (x? )F 0 satisfies the radius Lipschitz condition (18.1.6) and the
center Lipschitz condition (18.1.7). Let B(x) be an approximation to the F 0 (x) for all
x ∈ U(x? , r), B(x) is invertible and
kB(x)−1F 0 (x)k ≤ ω1 , kB(x)−1 F 0 (x) − Ik ≤ ω2 , (18.3.8)
where vn = θn k(Pn F 0 (xn ))−1 kkPn F 0 (xn )k = θnCond(Pn F 0 (xn ))k with vn ≤ v < 1. Let r > 0
satisfy Z r
1 − ω2 − ω1 v
L0(u)du < . (18.3.9)
0 1 + ω1 − ω2
Then the (INLM) method is convergent for all x0 ∈ U(x? , r) and
R ρ(x ) !
? ω1 0 0 L (u)du
kxn+1 − x k ≤ (1 + v) R ρ(x0 ) + ω2 + ω1 v kxn − x? k, (18.3.10)
1− 0 L0 (u)du
where R ρ(x0 )
ω1 L (u)du
0
q = (1 + v) R ρ(x0 ) + ω2 + ω1 v (18.3.11)
1− 0 L0(u)du
is less than 1. Further, suppose that the function Lα defined by (18.2.11) is nondecreasing
for some α with 0 < α ≤ 1. Ler r̃ satisfy
ω1 0r̃ L (u)du
R
(1 + v) + ω2 + ω1 v ≤ 1. (18.3.12)
1 − 0r̃ L0 (u)du
R
xn+1 − x? = xn − x? − B−1 ? −1
n (F(xn ) − F (x )) + Bn rn
Z 1
0 θ
= xn − x? − B−1 ? −1 −1
n F (x )dθ(xn − x ) + Bn Pn Pn rn
0
Z 1
= −B−1 0
n F (xn ) F 0 (xn )−1 F 0 (x? )F 0 (x? )−1 (F 0 (x? ) − F 0 (xθ ))(xn − x? )dθ
0
+B−1 0 ? −1 −1
n (F (xn ) − Bn )(xn − x ) + Bn Pn Pn rn ,
where xθ = x? + θ(xn − x? ). Using, Lemma 18.2.1 and 18.2.2 and condition (18.3.8) we
obtain
Z 1
kxn+1 − x? k ≤ kB−1 0
n F (xn )k kF 0 (xn )−1 F 0 (x? )kkF 0 (x? )−1 (F 0 (x? ) − F 0 (xθ ))k
0
kxn − x? kdθ + kB−1 0 ? −1 −1
n (F (xn ) − Bn )kkxn − x k + θn kBn Pn kkBn F(xn )k
Z 1 Z ρ(xn )
ω1
≤ R ρ(xn ) L (u)duρ(xn)dθ
1− 0 L0 (u)du 0 θρ(xn )
+θn kPn−1 F 0 (xn )kk(Pn F 0 (xn ))−1 kkPn F 0 (xn )kkF 0 (xn )−1 F(xn )k
R ρ(xn ) ρ(xn )
!
ω1 L (u)udu L (u)udu
R
0 0
≤ R ρ(xn ) + ω2 ρ(xn ) + ω v
1 n ρ(xn ) + R ρ(xn )
1− 0 L0 (u)du 1− 0 L0(u)du
R ρ(xn )
ω1 0 L (u)udu
≤ (1 + vn ) R ρ(xn ) + (ω2 + ω1 vn )ρ(xn ) (18.3.15)
1− 0 L0(u)du
R ρ(xn ) !
0 ω1 L (u)udu
≤ (1 + vn ) R ρ(xn ) + ω2 + ω1 vn ρ(xn ).
1− 0 L0(u)du
If n = 0,in (18.3.15), we obtain kx1 − x? k ≤ qkx0 − x? k < kx0 − x? k. Hence x1 ∈ U(x? , r),
this shows that the iteration can be continued an infinite number of times. By mathematical
induction, xn ∈ U(x? , r) and ρ(xn ) = kxn − x? k decreases monotonically. Therefore, for all
n ≥ 0, we have in turn that
R ρ(x ) !
? ω1 0 n L (u)udu
kxn+1 − x k ≤ (1 + vn ) R ρ(x ) + ω2 + ω1 vn ρ(xn )
1 − 0 n L0 (u)du
R ρ(x ) !
ω1 0 0 L (u)udu
≤ (1 + v) R ρ(x ) + ω2 + ω1 v ρ(xn ),
1 − 0 0 L0 (u)du
If, n = 0 in (18.3.15), we obtain kx1 − x? k ≤ q̃kx0 − x? k < kx0 − x? k. Hence, x1 ∈ U(x? , r̃),
this shows that (18.1.4) can be continued infinite number of times. By mathematical induc-
tion, xn ∈ U(x? , r̃) and ρ(xn ) = kxn − x? k decreases monotonically. Therefore, for all n ≥ 0,
we have
R ρ(xn )
? ω1 0 L (u)udu
kxn+1 − x k ≤ (1 + vn ) R ρ(xn ) + (ω2 + ω1 vn )ρ(xn )
1− 0 L0(u)du
ω1 ϕ1,α (ρ(xn ))
≤ (1 + v) R ρ(x0 ) ρ(xn )1+α + (ω2 + ω1 v)ρ(xn )
1− 0 L0(u)du
ω1 ϕ1,α (ρ(x0 ))
≤ (1 + v) R ρ(x0 ) ρ(xn )1+α + (ω2 + ω1 v)ρ(xn ).
1− 0 L0(u)du
Remark 18.3.4. If L0 = L our Theorem 18.3.3 reduces to Theorem 18.3.2 in [13] (see also
[5]). Otherwise, i.e., if L0 < L , then our Theorem 18.3.3 constitutes an improvement. In
in Theorem 18.3.2, the function Łα defined by (18.2.11) is nondecreasing for α = 1, we
improve the result of [5]. In particular, for v = 0, we can get the radii of converence ball
for the Newton-like method [14].
(b) The results in section 18.5 of [5, 13] using only center-Lipschitz condition can be
improved, if rewritten using L0 instead of L .
18.5. Examples
Finally, we provide an example where L0 < L .
Example 18.5.1. Let X = Y = R3 , D = U(0, 1) and x? = (0, 0, 0). Define function F on D
for w = (x, y, z) by
e−1 2
F(w) = (ex − 1, y + y, z). (18.5.1)
2
Then, the Fréchet derivative of F is given by
x
e 0 0
F 0 (w) = 0 (e − 1) y + 1 0
0 0 1
Notice that we have F(x? ) = 0, F 0 (x? ) = F 0 (x? )−1 = diag{1, 1, 1} and L0 = e − 1 < L = e.
More examples where L0 < L can be found in [1, 2, 3].
334 Ioannis K. Argyros and Á. Alberto Magreñán
18.6. Conclusion
Under the hypothesis that F 0 (x? )F 0 satisfies the center Lipschitz condition (18.1.7) and the
radius Lipschitz condition (18.1.6), we presented a more precise local convergence analysis
for the enexact Newton method under the same computational cost as in earlier studies such
as Chen and Li [5], Zhang, Li and Xie [13]. Numerical examples are provided to show that
the center Lipschitz function can be smaller than the radius Lipschitz function.
References
[1] Argyros, I.K., Cho, Y.J, Hilout, S., Numerical methods for equations and its applica-
tions, CRC Press, Taylor and Francis, New York, 2012.
[2] Argyros, I.K., Hilout, S., Weak convergence conditions for inexact Newton-type meth-
ods, App. Math. Comp., 218 (2011), 2800-2809.
[3] Argyros, I.K., Hilout, S., On the semilocal convergence of a Damped Newton’s
method, Appl.Math. Comput., 219, 5(2012), 2808-2824.
[4] Argyros, I.K., Hilout, S., Weaker conditions for the convergence of Newton’s method,
J. Complexity, 28 (2012), 364-387.
[5] Chen, J., Li, W., Convergence behaviour of inexact Newton methods under weaker
Lipschitz condition, J. Comput. Appl. Math., 191 (2006), 143-164.
[6] Dembo, R.S., Eisenstat, S.C., Steihaug, T., Inexact Newton methods, SIAM J. Numer.
Anal., 19 (1982), 400-408.
[7] Kantorovich, L.V., Akilov, G.P., Functional Analysis, 2nd ed., Pergamon Press, Ox-
ford, 1982.
[8] Morini, B., Convergence behaviour of inexact Newton method, Math. Comp. 61
(1999), 1605-1613.
[9] Ortega, J.M., Rheinholdt, W.C., Iterative Solution of Nonlinear Equations in Several
Variables, Academic Press, New York, 1970.
[10] Proinov, P.D., General local convergence theory for a class of iterative processes and
its applications to Newton’s method, J. Complexity, 25 (2009), 38- 62.
[11] Rheinholdt, W.C., An adaptive continuation process for solving systems of nonlinear
equations, Polish Academy of Science, Banach Ctr. Publ. 3 (1977), 129-142.
[12] Traub, J.F., Iterative methods for the solution of equations, Prentice- Hall Series in
Automatic Computation, Englewood Cliffs, N. J., 1964.
336 Ioannis K. Argyros and Á. Alberto Magreñán
[13] Zhang, H., Li, W., Xie, L., Convergence of inexact Newton methods of nonlinear
operator equations in Banach spaces.
[14] Wang, X., Convergence of Newton’s method and uniqueness of the solution of equa-
tions in Banach space, IMA J. Numer. Anal., 20 (2000), 123-134.
Chapter 19
19.1. Introduction
In this chapter we are concerned with the problem of approximating a locally unique solu-
tion x? of equation
F(x) = 0, (19.1.1)
where F is a Fréchet–differentiable operator defined on a convex subset D of a Banach
space X with values in a Banach space Y .
A vast number of problems from Applied Science including engineering can be solved
by means of finding the solutions equations in a form like (19.1.1) using mathematical
modelling [7, 10, 15, 18]. For example, dynamic systems are mathematically modeled by
difference or differential equations, and their solutions usually represent the states of the
systems. Except in special cases, the solutions of these equations cannot be found in closed
form. This is the main reason why the most commonly used solution methods are iterative.
Iteration methods are also applied for solving optimization problems. In such cases, the it-
eration sequences converge to an optimal solution of the problem at hand. Since all of these
methods have the same recursive structure, they can be introduced and discussed in a gen-
eral framework. The convergence analysis of iterative methods is usually divided into two
categories: semilocal and local convergence analysis. In the semilocal convergence analy-
sis one derives convergence criteria from the information around an initial point whereas in
the local analysis one finds estimates of the radii of convergence balls from the information
around a solution.
We consider the Secant method in the form
where δF(x, y) ∈ L (X , Y ) (x, y ∈ D ) the space of bounded linear operators from X into Y .
of the Fréchet–derivative of F [15, 18].
The semilocal convergence matter is, based on the information around an initial point,
to give criteria ensuring the convergence of iteration procedures. A very important problem
in the study of iterative procedures is the convergence domain. In general the convergence
338 Ioannis K. Argyros and Á. Alberto Magreñán
was first studied by S. Ulm [25]. The first semilocal convergence analysis was given by
P. Laasonen [21]. His results was improved by F. A. Potra and V. Pták [20, 21, 22]. A
semilocal convergence analysis for general secant-type methods was given in general by J.
E. Dennis [13]. Bosarge and Falb [9], Dennis [10], Potra [20, 21, 22], Argyros [5, 6, 7, 8],
Hernández et al. [13] and others [14], [18], [26], have provided sufficient convergence
conditions for the Secant method based on Lipschitz–type conditions on δF. Moreover,
there exist new graphical tools to study this kind of methods [17].
The conditions usually associated with the semilocal convergence of Secant method
(19.1.2) are:
k A−1 F(x0 ) k≤ η;
The sufficient convergence condition(19.1.3) is easily violated (see the Numerical Ex-
amples). Hence, there is no guarantee in these cases that equation (19.1.1) under the in-
formation (`, c, η) has a solution that can be found using Secant method (19.1.2). In this
chapter we are motivated by optimization considerations, and the above observation.
Expanding the Applicability of Secant Method with Applications 339
The use of Lipschitz and center–Lipschitz conditions is one way used to enlarge the
convergence domain of different methods. This technique consist on using both conditions
together instead of using only the Lipschitz one which allow us to find a finer majorizing
sequence, that is, a larger convergence domain. It has been used in order to find weaker
convergence criteria for Newton’s method by Argyros in [8]. Gutiérrez et al in [12] give
sufficient conditions for Newton’s method using both Lipschitz and center-Lipschitz condi-
tions, for the damped Newton’s methods and Amat et al in [3, 4] or Garcı́a-Olivo [11] for
other methods.
Here using Lipschitz and center–Lipschitz conditions, we provide a new semilocal con-
vergence analysis for (19.1.2). It turns out that our new convergence criteria can always be
weaker than the old ones given in earlier studies such as [2, 14, 16, 18, 20, 21, 22, 23, 26,
27]. The chapter is organized as follows: The semilocal convergence analysis of the secant
method is presented in Section 19.2. Numerical examples are provided in Section 19.3.
Lemma 19.2.1. Let `0 > 0, ` > 0, c > 0 and η > 0 be constants with `0 ≤ `. Then, the
following items hold
(i)
`(c + η) 2` 1 − `0 (c + η) 4`2
0< ≤ p < ⇔ c+η ≤ p 2 ;
1 − `0 (c + η) ` + `2 + 4`0 ` 1 − `0 c
` + `2 + 4`0 `
(19.2.1)
(ii) q
3− 1 + 4 ``0 (1 − `c)2
`c ≤ q ⇔ ≤ b2 − `c; (19.2.2)
1+ 1 + 4 ``0 4
(iii) q
3− 1 + 4 ``0 (1 − `c)2
`c ≥ q ⇔ ≥ b2 − `c; (19.2.3)
1+ 1 + 4 ``0 4
(iv)
q
3− 1 + 4 ``0 p 4`
`c ≤ q and `c + `η ≤ 1 ⇒ c + η ≤ p 2 c; (19.2.4)
1+ 1 + 4 ``0 ` + `2 + 4`0 `
340 Ioannis K. Argyros and Á. Alberto Magreñán
(v)
q
3− 1 + 4 ``0 4` p
`c ≥ q and c + η ≤ p 2 ⇒ `c + `η ≤ 1. (19.2.5)
1+ 1 + 4 ``0 `+ `2 + 4`0 `
`0 2
Proof. Let x = 1 − `c, y = `η, a = and b = √ . Then, we have that ab2 + b −
` 1 + 1 + 4a
1
1 = 0 and ab + 1 = .
b
(i) The triple inequality in (19.2.1) holds, if
`c + `η 2`
≤ √ = b, (19.2.6)
1 − a`(c + η) ` + `2 + 4a`2
1 − a`(c + η)
b< (19.2.7)
1 − a`c
and
1
`(c + η) < (19.2.8)
a
or, if
y ≤ b2 − (1 − x), (19.2.9)
1−b
y< − (1 − b)(1 − x) = b2 − (1 − b)(1 − x), (19.2.10)
a
and
1
y≤
− (1 − x), (19.2.11)
a
respectively. We have that ab2 = 1 − b < 1 by the definition of a and b. It follows
that
1
b2 − (1 − x) < − (1 − x) (19.2.12)
a
and from (1 − b)(1 − x) < (1 − x) we get that
Hence, it follows from (19.2.12) and (19.2.13) that (19.2.6)–(19.2.8) are satisfied if
(19.2.9) holds. But (19.2.9) is equivalent to the right hand side inequality in (19.2.1).
Conversely, if the right hand side inequality in (19.2.1) holds, then (19.2.9), (19.2.12)
and (19.2.13) imply (19.2.10) and (19.2.11) imply (19.2.6)–(19.2.8) which imply the
triple inequality in (19.2.1).
(ii)
q
3− 1 + 4 ``0
`c ≤ q ⇔ 2(1 − b) < x < 2(1 + b) ⇔ x2 − 4x + 4(1 − b2 ) ≤ 0
1+ 1 + 4 ``0
x2 (`η)2
⇔ ≤ b2 − (1 − x) ⇔ ≤ b2 − `c.
4 4
Expanding the Applicability of Secant Method with Applications 341
(iii)
(`η)2 x2
≥ b2 − `c ⇔ ≥ b2 − (1 − x) ⇔ x2 − 4x + 4(1 − b2 ) ≥ 0 ⇒ x ≤ 2(1 − b)
4 4
q
3 − 1 + 4 ``0
⇔ `c ≥ q
1 + 1 + 4 ``0
We need the following result on majorizing sequences for the Secant method (19.1.2).
Lemma 19.2.2. Let `0 > 0, ` > 0, c > 0, and η > 0 be constants with `0 ≤ `.
Suppose:
4`2
c+η ≤ p . (19.2.14)
` + `2 + 4`0 `
Then, scalar sequence {tn } (n ≥ −1) given by
` (tn+1 − tn−1 ) (tn+1 − tn )
t−1 = 0, t0 = c, t1 = c + η, tn+2 = tn+1 + (19.2.15)
1 − `0 (tn+1 − t0 + tn )
is increasing, bounded from above by
η
t ?? = + c, (19.2.16)
1−b
and
c + η ≤ t ? ≤ t ?? , (19.2.17)
` (t1 − t−1 )
0< ≤b
1 − `0 t 1
or
` (c + η)
0< ≤ b,
1 − `0 (c + η)
which is true by (19.2.1) and (19.2.14). Let assume that (19.2.19) holds for k ≤ n + 1.
b `0
` (bk + bk+1 ) η + (2 − bk+1 − bk+2 ) η + b `0 c ≤ b (19.2.22)
1−b
or
` (bk−1 + bk ) η + `0 (1 + b + · · ·+ bk ) + (1 + b + · · · + bk+1 ) η + `0 c − 1 ≤ 0. (19.2.23)
We denote by U(z, ρ) the open ball centered ar z ∈ X and of radius ρ > 0. We also
denote by Ū(z, ρ) the closure of U(z, ρ). We shall study the Secant method (19.1.2) for
triplets (F, x−1 , x0 ) belonging to the class C (`, `0, η, c) defined as follows:
Definition 19.2.3. Let `, `0 , η, c be positive constants satisfying the hypotheses of Lemma
19.2.2.
We say that a triplet (F, x−1 , x0 ) belongs to the class C (`, `0, η, c) if:
(c1 ) F is a nonlinear operator defined on a convex subset D of a Banach space X with
values in a Banach space Y ;
(c2 ) x−1 and x0 are two points belonging to the interior D 0 of D and satisfying the in-
equality
k x0 − x−1 k≤ c;
(c3 ) F is Fréchet–differentiable on D 0 , and there exists an operator δF : D 0 × D 0 →
L (X , Y ) such that:
the linear operator A = δF(x−1 , x0 ) is invertible, its inverse A−1 is bounded and:
k A−1 F(x0 ) k ≤ η;
k A [δF(x, y) − F 0 (z)] k ≤ ` (k x − z k + k y − z k);
k A [δF(x, y) − F 0 (x0 )] k ≤ `0 (k x − x0 k + k y − x0 k)
for all x, y, z ∈ D .
344 Ioannis K. Argyros and Á. Alberto Magreñán
We present the following semilocal convergence theorem for Secant method (19.1.2).
Theorem 19.2.4. If (F, x−1 , x0 ) ∈ C (`, `0, η, c), then sequence {xn } (n ≥ −1) generated by
Secant method (19.1.2) is well defined, remains in U(x0 ,t ? − t0 ) for all n ≥ 0 and converges
to a unique solution x? ∈ U(x0 ,t ? − t0 ) of equation F(x) = 0. Moreover the following
estimates hold for all n ≥ 0
and
k xn − x? k≤ t ? − tn (19.2.29)
where the sequence {tn } (n ≥ 0) given by (19.2.15). Furthermore, if there exists R ≥ t ? −t0 ,
such that
η
`0 (c + + R) ≤ 1, (19.2.30)
1−b
and
U(x0 , R) ⊆ D , (19.2.31)
then, the solution x? is unique in U(x0 , R).
By the identity,
Z 1
F(x) − F(y) = F 0 (y + t(x − y)) dt (x − y) (19.2.35)
0
we get
k A−1 0
0 [F(x) − F(y) − F (u)(x − y)] k≤ ` (k x − u k + k y − u k) k x − y k (19.2.36)
Expanding the Applicability of Secant Method with Applications 345
and
k A−1
0 [F(x) − F(y) − δF(u, v) (x − y)] k≤ ` (k x − v k + k y − v k + k u − v k) k x − y k
(19.2.37)
0
for all x, y, u, v ∈ D . By a continuity argument (19.2.34)–(19.2.37) remain valid if x and/or
y belong to Dc . We first show (19.2.28). If (19.2.28) holds for all n ≤ k and if {xn } (n ≥ 0)
is well defined for n = 0, 1, 2, · · · , k then
k x0 − xn k≤ tn − t0 < t ? − t0 , n ≤ k. (19.2.38)
That is (19.1.2) is well defined for n = k + 1. For n = −1, and n = 0, (19.2.28) reduces
to k x−1 − x0 k≤ c, and k x0 − x1 k≤ η. Suppose (19.2.28) holds for n = −1, 0, 1, · · · , k
(k ≥ 0). Using (19.2.33), (19.2.37) and
we obtain in turn:
and
k xk+2 − xk+1 k = k δF(xk , xk+1 )−1 F(xk+1 ) k
≤ k δF(xk , xk+1 )−1 A k k A−1 F(xk+1 ) k
` (tk+1 − tk + tk − tk−1) (19.2.41)
≤ (tk+1 − tk )
1 − `0 (tk+1 − t0 + tk − t0 + t0 − t−1 )
= tk+2 − tk+1 .
The induction for (19.2.28) is completed. It follows from (19.2.28) and Lemma 19.2.2
that sequence {xn } (n ≥ −1) is complete in a Banach space X , and as such it converges to
some x? ∈ U(x0 ,t ? − t0 ) (since U(x0 ,t ? − t0 ) is a closed set). By letting k → ∞ in (19.2.41),
we obtain F(x? ) = 0. Estimate (19.2.29) follows from (19.2.28) by using standard ma-
joration techniques [7, 15, 18, 22]. We shall first show uniqueness in U(x0 ,t ? − t0 ). Let
y? ∈ U(x0 ,t ? − t0 ) be a solution of equation (19.1.1).
Set Z 1
M= F 0 (y? + t (y? − x? )) dt.
0
It then by (c3 ):
k A−1 (A − M ) k = `0 (k y? − x0 k + k x? − x0 k + k x0 − x−1 k)
((t ?
≤ `0 − t0 ) + (t ? −
t0 ) +
t0 )
η
≤ `0 2 +c −c (19.2.42)
1−b
2η
= `0 + c < 1.
1−b
346 Ioannis K. Argyros and Á. Alberto Magreñán
It follows from (19.2.1), and the Banach lemma on invertible operators that M −1 exists
on U(x0 ,t ? − t0 ). Using the identity:
F(x? ) − F(y? ) = M (x? − y? ) (19.2.43)
we deduce x? = y? . Finally, we shall show uniqueness in U(x0 , R). As in (19.2.42), we
arrive at
−1 η
k A (A − M ) k< `0 + c + R ≤ 1,
1−b
by (19.2.30).
Remark 19.2.5. (a) Let us define the majoring sequence {wn } used in earlier studies such
as [2, 14, 16, 18, 20, 21, 22, 23, 26, 27] (under condition (19.1.3)):
` (wn+1 − wn−1 ) (wn+1 − wn )
w−1 = 0, w0 = c, w1 = c + η, wn+2 = wn+1 + .
1 − ` (wn+1 − w0 + wn )
(19.2.44)
Note that in general
`0 ≤ ` (19.2.45)
`
holds, and can be arbitrarily large [5, 6, 7, 8]. In the case `0 = `, then tn = wn
`0
(n ≥ −1). Otherwise:
tn+1 − tn ≤ wn+1 − wn , (19.2.46)
0 ≤ t ? − tn ≤ w? − wn , w? = lim wn . (19.2.47)
n−→∞
Note also that strict inequality holds in (19.2.46) for n ≥ 1, if `0 < `. It is worth notic-
ing that the center-Lipschitz condition is not an additional hypothesis to the Lipschitz
condition, since in practice the computation of constant ` requires the computation
of `0 . It follows from the proof of Theorem 19.2.4 that sequence {sn } defined by
`0 (s1 − s−1 )(s1 − s0 )
s−1 = 0, s0 = c, s1 = c + η, s2 = s1 +
1 − `0 s 1
`(sn+1 − sn−1 )(sn+1 − sn )
sn+2 = sn+1 + for n = 1, 2, . . ..
1 − `0 (sn+1 − s0 + sn )
is also a majorizing sequence for {xn } which is tighter than {tn }.
(b) In practice constant c depends on initial guesses x−1 and x0 which can be chosen to be
as close to each other as we wish. Therefore, in particular, we can always choose
r
`0
3− 1+4
`
`c < r ,
`0
1+ 1+4
`
which according to (iv) in Lemma 19.2.1 implies that the new sufficient convergence
criterion (19.2.14) is weaker than the old one given by (19.1.3).
Expanding the Applicability of Secant Method with Applications 347
It is well known that this problem can be formulated as the integral equation
Z 1
u(s) = s + Q (s,t) (u3(t) + γ u2 (t)) dt (19.3.1)
0
We observe that Z 1
1
max |Q (s,t)| dt = .
0≤s≤1 0 8
Then problem (19.3.1) is in the form (19.1.1), where, F : D −→ Y is defined as
Z 1
[F(x)] (s) = x(s) − s − Q (s,t) (x3(t) + γ x2 (t)) dt.
0
Choosing x−1 (s) such that kx−1 − x0 k ≤ c and k0 c < 1. Then, we have
and
1
kδF(x−1 , x0 )−1 F 0 (x0 )k ≤ ,
(1 − k0 c)
where k0 is such that
kF 0 (x0 )−1 (F 0 (x0 ) − A0 )k ≤ k0 c,
Set u0 (s) = s and D = U(u0 , R). It is easy to verify that U(u0 , R) ⊂ U(0, R + 1) since
k u0 k= 1. If 2 γ < 5, and k0 c < 1 the operator F 0 satisfies conditions of Theorem 19.2.6,
with
1+γ γ+6 R+3 2 γ+3 R+6
η= , l= , l0 = .
(1 − k0 c)(5 − 2 γ) 8(5 − 2 γ)(1 − k0 c) 16(5 − 2 γ)(1 − k0 c)
k0 = 0.1938137822 . . .,
η = 0.465153 . . .,
l = 0.344989 . . .
and
l0 = 0.187999 . . ..
√
Then, criterion (19.1.3) is not satisfied since lc + 2 lη = 1.14617 . . . > 1, but criterion
(19.2.14) is satisfied since
4l
η + c = 1.46515 . . . ≤ p = 1.49682 . . ..
(l 2 + l 2 + 4l0 l)2
F(x) = x3 − k
where k ∈ R and we are going to apply secant-method to find the solution of F(x) = 0. We
take the starting point x0 = 1 we consider the domain Ω = B(x0 , 1) and we let x−1 free in
order to find a relation between k and x−1 for which criterion (19.1.3) is not satisfied but
new criterion (19.2.14) is satisfied. In this case, we obtain
6
l= ,
|1 + x−1 + x2−1 |
9
l0 = ,
2|1 + x−1 + x2−1 |
Taking all this data into account we obtain the following criteria:
Expanding the Applicability of Secant Method with Applications 349
p(t) − 73 + 24k + (22 + 48k)t + (−111 + 72k)t 2 + (−38 + 48k)t 3 + (−25 + 24k)t 4 .
p(t) = −49 + 24k + (22 + 48k)t + (−111 + 72k)t 2 + (−62 + 48k)t 3 + (−25 + 24k)t 4 .
p(t) = 25 + 24k + (−118 + 48k)t + (−33 + 72k)t 2 + (−58 + 48k)t 3 + (−23 + 24k)t 4 .
p(t) = 1 + 24k + (−118 + 48k)t + (−33 + 72k)t 2 + (−34 + 48k)t 3 + (−23 + 24k)t 4 .
Now we consider a case in which both criteria (19.1.3) and (19.2.14) are satisfied to
compare the majorizing sequences. We choose k = 0.99 and x−1 = 1.2 and we obtain
Table 19.3.1. Comparison between the sequences {sn }, {tn } and {wn }
Conclusion
We present a new semilocal convergence analysis for the secant method in order to ap-
proximate a locally unique solution of a nonlinear equation in a Banach space setting. We
showed that the new convergence criteria can be always weaker than the corresponding ones
in earlier studies such as [2, 14, 16, 18, 20, 21, 22, 23, 26, 27]. Numerical examples where
the old results cannot guarantee the convergence but our new convergence criteria can are
also provided in this chapter.
References
[1] Amat, S., Busquier, S., Negra, M., Adaptive approximation of nonlinear operators,
Numer. Funct. Anal. Optim., 25 (2004), 397-405.
[2] Amat, S., Busquier, S., Gutiérrez, J. M., On the local convergence of secant-type
methods, Int. J. Comp. Math., 81(8) (2004), 1153-1161.
[3] Amat, S., Busquier, S., Magreñán, Á.A., Reducing chaos and bifurcations in
Newton-type methods, Abst. Appl. Anal., 2013 (2013), Article ID 726701, 10 pages,
http://dx.doi.org/10.1155/2013/726701.
[4] Amat, S., Magreñán, Á.A., Romero, N., On a family of two step Newton-type meth-
ods, App. Math. Comp., 219(4) (2013), 11341-11347.
[5] Argyros, I.K., A unifying local–semilocal convergence analysis and applications for
two–point Newton–like methods in Banach space, J. Math. Anal. Appl., 298 (2004),
374-397.
[6] Argyros, I.K., New sufficient convergence conditions for the Secant method, Che-
choslovak Math. J., 55 (2005), 175-187.
[8] Argyros, I.K., Weaker conditions for the convergence of Newton’s method, J. Com-
plexity, 28 (2012), 364-387.
[9] Bosarge, W.E., Falb, P.L., A multipoint method of third order, J. Optimiz. Th. Appl., 4
(1969), 156-166.
[10] Dennis, J.E., Toward a unified convergence theory for Newton–like methods, in Nonl.
Funct. Anal. Appl. (L.B. Rall, ed.), Academic Press, New York, (1971), 425-472.
[11] Garcı́a-Olivo, M., El método de Chebyshev para el cálculo de las raı́ces de ecuaciones
no lineales (PhD Thesis), Servicio de Publicaciones, Universidad de La Rioja, 2013.
http://dialnet.unirioja.es/descarga/tesis/37844.pdf
[13] Hernández, M.A., Rubio, M.J., Ezquerro, J.A., Solving a special case of conservative
problems by Secant–like method, Appl. Math. Comp., 169 (2005), 926-942.
[14] Hernández, M.A., Rubio, M.J., Ezquerro, J.A., Secant–like methods for solving non-
linear integral equations of the Hammerstein type, J. Comp. Appl. Math., 115 (2000),
245-254.
[15] Kantorovich, L.V., Akilov, G.P., Functional Analysis, Pergamon Press, Oxford, 1982.
[16] Laasonen, P., Ein überquadratisch konvergenter iterativer algorithmus, Ann. Acad. Sci.
Fenn. Ser I, 450 (1969), 1–10.
[17] Magreñán, Á.A., A new tool to study real dynamics: The convergence plane, App.
Math. Comp., 248 (2014), 215-224.
[18] Ortega, J.M., Rheinboldt, W.C., Iterative Solution of Nonlinear Equations in Several
Variables, Academic Press, New York, 1970.
[19] Ostrowski, A.M., Solution of equations in Euclidian and Banach Spaces, Academic
Press, New York, 1972.
[20] Potra, F.A., An error analysis for the secant method, Numer. Math., 38 (1982), 427-
445.
[21] Potra, F.A., Sharp error bounds for a class of Newton–like methods, Libertas Mathe-
matica, 5 (1985), 71-84.
[22] Potra, F.A., Pták, V., Nondiscrete Induction and Iterative Processes, Pitman, New
York, 1984.
[23] Schmidt, J.W., Untere Fehlerschranken fur Regula–Falsi Verhafren, Period. Hungar.,
9 (1978), 241-247.
[24] Sergeev, A.S., On the method of Chords (in Russian), Sibirsk, Math. J., 11, (1961),
282–289.
[25] Ulm, S., Majorant principle and the method of Chords (in Russian), Izv. Akad. Nauk
Eston. SSR, Ser. Fiz.-Mat., 13 (1964), 217-227.
[26] Yamamoto, T., A convergence theorem for Newton–like methods in Banach spaces,
Numer. Math., 51 (1987), 545-557.
[27] Wolfe, M.A., Extended iterative methods for the solution of operator equations, Nu-
mer. Math., 31 (1978), 153-174.
Chapter 20
20.1. Introduction
In this study we are concerned with the problem of approximating a locally unique solution
x∗ of the equation
F(x) = 0, (20.1.1)
where F is a Fréchet-differentiable operator defined on a convex subset D of a Banach space
X with values in a Banach space Y.
Many problems in computational mathematics and other disciplines can be brought in
a form like (20.1.1) using mathematical modelling [1, 3, 11, 15, 18, 19]. The solutions of
these equations can rarely be found in closed form. That is why most solution methods
for these equations are usually iterative. In particular the practice of Numerical Functional
Analysis for finding such solutions is essentially connected to Newton-like methods [1, 3,
15, 17, 18, 19]. The study about convergence of iterative procedures is normally centered
on two types: semilocal and local convergence analysis. The semilocal convergence matter
is, based on the information around an initial point, to give criteria ensuring the convergence
of the iterative procedures. While the local analysis is based on the information around a
solution, to find estimates of the radii of convergence balls. There exist many studies which
deal with the local and the semilocal convergence analysis of Newton-like methods such as
[1]-[20].
Majorizing sequences in connection to the Kantorovich theorem have been used ex-
tensively for studying the convergence of these methods [1, 2, 3, 4, 11, 15, 10]. Rall [19]
suggested a different approach for the convergence of these methods, based on recurrent
relations. Candela and Marquina [5, 6], Parida[16], Parida and Gupta [17], Ezquerro and
Hernández [7], Gutiérrez and Hernández [8, 9]. Argyros [1, 2, 3] used this idea for sev-
eral high-order methods. In particular, Kou and Li [12] introduced a third order family of
354 Ioannis K. Argyros and Á. Alberto Magreñán
(2)
kF 00 (x)k ≤ M, for each x ∈ D;
(3)
kF 0 (x0 )−1 k ≤ β;
(4)
kF 0 (x0 )−1 F(x0 )k ≤ η.
Third Order Family of Methods in Banach Spaces 355
a = Kβη,
|θ2 + θ − 1| + |1 − θ|
α = ,
θ2
M
γ = βη,
2
a0 = b0 = 1, d0 = α + γ, b−1 = 0,
an
an+1 = ,
1 − aan dn
bn+1 = an+1 βηcn ,
|1 + θ|(θ − 1)2 + |1 − θ| M
kn = 2
bn + an βb2n η,
θ 2
M 2 M 2
cn = k + K|θ|bn kn + |θ − 1|b2n
2 n 2
and
dn+1 = αbn+1 + γan+1 b2n+1 .
We suppose (C 0 ):
(1)
kF 0 (x0 )−1 (F 0 (x) − F 0 (y))k ≤ Kkx − yk, for each x, y ∈ D;
(2)
kF 0 (x0 )−1 (F 0 (x) − F 0 (x0 ))k ≤ K0 kx − x0 k, for each x ∈ D;
(3)
kF 0 (x0 )−1 F(x0 )k ≤ η.
Notice that the new conditions are given in affine invariant form and the condition on the
second Fréchet-derivative has been dropped. The advantages of presenting results in affine
invariant form instead of non-affine invariant form are well known [1, 3, 11, 15, 18]. If
operator F is twice Fréchet differentiable, then (1) in (C 0 ) implies (2) in (C ).
In order for us to compare the old approach with the new, let us rewrite the conditions
(C ) in affine invariant form. We shall call these conditions again (C ).
(C2 )
kF 0 (x0 )−1 F 00 (x)k ≤ M, for each x ∈ D;
(C4 )
kF 0 (x0 )−1 F(x0 )k ≤ η.
356 Ioannis K. Argyros and Á. Alberto Magreñán
The parameters and sequences are defined as before but β = 1. Then, we can certainly set
K = M. Define parameters
a0 = Kη,
α0 = α,
K
γ0 = η,
2
a00 = b00 = 1, d00 = α0 + γ0 , b0−1 = 0,
1
a0n+1 = 0
,
1 − K0 (dn0 + dn−1 + · · · + d00 )
b0n+1 = a0n+1 ηc0n ,
(kn0 )2 |θ2 − 1| 0 2
c0n = K[ + |θ|b0n kn0 + (bn ) ],
2 2
|θ + 1|(θ − 1)2 + |1 − θ| 0 K 0 0 2
kn0 = bn + an (bn ) η
θ2 2
and
0
dn+1 = α0 b0n+1 + γ0 a0n+1 (b0n+1)2 .
We have that
K0 ≤ K (20.2.1)
K
holds in general and K0 can be arbitrarily large [1]-[3]. Notice that the center Lipschitz
condition is not an additional condition to the Lipschitz condition, since in practice the
computation of K involves the computation of K0 as a special case. We have by the defini-
tion of an+1 in turn that
an
an+1 =
1 − Kηan dn
an
= an−1
1 − Kηdn 1−Kηa n−1 dn−1
an (1 − Kηan−1 dn−1)
=
1 − Kηan−1 (dn + dn−1 )
an−1
1−Kηan−1 dn−1 (1 − Kηan−1 dn−1)
=
1 − Kηan−1 (dn + dn−1 )
an−1
=
1 − Kηan−1 (dn + dn−1 )
..
.
a0
=
1 − Kηan−1 (dn + dn−1 + · · · + d0 )
1
= .
1 − Kη(dn + dn−1 + · · · + d0 )
Hence, we deduce that
Moreover, strict inequality holds in (20.2.2) if K0 < K. Hence, using a simple inductive
argument we also have that
and
0
dn+1 ≤ dn+1 . (20.2.6)
Proof. It follows from the proof of Lemma 1 in [4] by simply noticing: the expressions
involving
and Z 1
F 00 (yn + t(xn+1 − yn ))(1 − t)(xn+1 − yn )2 dt
0
are not needed and can be replaced, respectively, by
Z 1
[F 0 (yn + t(xn − yn ) − F 0 (xn )](yn − xn )dt
0
and Z 1
[F 0 (yn + t(xn+1 − yn ) − F 0 (yn )](xn+1 − yn )dt.
0
Hence, condition (2) in (C ) is not needed and can be replaced by condition (1) in (C 0 )
to produce the same bounds as in [4] (for K = M) (see also the proof of Theorem
20.3.2 that follows)
358 Ioannis K. Argyros and Á. Alberto Magreñán
(ii) The computation of the upper bounds on kF 0 (xn )−1 F 0 (x0 )k in [4] uses condition (1)
in (C ) and the estimate
kF 0 (xn )−1 (F 0 (xn ) − F 0 (xn+1 ))k ≤ kF 0 (xn )−1 F 0 (x0 )kKkxn − xn+1 k
to arrive at
kF 0 (xn )−1 F 0 (x0 )k ≤ an+1 , (20.2.7)
0
whereas we use (2) in (C ) and estimate
kF 0 (x0 )−1 (F 0 (xn ) − F 0 (xn+1 ))k ≤ K0 kxn+1 − xn k
≤ K0 (kxn+1 − xn k + · · · + kx1 − x0 k)
≤ K0 (dn0 + dn−1
0
+ · · · + d00 )
to arrive at the estimate
kF 0 (xn )−1 F 0 (x0 )k ≤ a0n+1 , (20.2.8)
which is more precise (see also (20.2.2)).
Lemma 20.2.2. Suppose that
a01 b01 < 1. (20.2.9)
Then, sequence {p0n } defined by p0n = a0n b0n is decreasingly convergent to 0 such that
n+1 1
p0n+1 ≤ ξ21 , ξ := a01 b01
ξ1 1
and
n 1
dn0 ≤ (α0 + γ0 )ξ21 .
ξ1
Moreover, if
a1 b1 < 1, (20.2.10)
then, sequence {pn } defined by pn = an bn is also decreasingly convergent to 0 such that
n+1 1
p0n+1 ≤ pn+1 ≤ ξ2 , ξ = a1 b1 ,
ξ
n 1
dn0 ≤ dn ≤ (α + γ)ξ2 ,
ξ
and
ξ1 ≤ ξ.
Proof. It follows from the proof of Lemma 3 in [4] by simply using {p0n }, a01 , b01 , ξ1 instead
of {pn }, a1 , b1 , ξ, respectively.
Next, we present the main semilocal convergence result for the third order method
(20.1.2) under the (C 0 ) conditions, (20.2.9) and the convergence criterion
a(α + γ) < 1. (20.2.11)
The proof follows from the proof of Theorem 5 in [4] (with the exception of the unique-
ness of the solution part) by simply replacing the (C ) conditions and (20.2.10) by the (C 0 )
conditions and (20.2.9) respectively.
Third Order Family of Methods in Banach Spaces 359
Theorem 20.2.3. Suppose that conditions (C 0 ), (20.2.9) and (20.2.11) hold. Moreover,
suppose that
U00 = U(x0 , r0 η) ⊂ D, (20.2.12)
where
∞
r0 = ∑ dn0 . (20.2.13)
n=0
Then, sequences {xn } generated by the third order method (20.1.2) is well defined, remains
in U00 for each n = 0, 1, 2, · · · and converges to a unique solution x∗ of equation F(x) = 0 in
U(x0 , K20 − r0 η) D. Moreover, the following estimates hold
T
∞ ∞
α+γ k
kxn+1 − x∗ k ≤ ∑ dk0 η ≤ η ∑ ξ21 . (20.2.14)
k=n+1 ξ1 k=n+1
Proof. As already noted above, we only need to show the uniqueness part. Let y∗ ∈
U(x0 , K20 − r0 η) be such that F(y∗ ) = 0. Define Q = 01 F 0 (x∗ + t(y∗ − x∗ ))dt. Using condi-
R
and
K
g(t) = K0 t 4 + 2
[1 + |1 − θ|(1 + |1 − θ2 |)]t 3
2θ
K
+ [|1 − θ|(1 + |1 − θ2 |) − 1]t 2
2θ2
K |θ2 − 1|
+ 2 |1 − θ|(1 + |1 − θ2 |)( − 1)t
θ 2θ2
K
− 4 |1 − θ||1 − θ2 |(1 + |1 − θ2 |). (20.3.2)
2θ
|θ2 −1|
We have f (0) = − K2 |θ| < 0 for θ 6= ±1 and f (1) = K0 > 0 for K0 6= 0. It follows from the
intermediate value theorem that polynomial f has roots in (0, 1). Denote by δ f the smallest
root of f in (0, 1). Similarly, we have g(0) = − 2θK4 |1 − θ||θ2 − 1|(1 + |1 − θ2 |) < 0 for
θ 6= ±1 and g(1) = K0 + 2θK2 > 0. Denote by δg the smallest root of g in (0, 1). Set
δ = min{δ f , δg }. (20.3.3)
" #
K|θ| |θ2 − 1| δ2
0< + + δ (s0 − t0 ) ≤ δ (20.3.5)
1 − K0 (1 + δ)s0 2θ2 2
and
K
0 < 2
{|1 − θ|(1 + |1 − θ2 |)
θ (1 − K0 (1 + δ)s0 )
" #
|θ2 − 1| δ2 δ2
2
+ + δ + }(s0 − t0 ) ≤ δ2 . (20.3.6)
2θ 2 2
We shall assume from now on that δ satisfies conditions (20.3.3)-(20.3.6). These conditions
shall be referred to as the (4) conditions. Moreover, define scalar sequences {tn }, {sn } by
t0 = 0, s0 = t0 + θη,
|1 − θ| 2 (s0 − t0 )K
t1 = s0 + (1 + |1 − θ |) + (s0 − t0 )
|θ3 | 2θ2
for each n = 0, 1, 2, · · · .
K|θ|
sn+1 = tn+1 +
1 − K0 tn+1
|1 − θ2 | 2 (tn+1 − sn )2
(sn − tn ) + + (sn − tn )(tn+1 − sn ) (20.3.7)
2θ2 2
Third Order Family of Methods in Banach Spaces 361
K
tn+2 = sn+1 + 2
|1 − θ|(1 + |1 − θ2 |)
θ (1 − K0 tn+1)
|1 − θ2 | 2 (tn+1 − sn )2
(s n − t n ) + + (sn − tn )(tn+1 − sn )
2θ2 2
1
+ (sn+1 − tn+1 )2 }. (20.3.8)
2
Then, we can show the following auxiliary result for majorizing sequences {tn }, {sn } under
the (4) conditions.
Lemma 20.3.1. Suppose that the (4) conditions hold. Then, sequence {tn }, {sn } defined by
(20.3.7) and (20.3.8) are increasingly convergent to their unique least upper bound denoted
by t ∗ which satisfies
θη
θη ≤ t ∗ ≤ t ∗∗ := . (20.3.9)
1−δ
Moreover, the following estimates hold for each n = 0, 1, 2, · · · .
0 < sn − tn ≤ δn θη (20.3.10)
and
0 < tn+1 − sn ≤ δn+1 θη. (20.3.11)
Proof. We shall show estimates (20.3.10) and (20.3.11) using induction. If n = 0, (20.3.10)
holds by the definition of t0 and s0 , whereas (20.3.11) holds by (20.3.4). We then have that
1 − δ2
t1 ≤ s0 + δs0 = (1 + δ)s0 = s0 < t ∗∗ . (20.3.12)
1−δ
If n = 1, estimates (20.3.10) and (20.3.11) hold by (20.3.5), (20.3.6), (20.3.12) and
(20.3.10), (20.3.11) for n = 0. Suppose that (20.3.10) and (20.3.11) hold for all m ≤ n.
Then, we have that
fm (δ) ≤ 0. (20.3.17)
since f (δ) ≤ 0. It then, follows from (20.3.17) and (20.3.18) that (20.3.17) holds, if
f0 (δ) ≤ 0, (20.3.19)
which is true by (20.3.5). Hence, we showed (20.3.10) for m + 1 replacing n. Next, we shall
show (20.3.11) for m + 1 replacing n. We have in turn that
K
sm+2 − sm+1 ≤ m+2 {|1 − θ|(1 + |θ2 − 1|)
2
θ (1 − K0 1−δ
1−δ )
" #
|θ2 − 1| m (δm+1 (s0 − t0 ))2
(δ (s 0 − t 0 )) 2
+ + δ2m+1 (s0 − t0 )2
2θ2 2
+(δm+1 (s0 − t0 ))2 }
must be smaller or equal to δm+2 (s0 − t0 ). As in the preceding case we are motivated to
define polynomials gm on [0, 1] by
m+2
|1 − θ|(1 + |θ2 − 1|) |θ2 − 1| m t m+2 m+1 t
gm (t) = K{ 2 2
t + +t + 2}
θ θ 2 2θ
2 m+1 2
×(s0 − t0 ) + t K0 (1 + t + · · · + t )(s0 − t0 ) − t . (20.3.20)
g0 (δ) ≤ 0, (20.3.24)
which is true by (20.3.6). The induction for (20.3.11) is completed. It then, follows that
1 − δm+3
tm+2 ≤ s0 < t ∗∗ . (20.3.25)
1−δ
Hence, sequences {tn }, {sn } are increasing, bounded above by t ∗∗ and as such they converge
to their unique least upper bound t ∗ which satisfies (20.3.9).
We can show the main semilocal convergence result for the third order method (20.1.2)
under the (C 0 ) and (4) conditions using {tn } and {sn } as majorizing sequences.
kyn − xn k ≤ sn − tn , (20.3.27)
kxn+1 − yn k ≤ tn+1 − sn (20.3.28)
kxn+1 − xn k ≤ tn+1 − tn (20.3.29)
and
kxn − x∗ k ≤ t ∗ − tn . (20.3.30)
Furthermore, if there exists R > t ∗ such that
K0 (t ∗ + R) < 2, (20.3.31)
then, the point x∗ is the only solution of equation F(x) = 0 in U(x0 , R).
Proof. We shall first show (20.3.27) and (20.3.28) using induction. We have by (20.1.2)
and (20.3.7) that
Hence, (20.3.27) holds for n = 0. It follows from the first substep of (20.1.2) that
Composing (20.3.32) by F 0 (x0 )−1 and using (2), (3) in (C 0 ) and (20.3.7)
(θ + 1)(θ − 1)2 0 1
x1 − y0 = − 2
F (x0 )−1 F(x0 ) − 2 F 0 (x0 )−1 F(y0 ) (20.3.34)
θ θ
Hence, using (20.3.33) and (20.3.34), we get that
|θ + 1||θ − 1|2 0 1
kx1 − y0 k = 2
kF (x0 )−1 F(x0 )k + 2 kF 0 (x0 )−1 F(y0 )k
θ θ
|θ + 1||θ − 1| 2 1 |1 − θ| K
≤ 2
(s0 − t0 ) + 2 ( + (s0 − t0 ))(s0 − t0 )
θ θ |θ| 2
= t1 − s0 , (20.3.35)
Then, we have x1 ∈ U(x0 ,t ∗ ). Notice that K0 t ∗ < 1 from the proof of Lemma 20.3.1. Let us
suppose x ∈ U(x0 ,t ∗ ). Then, using (2) in (C 0 ) we have that
It follows from (20.3.36) and the Banach lemma that F 0 (x)−1 ∈ L(Y, X) and
1 1
kF 0 (x1 )−1 F 0 (x0 )k ≤ ≤ . (20.3.37)
1 − K0 kx1 − x0 k 1 − K0 t1
Suppose that (20.3.27)-(20.3.29) hold for all m ≤ n and xm ∈ U(x0 ,t ∗). Using the first step
in (20.1.2) we get that
Subtracting the first step in (20.1.2) from the second step to obtain
θ3 − θ2 − θ + 1 1
F 0 (xm)(xm+1 − ym ) = 2
F(xm ) − 2 F(ym). (20.3.39)
θ θ
Third Order Family of Methods in Banach Spaces 365
θ3 − θ2 − θ + 1 0 1
xm+1 − ym = 2
F (xm )−1 F(xm ) − 2 F 0 (xm )−1 F(ym). (20.3.42)
θ θ
366 Ioannis K. Argyros and Á. Alberto Magreñán
K |1 + θ|(θ − 1)2 0
kxm+2 − ym+1 k ≤ [ 2
kF (x0 )−1 F(xm+1 )k
1 − K0 tm+1 θ
1
+ 2 kF 0 (x0 )−1 F(ym+1 )k]
θ
2
K 2 |θ − 1|
≤ |1 + θ|(θ − 1) ( (sm − tm )2
θ2 (1 − K0 tm+1 ) 2θ2
(tm+1 − sm )2
+ + (sm − tm )(tm+1 − sm ))
2
|θ2 − 1|
+|1 − θ|( (sm − tm )2
2θ2
(tm+1 − sm )2
+ + (sm − tm )(tm+1 − sm )))
2
(sm+1 − tm+1 )2
+
2
= tm+2 − sm+1 .
It follows that Q−1 exists. Then, from the identity 0 = F(x∗ ) − F(y∗ ) = Q(x∗ − y∗ ) we
deduce that x∗ = y∗ . Similarly, if F(y∗ ) = 0 and y∗ ∈ U(x0 , R), we have that
K0
kF 0 (x0 )−1 (F 0 (x0 ) − Q)k ≤ (R + t ∗ ) < 1,
2
by (20.3.31). Hence, again we deduce that x∗ = y∗ .
Remark 20.3.3. (a) It follows from the proof of Theorem 20.3.2 that sequences
{t¯n }, {s̄n} defined by
t¯0 = 0, s̄0 = t¯0 + θη,
|1 − θ (s̄0 − t¯0 )K0
t¯1 = s̄0 + [ 3 (1 + |1 − θ2 |) + ](s̄0 − t¯0 ),
|θ | 2θ2
|θ| K |θ2 − 1|
s̄1 = t¯1 + [ (s̄0 − t¯0 )2
1 − K0 t¯1 2 θ2
K
(t¯1 − s¯0 )2 + K0 (s̄0 − t¯0 )(t¯1 − s¯0 )],
2
|θ| |θ2 − 1|
s̄n+1 = t¯n+1 + [ (s̄n − t¯n )2
1 − K0 t¯n+1 2θ2
(t¯n+1 − s̄n )2
+ (s̄n − t¯n )(t¯n+1 − s¯n )],
2
K |θ2 − 1|
t¯n+2 = s̄n+1 + 2 {|1 − θ|(1 + |1 − θ2 |)[ 2
(s̄n − t¯n )2
¯
θ (1 − K0 tn+1 ) 2θ
¯
(tn+1 − s̄n ) 2
+ (s̄n − t¯n )(t¯n+1 − s¯n )]
2
1
(s̄n+1 − t¯n+1 )2 } for each n = 0, 1, 2, · · · .
2
Then, a simple induction argument shows that
s̄n ≤ sn ,
t¯n ≤ tn ,
s̄n − t¯n ≤ sn − tn ,
t¯n+1 − s̄n ≤ tn+1 − sn
and
t¯∗ = lim t¯n ≤ t ∗ .
n→∞
Clearly, {t¯n }, {s̄n }, t¯∗ can replace {tn }, {sn }, t ∗ in Theorem 20.3.2.
(b) The limit point t ∗ can be replaced by t ∗∗ given in closed form by (20.3.9).
(c) Criteria (4) or (20.2.9) and (20.2.11) are sufficient for the convergence of the third
order method (20.1.2). However, these criteria are not also necessary. In practice,
we shall test to see which of these criteria are satisfied (if any) and then use the best
possible error bounds and uniqueness results (see also the numerical examples in the
next section).
368 Ioannis K. Argyros and Á. Alberto Magreñán
δ = min{δ f , δg } = .4104586 . . .
= .136162 . . . ≤ .168476 . . . = δ2 .
Consequently, convergence to the solution is guaranteed by Theorem 20.3.2. Moreover,
the computational order of convergence (COC) is shown in Table 20.4.1. Here (COC) is
defined by
kxn+1 − x? k∞ kxn − x? k∞
ρ ≈ ln / ln , n ∈ N,
kxn − x? k∞ kxn−1 − x? k∞
The Table 20.4.1 shows the (COC).
Example 20.4.2. Let X = Y = C [0, 1], the space of continuous functions defined in [0, 1]
equipped with the max-norm. Let Ω = {x ∈ C [0, 1]; kxk ≤ R}, such that R > 1 and F defined
on Ω and given by
Z 1
F(x)(s) = x(s) − f (s) − λ G(s,t)x(t)3 dt, x ∈ C[0, 1], s ∈ [0, 1],
0
Third Order Family of Methods in Banach Spaces 369
n COC
1 2.73851
2 2.99157
3 2.99999
4 3.00000
5 3.00000
ρ = 3.00000
where f ∈ C [0, 1] is a given function, λ is a real constant and the kernel G is the Green
function
(1 − s)t, t ≤ s,
G(s,t) =
s(1 − t), s ≤ t.
In this case, for each x ∈ Ω, F 0 (x) is a linear operator defined on Ω by the following
expression:
Z 1
0
[F (x)(v)](s) = v(s) − 3λ G(s,t)x(t)2v(t) dt, v ∈ C[0, 1], s ∈ [0, 1].
0
If we choose x0 (s) = f (s) = 1, it follows kI − F 0 (x0 )k ≤ 3|λ|/8. Thus, if |λ| < 8/3, F 0 (x0 )−1
is defined and
8
kF 0 (x0 )−1 k ≤ .
8 − 3|λ|
Moreover,
|λ|
kF(x0 )k ≤ ,
8
|λ|
kF 0 (x0 )−1 F(x0 )k ≤ .
8 − 3|λ|
On the other hand, for x, y ∈ Ω we have
Z 1
0 0
[(F (x) − F (y))v](s) = 3λ G(s,t)(x(t)2 − y2 (t))v(t) dt
0
6|λ|
kF 00 (x)k ≤ .
8
Consequently,
β = 0.677966 . . .,
η = 0.127119 . . .,
M = 4.95,
a = 0.426602 . . .,
α = 1.16529 . . .,
and
γ = 0.213301 . . .
So, as a1 b1 = 1.25402 ≤ 1, condition (20.2.9) is violated. Hence, there is no guarantee
under the conditions given in [4] that sequence {xn } converges to x∗ . Calculating now δ f
and δg , the smallest solutions of the polynomials f (t) and g(t) given in (20.3.1) and (20.3.2)
respectively between 0 and 1, we obtain that
δ = min{δ f , δg } = 0.370693 . . .
= 0.0871515 . . . ≤ 0.137413 . . . = δ2 .
Consequently, convergence to the solution is guaranteed by Theorem 20.3.2.
References
[1] Argyros, I.K., Computational theory of iterative methods, Series: Studies in Compu-
tational Mathematics, 15, Editors: C.K. Chui and L. Wuytack, Elsevier Publ. Co. New
York, U.S.A, 2007.
[2] Argyros, I.K., Hilout, S., Weaker conditions for the convergence of Newton’s method.
J. Complexity, 28 (2012), 364–387.
[3] Argyros, I.K., Hilout, S., Computational methods in nonlinear analysis, World Scien-
tific Publ. Comp., New Jersey, USA 2013.
[4] Chun, C., Stanica, P., Neta, B., Third order family of methods in Banach spaces, Comp
Math. Appl., 61 (2011), 1665-1675.
[5] Candela, V., Marquina, A., Recurrence relations for rational cubic methods I: The
Halley method, Computing, 44 (1990), 169-184.
[6] Candela, V., Marquina, A., Recurrence relations for rational cubic methods II: The
Chebyshev method, Computing, 45 (1990), 355-367.
[7] Ezquerro, J. A., Hernández, M.A., Recurrence relations for Chebyshev-type methods,
Appl. Math. Optim., 41 (2000), 227-236.
[8] Gutiérrez, J.M., Hernández, M.A., Recurrence relations for the super-Halley method,
Computers Math. Applic., 36 (1998), 1-8.
[9] Gutiérrez, J.M., Hernández, M.A., Third-order iterative methods for operators with
bounded second derivative, J. Comp. Appl. Math., 82 (1997), 171-183.
[10] Gutiérrez, J.M., Magreñán, Á.A., Romero, N., On the semilocal convergence of
Newton-Kantorovich method under center-Lipschitz conditions, App. Math. Comp.,
221 (2013), 79-88.
[11] Kantorovich, L.V., G.P. Akilov, Functional Analysis, Pergamon Press, Oxford, 1982.
[12] Kou, J.S., Li, T., Modified Chebyshev’s method free from second derivative for non-
linear equations, App. Math. Comp., 187 (2007), 1027-1032.
[13] Kou, J.S., Li, T., Wang, X.H., A modification of Newton method with third-order
convergence, App. Math. Comp., 181 (2006), 1106-1111.
372 Ioannis K. Argyros and Á. Alberto Magreñán
[14] Magreñán, Á. A., Different anomalies in a Jarratt family of iterative root-finding meth-
ods, App. Math. Comp., 233 (2014), 29–38.
[15] Ortega, J.M., Rheinboldt, W.C., Iterative Solution of Nonlinear Equations in Several
Variables, Academic press, New York, 1970.
[16] Parida, P.K., Study of some third order methods for nonlinear equations in Banach
spaces, Ph.D. Dessertation, Indian Institute of Technology, Department of Mathemat-
ics, Kharagpur, India, 2007.
[17] Parida, P.K., Gupta, D.K., Recurrence relations for semilocal convergence of a
Newton-like method in Banach spaces, J. Math. Anal. Applic., 345 (2008), 350-361.
[18] Potra, F.A., Ptrǎ, V., Nondiscrete induction and iterative processes, in Research Notes
in Mathematics, 103, Pitman, Boston, 1984.
[20] Wu, Q., Zhao, Y., Third order convergence theorem by using majorizing function for
a modified Newton method in Banach space, App. Math. Comp., 175 (2006), 1515-
1524.
Chapter 21
21.1. Introduction
In this chapter we are concerned with the problem of approximating a solution x∗ of the
nonlinear equation
F(x) = 0, (21.1.1)
where F is a Fréchet-differentiable operator defined on a subset D of a Banach space X with
values in a Banach space Y.
Many problems in computational sciences and other disciplines can be brought in a
form like (21.1.1) using mathematical modeling [3]. The solutions of equation (21.1.1) can
rarely be found in closed form. That is why most solution methods for these equations are
usually iterative. In particular, the practice of Numerical Functional Analysis for finding
such solution is essentially connected to Newton-like methods [1]-[20]. The study about
convergence matter of iterative procedures is usually based on two types: semilocal and
local convergence analyses. The semilocal convergence matter is, based on the information
around an initial point, to give conditions ensuring the convergence of the iterative proce-
dure; while the local one is, based on the information around a solution, to find estimates
of the radii of convergence balls. There exist many studies which deal with the local and
semilocal convergence analyses of Newton-like methods such as [1]-[20].
We present a local convergence analysis for the modified Halley-Like Method [30] de-
fined for each n = 0, 1, 2, · · · by
(C2 )
kF 0 (x0 )−1 F(x0 )k ≤ β1 ;
(C3 )
kF 0 (x0 )−1 F 00 (x)k ≤ β2 for each x ∈ D;
(C4 )
Under the (C ) conditions for α = γ = 1 and θ ∈ (0, 1] the convergence order was shown
to be 3 + 2q in [30]. Moreover, for γ = 1, α = 0 and θ ∈ (0, 1] the convergence order was
shown to be 2 + q in [10].
Similar conditions have been used by several authors on other high convergence or-
der methods [1]-[20]. The corresponding conditions for the local convergence analysis are
given by simply replacing x0 by x∗ in the preceding (C ) conditions. These conditions how-
ever are very restrictive. As a motivational example, let us define function f on D = [− 12 , 52 ]
by 3 2
x lnx + x5 − x4 , x 6= 0
f (x) =
0, x = 0
Choose x∗ = 1. We have that
Then, e.g, function f cannot satisfy condition (C4 ), say for q = 1, since function f 000 is un-
bounded on D. In the present chapter we only use hypotheses on the first Fréchet derivative
(see conditions (21.2.12)-(21.2.15)). Notice that they used θ ∈ (0, 1], whereas in this chap-
ter θ can belong in a wider than (0, 1] interval and γ = α = 1 in [30]. This way we expand
the applicability of method (21.1.2).
The chapter is organized as follows. The local convergence of method (21.1.2) is given
in Section 21.2, whereas the numerical examples are given in Section 21.3. Finally, some
remarks are given in the concluding Section 21.4.
Modified Halley-Like Methods 375
Lr
g1 (r) = ,
2(1 − L0 r)
M|1 − θ|
g2 (r) = g1 (r) + ,
1 − L0 r
L0 (1 + g2 (r))
g3 (r) = ,
2|θ|(1 − L0 r)
g4 (r) = 1 + g3 (r)r + g23 (r)r2,
|γ|Mg4 (r)
g5 (r) = g1 (r) + ,
1 − L0 r
g6 (r) = 1 + 2g1,3 (r)r + 4g23 (r)r2,
L0 (1 + g1 (r))
g1,3 (r) =
2(1 − L0 r)
and
|α|Mg6(r)
g7 (r) = [1 + ]g5 (r).
1 − L0 r
Moreover, define parameter
0 < g1 (r) < 1, and 0 < g2 (r) < 1, for each r ∈ (0, r2).
1
Evidently, g5 (r) ∈ (0, 1), if for each r ∈ (0, r5) and r5 < L0 to be determined, we have that
|γ|g4 (r)M
0 < g1 (r) + < 1 for each r ∈ (0, r5 ).
1 − L0 r
We have that
1 − 1
p5 (( ) ) = |γ|Mg4(( )− ) > 0.
L0 L0
376 Ioannis K. Argyros and Á. Alberto Magreñán
Suppose that
|γ|M < 1.
Then, we have that
p5 (0) = M|γ| − 1 < 0.
It follows from the intermediate value theorem that function p5 has zeros in the interval
(0, L10 ). Denote by r5 the smallest such zero. Then, we have that
We get that
1 − 1 1
p7 (( ) ) = |γ|Mg6(( )−)g5 (( )− ) > 0.
L0 L0 L0
and
p7 (0) = (1 + |α|Mg6 (0))|γ|g5 (0) − 1 = (1 + |α|M)|γ|M − 1.
Suppose that
(1 + |α|M)|γ|M < 1.
Then, we have p7 (0) < 0. It follows that function p7 has zeros in the interval (0, L10 ). Denote
by r7 the smallest such zero. Then, we obtain that
Set
r∗ = min{r2 , r5 , r7 }. (21.2.1)
Then, we have that
and
0 < g7 (r) < 1, for each r ∈ (0, r∗ ). (21.2.8)
Next, we present the local convergence analysis of method (21.1.2).
Modified Halley-Like Methods 377
It follows from (21.2.24) and the Banach Lemma on invertible operators [3, ?] that
F 0 (x0 )−1 ∈ L(Y, X) and
1 1
kF 0 (x0 )−1 F 0 (x∗ )k ≤ < . (21.2.25)
1 − L0 kx0 − x k 1 − L0 r∗
∗
Hence, y0 and u0 are well defined. Using the first substep in method (21.1.2) for n = 0,
(21.2.2), (21.2.14), (21.2.25) and the definition of function g1 we obtain in turn that
so,
ky0 − x∗ k ≤ kF 0 (x0 )−1 F 0 (x∗ )kkF 0 (x∗ )−1 [F(x0 ) − F(x∗ ) − F 0 (x0 )(x0 − x∗ )]k
Lkx0 − x∗ k2
≤
2(1 − L0 kx0 − x∗ k)
= g1 (kx0 − x∗ k)kx0 − x∗ k < kx0 − x∗ k < r∗ ,
which shows (21.2.17) for n = 0. We also have from the second substep of method (21.1.2)
for n = 0, (21.2.9), (21.2.15), (21.2.17) and the definition of functions g1 and g2 that
which shows (21.2.19) for n = 0. We also need an estimate on kAθ,0 k. It follows from
(21.2.27) and the definition of Aθ,0 , g3 , g4 that
1 1
kAθ,0 k ≤ 1 + kHθ,0 k + kHθ,0 k2
2 4
≤ 1 + g3 (kx0 − x∗ k)kx0 − x∗ k + g23 (kx0 − x∗ k)kx0 − x∗ k2
= g4 (kx0 − x∗ k), (21.2.28)
which shows (21.2.20) for n = 0. Then, from the third substep of method (21.1.2) for n = 0,
Modified Halley-Like Methods 379
(21.2.19), (21.2.20), (21.2.28) the definition of functions g1 , g5 and radius r∗ , we have that
which shows (21.2.21) for n = 0. Next, we need an estimate on kBθ,0 k. We have by the
definition of operator Bθ,0 and functions g1,3 , g3 , g6 that
M(r) = 1 + L0 r.
Moreover, condition (21.2.14) can be replaced by the popular but stronger conditions
or
2. The results obtained here can be used for operators F satisfying autonomous differ-
ential equations [3] of the form
F 0 (x) = P(F(x))
where P is a continuous operator. Then, since F 0 (x∗ ) = P(F(x∗ )) = P(0), we can
apply the results without actually knowing x∗ . For example, let F(x) = ex − 1. Then,
we can choose: P(x) = x + 1.
3. The local results obtained here can be used for projection methods such as the
Arnoldi’s method, the generalized minimum residual method (GMRES), the gener-
alized conjugate method(GCR) for combined Newton/finite projection methods and
in connection to the mesh independence principle can be used to develop the cheapest
and most efficient mesh refinement strategies [3, 4].
4. The radius rA given by
1
r ≤ rA = . (21.2.33)
L0 + L2
was shown by us to be the convergence radius of Newton’s method [3, 4]
xn+1 = xn − F 0 (xn )−1 F(xn ) for each n = 0, 1, 2, · · · (21.2.34)
under the conditions (21.2.13) and (21.2.32). It follows from (21.2.1) and (21.2.33)
that the convergence radius r∗ of the method (21.1.2) cannot be larger than the con-
vergence radius rA of the second order Newton’s method (21.2.33). As already noted
in [3, 4] rA is at least as large as the convergence ball given by Rheinboldt [3, 4]
2
rR = . (21.2.35)
3L
In particular, for L0 < L we have that
rR < rA
and
rR 1 L0
→ as → 0.
rA 3 L
That is our convergence ball rA is at most three times larger than Rheinboldt’s. The
same value for rR was given by Traub [3, 4].
5. It is worth noticing that method (21.1.2) is not changing when we use the conditions
of Theorem 21.2.1 instead of the stronger (C ) conditions used in [30]. Moreover, we
can compute the computational order of convergence (COC) defined by
kxn+1 − x∗ k kxn − x∗ k
ξ = ln / ln
kxn − x∗ k kxn−1 − x∗ k
or the approximate computational order of convergence
kxn+1 − xn k kxn − xn−1 k
ξ1 = ln / ln .
kxn − xn−1 k kxn−1 − xn−2 k
This way we obtain in practice the order of convergence in a way that avoids the
bounds given in [30] involving estimates up to the second Fréchet derivative of oper-
ator F.
Modified Halley-Like Methods 381
1
F(x) = (sinx, (ex + 2x − 1)). (21.3.1)
3
e−1 2
F(v) = (ex − 1, y + y, z). (21.3.2)
2
Then, the Fréchet-derivative is given by
x
e 0 0
F 0 (v) = 0 (e − 1)y + 1 0 .
0 0 1
Notice that x∗ = (0, 0, 0), F 0 (x∗ ) = F 0 (x∗ )−1 = diag{1, 1, 1}, L0 = e−1 < L = e, M = e, θ =
3 3 3
4 , γ = 10 , α = 100 . Then, by (21.2.1) we obtain
Example 21.3.3. Returning back to the motivational example at the introduction of this
chapter, we see that conditions (21.2.12)–(21.2.15) are satisfied for x∗ = 1, f 0 (x∗ ) =
3, f (1) = 0, L0 = L = 146.6629073 and M = 101.5578008. Hence, the results of Theo-
rem 2.1 can apply but not the ones in [30]. In particular, for θ = 0.9902, α = 0.008 and
γ = 0.005 hypotheses (21.2.9)-(21.2.15) are satisfied. Moreover, we obtain
21.4. Conclusion
We present a local convergence analysis of Modified Halley-Like Methods with less com-
putation of inversion in order to approximate a solution of an equation in a Banach space
setting. Earlier convergence analysis is based on Lipschitz and Holder-type hypotheses up
to the second Fréchet-derivative [1]–[20]. In this chapter the local convergence analysis is
based only on Lipschitz hypotheses of the first Fréchet-derivative. Hence, the applicability
of these methods is expanded under less computational cost of the constants involved in the
convergence analysis.
References
[1] Ahmad, F., Hussain, S., Mir, N.A., A. Rafiq, New sixth order Jarratt method for solv-
ing nonlinear equations, Int. J. Appl. Math. Mech., 5(5) (2009), 27-35.
[2] Amat, S., Hernández, M.A., Romero, N., A modified Chebyshev’s iterative method
with at least sixth order of convergence, App. Math. Comp., 206(1) (2008), 164-174.
[4] Argyros, I. K., Hilout, S., A convergence analysis for directional two-step Newton
methods, Numer. Algor., 55 (2010), 503-528.
[5] Argyros, I. K., Magreñán, Á.A., On the convergence of an optimal fourth-order family
of methods and its dynamics. App. Math. Comp., 252 (2015), 336-346.
[6] Bruns, D.D., Bailey, J.E., Nonlinear feedback control for operating a nonisothermal
CSTR near an unstable steady state, Chem. Eng. Sci., 32 (1977), 257-264.
[7] Candela, V., Marquina, A., Recurrence relations for rational cubic methods I: The
Halley method, Computing, 44 (1990), 169-184.
[8] Candela, V., Marquina, A., Recurrence relations for rational cubic methods II: The
Chebyshev method, Computing, 45 (1990), 355-367.
[9] Chun, C., Some improvements of Jarratt’s method with sixth-order convergence, App.
Math. Comp., 190(2) (2007), 1432-1437.
[10] Ezquerro, J.A., Hernández, M.A., A uniparametric Halley-type iteration with free sec-
ond derivative, Int. J.Pure and Appl. Math., 6 (1) (2003), 99-110.
[11] Ezquerro, J.A., Hernández, M.A., New iterations of R-order four with reduced com-
putational cost. BIT Numer. Math., 49 (2009), 325-342.
[12] Ezquerro, J.A., Hernández, M.A., On the R-order of the Halley method, J. Math. Anal.
Appl., 303 (2005), 591-601.
[13] Gutiérrez, J.M., Hernández, M.A., Recurrence relations for the super-Halley method,
Computers Math. Applic. 36(7) (1998), 1-8.
384 Ioannis K. Argyros and Á. Alberto Magreñán
[14] Ganesh, M., Joshi, M.C., Numerical solvability of Hammerstein integral equations of
mixed type, IMA J. Numer. Anal., 11 (1991), 21-31.
[15] Gutiérrez, J.M., Magreñán, Á.A., Romero, N., On the semilocal convergence of
Newton-Kantorovich method under center-Lipschitz conditions, App. Math. Comp.,
221 (2013), 79-88.
[17] Hernández, M.A., M.A. Salanova, Sufficient conditions for semilocal convergence of
a fourth order multipoint iterative method for solving equations in Banach spaces.
Southwest J. Pure Appl. Math, (1) (1999), 29-40.
[18] Jarratt, P., Some fourth order multipoint methods for solving equations, Math. Comp.
20(95) (1966), 434-437.
[19] Kou, J., Li, Y., An improvement of the Jarratt method, App. Math. Comp., 189 (2007),
1816-1821.
[20] Kou, J., Wang, X., Semilocal convergence of a modified multi-point Jarratt method
in Banach spaces under general continuity conditions, Numer. Algorithms, 60 (2012),
369-390.
[21] Magreñán, Á.A., Different anomalies in a Jarratt family of iterative root-finding meth-
ods, App. Math. Comp., 233 (2014), 29–38.
[22] Parhi, S.K., Gupta, D.K., Semilocal convergence of a Stirling-like method in Banach
spaces, Int. J. Comput. Methods, 7(02) (2010), 215-228.
[23] Parhi, S.K., Gupta, D.K., Recurrence relations for a Newton-like method in Banach
spaces, J. Comput. Appl. Math., 206(2) (2007), 873-887.
[24] Rall, L.B., Computational solution of nonlinear operator equations, Robert E. Krieger,
New York (1979).
[25] Ren, H., Wu, Q., Bi, W., New variants of Jarratt method with sixth-order convergence,
Numer. Algorithms 52(4) (2009), 585-603.
[26] Wang, X., Kou, J., Li, Y., Modified Jarratt method with sixth order convergence, Appl.
Math. Let., 22 (2009), 1798-1802.
[27] Ye, X., Li, C., Convergence of the family of the deformed Euler-Halley iterations
under the Hölder condition of the second derivative, J. Comp. Appl. Math., 194(2)
(2006), 294-308.
[28] Ye, X., Li, C., Shen, W., Convergence of the variants of the Chebyshev-Halley itera-
tion family under the Hölder condition of the first derivative, J. Comput. Appl. Math.,
203(1) (2007), 279-288.
Modified Halley-Like Methods 385
[29] Zhao, Y., Wu, Q., Newton-Kantorovich theorem for a family of modified Halley’s
method under Hölder continuity condition in Banach spaces, App. Math. Comp.,
202(1) (2008), 243-251.
[30] Wang, X., Kou, J., Convergence for modified Halley-like methods with less computa-
tion of inversion, J. Diff. Eq. and Appl., 19(9) (2013), 1483-1500.
Chapter 22
22.1. Introduction
In this chapter we are concerned with the problem of approximating a solution x∗ of the
equation
F(x) = 0, (22.1.1)
where F is a Fréchet-differentiable operator defined on a convex subset D of a Banach space
X with values in a Banach space Y .
Many problems in computational sciences and other disciplines can be brought in a
form like (22.1.1) using mathematical modelling [11, 12, 29, 31]. The solutions of these
equations can rarely be found in closed form. That is why most solution methods for
these equations are iterative. The study about convergence matter of iterative procedures
is usually based on two types: semilocal and local convergence analysis. The semilocal
convergence matter is, based on the information around an initial point, to give condi-
tions ensuring the convergence of the iterative procedure; while the local one is, based on
the information around a solution, to find estimates of the radii of convergence balls. In
particular, the practice of Numerical Functional Analysis for finding solution x∗ of equa-
tion (22.1.1) is essentially connected to variants of Newton’s method. This method con-
verges quadratically to x∗ if the initial guess is close enough to the solution. Iterative
methods of convergence order higher than two such as Chebyshev-Halley-type methods
[5, 6, 11, 14, 23, 30, 12, 19, 20, 21, 22, 24, 25, 26, 27, 28, 31, 33] require the evaluation of
the second Fréchet-derivative, which is very expensive in general. However, there are inte-
gral equations, where the second Fréchet-derivative is diagonal by blocks and inexpensive
or for quadratic equations the second Fréchet-derivative is constant. Moreover, in some ap-
plications involving stiff systems, high order methods are usefull. That is why in a unified
way we study the local convergence of the improved Jarratt-type method (IJTM) defined
388 Ioannis K. Argyros and Á. Alberto Magreñán
for each n = 0, 1, 2, . . . by
where x0 is an initial point and I is the identity operator. If we set Hn = F 0 (xn )−1 (F 0 (yn ) −
F 0 (xn )), then using some algebraic manipulation we obtain that
−1 ! −1
1 3 3 3
Jn = I + I + Hn =I− I + Hn Hn . (22.1.3)
2 2 4 2
This method has been shown to be of convergence order between 5 and 6 [29, 33]. The
usual conditions for the semilocal convergence of these methods are (C ):
or
or kF 000 (x) − F 000 (y)k ≤ ϕ(kx − yk)) for each x, y ∈ D, where ϕ : [0, +∞) → [0, +∞) is
a non-decreasing function.
The local convergence conditions are similar but x0 is x∗ in (C1 ) and (C2 ). There is
a plethora of local and semilocal convergence results under the (C ) conditions [1]–[33].
These conditions restrict the applicability of these methods. That is why, in our chapter we
assume the conditions (A ):
and
Notice that the (A ) conditions are weaker than the (C ) conditions. Hence, the applicability
of (IJTM) is expanded under the (A ) conditions.
As a motivational example, let us define function f on D = U 1, 32 by
3 2
x lnx + x5 − x4 , x 6= 0
f (x) =
0, x = 0.
Choose x∗ = 1. We have that
Lt
f1 (t) = (22.2.4)
2(1 − L0 t)
390 Ioannis K. Argyros and Á. Alberto Magreñán
and
1 Lt
f2 (t) = 1+ . (22.2.5)
3 1 − L0 t
Then, we have by the choice of rA that
f1 (t) ≤ 1 for each t ∈ [0, rA] (22.2.6)
and
f2 (t) ≤ 1 for each t ∈ [0, rA]. (22.2.7)
h
Define function f 3 on the interval 0, L10 by
(Lt)2
f3 (t) = · (22.2.8)
2(1 − L0 t)2
Then, we have that
f3 (t) ≤ 1 for each t ∈ [0, r0] (22.2.9)
and
f3 (t) < 1 for each t ∈ [0, r0). (22.2.10)
Moreover, define functions f 4 and f 5 on the interval [0, r0) by
Lt 2 L2 Kt
f4 (t) = 1+ (22.2.11)
2(1 − L0 t) 2(1 − L0 t)2 − L2 t 2
and
2K
f5 (t) = 1 + f4 (t). (22.2.12)
2(1 − L0 t)2 − L2 t 2
Furthermore, define functions f 4 and f 5 on the interval [0, r0) by
f 4 (t) = f 4 (t) − 1 (22.2.13)
and
f 5 (t) = f 5 (t) − 1. (22.2.14)
We have that f 4 (0) = f 5 (0) = −1 < 0 and f 4 (t) → +∞, f 5 (t) → +∞ as t → r0 . It follows
by intermediate value theorem that functions f 4 and f 5 has zeros in (0, r0). Denote by r4
and r5 the minimal zeros of functions f 4 and f 5 on the interval (0, r0), respectively. Finally,
define
r = min{r4 , r5 }. (22.2.15)
Then, we have by the choice of r that
f1 (t) < 1, (22.2.16)
f2 (t) < 1, (22.2.17)
f3 (t) < 1, (22.2.18)
f4 (t) < 1, (22.2.19)
and
f5 (t) < 1 for each t ∈ [0, r). (22.2.20)
Next, we present the main local convergence for IJTM under the (A ) conditions.
Improved Jarratt-Type Method 391
Theorem 22.2.1. Suppose that the (A ) conditions and U(x∗ , r) ⊆ D, where r is given by
(22.2.15). Then, sequence {xn } generated by IJTM (22.1.2) for any x0 ∈ U(x∗ , r) is well
defined, remains in U(x∗ , r) for each n = 0, 1, 2, . . . and converges to x∗ . Moreover, the
following estimates hold for each n = 0, 1, 2, . . .
Proof. We shall use induction to show that estimates (22.2.20) hold for each n = 0, 1, 2, . . .
Using (A2 ) and the hypothesis x0 ∈ U(x∗ , r), we have that
by the choice of r. It follows from (22.2.22) and the Banach lemma on invertible operators
that [11, 12, 28] F 0 (x0 )−1 ∈ L(Y, X) and
1 1
kF 0 (x0 )−1 F 0 (x∗ )k ≤ < · (22.2.23)
1 − L0 kx0 − x∗ k 1 − L0 r
Using the first substep of IJTM for n = 0, F(x∗ ) = 0, (A1), (A2 ), (22.2.22) and the choice
of r we get that
Z 1
#
× (F 0 (x∗ + θ(x0 − x∗ )) − F 0 (x0 )) dθ(x0 − x∗ ) , (22.2.24)
0
so
which shows u0 ∈ U(x∗ , r). Using the second substep of IJTM, we get by (22.2.25) and
(22.2.17) that
2
y0 − x∗ = x0 − x∗ + (u0 − x0 )
3
2 2
= x0 − x∗ + (u0 − x∗ ) + (x∗ − x0 )
3 3
1 2
= (x0 − x∗ ) + (u0 − x∗ )
3 3
392 Ioannis K. Argyros and Á. Alberto Magreñán
so,
1 2
ky0 − x∗ k ≤ kx0 − x∗ k + ku0 − x∗ k ≤ f 2 (r)kx0 − x∗ k < r,
3 3
∗
which shows that y0 ∈ U(x , r).
Next, we shall find upper bounds on kH0 k and kJ0 k. Using (A1 ), (22.2.24), (22.2.18)
that
3 3 0
kH0 k ≤ kF (x0 )−1 F 0 (x∗ )kkF 0 (x∗ )−1 (F 0 (y0 ) − F 0 (x0 ))k
2 2
3 Lky0 − x0 k 3 2 Lku0 − x0 k
≤ ≤ ·
2 1 − L0 kx0 − x k 2 3 1 − L0 kx0 − x∗ k
∗
2
L2 kx0 − x∗ k2 Lr
≤ < √
2(1 − L0 kx0 − x∗ k2 ) 2(1 − L0 r)
= ( f 3 (r))2 < 1. (22.2.26)
It follows from (22.2.25) and the Banach lemma on invertible operators that
−1
I + 32 H0 ∈ L(Y, X) and
−1
3
1
I + H0
≤ 2 kx −x∗ k2
2
L
1 − 2(1−L kx0
−x∗ k)2 0 0
1
< L r2 2 · (22.2.27)
1 − 2(1−L r)2
0
Then, from the fourth substep of IJTM for n = 0, (22.2.25), (22.2.26), (22.2.27), (22.2.19)
and (A4 )
−1
0 −1 3 3
z0 = x0 − F (x0 ) F(x0 ) + I + H0 H0 F 0 (x0 )−1 F(x0 )
4 2
Improved Jarratt-Type Method 393
so,
so
kF 0 (x∗ )−1 F(x0 )k ≤ Kkx0 − x∗ k by (A4 ). (22.2.30)
Next, using the last substep in IJTM for n = 0, (22.2.23), (22.2.27), (22.2.19) and
(22.2.30) (for x0 replaced by z0 ) we get in turn that
1 Kkz0 − x∗ k
kx1 − x∗ k ≤ kz0 − x∗ k + L2 kx0 −x∗ k2
1 − 2(1−L ∗ 2
1 − L0 kx0 − x∗ k
0kx0 −x k)
2K(1 − L0 kx0 − x∗ k)
× 1+ kz0 − x∗ k
2(1 − L0 kx0 − x∗ k)2 − L2 kx0 − x∗ k2
≤ f 5 (kx0 − x∗ k)kx0 − x∗ k ≤ f 5 (r)kx0 − x∗ k
< kx0 − x∗ k, (22.2.31)
Remark 22.2.2. (a) Condition (A2 ) can be dropped, since this condition follows from
(A3 ). Notice, however that
L0 ≤ L (22.2.32)
L
holds in general and L0 can be arbitrarily large [2, 3, 4, 5, 6].
394 Ioannis K. Argyros and Á. Alberto Magreñán
K(r) = 1 + L0 r. (22.2.33)
The convergence ball of radius rA was given by us in [2, 3, 5] for Newton’s method
under conditions (A1 )-(A3 ). Estimate (22.2.22) shows that the convergence ball of
higher than two IJTM methods is smaller than the convergence ball of the quadrat-
ically convergent Newton’s method. The convergence ball given by Rheinboldt [31]
for Newton’s method is
2
rR = < rA (22.2.35)
3L
if L0 < L and rrRA → 13 as LL0 → 0. Hence, we do not expect r to be larger than rA no
matter how we choose L0 , L and K. Finally note that if α = 0, then IJTM reduces to
Newton’s method and r = rA .
(d) The local results can be used for projection methods such as Arnoldi’s method, the
generalized minimum residual method (GMREM), the generalized conjugate method
(GCM) for combined Newton/finite projection methods and in connection to the mesh
independence principle in order to develop the cheapest and most efficient mesh re-
finement strategy [11, 12, 31].
(e) The results can also be used to solve equations where the operator F 0 satisfies the
autonomous differential equation [11, 12, 29, 31]:
(f) It is worth noticing that IJTM is not changing if we use the (A ) instead of the (C )
conditions. Moreover for the error bounds in practice we can use the computational
order of convergence (COC) [1, 2, 3, 4, 11, 12, 15] using
ln kxkxn+2 −xn+1 k
n+1 −xn k
ξ = sup for each n = 1, 2, . . .
kx −xn k
ln kxnn+1
−xn−1 k
Improved Jarratt-Type Method 395
Notice that x∗ = (0, 0, 0), F 0 (x∗ ) = F 0 (x∗ )−1 = diag{1, 1, 1}, L0 = e − 1 < L = K = e, r0 =
0.274695 . . . < rA = 0.324967 . . . < 1/L0 = 0.581977 . . ., r = 0.144926 . . ..
Example 22.3.2. Let X = Y = C([0, 1]), the space of continuous functions defined on [0, 1]
be and equipped with the max norm. Let D = U(0, 1). Define function F on D by
Z 1
F(ϕ)(x) = ϕ(x) − 5 xθϕ(θ)3 dθ. (22.3.2)
0
We have that
Z 1
0
F (ϕ(ξ))(x) = ξ(x) − 15 xθϕ(θ)2 ξ(θ)dθ, for each ξ ∈ D.
0
∗
Then, we get that x = 0, L0 = 7.5, L = 15 and K = K(t) = 1 + 7.5t, r0 = 0.055228 . . . <
rA = 0.066666 . . . < 1/L0 = 0.133333 . . ., r = 0.0370972 . . ..
Example 22.3.3. Returning to the motivational
example at the Introduction of this chapter,
let the function f on D = U = 1, 23 defined by
3 2
x lnx + x5 − x4 , x 6= 0
f (x) =
0, x = 0.
[1] Amat, S., Busquier, S., Gutiérrez, J. M.,. Geometric constructions of iterative func-
tions to solve nonlinear equations, J. Comput. Appl. Math., 157 (2003), 197-205.
[2] Amat, S., Busquier, S., Plaza, S., Review of some iterative root-finding methods from
a dynamical point of view, Scientia Series A: Mathematical Sciences, 10 (2004), 3-35.
[3] Amat, S., Busquier, S., Plaza, S., Dynamics of the King and Jarratt iteration, Aequa-
tiones Math., 69, (2005), 3, 212-223.
[4] Amat, S., Busquier, S., Plaza, S., Chaotic dynamics of a third-order Newton-type
method, J. Math. Anal. Appl., 366(1) (2010), 24-32.
[6] Argyros, I. K., A note on the Halley method in Banach spaces, App. Math. Comp., 58
(1993), 215-224.
[7] Argyros, I. K., The Jarratt method in a Banach space setting, J. Comp. Appl. Math.,
51 (1994), 103-106.
[9] Argyros, I. K.,. A new convergence theorem for the Jarratt method in Banach spaces,
Computers and Mathematics with applications, 36 (8) (1998), 13-18.
[10] Argyros, I. K., Chen, D.. An inverse-free Jarratt type approximation in a Banach space,
J. Approx. Th. Appl. 12(1) (1996), 19-30.
[11] Argyros, I. K., Computational theory of iterative methods. Series: Studies in Com-
putational Mathematics, 15, Editors: C.K. Chui and L. Wuytack, Elsevier Publ. Co.
New York, U.S.A, 2007.
[12] Argyros, I. K., Hilout, S., Numerical methods in Nonlinear Analysis, World Scientific
Publ. Comp. New Jersey, 2013.
[13] Argyros, I. K., and Hilout, S., Weaker conditions for the convergence of Newton’s
method, J. Complexity, 28 (2012), 364-387.
398 Ioannis K. Argyros and Á. Alberto Magreñán
[14] Argyros, I. K., Magreñán, Á.A., On the convergence of an optimal fourth-order family
of methods and its dynamics. App. Math. Comp., 252 (2015), 336-346.
[15] Chicharro,F., Cordero, A., Gutiérrez, J. M., Torregrosa, J. R., Dynamics of derivative-
free methods for nonlinear equations, App. Math. Comp., 219 (2013), 7023-7035.
[16] Chun, C., Lee, M. Y., Neta, B., Dzunić, J., On optimal fourth-order iterative methods
free from second derivative and their dynamics, App. Math. Comp., 218 (2012), 6427-
6438.
[17] Candela, V., Marquina, A., Recurrence relations for rational cubic methods I: The
Halley method, Computing, 44 (1990), 169-184.
[18] Candela, V., Marquina, A., Recurrence relations for rational cubic methods II: The
Chebyshev method, Computing, 45 (1990), 355-367.
[19] Ezquerro, J.A., Hernández, M. A., Avoiding the computation of the second Fréchet-
derivative in the convex acceleration of Newton’s method, J. Comp. Appl. Math., 96
(1998), 1-12.
[20] Ezquerro, J.A., Hernández, M. A., On Halley-type iterations with free second deriva-
tive, J. Comp. Appl. Math., 170 (2004), 455-459.
[21] Gutiérrez, J. M., Hernández, M. A., Recurrence relations for the super-Halley method,
Computers Math. Applic. 36 (1998), 1-8.
[22] Gutiérrez, J. M., Hernández, M. A., Third-order iterative methods for operators with
bounded second derivative, J. Comp. Appl. Math., 82 (1997), 171-183.
[23] Gutiérrez, J.M., Magreñán, Á.A., Romero, N., On the semilocal convergence of
Newton-Kantorovich method under center-Lipschitz conditions, App. Math. Comp.,
221 (2013), 79–88.
[24] Hernández, M. A., Salanova, M. A., Modification of the Kantorovich assumptions for
semilocal convergence of the Chebyshev method, J. Comp. Appl. Math., 126 (2000),
131-143.
[26] Hernández, M. A., Reduced recurrence relations for the Chebyshev method, J. Optim.
Theory App., 98 (1998), 385-397.
[27] Jarratt, P., Some fourth order multipoint iterative methods for solving equations, Math-
ematics and Computation, 20(95) (1996), 434-437.
[28] Kantorovich, L.V., Akilov, G. P., Functional Analysis, Pergamon Press, Oxford, 1982.
[29] Kou, J., Li, Y., An improvement of the Jarratt method, App. Math. Comp., 189, 2,
(2007), 1816-1821.
Improved Jarratt-Type Method 399
[30] Magreñán, Á.A., Different anomalies in a Jarratt family of iterative root-finding meth-
ods, App. Math. Comp., 233 (2014), 29–38.
[31] Ortega, J.M., Rheinboldt, W.C., Iterative Solution of Nonlinear Equations in Several
Variables, Academic press, New York, 1970.
[32] Parida, P.K., Gupta, D.K., Semilocal convergence of a family of third order methods
in Banach spaces under Hölder continuous second derivative, Nonl. Anal., 69 (2008),
4163-4173.
[33] Ren, H., Wo, Q., Bi, W., New variants of Jarratt method with sixth-order convergence,
Numer. Algorithms, 52(4) (2009), 585-603.
Chapter 23
23.1. Introduction
Let X , Y be Banach spaces and D be a non-empty, convex and open subset in X . Let
U(x, r) and U(x, r) stand, respectively, for the open and closed ball in X with center x and
radius r > 0. Denote by L (X , Y ) the space of bounded linear operators from X into Y . In
the present chapter we are concerned with the problem of approximating a locally unique
solution x? of equation
F(x) = 0, (23.1.1)
where F is a Fréchet continuously differentiable operator defined on D with values in Y .
A lot of problems from computational sciences and other disciplines can be brought in
the form of equation (23.1.1) using Mathematical Modelling [8, 10, 14]. The solution of
these equations can rarely be found in closed form. That is why most solution methods for
these equations are iterative. In particular, the practice of numerical analysis for finding
such solutions is essentially connected to variants of Newton’s method [8, 10, 14, 22, 25,
27, 32].
A very important aspect in the study of iterative procedures is the convergence domain.
In general the convergence domain is small. This is why it is important to enlarge it without
additional hypotheses. Then, this is our goal in this chapter.
In the present chapter we study the secant-like method defined by
The family of secant-like methods reduces to the secant method if λ = 0 and to Newton’s
method if λ = 1. It was shown in [27] (see√ also [7, 8, 21] and the references therein) that
the R–order of convergence is at least (1 + 5)/2 if λ ∈ [0, 1), the same as that of the se-
cant method. In the real case the closer xn and yn are, the higher the speed of convergence.
402 Ioannis K. Argyros and Á. Alberto Magreñán
Moreover in [19], it was shown that as λ approaches 1 the speed of convergence is close
to that of Newton’s method. Moreover, there exist new graphical tools [24]. Furthermore,
the advantages of using secant-like method instead of Newton’s method is that the former
method avoids the computation of F 0 (xn )−1 at each step. The study about convergence mat-
ter of iterative procedures is usually centered on two types: semilocal and local convergence
analysis. The semilocal convergence matter is, based on the information around an initial
point, to give criteria ensuring the convergence of iterative procedure; while the local one
is, based on the information around a solution, to find estimates of the radii of convergence
balls. There is a plethora of studies on the weakness and/or extension of the hypothesis
made on the underlying operators; see for example [1]–[34].
The hypotheses used for the semilocal convergence of secant-like method are (see [8,
18, 19, 21]):
(C1 ) There exists a divided difference of order one denoted by [x, y; F] ∈ L (X , Y ) satisfy-
ing
[x, y; F](x − y) = F(x) − F(y) for all x, y ∈ D ;
k x0 − x−1 k≤ c;
k A−1
0 ([x, y; F] − [u, v; F]) k≤ M (k x − u k + k y − v k) for all x, y, u, v ∈ D ;
k A−1
0 ([x, y; F] − [v, y; F]) k≤ L k x − v k for all x, y, v ∈ D ;
(C3?? ) There exist x−1 , x0 ∈ D and K > 0 such that F(x0 )−1 ∈ L (Y , X ) and
k A−1
0 F(x0 ) k≤ η;
k B−1
0 F(x0 ) k≤ η.
We shall refer to (C1 )–(C4 ) as the (C ) conditions. From analyzing the semilocal conver-
gence of the simplified secant method, it was shown [18] that the convergence criteria are
milder than those of secant-like method given in [20]. Consequently, the decreasing and
accessibility regions of (23.1.2) can be improved. Moreover, the semilocal convergence of
(23.1.2) is guaranteed.
In the present chapter we show: an even larger convergence domain can be obtained
under the same or weaker sufficient convergence criteria for method (23.1.2). In view of
(C3 ) we have that
Enlarging the Convergence Domain of Secant-Like Methods for Equations 403
k A−1
0 ([x, y; F] − [x−1 , x0 ; F]) k≤ M0 (k x − x−1 k + k y − x0 k) for all x, y ∈ D .
(C7 ) There exist x0 ∈ D and M2 > 0 such that F 0 (x0 )−1 ∈ L (Y , X ) and
qn = (1 − λ) (tn − t0 ) + (1 + λ) (tn+1 − t0 ),
K (tn+1 − tn + (1 − λ) (tn − tn−1 ))
tn+2 = tn+1 + (tn+1 − tn ), (23.2.1)
1 − M1 q n
K (tn+1 − tn + (1 − λ) (tn − tn−1 ))
αn = , (23.2.2)
1 − M1 q n
function { f n } for each n = 1, 2, · · · by
and polynomial p by
0 < α0 ≤ α ≤ 1 − 2 M1 η. (23.2.5)
c + η ≤ t ? ≤ t ?? . (23.2.7)
0 ≤ tn+1 − tn ≤ αn η (23.2.8)
and
αn η
t ? − tn ≤ . (23.2.9)
1−α
Proof. We shall first prove that polynomial p has roots in (0, 1). If λ 6= 1, p(0) = −(1 −
λ) K < 0 and p(1) = 2 M1 > 0. If λ = 1, p(t) = t p(t), p(0) = −K < 0 and p(1) = 2 M1 > 0.
In either case it follows from the intermediate value theorem that there exist roots in (0, 1).
Denote by α the minimal root of p in (0, 1). Note that, in particular for Newton’s method
(i.e. for λ = 1) and for Secant method (i.e. for λ = 0), we have, respectively by (23.2.4)
that
2K
α= p (23.2.10)
K + K 2 + 4 M1 K
and
2K
α= p . (23.2.11)
K+ K 2 + 8 M1 K
It follows from (23.2.1) and (23.2.2) that estimate (23.2.8) is satisfied if
0 ≤ αn ≤ α. (23.2.12)
t2 − t1 ≤ α (t1 − t0 ) =⇒ t2 ≤ t1 + α (t1 − t0 )
1 − α2
=⇒ t2 ≤ η + t0 + α η = c + (1 + α) η = c + < t ?? .
1−αη
Suppose that
1 − αk+1
tk+1 − tk ≤ αk η and tk+1 ≤ c + η. (23.2.13)
1−α
Estimate (23.2.12) shall be true for k + 1 replacing n if
0 ≤ αk+1 ≤ α (23.2.14)
Enlarging the Convergence Domain of Secant-Like Methods for Equations 405
or
fk (α) ≤ 0. (23.2.15)
We need a relationship between two consecutive recurrent functions f k for each k = 1, 2, · · ·.
It follows from (23.2.3) and (23.2.4) that
M1 η (1 − λ) lim (1 + α + · · · + αn )+
n→∞ (23.2.18)
(1 + λ) lim (1 + α + · · · + αn+1 ) − 1
n→∞
1−λ 1+λ 2 M1 η
= M1 η + −1 = − 1,
1−α 1−α 1−α
since α ∈ (0, 1). In view of (23.2.15), (23.2.16) and (23.2.18) we can show instead of
(23.2.15) that
f∞ (α) ≤ 0, (23.2.19)
which is true by (23.2.5). The induction for (23.2.8) is complete. It follows that sequence
{tn } is non-decreasing, bounded from above by t ?? given by (23.2.6) and as such it con-
verges to t ? which satisfies (23.2.7). Estimate (23.2.9) follows from (23.2.8) by using stan-
dard majorization techniques [8, 10, 22]. The proof of Lemma 23.2.1 is complete.
Lemma 23.2.2. Let c ≥ 0, η > 0, M1 > 0, K > 0 and λ ∈ [0, 1]. Set r−1 = 0, r0 = c and
r1 = c + η. Define scalar sequences {rn } for each n = 1, · · · by
r2 = r1 + β1 (r1 − r0 )
(23.2.20)
rn+2 = rn+1 + βn (rn+1 − rn ),
where
M1 (r1 − r0 + (1 − λ) (r0 − r−1 ))
β1 = ,
1 − M1 q 1
K (rn+1 − rn + (1 − λ) (rn − rn−1 ))
βn = f or each n = 2, 3, · · ·
1 − M1 q n
and function {gn } on [0, 1) for each n = 1, 2, · · · by
n−1
(1 − λ))t n+1
gn (t) = K (t + (r2 − r1 )+
1 −t 1 − t n+2
M1 t (1 − λ) + (1 + λ) (r2 − r1 ) + (2 M1 η − 1)t.
1 −t 1 −t
(23.2.21)
406 Ioannis K. Argyros and Á. Alberto Magreñán
Suppose that
2 M1 (r2 − r1 )
0 ≤ β1 ≤ α ≤ 1 − , (23.2.22)
1 − 2 M1 η
where α is defined in Lemma 23.2.1. Then, sequence {rn } is non-decreasing, bounded from
above by r?? defined by
r2 − r1
r?? = c + η + (23.2.23)
1−α
and converges to its unique least upper bound r? which satisfies
c + η ≤ r? ≤ r?? . (23.2.24)
Moreover, the following estimates are satisfied for each n = 1, · · ·
0 ≤ rn+2 − rn+1 ≤ αn (r2 − r1 ). (23.2.25)
Proof. We shall use mathematical induction to show that
0 ≤ βn ≤ α. (23.2.26)
Estimate (23.2.26) is true for n = 0 by (23.2.22). Then, we have by (23.2.20) that
0 ≤ r3 − r2 ≤ α (r2 − r1 ) =⇒ r3 ≤ r2 + α (r2 − r1 )
=⇒ r3 ≤ r2 + (1 + α) (r2 − r1 ) − (r2 − r1 )
1 − α2
=⇒ r3 ≤ r1 + (r2 − r1 ) ≤ r?? .
1−α
Suppose (23.2.26) holds for each n ≤ k, then, using (23.2.20), we obtain that
1 − αk+1
0 ≤ rk+2 − rk+1 ≤ αk (r2 − r1 ) and rk+2 ≤ r1 + (r2 − r1 ). (23.2.27)
1−α
Estimate (23.2.26) is certainly satisfied, if
gk (α) ≤ 0, (23.2.28)
where gk is defined by (23.2.21). Using (23.2.21), we obtain the following relationship
between two consecutive recurrent functions gk for each k = 1, 2, · · ·
gk+1(α) = gk (α) + p(α) αk−1 (r2 − r1 ) = gk (α), (23.2.29)
since p(α) = 0. Define function g∞ on [0, 1) by
g∞ (t) = lim gk (t). (23.2.30)
k→∞
Remark 23.2.3. Let us see how sufficient convergence criterion on (23.2.5) for sequence
{tn } simplifies in the interesting case of Newton’s method. That is when c = 0 and λ = 1.
Then, (23.2.5) can be written for L0 = 2 M1 and L = 2 K as
1 p 1
h0 = (L + 4 L0 + L2 + 8 L0 L) η ≤ . (23.2.32)
8 2
The convergence criterion in [18] reduces to the famous for it simplicity and clarity Kan-
torovich hypothesis
1
h = Lη ≤ . (23.2.33)
2
Note however that L0 ≤ L holds in general and L/L0 can be arbitrarily large [6, 7, 8, 9, 10,
14]. We also have that
1 1
h ≤ =⇒ h0 ≤ (23.2.34)
2 2
but not necessarily vice versa unless if L0 = L and
h0 1 L
−→ as −→ ∞. (23.2.35)
h 4 L0
Similarly, it can easily be seen that the sufficient convergence criterion (23.2.22) for se-
quence {rn } is given by
q p
1 1
h1 = (4 L0 + L0 L + 8 L20 + L0 L) η ≤ . (23.2.36)
8 2
We also have that
1 1
h0 ≤ =⇒ h1 ≤ (23.2.37)
2 2
and
h1 h1 L0
−→ 0, −→ 0 as −→ 0. (23.2.38)
h h0 L
Note that sequence {rn } is tighter than {tn } and converges under weaker conditions. In-
deed, a simple inductive argument shows that for each n = 2, 3, · · ·, if M1 < K, then
We have the following usefull and obvious extensions of Lemma 23.2.1 and Lemma
23.2.2, respectively.
t1 ≤ t2 ≤ · · · ≤ tN ≤ tN+1 , (23.2.40)
1
> (1 − λ) (tN − t0 ) + (1 + λ) (tN+1 − t0 ) (23.2.41)
M1
and
0 ≤ αN ≤ α ≤ 1 − 2 M1 (tN+1 − tN ). (23.2.42)
408 Ioannis K. Argyros and Á. Alberto Magreñán
and
αn
t ? − tN+n ≤ (tN+1 − tN ). (23.2.44)
1−α
Lemma 23.2.5. Let N = 1, 2, · · · be fixed. Suppose that
r1 ≤ r2 ≤ · · · ≤ rN ≤ rN+1 , (23.2.45)
1
> (1 − λ) (rN − r0 ) + (1 + λ) (rN+1 − r0 ) (23.2.46)
M1
and
2 M1 (rN+1 − rN )
0 ≤ βN ≤ α ≤ 1 − . (23.2.47)
1 − 2 M1 (rN − rN−1 )
Then, sequence {rn } generated by (23.2.20) is nondecreasing, bounded from above by r??
and converges to r? which satisfies r? ∈ [rN+1 , r?? ]. Moreover, the following estimates are
satisfied for each n = 0, 1, · · ·
and
αn
r? − rN+n ≤ (rN+1 − rN ). (23.2.49)
1−α
Next, we present the following semilocal convergence result for secant-like method
under the (C ? ) conditions.
Theorem 23.2.6. Suppose that the (C ? ), Lemma 23.2.1 (or Lemma 23.2.4) conditions and
U(x0 ,t ? ) ⊆ D (23.2.50)
hold. Then, sequence {xn } generated by the secant-like method is well defined, remains in
U(x0 ,t ? ) for each n = −1, 0, 1, · · · and converges to a solution x? ∈ U(x0 ,t ? − c) of equation
F(x) = 0. Moreover, the following estimates are satisfied for each n = 0, 1, · · ·
and
k xn − x? k≤ t ? − tn . (23.2.52)
Furthemore, if there exists r ≥ t ? such that
U(x0 , r) ⊆ D (23.2.53)
and
1 2
r + t? < or r + t ? < , (23.2.54)
M1 M2
then, the solution x? is unique in U(x0 , r).
Enlarging the Convergence Domain of Secant-Like Methods for Equations 409
and
U(xk+1 ,t ? − tk+1 ) ⊆ U(xk ,t ? − tk ) (23.2.56)
for each k = −1, 0, 1, · · ·. Let z ∈ U(x0 ,t ? − t0 ). Then, we obtain that
k w − x0 k≤k w − x1 k + k x1 − x0 k≤ t ? − t1 + t1 − t0 = t ? = t ? − t0 .
which implies x1 ∈ U(x0 ,t ? ) ⊆ D . Hence, estimates (23.2.51) and (23.2.52) hold for k = −1
and k = 0. Suppose (23.2.51) and (23.2.52) hold for all n ≤ k. Then, we obtain that
k+1 k+1
k xk+1 − x0 k≤ ∑ k xi − xi−1 k≤ ∑ (ti − ti−1 ) = tk+1 − t0 ≤ t ?
i=1 i=1
and
k yk − x0 k≤ λ k xk − x0 k +(1 − λ) k xk−1 − x0 k≤ λt ? + (1 − λ)t ? = t ? .
Hence, xk+1 , yk ∈ U(x0 ,t ? ). Let Ek := [xk+1, xk ; F] for each k = 0, 1, · · ·. Using (23.1.2),
Lemma 23.2.1 and the induction hypotheses, we get that
It follows from (23.2.57) and the Banach lemma on invertible operators that B−1
k+1 exists and
1 1
k B−1 0
k+1 F (x0 ) k≤ ≤ , (23.2.58)
1 − Θk 1 − M1 qk+1
Then, using the induction hypotheses, the (C ? ) condition and (23.2.59), we get in turn that
which completes the induction for (23.2.55). Furthemore, let v ∈ U(xk+2 ,t ? − tk+2 ). Then,
we have that
k v − xk+1 k ≤ k v − xk+2 k + k xk+2 − xk+1 k
≤ t ? − tk+2 + tk+2 − tk+1 = t ? − tk+1 ,
which implies v ∈ U(xk+1 ,t ? −tk+1 ). The induction for (23.2.55) and (23.2.56) is complete.
Lemma 23.2.1 implies that {tk} is a complete sequence. It follows from (23.2.55) and
(23.2.56) that {xk } is a complete sequence in a Banach space X and as such it converges
to some x? ∈ U(x0 ,t ? ) (since U(x0 ,t ?) is a closed set). By letting k −→ ∞ in (23.2.60), we
get that F(x? ) = 0. Moreover, estimate (23.2.52) follows from (23.2.51) by using standard
majorization techniques [8, 10, 22]. To show the uniqueness part, let y? ∈ U(x0 , r) be such
F(y? ) = 0, where r satisfies (23.2.53) and (23.2.54). We have that
It follows by (23.2.61) and the Banach lemma on invertible operators that linear operator
[y? , x? ; F]−1 exists. Then, using the identity 0 = F(y? ) − F(x? ) = [y? , x? ; F] (y? − x? ), we
deduce that x? = y? . The proof of Theorem 23.2.6 is complete.
In order for us to present the semilocal result for secant-like method under the (C ?? )
conditions, we first need a result on a majorizing sequence. The proof in given in Lemma
23.2.1.
Remark 23.2.7. Clearly, (23.2.22) (or (23.2.47)), {rn } can replace (23.2.5) (or (23.2.42)),
{tn }, respectively in Theorem 23.2.6.
Lemma 23.2.8. Let c ≥ 0, η > 0, L > 0, M0 > 0 with M0 c < 1 and λ ∈ [0, 1]. Set
L M0
s−1 = 0, s0 = c, s1 = c + η, K̃ = and M̃1 = .
1 − M0 c 1 − M0 c
Enlarging the Convergence Domain of Secant-Like Methods for Equations 411
and polynomial p̃ by
α̃n η
0 ≤ sn+1 − sn ≤ α̃n η and s? − sn ≤ .
1 − α̃
Next, we present the semilocal convergence result for secant-like method under the
(C ?? )
conditions.
Theorem 23.2.9. Suppose that the (C ??) conditions, (23.2.62) (or Lemma 23.2.2 conditions
with α̃n , α̃, M̃1 replacing, respectively, αn , α, M1 ) and U(x0 , s? ) ⊆ D hold. Then, sequence
{xn } generated by the secant-like method is well defined, remains in U(x0 , s? ) for each
n = −1, 0, 1, · · · and converges to a solution x? ∈ U(x0 , s? ) of equation F(x) = 0. Moreover,
the following estimates are satisfied for each n = 0, 1, · · ·
Furthemore, if there exists r ≥ s? such that U(x0 , r) ⊆ D and r + s? + c < 1/M0 , then, the
solution x? is unique in U(x0 , r).
412 Ioannis K. Argyros and Á. Alberto Magreñán
Proof. The proof is analogous to Theorem 23.2.6. Simply notice that in view of (C5 ), we
obtain instead of (23.2.57) that
k A−1
0 (Bk+1 − A0 ) k≤ M0 (k yk+1 − x−1 k + k xk+1 − x0 k)
≤ M0 ((1 − λ) k xk − x0 k +λ k xk+1 − x0 k + k x0 − x−1 k + k xk+1 − x0 k)
≤ M0 ((1 − λ) (sk − s0 ) + (1 + λ) (sk+1 − s0 ) + c) < 1,
leading to B−1
k+1 exists and
1
k B−1
k+1 A0 k≤ ,
1 − Ξk
where Ξk = M0 ((1 − λ) (sk − s0 ) + (1 + λ) (sk+1 − s0 ) + c). Moreover, using (C3? ) instead of
(C3?? ), we get that
k A−1
0 F(xk+1 ) k≤ L (sk+1 − sk + (1 − λ) (sk − sk−1 )) (sk+1 − sk ).
U(x0 ,t ?? ) ⊆ D , (23.2.63)
u−1 = 0, u0 = c, u1 = c + η
M (un+1 − un + (1 − λ) (un − un−1 )) (23.2.64)
un+2 = un+1 + (un+1 − un ),
1 − M q?n
where
q?n = (1 − λ) (un − u0 ) + (1 + λ) (un+1 − u0 ).
Then, if K < M or M1 < M, a simple inductive argument shows that for each n =
2, 3, · · ·
Clearly {tn } converges under the (C ) conditions and conditions of Lemma 2.1. More-
over, as we already showed in Remark 23.2.3, the sufficient convergence criteria of
Theorem 23.2.6 can be weaker than those of Theorem 23.2.9. Similarly if L ≤ M, {sn }
is a tighter sequence than {un }. In general, we shall test the convergence criteria and
use the tightest sequence to estimate the error bounds.
Enlarging the Convergence Domain of Secant-Like Methods for Equations 413
(c) Clearly the conclusions of Theorem 23.2.9 hold if {sn }, (23.2.62) are replaced by
{r̃n }, (23.2.22), where {r̃n } is defined as {rn } with M0 replacing M1 in the definition
of β1 (only at the numerator) and the tilda letters replacing the non-tilda letters in
(23.2.22).
· Fixed λ ∈ [0, 1], the operator B0 = [y0 , x0 ; F] is invertible and such that kB−1
0 k ≤ β,
· kB−1
0 F(x0 )k ≤ η,
23.3.1. Example 1
We illustrate the above-mentioned with an application, where a system of nonlinear equa-
tions is involved. We see that Theorem 23.3.1 cannot guarantee the semilocal convergence
of secant-like methods (23.1.2) , but Theorem 23.2.6 can do it.
It is well known that energy is dissipated in the action of any real dynamical system,
usually through some form of friction. However, in certain situations this dissipation is
so slow that it can be neglected over relatively short periods of time. In such cases we
assume the law of conservation of energy, namely, that the sum of the kinetic energy and
the potential energy is constant. A system of this kind is said to be conservative.
If ϕ and ψ are arbitrary functions with the property that ϕ(0) = 0 and ψ(0) = 0, the
general equation
d 2 x(t) dx(t)
µ +ψ + ϕ(x(t)) = 0, (23.3.2)
dt 2 dt
414 Ioannis K. Argyros and Á. Alberto Magreñán
can be interpreted as the equation of motion of a mass µ under the action of a restoring force
−ϕ(x) and a damping force −ψ(dx/dt). In general these forces are nonlinear, and equation
(23.3.2) can be regarded as the basic equation of nonlinear mechanics. In this chapter we
shall consider the special case of a nonlinear conservative system described by the equation
d 2 x(t)
µ + ϕ(x(t)) = 0,
dt 2
in which the damping force is zero and there is consequently no dissipation of energy.
Extensive discussions of (23.3.2), with applications to a variety of physical problems, can
be found in classical references [4] and [31].
Now, we consider the special case of a nonlinear conservative system described by the
equation
d 2 x(t)
+ φ(x(t)) = 0 (23.3.3)
dt 2
with the boundary conditions
x(0) = x(1) = 0. (23.3.4)
After that, we use a process of discretization to transform problem (23.3.3)–(23.3.4) into a
finite-dimensional problem and look for an approximated solution of it when a particular
function φ is considered. So, we transform problem (23.3.3)–(23.3.4) into a system of non-
linear equations by approximating the second derivative by a standard numerical formula.
1
Firstly, we introduce the points t j = jh, j = 0, 1, . . ., m + 1, where h = m+1 and m is
an appropriate integer. A scheme is then designed for the determination of numbers x j ,
it is hoped, approximate the values x(t j ) of the true solution at the points t j . A standard
approximation for the second derivative at these points is
x j−1 − 2x j + x j+1
x00j ≈ , j = 1, 2, . . ., m.
h2
A natural way to obtain such a scheme is to demand that the x j satisfy at each interior mesh
point t j the difference equation
x j−1 − 2x j + x j+1 + h2 φ(x j ) = 0. (23.3.5)
Since x0 and xm+1 are determined by the boundary conditions, the unknowns are
x1 , x2 , . . ., xm .
A further discussion is simplified by the use of matrix and vector notation. Introducing
the vectors
x1 φ(x1 )
x2 φ(x2 )
x = . , vx = .
.. ..
xm φ(xm )
and the matrix
−2 1 0 ··· 0
1 −2 1 ··· 0
0 1 −2 ··· 0
A= ,
.. .. .. .. ..
. . . . .
0 0 0 · · · −2
Enlarging the Convergence Domain of Secant-Like Methods for Equations 415
the system of equations, arising from demanding that (23.3.5) holds for j = 1, 2, . . ., m, can
be written compactly in the form
F(x) ≡ Ax + h2 vx = 0, (23.3.6)
e and h = 1 , so that
where ` = (`1 , `2 , . . ., `8 )t ∈ Ω 9
7 2
kF 0 (x) − F 0 (y)k ≤ h kx − yk. (23.3.7)
4
Considering (see [27])
Z 1
[x, y; F] = F 0 (τx + (1 − τ)y)dτ,
0
1
Table 23.3.2. Absolute errors obtained by secant-like method (23.1.2) with λ = 2 and
{kF(xn )k}
n kxn − x∗ k kF(xn)k
−1 1.3893 . . .× 10−1 8.6355 . . .× 10−2
0 4.5189 . . .× 10−2 1.2345 . . .× 10−2
1 1.43051 . . .× 10−4 2.3416 . . .× 10−5
2 1.14121 . . .× 10−7 1.9681 . . .× 10−8
3 4.30239 . . .× 10−13 5.7941 . . .× 10−14
23.3.2. Example 2
Consider the following nonlinear boundary value problem
(
1
u00 = −u3 − u2
4
u(0) = 0, u(1) = 1.
It is well known that this problem can be formulated as the integral equation
Z 1
1
u(s) = s + Q (s,t) (u3(t) + u2 (t)) dt (23.3.8)
0 4
where, Q is the Green function:
t (1 − s), t ≤ s
Q (s,t) =
s (1 − t), s < t.
We observe that Z 1
1
max |Q (s,t)| dt = .
0≤s≤1 0 8
Then problem (23.3.8) is in the form (23.1.1), where, F is defined as
Z 1
1
[F(x)] (s) = x(s) − s − Q (s,t) (x3(t) + x2 (t)) dt.
0 4
The Fréchet derivative of the operator F is given by
Z 1 Z 1
1
[F 0 (x)y] (s) = y(s) − 3 Q (s,t)x2(t)y(t)dt − Q (s,t)x(t)y(t)dt.
0 2 0
418 Ioannis K. Argyros and Á. Alberto Magreñán
1 + 14 5
Choosing x0 (s) = s and R = 1 we have that kF(x0 )k ≤ = . Define the divided
8 32
difference defined by
Z 1
[x, y; F] = F 0 (τx + (1 − τ)y)dτ.
0
Taking into account that
Z 1
k[x, y; F] − [v, y; F]k ≤ kF 0 (τx + (1 − τ)y) − F 0 (τv + (1 − τ)y) k dτ
0
1 1 2 2 τ
Z
≤ 3τ kx − v2 k + 2τ(1 − τ)kykkx − vk + kx − vk dτ
8 0 2
1 1
≤ kx2 − v2 k + kyk + kx − vk
8 4
1 1
≤ kx + vk + kyk + kx − vk
8 4
25
≤ kx − vk
32
9s
Choosing x−1 (s) = , we find that
10
Z 1
k1 − A0 k ≤ kF 0 (τx0 + (1 − τ)x−1 ) k dτ
0
!
1 1 9 2 1 9
Z
≤ 3 τ + (1 − τ) + τ + (1 − τ) dτ
8 0 10 2 10
≤ 0.409375 . . .
and so
25 −1
L≥ kA k = 1.32275 . . .
32 0
.
In an analogous way, choosing λ = 0.8 we obtain
M0 = 0.899471 . . .,
kB−1
0 k = 1.75262 . . .
and
η = 0.273847 . . ..
Notice that we can not guarantee the convergence of the secant method by Theorem 3.1
since the first condition of (3.1) is not satisfied:
√
η 3− 5
a= = 0.732511 . . . > = 0.381966 . . .
c+η 2
Enlarging the Convergence Domain of Secant-Like Methods for Equations 419
K̃ = 1.45349 . . .,
α0 = 0.434072 . . .,
α = 0.907324 . . .
and
1 − 2M̃1 η = 0.945868 . . ..
And condition (2.62) 0 < α0 ≤ α ≤ 1 − 2M̃1 η is satisfied and as a consequence we can
ensure the convergence of the secant method by Theorem 23.2.9.
Conclusion
We presented a new semilocal convergence analysis of the secant-like method for approx-
imating a locally unique solution of an equation in a Banach space. Using a combination
of Lipschitz and center-Lipschitz conditions, instead of only Lipschitz conditions invested
in [18], we provided a finer analysis with larger convergence domain and weaker sufficient
convergence conditions than in [15, 18, 19, 21, 26, 27]. Numerical examples validate our
theoretical results.
References
[1] Amat, S., Busquier, S., Negra, M., Adaptive approximation of nonlinear operators,
Numer. Funct. Anal. Optim., 25 (2004) 397-405.
[2] Amat, S., Busquier, S., Gutiérrez, J. M., On the local convergence of secant-type
methods, Int. J. Comp. Math., 81(8) (2004), 1153-1161.
[3] Amat, S., Bermúdez, C., Busquier, S., Gretay, J. O., Convergence by nondiscrete
mathematical induction of a two step secant’s method, Rocky Mountain J. Math., 37(2)
(2007), 359-369.
[4] Andronow, A.A., Chaikin, C. E., Theory of oscillations, Princenton University Press,
New Jersey, 1949.
[5] Argyros, I.K., Polynomial Operator Equations in Abstract Spaces and Applications,
St.Lucie/CRC/Lewis Publ. Mathematics series, 1998, Boca Raton, Florida, U.S.A.
[6] Argyros, I.K., A unifying local-semilocal convergence analysis and applications for
two-point Newton-like methods in Banach space, J. Math. Anal. Appl., 298 (2004),
374-397.
[7] Argyros, I.K., New sufficient convergence conditions for the secant method, Che-
choslovak Math. J., 55 (2005), 175-187.
[9] Argyros, I.K., A semilocal convergence analysis for directional Newton methods,
Math. Comput., 80 (2011), 327-343.
[10] Argyros, I.K., Cho, Y.J., Hilout, S., Numerical Methods for Equations and its Appli-
cations, CRC Press/Taylor and Francis, Boca Raton, Florida, USA, 2012
[11] Argyros, I.K., Hilout, S., Convergence conditions for secant–type methods, Che-
choslovak Math. J., 60 (2010), 253–272.
[12] Argyros, I.K., Hilout, S., Weaker conditions for the convergence of Newton’s method,
J. Complexity, 28 (2012), 364-387.
[13] Argyros, I.K., Hilout, S., Estimating upper bounds on the limit points of majorizing
sequences for Newton’s method, Numer. Algorithms, 62(1) (2013), 115-132.
422 Ioannis K. Argyros and Á. Alberto Magreñán
[14] Argyros, I.K., Hilout, S., Numerical methods in nonlinear analysis, World Scientific
Publ. Comp., New Jersey, 2013.
[15] Argyros, I.K., Ezquerro, J.A., Hernández, M.Á. Hilout, S., Romero, N., Velasco, A.I.,
Expanding the applicability of secant-like methods for solving nonlinear equations,
Carp. J. Math. 31 (1) (2015), 11-30.
[16] Dennis, J.E., Toward a unified convergence theory for Newton–like methods, in Non-
linear Functional Analysis and Applications (L.B. Rall, ed.), Academic Press, New
York, (1971), 425–472.
[17] Ezquerro, J.A., Gutiérrez, J.M., Hernández, M.A., Romero, N., Rubio, M.J., The
Newton method: from Newton to Kantorovich. (Spanish), Gac. R. Soc. Mat. Esp.,
13 (2010), 53–76.
[18] Ezquerro, J.M., Hernández, M.A., Romero, N., A.I. Velasco,Improving the domain of
starting point for secant-like methods,App. Math. Comp., 219 (8) (2012), 3677–3692.
[19] Ezquerro, J.A., Rubio, M.J., A uniparametric family of iterative processes for solving
nondifferentiable equations, J. Math. Anal. Appl., 275 (2002), 821–834.
[20] Hernández, M.A., Rubio, M.J., Ezquerro, J.A., secant-like methods for solving non-
linear integral equations of the Hammerstein type. Proceedings of the 8th International
Congress on Computational and Applied Mathematics, ICCAM-98 (Leuven), J. Com-
put. Appl. Math., 115 (2000), 245–254.
[21] Hernández, M.A., Rubio, M.J., Ezquerro, J.A., Solving a special case of conservative
problems by secant-like methods, App. Math. Comp., 169 (2005), 926–942.
[22] Kantorovich, L.V., Akilov, G.P., Functional Analysis, Pergamon Press, Oxford, 1982.
[23] Laasonen, P., Ein überquadratisch konvergenter iterativer algorithmus, Ann. Acad. Sci.
Fenn. Ser I, 450 (1969), 1–10.
[24] Magreñán, Á.A., A new tool to study real dynamics: The convergence plane, App.
Math. Comp., 248 (2014), 215–224.
[25] Ortega, J.M., Rheinboldt, W.C., Iterative Solution of Nonlinear Equations in Several
Variables, Academic Press, New York, 1970.
[26] Potra, F.A., Sharp error bounds for a class of Newton–like methods, Libertas Mathe-
matica, 5 (1985), 71–84.
[27] Potra, F.A., Pták, V., Nondiscrete Induction and Iterative Processes, Pitman, New
York, 1984.
[28] Proinov, P.D., General local convergence theory for a class of iterative processes and
its applications to Newton’s method, J. Complexity, 25 (2009), 38–62.
[29] Proinov, P.D., New general convergence theory for iterative processes and its applica-
tions to Newton–Kantorovich type theorems, J. Complexity, 26 (2010), 3–42.
Enlarging the Convergence Domain of Secant-Like Methods for Equations 423
[30] Schmidt, J.W., Untere Fehlerschranken fur Regula-Falsi Verhafren, Period. Hungar.,
9 (1978), 241–247.
[32] Traub, J.F., Iterative Methods for the Solution of Equations, Prentice-Hall, Englewood
Cliffs, New Jersay, 1964.
[33] Yamamoto, T., A convergence theorem for Newton–like methods in Banach spaces,
Numer. Math., 51 (1987), 545–557.
[34] Wolfe, M.A., Extended iterative methods for the solution of operator equations, Nu-
mer. Math., 31 (1978), 153–174.
Author Contact Information
Ioannis K. Argyros
Professor
Department of Mathematical Sciences
Cameron University
Lawton, OK, US
Tel: (580) 581-2908
Email: [email protected]
algorithm, 190, 200, 207, 210, 211, 214, 215, 217, Damped Newton method, 309, 310, 311, 314, 315,
218, 223, 235, 269, 270, 271, 272, 273, 274, 276, 316
277, 279, 280, 284, 291, 397 derivatives, 21, 22, 43, 66, 73, 99, 106
deviation, 187, 281
differential equations, 64, 337
B distribution, 17, 259, 316, 317, 415
dynamic systems, 337
Banach space(s), 1, 9, 12, 20, 21, 33, 34, 35, 45, 47,
72, 73, 80, 94, 95, 99, 100, 109, 125, 127, 133,
134, 135, 137, 147, 148, 150, 167, 178, 186, 245, F
254, 257, 263, 265, 267, 307, 308, 309, 321, 323,
324, 335, 336, 337, 338, 343, 345, 350, 351, 352, finite element method, 205
353, 354, 355, 357, 359, 363, 365, 366, 367, 369,
371, 372, 373, 381, 384, 385, 387, 397, 399, 401,
410, 413, 419, 421, 423 G
boundary value problem, 2, 16, 162, 181, 203, 217,
238, 261, 347, 417 Genetic Algorithm, 269, 271, 273, 275, 277, 279
bounded linear operators, 1, 75, 137, 167, 187, 245,
309, 323, 337, 354, 401
H
bounds, 6, 7, 9, 13, 15, 19, 27, 34, 74, 89, 91, 92, 93,
127, 129, 131, 133, 149, 190, 196, 225, 256, 265,
Hilbert space, 151, 187, 207, 210, 211, 214, 215,
266, 282, 299, 308, 310, 321, 327, 352, 357, 358,
223, 230, 281, 295
367, 380, 392, 394, 395, 412, 421, 422
I
C
IMA, 45, 71, 133, 134, 135, 148, 149, 150, 241, 293,
CAD, 280
307, 308, 336, 384
calculus, 108
image, 105, 138, 139, 269, 296, 297, 298, 304
closed ball, 35, 59, 73, 138, 167, 187, 245, 296, 323,
induction, 5, 7, 8, 12, 20, 25, 28, 38, 40, 50, 52, 53,
344, 354, 375, 389, 401
56, 61, 79, 80, 108, 119, 120, 121, 143, 155, 172,
complex numbers, 21
173, 174, 176, 177, 178, 185, 186, 227, 228, 249,
computation, 21, 28, 129, 138, 144, 153, 169, 189,
250, 253, 254, 265, 287, 288, 303, 313, 329, 331,
201, 224, 225, 235, 237, 246, 247, 271, 296, 299,
342, 345, 362, 363, 367, 372, 391, 393, 405, 406,
303, 325, 346, 356, 358, 398, 402, 403
409, 410, 421
computational mathematics, 96, 138, 149, 190, 225,
inequality, 15, 55, 59, 84, 88, 118, 121, 122, 128,
296, 303, 308, 353
153, 162, 163, 174, 193, 204, 218, 227, 228, 234,
computing, 100, 200
238, 239, 271, 303, 311, 326, 340, 342, 346, 357
convergence criteria, 1, 2, 8, 14, 74, 80, 91, 138, 190,
inversion, 381, 385
225, 246, 256, 337, 339, 350, 354, 402, 412
428 Index
iteration, 70, 93, 110, 112, 141, 152, 175, 188, 200,
235, 242, 270, 283, 284, 291, 293, 309, 310, 325, O
331, 337, 383, 397
iterative solution, 47 one dimension, 131
operations, 208
operations research, 208
K optimization, 34, 47, 99, 136, 138, 148, 269, 296,
324, 337, 338
K+, 174, 235, 248, 404 optimization method, 136
Kantorovich theorem, 19, 71, 73, 111, 126, 353, 385 organism, 272
orthogonality, 99
L
P
Lavrentiev Regularization Methods, 151
laws, 17, 317 parallel, 104, 105, 106, 111, 123
light, 214 parents, 272
linear systems, 138 Philadelphia, 148, 206, 217
Lipschitz conditions, 72, 75, 77, 93, 95, 144, 263, physics, 167, 281
304, 339, 351, 371, 384, 398, 419 population, 269, 270, 272, 273, 277
Portugal, 279
preservation, 273
M probability, 272, 273
programming, 218
management, 208 proposition, 108, 144
manifolds, 99, 100, 101, 102, 103, 110, 111, 122, prototype, 17, 317
126, 133, 134, 135, 136 publishing, 242
manipulation, 84, 270, 272, 362, 388
mapping, 101, 103, 104, 207, 208, 209, 215
mass, 134, 258, 414 R
mathematical programming, 208
matrix, 18, 31, 138, 143, 145, 258, 310, 317, 414 radiation, 17, 316
matter, 21, 35, 73, 100, 137, 223, 246, 295, 309, 323, radius, 2, 9, 23, 29, 30, 35, 36, 59, 68, 73, 89, 105,
337, 353, 373, 387, 394, 402 138, 145, 151, 167, 187, 188, 221, 245, 296, 299,
measurement, 281 310, 321, 323, 324, 325, 326, 327, 330, 332, 333,
memory, 34 334, 343, 354, 375, 379, 380, 389, 394, 401
meth, 335, 351, 372, 384, 399 real dynamics, 19, 186, 266, 352, 422
mixing, 274 real numbers, 10, 152, 154, 155, 309
modelling, 1, 35, 137, 185, 337, 353, 387 Relaxed Proximal Point Algorithms, 209, 211, 213,
models, 19 215, 219
modifications, 21 Riemannian Manifolds, 99, 101, 103, 105, 107, 109,
mutation, 272, 273 110, 111, 113, 115, 117, 119, 121, 123, 125, 127,
129, 131, 135
root(s), 3, 4, 5, 34, 36, 45, 48, 67, 99, 112, 115, 135,
N 171, 172, 191, 197, 248, 255, 259, 315, 343, 349,
350, 360, 372, 384, 397, 399, 404, 411, 415
Newton iteration, 134, 302
next generation, 272, 273
nodes, 17, 31, 317 S
noisy data, 151, 152, 158, 187, 239
nonlinear least squares, 137, 295, 296, 298, 304 Secant method, 13, 33, 39, 248, 337, 338, 341, 343,
null, 324 344, 351, 404
numerical analysis, 21, 73, 100, 167, 245, 323, 401 SHEGA, 271, 273, 274, 275, 276, 277, 278
structure, 103, 107, 337
STTM, 22, 23, 24, 25, 29, 30, 31, 32
symmetry, 272, 273, 274, 276, 277, 280
Index 429
vector, 100, 101, 104, 105, 106, 107, 108, 109, 117,
T 119, 120, 121, 122, 134, 258, 260, 310, 321, 414,
416
techniques, 12, 47, 52, 61, 80, 172, 178, 219, 249,
254, 281, 345, 366, 405, 410
tensions, 149, 308 W