0% found this document useful (0 votes)
202 views

Iterative Algorithms I PDF

Uploaded by

Betulix
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
202 views

Iterative Algorithms I PDF

Uploaded by

Betulix
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 445

MATHEMATICS RESEARCH DEVELOPMENTS

ITERATIVE ALGORITHMS I

No part of this digital document may be reproduced, stored in a retrieval system or transmitted in any form or
by any means. The publisher has taken reasonable care in the preparation of this digital document, but makes no
expressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. No
liability is assumed for incidental or consequential damages in connection with or arising out of information
contained herein. This digital document is sold with the clear understanding that the publisher is not engaged in
rendering legal, medical or any other professional services.
MATHEMATICS RESEARCH DEVELOPMENTS

Additional books in this series can be found on Nova’s website


under the Series tab.

Additional e-books in this series can be found on Nova’s website


under the e-book tab.
MATHEMATICS RESEARCH DEVELOPMENTS

ITERATIVE ALGORITHMS I

IOANNIS K. ARGYROS
AND
Á. ALBERTO MAGREÑÁN

New York
Copyright © 2017 by Nova Science Publishers, Inc.

All rights reserved. No part of this book may be reproduced, stored in a retrieval system or transmitted in
any form or by any means: electronic, electrostatic, magnetic, tape, mechanical photocopying, recording or
otherwise without the written permission of the Publisher.

We have partnered with Copyright Clearance Center to make it easy for you to obtain permissions to reuse
content from this publication. Simply navigate to this publication’s page on Nova’s website and locate the
“Get Permission” button below the title description. This button is linked directly to the title’s permission
page on copyright.com. Alternatively, you can visit copyright.com and search by title, ISBN, or ISSN.

For further questions about using the service on copyright.com, please contact:
Copyright Clearance Center
Phone: +1-(978) 750-8400 Fax: +1-(978) 750-4470 E-mail: [email protected].

NOTICE TO THE READER


The Publisher has taken reasonable care in the preparation of this book, but makes no expressed or implied
warranty of any kind and assumes no responsibility for any errors or omissions. No liability is assumed for
incidental or consequential damages in connection with or arising out of information contained in this book.
The Publisher shall not be liable for any special, consequential, or exemplary damages resulting, in whole or
in part, from the readers’ use of, or reliance upon, this material. Any parts of this book based on government
reports are so indicated and copyright is claimed for those parts to the extent applicable to compilations of
such works.

Independent verification should be sought for any data, advice or recommendations contained in this book. In
addition, no responsibility is assumed by the publisher for any injury and/or damage to persons or property
arising from any methods, products, instructions, ideas or otherwise contained in this publication.

This publication is designed to provide accurate and authoritative information with regard to the subject
matter covered herein. It is sold with the clear understanding that the Publisher is not engaged in rendering
legal or any other professional services. If legal or any other expert assistance is required, the services of a
competent person should be sought. FROM A DECLARATION OF PARTICIPANTS JOINTLY ADOPTED
BY A COMMITTEE OF THE AMERICAN BAR ASSOCIATION AND A COMMITTEE OF
PUBLISHERS.

Additional color graphics may be available in the e-book version of this book.

Library of Congress Cataloging-in-Publication Data


Names: Argyros, Ioannis K., editor. | Magreñán, Á. Alberto (Angel Alberto), editor.
Title: Iterative algorithms / editors, Ioannis K. Argyros and Á. Alberto Magreñán
(Cameron University, Department of Mathematical Sciences,
Lawton, OK, USA, and others).
Description: Hauppauge, New York: Nova Science Publishers, Inc., [2016]- |
Series: Mathematics research developments | Includes index.
Identifiers: LCCN 2016021559 (print) | LCCN 2016025038 (ebook) | ISBN 9781634854061 (hardcover: v. 1) |
ISBN 9781634854221
Subjects: LCSH: Iterative methods (Mathematics) | Algorithms. | Numerical analysis.
Classification: LCC QA297.8 .I834 2016 (print) | LCC QA297.8 (ebook) | DDC
518/.26--dc23
LC record available at https://lccn.loc.gov/2016021559

Published by Nova Science Publishers, Inc. † New York


Contents
Preface xiii

1 Secant-Type Methods 1
1.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2. Majorizing Sequences for the Secant-Type Method . . . . . . . . . . . . . 2
1.3. Semilocal Convergence of the Secant-Type Method . . . . . . . . . . . . . 9
1.4. Local Convergence of the Secant-Type Method . . . . . . . . . . . . . . . 14
1.5. Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2 Efficient Steffensen-Type Algorithms for Solving Nonlinear Equations 21


2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2. Local Convergence of (STTM) . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3. Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3 On the Semilocal Convergence of Halley’s Method under a


Center-Lipschitz Condition on the Second Fréchet Derivative 35
3.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2. Motivational example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.3. Mistakes in the Proof of Theorem 3.1.1 . . . . . . . . . . . . . . . . . . . 38
3.4. New Semilocal Convergence Theorem . . . . . . . . . . . . . . . . . . . . 39
3.5. Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

4 An Improved Convergence Analysis of Newton’s Method for


Twice Fréchet Differentiable Operators 47
4.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.2. Majorizing Sequences I . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.3. Majorizing Sequences II . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.4. Semilocal Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.5. Local Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.6. Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

5 Expanding the Applicability of Newton’s Method Using Smale’s α-Theory 73


5.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.2. Semilocal Convergence of Newton’s Method . . . . . . . . . . . . . . . . 75
5.3. Local Convergence Analysis of Newton’s Method . . . . . . . . . . . . . . 87
5.4. Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
vi Contents

6 Newton-Type Methods on Riemannian Manifolds under


Kantorovich-Type Conditions 99
6.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.2. Basic Definitions and Preliminary Results . . . . . . . . . . . . . . . . . . 103
6.3. Simplified Newton’s Method on Riemannian Manifolds (m = ∞) . . . . . . 110
6.4. Order of Convergence of Newton-Type Methods . . . . . . . . . . . . . . . 122
6.5. One Family of High Order Newton-Type Methods . . . . . . . . . . . . . . 125
6.6. Expanding the Applicability of Newton Methods . . . . . . . . . . . . . . 126

7 Improved Local Convergence Analysis of Inexact Gauss-Newton


Like Methods 137
7.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
7.2. Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
7.3. Local Convergence Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 139
7.4. Special Case and Numerical Examples . . . . . . . . . . . . . . . . . . . . 144

8 Expending the Applicability of Lavrentiev Regularization Methods


for Ill-Posed Problems 151
8.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
8.2. Basic Assumptions and Some Preliminary Results . . . . . . . . . . . . . . 152
8.3. Stopping Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
8.4. Error Bound for the Case of Noise-Free Data . . . . . . . . . . . . . . . . 157
8.5. Error Analysis with Noisy Data . . . . . . . . . . . . . . . . . . . . . . . . 158
8.6. Order Optimal Result with an a Posterior Stopping Rule . . . . . . . . . . . 160
8.7. Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

9 A Semilocal Convergence for a Uniparametric Family of Efficient


Secant-Like Methods 167
9.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
9.2. Semilocal Convergence Using Recurrent Relations . . . . . . . . . . . . . 169
9.3. Semilocal Convergence Using Recurrent Functions . . . . . . . . . . . . . 171
9.4. Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

10 On the Semilocal Convergence of a Two-Step Newton-Like Projection


Method for Ill-Posed Equations 187
10.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
10.1.1. Projection Method . . . . . . . . . . . . . . . . . . . . . . . . . . 188
10.2. Semilocal Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
10.3. Error Bounds under Source Conditions . . . . . . . . . . . . . . . . . . . . 197
10.3.1. A Priori Choice of the Parameter . . . . . . . . . . . . . . . . . . . 199
10.3.2. An Adaptive Choice of the Parameter . . . . . . . . . . . . . . . . 199
10.4. Implementation of Adaptive Choice Rule . . . . . . . . . . . . . . . . . . 200
10.4.1. Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
10.5. Numerical Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
Contents vii

11 New Approach to Relaxed Proximal Point Algorithms Based


on A−Maximal 207
11.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
11.2. A−Maximal Monotonicity and Auxiliary Results . . . . . . . . . . . . . . 208
11.3. The Generalized Relaxed Proximal Point Algorithm . . . . . . . . . . . . . 210
11.4. An Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215

12 Newton-Type Iterative Methods for Nonlinear Ill-Posed


Hammerstein-Type Equations 221
12.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
12.2. Preparatory Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
12.2.1. A Priori Choice of the Parameter . . . . . . . . . . . . . . . . . . . 224
12.2.2. An Adaptive Choice of the Parameter . . . . . . . . . . . . . . . . 224
12.3. Convergence Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
12.3.1. Iterative Method for Case (1) . . . . . . . . . . . . . . . . . . . . . 225
12.3.2. Iterative Method for Case (2) . . . . . . . . . . . . . . . . . . . . . 230
12.4. Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
12.5. Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
12.6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239

13 Enlarging the Convergence Domain of Secant-Like Methods for Equations 245


13.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
13.2. Semilocal Convergence of Secant-Like Method . . . . . . . . . . . . . . . 247
13.3. Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
13.3.1. Example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
13.3.2. Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261

14 Solving Nonlinear Equations System via an Efficient Genetic Algorithm


with Symmetric and Harmonious Individuals 269
14.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
14.2. Convert (14.1.1) to an Optimal Problem . . . . . . . . . . . . . . . . . . . 270
14.3. New Genetic Algorithm: SHEGA . . . . . . . . . . . . . . . . . . . . . . 271
14.3.1. Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
14.3.2. Fitness Function . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
14.3.3. Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
14.3.4. Symmetric and Harmonious Individuals . . . . . . . . . . . . . . . 272
14.3.5. Crossover and Mutation . . . . . . . . . . . . . . . . . . . . . . . 272
14.3.6. Elitist Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
14.3.7. Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
14.4. Mixed Algorithm: SHEGA-Newton Method . . . . . . . . . . . . . . . . . 273
14.5. Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
14.6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
viii Contents

15 On the Semilocal Convergence of Modified Newton-Tikhonov


Regularization Method for Nonlinear Ill-Posed Problems 281
15.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
15.1.1. The New Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
15.2. Convergence Analysis of (15.1.12) . . . . . . . . . . . . . . . . . . . . . 284
15.3. Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
15.3.1. Error Bounds under Source Conditions . . . . . . . . . . . . . . . 289
15.3.2. A Priori Choice of the Parameter . . . . . . . . . . . . . . . . . . . 290
15.3.3. Adaptive Choice of the Parameter . . . . . . . . . . . . . . . . . . 290
15.4. Implementation of the Method . . . . . . . . . . . . . . . . . . . . . . . . 291
15.4.1. Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291

16 Local Convergence Analysis of Proximal Gauss-Newton Method


for Penalized Nonlinear Least Squares Problems 295
16.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
16.2. Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
16.3. Local Convergence Analysis of the Proximal Gauss-Newton Method . . . . 298
16.4. Special Cases and Numerical Examples . . . . . . . . . . . . . . . . . . . 304

17 On the Convergence of a Damped Newton Method with Modified


Right-Hand Side Vector 309
17.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
17.2. Semilocal Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
17.3. Local Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
17.4. Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316

18 Local Convergence of Inexact Newton-Like Method under


Weak Lipschitz Conditions 323
18.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
18.2. Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
18.3. Local Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
18.4. Special Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
18.5. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
18.6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334

19 Expanding the Applicability of Secant Method with Applications 337


19.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
19.2. Semilocal Convergence Analysis of the Secant Method . . . . . . . . . . . 339
19.3. Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347

20 Expanding the Convergence Domain for Chun-Stanica-Neta Family


of Third Order Methods in Banach Spaces 353
20.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
20.2. Semilocal Convergence I . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
20.3. Semilocal Convergence II . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
20.4. Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
Contents ix

21 Local Convergence of Modified Halley-Like Methods with Less


Computation of Inversion 373
21.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
21.2. Local Convergence Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 375
21.3. Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
21.4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381

22 Local Convergence for an Improved Jarratt-Type Method in Banach Space 387


22.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
22.2. Local Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
22.3. Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395

23 Enlarging the Convergence Domain of Secant-Like Methods for Equations 401


23.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401
23.2. Semilocal Convergence of Secant-Like Method . . . . . . . . . . . . . . . 403
23.3. Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
23.3.1. Example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
23.3.2. Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417

Author Contact Information 425

Index 427
Dedicated to
My mother Anastasia

Dedicated to
My parents Alberto and Mercedes
My grandmother Ascensión
My beloved Lara
Preface

It is a well-known fact that iterative methods have been studied since problems where we
cannot find a solution in a closed form. There exist methods with different behaviors when
they are applied to different functions, methods with higher order of convergence, meth-
ods with great zones of convergence, methods which do not require the evaluation of any
derivative, etc. and researchers are developing new iterative methods frequently.
Once these iterative methods appeared, several researchers have studied them in dif-
ferent terms: convergence conditions, real dynamics, complex dynamics, optimal order
of convergence, etc. This phenomena motivated the authors to study the most used and
classical ones as for example Newton’s method or its derivative-free alternative the Secant
method.
Related to the convergence of iterative methods, the most well known conditions are the
Kantorovich ones, who developed a theory which has allow many researchers to continue
and experiment with these conditions. Many authors in the recent years have studied mod-
ifications of theses conditions related, for example, to centered conditions, ω-conditions or
even convergence in Hilbert spaces.
In this monograph, we present the complete recent work of the past decade of the au-
thors on Convergence and Dynamics of iterative methods. It is the natural outgrowth of
their related publications in these areas. Chapters are self-contained and can be read inde-
pendently. Moreover, an extensive list of references is given in each chapter, in order to
allow reader to use the previous ideas. For these reasons, we think that several advanced
courses can be taught using this book.
The list of presented topic of our related studies follows.

Secant-type methods;
Efficient Steffensen-type algorithms for solving nonlinear equations;
On the semilocal convergence of Halley’s method under a center-Lipschitz condition on
the second Fréchet derivative;
An improved convergence analysis of Newton’s method for twice Fréchet differentiable
operators;
Expanding the applicability of Newton’s method using Smale’s α-theory;
Newton-type methods on Riemannian Manifolds under Kantorovich-type conditions;
Improved local convergence analysis of inexact Gauss-Newton like methods;
Expanding the Applicability of Lavrentiev Regularization Methods for Ill-posed Problems;
A semilocal convergence for a uniparametric family of efficient secant-like methods;
On the semilocal convergence of a two-step Newton-like projection method for ill-posed
xiv Ioannis K. Argyros and Á. Alberto Magreñán

equations;
New Approach to Relaxed Proximal Point Algorithms Based on A−maximal;
Newton-type Iterative Methods for Nonlinear Ill-posed Hammerstein-type Equations;
Enlarging the convergence domain of secant-like methods for equations;
Solving nonlinear equations system via an efficient genetic algorithm with symmetric and
harmonious individuals;
On the Semilocal Convergence of Modified Newton-Tikhonov Regularization Method for
Nonlinear Ill-posed Problems;
Local convergence analysis of proximal Gauss-Newton method for penalized nonlinear
least squares problems;
On the convergence of a Damped Newton method with modified right-hand side vector;
Local convergence of inexact Newton-like method Under weak Lipschitz conditions;
Expanding the applicability of Secant method with applications;
Expanding the convergence domain for Chun-Stanica-Neta family of third order methods
in Banach spaces;
Local convergence of modified Halley-like methods with less computation of inversion;
Local convergence for an improved Jarratt-type method in Banach space;
Enlarging the convergence domain of secant-like methods for equations.

The book’s results are expected to find applications in many areas of applied mathemat-
ics, engineering, computer science and real problems. As such this monograph is suitable
to researchers, graduate students and seminars in the above subjets, also to be in all science
and engineering libraries.
The preparation of this book took place during 2015-2016 in Lawton, Oklahoma, USA
and Logroño, La Rioja, Spain.

Ioannis K. Argyros
Á. Alberto Magreñán
April 2016
Chapter 1

Secant-Type Methods

1.1. Introduction
In this chapter we are concerned with the problem of approximating a locally unique solu-
tion x? of the nonlinear equation
F(x) = 0, (1.1.1)
where, F is a Fréchet-differentiable operator defined on a nonempty subset D of a Banach
space X with values in a Banach space Y . A lot of problems from Applied Sciences can
be expressed in a form like (1.1.1) using mathematical modelling [3]. The solutions of
these equations can be found in closed form only in special cases. That is why the most
solution methods for these equations are iterative. The convergence analysis of iterative
methods is usually divided into two categories: semilocal and local convergence analysis.
In the semilocal convergence analysis one derives convergence criteria from the information
around an initial point whereas in the local analysis one finds estimates of the radii of
convergence balls from the information around a solution. If X = Y and Q(x) = F(x) + x,
then the solution x∗ of equation (1.1.1) is very important in fixed point theory.
We study the convergence of the secant-type method

xn+1 = xn − An−1 F(xn ), An = δF(xn , yn ) for each n = 1, 2, · · · , (1.1.2)


where x−1 , x0 are initial points, yn = θn xn + (1 − θn )xn−1, θn ∈ R. Here An ∈ L (X , Y ),
x, y ∈ D is a consistent approximation of the Fréchet-derivative of F (see page 182 of [15]
or the second estimate in condition (D4 ) of Definition 3.1). L (X , Y ) stands for the space of
bounded linear operators from X to Y . Many iterative methods are special cases of (1.1.2).
Indeed, if θn = 1, then we obtain Newton’s method

xn+1 = xn − F 0 (xn )−1 F(xn ) for each n = 0, 1, 2 . . .; (1.1.3)

if θn = 0, we obtain the secant method

xn+1 = xn − δF(xn , xn−1 )−1 F(xn ) for each n = 0, 1, 2 . . .; (1.1.4)

if θn = 2, we obtain the Kurchatov method

xn+1 = xn − δF(xn , 2xn − xn−1 )−1 F(xn ) for each n = 0, 1, 2 . . .. (1.1.5)


2 Ioannis K. Argyros and Á. Alberto Magreñán

Other choices of θn are also possible [1, 2, 6, 8, 9, 12, 14, 15, 21, 22]. There is a plethora
of sufficient convergence criteria for special cases of secant-type methods (1.1.3)-(1.1.5)
under Lipschitz-type conditions (1.1.2) (see [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16,
17, 18, 19, 20, 21, 22] and the references there in) or even graphical tools to study them [13].
Therefore, it is important to study the convergence of the secant-type method in a unified
way. It is interesting to notice that although we use very general majorizing sequences for
{xn } our technique leads in the semilocal case to: weaker sufficient convergence criteria;
more precise estimates on the distances kxn − xn−1 k, kxn − x∗ k and an at least as precise
information on the location of the solution x∗ in many interesting special cases such as
Newton’s method or the secant method (see Remark 3.3 and the Examples). Moreover,
in the local case: a larger radius of convergence and more precise error estimates than in
earlier studies such as [8, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22] are obtained in this
chapter (see Remark 4.2 and the Examples).
The chapter is organized as follows. In Section 1.2 we study the convergence of the
majorizing sequences for {xn }. Section 1.3 contains the semilocal and Section 1.4 the
local convergence analysis for {xn }. The numerical examples are given in the concluding
Section 1.5. In particular, in the local case we present an example where the radius of
convergence is larger than the one given by Rheinboldt [18] and Traub [19] for Newton’s
method. Moreover, in the semilocal case we provide an example involving a nonlinear
integral equation of Chandrasekhar type [7] appearing in radiative transfer as well as an
example involving a two point boundary value problem.

1.2. Majorizing Sequences for the Secant-Type Method


In this Section, we shall first study some scalar sequences which are related to the secant-
type method.
Let there be parameters c ≥ 0, ν ≥ 0, λ ≥ 0, µ ≥ 1, l0 > 0 and l > 0 with l0 ≤ l. Define
the scalar sequence {αn } by



 α−1 = 0, α0 = c, α1 = c + ν

 l (αn+1 − αn + λ(αn − αn−1 ))(αn+1 − αn )
 αn+2 = αn+1 +

1 − l0 [µ(αn+1 − c) + λ(αn − c) + c]
for each n = 0, 1, 2, · · ·

(1.2.1)
Special cases of the sequence {αn } have been used as majorizing sequences for secant-type
method by several authors. For example: Case 1 (secant method) l0 = l, λ = 1 and µ = 1 has
been studied in [6, 8, 9, 12, 14, 15, 20, 21] and for l0 ≤ l in [2, 4]. Case 2 (Newton’s method)
l0 = l, λ = 0, c = 0 and µ = 2 has been studied in [1, 8, 10, 11, 12, 14, 15, 17, 18, 19, 21, 22]
and for l0 ≤ l in [2, 3, 4]. In the present chapter we shall study the convergence of sequence
{αn } by first simplifying it. Indeed, the purpose of the following transformations is to
study the sequence (1.2.1) after using easier to study sequences defined by (1.2.3), (1.2.6)
and (1.2.8). Let
l0 l
L0 = and L = . (1.2.2)
1 + (µ + λ − 1)l0 c 1 + (µ + λ − 1)l0 c
Secant-Type Methods 3

Using (1.2.1) and (1.2.2), sequence {αn } can be written as


 α = 0, α0 = c, α1 = c + ν
 −1

 L (αn+1 − αn + λ(αn − αn−1 ))(αn+1 − αn )


 αn+2 = αn+1 +

1 − L0 (µαn+1 + λαn )
for each n = 0, 1, 2, · · ·

(1.2.3)
Moreover, let
L = bL0 for some b ≥ 1 (1.2.4)
and
βn = L0 αn . (1.2.5)
Then, we can define sequence {βn } by

 β = 0, β0 = L0 c, β1 = L0 (c + ν)
 −1


 b β n+1 − β n + λ(βn − β n−1 ) (βn+1 − βn )
 βn+2 = βn+1 +
 for each n = 0, 1, 2, · · ·
1 − (µβn+1 + λβn )
(1.2.6)
Furthermore, let

1
γn = − βn for each n = 0, 1, 2, · · · . (1.2.7)
µ+λ
Then, sequence {γn } is defined by
 1 1 1
 γ−1 = µ+λ , γ0 = µ+λ − L0 c, γ1 =


− L0 (c + ν)
µ+λ

 b γn+1 − γn + λ(γn − γn−1 ) (γn+1 − γn )
 γn+2 = γn+1 −
 for each n = 0, 1, 2, · · ·
µγn+1 + λγn
(1.2.8)
Finally, let
γ
δn = 1 − n for each n = 0, 1, 2, · · · (1.2.9)
γn−1
Then, we define the sequence {δn } by
 γ γ

 δ0 = 1 − γ 0 , δ1 = 1 − γ1
 −1 0

bδn+1 (λδn + (1 − δn )δn+1 ) (1.2.10)



 δn+2 = (1 − δn )(1 − δn+1 ) (µ(1 − δn+1 ) + λ) for each n = 0, 1, 2, · · ·

It is convenient for the study of the convergence of the sequence {αn } to define polynomial
p by
p(t) = µt 3 − (λ + 3µ + b)t 2 + (2λ + 3µ + b(λ + 1))t − (µ + λ). (1.2.11)
We have that p(0) = −(µ + λ) < 0 and p(1) = bλ > 0 for λ > 0. It follows from the
intermediate value theorem that p has roots in (0, 1). Denote the smallest root by δ. If
4 Ioannis K. Argyros and Á. Alberto Magreñán

λ = 0, then p(t) = 2
√(t − 1)(µt − (2µ + b)t + µ). Hence, we can choose the smallest root of
2
2µ+b− b +4µb
p given by 2µ ∈ (0, 1) to be δ in this case. Note that in particular for Newton’s
method and secant method, respectively, we have that

p(t) = (t − 1)(2t 2 − (b + 4)t + 2)

and
p(t) = (t − 2)(t 2 − (b + 2)t + 1).
Hence, we obtain, respectively that
4
δ= √ (1.2.12)
b + 4 + b2 + 8b
and
2
δ= √ . (1.2.13)
b + 2 + b2 + 4b
Notice also that
p(t) ≤ 0 for each t ∈ (−∞, δ]. (1.2.14)
Next, we study the convergence of these sequences starting from {δn }.

Lemma 1.2.1. Let δ1 > 0, δ2 > 0 and b ≥ 1 be given parameters. Suppose that

0 < δ2 ≤ δ1 ≤ δ, (1.2.15)

where δ was defined in (1.2.11). Let {δn } be the scalar sequence defined by (1.2.10). Then,
the following assertions hold:

(A1 ) If
δ1 = δ2 (1.2.16)
then,
δn = δ for each n = 1, 2, 3, · · · (1.2.17)

(A2 ) If
0 < δ2 < δ1 < δ (1.2.18)
then, sequence {δn } is decreasing and converges to 0.

Proof. It follows from (1.2.10) and δ2 ≤ δ1 that δ3 > 0. We shall show that

δ3 ≤ δ2 . (1.2.19)

In view of (1.2.10) for n = 1, it suffices to show that

p1 (δ2 ) = µ(1 − δ1 )δ22 − (1 − δ 1 )(2µ + λ + b)δ2 − (µ + (1 + b)λ)δ1 + µ + λ ≥ 0. (1.2.20)

The discriminant ∆ of the quadratic polynomial p1 is given by


h i
∆ = (1 − δ1 ) (1 − δ1 )(λ2 + 2(2µ + λ)b + b2 ) + 4µλbδ1 > 0. (1.2.21)
Secant-Type Methods 5

Hence, p1 has two distinct roots δs and δl with δs < δl . Polynomial p1 is quadratic with
respect to δ2 and the leading coefficient (µ(1 − δ1 )) is positive. Therefore, we have that

p1 (t) ≥ 0 for each t ∈ (−∞, δs ] ∪ [δl , +∞)

and
p1 (t) ≤ 0 for each t ∈ [δs , δl ].
Then, (1.2.20) shall be true, if
δ2 ≤ δs . (1.2.22)
By hypothesis (1.2.15) we have δ1 ≤ δ0 . Then by (1.2.14) we get that p(δ1 ) ≤ 0 ⇒ δ1 ≤
δs ⇒(1.2.22), since δ2 ≤ δ1 by hypothesis (1.2.15). Hence, we showed (1.2.19). Therefore,
relation
0 < δk+1 < δk , (1.2.23)
holds for k = 2. Then, we must show that

0 < δk+2 < δk+1 . (1.2.24)

It follow from (1.2.10), δk < 1 and δk+1 < 1 that δk+2 > 0. Then, in view of (1.2.10) the
right hand side of (1.2.24) is true, if

bδk+1 [λδk + (1 − δk )δk+1 ]


≤ δk+1 (1.2.25)
(1 − δk )(1 − δk+1) [λ + µ(1 − δk+1 )]
or
p(δk ) ≤ 0, (1.2.26)
which is true by (1.2.14) since δk ≤ δ1 ≤ δ. The induction for (1.2.23) is complete. If
δ1 = δ2 = δ, then it follows from (1.2.10) for n = 1 that δ3 = δ and δn = δ for n = 4, 5, · · ·,
which shows (1.2.17). If δ2 < δ1 , the sequence {δn } is decreasing, bounded below by 0
and as such it converges to its unique largest lower bound denoted by γ. We then have from
(1.2.10) that
bγ[λγ + (1 − γ)γ]
γ= ⇒ γ = δ or γ = 0. (1.2.27)
(1 − γ)2 [λ + µ(1 − γ)]
But γ ≤ δ1 ≤ δ. Hence, we conclude that γ = 0. 
Next, we present three results for the convergence of sequences {αn }, {βn } and {γn }
under conditions that are not all the same with the ones in Lemma 2.1 (see e.g. (1.2.28)).

Lemma 1.2.2. Suppose that the hypothesis (1.2.18) is satified. Then, the sequence {γn } is
decreasingly convergent and sequences {αn } and {βn } are increasingly convergent. More-
over, the following estimate holds:
l0 c < 1. (1.2.28)

Proof. Using (1.2.2) and (1.2.9) we get that

γn = (1 − δn )γn−1 = · · · = (1 − δn ) · · ·(1 − δ1 )γ0 = (1 − δn ) · · ·(1 − δ1 )γ0 > 0.


6 Ioannis K. Argyros and Á. Alberto Magreñán

In view of (1.2.18) we have in turn that


γ1
δ1 > 0 ⇒ 1 − >0
γ0
1 − (µ + λ)L0 c
⇒ γ0 = >0
µ+λ
1 − l0 c
⇒ γ0 = >0
(µ + λ)[1 + (µ + λ − 1)l0 c]
⇒ (2.28)

and by the preceding equation we deduce that γn > 0 for each n = 1, 2, . . . and

γn < γn−1 for each n = 1, 2, . . .,

since δn < 1. Hence, sequence {γn } converges to its unique largest lower bound denoted by
γ∗ . We also have that βn = µ+λ
1 1
− γn < µ+λ . Thus, the sequence {βn } is increasing, bounded
from above by µ+λ and as such it converges to its unique least upper bound denoted by β∗ .
1

L−1
Then, in view of (1.2.5) sequence {αn } is also increasing, bounded from above by 0
µ+λ and
such it also converges to its unique least upper bound denoted by α∗ . 

Lemma 1.2.3. Suppose that (1.2.15) and (1.2.16) are satisfied. Then, the following asser-
tions hold for each n = 1, 2, · · ·
δn = δ
γn = (1 − δ)n γ0 , γ∗ = lim γn = 0,
n→∞

1 1
βn = − (1 − δ)n γ0 , β∗ = lim βn =
µ+λ n→∞ µ+λ
and  
1 1 1
αn = − (1 − δ) γ0 , α∗ = lim αn =
n
L0 µ + λ n→∞ L0 (µ + λ)
Corollary 1.2.4. Suppose that the hypotheses of Lemma 2.1 and Lemma 2.2 hold. Then,
sequence {αn } defined in (1.2.1) is nondecreasing and converges to

1 + (µ + λ − 1)l0 c
α∗ = β∗ .
l0
Next, we present lower and upper bounds on the limit point α∗ .

Lemma 1.2.5. Suppose that the condition (1.2.18) is satisfied. Then, the following assertion
holds
b11 ≤ α∗ ≤ b12 , (1.2.29)
where    
1 + (µ + λ − 1)l0 c 1 δ1 δ2
b11 = − exp −2 + ,
l0 µ+λ 2 − δ1 2 − δ2
Secant-Type Methods 7
 
1 1 + (µ + λ − 1)l0 c 1 ∗
b2 = − exp(δ ) , (1.2.30)
l0 µ+λ
    
1 δ2 (µ + λ)(1 − (µ + λ − 1)l0 c)
δ∗ = − δ1 + + ln
1 − δ1 1−r 1 − l0 c
and
λδ1 + δ 2 (1 − δ1 )
r=b .
(1 − δ1 )(1 − δ2 )(λ + µ(1 − δ2 ))
Proof. Using (1.2.18) and (1.2.28) we have that 0 < δ3 < δ2 < δ1 . Let us assume that
0 < δk+1 < δk < · · · < δ1 . Then, it follows from the induction hypotheses and (1.2.34) that

δk + δk+1 (1 − δk )
δk+2 = δk+1 b < rδk+1 < r2 δk ≤ · · · ≤ rk−1 δ3 ≤ rk δ2 .
(1 − δk )(1 − δk+1 )(2 − δk+1 )
We have that

γ∗ = lim γn = ∏(1 − δn )γ0 .
n→∞
i=1
This is equivalent to
  ∞    
1 1 (µ + λ)(1 + (µ + λ − 1)l0 c)
ln ∗ = ∑ ln + ln ,
γ n=1 1 − δn 1 − l0 c

recalling that γ0 = (1−l0c)/((µ+λ)(1+(µ+λ−1)l0 c)). We shall use the following bounds


for lnt, t > 1:  
t −1 t2 − 1
2 ≤ lnt ≤ .
t +1 2t
First, we shall find an upper bound for ln(1/γ∗ ). We have that
∞  
∗ δn (2 − δn ) (µ + λ)(1 + (µ + λ − 1)l0 c)
ln(1/γ ) ≤ ∑ + ln
n=1 2(1 − δn )  1 − l0 c


1 (µ + λ)(1 + (µ + λ − 1)l0 c)
≤ 1−δ1 ∑ δn + ln
n=1 1− l0 c 
1 (µ+λ)(1+(µ+λ−1)l0 c)
≤ 1−δ1 (δ1 + δ2 + δ3 + · · ·) + ln 1−l0 c  
1 n (µ+λ)(1+(µ+λ−1)l0 c)
≤ 1−δ1 (δ1 + δ2 + rδ2 + · · · + r δ2 + · · ·) + ln 1−l0 c
  
1 2 n (µ+λ)(1+(µ+λ−1)l0 c)
≤ 1−δ1 δ1 + δ2 (r + r + · · · + r + · · · + ln 1−l0 c
   
1 δ2 (µ+λ)(1+(µ+λ−1)l0 c) ∗
≤ 1−δ1 δ1 + 1−r + ln 1−l0 c = −δ .

As β∗ = 1/(µ + λ) − γ∗ and α∗ = L−1 ∗


0 β , we obtain the upper bound in (1.2.33). Moreover,
in order to obtain the lower bound for ln(1/γ∗ ), we have that
∞  
∗ δn δ1 δ2
ln(1/γ ) ≥ 2 ∑ >2 + ,
n=1 2 − δn 2 − δ1 2 − δ2

which implies the lower bound in (1.2.33). 


From now on we shall denote by (C1 ) the hypothesis of Lemma 2.1 and Lemma 2.2.
8 Ioannis K. Argyros and Á. Alberto Magreñán

Remark 1.2.6. (a) Let us introduce the notation

cN = αN−1 − αN−2 , νN = αN − αN−1

for some integer N ≥ 1. Notice that c1 = α0 − α−1 = c and ν1 = α1 − α0 = ν. The re-


sults in the preceding Lemmas can be weakened even further as follows. Consider the
convergence criteria (C∗N ) for N > 1: (C1 ) with c, ν replaced by cN , νN , respectively

α−1 < α0 < α1 < · · · < αN < αN+1 ,


 
l0 µ(αN+1 − cN ) + λ(αN − cN ) + cN < 1.
Then, the preceding results hold with c, ν, δ1 , δ2 , b11 , b12 replaced, respectively by
cN , νN , δN , δN+1 , bN1 , bN2 .

(b) Notice that if

l0 [µ(αn+1 − c) + λ(αn − c) + c] < 1 holds for each n = 0, 1, 2, · · · , (1.2.31)

then, it follows from (1.2.1) that sequence {αn } is increasing, bounded from above by
1+(µ+λ−1)l0 c
l0 (µ+λ)
and as such it converges to its unique least upper bound α∗ . Criterion
(1.2.31) is the weakest of all the preceding convergence criteria for sequence {αn }.
Clearly all the preceding criteria imply (1.2.31). Finally, define the criteria for N ≥ 1
 N
N (C∗ )
(I ) = (1.2.32)
(1.2.31) if criteria (C∗N ) fail.

Lemma 1.2.7. Suppose that the conditions (1.2.18) and (1.2.28) hold. Then, the following
assertion holds
b11 ≤ α∗ ≤ b12 , (1.2.33)
where    
1 + (µ + λ − 1)l0 c 1 δ1 δ2
b11
= − exp −2 + ,
l0 µ+λ 2 − δ1 2 − δ2
 
1 1 + (µ + λ − 1)l0 c 1 ∗
b2 = − exp(δ ) , (1.2.34)
l0 µ+λ
    
1 δ2 (µ + λ)(1 − (µ + λ − 1)l0 c)
δ∗ = − δ1 + + ln
1 − δ1 1−r 1 − l0 c
and
λδ1 + δ 2 (1 − δ1 )
r=b .
(1 − δ1 )(1 − δ2 )(λ + µ(1 − δ2 ))
Proof. Using (1.2.18) and (1.2.28) we have that 0 < δ3 < δ2 < δ1 . Let us assume that
0 < δk+1 < δk < · · · < δ1 . Then, it follows from the induction hypotheses and (1.2.34) that

δk + δk+1 (1 − δk )
δk+2 = δk+1 b < rδk+1 < r2 δk ≤ · · · ≤ rk−1 δ3 ≤ rk δ2 .
(1 − δk )(1 − δk+1 )(2 − δk+1 )
Secant-Type Methods 9

We have that

γ∗ = lim γn = ∏(1 − δn )γ0 .
n→∞
i=1

This is equivalent to
  ∞    
1 1 (µ + λ)(1 + (µ + λ − 1)l0 c)
ln ∗ = ∑ ln + ln ,
γ n=1 1 − δn 1 − l0 c

recalling that γ0 = (1−l0c)/((µ+λ)(1+(µ+λ−1)l0 c)). We shall use the following bounds


for lnt, t > 1:  
t −1 t2 − 1
2 ≤ lnt ≤ .
t +1 2t
First, we shall find an upper bound for ln(1/γ∗ ). We have that
∞  
∗ δn (2 − δn ) (µ + λ)(1 + (µ + λ − 1)l0 c)
ln(1/γ ) ≤ ∑ + ln
n=1 2(1 − δn )  1 − l0 c


(µ + λ)(1 + (µ + λ − 1)l0 c)
≤ 1−δ1 ∑ δn + ln
1
n=1 1− l0 c 
≤ 1−δ1 (δ1 + δ2 + δ3 + · · ·) + ln (µ+λ)(1+(µ+λ−1)l
1
1−l0 c 
0 c)

≤ 1−δ1 (δ1 + δ2 + rδ2 + · · · + r δ2 + · · ·) + ln (µ+λ)(1+(µ+λ−1)l
1 n
1−l0 c
0 c)

  
1 2 n (µ+λ)(1+(µ+λ−1)l0 c)
≤ 1−δ1 δ1 + δ2 (r + r + · · · + r + · · · + ln 1−l0 c
   
1 δ2 (µ+λ)(1+(µ+λ−1)l0 c) ∗
≤ 1−δ1 δ1 + 1−r + ln 1−l0 c = −δ .

As β∗ = 1/(µ + λ) − γ∗ and α∗ = L−1 ∗


0 β , we obtain the upper bound in (1.2.33). Moreover,
in order to obtain the lower bound for ln(1/γ∗ ), we have that
∞  
∗ δn δ1 δ2
ln(1/γ ) ≥ 2 ∑ >2 + ,
n=1 2 − δn 2 − δ1 2 − δ2

which implies the lower bound in (1.2.33). 

1.3. Semilocal Convergence of the Secant-Type Method


In this section, we first present the semilocal convergence of the secant-type method using
{αn } (defined in (1.2.1)) as a majorizing sequence. Let U(x, R) stand for an open ball
centered at x ∈ X with radius R > 0. Let U(x, R) denote its closure. We shall study the
secant method for triplets (F , x−1 , x0 ) belonging to the class K = K (l0 , l, ν, c, λ, µ) defined
as follows.

Definition 1.3.1. Let l0 , l, ν, c, λ, µ be constants satisfying the hypotheses (I N ) for some fixed
integer N ≥ 1. A triplet (F , x−1 , x0 ) belongs to the class K = K (l0 , l, ν, c, λ, µ) if:

(D1 ) F is a nonlinear operator defined on a convex subset D of a Banach space X with


values in a Banach space Y .
10 Ioannis K. Argyros and Á. Alberto Magreñán

(D2 ) x−1 and x0 are two points belonging to the interior D0 of D and satisfying the in-
equality
kx0 − x−1 k ≤ c.

(D3 ) There exists a sequence {θn } of real numbers and λ, µ such that |1 − θn | ≤ λ and
1 + |θn | ≤ µ for each n = 0, 1, 2, · · ·.

(D4 ) F is Fréchet-differentiable on D0 and there exists an operator δF : D 0 × D0 →


Ł(X,Y ) such that A −1 = δF (x0 , y0 )−1 ∈ Ł(Y, X) for all x, y, z ∈ D then, the following
hold
kA −1 F (x0 )k ≤ ν,
kA −1 (δF (x, y) − F 0 (z))k ≤ l(kx − zk + ky − zk)
and
kA −1(δF (x, y) − F 0 (x0 ))k ≤ l0 (kx − x0 k + ky − x0 k),
where y0 = θ0 x0 + (1 − θ0 )x−1 .

( D5 )
U(x0 , α∗0 ) ⊆ Dc = {x ∈ D : F is continuous at x} ⊆ D,
where α∗0 = (µ + λ − 1)(α∗ − c) and α∗ is given in Lemma 2.3.

Next, we present the semilocal convergence result for the secant method.

Theorem 1.3.2. If (F , x−1 , x0 ) ∈ K (l0, l, ν, c, λ, µ) then, the sequence {xn } (n ≥ −1) gener-
ated by the secant-type method is well defined, remains in U(x0 , α∗0 ) for each n = 0, 1, 2, · · ·
and converges to a unique solution x∗ ∈ U(x0 , α∗ − c) of (1.1.1). Moreover, the following
assertions hold for each n = 0, 1, 2, · · ·

kxn − xn−1 k ≤ αn − αn−1 (1.3.1)

and
kx∗ − xn k ≤ α∗ − αn , (1.3.2)
where sequence {αn } (n ≥ 0) is given in (1.2.1). Furthermore, if there exists R such that

U(x0 , R) ⊆ D, R ≥ α∗ − c and l0 (α∗ − c + R) + kA −1 (F −1 (x0 ) − A )k < 1, (1.3.3)

then, the solution x∗ is unique in U(x0 , R).

Proof. First, we show that M = δF (xk+1, yk+1 ) is invertible for xk+1 , yk+1 ∈ U(x0 , α∗0 ). By
(D2 ),(D3 ) and (D4 ), we have that

kyk+1 − x0 k ≤ kθk (xk+1 − x0 ) + (1 − θk+1 )(xk − x0 )k

≤ |θk+1 |kxk+1 − x0 k + |1 − θk+1 |kxk − x0 k ≤ (µ − 1)(α∗ − c) + λ(α∗ − c) = α∗0


Secant-Type Methods 11

and
kI − A −1 M k = kA −1 (M − A )k
≤ kA −1 (M − F 0 (x0 ))k + kA −1(F 0 (x0 ) − A )k
≤ l0 (kxk+1 − x0 k + kyk+1 − x0 k + kx0 − x−1 k)
≤ l0 (kxk+1 − x0 k + |θk+1 |kxk+1 − x0 k + |1 + θk+1 |kxk+1 − x0 k + c)
≤ l0 (µ(αk+1 − c) + λ(αk+1 − c) + c) < 1
(1.3.4)
Using the Banach Lemma on invertible operators [9], [10], [15], [18], [20] and (1.3.4), we
deduce that M is invertible and

kM −1 A k ≤ (1 − l0 (µ(αk+1 − c) + λ(αk+1 − c) + c))−1 . (1.3.5)

By (D4 ), we have

kA −1 (F 0 (u) − F 0 (v))k ≤ 2lku − vk, u, v ∈ D0 . (1.3.6)

We can write the identity


Z 1
0 0
F (x) − F (y) = F 0 (y + t(x − y))dt(x − y). (1.3.7)
0

Then, for all x, y, u, v ∈ D0 , we obtain

kA −1 (F (x) − F (y) − F 0 (u)(x − y))k ≤ l(kx − uk + ky − uk)kx − yk (1.3.8)

and

kA −1 (F (x) − F (y) − δF (u, v)(x − y))k ≤ l(kx − vk + ky − vk + ku − vk)kx − yk. (1.3.9)

By a continuity argument (1.3.6)-(1.3.9) remain valid if x and/or y belong to Dc . Next,


we show (1.3.1). If (1.3.1) holds for all n ≤ k and if {xn } (n ≥ 0) is well defined for
n = 0, 1, 2, · · · , k, then

kxn − x0 k ≤ αn − α0 < α∗ − α0 , n ≤ k. (1.3.10)

That is (1.1.2) is well defined for n = k + 1. For n = −1 and n = 0, (1.3.1) reduces to


kx−1 − x0 k ≤ c and kx0 − x1 k ≤ ν. Suppose (1.3.1) holds for n = −1, 0, 1, · · · , k (k ≥ 0). By
(1.3.5), (1.3.9), and

F (xk+1) = F (xk+1) − F (xk ) − Ak (xk+1xk ) (1.3.11)

we obtain in turn the following estimates

kA −1F (xk+1)k = kA −1 (δF (xk+1, xk ) − Ak )(xk+1 − xk )k 


≤ kA −1 (δF (xk+1 , xk ) − F 0 (xk ))k + kA −1 (F 0 (xk ) − Ak )k k(xk+1 − xk )k
(1.3.12)
≤ l [k(xk+1 − xk )k + k(xk − yk )k]k(xk+1 − xk )k
≤ l(αk+1 − αk + |1 − θk |(αk − αk−1 )(αk+1 − αk ))
12 Ioannis K. Argyros and Á. Alberto Magreñán

and
−1
kxk+2 − xk+1 k = kAk+1 F (xk+1)k
−1
≤ kAk+1 A kkA −1F (xk+1)k
l(αk+1 −αk +|1−θk |(αk −αk−1 ))
≤ 1−l0 [(1+|θk+1 |)(αk+1 −c)+|1−θk+1 |(αk −c)+c]
(αk+1 − αk )
≤ αk+2 − αk+1.
The induction for (1.3.1) is complete. It follows from (1.3.1) and Lemma 2.1 that {xn }
(n ≥ −1) is a complete sequence in a Banach space X and as such it converges to some x∗ ∈
U(x0 , α∗ − c) (since U(x0 , α∗ − c) is a closed set). By letting k → ∞ in (1.3.12), we obtain
F (x∗ ) = 0. Moreover, estimate (1.3.2) follows from (1.3.1) by using standard majoration
techniques [8, 12, 14]. Finally, to show the uniqueness in U(x0 , R), let y∗ ∈ U(x0 , R) be a
solution (1.1.1). Set Z 1
T = F 0 (y∗ + t(y∗ − x∗ ))dt
0
Using (D4 ) and (1.3.3) we get in turn that

kA −1 (A − T )k = l0 (ky∗ − x0 k + kx∗ − x0 k) + kA −1 (F 0 (x0 ) − A )k


(1.3.13)
≤ l0 [(α∗ − α0 ) + R] + kA −1 (F 0 (x0 ) − A )k < 1.

If follows from (1.3.13) and the Banach lemma on invertible operators that T −1 exists.
Using the identity:
F (x∗) − F 0 (y∗) = T (x∗ − y∗ ), (1.3.14)
we deduce that x∗ = y∗ . 
Remark 1.3.3. If follows from the proof of Theorem 3.2 that sequences {rn }, {sn } defined
by


 r = 0, r0 = c, r1 = c + ν
 −1

 r2 = r1 + l0 (r1 −r1−l
0 +|1−θ0 |(r0 −r−1 ))(r1 −r0 )
0 ((1+|θ1 |)(r1 −r0 ))
(1.3.15)

 −r +|1−θ |(r −r ))(r −r )
 rn+2 = rn+1 + l(r n+1 n n n n−1 n+1 n
1−l0 [(1+|θn+1 |)(rn+1 −r0 )+(|1−θn+1 |)(rn −r0 )+c]

and 

 s−1 = 0, s0 = c, s1 = c + ν


l (s −s +λ(s −s −1 ))(s −s ) (1.3.16)
 s2 = s1 + 0 11−l00 (1+|θ0 1 |)(s 1
1 −s0 )
0


 sn+2 = sn+1 + l(sn+1 −sn +λ(sn −sn−1 ))(sn+1 −sn )
1−l0 (µ(sn+1 −s0 )+λ(sn −s0 ))+c

respectively are more precise majorizing sequences for {xn }. Clearly, these sequences also
converge under the (I N ) hypotheses.
A simple inductive argument shows that if l0 < l for each n = 2, 3, · · ·

rn < sn < αn (1.3.17)

rn+1 − rn < sn+1 − sn < αn+1 − αn (1.3.18)


and
r∗ = lim rn ≤ s∗ = lim sn ≤ α∗ = lim αn . (1.3.19)
n→∞ n→∞ n→∞
Secant-Type Methods 13

In practice, one must choose {θn } so that the best error bounds are obtained (see also
Section 4). Note also that sequences {rn } or {sn } may converge under even weaker
hypotheses. The sufficient convergence criterion (1.2.15) determines the smallness of c
and r. This criterion can be solved for c and r ( see for example the h criteria or (1.3.29)
that follow). Indeed, let us demonstrate the advantages in two popular cases:

Case 1. Newton’s method. (i. e., if c = 0, λ = 0, µ = 1). Then, it can easily be seen that
{sn } (and consequently {rn }) converges provided that (see also [3])

h2 = l2 ν ≤ 1, (1.3.20)

where  q 
1 √ 2
l2 = 4κ0 + κ0 κ + κ0 κ + 8κ0 , (1.3.21)
4
whereas sequence {xn } converges, if

h1 = l 1 ν ≤ 1 (1.3.22)

where  q 
1 2
l1 = 4κ0 + κ + κ0 + 8κκ0 , (1.3.23)
4
In the case κ0 = κ (i. e. b = 1), we obtain the famous for its simplicity and clarity Kan-
torovich sufficient convergent criteria [2] given by

h = 2κν ≤ 1. (1.3.24)

Notice however that


h ≤ 1 ⇒ h1 ≤ 1 ⇒ h2 ≤ 1 (1.3.25)
but not necessarily vice versa unless if κ0 = κ. Moreover, we have that

h1 1 h1 h2 κ0
→ , → 0, → 0 as →0 (1.3.26)
h 4 h h1 κ
Case 2. Secant method. (i. e. for θn = 0). Schmidt [20], Potra-Ptáck [15], Dennis [8],
Ezquerro el at. [9], used the majorizing sequence {αn } for θn ∈ [0, 1] and l0 = l. That is,
they used the sequence {tn } given by

 t−1 = 0, t0 = c, t1 = c + ν
(1.3.27)
 t l(tn+1 −tn−1 )(tn+1 −tn )
n+2 = tn+1 + 1−l(tn −tn+1 +c)

whereas our sequence {αn } reduces to



 α−1 = 0, α0 = c, α1 = c + ν
(1.3.28)
 α l(αn+1 −αn−1 )(αn+1 −αn )
n+2 = αn+1 + 1−l0 (αn+1 −αn +c)
14 Ioannis K. Argyros and Á. Alberto Magreñán

Then, in case l0 < l our sequence is more precise (see also (1.3.17)-(1.3.19)). Notice also
that in the preceding references the sufficient convergence criterion associated to {tn } is
given by √
lc + 2 lν ≤ 1 (1.3.29)
Our sufficient convergence criteria can be also weaker in this case (see also the numerical
examples). It is worth nothing that if c = 0 (1.3.29) reduces to (1.3.24) (since κ = 2l).
Similar observations can be made for other choices of parameters.

1.4. Local Convergence of the Secant-Type Method


In this section, we present the local convergence analysis of the secant-type method. Let
x∗ ∈ X be such that F (x∗ ) = 0 and F 0 (x∗ )−1 ∈ Ł(Y , X ). Using the identities
 
xn+1 −x∗ = (An−1F 0 (x∗ ))F 0 (x∗ )−1 (δF (xn , yn ) − F 0 (xn )) + (F 0 (xn ) − δF (xn , x∗ )) (xn −x∗ ),

yn − xn = (1 − θn )(xn−1 − xn ),
and
yn − x∗ = θn (xn − x∗ ) + (1 − θn )(xn−1 − x∗ )
we easily arrive at:

Theorem 1.4.1. Suppose that (D1 ) and (D3 ) hold. Moreover, suppose that there exist x∗ ∈
D, K0 > 0, K > 0 such that F (x∗ ) = 0, F 0 (x∗ )−1 ∈ Ł(Y , X ),

kF 0 (x∗ )−1 (δF (x, y) − F 0 (x∗ ))k ≤ K0 (kx − x∗ k + ky − x∗ k)

kF 0 (x∗ )−1 (δF (x, y) − F 0 (z))k ≤ K(kx − zk + ky − zk) for each x, y, z ∈ D,


and
U(x∗ , R∗0 ) ⊆ D,
where
1
R∗ =
(2λ + 1)K + (λ + µ)K0
and
R∗0 = (µ + λ − 1)R∗ .
Then, sequence {xn } generated by the secant-type method is well defined, remains in
U(x∗ , R∗ ) for each n = −1, 0, 1, 2, · · · and converges to x∗ provided that x−1 , x0 ∈ U(x∗ , R∗ ).
Moreover, the following estimates hold

kxn+1 − x∗ k ≤ eˆn kxn − x∗ k ≤ en kxn − x∗ k ≤ en kxn − x∗ k,

where
K(kxn − x∗ k + |1 − θn |kxn−1 − xn k
eˆn =
1 − K0 ([(1 + |θn |)kxn − x∗ k + |1 − θn |kxn−1 − x∗ k]
K(kxn − x∗ k + λkxn−1 − xn k
en =
1 − K0 ([(µkxn − x∗ k + λkxn−1 − x∗ k]
Secant-Type Methods 15

K(2λ + 1)R∗
en =
1 − K0 (λ + µ)R∗
and 
κ0 , if n = 0
K=
κ, if n > 0
Remark 1.4.2. Comments similar to the one given in Remark 3.3 can also follow for this
case. For example, notice again that in the case of Newton’s method
2
R∗ = ,
2κ0 + κ
whereas the convergence ball given independently by Rheinboldt [18] and Traub [19] is
given by
2
R1∗ = .

Note that
R1∗ ≤ R∗ .
Strict inequality holds in the preceding inequality if κ0 < κ. Moreover, the error bounds are
tighter, if κ0 < κ. Finally, note that κκ0 can be arbitrarily small and

R∗ κ0
1
→ 3 as → 0.
R∗ κ

1.5. Numerical Examples


Related to the semilocal case we present the following examples.

Example 1.5.1. Let X = Y = R and let consider the following function

x3 − 0.49 = 0, (1.5.1)

and we are going to apply the secant method (λ = 1, µ = 1, θn = 0) to find the solution of
(1.5.1). We take the starting points x−1 = 1.14216 · · ·, x0 = 1 and we consider the domain
Ω = B(x0 , 2). In this case, we obtain

ν = 0.147967 · · · , (1.5.2)

ν = 0.14216 · · · , (1.5.3)
l = 2.61119 · · · , (1.5.4)
l0 = 1.74079 · · · . (1.5.5)

Notice that hypothesis lc + 2 lν ≤ 1 is not satisfied, but hypotheses of Theorem 3.2 are
satisfied, so the convergence of secant method starting form x0 ∈ B(x0 , 2) converges to the
solution of (1.5.1).
16 Ioannis K. Argyros and Á. Alberto Magreñán

Example 1.5.2. Let X = Y = C [0, 1], equipped with the max-norm. Consider the following
nonlinear boundary value problem

u00 = −u3 − γ u2
u(0) = 0, u(1) = 1.

It is well known that this problem can be formulated as the integral equation
Z 1
u(s) = s + Q (s,t) (u3(t) + γ u2 (t)) dt (1.5.6)
0

where, Q is the Green’s function:



t (1 − s), t ≤ s
Q (s,t) =
s (1 − t), s < t.
We observe that Z 1
1
max |Q (s,t)| dt = .
0≤s≤1 0 8
Then problem (1.5.6) is in the form (1.1.1), where, F : D −→ Y is defined as
Z 1
[F(x)] (s) = x(s) − s − Q (s,t) (x3(t) + γ x2 (t)) dt.
0
Z 1
We define the divided difference by δF(x, y) = F 0 (y + t(x − y))dt. Set u0 (s) = s and
0
D = U(u0, R0). It is easy to verify that U(u0, R0) ⊂ U(0, R0 + 1) since k u0 k= 1. If 2 γ < 5,
the operator F 0 satisfies conditions of Theorem 3.2, with
1+γ γ + 6 R0 + 3 2 γ + 3 R0 + 6
θn = 0, ν= , l= , l0 = .
(1 − l0 c)(5 − 2 γ) (1 − l0 c)(5 − 2 γ) (1 − l0 c)(5 − 2 γ)
1+γ
Since kδF(x0 , x−1 )−1 F(x0 )k ≤ kδF(x0 , x−1 )−1 F 0 (x0 )kkF 0 (x0 )F(x0 )k ≤ (1−l1 0 c) 5−2γ . Note
that l0 < l. Therefore, the hypothesis of Kantorovich may not be satisfied, but conditions of
Theorem 3.2 may be satisfied.
Finally, for the local case we study the following one.
Example 1.5.3. Let X = Y = R3 , D = U(0, 1), x∗ = (0, 0, 0) and define function F on D by

F(x, y, z) = (ex − 1, y2 + y, z). (1.5.7)

We have that for u = (x, y, z)


 
ex 0 0
F 0 (u) =  0 2y + 1 0  , (1.5.8)
0 0 1

Using the norm of the maximum of the rows and (1.5.7)–(1.5.8) we see that since F 0 (x∗ ) =
diag{1, 1, 1}, we can define parameters for Newton’s method by

K = e/2, (1.5.9)
Secant-Type Methods 17

K0 = 1, (1.5.10)
2
R∗ = , (1.5.11)
e+4
R∗0 = R∗ , (1.5.12)
since θn = 1, µ = 2, λ = 0. Then the Newton’s method starting form x0 ∈ B(x∗ , R∗ ) converges
to a solution of (1.5.7). Note that using only Lipschitz condition we obtain the Rheinboldt
or Traub ball R∗T R = 3e
2
< R∗ .

Example 1.5.4. In this example we present an application of the previous analysis to the
Chandrasekhar equation:
Z 1
s x(t)
x(s) = 1 + x(s) dt, s ∈ [0, 1], (1.5.13)
4 0 s +t
which arises in the theory of radiative transfer [7]; x(s) is the unknown function which
is sought in C[0, 1]. The physical background of this equation is fairly elaborate. It was
developed by Chandraseckhar [7] to solve the problem of determination of the angular
distribution of the radiant flux emerging from a plane radiation field. This radiation field
must be isotropic at a point, that is the distribution in independent of direction at that point.
Explicit definitions of these terms may be found in the literature [7]. It is considered to be
the prototype of the equation,
Z 1
ϕ(s)
x(s) = 1 + λs x(s) x(t) dt, s ∈ [0, 1],
0 s +t
for more general laws of scattering, where ϕ(s) is an even polynomial in s with
Z 1
1
ϕ(s) ds ≤ .
0 2
Integral equations of the above form also arise in the other studies [7]. We determine where
a solution is located, along with its region of uniqueness.
Note that solving (3.7) is equivalent to solve F(x) = 0, where F : C[0, 1] → C[0, 1] and
Z 1
s x(t)
[F(x)](s) = x(s) − 1 − x(s) dt, s ∈ [0, 1]. (1.5.14)
4 0 s +t
To obtain a numerical solution of (3.7), we first discretize the problem and approach
the integral by a Gauss-Legendre numerical quadrature with eight nodes,
Z 1 8

0
f (t) dt ≈ ∑ w j f (t j),
j=1

where
t1 = 0.019855072, t2 = 0.101666761, t3 = 0.237233795, t4 = 0.408282679,
t5 = 0.591717321, t6 = 0.762766205, t7 = 0.898333239, t8 = 0.980144928,
w1 = 0.050614268, w2 = 0.111190517, w3 = 0.156853323, w4 = 0.181341892,
w5 = 0.181341892, w6 = 0.156853323, w7 = 0.111190517, w8 = 0.050614268.
18 Ioannis K. Argyros and Á. Alberto Magreñán

If we denote xi = x(ti ), i = 1, 2, . . ., 8, equation (3.7) is transformed into the following non-


linear system:
xi 8
xi = 1 + ∑ ai j x j , i = 1, 2, . . ., 8,
4 j=1
ti w j
where, ai j = .
ti + t j
Denote now x = (x1 , x2 , . . ., x8 )T , 1 = (1, 1, . . ., 1)T , A = (ai j ) and write the last nonlin-
ear system in the matrix form:
1
x = 1 + x (Ax), (1.5.15)
4
where represents the inner product. Set G(x) = x. If we choose x0 = (1, 1, . . ., 1)T and
x−1 = (0, 0, . . ., 0)T . Assume sequence {xn } is generated by secant-type mtehods with dif-
ferent choices of θn . Table 1 gives the comparison results for kxn+1 − xn k equipped with
the max-norm for this example. The computational order of convergence (COC) is shown
in Table 1.5.1 for various methods. Here (COC) is defined in [1],[4] by
   
kxn+1 − x? k∞ kxn − x? k∞
ρ ≈ ln / ln , n ∈ N,
kxn − x? k∞ kxn−1 − x? k∞

The last line in Table 1.5.1 shows the (COC).

Table 1.5.1. The comparison results of kxn+1 − xn k for Example 3.3 using various
methods

n kxn+1 − xn k kxn+1 − xn k kxn+1 − xn k kxn+1 − xn k


θn = 0, Newton θn = 1, secant θn = 2, Kurchatov θn = 1/2, midpoint
1 9.49639 × 10−6 4.70208 × 10−2 4.33999 × 10−1 1.42649 × 10−1
2 8.18823 × 10−12 7.77292 × 10−3 3.28371 × 10−2 1.51900 × 10−2
3 5.15077 × 10−24 5.14596 × 10−5 2.33370 × 10−3 1.66883 × 10−4
4 1.79066 × 10−48 3.89016 × 10−8 9.32850 × 10−6 1.34477 × 10−7
5 1.95051 × 10−97 1.77146 × 10−13 2.214411 × 10−9 1.03094 × 10−12
6 2.12404 × 10−195 5.35306 × 10−22 1.801201 × 10−15 5.63911 × 10−21
ρ 2.00032 1.61815 1.61854 1.61817
References

[1] Amat, S., Busquier, S., Negra, M., Adaptive approximation of nonlinear operators,
Numer. Funct. Anal. Optim., 25 (2004), 397–405.

[2] Argyros, I.K., Computational theory of iterative methods. Series: Studies in Compu-
tational Mathematics, 15, Editors: C.K. Chui and L. Wuytack, 2007, Elsevier Publ.
Co. New York, U.S.A.

[3] Argyros, I.K., Hilout, S., Weaker conditions for the convergence of Newton’s method,
J. Complexity, AMS, 28 (2012), 364–387.

[4] Argyros, I. K., Cho, Y. J., Hilout, S., Numerical method for equations and its applica-
tions. CRC Press/Taylor and Francis, New York, 2012.

[5] Argyros, I. K., George, S., Ball convergence for Steffensen-type fourth-order methods.
Int. J. Interac. Multim. Art. Intell., 3(4) (2015), 37–42.

[6] Cătinaş, E., The inexact, inexact perturbed, and quasi-Newton methods are equivalent
models, Math. Comp., 74(249) (2005), 291–301.

[7] Chandrasekhar, S., Radiative transfer, Dover Publ., New York, 1960.

[8] Dennis, J.E., Toward a unified convergence theory for Newton-like methods, in Non-
linear Functional Analysis and Applications (L. B. Rall, ed.) Academic Press, New
York, (1971), 425–472.

[9] Ezquerro, J.A., Hernández, Rubio, M.J., Secant-like methods for solving nonlinear
integral equations of the Hammerstein type, J. Comput. Appl. Math., 115 (2000), 245–
254.

[10] Ezquerro, J.A., Gutiérrez, J.M., Hernández, M.A., Romero, N., Rubio, M.J., The
Newton method: from Newton to Kantorovich. (Spanish), Gac. R. Soc. Mat. Esp.,
13(1) (2010), 53–76.

[11] Gragg, W.B., Tapia, R.A., Optimal error bounds for the Newton-Kantorovich theorem,
SIAM J. Numer. Anal., 11 (1974), 10–13.

[12] Kantorovich, L.V., Akilov, G.P., Functional Analysis, Pergamon Press, Oxford, 1982.

[13] Magreñán, Á. A., A new tool to study real dynamics: The convergence plane, App.
Math. Comp., 248 (2014), 215–224.
20 Ioannis K. Argyros and Á. Alberto Magreñán

[14] Ortega, L.M., Rheinboldt, W.C., Iterative Solution of Nonlinear Equations in Several
Variables, Academic press, New York, 1970.

[15] Potra, F.A., Pták, V., Nondiscrete induction and iterative processes. Research Notes in
Mathematics, 103. Pitman (Advanced Publishing Program), Boston, MA, 1984.

[16] Proinov, P.D., General local convergence theory for a class of iterative processes and
its applications to Newton’s method, J. Complexity, 25 (2009), 38–62.

[17] Proinov, P.D., New general convergence theory for iterative processes and its applica-
tions to Newton-Kantorovich type theorems, J. Complexity, 26 (2010), 3–42.

[18] Rheinboldt, W.C., An adaptative continuation process for solving systems of nonlinear
equations, Banach ctz. Publ. 3 (1975), 129–142.

[19] Traub, J. F. , Iterative method for solutions of equations, Prentice-Hall, New Jersey,
1964.

[20] Schmidt, J. W., Untere Fehlerschranken fun Regula-Falsi Verhafren, Period. Hungar.,
9 (1978), 241–247.

[21] Yamamoto, T., A convergence theorem for Newton-like methods in Banach spaces,
Numer. Math., 51 (1987), 545–557.

[22] Zabrejko, P.P., Nguen, D.F., The majorant method in the theory of Newton-
Kantorovich approximations and the Pták error estimates, Numer. Funct. Anal. Optim.,
9 (1987), 671–684.
Chapter 2

Efficient Steffensen-Type Algorithms


for Solving Nonlinear Equations

2.1. Introduction
In this chapter we are concerned with the problem of approximating a locally unique solu-
tion x? of an equation
F(x) = 0, (2.1.1)
where F is an operator defined on a non–empty, open subset Ω of a Banach space X with
values in a Banach space Y .
Many problems in computational sciences can be brought in the form of equation
(2.1.1). For example, the unknowns of engineering equations can be functions (differ-
ence, differential, and integral equations), vectors (systems of linear or nonlinear algebraic
equations), or real or complex numbers (single algebraic equations with single unknowns).
The solutions of these equations can rarely be found in closed form. That is why the most
commonly used solution method are iterative. The practice of numerical analysis is usually
connected to Newton-like methods [1,3,5,7–9, 10–16,18,19,21–27].
The study about convergence matter of iterative procedures is usually based on two
types: semilocal and local convergence analysis. The semilocal convergence matter is,
based on the information around an initial point, to give conditions ensuring the conver-
gence of the iterative method; while the local one is, based on the information around a
solution, to find estimates of the radii of convergence balls.
A classic iterative process for solving nonlinear equations is Chebyshev’s method (see
[5], [8], [17]):

 x0 ∈ Ω,
yk = xk − F 0 (xk )−1 F(xk ),

xk+1 = yk − 12 F 0 (xk )−1 F 00 (xk )(yk − xk )2 , k ≥ 0.

This one-point iterative process depends explicitly on the two first derivatives of F (namely,
xk+1 = ψ(xk , F(xk ), F 0 (xk ), F 00 (xk ))). Ezquerro and Hernández introduced in [15] some
modifications of Chebyshev’s method that avoid the computation of the second derivative
of F and reduce the number of evaluations of the first derivative of F. Actually, these
22 Ioannis K. Argyros and Á. Alberto Magreñán

authors have obtained a modification of the Chebyshev iterative process which only need
to evaluate the first derivative of F, (namely, xk+1 = ψ(xk , F 0 (xk )), but with third-order of
convergence. In this chapter we recall this method as the Chebyshev–Newton–type method
(CNTM) and it is written as follows:

 x0 ∈ Ω,


 yk = xk − F 0 (xk )−1 F(xk ),
 zk = xk + a (yk − xk )

 1 0
 x −1
((a2 + a − 1) F(xk ) + F(zk )), k ≥ 0.
k+1 = xk − 2 F (xk )
a
There is an interest in constructing families of iterative processes free of derivatives. To
obtain a new family in [8] we considered an approximation of the first derivative of F from
a divided difference of first order, that is, F 0 (xk ) ≈ [xk−1 , xk , F], where, [x, y; F] is a divided
difference of order one for the operator F at the points x, y ∈ Ω. Then, we introduce the
Chebyshev–Secant–type method (CSTM)


 x−1 , x0 ∈ Ω,

yk = xk − B−1k F(xk ), Bk = [xk−1 , xk ; F],

 z = xk + a (yk − xk ),
 k
xk+1 = xk − B−1k (b F(xk ) + c F(zk )), k ≥ 0,

where a, b, c are non–negative parameters to be chosen so that sequence {xk } converges


to x? . Note that (CSTM) is reduced to the secant method (SM) if a = 0, b = c = 1/2, and
yk = xk+1 . Moreover, if xk−1 = xk , and F is differentiable on Ω, then, F 0 (xk ) = [xk , xk ; F],
and (CSTM) reduces to Newton’s method (NM).
We provided a semilocal convergence analysis for (CSTM) using recurrence sequences,
and also illustrated its effectiveness through numerical examples. Dennis [14], Potra [23],
Argyros [1]–[11], Ezquerro et al. [15] and others [16], [22], [25], have provided suffi-
cient convergence conditions for the (SM) based on Lipschitz–type conditions on divided
difference operator (see, also relevant works in [12]–[13], [17], [19], [20].
In this chapter, we continue the study of derivative free iterative processes. We introduce
the Steffensen-type method (STTM):


 x0 ∈ Ω,

yk = xk − A−1
k F(xk ), Ak = [xk , G(xk); F],

 z = xk + a (yk − xk ),
 k
xk+1 = xk − A−1
k (b F(xk ) + c F(zk )), k ≥ 0,
2
where, G : X → X. Note that (STTM) reduces to (CNTM) if G(x) = x, b = a +a−1 a2
and
c = a12 provided that F is Fréchet-differentiable on Ω.
In the special case a = 0, b = c = 21 , xk+1 = yk the quadratic convergence of (CNTM) is
established in [15]. The semilocal convergence analysis of (CNTM) when Ak is replaced by
the more general Gk ∈ L(X,Y ) is given by us in [8]. In the present chapter we provide a local
convergence analysis for (STTM). Then, we give numerical examples to show that (STTM)
is faster than (CSTM). In particular, three numerical examples are also provided. Firstly,
we consider a scalar equation where the main study of the chapter is applied. Secondly,
Efficient Steffensen-Type Algorithms for Solving Nonlinear Equations 23

we give the radius of convergence of (STTM) for a nonlinear integral equation. Thirdly,
we discretize a nonlinear integral equation and approximate a numerical solution by using
(STTM).

2.2. Local Convergence of (STTM)


In this section we provide a local convergence analysis for (STTM). The radius of conver-
gence is also found. The convergence order of (STTM) is at least quadratic if (1 − a)c =
1 − b and at least cubic if a = b = c = 1.
We need a result for zeros of functions related to our local convergence analysis of
(STTM).

Lemma 2.2.1. Suppose a ∈ [0, 1], b ∈ [0, 1],c ≥ 0 are parameters satisfying (1−a)c = 1−b,
L0 , L and N are positive constants with L0 ≤ L. Let ψ be a function defined on [0, +∞) by

ψ(r) = a2 c(N + 2)2 L3 r3 + 2[1 − b + ac + c(N + 2)]a(N + 2)L2 r2 (1 − N+1 2 L0 r)


+4[|1 − ac|(N + 2) + a(1 − b)]Lr(1 − N+1 2 L 0 r) 2 − 8(1 − N+1 L r)3 .
2 0
(2.2.1)
Then, ψ has a least positive zero in (0, R0 ] with R0 given by
2
R0 = (N+2)L+(N+1)L0
. (2.2.2)

Proof. We shall consider two possibilities:


Case ac = 0. If a = 0, then function ψ further reduces to

ψ(r) = 4(1 − N+1 2 N+1


2 L0 r) [(N + 2)Lr − 2(1 − 2 L0 r)]

with minimal zero R0 given by (2.2), since


2
R0 < . (2.2.3)
(N + 1)L0

If c = 0, then b = 1 from the condition (1 − a)c = 1 − b and function ψ becomes

ψ(r) = 4(1 − N+1 2 N+1


2 L0 r) [(N + 2)Lr − 2(1 − 2 L0 r)]

leading to the same value for minimal zero R0 .


Case ac > 0. Using the definition of R0 we get
(N+2)L
1 − N+1
2 L0 R0 = (N+2)L+(N+1)L0 .

Then, we have ψ(0) = −8 < 0 and


2L (N+2)L 2
ψ(R0 ) = 4[|1 − ac|(N + 2) + a(1 − b)] (N+2)L+(N+1)L0 (N+2)L+(N+1)L0
4a(N+2)L2 (N+2)L
+2[1 − b + ac + c(N + 2)] [(N+2)L+(N+1)L 2
0 ] (N+2)L+(N+1)L0
8a2c(N+2)2 L3 8(N+2)3L3 (2.2.4)
+ ((N+2)L+(N+1)L ) 3 − ((N+2)L+(N+1)L )3
0 0
8L3 (N+2)2
= ((N+2)L+(N+1)L0 ) 3 [(|1 − ac| + ac − 1)(N + 2) + 2a(1 − b + ac)].
24 Ioannis K. Argyros and Á. Alberto Magreñán

If 1 − ac ≥ 0 the bracket in (2.2.4) becomes 2a(1 − b + ac) = 2ac > 0, whereas if 1 − ac ≤ 0,


we have

2[(ac − 1)(N + 2) + a(1 − b + ac)] = 2[(ac − 1)(N + 2) + ac] > 0.

Hence, in either case ψ(R0 ) > 0. It follows from the intermediate value theorem that there
exists a zero of function ψ in (0, R0) and the minimal such zero must satisfy 0 < R < R0 .
That completes the proof of the lemma.

Remark 2.2.2. We are especially interested in the case when a = b = c = 1. It follows from
(2.2.1) that in this case we can write

ψ(r) = 8(1 − N+1 3 2 Lr


2 L0 r) [(N + 2) ( 2(1− N+1 L
Lr
)3 + (N + 3)(N + 2)( 2(1− N+1 )2
2 0 r) 2L r)
0
−1].
(2.2.5)
Define function φ on [0, +∞) by

φ(r) = (N + 2)2 r3 + (N + 3)(N + 2)r2 − 1. (2.2.6)

We have φ(0) = −1 < 0 and φ(1) = (N + 2)2 + (N + 3)(N + 2) − 1 > 0. Then, again by the
intermediate value theorem there exists R1 ∈ (0, 1) such that φ(R1 ) = 0. Moreover, we get

φ0 (r) = 3(N + 2)r2 + 2(N + 3)(N + 2)r > 0, f or r > 0.

That is φ is increasing on [0, +∞). Hence, φ crosses the x−axis only once. Therefore R1 is
the unique zero of φ in (0, 1). In this case, by setting
LR
2(1− N+1
= R1 (2.2.7)
2 Lo R)

and solving for R we obtain


2R1
R? := R = L+(N+1)L0 R1 . (2.2.8)

We can show the main result of this section concerning the local convergence of
(STTM).

Theorem 2.2.3. Suppose:


(a) F : Ω ⊆ X → Y and there exists divided difference [x,y;F] satisfying

[x, y; F](x − y) = F(x) − F (y) f or all x, y ∈ Ω; (2.2.9)

(b) Point x? is a solution of equation F(x) = 0, F 0 (x? )−1 ∈ L(Y, X) and there exists a con-
stant L > 0 such that
L
kF 0 (x? )−1 ([x, y; F] − [u, v; F])k ≤ (kx − uk + ky − vk) f or all x, y, u, v ∈ Ω; (2.2.10)
2
(c) There exists a constant L0 > 0 such that
L0
kF 0 (x? )−1 ([x, y; F] − F 0 (x? ))k ≤ (kx − x? k + ky − x? k) f or all x, y ∈ Ω; (2.2.11)
2
Efficient Steffensen-Type Algorithms for Solving Nonlinear Equations 25

(d) G : Ω ⊆ X → X is continuous and such that G(x? ) = x? .


(e) There exists N ∈ (0, 1] such that
kG(x) − G(x? )k ≤ Nkx − x? k f or all x ∈ Ω; (2.2.12)
(f) The relation (1 − a)c = 1 − b is true;
(g)
U(x? , R) = {x ∈ Ω : kx − x? k < R} ⊆ Ω, (2.2.13)
where R is the least positive zero of function ψ given in (2.2.1).
Then, sequence {xn } generated by (STTM) is well defined, remains in U(x? , R) for all n ≥ 0
and converges to x? provided that x0 ∈ U(x? , R). Moreover, the following error estimates
are satisfied for en = xn − x?
ken+1 k ≤ ξn ken k2 ≤ ξken k2 , (2.2.14)
where
L
hn = L0 , (2.2.15)
2(1 − 2 (N + 1)ken k)
ξn = [|1 − ac|(N + 2) + a(1 − b)]hn + [1 − b + ac + c(N + 2)]a(N + 2)h2n ken k
+a2 c(N + 2)2 h3n ken k2 ,
(2.2.16)
L
H= L , (2.2.17)
2(1− 0 (N+1)R) 2

ξ = [|1 − ac|(N + 2) + a(1 − b)]H + [1 − b + ac + c(N + 2)]a(N + 2)H 2 R


(2.2.18)
+a2 c(N + 2)2 H 3 R2 .
In particular, if
a = b = c = 1, (2.2.19)
the optimal (STTM) is obtained which is cubically convergent. Furthermore, the error
estimates (2.2.14) are
ken+1 k ≤ λn ken k3 ≤ λken k3 , (2.2.20)
where
λn = (N + 2)h2n [N + 3 + (N + 2)hn ken k] (2.2.21)
and
λ = (N + 2)H 2 [N + 3 + (N + 2)HR? ], (2.2.22)
where R? is given by (2.2.8).
Proof. We shall show the assertions of the theorem using induction. Let un = yn − x? and
vn = zn − x? (n ≥ 0). Using x0 ∈ U(x? , R), G(x? ) = x? and (2.2.12) we obtain
kG(x0 ) − x? k = kG(x0 ) − G(x? )k ≤ Nkx0 − x? k ≤ kx0 − x? k < R, (2.2.23)
which implies G(x0 ) ∈ U(x? , R). Then, we have by (2.2.11) and (2.2.3):
L0
kF 0 (x? )−1 (F 0 (x? ) − [x0 , G(x0 ); F])k ≤ ? ?
2 (kx − x0 k + kx − G(x0 )k)
L0 ? ?
= 2 (kx − x0 k + kG(x ) − G(x0 )k)
L0 ? ? (2.2.24)
≤ 2 (kx − x0 k + Nkx − x0 k)
L0 L0
< 2 (N + 1)R < 2 (N + 1)R0 < 1.
26 Ioannis K. Argyros and Á. Alberto Magreñán

It follows from (2.2.24) and the Banach lemma on invertible operators that A−1
0 ∈ L(Y, X)
and
kA−1 0 ?
0 F (x )k ≤ L0
1
< L0 1 . (2.2.25)
1− 2 (N+1)ke0 k 1− 2 (N+1)R

Thus, y0 is well defined. Using (2.2.9) we have

F(x0 ) = F(x0 ) − F(x? ) = −(F(x? ) − F(x0 ))


= −[x? , x0 ; F](x? − x0 ) = [x? , x0 ; F]e0 (2.2.26)
= [G(x? ), x0 ; F]e0 .

So,
u0 = y0 − x? = x0 − x? − A−1 0 F(x0 ) (2.2.27)
= A−1
0 F 0 (x? )F 0 (x? )−1 (A − [x? , x ; F])e .
0 0 0

By (2.2.10), (2.2.25) and (2.2.27) we get in turn

ku0 k ≤ kA−1 0 ? L ?
0 F (x )k 2 (kx0 − x k + kG(x0 ) − x0 k)ke0k
≤ 1 L ? ? ?
L0
1− 2 (kx0 − x k + kG(x0 ) − G(x )k + kx − x0 k)ke0 k
(N+1)ke0k
2
≤ 1 L
(2kx0 − x? k + Nkx? − x0 k)ke0 k (2.2.28)
L
1− 20 (N+1)ke0k 2
L(N+2)R0
≤ L ke0 k = ke0 k < R,
2[1− 20 (N+1)R0 ]

which implies y0 ∈ U(x? , R). Noting that

v0 = z0 − x? = au0 + (1 − a)e0 , (2.2.29)

we get
kv0 k ≤ aku0 k + (1 − a)ke0 k ≤ ke0 k < R. (2.2.30)
As in (2.2.26), we have

F(z0 ) = [x? , z0 ; F]v0 = [G(x? ), z0; F]v0 . (2.2.31)

Using (2.2.29) and (2.2.31), we get

e1 = e0 − A−1
0 (bF(x0 ) + cF(z0 )) 
−1
= A0 [x0 , G(x0 ); F]e0 − (b[x? , x0 ; F]e0 + c[x? , z0 ; F]v0 ) 
= A−1 ? ?
0 [x0 , G(x0 ); F]e0 − b[x , x0 ; F]e0 − c[x , z0 ; F](au0 + (1 − a)e0 ) 
= A−1 ? ? ?
0 [x0 , G(x0 ); F]e0 − b[x , x0 ; F]e0 − (1 − b)[x , z0 ; F]e0 − ac[x , z0 ; F]u  0
= A−1 ? ? ?
0 ([x0 , G(x0 ); F] − [x , x0 ; F])e0 + (1 − b)([x , x0 ; F] − [x , z0 ; F])e0
+acA−1 ? ? ?
0 ([x , x0 ; F] − [x , z0 ; F] + [x0 , G(x0 ); F] − [x , x0 ; F])u0 − acu0 .
(2.2.32)
Define
D0 = A−1 ?
0 ([x0 , G(x0 ); F] − [x , x0 ; F]),
−1 (2.2.33)
E0 = A0 ([x? , x0 ; F] − [x? , z0 ; F]),
then, we have from (2.2.27) that
u0 = D0 e0 . (2.2.34)
Efficient Steffensen-Type Algorithms for Solving Nonlinear Equations 27

Moreover, we can rewrite (2.2.32) as

e1 = D0 e0 + (1 − b)E0 e0 + acE0 u0 + acD0 u0 − acu0


(2.2.35)
= (1 − ac)D0 e0 + (1 − b)E0 e0 + acE0 D0 e0 + acD20 e0 .

We need to find upper bounds on the norms kD0 k and kE0 k. Using (2.2.10) and (2.2.33)
we get in turn

kD0 k ≤ kA−1 0 ? 0 ? −1 ?
0 F (x )kkF (x ) ([x0 , G(x0 ); F] − [x , x0 ; F])k
1 L ?
≤ L0 2 (kx0 − x k + kG(x0 ) − x0 k)
1− 2 (N+1)ke0k (2.2.36)
L ?
2 (N+2)kx0−x k L(N+2)ke0k
≤ L0 = L
1− 2 (N+1)kx0−x? k 2(1− 20 (N+1)ke0k)

and

kE0 k ≤ kA−1 0 ? 0 ? −1 ? ?
0 F (x )kkF (x ) ([x , x0 ; F] − [x , z0 ; F])k
L aL aL
2 kz0 −x0 k 2 ky0 −x0 k 2 (ku0k+ke0 k)
≤ L0 = L ≤ L
1− 2 (N+1)kx0−x? k 1− 20 (N+1)ke0k 1− 20 (N+1)ke0k
aL
2 (kD0 k+1)ke0k 1 aL
L
(N+2)ke 0k (2.2.37)
≤ L ≤ L [ 2
L0 + 1]ke0 k
1− 20 (N+1)ke0k 1− 20 (N+1)ke0k 2 1− 2 (N+1)ke0k
aL2 (N+2)ke0k2 aLke0 k
≤ L + L .
4(1− 20 (N+1)ke0k)2 2(1− 20 (N+1)ke0 k)

Using (2.2.35)-(2.2.37) we get


L(N+2)ke0 k2 2(N+2)ke k3 2
ke1 k ≤ |1 − ac| L0 + (1−b)aLL0
0
2
+ (1−b)aLke
L0
0k
2(1− 2 (N+1)ke0k) 4(1− 2 (N+1)ke0k) 2(1− 2 (N+1)ke0k)
aL2 (N+2)ke0k2 aLke0 k  L(N+2)ke 0k
2
+ac L0 2
+ L0 L0
4(1− 2 (N+1)ke0 k) 2(1− 2 (N+1)ke0k) 2(1− 2 (N+1)ke0 k)
L2 (N+2)2 ke0 k3
+ac L
4(1− 20 (N+1)ke0k)2
≤ h0 |1 − ac|(N + 2)ke0 k2 + (1 − b)a(N + 2)h20 ke0 k3 + (1 − b)ake0 k2 h0
+ac(ah20 (N + 2)ke0 k2 + ake0 kh0 )ke0 k2 (N + 2)h0 + ac(N + 2)2 h20 ke0 k3
= ξ0 ke0 k2 ≤ ξke0 k2 ≤ {[|1 − ac|(N + 2) + a(1 − b)]HR
+[1 − b + ac + c(N + 2)]a(N + 2)H 2 R2 + a2 c(N + 2)2 H 3 R3 }ke0 k
= {[|1 − ac|(N + 2) + a(1 − b)] 2(1−L LR N+1
0 ( 2 )R)
2 2
+[1 − b + ac + c(N + 2)]a(N + 2) 4(1−LL (RN+1 )R)2
0 2
3 3
+a2 c(N + 2)2 8(1−LL (RN+1 )R)3 }ke0 k
0 2
ψ(R)+8(1−L0 ( N+1 2 )R)
3
= 8(1−L0 ( N+1 3 ke0 k = ke0 k < R,
2 )R)
(2.2.38)
which implies x1 ∈ U(x? , R). Hence, assertion (2.2.14) is true for n = 0.
Let us assume {xn } is well defined and xn ∈ U(x? , R) for all 0 ≤ n ≤ k (k ≥ 1). Using
an analogous way with x0 replaced by xk we deduce:
(i) G(xk ) ∈ U(x? , R);
(ii) A−1
k ∈ L(Y, X) and

kA−1 0 ?
k F (x )k ≤ L0
1
< L0
1
; (2.2.39)
1− 2 (N+1)kek k 1− 2 (N+1)R
28 Ioannis K. Argyros and Á. Alberto Magreñán

(iii) yk is well defined, yk ∈ U(x? , R) and


kuk k ≤ kA−1 0 ? L ?
k F (x )k 2 (kxk − x k + kG(xk ) − xk k)kek k
≤ 1 L ? ? ?
L0 2 (kxk − x k + kG(xk ) − x k + kx − xk k)kek k
1− 2 (N+1)kekk (2.2.40)
L(N+2)kek k2 L(N+2)R0 kek k
≤ 2(1−L0 ( N+1
≤ 2(1−L0 ( N+1
= kek k < R;
2 )kek k) 2 )R0 )

(iv) zk is well defined, zk ∈ U(x? , R);


(v) xk+1 is well defined and
kek+1 k ≤ hk |1 − ac|(N + 2)kek k2 + (1 − b)a(N + 2)h2k kek k3
+(1 − b)ahk kek k2 + ac(ah2k (N + 2)kek k2 + akek khk)(N + 2)hk kek k2
+ac(N + 2)2 h2k kek k3 ≤ ξk kek k2 ≤ ξkek k2 ≤ kek k < R.
(2.2.41)
The induction is completed and by (2.2.41) limk→∞ xk = x? . In the special case of a = b =
c = 1, in view of ξn = λn en and R∗ = R, we can deduce that the error estimates (2.2.20) hold
for any n ≥ 0. That completes the proof of the theorem.
Remark 2.2.4. (a) In view of (2.2.10) condition (2.2.11) always holds. Hence, (2.2.11) is
not an additional to (2.2.10) hypothesis, since in practice the computation of L requires that
of L0 .
(b) It follows from (2.2.22) that λ is directly proportional to R? since L0 , L and N are
constants. Clearly, the smaller R? is the smaller the ratio of convergence in (2.2.22) will be.
(c) Note that (2.2.10) implies that F is a differentiable operator in Ω [5,17,19].

2.3. Numerical Examples


In this section, we present numerical examples, where we verify the conditions of Theorem
2.2.3
Example 2.3.1. Let X = Y = R, Ω = (−1, 1) and define F on Ω by
F(x) = ex − 1. (2.3.1)
Then, x? = 0 is a solution of Eq. (2.1.1), and F 0 (x? ) = 1. Note that for any x, y, u, v ∈ Ω, we
have
|F 0 (x? )−1 ([x, y; F] − [u, v; F])| = | 01 (F 0 (tx + (1 − t)y) − F 0 (tu + (1 − t)v))dt|
R
 
= | 01 01 (F 00 θ(tx + (1 − t)y) + (1 − θ)(tu + (1 − t)v) tx + (1 − t)y − (tu + (1 − t)v) dθdt|
R R

= | 01 01 (eθ(tx+(1−t)y)+(1−θ)(tu+(1−t)v) tx + (1 − t)y − (tu + (1 − t)v) dθdt|
R R
R1
≤ 0 e|t(x − u) + (1 − t)(y − v)|dt
≤ e2 (|x − u| + |y − v|)
(2.3.2)
and
|F 0 (x? )−1 ([x, y; F] − [x? , x? ; F])| = | 01 F 0 (tx + (1 − t)y)dt − F 0 (x? )|
R

= | 01 (etx+(1−t)y − 1)dt|
R
tx+(1−t)y (tx+(1−t)y)2
= | 01 (tx + (1 − t)y)(1 + (2.3.3)
R
2! + 3! + · · · )dt|
R1 1 1
≤ | 0 (tx + (1 − t)y)(1 + 2! + 3! + · · ·)dt|
≤ e−1 ? ?
2 (|x − x | + |y − x |).
Efficient Steffensen-Type Algorithms for Solving Nonlinear Equations 29

That is to say, the Lipschitz condition (2.2.10) and the center-Lipschitz condition (2.2.11)
are true for L = e and L0 = e − 1, respectively.
2
Choose G(x) = x − hF (x), where h ∈ (0, e−1 ) is a constant. Then, G : Ω ⊆ X → X is
? ?
continuous and such that G(x ) = x . Moreover, for any x ∈ Ω, we have
2 3
|G(x) − G(x? )| = |x − h(ex − 1)| = |x − h(x + x2! + x3! + · · · )|
2 3 1
= |(1 − h)x − h( x2! + x3! + · · · )| ≤ (|1 − h| + h| 2! + 3!1 + · · · |)|x|
= (|1 − h| + h(e − 2))|x − x? |,
(2.3.4)
which means condition (2.2.12) is true for N = |1 − h| + h(e − 2) and N ∈ (0, 1).

Table 2.3.1. The comparison results of R0 and R for Example 2.3.1 using various
choices of a, b, c and h

Case h R0 (using R0 (using R(using R(using


both (2.10) only (2.10)) both (2.10) only (2.10))
and (2.11)) and (2.11))
a=b=c=1 0.99 0.193161183 0.165629465 0.113779771 0.103632763
1.00 0.193394634 0.165839812 0.113940329 0.103781113
1.01 0.191979459 0.164565090 0.112967093 0.102882059
a = b = 1, c = 0.5 0.99 0.193161183 0.165629465 0.130274315 0.117141837
1.00 0.193394634 0.165839812 0.130452367 0.117305161
1.01 0.191979459 0.164565090 0.129373022 0.116315329
a = b = 0.5, c = 1 0.99 0.193161183 0.165629465 0.128109720 0.115388719
1.00 0.193394634 0.165839812 0.128282409 0.115547600
1.01 0.191979459 0.164565090 0.127235492 0.114584632

Table 2.3.3 gives the comparison results of R0 and R for Example 2.3.1 using various
choices of a, b, c and h, which show that the convergence radius R is always enlarged by
using both condition (2.2.10) and (2.2.11) than the one by using only condition (2.2.10).
The same result is true for R0 .
Let us set h = 1 and choose x0 = 0.11. Suppose sequence {xn } is generated by (STTM).
Table 2.3.3 gives the error estimates for Example 2.3.1 using various choices of a, b and c,
which shows that all error estimates given by (2.2.14) (or (2.2.14)) are satisfied. Moreover,
the error estimates of case a = b = c = 1 are smallest among all choices of a, b, c.

Example 2.3.2. Let X = Y = C[0, 1], the space of continuous functions defined on [0, 1],
equipped with the max norm and Ω = U(0, 1). Define function F on Ω, given by
Z 1
F(x)(s) = x(s) − 5 stx3 (t)dt, (2.3.5)
0

and the divided difference of F is defined by


Z 1
[x, y; F] = F 0 (tx + (1 − t)y)dt. (2.3.6)
0
30 Ioannis K. Argyros and Á. Alberto Magreñán

Table 2.3.2. The comparison results of error estimates for Example 2.3.1 using
various choices of a, b and c

Case n ken+1 k λn ken k3 (or ξn ken k2 ) λken k3 (or ξken k2 )


a=b=c=1 0 1.73843e-05 0.040042622 0.040806467
1 1.32733e-15 9.80995e-14 1.61073e-13
a = b = 1, c = 0.5 0 0.000178345 0.039055027 0.053915139
1 7.08930e-13 4.32947e-08 1.41726e-07
a = b = 0.5, c = 1 0 0.001347553 0.043408200 0.057192845
1 2.26682e-07 3.11418e-06 8.58318e-06

Then, we have
Z 1
[F 0 (x)y](s) = y(s) − 15 stx2 (t)y(t)dt, f or all y ∈ Ω. (2.3.7)
0

We have x? (s) = 0 for all s ∈ [0, 1], L0 = 7.5 and L = 15 [5].


Choose G(x) = x and a = b = c = 1. Then, N = 1. Using Theorem 2.2.3 and Remark
2.2.4, we deduce that R1 is the unique positive zero of function

φ(r) = 9r3 + 12r2 − 1, (2.3.8)

which leads to R1 = 0.263762616. Moreover, the radius of convergence of (STTM) is given


by
2R1
R? = R = = 0.027828287, (2.3.9)
L + (N + 1)L0 R1
which is bigger than the corresponding radius
2R1
R0 = = 0.023023089 (2.3.10)
L + (N + 1)LR1

obtained by only using the Lipschitz condition (2.2.10).

In the last example we are not interested in checking if the hypotheses of Theorem 2.2.3
are satisfied or not, but comparing the numerical behavior of (STTM) with earlier methods.

Example 2.3.3. In this example we present an application of the previous analysis to the
significant Chandrasekhar integral equation [7]:
Z 1
s x(t)
x(s) = 1 + x(s) dt, s ∈ [0, 1]. (2.3.11)
4 0 s +t
Integral equations of the form (2.3.11) are very important and appear in the areas of neutron
transport, radiative transfer and the Kinetic theory of gasses. We refer the interested reader
to [1,11,17] where a detailed description of the physical phenomenon described by (2.3.11)
can be found. We determine where a solution is located, along with its region of uniqueness.
Later, the solution is approximated by an iterative method of (STTM).
Efficient Steffensen-Type Algorithms for Solving Nonlinear Equations 31

Note that solving ((2.3.11)) is equivalent to solve F(x) = 0, where F : C[0, 1] → C[0, 1]
and
s 1 x(t) Z
[F(x)](s) = x(s) − 1 − x(s) dt, s ∈ [0, 1]. (2.3.12)
4 0 s +t
To obtain a numerical solution of (2.3.11), we first discretize the problem and we find it
convenient by testing several number of nodes to approach the integral by a Gauss-Legendre
numerical quadrature with eight nodes (see also [1], [11], [17])
Z 1 8

0
f (t) dt ≈ ∑ w j f (t j),
j=1

where
t1 = 0.019855072, t2 = 0.101666761, t3 = 0.237233795, t4 = 0.408282679,
t5 = 0.591717321, t6 = 0.762766205, t7 = 0.898333239, t8 = 0.980144928,
w1 = 0.050614268, w2 = 0.111190517, w3 = 0.156853323, w4 = 0.181341892,
w5 = 0.181341892, w6 = 0.156853323, w7 = 0.111190517, w8 = 0.050614268.

If we denote xi = x(ti ), i = 1, 2, . . ., 8, equation (2.3.11) is transformed into the following


nonlinear system:
xi 8
xi = 1 + ∑ ai j x j , i = 1, 2, . . ., 8,
4 j=1
ti w j
where, ai j = .
ti + t j

Table 2.3.3. The comparison results of kxn+1 − xn k for Example 2.3.3 using various
methods

n STTM STTM CSTM CSTM


(a = b = c = 1) (a = 0.5, b = 0, c = 2) (a = b = c = 1) (a = 0.5, b = 0, c = 2)
1 2.49e-01 2.45e-01 2.49e-01 2.45e-01
2 5.69e-04 4.85e-03 6.14e-04 4.87e-03
3 3.40e-12 1.33e-06 5.76e-07 6.18e-06
4 4.34e-37 8.02e-14 1.91e-15 3.28e-12
5 6.36e-112 2.46e-28 4.34e-30 1.33e-24
6 1.54e-336 2.04e-57 8.04e-62 1.40e-49

Denote now x = (x1 , x2 , . . ., x8 )T , 1 = (1, 1, . . ., 1)T , A = (ai j ) and write the last nonlin-
ear system in the matrix form:
1
x = 1 + x (Ax), (2.3.13)
4
where represents the inner product. Set G(x) = x. If we choose x0 = (1, 1, . . ., 1)T and
x−1 = (.99, .99, .. ., .99)T . Assume sequence {xn } is generated by (STTM) (or (CSTM))
with different choices of parameters a, b and c. Table 2.3.3 gives the comparison results for
kxn+1 − xn k equipped with the max-norm for this example, which show that (STTM) is faster
32 Ioannis K. Argyros and Á. Alberto Magreñán

than (CSTM). Here, we perform the computations by Maple 11 in a computer equipped with
Inter(R) Core(TM) i3-2310M CPU.
In future results we shall use higher precision instead of a fixed number of digits in all
computations. We shall also use an adaptive arithmetic in each step of the iterative method.
Note that this higher precision is only necessary in the last step of the iterative process.
Table 2.3.3 shows the usefulness of (STTM) since it is faster than other relevant methods
in the literature like (CSTM).
References

[1] Argyros, I.K., Polynomial operator equations in abstract spaces and applications,
St.Lucie/CRC/Lewis Publ. Mathematics series, 1998, Boca Raton, Florida, U.S.A.

[2] Argyros, I.K., On the Newton–Kantorovich hypothesis for solving equations, J. Com-
put. Appl. Math., 169 (2004), 315–332.

[3] Argyros, I.K., A unifying local–semilocal convergence analysis and applications for
two–point Newton–like methods in Banach space, J. Math. Anal. Appl., 298 (2004),
374–397.

[4] Argyros, I.K., New sufficient convergence conditions for the Secant method, Che-
choslovak Math. J., 55 (2005), 175–187.

[5] Argyros, I.K., Convergence and applications of Newton–type iterations, Springer–


Verlag Publ., New–York, 2008.

[6] Argyros, I.K., Hilout, S., On the weakening of the convergence of Newton’s method
using recurrent functions, J. Complexity, 25 (2009), 530–543.

[7] Argyros, I.K., Hilout, S., On the convergence of two-step Newton-type methods of
high efficiency order, Applicationes Mathematicae, 36(4) (2009), 465-499.

[8] Argyros, I.K., Ezquerro, J., Gutiérrez, J.M., Hernández, M., Hilout, S., On the semilo-
cal convergence of efficient Chebyshev-Secant-type methods, J. Comput. Appl. Math.,
235 (2011), 3195–3206.

[9] Argyros, I.K., Cho, Y.J., Hilout, S., Numerical Methods for Equations and Its Appli-
cations, CRC Press/Taylor and Francis Group, New York, 2012.

[10] Argyros, I. K., George, S. Ball convergence for Steffensen-type fourth-order methods.
Int. J. Interac. Multim. Art. Intell., 3(4) (2015), 37–42.

[11] Argyros, I. K., González, D., Local convergence for an improved Jarratt-type method
in Banach space. Int. J. Interac. Multim. Art. Intell., 3(4) (2015), 20–25.

[12] Catinas, E., On some iterative methods for solving nonlinear equations, Revue d’ anal-
yse numerique et de theorie de l’approximation, 23(1) (1994), 47–53.

[13] Chandrasekhar, S., Radiative transfer, Dover Publ., New–York, 1960.


34 Ioannis K. Argyros and Á. Alberto Magreñán

[14] Dennis, J.E., Toward a unified convergence theory for Newton–like methods, in Non-
linear Funct. Anal. App. (L.B. Rall, ed.), Academic Press, New York, (1971), 425–
472.

[15] Ezquerro, J.A., Hernández, M.A., An optimization of Chebyshev’s method, J. Com-


plexity, 25 (2009), 343–361.

[16] Hernández, M.A., Rubio, M.J., Ezquerro, J.A., Solving a special case of conservative
problems by Secant–like method, Appl. Math. Comp., 169 (2005), 926–942.

[17] Hernández, M.A., Rubio, M.J., Ezquerro, J.A., Secant–like methods for solving non-
linear integral equations of the Hammerstein type, J. Comp. Appl. Math., 115 (2000),
245–254.

[18] Grau, M., Noguera, M., A variant of Cauchy’s method with accelerated fifth-order
convergence. Appl. Math. Lett., 17 (2004), 509–517.

[19] Kantorovich, L.V., Akilov, G.P., Functional Analysis, Pergamon Press, Oxford, 1982.

[20] Laasonen, P., Ein überquadratisch konvergenter iterativer algorithmus, Ann. Acad. Sci.
Fenn. Ser I, 450 (1969), 1–10.

[21] Magreñán, Á. A. , Different anomalies in a Jarratt family of iterative root-finding


methods, App. Math. Comp. 233 (2014), 29–38.

[22] Ortega, J.M., Rheinboldt, W.C., Iterative Solution of Nonlinear Equations in Several
Variables, Academic Press, New York, 1970.

[23] Potra, F.A., Sharp error bounds for a class of Newton–like methods, Libertas Mathe-
matica, 5 (1985), 71–84.

[24] Petković, M.S., Petković, L.D., Construction of zero-finding methods by Weierstrass


functions, App. Math. Comp., 184 (2007), 351–359.

[25] Petković, M.S., IIić, S., Dzunić, J., Derivative free two-point methods with and with-
out memory for solving nonlinear equations, App. Math. Comp., 217 (2010), 1887–
1895.

[26] Schmidt, J.W., Untere Fehlerschranken fur Regula–Falsi Verhafren, Period. Hungar.,
9 (1978), 241–247.

[27] Yamamoto, T., A convergence theorem for Newton–like methods in Banach spaces,
Numer. Math., 51 (1987), 545–557.

[28] Wolfe, M.A., Extended iterative methods for the solution of operator equations, Nu-
mer. Math., 31 (1978), 153–174.
Chapter 3

On the Semilocal Convergence of


Halley’s Method under a
Center-Lipschitz Condition on the
Second Fréchet Derivative

3.1. Introduction
Let X and Y be Banach spaces and D be a non-empty, open and convex subset of X. The aim
of this chapter is to show using a numerical example that the convergence theorem of Ref.
[15] is false under the stated hypotheses. Reference [15] was concerned with the semilocal
convergence of Halley’s method for solving a nonlinear operator equation

F(x) = 0, (3.1.1)

where F : D ⊂ X → Y is continuously twice Fréchet differentiable.


Many problems from computational sciences and other disciplines can be brought in
a form similar to equation (3.1.1) using mathematical modelling [1, 2, 3, 4, 7, 8, 10, 12].
The solutions of these equations can be rarely be found in closed form. That is why most
solution methods for these equations are iterative. The study about convergence matter
of iterative procedures is usually based on two types: semilocal and local convergence
analysis. The semilocal convergence matter is, based on the information around an initial
point, to give conditions ensuring the convergence of the iterative procedure; while the
local one is, based on the information around a solution, to find estimates of the radii of
convergence balls.
Halley’s method with initial point x0 ∈ D is defined by [1, 5, 6, 9, 13, 14, 15]

xk+1 = xk − [I − LF (xk )]−1F 0 (xk )−1 F(xk ), k = 0, 1, 2, . . ., (3.1.2)

where, LF (x) = 12 F 0 (x)−1 F 00 (x)F 0 (x)−1 F(x). Let U(x, R), U(x, R) stand, respectively, for
the open and closed balls in X with center x and radius R > 0. Halley’s method is cubically
convergent and has been studied extensively (see [1-13] and the references therein). In
36 Ioannis K. Argyros and Á. Alberto Magreñán

particular, recurrence relations have been used by Parida [13], Parida and Gupta [14], Chun,
Stǎnicǎ and Neta [9] together with different continuity conditions on the second Fréchet
derivative F 00 of F such as F 00 is Lipschitz or Hölder continuous to provider a semilocal
convergence analysis for third order methods such as Halley’s, Chebyshev’s method,
super-Halley’s method and other high order methods. The sufficient conditions usually
associated with the semilocal convergence of Halley’s method are the (C) conditions [1, 5]
given by
(C1 ) kF 0 (x0 )−1 F(x0 )k ≤ η,
(C2 ) kF 0 (x0 )−1 F 00 (x)k ≤ β,
(C3 ) kF 0 (x0 )−1 (F 00 (x) − F 00 (y))k ≤ Mkx − yk,
3M 2
(C4 ) h = 2 3 2
η ≤ 1,
(β +2M) 2 −β(β +3M)
(C5 ) U(x0 , R0 ) ⊆ D, where R0 is the small positive root of

M 3 β 2
p(t) = t + t − t + η.
6 2
Similar conditions but with different (C4 ) and (C5 ) have been given by us in [5, Theorem
2.3], where the corresponding to R0 radius is given in closed form. There are many inter-
esting examples in the literature (see [3, 4, 11, 15] and Example 3.5.2), where Lipschitz
condition (C3 )(used in [9, 13, 14]) is violated but center-Lipschitz condition

kF 0 (x0 )−1 [F 00 (x) − F 00 (x0 )]k ≤ Lkx − x0 k, f or each x ∈ D (3.1.3)

is satisfied. Note that


L≤M
holds in general and ML can be arbitrarily large [4, 5]. A local convergence analysis for
Halley’s method under (1.3) and more general conditions has been given by us in [5, 6].
Relevant work but for Newton’s method can be found in [3, 4, 11].
The following semilocal convergence theorem was established in [15].
Theorem 3.1.1. Let F : D ⊂ X → Y be continuously twice Fréchet differentiable, D open
and convex. Assume that there exists a starting point x0 ∈ D such that F 0 (x0 )−1 exists, and
the following conditions hold:
(i) kF 0 (x0 )−1 F(x0 )k ≤ η;
(ii) kF 0 (x0 )−1 F 00 (x0 )k ≤ β;
(iii) (1.3) is true;
(iv) 12 βη < τ, where

3s? + 1 − 7s? + 1
τ= = 0.134065 . . ., (3.1.4)
9s? − 1
s? = 0.800576 . . . such that q(s? ) = 1, and

(6s + 2) − 2 7s + 1 s
q(s) = √ (1 + ); (3.1.5)
(6s − 2) + 7s + 1 1 − s2

(v) U(x0 , R) ⊂ D, where R is the positive solution of

Lt 2 + βt − 1 = 0. (3.1.6)
Semilocal Convergence of Halley’s Method 37

Then, the Halley sequence {xk } generated by (3.1.2) remains in the open ball U(x0 , R), and
converges to the unique solution x? ∈ U(x0 , R) of Eq. (3.1.1) . Moreover, the following error
estimate holds

a i
kx? − xk k ≤ ∑ γ2 , (3.1.7)
c(1 − τ)γ i=k+1
1 a(a+4)
where a = βη, c = R and γ = (2−3a)2
.

In the present chapter we expand the applicability of Halley’s method using (3.1.3)
instead of (C3 ). The chapter is organized as follows: In Section 3.2 we present a coun-
terexample to show that the result in [15] using (3.1.3) is false. The mistakes in the proof
are pointed out in Section 3.3. Section 3.4 contains our semilocal convergence analysis of
Halley’s method using (3.1.3). The numerical examples are given in the concluding Section
3.5.

3.2. Motivational example


Example 3.2.1. Let us define a scalar function F(x) = 20x3 − 54x2 + 60x − 23 on D = (0, 3)
with initial point x0 = 1. Then, we have that

F 0 (x) = 12(5x2 − 9x + 5), F 00 (x) = 12(10x − 9). (3.2.1)

So, F(x0 ) = 3, F 0 (x0 ) = 12, F 00 (x0 ) = 12. We can choose η = 41 and β = 1 in Theorem 3.1.1.
Moreover, we have for any x ∈ D that

|F 0 (x0 )−1 [F 00 (x) − F 00 (x0 )]| = 10|x − x0 |. (3.2.2)

Hence, the center Lipschitz condition (3.1.3) is true for constant L = 10. We can also verify
condition 12 βη = 18 < τ = 0.134065 . . . is true. By (3.1.6), we get
q

β2 + 4L − β 41 − 1
R= = = 0.270156 . . .. (3.2.3)
2L 20
Then, condition U(x0 , R) = [x0 − R, x0 + R] ≈ [0.729844, 1.270156] ⊂ D is also true. Hence,
all conditions in Theorem 3.1.1 are satisfied. However, we can verify that the point x1
generated by the Halley’s method (3.1.2) doesn’t remain in the open ball U(x0 , R). In fact,
we have that
|F 0 (x0 )−1 F(x0 )| 2
|x1 − x0 | = 1 0 −1 00 0 −1
= = 0.285714 . . . > R. (3.2.4)
|1 − 2 F (x0 ) F (x0 )F (x0 ) F(x0 )| 7

Clearly, the rest conclusions of Theorem 3.1.1 cannot be reached.


38 Ioannis K. Argyros and Á. Alberto Magreñán

3.3. Mistakes in the Proof of Theorem 3.1.1


One crucial mistake exists in the proof of Theorem 3.1.1. To show this, let us introduce
real constants a, b and c, and real sequences {ak }, {bk}, {ck} and {dk } given in Ref [15] as
follows:
0 < a < 2τ, b > 0, c > 0, 2bc < 2 − a, (3.3.1)
a0 = 1, b0 = b, c0 = a2 , d0 = b
1− 2a , (3.3.2)
and  ak

 ak+1 = 1−ca k dk
,

 b ck 2
k+1 = c(1 + 2 )dk ,
c 2 (3.3.3)

 ck+1 = 2 ak+1bk+1 ,

 d ak+1 bk+1
k+1 = 1−ck+1

for all k ≥ 0. In addition, author in [15] sets


k
rk = d0 + d1 + · · · + dk = ∑ di (3.3.4)
i=0

and
r = lim rk (3.3.5)
k→∞
provided the limit exists. Using induction, Ref. [15] shows that the following relation holds
for k ≥ 0 if ak ≥ 1:
1
ak+1 = . (3.3.6)
1 − crk
Next Ref. [15] claims that from the initial relations, it follows by induction that, for k =
0, 1, 2, . . .
ca2 bk 2ck
cak dk = k = . (3.3.7)
1 − ck 1 − ck
Here, we point out that the relation (3.3.7) is not always true. In fact, for k = 0, the first and
second equality of (3.3.7) are obtained from
a0 b0
d0 = (3.3.8)
1 − c0
and
c
c0 = a20 b0 , (3.3.9)
2
respectively. We can easily verify (3.3.8) is really true from (3.3.2). However, (3.3.9) is not
true in general. Otherwise, using (3.3.2) and (3.3.9), we demand that

a = bc. (3.3.10)

Clearly, this condition is introduced improperly and will be violated frequently. Since some
lemmas of Ref. [15] are established on the basis of the above basic relation (3.3.7), they
will be not always true. Therefore, the main theorem of Ref. [15] (Theorem 3.1.1 stated
above) will be not always true, because it is based on these lemmas.
Semilocal Convergence of Halley’s Method 39

3.4. New Semilocal Convergence Theorem


In this section, we will give a new semilocal convergence theorem for Halley’s method
under the center Lipschitz condition. We first need some auxiliary lemmas.

Lemma 3.4.1. Function q(s) given by


s(s+2)
2s (1−3s)2
q(s) = 1−s (1 + 1−( s(s+2) )2 ) (3.4.1)
2
(1−3s)


increases

monotonically on [0, 2−4 2 ) and there exists a unique point τ ≈ 0.134065 ∈
(0, 2−4 2 ) such that q(τ) = 1.

Proof. It is easy to verify function h(s) given by


s(s+2)
h(s) = (1−3s)2
. (3.4.2)

increases monotonically on [0, 31 ), and such that h( 2−4 2 ) = 1 and h(s) < 1 is true on
√ √
[0, 2−4 2 ). By (4.1), q(s) increases

monotonically on [0, 2−4 2 ). Since q(0) = 0 and q(s)
tends to +∞ as s tends to 2−4 2 from the left side. Therefore, there exists a unique point

τ ∈ (0, 2−4 2 ) such that q(τ) = 1. We can use iterative methods such as the Secant method
to obtain: τ ≈ 0.134065. The proof is complete. 

Lemma 3.4.2. Let real constants a, b and c be defined by

a > 0, b > 0, c > 0, 2bc < 2 − a, (3.4.3)

and real sequences {ak }, {bk}, {ck } and {dk } be defined by (3.3.2)-(3.3.3). Assume that

c2 b2 (a + 4)
c1 = < τ, (3.4.4)
2(2 − a − 2bc)2

where τ is a constant defined in Lemma 3.4.1. Then, sequence {ck } is a bounded strictly
decreasing and ck ∈ (0, τ) for all k ≥ 1. Moreover, we have that

c2k
ck+1 = (ck + 2), k = 1, 2, . . .. (3.4.5)
(1 − 3ck )2

Proof. By conditions (3.4.3) and (3.4.4), a1 , b1 , c1 and d1 are well defined. Since,

ca21 b1 2c1
ca1 d1 = = < 1, (3.4.6)
1 − c1 1 − c1
a2 , b2 and c2 are well defined. We have that

c a21 c1 a21 b21 c21


c2 = 2c1 2
c(1 + ) = (c1 + 2). (3.4.7)
2 (1 − 1−c ) 2 (1 − c1 )2 (1 − 3c1 )2
1
40 Ioannis K. Argyros and Á. Alberto Magreñán

Hence, (3.4.5) holds for k = 1. By (3.4.7), c2 < c1 < 1 is true, since


(1 − 3c1 )2 − c1 (c1 + 2) = 8c21 − 8c1 + 1 > 0, (3.4.8)
which is equivalent to √
2− 2
c1 < . (3.4.9)
4
Then, d2 is well defined.
Suppose sequences {ak }, {bk}, {ck } and {dk } are well defined for k = 0, 1, . . ., n + 1,
cn+1 < cn < · · · < c2 < c1 , and (3.4.5) holds for k = 1, 2, . . ., n, where n ≥ 1 is a fixed
integer. Since,
ca2 bn+1 2cn+1
can+1 dn+1 = n+1 = < 1, (3.4.10)
1 − cn+1 1 − cn+1
it follows that an+2 , bn+2 and cn+2 are well defined. We have that
c a2n+1 cn+1 a2n+1 b2n+1 c2n+1
cn+2 = c(1 + ) = (cn+1 + 2), (3.4.11)
2 (1 − 2cn+1 )2 2 (1 − cn+1 )2 (1 − 3cn+1 )2
1−cn+1

thus (4.5) holds for k = n + 1. So, cn+2 < cn+1 < · · · < c2 < c1 < 1, and dn+2 is well defined.
That completes the induction and the proof of the lemma. 
c2
Lemma 3.4.3. Under the assumptions of Lemma 3.4.2, if we set γ = c1 , then for k ≥ 0
2k −1
(i) ck+1 ≤ c1 γ ;
k
(ii) dk+1 ≤ ca1 (1−c1) γ2 −1 ;
2c1
i−1 −1
(iii) r − rk ≤ ∑∞ 2c1
i=k+1 ca1 (1−c1 ) γ
2
;
where r is defined in (3.3.5).
Proof. Obviously, (i) is true for k = 0. By Lemma 3.4.2, for any k ≥ 1, we have that
c2k c2k c2k
ck+1 = (ck + 2) ≤ (ck−1 + 2) ≤ · · · ≤ (c1 + 2) = λc2k ,
(1 − 3ck )2 (1 − 3ck−1 )2 (1 − 3c1 )2
(3.4.12)
where
c1 + 2 c21 (c1 + 2) c2 γ
λ= 2
= 2
= 2= . (3.4.13)
(1 − 3c1 ) 2
(1 − 3c1 ) c1 c1 c1
Multiplying both side of (3.4.12) by λ yields
2 k k
λck+1 ≤ (λck )2 ≤ (λck−1 )2 ≤ · · · ≤ (λc1 )2 = γ2 , (3.4.14)
which shows (i). Consequently, for k ≥ 1, we have
k
cak+1 dk+1 ca2k+1bk+1 2ck+1 2c1 γ2 −1
dk+1 = = = ≤ . (3.4.15)
cak+1 cak+1 (1 − ck+1 ) cak+1 (1 − ck+1 ) ca1 (1 − c1 )
That is, (ii) is true for k ≥ 1. For the case of k = 0, we have that
a1 b1 ca21 b1 2c1
d1 = = = , (3.4.16)
1 − c1 ca1 (1 − c1 ) ca1 (1 − c1 )
which means (ii) is also true for k = 0. Moreover, (iii) is true by using (ii) and the definitions
of rk and r. The proof is complete. 
Semilocal Convergence of Halley’s Method 41

Lemma 3.4.4. Under the assumptions of Lemma 3.4.2, if we set a = βη, b = η, c = R1 , then
r = limk→∞ rk < R.
Proof. By the definition of rk , Lemmas 3.4.1-3.4.3, for any k ≥ 1, we have that
2c1 i−1 −1 γ
rk ≤ d0 + ∑ki=1 ca1 (1−c 1)
γ2 < d0 + a1 2Rc
(1−c1 ) (1 + 1−γ2 )
1

c1 (c1 +2)
2(1−cd0)Rc1 2
(3.4.17)
= d0 + 1−c1 (1 + (1−3c 1)
c (c +2) ) = d0 + q(c1 )(R − d0 )
1−( 1 1 2 )2
(1−3c1 )
< d0 + q(τ)(R − d0 ) = R.
1
Here, we used R = c > 1−b a = d0 by (3.4.3). Hence, r = limk→∞ rk exists and r < R. The
2
proof is complete. 
Lemma 3.4.5. Set a = βη, b = η and c = R1 . Let {ak }, {bk}, {ck }, {dk} be the sequences
generated by (3.3.2)-(3.3.3). Suppose that conditions (3.4.3) and (3.4.4) are true. Then for
any k ≥ 0, we have
(i) F 0 (xk )−1 exists and kF 0 (xk )−1 F 0 (x0 )k ≤ ak ;
(ii) kF 0 (x0 )−1 F(xk )k ≤ bk ;
(iii) [I − LF (xk )]−1 exists and kLF (xk )k ≤ ck ;
(iv) kxk+1 − xk k ≤ dk ;
(v) kxk+1 − x0 k ≤ rk < R.
Proof. The proof is similar to the one in [15], and we shall omit it. 

Now, we can state our main theorem.

Theorem 3.4.6. Let F : D ⊂ X → Y be continuously twice Fréchet differentiable, D open


and convex. Assume that there exists a starting point x0 ∈ D such that F 0 (x0 )−1 exists, and
the following conditions hold:
(i) kF 0 (x0 )−1 F(x0 )k ≤ η;
(ii) kF 0 (x0 )−1 F 00 (x0 )k ≤ β;
(iii) the center Lipschitz-condition (3.1.3) is true;
(iv) conditions (3.4.3) and (3.4.4) are true;
(v) U(x0 , R) ⊂ D, where R is the positive solution of (3.1.6).
Then, the Halley sequence {xk } generated by (3.1.2) remains in the open ball U(x0 , R), and
converges to the unique solution x? ∈ U(x0 , R) of Eq. (3.1.1) . Moreover, the following error
estimate holds for any k ≥ 1

2c1 i−1
kx? − xk k ≤ ∑ γ2 −1 , (3.4.18)
i=k ca1 (1 − c1 )
1 c2 c1 (c1 +2)
where a = βη, b = η, c = R and γ = c1 = (1−3c1)2
.

Proof. Using Lemma 3.4.5, we have for all k ≥ 0, xk ∈ U(x0 , R). From Lemma 3.4.5 and
Lemma 3.4.3, we get for any integer k, m ≥ 1
k+m−1 k+m−1 k+m−1 2c1 2 i−1 −1
kxk+m − xk k ≤ ∑i=k kxi+1 − xi k ≤ ∑i=k di ≤ ∑i=k ca1 (1−c1) γ
i−1 −1 γ2
k−1 (3.4.19)
≤ 2c1
ca1 (1−c1) ∑∞
i=k γ
2
≤ 2c1
ca1 (1−c1 )γ 1−γ2
.
42 Ioannis K. Argyros and Á. Alberto Magreñán

That is, {xk } is a Cauchy sequence. So, there exists a point x? ∈ U(x0 , R) such that {xk }
converges to x? as k → ∞. Using Lemma 3.4.3, clearly we have dk → 0 as k → ∞. Using
Lemma 3.4.5 and Lemma 3.4.2, for any k ≥ 0, we have
kF 0 (x0 )−1 F(xk+1)k ≤ bk+1 = c(1 + c2k )dk2 ≤ c(1 + c21 )dk2 → 0 as k → ∞. (3.4.20)
The continuity of F gives
kF 0 (x0 )−1 F(x? )k = limk→∞ kF 0 (x0 )−1 F(xk+1 )k = 0 as k → ∞, (3.4.21)
that is F(x? ) = 0. By let m → ∞ in (3.4.19), (3.4.18) is obtained immediately.
Finally, we can show the uniqueness of x? ∈ U(x0 , R) by using the same technique as
in [2, 3, 4, 5, 15]. The proof is complete. 

Remark 3.4.7. (a) Let us compare our sufficient convergence condition (3.4.3) with condi-
tion (C4 ). Condition (3.4.3) can be rewritten as

2β+ β2 +4L (3.4.22)
h0 = 2 η<1
if we use the choices of a, b, c given in Lemma 3.4.5 and R given by (3.2.3). Then, we have
that
h0 ≤ h. (3.4.23)
Estimate (3.4.23) shows that one of our convergence conditions is at least as weak as (C4).
However a direct comparison between (3.4.4) and (C4) is not practical. A similar favorable
comparison can be followed with all other sufficient convergence conditions of the form
(C4 ) already in the literature using M instead of L (see [4, 5, 10, 11, 13, 14, 15]) and the
references therein).
(b) It is possible that (C3 ) is satisfied (hence, (3.1.3) too) but not (C4 ) (or (C5 )). In this
case we test to see if our conditions are satisfied. If our conditions are satisfied although
we predict only quadratic convergence of the Halley method (3.1.2) (see e.g. Lemma 3.4.3)
after a certain iterate xN , where N is a finite natural integer (C4 ) and (C5 ) will be satisfied
for x0 = xN . Therefore, the usual error estimates for the cubical convergence of the Halley
method (3.1.2) will hold. We refer the reader to [3, 4], where we show how to choose N
in the case of Newton’s method. The N for Halley’s method (3.1.2) can be found in an
analogous way.

3.5. Numerical Examples


In this section, we will give some examples to show the application of our Theorem 3.4.6.
Example 3.5.1. Let us define a scalar function F(x) = x3 − 2.25x2 + 3x − 1.585 on D =
(0, 3) with initial point x0 = 1. Then, we have that
F 0 (x) = 3x2 − 4.5x + 3, F 00 (x) = 6x − 4.5. (3.5.1)
So, F(x0 ) = 0.165, F 0 (x0 ) = 1.5, F 00 (x0 ) = 1.5. We can choose η = 0.11 and β = 1 in
Theorem 3.4.6. Moreover, we have for any x ∈ D that
|F 0 (x0 )−1 [F 00 (x) − F 00 (x0 )]| = 4|x − x0 |. (3.5.2)
Semilocal Convergence of Halley’s Method 43

Hence, the weak Lipschitz condition (3.1.3) is true for constant L = 4. By (3.1.6), we get
q

β2 + 4L − β 17 − 1
R= = = 0.390388 . . .. (3.5.3)
2L 8
Then, condition U(x0 , R) = [x0 − R, x0 + R] ≈ [0.609612, 1.390388] ⊂ D is true. We can
c2 b2 (a+4)
also verify conditions 2 − a − 2bc = 2 − βη − 2η/R ≈ 1.326458 > 0 and c1 = 2(2−a−2bc) 2 ≈

0.092729212 < τ = 0.134065 . . . is true. Hence, all conditions in Theorem 3.4.6 are satis-
fied.

Example 3.5.2. In this example we provide an application of our results to a special non-
linear Hammerstein integral equation of the second kind. Consider the integral equation
Z b0
1
u(s) = f (s) + λ k(s,t)u(t)2+ n dt, λ ∈ R, n ∈ N, (3.5.4)
a0

where f is a given continuous function satisfying f (s) > 0 for s ∈ [a0, b0 ] and the kernel is
continuous and positive in [a0, b0 ] × [a0 , b0 ].
Let X = Y = C[a0, b0 ] and D = {u ∈ C[a0 , b0] : u(s) ≥ 0, s ∈ [a0 , b0]}. Define F : D → Y
by
Z b0
1
F(u)(s) = u(s) − f (s) − λ k(s,t)u(t)2+ n dt, s ∈ [a0 , b0]. (3.5.5)
a0
We use the max-norm, The first and second derivatives of F are given by
Z b0
0 1 1
F (u)v(s) = v(s) − λ(2 + ) k(s,t)u(t)1+ n v(t)dt, v ∈ D, s ∈ [a0, b0 ], (3.5.6)
n a0

and
Z b0
00 1 1 1
F (u)(vw)(s) = −λ(1 + )(2 + ) k(s,t)u(t) n (vw)(t)dt, v, w ∈ D, s ∈ [a0, b0 ],
n n a0
(3.5.7)
respectively.
Let x0 (t) = f (t), α = mins∈[a0 ,b0 ] f (s), δ = maxs∈[a0 ,b0] f (s) and M =
R 0
maxs∈[a0 ,b0 ] ab0 |k(s,t)|dt. Then, for any v, w ∈ D,
0 1 1
k[F 00 (x) − F 00 (x0 )](vw)k ≤ |λ|(1 + 1n )(2 + 1n ) maxs∈[a0,b0 ] ab0 |k(s,t)||x(t) n − f (t) n |dtkvwk
R
R 0 |x(t)− f (t)|
= |λ|(1 + 1n )(2 + 1n ) maxs∈[a0,b0 ] ab0 |k(s,t)| n−1 n−2 1 n−1 dtkvwk
x(t) n +x(t) n f (t) n +···+ f (t) n
1 1 R b0 |x(t)− f (t)|
≤ |λ|(1 + n )(2 + n ) maxs∈[a ,b ] a0 |k(s,t)| f (t) n−1
0 0 dtkvwk
n
|λ|(1+ n1 )(2+ n1 ) R b0
≤ n−1 maxs∈[a0 ,b0 ] a0 |k(s,t)||x(t) − f (t)|dtkvwk
α n
|λ|(1+ n1 )(2+ n1 )M
≤ n−1 kx − x0 kkvwk,
α n
(3.5.8)
which means
|λ|(1+ n1 )(2+ n1 )M
kF 00 (x) − F 00 (x0 )k ≤ n−1 kx − x0 k. (3.5.9)
α n
44 Ioannis K. Argyros and Á. Alberto Magreñán

Next, we give a bound for kF 0 (x0 )−1 k. Using (3.5.6), we have that

1 1
kI − F 0 (x0 )k ≤ |λ|(2 + )δ1+ n M. (3.5.10)
n
1
It follows from the Banach theorem that F 0 (x0 )−1 exists if |λ|(2 + 1n )δ1+ n M < 1, and

1
kF 0 (x0 )−1 k ≤ 1 . (3.5.11)
1 − |λ|(2 + n1 )δ1+ n M
1
On the other hand, we have from (3.5.5) and (3.5.7) that kF(x0 )k ≤ |λ|δ2+ n M and
1 1
kF 00 (x0 )k ≤ |λ|(1 + n1 )(2 + 1n )δ n M. Hence, if |λ|(2 + n1 )δ1+ n M < 1, the weak Lipschitz
condition (1.3) is true for

|λ|(1 + 1n )(2 + 1n )M
L= n−1 1 (3.5.12)
α n [1 − |λ|(2 + n1 )δ1+ n M]

and constants η and β in Theorem 3.4.6 can be given by


1 1
|λ|δ2+ n M |λ|(1 + 1n )(2 + 1n )δ n M
η= 1 , β= 1 . (3.5.13)
1 − |λ|(2 + n1 )δ1+ n M 1 − |λ|(2 + n1 )δ1+ n M

Next we let [a0 , b0] = [0, 1], n = 2, f (s) = 1, λ = 1.1 and k(s,t) is the Green’s kernel on
[0, 1] × [0, 1] defined by 
t(1 − s), t ≤ s;
G(s,t) = (3.5.14)
s(1 − t), s ≤ t.
Consider the following particular case of (3.5.4):
Z 1
5
u(s) = f (s) + 1.1 G(s,t)u(t) 2 dt, s ∈ [0, 1]. (3.5.15)
0

Then, α = δ = 1 and M = 81 . Moreover, we have that

22 11 11
η= , β= , L= . (3.5.16)
105 14 14
Therefore 2 − a − 2bc ≈ 1.264456 > 0, τ − c1 ≈ 0.027938 > 0 and R ≈ 0.733988. Hence,
U(x0 , R) ⊂ D. Thus, all conditions of Theorem 3.4.6 are satisfied. Consequently, sequence
{xk } generated by Halley’s method (3.1.2) with initial point x0 converges to the unique
solution x? of Eq. (3.5.15) on U(x0 , 0.733988). The Lipschitz condition (C3 ) is not satisfied
[3, 4, 11, 15]. Hence, we have expanded the applicability of Halley’s method. Note also
that verifying (3.1.3) is less expensive than verifying (C3 ).
References

[1] Amat, S., Busquier, S, Third-order iterative methods under Kantorovich conditions, J.
Math. Anal. Appl. 336 (2007), 243–261.

[2] Argyros, I.K., The convergence of Halley-Chebyshev type method under Newton-
Kantorovich hypotheses, Appl. Math. Lett. 6 (1993), 71–74.

[3] Argyros, I.K., Approximating solutions of equations using Newton’s method with a
modified Newton’s method iterate as a starting point, Rev. Anal. Numer. Theor. Approx,
36 (2007), 123–138.

[4] Argyros, I.K., Computational theory of iterative methods, Series: Studies in Compu-
tational Mathematics 15, Editors, C.K. Chui and L. Wuytack, Elservier Publ. Co. New
York, USA, 2007.

[5] Argyros, I.K., Cho, Y.J., Hilout, S., On the semilocal convergence of the Halley
method using recurrent functions, J. Appl. Math. Computing. 37 (2011), 221–246.

[6] Argyros, I.K., Ren, H.M., Ball convergence theorems for Halley’s method in Banach
spaces, J. Appl. Math. Computing 38 (2012), 453–465.

[7] Argyros, I. K., and S. George, Ball convergence for Steffensen-type fourth-order
methods. Int. J. Interac. Multim. Art. Intell., 3(4) (2015), 37–42.

[8] Argyros, I. K., and González, D., Local convergence for an improved Jarratt-type
method in Banach space. Int. J. Interac. Multim. Art. Intell., 3(4) (2015), 20–25.

[9] Chun, C., Stǎnicǎ, P., Neta, B., Third-order family of methods in Banach spaces,
Comp. Math. with App., 61 (2011), 1665–1675.

[10] Deuflhard, P., Newton Methods for Nonlinear Problems: Affine Invariance and Adap-
tive Algorithms, Springer-Verlag, Berlin, Heidelberg, 2004.

[11] Gutiérrez, J.M., Hernández, M.A., Newton’s method under weak Kantorovich condi-
tions, IMA J. Numer. Anal. 20 (2000), 521–532.

[12] Magreñán, Á. A. , Different anomalies in a Jarratt family of iterative root-finding


methods, App. Math. Comp. 233 (2014), 29–38.

[13] Parida, P.K., Study of third order methods for nonlinear equations in Banach spaces,
PhD dissertation, 2007, IIT kharagpur, India.
46 Ioannis K. Argyros and Á. Alberto Magreñán

[14] Parida, P.K., Gupta, P.K., Semilocal convergence of a family of third-order


Chebyshev-type methods under a mild differentiability condition. Int. J. Comput.
Math. 87 (2010), 3405–3419.

[15] Xu, X.B., Ling, Y.H., Semilocal convergence for Halley’s method under weak Lips-
chitz condition, App. Math. Comp. 215 (2009), 3057–3067.
Chapter 4

An Improved Convergence Analysis


of Newton’s Method for Twice
Fréchet Differentiable Operators

4.1. Introduction
In this chapter, we are concerned with the problem of approximating a locally unique solu-
tion x? of equation
F (x) = 0, (4.1.1)
where, F is a twice Fréchet differentiable operator defined on a convex subset D of a
Banach space X with values in a Banach space Y . Numerous problems in science and
engineering – such as optimization of chemical processes or multiphase, multicomponent
flow – can be reduced to solving the above equation [7, 8, 9, 14, 15, 16]. Consequently,
solving these equations is an important scientific field of research. For most problems,
finding a closed form solution for the non-linear equation (4.1.1) is not possible. Therefore,
iterative solution techniques are employed for solving these equations. The study about
convergence analysis of iterative methods is usually divided into two categories: semilocal
and local convergence analysis. The semilocal convergence analysis is based upon the
information around an initial point to give criteria ensuring the convergence of the iterative
procedure. While the local convergence analysis is based on the information around a
solution to find estimates of the radii of convergence balls.
The most popular iterative method for solving problem (4.1.1) is the Newton’s method

xn+1 = xn − F 0 (xn )−1 F (xn ) for each n = 0, 1, 2, . . ., (4.1.2)


where x0 ∈ D is an initial point. There exists extensive local as well as semilocal conver-
gence analysis results under various Lipschitz type conditions for Newton’s method (4.1.2)
[1–17]. The following four conditions have been used to perform semilocal convergence
analysis of Newton’s method (4.1.2) [3, 5, 7, 8, 9, 13, 14]
C1 . there exists x0 ∈ D such that F 0 (x0 )−1 ∈ L(Y , X ),

C2 . F 0 (x0 )−1 F (x0 ) ≤ η,
48 Ioannis K. Argyros and Á. Alberto Magreñán

C3 . F 0 (x0 )−1 F 00 (x) ≤ K for each x ∈ D ,

C4 . F 0 (x0 )−1 (F 00 (x) − F 00 (y)) ≤ M kx − yk for each x, y ∈ D .

Let us also introduce the center-Lipschitz condition



C5 . F 0 (x0 )−1 (F 0 (x) − F 0 (x0 )) ≤ L0 kx − x0 k for each x ∈ D .

We shall refer to (C1 ) – (C5 ) as the (C) conditions. The following conditions have also
been employed [9, 10, 11, 12, 17, 14]

C6 . kF 0 (x0 )−1 F 00 (x0 )k ≤ K0



C7 . F 0 (x0 )−1 (F 00 (x) − F 00 (x0 )) ≤ M0 kx − x0 k for each x ∈ D .

Here onwards, the conditions (C1 ), (C2 ), (C5 ), (C6 ), (C7 ) are referred as the (H) conditions.
For the semilocal convergence of Newton’s method the conditions (C1 ), (C2 ), (C3 )
together with the following sufficient conditions are given [1, 2, 3, 4, 9, 10, 11, 12, 17, 14,
15, 16, 18]
p
4M + K 2 − K K 2 + 2M
η≤ p , (4.1.3)
3M (K + K 2 + 2M )
U(x0 , R1 ) ⊆ D (4.1.4)

where R1 is the smallest positive root of

M K
P1(t) = t3 + t 2 − t + η. (4.1.5)
6 2
Whereas the conditions (C1 ), (C2 ), (C6 ), (C7 ) together with
q
4M0 + K02 − K0 K02 + 2M0
η≤ q (4.1.6)
3M0 (K0 + K02 + 2M0 )
U(x0 , R2 ) ⊆ D (4.1.7)

where R2 is the small positive root of

M0 K0
P2(t) = t3 + t 2 − t + η. (4.1.8)
6 2
have also been used for the semilocal convergence of Newton’s method. Conditions (4.1.3)
and (4.1.6) cannot be directly be compared with ours given in Sections 4.2 and 4.3, since
we use L0 that does not appear in (4.1.3) and (4.1.6). However, comparisons can be made
on concrete numerical examples. Let us consider X = Y = R, x0 = 1 and D = [ζ, 2 − ζ] for
ζ ∈ (0, 1). Define function F on D by

F (x) = x5 − ζ. (4.1.9)
Newton’s Method 49
·10−2
20
η
18 h1
h2
16

14

12

10

2
ζ ≈ 0.514 ζ ≈ 0.723
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ζ

Figure 4.1.1. Convergence criteria (4.1.3) and (4.1.6) for the equation (4.1.9). Here, h1 and
h2 stands respectively for the right hand side of the conditions (4.1.3) and (4.1.6).

Then, through some simple calculations, the conditions (C2 ), (C3 ), (C4 ), (C5 ), (C6 ) and
(C7 ) yield

(1 − ζ)
η= , K = 4(2 − ζ) , M = 12(2 − ζ) , K0 = 4,
3 2
5

M0 = 4ζ2 − 20ζ + 28, L0 = 15 − 17ζ + 7ζ2 − ζ 3.
Figure 4.1.1 plots the criteria (4.1.3) and (4.1.6) for the problem (4.1.9). In the Figure 4.1.1,
h1 stands for the right hand side of the condition (4.1.3) and h2 stands for the the right hand
side of the condition (4.1.6). In the Figure 4.1.1, we observe that for ζ < 0.723 the criterion
(4.1.3) does not hold while for ζ < 0.514 the criterion (4.1.6) does not hold. However, one
may see that the method (4.1.2) is convergent.
In this chapter, we expand the applicability of Newton’s method (4.1.2) first under the
(C) conditions and secondly under the (H) conditions. The local convergence of Newton’s
method (4.1.2) is also performed under similar conditions.
The chapter is organized as follows. In the Section 4.2 and Section 4.3, we study ma-
jorizing sequences for the Newton’s iterate {xn }. Section 4.4 contains the semilocal con-
vergence of Newton’s method. The local convergence is given in Section 4.5. Finally,
numerical examples are given in Section 4.6.
50 Ioannis K. Argyros and Á. Alberto Magreñán

4.2. Majorizing Sequences I


In this section, we present scalar sequences and prove that these sequences are majorizing
for Newton’s method (4.1.2). We need the following convergence results for majorizing
sequences under the (C) conditions.
Lemma 4.2.1. Let K , L0 , M > 0 and η > 0. Define parameters α, η0 and η1 by
2K
α= p , (4.2.1)
K+ K 2 + 8L 0 K
2
η0 = r 2 (4.2.2)
K K 2M α
+ (1 + α)L0 + + (1 + α)L0 +
2 2 3
and

η1 = r 2 . (4.2.3)
K K 2M α
+ α L0 + + αL0 +
2 2 3
Suppose that 

 1 − α2
 η1 if L0 η ≤
2 + 2α − α2
η≤ (4.2.4)


 η0 1 − α2
if ≤ L0 η.
2 + 2α − α2
Then, sequence {tn} generated by
M
K+ (tn+1 − tn )
t0 = 0, t1 = η, tn+2 = tn+1 + 3 (tn+1 − tn )2 (4.2.5)
2(1 − L0tn+1 )
is well defined, increasing, bounded from above
η
t ?? = (4.2.6)
1−α
and converges to its unique least upper bound t ? which satisfies t ? ∈ [η,t ??]. Moreover the
following estimates hold

tn+1 − tn ≤ αn η (4.2.7)

and
αn η
t ? − tn ≤ . (4.2.8)
1−α
Proof. We use mathematical induction to prove (4.2.7). Set
M
K+ (tk+1 − tk )
αk = 3 . (4.2.9)
2(1 − L0 tk+1)
Newton’s Method 51

According to (4.2.5) and (4.2.9), we must prove that

αk ≤ α. (4.2.10)

Estimate (4.2.10) holds for k = 0 by (4.2.4) and the choice of η1 given in (4.2.3). Then, we
also have

t2 − t1 ≤ α(t1 − t0 )

and

1 − α2 η
t2 ≤ t1 + α(t1 − t0 ) = η + αη = (1 + α)η = η< = t ?? .
1−α 1−α
Let us assume that (4.2.9) holds for all k ≤ n. Then, we also have by (4.2.5) that

tk+1 − tk ≤ αk η

and

1 − αk+1
tk+1 ≤ η < t ?? .
1−α
Then, we must prove that
K M  1 − αk+1
+ αk η αk η + αL0 η − α ≤ 0. (4.2.11)
2 6 1−α
Estimate (4.2.11) motivates us to define recurrent functions f k on [0, 1) for each k = 1, 2, . . .
by
1 M 
fk (t) = K + t k η t k−1η + L0 (1 + t + · · · + t k )η − 1. (4.2.12)
2 3
We need a relationship between two consecutive functions f k . Using (4.2.12) we get that

fk+1 (t) = f k (t) + gk (t), (4.2.13)

where
h1 M  1 M  i
gk (t) = K+ t k+1 η t − K + t k η + L0 t 2 t k−1η
2 3 2 3
h1 M i
= (2L0t 2 + K t − K ) + t k η(t 2 − 1) t k−1 η. (4.2.14)
2 6
In particular, we get that
gk (α) ≤ 0, (4.2.15)
since α ∈ (0, 1) and
2L 0 α2 + L α − K = 0 (4.2.16)
52 Ioannis K. Argyros and Á. Alberto Magreñán

by the choice of α. Evidently (4.2.11) holds if

fk (α) ≤ 0 for each k = 1, 2, . . .. (4.2.17)

But in view of (4.2.13), (4.2.14) and (4.2.15) we have that

fk (α) ≤ f k−1 (α) ≤ · · · ≤ f 1 (α). (4.2.18)

Hence, (4.2.17) holds if


f1 (α) ≤ 0 (4.2.19)
which is true by the choice of η0 . The induction for (4.2.7) is complete. Hence, sequence
{tn } is increasing, bounded from above by t ?? and as such it converges to t ? . Estimates
(4.2.8) follows from (4.2.7) and by standard majorization techniques [7, 8, 14, 15, 16, 18].

Let us denote by γ0 and γ1 , respectively, the minimal positive zeros of the following
equations with respect to η
hK M i
+ α(t2 − t1 ) (t2 − t1 ) + L0 (1 + α)(t2 − t1 ) + L0 t1 − 1 = 0 (4.2.20)
2 6

and
hK M i
+ (t2 − t1 ) (t2 − t1 ) + αL0t2 − α = 0. (4.2.21)
2 6
Let us set
γ = min{γ0 , γ1 , 1/L0}. (4.2.22)
Then, we can show the following result.

Lemma 4.2.2. Suppose that 


 1

 ≤ γ if γ 6=
L0
η (4.2.23)

 1
 < γ if γ =
L0
Then, sequence {tn } generated by (4.2.5) is well defined, increasing, bounded from above
by
t2 − t1
t1?? = t1 + (4.2.24)
1−α
and converges to its unique least upper bound t1? ∈ [0,t1??]. Moreover, the following esti-
mates hold for each n = 1, 2, . . .

tn+2 − tn+1 ≤ αn (t2 − t1 ). (4.2.25)


Newton’s Method 53

Proof. As in Lemma 4.2.1 we shall prove (4.2.25) using mathematical induction. We have
by the choice of γ1 that

M
K+ (t2 − t1 )
α1 = 3 (t2 − t1 ) ≤ α. (4.2.26)
2(1 − L0t2 )
Then, it follows from (4.2.26) and (4.2.20) that

0 < t3 − t2 ≤ α(t2 − t1 )
t3 ≤ t2 + α(t2 − t1 )
t3 ≤ t2 + (1 + α)(t2 − t1 ) − (t2 − t1 )
1 − α2
t3 ≤ t1 + (t2 − t1 ) < t ?? .
1−α
Assume that
0 < αk ≤ α (4.2.27)
holds for all n ≤ k. Then, we get by (4.2.5) and (4.2.27) that

0 < tk+2 − tk+1 ≤ αk (t2 − t1 ) (4.2.28)

and

1 − αk+1
tk+2 ≤ t1 + (t2 − t1 ) < t1?? . (4.2.29)
1−α
Estimate (4.2.27) is true, if k is replaced by k + 1 provided that
hK M i
+ (tk+2 − tk+1 ) (tk+2 − tk+1 ) ≤ α(1 − L0 tk+2)
2 6
or
hK M i h 1 − αk+1 i
+ αk (t2 − t1 ) αk (t2 − t1 ) + αL0 t1 + (t2 − t1 ) − α ≤ 0. (4.2.30)
2 6 1−α
Estimate (4.2.30) motivates us to define recurrent functions f k on [0, 1) by
hK M i
fk(t) = + t k (t2 − t1 ) t k (t2 − t1 ) + t L0 (1 + t + · · · + t k )(t2 − t1 )
2 6
− t(1 − L0 t1 ). (4.2.31)

We have that
h1 M i
fk+1(t) = f k (t) + (2L0t 2 + K t − K ) + t k (t 2 − 1)(t2 − t1 ) t k (t2 − t1 ). (4.2.32)
2 6
In particular, we have the choice of α that

fk+1(α) ≤ f k (α) ≤ · · · ≤ f 1 (α) ≤ 0. (4.2.33)


54 Ioannis K. Argyros and Á. Alberto Magreñán

Evidently, estimate (4.2.30) holds if

fk (α) ≤ 0 or by (4.2.33) if
(4.2.34)
f1 (α) ≤ 0

which is true by the choice of η0 . The proof of the Lemma is complete.

Lemma’s 4.2.1 and 4.2.2 admit the following useful extensions. The proofs are omitted
since they can simply be obtained by replacing η = t1 −t0 with tN+1 −tN where N = 1, 2, . . .
for Lemma 4.2.3 and N = 2, 3, . . . for Lemma 4.2.4.

Lemma 4.2.3. Suppose there exists N = 1, 2, . . . such that


1
t0 < t1 < t2 < · · · < tN < tN+1 <
L0
and


 1 − α2
 η1 if L0 η ≤
2 + 2α − α2
tN+1 − tN ≤


 η0 1 − α2
if ≤ L0 η.
2 + 2α − α2
Then, the conclusions of Lemma 4.2.1 for sequence {tn } hold.

Lemma 4.2.4. Suppose there exists N = 2, 3, . . . such that


1
t0 < t1 < t2 < · · · < tN < tN+1 <
L0
and

 1

 ≤ γ if γ 6=
L0
η

 1
 < γ if γ = ,
L0
where γ is defined by (4.2.22) where t2 − t1 , t1 , t2 are replaced, respectively, by tN+1 − tN ,
tN , tN+1 . Then, the conclusions of Lemma 4.2.1 for sequence {tn } hold.

Remark 4.2.5. Another sequence related to Newton’s method (4.1.2) is given by (see The-
orem 4.4.1)

M1 

K0 + (s1 − s0 ) 
2
s0 = 0, s1 = η, s2 = s1 + 3 (s1 − s0 ) 


2(1 − L0 s1 )
(4.2.35)
M 

K + (sn+1 − sn ) 

3 

sn+2 = sn+1 + (sn+1 − sn )2 
2(1 − L0 sn+1 )
Newton’s Method 55

for each n = 1, 2, . . . and some K0 ∈ (0, K ], M1 ∈ (0, M ]. Then, a simple inductive argument
shows that

sn ≤ tn (4.2.36)
sn+1 − sn ≤ tn+1 − tn (4.2.37)

and

s? = lim sn ≤ t ? . (4.2.38)
n→∞

Moreover, if K0 < K or M1 < M then (4.2.36) and (4.2.37) hold as strict inequalities.
Clearly, sequence {sn } converges under the hypotheses of Lemma 4.2.1 or Lemma 4.2.2.
However, {sn } can converge under weaker hypotheses than those of Lemma 4.2.2. Indeed,
denote by γ10 and γ11 , respectively, the minimal positive zeros of equations
hK M i
+ α(s2 − s1 ) (s2 − s1 ) + L0 (1 + α)(s2 − s1 ) + L0 s1 − 1 = 0 (4.2.39)
2 6

and
hK M1 i
0
+ (s2 − s1 ) (s2 − s1 ) + αL0 s2 − α = 0. (4.2.40)
2 6
Set
γ1 = min{γ10 , γ11 , 1/L0}. (4.2.41)
Then, we have that
γ ≤ γ1 . (4.2.42)
Moreover, the conclusions of Lemma 4.2.2 hold for sequence {sn } if (4.2.42) replaces
(4.2.23).
Note also that strict inequality can hold in (4.2.42) which implies that the sequence {sn }
– which is tighter than {tn } – converges under weaker conditions.

4.3. Majorizing Sequences II


We show convergence of sequences that are majorizing for Newton’s method (4.1.2) under
the (H) conditions.

Lemma 4.3.1. Let K0 > 0, L0 > 0, M0 > 0 and η > 0 with K0 ≤ L0 . Define parameters a,
θ0 and η1 by
2K0
a= q , (4.3.1)
K0 + K02 + 8K0 L0
2
θ0 = r 2 (4.3.2)
K0 K0 2M0 (a + 3)
+ (1 + a)L0 + + (1 + a)L0 +
2 2 3
56 Ioannis K. Argyros and Á. Alberto Magreñán

and

2a
θ1 = r 2 . (4.3.3)
K0 K0 2 M0 a
+ aL 0 + + aL 0 +
2 2 3
Suppose that 

 1 − a2
 θ1 if L0 η ≤
2 + 2a − a2
η≤ (4.3.4)

 1 − a2
 θ0 if ≤ L0 η
2 + 2a − a2
Then, sequence {vn } generated by

M0 M0 K0
(vn+1 − vn ) + vn +
v0 = 0, v1 = η, vn+2 = vn+1 + 6 2 2 (v − v )
n+1 n (4.3.5)
1 − L0 vn+1
is well defined, increasing, bounded from above
η
v?? = (4.3.6)
1−a
and converges to its unique least upper bound v? which satisfies v? ∈ [0, v??]. Moreover the
following estimates hold

vn+1 − vn ≤ an η (4.3.7)

and

an η
v? − vn ≤ . (4.3.8)
1−a
Proof. As in Lemma 4.2.1 we use mathematical induction to prove that

K0 M0 M0
+ vk + (vk+1 − vk )
βk = 2 2 6 (vk+1 − vk ) ≤ a. (4.3.9)
1 − L0 vk+1
Estimate (4.3.9) holds for k = 0 by the choice of θ1 . Let us assume that (4.3.9) holds for all
k ≤ n. Then, we must prove that
K M0 1 − a k M0  1 − ak+1
0
+ η+ ak η ak η + aL 0 η − a ≤ 0. (4.3.10)
2 2 1−a 6 1−a
Define recurrent functions f k on [0, 1) for each k = 1, 2, . . . by
K M0 M0 
0
fk(t) = + (1 + t + · · · + +t k−1 )η + ak η t k−1 η
2 2 6
+ L0 (1 + t + · · · + t k )η − 1. (4.3.11)
Newton’s Method 57

Using (4.3.11), we get that


h1
fk+1(a) = f k (a) + (2L0 a2 + K0 a − K0 )
2
M0 i
+ (ak+2 + 3ak+1 + 2ak − 3)η ak−1η ≤ f k (a), (4.3.12)
6
since a given by (4.3.1) solves the equation 2L0 a2 + K0 a − K0 = 0 and ak+2 + 3ak+1 + 2ak −
3 ≤ 0 for each k = 1, 2, . . ., if a ∈ [0, 1/2]. Evidently, it follows from (4.3.12) that (4.3.10)
holds which is true by the choice of θ0 .

Denote by δ0 and δ1 , respectively, the minimal positive zeros of equations


hK M0  M0 i
0
+ v2 + a(v2 − v1 ) (v2 − v1 ) + L0 (v1 + (1 + a)(v2 − v1 )) − 1 = 0 (4.3.13)
2 2 6

and
hM M0 K0 i
0
(v2 − v1 ) + v1 + (v2 − v1 ) + aL0 v2 − a = 0. (4.3.14)
6 2 2
Set
δ = min{δ0 , δ1 , 1/L0 }. (4.3.15)
Then, we can show:

Lemma 4.3.2. Suppose that 


 1

≤ δ if δ 6=
L0
η (4.3.16)

 1
< δ if δ =
L0
Then, sequence {vn } generated by equation (4.3.5) is well defined, increasing, bounded
from above by
v2 − v1
v??
1 = v1 + (4.3.17)
1−a
and converges to its unique least-upper bound v?1 which satisfies v?1 ∈ [0, v??
1 ]. Moreover, the
following estimates hold for each n = 1, 2, 3, . . .

vn+2 − vn+1 ≤ an (v2 − v1 ). (4.3.18)

Proof. We have that β1 ≤ a by the choice of δ1 . This time we must have

hK M0  1 − ak  M i
0 0 k
+ v1 + (v2 − v1 ) + a (v2 − v1 ) ak (v2 − v1 )
2 2 1−a 6
h 1 − ak+1 i
+ aL0 v1 + (v2 − v1 ) − a ≤ 0. (4.3.19)
1−a
58 Ioannis K. Argyros and Á. Alberto Magreñán

Define functions f k on [0, 1) by


hK M0  1 − tk  M i
0 0 k
fk(t) = + v1 + (v2 − v1 ) + a (t2 − t1 ) t k (v2 − v1 )
2 2 1 −t 6
h 1 − t k+1 i
+ t L0 v1 + (v2 − v1 ) − t. (4.3.20)
1 −t
We have that
h1 M0 
fk+1(a) = f k (a) + (2L0 a2 + K0 a − K0 ) + (v2 − v1 ) 3(a − 1)a
2 6 i
+ (ak+2 + 3ak+1 + 2ak − 3) ak (v2 − v1 ).

Thus
fk+1(a) ≤ f k (a) ≤ · · · ≤ f 1 (a). (4.3.21)
But by the choice of η0 we have that f 1 (a) ≤ 0.

Remark 4.3.3. A sequence related to Newton’s method (4.1.2) under the (H) conditions is
defined by

M1 

K0 + (u1 − u0 ) 

3 2
u0 = 0, u1 = η, u2 = u1 + (u1 − u0 ) 

2(1 − L0 u1 )
(4.3.22)
M0 

K0 + (un+1 − un ) 

3 

un+2 = un+1 + (un+1 − un )2 
2(1 − L0 un+1 )
for each n = 1, 2, . . . and M1 ∈ (0, M ]. Then, a simple inductive argument shows that for
each n = 2, 3, . . .

un ≤ vn (4.3.23)
un+1 − un ≤ vn+1 − vn (4.3.24)

and

u? = lim ≤ v? . (4.3.25)
n→∞

Moreover, if K0 < K or M1 < M0 then (4.3.23) and (4.3.24) hold as strict inequalities.
Sequence {un } converges under the hypotheses of Lemma 4.3.1 or 4.3.2. However, {un }
can converge under weaker hypotheses than those of Lemma 4.3.2. Indeed, denote by δ10
and δ11 , respectively, the minimal positive zeros of equations
hK M0  M0 i
0
+ u2 + (u2 − u1 ) (u2 − u1 ) + L0 (u1 + (1 + a)(u2 − u1 )) − 1 = 0 (4.3.26)
2 2 6
and
hM M0 K0 i
0
(u2 − u1 ) + u1 + (u2 − u1 ) + aL0 u2 − a = 0. (4.3.27)
6 2 2
Newton’s Method 59

Set
δ1 = min{δ10 , δ11 , 1/L0}. (4.3.28)
Then, we have that
δ ≤ δ1 .
Moreover, the conclusions of Lemma 4.3.2 hold for sequence {un } if (4.3.28) replaces
(4.3.16). Note also that strict inequality may hold in (4.3.28) which implies that, tighter
than {vn }, sequence {un } converges under weaker conditions. Finally note that sequence
{tn } is tighter than {vn } although the sufficient convergence conditions for {vn } are weaker
than those of {tn }.

Lemmas similar to Lemma 2.3 and Lemma 2.4 for sequence {vn } can follow in an
analogous way.

4.4. Semilocal Convergence


We present the semilocal convergence of Newton’s method (4.1.2) first under the (C) and
then under the (H) conditions. Let u(x, R) and U(x, R) stand, respectively, for the open and
closed balls in X centered at x ∈ X and of radius R > 0.

Theorem 4.4.1. Let F : D ⊆ X −→ Y be twice Fréchet differentiable. Suppose that the


(C) conditions, hypotheses of Lemma 4.2.1 hold and

U(x0 ,t ? ) ⊆ D . (4.4.1)

Then, the sequence {xn } defined by Newton’s method (4.1.2) is well defined, remains in
U(x0 ,t ? ) for all n ≥ 0 and converges to a unique solution x? ∈ U(x0 ,t ? ) of equation F (x) =
0. Moreover, the following estimates hold for all n ≥ 0

kxn+2 − xn+1 k ≤ tn+2 − tn+1 (4.4.2)

and

kxn − x? k ≤ t ? − tn , (4.4.3)

where, sequence {tn } (n ≥ 0) is given by (4.2.5). Furthermore, if there exists R ≥ t ? , such


that

U(x0 , R) ⊆ D (4.4.4)

and

L0 (t ? + R) ≤ 2. (4.4.5)

The solution x? is unique in U(x0 , R).


60 Ioannis K. Argyros and Á. Alberto Magreñán

Proof. Let us prove that

kxk+1 − xk k ≤ tk+1 − tk (4.4.6)

and

U(xk+1 ,t ? − tk+1 ) ⊆ U(xk ,t ? − tk ) (4.4.7)

hold for all k ≥ 0. For every z ∈ U(x1 ,t ? − t1 )

kz − x0 k ≤ kz − x1 k + kx1 − x0 k
≤ (t ? − t1 ) + (t1 − t0 ) = t ? − t0 ,

implies that z ∈ U(x0 ,t ? − t0 ). Since, also



kx1 − x0 k = F 0 (x0 )−1 F (x0 ) ≤ η = t1 − t0 .

Thus estimate (4.4.6) and (4.4.7) hold for k = 0. Given they hold for n = 0, 1, 2, . . ., k, then
we have
k+1 k+1
kxk+1 − x0 k ≤ ∑ kxi − xi−1 k ≤ ∑ (ti − ti−1) = tk+1 − t0 = tk+1 (4.4.8)
i=1 i=1

and

kxk + θ(xk+1 − xk ) − x0 k ≤ tk + θ(tk+1 − tk ) ≤ t ? , (4.4.9)

for all θ ∈ [0, 1]. Using (4.1.2), we obtain the approximation

F (xk+1) = F (xk+1) − F (xk ) − F 0 (xk )(xk+1 − xk )


Z 1
= [F 0 (xk + θ(xk+1 − xk )) − F 0 (xk )](xk+1 − xk )dθ (4.4.10)
0
Z 1
= F 00 (xk + θ(xk+1 − xk ))(1 − θ)(xk+1 − xk )2dθ.
0

Then, we get by (C3 ), (C4 ) and (4.4.1)


0 Z 1  0
F (x0 )−1 F (xk+1 ) ≤ F (x0 )−1 [F 00 (xk + θ(xk+1 − xk )) − F 00 (x? )]
0

+ F 0 (x0 )−1 F 00 (x? ) kxk+1 − xk k2 (1 − θ)dθ
h Z 1  Ki
≤ M kxk+1 − xk k (1 − θ)dθ + kxk+1 − xk k2
0 2
M K
≤ kxk+1 − xk k3 +
kxk+1 − xk k2
6 2
h 1  Ki
≤ M (tk+1 − tk ) + (tk+1 − tk )2 (4.4.11)
6 2
Newton’s Method 61

where ( (
K0 , K = 0, M0 , K = 0,
K= and M =
K, K > 0, M, K > 0.
Using (C5 ), we obtain that
0
F (x0 )−1 (F 0 (xk+1 ) − F 0 (x0 )) ≤ L0 kxk+1 − x0 k
≤ L0 tk+1 ≤ L0 t ? < 1. (4.4.12)

It follows from the Banach lemma on invertible operators [7, 8, 14, 15, 16] and (4.4.12) that
F 0 (xk+1)−1 exists and
0
F (xk+1 )−1 F 0 (x0 ) ≤ (1 − L0 kxk+1 − x0 k)−1
≤ (1 − L0tk+1)−1 . (4.4.13)

Therefore by (4.1.2), (4.4.11) and (4.4.13), we obtain in turn



kxk+2 − xk+1 k ≤ F 0 (xk+1)−1 F 0 (xk+1 )

≤ F 0 (xk+1)−1 F 0 (x0 ) F 0 (x0 )−1 F (xk+1) (4.4.14)
≤ tk+2 − tk+1.

Thus for every z ∈ U(xk+2 ,t ? − tk+2 ), we have

kz − xk+1 k ≤ kz − xk+2 k + kxk+2 − xk+1 k


≤ t ? − tk+2 + tk+2 − tk+1 = t ? − tk+1. (4.4.15)

That is,
z ∈ U(xk+1 ,t ? − tk+1 ). (4.4.16)
Estimates (4.4.13) and (4.4.16) imply that (4.4.6) and (4.4.7) hold for n = k + 1. The proof
of (4.4.6) and (4.4.7) is now complete by induction.
Lemma 4.2.1 implies that sequence {tn } is a Cauchy sequence. From (4.4.6) and (4.4.7),
{xn } (n ≥ 0) becomes a Cauchy sequence too and as such it converges to some x? ∈ U(x0 ,t ? )
(since U(x0 ,t ? ) is a closed set). Estimate (4.4.3) follows from (4.4.2) by using standard
majorization techniques [7, 8, 14, 15, 16, 18]. Moreover, by letting k → ∞ in (4.4.11), we
obtain F (x? ) = 0. Finally, to show uniqueness let y? be a solution of equation F (x) = 0 in
U(x0 , R). It follows from (C5 ) for x = y? + θ(x? − y? ), θ ∈ [0, 1], the estimate

0 Z 1
F (x0 )−1 (F 0 (y? + θ(x? − y? )) − F 0 (x0 )) dθ
0

Z 1
≤ L0 ky? + θ(x? − y? ) − x0 k dθ
0
Z 1
≤ L0 (θ kx? − x0 k + (1 − θ) ky? − x0 k)dθ
0
L0
≤ (t ? + R) ≤ 1, (by (4.6.2))
2
62 Ioannis K. Argyros and Á. Alberto Magreñán

and the Banach lemma on invertible operators implies that the linear operator T ?? =
R1 0 ? ? ? ? 0 ? ?? ?
0 F (y + θ(x − y ))dθ is invertible. Using the identity 0 = F (x ) − F (y ) = T (x −
y? ), we deduce that x? = y? .
Similarly, we show the uniqueness in U(x0 ,t ? ) by setting t ? = R. That completes the
proof of Theorem 4.4.1.

Remark 4.4.2. The conclusions of Theorem 4.4.1 hold if {tn }, t ? are replaced by {rn }, r? ,
respectively.

Using the approximation


Z 1
F 0 (x0 )−1F (xk+1) = F 0 (x0 )−1[F 00 (xk + θ(xk+1 − xk )) − F 00 (x0)](xk+1 − xk )2(1 − θ)dθ
0
Z 1
+ F 0 (x0 )−1F 00 (x0)(1 − θ)dθ kxk+1 − xk k2 (4.4.17)
0

instead of (4.4.11) and (C6 ), (C7 ) instead of, respectively, (C3 ), (C4 ), we arrive at the
following semilocal convergence result under the (H) conditions [8, Theorem 6.3.7 p. 210
for proof].

Theorem 4.4.3. Let F : D ⊆ X −→ Y be twice Fréchet differentiable. Furthermore sup-


pose that the (H) conditions,
U(x0 , v? ) ⊆ D , (4.4.18)
and hypotheses of Lemma 4.3.1 hold. Then, the sequence {xn } generated by Newton’s
method (4.1.2) is well defined, remains in U(x0 ,t ? ) for all n ≥ 0 and converges to a unique
solution x? ∈ U(x0 ,t ? ) of equation F (x) = 0. Moreover, the following estimates hold for all
n ≥ 0:

kxn+2 − xn+1 k ≤ vn+2 − vn+1 (4.4.19)

and

kxn − x? k ≤ v? − vn (4.4.20)

where, sequence {vn } (n ≥ 0) is given by (4.3.5). Furthermore, if there exists R ≥ t ? , such


that

U(x0 , R) ⊆ D (4.4.21)

and

L0 (t ? + R) ≤ 2. (4.4.22)

The solution x? is unique in U(x0 , R).


Newton’s Method 63

4.5. Local Convergence


We study the local convergence of Newton’s method under the (A) conditions

A1 . there exists x? ∈ D such that F (x? ) = 0 and F 0 (x? )−1 ∈ L(Y , X )



A2 . F 0 (x? )−1 F 00 (x? ) ≤ b

A3 . F 0 (x? )−1 [F 00 (x) − F 00 (x? )] ≤ c kx − x? k for each x ∈ D

A4 . F 0 (x? )−1 [F 0 (x) − F 0 (x? )] ≤ d kx − x? k for each x ∈ D .

Note also that in view of (A3 ) and (A4 ), respectively, there exist c0 ∈ (0, c] and d0 ∈ (0, d]
such that for each θ ∈ [0, 1]
 

A0 3 . F 0 (x? )−1 F 00 (x0 + θ(x? − x0 )) − F 00 (x? ) ≤ c0 (1 − θ) kx0 − x? k
 

A0 4 . F 0 (x? )−1 F 0 (x0 ) − F 0 (x? ) ≤ d0 (1 − θ) kx0 − x? k.

Then, we can show:

Theorem 4.5.1. Suppose that (A) conditions hold and

U(x? , r) ⊆ D , (4.5.1)

where
2
r= r 2 4c . (4.5.2)
b b
+d + +d +
2 2 3
Then, sequence {xn } (starting from x0 ∈ U(x? , r)) generated by Newton’s method (4.1.2) is
well defined, remains in U(x? , r) for all n ≥ 0 and converges to x? . Moreover the following
estimates hold

kxn+1 − x? k ≤ en kxn − x? k2 , (4.5.3)


c b ct b
kxn − x? k + +
en = 3 2 and q(t) = 3 2 t (4.5.4)
1 − d kxn − x? k 1 − dt

where
( (
c0 if n = 0, d0 if n = 0,
c= d=
c if n > 0, d if n > 0.

Proof. The starting point x0 ∈ U(x? , r). Then, suppose that xk ∈ U(x? , r) for all k ≤ n.
Using (A4 ) and the definition of r we get that
0 ? −1 0
F (x ) (F (xk ) − F 0 (x? )) ≤ d kxk − x? k < dr < 1. (4.5.5)
64 Ioannis K. Argyros and Á. Alberto Magreñán

It follows from (4.5.5) and the Banach lemma on invertible operators that F 0 (xk )−1 exists
and
0 1
F (xk )−1 F 0 (x? ) ≤ . (4.5.6)
1 − d kxk − x? k
Hence, xk+1 exists. Using (4.1.2), we obtain the approximation
hZ 1   i
x? − xk+1 = −F 0 (xk )−1 F 0 (x? ) F 0 (x?)−1 F 00 (xk + θ(x? − xk )) − F 00 (x? ) + F 00 (x? )
0
(x? − xk )2 (1 − θ)dθ (4.5.7)

In view of (A2 ), (A3 ), (A4 ), (4.5.6), (4.5.7) and the choice of r we have in turn that
R1
(1 − θ)2 kxk − x? k3 dθ + b 01 (1 − θ)dθkxk − x? k2
R
? c 0
kxk+1 − x k ≤
1 − d kxk − x? k
≤ ek kxk − x? k2 < q(r) kxk − x? k = kxk − x? k (4.5.8)

which implies that xk+1 ∈ U(x? , r) and limk→∞ xk = x? .

Remark 4.5.2. The local results can be used or projection methods such as Arnolds, the
generalized minimum residual method (GMRES), the generalized conjugate method (GCR)
for combined Newton/finite projection methods and in connection with the mesh inde-
pendence principle to develop the cheapest and most efficient mesh refinement strategies
[7, 8, 4, 15, 16]. These results can also be used to solve equations of the form (4.1.1),
where F 0 , F 00 satisfy differential equations of the form

F 0 (x) = P (F (x)) and F 00 (x) = Q (F (x)). (4.5.9)

where, P and Q are known operators. Since, F 0 (x? ) = P (F (x? )) = P (0) and F 00 (x? ) =
Q (F (x?)) = Q (0) we can apply our results without actually knowing the solution x? of
equation (4.1.1).

4.6. Numerical Examples


Example 1. Let X = Y = R be equipped with the max-norm, x0 = ω, D =
[− exp(1), exp(1)]. Let us define F on D by

F (x) = x3 − exp(1). (4.6.1)

Here, ω ∈ D . Through some algebraic manipulations, we obtain



 |ω3 − exp(1)| 4 exp(1) 2 exp(1) + ω 2
η = , K = , L0 = , K0 =
3ω 2 ω 2 ω2 ω

 2 2
M = 2 , M0 = 2 .
ω ω
For ω = 0.48 exp(1), the criteria (4.1.3) and (4.1.6) yield

0.09730789545 ≤ 0.07755074734 and 0.09730789545 ≤ 0.2856823952


Newton’s Method 65

Table 4.6.1. Newton’s method applied to (4.4.11)

n xn kxn+2 − xn+1 k kxn − x? k


0 1.30e + 00 6.44e − 03 9.08e − 02
1 1.40e + 00 2.98e − 05 6.47e − 03
2 1.40e + 00 6.37e − 10 2.98e − 05
3 1.40e + 00 2.91e − 19 6.37e − 10
4 1.40e + 00 6.06e − 38 2.91e − 19
5 1.40e + 00 2.63e − 75 6.06e − 38
6 1.40e + 00 4.95e − 150 2.63e − 75
7 1.40e + 00 1.76e − 299 4.95e − 150
8 1.40e + 00 2.22e − 598 1.76e − 299
9 1.40e + 00 3.52e − 1196 2.22e − 598

respectively. Thus we observe that the criterion (4.1.3) fails while the criterion (4.1.6)
holds. From the hypothesis of Lemma 4.2.1, we get
(
0.2017739733 if 0.08268226632 ≤ 0.2499999999
0.09730789545 ≤
0.2036729480 if 0.2499999999 ≤ 0.08268226632.

Thus the hypothesis of Lemma 4.2.1 hold. As a consequence, we can apply the Theorem
4.4.1. The table 4.6.1 reports convergence behavior of Newton’s method (4.1.2) applied to
(4.4.11) with x0 = 1 and ψ = 0.55. Numerical computations are performed to the decimal
point accuracy of 2005 by employing the high-precision library ARPREC. The Table 4.6.2
reports behavior of series {tn } (4.2.5). Comparing Tables 4.6.1 and 4.6.2, we observe that

Table 4.6.2. Sequences {tn } (4.2.5)

n tn tn+2 − tn+1 t ? − tn
0 0.00e + 00 4.95e − 02 1.69e − 01
1 9.73e − 02 1.87e − 02 7.16e − 02
2 1.47e − 01 3.26e − 03 2.21e − 02
3 1.66e − 01 1.02e − 04 3.36e − 03
4 1.69e − 01 1.01e − 07 1.02e − 04
5 1.69e − 01 9.75e − 14 1.01e − 07
6 1.69e − 01 9.16e − 26 9.75e − 14
7 1.69e − 01 8.08e − 50 9.16e − 26
8 1.69e − 01 6.30e − 98 8.08e − 50
9 1.69e − 01 3.82e − 194 6.30e − 98

the estimates of Theorem 4.4.1 hold.

Example 2. In this example, we provide an application of our results to a special nonlinear


66 Ioannis K. Argyros and Á. Alberto Magreñán

Hammerstein integral equation of the second kind. Consider the integral equation
Z 1
4
x(s) = 1 + G(s,t)x(t)3 dt, s ∈ [0, 1], (4.6.2)
5 0

where, G is the Green kernel on [0, 1] × [0, 1] defined by


(
t(1 − s), t ≤ s;
G(s,t) = (4.6.3)
s(1 − t), s ≤ t.

Let X = Y = C [0, 1] and D be a suitable open convex subset of X1 := {x ∈ X : x(s) > 0, s ∈


[0, 1]}, which will be given below. Define F : D → Y by
Z 1
4
[F (x)](s) = x(s) − 1 − G(s,t)x(t)3 dt, s ∈ [0, 1]. (4.6.4)
5 0

The first and second derivatives of F are given by


Z 1
12
[F (x)0 y](s) = y(s) − G(s,t)x(t)2y(t) dt, s ∈ [0, 1], (4.6.5)
5 0

and
Z 1
24
[F (x)00 yz](s) = G(s,t)x(t)y(t)z(t) dt, s ∈ [0, 1], (4.6.6)
5 0

respectively. We use the max-norm. Let x0 (s) = 1 for all s ∈ [0, 1]. Then, for any y ∈ D , we
have
Z 1
0 12
[(I − F (x0 ))(y)](s) = G(s,t)y(t) dt, s ∈ [0, 1], (4.6.7)
5 0

which means
Z 1
0 12 12 3
kI − F (x0 )k ≤ max G(s,t) dt = = < 1. (4.6.8)
5 s∈[0,1] 0 5 × 8 10

It follows from the Banach theorem that F 0 (x0 )−1 exists and
1 10
kF 0 (x0 )−1 k ≤ = . (4.6.9)
3 7
1−
10
On the other hand, we have from (4.4.7) that
Z 1
4 1
kF (x0 )k = max G(s,t) dt = .
5 s∈[0,1] 0 10

Then, we get η = 1/7. Note that F 00 (x) is not bounded in X or its subset X1 . Take into
account that a solution x? of equation (4.1.1) with F given by (4.4.6) must satisfy
1 ? 3
kx? k − 1 − kx k ≤ 0, (4.6.10)
10
Newton’s Method 67

i.e., kx? k ≤ ρ1 = 1.153467305 and kx? k ≥ ρ2 = 2.423622140, where ρ1 and ρ2 are the
positive roots of the real equation z − 1 − z3 /10 = 0. Consequently, if we look for a solution
such that x? < ρ1 ∈ X1 , we can consider D := {x : x ∈ X1 and kxk < r}, with r ∈ (ρ1 , ρ2 ),
as a nonempty open convex subset of X . For example, choose r = 1.7. Using (4.3.7) and
(4.3.8), we have that for any x, y, z ∈ D
Z 1
 0 
(F (x) − F 0 (x0 ))y (s) = 12 G(s,t)(x(t)2 − x0 (t)2)y(t) dt
5 0
Z 1
12
≤ G(s,t)kx(t) − x0(t)k kx(t) + x0(t)ky(t) dt
5 0
12 1
Z
≤ G(s,t) (r + 1)kx(t) − x0(t)ky(t) dt, s ∈ [0, 1]
5 0
(4.6.11)

and
Z 1
24
k(F 00 (x)yz)(s)k = G(s,t)x(t)y(t)z(t) dt, s ∈ [0, 1]. (4.6.12)
5 0

Then, we get
12 1 81
kF 0 (x) − F 0 (x0 )k ≤ (r + 1)kx − x0 k = kx − x0 k, (4.6.13)
5 8 100
24 r 51
kF 00 (x)k ≤ × = (4.6.14)
5 8 50
and
Z 1
 00  
F (x) − F 00 (x) yz (s) = 24 G(s,t)(x(t) − x(t)))y(t)z(t) dt (4.6.15)
5 0
24 1 3
≤ kx − xk = kx − xk. (4.6.16)
5 8 5
Now we can choose constants as follows:
1 6 6 51 49 11
η= , M = , M0 = , K = , L0 = and K0 = .
7 7 7 35 70 15
From (4.1.3) and (4.1.5), we obtain

0.1428571429 < 0.3070646192 and R1 = 0.1627780248.

From (4.1.6) and (4.1.8), we get

0.1428571429 < 0.4988741112 and R2 = 0.1518068730.

From the hypotheses (4.2.4) and (4.3.4) we get


(
1 0.5047037049 if 0.1000000000 ≤ 0.2131833880

7 0.5228360736 if 0.2131833880 ≤ 0.1000000000
68 Ioannis K. Argyros and Á. Alberto Magreñán

Table 4.6.3. Comparison among the sequences (4.2.5), (4.2.35), (4.3.5) and (4.3.22)

n tn sn vn un
0 0.000000e + 00 0.000000e + 00 0.000000e + 00 0.000000e + 00
1 1.428571e − 01 1.428571e − 01 1.428571e − 01 1.428571e − 01
2 1.598408e − 01 1.514801e − 01 2.042976e − 01 1.516343e − 01
3 1.600782e − 01 1.515408e − 01 2.356037e − 01 1.516661e − 01
4 1.600783e − 01 1.515408e − 01 2.527997e − 01 1.516661e − 01
5 1.600783e − 01 1.515408e − 01 2.626215e − 01 1.516661e − 01
6 1.600783e − 01 1.515408e − 01 2.683548e − 01 1.516661e − 01
7 1.600783e − 01 1.515408e − 01 2.717435e − 01 1.516661e − 01
8 1.600783e − 01 1.515408e − 01 2.737612e − 01 1.516661e − 01
9 1.600783e − 01 1.515408e − 01 2.749678e − 01 1.516661e − 01

and
(
1 0.6257238049 if 0.1000000000 ≤ 0.2691240473

7 0.5832936968 if 0.2691240473 ≤ 0.1000000000

respectively. Thus hypotheses (4.2.4) and (4.3.4) hold. Comparison – among sequences
(4.2.5), (4.2.35), (4.3.5) and (4.3.22) is reported in Table 4.6.3. In the Table 4.6.3, we
observe that the estimates (4.2.36) and (4.3.23) hold.
Concerning the uniqueness balls. From equation (4.1.5), we get R1 = 0.1627780248
and from equation (4.1.8), we get R2 = 0.1518068730. Whereas from Theorem 4.4.1, we
get R ≤ 1.257142857. Therefore, the new approach provides the largest uniqueness ball.
Example 3. Let us consider the case when X = Y = R, D = U(0, 1) and define F on D
by
F (x) = ex − 1. (4.6.17)
Then, we can define P (x) = x + 1 and Q (x) = x + 1. In order for us to compare our radius
of convergence with earlier ones, let us introduce the Lipschitz condition
0 ? −1 0
F (x ) (F (x) − F 0 (y)) ≤ L kx − yk for each x, y ∈ D . (4.6.18)

The radius of convergence given by Traub-Wozniakowski [7, 8, 16] is


2
r0 = (4.6.19)
3L
The radius of convergence given by us in [5, 6, 7, 8]
2
r1 = (4.6.20)
2d + L
Using (A2 ), (A3 ), (A4 ) and (4.6.18), we get that b = 1, c = d = e − 1 and L = e. Then, using
(4.5.2), (4.6.19) and (4.6.20), we obtain

r = 0.4078499356 > r1 = 0.324947231 > r0 = 0.245252961.


Newton’s Method 69

Example 4. Let X = Y = R3 , D = U(0, 1), x∗ = (0, 0, 0) and define function F on D by


 T
x e−1 2
F (x, y, z) = e − 1, y + y, z . (4.6.21)
2

We have that for u = (x, y, z)


 
ex 0 0
F 0 (u) =  0 (e − 1)y + 1 0  ,
0 0 1
 
ex 0 0 0 0 0 0 0 0
F 00 (u) =  0 0 0 0 e − 1 0 0 0 0 
0 0 0 0 0 0 0 0 0
and
 
ex 0 0
F 000 (u) =  0 0 0 0 0 0 0 0 0 0 0 .
0 0 0

Using the (A) and (A0 ) conditions – and F 0 (x∗ ) = diag{1, 1, 1} – we set

b = 1.0, c = c0 = c = d = d0 = d = e − 1, L = e, and L0 = e − 1.
We obtain
r = 0.4078499356.
Thus, r0 < r.

Table 4.6.4. Comparison among various iterative procedures

n kxn+1 − x? k en kxn − x? k2 λn kxn − x? k2 µn kxn − x? k2


1 0.034624745433299 0.292667362771974 0.479494429606589 15.944478671072201
2 0.000669491177317 0.000677347930013 0.001732513344520 0.001798733838791
3 0.000000347374133 0.000000224639537 0.000000609893622 0.000000610302684
4 0.000000000000103 0.000000000000060 0.000000000000164 0.000000000000164
5 0.000000000000000 0.000000000000000 0.000000000000000 0.000000000000000

The following iterations have been used before

kxn+1 − x? k ≤ pn kxn − x? k2 [6, 7, 8, 12],


kxn+1 − x? k ≤ λn kxn − x? k2 [6, 7, 8]
? 2
kxn+1 − x? k ≤ µn kxn − x k [16]

and

kxn+1 − x? k ≤ ξn kxn − x? k2 [6, 7, 8, 16]


70 Ioannis K. Argyros and Á. Alberto Magreñán

Table 4.6.5. Comparison among various iterative procedures

n ξn kxn − x? k2
1 0.240445748047369
2 0.000661013573819
3 0.000000224531576
4 0.000000000000060
5 0.000000000000000

where
L /3 kxn − x? k + b/2 L /2
pn = , λn = ,
1 − d kxn − x? k 1 − L0 kxn − x? k
L /2 L /3 kxn − x? k + b/2
µn = and ξn =   .
1 − L kxn − x? k 1 − L /2 kxn − x? k + b kxn − x? k

To compare the above iterations with the iteration (4.5.3), we produce the comparison
table 4.6.4 and 4.6.5, we apply Newton’s method (4.1.2) to the equation (4.6.21) with
x0 = (0.21, 0.21, 0.21)T. In the Table 4.6.4, we note that the estimate (4.5.3) – of Theo-
rem 4.5.1 – hold.
References

[1] Amat, S., C. Bermúdez, Busquier, S., Legaz, M. J., Plaza, S., On a Family of High-
Order Iterative Methods under Kantorovich Conditions and Some Applications, Abst.
Appl. Anal., 2012, (2012).

[2] Amat, S., Bermúdez, C., Busquier, S., Plaza, S., On a third-order Newton-type
method free of bilinear operators, Numer. Linear Alg. with Appl., 17(4) (2010), 639-
653.

[3] Amat, S., Busquier, S., Third-order iterative methods under Kantorovich conditions,
J. Math. Anal. App., 336(1) (2007), 243-261.

[4] Amat, S., Busquier, S., Gutiérrez, J. M., Third-order iterative methods with applica-
tions to Hammerstein equations: A unified approach, J. Comput. App. Math., 235(9)
(2011), 2936-2943.

[5] Argyros, I.K., A Newton-Kantorovich theorem for equations involving m-Fréchet


differentiable operators and applications in radiative transfer, J. Comp. App. Math.,
131 (2001), 149-159.

[6] Argyros, I.K., On the Newton-Kantorovich hypothesis for solving equations, Journal
of Computation and Applied Mathematics, 169(2) (2004), 315-332.

[7] Argyros, I.K., Cho, Y.J., Hilout, S., Numerical methods for equations and its appli-
cations, CRC Press/Taylor and Francis Publications, New York, 2012.

[8] Argyros, I.K., Computational theory of iterative methods, Elsevier, 2007.

[9] Ezquerro, J.A., González, D., Hernández, M. A., Majorizing sequences for Newton’s
method from initial value problems, J. Comp. Appl. Math., 236 (2012), 2246-2258.

[10] Ezquerro, J.A., Hernández, M.A., Generalized differentiability conditions for New-
ton’s method, IMA J. Numer. Anal., 22(2) (2002), 187-205.

[11] Ezquerro, J.A., González, D., Hernández, M.A., On the local convergence of New-
ton’s method under generalized conditions of Kantorovich, App. Math. Let., 26
(2013), 566-570.

[12] Gutiérrez, J.M., A new semilocal convergence theorem for Newton’s method, J.
Comp. App. Math., 79 (1997), 131 - 145.
72 Ioannis K. Argyros and Á. Alberto Magreñán

[13] Gutiérrez, J.M., Magreñán, Á.A., Romero, N., On the semilocal convergence of
Newton-Kantorovich method under center-Lipschitz conditions, App. Math. Comp.,
221 (2013), 79-88.

[14] Kantorovich,L.V., The majorant principle and Newton’s method, Doklady Akademii
Nauk SSSR, 76, 17-20, 1951. (In Russian).

[15] Ostrowski, A.M., Solution of equations in Euclidean and Banach spaces, Academic
Press, New York, 3rd Ed. 1973.

[16] Traub, J. F., Iterative Methods for the Solution of Equations, American Mathematical
Soc., 1982.

[17] Werner, W., Some improvements of classical iterative methods for the solution of
nonlinear equations, Numerical Solution of Nonlinear Equations Lecture Notes in
Mathematics, 878/1981 (1981), 426-440.

[18] Yamamoto, T., On the method of tangent hyperbolas in Banach spaces, Journal of
Computational and Applied Mathematics, 21(1), (1988), 75-86.
Chapter 5

Expanding the Applicability of


Newton’s Method Using Smale’s
α-Theory

5.1. Introduction
Let X , Y be Banach spaces. Let U(x, r) and U(x, r) stand, respectively, for the open and
closed ball in X with center x and radius r > 0. Denote by L (X , Y ) the space of bounded
linear operators from X into Y . In the present chapter we are concerned with the problem
of approximating a locally unique solution x? of equation

F(x) = 0, (5.1.1)

where F is a Fréchet continuously differentiable operator defined on U(x0 , R) for some


R > 0 with values in Y .
A lot of problems from computational sciences and other disciplines can be brought in
the form of equation (5.1.1) using Mathematical Modelling [5, 13]. The solution of these
equations can rarely be found in closed form. That is why the solution methods for these
equations are iterative. In particular, the practice of numerical analysis for finding such
solutions is essentially connected to variants of Newton’s method [5, 13, 21, 22]. The study
about convergence matter of Newton methods is usually centered on two types: semilo-
cal and local convergence analysis. The semilocal convergence matter is, based on the
information around an initial point, to give criteria ensuring the convergence of Newton
methods; while the local one is, based on the information around a solution, to find es-
timates of the radii of convergence balls. We find in the literature several studies on the
weakness and/or extension of the hypothesis made on the underlying operators. There is a
plethora on local as well as semil-local convergence results, we refer the reader to [1]–[34].
The most famous among the semilocal convergence of iterative methods is the celebrated
Kantorovich theorem for solving nonlinear equations. This theorem provides a simple and
transparent convergence criterion for operators with bounded second derivatives F 00 or the
Lipschitz continuous first derivatives [5, 13, 21, 22]. Another important theorem inaugu-
rated by Smale at the International Conference of Mathematics (cf. [28]), where the concept
74 Ioannis K. Argyros and Á. Alberto Magreñán

of an approximate zero was proposed and the convergence criteria were provided to deter-
mine an approximate zero for analytic function, depending on the information at the initial
point. Wang and Han [32, 31] generalized Smale’s result by introducing the γ-condition (see
(5.1.3)). For more details on Smale’s theory, the reader can refer to the excellent Dedieu’s
book [15, Chapter 3.3].
Newton’s method defined by

x0 is an initial point
(5.1.2)
xn+1 = xn − F 0 (xn )−1 F(xn ) for each n = 0, 1, 2, · · ·

is undoubtedly the most popular iterative process for generating a sequence {xn } approxi-
mating x? . Here, F 0 (x) denotes the Fréchet-derivative of F at x ∈ U(x0 , R).
In the present chapter we expand the applicability of Newton’s method under the γ-
condition by introducing the notion of the center γ0 -condition (to be precised in Defini-
tion 5.3.1) for some γ0 ≤ γ. This way we obtain tighter upper bounds on the norms of
k F 0 (x)−1 F 0 (x0 ) k for each x ∈ U(x0 , R) (see (5.2.4), (5.2.2) and (5.2.3)) leading to weaker
sufficient convergence conditions and a tighter convergence analysis than in earlier studies
such as [14, 19, 27, 28, 31, 32]. The approach of introducing center-Lipschitz condition
has already been fruitful for expanding the applicability of Newton’s method under the
Kantorovich-type theory [3, 9, 11, 13].
Wang in his work [31] on approximate zeros of Smale (cf. [28, 29]) used the γ-Lipschitz
condition at x0

k F 0 (x0 )−1 F 00 (x) k≤
(1 − γ k x − x0 k)3 (5.1.3)
for each x ∈ U(x0 , r), 0 < r ≤ R,
where γ > 0 and x0 are such that γ k x − x0 k< 1 and F 0 (x0 )−1 ∈ L (Y , X ) to show the
following semilocal convergence result for Newton’s method.

Theorem 5.1.1. Let F : U(x0 , R) ⊆ X −→ Y be twice-Fréchet differentiable. Suppose there


exists x0 ∈ U(x0 , R) such that F 0 (x0 )−1 ∈ L (Y , X ) and

k F 0 (x0 )−1 F(x0 ) k≤ η; (5.1.4)

condition (5.1.3) holds and for α = γη



α ≤ 3 − 2 2; (5.1.5)

t ? ≤ R, (5.1.6)
where p  
? 1+α− (1 + α)2 − 8 α 1 1
t = ≤ 1− √ . (5.1.7)
4γ 2 γ
Then, sequence {xn } generated by Newton’s method is well defined, remains in U(x0 ,t ?) for
each n = 0, 1, · · · and converges to a unique solution x? ∈ U(x0 ,t ? ) of equation F(x) = 0.
Moreover, the following error estimates hold

k xn+1 − xn k≤ tn+1 − tn (5.1.8)


Expanding the Applicability of Newton’s Method 75

and
k xn+1 − x? k≤ t ? − tn , (5.1.9)
where scalar sequence {tn } is defined by
t0 = 0, t1 = η,
γ (tn − tn−1 )2 ϕ(tn )
tn+1 = tn +   = tn − 0
1 ϕ (tn ) (5.1.10)
2− 2
(1 − γtn )(1 − γtn−1 )2
(1 − γtn )
f or each n = 1, 2, · · · ,
where
γt 2
ϕ(t) = η − t + . (5.1.11)
1 − γt
Notice that t ? is the small zero of equation ϕ(t) = 0, which exists under the hypothesis
(5.1.5).
The chapter is organized as follows: sections 5.2. and 5.3. contain the semilocal and
local convergence analysis of Newton’s method. Applications and numerical examples are
given in the concluding section 5.4.

5.2. Semilocal Convergence of Newton’s Method


We need some auxiliary results. We shall use the Banach lemma on invertible operators
[5, 13, 21, 22]
Lemma 5.2.1. Let A, B be bounded linear operators, where A is invertible. Moreover,
k A−1 kk B k< 1. Then, A + B is invertible and
k A−1 k
k (A + B)−1 k≤ . (5.2.1)
1− k A−1 k k B k
We shall also use the following definition of Lipschitz and local Lipschitz conditions.
Definition 5.2.2. (see [14, p. 634], [34, p. 673]) Let F : U(x0 , R) −→ Y be Fréchet-
differentiable on U(x0 , R). We say that F 0 satisfies the Lipschitz condition at x0 if there
exists an increasing function ` : [0, R] −→ [0, +∞) such that
k F 0 (x0 )−1 (F 0 (x) − F 0 (y)) k≤ `(r) k x − y k
(5.2.2)
f or each x, y ∈ U(x0 , r), 0 < r ≤ R.
In view of (5.2.2), there exists an increasing function `0 : [0, R] −→ [0, +∞) such that
the center-Lipschitz condition
k F 0 (x0 )−1 (F 0 (x) − F 0 (x0 )) k≤ `0 (r) k x − x0 k
(5.2.3)
f or each x ∈ U(x0 , r), 0 < r ≤ R
holds. Clearly,
`0 (r) ≤ `(r) for each r ∈ (0, R] (5.2.4)
holds in general and `(r)/`0(r) can be arbitrarily large [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13].
76 Ioannis K. Argyros and Á. Alberto Magreñán

Lemma 5.2.3. (see [14, p. 638]) Let F : U(x0 , R) −→ Y be Fréchet-differentiable on


U(x0 , R). Suppose F 0 (x0 )−1 ∈ L (Y , X ) and there exist γ0 ≥ 0, γ ≥ 0 such that γ0 R < 1,
γ R < 1. Then, F 0 satisfies conditions (5.2.2) and (5.2.3), respectively, with

`(r) := (5.2.5)
(1 − γ r)3

and
γ0 (2 − γ0 r)
`0 (r) := . (5.2.6)
(1 − γ0 r)2
Notice that with preceding choices of functions ` and `0 and since condition (5.2.4) is
satisfied, we can always choose γ0 , γ such that

γ0 ≤ γ. (5.2.7)

From now on we assume that condition (5.2.7) is satisfied. We also need a result by Zabre-
jko and Nguen.

Lemma 5.2.4. (see [34, p. 673]) Let F : U(x0 , R) −→ Y be Fréchet-differentiable on


U(x0 , R). Suppose F 0 (x0 )−1 ∈ L (Y , X ) and

k F 0 (x0 )−1 (F 0 (x) − F 0 (y)) k≤ λ(r) k x − y k


f or each x, y ∈ U(x0 , r), 0 < r ≤ R

for some increasing function λ : [0, R] −→ [0, +∞). Then, the following assertion holds

k F 0 (x0 )−1 (F 0 (x + p) − F 0 (x)) k≤ Λ(r+ k p k) − Λ(r)


f or each x ∈ U(x0 , r), 0 < r ≤ R and k p k≤ R − r,

where Z r
Λ(r) = λ(t) dt.
0
In particular, if
k F 0 (x0 )−1 (F 0 (x) − F 0 (x0 )) k≤ λ0 (r) k x − x0 k
f or each x ∈ U(x0 , r), 0 < r ≤ R
for some increasing function λ0 : [0, R] −→ [0, +∞). Then, the following assertion holds

k F 0 (x0 )−1 (F 0 (x0 + p) − F 0 (x0 )) k≤ Λ0 (k p k)


f or each 0 < r ≤ R and k p k≤ R − r,

where Z r
Λ0 (r) = λ0 (t) dt.
0

Using the center-Lipschitz condition and Lemma 5.2.3, we can show the following
result on invertible operators.
Expanding the Applicability of Newton’s Method 77

Lemma 5.2.5. Let F : U(x0 , R) −→ Y be Fréchet-differentiable on U(x0 , R). Suppose


F 0 (x0 )−1 ∈ L (Y , X ) and γ0 R < 1 for some γ0 > 0 and x0 ∈ X ; center-Lipschitz (5.2.3) holds
1 1
on U0 = U(x0 , r0 ), where `0 (r) is given by (5.2.6) and r0 = (1 − √ ) . Then F 0 (x)−1 ∈
2 γ0
L (Y , X ) on U0 and satisfies
 −1
0 −1 0 1
k F (x) F (x0 ) k≤ 2 − . (5.2.8)
(1 − γ0 r)2
Proof. We have by (5.2.3), (5.2.6) and x ∈ U0 that
1
k F 0 (x0 )−1 (F 0 (x) − F 0 (x0 )) k≤ `0 (r) r = − 1 < 1.
(1 − γ0 r)2
The result now follows from Lemma 5.2.1. The proof of Lemma 5.2.5 is complete.

Using (5.1.3) a similar to Lemma 5.2.1, Banach lemma was given in [31] (see also
[27, 28, 29]).
Lemma 5.2.6. Let F : U(x0 , R) −→ Y be twice Fréchet-differentiable on U(x0 , R). Suppose
F 0 (x0 )−1 ∈ L (Y , X ) and γ R < 1 for some γ > 0 and x0 ∈ X ; condition (5.1.3) holds on
1 1
V0 = U(x0 , r0 ), where r0 = (1 − √ ) . Then F 0 (x)−1 ∈ L (Y , X ) on V0 and satisfies
2 γ
 −1
0 −1 0 1
k F (x) F (x0 ) k≤ 2 − . (5.2.9)
(1 − γ r)2
Remark 5.2.7. It follows from (5.2.7)–(5.2.9) that (5.2.8) is more precise upper bound on
the norm of F 0 (x)−1 F 0 (x0 ). This observation leads to a tighter majorizing sequence for
{xn } (see Proposition 5.2.10).
We can show the main following semilocal convergence theorem for Newton’s method.
Theorem 5.2.8. Suppose that
(a) There exist x0 ∈ X and η > 0 such that
F 0 (x0 )−1 ∈ L (Y , X ) and k F 0 (x0 )−1 F(x0 ) k≤ η;

(b) Operator F 0 satisfies Lipschitz and center-Lipschitz conditions (5.2.2) and (5.2.3) on
U(x0 , r0 ) with `(r) and `(r) are given by (5.2.5) and (5.2.6), respectively;
(c) U0 ⊆ U(x0 , R);
(d) Scalar sequence {sn } defined by
s0 = 0, s1 = η,
γ0 (s1 − s0 )2
s2 = s1 +  
1
2− (1 − γ s1 )
(1 − γ0 s1 )2
(5.2.10)
γ (sn+1 − sn )2
sn+2 = sn+1 +  
1
2− (1 − γ sn+1 ) (1 − γ sn )2
(1 − γ0 sn+1 )2
f or each n = 1, 2, · · ·
78 Ioannis K. Argyros and Á. Alberto Magreñán

satisfies for each n = 1, 2, · · ·


 1 γ0 1

 if ≤ 1− √
γ γ 2
sn < b = 1 1 γ0 1 (5.2.11)

 (1 − √ ) if ≥ 1− √ .
2 γ0 γ 2

Then, the following assertions hold

(i) Sequence {sn } is increasingly convergent to its unique least upper bound s? which
satisfies s? ∈ [s2 , b], where b is given in (5.2.11).

(ii) Sequence {xn } generated by Newton’s method is well defined, remains in U(x0 , s? )
for each n = 0, 1, · · · and converges to a unique solution x? ∈ U(x0 , s? ) of equation
F(x) = 0. Moreover, the following estimates hold

k xn+1 − xn k≤ sn+1 − sn (5.2.12)

and
k xn − x? k≤ s? − sn for each n = 0, 1, 2, · · · . (5.2.13)

Proof. (i) It follows from (5.2.8) and (5.2.9) that sequence {sn } is increasing and
bounded above by 1/γ. Hence, it converges to s? ∈ [s2 , b].

(ii) We use Mathematical Induction to prove that

k xk+1 − xk k≤ sk+1 − sk (5.2.14)

and

U(xk+1 , s? − sk+1 ) ⊆ U(xk , s? − sk ) for each k = 1, 2, · · · . (5.2.15)

Let z ∈ U(x1 , s? − s1 ). Then, we obtain that

k z − x0 k≤k z − x1 k + k x1 − x0 k≤ s? − s1 + s1 − s0 = s? − s0 ,

which implies z ∈ U(x0 , s? − s0 ). Note also that

k x1 − x0 k=k F 0 (x0 )−1 F(x0 ) k≤ η = s1 − s0 .

Hence, estimates (5.2.14) and (5.2.15) hold for k = 0. Suppose these estimates hold
for natural integers n ≤ k. Then, we have that
k+1 k+1
k xk+1 − x0 k≤ ∑ k xi − xi−1 k≤ ∑ (si − si−1 ) = sk+1 − s0 = sk+1
i=1 i=1

and

k xk + θ (xk+1 − xk ) − x0 k≤ sk + θ (sk+1 − sk ) ≤ s? for all θ ∈ (0, 1).


Expanding the Applicability of Newton’s Method 79

Using (5.2.2), Lemma 5.2.1 for x = xk+1 and the induction hypotheses we get that
1
k F 0 (x0 )−1 (F 0 (xk+1 ) − F 0 (x0 )) k ≤ −1
(1 − γ0 k xk+1 − x0 k)2
1 (5.2.16)
≤ − 1 < 1.
(1 − γ0 sk+1)2

It follows from (5.2.16) and the Banach lemma 5.2.1 on invertible operators that
F 0 (xk+1 )−1 exists and
 −1
1
k F 0 (xk+1 )−1 F 0 (x0 ) k≤ 2 − . (5.2.17)
(1 − γ0 sk+1)2

Using (5.1.2), we obtain the approximation

F(xk+1) = F(xk+1) − F (xk ) − F 0 (xk ) (xk+1 − xk )


Z 1
(5.2.18)
= (F 0 (xτk ) − F 0 (xk )) dτ(xk+1 − xk ),
0

where xτk = xk + τ (xk+1 − xk ) and xτk s = xk + τ s (xk+1 − xk ) for each 0 ≤ τ, s ≤ 1. Then


by (5.2.18) for k = 0, (5.2.3) and (5.2.6), we get that

k F 0 (x0 )−1 F(x1 ) k


Z 1
≤ k F 0 (x0 )−1 (F 0 (x0 + τ (x1 − x0 )) − F 0 (x0 )) k dτ k x1 − x0 k
Z0 1  
1
≤ − 1 dτ k x1 − x0 k
0 (1 − γ0 τ (k x1 − x0 k)2
γ k x1 − x0 k2 γ (s1 − s0 )2
= 0 ≤ 0 .
1 − γ0 k x1 − x0 k 1 − γ0 s1

Moreover, it follows from Lemma 5.2.4, (5.2.2) and (5.2.18) in turn for k = 1, 2, · · ·
that
k F 0 (x0 )−1 F(xk+1 ) k
Z 1
≤ k F 0 (x0 )−1 (F 0 (xτk ) − F 0 (xk )) k dτ k xk+1 − xk k
Z0 1 Z 1
2 γ τ ds dτ k xk+1 − xk k2

0 0 (1 − γ k xτk s − x0 k)3
2 γ τ ds dτ k xk+1 − xk k2
Z 1Z 1
≤ 3
0 0 (1 − γ k xk − x0 k −γ τ s k xk+1 − xk k) (5.2.19)
2
γ k xk+1 − xk k
=
(1 − γ k xk − x0 k −γ kxk+1 − xk k) (1− γ k xk − x0 k)2
γ (sk+1 − sk )2 k xk+1 − xk k 2

(1 − γ sk+1) (1 − γ sk )2 sk+1 − sk
γ (sk+1 − sk ) 2
≤ .
(1 − γ sk+1) (1 − γ sk )2

(see also [27, p. 33, estimate (3.19)]) Then, in view of (5.1.2), (5.2.10), (5.2.17) and
80 Ioannis K. Argyros and Á. Alberto Magreñán

the preceding two estimates we obtain that

k x2 − x1 k ≤ k F 0 (x1 )−1 F(x0 ) k k F 0 (x0 )−1 F(x1 ) k


1 γ0 (s1 − s0 )2
≤ = s2 − s1
1 1 − γ 0 s1
2−
(1 − γ0 s1 )2

and for k = 1, 2, · · ·

k xk+2 − xk+1 k=k (F 0 (xk+1 )−1 F 0 (x0 )) (F 0 (x0 )−1 F(xk+1 )) k


≤k F 0 (xk+1)−1 F 0 (x0 ) k k F 0 (x0 )−1 F(xk+1 ) k
1 γ (sk+1 − sk )2 (5.2.20)
≤ = sk+2 − sk+1 .
1 (1 − γ sk+1 ) (1 − γ sk )2
2−
(1 − γ0 sk+1)2

Hence, we showed (5.2.14) holds for all k ≥ 0. Furthermore, let w ∈ U(xk+2 , s? −


sk+2). Then, we have that

k w − xk+1 k ≤ k w − xk+2 k + k xk+2 − xk+1 k


(5.2.21)
≤ s? − sk+2 + sk+2 − sk+1 = s? − sk+1 .

That is w ∈ U(xk+1, s? − sk+1 ). The induction for (5.2.14) and (5.2.15) is now com-
pleted. Lemma 5.2.5 implies that {sn } is a complete sequence. It follows from
(5.2.14) and (5.2.15) that {xn } is also a complete sequence in a Banach space X
and as such it converges to some x? ∈ U(x0 , s? ) (since U(x0 , s? ) is a closed set). By
letting k −→ ∞ in (5.2.19) we get F(x? ) = 0. Estimate (5.2.13) is obtained from
(5.2.12) by using standard majorization techniques (cf. [5, 13, 21, 28, 29]). Finally,
to show the uniqueness part, let y? ∈ U(x0 , s? ) be a solution of equation (5.1.1). Us-
Z 1
ing (5.2.3) for x replaced by z? = x? + τ (y? − x? ) and G = F 0 (z? ) dτ we get as in
0
(5.2.9) that k F 0 (x0 )−1 (G − F 0 (x0 )) k< 1. That is G −1 ∈ L (Y , X ). Using the identity
0 = F(x? ) − F(y? ) = G (x? − y? ), we deduce x? = y? .

Remark 5.2.9. (a) The convergence criteria in Theorem 5.2.8 are weaker than in The-
orem 5.1.1. In particular, Theorem 5.1.1 requires that operator F is twice Fréchet-
differentiable but our Theorem 5.2.8 requires only that F is Fréchet-differentiable.
Notice also that if F is twice Fréchet-differentiable, then (5.2.2) implies (5.1.3).
Moreover, in view of (5.1.7) and (5.2.9), we have that (5.1.5) =⇒ (5.2.11) but not
necessarily vice versa. Therefore, Theorem 5.2.8 can apply in cases when Theorem
5.1.1 cannot.

(b) Estimate (5.2.11) can be checked, since scalar sequence is based on the initial data
γ0 , γ and η, especially in the case when si = si+n for some finite i. At this point, we
would like to know if it is possible to find convergence criteria stronger than (5.2.11)
but weaker than (5.1.5). To this extend we first compare our majorizing sequence
{sn } to the old majorizing sequence {tn }.
Expanding the Applicability of Newton’s Method 81

Proposition 5.2.10. Let F : U(x0 , R) −→ Y be twice Fréchet-differentiable on U(x0 , R).


Suppose that hypotheses of Theorem 5.1.1 and the center-Lipschitz condition hold on
U(x0 , r0 ). Then, the following assertions hold

(a) Scalar sequences {tn } and {sn } are increasingly convergent to t ? , s? , respectively.

(b) Sequence {xn } generated by Newton’s method is well defined, remains in U(x0 , r0 )
for each n = 0, 1, · · · and converges to a unique solution x? ∈ U(x0 , r0 ) of equation
F(x) = 0. Moreover, the following estimates hold for each n = 0, 1, · · ·

sn ≤ tn , (5.2.22)

sn+1 − sn ≤ tn+1 − tn , (5.2.23)


s? ≤ t ? , (5.2.24)
k xn+1 − xn k≤ sn+1 − sn
and
k xn − x? k≤ s? − sn .

Proof. According to Theorems 5.1.1 and 5.2.8 we only need to show (5.2.22) and (5.2.23),
since (5.2.24) follows from (5.2.22) by letting n → ∞. It follows from the definition of
sequences {tn } and {sn } (see (10) and (21)) that t0 = s0 , t1 = s1 , s2 ≤ t2 and s2 − s1 ≤ t2 −t1 ,
since γ0 ≤ γ,

1 1 1 1
≤ , = (5.2.25)
1 − γ 0 s0 1 − γ t0 1 − γ s1 1 − γ t1

and

1 1
≤ . (5.2.26)
1 1
2− 2−
(1 − γ0 s1 )2 (1 − γ0 t1 )2

Hence, (5.2.22) and (5.2.23) hold for n = 0, 1, 2. Suppose that (5.2.22) and (5.2.23) hold
for all k ≤ n. Then, we have that sk+1 ≤ tk+1 and sk+1 − sk ≤ tk+1 − tk , since γ0 ≤ γ,

1 1 1 1
≤ , ≤ (5.2.27)
1 − γ sk−1 1 − γ tk−1 1 − γ sk 1 − γtk

and

1 1
≤ . (5.2.28)
1 1
2− 2−
(1 − γ0 sk )2 (1 − γ0 tk )2
82 Ioannis K. Argyros and Á. Alberto Magreñán

Remark 5.2.11. In view of (5.2.22)–(5.2.24), our error analysis is tighter and the informa-
tion on the location of the solution x? is at least as precise as the old one. Notice also that
estimates (5.2.22) and (5.2.23) hold as strict inequalities for n > 1 if γ0 < γ (see also the
numerical examples) and these advantages hold under the same or less computational cost
as before (see Remark 5.2.9).
Next, we present our [11, Theorem 3.2]. This theorem shall be used to show that (5.1.5)
can be weakened.
Theorem 5.2.12. Let F : U(x0 , R) ⊆ X −→ Y be Fréchet-differentiable. Suppose there
exist parameters L ≥ L0 > 0 and η > 0 such that for all x, y ∈ U(x0 , R)

F 0 (x0 )−1 ∈ L (Y , X ), k F 0 (x0 )−1 F(x0 ) k≤ η,

k F 0 (x0 )−1 (F 0 (x) − F 0 (x0 )) k≤ L0 k x − x0 k, (5.2.29)


0 −1 0 0
k F (x0 ) (F (x) − F (y)) k≤ L k x − y k, (5.2.30)
s? := lim sn ≤ R
n−→∞
and
h1 = 2 L1 η ≤ 1, (5.2.31)
where
L0 η2
s0 = 0, s1 = η, s2 = η + ,
2 (1 − L0 η)
L (sn − sn−1 )2
sn+1 = sn + f or each n = 2, 3, · · ·
2 (1 − L0 sn )
p q
1
and L1 = (4 L0 + L0 L + L0 L + 8 L20 ). Then, the following assertions hold
8
(a) Sequence {sn } is increasing convergent to its unique least upper bound s? , which
satisfies
s2 ≤ s? ≤ s?? = δ η,
where
L0 η
δ = 1+
2 (1 − β) (1 − L0 η)
and
2L
β= p .
L+ L2 + 8 L0 L

(b) Sequence {xn } generated by Newton’s method is well defined, remains in U(x0 , s? )
for each n = 0, 1, · · · and converges to a solution x? ∈ U(x0 , s? ) of equation F(x) = 0.
Moreover, the following estimates hold for each n = 0, 1, · · ·
L k xn − xn−1 k
k xn+1 − xn k≤ ≤ sn+1 − sn
2 (1 − L0 k xn − x0 k)
and
k xn − x? k≤ s? − sn .
Expanding the Applicability of Newton’s Method 83

(c) If there exists ς > s? such that ς < R and L0 (s? + ς) ≤ 2, then, the solution x? of
equation F(x) = 0 is unique in U(x0 , ς).

Remark 5.2.13. (a) If L0 = L, convergence criterion (5.2.31) reduces to the famous for
its simplicity and clarity Kantorovich hypothesis [21] for solving equations

h = 2 L η ≤ 1. (5.2.32)

Notice that
L0 ≤ L (5.2.33)
holds in general and L0 /L can be arbitrarily small [11, 13]. We also have that

h ≤ 1 =⇒ h1 ≤ 1 (5.2.34)

and
h1 L0
−→ 0 as −→ 0.
h0 L
Moreover, the Kantorovich majorizing sequence is given by

p(sn ) L (sn − sn−1 )2


s0 = 0, sn+1 = sn − = sn − for each n = 0, 1, 2, · · · ,
p0 (sn ) p0 (sn )

where p(t) = (L/2)t 2 − t + η. If (5.2.32) is satisfied then (see [11])

sn ≤ sn ,

sn+1 − sn ≤ sn+1 − sn
and

s? ≤ s? = lim sn = √ .
n−→∞ 1+ 1−h
(b) Let us show that Wang’s convergence criterion (5.1.5) can be weakened under the
Kantorovich hypotheses. In particular, suppose that (5.2.30) and (5.2.32) are satis-
fied. Then, (5.2.31) is also satisfied. Moreover, if F is twice Fréchet-differentiable on
U(x0 , 1/γ), then Wang’s condition (5.1.3) is certainly satisfied, if we choose γ = L/2.
Then, condition (5.2.32) becomes
1
γη ≤ , (5.2.35)
4
?
which improves (5.1.5). We must also show√ that s ≤ 1/γ. But the preceding inequal-
ity reduces to showing that h − 2 ≤ 2 1 − h, which is true by (5.2.32). Clearly, in
view of (5.2.31), (5.2.33)–(5.2.35), criterion (5.1.5) (i.e., (5.2.35)) can be improved
even further, for γ0 = L0 /2, if L0 ≤ L.
84 Ioannis K. Argyros and Á. Alberto Magreñán

(c) Suppose Wang’s condition (5.1.3) is satisfied as well as criterion (5.1.5) on√ U(x0 , r0 ).
Recall that r0 = (1− √12 ) 1γ . Then, for r ∈ [0, r0 ], we have that 1/(1−γ r) ≤ 2. Then,

in view of (5.1.3) and (5.2.30), we can choose L = 4 2 γ. Then, (5.1.5) becomes

L η ≤ 4 (3 2 − 4) = .970562748.

However, we must also have that t ? ≤ 1/L, where t ? is given in (5.1.7). By direct
algebraic manipulation, we see that the preceding inequality is satisfied, if
√ √ √
2 2 (2 2 − 1) 4 2
.078526267 = √ ≤ Lη ≤ √ = .548479169.
8− 2 8 2−1
Hence, the last two estimates on ”L η” are satisfied, if the preceding inequality is
satisfied, which is weaker than (5.2.32) for L η ∈ (.5, .548479169]. However, the
preceding inequality may not be weaker than (5.2.31) (if γ0 = L0 /2) for L0 sufficiently
smaller than L.

Next, we present the following specialization of Theorem 5.2.12 for Newton-


Kantorovich method and analytic operators defined on U(x0 , R). In the following theorem
and for ε = γ0 /γ, interval I is defined by
 √ √
(0, 1) √ √ i f ε ≤ ( √2 − 1)/ √2
I=
(0, ε−1 ( 2 − 1)/ 2) i f ε > ( 2 − 1)/ 2.

Theorem 5.2.14. Let F : U(x0 , R) ⊆ X → Y be Fréchet-differentiable in U(x0 , R). Define


functions f , H and H1 on interval I by

1 α ε (2 − r)
f (r) = g(r) r α− , H1 (r) =
2 (1 − ε r)2
and  
α ε (2 − r)
 (1 − ε r)2 
H(r) = 
1 +   α − r,
αε (2 − r) 
2 (1 − β) 1 −
(1 − εr)2
where
" s s #
1 4 ε (2 − r) 2 ε (2 − r) 2 ε(2 − r) 8 ε2 (2 − r)2
g(r) = + + + .
8 (1 − ε r)2 (1 − ε r)2 (1 − r)3 (1 − ε r)2 (1 − r)3 (1 − ε r)4

Suppose that there exist intervals I f , IH and IH1 such that for some α ∈ I

I f ⊂ I, IH ⊂ I, IH1 ⊂ I,

f (r) ≤ 0 for each r ∈ I f , (5.2.36)


H(r) ≤ 0 for each r ∈ IH , (5.2.37)
H1 (r) ≤ 1 for each r ∈ IH1 (5.2.38)
Expanding the Applicability of Newton’s Method 85

and
/
I0 = I f ∩ IH ∩ IH1 6= 0.
Denote by r? = r? (α) the largest element in I0 . Moreover, suppose there exists a point
x0 ∈ U(x0 , R) such that F 0 (x0 )−1 ∈ Ł(Y , X ) and
r?
≤ R. (5.2.39)
γ
Then, the following assertions hold
(a) Scalar sequence {sn } is increasingly convergent to s? which satisfies

η ≤ s? ≤ s?? = δη,

where
L0 η
δ = 1+ ,
2 (1 − β)(1 − L0 η)
2L 2M
β= p = p ,
L+ L2 + 8 L0 L M+ M 2 + 8 M0 M
L0 L
M0 = , M=
γ γ
γ(2 − ε r? ) 2γ
L0 = and L= , (5.2.40)
(1 − ε r? )2 (1 − r? )3
where sequence {sn }, s? and s?? are given in Theorem 5.2.12.

(b) The conclusions (a) and (b) of Theorem 5.2.12 hold.


Proof. Notice that if follows from (5.2.38) that H(r) + r ≥ 0 for each r ∈ IH . We have by
s? ≤ s?? and (5.2.36) that γ s? ≤ γs?? ≤ H(r? ) ≤ r? < 1. Then, we showed in [5] (see also
[13]) that (5.2.2) and (5.2.3) are satisfied for functions L0 and L given by (5.2.40). Using
these choices of L0 and L we must show that (5.2.9) is satisfied. That is we must have
α 1
h3 = g(r?)α ≤ ?
≤ ,
2r 2
which is true by the choice of r? in (5.2.36) and (5.2.39). Notice also that by (5.2.37) and
(5.2.38), we have s? ≤ r? /γ. The rest follows from Theorem 5.2.8.

Remark 5.2.15. (a) It follows from the proof of Theorem 5.2.8 that function f can be
replaced by f 1 defined by
1
f 1 (r) = g(r)α− . (5.2.41)
2
In practice, we shall employ both functions to see which one will produce the largest
possible upper bound r? for α.

(b) It is worth noticing that

L0 (r) < L(r) for all r ∈ (0, 1).


86 Ioannis K. Argyros and Á. Alberto Magreñán

(c) Notice that it follows from (5.2.37) and (5.2.38) that α ≤ r? .

In the case when F is Fréchet-differentiable on X , we have the following result.


Proposition 5.2.16. Let F : X → Y be analytic. Suppose that there exists a point x0 ∈ D
such that F 0 (x0 )−1 ∈ L (Y , X ) and for each r in some interval IH2 such that 0/ 6= IH2 ⊂ I, we
have that
1
H2 (r) = g(r) − ≤0 (5.2.42)
2r
Denote by r1 the largest element in IH2 . Moreover, suppose

α ≤ r1 = 0.179939475 .. .. (5.2.43)

Then, the conclusions of Theorem 5.2.8 hold.

Proof. It follows by the choice of r1 that


1
g(r1 ) ≤ . (5.2.44)
2r1
Using (5.2.43) and (5.2.44) we get
1 1
h3 = g(r1 )α = α≤ .
2 r1 2
Notice that condition (5.2.39) is satisfied automatically.

The results obtained in this chapter can be connected to the following notion [14].

Definition 5.2.17. A point x0 is said to be an approximate zero of the first kind for F if {xn }
is well defined for each n = 0, 1, · · · and satisfies
n −1
kxn+1 − xn k ≤ Ξ2 kx1 − x0 k f or some Ξ ∈ (0, 1). (5.2.45)

Notice that if we start from an approximate zero x0 of the first kind then, the convergence of
Newton-Kantorovich method to x? is very fast.

In view of the estimate


L
kxn+1 − xn k ≤ kxn − xn−1 k2
2(1 − L0 sn )
we get that
L L γ 1 1
≤ ?
≤ 3 2−r

2(1 − L0 s) 2 (1 − L0 s ) (1 − r) 1 − (1−r)2 (H(r) + r) η

provided that
2−r
(H(r) + r) < 1 (5.2.46)
(1 − r)2
and
0 ≤ α ψ(r) ≤ Ξ < 1, (5.2.47)
Expanding the Applicability of Newton’s Method 87

where
1
ψ(r) = 2−r
.
(1 − r)3 [1 − (1−r) 2 (H(r) + r)]

Conditions (5.2.46) and (5.2.47) must hold respectively in Theorem 5.2.8 and Proposition
5.2.10 for r = r? , r1 . Then, x0 is an approximate zero of the first kind in all these results. If
γ0 = γ, then (5.2.42) holds for r1 = .179939475 · · ·. Using (5.2.45) we notice that (5.2.46)
and (5.2.47) hold at r1 . It then follows that (5.2.45) is satisfied with factor Ξ/η, where Ξ is
given by Ξ = α ψ(r1 ).

Remark 5.2.18. If F : D ⊆ X → Y is an analytic operator and x0 ∈ D . Let γ be defined


by (see [28, 29])
1 1
γ = supk F 0 (x0 )−1 F ( j) (x0 )k j−1 ,
j>1 j!

or γ = ∞ if F 0 (x0 ) is not invertible or the supremum in γ does not exist. Then, if D = X , the
sufficient convergence condition of Newton-Kantorovich method is given by α ≤ 0.130707.
Rheinboldt in [25] improved Smale’s result by showing convergence of Newton’s method
when α ≤ 0.15229240. Here, we showed convergence for α ≤ r1 = .179939475.

5.3. Local Convergence Analysis of Newton’s Method


We shall use similar definitions to (5.1.3), (5.2.2) and (5.2.3) to study the local convergence
of Newton’s method.

Definition 5.3.1. (see [30]) Let F : U(x? , R) ⊆ X → Y be twice Fréchet-differentiable on


U(x? , R) and F(x? ) = 0. Let γ > 0 and let 0 < r ≤ 1/γ be such that r ≤ R. The operator F 00
is said to satisfy the γ-Lipschitz condition at x? on U(x? , r) if


k F 0 (x? )−1 F 00 (x) k≤ f or each x ∈ U(x? , r). (5.3.1)
(1 − γ k x − x? k)3

Definition 5.3.2. Let F : U(x? , R) −→ Y be Fréchet-differentiable on U(x? , R) and F(x? ) =


0. We say that F 0 satisfies the γ-Lipschitz condition at x? if there exists an increasing function
` : [0, R] −→ [0, +∞) such that

k F 0 (x? )−1 (F 0 (x) − F 0 (y)) k≤ `(r) k x − y k


(5.3.2)
f or each x, y ∈ U(x? , r), 0 < r ≤ R.

Definition 5.3.3. Let F : U(x? , R) −→ Y be Fréchet-differentiable on U(x? , R) and F(x? ) =


0. We say that F 0 satisfies the γ? -center-Lipschitz condition at x? if there exists an increasing
function `? : [0, R] −→ [0, +∞) such that

k F 0 (x? )−1 (F 0 (x) − F 0 (x? )) k≤ `? (r) k x − x? k


(5.3.3)
f or each x ∈ U(x? , r), 0 < r ≤ R.

Remark 5.3.4. (a) Notice again that `? (r) ≤ `(r) and `/`? can be arbitrarily large.
88 Ioannis K. Argyros and Á. Alberto Magreñán

(b) In order for us to cover the local convergence analysis of Newton’s method, let us
1 1
define function f ε : Iε = [0, (1 − √ )] −→ R by
ε 2

fε (t) = (1 − εt)2 (2 − t)t − (1 − t)2 (2 (1 − εt)2 − 1). (5.3.4)

Suppose that
1 1
ε> (1 − √ ). (5.3.5)
2 2
Then, we have that
1 1 1 1 1 1
fε (0) = −1 < 0 and f ε ( (1 − √ )) = √ (1 − √ ) (2 − (1 − √ )) > 0.
ε 2 ε 2 2 ε 2
Hence, it follows from the intermediate value theorem that function f ε has a zero in

I ε . Denote by µ?ε the minimal such zero. Define function gε : Iε −→ R by
(1 − εt)2 (2 − t)t
gε (t) = . (5.3.6)
(1 − t)2 (2 (1 − εt)2 − 1)
Then, we have that

0 ≤ gε (t) < 1 f or each t ∈ [0, µ?ε ]. (5.3.7)

Set
µ?ε
Rε = . (5.3.8)
γ
It follows from the definition of f ε , µ?ε and gε that

3− 6
R1 = ≤ Rε . (5.3.9)

Moreover, strict inequality holds if ε 6= 1. Let us assume that F satisfies the γ? -center-
1 1
Lipschitz condition at x? on U(x? , (1 − √ )) with F(x? ) = 0 and the γ-Lipschitz
εγ 2
? 1
condition at x on U(x , ). Then, for x0 ∈ U(x? , Rε ), we have the identity
?
γ

xn+1 − x? = F 0 (xn )−1 (F(x? ) − F(xn ) − F 0 (xn ) (x? − xn ))


Z 1
(5.3.10)
0
= F (xn ) −1 0 ?
F (x ) F 0 (x? )−1 (F 0 (xτn,? ) − F 0 (xn )) dτ (x? − xn ),
0

where xτn,? = xn + τ (x? − xn ). Set also

xτn,?
s
= xn + τ s (x? − xn ) f or each 0 ≤ t ≤ 1, 0 ≤ s ≤ 1.

As in Theorem 5.2.8 but using (5.3.2) and (5.3.3) for


2γ γ? (2 − r)
`(r) = and `? (r) = ,
(1 − γ r)3 (1 − γ? r)2
Expanding the Applicability of Newton’s Method 89

we get in turn as in (5.2.17) and (5.2.19), respectively, that


 −1
0 −1 0 ? 1
k F (xn ) F (x ) k≤ 2 − (5.3.11)
(1 − ε γ k xn − x? k)2

and Z 1
k (F 0 (xτn,? ) − F 0 (xn )) dτ (x? − xn ) k≤
0
Z
2 γ k xτn,?
1Z 1 s
− x? k ds dτ
τ s − x? k)3
k xτn,?
s
− x? k≤ (5.3.12)
0 0 (1 − γ s k xn,? 
1
− 1 k xn − x? k .
(1 − γ k xn − x? k)2
That is we have by (5.3.10)–(5.3.12) that

k xn+1 − x? k
(1 − ε γ k xn − x? k)2 (1 − (1 − γ k xn − x? k)2 )
≤ k xn − x? k (5.3.13)
(2 (1 − ε γ k xn − x? k)2 − 1) (1 − γ k xn − x? k)2
< gε (µ?ε ) k xn − x? k=k xn − x? k< Rε .

Estimate (5.3.13) shows that xn+1 ∈ U(x? , Rε ) and lim xn = x? .


n→∞

Hence we arrived at the following result on the local convergence for Newton’s method.

Theorem 5.3.5. Let F : U(x? , R) ⊆ X −→ Y be Fréchet-differentiable on U(x? , R). Sup-


pose that

(a) There exists x? ∈ U(x0 , R) such that F(x? ) = 0 and F 0 (x? )−1 ∈ L (Y , X ).
1
(b) Operator F satisfies the center γ? -center-Lipschitz condition at x? on U(x? , (1 −
εγ
1 1
√ )) for ε satisfying (5.3.5) and the γ-Lipschitz condition at x? on U(x? , ).
2 γ
Then, if x0 ∈ U(x? , Rε ), sequence {xn } generated by Newton’s method is well defined, re-
mains in U(x? , Rε ) for each n = 0, 1, · · · and converges to x? . Moreover, the following
estimate holds
γ (2 − γ k xn − x? k) (1 − εγ k xn − x? k)2
k xn+1 − xn k≤ k xn − x? k2 . (5.3.14)
(1 − γ k xn − x? k)2 (2 (1 − ε k xn − x? k)2 − 1)

Remark 5.3.6. If ε = 1 (i.e. γ? = γ), our results reduces to the ones given by Wang [31]
(see also [30, 32]). Otherwise, if

1 1 γ?
(1 − √ ) ≤ < 1, (5.3.15)
2 2 γ

then, according to (5.3.9) our convergence radius is larger. Moreover, our error bounds are
tighter if γ? < γ.
90 Ioannis K. Argyros and Á. Alberto Magreñán
1
Remark 5.3.7. Let us define function f ε1 : Iε1 = [0, ] −→ R for ε1 > 0 by
ε1

fε1 (t) = (2 − t)t − (1 − t)2 (1 − ε1 t). (5.3.16)

Suppose that
1
ε1 > . (5.3.17)
2
Then, we have
1
fε1 (0) = −1 < 0 and f ε1 ( ) > 0.
ε1

Denote by µ?ε1 the minimal zero of f ε1 on I ε1 . Define function gε1 : Iε1 −→ R by

(2 − t)t
gε1 (t) = . (5.3.18)
(1 − t)2 (1 − ε1 t)

Then, we have that


0 ≤ gε1 (t) < 1 for each t ∈ [0, µ?ε1 ].
Set
µ?ε1
L0 = ε1 γ and Rε1 = . (5.3.19)
γ
Hence, we arrived at the following result.

Theorem 5.3.8. Suppose that

(a) There exists x? ∈ U(x0 , R) such that F(x? ) = 0 and F 0 (x? )−1 ∈ L (Y , X ).
1
(b) Operator F satisfies the center L0 -Lipschitz condition at x? on U(x? , ) for ε1 satis-
ε1
1
fying (5.3.17) and the γ-Lipschitz condition at x? on U(x? , ).
γ
Then, if x0 ∈ U(x? , Rε1 ), sequence {xn } generated by Newton’s method is well defined,
remains in U(x? , Rε1 ) for each n = 0, 1, · · · and converges to x? . Moreover, the following
estimate holds
γ(2 − γ k xn − x? k) k xn − x? k2
k xn+1 − xn k≤ . (5.3.20)
(1 − γ k xn − x? k)2 (1 − ε1 k xn − x? k)

5.4. Numerical Examples


In this section we provide numerical examples.

Example 5.4.1. (a) Consider γ = 1.8, γ0 = .44 and η = .1. Using (5.2.10) and (5.2.11),
we get that
γ0 1 1
= .2444444444 ≤ 1 − √ = .2928932190, = .5555555556,
γ 2 γ
Expanding the Applicability of Newton’s Method 91

Table 5.4.1. Comparison Table


n sn tn sn+1 − sn tn+1 − tn
1 .1 .1 .0051130691 .0059006211
2 .1051130691 .1059006211 .0000169735 .0000230132
3 .1051300426 .1059236343 2e-10 4e-10
4 .1051300428 .1059236347 0 0
5 ∼ ∼ ∼ ∼

s2 = .1059236776, s3 = .1060526606, s4 = .1060527234


and
sn = s4 = .1060527234 for each n = 5, 6, 7, · · · .
That is sn < 1/γ for each n = 1, 2, · · · and condition (5.2.11) holds. Hence, our
Theorem 5.2.8 is applicable. We have that

α = .18 > 3 − 2 2 = .171572876.

Hence the older convergence criteria in [32] do not hold.

(b) Consider now γ = .5, γ0 = .44 and η = .1. Using (5.2.10) and (5.2.11), we get that
γ0 1 1 1
= .88 > 1 − √ = .2928932190, (1 − √ ) = .665666406,
γ 2 2 γ0
s2 = .1051130691, s3 = .1051300426, s4 = .1051300428
and
sn = s4 = .1051300428 for each n = 5, 6, · · · .

That is sn < (1 − (1/ 2))/γ0 for each n = 1, 2, · · · and condition (5.2.11) holds.
Hence, our Theorem 5.2.8 is applicable. we also have that

α = .05 ≤ .171572876.

Hence the convergence criterion in [31] is also satisfied. We can now compare our
results of Theorem 5.2.8 (see also sequence {sn } given by (5.2.10)) to ones given in
[31, 32] (see also {tn } given by (5.1.10)). Table 5.4.1 shows that our error bounds
using sequence {sn } are tighter than those given in [32].
Example 5.4.2. Let function h : R −→ R be defined by

0 if x ≤ 0
h(x) =
x i f x ≥ 0.

Define function F by
 2
 ϖ − x + 1 x3 + x

if x≤
1
F(x) = 18 1−x 2 (5.4.1)
 ϖ − 71 + 2 x2

if
1
x≥ ,
144 2
92 Ioannis K. Argyros and Á. Alberto Magreñán

where ϖ > 0 is a constant. Then, we have that



 1 x2 1
 −2 + + if x≤
(1 − x) 2 6 2
F 0 (x) = (5.4.2)

 4x 1
if x≥
2
and 
 2 x 1
 2
+ if x≤
F 00 (x) = (1 − x) 3 2 (5.4.3)

 4 1
if x ≥ .
2
0
We shall first show that F satisfies the L-Lipschitz condition (5.2.2) on U(0, 1), where

2 1
L(u) = 3
+ f or each u ∈ [0, 1) (5.4.4)
(1 − u) 6

and the L0 -center-Lipschitz condition (5.2.3) on U(0, 1), where

2 1
L0 (u) = + f or each u ∈ [0, 1). (5.4.5)
(1 − u)3 12

It follows from (5.4.3) that

L(u) < L(v) f or each 0 ≤ u < v < 1 (5.4.6)

and
1
0 < F 00 (u) < F 00 (|u|) < L(|u|) 6= u < 1.
f or each (5.4.7)
2
Let x, y ∈ U(0, 1) with |y| + |x − y| < 1. Then, it follows from (5.4.6) and (5.4.7) that
Z 1
|F 0 (x) − F 0 (y)| ≤ |x − y| F 00 (y + t (x − y)) dt
Z0 1 (5.4.8)
≤ |x − y| L(|y| + t |x − y|) dt.
0

Hence, F 0 satisfies the L-Lipschitz condition (5.2.2) on U(0, 1). Similarily, using (5.4.2)
and (5.4.5), we deduce that F 0 satisfies the L0 -center-Lipschitz condition (5.2.3) on U(0, 1).
Notice that
L0 (u) < L(u) f or each u ∈ [0, 1). (5.4.9)
Table 5.4.2 show that our error bounds sn+1 − sn are finer than tn+1 − tn .

Example 5.4.3. Let X = Y = R2 , x0 = (1, 0), D = U(x0 , 1 − κ) for κ ∈ (0, 1). Let us define
function F on D as follows

F(x) = (ζ31 − ζ2 − κ, ζ1 + 3 ζ2 − 3 κ) with x = (ζ1 , ζ2 ). (5.4.10)

Using (5.4.10) we see that the γ-Lipschitz condition is satisfied for γ = 2 − κ. We also have
that η = (1 − κ)/3.
Expanding the Applicability of Newton’s Method 93

Table 5.4.2. Comparison Table


n sn tn sn+1 − sn tn+1 − tn
0 0 0 .05 .05
1 .05 .05 .00308148876 .00321390287
2 .05308148876 .05321390287 .00001307052 .00001479064
3 .05309455928 .05322869351

Table 5.4.3. Comparison Table


n sn tn sn+1 − sn tn+1 − tn
0 0.000000e + 00 0.000000e + 00 1.000000e − 01 1.000000e − 01
1 1.000000e − 01 1.000000e − 01 1.52215005e − 02 2.201246e − 02
2 1.152215005e − 01 1.220125e − 01 6.507434e − 04 1.683820e − 03
3 1.158722439e − 01 1.236963e − 01 1.2499e − 06 1.069600e − 05
4 1.158734938e − 01 1.237070e − 01 0 4.338887e − 10
5 ∼ 1.237070e − 01 ∼ 7.140132e − 19
6 ∼ 1.237070e − 01 ∼ 1.933579e − 36
7 ∼ 1.237070e − 01 ∼ 1.417992e − 71
8 ∼ 1.237070e − 01 ∼ 7.626002e − 142
9 ∼ 1.237070e − 01 ∼ 2.205685e − 282

Case I. Let κ = .6255.√ Then we notice that (5.1.5) is not satisfied since α =
.1715834166 > 3 − 2 2 = .171572875. Hence there√is no guarantee that New-
ton’s method starting from x0 will converge to x? = ( 3 κ, 0) = (.85521599, 0) (cf.
[14, 19, 26, 27, 30, 31, 32]). However, our results can apply. Indeed using the
√ of Lipschitz and center-Lipschitz conditions we have that L0 = 3 − κ and
definition
L = 4 2(2 − κ). Hence, (5.2.31) is satisfied since h = L1 η = .3396683409 < .5. We
conclude that Theorem 5.2.12 is applicable and iteration {sn } converges to x? .

II. Let κ = .7. It can be seen that the condition (5.1.5) holds since α = .13 ≤
Case √
3 − 2 2. We also obtain that h = .2626128133 < .5. We get in turn that 1/γ =
0.7692307,
 
1 1 1
1− √ = 0.2899932 and 1 − √ = .29289321 < 0.776923.
2 0 γ 2
Then condition (5.2.31) also holds. Using Theorem 5.2.8, the γ0 -center-Lipschitz
condition is satisfied if
0  1
F (x0 )−1 F 0 (x) − F 0 (x0 ) < − 1,
(1 − γ0 kx − x0 k)2

which is certainly satisfied for say γ0 = 1.01. Note that γ0 < 1.3 = γ. Table 5.4.3
compare the sequences {sn }, {tn } and the error bounds tn+1 − tn , sn+1 − sn . We also
observe that {sn } is finer majorizing sequence than {tn }.
94 Ioannis K. Argyros and Á. Alberto Magreñán

Conclusion
A convergence analysis of Newton’s method is provided for approximating a locally unique
solution of nonlinear equation in a Banach space setting. Using Smale’s α-theory and
the center-Lipschitz condition, we presented a new convergence analysis with larger con-
vergence domain and weaker sufficient convergence conditions. Moreover, these ad-
vantages are obtained under the same computational cost as in earlier studies such as
[14, 19, 27, 30, 31, 32]. Numerical examples validating the theoretical results are also
provided in this chapter.
References

[1] Amat, S., Busquier, S., Negra, M., Adaptive approximation of nonlinear operators.
Numer. Funct. Anal. Optim. 25 (2004), 397–405.

[2] Argyros, I.K., A convergence analysis for Newton’s method based on Lipschitz, center
Lipschitz conditions and analytic operators. PanAmer. Math. J. 13 (2003), 35–42.

[3] Argyros, I.K., A unifying local-semilocal convergence analysis and applications for
two-point Newton-like methods in Banach space. J. Math. Anal. Appl. 298 (2004),
374–397.

[4] Argyros, I.K., On the Newton-Kantorovich hypothesis for solving equations. J. Comp.
Appl. Math. 169 (2004), 315–332.

[5] Argyros, I.K., Computational theory of iterative methods. Series: Studies in Comp.
Math., 15, Editors: C.K. Chui and L. Wuytack, Elsevier Publ. Co. New York, U.S.A,
2007.

[6] Argyros, I.K., On the semilocal convergence of a Newton-type method in Banach


spaces under a gamma-type condition. J. Concr. Appl. Math. 6 (2008), 33–44.

[7] Argyros, I.K., A new semilocal convergence theorem for Newton’s method under
a gamma-type condition. Atti Semin. Mat. Fis. Univ. Modena Reggio Emilia 56
(2008/09), 31–40.

[8] Argyros, I.K., Semilocal convergence of Newton’s method under a weak gamma con-
dition. Adv. Nonlinear Var. Inequal. 13 (2010), 65–73.

[9] Argyros, I.K., A semilocal convergence analysis for directional Newton methods.
Math. Comput. 80 (2011), 327–343.

[10] Argyros, I.K., Hilout, S., Extending the Newton-Kantorovich hypothesis for solving
equations. J. Comput. Appl. Math. 234 (2010), 2993–3006.

[11] Argyros, I.K., Hilout, S., Weaker conditions for the convergence of Newton’s method.
J. Complexity 28 (2012), 364–387.

[12] Argyros, I.K., Hilout, S., convergence of Newton’s method under weak majorant con-
dition. J. Comput. Appl. Math. 236 (2012), 1892–1902.
96 Ioannis K. Argyros and Á. Alberto Magreñán

[13] Argyros, I.K., Cho, Y.J., Hilout, S., Numerical methods for equations and its applica-
tions, CRC Press/Taylor and Francis Publ., New York, 2012.
[14] Cianciaruso, F. Convergence of Newton-Kantorovich approximations to an approxi-
mate zero. Numer. Funct. Anal. Optim. 28 (2007), 631–645.
[15] Dedieu, J.P., Points fixes, zéros et la méthode de Newton, 54, Springer, Berlin, 2006.
[16] Ezquerro, J.A., Gutiérrez, J.M., Hernández, M.A., Romero, N., Rubio, M.J., The
Newton method: from Newton to Kantorovich. (Spanish) Gac. R. Soc. Mat. Esp. 13
(2010), 53–76.
[17] Ezquerro, J.A., Hernández, M.A., An improvement of the region of accessibility of
Chebyshev’s method from Newton’s method. Math. Comp. 78 (2009), 1613–1627.
[18] Ezquerro, J.A., Hernández, M.A., Romero, N., Newton-type methods of high order
and domains of semilocal and global convergence. App. Math. Comp. 214 (2009),
142–154.
[19] Guo, X., On semilocal convergence of inexact Newton methods. J. Comput. Math. 25
(2007), 231–242.
[20] Hernández, M.A., A modification of the classical Kantorovich conditions for New-
ton’s method. J. Comp. Appl. Math. 137 (2001), 201–205.
[21] Kantorovich, L.V., Akilov, G.P., Functional Analysis, Pergamon Press, Oxford, 1982.
[22] Ortega, L.M., Rheinboldt, W.C., Iterative Solution of Nonlinear Equations in Several
Variables, Academic press, New York, 1970.
[23] Proinov, P.D., General local convergence theory for a class of iterative processes and
its applications to Newton’s method. J. Complexity 25 (2009) 38–62.
[24] Proinov, P.D., New general convergence theory for iterative processes and its applica-
tions to Newton-Kantorovich type theorems. J. Complexity 26 (2010), 3–42.
[25] Rheinboldt, W.C., On a theorem of S. Smale about Newton’s method for analytic
mappings. Appl. Math. Let. 1 (1988), 3–42.
[26] Shen, W., Li, C., Kantorovich-type convergence criterion for inexact Newton methods.
Appl. Numer. Math. 59 (2009) 1599–1611.
[27] Shen, W., Li, C., Smale’s α-theory for inexact Newton methods under the γ-condition.
J. Math. Anal. Appl. 369 (2010), 29–42.
[28] Smale, S., Newton’s method estimates from data at one point. The merging of dis-
ciplines: new directions in pure, applied, and computational mathematics (Laramie,
Wyo., 1985) (1986), 185–196, Springer, New York.
[29] Smale, S., Algorithms for solving equations. Proceedings of the International
Congress of Mathematicians, Vol. 1, 2 (Berkeley, Calif., 1986) (1987), 172–195, Amer.
Math. Soc., Providence, RI.
Expanding the Applicability of Newton’s Method 97

[30] Wang, D.R., Zhao, F.G. The theory of Smale’s point estimation and its applications,
Linear/nonlinear iterative methods and verification of solution (Matsuyama, 1993). J.
Comput. Appl. Math. 60 (1995), 253–269.

[31] Wang, X.H., Convergence of Newton’s method and inverse function theorem in Ba-
nach space. Math. Comp. 68 (1999), 169–186.

[32] Wang, X.H., Han, D.F., On dominating sequence method in the point estimate and
Smale theorem. Sci. China Ser. A 33 (1990), 135–144.

[33] Yakoubsohn, J.C., Finding zeros of analytic functions: α–theory for Secant type
method. J. Complexity 15 (1999), 239–281.

[34] Zabrejko, P.P., Nguen, D.F., The majorant method in the theory of Newton-
Kantorovich approximations and the Pták error estimates. Numer. Funct. Anal. Optim.
9 (1987), 671–684.
Chapter 6

Newton-Type Methods on
Riemannian Manifolds under
Kantorovich-Type Conditions

6.1. Introduction
Let us suppose that F is an operator defined on an open convex subset Ω of a Banach space
E. Let us denote by DF (xn ) the first Fréchet derivatives of F at xn .
Given an integer mand an initial point x0 ∈ E, we move from xn to xn+1 through an
m
intermediate sequence yin i=0 , y0n = xn , which is a generalization of Newton (m = 1) and
simplified Newton (m = ∞) methods
 1 0

0 −1 0


 y n = yn − D F y n F y n

  
 y2n = y1n − D F y0n −1 F y1n
..

 .


 ym = x −1  (m−1)
n n+1 = ym−1 − D F y0
n n F y n .

This family of methods was introduced by E. Shamanskii [43]. Under appropriate con-
ditions, these iterative methods converge to a root x∗ of the equation F (x) = 0. More-
over, if x0 is sufficiently near x∗ the method has order of convergence at least m + 1. See
[33, 38, 43, 46]. In particular, Notice that in [38] a modification of D F(xn ) at each sub-
step. In [39, 40, 41], Parida and Gupta provided some recurrence relations to establish
a convergence analysis for a third order Newton-type methods under Lipschitz or Hölder
conditions on the second Fréchet derivative. A modification of the approach used in [39]
and some applications are presented by Chun et al. in [19]. Recently, Argyros and Ren
[17] expanded the applicability of Halley’s method using a center-Lipschitz condition on
the second Fréchet derivative instead of Lipschitz’s condition.
On the other hand, in the last years, attention has been paid in studying Newton’s
method on manifolds, since there are many numerical problems posed on manifolds that
arise naturally in many contexts. Some examples include eigenvalue problems, minimiza-
tion problems with orthogonality constraints, optimization problems with equality con-
100 Ioannis K. Argyros and Á. Alberto Magreñán

straints, invariant subspace computations. See for instance [1, 2, 3, 7, 15, 20, 21, 27, 29, 35,
36, 48, 49]. For these problems, one has to compute solutions of equations or to find zeros
of a vector field on Riemannian manifolds.
The study about convergence matter of iterative methods is usually centered on two
types: semilocal and local convergence analysis. The semilocal convergence matter is,
based on the information around an initial point, to give criteria ensuring the convergence
of iterative methods; while the local one is, based on the information around a solution,
to find estimates of the radii of convergence balls. There is a plethora of studies on the
weakness and/or extension of the hypothesis made on the underlying operators; see for
example [4, 5, 6, 12, 14, 32, 34, 48, 49].
The semilocal convergence analysis of Newton’s method is based on celebrated Kan-
torovich theorem. This theorem is a fundamental result in numerical analysis, e.g., for
providing an iterative method for computing zeros of polynomials or of systems of non-
linear equations. Moreover, this theorem is a very usefull result in nonlinear functional
analysis, e.g., for establishing that a nonlinear equation in an abstract space has a solution.
Let us recall Kantorovich’s theorem in a Banach space setting.

Theorem 6.1.1. [32] Let E be a Banach space, Ω ⊆ E be an open convex set, F : Ω −→ Ω


be a continuous operator, such that, F ∈ C1 and DF is Lipschitz on Ω

kDF (x) − DF (y)k ≤ l kx − yk , for all x, y ∈ Ω, l > 0.

Suppose that for some x0 ∈ Ω, DF (x0 ) is invertible and that for some a > 0 and b ≥ 0 :

−1
DF (x0 ) ≤ a,

−1
DF (x0 ) F (x0 ) ≤ b,
1
h = abl ≤ (6.1.1)
2
and
1  √ 
B (x0 ,t∗ ) ⊆ Ω where t∗ = 1− 1−2h .
al
If
vk = −DF (xk )−1 F (xk ) ,
xk+1 = xk + vk .
Then {xk }k∈N ⊆ B (x0 ,t∗ ) and xk −→ p∗ , which is the unique zero of F in B [x0 ,t∗ ] . Further-
more , if h < 12 and B (x0 , r) ⊆ Ω with

1  √ 
t∗ < r ≤ t∗∗ = 1+ 1−2h ,
al
then p∗ is also the unique zero of F in B (x0 , r). Also, the error bound is:
k b
kxk − x∗ k ≤ (2 h)2 ; k = 1, 2, . . .
h
Newton-Type Methods on Riemannian Manifolds 101

Although the concepts will be defined later on, to extend the method on Riemannian
manifolds, preliminarily we will say that the derivative of F at xn is replaced by the covariant
derivative of X at pn :
∇(.) X (pn ) : Tpn M −→ Tpn M
v −→ ∇Y X,
where Y is a vector field satisfying Y (p) = v. We adopt the notation D X (p) v = ∇y X (p);
hence D X (p) is a linear mapping of Tp M into Tp M. So, in this new context

−F 0 (xn )−1 F (xn )

is written as
−D X (pn )−1 X (pn )
or −1
− ∇X(pn ) X (pn ) .
Now we can write Kantorovich’s theorem in the new context. A proof of this theorem can
be found in [27]. We will say that a singularity of a vector field X, is a point p ∈ M for
which X (p) = 0.

Theorem 6.1.2. [27] (Kantorovich’s theorem on Riemannian manifold) Let M be a Rie-


mannian manifold, Ω ⊆ M be an open convex set, X ∈ χ (M) and D X ∈ Lipl (Ω). Suppose
that for some p0 ∈ Ω, D X (p0 ) is invertible and that for some a > 0 and b ≥ 0 :
 −1 
−1
D X (p0 ) ≤ a ∇(.) X (po ) ≤ a
 −1 
−1
D X (p0 ) X (p0 ) ≤ b ∇X(po ) X (po ) ≤ b
1
h = abl ≤ (6.1.2)
2
1  √ 
B (p0 ,t∗ ) ⊆ Ω where t∗ = 1− 1−2h .
al
If
vk = −D X (pk )−1 X (pk ) ,
pk+1 = exp pk (vk ) ,
then {pk }k∈N ⊆ B (p0 ,t∗ ) and pk −→ p∗ which is the unique singularity of X in B [p0 ,t∗ ],
where exp pk is defined in (6.2.6). Furthermore , if h < 12 and B (p0 , r) ⊆ Ω with

1  √ 
t∗ < r ≤ t∗∗ = 1+ 1−2h ,
al
then p∗ is also the unique singularity of F in B (p0 , r). The error bound is:
k
d (pk , p∗ ) ≤ bh (2 h)2 ; k = 1, 2, . . . (6.1.3)
102 Ioannis K. Argyros and Á. Alberto Magreñán

The Kantorovich hypothesis (6.1.2) in Theorem 6.1.2 is only a sufficient convergence


criterion for Newton method as well as the modified Newton’s method. There are numeri-
cal examples in the literature showing that Newton’s method converges but the Kantorovich
hypothesis is not satisfied (see [6, 9, 12, 14, 32] and the references therein). In the present
chapter we show how to expand the convergence domain of Newton’s method without addi-
tional hypotheses. We achieve this goal by introducing more precise majorizing sequences
for Newton’s method than in the earlier studies in the field. Notice that if DF is Lipschitz
on Ω, there exists a constant l0 > 0, such that DF is center-Lipschitz on Ω

kDF (x) − DF (x0 )k ≤ l0 kx − x0 k , for all x ∈ Ω, l0 > 0.

Clearly,
l0 ≤ l (6.1.4)
holds in general and l/l0 can be arbitrarily large [6, 12, 14]. In particular, we show that in
the case of the modified Newton’s method, condition (3) of Theorem 6.1.2 can be replaced
by
1
h0 = a b l 0 ≤ , (6.1.5)
2
whereas in the case of Newton’s method, condition (3) of Theorem 6.1.2 can be replaced
by
ab p 1
h1 = (l + 4 l0 + l 2 + 8 l0 l) ≤ , (6.1.6)
8 2
or by q
ab p 1
h2 = (4 l0 + l l0 + 8 l02 + l0 l) ≤ . (6.1.7)
8 2
Notice that
1 1 1 1
h ≤ =⇒ h1 ≤ =⇒ h2 ≤ =⇒ h0 ≤ (6.1.8)
2 2 2 2
but not necessarily vice versa unless if l0 = l. Moreover, we have that

h1 1 h2 h2 h0 l0
−→ , −→ 0, −→ 0 and −→ 0 as −→ 0. (6.1.9)
h 4 h1 h h l
The preceding estimates show by how many times (at most) the applicability of the modified
Newton’s method or Newton’s method can be extended. Moreover, we show that under the
new convergence conditions, the error estimates on the distances d(pn , pn−1 ), d(pn , p∗ )
can be tighter and the information on the location of the solution at least as precise as in
Theorem 6.1.2.
The chapter in organized as follows: Section 6.2. contains some definitions and fun-
damental properties of Riemannian manifolds. The convergence of simplified Newton’s
method and the order of convergence using normal coordinates are given in Sections 6.3.
and 6.4.. Family of high order Newton-type methods, precise majorizing sequences and the
corresponding convergence results are provided in Sections 6.5. and 6.6..
Newton-Type Methods on Riemannian Manifolds 103

6.2. Basic Definitions and Preliminary Results


In this section, we introduce some definitions and fundamental properties of Riemannian
manifolds in order to make this chapter as selfcontained as possible. These definitions and
properties can be found in [20, 21, 22, 28, 32, 35, 37, 38, 47]. The preceding references are
recommended to the interested reader for further study.

Definition 6.2.1. [28, 37, 38, 47] A differentiable manifold of dimension m is a set M and
m m
a family[of injective mappings xα : Uα ⊂ R −→ M of open sets Uα of R into M such that:
(i) xα (Uα ) = M.
α  −1
(ii) for any pair α, β with xα (Uα) ∩ xβ Uβ = W 6= 0, / the sets x−1
α (W ) and xβ (W ) are
open sets in Rm and the mappings x−1 β
◦ xα are differentiable.
(iii) The family {(Uα, xα )} is maximal relative to the conditions (i) and (ii).
The pair (Uα , xα ) (or the mapping xα ) with p ∈ xα (Uα ) is called a parametrization
(or system of coordinates) of M at p; xα (Uα ) is then called neighborhood at p and
xα (Uα) , x−1
α is called a coordinate chart. A family {(Uα, xα )} satisfying (i) and (ii) is
called a differentiable structure on M.

Let M be a real manifold, p ∈ M and denote by Tp M the tangent space at p to M. Let


x : U ⊂ Rm −→ M be a system of coordinates around p whit x (x1 , x2 , · · · , xm ) = p and its
associated basis ( )
∂ ∂ ∂
, ,··· ,
∂x1 p ∂x2 p ∂xm p
in Tp M. The tangent bundle T M is defined as
[
T M = {(p, v) ; p ∈ M and v ∈ Tp M} = Tp M
p∈M

and provides a differentiable structure of dimension 2m [22]. Next, we define the concept
of Riemannian metric:

Definition 6.2.2. A Riemannian metric on a differentiable manifold M is a correspondence


which associates to each point p of M an inner product h., .ip (that is, a symmetric, bilinear,
positive-definite form) on the tangent space Tp M, which varies differentiabily in the follow-
ing sense: x : U ⊂ Rm −→ M is a system of coordinates around p with x (x1 , x2 , . . ., xm ) = p,
then
* +
∂ ∂
gi j (x1 , x2 , · · · , xm ) := ,
∂xi p ∂x j p
p
* ! !+
∂ ∂
= dx−1 , dx−1 ,
∂xi p ∂x j p

in which dx−1 is the tangent map of x−1 and is a differentiable operator on U for each
i, j = 1, 2, .., n. The operators gi j are called the local representatives of the Riemannian
metric.
104 Ioannis K. Argyros and Á. Alberto Magreñán

The inner product h., .ip induces in a natural way the norm ||.|| p . The subscript p is
usually deleted whenever there is not possibility of confusion.
If p and q are two elements of the manifold M and c : [0, 1] −→ M is a piecewise smooth
curve connecting p and q, then the arc length of c is defined by
Z 1
l (c) = c0 (t) dt (6.2.1)
0
Z 1 
dc dc 1/2
= , dt,
0 dt dt

and the Riemannian distance from p to q by

d (p, q) = inf l (c). (6.2.2)


c

Definition 6.2.3. Let χ (M) be the set of all vector fields of class C∞ on M and D (M) the
ring of real-valued operators of class C∞ defined on M, that is:

χ (M) = C∞ M, T(.) M ,
D (M) = C∞ (M, R).
An affine connection ∇ on M is a mapping

∇ : χ (M) × χ (M) −→ χ (M)


(6.2.3)
(X,Y ) 7−→ ∇X Y

which satisfies the following properties:

i) ∇ f X+gY Z = f ∇X Z + g∇Y Z.
ii) ∇X (Y + Z) = ∇X Y + ∇X Z.
iii) ∇X ( fY ) = f ∇X Y + X ( f )Y,

where X, Y, Z ∈ χ (M) and f , g ∈ D (M).

Definition 6.2.4. Let X be a C1 vector field on M, the covariant derivative of X determined


by the Levi-Civita connection ∇ defines on each p ∈ M a linear application of Tp M itself

D X (p) : Tp M −→ Tp M
(6.2.4)
v 7−→ D X (p) (v) = ∇Y X (p)

where Y is a vector field satisfying Y (p) = v. The value D X (p) (v) depends only on the
tangent vector v = Y (p) since ∇ is linear in Y , thus we can write

D X (p) (v) = ∇vX (p) .


Let us consider a curve c : [a, b] −→ M and a vector field X along c, i.e. X (p) ∈ Tc0 (t) M
where c (t) = p for all t. We say that a vector field X is parallel along c (with respect to ∇) if
D X (p) (c0 (t)) = 0, the affine connection is compatible with the metric h., .i, when for any
Newton-Type Methods on Riemannian Manifolds 105

smooth curve c and any pair of parallel vector fields P and P0 along c, we have that hP, P0 i
is constant or equivalently
d


hX,Y i = ∇c0 (t) X,Y + X, ∇c0(t)Y ,
dt
where X and Y are vector fields along the differentiable curve c : I −→ M (see [22], [45]).
We say that ∇ is symmetric if

∇X Y − ∇Y X = [X,Y ] for all X,Y ∈ χ (M).

The theorem of Levi-Civita (see [45]), establishes that there exists a unique symmetric
affine connection ∇ on M compatible with the metric. This connection is called connection
of Levi-Civita.

Definition 6.2.5. [28, 37, 38, 47] A parameterized curve γ : I −→ M is a geodesic at t0 ∈ I


if ∇γ0 (t) γ0 (t) = 0 in the point t0 . If γ is a geodesic at t, for all t ∈ I, we say that γ is a
geodesic. If [a, b] ⊆ I, the restriction of γ to [a, b] is called a geodesic segment joining γ (a)
to γ(b).

Some times, by abuse of the language, we refer to the image γ (I), of a geodesic γ, as a
geodesic. A basic property of a geodesic is that, γ0 (t) is parallel along of γ (t) ; this implies
that ||γ0 (t)|| is constant.
Let B (p, r) and B [p, r] be respectively the open geodesic and the closed geodesic ball
with center p and radius r, that is:

B (p, r) = {q ∈ M : d (p, q) < r}


B [p, r] = {q ∈ M : d (p, q) ≤ r} .

We define an open set U of M to be convex if given p, q ∈ U there exists a unique geodesic


in U joining p to q, and such that the length of the geodesic is d (p, q).
The Hopf and Rinow theorem (see [22]) gives necessary and sufficient conditions for
M to be a complete metric space. In particular, if M is a complete metric space, then for
any q ∈ M there exists a geodesic γ, called minimizing geodesic, joining p to q with

l (γ) = d (p, q), (6.2.5)

also if v ∈ Tp M, there exists a unique minimizing geodesic γ such that γ (0) = p and γ0 (0) =
v. The point γ (1) is called the image of v by the exponential map at p, that is, there exist a
well defined map
exp p : Tp M −→ M (6.2.6)
such that
exp p (v) = γ (1),
and for any t ∈ [0, 1]
γ (t) = exp p (tv).
b of the origin 0 p ∈
It can be shown that exp p defines a diffeomorphism of a neighborhood U
Tp M onto a neighborhood U of p ∈ M, called normal neighborhood of p, (see [22]).
106 Ioannis K. Argyros and Á. Alberto Magreñán

Let p ∈ M and U a normal neighborhood of p. Let us consider an orthonormal


basis {ei }m m
i=1 of Tp M. This basis gives the isomorphism f : R −→ Tp M defined by
f (u1 , · · · , un ) = ∑m m
i=1 ui ei . If q = exp p (∑i=1 ui ei ) , we say that (u1 , . . ., un ) are normal coor-
dinates of q in the normal neighborhood U of p and the coordinate chart is the composition

ϕ := exp p ◦ f : Rm −→ U.

One of the most important properties of the normal coordinates is that the geodesies passing
through p are given by linear equations, (see [44]).
The exponential map has many important properties [22], [44]. When the exponential
map is defined for each value of the parameter t ∈ R we will say that the Riemannian man-
ifold M is geodesically complete (or simply complete). The Hopf and Rinow theorem (see
[22]), also establishes that the property of the Riemannian manifold of being geodesically
complete is equivalent to being complete as a metric space.

Definition 6.2.6. [28, 37, 38, 47] Let c be a piecewise smooth curve. For any pair a, b ∈ R,
we define the parallel transport along c which is denoted by Pc as

Pc,a,b : Tc(a)M −→ Tc(b)M


(6.2.7)
v 7−→ V (c (b)) ,

where V is the unique vector field along c such that ∇c0(t)V = 0 and V (c (a)) = v.

It is easy to show that Pc,a,b is linear and one-one, thus Pc,b,a is an isomorphism between
every two tangent spaces Tc(a)M and Tc(b) M. Its inverse is the parallel translation along
the reversed portion of c from V (c (b)) to V (c (a)) , actually Pc,a,b is a isometry between
i
Tc(a) M and Tc(b)M. Moreover, for a positive integer i and for all (v1 , v2 , . . ., vi ) ∈ Tc(a) M ,
we define Pci as
i
i i
Pc,a,b : Tc(a)M −→ Tc(b)M ,
where
i
Pc,a,b (v1 , v2 , . . ., vi ) = (Pc,a,b (v1 ) , Pc,a,b (v2 ), . . ., Pc,a,b (vi )) .
The parallel transport has the important properties:

Pc,a,b ◦ Pc,b,d = Pc,a,d ,


−1 (6.2.8)
Pc,b,a = Pc,a,b .

Next, we generalize the concept of covariant derivative. We observe that

D X : Ck (T M) −→ Ck−1 (T M)
(6.2.9)
(v, .) 7 → D X (Y ) = ∇Y X,

where T M is the tangent bundle. Similar to the higher order Fréchet derivative, see [18]. We
define the higher order covariant derivatives, see [45], as the multilinear map or j-tensor:
j
D jX : Ck (T M) −→ Ck− j (T M)
Newton-Type Methods on Riemannian Manifolds 107

given by

D j X (Y1,Y2, . . .,Y j−1,Y ) = ∇Y D j−1 (X (Y1,Y2, . . .,Y j−1)) (6.2.10)


j−1
− ∑ D j−1 X (Y1 ,Y2 , . . ., ∇Y Yi , . . .,Y j−1 , )
i=1

for each Y1 ,Y2 , . . .,Y j−1 ∈ Ck (T M). In the case of j = 2 we have

D 2 X : Ck (T M) ×Ck (T M) −→ Ck−2 (T M)
and

D 2 X (Y1 ,Y ) = ∇Y D X (Y1) − D X (∇Y Y1 ) (6.2.11)


= ∇Y (∇Y1 X) − ∇∇Y Y1 X.

The multilinearity refers to the structure of Ck (M)-module, such that, the value of

D j X (Y1,Y2, . . .,Y j−1,Y )


at p ∈ M only depends on the j-tuple of tangent vectors

(v1 , v2 , . . ., v j ) = (Y1 (p) ,Y2 (p) , . . .,Y j−1 (p) ,Y (p)) ∈ (Tp M) j .

Therefore, for any p ∈ M, we can define the map

D j X (p) : (Tp M) j −→ Tp M
by
D j X (p) (v1 , v2, . . ., v j ) = D j X (Y1,Y2, . . .,Y j−1,Y ) (p) . (6.2.12)

Definition 6.2.7. [28, 37, 38, 47] Let M be a Riemannian manifold, Ω ⊆ M an open convex
set and X ∈ χ (M) . The covariant derivative D X = ∇(.) X is Lipschitz with constant l > 0,
if for any geodesic γ and a, b ∈ R so that γ [a, b] ⊆ Ω, it holds that:
Z b
Pγ,b,a D X (γ (b)) Pγ,a,b − D X (γ (a)) ≤ l γ0 (t) dt. (6.2.13)
a

We will write D X ∈ Lipl (Ω) .

Note that Pγ,b,a D X (γ (b))Pγ,b,a and D X (γ(a)) are both operators defined in the same
tangent plane Tγ(a)M. If M is an Euclidean space, the above definition coincides with the
usual Lipschitz definition for the operator DF : M −→ M.

Proposition 6.2.8. [28, 37, 38, 47] Let c be a curve in M and X be a C1 vector field on M,
then the covariant derivative of X in the direction of c0 (s) is

D X (c (s)) c0 (s) = ∇c0 (s)Xc(s) (6.2.14)


1
= lim (Pc,s+h,s X (c (s + h)) − X (c (s))) .
h→0 h
108 Ioannis K. Argyros and Á. Alberto Magreñán

Note that if M = Rn the previous proposition agrees with the definition of classic direc-
tional derivative in Rn ; (see [45]).
It is also possible to obtain a version of the fundamental theorem of calculus for mani-
folds:

Theorem 6.2.9. [27] Let c be a geodesic in M and X be a C1 vector field on M, then


Z t 
Pc,t,0 X (c (t)) = X (c (0)) + Pc,s,0 D X (c (s)) c0 (s) ds. (6.2.15)
0

Theorem 6.2.10. Let c be a geodesic in M and X be a C2 vector field on M, then

0 0
Z t 
Pc,t,0 D X (c (t)) c (t) = D X (c (0)) c (0) + Pc,s,0 D 2 X (c (s)) c0 (s), c0 (s) ds.
0
(6.2.16)

Proof. Let us consider the vector field along of the geodesic c (s)

Y (c (s)) = D X (c (s))c0 (s) .

By the previous theorem


Z t 
Pc,t,0Y (c (t)) = Y (c (0)) + Pc,s,0 DY (c (s))c0 (s) ds
0

hence
Z t  
Pc,t,0 D X (c (t))c0 (t) = D X (c (0)) c0 (0) + Pc,s,0 D D X (c (s)) c0 (s) c0 (s) ds
0

by (6.2.11)
  
D 2 X (c (s)) c0 (s) , c0 (s) = ∇c0 (s) D X (c (s)) c0 (s) − D X (c (s)) ∇c0 (s) c0 (s)
 
D D X (c (s))c0 (s) c0 (s) − D X (c (s)) ∇c0 (s)c0 (s) ,

since c (s) is a geodesic, we have ∇c0(s) c0 (s) = 0, hence


 
D 2 X (c (s)) c0 (s) , c0 (s) = D D X (c (s)) c0 (s) c0 (s) .
Therefore
Z t 
Pc,t,0 D X (c (t))c0 (t) = D X (c (0)) c0 (0) + D 2 X (c (s)) c0 (s) , c0 (s) ds.
0

In a similar way, using an induction strategy, we can prove that


Z s 
Pc,t,0 D n n
X (c (s))Pc,0,t n
− D X (c (0)) = Pc,t,0 D n X (c (t)) Pc,0,t
n
c0 (0) , . . ., c0 (0) dt
0
(6.2.17)
Newton-Type Methods on Riemannian Manifolds 109

Theorem 6.2.11. Let c be a geodesic in M, [0, 1] ⊆ Dom (c) and X be a C2 vector field on
M, then
Z 1 
Pc,1,0 X (c (1)) = X (c (0)) + D X (c (0)) c0 (0) + (1 − t) Pc,s,0 D 2 X (c (t)) c0 (t), c0 (t) dt.
0
(6.2.18)
Proof. Consider the curve
f (s) = Pc,s,0 X (c (s))
in Tc(0) M. We have that

f (n) (s) = Pc,s,0 D (n)X (c (s)) c0 (s), c0 (s), · · · , c0 (s) (6.2.19)
| {z }
n − times
Then 
f 00 (s) = Pc,s,0 D 2 X (c (s)) c0 (s) , c0 (s) ,
and from Taylor’s theorem
Z 1
0
f (1) = f (0) + f (0)(1 − 0) + (1 − t) f 00 (t) dt.
0

Therefore
Z 1 
Pc,1,0 X (c (1)) = X (c (0)) + D X (c (0)) c0 (0) + (1 − t) Pc,t,0 D 2 X (c (t)) c0 (t), c0 (t) dt.
0

Let us recall that if A : Tp M −→ Tp M, we can define kAk =


sup{kAvk : v ∈ Tp M, kvk = 1} .
The following is an important lemma, that allows to know when an operator is invertible
and also allows to give on estimate for its inverse.
Lemma 6.2.12. (Banach’s Lemma [32]) Let A be an invertible bounded linear operator in
a Banach space E and B be an bounded linear operator B in E, if
−1
A B − I < 1

then B−1 exists and


−1
−1 A
B ≤
1 − kA−1 B − Ik
−1
A
≤ .
1 − kA−1 k kB − Ak
Moreover,
−1 1
B A ≤
1 − kA−1 B − Ik
1
≤ .
1 − kA−1 k kB − Ak
110 Ioannis K. Argyros and Á. Alberto Magreñán

6.3. Simplified Newton’s Method on Riemannian Manifolds


(m = ∞)
Next we will prove the semilocal convergence of the simplified Newton’s method in Rie-
mannian manifolds (fixing D X (p0 )−1 in each iteration). Our main result is:

Theorem 6.3.1. Let M be a Riemannian manifold, Ω ⊆ M be an open convex set, X ∈


χ (M), and D X ∈ Lipl (Ω). Suppose that for some p0 ∈ Ω, D X (p0 ) is invertible and that
for some a > 0 and b ≥ 0 :


(1) D X (p0 )−1 ≤ a,


(2) D X (p0 )−1 X (p0 ) ≤ b,
(3) h = a b l ≤ 12 , √ 
1
(4) B (p0 ,t∗ ) ⊆ Ω where t∗ = al 1− 1−2h .

If
vk = −Pσk ,0,1 D X (p0 )−1 Pσk ,1,0 X (pk ) ,
(6.3.1)
pk+1 = exp pk (vk ) ,
where {σk : [0, 1] −→ M}k∈N is the minimizing geodesic family connecting p0 , pk , then
{pk }k∈N ⊆ B (p0 ,t∗ ) and pk −→ p∗ which is the only one singularity of X in B [p0 ,t∗ ].
Furthermore, if h < 12 and B (p0 , r) ⊆ Ω with

1  √ 
t∗ < r ≤ t∗∗ = 1+ 1−2h ,
al
then p∗ is also the only singularity of F in B (p0 , r). The error bound is:
b
√ k+1
d (pk , p∗ ) ≤ h 1− 1−2h ; k = 1, 2, . . . (6.3.2)

First, we establish some results that are of primary relevance in this proof.

Lemma 6.3.2. Let M be a Riemannian manifold, Ω ⊆ M an open convex set, X ∈ χ (M) and
D X ∈ Lipl (Ω) . Take p ∈ B (p0 , r) ⊆ Ω, v ∈ Tp M, σ : [0, 1] −→ M be a minimizing geodesic
connecting p0 , p and
γ (t) = exp p (tv).
Then
Pγ,t,0 X (γ(t)) = X (p) + Pσ,0,1 t D X (p0 ) Pσ,1,0 v + R (t)
with t 
kR (t)k ≤ l kvk + d (p0 , p) t kvk .
2
Proof. From Theorem 6.2.9, it follows that
Z t 
Pγ,t,0 X (γ (t)) − X (γ(0)) = Pγ,s,0 D X (γ (s))γ0 (s) ds,
0
Newton-Type Methods on Riemannian Manifolds 111

since γ is a minimizing geodesic, then γ0 (t) is parallel and γ0 (s) = Pγ,0,s γ0 (0). Moreover
γ0 (0) = v then
Z t 
Pγ,t,0 X (γ (t)) − X (p) = Pγ,s,0 D X (γ(s)) Pγ,0,s v ds.
0

Thus
Pγ,t,0
Rt
X (γ (t)) − X (p) − Pσ,0,1t D X (p0 ) Pσ,1,0 v
= 0 Pγ,s,0 D X (γ (s))Pγ,0,s v ds − Pσ,0,1 D X (p0 )Pσ,1,0  v
= 0t Pγ,s,0 D X (γ (s))Pγ,0,s v − Pσ,0,1 D X (p0 ) Pσ,1,0 v ds,
R

letting Z t 
R (t) = Pγ,s,o D X (γ (s)) Pγ,o,s v − Pσ,0,1 D X (p0 ) Pσ,1,0 v ds,
0
and since D X ∈ Lipl (Ω) , we obtain
Z t 
kR (t)k ≤ Pγ,s,o D X (γ(s))Pγ,o,s − D X (p) + D X (p) − Pσ,0,1 D X (p0 ) Pσ,1,0 kvk ds
0
Z t 
≤ Pγ,s,o D X (γ(s))Pγ,o,s − D X (p) + D X (p) − Pσ,0,1 D X (p0 )Pσ,1,0 kvkds
0
Z 1 
= Pγ,s,o D X (γ(s)) Pγ,o,s − D X (γ (0)) + D X (σ (1)) − Pσ,0,1 D X (σ (0))Pσ,1,0 kvkds
0
Z t Z s

≤l γ0 (τ) dτ + d (p0 , p) kvkds
0 0
Z t Z s

=l γ0 (0) dτ + d (p0 , p) kvkds
0 0
Z t 
=l s γ0 (0) + d (p0 , p) kvkds
0
 
t2
=l kvk + td (p0 , p) kvk.
2

Therefore, t 
kR (t)k ≤ l kvk + d (p0 , p) t kvk .
2

Corollary 6.3.3. Let M be a Riemannian manifold, Ω ⊆ M be an open convex set, X ∈


χ (M), D X ∈ Lipl (Ω) . Take p ∈ Ω , v ∈ Tp M and let be

γ (t) = exp p (tv).

If γ[0,t) ⊆ Ω and Pσ,0,1 D X (p0 ) Pσ,1,0 v = −X (p) , then


 

Pγ,1,0 X (γ (1)) ≤ l 1 kvk + d (p0 , p) kvk . (6.3.3)
2

Now we can prove the simplified Kantorovich theorem on Riemannian manifolds. The
proof of this theorem will be divided in two parts. First, we will prove that simplified
Kantorovich method is well defined, i.e. {pk }k∈N ⊆ B (p0 ,t∗) ; we will also prove the con-
vergence of the method. In the second part, we will establish uniqueness.
112 Ioannis K. Argyros and Á. Alberto Magreñán

• CONVERGENCE
We consider the auxiliary real function f : R −→ R, defined by
l 1 b
f (t) = t 2 − t + . (6.3.4)
2 a a
Its discriminant is
1
4= (1 − 2 l b a),
a2
which is positive, because a b l ≤ 12 . Thus f has a least one real root (unique when
h = 12 ). If t∗ is the smallest root, a direct calculation show that f 0 (t) < 0 for 0 ≤ t < t∗ ,
so f is strictly decreasing in [0,t∗] . Therefore the (scalar) Newton’s method can be
applied to f , in other words:
If t0 ∈ [0,t∗), for k = 0, 1, 2, . . ., we define

f (tk )
tk+1 = tk − .
f 0 (0)

Then {tk }k∈N is well defined, strictly increasing and converges to t∗ . Furthermore, if
h = a b l < 12 , then

b √ k+1
t∗ − tk ≤ 1− 1−2h , k = 1, 2, . . ., see [32]. (6.3.5)
h
Let us take as starting point t0 = 0. We want to show that Newton’s iteration are well
defined for any q ∈ B (p0 ,t∗ ) ⊆ Ω.
We define
 
f (t)
K (t) = q ∈ B (p0 , t) : Pσ,0,1 D X (p0 )−1 Pσ,1,0 X (q) ≤ 0 = a f (t), 0 ≤ t < t∗ ,
| f (0)|
(6.3.6)
where σ : [0, 1] −→ M is the minimizing geodesic connecting p0 and q. Note that
K (t) 6= 0/ since p0 ∈ K (t).

Proposition 6.3.4. Under the hypotheses of either the Kantorovich or the simplified
Kantorovich method, if q ∈ B (p0 ,t∗ ), then D X (q) is nonsingular and
1
−1
D X (q) ≤ 0 where λ = d (p0 , q) < t∗ .
| f (λ)|

Proof. Let λ = d (p0 , q) and α : [0, 1] −→ M be a geodesic with α (0) = p0 , α(1) = q


and kα0 (0)k = λ. Define φ : Tq M −→ Tq M by letting

φ = Pα,1,0 D X (p0 ) Pα,0,1 . (6.3.7)

Since Pα,1,0 and Pα,0,1 are linear, isometric and D X (p0 ) is nonsingular, we have that
φ is linear, nonsingular and
−1 1
φ = −1
D X (p0 ) ≤ a = 0 ,
| f (0)|
Newton-Type Methods on Riemannian Manifolds 113

with α ([0, 1]) ⊆ B (p0 ,t∗ ). Since d (p0 , q) < t∗ , D X ∈ LipL (Ω) and kα0 (0)k = λ.
Therefore
kD X (q) − φk ≤ lλ. (6.3.8)

By (6.3.7) and (6.3.8), we have


−1
φ kD X (q) − φk ≤ alλ
≤ alt∗
1  √ 
= al 1− 1−2abl
al
≤ 1.

Using Banach’s lemma, we conclude that D X (q) is nonsingular, and


−1
φ
−1
D X (q) ≤ −1
1 − φ kD X (q) − φk
a

1−al λ
1
≤ 0 .
| f (λ)|

Therefore, for any q ∈ B (p0 ,t∗ ), we can apply the Kantorovich methods.

Lemma 6.3.5. Let q ∈ K (t), define

t+ = t − | ff0(t)
(0)| 
q+ = expq −Pσ,0,1 D X (p0 )−1 Pσ,1,0 X (q) .

Then t < t+ < t∗ and q+ ∈ K (t+ ).

Proof. Consider the geodesic γ : [0, 1] −→ M defined by


 
γ (θ) = expq −θPσ,0,1 D X (p0 )−1 Pσ,1,0 X (q) ,

we have

d (p0 , γ(θ)) ≤ d (p0 , q) + d (q, γ (θ))




≤ t + θPσ,0,1 D X (p0 )−1 Pσ,1,0 X (q)
f (t)
≤ t +θ .
| f 0 (0)|

Since  
γ (1) = expq −Pσ,0,1 D X (p0 )−1 Pσ,1,0 X (q) = q+ ,
114 Ioannis K. Argyros and Á. Alberto Magreñán

this implies that

f (t)
d (p0 , q+) = d (p0 , γ(1)) ≤ t + = t+ ,
| f 0 (0)|
therefore
q+ ∈ B (p0 ,t+) ⊂ B (p0 ,t∗ ) .
Moreover, if σ+ [0, 1] −→ M is the minimizing geodesic connecting p0 and q+, then

−1 −1
−Pσ+ ,0,1 D X (p0 ) Pσ+ ,1,0 X (q+ ) ≤ D X (p0 ) kX (q+)k .

Furthermore, if v = −Pσ,0,1 D X (p0 )−1 Pσ,1,0 X (q) , then


 
Pσ,0,1 D X (p0 )Pσ,1,0 v = Pσ,0,1 D X (p0 )Pσ,1,0 −Pσ,0,1 D X (p0 )−1 Pσ,1,0 X (q)
= −Pσ,0,1 D X (p0 ) D X (p0 )−1 Pσ,1,0 X (q)
= −X (q).

By Theorem 6.2.10,
kX (q+)k = kX (γ(1))k
 
1
≤l kvk + d (p0 , p) kvk
2
 
1 −1 −1
≤l −Pσ+ ,0,1D X (p0 ) Pσ+ ,1,0X (q) + t −Pσ+ ,0,1 D X (p0 ) Pσ+ ,1,0 X (q)
2
    
1 f (t) f (t)
≤l +t .
2 | f 0 (0)| | f 0 (0)|

Thus, by (6.3.6), after some calculations


      
−1 1 1 f (t) f (t)
−Pσ+ ,0,1 D X (p 0 ) Pσ+ ,1,0 kX (q + )k ≤ l + t
| f 0 (0)| 2 | f 0 (0)| | f 0 (0)|
1  
= l 2b − 2t + alt 2 2b + 2t + alt 2
8
f (t+ )
= 0 ,
| f (0)|

we thus conclude
f (t+ )
kX (q+ )k ≤ ,
| f 0 (0)|
and therefore
q+ ∈ K (t+) .

Now we are going to prove that starting from any point of K (t) the simplified Newton
method converges.
Newton-Type Methods on Riemannian Manifolds 115

Corollary 6.3.6. Take 0 ≤ t < t∗ and q ∈ K (t) , and define

τ0 = t
τk+1 = τk − | ff (τ k)
0 (0)| for each k = 0, 1, · · ·

Then the sequence generated by Newton’s method starting with the point q0 = q is
well defined for any k and
qk ∈ K (τk ) . (6.3.9)
Moreover {qk }k∈N converges to some q∗ ∈ B (p0 ,t∗ ), X (q∗ ) = 0 and

d (qk , q∗ ) ≤ t∗ − τk for each k = 0, 1, · · · .

Proof. It is clear that the sequence {τk }k∈N is the sequence generated by Newton’s method
for solving f (t) = 0. Therefore, {τk }k∈N is well defined, strictly increasing and it converges
to the root t∗ (see the definition of f ). By hypothesis, q0 ∈ K (τ0 ) ; suppose that the points
q0 , q1 , . . ., qk are well defined. Then, using Banach’s Lemma, we conclude that qk+1 is well
defined. Furthermore,


d (qk+1, qk ) ≤ −Pσk ,0,1 D X (p0 )−1 Pσk ,1,0 X (qk ) .

Since  
qk+1 = expqk −Pσk ,0,1 D X (p0 )−1 Pσk ,1,0 X (qk )

and σk : [0, 1] −→ M is the minimizing geodesic connecting p0 , qk , from Lemma 6.3.5 and
using (6.3.9) we obtain
f (τk )
d (qk+1, qk ) ≤ = τk+1 − τk . (6.3.10)
| f 0 (0)|
Hence, for k ≥ s, s ∈ N,
d (qk , qs ) ≤ τs − τk . (6.3.11)
It follows that {qk }k∈N is a Cauchy sequence. Since M is complete, it converges to the some
q∗ ∈ M. Moreover qk ∈ K (τk ) ⊆ B [p0 ,t∗ ], therefore q∗ ∈ B [p0 ,t∗ ].
Next, we prove that X (q∗ ) = 0. We have


kX (qk )k = Pσk ,0,1 D X (p0 ) Pσk ,1,0 Pσk ,0,1 D X (p0 )−1 Pσk ,1,0 X (qk )


≤ kD X (p0 )k D X (p0 )−1 X (qk )
f (τk )
≤ (kD X (p0 )k)
| f 0 (0)|
= (kD X (p0 )k) (τk+1 − τk ) .

Passing to the limit in k, we conclude X (q∗ ) = 0. Finally, letting s −→ ∞ in (6.3.11), we


get
d (q∗ , qk ) ≤ t∗ − τk .
116 Ioannis K. Argyros and Á. Alberto Magreñán

• Finally, from (6.3.5)

b √ k+1
d (q∗ , qk ) ≤ 1− 1−2h , k = 1, 2, . . .
h
By hypothesis, p0 ∈ K (0) , thus by the Lemma 6.3.5, the sequence {pk }k∈N generated
by (6.3.1) is well defined, contained in B (p0 ,t∗) and converges to some p∗ , which is
a singular point of X in B [p0 ,t∗ ]. Moreover, if h < 1/2, then

b √ k+1
d (pk , p∗ ) ≤ 1− 1−2h .
h

• UNIQUENESS
This proof will be made in an indirect way, by contradiction. But before we are going
to establish some results.

Lemma 6.3.7. Take 0 ≤ t < t∗ and q ∈ K (t), let

A−1 = −Pσ,0,1 D X (p0 )−1 Pσ,1,0


v = A−1 X (q),

where σ : [0, 1] −→ M is a minimizing geodesic connecting p0 , pk . Define for θ ∈ R

τ (θ) = t + θa f (t) ,
γ (θ) = expq (θv) .

Then, for θ ∈ [0, 1]

t < τ (θ) < t∗ and γ (θ) ∈ K (τ (θ)) .

Proof. Because γ is a minimizing geodesic, for all θ ∈ [0, 1] we have

d (p0, γ(θ)) ≤ d (p0, q) + d (q, γ(θ))


≤ t + θ kvk
≤ t + θa f (t)
= τ (θ).

This implies that

t ≤ τ (θ) ≤ τ (1) ≤ t∗ and γ ([0, θ]) ⊂ B (p0 ,t∗ ) . (6.3.12)

Using the Lemma 6.3.2, we obtain

X (γ (θ)) = Pγ,0,θ (X (p) + Pσ,0,1 θD X (p0 )Pσ,1,0 v + R (θ)) ,

with Z θ 
R (θ) = Pγ,s,0 D X (γ (s))Pγ,0,s v − Pσ,0,1 D X (p0 ) Pσ,1,0 v ds,
0
Newton-Type Methods on Riemannian Manifolds 117

and  
θ
kR (θ)k ≤ L kvk + d (p0 , q) θ kvk .
2
This yields to
 
−1
Z θ
−1
A X (γ (θ)) =
A Pγ,0,θ X (q) − Pγ,s,0 D X (γ (s)) Pγ,0,s v ds
0
 
−1 Z θ 
= A Pγ,0,θ (1 − θ) X (q) − P γ,s,0 D X (γ(s)) Pγ,0,s − D X (q) v ds
0

−1 θ 
Z
≤ A−1 Pγ,0,θ (1 − θ) X (q) + A Pγ,0,θ P γ,s,0 D X (γ(s)) Pγ,0,s − D X (q) vds

0
≤ (1 − θ) a f (t) + a kR (θ)k
 
θ
≤ (1 − θ) a f (t) + aL kvk + d (p0 ,q) θ kvk
2
 
θ
≤ (1 − θ) a f (t) + aL a f (t) + t θa f (t)
2
1  
= 2 b − 2t + a Lt 2 −4θ + a2 L2 θ2 t 2 + 4aLθt + 2abLθ2 − 2aLθ2 t + 4
8
= a f (τ(θ)) .

Therefore
γ(θ) ∈ K (τ (θ)),
and the Lemma is proved.

Lemma 6.3.8. Let 0 ≤ t < t∗ and q ∈ K (t). Suppose that q∗ ∈ B [p0 ,t∗ ] is a singularity
of the vector field X and
t + d (q, q∗ ) = t∗ .
Then
d (p0 , q) = t.
Moreover, letting
t+ = t + a f (t), 
q+ = expq A−1 X (q) ,
then t < t+ < t∗ , q+ ∈ K (t+) and

t+ + d (q+, q∗ ) = t∗ .

Proof. Consider the minimizing geodesic α : [0, 1] −→ M joining q to q∗ . Since q ∈ K (t),


we have

d (p0, α(θ)) ≤ d (p0, q) + d (q, α (θ))


≤ t + θd (q, q∗)
≤ t + d (q, q∗ )
= t∗ .

It follows that α ([0, 1]) ⊂ B (p0 ,t∗ ). Taking u = α0 (0), by Lemma 6.3.2 we have

Pα,1,0 X (α (1)) = X (q) + Pσ,0,1 D X (p0 )Pσ,1,0 u + R (1) ,


118 Ioannis K. Argyros and Á. Alberto Magreñán

with  
1
kR (1)k ≤ L kuk + d (p0 , q) kuk .
2
Therefore
 
1
kR (1)k ≤ L d (q, q∗ ) + d (p0 , q) d (q, q∗ ) (6.3.13)
2
 
1
=L (t∗ − t) + d (p0 , q) (t∗ − t)
2
 
1
≤L (t∗ − t) + t (t∗ − t) (6.3.14)
2
1
= L (t∗ + t) (t∗ − t) .
2
On the other hand, since | f (t)| is strictly decreasing in [0,t∗ ] and 0 ≤ d (p0 , q) ≤ t < t∗ ,

kR (1)k = kX (q) + Auk (6.3.15)


1
≥ A−1 X (q) + u
−1
D X (p0 )

≥ f 0 (0) A−1 X (q) + u

≥ f 0 (0) kuk − A−1 X (q)

≥ f 0 (0) (kuk − a f (t))
= − f 0 (0)(t∗ − t) − f (t) > 0.

Because
f 00 (t) = L,
0 = f 0 (t∗ ) = f (t) + f 0 (t)(t∗ − t) + 21 f 00 (t)(t∗ − t)2 ,
and Z t
f 0 (t) = f 0 (0) + f 00 (t)dt,
0
therefore
 1
0 = f (t) + f 0 (0) + tL (t∗ − t) + L (t∗ − t)2 ,
2
hence
1
L (t∗ + t) (t∗ − t) = − f 0 (0)(t∗ − t) − f (t).
2
Thus, the last term in (6.3.13) is equal to the last term in the inequality (6.3.15), we conclude
that all these inequalities in (6.3.15) are equalities, in particular
1
= | f 0 (0)| = a,
kD X(p0 )−1 k
kuk − A−1 X (q) = A−1 X (q) + u > 0,
−1 (6.3.16)
A X (q) = a f (t),
 
L 12 (t∗ − t) + d (p0 , q) (t∗ − t) = L 21 (t∗ − t) + t (t∗ − t) .
Newton-Type Methods on Riemannian Manifolds 119

From the last equation in (6.3.16), we obtain

d (p0 , q) = t,

the second equation in (6.3.16) implies that u and A−1 X (q) are linearly dependent vectors
in Tq M, so that there exists r ∈ R such that

A−1 X (q) = −ru.

Thus, the second equation implies

1 − |r| = |1 − r| ,

and because r 6= 0 and r 6= 1, we have 0 < r < 1, thus

q+ = expq (ru) = α (r) .

Moreover, given that α is a minimizing geodesic joining q to q∗ , we have that q, α (r) and
q∗ are in the same geodesic line, thus

d (q, α(r)) + d (α(r), q∗ ) = d (q, q∗ ) ,

therefore,
d (q, q+) + d (q+ , q∗ ) = d (q, q∗ ).
Moreover,
d (q, q+) = kruk = A−1 X (q) = a f (t) = t+ − t,
hence
d (q+, q∗ ) = d (q, q∗ ) − d (q, q+) = (t∗ − t) − (t+ − t) = t∗ − t+ ,
that is
d (q+ , q∗ ) + t+ = t∗ .

Corollary 6.3.9. Suppose that q∗ ∈ B [p0 ,t∗ ] is a zero of the vector field X. If there exist t˜
and q̃ such that
0 ≤ t˜ < t∗ , q̃ ∈ K (t˜) and t˜ + d (q̃, q∗ ) = t∗ ,
then
d (p0 , q∗ ) = t∗ .

Proof. Changing τ0 by t˜ and q0 by q̃ in Corollary 6.3.6, we obtain that

qk ∈ K (τk ) , for all k ∈ N,

{τk }k∈N converges to t∗ , {qk }k∈N converges to some q̃∗ ∈ B (p0 ,t∗ ) , and X (q∗ ) = 0. More-
over, by Lemma 6.3.8 and applying induction, it is easy to show that for all k,

d (p0 , qk ) = τk and d (qk , q∗ ) = t∗ − τk .


120 Ioannis K. Argyros and Á. Alberto Magreñán

Passing to the limit, we obtain

d (p0 , q̃∗ ) = t∗ and d (q̃∗ , q∗ ) = 0.

Therefore q̃∗ = q∗ and


d (p0 , q∗ ) = t∗ .

The two following Lemmas complete the proof of the uniqueness.

Lemma 6.3.10. The limit p∗ of the sequence {pk }k∈N is the unique singularity of X in
B [p0 ,t∗ ] .

Proof. Let q∗ ∈ B [p0 ,t∗ ] a singularity of the vector field X. Using induction, we will show
that
d (pk , q∗ ) + tk ≤ t∗ .
We need to consider two cases:

Case 1. (d (p0 , q∗ ) < t∗ ). First we show by induction that for all k ∈ N,

d (pk , q∗ ) + tk < t∗ . (6.3.17)

Indeed, for k = 0 (6.3.17) is immediately true, because t0 = 0. Now, suppose the property
is true for some k. Let us take the geodesic

γk (θ) = exp pk (−θvk ),

where vk is defined in (6.3.1). From Lemma 6.3.7, for all θ ∈ [0, 1],

γk (θ) ∈ K (tk + θ (tk+1 − tk )) . (6.3.18)

Define φ : [0, 1] −→ M by

φ (θ) = d (γk (θ), q∗ ) + tk + θ (tk+1 − tk ) . (6.3.19)

We know that
φ (0) = d (pk , q∗ ) + tk < t∗ .
We next show, by contradiction, that φ (θ) 6= t∗ for all θ ∈ [0, 1]. 
Suppose that there exists a θ̃ ∈ [0, 1] such that φ θ̃ = t∗ , and let q̃ = γk θ̃ and t˜ =
tk + θ̃ (tk+1 − tk ) . By (6.3.18) and (6.3.19),

q̃ ∈ K (t˜) and d (q̃, q∗ ) + t˜ = t∗ .

Applying Corollary 6.3.9, we conclude that

d (p0 , q∗ ) = t∗ ,
Newton-Type Methods on Riemannian Manifolds 121

which contradicts our assumption. Thus φ (θ) 6= t∗ for all θ ∈ [0, 1], Since φ(0) < t∗ and φ
is continuous, we have that φ (θ) < t∗ for all θ ∈ [0, 1]. In particular, by (6.3.19),

d (γk (1) , q∗ ) + tk+1 = φ (1) < t∗ .

Thus,
d (pk+1 , q∗ ) + tk+1 < t∗ ,
in this way (6.3.17) is true for all k ∈ N.

Case 2. (d (p0 , q∗ ) = t∗ ). Using induction, let us prove that for all k ∈ N,

d (pk , q∗ ) + tk = t∗ . (6.3.20)
Indeed, for k = 0, this is immediately true, because t0 = 0. Now, suppose that (6.3.20) is
true for some k. Since pk ∈ K (tk ), by Lemma 6.3.8 we conclude that

d (pk+1 , q∗ ) + tk+1 = t∗ .

Finally, by (6.3.17) and (6.3.20) we conclude that for all k ∈ N,

d (pk , q∗ ) + tk ≤ t∗ ,

and passing to the limit k −→ ∞, we obtain d (p∗ , q∗ ) = 0, and therefore

p∗ = q∗ .

1
Lemma 6.3.11. If h = a b L < 2 and B (p0 , r) ⊆ Ω, with
1  √ 
t∗ < r ≤ t∗∗ = 1+ 1−2h ,
aL
then the limit p∗ of the sequence {pk }k∈N is the unique singularity of the vector field X in
B (p0 , r).
Proof. Let q∗ ∈ B (p0 , r) be a singularity of the vector field X in B (p0 , r). Let us consider
the minimizing geodesic α : [0, 1] −→ M joining p0 to q∗ . By Lemma 6.3.2,

Pα,1,0 X (α (1)) = X (p0 ) + Pσ,0,1 D X (p0 )Pσ,1,0 u + R (1),

where
 
1 L
kR (1)k ≤ L kuk + d (p0 , p0 ) kuk = d (p0 , q∗ )2 and kuk = d (p0 , q∗ ) . (6.3.21)
2 2
In a similar way to the inequality (6.3.15), is easy to prove that
1 
−1
kR (1)k ≥ kuk − D X (p0 ) X (p0 )
a
1 b
≥ d (p0 , q∗ ) − .
a a
122 Ioannis K. Argyros and Á. Alberto Magreñán

Therefore
L 1 b
d (p0 , q∗ )2 ≥ d (p0 , q∗ ) − ,
2 a a
hence
f (d (p0 , q∗ )) ≥ 0,
since d (p0 , q∗ ) ≤ r ≤ t∗∗ , then
d (p0 , q∗ ) ≤ t∗ .
Finally, from Lemma 6.3.10,
p∗ = q∗ .

6.4. Order of Convergence of Newton-Type Methods


The analysis of the order of convergence is performed in a local way, that is, in a neigh-
borhood of the zero of the vector field. Then, we can define the order of convergence in
Riemannian manifolds in following way:

Definition 6.4.1. Let M be a manifold and let {pk }k∈N be a sequence on M converging to
p∗ . If there exists a system of coordinates (U, x) of M with p∗ ∈ Uα, constants p > 0, c ≥ 0
and K ≥ 0 such that, for all k ≥ K, {pk }∞ k=K ⊆ Uα the following inequality holds:
−1
x (pk+1 ) − x−1 (p∗ ) ≤ c x−1 (pk ) − x−1 (p∗ ) p , (6.4.1)

then we said that {pk }k∈N converges to p∗ with order at least p.

It can be shown that the definition above do not depend on the choice of the coordinates
system and the multiplicative constant c depends on the chart, but for any chart, there exists
such a constant, (see [1]).
Notice that in normal coordinates of 0 pk ,
−1
exp (p) − exp−1 (q) = d (p, q),
pk pk

thus, in normal coordinates, (6.4.1) is transformed into

d (pk+1 , p∗ ) ≤ cd (pk , p∗ ) p .

Lemma 6.4.2. Let M be an Riemannian manifold, Ω ⊆ M be an open set, X ∈ χx (M) and


D X ∈ Lipl (Ω). Let us take p ∈ Ω , v ∈ Tp M and
γ (t) = exp p (tv).

If γ[0,t) ⊆ Ω, then
Pγ,t,0 X (γ (t)) = X (p) + t D X (p) v + R (t)
with
l
kR (t)k ≤ t 2 kvk2
2
Newton-Type Methods on Riemannian Manifolds 123

Proof. From Theorem 6.2.9, it follows that


Z t 
Pγ,t,0 X (γ (t)) − X (γ(0)) = Pγ,s,0 D X (γ (s))γ0 (s) ds.
0

Given that γ is a geodesic, we have that γ0 (t) is parallel and γ0 (s) = Pγ,0,s γ0 (0). Moreover,
since γ0 (0) = v then
Z t 
Pγ,t,0 X (γ (t)) − X (p) = Pγ,s,0 D X (γ(s)) Pγ,0,s v ds.
0

Therefore
Pγ,t,0 X (γ (t)) − X (p) − t D X(p) v
= 0t Pγ,s,0 D X (γ (s))Pγ,0,s v ds − t D X (p)v
R

= 0t Pγ,s,0 D X (γ(s)) Pγ,0,s v − D X (p) v ds,


R

let Z t 
R (t) = Pγ,s,0 D X (γ(s)) Pγ,0,s v − D X (p) v ds.
0
By hypothesis, D X ∈ LipL (Ω) , hence
Z t 
kR (t)k ≤ Pγ,s,0 D X (γ (s))Pγ,0,s v − D X (p) v ds
0
Z t 
≤ Pγ,s,0 D X (γ (s))Pγ,0,s − D X (p) kvk ds
0
Z t  Z s

≤ L 0
γ (τ) dτ kvk ds.
0 0

Since γ is a geodesic, kγ0 (τ)k is constant. Therefore,


0 0
γ (τ) = γ (0) = kvk ,

thus Z t Z s  Z t
L
kR (t)k ≤ L kvk dτ kvk ds = L kvk s kvk ds = t 2 kvk2 .
0 0 0 2

Lemma 6.4.3. (Order of convergence)

i) The convergence order of the Newton method in Riemannian manifold is two


(quadratic convergence).

ii) The convergence order of the simplified Newton method in Riemannian manifold is
one (linear convergence).

Proof. Let k be sufficiently large in such a way that p∗ , pk , pk+1 , . . ., belong to a normal
neighborhood U of pk . Let us consider the geodesic γk joining pk to p∗ defined by

γk (t) = exp pk (tuk ) ,


124 Ioannis K. Argyros and Á. Alberto Magreñán

where uk ∈ Tpk M and d (pk , p∗ ) = kuk k .


We know that if p, q be in one normal neighborhood U of pk , then
−1
exp (p) − exp−1 (q) = d (p, q).
pk pk

i) By Lemma 6.3.2,

Pγ,t,o X (p∗ ) = X (pk ) + D X (pk )uk + R (1),

with
L
kR (1)k ≤ kuk k2 and kuk k = d (pk , p∗ ) .
2
Hence,
0 = D X (pk )−1 X (pk ) + uk + D X (pk )−1 R (1).
Since
−D X (pk )−1 X (pk ) = exp−1 −1
pk (pk+1 ) and uk = exp pk (p∗ ) ,

we have
−1
exp−1 −1
pk (pk+1 ) − exp pk (p∗ ) = D X (pk ) R (1) ,

thus L

d (pk+1 , p∗ ) ≤ D X (pk )−1 kuk k2 .
2
Moreover, by Banach’s Lemma,
a a a a
−1
D X (pk ) ≤ ≤ ≤ =√ .
1 − ald (pk , p0 ) 1 − alτk 1 − alt∗ 1 − 2abl
Therefore
d (pk+1 , p∗ ) ≤ Cd (pk , p∗ )2 ,
with
La
C= √ .
2 1 − 2abL
ii) Let p0 be sufficiently near to p∗ in such a way that p0 is in the normal neighborhood
U of 0 pk , By Lemma 6.3.2, if σk : [0, 1] −→ M is the minimizing geodesic connecting p0 ,
pk , then
Pγ,1,0 X (p∗ ) = X (pk ) + Pσk ,0,1 D X (p0 ) Pσk ,1,0 uk + R (1),
with  
1
kR (1)k ≤ L kuk k + d (p0 , pk ) kuk k.
2
Therefore

0 = Pσk ,0,1 D X (p0 )−1 Pσk ,1,0 X (pk ) + uk + Pσk ,0,1 D X (p0 )−1 Pσk ,1,0 R (1) .

Since

−Pσk ,0,1 D X (p0 )−1 Pσk ,1,0 X (pk ) = exp−1 −1


pk (pk+1 ) and uk = exp pk (p∗ ),
Newton-Type Methods on Riemannian Manifolds 125

we have
−1
exp−1 −1
pk (pk+1 ) − exp pk (p∗ ) = Pσk ,0,1 D X (p0 ) Pσk ,1,0 R (1).
We thus conclude that

d (pk+1 , p∗ ) = exp−1 −1
pk (pk+1) − exp pk (p∗ )



= Pσk ,0,1 D X (p0 )−1 Pσk ,1,0 R (1)


≤ D X (p0 )−1 kR (1)k
 
1
≤ al kuk k + d (p0 , pk ) kuk k
2
 
1
= al d (pk , p∗ ) + d (p0 , pk ) d (pk , p∗ )
2
 
1 d (pk , p∗ )
= al + 1 d (p0 , pk ) d (pk , p∗ ) .
2 d (p0 , pk )
If k is sufficiently large, then d (pk , p∗ ) ≤ d (p0 , pk ) , and therefore
 
1 d (pk , p∗ ) 3
+1 ≤ ,
2 d (p0 , pk ) 2
and then, for p0 sufficiently close to p∗ ,
d (pk+1 , p∗ ) ≤ K0 d (p0 , pk )d (pk , p∗ ) ,
3aL
with K0 ≤ 2 .

Remark 6.4.4. Note that if instead of putting in the Kantorovich method the point p0 , we fix
p j sufficiently close to p∗ , we will obtain a new convergent method. Indeed the calculations
made in the previous lemma become in
d (pk+1 , p∗ ) ≤ K j d (p j , p∗ )d (pk , p∗ ) ,
3aL
with K j ≤ 2 . Thus,
d (pk+1 , p∗ ) ≤ Kd (p j , p∗ ) d (pk , p∗ ) , (6.4.2)
3aL
with K ≤ 2 .

6.5. One Family of High Order Newton-Type Methods


We recall the Shamanskii family of iterative methods. Given an integer m and an initial
point
 i mx0 in0a Banach Space, we move from xn to xn+1 through an intermediate sequence
yn i=0 , yn = xn , which is a generalization of Newton (m = 1) and simplified Newton
(m = ∞) methods
 1 −1 

 yn = y0n − D F y0n F y0n
 y2 = y1 − D F y0 −1 F y1 

 n n n n
..

 .


 ym = x −1  (m−1)
n = ym−1 − D F y0
n+1 n F y n , n
126 Ioannis K. Argyros and Á. Alberto Magreñán

For a problem on Riemannian manifolds, let us consider the family


  
 1 = exp −1

 q −D X (p n ) X (p n )
 n

pn
 

 q2 = exp 1 −Pσ ,0,1 D X (pn )−1 Pσ ,1,0 X q1
n qn 1 1 n
.. (6.5.1)



 .   

 −1 (m−1)
 qm n = pn+1 = expqm−1 −Pσm−1 ,0,1 D X (pn ) Pσm−1 ,1,0 X qn ,
n

where σk : [0, 1] −→ M be the minimizing geodesic joining the points pn and qkn ; k =
1, 2, . . ., (m − 1) , thus:
σk (0) = pn and σk (1) = qkn .
Theorem 6.5.1. Under the hypotheses of Kantorovich’s theorem, the method described in
(6.5.1) converges with order of convergence m + 1.
Proof. Let us observe that
     
(m−1) (m−1) (m−2)
d (pn+1 , pn ) ≤ d pn+1 , qn + d qn , qn + · · · + d q2n , q1n + d q1n , pn .
1
Now, if we define pn+1 = qm n , pn = qn , looking at each step as a different method according
to (6.5.1) , then by Kantorovich theorem in the first step and by the simplified Kantorovich
theorem for the following steps, each one of the sequences {qm n }m∈N for fixed n, is con-
vergent to the same point p∗ ∈ M. Therefore, {pn } p∈N is convergent to p∗ . Moreover, for
Lemma 6.4.3 i) and (6.4.2),
   
(m−1) (m−2)
d (pn+1 , p∗ ) ≤ Kd (pn , p∗ ) d qn , p∗ ≤ Kd (pn , p∗ ) Kd (pn , p∗ ) d qn , p∗

≤ · · · ≤ K m−1 d (pn , p∗ )m−1 d q1n , p∗ ≤ K m−1 d (pn , p∗ )m−1 Cd (pn , p∗ )2 .
Therefore,
d (pn+1 , p∗ ) ≤ CK m−1 d (pn , p∗ )m+1 .

6.6. Expanding the Applicability of Newton Methods


We have used Lipschitz condition (6.2.13) and the famous Kantorovich sufficient conver-
gence criterion (6.1.1) in connection to majorizing function f for the semilocal convergence
of both simplified Newton and Newton methods. According to the proof of Lemma 6.3.2,
corresponding majorizing sequences for these methods are given by (see [27])
t0 = 0, t1 = b
f (tk) al (6.6.1)
tk+1 = tk + 0 = tk + (tk − tk−1 )2 for each k = 1, 2, · · ·
| f (0)| 2
for the simplified Newton method and
u0 = 0, u1 = b
f (uk) a l (uk − uk−1 )2 (6.6.2)
uk+1 = uk + 0 = uk + for each k = 1, 2, · · ·
| f (uk )| 2 (1 − a l uk )
Newton-Type Methods on Riemannian Manifolds 127

for the Newton method. Kantorovich criterion (6.1.1) may be not satisfied on a particular
problem but Newton methods may still converges to p∗ [27]. Next, we shall show that
condition (6.1.1) can be weakened by introducing the center Lipschitz condition and relying
on tighter majorizing sequences instead of majorizing function f .

Definition 6.6.1. Let E be a Banach space, Ω ⊆ E be an open convex set, F : Ω −→ Ω be


a continuous operator, x0 be a point in Ω, such that F ∈ C1 and DF is center-Lipschitz in
Ω at x0

k DF(x) − DF(x0 ) k≤ l0 k x − x0 k for each x ∈ Ω and some l0 > 0.

As in the case of Definition 6.2.7 we will write DF ∈ Lipl0 (Ω) at x0 ∈ Ω.

Note that
l0 ≤ l (6.6.3)
holds in general and l/l0 can be arbitrarily large [?], [14].
We present the semilocal convergence of the simplified Newton method using only the
center-Lipschitz condition.

Theorem 6.6.2. Let M be a Riemannian manifold, Ω ⊆ M be an open convex set, X ∈ χ(M).


Suppose that for some p0 ∈ Ω, DX ∈ Lipl0 (Ω) at p0 , DX(p0 ) is invertible and that for some
a > 0 and b ≥ 0, the following hold

−1
DX (p0 ) ≤ a,

−1
DX (p0 ) X (x0 ) ≤ b,
1
h0 = a b l 0 ≤ (6.6.4)
2
and
 1  p 
B p0 ,t∗0 ⊆ Ω where t∗0 = 1 − 1 − 2 h0 .
a l0
Then, sequence {pk } generated by (6.3.1) is such that {pk } ⊆ B(p0 ,t∗0 ) and pk → p∗ , which
the only singularity of X in B(p0 ,t∗0 ). Moreover, if h0 < 1/2 and B(p0 , r) ⊆ Ω with

1  p 
t∗0 < r ≤ t∗∗
0
= 1 + 1 − 2 h0
a l0

and p∗ is also the only singularity of F in B(p0 , r). Furthemore, the following error bounds
are satisfied fr each k = 1, 2, · · ·

d(pk , pk−1 ) ≤ tk0 − tk−1


0
,

d(pk , p∗ ) ≤ t∗0 − tk0


and
b p
d(pk , p∗ ) ≤ (1 − 1 − 2 h0 )k+1,
h0
128 Ioannis K. Argyros and Á. Alberto Magreñán

where sequence {tk0 } is defined by

t00 = 0, t10 = b
0 a l0 0 0 2
tk+1 = tk0 + (t − t ) for each k = 1, 2, · · · .
2 k k−1
Proof. Simply notice that l0 , h0 , {tk0 }, t∗0 , t∗∗
0
can replace l, h, {tk}, t∗ , t∗∗, respectively, in
the proof of Theorem 6.3.1.

Remark 6.6.3. Under Kantorovich criterion (6.1.1) a simple inductive argument shows
that
tk0 ≤ tk and tk+1
0
− tk0 ≤ tk+1 − tk for each k = 0, 1, · · · .
Moreover, we have that
1 1
t∗0 ≤ t∗ , 0
t∗∗ ≤ t∗∗ , h≤ =⇒ h0 ≤
2 2
and
h0 l0
−→ 0 as −→ 0.
h l
Furthemore, strict inequality holds in these estimates (for k > 1) if l0 < l.
The convergence order of simplified method is only linear, whereas the convergence
order of Newton method is quadratic if h < 1/2. If criterion h ≤ 1/2 is not satisfied but
weaker h0 ≤ 1/2 is satisfied, we can start with the simplified method until a certain iterate
xN (N a finite natural integer) at which criterion h ≤ 1/2 is satisfied. Such an integer N
exists. Since the simplified Newton method converges [8], [12], [14]. This approach was
not possible before since h ≤ 1/2 was used at the convergence criterion for both methods.

Remark 6.6.4. Under the hypotheses of Theorem 6.1.2, we see in the proof of this Theorem,
sequences {rk}, {sk } defined by

a l0 (r1 − r0 )2
r0 = 0, r1 = b, r2 = r1 +
2 (1 − a l0 r1 )2
(6.6.5)
a l (rk − rk−1 )2
rk+1 = rk + for each k = 2, 3, · · ·
2 (1 − a l0 rk )

s0 = 0, s1 = b
a l (sk − sk−1 )2 (6.6.6)
sk+1 = sk + for each k = 1, 2, · · ·
2 (1 − a l0 sk )
are also majorizing sequences for {pk } such that

rk ≤ sk ≤ uk , d(pk , pk−1) ≤ rk − rk−1 ≤ sk − sk−1 ≤ uk − uk−1

and
r∗ = lim rk ≤ s∗ = lim sk ≤ t∗ = lim uk .
k→∞ k→∞ k→∞
Newton-Type Methods on Riemannian Manifolds 129

Simply notice that for the computation of the upper bound on the norms k DX(q)−1 k (see
(6.3.8)), we can have using the center-Lipschitz condition

k φ−1 k a
k DX(q)−1 k≤ −1

1− k φ k k DX(q) − φ k 1 − a l0 λ

instead of the less tight (if l0 < l) and more expensive to compute estimate
a
k DX(q)−1 k≤
1−al λ
obtained in the proofs of Theorems 6.1.2 and 6.3.1 using the Lipschitz condition. Hence, the
results of Theorem 6.1.2 involving sequence {uk} can be rewritten using tighter sequences
{rk } or {sk }. Note that the introduction of the center-Lipschitz condition is not an addi-
tional hypothesis to Lipschitz condition since in practice, the computation of l requires the
computation of l0 . So far we showed that under Kantorovich criterion (6.1.1) the estimates
of the distances d(pk , pk−1), d(pk , p∗ ) are improved (if l0 < l) using tighter sequences {rk },
{sk } for the computation on the upper bounds of these distances. Moreover, the information
on the location of the solution is at least as precise.

Next, we shall show that Kantorovich criterion (6.1.1) can be weakened if one directly
(and not through majorizing function f ) studies the convergence of sequences {rk } and
{sk }. First, we present the results for sequence {sk }.

Lemma 6.6.5. [13] Assume there exist constants l0 ≥ 0, l ≥ 0, a > 0 and b ≥ 0 with l0 ≤ l
such that 
≤ 1/2 if l0 6= 0
h1 = l b (6.6.7)
< 1/2 if l0 = 0
 p 
a 2
where l = l + 4 l0 + l + 8 l0 l . Then, sequence {sn } given by (6.6.6) is nondecreas-
8
ing, bounded from above by s?? and converges to its unique least upper bound s? ∈ [0, s??],
where
2b 4l
s?? = and δ = p < 1 for l0 6= 0. (6.6.8)
2−δ l + l 2 + 8 l0 l
Moreover the following estimates hold

a l0 s? ≤ 1, (6.6.9)
 n
δ δ
0 ≤ sn+1 − sn ≤ (sn − sn−1 ) ≤ · · · ≤ b for each n = 1, 2, · · · , (6.6.10)
2 2
 n
δ n
sn+1 − sn ≤ (2 h1 )2 −1 b for each n = 0, 1, · · · (6.6.11)
2
and
 n n
? δ (2 h1 )2 −1 b
0 ≤ s − sn ≤ , (2 h1 < 1) for each n = 0, 1, · · · . (6.6.12)
2 1 − (2 h1 )2n
130 Ioannis K. Argyros and Á. Alberto Magreñán

Lemma 6.6.6. [16] Suppose that hypotheses of Lemma 6.6.5 hold. Assume that
1
h2 = l 2 b ≤ , (6.6.13)
2
a
where l2 = (4 l0 + (l l0 + 8 l02 )1/2 + (l0 l)1/2 ). Then, scalar sequence {rn } given by (6.6.5)
8
is well defined, increasing, bounded from above by

a l 0 b2
r?? = b + (6.6.14)
2 (1 − (δ/2)) (1 − a l0 b)

and converges to its unique least upper bound r? which satisfies 0 ≤ r? ≤ r?? . Moreover,
the following estimates hold

a l 0 b2
0 < rn+2 − rn+1 ≤ (δ/2)n for each n = 1, 2, · · · . (6.6.15)
2 (1 − a l0 b)

Lemma 6.6.7. [16] Suppose that hypotheses of Lemma 6.6.5 hold and there exists a mini-
mum integer N > 1 such that iterates ri (i = 0, 1, · · · , N −1) given by (6.6.5) are well defined,
1
ri < ri+1 < for each i = 0, 1, · · · , N − 2 (6.6.16)
a l0
and
1 δ
rN ≤ (1 − (1 − a l0 rN−1 ) ). (6.6.17)
a l0 2
Then, the following assertions hold

a l0 rN < 1, (6.6.18)

1 δ
rN+1 ≤ (1 − (1 − a l0 rN ) ), (6.6.19)
a l0 2
δ a l0 (rN+1 − rN )
δN−1 ≤ ≤ 1− , (6.6.20)
2 1 − a l0 rN
sequence {rn } given by (6.6.5) is well defined, increasing, bounded from above by

2
r?? = rN−1 + (rN − rN−1 )
2−δ
and converges to its unique least upper bound r? which satisfies 0 ≤ r? ≤ r?? , where δ is
given in Lemma 6.6.5 and
a l (rn+2 − rn+1 )
δn = .
2 (1 − a l0 rn+2 )
Moreover, the following estimates hold
 n−1
δ
0 < rN+n − rN+n−1 ≤ (rN+1 − rN ) for each n = 1, 2, · · · .
2
Newton-Type Methods on Riemannian Manifolds 131

Remark 6.6.8. If N = 2 we must have

a l0 b al b+δ
r2 = b + ≤ ,
2 (1 − a l0 b) a l + δ a l0

which is (6.6.13). When N > 2 we do not have closed form inequalities (solved for n)
anymore given by
c0 η ≤ c1 ,
where c0 and c1 may depend on l0 and l, see e.g. (6.6.7) or (6.6.13). However, the corre-
sponding inequalities can also be checked out, since only computations involving b, l0 , and
l are carried out (see also [16]). Clearly, the sufficient convergence conditions of the form
(6.6.17) become weaker as N increases.

Remark 6.6.9. In [14], [16], tighter upper bounds on the limit points of majorizing se-
quence {rn }, {sn }, {uk } than [6, 8, 32] are given. Indeed, we have that
 
? a l0 b
r = lim rn ≤ r3 = 1 + b.
n→∞ (2 − δ) (1 − a l0 b)

Note that  
≤ r2 if l0 ≤ l ≤ r1 if l0 ≤ l
r3 and r3
< r2 if l0 < l < r1 if l0 < l
where
2b
r2 = and r1 = 2 b.
2−δ
Moreover, r2 can be smaller than s? for sufficiently small l0 . We have also that
1 1 1
h≤ =⇒ h1 ≤ =⇒ h2 ≤ ,
2 2 2
but not necessarily vice versa unless if l0 = l. Moreover, we have that

h1 1 h2 h2 l0
−→ , −→ 0 and −→ 0 as −→ 0.
h 4 h h1 l
Example 6.6.10. We consider a simple example to test the ”h” conditions in one dimension.
Let X = R, x0 = 1, Ω = [d, 2 − d], d ∈ [0, .5). Define function F on Ω by

F(x) = x3 − d. (6.6.21)

We get that
1
b= (1 − d) and l = 2 (2 − d).
3
Kantorovich condition (6.1.1) is given by
2
h= (1 − d) (2 − d) > .5 for all d ∈ (0, .5).
3
132 Ioannis K. Argyros and Á. Alberto Magreñán

Hence, there is no guarantee that Newton’s method starting at x0 = 1 converges to x? .


However,
√ one can easily see that if for example d = .49, Newton’s method converges to
? 3
x = .49. In view of (6.6.21), we deduce the center-Lipschitz condition

l0 = 3 − d < l = 2 (2 − d) for all d ∈ (0, .5). (6.6.22)

We consider the ”h” conditions of Remark 6.6.9. Then, we obtain that


1
h1 = (8 − 3 d + (5 d 2 − 24 d + 28)1/2) (1 − d) ≤ .5 for all d ∈ [.450339002, .5)
12
and
1
h2 = (1 − d) (12 − 4 d + (84 − 58 d + 10 d 2 )1/2 + (12 − 10 d + 2 d 2 )1/2) ≤ .5
24
for all d ∈ [.4271907643, .5).

In Fig. 6.6.1, we compare the ”h” conditions for d ∈ (0, .999).

Figure 6.6.1. Functions h, h1 , h2 (from top to bottom) with respect to d in interval (0, .999),
respectively. The horizontal blue line is of equation y = .5.
References

[1] Absil, P.A. Mahony, R., Sepulchre, R., Optimization Algorithms on Matrix Manifolds,
Princeton University Press, Princeton NJ, 2008.

[2] Adler, R.L., Dedieu, J.P., Margulies, J.Y., Martens, M., Shub,M., Newton’s method
on Riemannian manifolds and a geometric model for the human spine, IMA J. Numer.
Anal., 22 (2002), 359–390.

[3] Alvarez, F., Bolte, J., Munier, J., A unifying local convergence result for Newton’s
method in Riemannian manifolds, Foundations Comput. Math., 8 (2008), 197–226.

[4] Amat, S., Busquier, S., Third-order iterative methods under Kantorovich conditions,
J. Math. Anal. Appl., 336 (2007), 243–261.

[5] Amat, S., Busquier, S., Gutiérrez, J. M., Third-order iterative methods with appli-
cations to Hammerstein equations: A unified approach, J. App. Math. Comp., 235
(2011), 2936–2943.

[6] Argyros, I.K., A unifying local-semilocal convergence analysis and applications for
two-point Newton-like methods in Banach space, J. Math. Anal. Appl., 298 (2004),
374–397.

[7] Argyros, I.K., An improved unifying convergence analysis of Newton’s method in


Riemannian manifolds, J. App. Math. Comp., 25 (2007), 345–351.

[8] Argyros, I.K., Approximating solutions of equations using Newton’s method with a
modified Newton’s method iterate as a starting point, Rev. Anal. Numér. Théor. Ap-
prox., 36 (2007), 123–138.

[9] Argyros, I.K., Chebysheff-Halley like methods in Banach spaces, Korean J. Comp.
Appl. Math., 4 (1997), 83–107.

[10] Argyros, I.K., Improved error bounds for a Chebysheff-Halley-type method, Acta
Math. Hungar., 84 (1999), 209–219.

[11] Argyros, I.K., Improving the order and rates of convergence for the super-Halley
method in Banach spaces, Korean J. Comput. Appl. Math., 5 (1998), 465–474.

[12] Argyros, I.K., Convergence and applications of Newton–type iterations, Springer–


Verlag Publ., New–York, 2008.
134 Ioannis K. Argyros and Á. Alberto Magreñán

[13] Argyros, I.K., A semilocal convergence analysis for directional Newton methods,
Math. Comp., 80 (2011), 327–343.

[14] Argyros, I.K., Cho, Y.J., Hilout, S., Numerical methods for equations and its applica-
tions, CRC Press/Taylor and Francis Publ., New York, 2012.

[15] Argyros, I.K., Hilout, S., Newton’s method for approximating zeros of vector fields
on Riemannian manifolds, J. App. Math. Comp., 29 (2009), 417–427.

[16] Argyros, I.K., Hilout, S., Weaker conditions for the convergence of Newton’s method,
J. Complexity, 28 (2012), 364–387.

[17] Argyros, I.K., Ren, H., On the semilocal convergence of Halley’s method under a
center-Lipschitz condition on the second Fréchet derivative, App. Math. Comp., 218
(2012), 11488–11495.

[18] Averbuh, V.I., Smoljanov, O.G., Differentiation theory in linear topological spaces,
Uspehi Mat. Nauk 6, 201-260 Russian Math. Surveys 6 (1967), 201–258.

[19] Chun, C., Stanica, P., Neta, B., Third-order family of methods in Banach spaces,
Comput. Math. Appl., 61 (2011), 1665–1675.

[20] Dedieu, J.P., Priouret, P., Malajovich, G., Newton’s method on Riemannian manifolds:
convariant alpha theory, IMA J. Numer. Anal., 23 (2003), 395–419.

[21] Dedieu, J.P., Nowicki, D., Symplectic methods for the approximation of the expo-
nential map and the Newton iteration on Riemannian submanifolds, J. Complexity, 21
(2005), 487–501.

[22] Do Carmo, M., Riemannian geometry, Birkhäuser, Boston, 1992.

[23] Ezquerro, J.A., Hernández, M.A., New Kantorovich-type conditions for Halley’s
method, Appl. Numer. Anal. Comput. Math., 2 (2005), 70–77.

[24] Ezquerro, J.A., Gutiérrez, J. M., Hernández, M.A., Salanova, M.A., Chebyshev-like
methods and quadratic equations, Rev. Anal. Numér. Théor. Approx., 28 (2000), 23–35.

[25] Ezquerro, J.A., A modification of the Chebyshev method, IMA J. Numer. Anal., 17
(1997) 511–525.

[26] Ezquerro, J.A., Hernández, M.A., A super-Halley type approximation in Banach


spaces, Approx. Theory Appl., 17 (2001), 14–25.

[27] Ferreira, O., Svaiter, B., Kantorovich’s Theorem on Newton’s Method in Riemannian
Manifolds, J. Complexity, 18 (2002), 304–329.

[28] Gabay, D., Minimizing a differentiable function over a differential manifold, J. Optim.
Theory Appl., 37 (1982), 177–219.

[29] Groisser, D., Newton’s method, zeros of vector fields, and the Riemannian center of
mass, Adv. Appl. Math., 33 (2004), 95–135.
Newton-Type Methods on Riemannian Manifolds 135

[30] Hernández, M.A., Romero, N., General study of iterative processes of R-order at least
three under weak convergence conditions, J. Optim. Theory Appl., 133 (2007), 163–
177.

[31] Hernández, M.A., Romero, N., On a characterization of some Newton-like methods


of R-order at least three, J. Comput. Appl. Math., 183 (2005), 53–66.

[32] Kantorovich, L.V., Akilov, G.P., Functional Analysis in Normed Spaces, Pergamon,
Oxford, 1964.

[33] Kelley, C.T., A Shamanskii-like acceleration scheme for nonlinear equations at singu-
lar roots, Math. Comp., 47 (1986), 609–623.

[34] Kress, R., Numerical Analysis, Springer-Verlag, New York, 1998.

[35] Li, C., Wang, J., Newton’s method on Riemannian manifolds: Smale’s point estimate
theory under the γ-condition, IMA J. Numer. Anal., 26 (2006), 228–251.

[36] Li, C., Wang, J., Newton’s method for sections on Riemannian manifolds: Generalized
covariant α-theory, J. Complexity, 25 (2009), 128–151.

[37] Manton, J.H., Optimization algorithms exploiting unitary constraints, IEEE Trans.
Signal Process., 50 (2002), 635–650.

[38] Neta, B., A new iterative method for the solution of systems of nonlinear equations.
Approximation theory and applications (Proc. Workshop, Technion Israel Inst. Tech.,
Haifa, 1980), Academic Press, New York-London, (1981), 249–263.

[39] Parida, P.K., Gupta, D.K., Recurrence relations for a Newton-like method in Banach
spaces, 206 (2007), 873–887.

[40] Parida, P.K., Gupta, D.K., Recurrence relations for semilocal convergence of a
Newton-like method in Banach spaces, J. Math. Anal. Appl. 345 (2008), 350–361.

[41] Parida, P.K., Gupta, D.K., Semilocal convergence of a family of third-order methods in
Banach spaces under Hölder continuous second derivative, Nonlinear Anal. 69 (2008),
4163–4173.

[42] Romero, N., PhD Thesis. Familias paramétricas de procesos iterativos de alto orden
de convergencia. http://dialnet.unirioja.es/

[43] Shamanskii, V.E., A modification of Newton’s method, Ukrain. Mat. Zh., 19 (1967),
133–138.

[44] Spivak, M., A comprehensive introduction to differential geometry, Vol. I, third ed.,
Publish or Perish Inc., Houston, Texas., 2005.

[45] Spivak, M., A comprehensive introduction to differential geometry, Vol. II, third ed.,
Publish or Perish Inc., Houston, Texas, 2005.
136 Ioannis K. Argyros and Á. Alberto Magreñán

[46] Traub, J.F., Iterative methods for the solution of equations, Prentice Hall, Englewood
Cliffs, N. J., 1964.

[47] Udriste, C., Convex functions and optimization methods on Riemannian manifolds,
Mathematics and its Applications, 297, Kluwer Academic Publishers Group, Dor-
drecht, 1994.

[48] Wang, J.H., Convergence of Newton’s method for sections on Riemannian manifolds,
J. Optim. Theory Appl., 148 (2011), 125–145.

[49] Zhang, L.H., Riemannian Newton method for the multivariate eigenvalue problem,
SIAM J. Matrix Anal. Appl., 31 (2010), 2972–2996.
Chapter 7

Improved Local Convergence


Analysis of Inexact Gauss-Newton
Like Methods

7.1. Introduction
Let X and Y be Banach spaces. Let D ⊆ X be open set and F : D −→ Y be continuously
differentiable. In this chapter we are concerned with the problem of approximating a locally
unique solution x? of nonlinear least squares problem
min k F(x) k2 . (7.1.1)
x∈D

A solution x? ∈ D of (7.1.1) is also called a least squares solution of the equation F(x) = 0.
Many problems from computational sciences and other disciplines can be brought in a
form similar to equation (7.1.1) using mathematical modelling [8], [11]. For example in
data fitting, we have X = Ri , Y = R j , i is the number of parameters and j is the number of
observations [23].
The solution of (7.1.1) can rarely be found in closed form. That is why the solution
methods for these equations are usually iterative. In particular, the practice of numerical
analysis for finding such solutions is essentially connected to Newton-type methods [8].
The study about convergence matter of iterative procedures is usually centered on two types:
semilocal and local convergence analysis. The semilocal convergence matter is, based on
the information around an initial point, to give criteria ensuring the convergence of iterative
procedures; while the local one is, based on the information around a solution, to find
estimates of the radii of convergence balls. A plethora of sufficient conditions for the local
as well as the semilocal convergence of Newton-type methods as well as an error analysis
for such methods can be found in [1]–[47].
In the present chapter we use the inexact Gauss-Newton like method
xn+1 = xn + sn , B (xn) sn = −F 0 (xn )? F(xn) + rn for each n = 0, 1, · · · , (7.1.2)
where x0 ∈ D is an initial point to generate a sequence {xn } approximating x? . Here, A?
denotes the adjoint of the operator A, B (x) ∈ L (X , Y ) the space of bounded linear operators
138 Ioannis K. Argyros and Á. Alberto Magreñán

from X into Y , is an approximation of the derivative F 0 (x)? F 0 (x) (x ∈ D ); rn is the residual


tolerance and the preconditioning invertible matrix P for the linear systems defining the
step sn satisfy
k Pn rn k≤ θn k Pn F 0 (xn )? F(xn ) k for each n = 0, 1, · · · (7.1.3)
If θn = 0 for each n = 0, 1, · · ·, the inexact Gauss-Newton method reduces to Gauss-
Newton method. If x? is a solution of (7.1.1), F(x? ) = 0 and F 0 (x? ) is invertible, then the
theories of Gauss-Newton methods merge into those of Newton method. A survey of con-
vergence results under various Lipschitz-type conditions for Gauss-Newton-type methods
can be found in [8] (see also [5]–[15], [17]–[40]). The convergence of these methods re-
quires among other hypotheses that F 0 satisfies a Lipschitz condition or F 00 is bounded in
D . Several authors have relaxed these hypotheses [9]–[15]. In particular, Ferreira et al.
[24]–[29] have used the majorant condition in the local as well as semilocal convergence of
Newton-type method. Argyros and Hilout [12]–[16] have also used the majorant condition
to provide a tighter convergence analysis and weaker convergence criteria for Newton-type
method. The local convergence of inexact Gauss-Newton method was examined by Fer-
reira et al. [28] using the majorant condition. It was shown that this condition is better that
Wang’s condition [36], [47] in some sence. A certain relationship between the majorant
function and operator F was established that unifies two previously unrelated results per-
taining to inexact Gauss-Newton methods, which are the result for analytical functions and
the one for operators with Lipschitz derivative.
In the present chapter, we are motivated by the elegant work in [28] and optimization
considerations. Using more precise majorant condition and functions, we provide a new
local convergence analysis for inexact Gauss-Newton-like methods under the same compu-
tational cost and the following advantages: larger radius of convergence; tighter error esti-
mates on the distances k xn − x? k for each n = 0, 1, · · · and a clearer relationship between
the majorant function and the associated least squares problems (7.1.1). These advantages
are obtained because we use a center-type majorant condition (see (7.3.1)) for the compu-
tation of inverses involved which is more precise that the majorant condition used in [28].
Moreover, these advantages are obtained under the same computational cost, since as we
will see in section 7.3. and section 7.4., the computation of the majorant function requires
the computation of the center-majorant function. Furthemore, these advantages are very
important in computational mathematics, since we have a wider choice of initial guesses x0
and fewer computations to obtain a desired error tolerance on the distances k xn − x? k for
each n = 0, 1, · · ·.
The chapter is organized as follows. In order to make the chapter as self contained as
possible, we provide the necessary background in section 7.2.. Section 7.3. contains the
local convergence analysis of inexact Gauss-Newton-like methods. Some proofs are abbre-
viated to avoid repetitions with the corresponding in [28]. Special cases and applications
are given in the concluding section 7.4..

7.2. Background
Let U(x, r) and U(x, r) stand, respectively, for the open and closed ball in X with center x ∈
D and radius r > 0. Let A : X −→ Y be continuous linear and injective with closed image,
Improved Local Convergence Analysis of Inexact Gauss-Newton Like Methods 139

the Moore-Penrose inverse [8], [11], [34] A+ : Y −→ X is defined by A+ = (A? A)−1 A? . I


denotes the identity operator on X (or Y ).
Lemma 7.2.1. [8, 11, 35] (Banach’s Lemma) Let A : X −→ X be a continuous linear
operator. If k A − I k< 1 then A−1 ∈ L (X , X ) and k A−1 k≤ 1/(1− k A − I k).
Lemma 7.2.2. [8, ?] Let A, E : X −→ Y be two continuous linear operators with closed
images. Suppose B = A + E, A is injective and k E A+ k< 1. Then, B is injective.
Lemma 7.2.3. [8, 11, 35] Let A, E : X −→ Y be two continuous linear operators with
closed images. Suppose B = A + E and k A+ k k E k< 1. Then, the following estimates hold

+ k A+ k + + 2 k A+ k2 k E k
k B k≤ and k B − A k≤ .
1− k A+ k k E k 1− k A+ k k E k
Proposition 7.2.4. [34] Let R > 0. Suppose g : [0, R) −→ R is convex. Then, the following
holds
g(u) − g(0) g(u) − g(0)
D+g(0) = lim+ = inf .
u→0 u u>0 u
Proposition 7.2.5. [34] Let R > 0 and θ ∈ [0, 1]. Suppose g : [0, R) −→ R is convex. Then,
h : (0, R) −→ R defined by h(t) = (g(t) − g(θt))/t is increasing.

7.3. Local Convergence Analysis


We examine the local convergence of inexact Gauss-Newton-like method. In order for us
to show the main Theorem 7.3.8, we need some auxiliary results. The proofs of some
of the results are omitted, since these proofs can be found in [28] by simply replacing
function f by f 0 . Assume that x ∈ D −→ F(x)? F(x) has x? as stationarily point. Let R > 0,
c =k F(x? ) k, β =k F 0 (x? )+ k and

κ = sup{t ∈ [0, R) : U(x? ,t) ⊆ D }.

Suppose that F 0 (x? )? F(x? ) = 0, F 0 (x? ) is injective and there exist functions f 0 , f :
[0, R) −→ (−∞, +∞) continuously differentiable, such that the following assumptions hold
(H0 )
k F 0 (x) − F 0 (x? ) k≤ f 00 (k x − x? k) − f 00 (0), (7.3.1)
k F 0 (x) − F 0 (x? + τ (x − x? )) k≤ f 0 (k x − x? k) − f 0 (τ k x − x? k), (7.3.2)
for all x ∈ U(x? , κ) and τ ∈ [0, 1];

(H1 ) f 0 (0) = f (0) = 0 and f 00 (0) = f 0 (0) = −1;

(H2 ) f 00 , f 0 are strictly increasing,

f0 (t) ≤ f (t) and f 00 (t) ≤ f 0 (t) for each t ∈ [0, R);

(H3 ) √
α0 = 2 c β2 D+ f00 (0) < 1.
140 Ioannis K. Argyros and Á. Alberto Magreñán

Let

0 ≤ ϑ < 1, 0 ≤ ω2 < ω1 such that ω1 (α0 + α0 ϑ + ϑ) + ω2 < 1, (7.3.3)

where α0 is defined in (H3 ). Define parameters ν0 , ρ0 and r0 by

ν0 := sup{t ∈ [0, R) : β ( f 00 (t) + 1) < 1} (7.3.4)



t f 0 (t) − f (t) + 2c β ( f 00 (t) + 1)
ρ0 := sup{t ∈ [0, ν0 ) : (1 + ϑ) ω1 β + ω1 ϑ + ω2 < 1}
t (1 − β ( f 00 (t) + 1))
(7.3.5)
and
r0 := min{κ, ρ0 }. (7.3.6)
We provide the following auxiliary lemmas.
Lemma 7.3.1. Suppose that (H0 )–(H3 ) hold. Then, the constant ν0 defined by (7.3.4) is
positive and β ( f 00 (t) + 1) < 1 for each t ∈ (0, ν0 ).
Lemma 7.3.2. Suppose that (H0 )–(H3 ) hold. Then, the following real functions hi (i =
1, 2, 3) defined on (0, R) by
1 t f 0 (t) − f (t) f00 (t) + 1
h1 (t) = , h2 (t) = and h3 (t) =
1 − β ( f 00 (0) + 1) t2 t

are increasing. Note also that h2 h1 and h3 h1 are increasing on (0, R).
Lemma 7.3.3. Suppose that (H0 )–(H3 ) hold. Then, the constant ρ0 defined by (7.3.5) is
positive and the following holds for each t ∈ (0, ρ0 ):

t f 0 (t) − f (t) + 2 c β ( f 00 (t) + 1)
(1 + ϑ) ω1 β + ω1 ϑ + ω2 < 1,
t (1 − β ( f 00 (t) + 1))
where ϑ, ω1 and ω2 are defined in (7.3.3).
Proof. Using (H1 ), we have that
t f 0 (t) − f (t) f 0 (t) − ( f (t) − f (0))/t 1 − β ( f 0 (t) + 1)
lim = lim
t→0 t (1 − β ( f 00 (t) + 1)) t→0 1 − β ( f 0(t) + 1) 1 − β ( f 00 (t) + 1)
1 − β ( f 0 (0) + 1) f 0 (t) − ( f (t) − f (0))/t
= lim = 0.
1 − β ( f 00 (0) + 1) t→0 1 − β ( f 0(t) + 1)

By the convexity of f 0 and f 00 and Proposition 7.2.4, we get that

f 0 (t) + 1 ( f 00 (t) − f 00 (0))/t


lim = lim = D+ f00 (t).
t→0 t (1 − β ( f 00 (t) + 1)) t→0 1 − β ( f 00 (t) + 1)

We deduce that

t f 0 (t) − f (t) + 2 c β ( f 00 (t) + 1)
lim(1 + ϑ) ω1 β + ω1 ϑ + ω2
t→0 t (1 − β ( f 00 (t) + 1))
= (1 + ϑ) ω1 α0 + ω1 ϑ + ω2 .
Improved Local Convergence Analysis of Inexact Gauss-Newton Like Methods 141

By (7.3.3), we have that ω1 (α0 + α0 ϑ + ϑ) + ω2 < 1. Then, there exists δ0 such that

t f 0 (t) − f (t) + 2 c β ( f 00 (t) + 1)
(1 + ϑ) ω1 β + ω1 ϑ + ω2 < 1 for each t ∈ (0, δ0 ).
t (1 − β ( f 00 (t) + 1))

The definition of ρ0 gives that δ0 ≤ ρ0 . The proof of Lemma 7.3.3 is complete. 

Lemma 7.3.4. Suppose that (H0 )–(H3 ) hold. Then, for each x ∈ D such that x ∈
U(x? , min{ν0 , κ}), F 0 (x)? F 0 (x) is invertible and the following estimates hold

β
k F 0 (x)+ k≤
1 − β ( f 00 (k x − x? k) + 1)

and √
2 β2 ( f 00 (k x − x? k) + 1)
k F 0 (x)+ − F 0 (x? )+ k≤ .
1 − β ( f 00 (k x − x? k) + 1)
In particular, F 0 (x)? F 0 (x) is invertible in U(x? , r0 ).

Proof. Since x ∈ D such that x ∈ U(x? , min{ν0 , κ}), then k x − x? k≤ ν0 . By Lemma 7.3.1,
(7.3.1) and the definition of β, we have that

k F 0 (x? )+ k k F 0 (x) − F 0 (x? ) k≤ β ( f 00 (k x − x? k) − f 00 (0)) < 1.

Consider operators A = F 0 (x? ), B = F 0 (x) and E = B − A. Hence, we have that


k E A+ k≤k E k k A+ k< 1. Then, we deduce the desired result by Lemmas 7.2.2 and
Lemma 7.2.3. That completes the proof of Lemma 7.3.4. 

Newton’s iteration at a point is a zero of the linearization of F at such a point. Hence,


we shall study the linearization error at a point in D :

EF (x, y) := F(y) − (F(x) + F 0 (x) (x − y)) for each x, y ∈ D . (7.3.7)

We shall bound this error by the error in linearization of the majorant function f :

e f (t, u) := f (u) − ( f (t) + f 0 (t) (u − t)) for each t, u ∈ [0, R). (7.3.8)

Define also the Gauss-Newton step to the operator F by

SF (x) = −F 0 (x)+ F(x) for each x ∈ D. (7.3.9)

Lemma 7.3.5. Suppose that (H0 )–(H3 ) hold. If k x? − x k< κ, then the following assertion
holds
k EF (x, x? ) k≤ e f (k x − x? k, 0).

Lemma 7.3.6. Suppose that (H0 )–(H3 ) hold. Then, for each x ∈ D such that x ∈
U(x? , min{ν0 , κ}), the following estimate holds

β e f (k x − x? k, 0) + 2 c β2 ( f 00 (k x − x? k) + 1)
k SF (x) k≤ + k x − x? k .
1 − β ( f 00 (k x − x? k) + 1)
142 Ioannis K. Argyros and Á. Alberto Magreñán

Proof. Let x ∈ D such that x ∈ U(x? , min{ν0 , κ}). Using (7.3.7) and (7.3.9), we have that

k SF (x) k
=k F 0 (x)+ (F(x? ) − (F(x) − F 0 (x) (x? − x))) − (F 0 (x)+ − F 0 (x? )+ ) F(x? ) + (x? − x) k
≤k F 0 (x)+ k k EF (x, x? ) k + k F 0 (x)+ − F 0 (x? )+ k k F(x? ) k + k x? − x k .

Then, we deduce the desired result by Lemmas 7.3.4 and Lemma 7.3.5. That completes the
proof of Lemma 7.3.6. 

Lemma 7.3.7. Let parameters ϑ, ω1 and ω2 defined by (7.3.3). Let ν0 , ρ0 and r0 as defined
in (7.3.4), (7.3.5) and (7.3.6), respectively. Suppose that (H0 )–(H3 ) hold. For each x ∈
U(x? , r0 ) \ {x? }, define

x+ = x + s, B (x) s = −F 0 (x)? F(x) + r, (7.3.10)

where B (x) is an invertible approximation of F 0 (x)? F(x) satisfying

k B (x)−1 F 0 (x)? F 0 (x) k≤ ω1 , k B (x)−1 F 0 (x)? F 0 (x) − I k≤ ω2 . (7.3.11)

Suppose also that the forcing term θ and the residuals r (as defined in (7.1.3)) satisfy

k P r k≤ θ k P F 0 (x)? F(x) k and θcond (P F 0 (x)? F 0 (x)) ≤ ϑ. (7.3.12)

Then, x+ is well defined and the following estimate holds

f 0 (k x? − x k) k x? − x k − f (k x? − x k)
k x+ − x? k≤ (1 + ϑ) ω1 β ? − x k2 (1 − β ( f 0 (k x? − x k) + 1))
k x? − x k2 +
√ k x 0
 
(1 + ϑ) ω1 2 c β2 ( f 00 (k x − x? k) + 1)
+ ω1 ϑ + ω2 k x? − x k .
k x? − x k (1 − β ( f 00 (k x − x? k) + 1))
(7.3.13)
In particular, k x+ − x? k<k x? − x k.

Proof. By Lemma 7.3.4 and since x ∈ U(x? , r0 ), we have that F 0 (x)? F 0 (x) is invertible. In
view of (7.1.2) and (7.3.10), we obtain the identity

x+ − x? = x − x? − B (x)−1 F 0 (x)? (F(x) − F(x? )) + B (x)−1 r+


B (x)−1 F 0 (x)? F 0 (x) (F 0 (x?)+ F(x? ) − F 0 (x)+ F(x?))
= B (x)−1 F 0 (x)? F 0 (x) F 0 (x)+ (F(x? ) − (F(x) + F 0 (x) (x? − x)))+ (7.3.14)
B (x)−1 r + B (x)−1 (F 0 (x)? F 0 (x) − B (x)) (x − x? )+
B (x)−1 F 0 (x)? F 0 (x) (F 0 (x?)+ F(x? ) − F 0 (x)+ F(x?)).
Using (7.3.7), (7.3.9), (7.3.11), (7.3.12) and (7.3.14), we get that

k x+ − x? k ≤ ω1 k F 0 (x)+ k k EF (x, x? ) k + k B (x)−1 r k +ω2 k x? − x k +


ω1 k F 0 (x)+ − F 0 (x? )+ k k F(x? ) k
≤ ω1 k F 0 (x)+ k k EF (x, x? ) k +ω1 ϑ k SF (x) k +
ω2 k x? − x k +ω1 k F 0 (x)+ − F 0 (x? )+ k k F(x? ) k .
(7.3.15)
Improved Local Convergence Analysis of Inexact Gauss-Newton Like Methods 143

Using (7.3.8), (7.3.15) and Lemmas 7.3.4–7.3.6, we deduce that



? e f (k x − x? k, 0) + 2c β ( f 00 (k x − x? k) + 1)
k x+ − x k≤ (1 + ϑ) β ω1 +
1 − β ( f 00 (k x − x? k) + 1)
ω1 ϑ k x − x? k +ω2 k x? − x k √
f 0 (k x − x? k) k x − x? k − f (k x − x? k) + 2 c β ( f 00 (k x − x? k) + 1)
≤ (1 + ϑ) βω1
1 − β ( f 00 (k x − x? k) + 1)
? ?
+ω1 ϑ k x − x k +ω2 k x − x k .
(7.3.16)
Hence, (7.3.13) holds. Note that if we factorize by k x? − x k in the right term in (7.3.13),
then, we deduce that k x+ − x? k<k x? − x k. The proof of Lemma 7.3.7 is complete. 

Next, we provide the main local convergence result for inexact Gauss-Newton-like
method.
Theorem 7.3.8. Let F : D ⊆ X −→ Y be a continuously differentiable operator. Let pa-
rameters ϑ, ω1 and ω2 defined by (7.3.3). Let ν0 , ρ0 and r0 as defined in (7.3.4), (7.3.5)
and (7.3.6), respectively. Suppose that (H0 )–(H3 ) hold. Then, sequence {xn } generated by
(7.1.2), starting at x0 ∈ U(x? , r0 ) \ {x?} for the the forcing term θn , the residual rn and the
invertible preconditioning matrix Pn satisfying the following estimates for each n = 0, 1, · · ·:
k Pn rn k≤ θn k Pn F 0 (xn )? F(xn ) k, 0 ≤ θn cond(Pn F 0 (xn )? F 0 (xn )) ≤ ϑ,
k B (xn )−1 F 0 (xn )? F 0 (xn ) k≤ ω1 and k B (xn )−1 F 0 (xn )? F 0 (xn ) − I k≤ ω2
is well defined, remains in U(x? , r0 ) for all n ≥ 0 and converges to x? . Moreover, the
following estimate holds for each n = 0, 1, · · ·
k xn+1 − x? k≤ Ξn k xn − x? k, (7.3.17)
where
f 0 (k x? − x0 k) k x? − x0 k − f (k x? − x0 k)
Ξn = (1 + ϑ) ω1 β ? − x k2 (1 − β ( f 0 (k x? − x k) + 1))
k x? − xn k +
√ k x 0 0 0
(1 + ϑ) ω1 2 c β2 ( f 00 (k x0 − x? k) + 1)
+ ω1 ϑ + ω2 .
k x? − x0 k (1 − β ( f 00 (k x0 − x? k) + 1))
Proof. By induction argument, Lemmas 7.3.4 and 7.3.7, {xn } starting at x0 ∈ U(x? , r0 ) \
{x? } is well defined in U(x? , r0 ). By letting x+ = xn+1 , x = xn , r = rn , P = Pn , θ = θn and
P = Pn in (7.3.10)–(7.3.12), we get that
k xn+1 − x? k≤
f 0 (k x? − xn k) k x? − xn k − f (k x? − xn k)
(1 + ϑ) ω1 β ? − x k2 (1 − β ( f 0 (k x? − x k) + 1))
k x? − xn k2 +
√k x n 0 n
 
(1 + ϑ) ω1 2 c β2 ( f 00 (k xn − x? k) + 1)
+ ω1 ϑ + ω2 k x? − xn k .
k x? − xn k (1 − β ( f 00 (k xn − x? k) + 1))
We also have by Lemma 7.3.7 that k xn − x? k≤k x0 − x? k for each n = 1, 2, · · ·. Hence,
(7.3.17) holds. Proposition 7.3.3 imply that xn+1 ∈ U(x? , r0 ) and lim xn = x? . The proof
n−→∞
of Theorem 7.3.8 is complete. 
144 Ioannis K. Argyros and Á. Alberto Magreñán

Remark 7.3.9. If f (t) = f 0 (t) for each t ∈ [0, R), then, Theorem 7.3.8 reduces to [28,
Theorem 7]. In particular, we have in this case that ν = ν0 , ρ = ρ0 , δ = δ0 , α = α0 , r = r0
and D+ f (0) = D+ f0 (0), where ν, ρ, δ, α, r and D+ f (0) are defined, respectively, as ν0 ,
ρ0 , δ0 , α0 , r0 and D+ f0 (0) by setting f 0 (t) = f (t). Otherwise, i.e., if

f0 (t) < f (t) and f 00 (t) < f 0 (t) f or each t ∈ [0, R), (7.3.18)

then, we have that

ν ≥ ν0 , ρ ≤ ρ0 , δ ≤ δ0 , α ≥ α0 , r ≤ r0 and D+ f (0) ≥ D+ f0 (0). (7.3.19)

Note that these advantages are obtained under the same computational cost, since in prac-
tice, the computation of function f requires that of f 0 . Note also that the local results in
[18], [19], [24]–[27] are also extended, since these are special cases of Theorem 7.3.8. In
particular, if ϑ = 0 (i.e., if θn = rn = 0 for each n = 0, 1, · · ·) in Theorem 7.3.8, we improve
the convergence of Gauss-Newton like method under majorant condition, which for ω1 = 1
and ω2 = 0 has been obtained in [26, Theorem 7]. These results extend those the ones ob-
tained by Chen and Li in [18], [19] given only for the the case c = 0. Moreover, if c = 0 and
F 0 (x? ) is invertible, we extend the convergence of inexact Newton-like methods under ma-
jorant condition, which was obtained in [24, Theorem 4]. Furthemore, if c = ϑ = ω2 = 0,
ω1 = 1 and F 0 (x? ) is invertible in Theorem 7.3.8, we extend the convergence of Newton’s
method under majorant condition obtained in [24, Theorem 2.1].

In the next section, we shall show how to choose functions f 0 and f so that (7.3.18) is
satisfied.

7.4. Special Case and Numerical Examples


We present two special cases of Theorem 7.3.8. The first one is based on the center-
Lipschitz and Lipschitz conditions [8], [11]. The second one is based on Wang’s condition
[47], which generalized Smale’s alpha theory for analytic functions [44].

Remark 7.4.1. Let us define functions f , f 0 : [0, κ] −→ R by

L0 t 2 Lt 2
f0 (t) = −t and f (t) = − t,
2 2
where L0 and L are the center-Lipschitz and Lipschitz constants, respectively. We have that
f0 (0) = f (0) = 0 and f 00 (0) = f 0 (0) = −1. Set also R = 1/L. Then, it can easily be seen
that Theorem 7.3.8 specializes to the following proposition.

Proposition 7.4.2. Let F : D ⊂ X −→ Y be continuously differentiable operator. Let


x? ∈ D , such that F 0 (x? )? F(x? ) = 0, F 0 (x? ) is injective. Let c =k F(x? ) k, β =k F 0 (x? )+ k
and κ = sup{t ≥ 0 : U(x? ,t) ⊆ D }. Suppose that there exist L0 > 0, L > 0 such that

k F 0 (x) − F 0 (x? ) k≤ L0 k x − x? k and k F 0 (x) − F 0 (y) k≤ L k x − y k,


Improved Local Convergence Analysis of Inexact Gauss-Newton Like Methods 145

for each x, y ∈ U(x? , κ). Suppose that α0 = 2 c β2 L < 1. Let parameters ϑ, ω1 and ω2
defined by (7.3.3). Let

0 2 (1 − ω1 ϑ − ω2 ) − 2 2c L0 β2 ω1 (1 + ϑ)
r = min{κ, }.
β (L (1 + ϑ) ω1 + 2 L0 (1 − ω1 ϑ − ω2 ))

Then, sequence {xn } generated by (7.1.2) with rn = 0 and B (xn ) = F 0 (xn )? F 0 (xn ), starting
at x0 ∈ U(x? , r0 ) \ {x? } for the the forcing term θn and the invertible preconditioning matrix
Pn satisfying the following estimates for each n = 0, 1, · · ·:
k Pn rn k≤ θn k Pn F 0 (xn )? F(xn ) k, 0 ≤ θn cond(Pn F 0 (xn )? F 0 (xn )) ≤ ϑ,

k B (xn )−1 F 0 (xn )? F 0 (xn ) k≤ ω1 and k B (xn )−1 F 0 (xn )? F 0 (xn ) − I k≤ ω2


is well defined, remains in U(x? , r0 ) for all n ≥ 0 and converges to x? . Moreover, the
following estimate holds for each n = 0, 1, · · ·

k xn+1 − x? k≤ ∆n k xn − x? k, (7.4.1)

where

(1 + ϑ) ω1 β L ? (1 + ϑ) ω1 2c β2 L0
∆n = k x − xn k + + ω1 ϑ + ω2 .
2 (1 − β L0 k x0 − x? k) 1 − β L0 k x0 − x? k
Remark 7.4.3. (a) If L0 = L, Proposition 7.4.2 reduces to [28, Theorem 16]. Moreover,
if ϑ = 0, Proposition 7.4.2 reduces to [28, Corollary 17]. Furthemore, if c = 0, then,
Proposition 7.4.2 reduces to [19, Corollary 6.1].

(b) If F(x? ) = 0, F 0 (x? )+ = F 0 (x? )−1 and L0 < L, then Theorem 7.3.8 improves the corre-
sponding results on inexact Newton-like methods [30], [33], [37], [38]. In particular
for Newton’s method. Set c = ϑ = ω1 = ω2 = 0. Then, we get that
2
r0 = min {κ, }.
β (2 L0 + L)
This radius is at least as large as the one provided by Traub [46], which is given by
2
r00 = min {κ, }.
3βL
Let us provide a numerical example for this case.

Example 7.4.4. Let X = Y = C [0, 1], the space of continuous functions defined on [0, 1] be
equipped with the max norm and D = U(0, 1). Define function F on D by
Z 1
F(h)(x) = h(x) − 5 x θ h(θ)3 dθ. (7.4.2)
0

Then, we have that


Z 1
F 0 (h[u])(x) = u(x) − 15 x θ h(θ)2 u(θ) dθ f or each u ∈ D.
0
146 Ioannis K. Argyros and Á. Alberto Magreñán

Using Proposition 7.4.2, we see that the hypotheses hold for x? (x) = 0, where x ∈ [0, 1],
β = 1, L = 15 and L0 = 7.5. Then, we get that
2 1
r00 = min {κ, } ≤ min {κ, } = r0 .
45 15
2 2
Clearly, if min {κ, } = , then we deduce in particular that r00 < r0 .
45 45
Remark 7.4.5. Let γ ≥ γ0 . Let us define functions f , f 0 : [0, κ] −→ R by
t t
f0 (t) = − 2t and f (t) = − 2t.
1 − γ0 t 1 − γt
Then, we have that f 0 (0) = f (0) = 0, f 00 (0) = f 0 (0) = −1. Set also R = 1/γ. Note also that
1 1
f00 (t) = − 2, f 0 (t) = − 2,
(1 − γ0 t)2 (1 − γt)2
2 γ0 2γ
f000 (t) = and f 00 (t) = .
(1 − γ0 t)3 (1 − γt)3
We introduce the definition of the center γ0 -condition.
Definition 7.4.6. Let γ0 > 0 and let 0 < µ ≤ 1/γ0 be such that U(x? , µ) ⊆ D . The operator
F is said to satisfy the center γ0 -condition at x? on U(x? , µ) if
1
k F 0 (x) − F 0 (x? ) k≤ −1 f or each x ∈ U(x? , µ).
(1 − γ0 k x − x? k)2
We also need the definition of γ-condition due to Wang [47].
Definition 7.4.7. Let γ > 0 and let 0 < µ ≤ 1/γ be such that U(x? , µ) ⊆ D . The operator F
is said to satisfy the γ-condition at x? on U(x? , µ) if

k F 00 (x) k≤ f or each x ∈ U(x? , µ).
(1 − γ k x − x? k)3
Remark 7.4.8. (a) Note that γ0 ≤ γ holds in general and γ/γ0 can be arbitrarily large
[7]–[16].
(b) If F is an analytic function, Smale [44] used the following choice
F (n)(x? ) 1
γ = sup k k n < +∞.
n∈N? n!
Using the above definitions and choices of functions (see Remark 7.4.5, Definitions
7.4.6 and 7.4.7), the corresponding specialization of Theorem 7.3.8 along the lines
of Proposition 7.4.2 can be obtained. However, we leave this part to the interested
reader. Note that clearly if γ0 = γ, this result reduces to [28, Theorem 18], which in
turn reduces to [19, Example 1] if c = 0. Otherwise (i.e., if γ0 < γ), our result is an
improvement.
Next, we provide an example, where γ0 < γ in the case when F(x? ) = 0, F 0 (x)+ =
F (x)−1 and c = ϑ = ω1 = ω2 = 0.
0
References

[1] Amat, S., Bermúdez, C., Busquier, S., Legaz, M.J., Plaza, S., On a family of high-
order iterative methods under Kantorovich conditions and some applications, Abstr.
Appl. Anal. 2012, Art. ID 782170, 14 pp.

[2] Amat, S., Bermúdez, C., Busquier, S., Plaza, S., On a third-order Newton-type method
free of bilinear operators, Numer. Linear Algebra Appl., 17 (2010), 639–653.

[3] Amat, S., Busquier, S., Gutiérrez, J.M., Geometric constructions of iterative functions
to solve nonlinear equations, J. Comput. Appl. Math., 157 (2003), 197–205.

[4] Amat, S., Busquier, S., Gutiérrez, J.M., Third-order iterative methods with applica-
tions to Hammerstein equations: a unified approach, J. Comput. Appl. Math., 235
(2011), 2936–2943.

[5] Argyros, I.K., Forcing sequences and inexact Newton iterates in Banach space, Appl.
Math. Lett., 13 (2000), 69–75.

[6] Argyros, I.K., A unifying local–semilocal convergence analysis and applications for
two–point Newton–like methods in Banach space, J. Math. Anal. Appl., 298 (2004),
374–397.

[7] Argyros, I.K., On the semilocal convergence of the Gauss–Newton method, Adv. Non-
linear Var. Inequal., 8 (2005), 93–99.

[8] Argyros, I.K., Convergence and applications of Newton–type iterations, Springer–


Verlag Publ., New York, 2008.

[9] Argyros, I.K., On the semilocal convergence of inexact Newton methods in Banach
spaces, J. Comput. Appl. Math., 228 (2009), 434–443.

[10] Argyros, I.K., Cho, Y.J., Hilout, S., On the local convergence analysis of inexact
Gauss–Newton–like methods, Panamer. Math. J., 21 (2011), 11–18.

[11] Argyros, I.K., Cho, Y.J., Hilout, S., Numerical methods for equations and its applica-
tions, Science Publishers, New Hampshire, USA, 2012.

[12] Argyros, I.K., Hilout, S., On the local convergence of the Gauss-Newton method,
Punjab Univ. J. Math., 41 (2009), 23–33.

147
148 Ioannis K. Argyros and Á. Alberto Magreñán

[13] Argyros, I.K., Hilout, S., Improved generalized differentiability conditions for
Newton–like methods, J. Complexity, 26 (2010), 316–333.

[14] Argyros, I.K., Hilout, S., Extending the applicability of the Gauss-Newton method
under average Lipschitz-type conditions, Numer. Algorithms, 58 (2011), 23–52.

[15] Argyros, I.K., Hilout, S., Improved local convergence of Newton’s method under weak
majorant condition, J. Comput. Appl. Math., 236 (2012), 1892–1902.

[16] Argyros, I.K., Hilout, S., Weaker conditions for the convergence of Newton’s method,
J. Complexity, 28 (2012), 364–387.

[17] Chen, J., The convergence analysis of inexact Gauss–Newton methods for nonlinear
problems, Comput. Optim. Appl., 40 (2008), 97–118.

[18] Chen, J., Li, W., Convergence of Gauss–Newton’s method and uniqueness of the so-
lution, App. Math. Comp., 170 (2005), 686–705.

[19] Chen, J., Li, W., Local convergence results of Gauss-Newton’s like method in weak
conditions, J. Math. Anal. Appl., 324 (2006), 1381–1394.

[20] Chen, J., Li, W., Convergence behaviour of inexact Newton methods under weak Lip-
schitz condition, J. Comput. Appl. Math., 191 (2006), 143–164.

[21] Dedieu, J.P., Shub, M., Newton’s method for overdetermined systems of equations,
Math. Comp., 69 (2000), 1099–1115.

[22] Dembo, R.S., Eisenstat, S.C., Steihaug, T., Inexact Newton methods, SIAM J. Numer.
Anal., 19 (1982), 400–408.

[23] Dennis, J.E., Schnabel, R.B., Numerical methods for unconstrained optimization and
nonlinear equations (Corrected reprint of the 1983 original), Classics in Appl. Math.,
16, SIAM, Philadelphia, PA, 1996.

[24] Ferreira, O.P., Local convergence of Newton’s method in Banach space from the view-
point of the majorant principle, IMA J. Numer. Anal., 29 (2009), 746–759.

[25] Ferreira, O.P., Local convergence of Newton’s method under majorant condition, J.
Comput. Appl. Math., 235 (2011), 1515–1522.

[26] Ferreira, O.P., Gonçalves, M.L.N, Local convergence analysis of inexact inexact-
Newton like methods under majorant condition, Comput. Optim. Appl., 48 (2011),
1–21.

[27] Ferreira, O.P., Gonçalves, M.L.N, Oliveira, P.R., Local convergence analysis of
Gauss–Newton like methods under majorant condition, J. Complexity, 27 (2011), 111–
125.

[28] Ferreira, O.P., Gonçalves, M.L.N, Oliveira, P.R., Local convergence analysis of inex-
act Gauss–Newton like methods under majorant condition, J. Comput. Appl. Math.,
236 (2012), 2487–2498.
Improved Local Convergence Analysis of Inexact Gauss-Newton Like Methods 149

[29] Ferreira, O.P., Svaiter, B.F., Kantorovich’s majorants principle for Newton’s method,
Comput. Optim. Appl., 42 (2009), 213–229.

[30] Guo, X., On semilocal convergence of inexact Newton methods, J. Comput. Math., 25
(2007), 231–242.

[31] Gutiérrez, J.M., Hernández, M.A., Newton’s method under weak Kantorovich condi-
tions, IMA J. Numer. Anal., 20 (2000), 521–532.

[32] Häubler, W.M., A Kantorovich–type convergence analysis for the Gauss–Newton–


method, Numer. Math., 48 (1986), 119–125.

[33] Huang, Z.A., Convergence of inexact Newton method, J. Zhejiang Univ. Sci. Ed., 30
(2003), 393–396.

[34] Hiriart-Urruty, J.B, Lemaréchal, C., Convex analysis and minimization algorithms
(two volumes). I. Fundamentals. II. Advanced theory and bundle methods, 305 and
306, Springer–Verlag, Berlin, 1993.

[35] Kantorovich, L.V., Akilov, G.P., Functional Analysis, Pergamon Press, Oxford, 1982.

[36] Li, C., Hu, N., Wang, J., Convergence bahavior of Gauss–Newton’s method and ex-
tensions to the Smale point estimate theory, J. Complexity, 26 (2010), 268–295.

[37] Li, C., Shen, W.P., Local convergence of inexact methods under the Hölder condition,
J. Comput. Appl. Math., 222 (2008), 544–560.

[38] Li, C., Zhang, W–H., Jin, X–Q., Convergence and uniqueness properties of Gauss-
Newton’s method, Comput. Math. Appl., 47 (2004), 1057–1067.

[39] Martı́nez, J.M., Qi, L.Q., Inexact Newton methods for solving nonsmooth equations.
Linear/nonlinear iterative methods and verification of solution (Matsuyama, 1993), J.
Comput. Appl. Math., 60 (1995), 127–145.

[40] Morini, B., Convergence behaviour of inexact Newton methods, Math. Comp., 68
(1999), 1605–1613.

[41] Potra, F.A., Sharp error bounds for a class of Newton–like methods, Libertas Mathe-
matica, 5 (1985), 71–84.

[42] Proinov, P.D., General local convergence theory for a class of iterative processes and
its applications to Newton’s method, J. Complexity, 25 (2009), 38–62.

[43] Proinov, P.D., New general convergence theory for iterative processes and its applica-
tions to Newton–Kantorovich type theorems, J. Complexity, 26 (2010), 3–42.

[44] Smale, S., Newton’s method estimates from data at one point. The merging of dis-
ciplines: new directions in pure, applied, and computational mathematics (Laramie,
Wyo., 1985), 185-196, Springer, New York, 1986.
150 Ioannis K. Argyros and Á. Alberto Magreñán

[45] Stewart, G.W., On the continuity of the generalized inverse, SIAM J. Appl. Math., 17
(1969), 33–45.

[46] Traub, J.F., Iterative Methods for the Solution of Equations, Englewood Cliffs, New
Jersey: Prentice Hall, 1964.

[47] Wang, X.H., Convergence of Newton’s method and uniqueness of the solution of equa-
tions in Banach spaces, IMA J. Numer. Anal., 20 (2000), 123–134.
Chapter 8

Expending the Applicability of


Lavrentiev Regularization Methods
for Ill-Posed Problems

8.1. Introduction
In this chapter, we are interested in obtaining a stable approximate solution for a nonlin-
ear ill-posed operator equation of the form

F(x) = y, (8.1.1)

where F : D(F) ⊂ X → X is a monotone operator and X is a Hilbert space. We denote the


inner product and the corresponding norm on a Hilbert space by h·, ·i and k · k, respectively.
Let U(x, r) stand for the open ball in X with center x ∈ X and radius r > 0. Note that F is a
monotone operator if it satisfies the relation

hF(x1 ) − F(x2 ), x1 − x2 i ≥ 0 (8.1.2)

for all x1 , x2 ∈ D(F).


We assume, throughout this chapter, that yδ ∈ Y is the available noisy data with

ky − yδ k ≤ δ (8.1.3)

and (8.1.1) has a solution x̂. Since (8.1.1) is ill-posed, its solution need not depend con-
tinuously on the data, i.e., small perturbation in the data can cause large deviations in the
solution. So the regularization methods are used ([9, 10, 11, 13, 14, 17, 19, 20]). Since F is
monotone, the Lavrentiev regularization is used to obtain a stable approximate solution of
(8.1.1). In the Lavrentiev regularization, the approximate solution is obtained as a solution
of the equation
F(x) + α(x − x0 ) = yδ , (8.1.4)
where α > 0 is the regularization parameter and x0 is an initial guess for the solution x̂.
In [8], Bakushinskii and Seminova proposed an iterative method

xδk+1 = xδk − (αk I + Ak,δ )−1 [(F(xδk ) − yδ ) + αk (xδk − x0 )], xδ0 = x0 , (8.1.5)
152 Ioannis K. Argyros and Á. Alberto Magreñán

where Ak,δ := F 0 (xδk ) and (αk ) is a sequence of positive real numbers satisfying αk → 0 as
k → ∞. It is important to stop the iteration at an appropriate step, say k = kδ , and show that
xk is well defined for 0 < k ≤ kδ and xδkδ → x̂ as δ → 0 (see [15]).
In [6]-[8], Bakushinskii and Seminova chose the stopping index kδ by requiring it to
satisfy
kF(xδkδ ) − yδ k2 ≤ τδ < kF(xδk ) − yδ k
for k = 0, 1, · · · and kδ − 1, τ > 1. In fact, they showed that xδkδ → x̂ as δ → 0 under the
following assumptions:
(1) There exists L > 0 such that kF 0 (x) − F 0 (y)k ≤ Lkx − yk for all x, y ∈ D(F);
(2) There exists p > 0 such that
αk − αk+1
≤p (8.1.6)
αk αk+1
∈ N;
for all k p
(3) (2 + Lσ)kx0 − x̂ktd ≤ σ − 2kx0 − x̂kt ≤ dα0 , where

σ := ( τ − 1)2 , t := pα0 + 1, d = 2(tkx0 − x̂k + pσ).

However, no the error estimate was given in [8] (see [15]).


In [15], Mahale and Nair, motivated by the work of Qi-Nian Jin [12] for an iteratively
regularized Gauss-Newton method, considered an alternate stopping criterion which not
only ensures the convergence, but also derives an order optimal error estimate under a gen-
eral source condition on x̂ − x0 . Moreover, the condition that they imposed on {αk } is
weaker than (8.1.6).
In the present chapter, we are motivated by [15]. In particular, we expand the applica-
bility of the method (8.1.5) by weakening one of the major hypotheses in [15] (see below
Assumption 8.2.1 (ii) in the next section).
In Section 8.2, we consider some basic assumptions required throughout the chapter.
Section 8.3 deals with the stopping rule and a result that establishes the existence of the
stopping index. In Section 8.4, we prove results for the iterations based on the exact data
and, in Section 8.5, the error analysis for the noisy data case is proved. The main order
optimal result using the a posteriori stopping rule is provided in Section 8.6.

8.2. Basic Assumptions and Some Preliminary Results


We use the following assumptions to prove the results in this chapter.
Assumption 8.2.1. (1) There exists r > 0 such that U(x̂, r) ⊆ D(F) and F : U(x̂, r) → X is
Fréchet differentiable.
(2) There exists K0 > 0 such that, for all uθ = u + θ(x̂ − u) ∈ U(x̂, r), θ ∈ [0, 1] and
v ∈ X, there exists an element, say φ(x̂, uθ, v) ∈ X, satisfying
[F 0 (x̂) − F 0 (uθ )]v = F 0 (uθ )φ(x̂, uθ , v), kφ(x̂, uθ, v)k ≤ K0 kvkkx̂ − uθ k
for all uθ ∈ U(x̂, r) and v ∈ X.
(3) kF 0 (u) + αI)−1 F 0 (uθ )k ≤ 1 for all uθ ∈ U(x̂, r).
(4) k(F 0 (u) + αI)−1 k ≤ α1 for all uθ ∈ U(x̂, r).
Expending the Applicability of Lavrentiev Regularization Methods ... 153

The condition (2) in Assumption 8.2.1 weakens the popular hypotheses given in [15],
[16] and [18].
Assumption 8.2.2. There exists a constant K > 0 such that, for all x, y ∈ U(x̂, r) and v ∈ X,
there exists an element denoted by P(x, u, v) ∈ X satisfying

[F 0 (x) − F 0 (u)]v = F 0 (u)P(x, u, v), kP(x, u, v)k ≤ Kkvkkx − uk.

Clearly, Assumption 8.2.2 implies Assumption 8.2.1 (2) with K0 = K, but not neces-
sarily vice versa. Note that K0 ≤ K holds in general and KK0 can be arbitrarily large [1]-[5].
Indeed, there are many classes of operators satisfying Assumption 8.2.1 (2), but not As-
sumption 8.2.2 (see the numerical examples at the end of this chapter). Moreover, if K0
is sufficiently smaller than K which can happen since KK0 can be arbitrarily large, then the
results obtained in this chapter provide a tighter error analysis than the one in [15].
Finally, note that the computation of constant K is more expensive than the computation
of K0 .
We need the auxiliary results based on Assumption 8.2.1.
Proposition 8.2.3. For any u ∈ U(x̂, r) and α > 0,
3K0
k(F 0 (u) + αI)−1 [F(x̂) − F (u) − F 0 (u)(x̂ − u)k ≤ kx̂ − uk2 .
2

Proof. Using the fundamental theorem of integration, for any u ∈ U(x̂, r), we get
Z 1
F(x̂) − F(u) = F 0 (u + t(x̂ − u))(x̂ − u)dt.
0

Hence, by Assumption 8.2.2,

F(x̂) − F (u) − F 0 (u)(x̂ − u)


Z 1
= [F 0 (u + t(x̂ − u)) − F 0 (x̂) + F 0 (x̂) − F 0 (u)](x̂ − u)dt
0
Z 1
= F 0 (x̂)[φ(u + t(x̂ − u), x̂, x̂ − u) − φ(u, x̂, x̂ − u)]dt.
0

Then, by (2), (3) in Assumptions 8.2.1 and the inequality k(F 0 (u) + αI)−1 F 0 (uθ)k ≤ 1, we
obtain in turn

k(F 0 (u) + αI)−1 [F(x̂) − F(u) − F 0 (u)(x̂ − u)k


Z 1
≤ kφ(u + t(x̂ − u), x̂, x̂ − u) + φ(u, x̂, x̂ − u)kdt.
0
Z 1
≤ K0 kx̂ − uk2tdt + K0 kx̂ − uk2
0
3K0
≤ kx̂ − uk2 .
2
This completes the proof.
154 Ioannis K. Argyros and Á. Alberto Magreñán

Proposition 8.2.4. For any u ∈ U(x̂, r) and α > 0,

αk(F 0 (x̂) + αI)−1 − (F 0 (u) + αI)−1 k ≤ K0 kx̂ − uk. (8.2.1)

Proof. Let Tx̂,u = α(F 0 (x̂) + αI)−1 − (F 0 (u) + αI)−1 ) for all v ∈ X. Then we have, by
Assumption 8.2.2,

kTx̂,u vk = kα(F 0 (x̂) + αI)−1 (F 0 (u) − F 0 (x̂)(F 0 (u) + αI)−1 )vk


= k(F 0 (x̂) + αI)−1 F 0 (x̂)φ(u, x̂, α(F 0 (u) + αI)−1 v)k
≤ K0 kx̂ − ukkvk

for all v ∈ X. This completes the proof.


Assumption 8.2.5. There exists a continuous and strictly monotonically increasing func-
tion ϕ : (0, a] → (0, ∞) with a ≥ kF 0 (x̂)k satisfying
(1) limλ→0 ϕ(λ) = 0;
αϕ(λ)
(2) supλ≥0 λ+α ≤ ϕ(α) for all α ∈ (0, a];
(3) there exists v ∈ X with kvk ≤ 1 such that

x̂ − x0 = ϕ(F 0 (x̂))v. (8.2.2)

Next, we assume a condition on the sequence {αk } considered in (8.1.5).


Assumption 8.2.6. ([15], Assumption 2.6) The sequence {αk } of positive real numbers is
such that
αk
1≤ ≤ µ, lim αk = 0 (8.2.3)
αk+1 k→0

for a constant µ > 1.

Note that the condition (8.2.3) on {αk } is weaker than (8.1.6) considered by Bakunshin-
skii and Smirnova [8] (see [15]). In fact, if (8.1.6) is satisfied, then it also satisfies (8.2.3)
with µ = pα0 + 1, but the converse need not be true (see [15]). Further, note that, for these
choices of {αk }, αk /αk+1 is bounded whereas (αk − αk+1 )/αk αk+1 → ∞ as k → ∞. (2) in
Assumption 8.2.1 is used in the literature for regularization of many nonlinear ill-posed
problems (see [12], [13], [19]-[21]).

8.3. Stopping Rule


Let c0 > 4 and choose kδ to be the first non-negative integer such that xδk in (8.1.5) is
defined for each k ∈ {0, 1, 2, · · · , kδ } and

kαkδ (Aδkδ + αkδ I)−1 (F(xδkδ ) − yδ )k ≤ c0 δ. (8.3.1)

In the following, we establish the existence of such a kδ . First, we consider the positive
integer N ∈ N satisfying
(c − 1)δ
αN ≤ < αk (8.3.2)
kx0 − x̂k
Expending the Applicability of Lavrentiev Regularization Methods ... 155

for all k ∈ {0, 1, · · · , N − 1}, where c > 1 and α0 > (c − 1)δ/kx0 − x̂k.
The following technical lemma from [15] is used to prove some of the results of this
chapter.

√ 8.3.1. ([15], Lemma 3.1) Let a > 0 and b ≥ 0 be such that 4ab ≤ 1 and 2θ :=
Lemma
(1 − 1 − 4ab)/2a. Let θ1 , · · · , θn be non-negative real numbers such that θk+1 ≤ aθk + b
and θ0 ≤ θ. Then θk ≤ θ for all k = 1, 2, · · · , n.

The rest of the results in this chapter can be proved along the same lines of the proof in
[15]. In order for us to make the chapter is a self contained as possible we present the proof
of one of them and for the proof of the rest we refer the reader to [15].
Theorem 8.3.2. ([15], Theorem 3.2) Let (8.1.2), (8.1.3), (8.2.3) and Assumption 8.2.1 be
satisfied. Let N be as in (8.3.2) for some c > 1 and 6cK0 kx0 − x̂k/(c − 1) ≤ 1. Then xδk is
defined iteratively for each k ∈ {0, 1, · · · , N} and
2ckx0 − x̂k
kxδk − x̂k ≤ (8.3.3)
c−1
for all k ∈ {0, 1, · · · , N}. In particular, if r > 2ckx0 − x̂k/(c − 1), then xδk ∈ Br (x̂) for k ∈
{0, 1, · · · , N}. Moreover,

kαN (AδN + αN I)−1 (F(xδN ) − yδ )k ≤ c0 δ (8.3.4)

for c0 := 73 c + 1.

Proof. We show (8.3.3) by induction. It is obvious that (8.3.3) holds for k = 0. Now, assume
that (8.3.3) holds for some k ∈ {0, 1, · · · , N}. Then it follows from (8.1.5) that

xδk+1 − x̂
= xδk − x̂ − (Aδk + αk I)−1 [F(xδk ) − yδ + αk (xδk − x0 )]
= (Aδk + αk I)−1 ((Aδk + αk I)(xδk − x̂) − [F(xδk ) − yδ + αk (xδk − x0 )])
= (Aδk + αk I)−1 [Aδk (xδk − x̂) + yδ − F(xδk ) + αk (x0 − x̂)]
= αk (Aδk + αk I)−1 (x0 − x̂) + (Aδk + αk I)−1 (yδ − y) (8.3.5)
+(Aδk + αk I)−1 [F(x̂) − F(xδk ) + Aδk (xδk − x̂)]

Using (8.1.3), the estimates k(Aδk + αk I)−1 k ≤ 1/αk , k(Aδk + αk I)−1 Aδk k ≤ 1 and Proposition
8.2.3, we have
δ
kαk (Aδk + αk I)−1 (x0 − x̂) + (Aδk + αk I)−1 (yδ − y)k ≤ kx0 − x̂k +
αk
and
3K0 δ
k(Aδk + αk I)−1 [F(x̂) − F(xδk ) + Aδk (xδk − x̂)]k ≤ kxk − x̂k2 .
2
Thus we have
δ 3K0 δ
kxδk+1 − x̂k ≤ kx0 − x̂k + + kxk − x̂k2 .
αk 2
156 Ioannis K. Argyros and Á. Alberto Magreñán
δ
But, by (8.3.2), αk ≤ kx0 − x̂k/(c − 1) and so

ckx0 − x̂k 3K0 δ


kxδk+1 − x̂k ≤ + kxk − x̂k2 ,
c−1 2
which leads to the recurrence relation

θk+1 ≤ aθ2k + b,

where
3K0 ckx0 − x̂k
θk = kxδk − x̂k, a= , b= .
2 c−1
From the hypothesis of the theorem, we have 4ab = 6cK0 kxc−1
0−x̂k
< 1. It is obvious that

1 − 1 − 4ab 2b 2ckx0 − x̂k
θ0 ≤ kx0 − x̂k ≤ θ := = √ ≤ 2b = .
2a 1 + 1 − 4ab c−1

Hence, by Lemma 8.3.1, we get

2ckx0 − x̂k
kxδk − x̂k ≤ θ ≤ (8.3.6)
c−1

for all k ∈ {0, 1, · · · , N}. In particular, if r > 2ckx0 − x̂k/(c − 1), then we have xδk ∈ Br (x̂)
for all k ∈ {0, 1, · · · , N}.
Next, let γ = kαN (AδN + αN I)−1 (F(xδN ) − yδ )k. Then, using the estimates

kαN (AδN + αN I)−1 k ≤ 1, kαN (AδN + αN I)−1 AδN k ≤ αk

and Proposition 8.2.3, we have

γ
≤ δ + kαN (AδN + αN I)−1 (F(xδN ) − y + AδN (xδN − x̂) − AδN (xδN − x̂))k
= δ + kαN (AδN + αN I)−1 [F(xδN ) − F(x̂) − AδN (xδN − x̂) + AδN (xδN − x̂)]k
kxδN − x̂k2
≤ δ + αN [3K0 + kxδN − x̂k]
2
kxδ − x̂k
≤ δ + αN kxδN − x̂k[1 + 3K0 N ] (8.3.7)
2
2αN ckxδ0 − x̂k 3K0 ckxδ0 − x̂k 1 7c
≤ δ+ [1 + ] ≤ δ + 2cδ[1 + ] ≤ ( + 1)δ.
c−1 c−1 6 3

Therefore, we have kαN (AδN + αN I)−1 (F(xδN ) − yδ )k ≤ c0 δ, where c0 := 73 c + 1. This com-


pletes the proof.
Expending the Applicability of Lavrentiev Regularization Methods ... 157

8.4. Error Bound for the Case of Noise-Free Data


Let
xk+1 = xk − (Ak + αk I)−1 [F(xk ) − y + α k (xk − x0 )] (8.4.1)
for all k ≥ 0.
We show that each xk is well defined and belongs to U(x̂, r) for r > 2kx0 − x̂k. For this,
we make use of the following lemma.

Lemma 8.4.1. ([15], Lemma 4.1) Let Assumption 8.2.1 hold. Suppose that, for all k ∈
{0, 1, · · · , n}, xk in (8.4.1) is well defined and ρk := kαk (Ak + αk I)−1 (x0 − x̂)k for some
n ∈ N. Then we have
3K0 kxk − x̂k2 3K0 kxk − x̂k2
ρk − ≤ kxk+1 − x̂k ≤ ρk + (8.4.2)
2 2
for all k ∈ {0, 1, · · · , n}.

Theorem 8.4.2. ([15], Theorem 4.2) Let Assumption 8.2.1 hold. If 6K0 kx0 − x̂k ≤ 1 and
r > 2kx0 − x̂k, then, for all k ∈ N, the iterates xk in (8.4.1) are well defined and

2kx0 − x̂k
kxk − x̂k ≤ p ≤ 2kx0 − x̂k (8.4.3)
1+ 1 − 6K0 kx0 − x̂k

for all k ∈ N.

Lemma 8.4.3. ([15], Lemma 4.3) Let Assumptions 8.2.1 and 8.2.6 hold and let r > 2kx0 −
x̂k. Assume that kAk ≤ ηα0 and 4µ(1 + η−1 )K0 kx0 − x̂k ≤ 1 for some η with 0 < η < 1.
Then, for all k ∈ N, we have
1 1
kxk − x̂k ≤ kαk (Ak + αk I)−1 (x0 − x̂)k ≤ kxk − x̂k (8.4.4)
(1 + η)µ 1−η
and
1−η 1 η
kxk − x̂k ≤ k(xk+1 − x̂)k ≤ ( + )kxk − x̂k. (8.4.5)
(1 + η)µ 1 − η (1 + η)µ

The following corollary follows from Lemma 8.4.3 by taking η = 1/3. We show that
this particular case of Lemma 8.4.3 is better suited for our later results.
Corollary 8.4.4. ([15], Corollary 4.4) Let Assumptions 8.2.1 and 8.2.6 hold and let r >
2kx0 − x̂k. Assume that kAk ≤ α0 /3 and 16µK0 kx0 − x̂k ≤ 1. Then, for all k ∈ N, we have
3 3
kxk − x̂k ≤ kαk (A + αk I)−1 (x0 − x̂)k ≤ kxk − x̂k (8.4.6)
4µ 2
and
kxk − x̂k
≤ k(xk+1 − x̂)k ≤ 2kxk − x̂k.

158 Ioannis K. Argyros and Á. Alberto Magreñán

Theorem 8.4.5. ([15], Theorem 4.5) Let the Assumptions of Lemma 8.4.3 hold. If x0 is
chosen such that x0 − x̂ ∈ N(F 0 (x̂))⊥, then limk→∞ xk = x̂.

Lemma 8.4.6. ([15], Lemma 4.6) Let the assumptions of Lemma 8.4.3 hold for η satisfying
r
η 4
(1 − 1 − )[1 + (2µ − 1)η + 2µ] + 2η < . (8.4.7)
(1 + η)µ 3

Then, for all k, l ∈ NU{0} with k ≥ l, we have


h kαl (A + αl I)−1 (F(xl ) − y)k i
kxl − x̂k ≤ cη kxk − x̂k + ,
αk
where n (3ε + 1)η o
cη := (1 − bη )−1 max µ, 1 + ,
4(1 − η)

(3ε + 1)η 3εa 1− 1−a η
bη := + , ε := , a := )µ.
(1 − η) 4 a (1 + η

Remark 8.4.7. ([15], Remark 4.7) It can be seen that (8.4.7) is satisfied if η ≤ 1/3 + 1/24.

Now, if we take η = 1/3, that is, K0 kx0 − x̂kµ ≤ 1/16 in Lemma 8.4.6, then it takes the
following form.

Lemma 8.4.8. ([15], Lemma 4.8) Let the assumptions of Lemma 8.4.3 hold with η = 1/3.
Then, for all k ≥ l ≥ 0, we have
h kαl (A + αl I)−1 (F(xl ) − y)k i
kxl − x̂k ≤ c1/3 kxk − x̂k + ,
αk
where h 8µ + (8µ + 1)3ε i−1 n 3ε + 1 o
c1/3 = 1 − max µ, 1 + ,
16µ 8


ε := √ √ .
4µ + 4µ − 1

8.5. Error Analysis with Noisy Data


The first result in this section gives an error estimate for kxδk − xk k under the Assumption
8.2.5, where k = 0, 1, 2, · · · , N.

Lemma 8.5.1. ([15],


√ Lemma 5.1) Let Assumption 8.2.1 hold and let K0 kx0 − x̂k ≤ 1/m,
where m > (7 + 73)/2, and N be the integer satisfying (8.3.2) with

m2 − 4m − 6
c> .
m2 − 7m − 6
Expending the Applicability of Lavrentiev Regularization Methods ... 159

Then, for all k ∈ {0, 1, · · · , N}, we have

δ
kxδk − xk k ≤ , (8.5.1)
(1 − κ)αk

where
1 3c 6
κ := 4+ + .
m c−1 m

If we take m = 8 in Lemma 8.5.1, then we get the following corollary as a particular


case of Lemma 8.5.1. We make use of it in the following error analysis.

Corollary 8.5.2. ([15], Corollary 5.2) Let Assumption 8.2.1 hold and let 16K0 kx0 − x̂k ≤ 1.
Let N be the integer defined by (8.3.2) with c > 13. Then, for all k ∈ {0, 1, · · · , N}, we have

δ
kxδk − xk k ≤ ,
(1 − κ)αk

where
31c − 19
κ := .
32(c − 1)

Lemma 8.5.3. ([15], Lemma 5.3) Let the assumptions of Lemma 8.5.1 hold. Then we have

kαk (A + αkδ I)−1 (F(xkδ ) − y)k ≤ c1 δ.

Moreover, if kδ > 0, then, for all 0 ≤ k < kδ , we have

kαk (A + αk I)−1 (F(xk ) − y)k ≥ c2 δ,

where  2cK0 kx0 − x̂k  2−κ 3K0 µkx0 − x̂k 


c1 = 1 + c0 + + ,
c−1 1 − κ 2(1 − κ)2 (c − 1)
c0 − ((2 − κ)(1 − κ)) − (3K0 kx0 − x̂k/2(1 − κ)2 (c − 1))
c2 =
1 + 2(cK0 kx0 − x̂k/(c − 1))
with c0 = 73 c + 1 and κ as in Lemma 8.5.1.

Theorem 8.5.4. ([15], Theorem 5.4) Let Assumptions 8.2.1 and 8.2.6 hold. If 16kµkx0 −
x̂k ≤ 1 and the integer kδ is chosen according to stopping rule (8.3.1) with c0 > 94
3 , then
we have n o
δ
kxδkδ − x̂k ≤ ξ inf kxk − x̂k + :k≥0 , (8.5.2)
αk
c c +1 0 kx0 −x̂k)
where ξ = max{2µρ, 1/31−κ1
, c}, ρ := 1 + µ(1+3K
c2 (1−κ) with c1/3 and κ as in Lemma 8.4.8
and Corollary 8.5.2, respectively, and c1 , c2 as in Lemma 8.5.3.
160 Ioannis K. Argyros and Á. Alberto Magreñán

8.6. Order Optimal Result with an a Posterior Stopping Rule


In this section, we show the convergence xδkδ → x̂ as δ → 0 and also give an optimal error
estimate for kxδkδ − x̂k.
Theorem 8.6.1. ([15], Theorem 6.1) Let the assumptions of Theorem 8.5.4 hold and let kδ
be the integer chosen by (8.3.1). If x0 is chosen such that x0 − x̂ ∈ N(F 0 (x̂))⊥, then we have
limδ→0 xδkδ = x̂. Moreover, if Assumption 8.2.5 is satisfied, then we have
 
kxδkδ − x̂k ≤ ξ0 µψ−1 δ ,

where ξ0 := 8µξ/3 with ξ as in Theorem 8.5.4 and ψ : (0, ϕ(a)] → (0, aϕ(a)] is defined as
ψ(λ) := λϕ−1 (λ), λ ∈ (0, ϕ(a)].
Proof. From (8.4.6) and (8.5.2), we get
δ
kxδkδ − x̂k ≤ ξ00 inf{kαk (A + αk I)−1 (x0 − x̂)k + : k = 0, 1, · · ·} (8.6.1)
αk
 
µ(1+3k0kx0 −x̂k) c c1 +1
where ξ00 = 4µ
3 max{2µ 1+ c2 (1−κ)
, 1/31−κ , c}. Now, we choose an integer mδ such

that mδ = max{k : αk ≥ δ}. Then, we have
δ
kxδkδ − x̂k ≤ ξ00 inf{kαmδ (A + αmδ I)−1 (x0 − x̂)k + : k = 0, 1, · · ·} (8.6.2)
αmδ
δ
√ δ
Note that α mδ ≤ δ, so α mδ → 0 as δ → 0. Therefore by (8.6.2) to show that xδkδ → x̂ as
δ → 0, it is enough to prove that kαmδ (A + αmδ I)−1 (x0 − x̂)k → 0 as δ → 0. Observe that,
for w ∈ R(F 0 (x̂)), i.e., w = F 0 (x̂)u for some u ∈ D(F) we have kαmδ (A + αmδ I)−1 wk ≤
αmδ kuk → 0 as δ → 0. Now since R(F 0 (x̂)) is a dense subset of N(F 0 (x̂))⊥ it follows that
kαmδ (A + αmδ I)−1 (x0 − x̂)k → 0 as δ → 0. Using Assumption 8.2.5, we get that

kαk (A + αk I)−1 (x0 − x̂)k ≤ ϕ(αk ). (8.6.3)

So by (8.6.2) and (8.6.3) we obtain that


δ
kxδkδ − x̂k ≤ ξ00 inf{ϕ(αk ) + : k = 0, 1, · · ·}. (8.6.4)
αk

Choose kˆδ such that

ϕ(αkˆδ )αkˆδ ≤ δ < ϕ(αk )αk for k = 0, 1, · · ·kδ − 1. (8.6.5)

This also implies that

ψ(ϕ(αkˆδ )) ≤ δ < ψ(ϕ(αk )) for k = 0, 1, · · ·kδ − 1. (8.6.6)

From (8.6.4), kxδkδ − x̂k ≤ ξ00 {ϕ(αkˆδ ) + αδˆ }. Now using (8.6.5) and (8.6.6) we get kxδkδ −

x̂k ≤ 2ξ00 αδˆ ≤ 2ξ00 µ α ˆδ ≤ 2ξ00 µψ−1 (δ). This completes the proof.
kδ kδ −1
Expending the Applicability of Lavrentiev Regularization Methods ... 161

8.7. Numerical Examples


We provide two numerical examples, where K0 < K.
Example 8.7.1. Let X = R, D(F) = U(0, 1), x̂ = 0 and define a function F on D(F) by
F(x) = ex − 1. (8.7.1)
Then, using (8.7.1) and Assumptions 8.2.1 (2) and 8.2.2, we get
K0 = e − 1 < K = e.

Example 8.7.2. Let X = C([0, 1]) (: the space of continuous functions defined on [0, 1]
equipped with the max norm) and D(F) = U(0, 1). Define an operator F on D(F) by
Z 1
F(h)(x) = h(x) − 5 xθh(θ)3 dθ. (8.7.2)
0
Then the Fréchet-derivative is given by
Z 1
0
F (h[u])(x) = u(x) − 15 xθh(θ)2 u(θ)dθ (8.7.3)
0

for all u ∈ D(F). Using (8.7.2), (8.7.3), Assumptions 8.2.1 (2), 8.2.2 for x̂ = 0, we get
K0 = 7.5 < K = 15.
K
Next, we provide an example where K0 can be arbitrarily large.
Example 8.7.3. Let X = D(F) = R, x̂ = 0 and define a function F on D(F) by
F(x) = d0 x − d1 sin1 + d1 sined2 x , (8.7.4)
where d0 , d1 and d2 are the given parameters. Note that F(x̂) = F(0) = 0. Then it can easily
be seen that, for d2 sufficiently large and d1 sufficiently small, KK0 can be arbitrarily large.

We now present two examples where Assumption 8.2.2 is not satisfied, but Assumption
8.2.1 (2) is satisfied.
Example 8.7.4. Let X = D(F) = R, x̂ = 0 and define a function F on D by
1
x1+ i i
F(x) = 1
+ c1 x − c1 − , (8.7.5)
1+ i i+1

where c1 is a real parameter and i > 2 is an integer. Then F 0 (x) = x1/i + c1 is not Lipschitz
on D. Hence Assumption 8.2.2 is not satisfied. However, the central Lipschitz condition in
Assumption 8.2.2 (2) holds for K0 = 1. We also have that F(x̂) = 0. Indeed, we have
kF 0 (x) − F 0 (x̂)k = |x1/i − x̂1/i |
|x − x̂|
= i−1 i−1
x̂ i + · · · + x i

and so
kF 0 (x) − F 0 (x̂)k ≤ K0 |x − x̂|.
162 Ioannis K. Argyros and Á. Alberto Magreñán

Example 8.7.5. We consider the integral equation


Z b
u(s) = f (s) + λ G(s,t)u(t)1+1/ndt (8.7.6)
a

for all n ∈ N, where f is a given continuous function satisfying f (s) > 0 for all s ∈ [a, b], λ
is a real number and the kernel G is continuous and positive in [a, b] × [a, b].
For example, when G(s,t) is the Green kernel, the corresponding integral equation is
equivalent to the boundary value problem

u00 = λu1+1/n,
u(a) = f (a), u(b) = f (b).

These type of the problems have been considered in [1]–[5]. The equation of the form
(8.7.6) generalize the equation of the form
Z b
u(s) = G(s,t)u(t)ndt, (8.7.7)
a

which was studied in [1]-[5]. Instead of (8.7.6), we can try to solve the equation F(u) = 0,
where
F : Ω ⊆ C[a, b] → C[a, b], Ω = {u ∈ C[a, b] : u(s) ≥ 0, s ∈ [a, b]}
and Z b
F(u)(s) = u(s) − f (s) − λ G(s,t)u(t)1+1/ndt.
a
The norm we consider is the max-norm. The derivative F 0 is given by
Z b
1
F 0 (u)v(s) = v(s) − λ(1 + ) G(s,t)u(t)1/nv(t)dt
n a
0
for all v ∈ Ω. First of all, we notice that F does not satisfy the Lipschitz-type condition
in Ω. Let us consider, for instance, [a, b] = [0, 1], G(s,t) = 1 and y(t) = 0. Then we have
F 0 (y)v(s) = v(s) and
Z b
0 0 1
kF (x) − F (y)k = |λ|(1 + ) x(t)1/ndt.
n a

If F 0 were the Lipschitz function, then we have

kF 0 (x) − F 0 (y)k ≤ L1 kx − yk

or, equivalently, the inequality


Z 1
x(t)1/ndt ≤ L2 max x(s) (8.7.8)
0 x∈[0,1]

would hold for all x ∈ Ω and for a constant L2 . But this is not true. Consider, for example,
the function
t
x j (t) =
j
Expending the Applicability of Lavrentiev Regularization Methods ... 163

for all j ≥ 1 and t ∈ [0, 1]. If these are substituted into (8.7.7), then we have

1 L2
≤ ⇐⇒ j 1−1/n ≤ L2 (1 + 1/n)
j1/n(1 + 1/n) j

for all j ≥ 1. This inequality is not true when j → ∞. Therefore, Assumption 8.2.2 is not
satisfied in this case. However, Assumption 8.2.1 (2) holds. To show this, suppose that
x̂(t) = f (t) and γ = mins∈[a,b] f (s). Then, for all v ∈ Ω, we have
 1 Z b
0 0 1/n 1/n
k[F (x) − F (x̂)]vk = |λ| 1 + max G(s,t)(x(t) − f (t) )v(t)dt
n s∈[a,b] a
 1
≤ |λ| 1 + max Gn (s,t),
n s∈[a,b]

G(s,t)|x(t)− f (t)|
where Gn (s,t) = x(t)(n−1)/n +x(t)(n−2)/n f (t)1/n +···+ f (t)(n−1)/n
kvk. Hence it follows that
Z b
|λ|(1 + 1/n)
k[F 0 (x) − F 0 (x̂)]vk = max G(s,t)dtkx − x̂k
γ(n−1)/n s∈[a,b] a
≤ K0 kx − x̂k,
|λ|(1+1/n) Rb
where K0 = γ(n−1)/n N and N = maxs∈[a,b] a G(s,t)dt. Then Assumption 8.2.1 (2) holds
for sufficiently small λ.

In the next remarks, we compare our results with the corresponding ones in [15].

Remark 8.7.6. Note that the results in [15] were shown using Assumption 8.2.2 whereas
we used weaker Assumption 8.2.1 (2) in this chapter. Next, our result, Proposition 8.2.3,
was shown with 3K0 replacing K. Therefore, if 3K0 < K (see Example 8.7.3), then our result
is tighter. Proposition 8.2.4 was shown with K0 replacing K. Then, if K0 < K, then our result
is tighter. Theorem 8.3.2 was shown with 6K0 replacing 2K. Hence, if 3K0 < K, our result
is tighter. Similar favorable to us observations are made for Lemma 8.4.1, Theorem 8.4.2
and the rest of the results in [15].

Remark 8.7.7. The results obtained here can also be realized for the operators F satisfying
an autonomous differential equation of the form

F 0 (x) = P(F(x)),

where P : X → X is a known continuous operator. Since F 0 (x̂) = P(F(x̂)) = P(0), we


can compute K0 in Assumption 8.2.1 (2) without actually knowing x̂. Returning back to
Example 8.7.1, we see that we can set P(x) = x + 1.
References

[1] Argyros, I.K., Convergence and Application of Newton-type Iterations, Springer,


2008.

[2] Argyros, I.K., Approximating solutions of equations using Newton’s method with a
modified Newton’s method iterate as a starting point, Rev. Anal. Numer. Theor. Approx.
36 (2007), 123–138.

[3] Argyros, I.K., A semilocal convergence for directional Newton methods, Math. Com-
put. (AMS) 80 (2011), 327–343.

[4] Argyros, I.K., and Hilout, S., Weaker conditions for the convergence of Newton’s
method, J. Complexity 28 (2012), 364–387.

[5] Argyros, I.K., Cho, Y.J., Hilout, S., Numerical methods for equations and its applica-
tions, CRC Press, Taylor and Francis, New York, 2012.

[6] Bakushinskii, A., Seminova, A., On application of generalized discrepancy principle


to iterative methods for nonlinear ill-posed problems, Numer. Funct. Anal. Optim.
26(2005), 35–48.

[7] Bakushinskii, A., Seminova, A., A posteriori stopping rule for regularized fixed point
iterations, Nonlinear Anal. 64(2006), 1255–1261.

[8] Bakushinskii, A., Seminova, A., Iterative regularization and generalized discrepancy
principle for monotone operato equations, Numer. Funct. Anal. Optim. 28(2007), 13–
25.

[9] Binder, A., Engl, H.W., Groetsch, C.W., Neubauer, A., Scherzer, O., Weakly closed
nonlinear operators and parameter identification in parabolic equations by Tikhonov
regularization, Appl. Anal. 55(1994), 215–235.

[10] Engl, H.W., Hanke, M., Neubauer, A., Regularization of Inverse Problems, Dordrecht,
Kluwer, 1993.

[11] Engl, H.W., Kunisch, K., Neubauer, A., Convergence rates for Tikhonov regulariza-
tion of nonlinear ill-posed problems, Inverse Problems 5(1989), 523–540.

[12] Jin, Q., On the iteratively regularized Gauss-Newton method for solving nonlinear
ill-posed problems, Math. Comp. 69(2000), 1603–1623.
166 Ioannis K. Argyros and Á. Alberto Magreñán

[13] Jin, Q., Hou, Z.Y., On the choice of the regularization parameter for ordinary and
iterated Tikhonov regularization of nonlinear ill-posed problems, Inverse Problems
13(1997), 815–827.

[14] Jin, Q., Hou, Z.Y., On an a posteriori parameter choice strategy for Tikhonov regular-
ization of nonlinear ill-posed problems, Numer. Math. 83(1990), 139–159.

[15] Mahale, P. Nair, M.T., Iterated Lavrentiev regularization for nonlinear ill-posed prob-
lems, ANZIAM 51(2009), 191–217.

[16] Mahale, P. Nair, M.T., General source conditions for nonlinear ill-posed problems,
Numer. Funct. Anal. Optim. 28(2007), 111–126.

[17] Scherzer, O., Engl, H.W., Kunisch, K., Optimal a posteriori parameter choice for
Tikhonov regularization for solving nonlinear ill-posed problems, SIAM J. Numer.
Anal. 30(1993), 1796–1838.

[18] Semenova, E.V., Lavrentiev regularization and balancing principle for solving ill-
posed problems with monotone operators, Comput. Methods Appl. Math. 4(2010),
444–454.

[19] Tautenhahn, U., Lavrentiev regularization of nonlinear ill-posed problems, Vietnam J.


Math. 32(2004), 29–41.

[20] Tautenhahn, U., On the method of Lavrentiev regularization for nonlinear ill-posed
problems, Inverse Problems 18(2002), 191–207.

[21] Tautenhahn, U., Jin, Q., Tikhonov regularization and a posteriori rule for solving non-
linear ill-posed problems, Inverse Problems 19(2003), 1–21.
Chapter 9

A Semilocal Convergence for a


Uniparametric Family of Efficient
Secant-Like Methods

9.1. Introduction
Let U(x, r) and U(x, r) stand, respectively, for the open and closed ball in X with center
x ∈ X and radius r > 0. Denote by L (X , Y ) the space of bounded linear operators from X
into Y .
In this chapter we are concerned with the problem of approximating a locally unique
solution x∗ of nonlinear equation
F(x) = 0, (9.1.1)
where F is a Fráchet-differentiable operator defined on a non-empty convex subset D of a
Banach space X with values in a Banach space Y .
Many problems from computational sciences, physics and other disciplines can be taken
in the form of equation (9.1.1) using Mathematical Modelling [5, 6, 8, 9, 12, 22, 25]. The
solution of these equations can rarely be found in closed form. That is why the solution
methods for these equations are iterative. In particular, the practice of numerical analysis
for finding such solutions is essentially connected to variants of Newton’s method [5, 6, 9,
12, 19, 21, 22, 24, 25]. The study about the convergence of iterative procedures is usually fo-
cussed on two types: semilocal and local convergence analysis. The semilocal convergence
is, based on the information around an initial point, to give criteria ensuring the convergence
of iterative procedure; while the local one is, based on the information around a solution,
to find estimates of the radii of convergence balls. There are a lot of studies on the weak-
ness and/or extension of the hypothesis made on the underlying operators; see for example
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 17, 15, 16, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27]
and the references therein.
Ezquerro and Rubio used in [17] the uniparametric family of secant-like methods de-
168 Ioannis K. Argyros and Á. Alberto Magreñán

fined by

 x−1 , x0 given in D ,
y = µxn + (1 − µ)xn−1 , µ ∈ [0, 1], (9.1.2)
 n −1
xn+1 = xn − Bn F(xn ), Bn = [yn , xn ; F], for each n = 0, 1, . . .

and the method of recurrent relations to generate a sequence {xn } approximating x∗ . Here,
[z, w; F] for each z, w ∈ D is a divided difference of order one, which is a bounded linear
operator such that [4, 5, 7, 9, 19, 22, 25]

[z, w; F] : D → Y and [z, w; F](z − w) = F(z) − F(w). (9.1.3)

Secant-like method (9.1.2) can be considered as a combination of the secant and New-
ton’s method. Indeed, if µ = 0 we obtain the secant method and if µ = 1 we get New-
ton’s method provided that F 0 is Frchet-differentiable on D , since, then xn = yn and
[yn , xn ; F] = F 0 (xn ).

1+ 5
It was shown in [15, 16] that the R-order of convergence is at least for λ ∈ [0, 1),
2
the same as that of the secant method. Later in [12] another uniparametric family of secant-
like methods defined by

 x−1 , x0 given in D ,
y = λxn + (1 − λ)xn−1 , λ ≥ 1, (9.1.4)
 n −1
xn+1 = xn − An F(xn ), An = [yn , xn−1 ; F] for each n = 0, 1, . . .

√ It was shown that there exists λ0 ≥ 2 that the R-order of convergence is at


was studied.
1+ 5
least if λ ∈ [1, λ0 ] and λ 6= 2 and if λ = 2 the R-order of convergence is quadratic.
2
Note that if λ = 1 we obtain the secant method, whereas if λ = 2 we obtain the Kurchatov
method [9, 12, 19, 25].
We present a semilocal convergence analysis for secant like method (9.1.2) using our
idea of recurrent functions instead of recurrent relations and tighter majorizing sequences.
This way our analysis provided the following advantages (A) over the work in [12] under
the same computational cost:

(A1 ) Weaker sufficient convergence conditions,

(A2 ) Tighter estimates on the distances kxn+1 − xn k and kxn − x∗ k for each n = 0, 1, . . .,

(A3 ) At least as precise information on the location of the solution and

(A4 ) The results are presented in affine invariant form, whereas the ones in [12] are given
in a non-affine invariant forms. The advantages of affine versus non-affine results
have been explained in [4, 5, 7, 9, 19, 22, 25]

Our hypotheses for the semilocal convergence of secant-like method (9.1.4) are:

(C1 ) There exists a divided difference of order one [z, w; F] ∈ L (X , Y ) satisfying (9.1.3),
Efficient Secant-Like Methods 169

(C2 ) There exist x0 ∈ D , η ≥ 0 such that A−1 −1


0 ∈ L (Y , X ) and kA0 F(x0 )k ≤ η,

(C3 ) There exist x−1 , x0 ∈ D and c ≥ 0 such that

kx0 − x−1 k ≤ c,

(C4 ) There exists K > 0 such that

kA−1
0 ([x, y; F] − [v, w; F])k ≤ K(kx − vk + ky − wk) for each x, y, v, w ∈ D .

We shall denote by (C) conditions (C1 )–(C4 ). In view of (C4 ) there exist H0 , H1 , H > 0
such that

(C5 ) kA−1
0 ([x1 , x0 ; F] − A0 )k ≤ H0 (kx1 − y0 k + kx0 − x−1 k),

(C6 ) kA−1
0 (A1 − A0 )k ≤ H1 (ky1 − y0 k + kx0 − x−1 k) and

(C7 ) kA−1
0 ([x, y; F] − A0 )k ≤ H(kx − y0 k + ky − x−1 k) for each x, y ∈ D .

Clearly
H0 ≤ H1 ≤ H ≤ K (9.1.5)
K H
hold in general and , can be arbitrarily large [5, 6, 9]. Note that (C5 ), (C6 ), (C7 ) are
H H1
not additional to (C4 ) hypotheses. In practise the computation of K requires the computation
of H0 , H1 and H. It also follows from (C4 ) that F is differentiable [5, 6, 19, 21].
The chapter is organized as follows. In Section 9.2. we show that under the same hy-
potheses as in [18] and using recurrent relations, we obtain an at least as precise information
on the location of the solution. Section 9.3. contains the semilocal convergence analysis us-
ing weaker hypotheses and recurrent functions. We also show the advantages (A). The
results are also extended to cover the case of equations with nondifferentiable operators.
Numerical examples are presented in the concluding Section 9.4..

9.2. Semilocal Convergence Using Recurrent Relations


As in [12] let us define sequences {an } and {bn } for each n = 0, 1, . . . by

η Kc2
a−1 = , b−1 = ,
c+η c+η

an = f (an−1 )g(an−1)bn−1, bn = f (an−1 )2 an−1 bn−1


and functions f , g on [0, 1) by
1
f (t) = and g(t) = (2 − λ) + λ f (t)t.
1 −t
Next, we present the main result in this section in affine invariant form.
170 Ioannis K. Argyros and Á. Alberto Magreñán

Theorem 9.2.1. Under the (C) hypotheses further suppose that

U(x0 , R) ⊆ D

and for λ ∈ [1, λ0 ]



3− 5 a−1 (1 − a−1 )2
a−1 < , b−1 <
2 2(1 − a−1 ) − λ(1 − 2a−1 )
where
1 − a0
R= λη
1 − 2a0
and  
2c
λ0 ∈ 2, .
c−η
Then, sequence {xn } generated by secant-like method (9.1.4) is well defined, remains in
U(x0 , R) for each n = 0, 1, 2, . . . and converges to a solution x∗ ∈ U(x0 , R) of equation
F(x) = 0. Moreover, the following estimates hold

kxn+1 − xn k ≤ f (an−1 )an−1kxn − xn−1 k

and
( f (a0 )a0 )n
kxn − x∗ k ≤ kx1 − x0 k.
1 − f (a0 )a0
1
Furthermore, the solution x∗ is unique in D0 = U(x0 , σ0 ) ∩ D, where σ0 = − λc − R,
H
provided that  
1 1
R< − λc = R0 .
2 H
Proof. The proof with the exception of the uniqueness part is given in Theorem 3 [12] if
we use A−1 −1
0 F instead of F and set b = 1, where kA0 k ≤ b.

To prove the uniqueness of the solution, let us assume y∗ ∈ D0 is a solution of F(x) = 0.


Let L = [y∗ , x∗ ; F]. Then, using (C7 ) and the definition of σ0 we get in turn that

kA−1 ∗ ∗
0 (L − A0 )k ≤ H(ky − y0 k + kx − x−1 k) < H(σ0 + λc + R) = 1. (9.2.1)

It follows from (9.2.1) and the Banach lemma on invertible operators [4, 5, 6, 9, 19, 22, 25]
that L−1 ∈ L (Y , X ). Using the identity 0 = F(y∗ ) − F(x∗ ) = L(y∗ − x∗ ) we deduce that
x∗ = y∗ . That completes the proof of the Theorem. 

Remark 9.2.2. If K = H, Theorem 9.2.1 reduces to Theorem 3 in [12]. Otherwise, i. e. if


H < K, then our Theorem 9.2.1 constitutes an improvement over Theorem 3, since

σ < σ0 (9.2.2)

and
R0 < R1 , (9.2.3)
Efficient Secant-Like Methods 171

where
1
σ= − λc − R
K
and  
1 1
R0 = − λc
2 K
where given in [12] (for b = 1). Hence, (9.2.2) and (9.2.3) justify our claim for this section
made in the Introduction of this chapter.

9.3. Semilocal Convergence Using Recurrent Functions


We present the semilocal convergence of secant-like methods. First, we need some auxiliary
results on majorizing sequences for secant-like method.

Lemma 9.3.1. Let c ≥ 0, η > 0, H > 0, K > 0 and λ ≥ 1. Set t−1 = 0, t0 = c and t1 = c + η.
Define scalar sequences {qn }, {tn }, {αn } for each n = 0, 1, . . . by
qn = Hλ(tn+1 + tn − c), (9.3.1)
K(tn+1 − tn + λ(tn − tn−1 ))
tn+2 = tn+1 + (tn+1 − tn ),
1 − qn
K(tn+1 − tn + λ(tn − tn−1 ))
αn = , (9.3.2)
1 − qn
functions f n on [0, 1) for each n = 1, 2, . . . by
fn (t) = K(t n + λt n−1 )η + Hλ((1 + t + · · · + t n+1 )η + (1 + t + · · · + t n )η + c) − 1 (9.3.3)
and polynomial p on [0, 1) by
p(t) = Hλt 3 + (Hλ + K)t 2 + K(λ − 1)t − λK. (9.3.4)
Denote by α the only root of polynomial p in (0, 1). Suppose that
1 − Hλ(c + 2η)
0 ≤ α0 ≤ α ≤ · (9.3.5)
1 − Hλc
Then, sequence {tn} is non-decreasing, bounded from above by t ∗∗ defined by
η
t ∗∗ = +c (9.3.6)
1−α
and converges to its unique least upper bound t ∗ which satisfies
c + η ≤ t ∗ ≤ t ∗∗ . (9.3.7)
Moreover, the following estimates are satisfied for each n = 0, 1, 2, . . .
0 ≤ tn+1 − tn ≤ αn η (9.3.8)
and
αn η
t ∗ − tn ≤ · (9.3.9)
1−α
172 Ioannis K. Argyros and Á. Alberto Magreñán

Proof. We shall first show that polynomial p has roots in (0, 1). Indeed, we have p(0) =
−λK < 0 and p(1) = 2Hλ > 0. Using the intermediate value theorem we deduce that there
exists at least one root of p in (0, 1). Moreover p0 (t) > 0. Hence p crosses the positive axis
only once. Denote by α the only root of p in (0, 1). It follows from (9.3.1) and (9.3.2) that
estimate (9.3.8) is certainly satisfied if

0 ≤ αn ≤ α. (9.3.10)

Estimate (9.3.10) is true by (9.3.5) for n = 0. Then, we have by (9.3.1) that

t2 − t1 ≤ α(t1 − t0 ) ⇒ t2 ≤ t1 + α(t1 − t0 ) ⇒ t2 ≤ η + t0 + αη
1 − α2
= c + (1 + α)η = c + η < t ∗∗.
1−α
Suppose that

1 − αk+1
tk+1 − tk ≤ αk η and tk+1 ≤ c + η for each k ≤ n. (9.3.11)
1−α
Estimate (9.3.10) shall be true for k + 1 replacing n if

0 ≤ αk+1 ≤ α (9.3.12)

or
fk (α) ≤ 0, (9.3.13)
where f k is defined by (9.3.3). We need a relationship between two consecutive recurrent
functions f k for each k = 1, 2 . . . Using (9.3.3) and (9.3.4) we deduce that

fk+1 (α) = f k (α) + p(α)αk−1η = f k (α), (9.3.14)

since p(α) = 0. Define function f ∞ on (0, 1) by

f∞ (t) = lim fk (t). (9.3.15)


k→+∞

Then, we get from (9.3.3) and (9.3.15) that


 

f∞ (α) = Hλ + c − 1. (9.3.16)
1−α

Hence, by (9.3.14)–(9.3.16), (9.3.13) is satisfied if

f∞(α) ≤ 0 (9.3.17)

which is true by (9.3.5). The induction for (9.3.8) is complete. That is sequence {tn } is non-
decreasing, bounded from above by t ∗∗ given by (9.3.6) and as such it converges to some t ∗
which satisfies (9.3.7). Estimate (9.3.9) follows from (9.3.8) by using standard majorization
techniques [4, 5, 6, 9, 19, 22, 25]. The proof of Lemma 9.3.1 is complete. 
Efficient Secant-Like Methods 173

Lemma 9.3.2. Let c ≥ 0, η > 0, H0 > 0, H1 > 0, H > 0, K > 0 and λ ≥ 1. Set s−1 = 0,
s0 = c, s1 = c + η. Define scalar sequences {sn }, {bn } for each n = 1, 2, . . . by

 H0 (s1 − s0 + λ(s0 − s−1 ))

 s2 = s1 + (s1 − s0 ),
 1 − H1 λ(s1 + s0 − c)
(9.3.18)

 K(s − s + λ(s − s ))

 sn+2 = sn+1 + n+1 n n n−1
(sn+1 − sn ),
1 − Hλ(sn+1 + sn − c)


 H0 (s1 − s0 + λ(s0 − s−1 ))

 b = ,
 1 1 − H1 λ(s1 + s0 − c)
(9.3.19)

 K(sn+1 − sn + λ(sn − sn−1 ))

 bn = ,
1 − Hλ(sn+1 + sn − c)

and functions gn on [0, 1) by


 
n−1 1 − t n+1 1 − tn
gn (t) = K(t + λ)t (s2 − s1 ) + Hλt 2s1 + (s2 − s1 ) + (s2 − s1 )
1 −t 1 −t
−(1 + Hλc)t. (9.3.20)

Suppose that
1 − Hλ(2s2 − c)
0 ≤ b1 ≤ α ≤ , (9.3.21)
1 − Hλ(2s1 − c)
where α is defined in Lemma 9.3.1. Then, sequence {sn } is non-decreasing, bounded from
above by s∗∗ defined by
s2 − s1
s∗∗ = c + η + (9.3.22)
1−α
and converges to its unique least upper bound s∗ which satisfies

c + η ≤ s∗ ≤ s∗∗ . (9.3.23)

Moreover, the following estimates are satisfied for each n = 1, 2, . . .

0 ≤ sn+2 − sn+1 ≤ αn (s2 − s1 ). (9.3.24)

Proof. We shall show using induction that

0 ≤ bn ≤ α. (9.3.25)

Estimate (9.3.25) is true for n = 0 by (9.3.21). Then, we have by (9.3.18) that

0 ≤ s3 − s2 ≤ α(s2 − s1 ) ⇒ s3 ≤ s2 + α(s2 − s1 ) ⇒
⇒ s3 ≤ s2 + (1 + α)(s2 − s1 ) − (s2 − s1 ) ⇒ (9.3.26)
1 − α2
⇒ s3 ≤ s1 + (s2 − s1 ) ≤ s∗∗ .
1−α
174 Ioannis K. Argyros and Á. Alberto Magreñán

Suppose (9.3.25) holds for each n ≤ k. Then using (9.3.18) we get that

0 ≤ sk+2 − sk+1 ≤ αk (s2 − s1 ) (9.3.27)

and
1 + αk+1
sk+2 ≤ s1 + (s2 − s1 ). (9.3.28)
1−α
Estimate (9.3.25) shall be satisfied if

gk (α) ≤ 0. (9.3.29)

Using (9.3.20) we get the following relationship between two consecutive recurrent func-
tions gk :
gk+1(α) = gk (α) + p(α)αk−1 (s2 − s1 ) = gk (α). (9.3.30)
Define function g∞ on [0, 1) by

g∞ (t) = lim gk (t). (9.3.31)


k→+∞

Then, we get from (9.3.20) that


 
s2 − s1
g∞(α) = 2αHλ s1 + − α(1 + Hλc). (9.3.32)
1−α
Then, (9.3.29) is satisfied if
g∞ (α) ≤ 0, (9.3.33)
which is true by the choice of α and the right hand side inequality in hypothesis (9.3.21).
The induction for (9.3.25) (i. e. (9.3.24)) is complete. The rest of the proof as identical to
Lemma 9.3.1 is omitted. The proof is complete. 

Remark 9.3.3. (a) Let us consider an interesting choice for λ. Let λ = 1 (secant
method). Then, using (9.3.4) and (9.3.5) we have that
2K
α= √ (9.3.34)
K + K 2 + 4HK
and
K(c + η) 1 − H(c + 2η)
≤α≤ · (9.3.35)
1 − H(c + η) 1 − Hc
The corresponding condition for the secant method is given by [6, 9, 18, 21]:
p
Kc + 2 Kη ≤ 1. (9.3.36)

Condition (9.3.35) can be weaker than (9.3.36) (see also the numerical examples
at the end of the chapter). Moreover, the majorizing sequence {un } for the secant
method related to (9.3.36) is given by

 u−1 = 0, u0 = c, u1 = c + η,

K(un+1 − un−1 ) (9.3.37)

 un+2 = un+1 + (un+1 − un ).
1 − K(un+1 + un − c)
Efficient Secant-Like Methods 175

A simple inductive argument shows that if H < K, then for each n = 2, 3, . . .

t n < un , tn+1 − tn < un+1 − un and t ∗ ≤ u∗ = lim un . (9.3.38)


n→+∞

(b) The majorizing sequence {vn } used in [12] is essentially given by




 v−1 = 0, v0 = c, v1 = c + η,
K(vn+1 − vn + λ(vn − vn−1 )) (9.3.39)

 vn+2 = vn+1 + (vn+1 − vn ).
1 − Kλ(vn+1 + vn − c)

Then, again we have

tn < vn , tn+1 − tn < vn+1 − vn and t ∗ ≤ v∗ = lim vn . (9.3.40)


n→+∞

Moreover, our sufficient convergence conditions can be weaker than [12].

(c) Clearly, iteration {sn } is tighter than {tn} and we have as in (9.3.40) than for H0 < K
or H1 < H

sn < tn , sn+1 − sn < tn+1 − tn and s∗ = lim sn < t ∗ . (9.3.41)


n→+∞

Next, we present obvious and useful extensions of Lemma 9.3.1 and Lemma 9.3.2,
respectively.

Lemma 9.3.4. Let N = 0, 1, 2, . . . be fixed. Suppose that

t1 ≤ t2 ≤ · · · ≤ tN ≤ tN+1 , (9.3.42)
1
> tN+1 − tN + λ(tN − tN−1 ) (9.3.43)

and
1 − Hλ(tN − tN−1 + 2(tN+1 − tN ))
0 ≤ αN ≤ α ≤ (9.3.44)
1 − Hλ(tN − tN−1 )
Then, sequence {tn } generated by (9.3.2) is nondecreasing, bounded from above by t ∗∗ and
converges to t ∗ which satisfies t ∗ ∈ [tN+1 ,t ∗ ]. Moreover, the following estimates are satisfied
for each n = 0, 1, . . .
0 ≤ tN+n+1 − tN+n ≤ αn (tN+1 − tN ) (9.3.45)
and
αn
t ∗ − tN+n ≤ (tN+1 − tN ). (9.3.46)
1−α
Lemma 9.3.5. Let N = 1, 2, . . . be fixed. Suppose that

s1 ≤ s2 ≤ · · · ≤ sN ≤ sN+1 , (9.3.47)
1
> sN+1 − sN + λ(sN − sN−1 ) (9.3.48)

176 Ioannis K. Argyros and Á. Alberto Magreñán

and
1 − Hλ(2sN+1 − sN−1 )
0 ≤ bN ≤ α ≤ · (9.3.49)
1 − Hλ(2sN − sN−1 )
Then, sequence {sn } generated by (9.3.18) is nondecreasing, bounded from above by s∗∗
and converges to s∗ which satisfies s∗ ∈ [sN+1 , s∗ ]. Moreover, the following estimates are
satisfied for each n = 0, 1, . . .

0 ≤ sN+n+1 − sN+n ≤ αn (sN+1 − sN ) (9.3.50)

and
αn
s∗ − sN+n ≤ (sN+1 − sN ). (9.3.51)
1−α
Next, we present the following semilocal convergence result for secant-like method
under the (C) conditions.

Theorem 9.3.6. Suppose that the (C), Lemma 9.3.1 (or Lemma 9.3.4) conditions and

U = U(x0 , (2λ − 1)t ∗ ) ⊆ D (9.3.52)

hold. Then, sequence {xn } generated by secant-like method is well defined, remains in U for
each n = −1, 0, 1, 2, . . . and converges to a solution x∗ ∈ U(x0 ,t ∗ − c) of equation F(x) = 0.
Moreover, the following estimates are satisfied for each n = 0, 1, . . .

kxn+1 − xn k ≤ tn+1 − tn (9.3.53)

and
kxn − x∗ k ≤ t ∗ − tn . (9.3.54)
Furthermore, if there exists T ≥ t ∗ − c such that

U(x0 , r) ⊆ D (9.3.55)

and
H(T + t ∗ + (λ − 1)c) < 1, (9.3.56)
then, the solution x∗ is unique in U(x0 , T ).

Proof. We use mathematical induction to prove that

kxk+1 − xk k ≤ tk+1 − tk (9.3.57)

and
U(xk+1 ,t ∗ − tk+1 ) ⊆ U(xk ,t ∗ − tk ) (9.3.58)
for each k = −1, 0, 1, . . .. Let z ∈ U(x0 ,t ∗ − t0 ). Then we obtain that

kz − x−1 k ≤ kz − x0 k + kx0 − x−1 k ≤ t ∗ − t0 + c = t ∗


= t ∗ − t−1 ,
Efficient Secant-Like Methods 177

which implies z ∈ U(x−1 ,t ∗ − t−1 ). Let also w ∈ U(x0 ,t ∗ − t1 ). We get that

kw − x0 k ≤ kw − x1 k + kx1 − x0 k ≤ t ∗ − t1 + t1 − t0
= t ∗ − t0 ,

hence, w ∈ U(x0 ,t ∗ ,t0 ). Note that kx−1 − x0 k ≤ c = t0 −t−1 and kx1 − x0 k = kA−1
0 F(x0 )k ≤
∗ ∗
η = t1 − t0 < t . That is x1 ∈ U(x0 ,t ) ⊆ D . Hence, estimates (9.3.57) and (9.3.58) hold for
k = −1 and k = 0. Suppose that (9.3.57) and (9.3.58) hold for all n ≤ k. Then, we obtain
that
k+1 k+1
kxk+1 − x0 k ≤ ∑ kxi − xi−1 k ≤ ∑ (ti − ti−1 )
i=1 i=1
= tk+1 − t0 = t ∗ − c ≤ t ∗

and

kyk − x0 k = kλxk + (1 − λ)xk−1 − x0 k = kλ(xk − x0 ) + (1 − λ)(xk−1 − x0 )k


≤ λkxk − x0 k + (λ − 1)kxk−1 − x0 k
≤ λt ∗ + (λ − 1)t ∗ = (2λ − 1)t ∗ .

Hence, xk+1 , yk ∈ U(x0 ,t ∗ ).


Using (C7 ), Lemma 9.3.1 and the introduction hypotheses, we get that

kA−1
0 (Ak+1 − A0 )k ≤ H(kyk+1 − y0 k + kxk − x−1 k)
≤ H(λkxk+1 − x0 k + |1 − λ|kxk − x−1 k + kxk − x−1 k)
≤ Hλ(kxk+1 − x0 k + kxk − x0 k + kx0 − x−1 k) (9.3.59)
≤ Hλ(tk+1 − t0 + tk − t0 + c)
= Hλ(tk+1 + tk − c) < 1.

It follows from (9.3.59) and the Banach lemma on invertible operators [4, 5, 6, 9, 19, 22, 25]
that A−1
k+1 exists and
kA−1 −1
k+1A0 k ≤ (1 − Hλ(tk+1 + tk − c)) . (9.3.60)
In view of (9.1.4), we obtain the identity

F(xk+1 ) = F(xk+1 ) − F(xk ) − [yk , xk−1 ; F](xk+1 − xk )


= ([xk+1, xk ; F] − [yk , xk−1 ; F])(xk+1 − xk ). (9.3.61)

Using (9.1.4), (9.3.16) and the induction hypotheses we get in turn that

kA−1
0 F(xk+1 )k ≤ K(kxk+1 − yk k + kxk − xk−1 k)kxk+1 − xk k
≤ K(kxk+1 − xk k + λkxk − xk−1 k)kxk+1 − xk k (9.3.62)
≤ K(tk+1 − tk + λ(tk − tk+1 ))(tk+1 − tk ).
178 Ioannis K. Argyros and Á. Alberto Magreñán

It now follows from (9.1.4), (9.3.1), (9.3.61) and (9.3.62) that

kxk+2 − xk+1 k ≤ kA−1 −1


k+1A0 kkA0 F(xk+1 )k
K(tk+1 − tk + λ(tk − tk−1 ))(tk+1 − tk )
≤ (9.3.63)
1 − Hλ(tk+1 + tk − c)
= tk+2 − tk+1,

which completes the induction for (9.3.57). Moreover, let v ∈ U(xk+2,t ∗ − tk+2 ). Then, we
get that

kv − xk+1 k ≤ kv − xk+2 k + kxk+2 − xk+1 k


≤ t ∗ − tk+2 + tk+2 − tk+1 = t ∗ − tk+1 ,

which implies v ∈ U(xk+1 ,t ∗ − tk+1 ). The induction for (9.3.58) is complete.


Lemma 9.3.1 implies that {tk} is a complete sequence. It follows from (9.3.57) and
(9.3.58) that {xk } is a complete sequence in a Banach space X and as such it converges to
some x∗ ∈ U(x0 ,t ∗ −c) (since U(x0 ,t ∗ −c) is a closed set). By letting k → +∞ in (9.3.62) we
obtain F(x∗ ) = 0. Furthermore, estimate (9.3.54) follows from (9.3.53) by using standard
majorization techniques [5, 6, 8, 9, 19, 22, 25]. To show the uniqueness part, let y∗ ∈
U(x0 , T ) be such that F(y∗ ) = 0. We have that

kA−1 ∗ ∗ ∗ ∗
0 ([y , x ; F] − A0 )k ≤ H(ky − y0 k + kx − x−1 k)
≤ H(ky∗ − x0 k + (λ − 1)kx0 − x−1 k (9.3.64)

+kx − x0 k + kx0 − x−1 k)
≤ (R0 + t ∗ + (λ − 1)c) < 1.

It follows from (9.3.64) and the Banach lemma on invertible operators that [y∗ , x∗ ; F]−1
exists. Then, using the identity 0 = F(y∗ ) − F(x∗ ) = [y∗ , x∗ ; F](y∗ − x∗ ), we deduce that
x∗ = y∗ . The proof of Theorem 9.3.6 is complete. 

Remark 9.3.7. (a) The limit point t ∗ can be replaced in Theorem 9.3.6 by t ∗∗ given in
closed form by (9.3.6).

(b) It follows from the proof of Theorem 9.3.6 that {sn } is also a majorizing sequence for
{xn }. Hence, Lemma 9.3.2 (or Lemma 9.3.5), {sn }, s∗ can replace Lemma 9.3.1 (or
Lemma 9.3.4) {tn }, t ∗ in Theorem 9.3.6.
Hence we arrive at:

Theorem 9.3.8. Suppose that the (C) conditions, Lemma 9.3.2 (or Lemma 9.3.5) and

U = U(x0 , (2λ − 1)s∗ ) ⊆ D

hold. Then sequence {xn } generated by secant-like method is well defined, remains in U for
each n = −1, 0, 1, 2, . . . and converges to a solution x∗ ∈ U(x0 , s∗ − c) of equation F(x) = 0.
Moreover, the following estimates are satisfied for each n = 0, 1, . . .

kxn+1 − xn k ≤ sn+1 − sn
Efficient Secant-Like Methods 179

and
kxn − x∗ k ≤ s∗ − sn .
Furthermore, if there exists T ≥ s∗ − c such that

U(x0 , r) ⊆ D

and
H(T + s∗ + (λ − 1)c) < 1,
then, the solution x∗ is unique in U(x0 , T ).

Let us consider the equation

F(x) + G(x) = 0, (9.3.65)

where F is a before and G : D → Y is continuous. The corresponding secant-like method


is given by

xn+1 = xn − A−1
n (F(xn ) + G(xn )) for each n = 0, 1, 2 . . ., (9.3.66)

where x0 is an initial guess.


Suppose that

(C8 )
kA−1
0 (G(x) − G(y))k ≤ Mkx − yk for each x, y ∈ D , (9.3.67)
and

(C9 )
kA−1
0 (G(x1) − G(x0 ))k ≤ M0 kx1 − x0 k. (9.3.68)

Clearly,
M0 ≤ M (9.3.69)
M
holds and can be arbitrarily large [4, 5, 6, 8, 9].
M0
We shall denote by (C∗ ) the conditions (C) and (C8 ), (C9 ). Then, we can present the
corresponding result along the same lines as in Lemma 9.3.1, Lemma 9.3.2, Lemma 9.3.4,
Lemma 9.3.5, Theorem 9.3.6 and Theorem 9.3.8. However, we shall only present the results
corresponding to Lemma 9.3.2 and Theorem 9.3.8, respectively. The rest combination of
results can be given in an analogous way.

Lemma 9.3.9. Let c ≥ 0, η > 0, H0 > 0, H1 > 0, H > 0, M0 > 0, M > 0, K > 0 and λ ≥ 1.
Set γ−1 = 0, γ0 = c, γ1 = c + η. Define scalar sequences {γn }, {δn } by


 H0 (γ1 − γ0 + λ(γ0 − γ−1 )) + M0
 γ2 = γ1 + (γ1 − γ0 ),
1 − H1 λ(γ1 + γ0 − c)

 K(γn+1 − γn + λ(γn − γn−1 )) + M
 γn+2 = γn+1 + (γn+1 − γn ),
1 − Hλ(γn+1 + γn − c)
180 Ioannis K. Argyros and Á. Alberto Magreñán


 H0 (γ1 − γ0 + λ(γ0 − γ−1 )) + M0
 δ1 = ,
1 − H1 λ(γ1 + γ0 − c)

 K(γn+1 − γn + λ(γn − γn−1 )) + M
 δn = ,
1 − Hλ(γn+1 + γn − c)

and functions hn on [0, 1) by

hn (t) = K(t + λ)t n−1 (γ2 − γ1 ) + M


 
1 − t n+1 1 − tn
+Hλt 2γ1 + (γ2 − γ1 ) + (γ − γ1 )
1 −t 1 −t 2
−(1 + Hλc)t.

Suppose that function ϕ given by


 
γ2 − γ1
ϕ(t) = 2Hλ γ1 + t − (1 + Hλc)t + M
1 −t

has a minimal zero a in [0, 1) and

0 ≤ δ1 ≤ α ≤ a,

where α was defined in Lemma 9.3.1. Then, sequence {γn } is non-decreasing, bounded
from above by γ∗∗ defined by
γ − γ1
γ∗∗ = c + η + 2
1−α

and converges to its unique least upper bound γ which satisfies

c + η ≤ γ∗ ≤ γ∗∗ .

Moreover, the following estimates are satisfied for each n = 1, 2, . . .

0 ≤ γn+2 − γn+1 ≤ αn (γ2 − γ1 ).

Proof. Simply use {γn }, {δn }, {hn }, ϕ, a instead of {sn }, {bn }, {gn }, p, α in the proof of
Lemma 9.3.2 

Theorem 9.3.10. Suppose that the (C∗ ), Lemma 9.3.9 conditions,

U ⊆D

hold, where U was defined in Theorem 9.3.6 and kA−1 0 (F(x0 ) + G(x0 ))k ≤ η. Then, se-
quence {xn } generated by the secant-like method (9.3.66) in well defined, remains in U
for each n = −1, 0, 1, 2, . . . and converges to a solution x∗ ∈ U(x0 , γ∗ − c) of equation
F(x) + G(x) = 0. Moreover, the following estimates are satisfied for each n = 0, 1, . . .

kxn+1 − xn k ≤ γn+1 − γn

and
kxn − x∗ k ≤ γ∗ − γn .
Efficient Secant-Like Methods 181

Furthermore, if there exists γ ≥ γ∗ − c such that

U(x0 , γ) ⊆ D

and
K((λ − 1)c + γ) + M
0< ≤ η, for some µ ∈ (0, 1)
1 − Hλ(2γ − c)
then, the solution x∗ is unique in U(x0 , γ).

Proof. The proof until the uniqueness part follows as in Theorem 9.3.6 but using the
identity

F(xk+1) + G(xk+1 ) = ([xk+1 , xk ; F] − Ak )(xk+1 − xk ) + (G(xk+1 ) − G(xk ))

instead of (9.3.61). Finally, for the uniqueness part, let y∗ ∈ U(x0 , γ) be such that F(y∗ ) +
G(y∗ ) = 0. Then, we get from (9.3.66) the identity

xn+1 − y∗ = xn − A−1
n (F(xn ) + (xn )) − y

= −A−1 ∗ ∗ ∗
n (F(xn ) − F(x ) − An (xn − y ) + (G(xn ) − G(y )))
= −A−1 ∗ ∗ ∗
n (([xn , y ; F] − [yn , xn−1 ; F])(xn − y ) + (G(xn ) − G(y )).

This identity leads to

K(kxn − yn k + kxn−1 − y∗ k) + M
kxn+1 − y∗ k ≤ kxn − y∗ k
1 − Hλ(γn+1 + γn − c)
K((λ − 1)kxn − xn−1 k + kxn−1 − y∗ k) + M
≤ kxn − y∗ k
1 − Hλ(2γ − c)
K((λ − 1)c + γ) + M
≤ kxn − y∗ k ≤ µkxn + y∗ k
1 − Hλ(2γ − c)
≤ µn+1 kx0 − y∗ k ≤ µn+1 γ.

Hence, we deduce lim xn = y∗ . But we know that lim xn = x∗ . That is we conclude


n→+∞ n→+∞
x∗ = y∗ , That completes the proof of the Theorem. 

9.4. Numerical Examples


Example 9.4.1. Let X = Y = C [0, 1], equipped with the max-norm. Consider the following
nonlinear boundary value problem

u00 = −u3 − γ u2
u(0) = 0, u(1) = 1.

It is well known that this problem can be formulated as the integral equation
Z 1
u(s) = s + Q (s,t) (u3(t) + γ u2 (t)) dt (9.4.1)
0
182 Ioannis K. Argyros and Á. Alberto Magreñán

where, Q is the Green function:



t (1 − s), t ≤ s
Q (s,t) =
s (1 − t), s < t.

We observe that Z 1
1
max |Q (s,t)| dt = .
0≤s≤1 0 8
Then problem (9.4.1) is in the form (9.1.1), where, F : D −→ Y is defined as
Z 1
[F(x)] (s) = x(s) − s − Q (s,t) (x3(t) + γ x2 (t)) dt.
0

The Fréchet derivative of the operator F is given by


Z 1 Z 1
0 2
[F (x)y] (s) = y(s) − 3 Q (s,t)x (t)y(t)dt − 2γ Q (s,t)x(t)y(t)dt.
0 0

Then, we have that


Z 1 Z 1
[(I − F 0 (x0 ))(y)](s) = 3 Q (s,t)x20(t)y(t)dt + 2γ Q (s,t)x0(t)y(t)dt.
0 0

Hence, if 2γ < 5, then


kI − F 0 (x0 )k ≤ 2(γ − 2) < 1.
It follows that F 0 (x0 )−1 exists and

1
kF 0 (x0 )−1 k ≤ .
5 − 2γ

We also have that kF(x0 )k ≤ 1 + γ. Define the divided difference defined by


Z 1
δF(x, y) = F 0 (y + t(x − y))dt.
0

Choosing x−1 (s) such that |x−1 − x0 k ≤ c and l0 c < 1. Then we have for λ = 1

kδF(x−1 , x0 )−1 F(x0 )k ≤ kδF(x−1 , x0 )−1 F 0 (x0 )kkF 0 (x0 )F(x0 )k

and
1
kδF(x−1 , x0 )−1 F 0 (x0 )k ≤ ,
(1 − l0 c)
where l0 is such that
kF 0 (x0 )−1 (F 0 (x0 ) − A0 )k ≤ l0 c,
Set u0 (s) = s and D = U(u0 , R0 ). It is easy to verify that U(u0 , R0 ) ⊂ U(0, R0 + 1) since
k u0 k= 1. If 2 γ < 5, and l0 c < 1 the operator F 0 satisfies conditions of Theorem 9.2.6, with

1+γ γ + 6 R0 + 3 2 γ + 3 R0 + 6
η= , K= , H= .
(1 − l0 c)(5 − 2 γ) 8(5 − 2 γ)(1 − l0 c) 16(5 − 2 γ)(1 − l0 c)
Efficient Secant-Like Methods 183

Choosing R0 = 1, γ = 0.5 and c = 1 we obtain that

l0 = 0.1938137822. . .,

η = 0.465153 . . .,
K = 0.368246 . . .
and
H = 0.193814 . . ..
Moreover we obtain that a−1 = 0.317477 and b−1 = 0.251336, but conditions of Theo-
rem 2.1 are not satisfied since

a−1 (1 − a−1 )2
b−1 = 0.251336 > 0.147893 = .
2(1 − a−1 ) − λ(1 − 2a−1 )

Notice also that the popular condition (9.3.36) is also not satisfied, since Kc + 2 Kη =
1.19599 > 1. Hence, there is no guarantee under the old conditions that the secant-type
method converges to x∗ . However, conditions of Lemma 9.3.1 are satisfied, since

1 − Hλ(c + 2η)
0 < α = 0.724067 ≤ 0.776347 =
1 − Hλc
The convergence of the secant-type method is also ensured by Theorem 123.6.

Example 9.4.2. Let X = Y = R and let consider the real function

F(x) = x3 − 2

and we are going to apply secant-type method with λ = 2.5. We take the starting points
x0 = 1, x−1 = 0.25 and we consider the domain Ω = B(x0 , 3/4). In this case, we obtain

c = 0.75,

η = 0.120301 . . .,
K = 0.442105 . . .,
H = 0.180451 . . .,
Notice that the conditions of Theorem 9.2.1 and Lemma 9.3.1 are satisfied, but since H < K
Remark 9.2.2 ensures that our uniqueness ball is larger. It is clear as R1 = 1.83333 . . . >
0.193452 . . . = R0 .
References

[1] Amat, S., Bermúdez, C., Busquier, S., Gretay, J. O., Convergence by nondiscrete
mathematical induction of a two step secant’s method, Rocky Mountain Journal of
Mathematics, 37(2) (2007), 359–369.

[2] Amat, S., Busquier, S., Gutiérrez, J. M., On the local convergence of secant-type
methods, International Journal of Computer Mathematics, 81(8) (2004), 1153–1161.

[3] Amat, S., Busquier, S., Negra, M., Adaptive approximation of nonlinear operators,
Numerical Functional Analysis and Optimization, 25(5) (2004), 397–405.

[4] Argyros, I. K., Polynomial operator equations in abstract spaces and applications,
St.Lucie/CRC/Lewis Publ. Mathematics series, Boca Raton, Florida, U.S.A 1998.

[5] Argyros, I. K., Convergence and applications of Newton-type iterations, Springer-


Verlag Publications, New York, 2008.

[6] Argyros, I. K., Cho, Y. J., Hilout, S., Numerical methods for equations and its appli-
cations, CRC Press/Taylor & Francis, New York, 2012.

[7] Argyros, I. K., Hilout, S., Convergence conditions for secant-type methods,
Czechoslovak Mathematical Journal, 60 (2010), 11–18.

[8] Argyros, I. K., Hilout, S., Weaker conditions for the convergence of Newton’s method,
Journal of Complexity, 28(3) (2012), 346–387.

[9] Argyros, I. K., Hilout, S., Tabatabai, M.A., Mathematical modelling with applications
in biosciences and engineering, Nova Publishers, New York, 2011.

[10] Bosarge, W. E., Falb, P. L. A multipoint method of third order, Journal of Optimization
Theory and Applications, 4 (1969), 156–166.

[11] Dennis, J. E., Toward a unified convergence theory for Newton-like methods, Func-
tional Analysis and Applications (L.B. Rall, ed.), Academic Press, New York, 1971.

[12] Ezquerro, J. A., Grau-Sánchez, M., Hernández, M. A., Noguera, M., Semilocal con-
vergence of secant-like methods for differentiable and nondifferentiable operators
equations, J. Math. Anal. App., 398(1) (2013), 100-112.
186 Ioannis K. Argyros and Á. Alberto Magreñán

[13] Ezquerro, J. A., Gutiérrez, J. M., Hernández, M.A., Romero, N., Rubio, M. J., The
Newton method: from Newton to Kantorovich. (Spanish), Gac. Real Soc. Mat. Esp.,
13 (2010), 53–76.

[14] Ezquerro, J. A., Hernández, M.A., Romero, N., Velasco, A. I., App. Math. Comp.,
219(8) (2012), 3677-3692.

[15] Ezquerro, J. A., Hernández, M.A., Rubio, M.J., Secant-like methods for solving non-
linear integral equations of the Hammerstein type, Proceedings of the 8th Interna-
tional Congress on Computational and Applied Mathematics, ICCAM-98 (Leuven), J.
Comp. Appl. Math., 115 (2000), 245–254.

[16] Ezquerro, J. A., Hernández, M.A., Rubio, M.J., Solving a special case of conservative
problems by secant-like methods, App. Math. Comp., 169 (2005), 926–942.

[17] Ezquerro, J. A., Rubio, M.J.,, A uniparametric family of iterative processes for solving
nondifferentiable equations, J. Math. Anal. App., 275 (2002), 821–834.

[18] Kantorovich, L. V., Akilov, G. P., Functional analysis, Pergamon Press, Oxford, 1982.

[19] Laasonen, P.,, Ein überquadratisch konvergenter iterativer algorithmu, Annales


Academiae Scientiarum Fennicae Mathematica. Ser. I, 450 (1969), 1–10.

[20] Magreñán, Á. A. A new tool to study real dynamics: The convergence plane App.
Math. Comp., 248 (2014), 215–224.

[21] Ortega, L. M., Rheinboldt, W. C., Iterative solution of nonlinear equations in several
variables, Academic Press, New York, 1970.

[22] Potra, F. A., Pták, V.,, Nondiscrete induction and iterative processes. Research Notes
in Mathematics, 103, Pitman (Advanced Publishing Program), Boston, Massachusetts,
1984.

[23] Proinov, P. D., General local convergence theory for a class of iterative processes and
its applications to Newton’s method, Journal of Complexity, 25 (2009), 38–62.

[24] Schmidt, J. W., Untere fehlerschranken fur regula-falsi verhafren, Periodica Mathe-
matica Hungarica, 9 (1978), 241–247.

[25] Traub, J. F., Iterative Methods for the Solution of Equations, Prentice-Hall, Englewood
Cliffs, New Jersay, 1964.

[26] Yamamoto, T., A convergence theorem for Newton-like methods in Banach spaces,
Numerische Mathematik, 51, 545–557, 1987.

[27] Wolfe, M. A., Extended iterative methods for the solution of operator equations, Nu-
merische Mathematik, 31 (1978), 153–174.
Chapter 10

On the Semilocal Convergence of a


Two-Step Newton-Like Projection
Method for Ill-Posed Equations

10.1. Introduction
Let X be a real Hilbert space with inner product h., .i and norm k.k. Let U(x, R) and U(x, R),
stand respectively, for the open and closed ball in X with center x and radius R > 0. Let also
L(X) be the space of all bounded linear operators from X into itself.
In this chapter we are concerned with the problem of approximately solving the ill-
posed equation
F(x) = y, (10.1.1)
where F : D(F) ⊆ X → X is a nonlinear operator satisfying hF(v) − F(w), v − wi ≥
0, ∀v, w ∈ D(F), and y ∈ X.
It is assumed that (10.1.1) has a solution, namely x̂ and F possesses a locally uniformly
bounded Fréchet derivative F 0 (x) for all x ∈ D(F) (cf. [18]) i.e.,

kF 0 (x)k ≤ CF , x ∈ D(F)

for some constant CF .


In application, usually only noisy data yδ are available, such that

ky − yδ k ≤ δ.

Then the problem of recovery of x̂ from noisy equation F(x) = yδ is ill-posed, in the sense
that a small perturbation in the data can cause large deviation in the solution. For solving
(10.1.1) with monotone operators (see [12, 17, 18, 19]) one usually use the Lavrentiev
regularization method. In this method the regularized approximation xδα is obtained by
solving the operator equation

F(x) + α(x − x0 ) = yδ . (10.1.2)


188 Ioannis K. Argyros and Á. Alberto Magreñán

It is known (cf. [19], Theorem 1.1) that the equation (10.1.2) has a unique solution xδα for
α > 0, provided F is Fréchet differentiable and monotone in the ball Br (x̂) ⊂ D(F) with
radius r = kx̂ − x0 k + δ/α. However the regularized equation (10.1.2) remains nonlinear
and one may have difficulties in solving them numerically.
In [6], George and Elmahdy considered an iterative regularization method which con-
verges linearly to xδα and its finite dimensional realization in [7]. Later in [8] George and
Elmahdy considered an iterative regularization method which converges quadratically to xδα
and its finite dimensional realization in [9].
Recall that a sequence (xn ) in X with limxn = x∗ is said to be convergent of order
n
p > 1, if there exist positive reals β, γ, such that for all n ∈ N kxn − x∗ k ≤ βe−γp .If the
sequence (xn ) has the property that kxn − x∗ k ≤ βqn , 0 < q < 1 then (xn ) is said to be
linearly convergent. For an extensive discussion of convergence rate (see [13]).
Note that the method considered in [6], [7], [8] and [9] are proved using a suitably
constructed majorizing sequence which heavily depends on the initial guess and hence not
suitable for practical consideration.
Recently, George and Pareth [10] introduced a two-step Newton-like projection
method(TSNLPM) of convergence order four to solve (10.1.2). (TSNLPM) was realized
as follows:
Let {Ph }h>0 be a family of orthogonal projections on X. Our aim in this section is to
obtain an approximation for xδα , in the finite dimensional space R(Ph ), the range of Ph . For
the results that follow, we impose the following conditions.
Let
εh (x) := kF 0 (x)(I − Ph )k, ∀x ∈ D(F)
and {bh : h > 0} is such that limh→0 k(I−P h )x0 k
bh = 0 and limh→0bh = 0. We assume that
εh (x) → 0, ∀x ∈ D(F) as h → 0. The above assumption is satisfied if, Ph → I pointwise
and if F 0 (x) is a compact operator. Further we assume that εh (x) ≤ ε0 , ∀x ∈ D(F), bh ≤ b0
and δ ∈ (0, δ0 ].

10.1.1. Projection Method


We consider the following sequence defined iteratively by

yh,δ h,δ −1 h,δ h,δ δ h,δ


n,α = xn,α − Rα (xn,α )Ph [F(xn,α ) − f + α(xn,α − x0 )] (10.1.3)
and
h,δ h,δ h,δ h,δ δ h,δ
xn+1,α = yn,α − R−1
α (yn,α )Ph [F(yn,α ) − f + α(yn,α − x0 )] (10.1.4)

where Rα (x) := Ph F 0 (x)Ph + αPh and xh,δ δ


0,α := Ph x0 , for obtaining an approximation for xα in
the finite dimensional subspace R(Ph ) of X. Note that the iteration (10.1.3) and (10.1.4) are
the finite dimensional realization of the iteration (10.1.3) and (10.1.4) in [16]. In [10], the
parameter α = αi was chosen from some finite set

DN = {αi : 0 < α0 < α1 < α2 < · · · < αN }

using the adaptive method considered by Perverzev and Schock in [17].


The convergence analysis in [10] was carried out using the following assumptions.
Two-Step Newton-Like Projection Method 189

Assumption 10.1.1. (cf. [18], Assumption 3) There exists a constant k0 ≥ 0 such that
for every x, u ∈ D(F) and v ∈ X there exists an element Φ(x, u, v) ∈ X such that [F 0 (x) −
F 0 (u)]v = F 0 (u)Φ(x, u, v), kΦ(x, u,v)k ≤ k0 kvkkx − uk.
Assumption 10.1.2. There exists a continuous, strictly monotonically increasing function
ϕ : (0, a] → (0, ∞) with a ≥ kF 0 (x̂)k satisfying;
(i) limλ→0 ϕ(λ) = 0,

(ii) supλ≥0 αϕ(λ)


λ+α ≤ cϕ ϕ(α) ∀λ ∈ (0, a] and

(iii) there exists v ∈ X with kvk ≤ 1 (cf. [15]) such that

x0 − x̂ = ϕ(F 0 (x̂))v.

In the present paper we extend the applicability of (TSNLPM) by weakening Assump-


tion 10.1.1 which is very difficult to verify (or does not hold) in general (see numerical
examples at the last section of the paper). In particular, we replace Assumption 10.1.1 by
the weaker and easier to verify:
Assumption 10.1.3. Let x0 ∈ X be fixed. There exists a constant K0 ≥ 0 such that for each
x, u ∈ D(F) and v ∈ X there exists an element Φ(x, u, v) ∈ X depending on x0 such that
[F 0 (x) − F 0 (u)]v = F 0 (u)Φ(x, u, v), kΦ(x, u,v)k ≤ K0 kvk(kx − Ph x0 k + ku − Ph x0 k).
Note that Assumption 10.1.1⇒ Assumption 10.1.3 but not necessarily vice versa. At
the end of the chapter we have provided examples, where Assumption 10.1.3 is satisfied but
not Assumption 10.1.1.
We also replace Assumption 10.1.2 by
Assumption 10.1.4. There exists a continuous, strictly monotonically increasing function
ϕ : (0, a] → (0, ∞) with a ≥ kF 0 (x0 )k satisfying;
(i) limλ→0 ϕ(λ) = 0,
αϕ(λ)
(ii) supλ≥0 λ+α ≤ ϕ(α) ∀λ ∈ (0, a] and

(iii) there exists v ∈ X with kvk ≤ 1 (cf. [15]) such that

x0 − x̂ = ϕ(F 0 (x0 ))v.

Remark 10.1.5. The hypotheses of Assumption 10.1.1 may not hold or may be very ex-
pensive or imposible to verify in general. In particular, as it is the case for well-posed
nonlinear equations the computation of the Lipschitz constant k0 even if this constant exists
is very difficult. Moreover, there are classes of operators for which Assumption 10.1.1 is
not satisfied but the (TSNLPM) converges.
In this paper, we expand the applicability of (TSNLPM) under less computational cost.
Let us explain how we achieve this goal.
(1) Assumption 10.1.3 is weaker than Assumption 10.1.1. Notice that there are classes of
operators that satisfy Assumption 10.1.3 but do not satisfy Assumption 10.1.1;
190 Ioannis K. Argyros and Á. Alberto Magreñán

(2) The computational cost of constant K0 is less than that of constant k0 , even when
K0 = k0 ;

(3) The sufficient convergence criteria are weaker;

(4) The computable error bounds on the distances involved (including K0 ) are less costly;

(5) The convergence domain of (TSNLPM) with Assumption 10.1.3 can be larger, since
K0
k0 can be arbitrarily small (see Example 10.5.4);

(6) The information on the location of the solution is more precise;

and

(7) Note that the Assumption 10.1.2 involves the Fréchet derivative at the exact solu-
tion x̂ which is unknown in practice. But Assumption 10.1.4 depends on the Fréchet
derivative of F at x0 .

These advantages are also very important in computational mathematics since they
provide under less computational cost.

The paper is organization as follows: In Section 10.2 we present the convergence anal-
ysis of (TSNLPM). Section 10.3 contains the error analysis and parameter choice strategy.
The algorithm for implementing(TSNLPM) is given in Section 10.4. Finally, numerical
examples are presented in the concluding Section 10.5.

10.2. Semilocal Convergence


In order for us to present the semilocal convergence of (TSNLPM) it is convenient to intro-
duce some parameters:
Let

eh,δ h,δ h,δ


n,α := kyn,α − xn,α k, ∀n ≥ 0. (10.2.1)

Suppose that
1
0 < K0 < (10.2.2)
4(1 + αε00 )
and
4δ0 ε0
(1 + ) < 1. (10.2.3)
α0 α0
Define polynomial P on (0, ∞) by

ε0 K0 2 ε0 δ0 1
P(t) = (1 + ) t + (1 + )t + − . (10.2.4)
α0 2 α0 α0 4(1 + αε00 )
Two-Step Newton-Like Projection Method 191

It follows from (10.2.3) that P has a unique positive root given in closed form by the
quadratic formula. Denote this root by p0 .
Let
b0 < p0 , kx̂ − x0 k ≤ ρ, (10.2.5)
where
ρ < p0 − b0 . (10.2.6)
 
ε0 k0 δ0
γρ := (1 + ) (ρ + b0 )2 + (ρ + b0 ) + , (10.2.7)
α0 2 α0
4γρ
r := q (10.2.8)
1+ 1 + 32γρ (1 + αε00 )

and
ε0
b := 4(1 + )K0 r. (10.2.9)
α0
Then we have by (10.2.2)-(10.2.9) that
1
0 < γρ < . (10.2.10)
4
0<r<1 (10.2.11)
and
0 < b < 1. (10.2.12)
Indeed, we have by (10.2.4) and (10.2.12) that γρ − 14 ≤ P(p0 ) = 0 ⇒ 0 < γρ < 41 ⇒
(10.2.10). Estimate (10.2.11) follows from (10.2.8) and (10.2.10). Moreover, estimate
(10.2.12) follows from (10.2.2) and (10.2.11). We also have that

γρ < r. (10.2.13)

In view of (10.2.7) and (10.2.8), estimate (10.2.13) reduces to showing that 4γρ (1 + αε00 ) < 1
which is true by the choice of p0 and (10.2.4). Finally it follows from (10.2.13) that

0 < γρ < 1. (10.2.14)

Lemma 10.2.1. ([10], Lemma1)Let x ∈ D(F). Then


ε0
kR−1 0
α (x)Ph F (x)k ≤ (1 + ).
α0

Lemma 10.2.2. ([10], Lemma 2) Let e0 = eh,δ


0,α and γρ be as in (10.2.7). Then
e0 ≤ γρ .

Lemma 10.2.3. Suppose that (10.2.2), (10.2.3) and δ ∈ (0, δ0 ] hold and let Assumption
10.1.3 be satisfied. Then the following estimates hold for (TSNLPM):
192 Ioannis K. Argyros and Á. Alberto Magreñán

(a)

h,δ h,δ K0 ε0 h,δ h,δ


kxn,α − yn−1,α k ≤ (1 + )[3kxn−1,α − x0,α k (10.2.15)
2 α0
+5kyh,δ h,δ h,δ
n−1,α − x0,α k]en−1,α (10.2.16)

and

(b)

h,δ h,δ K0 ε0 h,δ h,δ


kxn,α − xn−1,α k ≤ {1 + (1 + )[3kxn−1,α − x0,α k (10.2.17)
2 α0
+5kyh,δ h,δ h,δ
n−1,α − x0,α k]}en−1,α . (10.2.18)

Proof. Observe that,


h,δ h,δ
xn,α − yn−1,α
h,δ h,δ h,δ
= yn−1,α − xn−1,α − R−1
α (yn−1,α )Ph

[F(yh,δ δ h,δ −1 h,δ


n−1,α ) − f + α(yn−1,α − x0 )] + Rα (xn−1,α )

Ph [F(xh,δ δ h,δ
n−1,α ) − f + α(xn−1,α − x0 )]
h,δ h,δ h,δ
= yn−1,α − xn−1,α − R−1
α (yn−1,α )Ph

[F(yh,δ h,δ h,δ h,δ


n−1,α ) − F(xn−1,α ) + α(yn−1,α − xn−1,α )]
h,δ −1 h,δ h,δ δ
+[R−1
α (xn−1,α ) − Rα (yn−1,α )]Ph [F(xn−1,α ) − f
h,δ
+α(xn−1,α − x0 )]
h,δ 0 h,δ h,δ h,δ
= R−1
α (yn−1,α )Ph [F (yn−1,α )(yn−1,α − xn−1,α )

−(F(yh,δ h,δ −1 h,δ


n−1,α ) − F(xn−1,α ))] + Rα (yn−1,α )Ph
h,δ h,δ h,δ h,δ
(F 0 (yn−1,α) − F 0 (xn−1,α))(xn−1,α − yn−1,α )
:= Γ1 + Γ2 (10.2.19)

where
h,δ 0 h,δ h,δ h,δ
Γ1 := R−1
α (yn−1,α )Ph [F (yn−1,α )(yn−1,α − xn−1,α )
h,δ h,δ
−(F(yn−1,α) − F (xn−1,α))]
and
h,δ 0 h,δ 0 h,δ
Γ2 := R−1
α (yn−1,α )Ph [F (yn−1,α ) − F (xn−1,α )]

(xh,δ h,δ
n−1,α − yn−1,α ).
Two-Step Newton-Like Projection Method 193

Note that,
Z 1
h,δ h,δ h,δ
kΓ1 k = kR−1
α (yn−1,α )Ph [F 0 (yn−1,α) − F 0 (xn−1,α
0
+t(yh,δ h,δ h,δ h,δ
n−1,α − xn−1,α ))](yn−1,α − xn−1,α )dtk
Z 1
h,δ 0 h,δ
= kR−1
α (yn−1,α )Ph F (yn−1,α ) [φ(xh,δ
n−1,α +
0
t(yh,δ h,δ h,δ h,δ h,δ
n−1,α − xn−1,α ), yn−1,α, xn−1,α − yn−1,α )]dtk
Z 1
ε0
≤ K0 (1 + )[ kxh,δ h,δ
n−1,α − x0,α (10.2.20)
α0 0
−t(yn−1,α − xn−1,α )kdt + kyh,δ
h,δ h,δ h,δ
n−1,α − x0,α k]

×kyh,δ h,δ
n−1,α − xn−1,α k
Z 1
ε0 h,δ h,δ
≤ K0 (1 + )[ (1 − t)kxn−1,α − x0,α k
α0 0
+tkyh,δ h,δ h,δ h,δ
n−1,α ) − x0,α k + kyn−1,α ) − x0,α k]dt
h,δ h,δ
×kyn−1,α − xn−1,α k
K0 ε0 h,δ h,δ
≤ (1 + )[kxn−1,α − x0,α k
2 α0
h,δ h,δ h,δ
+3kyn−1,α) − x0,α k]en−1,α

the last step follows from the Assumption 10.1.3 and Lemma 10.2.1. Similarly,
ε0
kΓ2 k ≤ K0 (1 + )[kyh,δ h,δ h,δ h,δ h,δ
n−1,α − x0,α k + kx0,α − xn−1,α k]en−1,α (10.2.21)
α0
So, (a) follows from (10.2.19), (10.2.20) and (10.2.21). And (b) follows from (a) and the
triangle inequality;
h,δ h,δ h,δ h,δ h,δ h,δ
kxn,α − xn−1,α k ≤ kxn,α − yn−1,α k + kyn−1,α − xn−1,α k.

Theorem 10.2.4. Under the hypotheses of Lemma 10.2.3 the following estimates hold for
(TSNLPM):

h,δ K0 ε0 h,δ h,δ


en,α ≤ (1 + )[5kxn,α − x0,α k
2 α0
+3kyh,δ h,δ h,δ h,δ
n−1,α ) − x0,α k]kyn−1,α − xn,α k
h,δ h,δ
≤ b2 en−1,α ≤ b2n e0,α ≤ b2n γρ .
194 Ioannis K. Argyros and Á. Alberto Magreñán

Proof. We have,
h,δ h,δ h,δ h,δ h,δ
yn,α − xn,α = xn,α − yn−1,α − R−1
α (xn,α )Ph
h,δ h,δ h,δ
[F(xn,α ) − f δ + α(xn,α − x0 )] + R−1
α (yn−1,α )

Ph [F(yh,δ δ h,δ
n−1,α ) − f + α(yn−1,α − x0 )]

= xh,δ h,δ −1 h,δ


n,α − yn−1,α − Rα (xn,α )Ph
h,δ h,δ h,δ h,δ
[F(xn,α ) − F(yn−1,α) + α(xn,α − yn−1,α )]
h,δ −1 h,δ h,δ δ
+[R−1
α (yn−1,α ) − Rα (xn,α )]Ph [F(yn−1,α ) − f

+α(yh,δ
n−1,α − x0 )]
h,δ h,δ h,δ h,δ
= R−1 0
α (xn,α )Ph [F (xn,α )(xn,α − yn−1,α )

−(F(xh,δ h,δ −1 h,δ


n,α ) − F(yn−1,α ))] + Rα (xn,α )Ph

[F 0 (xh,δ 0 h,δ h,δ h,δ


n,α ) − F (yn−1,α )] × (yn−1,α − xn,α )
:= Γ3 + Γ4 (10.2.22)
h,δ h,δ h,δ h,δ h,δ h,δ
where Γ3 = R−1 0
α (xn,α )Ph [F (xn,α )(xn,α − yn−1,α ) − (F(xn,α ) − F(yn−1,α ))]andΓ4 =
h,δ 0 h,δ 0 h,δ h,δ h,δ
R−1
α (xn,α )Ph [F (xn,α ) − F (yn−1,α )](yn−1,α − xn,α ). Analogous to the proof of (10.2.20) and
(10.2.21) one can prove that
K0 ε0
kΓ3 k ≤ (1 + )[3kxh,δ h,δ
n,α − x0,α k
2 α0
+kyh,δ h,δ h,δ h,δ
n−1,α − x0,α k]kxn,α − yn−1,α k (10.2.23)

and
ε0 h,δ h,δ h,δ h,δ h,δ h,δ
kΓ4 k ≤ K0 (1 + )[kxn,α − x0,α k + kyn−1,α − x0,α k]kxn,α − yn−1,α k
α0
Now
K0 ε0
eh,δ
n,α ≤ (1 + )[5kxh,δ h,δ
n,α − x0,α k
2 α0
+3kyh,δ h,δ h,δ h,δ
n−1,α − x0,α k]kxn,α − yn−1,α k (10.2.24)
K0 ε0 K0 ε0
≤ (1 + )(8r) (1 + )(8r)kxh,δ h,δ
n−1,α − yn−1,α k
2 α0 2 α0
≤ b2 kxh,δ h,δ
n−1,α − yn−1,α k

≤ b2n eh,δ 2n
0,α ≤ b γρ .

This completes the proof of the theorem.

Theorem 10.2.5. Suppose that the hypotheses of Theorem 10.2.4 hold. Then, sequences
{xh,δ h,δ
n,α }, {yn,α } generated by (TSNLPM) are well defined and remain in U(Ph x0 , r) for all
n ≥ 0.
Two-Step Newton-Like Projection Method 195

Proof. Note that by (b) of Lemma 10.2.3 we have,

kxh,δ h,δ h,δ


1,α − Ph x0 k = kx1,α − x0,α k
ε0 K0
≤ [1 + (1 + ) (8r)]γρ (10.2.25)
α0 2
≤ (1 + b)γρ
1 − b2
≤ γ < r,
1−b ρ

i.e., xh,δ
1,α ∈ Br (Ph x0 ). Again note that from (10.2.25) and Theorem 10.2.4 we get,

kyh,δ h,δ h,δ h,δ


1,α − Ph x0 k ≤ ky1,α − x1,α k + kx1,α − Ph x0 k
ε0 ε0
≤ [1 + (1 + )4K0 r + ((1 + )4K0 r)2 ]γρ
α0 α0
2
≤ (1 + b + b )γρ
1 − b2
≤ γ < r,
1−b ρ

i.e., yh,δ
1,α ∈ Br (Ph x0 ). Further by (10.2.25) and (b) of Lemma 10.2.3 we have,

kxh,δ h,δ h,δ h,δ


2,α − Ph x0 k ≤ kx2,α − x1,α k + kx1,α − Ph x0 k
≤ (1 + b)γρ + (1 + b)γρ
= 2(1 + b)γρ < r

and
h,δ h,δ h,δ h,δ
ky2,α − Ph x0 k ≤ ky2,α − x2,α k + kx2,α − Ph x0 k
≤ b4 γρ + 2(1 + b)γρ
≤ b2 γρ + 2(1 + b)γρ
1 − b3 1 − b2
≤ [ + ]γ
1−b 1−b ρ
(since b < 1)
2γρ
< <r
1−b

by the choice of r, i.e., xh,δ h,δ h,δ


2,α , y2,α ∈ Br (Ph x0 ). Continuing this way one can prove that xn,α ,
yh,δ
n,α ∈ Br (Ph x0 ), ∀n ≥ 0. This completes the proof.

Theorem 10.2.6. Suppose that the hypotheses of Theorem 10.2.5 hold. Then the following
assertions hold

(a) {xh,δ h,δ


n,α } is a complete sequence in U(Ph x0 , r) and converges to xα ∈ U(Ph x0 , r).

(b) Ph [F(xh,δ h,δ δ


α ) + α(xα − x0 )] = Ph y .
196 Ioannis K. Argyros and Á. Alberto Magreñán

(c)
h,δ h,δ (1 + b)b2n γρ
kxn,α − xα k ≤
1 − b2
where γρ and b are defined by (10.2.7) and (10.2.9), respectively.

Proof. We have that

kxh,δ h,δ
n+i+1,α − xn+i,α k

≤ (1 + b)b0 kxh,δ h,δ


n+i,α − yn+i,α k

≤ (1 + b)bkxh,δ h,δ
n+i,α − yn+i−1,α k

≤ (1 + b)b2 kxh,δ h,δ


n+i−1,α − yn+i,α k
h,δ
≤ (1 + b)b2(n+i)e0,α
≤ (1 + b)b2(n+i)γρ .

So,
m−1
h,δ h,δ
kxn+m,α − xn,α k ≤ ∑ kxh,δ h,δ
n+i+1,α − xn+i,α k
i=0
m−1
≤ (1 + b)b2n ∑ b2i
i=0
1 − b2m (1 + b)b2n
= (1 + b)b2n γ → γ ,
1 − b2 ρ 1 − b2 ρ
h,δ
as m → ∞. Thus xn,α is a Cauchy sequence in U(Ph x0 , r) and hence it converges, say to
xh,δ
α ∈ U(Ph x0 , r).
Observe that,
h,δ h,δ
kPh [F(xn,α) − f δ + α(xn,α − x0 )]k
h,δ h,δ
= kRα(x0 )(xn,α − yn,α )k
h,δ h,δ
≤ kRα(x0 )kkxn,α − yn,α k
= k(Ph F 0 (xh,δ h,δ
n,α )Ph + αPh )ken,α

≤ (CF + α)eh,δ
n,α . (10.2.26)

Now by letting n → ∞ in (10.2.26) we obtain

Ph [F(xh,δ h,δ δ
α ) + α(xα − x0 )] = Ph y . (10.2.27)

This completes the proof.

Remark 10.2.7. (a) The convergence order of (TSNLPM) is four [10], under Assump-
tion 10.1.1. In Theorem 10.2.6 the error bounds are too pessimistic. That is why in
Two-Step Newton-Like Projection Method 197

practice we shall use the computational order of convergence (COC) (see eg. [5])
defined by ! !
kxn+1 − xδα k kxn − xδα k
ρ ≈ ln / ln .
kxn − xδα k kxn−1 − xδα k
The (COC) ρ will then be close to 4 which is the order of convergence of (TSNLPM).

(b) Note that from the proof of the Theorem 10.2.5 a larger r can be obtained from solving
the equation
[b4t + 2(1 + bt)]γρ − rt = 0.
Note that this equation has a minimal root r∗ > r. Then, r∗ can replace r in Theorem
10.2.5. However, we decided to use r which is given in closed form. Using, Mathe-
matica or Maple we found r∗ in closed form. But it has a complicated and long form.
That is why we decided not to include r in this paper.

10.3. Error Bounds under Source Conditions


The objective of this section is to obtain an error estimate for kxh,δ
n,α − x̂k under a source
condition on x0 − x̂.

h,δ
Proposition 10.3.1. Let F : D(F) ⊆ X → X be a monotone operator in X. Let xα be the
solution of (10.2.27) and xhα := xh,0
α . Then

h,δ δ
kxα − xhα k ≤ .
α
Proof. The result follows from the monotonicity of F and the relation;

Ph [F(xh,δ h h,δ h δ
α ) − F(xα ) + α(xα − xα )] = Ph (y − y).

2
Theorem 10.3.2. Let ρ < ε and x̂ ∈ D(F) be a solution of (10.1.1). And let As-
K0 (1+ α0 )
0
sumption 10.1.3, Assumption 10.1.4 and the assumptions in Proposition 10.3.1 be satisfied.
Then
εh
kxhα − x̂k ≤ C̃(ϕ(α) + )
α
ε
max{1+(1+ α0 )K0 (2b0 +ρ),ρ+kx̂k}
where C̃ := 0
ε K0 .
1−(1+ α0 ) 2 ρ
0

R1 0 h
Proof. Let M := 0 F (x̂ + t(xα − x̂))dt. Then from the relation

Ph [F(xhα ) − F(x̂) + α(xhα − x0 )] = 0

we have,
(Ph MPh + αPh )(xhα − x̂) = Ph α(x0 − x̂) + Ph M(I − Ph )x̂.
198 Ioannis K. Argyros and Á. Alberto Magreñán

Hence,

xhα − x̂ = [(Ph MPh + αPh )−1 Ph − (F 0 (x0 ) + αI)−1 ]α(x0 − x̂)


+(F 0 (x0 ) + αI)−1 α(x0 − x̂)
+(Ph MPh + αPh )−1 Ph M(I − Ph )x̂
= (Ph MPh + αPh )−1 Ph [F 0 (x0 ) − M + M(I − Ph )]
(F 0 (x0 ) + αI)−1 α(x0 − x̂)
+(F 0 (x0 ) + αI)−1 α(x0 − x̂)
+(Ph MPh + αPh )−1 Ph M(I − Ph )x̂
:= ζ1 + ζ2 (10.3.1)

where ζ1 = (Ph MPh + αPh )−1 Ph [F 0 (x0 ) − M + M(I − Ph )](F 0 (x0 ) + αI)−1 α(x0 − x̂)andζ2 =
(F 0 (x0 ) + αI)−1 α(x0 − x̂) + (Ph MPh + αPh )−1 Ph M(I − Ph )x̂. Observe that,
Z 1
kζ1 k ≤ k(Ph MPh + αPh )−1 Ph [F 0 (x0 ) − F 0 (x̂
0
+t(xhα − x̂))]dt(F 0 (x0 ) + αI)−1 α(x0 − x̂)k
+k(Ph MPh + αPh )−1 Ph M(I − Ph )
(F 0 (x0 ) + αI)−1 α(x0 − x̂)k
≤ k(Ph MPh + αPh )−1 Ph
Z 1
[F 0 (x̂ + t(xhα − x̂))(Ph + I − Ph )
0
εh
φ(x0 , x̂ + t(xhα − x̂), (F 0 (x0 ) + αI)−1 α(x0 − x̂))]dtk + ρ
α
where, here and below εh := εh (x̂ + t(xhα − x̂)). So
Z 1
εh
kζ1 k ≤ (1 + )K0 [kx0 − Ph x0 k + kx̂ + t(xhα − x̂) − Ph x0 k]
α 0
εh
kF 0 (x0 ) + αI)−1 α(x0 − x̂))k + ρ
α
εh 1 εh
≤ (1 + )K0 [(b0 + kx̂ − x0 + x0 − Ph x0 k)ϕ(α) + kxhα − x̂kρ] + ρ
α 2 α
εh 1 h εh
≤ (1 + )K0 [(2b0 + ρ)ϕ(α) + kxα − x̂kρ] + ρ (10.3.2)
α 2 α
and
εh
kζ2 k ≤ ϕ(α) +
kx̂k. (10.3.3)
α
The result now follows from (10.3.1), (10.3.2) and (10.3.3).
h,δ
Theorem 10.3.3. Let xn,α be as in (10.1.4). And the assumptions in Theorem 10.2.6 and
Theorem 10.3.2 hold. Then
1+b δ + εh
kxh,δ
n,α − x̂k ≤ 2
γρ b2n + max{1, C̃}(ϕ(α) + ).
1−b α
Two-Step Newton-Like Projection Method 199

Proof. Observe that,


h,δ h,δ h,δ h,δ
kxn,α − x̂k ≤ kxn,α − xα k + kxα − xhα k + kxhα − x̂k

so, by Proposition 10.3.1, Theorem 10.2.6 and Theorem 10.3.2 we obtain,

1+b δ εh
kxh,δ
n,α − x̂k ≤ γ b2n + + C̃(ϕ(α) + )
1 − b2 ρ α α
1+b δ + εh
≤ 2
γρ b2n + max{1, C̃}(ϕ(α) + ).
1−b α

Let  
2n δ + εh
nδ := min n : b ≤ (10.3.4)
α
and
1+b
C0 = γ + max{1, C̃}. (10.3.5)
1 − b2 ρ
h,δ
Theorem 10.3.4. Let nδ and C0 be as in (10.3.4) and (10.3.5) respectively. And let xnδ ,α be
as in (10.1.4) and the assumptions in Theorem 10.3.3 be satisfied. Then

δ + εh
kxh,δ
nδ ,α − x̂k ≤ C0 (ϕ(α) + ). (10.3.6)
α

10.3.1. A Priori Choice of the Parameter


Let ψ(λ) := λϕ−1 (λ), 0 < λ ≤ a. Then the choice

αδ = ϕ−1 (ψ−1 (δ + εh )),

gives the optimal order error estimate (see [10]) for ϕ(α) + δ+ε
α . So the relation (10.3.6)
h

leads to the following.

Theorem 10.3.5. Let ψ(λ) := λϕ−1 (λ) for 0 < λ ≤ a, and the assumptions in Theorem
10.3.4 hold. For δ > 0, let αδ = ϕ−1 (ψ−1 (δ + εh )) and let nδ be as in (10.3.4). Then

kxh,δ −1
nδ ,α − x̂k = O(ψ (δ + εh )).

10.3.2. An Adaptive Choice of the Parameter


As in [10], the parameter α is chosen according to the balancing principle studied in [14],
[17], i.e., the parameter α is selected from some finite set

DN (α) := {αi = µi α0 , i = 0, 1, · · · , N}

where µ > 1, α0 > 0 and let


 
2n δ + εh
ni := min n : b ≤ .
αi
200 Ioannis K. Argyros and Á. Alberto Magreñán

Then for i = 0, 1, · · · , N, we have

δ + εh
kxh,δ h,δ
ni ,αi − xα i k ≤ C , ∀i = 0, 1, · · ·N.
αi
h,δ
Let xi := xni ,αi . In this paper we select α = αi from DN (α) for computing xi , for each
i = 0, 1, · · · , N.

Theorem 10.3.6. (cf. [18], Theorem 3.1) Assume that there exists i ∈ {0, 1, 2, · · · , N} such
that ϕ(αi ) ≤ δ+ε
αi . Let the assumptions of Theorem 10.3.4 and Theorem 10.3.5 hold and let
h

 
δ + εh
l := max i : ϕ(αi ) ≤ < N,
αi

δ + εh
k := max{i : kxi − x j k ≤ 4C0 , j = 0, 1, 2, · · · , i}.
αj
Then l ≤ k and kx̂ − xk k ≤ cψ−1 (δ + εh ) where c = 6C0 µ.

10.4. Implementation of Adaptive Choice Rule


The balancing algorithm associated with the choice of the parameter specified in Theorem
10.3.6 involves the following steps:

• Choose α0 > 0 such that δ0 < α0 and µ > 1.

• Choose αi := µi α0 , i = 0, 1, 2, · · · , N.

10.4.1. Algorithm
1. Set i = 0.
n o
δ+εh
2. Choose ni := min n : b2n ≤ αi .

3. Solve xi := xh,δ
ni ,αi by using the iteration (10.1.3) and (10.1.4).

4. If kxi − x j k > 4C0 δ+ε


α j , j < i, then take k = i − 1 and return xk .
h

5. Else set i = i + 1 and go to 2.

10.5. Numerical Example


In this section we consider the example considered in [18] for illustrating the algorithm
considered in section IV. We apply the algorithm by choosing a sequence of finite dimen-
sional subspace (Vn) of X with dimVn = n + 1. Precisely we choose Vn as the linear span
of {v1 , v2 , · · · , vn+1 } where vi , i = 1, 2, · · · , n + 1 are the linear splines in a uniform grid of
n + 1 points in [0, 1] (see [10] for details).
Two-Step Newton-Like Projection Method 201

Example 10.5.1. (see [18], section 4.3) Let F : D(F) ⊆ L2 (0, 1) −→ L2 (0, 1) defined by
Z 1
F(u) := k(t, s)u3(s)ds,
0

where 
(1 − t)s, 0 ≤ s ≤ t ≤ 1
k(t, s) = .
(1 − s)t, 0 ≤ t ≤ s ≤ 1
Then for all x(t), y(t) : x(t) > y(t) :
Z 1 Z 1 
3 3
hF(x) − F(y), x − yi = k(t, s)(x − y )(s)ds
0 0

×(x − y)(t)dt ≥ 0.
Thus the operator F is monotone. The Fréchet derivative of F is given by
Z 1
0
F (u)w = 3 k(t, s)u2(s)w(s)ds. (10.5.1)
0

As in [10] one can see that F 0 satisfies the Assumption 10.1.2. In our computation, we take
f (t) = (t − t 11 )/110 and f δ = f + δ. Then the exact solution

x̂(t) = t 3 .

We use
3
x0 (t) = t 3 +(t − t 8 )
56
as our initial guess, so that the function x0 − x̂ satisfies the source condition

x0 − x̂ = ϕ(F 0 (x̂))1

where ϕ(λ) = λ.
For the operator F 0 (.) defined in (10.5.1), εh = O(n−2) (cf. [11]). Thus we expect to
1
obtain the rate of convergence O((δ + εh ) 2 ).
We choose α0 = (1.1)(δ + εh ), µ = 1.1, ρ = 0.11, γρ = 0.7818 and b = 0.99. The re-
sults of the computation are presented in Table 1. The plots of the exact solution and the
approximate solution obtained are given in Figures 1 and 2.

Example 10.5.2. Let X = Y = R, D = [0, ∞), x0 = 1 and define function F on D by


1
x1+ i
F(x) = + c1 x + c2 , (10.5.2)
1 + 1i

where c1 , c2 are real parameters and i > 2 an integer. Then F 0 (x) = x1/i + c1 is not Lipschitz
on D. However Assumption 10.1.3 holds for K0 = 1.
202 Ioannis K. Argyros and Á. Alberto Magreñán

n=8 n=16

n=32 n=64
Figure 10.5.1. Curves of the exact and approximate solutions.

n=128 n=256

n=512 n=1024

Figure 10.5.2. Curves of the exact and approximate solutions.

Indeed, we have
1/i
kF 0 (x) − F 0 (x0 )k = |x1/i − x0 |
|x − x0 |
= i−1 i−1
x0 i + · · · + x i

so
kF 0 (x) − F 0 (x0 )k ≤ K0 |x − x0 |.
Two-Step Newton-Like Projection Method 203

Table 10.5.1. Iterations and corresponding error estimates

kxk −x̂k
n k nk δ + εh α kxk − x̂k (δ+εh )1/2
8 2 2 0.0134 0.0178 0.2217 1.9158
16 2 2 0.0133 0.0178 0.1835 1.5885
32 2 2 0.0133 0.0177 0.1383 1.1981
64 2 2 0.0133 0.0177 0.0998 0.8647
128 2 2 0.0133 0.0177 0.0699 0.6051
256 30 2 0.0133 0.2559 0.0470 0.4070
512 30 2 0.0133 0.2559 0.0290 0.2509
1024 30 2 0.0133 0.2559 0.0121 0.1049

Example 10.5.3. We consider the integral equations


Z b
u(s) = f (s) + λ G(s,t)u(t)1+1/ndt, n ∈ N. (10.5.3)
a

Here, f is a given continuous function satifying f (s) > 0, s ∈ [a, b], λ is a real number, and
the kernel G is continuous and positive in [a, b] × [a, b].
For example, when G(s,t) is the Green kernel, the corresponding integral equation is
equivalent to the boundary value problem

u00 = λu1+1/n
u(a) = f (a), u(b) = f (b).

These type of problems have been considered in [1]- [5].


Equation of the form (10.5.3) generalize equations of the form
Z b
u(s) = G(s,t)u(t)ndt (10.5.4)
a

studied in [1]-[5]. Instead of (10.5.3) we can try to solve the equation F(u) = 0 where

F : Ω ⊆ C[a, b] → C[a, b], Ω = {u ∈ C[a, b] : u(s) ≥ 0, s ∈ [a, b]},

and Z b
F(u)(s) = u(s) − f (s) − λ G(s,t)u(t)1+1/ndt.
a
The norm we consider is the max-norm.
The derivative F 0 is given by
Z b
1
F 0 (u)v(s) = v(s) − λ(1 + ) G(s,t)u(t)1/nv(t)dt, v ∈ Ω.
n a
204 Ioannis K. Argyros and Á. Alberto Magreñán

First of all, we notice that F 0 does not satisfy a Lipschitz-type condition in Ω. Let us con-
sider, for instance, [a, b] = [0, 1], G(s,t) = 1 and y(t) = 0. Then F 0 (y)v(s) = v(s) and
Z b
1
kF 0 (x) − F 0 (y)k = |λ|(1 + ) x(t)1/ndt.
n a

If F 0 were a Lipschitz function, then


kF 0 (x) − F 0 (y)k ≤ L1 kx − yk,

or, equivalently, the inequality


Z 1
x(t)1/ndt ≤ L2 max x(s), (10.5.5)
0 x∈[0,1]

would hold for all x ∈ Ω and for a constant L2 . But this is not true. Consider, for example,
the functions
t
x j (t) = , j ≥ 1, t ∈ [0, 1].
j
If these are substituted into (10.5.5)
1 L2
≤ ⇔ j 1−1/n ≤ L2 (1 + 1/n), ∀ j ≥ 1.
j1/n(1 + 1/n) j
This inequality is not true when j → ∞.
Therefore, condition (10.5.5) is not satisfied in this case. However, Assumption 10.1.3
holds. To show this, let x0 (t) = f (t) and γ = mins∈[a,b] f (s), α > 0 Then for v ∈ Ω,

1 b Z
k[F 0 (x) − F 0 (x0 )]vk = |λ|(1 + ) max | G(s,t)(x(t)1/n − f (t)1/n)v(t)dt|
n s∈[a,b] a
1
≤ |λ|(1 + ) max Gn (s,t)
n s∈[a,b]

G(s,t)|x(t)− f (t)|
where Gn (s,t) = x(t)(n−1)/n +x(t)(n−2)/n f (t)1/n +···+ f (t)(n−1)/n
kvk.
Hence,
Z b
0 0 |λ|(1 + 1/n)
k[F (x) − F (x0 )]vk = max G(s,t)dtkx − x0 k
γ(n−1)/n s∈[a,b] a
≤ K0 kx − x0 k,

where K0 = |λ|(1+1/n)
Rb
γ(n−1)/n
N and N = maxs∈[a,b] a G(s,t)dt. Then Assumption 10.1.3 holds for
sufficiently small λ.
Example 10.5.4. Let X = D(F) = R, x0 = 0, and define function F on D(F) by

F(x) = d0 x + d1 + d2 sined3 x ,

where d0 , d1 , d2 and d3 are given parameters. Then, it can easily be seen that for d3 suffi-
ciently large and d1 sufficiently small, Kk00 can be arbitrarily small.
References

[1] Argyros, I.K., Convergenve and Applications of Newton-type Iterations, Springer,


New York, 2008.

[2] Argyros, I.K., Approximating solutions of equations using Newton’s method with a
modified Newton’s method iterate as a starting point. Rev. Anal. Numer. Theor. Approx.
36, (2007), 123-138.

[3] Argyros, I.K., A Semilocal convergence for directional Newton methods, Math. Com-
put. (AMS). 80, (2011), 327-343.

[4] Argyros, I.K., and Hilout, S., Weaker conditions for the convergence of Newton’s
method, J. Complexity, 28, (2012), 364-387.

[5] Argyros, I.K., Cho, Y.J., and Hilout, S., Numerical methods for equations and its
applications, CRC Press, Taylor and Francis, New York, 2012.

[6] George, S., Elmahdy, A.I., An analysis of Lavrentiev regularization for nonlinear ill-
posed problems using an iterative regularization method, Int. J. Comput. Appl. Math.,
5(3) (2010),369-381.

[7] George, S., Elmahdy, A.I., An iteratively regularized projection method for nonlinear
ill-posed problems”, Int. J. Contemp. Math. Sciences, 5 (52) (2010), 2547-2565.

[8] George, S., Elmahdy, A.I., A quadratic convergence yielding iterative method for non-
linear ill-posed operator equations, Comput. Methods Appl. Math, 12(1) (2012), 32-45

[9] George, S., Elmahdy, A.I., An iteratively regularized projection method with quadratic
convergence for nonlinear ill-posed problems, Int. J. of Math. Analysis, 4(45) (2010),
2211-2228.

[10] George, S., Pareth, S., An application of Newton type iterative method for Lavren-
tiev regularization for ill-posed equations: Finite dimensional realization, IJAM, 42(3)
(2012), 164-170.

[11] Groetsch, C.W., King, J.T., Murio, D., Asymptotic analysis of a finite element method
for Fredholm equations of the first kind, in Treatment of Integral Equations by Nu-
merical Methods, Eds.: C.T.H. Baker and G.F. Miller, Academic Press, London, 1982,
1-11.
206 Ioannis K. Argyros and Á. Alberto Magreñán

[12] Jaan, J., Tautenhahn, U., On Lavrentiev regularization for ill-posed problems in
Hilbert scales, Numer. Funct. Anal. Optim., 24(5-6) (2003), 531-555.

[13] Kelley, C.T., Iterative methods for linear and nonlinear equations, SIAM Philadelphia,
1995.

[14] Mathe, P. Perverzev, S.V., Geometry of linear ill-posed problems in variable Hilbert
scales, Inverse Problems, 19(3) (2003), 789-803.

[15] Mahale, P., Nair, M. T., Iterated Lavrentiev regularization for nonlinear ill-posed prob-
lems, ANZIAM Journal, 51 (2009), 191-217.

[16] Pareth, S., George, S., “Newton type methods for Lavrentiev regularization of nonlin-
ear ill-posed operator equations”, (2012), (submitted).

[17] Perverzev,S.V., Schock, E., On the adaptive selection of the parameter in regulariza-
tion of ill-posed problems, SIAM J.Numer.Anal., 43 (2005), 2060-2076.

[18] Semenova, E.V., Lavrentiev regularization and balancing principle for solving ill-
posed problems with monotone operators, Comput. Methods Appl. Math., 4 (2010),
444-454.

[19] Tautanhahn, U., On the method of Lavrentiev regularization for nonlinear ill-posed
problems, Inverse Problems, 18 (2002), 191-207.
Chapter 11

New Approach to Relaxed Proximal


Point Algorithms Based on
A−Maximal

11.1. Introduction
Let X be a real Hilbert space with the norm k·k and the inner product h., .i. Here we consider
the inclusion problem of the form: find a solution to

0 ∈ M(x), (11.1.1)

where M : X → 2X is a set-valued mapping on X.


Based on the work of Rockafellar [11] on the proximal point algorithm and its applica-
tions to certain computational methods, Eckstein and Bertsekas [3] introduced the relaxed
proximal point algorithm and then they applied to the Douglas-Rachford splitting method
for finding zero of the sum of two monotone operators. Furthermore, they showed that it
was, in fact, a special case of the proximal point algorithm. Fukushima [6] applied the pri-
mal Douglas-Rachford splitting method for a class of monotone operators with applications
to the traffic equilibrium problem.
Highly motivated by these algorithmic developments (see [1-18] and references
therein), we generalize the relaxed proximal point algorithm based on the notions of A−
maximal monotonicity (also referred to as A−monotonicity in literature [16]) and (A, η)−
maximal monotonicity (also referred to as (A, η)−monotonicity [15]) for solving general
inclusion problems in Hilbert space settings. These concepts generalize the general theory
of maximal monotone set-valued mappings in a Hilbert space setting. Our approach differs
significantly than the one used by Rockafellar [11], where the locally Lipschitz type con-
dition on the mapping M −1 is imposed achieving the convergence rate estimate. The main
ingredients for our approach consist of the more generalized framework for the relaxed
proximal point algorithm based on the A−maximal monotonicity, and considering the con-
vergence rate as a quadratic polynomial in terms of αk , where {αk } is a scalar sequence.
The notion of A−maximal monotonicity was introduced and studied by Verma [16]
in the context of solving variational inclusion problems using the resolvent operator tech-
208 Ioannis K. Argyros and Á. Alberto Magreñán

nique, while this work was followed by an accelerated research developments. Furthermore
it generalizes the existing theory of maximal monotone operators (based on the classical
resolvent), including the H−maximal monotonicity by Fang and Huang [4] that concerns
with the generalization of the classical maximal monotonicity. Fang and Huang [4] in-
troduced the notion of H−maximal monotonicity, while investigating the solvability of a
general class of inclusion problems. They applied (H, η)−maximal monotonicity [5] in the
context of approximating the solutions of inclusion problems using the generalized resol-
vent operator technique. The generalized resolvent operator technique is equally effective
applying to several other problems, such as equilibria problems in economics, global op-
timization and control theory, operations research, mathematical finance, management and
decision sciences, mathematical programming, and engineering science. For more details
on the resolvent operator technique and its applications, and further developments, we refer
the reader to [1- 33] and references therein.

11.2. A−Maximal Monotonicity and Auxiliary Results


In this section we discuss some results based on the basic properties and auxiliary results on
A− maximal monotonicity (also referred to as A− monotonicity in literature) and its variant
forms. Let M : X → 2X be a multivalued mapping on X. We shall denote both the map M
and its graph by M, that is, the set {(x, y) : y ∈ M(x)}. This is equivalent to stating that a
mapping is any subset M of X × X, and M(x) = {y : (x, y) ∈ M}. If M is single-valued, we
shall still use M(x) to represent the unique y such that (x, y) ∈ M rather than the singleton
set {y}. This interpretation shall much depend on the context. The domain of a map M is
defined (as its projection onto the first argument) by

/
D(M) = {x ∈ X : ∃ y ∈ X : (x, y) ∈ M} = {x ∈ X : M(x) 6= 0}.

dom(M)=X, shall denote the full domain of M, and the range of M is defined by

R(M) = {y ∈ X : ∃ x ∈ X : (x, y) ∈ M}.

The inverse M −1 of M is {(y, x) : (x, y) ∈ M}. For a real number ρ and a mapping M, let
ρM = {(x, ρy) : (x, y) ∈ M}. If L and M are any mappings, we define

L + M = {(x, y + z) : (x, y) ∈ L, (x, z) ∈ M}.

Definition 11.2.1. Let M : X → 2X be a multivalued mapping on X. The map M is said to


be:

(i) (r)− strongly monotone if there exists a positive constant r such that

hu∗ − v∗ , u − vi ≥ rku − vk2 ∀ (u, u∗ ), (v, v∗) ∈ graph(M).

(ii) (m)−relaxed monotone if there exists a positive constant m such that

hu∗ − v∗ , u − vi ≥ (−m)ku − vk2 ∀ (u, u∗ ), (v, v∗) ∈ graph(M).


Relaxed Proximal Point Algorithms 209

Definition 11.2.2. ([16]). Let A : X → X be a single-valued mapping. The map M : X → 2X


is said to be A− maximal monotone if
(i) M is (m)−relaxed monotone for m > 0,

(ii) R(A + ρM) = X for ρ > 0.


Example 11.2.1. Let A : X → X be an (r)−strongly monotone mapping on X for r > 0.
Let f : X → R be a locally Lipschitz functional such that ∂ f , the subdifferential of f , is
(m)−relaxed monotone, where m > 0. Then A + ∂ f is (r − m)−strongly monotone for
r − m > 0. Then it follows that A + ∂ f is pseudomonotone, which is, in fact, maximal
monotone. This is equivalent to stating that ∂ f is A−maximal monotone.

Definition 11.2.3. ([16]). Let A : X → X be an (r)−strongly monotone mapping and let


M : X → 2X be an A−maximal monotone mapping. Then the generalized resolvent operator
M : X → X is defined by
Jρ,A
M
Jρ,A (u) = (A + ρM)−1 (u).

Definition 11.2.4. ([5]). Let H : X → X be a single-valued mapping. The map M : X → 2X


is said to be to H− maximal monotone if
(i) M is monotone,

(ii) R(H + ρM) = X for ρ > 0.

Definition 11.2.5. ([4]). Let H : X → X be an (r)−strongly monotone mapping and let


M : X → 2X be an H− monotone mapping. Then the generalized resolvent operator Jρ,H
M
:
X → X is defined by
M
Jρ,H (u) = (H + ρM)−1 (u).

Proposition 11.2.1. ([18]). Let A : X → X be an (r)−strongly monotone mapping and let


M : X → 2X be an A− maximal monotone mapping. Then (A + ρM) is maximal monotone
for ρ > 0.

Proposition 11.2.2. ([18]) Let A : X → X be an (r)−strongly monotone mapping and let


M : X → 2X be an A−- maximal monotone mapping. Then the operator (A + ρM)−1 is
single-valued for r − ρm > 0.

Proposition 11.2.3. ([4]) Let H : X → X be an (r)−strongly monotone mapping and let


M : X → 2X be an H− maximal monotone mapping. Then (H + ρM) is maximal monotone
for ρ > 0.

Proposition 11.2.4. ([4]) Let H : X → X be an (r)−strongly monotone mapping and let


M : X → 2X be an H−- maximal monotone mapping. Then the operator (H + ρM)−1 is
single-valued.
210 Ioannis K. Argyros and Á. Alberto Magreñán

11.3. The Generalized Relaxed Proximal Point Algorithm


This section deals with an introduction of a generalized version of the relaxed proximal
point algorithm and its applications to approximation solvability of the inclusion problem
(11.1.1) based on the A-maximal monotonicity.

Lemma 11.3.1. ([18]) Let X be a real Hilbert space, let A : X → X be (r)−strongly mono-
tone, and let M : X → 2X be A− maximal monotone. Then the generalized resolvent opera-
tor associated with M and defined by
M
Jρ,A (u) = (A + ρM)−1 (u) ∀ u ∈ X,

1
is ( r−ρm )− Lipschitz continuous for r − ρm > 0.

Lemma 11.3.2. Let X be a real Hilbert space, let A : X → X be (r)−strongly monotone


and (s)−Lipschitz continuous, and let M : X → 2X be A− maximal monotone. Then the
generalized resolvent operator associated with M and defined by
M
Jρ,A (u) = (A + ρM)−1 (u) ∀ u ∈ X

satisfies

M M 1
kJρ,A (A(u)) − Jρ,A (A(v))k ≤ kA(u) − A(v)k, (11.3.1)
r − ρm
where r − ρm > 0.

Theorem 11.3.3. Let X be a real Hilbert space, let A : X → X be (r)−strongly monotone,


and let M : X → 2X be A− maximal monotone. Then the following statements are equiva-
lent:

(i) An element u ∈ X is a solution to (11.1.1).

(ii) For an u ∈ X, we have


M
u = Jρ,A (A(u)),

where
M
Jρ,A (u) = (A + ρM)−1 (u).

Theorem 11.3.4. Let X be a real Hilbert space, let H : X → X be (r)−strongly mono-


tone, and let M : X → 2X be H− maximal monotone. Then the following statements are
equivalent:

(i) An element u ∈ X is a solution to (11.1.1).

(ii) For an u ∈ X, we have


M
u = Jρ,H (H(u)),
Relaxed Proximal Point Algorithms 211

where
M
Jρ,H (u) = (H + ρM)−1 (u).

Lemma 11.3.5. Let X be a real Hilbert space, let A : X → X be (r)−strongly monotone


and (s)− Lipschitz continuous, and let M : X → 2X be A- maximal monotone. Then

1
M
h(Jρ,A M
oA)(u) − (Jρ,A oA)(v), A(u) − A(v)i ≤ kA(u) − A(v)k2 ∀ u, v ∈ X,
r − ρm

where r − ρm > 0.

Lemma 11.3.6. Let X be a real Hilbert space, let H : X → X be (r)−strongly monotone


and (s)− Lipschitz continuous, and let M : X → 2X be H- maximal monotone. Then

1
M
h(Jρ,H M
oH)(u) − (Jρ,H oH)(v), H(u) − H(v)i ≤ kH(u) − H(v)k2 ∀ u, v ∈ X.
r
In the following theorem, we apply the generalized relaxed proximal point algorithm to
approximate the solution of (11.1.1), and as a result, we succeed achieving linear conver-
gence.

Theorem 11.3.7. Let X be a real Hilbert space, let A : X → X be (r)−strongly mono-


tone and (s)−Lipschitz continuous, and let M : X → 2X be A− maximal monotone. For
an arbitrarily chosen initial point x0 , suppose that the sequence {xk } is generated by the
generalized proximal point algorithm

A(xk+1) = (1 − αk )A(xk ) + αk yk ∀ k ≥ 0, (11.3.2)

and yk satisfies
kyk − A(JρMk ,A (A(xk )))k ≤ δk kyk − A(xk )k,
where JρMk ,A = (A + ρk M)−1 , and

{δk }, {αk }, {ρk } ⊆ [0, ∞)

are scalar sequences such that for γ ∈ (0, 12 ), αk ≤ γ,

2γ2 (s2 − 1)
r − ρk m ≥ 1 + p , (11.3.3)
1 − 2γ + (1 − 2γ)2 − 4γ4 (s2 − 1)
s
 1 − 2γ 2
1<s≤ 1+ , (11.3.4)
2γ2

∑∞
k=0 δk < ∞, δk → 0, and α = lim supk→∞ αk , and ρ = lim supk→∞ ρk .
212 Ioannis K. Argyros and Á. Alberto Magreñán

Then the sequence {xk } converges linearly to a solution x∗ of (11.1.1) with the conver-
gence rate
s
1 s2
θk = (1 − αk )2 + 2αk (1 − αk ) + α2k
(r − ρk m) (r − ρk m)2
r   
1
= s2 + (r − ρk m)2 − 2(r − ρk m) α2k − 2 1 − (r − ρk m) αk + 1
r − ρk m
1 p
= Pk (αk ) ∈ (0, 1),
r − ρk m

where
   
Pk (αk ) = s2 + (r − ρk m)2 − 2(r − ρk m) α2k − 2 1 − (r − ρk m) αk + 1
 2
= 1 − αk (r − ρk m − 1) + α2k (s2 − 1).

Proof. Note that it follows from hypotheses (11.3.2) and (11.3.3) that θk ∈ (0, 1). Suppose
that x∗ is a zero of M. Then from Theorem 11.3.1, it follows that any solution to (11.1.1) is
a fixed point of JρMk ,A oA. For all k ≥ 0, we express

A(zk+1) = (1 − αk )A(xk ) + αk A(JρMk ,A (A(xk ))).

Next, applying Lemma 11.3.2, we find the estimate


kA(zk+1 ) − A(x∗ )k2 = k(1 − αk )A(xk ) + αk A(JρMk ,A(A(xk )))
− [(1 − αk )A(x∗ ) + αk A(JρMk ,A(A(x∗ )))]k2
= k(1 − αk )(A(xk ) − A(x∗ )) + αk (A(JρMk ,A(A(xk ))) − A(JρMk ,A(A(x∗ ))))k2
= (1 − αk )2 kA(xk ) − A(x∗ )k2
+ 2αk (1 − αk )hA(xk ) − A(x∗ ), A(JρMk ,A(A(xk ))) − A(JρMk ,A(A(x∗ )))i
+ α2k kA(JρMk ,A(A(xk ))) − A(JρMk ,A(A(x∗ )))k2
1
≤ (1 − αk )2 kA(xk ) − A(x∗ )k2 + 2αk (1 − αk ) kA(xk ) − A(x∗ )k2
(r − ρk m)
+ α2k s2 kJρMk ,A(A(xk )) − JρMk ,A (A(x∗ ))k2
1
≤ (1 − αk )2 kA(xk ) − A(x∗ )k2 + 2αk (1 − αk ) kA(xk ) − A(x∗ )k2
(r − ρk m)
s2
+ α2k kA(xk )A(x∗ )k2
(r − ρk m)2
h 1 s2 i
= (1 − αk )2 + 2αk (1 − αk ) + α2k 2
kA(xk ) − A(x∗ )k2
(r − ρk m) (r − ρk m)
1 h    i
2 2 2
= s + (r − ρ k m) − 2(r − ρk m) α k − 2 1 − (r − ρ k m) αk + 1
(r − ρk m)2
· kA(xk ) − A(x∗ )k2
= θ2k kA(xk ) − A(x∗ )k2 ,
Relaxed Proximal Point Algorithms 213

where
Pk (αk )
θ2k = .
(r − ρk m)2
Thus, we have

kA(zk+1) − A(x∗ k ≤ θk kA(xk) − A(x∗ )k


1 p
= Pk (αk )kA(xk ) − A(x∗ )k. (11.3.5)
r − ρk m

Since A(xk+1) = (1 − αk )A(xk ) + αk yk , we have A(xk+1) − A(xk ) = αk (yk − A(xk )).


It follows that

kA(xk+1) − A(zk+1)k
= k(1 − αk )A(xk) + αk yk − [(1 − αk )A(xk) + αk A(JρMk ,A (A(xk )))]k
= kαk (yk − A(Jρ,A
M
(A(xk ))))k
≤ αk δk kyk − A(xk )k.

Next, we estimate using the above arguments that

kA(xk+1) − A(x∗ )k
≤ kA(zk+1) − A(x∗ )k + kA(xk+1 ) − A(zk+1)k
≤ kA(zk+1) − A(x∗ )k + αk δk kyk − A(xk )k
= kA(zk+1) − A(x∗ )k + δk kA(xk+1 ) − A(xk )k
≤ kA(zk+1) − A(x∗ )k + δk kA(xk+1 ) − A(x∗ )k + δk kA(xk ) − A(x∗ )k. (11.3.6)

This implies from (11.3.6) on applying (11.3.5) that

(1 − δk )kA(xk+1) − A(x∗ )k
≤ kA(zk+1) − A(x∗ )k + δk kA(xk ) − A(x∗ )k
≤ θk kA(xk) − A(x∗ )k + δk kA(xk) − A(x∗ )k
 
= θk + δk kA(xk ) − A(x∗ )k. (11.3.7)

Therefore,
(θk + δk )
kA(xk+1) − A(x∗ )k ≤ kA(xk ) − A(x∗ )k, (11.3.8)
1 − δk
where
(θk + δk )
lim sup = lim sup θk
1 − δk
1 p
= Pk (αk ). (11.3.9)
(r − ρk m)
Pk is a quadratic polynomial for each k whose leading coefficient
   2
s2 + (r − ρk m)2 − 2(r − ρk m) = 1 − (r − ρk m) + s2 − 1
214 Ioannis K. Argyros and Á. Alberto Magreñán

is positive since s > 1. Hence, each Pk has a minimum which is given by


   2
s2 + (r − ρk m)2 − 2(r − ρk m) − 1 − (r − ρk m)
 
s2 + (r − ρk m)2 − 2(r − ρk m)
 2
1 − (r − ρk m)
= 1−  2 < 1.
2
1 − (r − ρk m) + s − 1

Now, it follows from (11.3.8) in light of (11.3.9) that the sequence {A(xk )} converges to
A(x∗ ). On the other hand, A is (r)−strongly monotone (and hence, kA(x) − A(y)k ≥ rkx −
yk), we have that
θk
kxk − x∗ k ≤ kA(xk ) − A(x∗ )k → 0, (11.3.10)
r
which completes the proof.

Corollary 11.3.8. Let X be a real Hilbert space, let H : X → X be (r)−strongly mono-


tone and (s)−Lipschitz continuous, and let M : X → 2X be H− maximal monotone. For
an arbitrarily chosen initial point x0 , suppose that the sequence {xk } is generated by the
generalized proximal point algorithm

H(xk+1 ) = (1 − αk )H(xk ) + αk yk ∀ k ≥ 0, (11.3.11)

and yk satisfies
kyk − H(JρMk ,H (H(xk )))k ≤ δk kyk − H(xk )k,
where JρMk ,H = (H + ρk M)−1 , and

{δk }, {αk }, {ρk } ⊆ [0, ∞)

are scalar sequences such that for γ ∈ (0, 12 ), αk ≤ γ,

2γ2 (s2 − 1)
r ≥ 1+ p ,
1 − 2γ + (1 − 2γ)2 − 4γ4 (s2 − 1)
s
 1 − 2γ 2
1<s≤ 1+ .
2γ2

Then the sequence {xk } converges linearly to a solution of (11.1.1) with convergence
rate
q
1
θk =(s2 + r2 − 2r)α2k − 2(1 − r)αk + 1,
r
where ∑∞
k=0 δk < ∞, δk → 0, and α = lim supk→∞ αk , and ρ = lim supk→∞ ρk .
Relaxed Proximal Point Algorithms 215

11.4. An Application
Let X be a real Hilbert space and let f : X → R be a locally Lipschitz functional on X. We
consider the inclusion problem: determine a solution to

0 ∈ ∂ f (x), (11.4.1)

where ∂ f : X → 2X is a set-valued mapping on X. Then it turns out that A + ∂ f is


(r − m)−strongly monotone for r − m > 0, if A : X → X is (r)−strongly monotone, and
∂ f : X → 2X is (m)−relaxed monotone. This is equivalent to stating that ∂ f is A−maximal
monotone. Now all the conditions for Theorem 11. 3.3 are satisfied, one can apply
Theorem 11.3.3 to the solvability of (11.4.1) in the form:

Theorem 11.4.1. Let X be a real Hilbert space, and let A : X → X be (r)−strongly monotone
and (s)−Lipschitz continuous. Let f : X → R be a locally Lipschitz functional on X, and let
∂ f : X → 2X be A− maximal monotone. For an arbitrarily chosen initial point x0 , suppose
that the sequence {xk } is generated by the generalized proximal point algorithm

A(xk+1 ) = (1 − αk )A(xk ) + αk yk ∀ k ≥ 0, (11.4.2)

and yk satisfies
∂f
kyk − A(Jρk ,A (A(xk )))k ≤ δk kyk − A(xk )k,
∂f
where Jρ = (A + ρk ∂ f )−1 , and
k ,A

{δk }, {αk }, {ρk } ⊆ [0, ∞)

are scalar sequences such that for γ ∈ (0, 12 ), αk ≤ γ,

2γ2 (s2 − 1)
r − ρk m ≥ 1 + p ,
1 − 2γ + (1 − 2γ)2 − 4γ4 (s2 − 1)
s
 1 − 2γ 2
1<s≤ 1+ .
2γ2

Then the sequence {xk } converges linearly to a solution of (11.4.1) with convergence rate
given in Theorem 11.3.3.
References

[1] Agarwal, R.P., Verma,R.U., General system of (A, η)−maximal relaxed monotone
variational inclusion problems based on generalized hybrid algorithms, Communica-
tions in Nonlinear Science and Numerical Simulations 15 (2010), 238–251.

[2] Argyros, I.K., Cho, Y.J., Hilout, S., Numerical Methods for Equations and its Appli-
cations, CRC Press, Taylor & Francis, New York, 2012.

[21] Dhage, B.C., Verma, R.U., Second order boundary value problems of discontinuous
differential inclusions, Communications on Applied Nonlinear Analysis 12(3) (2005),
37-44.

[3] Eckstein, J., Bertsekas, D.P., On the Douglas-Rachford splitting method and the proxi-
mal point algorithm for maximal monotone operators, Mathematical Programming 55
(1992), 293–318.

[4] Fang, Y.P., Huang, N.J., H− monotone operators and system of variational inclusions,
Communications on Applied Nonlinear Analysis 11(1) (2004), 93–101.

[5] Fang, Y.P., Huang, N.J., Thompson, H.B., A new system of variational inclusions with
(H, η)− monotone operators, Computers and Mathematics with Applications 49(2-3)
(2005), 365–374.

[6] Fukushima, M., The primal Douglas-Rachford splitting algorithm for a class of mono-
tone operators with applications to the traffic equilibrium problem, Mathematical Pro-
gramming 72(1996), 1–15.

[7] Glowinski, R., Le Tellec, P., Augmented Lagrangians and Operator-Splitting Methods
in Continuum Mechanics, SIAM, Philadelphia, PA, 1989.

[8 Lan, H.Y., Kim, J.H., Cho, Y.J., On a new class of nonlinear A−monotone multivalued
variational inclusions, Journal of Mathematical Analysis and Applications 327(1)
(2007), 481–493.

[9] Moudafi, A., Mixed equilibrium problems: Sensitivity analysis and algorithmic as-
pect,Computers and Mathematics with Applications 44 (2002), 1099-1108.

[10] Robinson,S.M., Composition duality and maximal monotonicity, Mathematical Pro-


gramming 85 (1999a), 1-13.
218 Ioannis K. Argyros and Á. Alberto Magreñán

[11] Rockafellar, R.T., Monotone operators and the proximal point algorithm, SIAM Jour-
nal of Control and Optimization 14 (1976), 877-898.

[12] Rockafellar, R.T., Augmented Lagrangians and applications of the proximal point al-
gorithm in convex programming, Mathematics of Operations Research 1 (1976b),
97–116.

[13] Tseng, P, Alternating projection-proximal methods for convex programming and vari-
ational inequalities, SIAM Journal of Optimization 7 (1997), 951–965.

[14] Tseng, P., A modified forward-backward splitting method for maximal monotone
mappings, SIAM Journal of Control and Optimization 38 (2000), 431–446.

[15] Verma, R.U., Sensitivity analysis for generalized strongly monotone variational inclu-
sions based on the (A, η)− resolvent operator technique, Applied Mathematics Letters
19 (2006), 1409–1413.

[16] Verma, R.U., A− monotonicity and its role in nonlinear variational inclusions, Journal
of Optimization Theory and Applications 129(3) (2006), 457-467.

[17] Verma, R.U., General system of A− monotone nonlinear variational inclusion prob-
lems, Journal of Optimization Theory and Applications 131(1) (2006), 151-157.

[18] Verma, R.U., A− monotone nonlinear relaxed cocoercive variational inclusions, Cen-
tral European Journal of Mathematics 5(2) (2007), 1-11.

[19] Verma, R.U., General system of (A, η)−monotone variational inclusion problems
based on generalized hybrid algorithm, Nonlinear Analysis: Hybrid Systems 1 (3)
(2007), 326-335.

[20] Verma, R.U., Approximation solvability of a class of nonlinear set-valued inclusions


involving (A, η)− monotone mappings, Journal of Mathematical Analysis and Appli-
cations 337 (2008), 969-975.

[22] Verma, R.U., Auxiliary problem principle and its extension applied to variational in-
equalities, Mathematical Sciences Research Hot-Line 4(2) (2000), 55-63.

[23] Verma, R.U., On a class of nonlinear variational inequalities involving partially re-
laxed monotone and partially strongly monotone mappings, Mathematical Sciences
Research Hot-Line 3 (12) (1999), 7-26.

[24] Verma, R.U., A class of generalized implicit variational inequality type algorithms and
their applications, Mathematical Sciences Research Hot-Line 4 (3) (2000), 17-30.

[25] Verma, R.U., RKKM mappings and their applications, Mathematical Sciences Re-
search Hot-Line 4(10) (2000), 23–27.

[26] Verma, R.U., A class of new minimax inequalities in generalized H-spaces, Mathe-
matical Sciences Research Hot-Line 4(10) (2000), 29-32.
Relaxed Proximal Point Algorithms 219

[27] Verma, R.U., On a system of nonlinear variational inequalities involving multivalued


mappings, Mathematical Sciences Research Hot-Line 4(12) (2000), 21-31.

[28] Verma, R.U., A class of generalized iterative algorithms nonlinear quasivariational


inequalities, Mathematical Sciences Research Hot-Line 4(12) (2000), 33-45.

[29] Verma, R.U., Averaging techniques and cocoercively monotone mappings, Mathemat-
ical Sciences Research Journal 10(3) (2006), 79-82.

[30] Verma, R.U., General class of implicit variational inclusions and graph convergence
on A−maximal relaxed monotonicity, Journal of Optimization Theory and Applica-
tions 155(1) (2012), 196-214.

[31] Xu, H.K., Iterative algorithms for nonlinear operators, Journal of London Mathemat-
ical Society 66 (2002), 240-256.

[32] Zeidler, E., Nonlinear Functional Analysis and its Applications I, Springer-Verlag,
New York, New York, 1986.

[33] Zeidler, E., Nonlinear Functional Analysis and its Applications II/A, Springer-Verlag,
New York, New York, 1990.
Chapter 12

Newton-Type Iterative Methods for


Nonlinear Ill-Posed
Hammerstein-Type Equations

12.1. Introduction
This chapter is devoted to the study of non-linear ill-posed Hammerstein type operator
equations. Recall that ([13, 14, 15, 16]) an equation of the form

(KF)x = y (12.1.1)

is called a non-linear ill-posed Hammerstein type operator equation. Here F : D(F) ⊆ X →


Z, is a nonlinear operator, K : Z → Y is a bounded linear operator and X, Z,Y are Hilbert
spaces with corresponding inner product h., .iX , h., .iZ , h., .iY , and norm k.kX , k.kZ , k.kY
respectively. A typical example of a Hammerstein type operator is the nonlinear integral
operator
Z 1
(Ax)(t) := k(s,t) f (s, x(s))ds
0

where k(s,t) ∈ L2 ([0, 1] × [0, 1]), x ∈ L2 [0, 1] and t ∈ [0, 1].


The above integral operator A admits a representation of the form A = KF where K :
L2 [0, 1] → L2 [0, 1] is a linear integral operator with kernel k(t, s) : defined as
Z 1
Kx(t) = k(t, s)x(s)ds
0

and F : D(F) ⊆ L2 [0, 1] → L2 [0, 1] is a nonlinear superposition operator (cf. [24]) defined
as
Fx(s) = f (s, x(s)). (12.1.2)
The first author and his collaborators ([13, 14, 15, 16]), studied ill-posed Hammerstein type
equation extensively under some assumptions on the Fréchet derivative of F. Precisely, in
[13, 15], it is assumed that F 0 (x0 )−1 exists and in [16] it is assumed that F 0 (x)−1 exists for
all x in a ball of radius r around x0 .
222 Ioannis K. Argyros and Á. Alberto Magreñán

Note that if the function f in (12.1.2) is differentiable with respect to the second variable
and for all x ∈ Br (x0 ), t ∈ [0, 1]; ∂2 f (t, x(t)) ≥ κ1 , then F 0 (u)−1 exists and is a bounded
operator for all u ∈ Br (x0 ) (see Remark 2.1 in [15]), here ∂2 f (t, s) represents the partial
derivative of f with respect to the second variable.
Throughout this chapter it is assumed that the available data is yδ with

ky − yδ kY ≤ δ

and hence one has to consider the equation

(KF)x = yδ (12.1.3)

instead of (12.1.1). Observe that the solution x of (12.1.3) can be obtained by solving

Kz = yδ (12.1.4)

for z and then solving the non-linear problem

F(x) = z. (12.1.5)

In [16], for solving (12.1.5), George and Kunhanandan considered the sequence defined
iteratively by
xδn+1,α = xδn,α − F 0 (xδn,α )−1 (F(xδn,α ) − zδα )
where xδ0,α := x0 ,
zδα = (K ∗ K + αI)−1 K ∗ (yδ − KF(x0 )) + F(x0 ) (12.1.6)
and obtained local quadratic convergence.
Recall that a sequence (xn ) in X with lim xn = x∗ is said to be convergent of order p > 1,
if there exist positive reals c1 , c2 , such that for all n ∈ N
n
kxn − x∗ kX ≤ c1 e−c2 p .

If the sequence (xn ) has the property that kxn − x∗ kX ≤ c1 qn , 0 < q < 1, then (xn ) is said to
be linearly convergent. For an extensive discussion of convergence rate see Kelley [23].
And in [15], George and Nair studied the modified Laverentiev regularization

zδα = (K + αI)−1 (yδ − KF(x0 ))

for obtaining an approximate solution of (12.1.4) and introduced modified Newton’s itera-
tions,
xδn,α = xδn−1,α − F 0 (x0 )−1 (F(xδn−1,α) − F(x0 ) − zδα )
for solving (12.1.5) and obtained local linear convergence. In fact in [15] and [16], a solu-
tion x̂ of (12.1.1) is called an x0 -minimum norm solution if it satisfies

kF(x̂) − F(x0 )kZ := min{kF(x) − F (x0 )kZ : KF(x) = y, x ∈ D(F)}. (12.1.7)

We also assume throughout that the solution x̂ satisfies (12.1.7). In all these papers ([13, 14,
15, 16]), it is assumed that the ill-posedness of (12.1.1) is due to the nonclosedness of the
operator K. In this chapter we consider two cases:
Newton-Type Methods for Ill-Posed Equations 223

Case (1) F 0 (x0 )−1 exists and is a bounded operator, i.e., (12.1.5) is regular.
Case (2) F is monotone ([26], [31]), Z = X is a real Hilbert space and F 0 (x0 )−1 does
not exist, i.e., (12.1.5) is also ill-posed.
The case when F is not monotone and F 0 (x0 )−1 does not exist is the subject matter of
the forthcoming chapter.
One of the advantages of (approximately) solving (12.1.4) and (12.1.5) to obtain an
approximate solution for (12.1.3) is that, one can use any regularization method ([8, 22])
for linear ill-posed equations, for solving (12.1.4) and any iterative method ([10, 12]) for
solving (12.1.5). In fact in this chapter we consider Tikhonov regularization([11, 13, 16,
19, 20]) for approximately solving (12.1.4) and we consider a modified two step Newton
method ([1, 6, 7, 9, 21, 25]) for solving (12.1.5). Note that the regularization parameter α
is chosen according to the adaptive method considered by Pereverzev and Schock in ([28])
for the linear ill-posed operator equations and the same parameter α is used for solving the
non-linear operator equation (12.1.5), so the choice of the regularization parameter is not
depending on the non-linear operator F, this is another advantage over treating (12.1.3) as
a single non-linear operator equation.
This chapter is organized as follows. Preparatory results are given in Section 12.2 and
Section 12.3 comprises the proposed iterative method for case (1) and case (2). Section
12.4 deals with the algorithm for implementing the proposed method. Numerical examples
are given in Section 12.5. Finally the chapter ends with a conclusion in section 12.6.

12.2. Preparatory Results


In this section we consider Tikhonov regularized solution zδα defined in (12.1.6) and obtain
an a priori and an a posteriori error estimate for kF(x̂) − zδα kZ . The following assumption is
required to obtain the error estimate .

Assumption 12.2.1. There exists a continuous, strictly monotonically increasing function


ϕ : (0, a] → (0, ∞) with a ≥ kK ∗ KkY →X satisfying;

• limλ→0 ϕ(λ) = 0


αϕ(λ)
sup ≤ ϕ(α), ∀λ ∈ (0, a]
λ≥0 λ + α

and

• there exists v ∈ X, kvkX ≤ 1 such that

F(x̂) − F(x0 ) = ϕ(K ∗ K)v.

Theorem 12.2.2. (see (4.3) in [16] ) Let zδα be as in (12.1.6) and Assumption 12.2.1 holds.
Then
δ
kF(x̂) − zδα kZ ≤ ϕ(α) + √ . (12.2.1)
α
224 Ioannis K. Argyros and Á. Alberto Magreñán

12.2.1. A Priori Choice of the Parameter


Note that the estimate ϕ(α) + √δα in (12.2.1) is of optimal order for the choice α := αδ
p
which satisfies ϕ(αδ ) = √δα . Let ψ(λ) := λ ϕ−1 (λ), 0 < λ ≤ kKkY2 . Then we have δ =
√ δ
αδ ϕ(αδ ) = ψ(ϕ(αδ )) and
αδ = ϕ−1 (ψ−1 (δ)).
So the relation (12.2.1) leads to kF(x̂) − zδα kZ ≤ 2ψ−1 (δ).

12.2.2. An Adaptive Choice of the Parameter


In this chapter, we propose to choose the parameter α according to the adaptive choice
established by Pereverzev and Shock [28] for solving ill-posed problems. We denote by
DM the set of possible values of the parameter α
DM = {αi = α0 µ2i , i = 0, 1, 2, ....,M},µ > 1.
Then the selection of numerical value k for the parameter α according to the adaptive choice
is performed using the rule
k := max{i : αi ∈ D+M} (12.2.2)
where D+ δ δ
M = {αi ∈ DM : kzαi − zα j kZ ≤
√4δ ,
αj j = 0, 1, 2, ....,i − 1}. Let

δ
l := max{i : ϕ(αi ) ≤ √ }. (12.2.3)
αi
We will be using the following theorem from [16] for our error analysis.
Theorem 12.2.3. (cf. [16], Theorem 4.3) Let l be as in (12.2.3), k be as in (12.2.2) and zδαk
be as in (12.1.6) with α = αk . Then l ≤ k and

kF(x̂) − zδαk kZ ≤ (2 + )µψ−1 (δ).
µ−1

12.3. Convergence Analysis


Throughout this chapter we assume that the operator F possess a uniformly bounded
Fréchet derivative F 0 (.) for all x ∈ D(F). In the earlier papers [16, 17, 18] the authors
used the following Assumption:
Assumption 12.3.1. (cf.[30], Assumption 3 (A3)) There exist a constant K0 ≥ 0 such that
for every x, u ∈ Br (x0 ) ∪ Br (x̂) ⊆ D(F) and v ∈ X there exists an element Φ(x, u, v) ∈ X such
that [F 0 (x) − F 0 (u)]v = F 0 (u)Φ(x, u, v), kΦ(x, u,v)kX ≤ K0 kvkX kx − ukX .
The hypotheses of Assumption 12.3.1 may not hold or may be very expensive or impos-
sible to verify in general. In particular, as it is the case for well-posed nonlinear equations
the computation of the Lipschitz constant K0 even if this constant exists is very difficult.
Moreover, there are classes of operators for which Assumption 12.3.1 is not satisfied but
the iterative method converges.
In the present chapter, we expand the applicability of the Newton-type iterative method
under less computational cost. We achieve this goal by the following weaker Assumption.
Newton-Type Methods for Ill-Posed Equations 225

Assumption 12.3.2. Let x0 ∈ X be fixed. There exists a constant k0 such that for every
u ∈ Br (x0 ) ⊆ D(F) and v ∈ X, there exists an element Φ0 (x0 , u, v) ∈ X satisfying

[F 0 (x0 ) − F 0 (u)]v = F 0 (x0 )Φ0 (x0 , u, v), kΦ(x0, u, v)kX ≤ k0 kvkX kx0 − ukX .

Note that
k0 ≤ K0
K0
holds in general and k0 can be arbitrary large. The advantages of the new approach are:
(1) Assumption 12.3.2 is weaker than Assumption 12.3.1. Notice that there are classes
of operators that satisfy Assumption 12.3.2 but do not satisfy Assumption 12.3.1;

(2) The computational cost of finding the constant k0 is less than that of constant K0 ,
even when K0 = k0 ;

(3) The sufficient convergence criteria are weaker;

(4) The computable error bounds on the distances involved (including k0 ) are less costly
and more precise than the old ones (including K0 );

(5) The information on the location of the solution is more precise;


and
(6) The convergence domain of the iterative method is larger.
These advantages are also very important in computational mathematics since they pro-
vide under less computational cost a wider choice of initial guesses for iterative method and
the computation of fewer iterates to achieve a desired error tolerance. Numerical examples
for (1)-(6) are presented in Section 4.

12.3.1. Iterative Method for Case (1)


In this subsection for an initial guess x0 ∈ X, we consider the sequence vδn,αk defined itera-
tively by
vδn,αk = vδn,αk − F 0 (x0 )−1 (F(vδn,αk ) − zδαk )
where vδ0,αk = x0 for obtaining an approximation xδαk of x such that F(x) = zδαk .
Let
yδn,αk = vδ2n−1,αk (12.3.1)
and
xδn+1,αk = vδ2n,αk , (12.3.2)
for n > 0. We will be using the following notations;

M ≥ kF 0 (x0 )kX→Z ;
β := kF 0 (x0 )−1 kZ→X ;
1 1
k0 < min{1, };
4 β
226 Ioannis K. Argyros and Á. Alberto Magreñán

α0
δ0 < ;
4k0 β
1 1 δ0
ρ := ( − √ );
M 4k0 β α0
δ0
γρ := β[Mρ + √ ];
α0
and
eδn,αk := kyδn,αk − xδn,αk kX , ∀n ≥ 0. (12.3.3)

For convenience, we use the notation xn , yn and en for xδn,αk , yδn,αk and eδn,αk respectively.
Further we define
q := k0 r, r ∈ (r1 , r2 ) (12.3.4)
where q
1− 1 − 4k0 γρ
r1 =
2k0
and q
1 1+ 1 − 4k0 γρ
r2 = min{ , }.
k0 2k0
Note that r is well defined because γρ ≤ 4k10 . We will be using the relation e0 ≤ γρ for
proving our results, which can be seen as follows;

e0 = ky0 − x0 kX = kF 0 (x0 )−1 (F(x0 ) − zδαk )kX


≤ kF 0 (x0 )−1 kZ→X k(F(x0 ) − zδαk )kZ
≤ βkF(x0 ) − zαk + zαk − zδαk kZ
≤ β[kF(x0 ) − F(x̂)kZ + kzαk − zδαk kZ ]
δ
≤ β[Mρ + √ ]
α
δ0
≤ β[Mρ + √ ]
α0
= γρ .

Theorem 12.3.3. Let en , q be as in (12.3.3), (12.3.4) respectively and xn , yn be as in


(12.3.2), (12.3.1) respectively with δ ∈ (0, δ0 ]. Then by Assumption 12.3.2 and Theorem
12.2.3 xn , yn ∈ Br (x0 ) and the following estimates hold for all n ≥ 0.

(a) kxn+1 − yn kX ≤ qkyn − xn kX ;

(b) kyn+1 − xn+1 kX ≤ q2 kyn − xn kX ;

(c) en ≤ q2n γρ , ∀n ≥ 0.
Newton-Type Methods for Ill-Posed Equations 227

Proof. Suppose xn , yn ∈ Br (x0 ). Then

xn+1 − yn = yn − xn − F 0 (x0 )−1 (F(yn ) − F(xn ))


= F 0 (x0 )−1 [F 0 (x0 )(yn − xn ) − (F(yn ) − F(xn ))]
Z 1
= F 0 (x0 )−1 [F 0 (x0 ) − F 0 (xn + t(yn − xn ))](yn − xn )dt
0

and hence by Assumption 12.3.2, we have

kxn+1 − yn kX ≤ k0 rkyn − xn kX ≤ qkyn − xn kX .

This proves (a). To prove (b) we observe that

en+1 = kyn+1 − xn+1 kX = kxn+1 − yn − F 0 (x0 )−1 (F(xn+1 ) − F(yn ))kX


Z 1
0 −1
= kF (x0 ) [F 0 (x0 ) − F 0 (yn + t(xn+1 − yn )]
0
dt(xn+1 − yn )kX
≤ k0 rkyn − xn+1 kX
≤ q2 kxn − yn kX .

The last but one step follows from Assumption 12.3.2 and the last step follows from (a).
This completes the proof of (b) and (c) follows from (b). Now we shall show that xn , yn ∈
Br (x0 ) by induction. For n = 1,

x1 − y0 = y0 − x0 − F 0 (x0 )−1 (F(y0 ) − F(x0 ))


= F 0 (x0 )−1 [F 0 (x0 )(y0 − x0 ) − (F(y0 ) − F (x0 ))]
Z 1
= F 0 (x0 )−1 [F 0 (x0 ) − F 0 (x0 + t(y0 − x0 ))](y0 − x0 )dt
0

and hence by Assumption 12.3.2, we have


k0
kx1 − y0 kX ≤ ky0 − x0 k2X
2
≤ k0 re0 . (12.3.5)

So by triangle inequality and (12.3.5)

kx1 − x0 kX ≤ kx1 − y0 kX + ky0 − x0 kX


≤ (1 + q)e0
e0

1−q
γρ

1−q
≤ r.

i.e., x1 ∈ Br (x0 ). Observe that

ky1 − x1 kX = kx1 − y0 − F 0 (x0 )−1 (F(x1 ) − F(y0 ))kX


≤ k0 rkx1 − y0 kX
228 Ioannis K. Argyros and Á. Alberto Magreñán

and hence by (12.3.5)

ky1 − x1 kX ≤ q2 e0 . (12.3.6)

Therefore by (12.3.4), (12.3.6) and triangle inequality,

ky1 − x0 kX ≤ ky1 − x1 kX + kx1 − x0 kX


≤ (1 + q + q2 )e0
e0

1−q
γρ

1−q
≤ r

i.e., y1 ∈ Br (x0 ). Suppose xm , ym ∈ Br (x0 ). Then

kxm+1 − x0 kX ≤ kxm+1 − xm kX + kxm − xm−1 kX + · · · + kx1 − x0 kX


≤ (q + 1)em + (q + 1)em−1 + · · · + (q + 1)e0
≤ (q + 1)(em + em−1 + · · · + e0 )
≤ (q + 1)(q2m + q2(m−1) + · · · + 1)e0
1 − (q2m+1 )
≤ (q + 1) e0
1 − q2
e0

1−q
γρ

1−q
≤ r

i.e., xm+1 ∈ Br (x0 ) and

kym+1 − x0 kX ≤ kym+1 − xm+1 kX + kxm+1 − x0 kX


≤ q2 em + (q + 1)em + (q + 1)em−1 + · · · + (q + 1)e0
≤ (q2 + q + 1)em + (q + 1)em−1 + · · · + (q + 1)e0
≤ (q2(m+1) + · · · + q3 + q2 + q + 1)e0
e0

1−q
γρ

1−q
≤ r

i.e., ym+1 ∈ Br (x0 ). Thus by induction xn , yn ∈ Br (x0 ). This completes the proof of the
Theorem.
The main result of this section is the following Theorem.
Newton-Type Methods for Ill-Posed Equations 229

Theorem 12.3.4. Let xn and yn be as in (12.3.2) and (12.3.1) respectively and assumptions
of Theorem 12.3.3 hold. Then (xn ) is a Cauchy sequence in Br (x0 ) and converges to xδαk ∈
Br (x0 ). Further F(xδαk ) = zδαk and

kxn − xδαk kX ≤ Cq2n


γρ
where C = 1−q

Proof. Using the relation (b) and (c) of Theorem 12.3.3, we obtain
i=m−1
kxn+m − xn kX ≤ ∑ kxn+i+1 − xn+i kX
i=0
i=m−1
≤ ∑ (1 + q)en+i
i=0
i=m−1
≤ ∑ (1 + q)q2(n+i)e0
i=0
= (1 + q)q2ne0 + (1 + q)q2(n+1)e0 + ..... + (1 + q)q2(n+m)e0
≤ (1 + q)q2n(1 + q2 + q2(2) + · · · + q2m )e0
1 − (q2 )m+1
≤ q2n [ ]γρ
1−q
≤ Cq2n .

Thus xn is a Cauchy sequence in Br (x0 ) and hence it converges, say to xδαk ∈ Br (x0 ). Observe
that

kF(xn ) − zδαk kZ = kF 0 (x0 )(xn − yn )kZ


≤ kF 0 (x0 )kX→Z k(xn − yn )kZ
≤ Men ≤ Mq2n γρ . (12.3.7)

Now by letting n → ∞ in (12.3.7) we obtain F(xδαk ) = zδαk . This completes the proof.
Hereafter we assume that
kx̂ − x0 kX < ρ ≤ r.
Theorem 12.3.5. Suppose that the hypothesis of Assumption 12.3.2 holds. Then
β
kx̂ − xδαk kX ≤ kF(x̂) − zδαk kZ .
1 − k0 r
Proof. Note that k0 r < 1 and by Assumption 12.3.2, we have

kx̂ − xδαk kX ≤ kx̂ − xδαk + F 0 (x0 )−1 [F(xδαk ) − F(x̂) + F(x̂) − zδαk ]kX
≤ kF 0 (x0 )−1 [F 0 (x0 )(x̂ − xδαk ) + F(xδαk ) − F(x̂)]kX
+kF 0 (x0 )−1 (F(x̂) − zδαk )kX
≤ k0 kx0 − x̂ − t(xδαk − x̂)kX kx̂ − xδαk kX + βkF(x̂) − zδαk kZ
≤ k0 rkx̂ − xδαk kZ + βkF(x̂) − zδαk kZ .
230 Ioannis K. Argyros and Á. Alberto Magreñán

This completes the proof. The following Theorem is a consequence of Theorem 12.3.4 and
Theorem 12.3.5.
Theorem 12.3.6. Let xn be as in (12.3.2), assumptions in Theorem 12.3.4 and Theorem
12.3.5 hold. Then
β
kx̂ − xn kX ≤ Cq2n + kF(x̂) − zδαk kZ
1 − k0 r
where C is as in Theorem 12.3.4.
Observe that from section 2.2, l ≤ k and αδ ≤ αl+1 ≤ µαl , we have
δ δ δ
√ ≤ √ ≤ µ √ = µϕ(αδ ) = µψ−1 (δ).
αk αl αδ
This leads to the following theorem,
Theorem 12.3.7. Let xn be as in (12.3.2), assumptions in Theorem 12.2.3, Theorem 12.3.4
and Theorem 12.3.5 hold. Let
δ
nk := min{n : q2n ≤ √ }.
αk
Then
kx̂ − xnk kX = O(ψ−1 (δ)).

12.3.2. Iterative Method for Case (2)


F is a monotone operator (i.e., hF(x) − F(y), x − yi ≥ 0, ∀x, y ∈ D(F)), Z = X is a real
Hilbert space and F 0 (x0 )−1 does not exist. Thus the ill-posedness of (12.1.1) in this case is
due to the ill-posedness of F as well as the nonclosedness of the range of the linear operator
K. The following assumptions are needed in addition to the earlier assumptions for our
convergence analysis.
Assumption 12.3.8. There exists a continuous, strictly monotonically increasing function
ϕ1 : (0, b] → (0, ∞) with b ≥ kF 0 (x0 )kX→X satisfying;
• limλ→0 ϕ1 (λ) = 0,

αϕ1 (λ)
sup ≤ ϕ1 (α) ∀λ ∈ (0, b]
λ≥0 λ + α
and
• there exists v ∈ X with kvkX ≤ 1 (cf. [26]) such that

x0 − x̂ = ϕ1 (F 0 (x0 ))v.

Assumption 12.3.9. For each x ∈ Br̃ (x0 ) there exists a bounded linear operator G(x, x0 )
(see [29]) such that
F 0 (x) = F 0 (x0 )G(x, x0 )
with kG(x, x0 )kX→X ≤ k2 .
Newton-Type Methods for Ill-Posed Equations 231

The iterative method for this case


αk δ
ṽδn,αk = ṽδn,αk − R(x0 )−1 [F(ṽδn,αk ) − zδαk + (ṽ − x0 )]
c n,αk

where ṽδ0,αk := x0 is the initial guess and R(x0 ) := F 0 (x0 ) + αck I, with c ≤ αk . Let

ỹδn,αk = ṽδ2n−1,αk (12.3.8)

and
x̃δn+1,αk = ṽδ2n,αk (12.3.9)
for n > 0.
First we prove that x̃n,αk converges to the zero xδc,αk of
αk
F(x) + (x − x0 ) = zδαk (12.3.10)
c

and then we prove that xδc,αk is an approximation for x̂.


Let
ẽδn,αk := kỹδn,αk − x̃δn,αk kX , ∀n ≥ 0. (12.3.11)
For the sake of simplicity, we use the notation x̃n , ỹn and ẽn for x̃δn,αk , ỹδn,αk and ẽδn,αk
respectively.
Hereafter we assume that kx̂ − x0 kX < ρ ≤ r̃ where

1 δ0
ρ< (1 − √ )
M α0

with δ0 < α0 . Let
δ0
γ̃ρ := Mρ + √ .
α0
and we define
q1 = k0 r̃, r̃ ∈ (r˜1 , r˜2 ) (12.3.12)
where q
1− 1 − 4k0 γ˜ρ
r˜1 =
2k0
and q
1 1+ 1 − 4k0 γ˜ρ
r˜2 = min{ , }.
k0 2k0
Theorem 12.3.10. Let ẽn and q1 be as in equation (12.3.11) and (12.3.12) respectively, x̃n
and ỹn be as in (12.3.9) and (12.3.8) respectively with δ ∈ (0, δ0 ] and suppose Assumption
12.3.2 holds. Then we have the following.

(a) kx̃n − ỹn−1 kX ≤ q1 kỹn−1 − x̃n−1 kX ;

(b) kỹn − x̃n kX ≤ q21 kỹn−1 − x̃n−1 kX ,


232 Ioannis K. Argyros and Á. Alberto Magreñán

(c) ẽn ≤ q2n


1 γ̃ρ , ∀n ≥ 0.

Proof. Suppose x̃n , ỹn ∈ Br̃ (x0 ), then

x̃n − ỹn−1 = ỹn−1 − x̃n−1 − R(x0 )−1 (F(ỹn−1 ) − F (x̃n−1)


αk
+ (ỹn−1 − x̃n−1 ))
c
= R(x0 )−1 [R(x0 )(ỹn−1 − x̃n−1 )
αk
−(F(ỹn−1 ) − F(x̃n−1 )) − (ỹn−1 − x̃n−1 )]
c
Z 1
−1
= R(x0 ) [F 0 (x0 ) − (F(ỹn−1 ) − F(x̃n−1 ))]
0
×(ỹn−1 − x̃n−1 )dt.

Now since kR(x0 )−1 F 0 (x0 )kX→X ≤ 1, the proof of (a) follows as in Theorem 12.3.3. Again
observe that
αk
ẽn ≤ kx̃n − ỹn−1 − R(x0 )−1 (F(x̃n ) − zδαk + (x̃n − x0 ))kX
c
αk
+kR(x0 )−1 (F(ỹn−1 ) − zδαk + (ỹn−1 − x0 ))kX
c
αk
≤ kR(x0 )−1 [R(x0 )(x̃n − ỹn−1 ) − (F(x̃n ) − F(ỹn−1 )) − (x̃n − ỹn−1 )]kX
c
Z 1
−1
≤ kR(x0 ) [F 0 (x0 ) − (F(x̃n ) − F(ỹn−1 ))]dt(x̃n − ỹn−1 )kX .
0

So the remaining part of the proof is analogous to the proof of Theorem 12.3.3.

Theorem 12.3.11. Let ỹn and x̃n be as in (12.3.8) and (12.3.9) respectively and assumptions
of Theorem 12.3.10 holds. Then (x̃n ) is a Cauchy sequence in Br̃ (x0 ) and converges to
xδc,αk ∈ Br̃ (x0 ). Further F(xδc,αk ) + αck (xδc,αk − x0 ) = zδαk and

kx̃n − xδc,αk kX ≤ C̃q2n


1

γ̃ρ
where C̃ = 1−q1 .

Proof. Analogous to the proof of Theorem 12.3.4, one can prove that x̃n is a Cauchy se-
quence in Br̃ (x0 ) and hence it converges, say to xδc,αk ∈ Br̃ (x0 ) and
αk
kF(x̃n ) − zδαk + (x̃n − x0 )kX = kR(x0 )(x̃n − ỹn )kX
c
≤ kR(x0 )kX→X k(x̃n − ỹn )kX
αk
≤ (kF 0 (x0 )kX→X + )ẽn
c
0 αk 2n
≤ (kF (x0 )kX→X + )q1 ẽ0 (12.3.13)
c
0 αk 2n
≤ (kF (x0 )kX→X + )q1 γ̃ρ .
c
Newton-Type Methods for Ill-Posed Equations 233

Now by letting n → ∞ in (12.3.13) we obtain F(xδc,αk ) + αck (xδc,αk − x0 ) = zδαk . This completes
the proof.
Assume that k2 < 1−k 0r̃
1−c and for the sake of simplicity assume that ϕ1 (α) ≤ ϕ(α) for
α > 0.

Theorem 12.3.12. Suppose xδc,αk is the solution of (12.3.10) and Assumptions 12.3.2, 12.3.8
and 12.3.9 hold. Then
kx̂ − xδc,αk kX = O(ψ−1 (δ)).

Proof. Note that c(F(xδc,αk ) − zδαk ) + αk (xδc,αk − x0 ) = 0, so

(F 0 (x0 ) + αk I)(xδc,αk − x̂) = (F 0 (x0 ) + αk I)(xδc,αk − x̂)


−c(F(xδc,αk ) − zδαk ) − αk (xδcα − x0 )
= αk (x0 − x̂) + F 0 (x0 )(xδc,αk − x̂)
−c[F(xδc,αk ) − zδαk ]
= αk (x0 − x̂) + F 0 (x0 )(xδc,αk − x̂)
−c[F(xδc,αk ) − F(x̂) + F(x̂) − zδαk ]
= αk (x0 − x̂) − c(F(x̂) − zδαk ) + F 0 (x0 )(xδc,αk − x̂)
−c[F(xδc,αk ) − F(x̂)].

Thus

kxδc,αk − x̂kX ≤ kαk (F 0 (x0 + αk I)−1 (x0 − x̂)kX + k(F 0 (x0 ) + αk I)−1
c(F(x̂) − zδαk )kX + k(F 0 (x0 ) + αk I)−1 [F 0 (x0 )(xδc,αk − x̂)
−c(F(xδc,αk ) − F (x̂))]kX
≤ kαk (F 0 (x0 ) + αk I)−1 (x0 − x̂)kX + kF(x̂) − zδαk kX
Z 1
+k(F 0 (x0 ) + αk I)−1 [F 0 (x0 ) − cF 0 (x̂ + t(xδc,αk − x̂)]
0
(xδc,αk − x̂)dtkX
≤ kαk (F 0 (x0 ) + αk I)−1 (x0 − x̂)kX
+kF(x̂) − zδαk kX + Γ (12.3.14)
R1 0 (x ) − cF 0 (x̂ + t(xδ
where Γ := k(F 0 (x0 ) + αk I)−1 0 [F 0 c,αk − x̂)](xδc,αk − x̂)dtkX . So by As-
sumption 12.3.9, we obtain
Z 1
Γ ≤ k(F 0 (x0 ) + αk I)−1 [F 0 (x0 ) − F 0 (x̂ + t(xδc,αk − x̂)]
0
(xδc,αk − x̂)dtkX + (1 − c)k(F 0 (x0 ) + αI)−1 F 0 (x0 )
Z 1
G(x̂ + t(xδc,αk − x̂), x0 )(xδc,αk − x̂)dtkX
0
≤ k0 r̃kxδc,αk − x̂kX + (1 − c)k2 kxδc,αk − x̂kX (12.3.15)
234 Ioannis K. Argyros and Á. Alberto Magreñán

and hence by (12.3.14) and (12.3.15) we have

kαk (F 0 (x0 ) + αk I)−1 (x0 − x̂)kX + kF(x̂) − zδαk kX


kxδc,αk − x̂kX ≤
1 − (1 − c)k2 − k0 r̃

ϕ1 (αk ) + (2 + µ−1 )µψ−1 (δ)

1 − (1 − c)k2 − k0 r̃
−1
= O(ψ (δ)).

This completes the proof of the Theorem.


The following Theorem is a consequence of Theorem 12.3.11 and Theorem 12.3.12.

Theorem 12.3.13. Let x̃n be as in (12.3.9), assumptions in Theorem 12.3.11 and Theorem
12.3.12 hold. Then
−1
kx̂ − x̃n kX ≤ C̃q2n
1 + O(ψ (δ))

where C̃ is as in Theorem 12.3.11.

Theorem 12.3.14. Let x̃n be as in (12.3.9), assumptions in Theorem 12.2.3, Theorem


12.3.11 and Theorem 12.3.12 hold. Let
δ
nk := min{n : q2n
1 ≤ √ }.
αk

Then
kx̂ − x̃nk kX = O(ψ−1 (δ)).

Remark 12.3.15. Let us denote by r̄1 , γ̄ρ , q̄, δ̄0 the parameters using K0 instead of k0 for
Case 1 (Similarly for Case 2). Then we have,

r1 ≤ r̄1 ,

δ̄0 ≤ δ0 ,
γ̄ρ ≤ γρ ,
q ≤ q̄.
Moreover, strict inequality holds in the preceding estimates if k0 < K0 . Let h0 = 4k0 γρ
and h = 4K0 γ̄ρ . We can certainly choose γρ sufficiently close to γ̄ρ . Then, we have that,
h ≤ 1 ⇒ h0 ≤ 1 but not necessarily vice versa unless if k0 = K0 and γρ = γ̄ρ . Finally, we have
that, hh0 → 0 as Kk00 → 0. The last estimate shows by how many times our new approach using
k0 can expand the applicability of the old approach using K0 for these methods. Hence, all
the above justify the claims made at the introduction of the chapter. Finally we note that
the results obtained here are useful even if Assumptions 12.3.1 is satisfied but sufficient
convergence condition h ≤ 1 is not satisfied but h0 ≤ 1 is satisfied. Indeed, we can start
with iterative method described in Case 1 (or Case 2) until a finite step N such that h ≤ 1
is satisfied with xδN+1,αN as a starting point for faster methods such as (12.1.6). Such an
approach has already been employed in [2], [4] and [5] where the modified Newton’s
method is used as a predictor for Newton’s method.
Newton-Type Methods for Ill-Posed Equations 235

12.4. Algorithm
Note that for i, j ∈ {0, 1, 2, · · · , M}

zδαi − zδα j = (α j − αi )(K ∗ K + α j I)−1 (K ∗ K + αi I)−1 [K ∗ (yδ − KF(x0 ))].

The algorithm for implementing the iterative methods considered in section 3 involves
the following steps.
• α0 = δ2 ;

• αi = µ2i α0 , µ > 1;

• solve for wi : (K ∗ K + αi I)wi = K ∗ (yδ − KF(x0 ));

• solve for j < i, zi j : (K ∗ K + α j I)zi j = (α j − αi )wi ;


4
• if kzi j kX > µj
, then take k = i − 1;

• otherwise, repeat with i + 1 in place of i.

• choose nk = min{n : q2n ≤ √δ } in case (1) and nk = min{n : q2n √δ }


αk 1 ≤ αk in case (2)

• solve xnk using the iteration (12.3.2) or x̃nk using the iteration (12.3.9).

12.5. Numerical Examples


We present 5 numerical examples in this section. First, we consider two examples for illus-
trating the algorithm considered in the above sections. We apply the algorithm by choosing
a sequence of finite dimensional subspace (VN ) of X with dimVN = N + 1. Precisely we
choose VN as the space of linear splines in a uniform grid of N + 1 points in [0, 1]. Then we
present two examples where Assumption 12.3.1 is not satisfied but Assumption 12.3.2 is
satisfied. In the last example we show that Kk00 can be arbitrarily small.
Example 12.5.1. In this example for Case (1), we consider the operator KF : D(KF) ⊆
L2 (0, 1) −→ L2 (0, 1) with K : L2 (0, 1) −→ L2 (0, 1) defined by
Z 1
K(x)(t) = k(t, s)x(s)ds
0

(1 − t)s, 0 ≤ s ≤ t ≤ 1
where k(t, s) = and
(1 − s)t, 0 ≤ t ≤ s ≤ 1

F : D(F) ⊆ L2 (0, 1) −→ L2 (0, 1)

defined by F(u) := u3 ,
Then the Fréchet derivative of F is given by F 0 (u)w = 3(u)2w.
t2 t 11 5 3t 8
837t
In our computation, we take y(t) = 6160 − 16 − 110 − 3t80 − 112 and yδ = y + δ. Then the
exact solution
x̂(t) = 0.5 + t 3 .
236 Ioannis K. Argyros and Á. Alberto Magreñán

We use
3
x0 (t) = 0.5 + t 3 − (t − t 8 )
56
as our initial guess.
We choose α0 = (1.3)2(δ)2 , µ = 1.2, δ = 0.0667 the Lipschitz constant k0 equals ap-
proximately 0.23 and r = 1, so that q = k0 r = 0.23. The iterations and corresponding error
estimates are given in Table 12.5.1. The last column of the Table 12.5.1 shows that the error
1
kxnk − x̂kX is of order O(δ 2 ).

Table 12.5.1. Different errors


kxnk −x̂kX
N k αk kxnk − x̂kX (δ)1/2
16 4 0.0231 0.5376 2.0791
32 4 0.0230 0.5301 2.0523
64 4 0.0229 0.5257 2.0359
128 4 0.0229 0.5234 2.0270
256 4 0.0229 0.5222 2.0224
512 4 0.0229 0.5216 2.0200
1024 4 0.0229 0.5213 2.0188

Example 12.5.2. In this example for Case (2), we consider the operator KF : D(KF) ⊆
L2 (0, 1) −→ L2 (0, 1) where K : L2 (0, 1) −→ L2 (0, 1) defined by
Z 1
K(x)(t) = k(t, s)x(s)ds
0

and F : D(F) ⊆ L2 (0, 1) −→ L2 (0, 1) defined by


Z 1
F(u) := k(t, s)u3(s)ds,
0

where 
(1 − t)s, 0 ≤ s ≤ t ≤ 1
k(t, s) = .
(1 − s)t, 0 ≤ t ≤ s ≤ 1
Then for all x(t), y(t) : x(t) > y(t) : (see section 4.3 in [30])
Z 1 Z 1 
3 3
hF(x) − F(y), x − yi = k(t, s)(x − y )(s)ds (x − y)(t)dt ≥ 0.
0 0

Thus the operator F is monotone. The Fréchet derivative of F is given by


Z 1
F 0 (u)w = 3 k(t, s)(u(s))2w(s)ds.
0

So for any u ∈ Br (x0 ), x0 (s) ≥ k3 > 0, ∀s ∈ (0, 1), we have

F 0 (u)w = F 0 (x0 )G(u, x0)w,


Newton-Type Methods for Ill-Posed Equations 237

where G(u, x0) = ( xu0 )2 .


1 t 13 3
In our computation, we take y(t) = 110 ( 156
25t
− t6 + 156 ) and yδ = y + δ. Then the exact
solution
x̂(t) = t 3 .
We use
3
x0 (t) = t 3 +
(t − t 8 )
56
as our initial guess, so that the function x0 − x̂ satisfies the source condition
3 t6 t6
x0 − x̂ = (t − t 8 ) = F 0 (x0 )( ) = ϕ 1 (F 0
(x0 ))( )
56 x0 (t)2 x0 (t)2
1
where ϕ1 (λ) = λ. Thus we expect to have an accuracy of order at least O(δ 2 ).
We choose α0 = (1.3)δ, δ = 0.0667 =: c the Lipschitz constant k0 equals approximately
0.21 as in [30] and r̃ = 1, so that q1 = k0 r̃ = 0.21. The results of the computation are
presented in Table 12.5.2.

Table 12.5.2. Results of computation


kx̃nk −x̂kX
N k αk kx̃nk − x̂kX
δ1/2
8 4 0.0494 0.1881 0.7200
16 4 0.0477 0.1432 0.5531
32 4 0.0473 0.1036 0.4010
64 4 0.0472 0.0726 0.2812
128 4 0.0471 0.0491 0.1900
256 4 0.0471 0.0306 0.1187
512 4 0.0471 0.0140 0.0543
1024 4 0.0471 0.0133 0.0515

In the next two cases, we present examples for nonlinear equations where Assumption
12.3.2 is satisfied but not Assumption 12.3.1.
Example 12.5.3. Let X = Y = R, D = [0, ∞), x0 = 1 and define function F on D by
1
x1+ i
F(x) = + c1 x + c2 , (12.5.1)
1 + 1i
where c1 , c2 are real parameters and i > 2 an integer. Then F 0 (x) = x1/i + c1 is not Lips-
chitz on D. Hence, Assumption 12.3.1 is not satisfied. However central Lipschitz condition
Assumption 12.3.2 holds for k0 = 1.
Indeed, we have
1/i
kF 0 (x) − F 0 (x0 )kX = |x1/i − x0 |
|x − x0 |
= i−1 i−1
x0 i + · · · + x i
238 Ioannis K. Argyros and Á. Alberto Magreñán

so
kF 0 (x) − F 0 (x0 )kX ≤ k0 |x − x0 |.

Example 12.5.4. We consider the integral equations


Z b
u(s) = f (s) + λ G(s,t)u(t)1+1/ndt, n ∈ N. (12.5.2)
a

Here, f is a given continuous function satisfying f (s) > 0, s ∈ [a, b], λ is a real number, and
the kernel G is continuous and positive in [a, b] × [a, b].
For example, when G(s,t) is the Green kernel, the corresponding integral equation is
equivalent to the boundary value problem

u00 = λu1+1/n
u(a) = f (a), u(b) = f (b).

These type of problems have been considered in [1]- [5].


Equation of the form (12.5.2) generalize equations of the form
Z b
u(s) = G(s,t)u(t)ndt (12.5.3)
a

studied in [1]-[5]. Instead of (12.5.2) we can try to solve the equation F(u) = 0 where

F : Ω ⊆ C[a, b] → C[a, b], Ω = {u ∈ C[a, b] : u(s) ≥ 0, s ∈ [a, b]},

and Z b
F(u)(s) = u(s) − f (s) − λ G(s,t)u(t)1+1/ndt.
a
The norm we consider is the max-norm.
The derivative F 0 is given by
Z b
0 1
F (u)v(s) = v(s) − λ(1 + ) G(s,t)u(t)1/nv(t)dt, v ∈ Ω.
n a

First of all, we notice that F 0 does not satisfy a Lipschitz-type condition in Ω. Let us con-
sider, for instance, [a, b] = [0, 1], G(s,t) = 1 and y(t) = 0. Then F 0 (y)v(s) = v(s) and
Z b
1
kF 0 (x) − F 0 (y)kC[a,b]→C[a,b] = |λ|(1 + ) x(t)1/ndt.
n a

If F 0 were a Lipschitz function, then

kF 0 (x) − F 0 (y)kC[a,b]→C[a,b] ≤ L1 kx − ykC[a,b] ,

or, equivalently, the inequality


Z 1
x(t)1/ndt ≤ L2 max x(s), (12.5.4)
0 x∈[0,1]
Newton-Type Methods for Ill-Posed Equations 239

would hold for all x ∈ Ω and for a constant L2 . But this is not true. Consider, for example,
the functions
t
x j (t) = , j ≥ 1, t ∈ [0, 1].
j
If these are substituted into (12.5.4)
1 L2
≤ ⇔ j 1−1/n ≤ L2 (1 + 1/n), ∀ j ≥ 1.
j1/n(1 + 1/n) j

This inequality is not true when j → ∞.


Therefore, condition (12.5.4) is not satisfied in this case. Hence Assumption 12.3.1 is
not satisfied. However, condition Assumption 12.3.2 holds. To show this, let x0 (t) = f (t)
and γ = mins∈[a,b] f (s), α > 0 Then for v ∈ Ω,

1 b Z
0 0
k[F (x) − F (x0 )]vkC[a,b] = |λ|(1 + ) max | G(s,t)(x(t)1/n − f (t)1/n)v(t)dt|
n s∈[a,b] a
1
≤ |λ|(1 + ) max Gn (s,t)
n s∈[a,b]

G(s,t)|x(t)− f (t)|
where Gn (s,t) = x(t)(n−1)/n +x(t)(n−2)/n f (t)1/n +···+ f (t)(n−1)/n
kvkC[a,b].
Hence,
Z b
|λ|(1 + 1/n)
k[F 0 (x) − F 0 (x0 )]vkC[a,b] = max G(s,t)dtkx − x0 kC[a,b]
γ(n−1)/n s∈[a,b] a
≤ k0 kx − x0 kC[a,b] ,

|λ|(1+1/n) Rb
where k0 = γ(n−1)/n N and N = maxs∈[a,b] a G(s,t)dt. Then Assumption 12.3.2 holds for
sufficiently small λ.

Example 12.5.5. Define the scalar function F by F(x) = d0 x+d1 +d2 sin ed3 x , x0 = 0, where
di , i = 0, 1, 2, 3 are given parameters. Then, it can easily be seen that for d3 large and d2
sufficiently small, Kk00 can be arbitrarily small.

12.6. Conclusion
We presented an iterative method which is a combination of modified Newton method and
Tikhonov regularization for obtaining an approximate solution for a nonlinear ill-posed
Hammerstein type operator equation KF(x) = y, with the available noisy data yδ in place of
the exact data y. In fact we considered two cases, in the first case it is assumed that F 0 (x0 )−1
exists and in the second case it is assumed that F is monotone but F 0 (x0 )−1 does not exist.
In both the cases, the derived error estimate using an a priori and balancing principle are of
optimal order with respect to the general source condition. The results of the computational
experiments gives the evidence of the reliability of our approach.
References

[1] Argyros, I. K., Convergence and Application of Newton-type Iterations, (Springer,


2008).

[2] Argyros, I. K., Approximating solutions of equations using Newton’s method with a
modified Newton’s method iterate as a starting point. Rev. Anal. Numer. Theor. Approx.
36 (2007), 123-138.

[3] Argyros, I. K., A Semilocal convergence for directional Newton methods, Math. Com-
put.(AMS). 80 (2011), 327-343.

[4] Argyros, I. K., Hilout, S. Weaker conditions for the convergence of Newton’s method,
J. Complexity, 28 (2012), 364-387.

[5] Argyros, I. K., Cho, Y. J., Hilout, S., Numerical methods for equations and its appli-
cations, (CRC Press, Taylor and Francis, New York, 2012).

[6] Argyros, I. K., Hilout, S., A convergence analysis for directional two-step Newton
methods, Numer. Algor., 55 (2010), 503-528.

[7] Bakushinskii, A. B., The problem of convergence of the iteratively regularized Gauss-
Newton method, Comput. Math. Math. Phys., 32 (1982), 1353-1359.

[8] Bakushinskii, A. B., Kokurin, M. Y., Iterative Methods for Approximate Solution of
Inverse Problems, (Springer, Dordrecht, 2004).

[9] Blaschke, B., Neubauer, A., Scherzer, O., On convergence rates for the iteratively
regularized Gauss-Newton method IMA J. Numer. Anal., 17 (1997), 421-436.

[10] Engl, H. W., Regularization methods for the stable solution of inverse problems, Sur-
veys on Mathematics for Industry, 3 (1993), 71-143.

[11] Engl, H. W., Kunisch, K., Neubauer, A., Convergence rates for Tikhonov regulariza-
tion of nonlinear ill-posed problems, Inverse Problems, 5 (1989), 523-540.

[12] Engl, H. W., Kunisch, K., Neubauer, A., Regularization of Inverse Problems, (Kluwer,
Dordrecht, 1996).

[13] George, S., Newton-Tikhonov regularization of ill-posed Hammerstein operator equa-


tion, J. Inverse and Ill-Posed Problems, 14(2) (2006), 135-146.
242 Ioannis K. Argyros and Á. Alberto Magreñán

[14] George, S., Newton-Lavrentiev regularization of ill-posed Hammerstein operator


equation, J. Inverse and Ill-Posed Problems, 14(6) (2006), 573-582.
[15] George, S., Nair, M.T., A modified Newton-Lavrentiev regularization for nonlinear
ill-posed Hammerstein operator equations, J. Complexity and Ill-Posed Problems, 24
(2008), 228-240.
[16] George, S., Kunhanandan, M., An iterative regularization method for Ill-posed Ham-
merstein type operator equation, J. Inv. Ill-Posed Problems 17 (2009), 831-844.
[17] George, S., Shobha, M. E., A regularized dynamical system method for nonlinear ill-
posed Hammerstein type operator equations, J. Appl. Math. Bio, 1(1) (2011), 65-78.
[18] Shobha, M. E., George, S., Dynamical System Method for Ill-posed Hammerstein
Type Operator Equations with Monotone Operators, International Journal of Pure
and Applied Mathematics, ISSN 1311-8080, 81(1) (2012), 129-143.
[19] George, S., Newton type iteration for Tikhonov regularization of nonlinear ill-
posed problems, Journal of Mathematics, 2013 (2013), Article ID 439316, 9 pages,
doi:10.1155/2013/439316.
[20] Groetsch, C. W., Theory of Tikhonov regularization for Fredholm Equation of the first
kind (Pitmann Books, 1984).
[21] Kaltenbacher, B., A posteriori parameter choice strategies for some Newton-type
methods for the regularization of nonlinear ill-posed problems, Numer. Math., 79
(1998), 501-528.
[22] Kaltenbacher, B., Neubauer, A., Scherzer, O., Iterative regularisation methods for
nolinear ill-posed porblems (de Gruyter, Berlin, New York 2008).

[23] Kelley, C. T., Iterative Methods for Linear and Nonlinear Equations (SIAM, Philadel-
phia 1995).
[24] Krasnoselskii, M. A., Zabreiko, P. P., Pustylnik, E. I., Sobolevskii, P. E., Integral
operators in spaces of summable functions (Translated by T. Ando, Noordhoff Inter-
national publishing, Leyden, 1976).
[25] Langer, S., Hohage, T., Convergence analysis of an inexact iteratively regularized
Gauss-Newton method under general source conditions, J. Inverse Ill-Posed Probl.,
15 (2007), 19-35.
[26] Mahale, P., Nair, M. T., A simplified generalized Gauss-Newton method for nonlinear
ill-posed problems, Math. Comp., 78(265) (2009), 171-184.
[27] Nair, M.T., Ravishankar, P., Regularized versions of continuous newton’s method and
continuous modified newton’s method under general source conditions, Numer. Funct.
Anal. Optim. 29(9-10) (2008), 1140-1165.
[28] Pereverzev, S., Schock, E., On the adaptive selection of the parameter in regularization
of ill-posed problems, SIAM. J. Numer. Anal., 43(5) (2005), 2060-2076.
Newton-Type Methods for Ill-Posed Equations 243

[29] Ramm, A. G., Smirnova, A. B., Favini, A., Continuous modified Newton’s-type
method for nonlinear operator equations. Ann. Mat. Pura Appl. 182 (2003), 37-52.

[30] Semenova, E.V., Lavrentiev regularization and balancing principle for solving ill-
posed problems with monotone operators, Comput. Methods Appl. Math., 4 (2010),
444-454.

[31] Tautenhahn, U., On the method of Lavrentiev regularization for nonlinear ill-posed
problems, Inverse Problems, 18 (2002), 191-207.
Chapter 13

Enlarging the Convergence Domain


of Secant-Like Methods for
Equations

13.1. Introduction
Let X , Y be Banach spaces and D be a non-empty, convex and open subset in X . Let
U(x, r) and U(x, r) stand, respectively, for the open and closed ball in X with center x and
radius r > 0. Denote by L (X , Y ) the space of bounded linear operators from X into Y . In
the present chapter we are concerned with the problem of approximating a locally unique
solution x? of equation
F(x) = 0, (13.1.1)
where F is a Fréchet continuously differentiable operator defined on D with values in Y .
A lot of problems from computational sciences and other disciplines can be brought in
the form of equation (13.1.1) using Mathematical Modelling [8, 10, 14]. The solution of
these equations can rarely be found in closed form. That is why most solution methods for
these equations are iterative. In particular, the practice of numerical analysis for finding
such solutions is essentially connected to variants of Newton’s method [8, 10, 14, 23, 26,
28, 33].
A very important aspect in the study of iterative procedures is the convergence domain.
In general the convergence domain is small. This is why it is important to enlarge it without
additional hypotheses. Then, this is our goal in this chapter.
In the present chapter we study the secant-like method defined by

x−1 , x0 are initial points


yn = λ xn + (1 − λ) xn−1 , λ ∈ [0, 1] (13.1.2)
xn+1 = xn − B−1 n F(xn ), Bn = [yn , xn ; F] for each n = 0, 1, 2, · · · .

The family of secant-like methods reduces to the secant method if λ = 0 and to Newton’s
method if λ = 1. It was shown in [28] (see also√[7, 8, 20, 22] and the references therein)
that the R–order of convergence is at least (1 + 5)/2 if λ ∈ [0, 1), the same as that of the
secant method. In the real case the closer xn and yn are, the higher the speed of convergence.
246 Ioannis K. Argyros and Á. Alberto Magreñán

Moreover in [19], it was shown that as λ approaches 1 the speed of convergence is close
to that of Newton’s method. Moreover, the advantages of using secant-like method instead
of Newton’s method is that the former method avoids the computation of F 0 (xn )−1 at each
step. The study about convergence matter of iterative procedures is usually centered on
two types: semilocal and local convergence analysis. The semilocal convergence matter is,
based on the information around an initial point, to give criteria ensuring the convergence
of iterative procedure; while the local one is, based on the information around a solution,
to find estimates of the radii of convergence balls. There is a plethora of studies on the
weakness and/or extension of the hypothesis made on the underlying operators; see for
example [1]–[35] or even graphical tools to study this method [25].
The hypotheses used for the semilocal convergence of secant-like method are (see [8,
18, 19, 22]):
(C1 ) There exists a divided difference of order one denoted by [x, y; F] ∈ L (X , Y ) satisfy-
ing
[x, y; F](x − y) = F(x) − F(y) for all x, y ∈ D ;

(C2 ) There exist x−1 , x0 in D and c > 0 such that

k x0 − x−1 k≤ c;

(C3 ) There exist x−1 , x0 ∈ D and M > 0 such that A−1


0 ∈ L (Y , X ) and

k A−1
0 ([x, y; F] − [u, v; F]) k≤ M (k x − u k + k y − v k) for all x, y, u, v ∈ D ;

(C3? ) There exist x−1 , x0 ∈ D and L > 0 such that A−1


0 ∈ L (Y , X ) and

k A−1
0 ([x, y; F] − [v, y; F]) k≤ L k x − v k for all x, y, v ∈ D ;

(C3?? ) There exist x−1 , x0 ∈ D and K > 0 such that F(x0 )−1 ∈ L (Y , X ) and

k F 0 (x0 )−1 ([x, y; F] − [v, y; F]) k≤ K k x − v k for all x, y, v ∈ D ;

(C4 ) There exists η > 0 such that

k A−1
0 F(x0 ) k≤ η;

(C4? ) There exists η > 0 for each λ ∈ [0, 1] such that

k B−1
0 F(x0 ) k≤ η.

We shall refer to (C1 )–(C4 ) as the (C ) conditions. From analyzing the semilocal conver-
gence of the simplified secant method, it was shown [18] that the convergence criteria are
milder than those of secant-like method given in [21]. Consequently, the decreasing and
accessibility regions of (13.1.2) can be improved. Moreover, the semilocal convergence of
(13.1.2) is guaranteed.
In the present chapter we show: an even larger convergence domain can be obtained
under the same or weaker sufficient convergence criteria for method (13.1.2). In view of
(C3 ) we have that
Secant-Like Methods 247

(C5 ) There exists M0 > 0 such that

k A−1
0 ([x, y; F] − [x−1 , x0 ; F]) k≤ M0 (k x − x−1 k + k y − x0 k) for all x, y ∈ D .

We shall also use the conditions


(C6 ) There exist x0 ∈ D and M1 > 0 such that F 0 (x0 )−1 ∈ L (Y , X ) and

k F 0 (x0 )−1 ([x, y; F] − F 0 (x0 )) k≤ M1 (k x − x0 k + k y − x0 k) for all x, y ∈ D ;

(C7 ) There exist x0 ∈ D and M2 > 0 such that F 0 (x0 )−1 ∈ L (Y , X ) and

k F 0 (x0 )−1 (F 0 (x) − F 0 (x0 )) k≤ M2 (k x − x0 k + k y − x0 k) for all x, y ∈ D .

Note that M0 ≤ M, M2 ≤ M1 , L ≤ M hold in general and M/M0, M1 /M2 , M/L can be


arbitrarily large [6, 7, 8, 9, 10, 14]. We shall refer to (C1 ), (C2 ), (C3?? ), (C4? ), (C6 ) as the
(C ? ) conditions and (C1 ), (C2 ), (C3? ), (C4? ), (C5 ) as the (C ?? ) conditions. Note that (C5 ) is not
additional hypothesis to (C3 ), since in practice the computation of constant M requires that
of M0 . Note that if (C6 ) holds, then we can set M2 = 2 M1 in (C7 ).
The chapter is organized as follows. In Section 13.2. we use the (C ? ) and (C ?? ) con-
ditions instead of the (C ) conditions to provide new semilocal convergence analyses for
method (13.1.2) under weaker sufficient criteria than those given in [18, 19, 22, 27, 28].
This way we obtain a larger convergence domain and a tighter convergence analysis. Two
numerical examples, where we illustrate the improvement of the domain of starting points
achieved with the new semilocal convergence results, are given in the Section 13.3..

13.2. Semilocal Convergence of Secant-Like Method


We present the semilocal convergence of secant-like method. First, we need some results
on majorizing sequences for secant-like method.
Lemma 13.2.1. Let c ≥ 0, η > 0, M1 > 0, K > 0 and λ ∈ [0, 1]. Set t−1 = 0, t0 = c and
t1 = c + η. Define scalar sequences {qn }, {tn }, {αn } for each n = 0, 1, · · · by

qn = (1 − λ) (tn − t0 ) + (1 + λ) (tn+1 − t0 ),
K (tn+1 − tn + (1 − λ) (tn − tn−1 ))
tn+2 = tn+1 + (tn+1 − tn ), (13.2.1)
1 − M1 q n
K (tn+1 − tn + (1 − λ) (tn − tn−1 ))
αn = , (13.2.2)
1 − M1 q n
function { f n } for each n = 1, 2, · · · by

fn (t) = K ηt n + K (1 − λ) η t n−1 + M1 η ((1 − λ) (1 + t + · · · + t n )+


(13.2.3)
(1 + λ) (1 + t + · · · + t n+1 )) − 1

and polynomial p by

p(t) = M1 (1 + λ)t 3 + (M1 (1 − λ) + K)t 2 − K λt − K (1 − λ). (13.2.4)


248 Ioannis K. Argyros and Á. Alberto Magreñán

Denote by α the smallest root of polynomial p in (0, 1). Suppose that

0 < α0 ≤ α ≤ 1 − 2 M1 η. (13.2.5)

Then, sequence {tn} is non-decreasing, bounded from above by t ?? defined by


η
t ?? = +c (13.2.6)
1−α
and converges to its unique least upper bound t ? which satisfies

c + η ≤ t ? ≤ t ?? . (13.2.7)

Moreover, the following estimates are satisfied for each n = 0, 1, · · ·

0 ≤ tn+1 − tn ≤ αn η (13.2.8)

and
αn η
t ? − tn ≤ . (13.2.9)
1−α
Proof. We shall first prove that polynomial p has roots in (0, 1). If λ 6= 1, p(0) = −(1 −
λ) K < 0 and p(1) = 2 M1 > 0. If λ = 1, p(t) = t p(t), p(0) = −K < 0 and p(1) = 2 M1 > 0.
In either case it follows from the intermediate value theorem that there exist roots in (0, 1).
Denote by α the minimal root of p in (0, 1). Note that, in particular for Newton’s method
(i.e. for λ = 1) and for Secant method (i.e. for λ = 0), we have, respectively by (13.2.4)
that
2K
α= p (13.2.10)
K + K 2 + 4 M1 K
and
2K
α= p . (13.2.11)
K+ K 2 + 8 M1 K
It follows from (13.2.1) and (13.2.2) that estimate (13.2.8) is satisfied if

0 ≤ αn ≤ α. (13.2.12)

Estimate (13.2.12) is true by (13.2.5) for n = 0. Then, we have by (13.2.1) that

t2 − t1 ≤ α (t1 − t0 ) =⇒ t2 ≤ t1 + α (t1 − t0 )
1 − α2
=⇒ t2 ≤ η + t0 + α η = c + (1 + α) η = c + < t ?? .
1−αη
Suppose that
1 − αk+1
tk+1 − tk ≤ αk η and tk+1 ≤ c + η. (13.2.13)
1−α
Estimate (13.2.12) shall be true for k + 1 replacing n if

0 ≤ αk+1 ≤ α (13.2.14)
Secant-Like Methods 249

or
fk (α) ≤ 0. (13.2.15)
We need a relationship between two consecutive recurrent functions f k for each k = 1, 2, · · ·.
It follows from (13.2.3) and (13.2.4) that

fk+1(α) = f k (α) + p(α) αk−1 η = f k (α), (13.2.16)

since p(α) = 0. Define function f ∞ on (0, 1) by

f∞(t) = lim fn (t). (13.2.17)


n→∞

Then, we get from (13.2.3) and (13.2.17) that

f∞ (α) = lim fn (α)


n→∞
= K η lim αn + K (1 − λ) η lim αn−1 +

n→∞ n→∞

M1 η (1 − λ) lim (1 + α + · · · + αn )+
n→∞ (13.2.18)

(1 + λ) lim (1 + α + · · · + αn+1 ) − 1
 n→∞ 
1−λ 1+λ 2 M1 η
= M1 η + −1 = − 1,
1−α 1−α 1−α
since α ∈ (0, 1). In view of (13.2.15), (13.2.16) and (13.2.18) we can show instead of
(13.2.15) that
f∞ (α) ≤ 0, (13.2.19)
which is true by (13.2.5). The induction for (13.2.8) is complete. It follows that sequence
{tn } is non-decreasing, bounded from above by t ?? given by (13.2.6) and as such it con-
verges to t ? which satisfies (13.2.7). Estimate (13.2.9) follows from (13.2.8) by using stan-
dard majorization techniques [8, 10, 23]. The proof of Lemma 13.2.1 is complete. 
Lemma 13.2.2. Let c ≥ 0, η > 0, M1 > 0, K > 0 and λ ∈ [0, 1]. Set r−1 = 0, r0 = c and
r1 = c + η. Define scalar sequences {rn } for each n = 1, · · · by

r2 = r1 + β1 (r1 − r0 )
(13.2.20)
rn+2 = rn+1 + βn (rn+1 − rn ),

where
M1 (r1 − r0 + (1 − λ) (r0 − r−1 ))
β1 = ,
1 − M1 q 1
K (rn+1 − rn + (1 − λ) (rn − rn−1 ))
βn = f or each n = 2, 3, · · ·
1 − M1 q n
and function {gn } on [0, 1) for each n = 1, 2, · · · by
n−1
 (1 − λ))t n+1
gn (t) = K (t + (r2 − r1 )+ 
1 −t 1 − t n+2
M1 t (1 − λ) + (1 + λ) (r2 − r1 ) + (2 M1 η − 1)t.
1 −t 1 −t
(13.2.21)
250 Ioannis K. Argyros and Á. Alberto Magreñán

Suppose that
2 M1 (r2 − r1 )
0 ≤ β1 ≤ α ≤ 1 − , (13.2.22)
1 − 2 M1 η
where α is defined in Lemma 13.2.1. Then, sequence {rn } is non-decreasing, bounded from
above by r?? defined by
r2 − r1
r?? = c + η + (13.2.23)
1−α
and converges to its unique least upper bound r? which satisfies

c + η ≤ r? ≤ r?? . (13.2.24)
Moreover, the following estimates are satisfied for each n = 1, · · ·
0 ≤ rn+2 − rn+1 ≤ αn (r2 − r1 ). (13.2.25)
Proof. We shall use mathematical induction to show that
0 ≤ βn ≤ α. (13.2.26)
Estimate (13.2.26) is true for n = 0 by (13.2.22). Then, we have by (13.2.20) that
0 ≤ r3 − r2 ≤ α (r2 − r1 ) =⇒ r3 ≤ r2 + α (r2 − r1 )
=⇒ r3 ≤ r2 + (1 + α) (r2 − r1 ) − (r2 − r1 )
1 − α2
=⇒ r3 ≤ r1 + (r2 − r1 ) ≤ r?? .
1−α
Suppose (13.2.26) holds for each n ≤ k, then, using (13.2.20), we obtain that
1 − αk+1
0 ≤ rk+2 − rk+1 ≤ αk (r2 − r1 ) and rk+2 ≤ r1 + (r2 − r1 ). (13.2.27)
1−α
Estimate (13.2.26) is certainly satisfied, if
gk (α) ≤ 0, (13.2.28)
where gk is defined by (13.2.21). Using (13.2.21), we obtain the following relationship
between two consecutive recurrent functions gk for each k = 1, 2, · · ·
gk+1(α) = gk (α) + p(α) αk−1 (r2 − r1 ) = gk (α), (13.2.29)
since p(α) = 0. Define function g∞ on [0, 1) by
g∞ (t) = lim gk (t). (13.2.30)
k→∞

Then, we get from (13.2.21) and (13.2.30) that


 
2 M1 (r2 − r1 )
g∞ (α) = α + 2 M1 η − 1 . (13.2.31)
1−α
In view of (13.2.28)–(13.2.31) to show (13.2.28), it suffices to have g∞ (α) ≤ 0, which true
by the right hand hypothesis in (13.2.22). The induction for (13.2.26) (i.e. for (13.2.25)) is
complete. The rest of the proof is omitted (as identical to the proof of Lemma 13.2.1). The
proof of Lemma 13.2.2 is complete. 
Secant-Like Methods 251

Remark 13.2.3. Let us see how sufficient convergence criterion on (13.2.5) for sequence
{tn } simplifies in the interesting case of Newton’s method. That is when c = 0 and λ = 1.
Then, (13.2.5) can be written for L0 = 2 M1 and L = 2 K as
1 p 1
h0 = (L + 4 L0 + L2 + 8 L0 L) η ≤ . (13.2.32)
8 2
The convergence criterion in [18] reduces to the famous for it simplicity and clarity Kan-
torovich hypothesis
1
h = Lη ≤ . (13.2.33)
2
Note however that L0 ≤ L holds in general and L/L0 can be arbitrarily large [6, 7, 8, 9, 10,
14]. We also have that
1 1
h ≤ =⇒ h0 ≤ (13.2.34)
2 2
but not necessarily vice versa unless if L0 = L and
h0 1 L
−→ as −→ ∞. (13.2.35)
h 4 L0
Similarly, it can easily be seen that the sufficient convergence criterion (13.2.22) for se-
quence {rn } is given by
q p
1 1
h1 = (4 L0 + L0 L + 8 L20 + L0 L) η ≤ . (13.2.36)
8 2
We also have that
1 1
h0 ≤ =⇒ h1 ≤ (13.2.37)
2 2
and
h1 h1 L0
−→ 0, −→ 0 as −→ 0. (13.2.38)
h h0 L
Note that sequence {rn } is tighter than {tn } and converges under weaker conditions. In-
deed, a simple inductive argument shows that for each n = 2, 3, · · ·, if M1 < K, then

rn < tn , rn+1 − rn < tn+1 − tn and r? ≤ t ? . (13.2.39)

We have the following usefull and obvious extensions of Lemma 13.2.1 and Lemma
13.2.2, respectively.

Lemma 13.2.4. Let N = 0, 1, 2, · · · be fixed. Suppose that

t1 ≤ t2 ≤ · · · ≤ tN ≤ tN+1 , (13.2.40)
1
> (1 − λ) (tN − t0 ) + (1 + λ) (tN+1 − t0 ) (13.2.41)
M1
and
0 ≤ αN ≤ α ≤ 1 − 2 M1 (tN+1 − tN ). (13.2.42)
252 Ioannis K. Argyros and Á. Alberto Magreñán

Then, sequence {tn } generated by (13.2.1) is nondecreasing, bounded from above by t ??


and converges to t ? which satisfies t ? ∈ [tN+1 ,t ??]. Moreover, the following estimates are
satisfied for each n = 0, 1, · · ·

0 ≤ tN+n+1 − tN+n ≤ αn (tN+1 − tN ) (13.2.43)

and
αn
t ? − tN+n ≤ (tN+1 − tN ). (13.2.44)
1−α
Lemma 13.2.5. Let N = 1, 2, · · · be fixed. Suppose that

r1 ≤ r2 ≤ · · · ≤ rN ≤ rN+1 , (13.2.45)
1
> (1 − λ) (rN − r0 ) + (1 + λ) (rN+1 − r0 ) (13.2.46)
M1
and
2 M1 (rN+1 − rN )
0 ≤ βN ≤ α ≤ 1 − . (13.2.47)
1 − 2 M1 (rN − rN−1 )
Then, sequence {rn } generated by (13.2.20) is nondecreasing, bounded from above by r??
and converges to r? which satisfies r? ∈ [rN+1 , r?? ]. Moreover, the following estimates are
satisfied for each n = 0, 1, · · ·

0 ≤ rN+n+1 − rN+n ≤ αn (rN+1 − rN ) (13.2.48)

and
αn
r? − rN+n ≤ (rN+1 − rN ). (13.2.49)
1−α
Next, we present the following semilocal convergence result for secant-like method
under the (C ? ) conditions.
Theorem 13.2.6. Suppose that the (C ? ), Lemma 13.2.1 (or Lemma 13.2.4) conditions and

U(x0 ,t ? ) ⊆ D (13.2.50)

hold. Then, sequence {xn } generated by the secant-like method is well defined, remains in
U(x0 ,t ? ) for each n = −1, 0, 1, · · · and converges to a solution x? ∈ U(x0 ,t ? − c) of equation
F(x) = 0. Moreover, the following estimates are satisfied for each n = 0, 1, · · ·

k xn+1 − xn k≤ tn+1 − tn (13.2.51)

and
k xn − x? k≤ t ? − tn . (13.2.52)
Furthemore, if there exists r ≥ t ? such that

U(x0 , r) ⊆ D (13.2.53)

and
1 2
r + t? < or r + t ? < , (13.2.54)
M1 M2
then, the solution x? is unique in U(x0 , r).
Secant-Like Methods 253

Proof. We use mathematical induction to prove that

k xk+1 − xk k≤ tk+1 − tk (13.2.55)

and
U(xk+1 ,t ? − tk+1 ) ⊆ U(xk ,t ? − tk ) (13.2.56)
for each k = −1, 0, 1, · · ·. Let z ∈ U(x0 ,t ? − t0 ). Then, we obtain that

k z − x−1 k≤k z − x0 k + k x0 − x−1 k≤ t ? − t0 + c = t ? = t ? − t−1 ,

which implies z ∈ U(x−1 ,t ? − t−1 ). Let also w ∈ U(x0 ,t ? − t1 ). We get that

k w − x0 k≤k w − x1 k + k x1 − x0 k≤ t ? − t1 + t1 − t0 = t ? = t ? − t0 .

That is w ∈ U(x0 ,t ? − t0 ). Note that

k x−1 − x0 k≤ c = t0 − t−1 and k x1 − x0 k=k B−1 ?


0 F(x0 ) k≤ η = t1 − t0 < t ,

which implies x1 ∈ U(x0 ,t ? ) ⊆ D . Hence, estimates (13.2.51) and (13.2.52) hold for k = −1
and k = 0. Suppose (13.2.51) and (13.2.52) hold for all n ≤ k. Then, we obtain that
k+1 k+1
k xk+1 − x0 k≤ ∑ k xi − xi−1 k≤ ∑ (ti − ti−1 ) = tk+1 − t0 ≤ t ?
i=1 i=1

and
k yk − x0 k≤ λ k xk − x0 k +(1 − λ) k xk−1 − x0 k≤ λt ? + (1 − λ)t ? = t ? .
Hence, xk+1 , yk ∈ U(x0 ,t ? ). Let Ek := [xk+1, xk ; F] for each k = 0, 1, · · ·. Using (13.1.2),
Lemma 13.2.1 and the induction hypotheses, we get that

k F 0 (x0 )−1 (Bk+1 − F 0 (x0 )) k≤ M1 (k yk+1 − x0 k + k xk+1 − x0 k)


≤ M1 ((1 − λ) k xk − x0 k +λ k xk+1 − x0 k + k xk+1 − x0 k) (13.2.57)
≤ M1 ((1 − λ) (tk − t0 ) + (1 + λ) (tk+1 − t0 )) < 1,

since, yk+1 − x0 = λ (xk+1 − x0 ) + (1 − λ) (xk − x0 ) and

k yk+1 − x0 k=k λ (xk+1 − x0 ) + (1 − λ) (xk − x0 ) k


≤ λ k xk+1 − x0 k +(1 − λ) k xk − x0 k .

It follows from (13.2.57) and the Banach lemma on invertible operators that B−1
k+1 exists and

1 1
k B−1 0
k+1 F (x0 ) k≤ ≤ , (13.2.58)
1 − Θk 1 − M1 qk+1

where Θk = M1 ((1 − λ) k xk − x0 k +(1 + λ) k xk+1 − x0 k). In view of (13.1.2), we obtain


the identity

F(xk+1 ) = F(xk+1) − F(xk ) − Bk (xk+1 − xk ) = (Ek − Bk ) (xk+1 − xk ). (13.2.59)


254 Ioannis K. Argyros and Á. Alberto Magreñán

Then, using the induction hypotheses, the (C ? ) condition and (13.2.59), we get in turn that

k F 0 (x0 )−1 F(xk+1) k = k F 0 (x0 )−1 (Ek − Bk ) (xk+1 − xk ) k


≤ K k xk+1 − yk k k xk+1 − xk k
≤ K (k xk+1 − xk k +(1 − λ) k xk − xk−1 k) k xk+1 − xk k
≤ K (tk+1 − tk + (1 − λ) (tk − tk−1 )) (tk+1 − tk ),
(13.2.60)
since, xk+1 − yk = xk+1 − xk + (1 − λ) (xk − xk−1 ) and

k xk+1 − yk k≤k xk+1 − xk k +(1 − λ) k xk − xk−1 k≤ tk+1 − tk + (1 − λ) (tk − tk−1 ).

It now follows from (13.1.2), (13.2.1), (13.2.58)–(13.2.60) that

k xk+2 − xk+1 k ≤ k B−1 0 0 −1


k+1 F (x0 ) k k F (x0 ) F(xk+1 ) k
K (tk+1 − tk + (1 − λ) (tk+1 − xk )) (tk+1 − tk )
≤ = tk+2 − tk+1,
1 − M1 qk+1

which completes the induction for (13.2.55). Furthemore, let v ∈ U(xk+2 ,t ? − tk+2 ). Then,
we have that
k v − xk+1 k ≤ k v − xk+2 k + k xk+2 − xk+1 k
≤ t ? − tk+2 + tk+2 − tk+1 = t ? − tk+1 ,
which implies v ∈ U(xk+1 ,t ? −tk+1 ). The induction for (13.2.55) and (13.2.56) is complete.
Lemma 13.2.1 implies that {tk} is a complete sequence. It follows from (13.2.55) and
(13.2.56) that {xk } is a complete sequence in a Banach space X and as such it converges
to some x? ∈ U(x0 ,t ? ) (since U(x0 ,t ?) is a closed set). By letting k −→ ∞ in (13.2.60), we
get that F(x? ) = 0. Moreover, estimate (13.2.52) follows from (13.2.51) by using standard
majorization techniques [8, 10, 23]. To show the uniqueness part, let y? ∈ U(x0 , r) be such
F(y? ) = 0, where r satisfies (13.2.53) and (13.2.54). We have that

k F 0 (x0 )−1 ([y? , x? ; F] − F 0 (x0 )) k ≤ M1 (k y? − x0 k + k x? − x0 k)


(13.2.61)
≤ M1 (t ? + r) < 1.

It follows by (13.2.61) and the Banach lemma on invertible operators that linear operator
[y? , x? ; F]−1 exists. Then, using the identity 0 = F(y? ) − F(x? ) = [y? , x? ; F] (y? − x? ), we
deduce that x? = y? . The proof of Theorem 13.2.6 is complete. 

In order for us to present the semilocal result for secant-like method under the (C ?? )
conditions, we first need a result on a majorizing sequence. The proof in given in Lemma
13.2.1.

Remark 13.2.7. Clearly, (13.2.22) (or (13.2.47)), {rn } can replace (13.2.5) (or (13.2.42)),
{tn }, respectively in Theorem 13.2.6.

Lemma 13.2.8. Let c ≥ 0, η > 0, L > 0, M0 > 0 with M0 c < 1 and λ ∈ [0, 1]. Set
L M0
s−1 = 0, s0 = c, s1 = c + η, K̃ = and M̃1 = .
1 − M0 c 1 − M0 c
Secant-Like Methods 255

Define scalar sequences {q̃n }, {sn }, {α̃n } for each n = 0, 1, · · · by

q̃n = (1 − λ) (sn − s0 ) + (1 + λ) (sn+1 − s0 ),

K̃ (sn+1 − sn + (1 − λ) (sn − sn−1 ))


sn+2 = sn+1 + (sn+1 − sn ),
1 − M̃1 q̃n
K̃ (sn+1 − sn + (1 − λ) (sn − sn−1 ))
α̃n = ,
1 − M̃1 q̃n
function { f˜n } for each n = 1, 2, · · · by

f˜n (t) = K̃ ηt n + K̃ (1 − λ) η t n−1 + M̃1 η ((1 − λ) (1 + t + · · · + t n )+


(1 + λ) (1 + t + · · · + t n+1 )) − 1

and polynomial p̃ by

p̃(t) = M̃1 (1 + λ)t 3 + (M̃1 (1 − λ) + K̃)t 2 − K̃ λt − K̃ (1 − λ).

Denote by α̃ the smallest root of polynomial p̃ in (0, 1). Suppose that

0 ≤ α̃0 ≤ α̃ ≤ 1 − 2 M̃1 η. (13.2.62)

Then, sequence {sn } is non-decreasing, bounded from above by s?? defined by


η
s?? = +c
1 − α̃
and converges to its unique least upper bound s? which satisfies c + η ≤ s? ≤ s?? . Moreover,
the following estimates are satisfied for each n = 0, 1, · · ·

α̃n η
0 ≤ sn+1 − sn ≤ α̃n η and s? − sn ≤ .
1 − α̃
Next, we present the semilocal convergence result for secant-like method under the
(C ?? )
conditions.

Theorem 13.2.9. Suppose that the (C ??) conditions, (13.2.62) (or Lemma 13.2.2 conditions
with α̃n , α̃, M̃1 replacing, respectively, αn , α, M1 ) and U(x0 , s? ) ⊆ D hold. Then, sequence
{xn } generated by the secant-like method is well defined, remains in U(x0 , s? ) for each
n = −1, 0, 1, · · · and converges to a solution x? ∈ U(x0 , s? ) of equation F(x) = 0. Moreover,
the following estimates are satisfied for each n = 0, 1, · · ·

k xn+1 − xn k≤ sn+1 − sn and k xn − x? k≤ s? − sn .

Furthemore, if there exists r ≥ s? such that U(x0 , r) ⊆ D and r + s? + c < 1/M0 , then, the
solution x? is unique in U(x0 , r).
256 Ioannis K. Argyros and Á. Alberto Magreñán

Proof. The proof is analogous to Theorem 13.2.6. Simply notice that in view of (C5 ), we
obtain instead of (13.2.57) that
k A−1
0 (Bk+1 − A0 ) k≤ M0 (k yk+1 − x−1 k + k xk+1 − x0 k)
≤ M0 ((1 − λ) k xk − x0 k +λ k xk+1 − x0 k + k x0 − x−1 k + k xk+1 − x0 k)
≤ M0 ((1 − λ) (sk − s0 ) + (1 + λ) (sk+1 − s0 ) + c) < 1,

leading to B−1
k+1 exists and
1
k B−1
k+1 A0 k≤ ,
1 − Ξk
where Ξk = M0 ((1 − λ) (sk − s0 ) + (1 + λ) (sk+1 − s0 ) + c). Moreover, using (C3? ) instead of
(C3?? ), we get that

k A−1
0 F(xk+1 ) k≤ L (sk+1 − sk + (1 − λ) (sk − sk−1 )) (sk+1 − sk ).

Hence, we have that


k xk+2 − xk+1 k≤k B−1 −1
k+1 A0 k k A0 F(xk+1) k
L (sk+1 − sk + (1 − λ) (sk − sk−1 )) (sk+1 − sk )

1 − M0 ((1 + λ) (sk+1 − s0 ) + (1 − λ) (sk − s0 ) + c)
K̃ (sk+1 − sk + (1 − λ) (sk − sk−1 )) (sk+1 − sk )
≤ = sk+2 − sk+1 .
1 − M̃1 ((1 + λ) (sk+1 − s0 ) + (1 − λ) (sk − s0 ))
The uniqueness part is given in Theorem 13.2.6 with r, s? replacing R2 and R0 , respectively.
The proof of Theorem 13.2.9 is complete. 
Remark 13.2.10. (a) Condition (13.2.50) can be replaced by

U(x0 ,t ?? ) ⊆ D , (13.2.63)

where t ?? is given in the closed form by (13.2.55).

(b) The majorizing sequence {un } essentially used in [18] is defined by

u−1 = 0, u0 = c, u1 = c + η
M (un+1 − un + (1 − λ) (un − un−1 )) (13.2.64)
un+2 = un+1 + (un+1 − un ),
1 − M q?n
where
q?n = (1 − λ) (un − u0 ) + (1 + λ) (un+1 − u0 ).
Then, if K < M or M1 < M, a simple inductive argument shows that for each n =
2, 3, · · ·

t n < un , tn+1 − tn < un+1 − un and t ? ≤ u? = lim un . (13.2.65)


n→∞

Clearly {tn } converges under the (C ) conditions and conditions of Lemma 2.1. More-
over, as we already showed in Remark 13.2.3, the sufficient convergence criteria of
Theorem 13.2.6 can be weaker than those of Theorem 13.2.9. Similarly if L ≤ M, {sn }
is a tighter sequence than {un }. In general, we shall test the convergence criteria and
use the tightest sequence to estimate the error bounds.
Secant-Like Methods 257

(c) Clearly the conclusions of Theorem 13.2.9 hold if {sn }, (13.2.62) are replaced by
{r̃n }, (13.2.22), where {r̃n } is defined as {rn } with M0 replacing M1 in the definition
of β1 (only at the numerator) and the tilda letters replacing the non-tilda letters in
(13.2.22).

13.3. Numerical Examples


Now, we check numerically with two examples that the new semilocal convergence results
obtained in Theorems 13.2.6 and 13.2.9 improve the domain of starting points obtained by
the following classical result given in [21].

Theorem 13.3.1. Let X and Y be two Banach spaces and F : Ω ⊆ X → Y be a nonlinear


operator defined on a non-empty open convex domain Ω. Let x−1 , x0 ∈ Ω and λ ∈ [0, 1].
Suppose that there exists [u, v; F] ∈ L (X,Y ), for all u, v ∈ Ω (u 6= v), and the following four
conditions

· kx0 − x−1 k = c 6= 0 with x−1 , x0 ∈ Ω,

· Fixed λ ∈ [0, 1], the operator B0 = [y0 , x0 ; F] is invertible and such that kB−1
0 k ≤ β,

· kB−1
0 F(x0 )k ≤ η,

· k[x, y; F] − [u, v; F]k ≤ Q(kx − uk + ky − vk); Q ≥ 0; x, y, u, v ∈ Ω; x 6= y; u 6= v,


1−a
are satisfied. If B(x0 , ρ) ⊆ Ω, where ρ = η,
1 − 2a

η 3− 5 Qβc2 a(1 − a)2
a= < and b= < , (13.3.1)
c+η 2 c + η 1 + λ(2a − 1)

then the secant-like methods defined by (13.1.2)converge



to a solution x∗ of equation F(x) =
0 with R-order of convergence at least 2 . Moreover, xn , x∗ ∈ B(x0 , ρ), the solution x∗ is
1+ 5
1
unique in B(x0 , τ) ∩ Ω, where τ = Qβ − ρ − (1 − λ)α.

13.3.1. Example 1
We illustrate the above-mentioned with an application, where a system of nonlinear equa-
tions is involved. We see that Theorem 13.3.1 cannot guarantee the semilocal convergence
of secant-like methods (13.1.2), but Theorem 13.2.6 can do it.
It is well known that energy is dissipated in the action of any real dynamical system,
usually through some form of friction. However, in certain situations this dissipation is
so slow that it can be neglected over relatively short periods of time. In such cases we
assume the law of conservation of energy, namely, that the sum of the kinetic energy and
the potential energy is constant. A system of this kind is said to be conservative.
If ϕ and ψ are arbitrary functions with the property that ϕ(0) = 0 and ψ(0) = 0, the
general equation  
d 2 x(t) dx(t)
µ +ψ + ϕ(x(t)) = 0, (13.3.2)
dt 2 dt
258 Ioannis K. Argyros and Á. Alberto Magreñán

can be interpreted as the equation of motion of a mass µ under the action of a restoring force
−ϕ(x) and a damping force −ψ(dx/dt). In general these forces are nonlinear, and equation
(13.3.2) can be regarded as the basic equation of nonlinear mechanics. In this chapter we
shall consider the special case of a nonlinear conservative system described by the equation
d 2 x(t)
µ + ϕ(x(t)) = 0,
dt 2
in which the damping force is zero and there is consequently no dissipation of energy.
Extensive discussions of (13.3.2), with applications to a variety of physical problems, can
be found in classical references [4] and [32].
Now, we consider the special case of a nonlinear conservative system described by the
equation
d 2 x(t)
+ φ(x(t)) = 0 (13.3.3)
dt 2
with the boundary conditions
x(0) = x(1) = 0. (13.3.4)
After that, we use a process of discretization to transform problem (13.3.3)–(13.3.4) into a
finite-dimensional problem and look for an approximated solution of it when a particular
function φ is considered. So, we transform problem (13.3.3)–(13.3.4) into a system of non-
linear equations by approximating the second derivative by a standard numerical formula.
1
Firstly, we introduce the points t j = jh, j = 0, 1, . . ., m + 1, where h = m+1 and m is
an appropriate integer. A scheme is then designed for the determination of numbers x j ,
it is hoped, approximate the values x(t j ) of the true solution at the points t j . A standard
approximation for the second derivative at these points is
x j−1 − 2x j + x j+1
x00j ≈ , j = 1, 2, . . ., m.
h2
A natural way to obtain such a scheme is to demand that the x j satisfy at each interior mesh
point t j the difference equation
x j−1 − 2x j + x j+1 + h2 φ(x j ) = 0. (13.3.5)
Since x0 and xm+1 are determined by the boundary conditions, the unknowns are
x1 , x2 , . . ., xm .
A further discussion is simplified by the use of matrix and vector notation. Introducing
the vectors    
x1 φ(x1 )
   
 x2   φ(x2 ) 
   
x =  . , vx =  . 
 ..   .. 
   
xm φ(xm )
and the matrix  
−2 1 0 ··· 0
 1 −2 1 ··· 0 
 
 0 1 −2 ··· 0 
A= ,
 .. .. .. .. .. 
 . . . . . 
0 0 0 · · · −2
Secant-Like Methods 259

the system of equations, arising from demanding that (13.3.5) holds for j = 1, 2, . . ., m, can
be written compactly in the form

F(x) ≡ Ax + h2 vx = 0, (13.3.6)

where F is a function from Rm into Rm.


From now on, the focus of our attention is to solve a particular system of form (13.3.6).
We choose m = 8 and the infinity norm.
The steady temperature distribution is known in a homogeneous rod of length 1 in
which, as a consequence of a chemical reaction or some such heat-producing process, heat
is generated at a rate φ(x(t)) per unit time per unit length, φ(x(t)) being a given function of
the excess temperature x of the rod over the temperature of the surroundings. If the ends of
the rod, t = 0 and t = 1, are kept at given temperatures, we are to solve the boundary value
problem given by (13.3.3)–(13.3.4), measured along the axis of the rod. For an example we
choose an exponential law φ(x(t)) = exp(x(t)) for the heat generation.
Taking into account that the solution of (13.3.3)–(13.3.4) with φ(x(t)) = exp(x(t)) is of
the form Z 1
x(s) = G(s,t) exp(x(t)) dt,
0
where G(s,t) is the Green function in [0, 1] × [0, 1], we can locate the solution x∗ (s) in some
domain. So, we have
1
kx∗ (s)k − exp(kx∗ (s)k) ≤ 0,
8

so that kx (s)k ∈ [0, ρ1 ] ∪ [ρ2 , +∞], where ρ1 = 0.1444 and ρ2 = 3.2616 are the two positive
real roots of the scalar equation 8t − exp(t) = 0.
Observing the semilocal convergence results presented in this chapter, we can only
guarantee the semilocal convergence to a solution x∗ (s) such that kx∗ (s)k ∈ [0, ρ1 ]. For this,
we can consider the domain

Ω = {x(s) ∈ C2 [0, 1] ; kx(s)k < log(7/4), s ∈ [0, 1]},



since ρ1 < log 74 < ρ2 .
In view of what the domain Ω is for equation (13.3.3), we then consider (13.3.6) with
F :Ωe ⊂ R8 → R8 and
Ωe = {x ∈ R8 ; kxk < log(7/4)}.

According to the above-mentioned, vx = (exp(x1 ), exp(x2 ), . . ., exp(x8 ))t if φ(x(t)) =


exp(x(t)). Consequently, the first derivative of the function F defined in (13.3.6) is given
by
F 0 (x) = A + h2 diag(vx ).
Moreover,
F 0 (x) − F 0 (y) = h2 diag(z),
where y = (y1 , y2 , . . ., y8 )t and z = (exp(x1 ) − exp(y1 ), exp(x2 ) − exp(y2 ), . . ., exp(x8 ) −
exp(y8 )). In addition,

kF 0 (x) − F 0 (y)k ≤ h2 max |exp(`i)| kx − yk,


1≤i≤8
260 Ioannis K. Argyros and Á. Alberto Magreñán

e and h = 1 , so that
where ` = (`1 , `2 , . . ., `8 )t ∈ Ω 9

7 2
kF 0 (x) − F 0 (y)k ≤ h kx − yk. (13.3.7)
4
Considering (see [28])
Z 1
[x, y; F] = F 0 (τx + (1 − τ)y)dτ,
0

taking into account


Z 1
1
kτ(x − u) + (1 − τ)(y − v)kdτ ≤ (kx − uk + ky − vk) ,
0 2
and (13.3.7), we have
Z 1
k[x, y; F] − [u, v; F]k ≤ kF 0 (τx + (1 − τ)y) − F 0 (τu + (1 − τ)v)k dτ
0
7 2 1
Z
≤ h (τkx − uk + (1 − τ)ky − vk)dτ
4 0
7 2
= h (kx − uk + ky − vk) .
8
From the last, we have L = 648 7 7
and M1 = 648 k[F 0 (x0 )]−1 k.
1 1 1 1 t
If we choose λ = 2 and the starting points x−1 = ( 10 , 10 , . . ., 10 ) and x0 = (0, 0, . . ., 0)t ,
1
we obtain c = 10 , β = 11.202658 . . . and η = 0.138304 . . ., so that (13.3.1) of Theo-
rem 13.3.1 is not satisfied, since

η 3− 5
a= = 0.580368 . . . > = 0.381966 . . .
c+η 2
Thus, according to Theorem 13.3.1, we cannot guarantee the convergence of secant-like
method (13.1.2) with λ = 12 for approximating a solution of (13.3.6) with φ(s) = exp(s).
However, we can do it by Theorem 13.2.6, since all the inequalities which appear in
(2.5) are satisfied:

0 < α0 = 0.023303 . . . ≤ α = 0.577350 . . . ≤ 1 − 2M1 η = 0.966625 . . .,

where k[F 0 (x0 )]−1 k = 11.169433 . . ., M1 = 0.120657 . . . and

p(t) = (0.180986 . . .)t 3 + (0.180986 . . .)t 2 − (0.060328 . . .)t − (0.060328 . . .).

Then, we can use secant-like method (13.1.2) with λ = 12 to approximate a solution of


(13.3.6) with φ(u) = exp(u), the approximation given by the vector x∗ = (x∗1 , x∗2 , . . ., x∗8 )t
shown in Table 13.3.1 and reached after four iterations with a tolerance 10−16 . In Ta-
ble 13.3.2 we show the errors kxn − x∗ k using the stopping criterion kxn − xn−1 k < 10−16 .
Notice that the vector shown in Table 13.3.1 is a good approximation of the solution of
(13.3.6) with φ(u) = exp(u), since kF(x∗ )k ≤ C × 10−16 . See the sequence {kF(xn )k} in
Table 13.3.2.
Secant-Like Methods 261

Table 13.3.1. Approximation of the solution x∗ of (13.3.6) with φ(u) = exp(u)

n x∗i n x∗i n x∗i n x∗i


1 0.05481058 . . . 3 0.12475178 . .. 5 0.13893761 .. . 7 0.09657993 . . .
2 0.09657993 . . . 4 0.13893761 . .. 6 0.12475178 .. . 8 0.05481058 . . .

1
Table 13.3.2. Absolute errors obtained by secant-like method (13.1.2) with λ = 2 and
{kF(xn )k}

n kxn − x∗ k kF(xn)k
−1 1.3893 . . .× 10−1 8.6355 . . .× 10−2
0 4.5189 . . .× 10−2 1.2345 . . .× 10−2
1 1.43051 . . .× 10−4 2.3416 . . .× 10−5
2 1.14121 . . .× 10−7 1.9681 . . .× 10−8
3 4.30239 . . .× 10−13 5.7941 . . .× 10−14

13.3.2. Example 2
Consider the following nonlinear boundary value problem
(
1
u00 = −u3 − u2
4
u(0) = 0, u(1) = 1.

It is well known that this problem can be formulated as the integral equation
Z 1
1
u(s) = s + Q (s,t) (u3(t) + u2 (t)) dt (13.3.8)
0 4
where, Q is the Green function:

t (1 − s), t ≤ s
Q (s,t) =
s (1 − t), s < t.

We observe that Z 1
1
max |Q (s,t)| dt = .
0≤s≤1 0 8
Then problem (13.3.8) is in the form (13.1.1), where, F is defined as
Z 1
1
[F(x)] (s) = x(s) − s − Q (s,t) (x3(t) + x2 (t)) dt.
0 4
The Fréchet derivative of the operator F is given by
Z 1 Z 1
1
[F 0 (x)y] (s) = y(s) − 3 Q (s,t)x2(t)y(t)dt − Q (s,t)x(t)y(t)dt.
0 2 0
262 Ioannis K. Argyros and Á. Alberto Magreñán

1 + 14 5
Choosing x0 (s) = s and R = 1 we have that kF(x0 )k ≤ = . Define the divided
8 32
difference defined by
Z 1
[x, y; F] = F 0 (τx + (1 − τ)y)dτ.
0
Taking into account that
Z 1
k[x, y; F] − [v, y; F]k ≤ kF 0 (τx + (1 − τ)y) − F 0 (τv + (1 − τ)y) k dτ
0
1 1 2 2 τ 
Z
≤ 3τ kx − v2 k + 2τ(1 − τ)kykkx − vk + kx − vk dτ
8 0 2
  
1 1
≤ kx2 − v2 k + kyk + kx − vk
8 4
 
1 1
≤ kx + vk + kyk + kx − vk
8 4
25
≤ kx − vk
32

9s
Choosing x−1 (s) = , we find that
10
Z 1
k1 − A0 k ≤ kF 0 (τx0 + (1 − τ)x−1 ) k dτ
0
   !
1 1 9 2 1 9
Z
≤ 3 τ + (1 − τ) + τ + (1 − τ) dτ
8 0 10 2 10
≤ 0.409375 . . .

Using the Banach Lemma on invertible operators we obtain


kA−1
0 k ≤ 1.69312 . . .

and so
25 −1
L≥ kA k = 1.32275 . . .
32 0
.
In an analogous way, choosing λ = 0.8 we obtain
M0 = 0.899471 . . .,
kB−1
0 k = 1.75262 . . .
and
η = 0.273847 . . ..
Notice that we can not guarantee the convergence of the secant method by Theorem
13.3.1 since the first condition of (3.1) is not satisfied:

η 3− 5
a= = 0.732511 . . . > = 0.381966 . . .
c+η 2
Secant-Like Methods 263

On the other hand, observe that

M̃1 = 0.0988372 . ..,

K̃ = 1.45349 . . .,
α0 = 0.434072 . . .,
α = 0.907324 . . .
and
1 − 2M̃1 η = 0.945868 . . ..
And condition (2.62) 0 < α0 ≤ α ≤ 1 − 2M̃1 η is satisfied and as a consequence we can
ensure the convergence of the secant method by Theorem 13.2.9.

Conclusion
We presented a new semilocal convergence analysis of the secant-like method for approx-
imating a locally unique solution of an equation in a Banach space. Using a combination
of Lipschitz and center-Lipschitz conditions, instead of only Lipschitz conditions invested
in [18], we provided a finer analysis with larger convergence domain and weaker sufficient
convergence conditions than in [15, 18, 19, 22, 27, 28]. Numerical examples validate our
theoretical results.
References

[1] Amat, S., Busquier, S., Negra, M., Adaptive approximation of nonlinear operators,
Numer. Funct. Anal. Optim., 25 (2004) 397-405.

[2] Amat, S., Busquier, S., Gutiérrez, J. M., On the local convergence of secant-type
methods, International Journal of Computer Mathematics, 81 (8) (2004), 1153-1161.

[3] Amat, S., Bermúdez, C., Busquier, S., Gretay, J. O., Convergence by nondiscrete
mathematical induction of a two step secant’s method, Rocky Mountain Journal of
Mathematics, 37 (2) (2007), 359-369.

[4] Andronow, A.A., Chaikin, C.E., Theory of oscillations, Princenton University Press,
New Jersey, 1949.

[5] Argyros, I.K., Polynomial Operator Equations in Abstract Spaces and Applications,
St.Lucie/CRC/Lewis Publ. Mathematics series, 1998, Boca Raton, Florida, U.S.A.

[6] Argyros, I.K., A unifying local-semilocal convergence analysis and applications for
two-point Newton-like methods in Banach space, J. Math. Anal. Appl., 298 (2004),
374-397.

[7] Argyros, I.K., New sufficient convergence conditions for the secant method Che-
choslovak Math. J., 55 (2005), 175-187.

[8] Argyros, I.K., Convergence and Applications of Newton–type Iterations, Springer–


Verlag Publ., New–York, 2008.

[9] Argyros, I.K., A semilocal convergence analysis for directional Newton methods,
Math. Comput., 80 (2011), 327-343.

[10] Argyros, I.K., Cho, Y.J., Hilout, S., Numerical Methods for Equations and its Appli-
cations, CRC Press/Taylor and Francis, Boca Raton, Florida, USA, 2012

[11] Argyros, I.K., Hilout, S., Convergence conditions for secant–type methods, Che-
choslovak Math. J., 60 (2010), 253-272.

[12] Argyros, I.K., Hilout, S., Weaker conditions for the convergence of Newton’s method,
J. Complexity, 28 (2012), 364-387.

[13] Argyros, I.K., Hilout, S., Estimating upper bounds on the limit points of majorizing
sequences for Newton’s method, Numer. Algorithms, 62 (1) (2013), 115-132.
266 Ioannis K. Argyros and Á. Alberto Magreñán

[14] Argyros, I.K., Hilout, S., Numerical methods in nonlinear analysis, World Scientific
Publ. Comp., New Jersey, 2013

[15] Argyros, I.K., Ezquerro, J.A., Hernández, M.Á. Hilout, S., Romero, N., Velasco, A.I.,
Expanding the applicability of secant-like methods for solving nonlinear equations,
Carp. J. Math. 31 (1) (2015), 11-30.

[16] Dennis, J.E., Toward a unified convergence theory for Newton–like methods, in Non-
linear Functional Analysis and Applications (L.B. Rall, ed.), Academic Press, New
York, (1971), 425-472.

[17] Ezquerro, J.A., Gutiérrez, J.M., Hernández, M.A., Romero, N., Rubio, M.J., The
Newton method: from Newton to Kantorovich. (Spanish), Gac. R. Soc. Mat. Esp.,
13 (2010), 53-76.

[18] Ezquerro, J.M., Hernández, M.A., Romero, N., A.I. Velasco,Improving the domain of
starting point for secant-like methods,App. Math. Comp., 219 (8) (2012), 3677–3692.

[19] Ezquerro, J.A., Rubio, M.J., A uniparametric family of iterative processes for solv-
ingnondifferentiable equations, J. Math. Anal. Appl, 275 (2002), 821-834.

[20] A. Fraile, E. Larrodé, Á. A. Magreñán, J. A. Sicilia. Decision model for siting trans-
port and logistic facilities in urban environments: A methodological approach. J.
Comp. Appl. Math., 291 (2016), 478-487.

[21] Hernández, M.A., Rubio, M.J., Ezquerro, J.A., secant-like methods for solving non-
linear integral equations of the Hammerstein type. Proceedings of the 8th International
Congress on Computational and Applied Mathematics, ICCAM-98 (Leuven), J. Com-
put. Appl. Math., 115 (2000), 245-254.

[22] Hernández, M.A., Rubio, M.J., Ezquerro, J.A., Solving a special case of conservative
problems by secant-like methods, App. Math. Comp., 169 (2005), 926-942.

[23] Kantorovich, L.V., Akilov, G.P., Functional Analysis, Pergamon Press, Oxford, 1982.

[24] Laasonen, P., Ein überquadratisch konvergenter iterativer algorithmus, Ann. Acad.
Sci. Fenn. Ser I, 450 (1969), 1–10.

[25] Magreñán, Á. A., A new tool to study real dynamics: The convergence plane, App.
Math. Comp., 248 (2014), 215-224.

[26] Ortega, J.M., Rheinboldt, W.C., Iterative Solution of Nonlinear Equations in Several
Variables, Academic Press, New York, 1970.

[27] Potra, F.A., Sharp error bounds for a class of Newton–like methods, Libertas Mathe-
matica, 5 (1985), 71-84.

[28] Potra, F.A., Pták, V., Nondiscrete Induction and Iterative Processes, Pitman, New
York, 1984.
Secant-Like Methods 267

[29] Proinov, P.D., General local convergence theory for a class of iterative processes and
its applications to Newton’s method, J. Complexity, 25 (2009), 38-62.

[30] Proinov, P.D., New general convergence theory for iterative processes and its applica-
tions to Newton–Kantorovich type theorems, J. Complexity, 26 (2010), 3-42.

[31] Schmidt, J.W., Untere Fehlerschranken fur Regula-Falsi Verhafren, Period. Hungar.,
9 (1978), 241-247.

[32] Stoker, J. J., Nonlinear vibrations, Interscience-Wiley, New York, 1950.

[33] Traub, J.F., Iterative Methods for the Solution of Equations, Prentice-Hall, Englewood
Cliffs, New Jersay, 1964.

[34] Yamamoto, T., A convergence theorem for Newton–like methods in Banach spaces,
Numer. Math., 51 (1987), 545-557.

[35] Wolfe, M.A., Extended iterative methods for the solution of operator equations, Nu-
mer. Math., 31 (1978), 153-174.
Chapter 14

Solving Nonlinear Equations System


via an Efficient Genetic Algorithm
with Symmetric and Harmonious
Individuals

14.1. Introduction
In this chapter, we introduce genetic algorithms as a general tool for solving optimum prob-
lems. As a special case we use these algorithms to find the solution of the system of non-
linear equations 

 f1 (x1 , x2 , . . ., xn ) = 0,

 f2 (x1 , x2 , . . ., xn ) = 0,
.. (14.1.1)

 .


fn (x1 , x2 , . . ., xn ) = 0,
where f = ( f 1 , f 2 , . . ., f n ) : D = [a1 , b1 ] × [a2 , b2 ] · · · × [an , bn ] ⊆ Rn → Rn is continuous.
Genetic algorithms (GA) were first introduced in the 1970s by John Holland at Univer-
sity of Michigan [8]. Since then, a great deal of developments on GA have been obtained,
see [5, 6] and references therein. GA are used as an adaptive machine study approach in
their former period of development, and they have been successfully applied in numerous
areas such as artificial intelligence, self-adaption control, systematic engineering, image
processing, combinatorial optimization and financial system. GA show us very extensive
application prospect. Genetic algorithms are search algorithms based on the mechanics of
natural genetics.
In a genetic algorithm, a population of candidate solutions (called individuals or phe-
notypes) to an optimization problem is evolved towards better solutions. Each candidate
solution has a set of properties (its chromosomes or genotype) which can be mutated and
altered; traditionally, solutions are represented in binary as strings of 0s and 1s, but other
encodings are also possible [5]. The evolution usually starts from a population of randomly
generated individuals and happens in generations. In each generation, the fitness of every
270 Ioannis K. Argyros and Á. Alberto Magreñán

individual in the population is evaluated, the more fit individuals are stochastically selected
from the current population, and each individual’s genome is modified (recombined and
possibly randomly mutated) to form a new population. The new population is then used
in the next iteration of the algorithm. Commonly, the algorithm terminates when either a
maximum number of generations has been produced, or a satisfactory fitness level has been
reached for the population.
In the use of genetic algorithms to solve the practical problems, the premature phe-
nomenon often appears, which limits the search performance of genetic algorithm. The
reason for causing premature phenomenon is that highly similar exists between individuals
after some generations, and the opportunity of the generation of new individuals by further
genetic manipulation has been reduced greatly. Many ideas have been proposed to avoid
the premature phenomenon [6, 15, 16, 18]. [18] introduces two special individuals at each
generation of genetic algorithm to make the population maintains diversity in the problem
of finding the minimized distance between surfaces, and the computational efficiency has
been improved. We introduces other two special individuals at each generation of genetic
algorithm in the same problem in [15], and the computational efficiency has been further
improved. Furthermore, we suggest to put some symmetric and harmonious individuals at
each generation of genetic algorithm applied to a general optimal problem in [16], and good
computational efficiency has been obtained. Some application of our methods in reservoir
mid-ong hydraulic power operation has been given in [17].
Recently many authors use GA to solve nonlinear equations system, see [9, 3, 11, 10,
14]. These works give us impression that GA are effective methods for solving nonlin-
ear equations system. However, efforts are still needed so as to solve nonlinear equations
system more effectively. In this chapter, we present a new genetic algorithm to solve Eq.
(14.1.1).
The chapter is organized as follows: We convert the equation problem (14.1.1) to an
optimal problem in Section 14.2, in Section 14.3 we present our new genetic algorithm with
symmetric and harmonious individuals for the corresponding optimal problem, in Section
14.4 we give a mixed method by our method with Newton’s method, whereas in Section
14.5 we provide some numerical examples to show that our new methods are very effective.
Some remarks and conclusions are given in the concluding section 14.6.

14.2. Convert (14.1.1) to an Optimal Problem


Let us define function F : D = [a1 , b1 ] × [a2 , b2 ] · · · × [an , bn ] ⊆ Rn → R as follows

1 n
F(x1 , x2 , . . ., xn ) = ∑ | fi(x1 , x2, . . ., xn)|.
n i=1
(14.2.1)

Then, we convert problem (14.1.1) to the following optimal problem

min F(x1 , x2 , . . ., xn )
(14.2.2)
s.t (x1 , x2 , . . ., xn ) ∈ D.

Suppose x? ∈ D is a solution of Eq. (14.1.1), then we have F(x? ) = 0 from the definition
(14.2.1) of F. Since F(x) ≥ 0 holds for all x ∈ D, we deduce that x? is the solution of
Efficient Genetic Algorithm 271

problem (14.2.2). On the other hand, assume x? ∈ D is a solution of problem (14.2.2). Then
for all x ∈ D, we have F(x) ≥ F(x? ). Now we suppose (14.1.1) has at least one solution
denoted as y? . Then, we have 0 ≤ F(x? ) ≤ F(y? ) = 0, that is F(x? ) = 0 is true and x? is
a solution of Eq. (14.1.1). Hence, Eq. (14.1.1) is equivalent to problem (14.2.2) if Eq.
(14.1.1) has at least one solution.
From now on, we always suppose (14.1.1) has at least one solution, and try to find its
solution by finding a solution of (14.2.2) via a genetic algorithm.

14.3. New Genetic Algorithm: SHEGA


Since the simple genetic algorithm (called standard genetic algorithm) is not very efficient
in practical computation, many varieties have been given [6]. In this chapter, we give a new
genetic algorithm. Our main idea is to put pairs of symmetric and harmonious individuals
in generations.

14.3.1. Coding
We use the binary code in our method, as used in the simple genetic algorithm [6]. The
binary code is the most used code in GA, which represents the candidate solution by a
string of 0s and 1s. The length of the string is relation to the degree of accuracy needed for
the solution, and satisfies the following inequality

bi − ai
Li ≥ log2 , (14.3.1)
εi
where, Li is the length of the string standing for the i-component of an individual, and εi is
the degree of accuracy needed for xi . In fact, we should choose the minimal positive integer
to satisfy (14.3.1) for Li . Usually, all εi are equal to one another, so it can be denoted
by εx in this status. For example, let a1 = 0, b1 = 1 and εx = 10−6 , then we can choose
L1 = 20, since L1 ≥ log2 106 ≈ 19.93156857. Variable x1 ∈ [a1 , b1 ] is in the form of real
numbers, it can be represented by 20 digits of the binary code:00000000000000000000
stands for 0, 00000000000000000001 stands for 21Li , 00000000000000000010 stands for
2
2 Li
,. . . , 11111111111111111111 stands for 1.

14.3.2. Fitness Function


Since the simple genetic algorithm is used to find a solution of a maximal problem, one
should make some change for the fitness function. In this chapter, we define the fitness
function as follows
1
g(x1 , x2 , . . ., xn ) = . (14.3.2)
1 + F(x1 , x2 , . . ., xn )
For this function g, it satisfies: (1) The fitness value is always positive, which is needed
in the following genetic operators; (2) The fitness value will be bigger if the point
(x1 , x2 , . . ., xn ) is closer to a solution x? of problem (14.2.2).
272 Ioannis K. Argyros and Á. Alberto Magreñán

14.3.3. Selection
We use Roulette Wheel Section (called proportional election operator) in our method, as
used in the simple genetic algorithm. The probability of individual is selected and the
fitness function is proportional to the value. Suppose the size of population is N, the fitness
of individual i is gi . The individual i was selected to the next generation with probability pi
given by
gi
pi = N (i = 1, 2, · · · , N). (14.3.3)
∑i=1 gi

14.3.4. Symmetric and Harmonious Individuals


We first give the definition of symmetric and harmonious individuals.
Definition 14.3.1. Suppose individuals M1 = (x1 , x2 , . . ., xn ) and M2 = (y1 , y2 , . . ., yn ) can
be represented in the binary code by M10 = (x11 x12 . . .x1L1 x21 x22 . . .x2L2 . . .xn1 xn2
. . .xnLn ) and M20 = (y11 y12 . . .y1L1 y21 y22 . . .y2L2 . . .yn1 yn2 . . .ynLn ), respectively. They are
called symmetric and harmonious individuals if and only if xi j = 1 − yi j holds for any
i = 1, 2, . . ., n and j = 1, 2, . . ., Li .
In Definition 14.3.1 individuals M10 and M20 are complements in the binary sense.
In order to avoid the premature phenomenon of genetic algorithm, we introduce some
pair of symmetric and harmonious individuals in generation. We don’t use the fixed sym-
metric and harmonious individuals as in [15] and [18]. Contrarily, we generate pair of
symmetric and harmonious individuals randomly. On one hand, these pair of symmetry
of individuals continue to enrich the diversity of the population. On the other hand, they
continue to explore the space even if they haven’t been selected to participate in genetic
manipulation of exchange or mutation.
Suppose the size of population is N, and λ ∈ [0, 0.5) is a parameter. Let brc be the
biggest integer equal to or less than r. We introduce bλ ∗ Nc pairs of symmetric and harmo-
nious individuals in current generation provided that the quantity between the best fitness
value of the last generation to the one of the current generation is less than a preset precision
denoted by ε1 . Here, we call λ as symmetry and harmonious factor.

14.3.5. Crossover and Mutation


We use one-point crossover operator in our genetic algorithm, just as used in the simple
genetic algorithm. That is, a single crossover point on both parents’ organism strings is
selected. All data beyond that point in either organism string is swapped between the two
parent organisms. The resulting organisms are the children. An example is shown as fol-
lows:
A : 10110111 | 001 A0 : 10110111 | 110
⇒ 0
B : 00011100 | 110 B : 00011100 | 001.
We use bit string mutation, just as used in the simple genetic algorithm. That is, the
mutation of bit strings ensue through bit flips at random positions. The following example
is provided to show this:
A : 1010 1 0101010 ⇒ A0 : 1010 0 0101010.
Efficient Genetic Algorithm 273

14.3.6. Elitist Model


It is well-known that the global convergence of the simple genetic algorithm cannot be
assured [6]. In order to guarantee the global convergence of the genetic algorithm, we use an
elitist model (or the optimal preservation strategy) in this chapter. That is, at each generation
we reserves the individual with the maximum fitness value to the next generation.

14.3.7. Procedure
For the convenience of discussion, we call our genetic algorithm with symmetric and har-
monious individuals as SHEGA, and call the simple genetic algorithm with the elitist model
as EGA. We can give the procedure of SHEGA as follows:
Step 1. Assignment of parameters of genetic algorithm: The size N of population, the
number n of variables of (14.2.2), the lengthes L1 , L2 , . . ., Ln (computed from (14.3.1)) of
the binary string of the components of an individual, symmetry and harmonious factor λ,
controlled precision ε1 in subsection 14.3.4 to introduce the symmetric and harmonious
individuals, the probability pc of the crossover operator, the probability pm of the mutation
operator, and the largest genetic generation G.
Step 2. Generate the initial population randomly.
Step 3. Calculate the fitness value of each individual of the contemporary population,
and reserve the optimal individual of the contemporary population to the next generation.
Step 4. If the distance between the best fitness value of the last generation to that of
the current generation is less than a preset precision ε1 , we generate N − 2 ∗ bλ ∗ Nc − 1
individuals using Roulette Wheel Section and bλ ∗ Nc pairs of symmetry and harmonious
individuals randomly. Otherwise we generate N − 1 individuals using Roulette Wheel Sec-
tion directly. The population is then divided into two parts: one is the seed subpopulation
constituted by symmetry and harmonious individuals, and the other is a subpopulation ready
to be bred and constituted by the residual individuals.
Step 5. Take the crossover operator between each individual in the seed subpopulation
to one individual selected from the other subpopulation randomly. Take the crossover op-
erator each other using two two paired method with probability pc and take the mutation
operator with probability pm for each residual individual in the subpopulation ready to be
bred.
Step 6. Repeat Step 3-Step 5 until the maximum genetic generation G is reached.

14.4. Mixed Algorithm: SHEGA-Newton Method


In order to improve further the efficiency of SHEGA, we can apply it by mixed with a
classical iterative method such as Newton’s method [2, 13]

x(k+1) = x(k) − f 0 (x(k))−1 f (x(k)) (k ≥ 0) (x(0) ∈ D). (14.4.1)

Here, f 0 (x) denotes the Fréchet derivative of function f .


Local as well as semilocal convergence results for Newton method (14.4.1) under var-
ious assumptions have been given by many authors [1, 2, 4, 7, 12, 13]. It is well-known
274 Ioannis K. Argyros and Á. Alberto Magreñán

that Newton’s method converges to the solution quadratically provided that some neces-
sary conditions are satisfied. However, there are two deficiencies to limit the application of
Newton’s method. First, function f must be differentiable, which will not be satisfied in
practical application. Second, a good initial point for beginning the iterative is key to en-
sure the convergence of the iterative sequence, but it is a difficult task to choose the initial
point in advance. In fact, choosing good initial points to begin the corresponding itera-
tion is the common question for all the classical iterative methods used to solve equation
(14.1.1)[2, 13].
Here, we use Newton’s method as an example. In fact, one can develop other methods
by mixing SHEGA and other iterative methods. We can state SHEGA-Newton method as
follows:
Step 1. Given the maximal iterative step S and the precision accuracy εy. Set s = 1.
Step 2. Find an initial guess x(0) ∈ D by using SHEGA given in Section 3.
(s) (s) (s) (s) (s) (s)
Step 3. Compute f i(x1 , x2 , . . ., xn )(i = 1, 2, . . ., n). If F(x1 , x2 , . . ., xn ) ≤ εy, re-
(s) (s) (s)
port that the approximation solution x(s) = (x1 , x2 , . . ., xn ) is found and exit from the
circulation, where F is defined in (2.1).
Step 4. Compute the Jacobian Js = f 0 (x(s)) and solve the linear equations

Js u(s) = f (x(s) ). (14.4.2)

Set x(s+1) = x(s) + u(s) .


Step 5. If s ≤ S, set s = s + 1 and goto Step 3. Otherwise, report that the approximation
solution cannot be found.

14.5. Numerical Examples


In this section, we will provide some examples to show the efficiency of our new method.

Example 14.5.1. Let f be defined in D = [−3.5, 2.5] × [−3.5,2.5] by



f1 (x1 , x2 ) = x21 + x22 + x1 + x2 − 8 = 0,
(14.5.1)
f2 (x1 , x2 ) = x1 x2 + x1 + x2 − 5 = 0.

Let us choose parameters as follows:

N = 40, pc = 0.9, pm = 0.005, εx = 0.001, ε1 = 0.001. (14.5.2)


Since the genetic algorithms are random algorithms, we run each method 30 times, and
compare convergence number of times under various maximal generation G for EGA and
SHEGA. The comparison results are given in Table 14.5.1. We also give the comparison
results of the average of the best function value F under the fixed maximal generation
G = 300 in Table 14.5.2. Here, we say the corresponding genetic algorithm is convergent
if the function value F(x1 , x2 , . . ., xn ) is less than a fixed precision εy . We set εy = 0.05 for
this example. Tables 1 and 2 show us that SHEGA with proper symmetry and harmonious
factor λ performs better than EGA.
Efficient Genetic Algorithm 275

Table 14.5.1. The comparison results of convergence number of times for Example 1

G = 50 G = 100 G = 150 G = 200 G = 250 G = 300


EGA 11 11 11 11 12 12
SHEGA(λ = 0.05) 13 15 18 20 20 21
SHEGA(λ = 0.10) 8 20 23 26 29 29
SHEGA(λ = 0.15) 16 21 27 28 28 29
SHEGA(λ = 0.20) 16 26 29 30 30 30
SHEGA(λ = 0.25) 20 28 29 29 30 30
SHEGA(λ = 0.30) 22 28 30 30 30 30
SHEGA(λ = 0.35) 17 27 29 30 30 30
SHEGA(λ = 0.40) 17 29 30 30 30 30
SHEGA(λ = 0.45) 19 30 30 30 30 30

Table 14.5.2. The comparison results of the average of the best function value F for
Example 1

G = 300
EGA 0.129795093953972
SHEGA(λ = 0.05) 0.053101422547384
SHEGA(λ = 0.10) 0.041427669194903
SHEGA(λ = 0.15) 0.034877317449825
SHEGA(λ = 0.20) 0.035701604675096
SHEGA(λ = 0.25) 0.038051665705034
SHEGA(λ = 0.30) 0.039332883168632
SHEGA(λ = 0.35) 0.035780879206619
SHEGA(λ = 0.40) 0.034509424138501
SHEGA(λ = 0.45) 0.037425326021257

Example 14.5.2. Let f be defined in D = [−5, 5] × [−1, 3] × [−5, 5] by



 f1 (x1 , x2 , x3 ) = 3x21 + sin(x1 x2 ) − x23 + 2 = 0,
f2 (x1 , x2 , x3 ) = 2x31 − x22 − x3 + 3 = 0, (14.5.3)

f3 (x1 , x2 , x3 ) = sin(2x1 ) + cos(x2 x3 ) + x2 − 1 = 0.

Let us choose parameters as follows:

N = 50, pc = 0.8, pm = 0.05, εx = 0.0001, ε1 = 0.001. (14.5.4)


We run each method 20 times, and compare convergence number of times under various
maximal generation G for EGA and SHEGA. The comparison results are given in Table
14.5.3. We also give the comparison results of the average of the best function value F under
276 Ioannis K. Argyros and Á. Alberto Magreñán

the fixed maximal generation G = 500 in Table 14.5.4. Here, we say the corresponding
genetic algorithm is convergent if the function value F(x1 , x2 , . . ., xn ) is less than a fixed
precision εy . We set εy = 0.02 for this example. Tables 14.5.3 and 14.5.4 show us that
SHEGA with proper symmetry and harmonious factor λ performs better than EGA.

Table 14.5.3. The comparison results of convergence number of times for Example 2

G = 100 G = 200 G = 300 G = 400 G = 500


EGA 3 4 4 4 4
SHEGA(λ = 0.05) 2 6 8 9 11
SHEGA(λ = 0.10) 2 7 11 14 15
SHEGA(λ = 0.15) 5 11 14 14 16
SHEGA(λ = 0.20) 3 11 14 17 18
SHEGA(λ = 0.25) 8 11 14 17 17
SHEGA(λ = 0.30) 8 16 16 17 17
SHEGA(λ = 0.35) 6 14 18 19 20
SHEGA(λ = 0.40) 5 14 19 19 20
SHEGA(λ = 0.45) 6 17 19 19 20

Table 14.5.4. The comparison results of the average of the best function value F for
Example 2

G = 500
EGA 0.1668487275323187
SHEGA(λ = 0.05) 0.0752619855962109
SHEGA(λ = 0.10) 0.0500864062405815
SHEGA(λ = 0.15) 0.0358268275921585
SHEGA(λ = 0.20) 0.0257859494269335
SHEGA(λ = 0.25) 0.0239622084932336
SHEGA(λ = 0.30) 0.0247106452514721
SHEGA(λ = 0.35) 0.0171980128114993
SHEGA(λ = 0.40) 0.0179659124369376
SHEGA(λ = 0.45) 0.0158999282064303

Next, we will provide an example when f is non-differentiable. Newton’s method


(14.4.1) and SHEGA-Newton cannot be used to solve this problem since f is non-
differentiable. However, SHEGA can apply. Moreover, with the same idea used for
SHEGA-Newton method, one can develop some mixed methods which don’t require the
differentiability of function f to solve the problem.
Efficient Genetic Algorithm 277

Example 14.5.3. Let f be defined in D = [−2, 2] × [1, 6] by



f1 (x1 , x2 ) = x21 − x2 + 1 + 19 |x1 − 1| = 0,
(14.5.5)
f2 (x1 , x2 ) = x22 + x1 − 7 + 19 |x2 | = 0.

Let us choose parameters as follows:

N = 40, pc = 0.85, pm = 0.008, εx = 0.001, ε1 = 0.005. (14.5.6)


We run each method 20 times, and compare convergence number of times under various
maximal generation G for EGA and SHEGA. The comparison results are given in Table
14.5.5. We also give the comparison results of the average of the best function value F under
the fixed maximal generation G = 300 in Table 14.5.6. Here, we say the corresponding
genetic algorithm is convergent if the function value F(x1 , x2 , . . ., xn ) is less than a fixed
precision εy. We set εy = 0.003 for this example. Tables 14.5.5 and 14.5.6 show us that
SHEGA with proper symmetry and harmonious factor λ performs better than EGA.

Table 14.5.5. The comparison results of convergence number of times for Example 3

G = 100 G = 150 G = 200 G = 250 G = 300


EGA 6 9 10 10 10
SHEGA(λ = 0.05) 10 15 16 17 19
SHEGA(λ = 0.10) 12 15 17 18 20
SHEGA(λ = 0.15) 15 18 19 20 20
SHEGA(λ = 0.20) 15 20 20 20 20
SHEGA(λ = 0.25) 11 14 17 17 18
SHEGA(λ = 0.30) 10 14 16 17 19
SHEGA(λ = 0.35) 16 17 18 19 20
SHEGA(λ = 0.40) 10 12 18 18 20
SHEGA(λ = 0.45) 9 12 15 15 17

14.6. Conclusion
We presented a genetic algorithm as a general tool for solving optimum problems. Note that
in the special case of approximating solutions of systems of nonlinear equations there are
many deficiencies that limit the application of the usually employed methods. For example
in the case of Newton’s method function f must be differentiable and a good initial point
must be found. To avoid these problems we have introduced some pairs of symmetric and
harmonious individuals for the generation of a genetic algorithm. The population diversity
is preserved this way and the method guarantees convergence to a solution of the system.
Numerical examples are illustrating the efficiency of the new algorithm.
278 Ioannis K. Argyros and Á. Alberto Magreñán

Table 14.5.6. The comparison results of the average of the best function value F for
Example 3

G = 300
EGA 0.0081958187235953
SHEGA(λ = 0.05) 0.0021219384775699
SHEGA(λ = 0.10) 0.0019286950245614
SHEGA(λ = 0.15) 0.0018367719544782
SHEGA(λ = 0.20) 0.0022816080103967
SHEGA(λ = 0.25) 0.0023297925904943
SHEGA(λ = 0.30) 0.0023318357433983
SHEGA(λ = 0.35) 0.0021392510790106
SHEGA(λ = 0.40) 0.0022381534380744
SHEGA(λ = 0.45) 0.0025798012930550
References

[1] Amat, S., Busquier, S., M. Negra, Adaptive approximation of nonlinear operators,
Numer. Funct. Anal. Optim., 25 (2004), 397-405.

[2] Argyros, I.K., Computational Theory of Iterative Methods, Series: Studies in Com-
putational Mathematics 15, Editors: C.K. Chui and L. Wuytack, Elsevier Publ. Co.,
New York, U.S.A., 2007.

[3] El-Emary, I.M., El-Kareem, M.A., Towards using genetic algorithm for solving non-
linear equation systems, World Applied Sciences Journal 5 (2008), 282-289.

[4] Ezquerro, J.A., Hernández, M.A., On an application of Newton’s method to nonlinear


equations with w− condition second derivative, BIT 42 (2002), 519-532.

[5] Fogel, D.B., Evolutionary Computation: Toward a New Philosophy of Machine Intel-
ligence, IEEE Press, New York, 2000.

[6] Goldberg, D.E., Genetic Algorithms in Search, Optimization and Machine Learning,
Addison-Wesley, Second Edition, 1989.

[7] Gutiérrez, J.M., A new semilocal convergence theorem for Newton’s method, J. Com-
put. Appl. Math. 79 (1997), 131-145.

[8] Holland, J.H., Adaptation in Natural and Artificial System, Michigan Univ Press, Ann
Arbor, 1975.

[9] Kuri-Morales, A. F., No, R. H., Solution of simultaneous non-linear equations using
genetic algorithms, WSEAS Transactions on SYSTEMS, 2 (2003), 44-51.

[10] Mastorakis, N.E., Solving non-linear equations via genetic algorithms, Proceedings of
the 6th WSEAS Int. Conf. on Evolutionary Computing, Lisbon, Portugal, June 16-18,
2005, 24-28.

[11] Nasira, G.N., Devi, D., Solving nonlinear equations through jacobian sparsity pattern
using genetic algorithm, International Journal of Communications and Engineering,
5 (2012), 78-82.

[12] Neta, B., A new iterative method for the solution of system of nonlinear equations,
Approx. Th. and Applic (Z. Ziegler, ed.), Academic Press, 249-263 1981.
280 Ioannis K. Argyros and Á. Alberto Magreñán

[13] Ortega, J.M., Rheinbolt, W.C., Iterative Solution of Nonlinear Equations in Several
Variables, Academic Press, New York, 1970.

[14] Mhetre, P.H., Genetic algorithm for linear and nonlinear equation, International Jour-
nal of Advanced Engineering Technology, 3 (2012), 114-118.

[15] Ren, H., Bi, W., Wu, Q., Calculation of minimum distance between free-form surfaces
by a type of new improved genetic algorithm, Computer Engineering and Application,
23 (2004), 62-64 (in Chinese).

[16] Ren, H., Wu, Q., Bi, W., A genetic algorithm with symmetric and harmonious indi-
viduals, Computer Engineering and Application, 5(87) (2005) 24-26, 87 (in Chinese).

[17] Wan, X, Zhou, J., Application of genetic algorithm for self adaption, symmetry and
congruity in reservoir mid-ong hydraulic power operation, Advances in Water Science,
18 (2007), 598-603 (in Chinese).

[18] Xi, G., Cai, Y., Finding the minimized distance between surfaces, Journal of CAD and
Graphics, 14 (2002), 209–213 (in Chinese).
Chapter 15

On the Semilocal Convergence of


Modified Newton-Tikhonov
Regularization Method for Nonlinear
Ill-Posed Problems

15.1. Introduction
In this chapter we are concerned with the problem of approximately solving the nonlinear
ill-posed operator equation
F(x) = f , (15.1.1)
where F : D(F) ⊆ X → Y is a nonlinear operator between the Hilbert spaces X and Y.
Here and below h., .i denote the inner product and k.k denote the corresponding norm. We
assume throughout that f δ ∈ Y are the available data with

k f − f δk ≤ δ

and (15.1.1) has a solution x̂ ( which need not be unique). Then the problem of recovery of
x̂ from noisy equation F(x) = f δ is ill-posed, in the sense that a small perturbation in the
data can cause large deviation in the solution.
Further it is assumed that F possesses a locally uniformly bounded Fréchet derivative
F 0 (.) in the domain D(F) of F. A large number of problems in mathematical physics and
engineering are solved by finding the solutions of equations in a form like (15.1.1). If one
works with such problems, the measurement data will be distorted by some measurement er-
ror. Therefore, one has to consider appropriate regularization techniques for approximately
solving (15.1.1).
Iterative regularization methods are used for approximately solving (15.1.1). Recall
([20]) that, an iterative method with iterations defined by

xδk+1 = Φ(xδ0 , xδ1 , · · · , xδk ; yδ ),


282 Ioannis K. Argyros and Á. Alberto Magreñán

where xδ0 := x0 ∈ D(F) is a known initial approximation of x̂, for a known function Φ
together with a stopping rule which determines a stopping index kδ ∈ N is called an iterative
regularization method if kxδkδ − x̂k → 0 as δ → 0.
The Levenberg-Marquardt method([18], [21], [9], [10], [11], [14], [24], [6]) and iter-
atively regularized Gauss-Newton method (IRGNA) ([3], [5]) are the well-known iterative
regularization methods. In Levenberg-Marquardt method, the iterations are defined by,

xδk+1 = xδk − (A∗k,δ Ak,δ + αk I)−1 A∗k,δ (F(xδk ) − yδ ), (15.1.2)

where A∗k,δ := F 0 (xδk )∗ is as usual the adjoint of Ak,δ := F 0 (xδk ) and (αk ) is a positive sequence
of regularization parameter ([5]). In Gauss-Newton method, the iterations are defined by

xδk+1 = xδk − (A∗k,δ Ak,δ + αk I)−1 [A∗k,δ (F(xδk ) − yδ ) + αk (xδk − x0 )] (15.1.3)

where xδ0 := x0 and (αk ) is as in (15.1.2).


In [3], Bakushinskii obtained local convergence of the method (15.1.3), under the
smoothness assumption

x̂ − x0 = (F 0 (x̂)∗ F 0 (x̂))νw, w ∈ N(F 0 (x̂))⊥ (15.1.4)

with ν ≥ 1, w 6= 0 and F 0 (.) is Lipschitz continuous; N(F 0 (x̂)) denotes the nullspace of
F 0 (x̂). For noise free case Bakushinskii ([3]) obtained the rate

kxδk − x̂k = O(αk ),

and Blaschke et.al.([5]) obtained the rate

kxδk − x̂k = O(ανk ), (15.1.5)


1
for 2≤ ν < 1.
It is proved in [5], that the rate (15.1.5) can be obtained for 0 ≤ ν < 1
2 provided F 0 (.)
satifies the following conditions:

F 0 (x̄) = R(x̄, x)F 0 (x) + Q(x̄, x)

kI − R(x̄, x)k ≤ CR x̄, x ∈ B2ρ(x0 )


kQ(x̄, x)k ≤ CQ kF 0 (x̂)(x̄ − x)k
with ρ,CR and CQ sufficiently small. In fact in [5], Blaschke et.al. obtained the rate
2ν 1
kxδk − x̂k = o(αk2ν+1 ), 0≤ν<
2
by choosing the stopping index kδ according to the discrepancy principle

kF(xδkδ ) − yδ k ≤ τδ < kF(xδk ) − yδ k, 0 ≤ k < kδ

with τ > 1 chosen sufficiently large. Subsequently, many authors extended, modified, and
generalized Bakushinskii’s work to obtain error bounds under various contexts(see [4], [12],
[13], [15], [16], [17], [7]).
Modified Newton-Tikhonov Regularization Method 283

In [20], Mahale and Nair considered a method in which the iterations are defined by

xδk+1 = x0 − gαk (A∗0 A0 )A∗0 [F(xδk ) − yδ − A0 (xδk − x0 )], xδ0 := x0 (15.1.6)

where A0 := F 0 (x0 ), (αk ) is a sequence of regularization parameters which satisfies,


αk
αk > 0, 1≤ ≤ µ1 , lim αk = 0 (15.1.7)
αk+1 k→0

for some constant µ1 > 1 and each gα, for α > 0 is a positive real-valued piecewise contin-
uous function defined on [0, M] with M ≥ kA0 k2 . They choose the stopping index kδ for this
iteration as the positive integer which satisfies

max{kF(xδkδ −1 ) − yδ k, β˜kδ } ≤ τδ < max{kF(xδk−1 ) − yδ k, β˜k } 1 ≤ k < kδ

where τ > 1 is a sufficiently large constant not depending on δ, and

β˜k := kF(xδk−1) − yδ + A0 (xδk − xδk−1 )k.

In fact, Mahle and Nair obtained an order optimal error estimate, in the sense that an im-
proved order estimate which is applicable for the case of linear ill-posed problems as well
is not possible, under the following new source condition on x0 − x̂.

Assumption 15.1.1. There exists a continuous, stricly monotonically increasing function


ϕ : (0, M] → (0, ∞) satisfying limλ→0 ϕ(λ) = 0 and ρ0 > 0 such that

x0 − x̂ = [ϕ(A∗0 A0 )]1/2w (15.1.8)

for some w ∈ X with kwk ≤ ρ0 .

In [7], the author considered a particular case of this method, namely, regularized mod-
ified Newton’s method defined iteratively by

xδk+1 = xδk − (A∗0 A0 + αI)−1 [A∗0 (F(xδk ) − yδ ) + α(xδk − x0 )], xδ0 := x0 (15.1.9)

for approximately solving (15.1.1). Using a suitably constructed majorizing sequence (see,
[1], page 28), it is proved that the sequence(xδk ) converges linearly to a solution xδα of the
equation
A∗0 F(xδα) + α(xδα − x0 ) = A∗0 yδ (15.1.10)
and that xδα is an approximation of x̂. The error estimate in this chapter was obtained under
the following source condition on x0 − x̂

Assumption 15.1.2. There exists a continuous, stricly monotonically increasing function


ϕ : (0, a1 ] → (0, ∞) with a1 ≥ kF 0 (x̂)k2 satisfying

1. limλ→0 ϕ(λ) = 0

2. for α ≤ 1, ϕ(α) ≥ α

3. supλ≥0 αϕ(λ)
λ+α
≤ cϕ ϕ(α), ∀λ ∈ (0, a1]
284 Ioannis K. Argyros and Á. Alberto Magreñán

4. there exists w ∈ X such that

x0 − x̂ = ϕ(F 0 (x̂)∗ F 0 (x̂))w. (15.1.11)

Later in [8], using a two step Newton method (see, [2]), the author proved that the se-
quence (xδk ) in (15.1.9) converges linarly to the solution xδα of (15.1.10). The error estimate
in [8] was based on the following source condition

x0 − x̂ = ϕ(A∗0 A0 )w,

where ϕ is as in Assumption 15.1.1 with a1 ≥ kA0 k2 . In the present chapter we improve the
semilocal convergence by modifying the method (15.1.9).

15.1.1. The New Method


In this chapter we define a new iteration procedure

xδn+1,α = xδn,α − (A∗0 An + αI)−1 [A∗0 (F(xδn,α ) − yδ ) + α(xδn,α − x0 )], xδ0,α := x0 (15.1.12)

where An := F 0 (xδn,α ) and α > 0 is the regularization parameter. Using an assumption on


the Fréchet derivative of F we prove that the iteration in (15.1.12) converges quadratically
to the solution xδα of (15.1.10).
Recall ([22]) that, a sequence (xn ) is said to converge quadratically to x∗ if there exists
positive reals β, γ such that
n
kxn+1 − x∗ k ≤ βe−γ2
for all n ∈ N. And the convergence of (xn ) to x∗ is said to be linear if there exists a positive
number M0 ∈ (0, 1), such that

kxn+1 − x∗ k ≤ M0 kxn − x∗ k.

Quadratically convergent sequence will always eventually converge faster than a linear con-
vergent sequence.
We choose the regularization parameter α from some finite set

{α0 < α1 < · · · < αN }

using the balancing principle considered by Perverzev and Schock in [23].


The rest of this chapter is organized in the following way. In Section 15.2 we provide
the convergence analysis of the proposed method and in Section 15.3 we provide the error
analysis. Finally in Section 15.4 we provide the details for implementing the method and
the algorithm.

15.2. Convergence Analysis of (15.1.12)


The following assumption is used extensively for proving the results in this chapter.
Modified Newton-Tikhonov Regularization Method 285

Assumption 15.2.1. There exists a constant k0 > 0, r > 0 such that for every x, u ∈ B(x0 , r)∪
B(x̂, r) ⊂ D(F) and v ∈ X, there exists an element Φ(x, u, v) ∈ X such that

[F 0 (x) − F 0 (u)]v = F 0 (u)Φ(x, u, v), kΦ(x, u, v)k ≤ k0 kvkkx − uk.

In view of Assumption 15.2.1 there exists an element Φ0 (x, x0 , v) ∈ X such that

[F 0 (x) − F 0 (x0 )]v = F 0 (x0 )Φ0 (x, x0 , v), kΦ0 (x, x0 , v)k ≤ l0 kvkkx − x0 k.

Note that
l0 ≤ k0
k0 √
holds in general and l0 can be arbitrarily large [1], [2]. Let δ0 < α0 ,
q
1 + 2l0 (1 − √δα0 0 ) − 1
ρ< ,
l0
and
l0 2 δ0
γρ := ρ +ρ+ √ .
2 α0
2−3k0 2
For r ≤ (2+3l0 )k0 , k0 ≤ 3 let g : (0, 1) → (0, 1) be the function defined by

3(1 + l0 r)k0
g(t) := t ∀t ∈ (0, 1).
2(1 − l0 r)

Lemma 15.2.2. Let l0 r < 1 and u ∈ Br (x0 ). Then (A∗0 Au + αI) is invertible:

(i)
(A∗0 Au + αI)−1 = [I + (A∗0 A0 + αI)−1 A∗0 (Au − A0 )]−1 (A∗0 A0 + αI)−1
and

(ii)
1 + l0 r
k(A∗0 Au + αI)−1 A∗0 Au k ≤ ,
1 − l0 r
where Au := F 0 (u).

Proof. Note that by Assumption 15.2.1, we have

k(A∗0 A0 + αI)−1 A∗0 (Au − A0 )k = sup k(A∗0 A0 + αI)−1 A∗0 (Au − A0 )vk
kvk≤1

= sup k(A∗0 A0 + αI)−1 A∗0 A0 Φ0 (u, x0, v)k


kvk≤1
≤ l0 ku − x0 k ≤ l0 r < 1.

So I + (A∗0 A0 + αI)−1 A∗0 (Au − A0 ) is invertible. Now (i) follows from the following relation

A∗0 Au + αI = (A∗0 A0 + αI)[I + (A∗0 A0 + αI)−1 A∗0 (Au − A0 )].


286 Ioannis K. Argyros and Á. Alberto Magreñán

To prove (ii), observe that by Assumption 15.2.1 and (i), we have

k(A∗0 Au + αI)−1 A∗0 Au k = sup k(A∗0 Au + αI)−1 A∗0 Au vk


kvk≤1

= sup k(A∗0 Au + αI)−1 A∗0 (Au − A0 + A0 )vk


kvk≤1

= sup k[I + (A∗0 A0 + αI)−1 A∗0 (Au − A0 )]−1


kvk≤1

(A∗0 A0 + αI)−1 A∗0 (Au − A0 + A0 )vk


1
≤ [k(A∗0 A0 + αI)−1 A∗0 A0 Φ0 (u, x0 , v)k
1 − k0 r
+k(A∗0 A0 + αI)−1 A∗0 A0 vk]
1 + l0 r
≤ .
1 − l0 r
This completes the proof.
γρ 2−3k0
Theorem 15.2.3. Suppose Assumption 15.2.1 holds. Let 1−g(γρ ) ≤r≤ (2+3l0 )k0 , δ ∈ (0, δ0 ].
Then the sequence (xδn,α )
defined in (15.1.12) is a Cauchy sequence in Br (x0 ) and hence
converges to xα ∈ Br (x0 ). Further xδα satisfies (15.1.10) and the following estimate holds
δ

for all n ≥ 0 :
n
kxδn,α − xδα k ≤ re−γ2 (15.2.1)
where γ = −ln(g(γρ )).

Proof. Suppose xδn,α ∈ Br (x0 ), ∀n ≥ 0. Then

xδn+1,α − xδn,α = (A∗0 An + αI)−1 [A∗0 An (xδn,α − xδn−1,α ) (15.2.2)


−A∗0 (F(xδn,α) − F (xδn−1,α))] + (A∗0 An + αI)−1 A∗0 (An − An−1 )
(A∗0 An−1 + αI)−1 [A∗0 (F(xδn−1,α) − yδ ) + α(xδn−1,α − x0 )]
= (A∗0 An + αI)−1 A∗0 [An (xδn,α − xδn−1,α ) − (F(xδn,α) − F (xδn−1,α))]
+(A∗0 An + αI)−1 A∗0 (An − An−1 )(xδn,α − xδn−1,α )
:= ζ1 + ζ2 (15.2.3)

where ζ1 = (A∗0 An + αI)−1 A∗0 [An (xδn,α − xδn−1,α ) − (F(xδn,α ) − F(xδn−1,α))] and ζ2 = (A∗0 An +
αI)−1 A∗0 (An − An−1 )(xδn,α − xδn−1,α ). So by Fundamental Theorem of Integration, ζ1 =
(A∗0 An + αI)−1 A∗0 [ 01 (An − F 0 (xδn−1,α + t(xδn,α − xδn−1,α )dt](xδn,α − xδn−1,α ) and hence by As-
R

sumption 15.2.1 and Lemma 15.2.2,


Z 1
kζ1 k ≤ k(A∗0An + αI)−1 A∗0 An Φ(xδn−1,α + t(xδn,α − xδn−1,α ), xδn,α, xδn−1,α − xδn,α )dtk
0
1 + l0 r 1
Z
≤ Φ(xδn−1,α + t(xδn,α − xδn−1,α ), xδn,α , xδn−1,α − xδn,α )dtk
1 − l0 r 0
(l0 r + 1)k0 δ
≤ kx − xδn−1,α k2 . (15.2.4)
2(1 − l0 r) n,α
Modified Newton-Tikhonov Regularization Method 287

Similarly,

kζ2 k ≤ k(A∗0 An + αI)−1 A∗0 (An − An−1 )(xδn−1,α − xδn,α )k


≤ k(A∗0 An + αI)−1 A∗0 An Φ(xδn,α , xδn−1,α, xδn−1,α − xδn,α )k
(1 + l0 r)k0 δ
≤ kxn,α − xδn−1,α k2 . (15.2.5)
1 − l0 r
So by (15.2.3), (15.2.4)and (15.2.5), we have

3(1 + l0 r)k0 ) δ
kxδn+1,α − xδn,α k ≤ kxn,α − xδn−1,α k2
2(1 − l0 r)
≤ g(en )en , (15.2.6)

where
en := kxδn,α − xδn−1,α k, n = 1, 2, · · · .
Now using induction we shall prove that xδn,α ∈ Br (x0 ). Note that

e1 = kxδ1,α − x0 k
= k(A∗0 A0 + αI)−1 A∗0 (F(x0 ) − yδ )k
= k(A∗0 A0 + αI)−1 A∗0 (F(x0 ) − F (x̂) − F 0 (x0 )(x0 − x̂)
+F 0 (x0 )(x0 − x̂) + F(x̂) − yδ )k
Z 1
≤ k(A∗0 A0 + αI)−1 A∗0 ( [F 0 (x̂ + t(x0 − x̂)) − F 0 (x0 )](x0 − x̂)dt
0
+F 0 (x0 )(x0 − x̂) + F(x̂) − yδ )k
Z 1
≤ k(A∗0 A0 + αI)−1 A∗0 A0 ( Φ(x0 , x̂ + t(x0 − x̂), x0 − x̂)k
0
+k(A∗0 A0 + αI)−1 A∗0 F 0 (x0 )(x0 − x̂)k
+k(A∗0 A0 + αI)−1 A∗0 (F(x̂) − yδ )k
l0 2 δ
≤ ρ + ρ + √ ≤ γρ ≤ r (15.2.7)
2 α

i.e.,xδ1,α ∈ Br (x0 ).
Now since γρ < 1, by (15.2.7), e1 < 1. Therefore by (15.2.6) and the fact that g(µt) ≤
µg(t), for all t ∈ (0, 1), we have that en < 1, ∀n ≥ 1 and
n −1
g(e1 )2 e1 .

Now suppose xδk,α ∈ Br (x0 ) for some k. Then

kxδk+1,α − x0 k ≤ kxδk+1,α − xδk,α k + kxδk,α − xδk−1,αk + · · · + kxδ1,α − x0 k


k k−1
≤ (g(e1)2 −1 + g(e1 )2 −1 + · · · + 1)e1
e1 γρ
≤ ≤ ≤ r.
1 − g(e1 ) 1 − g(γρ )
288 Ioannis K. Argyros and Á. Alberto Magreñán

Thus by induction xδn,α ∈ Br (x0 ), ∀n ≥ 0.


Next we shall prove that (xδk+1,α) is a Cauchy sequence in Br (x0 ).
m
kxδn+m,α − xδn,α k ≤ ∑ kxδn+i+1,α − xδn+i,α k (15.2.8)
i=0
m
n+i −1
≤ ∑ g(e1 )2 e1
i=0
n −1 m
≤ g(e1 )2 e1 (1 + g(e1 )2 + · · · + g(e1 )2 )
n 2n −1
g(e1 )2 −1 e1 g(γρ ) γρ n
≤ ≤ ≤ re−γ2 . (15.2.9)
1 − g(e1 ) 1 − g(γρ )

Thus (xδn,α ) is a Cauchy sequence in Br (x0 ) and hence converges, say to xδα ∈ Br (x0 ). Further
by letting n → ∞ in (15.1.12) we obtain

F 0 (x0 )∗ (F(xδα ) − yδ ) + α(xδα − x0 ) = 0.

The estimate in (15.2.1) follows by letting m tends to ∞ in (15.2.9).


Remark 15.2.4. Note that if r ∈ (r1 , r2 ) where
q
2 + (2l0 − 3k0 )γρ − (4l02 + 9k02 − 36k0 l0 )γ2ρ − (12k0 + 8l0 )γρ + 4
r1 :=
2l0 (2 + 3k0 γρ )

and
q
2 + (2l0 − 3k0 )γρ + (4l02 + 9k02 − 36k0 l0 )γ2ρ − (12k0 + 8l0 )γρ + 4 2 − 3k0
r2 := min{ , },
2l0 (2 + 3k0 γρ ) (2 + 3l0 )k0

(8l0 −12k0)2 +16(36k0l0 −9k0−4l0 )−(8l0 +12k0 ) γρ
with γρ ≤ cl0 k0 := min{1, 2(36k0l0 −9k02−4l02 )
} then 1−g(γρ )
≤ r and
l0 r < 1.

15.3. Error Analysis


We use the following assumption to obtain an error estimate for kxδα − x̂k.
Assumption 15.3.1. There exists a continuous, strictly monotonically increasing function
ϕ : (0, a] → (0, ∞) with a ≥ kF 0 (x0 )k2 satisfying;
lim
• λ→0 ϕ(λ) = 0


sup αϕ(λ)
≤ ϕ(α), ∀λ ∈ (0, a].
λ ≥ 0 λ+α
• there exists v ∈ X such that
x0 − x̂ = ϕ(A∗0 A0 )v.
Modified Newton-Tikhonov Regularization Method 289

Theorem 15.3.2. Let xδα be as in (15.1.10). Then

max{1, kvk} δ
kxδα − x̂k ≤ ( √ + ϕ(α))
1−q α

where q = l0 r.
R1 0 δ
Proof. Let M = 0 F (x̂ + t(xα − x̂))dt. Then

F(xδα ) − F(x̂) = M(xδα − x̂)

and hence by (15.1.10), we have (A∗0 M + αI)(xδα − x̂) = A∗0 (yδ − y) + α(x0 − x̂). Thus

xδα − x̂ = (A∗0 A0 + αI)−1 [A∗0 (yδ − y) + α(x0 − x̂) + A∗0 (A0 − M)(xδα − x̂)]
= s1 + s2 + s3 (15.3.1)

where s1 := (A∗0 A0 + αI)−1 A∗0 (yδ − y), s2 := (A∗0 A0 + αI)−1 α(x0 − x̂) and s3 := (A∗0 A0 +
αI)−1 A∗0 (A0 − M)(xδα − x̂). Note that

δ
ks1 k ≤ √ , (15.3.2)
α

by Assumption 15.3.1
ks2 k ≤ ϕ(α)kvk (15.3.3)
and by Assumption 15.2.1
ks3 k ≤ l0 rkxδα − x̂k. (15.3.4)
The result now follows from (15.3.1), (15.3.2), (15.3.3) and (15.3.4).

15.3.1. Error Bounds under Source Conditions


Combining the estimates in Theorem 15.2.3 and Theorem 15.3.2 we obtain the following.

Theorem 15.3.3. Let the assumptions in Theorem 15.2.3 and Theorem 15.3.2 hold and let
xδn,α be as in (15.1.12). Then

n max{1, kvk} δ
kxδn,α − x̂k ≤ re−γ2 + ( √ + ϕ(α)).
1−q α
n
Further if nδ := min{n : e−γ2 < √δ }, then
α

δ
kxδnδ ,α − x̂k ≤ C̃( √ + ϕ(α))
α
max{1,kvk}
where C̃ := r + 1−q .
290 Ioannis K. Argyros and Á. Alberto Magreñán

15.3.2. A Priori Choice of the Parameter


Observe that the estimate √δ + ϕ(α) in Theorem 15.3.3 is of optimal order for the choice
α p
α := αδ which satisfies √δ = ϕ(α). Now, using the function ψ(λ) := λ ϕ−1 (λ), 0 < λ ≤ a,
αδ

we have δ = αϕ(α) = ψ(ϕ(α)) so that αδ = ϕ−1 [ψ−1 (δ)].
p
Theorem 15.3.4. Let ψ(λ) = λ ϕ−1 (λ), 0 < λ ≤ a and assumptions in Theorem 15.3.3
holds. For δ > 0, let αδ = ϕ−1 [ψ−1 (δ)] and let nδ be as in Theorem 15.3.3. Then

kxδnδ ,α − x̂k = O(ψ−1 (δ)).

15.3.3. Adaptive Choice of the Parameter


In the balancing principle considered by Pereverzev and Schock in [23], the regularization
parameter α = αi are selected from some finite set

DN := {αi : 0 < α0 < α1 < · · · < αN }.

Let
n δ
ni = min{n : e−γ2 ≤ √ }
αi
and let xδαi := xδni,αi where xδni,αi be as in (15.1.12) with α = αi and n = ni . Then from
Theorem 15.3.3, we have
δ
kxδαi − x̂k ≤ C̃( √ + ϕ(αi )), ∀i = 1, 2, · · ·N.
αi
Precisely we choose the regularization parameter α = αk from the set DN defined by

DN := {αi = µi α0 , i = 1, 2, · · ·N}

where µ > 1.
To obtain a conclusion from this parameter choice we considered all possible functions
ϕ satisfying Assumption 15.2.1 and ϕ(αi ) ≤ √δαi . Any of such functions is called admissible
for x̂ and it can be used as a measure for the convergence of xδα → x̂ (see [19]).
The main result of this section is the following theorem, proof of which is analogous to
the proof of Theorem 4.4 in [7].
Theorem 15.3.5. Assume that there exists i ∈ {0, 1, · · · , N} such that ϕ(αi ) ≤ √δ . Let
αi
assumptions of Theorem 15.3.3 be satisfied and let
δ
l := max{i : ϕ(αi ) ≤ √ } < N,
αi
δ
k = max{i : ∀ j = 1, 2, · · · , i; kxδαi − xδα j k ≤ 4C̃ √ }
αj
where C̃ is as in Theorem 15.3.3. Then l ≤ k and

kxδαk − x̂k ≤ 6C̃µψ−1 (δ).


Modified Newton-Tikhonov Regularization Method 291

15.4. Implementation of the Method


Finally the balancing algorithm associated with the choice of the parameter specified in
Theorem 15.3.5 involves the following steps:

• Choose α0 > 0 such that δ0 < ck0 l0 α0 and µ > 1.

• Choose N big enough but not too large and αi := µi α0 , i = 0, 1, 2, · · · , N.


r
δ
1+2l0 (ck0 l0 − √α0 )−1
0
• Choose ρ ≤ l0 where ck0 l0 is as in Remark 15.2.4.

• Choose r ∈ (r1 , r2 ).

15.4.1. Algorithm
1. Set i = 0.
n
2. Choose ni = min{n : e−γ2 ≤ √δ }.
αi

3. Solve xδni ,αi = xδαi by using the iteration (15.1.12) with n = ni and α = αi .

4. If kxδαi − xδα j k > 4C̃ √δα j , j < i, then take k = i − 1 and return xδαk .

5. Else set i = i + 1 and return to Step 2.


References

[1] Argyros, I.K., Convergence and Application of Newton-type Iterations, Springer,


2008.

[2] Argyros, I.K., Hilout, S., A convergence analysis for directional two-step Newton
methods, Numer. Algor., 55 (2010), 503-528.

[3] Bakushinskii, A.B., The problem of the convergence of the iteratively regularized
Gauss-Newton method, Comput. Math. Phy., 32 (1992), 1353-1359.

[4] Bakushinskii, A.B., Iterative methods without saturation for solving degenerate non-
linear operator equations, Dokl. Akad. Nauk, 344 (1995), 7-8, MR1361018.

[5] Blaschke, B., Neubauer, A., Scherzer, O., On convergence rates for the iteratively
regularized Gauss-Newton method, IMA Journal on Numerical Analysis, 17 (1997),
421- 436.

[6] Bockmann, C., Kammanee, A., Braunb, A., Logarithmic convergence of Levenberg-
Marquardt method with application to an inverse potential problem, J. Inv. Ill-Posed
Problems, 19 (2011), 345-367.

[7] George, S., On convergence of regularized modified Newton’s method for nonlinear
ill-posed problems, J. Inv. Ill-Posed Problems, 18 (2010), 133-146.

[8] George, S., “Newton type iteration for Tikhonov regularization of nonlinear ill-posed
problems,” J. Math., 2013 (2013), Article ID 439316, 9 pages.

[9] Hanke, M., A regularizing Levenberg-Marquardt scheme, with applications to inverse


groundwater filtration problems, Inverse Problems, 13 (1997), 79-95.

[10] Kanke, M., The regularizing Levenberg-Marquardt scheme is of optimal order, J. In-
tegral Equations Appl. 22 (2010), 259-283.

[11] Hochbruck, M., Honig, M., On the convergence of a regularizing Levenberg-


Marquardt scheme for nonlinear ill-posed problems, Numer. Math. 115 (2010), 71-79.

[12] Hohage, T., Logarithmic convergence rate of the iteratively regularized Gauss-Newton
method for an inverse potential and an inverse scattering problem, Inverse Problems,
13 (1997), 1279-1299.
294 Ioannis K. Argyros and Á. Alberto Magreñán

[13] Hohage, T., Regularization of exponentially ill-posed problems, Numer. Funct.


Anal&Optim., 21 (2000), 439-464.

[14] Jin, Q., On a regularized Levenberg-Marquardt method for solving nonlinear inverse
problems, Numer. Math. 16 (2010), 229-259.

[15] Kaltenbacher, B., A posteriori parameter choice strategies for some Newton-type
methods for the regularization of nonlinear ill-posed problems, Numerische Mathe-
matik, 79 (1998), 501-528.

[16] Kaltenbacher, B., A note on logarithmic convergence rates for nonlinear Tikhonov
regularization, J. Inv. Ill-Posed Problems, 16 (2008), 79-88.

[17] Langer, S., Hohage T., Convergence analysis of an inexact iteratively regularized
Gauss-Newton method under general source conditions, J. of Inverse & Ill-Posed
Problems, 15 (2007), 19-35.

[18] Levenberg, K., A Method for the solution of certain problems in least squares, Quart.
Appl. Math. 2 (1944), 164-168.

[19] Lu, S., Pereverzev, S.V., Sparsity reconstruction by the standard Tikhonov method,
RICAM-Report No. 2008-17.

[20] Mahale, P., Nair, M.T., A Simplified generalized Gauss-Newton method for nonlinear
ill-posed problems, Mathematics of Computation, 78(265) (2009), 171-184.

[21] Marquardt, D., An Algorithm for least-squares estimation of nonlinear parameters,


SIAM J. Appl.Math., 11(1963), 431-441.

[22] Ortega, J.M., Rheinboldt, W.C., Iterative solution of nonlinear equations in general
variables, Academic Press, New York and London (1970).

[23] Perverzev, S.V., Schock, E., “On the adaptive selection of the parameter in regulariza-
tion of ill-posed problems”, SIAM J. Numer. Anal. 43 (2005), 2060-2076.

[24] Pornsawad, P., Bockmann, C., Convergence rate analysis of the first stage Runga-
Kutta-type regularizations, Inverse Problems, 26 (2010), 035005.
Chapter 16

Local Convergence Analysis of


Proximal Gauss-Newton Method for
Penalized Nonlinear Least Squares
Problems

16.1. Introduction
Let X and Y be Hilbert spaces. Let D ⊆ X be open set and F : D −→ Y be continuously
Fréchet-differentiable. Moreover, let J : D → R ∪ {∞} be proper, convex and lower semi-
continuous. In this study we are concerned with the problem of approximating a locally
unique solution x? of the penalized nonlinear least squares problem
min k F(x) k2 +J(x). (16.1.1)
x∈D
A solution x? ∈ D of (16.1.1) is also called a least squares solution of the equation F(x) = 0.
Many problems from computational sciences and other disciplines can be brought in a
form similar to equation (16.1.1) using Mathematical Modelling [3, 6, 14, 16]. For example
in data fitting, we have X = Ri , Y = R j , i is the number of parameters and j is the number
of observations.
The solution of (16.1.1) can rarely be found in closed form. That is why the solution
methods for these equations are usually iterative. In particular, the practice of numeri-
cal analysis for finding such solutions is essentially connected to Newton-type methods
[1, 2, 3, 5, 4, 6, 7, 14, 17]. The study about convergence matter of iterative procedures
is usually centered on two types: semilocal and local convergence analysis. The semilo-
cal convergence matter is, based on the information around an initial point, to give criteria
ensuring the convergence of iterative procedures; while the local one is, based on the in-
formation around a solution, to find estimates of the radii of convergence balls. A plethora
of sufficient conditions for the local as well as the semilocal convergence of Newton-type
methods as well as an error analysis for such methods can be found in [1]–[20].
If J = 0, we obtain the well known Gauss-Newton method defined by
xn+1 = xn − F 0 (xn )+ F(xn ), for each n = 0, 1, 2, . . ., (16.1.2)
296 Ioannis K. Argyros and Á. Alberto Magreñán

where x0 ∈ D is an initial point [12] and F 0 (xn )+ is the Moore-Penrose inverse of the linear
operator F 0 (xn ). In the present paper we use the proximal Gauss-Newton method (to be pre-
cised in Section 16.2, see (16.2.6)) for solving penalized nonlinear least squares problem
(16.1.1). Notice that if J = 0, x? is a solution of (16.1.1), F(x? ) = 0 and F 0 (x? ) is invertible,
then the theories of Gauss-Newton methods merge into those of Newton method. A sur-
vey of convergence results under various Lipschitz-type conditions for Gauss-Newton-type
methods can be found in [2, 6] (see also [5, 9, 10, 12, 15, 18]). The convergence of these
methods requires among other hypotheses that F 0 satisfies a Lipschitz condition or F 00 is
bounded in D . Several authors have relaxed these hypotheses [4, 8, 9, 10, 15]. In particular,
Ferreira et al. [1, 9, 10] have used the majorant condition in the local as well as semilocal
convergence of Newton-type method. Argyros and Hilout [3, 4, 5, 6, 7] have also used
the majorant condition to provide a tighter convergence analysis and weaker convergence
criteria for Newton-type method. The local convergence of inexact Gauss-Newton method
was examined by Ferreira et al. [9] using the majorant condition. It was shown that this
condition is better that Wang’s condition [15], [20] in some sence. A certain relationship
between the majorant function and operator F was established that unifies two previously
unrelated results pertaining to inexact Gauss-Newton methods, which are the result for an-
alytical functions and the one for operators with Lipschitz derivative.
In [7] motivated by the elegant work in [10] and optimization considerations we pre-
sented a new local convergence analysis for inexact Gauss-Newton-like methods by using
a majorant and center majorant function (which is a special case of the majorant function)
instead of just a majorant function with the following advantages: larger radius of con-
vergence; tighter error estimates on the distances k xn − x? k for each n = 0, 1, · · · and a
clearer relationship between the majorant function and the associated least squares prob-
lems (16.1.1). Moreover, these advantages are obtained under the same computational cost,
since as we will see in Section 16.3. and Section 16.4., the computation of the majorant
function requires the computation of the center-majorant function. Furthermore, these ad-
vantages are very important in computational mathematics, since we have a wider choice
of initial guesses x0 and fewer computations to obtain a desired error tolerance on the dis-
tances k xn − x? k for each n = 0, 1, · · ·. In the present paper, we obtain the same advantages
over the work by Allende and Gonçalves [1] but using the proximal Gauss-Newton method
[6, 18].
The paper is organized as follows. In order to make the paper as self contained as
possible, we provide the necessary background in Section 16.2.. Section 16.3. contains
the local convergence analysis of inexact Gauss-Newton-like methods. Some proofs are
abbreviated to avoid repetitions with the corresponding ones in [18]. Special cases and
applications are given in the concluding Section 16.4..

16.2. Background
Let U(x, r) and U(x, r) stand, respectively, for the open and closed ball in X with center
x ∈ D and radius r > 0. Let A : X −→ Y be continuous linear and injective with closed
image, the Moore-Penrose inverse [3] A+ : Y −→ X is defined by A+ = (A? A)−1 A? . I
denotes the identity operator on X (or Y ). Let L (X , Y ) be the space of bounded linear
operators from X into Y . Let M ∈ L (X , Y ), the Ker(M) and Im(M) denote the Kernel
Proximal Gauss-Newton Method 297

and image of M, respectively and M ∗ its adjoint operator. Let M ∈ L (X , Y ) with a closed
image. Recall that the Moore-Pentose inverse of M is the linear operator M + ∈ L (Y , X )
which satisfies

M M + M = M, M + M M + = M, (M M +)∗ = M M +, (M + M)∗ = M + M. (16.2.1)

It follows from (16.2.1) that if ∏S denotes the projection of X onto subspace S, then

M + M = IX − ∏ , M M+ = ∏ . (16.2.2)
Ker(M) Im(M)

Moreover, if M is injective, then

M + = (M ∗ M)−1 M ∗ , M + M = IX , kM +k2 = k(M ∗ M)−1 k. (16.2.3)

Lemma 16.2.1. [3, 6, 14] (Banach’s Lemma) Let A : X −→ X be a continuous linear


operator. If k A − I k< 1 then A−1 ∈ L (X , X ) and k A−1 k≤ 1/(1− k A − I k).

Lemma 16.2.2. [1, 3, 6, 10] Let A, E : X −→ Y be two continuous linear operators with
closed images. Suppose B = A + E, A is injective and k E A+ k< 1. Then, B is injective.

Lemma 16.2.3. [1, 3, 6, 10] Let A, E : X −→ Y be two continuous linear operators with
closed images. Suppose B = A + E and k A+ k k E k< 1. Then, the following estimates hold

+ k A+ k + + 2 k A+ k2 k E k
k B k≤ and k B − A k≤ .
1− k A+ k k E k 1− k A+ k k E k

The semilocal convergence of proximal Gauss-Newton method using Wang’s condition


was introduced in [18]. Next, in order to make the paper as self contained, as possible, we
briefly illustrate how this method is defined. Let Q : X → X be continuous, positive, self
adjoint and bounded from below. It follows that Q−1 ∈ L (X , X ). Define a scalar product
on X by < u, v >=< u, Qv >. Then, the corresponding induced norm k . kQ is equivalent to
the given norm on X, since kQ1−1 k k x k≤k x k2Q≤k Q kk x k2 . The Moreau approximation of
J [18] with respect to the scalar product induced by Q in the functional Γ : X → R defined
by  
1 2
Γ(y) = inf J(x) + k x − y kQ (16.2.4)
x∈X 2
It follows from the properties of J that the infimum in (16.2.4) is obtained at a unique
point. Let us denote by prox Q
J (y) the proximity operator:

prox Q
J : X → X  
1 2 (16.2.5)
y → Γ(y) = argminx∈X J(x) + k x − y kQ
2
The first optimality condition for (16.2.4) leads to

z = prox Q
J (y) ⇔ 0 ∈ ∂J(z) + Q(z − y)
⇔ Q(z) ∈ (∂I + Q)(z),
298 Ioannis K. Argyros and Á. Alberto Magreñán

which leads to
prox Q −1
J (y) = (∂I + Q) (Q(y))

by using the minimum in (16.2.4). In view, of the above, we can define the proximal Gauss-
Newton method by
H(xn )
xn+1 = prox J (xn − F 0 (xn )+F(xn )) for each n = 0, 1, 2, . . . (16.2.6)
H(xn )
where x0 is an initial point, H(xn ) = F 0 (xn )∗ F 0 (xn ) and prox J is defined in (16.2.5).
Next, we present some auxiliary results.

Lemma 16.2.4. [18] Let Q1 and Q2 be continuous, positive self adjoint operators and
bounded from below on X. Then, the following hold
q
k prox Q
J
1
(y1 ) − prox Q2
J (y2 ) k ≤ k Q1 kk Q1−1 k k y1 − y2 k
Q2
+ k Q−1
1 kk (Q1 − Q2 )(y2 − prox J (y2 )) k .
(16.2.7)
for each y1 , y2 ∈ X.

Lemma 16.2.5. [18] Given xn ∈ X, if F 0 (xn ) is injective with closed image, then xn+1
satisfies
1
xn+1 = argminx∈X k F(xn ) + F 0 (xn )(x − xn ) k2 +J(x). (16.2.8)
2
Lemma 16.2.6. [18] Suppose: x∗ ∈ D satisfies −F 0 (x∗ )∗ F(x∗ ) ∈ ∂J(x∗ ); F 0 (x∗ ) is injective
and Im(F 0 (x∗ )) is closed. Then x∗ satisfies
H(x∗ ) 
x∗ = prox J x∗ − F 0 (xn )+ F(x∗ ) . (16.2.9)

Proposition 16.2.7. [10] Let R > 0. Suppose g : [0, R) −→ R is convex. Then, the following
holds
g(u) − g(0) g(u) − g(0)
D+g(0) = lim+ = inf .
u→0 u u>0 u
Proposition 16.2.8. [10] Let R > 0 and θ ∈ [0, 1]. Suppose g : [0, R) −→ R is convex. Then,
h : (0, R) −→ R defined by h(t) = (g(t) − g(θt))/t is increasing.

16.3. Local Convergence Analysis of the Proximal Gauss-


Newton Method
We shall prove the main local convergence results for the proximal Gauss-Newton method
(16.2.6) for solving the penalized nonlinear least squares problem (16.1.1) under the (H)
conditions given as follows:

(H0 ) Let D ⊆ X be open; J : D → R ∪ {+∞} be proper, convex and lower semicontinuously


Fréchet-differentiable such that F 0 has a closed image in D;
Proximal Gauss-Newton Method 299

(H1 ) Let x∗ ∈ D, R > 0, α :=k F(x∗ ) k, β :=k F 0 (x∗ )+ k, γ := β k F 0 (x∗ ) k and δ := sup{t ∈
[0, R) : U(x∗ ,t) ⊂ D }. Operator −F 0 (x∗ )∗ F(x∗ ) ∈ ∂J(x∗ ), F 0 (x) is injective and there
exist f 0 , f : [0, R) → R continuously differentiable such that for each x ∈ U(x∗ , δ),
θ ∈ [0, 1] and λ(x) =k x − x∗ k:

β k F 0 (x) − F 0 (x∗ ) k≤ f 00 (λ(x)) − f 00 (0) (16.3.1)

and
β k F 0 (x) − F 0 (x∗ + θ(x − x∗ )) k≤ f 00 (λ(x)) − f 00 (θλ(x)); (16.3.2)

(H2 ) f 0 (0) = f (0) = 0 and f 00 (0) = f 0 (0) = −1;

(H3 ) f 00 , f 0 are strictly increasing and for each t ∈ [0, R)

f0 (t) ≤ f (t) and f 00 (t) ≤ f 0 (t);


h √  i
(H4 ) 1 + 2 γ + 1 αβD + f00 (0) < 1;
Let positive constants ν, ρ, r and function Ψ be defined by

ν := sup t ∈ [0, R) : f 00 (t) < 0 ,

ρ := sup{t ∈ [0, ν) : Ψ(t) < 1} ,


r := min{ν, ρ}
and
h  √  i
( f 00 (t) + 1 + γ) t f 0 (t) − f (t) + αβ 1 + 2 ( f 00 (t) + 1) + αβ( f 00 (t) + 1)
Ψ(t) :=  2 .
t f00 (t)

Remark 16.3.1. In the literature, with the exception of our works [2, 3, 4, 5, 6, 7] only
(16.3.2) is used. However, notice that (16.3.2) always implies (16.3.1). That is (16.3.1) is
not an additional to (16.3.2) hypothesis. Moreover,

f00 (t) ≤ f 0 (t) for each t ∈ [0, R) (16.3.3)


0
holds in general and ff 0 (t) can be arbitrarily large [3, 6]. Using more precise (16.3.1)
0 (t)
instead of (16.3.2) for the computation of the upper bounds on the norms k F 0 (x)+ k and
k F 0 (x)+ − F 0 (x∗ )+ k leads to a tighter error estimates on k xn − x∗ k and a larger radius of
convergence (if f 00 (t) < f 0 (t)) that if only (16.3.2) was used (see also Remark 3.11, the last
Section and the numerical example).

Theorem 16.3.2. Under the (H) hypotheses, let x0 ∈ U(x∗ , r) \ {x∗ }. Then, sequence {xn }
generated by proximal Gauss-Newton method (16.2.6) for solving penalized nonlinear least
squares problem (16.1.1) is well defined, remains in U(x∗ , r) and converges to x∗ . Moreover,
the following estimates hold for each n = 0, 1, 2, . . .:

λn+1 = λ(xn+1 ) ≤ ϕn+1 := ϕ(λ0 , λn , f , f 0, f 00 ), (16.3.4)


300 Ioannis K. Argyros and Á. Alberto Magreñán

where
( f 00 (λ0 ) + 1 + γ)[ f 0(λ0 )λ0 − f (λ0 )] 2
ϕ(λ0 , λn , f , f 0 , f 00 ) = λn
(λ0 f00 (λ0 ))2
 √ 
1 + 2 αβ( f 00 (λ0 ) + 1)2
+ λ2n
(λ0 f00 (λ0 ))2
h √  i
αβ 1 + 2 γ + 1 [ f 00 (λ0 ) + 1]
+ λn .
λ0 ( f 00 (λ0 ))2
In order for us to prove Theorem 16.3.2 we shall need several auxiliarity results. The
proofs of the next four Lemmas are omitted, since they have been given, respectively in
Lemmas 16.3.1-16.3.4 in [7]. From now on we assume that hypotheses (H) are satisfied.
Lemma 16.3.3. The following hold, ν > 0, and f 00 (t) < 0 for each t ∈ [0, ν).
Lemma 16.3.4. The function gi , i = 1, 2, . . ., 7 defined by
1
g1 (t) = − ,
f00 (t)
f00 (t) + 1 + γ
g2 (t) = − ,
f00 (t)
t f 0 (t) − f (t)
g3 (t) = ,
t2
f 0 (t) + 1
g4 (t) = 0 ,
t
( f 0 (t) + 1 + γ)(t f 0(t) − f (t))
g5 (t) = 0 ,
(t f 00 (t))2
( f 00 (t) + 1)2
g6 (t) =
(t f 00 (t))2
and
f00 (t) + 1
g7 (t) =
t( f 00 (t))2
for each t ∈ [0, ν) are positive and increasing.
Lemma 16.3.5. The following hold, ρ > 0, and 0 ≤ ψ(t) < 1 for each t ∈ [0, ρ), where
function ψ is defined in the (H) hypotheses.
Lemma 16.3.6. Let x ∈ D . Suppose that λ(x) < min{ν, ρ} and the (H) hypotheses hold
excluding (16.3.2). Then, the following items hold:
β
k F 0 (x)+ k≤ − ,
f00 (λ(x))

0 + 0 ∗ + 2β( f 00 (λ(x)) + 1)
k F (x) − F (x ) k≤ −
f00 (λ(x))
and
H(x) = F 0 (x∗ )F 0 (x) is invertible on U(x∗ , r).
Proximal Gauss-Newton Method 301

Remark 16.3.7. It is worth noticing (see also Remark 16.3.2 that the estimates in Lemma
16.3.6 hold with f 0 replaced by f (i.e. using (16.3.2) instead of (16.3.1)). However, in this
case these estimates are less tight.

Lemma 16.3.8. Let x ∈ D . Suppose that λ(x) < min{ν, δ} and the (H) hypotheses exclud-
ing (16.3.2) hold. Then, the following items hold for each x ∈ D :
1 f00 (λ(x))+1+γ
(a) k H(x) k 2 ≤ β ;
1 β
(b) k H(x)−1 k 2 ≤ − f 0 (λ(x))
0

and

(c) β k (H(x) − H(x∗ ))F 0 (x∗ )+ k≤ ( f 00 (λ(x)) + 2 + γ)( f 00 (λ(x)) + 1).

Proof.

(a) It follows from (16.3.1) that

β k F 0 (x) k = k F 0 (x∗ )+ kk F 0 (x) k

≤ β(k F 0 (x) − F 0 (x∗ ) k + k F 0 (x∗ ) k)

≤ f 00 (λ(x)) + 1 + γ.

Then (a) follows from the preceding estimate and


1 1
k H(x) k 2 = F 0 (x) k 2 =k F 0 (x) k .

(b) Use Lemma 16.3.6, the definition of H and the last property in (16.2.3).
(c) We use (16.2.2) , (b) and (16.3.1) to obtain in turn that

β k (H(x) − H(x∗ ))F 0 (x∗ )+ k = β k F 0 (x)∗ (F 0 (x) − F 0 (x∗ ))F 0 (x∗ )+

+(F 0 (x) − F 0 (x∗ ))∗ ∏Im(F 0 (x∗ )) k

≤ (k F 0 (x) kk F 0 (x∗ )+ k +1)β k F 0 (x) − F 0 (x∗ ) k)

≤ ( f 00 (λ(x)) + 2 + γ)( f 00 (λ(x)) + 1).

The proof of the Lemma is complete. 


As in [1, 7, 10, 18] we define the linearization error at a point in D by

EF (x, y) := F(y) − [F(x) + F 0 (x)(y − x)] for each x, y ∈ D .

Then using (16.3.2) we bound this error by majorant function

e f (t, u) = f (u) − [ f (t) + f 0 (t)(u − t)] for each t, u ∈ [0, R).

In particular we have (see Lemma 16.3.5 in [7] for the proof).


302 Ioannis K. Argyros and Á. Alberto Magreñán

Lemma 16.3.9. Let x ∈ D . Suppose that λ(x) < δ, then the following items hold:

β k EF (x, x∗ ) k≤ e f (λ(x), 0).

Remark 16.3.10. (a) Using (16.3.2) only according to Lemma 16.3.9 we have that

β k E(x, x∗ ) k≤ e f0 (λ(x), 0) + 2( f 0(λ(x)) + λ(x)).

(b) Let us denote by G the proximal Gauss-Newton iteration operator by

G : U(x∗ , r) → X
H(x)
x → prox J (G(x)),

where
G(x) = x − F 0 (x)+F(x).
Notice that according to Lemma 16.3.8 H(x) is invertible in U(x∗ , r). Hence, F 0 (x)+
H(x)
and prox J are well defined in U(x∗ , r).

Next, we provide the proof of the Theorem 16.3.2.

Proof. Let x ∈ D . Suppose that λ(x) < r. Then, we shall first show that operator G is well
defined and
k G(x) − x∗ k≤ ϕ(λ(x), λ(x), f , f 0, f 00 ), (16.3.5)
where function ϕ was defined in Theorem 16.3.2. Using Lemma 16.2.6 as −F 0 (x∗ )∗ F(x∗ ) ∈
H(x)
∂J(x∗ ) and F 0 (x) is injective we have that x∗ = prox J (GF (x∗ )). Then, according to
Lemma 2.4 we have in turn that
H(x) H(x∗ )
k G(x) − x∗ k = k prox J (GF (x) − prox J (GF (x∗ )))
1
≤ (k H(x) kk H(x)−1 k) 2 k G(x) − G(x∗ ) k
H(x∗ )
+ k H(x)−1 kk (H(x) − H(x∗ ))(G(x∗ ) − prox J (G(x∗ ))) k

≤ P1 (x, x∗ ) + P2 (x, x∗ ),
(16.3.6)
where for simplicity we set
1
P1 (x, x∗ ) = (k H(x) kk H(x)−1 k) 2 k G(x) − G(x∗ ) k

and
P2 (x, x∗ ) =k H(x)−1 kk (H(x) − H(x∗ ))F 0 (x∗ )+ kk F(x∗ ) k .
Using the definition of P2 and items (b) and (c) of Lemma 16.3.8 we get that

αβ
P2 (x, x∗ ) ≤ ( f 0 (λ(x)) + 2 + γ)( f 00 (λ(x)) + 1). (16.3.7)
( f 00 (λ(x)))2 0
Proximal Gauss-Newton Method 303

Then, to find an upper bound on P2 , we first need to find an upper bound on k G(x)−G(x∗ ) k.
Indeed, we have in turn that
k G(x) − G(x∗ ) k = k F 0 (x)+ [F 0 (x)(x − x∗ ) − F(x) + F(x∗ )] + (F 0 (x∗ )+ − F 0 (x)+ )F(x∗ ) k

≤ k F 0 (x)+ kk EF (x, x∗ ) k + k F 0 (x∗ )+ − F 0 (x)+ kk F(x∗ ) k



e f (λ(x), 0) 2αβ( f 00 (λ(x)) + 1)
≤ − 0 −
f 0 (λ(x)) f 00 (λ(x))
0
f 0 (λ(x)) + 1 + γ  √ 
0
≤ e f (λ(x), 0) + 2αβ( f 0 (λ(x)) + 1) ,
f 00 (λ(x))2
(16.3.8)
where we used Lemma 16.3.6, Lemma 16.3.8 (a) and (b) and Lemma 16.3.9. Then,
(16.3.5) follows from (16.3.6) by summing up (16.3.7) we have that

k G(x) − x∗ k≤ q(x)λ(x), (16.3.9)

where
h  √  i
( f 00 (λ(x)) + 1 + γ) λ(x) f 0 (λ(x)) − f (λ(x)) + αβ 1 + 2 ( f 00 (λ(x)) + 1)
q(x) =
λ(x) [ f 0 (λ(x))]2
+αβ( f 00 (λ(x)) + 1)
+ .
λ(x) [ f 0 (λ(x))]2
But q(x) ∈ [0, 1), by Lemma 16.3.5, since x ∈ U(x∗ , r) \ {x∗}, so that 0 < λ(x) < r < ρ. That
is we have
k G(x) − x∗ k<k x − x∗ k . (16.3.10)
In particular x0 ∈ U(x∗ , r) \ {x∗ }. That is 0 < λ(x0 ) < r. Then, using mathematical
induction, Lemma 16.3.6 and (16.3.10) for x = x0 we get that λ(x1 ) =k x1 − x∗ k<k x0 −
x∗ k= λ(x0 ) < r. Similarly, we get as in (16.3.9) that
k xk+1 − x∗ k ≤ q(x0 ) k xk − x∗ k

< k xk − x∗ k

< r
from which it follows that lim xk = x∗ and sequence {xk } remains in U(x∗ , r) \ {x∗ }. 
k→∞

Remark 16.3.11. If f 0 = f , then the results of this Section reduce to the corresponding
ones in [1] (see also [9]). Otherwise, i.e. if strict inequality holds in (16.3.3), then: our
sufficient convergence condition (H4 ) is weaker than the one in [1] using f 0 instead of f 00
(i.e. the applicability of the method is extended in cases that cannot be covered before);
our convergence ball is larger and the estimates on the distances kxn − x∗ k more precise,
which imply that we have a wider choice of the initial guesses and less iterates are required
to obtain a given error tolerance. Notice also that these advantages are obtained under the
same computational cost as in [1, 9], since in practice the computation of the function f
requires the computation of f 0 as a special case. Therefore, these developments are very
important in computational mathematics.
304 Ioannis K. Argyros and Á. Alberto Magreñán

16.4. Special Cases and Numerical Examples


We present a special case of Theorem 16.3.2. This case is based on the center-Lipschitz and
Lipschitz conditions [2, 3, 4, 5, 7]. We refer the reader to [3, 6] for another case based on
Smale’s alpha theory [19].
Remark 16.4.1. Let us define functions f 0 , f : [0, γ] → R by
L0 2 L
f0 (t) = t − t and f (t) = t 2 − t, (16.4.1)
2 2
where 0 < L0 < L are the center-Lipschitz and Lipschitz constants, respectively. We have
that f 0 (0) = f (0) = 0 and f 00 (0) = f 0 (0) = −1. Notice that (16.3.3) holds as a strict in-
equality in this case. Then, one can specialize Theorem 16.3.2 using the above choices.
Clearly, the results improve the corresponding ones (with advantages as already stated in
the introduction of this study and in Remark 16.3.11) using only (16.2.2) (i.e. if f 0 = f ).
Since such results as far as we know are not available, let us at least consider the case
α = 0. That is we consider the case of zero-residual problems. Then, Theorem 16.3.2
specializes to:
Corollary 16.4.2. Let D ⊆ X be open, J : D → R∪{+∞} be proper, convex and lower semi-
continuous and F : D → Y be continuously Fréchet-differentiable and F 0 be with closed
image in D . Let x∗ ∈ D , R > 0, β =k F 0 (x∗ )+ k, γ = β k F 0 (x∗ ) k and δ = sup{t ∈ [0, R) :
U(x∗ ,t) ⊂ D }. Suppose that F(x∗ ) = 0, 0 ∈ ∂J(x∗ ), F 0 (x∗ ) is injective and there exists L0
and L such that for each x ∈ U(x∗ , δ), θ ∈ [0, 1]:

β k F 0 (x) − F 0 (x∗ ) k≤ L0 k x − x∗ k

and
β k F 0 (x) − F 0 (x∗ + θ(x − x∗ )) k≤ L(1 − θ) k x − x∗ k .
Let ( p )
4+γ− (4 + γ)2 − 8
r := min ,δ .
2L0
Then, sequence {xn } generated by proximal Gauss-Newton method (16.2.6) for solving
penalized nonlinear least squares problem (16.1.1) is well defined, remains in U(x∗ , r) and
converges to x∗ provided that x0 ∈ U(x∗ , r) \ {x∗ }. Moreover, the following estimates hold
L(γ + 2L0 k x0 − x∗ k)
k xk+1 − x∗ k≤ k xn − x∗ k2 for each n = 0, 1, 2, . . ..
2(1 − L0 k x0 − x∗ k)
The preceding results improve earlier ones [1, 8, 9, 10, 12, 15, 18] when L0 < L (see
also Remark 16.3.11). Next, we present an example where L0 < L. More example, where
L0 < L in the Lipschitz case or in Smale’s alpha theory can be found in [3, 4, 5, 6, 7].
Example 16.4.3. Let X = Y = R3 , D = U(0, 1), x∗ = (0, 0, 0) and define function F on D
by
e−1 2
F(x, y, z) = (ex − 1, y + y, z). (16.4.2)
2
Proximal Gauss-Newton Method 305

For simplicity we consider the nonlinear equation F(x) = 0 instead of (16.1.1). We have
that for u = (x, y, z)  x 
e 0 0
F 0 (u) =  0 (e − 1)y + 1 0  , (16.4.3)
0 0 1
Using the norm of the maximum of the rows and (16.4.2)–(16.4.3) we see that since F 0 (x∗ ) =
diag{1, 1, 1}, we can define parameters L0 and L by

L0 = e − 1 < L = e.
References

[1] Allende, G.B., Gonçalves, M.L.N., Local convergence analysis of a proximal Gauss-
Newton method under a majorant condition, preprint http://arxiv.org/abs/1304.6461

[2] Amat, S., Busquier, S., Gutiérrez, J.M., Geometric constructions of iterative functions
to solve nonlinear equations, J. Comput. Appl. Math., 157 (2003), 197-205.

[3] Argyros, I.K., Convergence and applications of Newton–type iterations, Springer–


Verlag Publ., New York, 2008.

[4] Argyros, I.K., Hilout, S., Extending the applicability of the Gauss-Newton method
under average Lipschitz-type conditions, Numer. Algorithms, 58 (2011), 23-52.

[5] Argyros, I.K., Hilout, S., Improved local convergence of Newton’s method under weak
majorant condition, J. Comput. Appl. Math., 236 (2012), 1892-1902.

[6] Argyros, I.K., Hilout, S., Computational methods in Nonlinear Analysis, World Sci-
entific Publ. Comp. New Jersey, 2013.

[7] Argyros, I.K., Hilout, S., Improved local convergence analysis of inexact Gauss-
Newton like methods under the majorizing condition in Banach space, J. Franklin
Institute, 350 (2013), 1531-1544.

[8] Chen, J., Li, W., Local convergence results of Gauss-Newton’s like method in weak
conditions, J. Math. Anal. Appl., 324 (2006), 1381–1394.

[9] Ferreira, O.P., Gonçalves, M.L.N, Oliveira, P.R., Local convergence analysis of
Gauss–Newton like methods under majorant condition, J. Complexity, 27 (2011), 111–
125.

[10] Ferreira, O.P., Gonçalves, M.L.N, Oliveira, P.R., Local convergence analysis of inex-
act Gauss–Newton like methods under majorant condition, J. Comput. Appl. Math.,
236 (2012), 2487–2498.

[11] Gutiérrez, J.M., Hernández, M.A., Newton’s method under weak Kantorovich condi-
tions, IMA J. Numer. Anal., 20 (2000), 521–532.

[12] Häubler, W.M., A Kantorovich–type convergence analysis for the Gauss–Newton–


method, Numer. Math., 48 (1986), 119–125.
308 Ioannis K. Argyros and Á. Alberto Magreñán

[13] Huang, Z.A., Convergence of inexact Newton method, J. Zhejiang Univ. Sci. Ed., 30
(2003), 393–396.

[14] Kantorovich, L.V., Akilov, G.P., Functional Analysis, Pergamon Press, Oxford, 1982.

[15] Li, C., Hu, N., Wang, J., Convergence bahavior of Gauss–Newton’s method and ex-
tensions to the Smale point estimate theory, J. Complexity, 26 (2010), 268–295.

[16] Potra, F.A., Sharp error bounds for a class of Newton–like methods, Libertas Mathe-
matica, 5 (1985), 71–84.

[17] Proinov, P.D., General local convergence theory for a class of iterative processes and
its applications to Newton’s method, J. Complexity, 25 (2009), 38–62.

[18] Salzo, S., Villa, S., Convergence analysis of a proximal Gauss-Newton method, Com-
put. Optim. Appl., 53 (2012), 557–589.

[19] Smale, S., Newton’s method estimates from data at one point. The merging of dis-
ciplines: new directions in pure, applied, and computational mathematics (Laramie,
Wyo., 1985), 185-196, Springer, New York, 1986.

[20] Wang, X.H., Convergence of Newton’s method and uniqueness of the solution of equa-
tions in Banach spaces, IMA J. Numer. Anal., 20 (2000), 123-134.
Chapter 17

On the Convergence of a Damped


Newton Method with Modified
Right-Hand Side Vector

17.1. Introduction
In this chapter we are concerned with the problem of approximating a locally unique solu-
tion x∗ of the nonlinear equation
F(x) = 0, (17.1.1)
where F is a Fréchet-differentiable operator defined on a open convex subset D of a Banach
space X with values in a Banach space Y.
Many problems from Computational Sciences and other disciplines can be brought in a
form similar to equation (17.1.1) using Mathematical Modeling [2, 6, 10]. For example in
data fitting, we have X = Y = Ri , i is number of parameters and i is number of observations.
The solution of (17.1.1) can rarely be found in closed form. That is why the solution
methods for these equations are usually iterative. In particular, the practice of Numerical
Analysis for finding such solutions is essentially connected to Newton-type methods [1]–
[15]. The study about convergence matter of iterative procedures is usually centered on
two types: semilocal and local convergence analysis. The semilocal convergence matter is,
based on the information around an initial point, to give criteria ensuring the convergence
of iteration procedures; while the local one is, based on the information around a solution,
to find estimates of the radii of the convergence balls.
In the present chapter, we study the convergence of the Damped Newton method defined
by

xn+1 = xn − A−1 I − αn F 0 (xn ) − A F(xn ), for each n = 0, 1, 2, . . ., (17.1.2)
where A ∈ L (X, Y) the space of bounded linear operators from X into Y, A−1 ∈ L (Y, X),
αn is a sequence of real numbers chosen to force convergence of sequence xn and x0 is
an initial point. If A = F 0 (x0 ) and αn = 0 for each n = 0, 1, 2, . . ., we obtain the modified
Newton’s method
yn+1 = yn − F 0 (x0 )−1 F(yn ), y0 = x0 , for each n = 0, 1, 2, . . ., (17.1.3)
310 Ioannis K. Argyros and Á. Alberto Magreñán

which converges linearly [2, 10].


The local convergence of Newton-like method (17.1.2) was studied by Krejić and
Lužanin [13] (see also [11]) in the case when X = Y = Ri .
Newton’s method

zn+1 = zn − F 0 (zn )F(zn ), for each n = 0, 1, 2, . . ., (17.1.4)

converges quadratically provided that the iteration starts close enough to the solution. How-
ever, the cost of a Newton iterate may be very expensive, since all the elements of the Ja-
cobian matrix involved must be computed, as well as the need for an exact slowdown of a
system of linear equations using a new matrix for every iterate. As noted in [13] Newton-
like method (17.1.2) uses a modification of the right hand side vector, which is cheaper than
the Newton and faster than the modified Newton method. One step of the method requires
the solution of a linear system, but the system matrix is the same in all iterations.
We present a new local and semilocal convergence analysis for Newton-like method.
In contrast to the work in [11, 13], in the local case the radius of convergence can be
computed as well as the error bounds on the distances kxn − x∗ k for each n = 0, 1, 2, . . .. In
the semilocal case, we present estimates on the smallness of kF(x0 )k as well as computable
estimates for kxn − x∗ k (not given in [11, 13] in terms of the Lipschitz constants and other
initial data).
We denote by U(ν, µ) the open ball centered at ν ∈ X and of radius µ > 0. Moreover,
by U(ν, µ) we denote the closure of U(ν, µ).
The chapter is organized as follows. Sections 17.2. and 17.3. contain the semilocal and
local convergence analysis of Newton-like method (17.1.2), respectively. The numerical
examples are presented in the concluding Section 17.4..

17.2. Semilocal Convergence


In this section we present the semilocal convergence of Damped Newton method (17.1.2).
We shall use the following conditions:

C0 F : D ⊆ X → Y is Fréchet-differentiable and there exists A ∈ L (X, Y) such that A−1 ∈


L (Y, X) with kA−1k ≤ a;
C1 There exists L > 0 such that for each x, y ∈ D the Lipschitz condition

kF 0 (x) − F 0 (y)k ≤ Lkx − yk (17.2.1)

holds;

C2 There exist L0 > 0 such that for each x ∈ D the center-Lipschitz condition

kF 0 (x) − F 0 (x0 )k ≤ L0 kx − x0 k (17.2.2)

holds;
Damped Newton Method with Modified Right-Hand Side 311

C3 There exist x0 ∈ D, α ≥ 0, a0 ≥ 0, a1 ≥ 0, and q ∈ (0, 1) such that for


kA−1 (F 0 (x0 ) − A) k ≤ a0 , kF 0 (x0 ) − Ak ≤ a1 the following hold
 
aL0 q
|αn | ≤ α, a + α kF(x0 )k + a0 ≤ q (17.2.3)
1−q
and
 
Lq2 L0 q
kF(x0 )k + kF(x0 )k + a1 (α + q) ≤ q; (17.2.4)
2 1−q

C4 There exist x0 ∈ D, α ≥ 0, a0 ≥ 0, a1 ≥ 0, and q ∈ (0, 1) such that for


kA−1 (F 0 (x0 ) − A) k ≤ a0 , kF 0 (x0 ) − Ak ≤ a1 the inequality (17.2.3) and
   
2 1 L0 q
+ L0 qkF(x0 )k + kF(x0 )k + a1 (α + q) ≤ q (17.2.5)
1−q 2 1−q
hold;
qkF(x0 )k
C5 U(x0 , r) ⊆ D with r = 1−q .
Notice that (1) implies (2),
L0 ≤ L (17.2.6)
L
holds in general and can be arbitrarily large [2, 3, 6]. The conditions involving kF(x0 )k
L0
and q in (3) and (4) can be solved for kF(x0 )k and q. However, these representations are
very long and unattractive. That is why we decided to leave these conditions as uncluttered
as possible. Notice also that these conditions determine the smallness of kF(x0 )k and q.
From now on we shall denote (0), (1), (2), (3), (5) and (0),(2), (4), (5) as the (C) and
(C0 ) conditions, respectively. Next, we present the semilocal convergence of the Damped
Newton-like method (17.1.2) first under the (C) conditions.
Theorem 17.2.1. Suppose that the (C) conditions hold. Then sequence {xn } generated
by the Damped Newton method (17.1.2) is well defined, remains in U(x0 , r) for each n =
0, 1, 2, . . ., and converges to a solution x∗ ∈ U(x0 , r) of equation (17.1.1). Moveover, the
following estimates hold for each n = 0, 1, 2, . . .,
kxn+1 − xn k ≤ qkF(xn )k ≤ qn+1 kF(x0 )k, (17.2.7)
and
kF(xn+1 )k ≤ qkF(xn )k ≤ qn+1 kF(x0 )k, (17.2.8)
where q is defined in (3) and r in (5).
Proof. We have by (17.1.2) and A−1 ∈ L (Y, X) that sequence {xn } is well defined. Then,
we shall show that x1 ∈ U(x0 , r), kx1 − x0 k ≤ qkF(x0 )k and kF(x1 )k ≤ qkF(x0 )k. Indeed,
we have by (17.1.2) for n = 0 and the second condition in (3) that

kx1 − x0 k = kA−1 I − α0 F 0 (x0 ) − A F(x0 )k
 
≤ kA−1 k + |α0 |kA−1 F 0 (x0 ) − A kF(x0 )k
 
≤ kA−1 k + αkA−1 F 0 (x0 ) − A kF(x0 )k
≤ qkF(x0 )k < r.
312 Ioannis K. Argyros and Á. Alberto Magreñán

Hence, x1 ∈ U(x0 , r) and (17.2.7) holds for n = 0. Using (17.1.2) it can easily be seen that
the Ostrowski-type approximation
Z 1 
F(xn+1 ) = F 0 (xn + θ(xn+1 − xn )) − F 0 (xn ) (xn+1 − xn ) dθ
0
 (17.2.9)
+ F 0 (xn ) − A (αn F(xn ) + (xn+1 − xn ))

holds. Using (17.2.9), and the (C) conditions, for n = 0 we get in turn that
Z 1  

kF(x1 )k = F 0 (x0 + θ(x1 − x0 )) − F 0 (x0 ) (x1 − x0 ) dθ
0


+ F 0 (x0 ) − A (α0 F(x0 ) + (x1 − x0 ))
L0 
≤ kx1 − x0 k2 + F 0 (x0 ) − A |α0 |kF 0 (x0 )k + kx1 − x0 k
2
L 2
≤ q kF(x0 )k2 + F 0 (x0 ) − A (αkF(x0 )k + qkF(x0 )k)
2 
L 2 0

q kF(x0 )k + F (x0 ) − A (α + q) kF(x0 )
2
≤ qkF(x0 )k.

That is (17.2.8) holds for n = 0. It follows from the existence of x1 ∈ U(x0 , r) and A−1 ∈
L (X, Y) that x2 is well defined. Using (17.1.2) for n = 1, we get by (0), (2), (3) that

kx2 − x1 k = kA−1 I − α1 F 0 (x1 ) − A F(x1 )k
  
≤ kA−1 k + αkA−1 F 0 (x1 ) − F 0 (x0 ) + F 0 (x0 ) − A kF(x1 )k
  
≤ kA−1 k + α kA−1kL0 kx1 − x0 k + A−1 F 0 (x0 ) − A kF(x1 )k
≤ qkF(x1 )k ≤ q2 kF(x0 )k.

We also have that

kx2 − x0 k ≤ kx2 − x1 k + kx1 − x0 k


≤ q2 kF(x0 )k + qkF(x0 )k
= qkF(x0 )k(1 + q)k
1 − q2
= qkF(x0 )k
1−q
qkF(x0 )k
< = r. (17.2.10)
1−q
Damped Newton Method with Modified Right-Hand Side 313

That is, x2 ∈ U(x0 , r). Then, using (17.2.9) for n = 1, as above we get in turn that
L
kF(x2 )k ≤ kx2 − x1 k2
2  
+ L0 kx1 − x0 k + F 0 (x0 ) − A |α0 |kF 0 (x1 )k + kx2 − x1 k
L
≤ q2 kF 0 (x1 )k2
2 
+ L0 qkF(x0 )k + F 0 (x0 ) − A (αkF(x1 )k + qkF(x1 )k)
 
L 2 0 0 

q kF (x1 )k + L0 qkF(x0 )k + F (x0 ) − A (α + q) kF(x1 )k
2
≤ qkF(x1 )k ≤ q2 kF(x0 )k.
Similarly, we have using (17.1.2) that
  
kx3 − x2 k ≤ kA−1 k + αkA−1 F 0 (x2 ) − F 0 (x0 ) + F 0 (x0 ) − A kF(x2 )k
  
≤ kA−1 k + α kA−1kL0 kx2 − x0 k + A−1 F 0 (x0 ) − A kF(x2 )k
≤ qkF(x2 )k ≤ q3 kF(x0 )k.
We also have that
kx3 − x0 k ≤ kx3 − x2 k + kx2 − x1 k + kx1 − x0 k
≤ (q3 + q2 + q)kF(x0 )k
1 − q3
= qkF(x0 )k < r,
1−q
and
L
kF(x3 )k ≤ kx3 − x2 k2
2  
+ L0 kx2 − x0 k + F 0 (x0 ) − A |α0 |kF 0 (x1 )k + kx3 − x2 k
L
≤ q2 kF(x2 )k2
2 
qkF(x0 )k 0


+ L0 + F (x0 ) − A (αkF(x2 )k + qkF(x2 )k)
1−q
   
L 2 qkF(x0 )k 0


≤ q kF(x2 )k + L0 + F (x0 ) − A (α + q) kF(x2 )k
2 1−q
≤ qkF(x2 )k ≤ q3 kF(x0 )k.
The rest follows in analogous way using induction (simply replace x2 , x3 by xn , xn+1 in the
above estimates). By letting n → ∞ in (17.2.7) we obtain F(x∗ ) = 0.

Condition (1) may not be satisfied but weaker condition (2) may be satisfied. In this
case (1) can be dropped. Then, using instead of approximation (17.2.9) the approximation
Z 1 
F(xn+1 ) = F 0 (xn + θ(xn+1 − x0 )) − F 0 (x0 ) (xn+1 − xn ) dθ
0
 (17.2.11)
+ F 0 (x0 ) − F 0 (xn ) (xn+1 − xn )
  
+ F 0 (xn ) − F 0 (x0 ) + F 0 (x0 ) − A (αn F(xn ) + (xn+1 − xn )) ,
314 Ioannis K. Argyros and Á. Alberto Magreñán

we arrive in an analogous way to Theorem 17.2.1 at the following semilocal convergence


result for the Damped Newton method (17.1.2) under the (C0 ) conditions.

Theorem 17.2.2. Suppose that the (C0 ) conditions hold. Then sequence {xn } generated
by the Damped Newton method (17.1.2) is well defined, remains in U(x0 , r) for each n =
0, 1, 2, . . ., and converges to a solution x∗ ∈ U(x0 , r) of equation (17.1.1). Moveover, the
following estimates hold for each n = 0, 1, 2, . . .,

kxn+1 − xn k ≤ qkF(xn )k ≤ qn+1 kF(x0 )k,

and
kF(xn+1 )k ≤ qkF(xn )k ≤ qn+1 kF(x0 )k,
where q is defined in (4) and r in (5).

Concerning the uniqueness of the solution x∗ in U(x0 , r) we have the following result.

Proposition 17.2.3. Suppose that the (C) or (C0 ) conditions hold. Moveover, suppose that
there exist x0 ∈ D and r1 ≥ r such that F 0 (x0 )−1 ∈ L (X, Y) and

F 0 (x0 )−1 L0 (r1 + r) < 2. (17.2.12)

Then the solution x∗ is the only solution of equation (17.1.1) in U(x0 , r1 ), where r is defined
in (5).

Proof. The existence of the solution x∗ is guaranteed by conditions (C) or (C0 ). To show
1
uniqueness, let y∗ ∈ U(x0 , r1 ) with F(y∗ ) = 0. Define M = 0 F 0 (x∗ + θ(y∗ − x∗ )) dθ. Then,
R

using (2) and (17.2.12) we obtain in turn that


Z 1
kF 0 (x0 )−1 kkM − F 0 (x0 )k ≤ kF 0 (x0 )−1 kL0 k(x∗ − x0 ) + θ(y∗ − x∗ )k dθ
0
Z 1
0 −1
≤ kF (x0 ) kL0 k(1 − θ)(x∗ − x0 ) + θ(y∗ − x0 )k dθ
0
0 L0
−1
≤ kF (x0 ) k (r + r1 ) < 1. (17.2.13)
2
It follows from (17.2.13) and the Banach lemma on invertible operator [10] that M −1 ∈
L (Y, X). Moreover, we have that 0 = F(y∗) − F(x∗ ) = M (y∗ − x∗ ), which implies x∗ =
y∗ .

17.3. Local Convergence


In this section we present the local convergence of Newton-like method(17.1.2). We shall
use the following conditions:

C0 F : D ⊆ X → Y is Fréchet-differentiable and there exists A ∈ L (X, Y), x∗ ∈ D such


that A−1 ∈ L (Y, X), F(x∗ ) = 0 with kA−1 k ≤ a and kF 0 (x∗ )k ≤ β;

C1 There exist L > 0 such that for each x, y ∈ D the Lipschitz condition (17.2.1) holds;
Damped Newton Method with Modified Right-Hand Side 315

C2 There exists l0 > 0 such that for each x ∈ D the center-Lipschitz condition(17.2.2)

kF 0 (x) − F 0 (x∗ )k ≤ l0 kx − x∗ k

holds;

C3 Let kA−1 (F 0 (x∗ ) − A) k ≤ β1 .

|αn | ≤ α, β1 (1 + αβ) < 1

Denote by R1 the positive root of quadratic polynomial


 
αl02 2 La αl0 β1
p1 (t) = at + + + l0 a + αl0 aβ t + β1 (1 + αβ) − 1; (17.3.1)
2 2 2

Moreover, denote by R2 the positive root of quadratic polynomial


 
aαl02 2 3al0 αl0 β1
p2 (t) = t + + + l0 a + αl0 aβ t + β1 (1 + αβ) − 1; (17.3.2)
2 2 2

C4 U(x∗ , R) ⊆ D, where R is R1 or R2 .

Notice that (1) implies (2),


l0 ≤ L (17.3.3)
holds in general and lL0 can be arbitrarily large [2, 3, 6]. The quadratic polynomials in (3)
and (4) have a positive root by the second hypothesis in (3) or (4) and since the coefficients
of t and t 2 are positive. From now on we shall denote (0), (1), (2), (3), (4) and (0),(2),
(4) as the (H) and (H0 ) conditions, respectively. Next, we present the local convergence
of Newton-like method (17.1.2) first under the (H) conditions. In view of (17.1.2) and
F(x∗ ) = 0, we can have the following identity
Z 1
∗ −1
 0 ∗ 
xn+1 − x = −A F (x + θ(xn − x∗ )) − F 0 (xn ) dθ
0
h
− (A − F 0 (x∗ )) + (F 0 (x∗ ) − F 0 (xn )) (I − αn F 0 (x∗ )) (17.3.4)
Z 1 i 

− αn F 0 (x∗ + θ(xn − x∗ )) − F 0 (x∗ ) (xn − x∗ )
0

Then, using (17.2.9), and the (H) conditions, it is standard to arrive at [2, 3, 4, 5, 6, 8, 9, 10,
14, 15]:

Theorem 17.3.1. Suppose that the (H) conditions hold. Then sequence {xn } generated
by the Damped Newton method (17.1.2) is well defined, remains in U(x∗ , R1 ) for each
n = 0, 1, 2, . . ., and converges to x∗ provided that x0 ∈ U(x∗ , R1 ). Moveover, the following
estimates hold for each n = 0, 1, 2, . . .,

kxn+1 − x∗ k ≤ en kxn − x∗ k < kxn − x∗ k < R1 , (17.3.5)


316 Ioannis K. Argyros and Á. Alberto Magreñán

where
La β αl0
en = kxn − x∗ k + β1 + αβ1 β + 1 kxn − x∗ k + l0 akxn − x∗ k
2 2
2
αl a
+ αl0 aβkxn − x∗ k + 0 kxn − x∗ k2
2
< p1 (R1 ) + 1 < 1.

In cases (1) cannot be verified by (2) holds, we can present the local convergence of the
Damped Newton method (17.1.2) under the (H0 ) conditions using the following modifica-
tion of the Ostrowski representation (17.3.4) given by
Z 1
∗ −1
 0 ∗ 
xn+1 − x = −A F (x + θ(xn − x∗ )) − F 0 (x∗ ) dθ
0
+ [F (x ) − F 0 (xn )]
0 ∗

h (17.3.6)
− (A − F 0 (x∗ )) + (F 0 (x∗ ) − F 0 (xn ))
(I − αn F 0 (x∗ )

Z 1
0 ∗ ∗ 0 ∗
i
− αn F (x + θ(xn − x )) − F (x ) (xn − x∗ )
0

Theorem 17.3.2. Suppose that the (H0 ) conditions hold. Then sequence {xn } generated
by the Damped Newton method (17.1.2) is well defined, remains in U(x∗ , R2 ) for each
n = 0, 1, 2, . . ., and converges to x∗ provided that x0 ∈ U(x∗ , R2 ). Moveover, the following
estimates hold for each n = 0, 1, 2, . . .,

kxn+1 − x∗ k ≤ e0n kxn − x∗ k < kxn − x∗ k < R2 , (17.3.7)

where
3l0 a β αl0
e0n = kxn − x∗ k + β1 + αβ1 β + 1 kxn − x∗ k + l0 akxn − x∗ k
2 2
2
αl a
+ αl0 aβkxn − x∗ k + 0 kxn − x∗ k2
2
< p2 (R2 ) + 1 < 1.

17.4. Numerical Examples


Example 17.4.1. In this example we present an application of the previous analysis to the
Chandrasekhar equation:
Z 1
s x(t)
x(s) = 1 + x(s) dt, s ∈ [0, 1], (17.4.1)
4 0 s +t
which arises in the theory of radiative transfer [7]; x(s) is the unknown function which
is sought in C[0, 1]. The physical background of this equation is fairly elaborate. It was
developed by Chandraseckhar [7] to solve the problem of determination of the angular
distribution of the radiant flux emerging from a plane radiation field. This radiation field
Damped Newton Method with Modified Right-Hand Side 317

must be isotropic at a point, that is the distribution in independent of direction at that point.
Explicit definitions of these terms may be found in the literature [7]. It is considered to be
the prototype of the equation,
Z 1
ϕ(s)
x(s) = 1 + λs x(s) x(t) dt, s ∈ [0, 1],
0 s +t
for more general laws of scattering, where ϕ(s) is an even polynomial in s with
Z 1
1
ϕ(s) ds ≤ .
0 2
Integral equations of the above form also arise in the other studies [7]. We determine where
a solution is located, along with its region of uniqueness.
Note that solving (17.4.1) is equivalent to solve F(x) = 0, where F : C[0, 1] → C[0, 1]
and Z 1
s x(t)
[F(x)](s) = x(s) − 1 − x(s) dt, s ∈ [0, 1]. (17.4.2)
4 0 s +t
To obtain a numerical solution of (17.4.1), we first discretize the problem and approach
the integral by a Gauss-Legendre numerical quadrature with eight nodes,
Z 1 8

0
f (t) dt ≈ ∑ w j f (t j),
j=1

where
t1 = 0.019855072, t2 = 0.101666761, t3 = 0.237233795, t4 = 0.408282679,
t5 = 0.591717321, t6 = 0.762766205, t7 = 0.898333239, t8 = 0.980144928,
w1 = 0.050614268, w2 = 0.111190517, w3 = 0.156853323, w4 = 0.181341892,
w5 = 0.181341892, w6 = 0.156853323, w7 = 0.111190517, w8 = 0.050614268.
If we denote xi = x(ti ), i = 1, 2, . . ., 8, equation (3.7) is transformed into the following non-
linear system:
xi 8
xi = 1 + ∑ ai j x j , i = 1, 2, . . ., 8,
4 j=1
ti w j
where, ai j = .
ti + t j
Denote now x = (x1 , x2 , . . ., x8 )T , 1 = (1, 1, . . ., 1)T , A = (ai j ) and write the last nonlin-
ear system in the matrix form:
1
x = 1 + x (Ax), (17.4.3)
4
where represents the inner product. Set G(x) = x. If we choose x0 = (1, 1, . . ., 1)T and
x−1 = (0, 0, . . ., 0)T . Assume sequence {xn } is generated with different choices of αn and
A = F 0 (x0 ). The computational order of convergence (COC) is shown in Table 17.4.1 for
various methods. Here (COC) is defined in [12] by
   
kxn+1 − x? k∞ kxn − x? k∞
ρ ≈ ln / ln , n ∈ N,
kxn − x? k∞ kxn−1 − x? k∞
The Table 17.4.1 shows the (COC).
318 Ioannis K. Argyros and Á. Alberto Magreñán

Table 17.4.1. The comparison results of the COC for Example 1 using various αn

n αn = 0 αn = 0.0001 αn = 0.001 αn = 0.01 αn = 0.1 αn = 1


ρ 1.0183391 1.0645848 1.0645952 1.0646989 1.0657398 1.0764689

Example 17.4.2. In this example, we consider the Singular Broyden [13] problem defined
as
F1 (x) = ((3 − hx1 )x1 − 2x2 + 1)2 ,
Fi (x) = ((3 − hxi )xi − xi−1 − 2xi+1 + 1)2 ,
Fn (x) = ((3 − hxn )xn − xn−1 + 1)2 ,
Taking as starting approximation x0 = (−1, . . ., −1)T and h = 2. The Table 17.4.2
shows the (COC) computed as in previous example.

Table 17.4.2. The comparison results of the COC for Example 2 using various αn

n αn = 0 αn = 0.01 αn = 0.02 αn = 0.03 αn = 0.04 αn = 05


ρ 1.7039443 1.7041146 1.7048251 1.7178472 1.5650132 1.6619946

Example 17.4.3. Let X = Y = R2 , D = U(1, 1) and x0 = (1, 0.5). Define function F on D


for w = (x, y) by
F(w) = (x3 − 3xy2 − 1, 3x2 y − y3 ). (17.4.4)
Then, the Fréchet derivative of F is given by
  
0 3 x2 − y2 −6xy
F (w) =
6xy 3x2
Moreover we see in Figure 17.4.1 the number of iterations needed to arrive at the solution
with 300 digits, starting in x0 = {1, 0.5}
Example 17.4.4. Let X = Y = R3 , D = U(0, 1) and x∗ = (0, 0, 0). Define function F on D
for w = (x, y, z) by
e−1 2
F(w) = (ex − 1, y + y, z). (17.4.5)
2
Then, the Fréchet derivative of F is given by
 x 
e 0 0
F 0 (w) =  0 (e − 1) y + 1 0
0 0 1

Notice that we have F(x∗ ) = 0, F 0 (x∗ ) = F 0 (x∗ )−1 = diag {1, 1, 1}


Moreover we see in Figure 17.4.2 the number of iterations needed to arrive at the solu-
tion with 300 digits, starting
Damped Newton Method with Modified Right-Hand Side 319

Figure 17.4.1. Number of iterations needed.

Figure 17.4.2. Number of iterations needed.


References

[1] Amat, S., Busquier, S., Gutiérrez, J. M., Geometric constructions of iterative functions
to solve nonlinear equations, J. Comput. Appl. Math., 157 (2003), 197-205.

[2] Argyros, I.K., Convergence and applications of Newton–type iterations, Springer–


Verlag Publ., New York, 2008.

[3] Argyros, I. K., Chen, J., Improved results on estimating and extending the radius of
an attraction ball, Appl. Math. Lett., 23 (2010), 404-408.

[4] Argyros, I. K., Chen, J., On local convergence of a Newton-type method in Banach
space, Int. J. Comput. Math., 86 (2009), 1366-1374.

[5] Argyros, I.K., Hilout, S., Improved local convergence of Newton’s method under weak
majorant condition, J. Comput. Appl. Math., 236 (2012), 1892–1902.

[6] Argyros, I.K., S. Hilout, Computational methods in Nonlinear Analysis, World Scien-
tific Publ. Comp. New Jersey, 2013.

[7] Chandrasekhar, S., Radiative transfer, Dover Publ., New York, 1960.

[8] Chen, J., Sun, Q., The convergence ball of Newton-like methods in Banach space and
applications, Taiwanese J. Math., 11 (2007), 383-397.

[9] Chen, J, Li, W., Convergence behaviour of inexact Newton methods under weak Lip-
schitz condition, J. Comput. Appl. Math., 191 (2006), 143-164.

[10] Kantorovich, L.V., Akilov, G.P., Functional Analysis, Pergamon Press, Oxford, 1982.

[11] Herceg, D., Krejić, N., Lužanin, Z., Quasi-Newton’s method with correction, Novi
Sad J. Math., 26 (1996), 115-127.

[12] Grau, M., Noguera, M., Gutiérrez, J. M., On some computational orders of conver-
gence. Appl. Math. Let., 23 (2010), 472-478.

[13] Krejić, N., Lužanin, Z., Newton-like method with modification of the right-hand-side
vector, Math. Comp. 71 (2002), 237-250

[14] Potra, F.A., Sharp error bounds for a class of Newton–like methods, Libertas Mathe-
matica, 5 (1985), 71-84.
322 Ioannis K. Argyros and Á. Alberto Magreñán

[15] Proinov, P.D., General local convergence theory for a class of iterative processes and
its applications to Newton’s method, J. Complexity, 25 (2009), 38–62.
Chapter 18

Local Convergence of Inexact


Newton-Like Method under Weak
Lipschitz Conditions

18.1. Introduction
Let X , Y be Banach spaces and D be a non-empty, convex and open subset in X . Let
U(x, r) and U(x, r) stand, respectively, for the open and closed ball in X with center x and
radius r > 0. Denote by L (X , Y ) the space of bounded linear operators from X into Y . In
this chapter, we are concerned with the problem of approximating a solution x? of equation

F(x) = 0 (18.1.1)

where F is a Fréchet continuously differentiable operator defined on D with values in Y .


Many problems from computational sciences and other disciplines can be brought in the
form of equation 18.1.1 using Mathematical Modelling [1, 3, 6, 7, 9, 12]. The solution of
these equations can rarely be found in closed form. That is why the solution methods for
these equations are iterative. In particular, the practice of numerical analysis for finding
such solutions is essentially connected to variants of Newton’s method [1]-[14]. The study
about convergence matter of iterative procedures is usually centered on two types: semilocal
and local convergence analysis. The semilocal convergence matter is, based on the informa-
tion around an initial point, to give criteria ensuring the convergence of iterative procedure;
while the local one is, based on the information around a solution, to find estimates of the
radii of convergence balls. There is a plethora of studies on the weakness and/or extension
of the hypothesis made on the underlying operators; see for example [1]-[14].
Undoubtedly the most popular iterative method, for generating a sequence approximat-
ing x? , is the Newton’s method (NM) which is defined as

xn+1 = xn − F 0 (xn )−1 F(xn ) for each n = 0, 1, 2, . . . (18.1.2)

where x0 is an initial point. There are two difficulties with the implementation of (NM).
The first is to evaluate F 0 and the second difficulty is to exactly solve the following Newton
324 Ioannis K. Argyros and Á. Alberto Magreñán

equation
F 0 (xn )(xn+1 − xn ) = −F(xn ) for each n = 0, 1, 2, . . .. (18.1.3)
It is well-known that evaluating F 0 and solving equation (18.1.3) may be computationally
expensive [1, 5, 6, 8, 12, 13, 14]. That is why inexact Newton method (INLM) has been
used [1, 2, 8, 9, 12, 13, 14]:

For n = 0 step 1 until convergence do


Find the step ∆n which satisfies
kPn rn k
Bn ∆n = −F(xn ) + rn , where ≤ ηn (18.1.4)
kPn F(xn )k

Set xn+1 = xn + ∆n where Pn is an invertible operator for each n = 0, 1, 2, · · · . Here, {rn }


is a null-sequence in the Banach space Y . Clearly, the convergence behavior of (INLM)
depends on the residual controls of {rn } and hypotheses on F 0 . In particular, Lipschitz
continuity conditions on F 0 have been used and residual controls of the form

krn k ≤ ηn kF(xn )k,


kF (x ) rn k ≤ ηn kF 0 (x? )−1 F(xn )k,
0 ? −1
(18.1.5)
kF 0 (x? )−1 rn k ≤ ηn kF 0 (x? )−1 F(xn )k1+θ ,
kPn rn k ≤ θn kPn F(xn )k1+θ ,

for some θ ∈ [0, 1] and for each n = 0, 1, 2, . . ., have been employed. Here, {ηn }, {θn } are
sequences in [0, 1], {Pn } is a sequence in L (Y , X ) and F 0 (x? )−1 F 0 satisfies a Lipschitz or
Hölder condition on U(x? , r) [1]-[6], [8, 9, 10, 13, 14].
In this chapter, we are motivated by the works of Argyros et al.[1, 2], Chen et al.[5]
and Zhang et al.[13] and optimization considerations. We suppose that F has a continuous
Fréchet-derivative in U(x? , r), F(x? ) = 0, F 0 (x? )−1 F 0 exists and F 0 (x? )−1 F 0 satisfies the
Lipschitz with L−average radius condition
Z ρ(x)
0 ? −1 0 0 τ
kF (x ) (F (x) − F (x ))k ≤ L (u) d u (18.1.6)
τρ(x)

for each x ∈ U(x? , r). Here, ρ(x) = kx − x? k, xτ = x? + τ(x − x? ), τ ∈ [0, 1] and L is a


monotone function on [0, r]. Condition (18.1.6) was inagurated by Wang in [14].
In view of (18.1.6) there exists a monotone function L0 on [0, r] such that the center
Lipschitz with L0 −average condition
Z ρ(x)
kF 0 (x? )−1 (F 0 (x) − F 0 (x? ))k ≤ L0 (u) d u (18.1.7)
0

holds for each x ∈ U(x? , r). Clearly, we have

L0 (u) ≤ L (u) (18.1.8)

for each u ∈ [0, r] and L /L0 can be arbitrarily large [1, 2, 4] (see also the numerical ex-
ample at the end of the chapter). It is worth noticing that (18.1.7) is not an additional to
Inexact Newton Like-Method 325

(18.1.6) hypothesis, since in practice the convergence of (18.1.6) requires the computation
of (18.1.7).
In the computation of kF(x)−1 F 0 (x? )k we use the condition (18.1.7) which is tighter
than 18.1.6, and the Banach lemma on invertible operators [7], to obtain the perturbation
bound
 Z ρ(x) −1
0 −1 0 ?
kF (x) F (x )k ≤ 1 − L0 (u) d u for each x ∈ U(x? , r), (18.1.9)
0

instead of using 18.1.6 to obtain


 Z ρ(x)
−1
0 −1 0 ?
kF (x) F (x )k≤ 1 − L (u) d u for each x ∈ U(x? , r). (18.1.10)
0

Notice that (18.1.6) and (18.1.10) have been used in [5], [13], [14]. It turns out that using
(18.1.9) instead of (18.1.10), in the case when L0 (u) < L (u) for each u ∈ [0, r], leads to
tighter majorizing sequences for (INLM). This observation in turn leads to the following
advantages over the earlier works (for ηn = 0 for each n = 0, 1, 2, . . . or not and L being a
constant or not):

1. Larger radius of convergence.

2. Tighter error estimates on the distances kxn+1 − xn k, kxn − x? k for each n = 0, 1, 2, . . ..

3. Fewer iteration to achieve a desired error tolerance.

The rest of the chapter is organized as follows. In Section 18.2 we present some auxiliary
results. Section 18.3 contains the local convergence analysis of (INLM). In Section 18.4, we
present special cases. The numerical example appears in Section 18.5 and the conclusion
in Section 18.6.

18.2. Background
In this section we present three auxiliary results. The first two are Banach-type perturbation
lemmas.

Lemma 18.2.1. Suppose that F is such that F 0 is continuously Fréchet- differentiable in


U(x? , r), F 0 (x? )−1 ∈ L(Y , X ) and F 0 (x? )−1 F 0 satisfies the center-Lipschitz condition with
L0−average. Let r satisfy Z r
L0(u)du ≤ 1. (18.2.1)
0
Then, for each x ∈ U(x? , r), F 0 (x) is invertible and

1
kF 0 (x)−1 F 0 (x? )k ≤ R ρ(x) . (18.2.2)
1− 0 L0 (u)du
326 Ioannis K. Argyros and Á. Alberto Magreñán

Proof. Let x ∈ U(x? , r). Using (18.1.7) and (18.2.1) we get in turn that
Z ρ(x) Z r
kF 0 (x? )−1 (F 0 (x) − F 0 (x? ))k ≤ L0(u)du < L0 (u)du ≤ 1. (18.2.3)
0 0

It follows from (18.2.3) and the Banach Lemma on invertible operators [7] that F 0 (x)−1 ∈
L(Y , X ) and (18.2.2) holds.
Lemma 18.2.2. Suppose that F is such that F 0 is continuously Fréchet-differentiable in
U(x? , r), F 0 (x? )−1 ∈ L(Y , X ) and F 0 (x? )−1 F 0 satisfies the radius Lipschitz condition with
L −average and the center-Lipschitz condition with L0−average. Then, we have
R ρ(x) R ρ(x)
0 −1 0 L (u)udu − 0 (L (u) − L0(u))ρ(x)du
kF (x) F(x)k ≤ ρ(x) + R ρ(x) (18.2.4)
1− 0 L0 (u)du
R ρ(x)
0 L (u)udu
≤ ρ(x) + R ρ(x) . (18.2.5)
1− 0 L0(u)du
If F 0 (x? )−1 F 0 satisfies the center-Lipschitz condition, then we have
R ρ(x)
0 −1 ρ(x) + 0 L0(u)(ρ(x) − u)ρ(x)du
kF (y) F(x)k ≤ R ρ(y) (18.2.6)
1 − 0 L0 (u)du

Proof. Let x ∈ U(x? , r). We have that

kF 0 (x)−1 F(x)k ≤ kF 0 (x)−1 F(x? )kkF 0 (x? )−1 F(x)k. (18.2.7)

But in view of (18.2.2) and the estimate


Z ρ(x)
kF 0 (x)−1 F(x)k ≤ ρ(x) + L (u)(u − ρ(x))du (18.2.8)
0

shown in [5, Lemma 2.1, 1.3], we obtain that


R ρ(x)
0 −1 ρ(x) + 0 L (u)(u − ρ(x))ρ(x)du
kF (x) F(x)k ≤ R ρ(x)
1 − 0 L0 (u)du

which implies (18.2.4) and since L0 (u) ≤ L (u) (18.2.4) implies (18.2.5). Estimate (18.2.6)
is shown in [5, Lemma 2.2, 1.3].
Remark 18.2.3. If L0 = L , then our two preceeding results are reduced to the correspond-
ing ones in [5, 13]. Otherwise, i.e., if strict inequality holds in (18.1.8), then our estimates
are more precise, since
1 1
R ρ(x) < R ρ(x) (18.2.9)
1− 0 L0(u)du 1− 0 L (u)du
and R ρ(x) ρ(x)
L (u)udu L (u)udu
R
0 0
ρ(x) + R ρ(x) < ρ(x) + R ρ(x) . (18.2.10)
1 − 0 L0 (u)du 1 − 0 L (u)du
Inexact Newton Like-Method 327

Notice that the right hand sides of (18.2.9) and (18.2.10) are the upper bounds of the norms
kF 0 (x)−1 F(x? )k, kF 0 (x)−1 F(x)k, respectively obtained in the corresponding Lemmas in
[5], [13].
It turns out that in view of estimates (18.2.9) and (18.2.10), we obtain the advantages al-
ready mentioned in the introduction of this chapter of our approach over the corresponding
ones in [5, 13, 14].

Next, we present another auxiliary result due to Wang [14, Lemma 2.2].

Lemma 18.2.4. Suppose that the function Lα defined by

Lα (t) := t 1−αL (t) (18.2.11)

is nondecreasing for some α with α ∈ [0, 1], where L is a positive integrable function. Then,
for each β ≥ 0, the function ϕβ,α defined by
Z t
1
ϕβ,α = uβ L (u)du (18.2.12)
t α+β 0

is also nondecreasing.

18.3. Local Convergence


In this section we present the local convergence of inexact Newton method using (18.1.6)
and (18.1.7). We shall first consider the case Bn = F 0 (xn ) for each n = 0, 1, 2, · · · .

Theorem 18.3.1. Suppose x? satisfies (18.1.1), F has a continuous Fréchet derivative in


U(x? , r), F 0 (x? )−1 exists and F 0 (x? )F 0 satisfies the radius Lipschitz condition (18.1.6) and
the center-Lipschitz condition (18.1.7). Assume Bn = F 0 (xn ), for each n in (18.1.3), vn =
θn k(Pn F 0 (xn ))−1 kkPn F 0 (xn )k = θnCond(Pn F 0 (xn )) with vn ≤ v < 1. Let r > 0 satisfy
Z r
1−v
L0(u)du ≤ . (18.3.1)
0 2
Then (INLM) (for Bn = F 0 (xn )) is convergent for all x0 ∈ U(x? , r) and
R ρ(x0 ) !
L (u)du
kxn+1 − x? k ≤ (1 + v) 0
R ρ(x0 ) + v kxn − x? k, (18.3.2)
1− 0 L0(u)du
where R ρ(x0 )
L (u)du
0
q = (1 + v) +v
R ρ(x0 ) (18.3.3)
1− 0 L0(u)du
is less than 1. Further, suppose that the function Lα defined in (18.2.11) is nondecreasing
for some α with 0 < α ≤ 1. Let r̃ satisfy

(1 + v) 0r̃ L (u)udu
R
+ v ≤ 1. (18.3.4)
1 − 0r̃ L0 (u)du
R
328 Ioannis K. Argyros and Á. Alberto Magreñán

Then (INLM) (for Bn = F 0 (xn )) is convergent for all x0 ∈ U(x? , r̃) and
R ρ(x0 ) !
? L (u)du α
kxn+1 − x k ≤ (1 + v) 0
R ρ(x0 ) ρ(xn ) + v kxn − x? k, (18.3.5)
1+α
ρ(x0 ) (1 − 0 L0(u)du)
where R ρ(x0 )
L (u)du
0
q̃ = (1 + v) +v
R ρ(x0 ) (18.3.6)
ρ(x0 )(1 − 0 L0(u)du)
is less than 1.

Proof. Let x0 ∈ B(x? , r), where r satisfies (18.3.1), then q given by (18.3.3) is such that
q ∈ (0, 1). Indeed, by the positivity of L , we have
R ρ(x0 )
0 L (u)du
q = (1 + v) R ρ(x0 ) +v
1− L0 (u)du
Rr0
0R L (u)du
< (1 + v) + v ≤ 1.
1 − 0r L0 (u)du

Suppose that (notice that x0 ∈ U(x? , r)) xn ∈ U(x? , r), we have by (18.1.3)

xn+1 − x? = xn − x? − F 0 (xn )−1 (F(xn ) − F(x? )) + F 0 (xn )−1 rn


Z 1
0 −1
= F (xn ) F(x ) ?
F 0 (x? )−1 (F 0 (xn ) − F 0 (xθ ))(xn − x? )dθ + F 0 (xn )Pn−1 Pn rn
0

where xθ = x? + θ(xn − x? ). It follows, by Lemma 18.2.1 and 18.2.2 and conditions (18.1.6)
and (18.1.7) that we can obtain in turn
Z 1
kxn+1 − x? k = kF 0 (xn )−1 F(x? )k kF 0 (x? )−1 (F 0 (xn ) − F 0 (xθ ))kk(xn − x? )kdθ
0
+θn kF 0 (xn )Pn−1 kkPn F(xn )k
Z 1 Z ρ(x)
1
≤ R ρ(x) L (u)duρ(x)dθ
1− 0 L0(u)du 0 θρ(x)

+θn k(Pn F 0 (xn ))−1 kkPn F 0 (xn )F 0 (xn )−1 F(xn )k


R ρ(x)
0 L (u)udu Z 1 Z ρ(x)
≤ R ρ(x) L (u)duρ(x)dθ
1 − 0 L0 (u)du 0 θρ(x)
R ρ(xn ) !
0 L (u)du ? 0
+θnCond(Pn F (xn )) kxn − x k + R ρ(xn )
1− 0 L0(u)du
R ρ(xn )
0 L (u)udu
≤ (1 + vn ) R ρ(xn ) + vn ρ(xn )
1− 0 L0(u)du
R ρ(xn ) !
L (u)du
≤ (1 + vn ) 0R ρ(x ) + vn ρ(xn ). (18.3.7)
n
1− 0 L0(u)du
Inexact Newton Like-Method 329

In particular, if n = 0 in (18.3.7), we obtain kx1 − x? k ≤ qkx0 − x? k. Hence x1 ∈ U(x? , r),


this shows that (INLM) can be continued an infinite number of times. By mathematical
induction, all xn ∈ U(x? , r) and ρ(xn ) = kxn − x? k decreases monotonically. Consequently,
we have for each n = 0, 1, 2, · · ·
R ρ(xn ) !
? L (u)du
kxn+1 − x k ≤ (1 + vn ) 0
R ρ(xn ) + vn kxn − x? k
1− 0 L0 (u)du
R ρ(x0 ) !
L (u)du
≤ (1 + v) 0R ρ(x ) + v kxn − x? k.
1 − 0 0 L0 (u)du

Hence, we showed (18.3.3). Moreover, if r̃ satisfies (18.3.4) and Lα defined by (18.2.11) is


nondecreasing for some α with 0 < α ≤ 1, then we get
R ρ(x0 )
L (u)udu
q̃ = (1 + v) 0
R ρ(x0 ) ρ(x0 )α + v
1+α
ρ(x0 ) (1 − 0 L0(u)du)
R r̃
0 L (u)udu
< (1 + v) r̃α + v ≤ 1.
r̃ (1 − 0r̃ L0 (u)du)
1+α
R

If, n = 0 in (18.3.1), we get kx1 − x? k ≤ q̃kx0 − x? k < kx0 − x? k. Hence, x1 ∈ U(x? , r̃). That
is (INM) can be continued an infinite number of times. It follows by mathematical induction
that, all xn belongs to U(x? , r̃) and ρ(xn ) = kxn − x? k decreases monotonically. Therefore,
for all k ≥ 0, from (18.3.7) and lemma 18.2.4 we get in turn that
R ρ(xn )
? 0 L (u)udu
kxn − x k ≤ (1 + vn ) R ρ(xn ) + vn ρ(xn )
1− 0 L0 (u)du
ϕ1,α (ρ(xn ))
= (1 + vn ) R ρ(xn ) ρ(xn )1+α + vn ρ(xn )
1− 0 L0 (u)du
ϕ1,α (ρ(x0 ))
≤ (1 + vn ) R ρ(x0 ) ρ(xn )1+α + vn ρ(xn )
1− 0 L0 (u)du
ϕ1,α (ρ(x0 ))
≤ (1 + v) R ρ(x0 ) ρ(xn )1+α + vρ(xn ).
1− 0 L0(u)du


Remark 18.3.2. If L0 = L our Theorem 18.3.1 reduces to Theorem 18.3.1 in [13] (see also
[5]). Otherwise, i.e., if L0 < L , then our Theorem 18.3.1 constitutes an improvement. In
particular, for v = 0, the estimate for the radii of convergence ball for Newton’s method are
given by Z r
1
L0 (u)du ≤
0 2
and
1 r̃
Z
(L0 (u)r̃ + L (u)u)du ≤ 1,
r̃ 0
330 Ioannis K. Argyros and Á. Alberto Magreñán

which reduce to the ones in [14] if L0 = L . Then, we can conclude that vanishing residual,
Theorem 18.3.1 merges into the theory of the Newton method. Besides, if the function Lα
defined by (18.2.11) is nondecreasing for α = 1, we improve the result in [5].
Next, we present a result analogous to Theorem 18.3.1 can also be proven for inexact
Newton-like method, where Bn = B(xn ) approximates F 0 (xn ).
Theorem 18.3.3. Suppose x? satisfies (18.1.1), F has a continuous derivative in U(x? , r),
F 0 (x? )−1 exists and F 0 (x? )F 0 satisfies the radius Lipschitz condition (18.1.6) and the
center Lipschitz condition (18.1.7). Let B(x) be an approximation to the F 0 (x) for all
x ∈ U(x? , r), B(x) is invertible and
kB(x)−1F 0 (x)k ≤ ω1 , kB(x)−1 F 0 (x) − Ik ≤ ω2 , (18.3.8)
where vn = θn k(Pn F 0 (xn ))−1 kkPn F 0 (xn )k = θnCond(Pn F 0 (xn ))k with vn ≤ v < 1. Let r > 0
satisfy Z r
1 − ω2 − ω1 v
L0(u)du < . (18.3.9)
0 1 + ω1 − ω2
Then the (INLM) method is convergent for all x0 ∈ U(x? , r) and
R ρ(x ) !
? ω1 0 0 L (u)du
kxn+1 − x k ≤ (1 + v) R ρ(x0 ) + ω2 + ω1 v kxn − x? k, (18.3.10)
1− 0 L0 (u)du
where R ρ(x0 )
ω1 L (u)du
0
q = (1 + v) R ρ(x0 ) + ω2 + ω1 v (18.3.11)
1− 0 L0(u)du
is less than 1. Further, suppose that the function Lα defined by (18.2.11) is nondecreasing
for some α with 0 < α ≤ 1. Ler r̃ satisfy
ω1 0r̃ L (u)du
R
(1 + v) + ω2 + ω1 v ≤ 1. (18.3.12)
1 − 0r̃ L0 (u)du
R

Then (INLM) is convergent for all x0 ∈ U(x? , r̃) and


R ρ(x0 )
? ω1 L (u)du
kxn+1 − x k ≤ (1 + v) 0
R ρ(x0 )ρ(xn )1+α + (ω2 + ω1 v)ρ(xn ),
1+α
ρ(x0 ) (1 − 0 L0(u)du)
(18.3.13)
where R ρ(x0 )
ω1 0 L (u)du
q̃ = (1 + v) R ρ(x0 ) + ω2 + ω1 v (18.3.14)
1− 0 L0(u)du
is less than 1.
Proof. Let x0 ∈ U(x? , r), where r satisfies (18.3.9), then q given by (18.3.11) is such that
q ∈ (0, 1). Indeed, by the positivity of L , we have
R ρ(x0 )
ω1 0 L (u)du
q = (1 + v) R ρ(x0 ) + ω2 + ω1 v
1− 0 L0(u)du
ω1 0r L (u)du
R
= (1 + v) + ω2 + ω1 v ≤ 1.
1 − 0r L0 (u)du
R
Inexact Newton Like-Method 331

Moreover, if xn ∈ U(x? , r), we have by (18.1.3) in turn that

xn+1 − x? = xn − x? − B−1 ? −1
n (F(xn ) − F (x )) + Bn rn
Z 1
0 θ
= xn − x? − B−1 ? −1 −1
n F (x )dθ(xn − x ) + Bn Pn Pn rn
0
Z 1
= −B−1 0
n F (xn ) F 0 (xn )−1 F 0 (x? )F 0 (x? )−1 (F 0 (x? ) − F 0 (xθ ))(xn − x? )dθ
0
+B−1 0 ? −1 −1
n (F (xn ) − Bn )(xn − x ) + Bn Pn Pn rn ,

where xθ = x? + θ(xn − x? ). Using, Lemma 18.2.1 and 18.2.2 and condition (18.3.8) we
obtain
Z 1
kxn+1 − x? k ≤ kB−1 0
n F (xn )k kF 0 (xn )−1 F 0 (x? )kkF 0 (x? )−1 (F 0 (x? ) − F 0 (xθ ))k
0
kxn − x? kdθ + kB−1 0 ? −1 −1
n (F (xn ) − Bn )kkxn − x k + θn kBn Pn kkBn F(xn )k
Z 1 Z ρ(xn )
ω1
≤ R ρ(xn ) L (u)duρ(xn)dθ
1− 0 L0 (u)du 0 θρ(xn )
+θn kPn−1 F 0 (xn )kk(Pn F 0 (xn ))−1 kkPn F 0 (xn )kkF 0 (xn )−1 F(xn )k
R ρ(xn ) ρ(xn )
!
ω1 L (u)udu L (u)udu
R
0 0
≤ R ρ(xn ) + ω2 ρ(xn ) + ω v
1 n ρ(xn ) + R ρ(xn )
1− 0 L0 (u)du 1− 0 L0(u)du
R ρ(xn )
ω1 0 L (u)udu
≤ (1 + vn ) R ρ(xn ) + (ω2 + ω1 vn )ρ(xn ) (18.3.15)
1− 0 L0(u)du
R ρ(xn ) !
0 ω1 L (u)udu
≤ (1 + vn ) R ρ(xn ) + ω2 + ω1 vn ρ(xn ).
1− 0 L0(u)du
If n = 0,in (18.3.15), we obtain kx1 − x? k ≤ qkx0 − x? k < kx0 − x? k. Hence x1 ∈ U(x? , r),
this shows that the iteration can be continued an infinite number of times. By mathematical
induction, xn ∈ U(x? , r) and ρ(xn ) = kxn − x? k decreases monotonically. Therefore, for all
n ≥ 0, we have in turn that
R ρ(x ) !
? ω1 0 n L (u)udu
kxn+1 − x k ≤ (1 + vn ) R ρ(x ) + ω2 + ω1 vn ρ(xn )
1 − 0 n L0 (u)du
R ρ(x ) !
ω1 0 0 L (u)udu
≤ (1 + v) R ρ(x ) + ω2 + ω1 v ρ(xn ),
1 − 0 0 L0 (u)du

which implies (18.3.10). Furthermore, if r̃ satisfies (18.3.12) and Lα defined by (18.2.11)


is nondecreasing for some α with 0 < α ≤ 1, then we get
R ρ(x0 )
ω1
L (u)udu
q̃ = (1 + v) 0
R ρ(x0 ) ρ(xn )α + ω2 + ω1 v
1+α
ρ(x0 ) (1 − 0 L0 (u)du)
R r̃
ω1 0 L (u)udu
< (1 + v) R r̃ r̃α + ω2 + ω1 v ≤ 1.
1+α
r̃ (1 − 0 L0 (u)du)
332 Ioannis K. Argyros and Á. Alberto Magreñán

If, n = 0 in (18.3.15), we obtain kx1 − x? k ≤ q̃kx0 − x? k < kx0 − x? k. Hence, x1 ∈ U(x? , r̃),
this shows that (18.1.4) can be continued infinite number of times. By mathematical induc-
tion, xn ∈ U(x? , r̃) and ρ(xn ) = kxn − x? k decreases monotonically. Therefore, for all n ≥ 0,
we have
R ρ(xn )
? ω1 0 L (u)udu
kxn+1 − x k ≤ (1 + vn ) R ρ(xn ) + (ω2 + ω1 vn )ρ(xn )
1− 0 L0(u)du
ω1 ϕ1,α (ρ(xn ))
≤ (1 + v) R ρ(x0 ) ρ(xn )1+α + (ω2 + ω1 v)ρ(xn )
1− 0 L0(u)du
ω1 ϕ1,α (ρ(x0 ))
≤ (1 + v) R ρ(x0 ) ρ(xn )1+α + (ω2 + ω1 v)ρ(xn ).
1− 0 L0(u)du

Remark 18.3.4. If L0 = L our Theorem 18.3.3 reduces to Theorem 18.3.2 in [13] (see also
[5]). Otherwise, i.e., if L0 < L , then our Theorem 18.3.3 constitutes an improvement. In
in Theorem 18.3.2, the function Łα defined by (18.2.11) is nondecreasing for α = 1, we
improve the result of [5]. In particular, for v = 0, we can get the radii of converence ball
for the Newton-like method [14].

18.4. Special Cases


In this section, we consider the following special cases of Theorem 18.3.1 and Theorem
18.3.3:
Corollary 18.4.1. Suppose x? satisfies (18.1.1), F has a continuous derivative in U(x? , r),
F 0 (x? )−1 exist, F 0 (x? )F 0 satisfies the radius Lipschitz condition with
L (u) = cαuα−1 kF 0 (x? )−1(F 0 (x) − F 0 (x) )k ≤ c(1 − θα )kx − x? ||α (18.4.1)
for each x ∈ U(x? , r), 0 ≤ θ ≤ 1, where xθ = x? + θ(x − x? ) and the center- radius Lipschitz
condition with
L0(u) = c0 αuα−1 kF 0 (x?)−1(F 0 (x) − F 0 (x? ))k ≤ c0 kx − x? ||α (18.4.2)
for each x ∈ U(x? , r), 0 ≤ θ ≤ 1 for some c0 ≤ c. Assume Bn = F 0 (xn ), for each n in (18.1.3),
vn = θn k(Pn F 0 (xn ))−1 kkPn F 0 (xn )k = θnCond(Pn F 0 (xn )) with vn ≤ v < 1. Let r̃ > 0 satisfy
 1
(1 − v)(1 + α) α
r̃ = .
c(1 + v)α + c0 (1 − v)(1 + α)
Then the inexact Newton method is convergent for all x0 ∈ U(x? , r̃) and
 
? cα(1 + v) ? α
kxn+1 − x k ≤ kx0 − x k + v kxn − x? k,
(1 + α)(1 − c0 kx0 − x? kα )
where
cα(1 + v)
q= kx0 − x? kα + v
(1 + α)(1 − c0 kx0 − x? kα )
is less than 1.
Inexact Newton Like-Method 333

Corollary 18.4.2. Suppose x? satisfies (18.1.1), F has a continuous derivative in


U(x? , r), F 0 (x? )−1 exists and F 0 (x? )F 0 satisfies the radius Lipschitz condition (18.4.1)
and the center Lipschitz condition (18.4.2). Let B(x) be an approximation to the
F 0 (x) for all x ∈ B(x? , r), B(x) is invertible and satisfies condition (18.3.8), vn =
θn k(Pn F 0 (xn ))−1 kkPn F 0 (xn )k = θnCond(Pn F 0 (xn )) with vk ≤ v < 1. Let r̃ > 0 satisfy
  α1
(1 + α)(1 − ω2 − ω1 v)
r̃ = .
c(1 + v)ω1 α + c0 (1 + α)(1 − ω2 − ω1 v)
Then the inexact Newton method is convergent for all x0 ∈ U(x? , r̃) and
 
? cα(1 + v)ω1
kxn+1 − x k ≤ kx0 − x k + ω2 + ω1 v kxn − x? k,
?
(1 + α)(1 − c0 kx0 − x? kα )
where
cα(1 + v)ω1
q= kx0 − x? kα + ω2 + ω1 v
(1 + α)(1 − c0 kx0 − x? kα)
is less than 1.
Remark 18.4.3. (a) If, v = 0 in Corollary 18.4.1, the estimate for the radius of conver-
gence ball for Newton’s method is given by
 1
1+α α
r̃ = ,
cα + c0 (1 + α)
which improves the result in [5, 13] for c0 < c. Moreover, if α = 1, our radius reduces
to r̃ = 2c02+c , which is larger than the one obtained by Rheinholdt and Traub [11, 12]
2
given by r̃ = 3c if c0 < c (see also the numerical examples at the end of the chapter).

(b) The results in section 18.5 of [5, 13] using only center-Lipschitz condition can be
improved, if rewritten using L0 instead of L .

18.5. Examples
Finally, we provide an example where L0 < L .
Example 18.5.1. Let X = Y = R3 , D = U(0, 1) and x? = (0, 0, 0). Define function F on D
for w = (x, y, z) by
e−1 2
F(w) = (ex − 1, y + y, z). (18.5.1)
2
Then, the Fréchet derivative of F is given by
 x 
e 0 0
F 0 (w) =  0 (e − 1) y + 1 0
0 0 1

Notice that we have F(x? ) = 0, F 0 (x? ) = F 0 (x? )−1 = diag{1, 1, 1} and L0 = e − 1 < L = e.
More examples where L0 < L can be found in [1, 2, 3].
334 Ioannis K. Argyros and Á. Alberto Magreñán

18.6. Conclusion
Under the hypothesis that F 0 (x? )F 0 satisfies the center Lipschitz condition (18.1.7) and the
radius Lipschitz condition (18.1.6), we presented a more precise local convergence analysis
for the enexact Newton method under the same computational cost as in earlier studies such
as Chen and Li [5], Zhang, Li and Xie [13]. Numerical examples are provided to show that
the center Lipschitz function can be smaller than the radius Lipschitz function.
References

[1] Argyros, I.K., Cho, Y.J, Hilout, S., Numerical methods for equations and its applica-
tions, CRC Press, Taylor and Francis, New York, 2012.

[2] Argyros, I.K., Hilout, S., Weak convergence conditions for inexact Newton-type meth-
ods, App. Math. Comp., 218 (2011), 2800-2809.

[3] Argyros, I.K., Hilout, S., On the semilocal convergence of a Damped Newton’s
method, Appl.Math. Comput., 219, 5(2012), 2808-2824.

[4] Argyros, I.K., Hilout, S., Weaker conditions for the convergence of Newton’s method,
J. Complexity, 28 (2012), 364-387.

[5] Chen, J., Li, W., Convergence behaviour of inexact Newton methods under weaker
Lipschitz condition, J. Comput. Appl. Math., 191 (2006), 143-164.

[6] Dembo, R.S., Eisenstat, S.C., Steihaug, T., Inexact Newton methods, SIAM J. Numer.
Anal., 19 (1982), 400-408.

[7] Kantorovich, L.V., Akilov, G.P., Functional Analysis, 2nd ed., Pergamon Press, Ox-
ford, 1982.

[8] Morini, B., Convergence behaviour of inexact Newton method, Math. Comp. 61
(1999), 1605-1613.

[9] Ortega, J.M., Rheinholdt, W.C., Iterative Solution of Nonlinear Equations in Several
Variables, Academic Press, New York, 1970.

[10] Proinov, P.D., General local convergence theory for a class of iterative processes and
its applications to Newton’s method, J. Complexity, 25 (2009), 38- 62.

[11] Rheinholdt, W.C., An adaptive continuation process for solving systems of nonlinear
equations, Polish Academy of Science, Banach Ctr. Publ. 3 (1977), 129-142.

[12] Traub, J.F., Iterative methods for the solution of equations, Prentice- Hall Series in
Automatic Computation, Englewood Cliffs, N. J., 1964.
336 Ioannis K. Argyros and Á. Alberto Magreñán

[13] Zhang, H., Li, W., Xie, L., Convergence of inexact Newton methods of nonlinear
operator equations in Banach spaces.
[14] Wang, X., Convergence of Newton’s method and uniqueness of the solution of equa-
tions in Banach space, IMA J. Numer. Anal., 20 (2000), 123-134.
Chapter 19

Expanding the Applicability of


Secant Method with Applications

19.1. Introduction
In this chapter we are concerned with the problem of approximating a locally unique solu-
tion x? of equation
F(x) = 0, (19.1.1)
where F is a Fréchet–differentiable operator defined on a convex subset D of a Banach
space X with values in a Banach space Y .
A vast number of problems from Applied Science including engineering can be solved
by means of finding the solutions equations in a form like (19.1.1) using mathematical
modelling [7, 10, 15, 18]. For example, dynamic systems are mathematically modeled by
difference or differential equations, and their solutions usually represent the states of the
systems. Except in special cases, the solutions of these equations cannot be found in closed
form. This is the main reason why the most commonly used solution methods are iterative.
Iteration methods are also applied for solving optimization problems. In such cases, the it-
eration sequences converge to an optimal solution of the problem at hand. Since all of these
methods have the same recursive structure, they can be introduced and discussed in a gen-
eral framework. The convergence analysis of iterative methods is usually divided into two
categories: semilocal and local convergence analysis. In the semilocal convergence analy-
sis one derives convergence criteria from the information around an initial point whereas in
the local analysis one finds estimates of the radii of convergence balls from the information
around a solution.
We consider the Secant method in the form

xn+1 = xn − δF(xn−1 , xn )−1 F(xn ) (n ≥ 0), (x−1 , x0 ∈ D ) (19.1.2)

where δF(x, y) ∈ L (X , Y ) (x, y ∈ D ) the space of bounded linear operators from X into Y .
of the Fréchet–derivative of F [15, 18].
The semilocal convergence matter is, based on the information around an initial point,
to give criteria ensuring the convergence of iteration procedures. A very important problem
in the study of iterative procedures is the convergence domain. In general the convergence
338 Ioannis K. Argyros and Á. Alberto Magreñán

domain is small. Therefore, it is important to enlarge the convergence domain without


additional hypotheses. Another important problem is to find more precise error estimates
on the distances kxn+1 − xn k, kxn − x? k. These are our objectives in this chapter.
The secant method, also known under the name of Regula Falsi or the method of chords,
is one of the most used iterative procedures for solving nonlinear equations. According to
A. N. Ostrowski [19], this method is known from the time of early Italian algebraists. In
the case of equations defined on the real line, the Secant method is better than Newton’s
method from the point of view of the efficiency index [7]. The Secant method was extended
for the solution of nonlinear equations in Banach Spaces by A. S. Sergeev [24] and J. W.
Schmidt [23].
The simplified Secant method

xn+1 = xn − δF(x−1 , x0 )−1 F(xn ) (n ≥ 0), (x−1 , x0 ∈ D)

was first studied by S. Ulm [25]. The first semilocal convergence analysis was given by
P. Laasonen [21]. His results was improved by F. A. Potra and V. Pták [20, 21, 22]. A
semilocal convergence analysis for general secant-type methods was given in general by J.
E. Dennis [13]. Bosarge and Falb [9], Dennis [10], Potra [20, 21, 22], Argyros [5, 6, 7, 8],
Hernández et al. [13] and others [14], [18], [26], have provided sufficient convergence
conditions for the Secant method based on Lipschitz–type conditions on δF. Moreover,
there exist new graphical tools to study this kind of methods [17].

The conditions usually associated with the semilocal convergence of Secant method
(19.1.2) are:

• F is a nonlinear operator defined on a convex subset D of a Banach space X with


values in a Banach space Y ;
• x−1 and x0 are two points belonging to the interior D 0 of D and satisfying the in-
equality
k x0 − x−1 k≤ c;

• F is Fréchet–differentiable on D 0 , and there exists an operator δF : D 0 × D 0 →


L (X , Y ) such that:
the linear operator A = δF(x−1 , x0 ) is invertible, its inverse A−1 is bounded, and:

k A−1 F(x0 ) k≤ η;

k A [δF(x, y) − F 0 (z)] k≤ ` (k x − z k + k y − z k);


for all x, y, z ∈ D ;
p
` c+2 ` η ≤ 1. (19.1.3)

The sufficient convergence condition(19.1.3) is easily violated (see the Numerical Ex-
amples). Hence, there is no guarantee in these cases that equation (19.1.1) under the in-
formation (`, c, η) has a solution that can be found using Secant method (19.1.2). In this
chapter we are motivated by optimization considerations, and the above observation.
Expanding the Applicability of Secant Method with Applications 339

The use of Lipschitz and center–Lipschitz conditions is one way used to enlarge the
convergence domain of different methods. This technique consist on using both conditions
together instead of using only the Lipschitz one which allow us to find a finer majorizing
sequence, that is, a larger convergence domain. It has been used in order to find weaker
convergence criteria for Newton’s method by Argyros in [8]. Gutiérrez et al in [12] give
sufficient conditions for Newton’s method using both Lipschitz and center-Lipschitz condi-
tions, for the damped Newton’s methods and Amat et al in [3, 4] or Garcı́a-Olivo [11] for
other methods.
Here using Lipschitz and center–Lipschitz conditions, we provide a new semilocal con-
vergence analysis for (19.1.2). It turns out that our new convergence criteria can always be
weaker than the old ones given in earlier studies such as [2, 14, 16, 18, 20, 21, 22, 23, 26,
27]. The chapter is organized as follows: The semilocal convergence analysis of the secant
method is presented in Section 19.2. Numerical examples are provided in Section 19.3.

19.2. Semilocal Convergence Analysis of the Secant Method


In this Section, we present the semilocal convergence analysis of the secant-method
(19.1.2). First, we present two auxiliary results concerning convergence criteria and ma-
jorizing sequences.

Lemma 19.2.1. Let `0 > 0, ` > 0, c > 0 and η > 0 be constants with `0 ≤ `. Then, the
following items hold
(i)
`(c + η) 2` 1 − `0 (c + η) 4`2
0< ≤ p < ⇔ c+η ≤  p 2 ;
1 − `0 (c + η) ` + `2 + 4`0 ` 1 − `0 c
` + `2 + 4`0 `
(19.2.1)

(ii) q
3− 1 + 4 ``0 (1 − `c)2
`c ≤ q ⇔ ≤ b2 − `c; (19.2.2)
1+ 1 + 4 ``0 4

(iii) q
3− 1 + 4 ``0 (1 − `c)2
`c ≥ q ⇔ ≥ b2 − `c; (19.2.3)
1+ 1 + 4 ``0 4

(iv)
q
3− 1 + 4 ``0 p 4`
`c ≤ q and `c + `η ≤ 1 ⇒ c + η ≤  p 2 c; (19.2.4)
1+ 1 + 4 ``0 ` + `2 + 4`0 `
340 Ioannis K. Argyros and Á. Alberto Magreñán

(v)
q
3− 1 + 4 ``0 4` p
`c ≥ q and c + η ≤  p 2 ⇒ `c + `η ≤ 1. (19.2.5)
1+ 1 + 4 ``0 `+ `2 + 4`0 `

`0 2
Proof. Let x = 1 − `c, y = `η, a = and b = √ . Then, we have that ab2 + b −
` 1 + 1 + 4a
1
1 = 0 and ab + 1 = .
b
(i) The triple inequality in (19.2.1) holds, if
`c + `η 2`
≤ √ = b, (19.2.6)
1 − a`(c + η) ` + `2 + 4a`2
1 − a`(c + η)
b< (19.2.7)
1 − a`c
and
1
`(c + η) < (19.2.8)
a
or, if
y ≤ b2 − (1 − x), (19.2.9)
1−b
y< − (1 − b)(1 − x) = b2 − (1 − b)(1 − x), (19.2.10)
a
and
1
y≤
− (1 − x), (19.2.11)
a
respectively. We have that ab2 = 1 − b < 1 by the definition of a and b. It follows
that
1
b2 − (1 − x) < − (1 − x) (19.2.12)
a
and from (1 − b)(1 − x) < (1 − x) we get that

b2 − (1 − x) < b2 − (1 − b)(1 − x). (19.2.13)

Hence, it follows from (19.2.12) and (19.2.13) that (19.2.6)–(19.2.8) are satisfied if
(19.2.9) holds. But (19.2.9) is equivalent to the right hand side inequality in (19.2.1).
Conversely, if the right hand side inequality in (19.2.1) holds, then (19.2.9), (19.2.12)
and (19.2.13) imply (19.2.10) and (19.2.11) imply (19.2.6)–(19.2.8) which imply the
triple inequality in (19.2.1).
(ii)
q
3− 1 + 4 ``0
`c ≤ q ⇔ 2(1 − b) < x < 2(1 + b) ⇔ x2 − 4x + 4(1 − b2 ) ≤ 0
1+ 1 + 4 ``0
x2 (`η)2
⇔ ≤ b2 − (1 − x) ⇔ ≤ b2 − `c.
4 4
Expanding the Applicability of Secant Method with Applications 341

(iii)
(`η)2 x2
≥ b2 − `c ⇔ ≥ b2 − (1 − x) ⇔ x2 − 4x + 4(1 − b2 ) ≥ 0 ⇒ x ≤ 2(1 − b)
4 4
q
3 − 1 + 4 ``0
⇔ `c ≥ q
1 + 1 + 4 ``0

(since x ≥ 2(1 + b) cannot hold).

(iv) The hypotheses in (19.2.4) and (19.2.2) imply `η ≤ b2 − `c which is


4`
c+η ≤  p 2 .
` + `2 + 4`0 `

(v) The hypothesis in (19.2.5) and (19.2.3) imply


p
`c + `η ≤ 1.


We need the following result on majorizing sequences for the Secant method (19.1.2).
Lemma 19.2.2. Let `0 > 0, ` > 0, c > 0, and η > 0 be constants with `0 ≤ `.

Suppose:

4`2
c+η ≤ p . (19.2.14)
` + `2 + 4`0 `
Then, scalar sequence {tn } (n ≥ −1) given by
` (tn+1 − tn−1 ) (tn+1 − tn )
t−1 = 0, t0 = c, t1 = c + η, tn+2 = tn+1 + (19.2.15)
1 − `0 (tn+1 − t0 + tn )
is increasing, bounded from above by
η
t ?? = + c, (19.2.16)
1−b
and

converges to its unique least upper bound t ? such that

c + η ≤ t ? ≤ t ?? , (19.2.17)

Moreover, the following estimates hold for all n ≥ 0:

0 ≤ tn+2 − tn+1 ≤ b (tn+1 − tn ) ≤ bn+1 η, (19.2.18)

where b is given in Lemma 19.2.1.


342 Ioannis K. Argyros and Á. Alberto Magreñán

Proof. We shall show using induction on k ≥ 0 that

0 ≤ tk+2 − tk+1 ≤ b (tk+1 − tk ). (19.2.19)

Using (19.2.15) for k = 0, we must show

` (t1 − t−1 )
0< ≤b
1 − `0 t 1
or
` (c + η)
0< ≤ b,
1 − `0 (c + η)
which is true by (19.2.1) and (19.2.14). Let assume that (19.2.19) holds for k ≤ n + 1.

It then follows from the induction hypotheses that

tk+2 ≤ tk+1 + b (tk+1 − tk )


≤ tk + b (tk − tk−1 ) + b (tk+1 − tk )
≤ t1 + b (t1 − t0 ) + · · · + b (tk+1 − tk )
(19.2.20)
≤ c + η + b η + · · · + bk+1 η
1 − bk+2 η
= c+ η< + c = t ?? .
1−b 1−b
Moreover, we can have:

` (tk+2 − tk+1 ) + b `0 (tk+2 − t0 + tk+1 )


   
1 − bk+2 1 − bk+1
≤ ` (tk+2 − tk+1 ) + (tk+1 − tk ) + b `0 + η + b `0 c (19.2.21)
1−b 1−b
b `0
≤ ` (bk + bk+1 ) η + (2 − bk+1 − bk+2 ) η + b `0 c.
1−b
In view of (19.2.21), inequality (19.2.19) holds, if

b `0
` (bk + bk+1 ) η + (2 − bk+1 − bk+2 ) η + b `0 c ≤ b (19.2.22)
1−b
or
 
` (bk−1 + bk ) η + `0 (1 + b + · · ·+ bk ) + (1 + b + · · · + bk+1 ) η + `0 c − 1 ≤ 0. (19.2.23)

In view of (19.2.23), we are motivated to define recurrent functions for k ≥ 1 on [0, 1)


by
 
k−1 k k k+1
fk (t) = ` (t + t ) η + `0 2 (1 + t + · · · + t ) + t η + `0 c − 1. (19.2.24)
Expanding the Applicability of Secant Method with Applications 343

We need the relationship between two consecutive functions f k . Using (19.2.24), we


obtain
 
k k+1 k+1 k+2
fk+1 (t) = ` (t + t ) η + `0 2 (1 + t + · · · + t ) + t η + `0 c − 1
= ` (t k−1 k k k+1
 + t ) η + ` (t + t ) η −
 ` (t
k−1
+ tk) η
+`0 2 (1 + t + · · · + t k ) + t k+1 η + `0 (2 t k+1 + t k+2 ) η (19.2.25)
−`0 t k+1 η + `0 c − 1
= f k (t) + ` (t k+1 − t k−1 ) η + `0 (t k+1 + t k+2 ) η
= p(t) t k−1 η + f k (t),
where p(t) = `0t 3 + (`0 + `)t 2 − `. Notice that by Descarte’s rule of signs, b is the only
positive root of polynomial p. We can show instead of (19.2.23)
fk (b) ≤ 0 k ≥ 1. (19.2.26)
Define functions f ∞ on interval [0, 1) by f ∞ (t) = lim fk (t). Then, in view of (19.2.24) we
k→∞
get that
2`0 η
f∞(t) = + `0 c − 1. (19.2.27)
1 −t
We have that f k (b) = f k+1(b) = f ∞ (b). Hence, we can show instead of (19.2.26) that
f∞(b) ≤ 0, which is true by (19.2.1), (19.2.14) and (19.2.27). Hence, we showed sequence
{tn } (n ≥ −1) is increasing and bounded from above by t ?? , so that (19.2.18) holds. It
follows that there exists t ? ∈ [c + η,t ?? ], so that lim tn = t ? . 
n−→∞

We denote by U(z, ρ) the open ball centered ar z ∈ X and of radius ρ > 0. We also
denote by Ū(z, ρ) the closure of U(z, ρ). We shall study the Secant method (19.1.2) for
triplets (F, x−1 , x0 ) belonging to the class C (`, `0, η, c) defined as follows:
Definition 19.2.3. Let `, `0 , η, c be positive constants satisfying the hypotheses of Lemma
19.2.2.
We say that a triplet (F, x−1 , x0 ) belongs to the class C (`, `0, η, c) if:
(c1 ) F is a nonlinear operator defined on a convex subset D of a Banach space X with
values in a Banach space Y ;
(c2 ) x−1 and x0 are two points belonging to the interior D 0 of D and satisfying the in-
equality
k x0 − x−1 k≤ c;
(c3 ) F is Fréchet–differentiable on D 0 , and there exists an operator δF : D 0 × D 0 →
L (X , Y ) such that:
the linear operator A = δF(x−1 , x0 ) is invertible, its inverse A−1 is bounded and:
k A−1 F(x0 ) k ≤ η;
k A [δF(x, y) − F 0 (z)] k ≤ ` (k x − z k + k y − z k);
k A [δF(x, y) − F 0 (x0 )] k ≤ `0 (k x − x0 k + k y − x0 k)
for all x, y, z ∈ D .
344 Ioannis K. Argyros and Á. Alberto Magreñán

(c4 ) the set Dc = {x ∈ D ; F is continuous at x} contains the closed ball U(x0 ,t ? − t0 ),


where t ? is given in Lemma 19.2.2.

We present the following semilocal convergence theorem for Secant method (19.1.2).

Theorem 19.2.4. If (F, x−1 , x0 ) ∈ C (`, `0, η, c), then sequence {xn } (n ≥ −1) generated by
Secant method (19.1.2) is well defined, remains in U(x0 ,t ? − t0 ) for all n ≥ 0 and converges
to a unique solution x? ∈ U(x0 ,t ? − t0 ) of equation F(x) = 0. Moreover the following
estimates hold for all n ≥ 0

k xn+2 − xn+1 k≤ tn+2 − tn+1 , (19.2.28)

and
k xn − x? k≤ t ? − tn (19.2.29)
where the sequence {tn } (n ≥ 0) given by (19.2.15). Furthermore, if there exists R ≥ t ? −t0 ,
such that
η
`0 (c + + R) ≤ 1, (19.2.30)
1−b
and
U(x0 , R) ⊆ D , (19.2.31)
then, the solution x? is unique in U(x0 , R).

Proof. We first show operator L = δF(u, v) is invertible for u, v ∈ U(x0 ,t ? − t0 ). It follows


from (19.2.1), (c2 ) and (c3 ) that:

k I − A−1 L k=k A−1 (L − A) k ≤ k A−1 (L − F 0 (x0 )) k + k A−1 (F 0 (x0 ) − A) k


≤ `0 (k u − x0 k + k v − x0 k + k x0 − x−1 k)
≤ `0  (t ? − ?
t0 + t − t0+ c) 
η
≤ `0 2 + c − c. < 1
1−b
(19.2.32)
According to the Banach Lemma on invertible operators [8], [15], and (19.2.32), L is
invertible and
 −1
−1
k L A k≤ 1 − `0 (k xk − x0 k + k xk+1 − x0 k +c) . (19.2.33)

The second condition in (c3 ) implies the Lipschitz condition for F 0

k A−1 (F 0 (u) − F 0 (v)) k≤ 2 ` k u − v k, u, v ∈ D 0 . (19.2.34)

By the identity,
Z 1
F(x) − F(y) = F 0 (y + t(x − y)) dt (x − y) (19.2.35)
0
we get

k A−1 0
0 [F(x) − F(y) − F (u)(x − y)] k≤ ` (k x − u k + k y − u k) k x − y k (19.2.36)
Expanding the Applicability of Secant Method with Applications 345

and

k A−1
0 [F(x) − F(y) − δF(u, v) (x − y)] k≤ ` (k x − v k + k y − v k + k u − v k) k x − y k
(19.2.37)
0
for all x, y, u, v ∈ D . By a continuity argument (19.2.34)–(19.2.37) remain valid if x and/or
y belong to Dc . We first show (19.2.28). If (19.2.28) holds for all n ≤ k and if {xn } (n ≥ 0)
is well defined for n = 0, 1, 2, · · · , k then

k x0 − xn k≤ tn − t0 < t ? − t0 , n ≤ k. (19.2.38)

That is (19.1.2) is well defined for n = k + 1. For n = −1, and n = 0, (19.2.28) reduces
to k x−1 − x0 k≤ c, and k x0 − x1 k≤ η. Suppose (19.2.28) holds for n = −1, 0, 1, · · · , k
(k ≥ 0). Using (19.2.33), (19.2.37) and

F(xk+1) = F(xk+1 ) − F(xk ) − δF(xk−1, xk ) (xk+1 − xk ) (19.2.39)

we obtain in turn:

k A−1 F(xk+1 ) k = `(k xk+1 − xk k + k xk − xk−1 k) k xk+1 − xk k


= `(tk+1 − tk + tk − tk−1 )(tk+1 − tk ) (19.2.40)
= `(tk+1 − tk−1 )(tk+1 − tk )

and
k xk+2 − xk+1 k = k δF(xk , xk+1 )−1 F(xk+1 ) k
≤ k δF(xk , xk+1 )−1 A k k A−1 F(xk+1 ) k
` (tk+1 − tk + tk − tk−1) (19.2.41)
≤ (tk+1 − tk )
1 − `0 (tk+1 − t0 + tk − t0 + t0 − t−1 )
= tk+2 − tk+1 .

The induction for (19.2.28) is completed. It follows from (19.2.28) and Lemma 19.2.2
that sequence {xn } (n ≥ −1) is complete in a Banach space X , and as such it converges to
some x? ∈ U(x0 ,t ? − t0 ) (since U(x0 ,t ? − t0 ) is a closed set). By letting k → ∞ in (19.2.41),
we obtain F(x? ) = 0. Estimate (19.2.29) follows from (19.2.28) by using standard ma-
joration techniques [7, 15, 18, 22]. We shall first show uniqueness in U(x0 ,t ? − t0 ). Let
y? ∈ U(x0 ,t ? − t0 ) be a solution of equation (19.1.1).
Set Z 1
M= F 0 (y? + t (y? − x? )) dt.
0
It then by (c3 ):

k A−1 (A − M ) k = `0 (k y? − x0 k + k x? − x0 k + k x0 − x−1 k)
((t ?
≤ `0  − t0 ) + (t ? −
 t0 ) +
t0 )
η
≤ `0 2 +c −c (19.2.42)
 1−b

= `0 + c < 1.
1−b
346 Ioannis K. Argyros and Á. Alberto Magreñán

It follows from (19.2.1), and the Banach lemma on invertible operators that M −1 exists
on U(x0 ,t ? − t0 ). Using the identity:
F(x? ) − F(y? ) = M (x? − y? ) (19.2.43)
we deduce x? = y? . Finally, we shall show uniqueness in U(x0 , R). As in (19.2.42), we
arrive at
 
−1 η
k A (A − M ) k< `0 + c + R ≤ 1,
1−b
by (19.2.30). 

Remark 19.2.5. (a) Let us define the majoring sequence {wn } used in earlier studies such
as [2, 14, 16, 18, 20, 21, 22, 23, 26, 27] (under condition (19.1.3)):
` (wn+1 − wn−1 ) (wn+1 − wn )
w−1 = 0, w0 = c, w1 = c + η, wn+2 = wn+1 + .
1 − ` (wn+1 − w0 + wn )
(19.2.44)
Note that in general
`0 ≤ ` (19.2.45)
`
holds, and can be arbitrarily large [5, 6, 7, 8]. In the case `0 = `, then tn = wn
`0
(n ≥ −1). Otherwise:
tn+1 − tn ≤ wn+1 − wn , (19.2.46)
0 ≤ t ? − tn ≤ w? − wn , w? = lim wn . (19.2.47)
n−→∞

Note also that strict inequality holds in (19.2.46) for n ≥ 1, if `0 < `. It is worth notic-
ing that the center-Lipschitz condition is not an additional hypothesis to the Lipschitz
condition, since in practice the computation of constant ` requires the computation
of `0 . It follows from the proof of Theorem 19.2.4 that sequence {sn } defined by
`0 (s1 − s−1 )(s1 − s0 )
s−1 = 0, s0 = c, s1 = c + η, s2 = s1 +
1 − `0 s 1
`(sn+1 − sn−1 )(sn+1 − sn )
sn+2 = sn+1 + for n = 1, 2, . . ..
1 − `0 (sn+1 − s0 + sn )
is also a majorizing sequence for {xn } which is tighter than {tn }.
(b) In practice constant c depends on initial guesses x−1 and x0 which can be chosen to be
as close to each other as we wish. Therefore, in particular, we can always choose
r
`0
3− 1+4
`
`c < r ,
`0
1+ 1+4
`
which according to (iv) in Lemma 19.2.1 implies that the new sufficient convergence
criterion (19.2.14) is weaker than the old one given by (19.1.3).
Expanding the Applicability of Secant Method with Applications 347

19.3. Numerical Examples


Example 19.3.1. Let X = Y = C [0, 1], equipped with the max-norm. Consider the follow-
ing nonlinear boundary value problem

u00 = −u3 − γ u2
u(0) = 0, u(1) = 1.

It is well known that this problem can be formulated as the integral equation
Z 1
u(s) = s + Q (s,t) (u3(t) + γ u2 (t)) dt (19.3.1)
0

where, Q is the Green function:



t (1 − s), t ≤ s
Q (s,t) =
s (1 − t), s < t.

We observe that Z 1
1
max |Q (s,t)| dt = .
0≤s≤1 0 8
Then problem (19.3.1) is in the form (19.1.1), where, F : D −→ Y is defined as
Z 1
[F(x)] (s) = x(s) − s − Q (s,t) (x3(t) + γ x2 (t)) dt.
0

The Fréchet derivative of the operator F is given by


Z 1 Z 1
0 2
[F (x)y] (s) = y(s) − 3 Q (s,t)x (t)y(t)dt − 2γ Q (s,t)x(t)y(t)dt.
0 0

Then, we have that


Z 1 Z 1
0
[(I − F (x0 ))(y)](s) = 3 Q (s,t)x20(t)y(t)dt + 2γ Q (s,t)x0(t)y(t)dt.
0 0

Hence, if 2γ < 5, then


kI − F 0 (x0 )k ≤ 2(γ − 2) < 1.
It follows that F 0 (x0 )−1 exists and
1
kF 0 (x0 )−1 k ≤ .
5 − 2γ

We also have that kF(x0 )k ≤ 1 + γ. Define the divided difference defined by


Z 1
δF(x, y) = F 0 (y + t(x − y))dt.
0

Choosing x−1 (s) such that kx−1 − x0 k ≤ c and k0 c < 1. Then, we have

kδF(x−1 , x0 )−1 F(x0 )k ≤ kδF(x−1 , x0 )−1 F 0 (x0 )kkF 0 (x0 )F(x0 )k


348 Ioannis K. Argyros and Á. Alberto Magreñán

and
1
kδF(x−1 , x0 )−1 F 0 (x0 )k ≤ ,
(1 − k0 c)
where k0 is such that
kF 0 (x0 )−1 (F 0 (x0 ) − A0 )k ≤ k0 c,
Set u0 (s) = s and D = U(u0 , R). It is easy to verify that U(u0 , R) ⊂ U(0, R + 1) since
k u0 k= 1. If 2 γ < 5, and k0 c < 1 the operator F 0 satisfies conditions of Theorem 19.2.6,
with
1+γ γ+6 R+3 2 γ+3 R+6
η= , l= , l0 = .
(1 − k0 c)(5 − 2 γ) 8(5 − 2 γ)(1 − k0 c) 16(5 − 2 γ)(1 − k0 c)

Choosing R0 = 0.9, γ = 0.5 and c = 1 we obtain that

k0 = 0.1938137822 . . .,

η = 0.465153 . . .,
l = 0.344989 . . .
and
l0 = 0.187999 . . ..

Then, criterion (19.1.3) is not satisfied since lc + 2 lη = 1.14617 . . . > 1, but criterion
(19.2.14) is satisfied since

4l
η + c = 1.46515 . . . ≤ p = 1.49682 . . ..
(l 2 + l 2 + 4l0 l)2

As a consequence the convergence of the secant-method is guaranteed by Theorem 19.2.4.

Example 19.3.2. Let X = Y = R and let consider the real functions

F(x) = x3 − k

where k ∈ R and we are going to apply secant-method to find the solution of F(x) = 0. We
take the starting point x0 = 1 we consider the domain Ω = B(x0 , 1) and we let x−1 free in
order to find a relation between k and x−1 for which criterion (19.1.3) is not satisfied but
new criterion (19.2.14) is satisfied. In this case, we obtain

η = |(1 − k)(1 + x−1 + x2−1 )|,

6
l= ,
|1 + x−1 + x2−1 |
9
l0 = ,
2|1 + x−1 + x2−1 |
Taking all this data into account we obtain the following criteria:
Expanding the Applicability of Secant Method with Applications 349

(i) If 55/54 < k ≤ 25/24 and


s
2 − 27k 1√ 2164 − 3024k + 729k2
α < x−1 ≤ − 3 − ,
2(−29 + 27k) 2 (−29 + 27k)2

where α is the smallest positive root of

p(t) − 73 + 24k + (22 + 48k)t + (−111 + 72k)t 2 + (−38 + 48k)t 3 + (−25 + 24k)t 4 .

(ii) If 25/24 < k < 29/27 and


s
2 − 27k 1√ 2164 − 3024k + 729k2
1 < x1 ≤ − 3 − .
2(−29 + 27k) 2 (−29 + 27k)2

(iii) If 55/54 < k < 25/24 and


s
56 − 27k 1√ −968 − 108k + 729k2
+ 3 − ≤ x−1 < α,
2(−29 + 27k) 2 (−29 + 27k)2

where α is the greatest positive root of

p(t) = −49 + 24k + (22 + 48k)t + (−111 + 72k)t 2 + (−62 + 48k)t 3 + (−25 + 24k)t 4 .

(iv) If 25/24 ≤ k < 29/27 and


s
56 − 27k 1√ −968 − 108k + 729k2
+ 3 − ≤ x−1 < 1.
2(−29 + 27k) 2 (−29 + 27k)2

(v) If 25/27 < k < 23/24 and


s
52 − 27k 1√ −968 + 108k + 729k2
1 ≤ x−1 < − 3 − .
2(−25 + 27k) 2 (−25 + 27k)2

(vi) If 23/24 ≤ k < 53/54 and


s
52 − 27k 1√ −968 + 108k + 729k2
α ≤ x−1 < − 3 − ,
2(−25 + 27k) 2 (−25 + 27k)2

where α is the smallest positive root of

p(t) = 25 + 24k + (−118 + 48k)t + (−33 + 72k)t 2 + (−58 + 48k)t 3 + (−23 + 24k)t 4 .

(vii) If 25/27 < k ≤ 23/24 and


s
−2 − 27k 1√ 1732 − 2808k + 729k2
+ 3 − ≤ x−1 < 1.
2(−25 + 27k) 2 (−25 + 27k)2
350 Ioannis K. Argyros and Á. Alberto Magreñán

(viii) If 23/24 < k < 53/54 and


s
−2 − 27k 1√ 1732 − 2808k + 729k2
+ 3 − ≤ x−1 < α,
2(−25 + 27k) 2 (−25 + 27k)2

where α is the greatest positive root of

p(t) = 1 + 24k + (−118 + 48k)t + (−33 + 72k)t 2 + (−34 + 48k)t 3 + (−23 + 24k)t 4 .

Now we consider a case in which both criteria (19.1.3) and (19.2.14) are satisfied to
compare the majorizing sequences. We choose k = 0.99 and x−1 = 1.2 and we obtain

c = 0.2, η = 0.0364 . . ., l = 1.64835, l0 = 1.23626.

Moreover, criterion (19.1.3)


p
lc + 2 lη = 0.819568 < 1,

is satisfied and criterion (19.2.14)


4l
c + η = 0.2364 . . . ≤ 0.26963 . . . = p ,
(l 2 + l 2 + 4l0 l)2
is also satisfied. In Table 19.3.1 it is shown that {sn }, {tn } and {wn } are majorizing se-
quences and it is shown also that the tighter sequence is {sn }.

Table 19.3.1. Comparison between the sequences {sn }, {tn } and {wn }

n ksn+1 − sn k ktn+1 − tn k kwn+1 − wn k


1 0.0150308 . .. 0.0200411 . . . 0.0232399 . . .
2 0.00197814 . . . 0.00292257 .. . 0.00446203 . ..
3 0.0000890021 . .. 0.000181477 . .. 0.000339709 . . .
4 4.88677 × 10−7 1.53289 × 10−6 4.52784 × 10−6
5 1.16179 × 10−10 7.63675 × 10−10 4.32958 × 10−9
6 1.66533 × 10−16 3.16414 × 10−15 5.45120 × 10−14

Conclusion
We present a new semilocal convergence analysis for the secant method in order to ap-
proximate a locally unique solution of a nonlinear equation in a Banach space setting. We
showed that the new convergence criteria can be always weaker than the corresponding ones
in earlier studies such as [2, 14, 16, 18, 20, 21, 22, 23, 26, 27]. Numerical examples where
the old results cannot guarantee the convergence but our new convergence criteria can are
also provided in this chapter.
References

[1] Amat, S., Busquier, S., Negra, M., Adaptive approximation of nonlinear operators,
Numer. Funct. Anal. Optim., 25 (2004), 397-405.

[2] Amat, S., Busquier, S., Gutiérrez, J. M., On the local convergence of secant-type
methods, Int. J. Comp. Math., 81(8) (2004), 1153-1161.

[3] Amat, S., Busquier, S., Magreñán, Á.A., Reducing chaos and bifurcations in
Newton-type methods, Abst. Appl. Anal., 2013 (2013), Article ID 726701, 10 pages,
http://dx.doi.org/10.1155/2013/726701.

[4] Amat, S., Magreñán, Á.A., Romero, N., On a family of two step Newton-type meth-
ods, App. Math. Comp., 219(4) (2013), 11341-11347.

[5] Argyros, I.K., A unifying local–semilocal convergence analysis and applications for
two–point Newton–like methods in Banach space, J. Math. Anal. Appl., 298 (2004),
374-397.

[6] Argyros, I.K., New sufficient convergence conditions for the Secant method, Che-
choslovak Math. J., 55 (2005), 175-187.

[7] Argyros, I.K., Convergence and applications of Newton–type iterations, Springer–


Verlag Publ., New–York, 2008.

[8] Argyros, I.K., Weaker conditions for the convergence of Newton’s method, J. Com-
plexity, 28 (2012), 364-387.

[9] Bosarge, W.E., Falb, P.L., A multipoint method of third order, J. Optimiz. Th. Appl., 4
(1969), 156-166.

[10] Dennis, J.E., Toward a unified convergence theory for Newton–like methods, in Nonl.
Funct. Anal. Appl. (L.B. Rall, ed.), Academic Press, New York, (1971), 425-472.

[11] Garcı́a-Olivo, M., El método de Chebyshev para el cálculo de las raı́ces de ecuaciones
no lineales (PhD Thesis), Servicio de Publicaciones, Universidad de La Rioja, 2013.
http://dialnet.unirioja.es/descarga/tesis/37844.pdf

[12] Gutiérrez, J. M., Magreñán, Á.A., N. Romero, On the semilocal convergence of


Newton-Kantorovich method under center-Lipschitz conditions, App. Math. Comp.,
221 (2013), 79-88.
352 Ioannis K. Argyros and Á. Alberto Magreñán

[13] Hernández, M.A., Rubio, M.J., Ezquerro, J.A., Solving a special case of conservative
problems by Secant–like method, Appl. Math. Comp., 169 (2005), 926-942.

[14] Hernández, M.A., Rubio, M.J., Ezquerro, J.A., Secant–like methods for solving non-
linear integral equations of the Hammerstein type, J. Comp. Appl. Math., 115 (2000),
245-254.

[15] Kantorovich, L.V., Akilov, G.P., Functional Analysis, Pergamon Press, Oxford, 1982.

[16] Laasonen, P., Ein überquadratisch konvergenter iterativer algorithmus, Ann. Acad. Sci.
Fenn. Ser I, 450 (1969), 1–10.

[17] Magreñán, Á.A., A new tool to study real dynamics: The convergence plane, App.
Math. Comp., 248 (2014), 215-224.

[18] Ortega, J.M., Rheinboldt, W.C., Iterative Solution of Nonlinear Equations in Several
Variables, Academic Press, New York, 1970.

[19] Ostrowski, A.M., Solution of equations in Euclidian and Banach Spaces, Academic
Press, New York, 1972.

[20] Potra, F.A., An error analysis for the secant method, Numer. Math., 38 (1982), 427-
445.

[21] Potra, F.A., Sharp error bounds for a class of Newton–like methods, Libertas Mathe-
matica, 5 (1985), 71-84.

[22] Potra, F.A., Pták, V., Nondiscrete Induction and Iterative Processes, Pitman, New
York, 1984.

[23] Schmidt, J.W., Untere Fehlerschranken fur Regula–Falsi Verhafren, Period. Hungar.,
9 (1978), 241-247.

[24] Sergeev, A.S., On the method of Chords (in Russian), Sibirsk, Math. J., 11, (1961),
282–289.

[25] Ulm, S., Majorant principle and the method of Chords (in Russian), Izv. Akad. Nauk
Eston. SSR, Ser. Fiz.-Mat., 13 (1964), 217-227.

[26] Yamamoto, T., A convergence theorem for Newton–like methods in Banach spaces,
Numer. Math., 51 (1987), 545-557.

[27] Wolfe, M.A., Extended iterative methods for the solution of operator equations, Nu-
mer. Math., 31 (1978), 153-174.
Chapter 20

Expanding the Convergence Domain


for Chun-Stanica-Neta Family of
Third Order Methods in Banach
Spaces

20.1. Introduction
In this study we are concerned with the problem of approximating a locally unique solution
x∗ of the equation
F(x) = 0, (20.1.1)
where F is a Fréchet-differentiable operator defined on a convex subset D of a Banach space
X with values in a Banach space Y.
Many problems in computational mathematics and other disciplines can be brought in
a form like (20.1.1) using mathematical modelling [1, 3, 11, 15, 18, 19]. The solutions of
these equations can rarely be found in closed form. That is why most solution methods
for these equations are usually iterative. In particular the practice of Numerical Functional
Analysis for finding such solutions is essentially connected to Newton-like methods [1, 3,
15, 17, 18, 19]. The study about convergence of iterative procedures is normally centered
on two types: semilocal and local convergence analysis. The semilocal convergence matter
is, based on the information around an initial point, to give criteria ensuring the convergence
of the iterative procedures. While the local analysis is based on the information around a
solution, to find estimates of the radii of convergence balls. There exist many studies which
deal with the local and the semilocal convergence analysis of Newton-like methods such as
[1]-[20].
Majorizing sequences in connection to the Kantorovich theorem have been used ex-
tensively for studying the convergence of these methods [1, 2, 3, 4, 11, 15, 10]. Rall [19]
suggested a different approach for the convergence of these methods, based on recurrent
relations. Candela and Marquina [5, 6], Parida[16], Parida and Gupta [17], Ezquerro and
Hernández [7], Gutiérrez and Hernández [8, 9]. Argyros [1, 2, 3] used this idea for sev-
eral high-order methods. In particular, Kou and Li [12] introduced a third order family of
354 Ioannis K. Argyros and Á. Alberto Magreñán

methods for solving equation (20.1.1), when X = Y = R defined by

yn = xn − θF 0 (xn )−1 F(xn ), for each n = 0, 1, 2, · · ·


θ2 + θ − 1 0 1
xn+1 = xn − 2
F (xn )−1 F(xn ) − 2 F 0 (xn )−1 F(yn ), (20.1.2)
θ θ
where x0 is an initial point and θ ∈ R − {0}. This family uses two evaluations of F and
one evaluation of F 0 . Third order methods requiring one evaluation of F and two evaluation
of F 0 can be found in [1, 3, 12, 18]. It is well known that the convergence domain of
high order methods is in general very small. This fact limits the applicability of these
methods. In the present study we are motivated by this fact and recent work by Chun,
Stanica and Neta [4] who provided a semilocal convergence analysis of the third order
method (20.1.2) in a Banach space setting. Their semilocal convergence analysis is based
on recurrent relations. In Section 20.2 we show convergence of the third order method
(20.1.2) using more precise recurrent relations under less computational cost and weaker
convergence criterion. Moreover, the error estimates on the distances kxn+1 − xn k, kxn − x∗ k
are more precise and the information on the location of the solution at least as precise. In
Section 20.3 using our technique of recurrent functions we present a semilocal convergence
analysis using majorizing sequence. The convergence criterion can be weaker than the older
convergence criteria or the criteria of Section 20.2. Numerical examples are presented in
Section 20.4 that show the advantages of our work over the older works.

20.2. Semilocal Convergence I


Let U(w, ρ), U(w, ρ) stand for the open and closed ball, respectively, with center w ∈ X and
of radius ρ > 0. Let also L(X,Y ) denote the space of bounded linear operators from X into
Y.
The semilocal convergence analysis of third order method (20.1.2),given by Chun, Stan-
ica and Neta [4] is based on the following conditions. Suppose:
(C ):

(1) There exists kF 0 (x) − F 0 (y)k ≤ Kkx − yk for each x and y ∈ D;

(2)
kF 00 (x)k ≤ M, for each x ∈ D;

(3)
kF 0 (x0 )−1 k ≤ β;

(4)
kF 0 (x0 )−1 F(x0 )k ≤ η.
Third Order Family of Methods in Banach Spaces 355

They defined certain parameters and sequences by

a = Kβη,
|θ2 + θ − 1| + |1 − θ|
α = ,
θ2
M
γ = βη,
2
a0 = b0 = 1, d0 = α + γ, b−1 = 0,
an
an+1 = ,
1 − aan dn
bn+1 = an+1 βηcn ,
|1 + θ|(θ − 1)2 + |1 − θ| M
kn = 2
bn + an βb2n η,
θ 2
M 2 M 2
cn = k + K|θ|bn kn + |θ − 1|b2n
2 n 2
and
dn+1 = αbn+1 + γan+1 b2n+1 .
We suppose (C 0 ):

(1)
kF 0 (x0 )−1 (F 0 (x) − F 0 (y))k ≤ Kkx − yk, for each x, y ∈ D;

(2)
kF 0 (x0 )−1 (F 0 (x) − F 0 (x0 ))k ≤ K0 kx − x0 k, for each x ∈ D;

(3)
kF 0 (x0 )−1 F(x0 )k ≤ η.

Notice that the new conditions are given in affine invariant form and the condition on the
second Fréchet-derivative has been dropped. The advantages of presenting results in affine
invariant form instead of non-affine invariant form are well known [1, 3, 11, 15, 18]. If
operator F is twice Fréchet differentiable, then (1) in (C 0 ) implies (2) in (C ).
In order for us to compare the old approach with the new, let us rewrite the conditions
(C ) in affine invariant form. We shall call these conditions again (C ).

(C1 ) kF 0 (x0 )−1 (F 0 (x) − F 0 (y))k ≤ Kkx − yk for each x and y ∈ D;

(C2 )
kF 0 (x0 )−1 F 00 (x)k ≤ M, for each x ∈ D;

(C4 )
kF 0 (x0 )−1 F(x0 )k ≤ η.
356 Ioannis K. Argyros and Á. Alberto Magreñán

The parameters and sequences are defined as before but β = 1. Then, we can certainly set
K = M. Define parameters

a0 = Kη,
α0 = α,
K
γ0 = η,
2
a00 = b00 = 1, d00 = α0 + γ0 , b0−1 = 0,
1
a0n+1 = 0
,
1 − K0 (dn0 + dn−1 + · · · + d00 )
b0n+1 = a0n+1 ηc0n ,
(kn0 )2 |θ2 − 1| 0 2
c0n = K[ + |θ|b0n kn0 + (bn ) ],
2 2
|θ + 1|(θ − 1)2 + |1 − θ| 0 K 0 0 2
kn0 = bn + an (bn ) η
θ2 2
and
0
dn+1 = α0 b0n+1 + γ0 a0n+1 (b0n+1)2 .
We have that
K0 ≤ K (20.2.1)
K
holds in general and K0 can be arbitrarily large [1]-[3]. Notice that the center Lipschitz
condition is not an additional condition to the Lipschitz condition, since in practice the
computation of K involves the computation of K0 as a special case. We have by the defini-
tion of an+1 in turn that
an
an+1 =
1 − Kηan dn
an
= an−1
1 − Kηdn 1−Kηa n−1 dn−1

an (1 − Kηan−1 dn−1)
=
1 − Kηan−1 (dn + dn−1 )
an−1
1−Kηan−1 dn−1 (1 − Kηan−1 dn−1)
=
1 − Kηan−1 (dn + dn−1 )
an−1
=
1 − Kηan−1 (dn + dn−1 )
..
.
a0
=
1 − Kηan−1 (dn + dn−1 + · · · + d0 )
1
= .
1 − Kη(dn + dn−1 + · · · + d0 )
Hence, we deduce that

a0n+1 ≤ an+1 for each n = 0, 1, 2, · · · . (20.2.2)


Third Order Family of Methods in Banach Spaces 357

Moreover, strict inequality holds in (20.2.2) if K0 < K. Hence, using a simple inductive
argument we also have that

b0n+1 ≤ bn+1 , (20.2.3)


c0n ≤ cn , (20.2.4)
kn0 ≤ kn (20.2.5)

and
0
dn+1 ≤ dn+1 . (20.2.6)

Lemma 20.2.1. Under the (C 0 ) conditions the following hold

kF 0 (xn )−1 F 0 (x0 )k ≤ a0n ,


kF 0 (xn )−1 F(xn )k ≤ b0n η,
kxn+1 − xn k ≤ dn0 η,
kxn+1 − yn k ≤ (dn0 + 2kn−1
0
+ θb0n )η.

Moreover, under the (C ) conditions the following hold

kF 0 (xn )−1 F 0 (x0 )k ≤ a0n ≤ an ,


kF 0 (xn )−1 F(xn )k ≤ b0n η ≤ bn η,
kxn+1 − xn k ≤ dn0 η ≤ dn η,
kxn+1 − yn k ≤ (dn0 + 2kn−1
0
+ θb0n )η ≤ (dn + 2kn−1 + θbn )η.

Proof. It follows from the proof of Lemma 1 in [4] by simply noticing: the expressions
involving

(i) the second Fréchet-derivative


Z 1
F 00 (xn + t(yn − xn ))(1 − t)(yn − xn )2 dt
0

and Z 1
F 00 (yn + t(xn+1 − yn ))(1 − t)(xn+1 − yn )2 dt
0
are not needed and can be replaced, respectively, by
Z 1
[F 0 (yn + t(xn − yn ) − F 0 (xn )](yn − xn )dt
0

and Z 1
[F 0 (yn + t(xn+1 − yn ) − F 0 (yn )](xn+1 − yn )dt.
0

Hence, condition (2) in (C ) is not needed and can be replaced by condition (1) in (C 0 )
to produce the same bounds as in [4] (for K = M) (see also the proof of Theorem
20.3.2 that follows)
358 Ioannis K. Argyros and Á. Alberto Magreñán

(ii) The computation of the upper bounds on kF 0 (xn )−1 F 0 (x0 )k in [4] uses condition (1)
in (C ) and the estimate
kF 0 (xn )−1 (F 0 (xn ) − F 0 (xn+1 ))k ≤ kF 0 (xn )−1 F 0 (x0 )kKkxn − xn+1 k
to arrive at
kF 0 (xn )−1 F 0 (x0 )k ≤ an+1 , (20.2.7)
0
whereas we use (2) in (C ) and estimate
kF 0 (x0 )−1 (F 0 (xn ) − F 0 (xn+1 ))k ≤ K0 kxn+1 − xn k
≤ K0 (kxn+1 − xn k + · · · + kx1 − x0 k)
≤ K0 (dn0 + dn−1
0
+ · · · + d00 )
to arrive at the estimate
kF 0 (xn )−1 F 0 (x0 )k ≤ a0n+1 , (20.2.8)
which is more precise (see also (20.2.2)).

Lemma 20.2.2. Suppose that
a01 b01 < 1. (20.2.9)
Then, sequence {p0n } defined by p0n = a0n b0n is decreasingly convergent to 0 such that
n+1 1
p0n+1 ≤ ξ21 , ξ := a01 b01
ξ1 1
and
n 1
dn0 ≤ (α0 + γ0 )ξ21 .
ξ1
Moreover, if
a1 b1 < 1, (20.2.10)
then, sequence {pn } defined by pn = an bn is also decreasingly convergent to 0 such that
n+1 1
p0n+1 ≤ pn+1 ≤ ξ2 , ξ = a1 b1 ,
ξ
n 1
dn0 ≤ dn ≤ (α + γ)ξ2 ,
ξ
and
ξ1 ≤ ξ.
Proof. It follows from the proof of Lemma 3 in [4] by simply using {p0n }, a01 , b01 , ξ1 instead
of {pn }, a1 , b1 , ξ, respectively. 
Next, we present the main semilocal convergence result for the third order method
(20.1.2) under the (C 0 ) conditions, (20.2.9) and the convergence criterion
a(α + γ) < 1. (20.2.11)
The proof follows from the proof of Theorem 5 in [4] (with the exception of the unique-
ness of the solution part) by simply replacing the (C ) conditions and (20.2.10) by the (C 0 )
conditions and (20.2.9) respectively.
Third Order Family of Methods in Banach Spaces 359

Theorem 20.2.3. Suppose that conditions (C 0 ), (20.2.9) and (20.2.11) hold. Moreover,
suppose that
U00 = U(x0 , r0 η) ⊂ D, (20.2.12)
where

r0 = ∑ dn0 . (20.2.13)
n=0
Then, sequences {xn } generated by the third order method (20.1.2) is well defined, remains
in U00 for each n = 0, 1, 2, · · · and converges to a unique solution x∗ of equation F(x) = 0 in
U(x0 , K20 − r0 η) D. Moreover, the following estimates hold
T

∞ ∞
α+γ k
kxn+1 − x∗ k ≤ ∑ dk0 η ≤ η ∑ ξ21 . (20.2.14)
k=n+1 ξ1 k=n+1
Proof. As already noted above, we only need to show the uniqueness part. Let y∗ ∈
U(x0 , K20 − r0 η) be such that F(y∗ ) = 0. Define Q = 01 F 0 (x∗ + t(y∗ − x∗ ))dt. Using condi-
R

tion (2) in (C 0 ) we get in turn that


Z 1
kF 0 (x0 )−1 (F 0 (x0 ) − Q)k ≤ K0 kx∗ + t(y∗ − x∗ ) − x0 kdt
0
Z 1
≤ K0 [(1 − t)kx∗ − x0 k + tky∗ − x0 k]dt
0
K0 2
< [r0 η + − r0 η] = 1. (20.2.15)
2 K0
It follows from (20.2.15) and the Banach lemma on invertible operators [1, 3, 11, 15, 18]
that Q−1 ∈ L(Y, X). Then, using the identity
0 = F(x∗ ) − F(y∗ ) = Q(x∗ − y∗ ),
we deduce that x∗ = y∗ . 
Remark 20.2.4. If K0 = K, and operator F is twice Fréchet differentiable then Lemma
20.2.1, Lemma 20.2.2 and Theorem 20.2.3 reduce to Lemma 1, Lemma 3 and Theorem 5
in [4], respectively. Otherwise i.e., if K0 < K or if the twice Fréchet differentiability of
operator F is not assumed, then our results constitute an improvement. It is worth noticing
that if K0 < K, then (20.2.10) implies (20.2.9) (but not necessarily vice versa) and ξ1 < ξ.

20.3. Semilocal Convergence II


We need to introduce some scalar sequences that shall be shown to be majorizing for the
third order methods (20.1.2) in Theorem 20.3.2.
Let K0 > 0, K > 0, η > 0 and θ ∈ R − {0}. Set t0 = 0 and s0 = |θ|η. Define polynomials
f and g by
K|θ| |θ| |θ2 − 1|
f (t) = ( + K0 )t 3 + Kt 2 + K( − |θ|)t
2 2 2|θ|
K |θ2 − 1|
− (20.3.1)
2 |θ|
360 Ioannis K. Argyros and Á. Alberto Magreñán

and
K
g(t) = K0 t 4 + 2
[1 + |1 − θ|(1 + |1 − θ2 |)]t 3

K
+ [|1 − θ|(1 + |1 − θ2 |) − 1]t 2
2θ2
K |θ2 − 1|
+ 2 |1 − θ|(1 + |1 − θ2 |)( − 1)t
θ 2θ2
K
− 4 |1 − θ||1 − θ2 |(1 + |1 − θ2 |). (20.3.2)

|θ2 −1|
We have f (0) = − K2 |θ| < 0 for θ 6= ±1 and f (1) = K0 > 0 for K0 6= 0. It follows from the
intermediate value theorem that polynomial f has roots in (0, 1). Denote by δ f the smallest
root of f in (0, 1). Similarly, we have g(0) = − 2θK4 |1 − θ||θ2 − 1|(1 + |1 − θ2 |) < 0 for
θ 6= ±1 and g(1) = K0 + 2θK2 > 0. Denote by δg the smallest root of g in (0, 1). Set

δ = min{δ f , δg }. (20.3.3)

Moreover, suppose that δ satisfies



1−θ Kη
2
θ3 (1 + |1 − θ |) + 2θ ≤ δ, (20.3.4)

" #
K|θ| |θ2 − 1| δ2
0< + + δ (s0 − t0 ) ≤ δ (20.3.5)
1 − K0 (1 + δ)s0 2θ2 2
and
K
0 < 2
{|1 − θ|(1 + |1 − θ2 |)
θ (1 − K0 (1 + δ)s0 )
" #
|θ2 − 1| δ2 δ2
2
+ + δ + }(s0 − t0 ) ≤ δ2 . (20.3.6)
2θ 2 2

We shall assume from now on that δ satisfies conditions (20.3.3)-(20.3.6). These conditions
shall be referred to as the (4) conditions. Moreover, define scalar sequences {tn }, {sn } by

t0 = 0, s0 = t0 + θη,
 
|1 − θ| 2 (s0 − t0 )K
t1 = s0 + (1 + |1 − θ |) + (s0 − t0 )
|θ3 | 2θ2

for each n = 0, 1, 2, · · · .

K|θ|
sn+1 = tn+1 +
1 − K0 tn+1
 
|1 − θ2 | 2 (tn+1 − sn )2
(sn − tn ) + + (sn − tn )(tn+1 − sn ) (20.3.7)
2θ2 2
Third Order Family of Methods in Banach Spaces 361
K 
tn+2 = sn+1 + 2
|1 − θ|(1 + |1 − θ2 |)
θ (1 − K0 tn+1)
 
|1 − θ2 | 2 (tn+1 − sn )2
(s n − t n ) + + (sn − tn )(tn+1 − sn )
2θ2 2
1
+ (sn+1 − tn+1 )2 }. (20.3.8)
2
Then, we can show the following auxiliary result for majorizing sequences {tn }, {sn } under
the (4) conditions.
Lemma 20.3.1. Suppose that the (4) conditions hold. Then, sequence {tn }, {sn } defined by
(20.3.7) and (20.3.8) are increasingly convergent to their unique least upper bound denoted
by t ∗ which satisfies
θη
θη ≤ t ∗ ≤ t ∗∗ := . (20.3.9)
1−δ
Moreover, the following estimates hold for each n = 0, 1, 2, · · · .

0 < sn − tn ≤ δn θη (20.3.10)

and
0 < tn+1 − sn ≤ δn+1 θη. (20.3.11)
Proof. We shall show estimates (20.3.10) and (20.3.11) using induction. If n = 0, (20.3.10)
holds by the definition of t0 and s0 , whereas (20.3.11) holds by (20.3.4). We then have that

1 − δ2
t1 ≤ s0 + δs0 = (1 + δ)s0 = s0 < t ∗∗ . (20.3.12)
1−δ
If n = 1, estimates (20.3.10) and (20.3.11) hold by (20.3.5), (20.3.6), (20.3.12) and
(20.3.10), (20.3.11) for n = 0. Suppose that (20.3.10) and (20.3.11) hold for all m ≤ n.
Then, we have that

tm+1 ≤ sm + δm+1 (s0 − t0 ) ≤ tm + δm (s0 − t0 )


δm+1 (s0 − t0 ) ≤ · · · ≤ t0 + (s0 − t0 ) + δ(s0 − t0 )
1 − δm+2
+ · · · + δm+1 (s0 − t0 ) = (s0 − t0 ) < t ∗∗. (20.3.13)
1−δ
Next, we shall show (20.3.10) for m + 1 replacing n. We have by the induction hypotheses
and (20.3.13) that
K|θ|
sm+1 − tm+1 ≤ m+2
1 − K0 1−δ
1−δ
 2 
|θ − 1| m 2 (δm (s0 − t0 ))2 2m+1 2
(δ (s 0 − t 0 )) + + δ (s 0 − t 0 )
θ2 2

must be smaller or equal to δm+1 (s0 − t0 ), or


" #
K|θ| |θ2 − 1| m δm+2 m+1
m+2 δ + +δ (s0 − t0 ) ≤ δ. (20.3.14)
1 − K0 1−δ θ2 2
1−δ
362 Ioannis K. Argyros and Á. Alberto Magreñán

Estimate (20.3.14) motivates us to define recurrent polynomials f m on (0, 1) by


 
|θ| m+2 m+1 |θ2 − 1| m
fm (t) = K t + |θ|t + t (s0 − t0 )
2 2|θ|
+K0 t(1 + t + · · · + t m+1 )(s0 − t0 ) − t. (20.3.15)

We need a relationship between two consecutive polynomials f m. Using (20.3.15) and


(20.3.1) by direct algebraic manipulation we get that

fm+1 (t) = f m (t) + f (t)t m−1(s0 − t0 ). (20.3.16)

Evidently, condition (20.3.14) is satisfied, if

fm (δ) ≤ 0. (20.3.17)

We also have from (20.3.17) that

fm+1 (δ) ≤ f m(δ), (20.3.18)

since f (δ) ≤ 0. It then, follows from (20.3.17) and (20.3.18) that (20.3.17) holds, if

f0 (δ) ≤ 0, (20.3.19)

which is true by (20.3.5). Hence, we showed (20.3.10) for m + 1 replacing n. Next, we shall
show (20.3.11) for m + 1 replacing n. We have in turn that
K
sm+2 − sm+1 ≤ m+2 {|1 − θ|(1 + |θ2 − 1|)
2
θ (1 − K0 1−δ
1−δ )
" #
|θ2 − 1| m (δm+1 (s0 − t0 ))2
(δ (s 0 − t 0 )) 2
+ + δ2m+1 (s0 − t0 )2
2θ2 2
+(δm+1 (s0 − t0 ))2 }

must be smaller or equal to δm+2 (s0 − t0 ). As in the preceding case we are motivated to
define polynomials gm on [0, 1] by
  m+2
|1 − θ|(1 + |θ2 − 1|) |θ2 − 1| m t m+2 m+1 t
gm (t) = K{ 2 2
t + +t + 2}
θ θ 2 2θ
2 m+1 2
×(s0 − t0 ) + t K0 (1 + t + · · · + t )(s0 − t0 ) − t . (20.3.20)

Using (20.3.20) and (20.3.2) by direct algebraic manipulation we get that

gm+1 (t) = gm (t) + g(t)t m(s0 − t0 ). (20.3.21)

Condition (20.3.11) is satisfied, if


gm (δ) ≤ 0. (20.3.22)
We also have from (20.3.21) and (4) that

gm+1 (δ) ≤ gm(δ), (20.3.23)


Third Order Family of Methods in Banach Spaces 363

since g(δ) ≤ 0. Hence, (20.3.22) is satisfied, if

g0 (δ) ≤ 0, (20.3.24)

which is true by (20.3.6). The induction for (20.3.11) is completed. It then, follows that

1 − δm+3
tm+2 ≤ s0 < t ∗∗ . (20.3.25)
1−δ
Hence, sequences {tn }, {sn } are increasing, bounded above by t ∗∗ and as such they converge
to their unique least upper bound t ∗ which satisfies (20.3.9). 
We can show the main semilocal convergence result for the third order method (20.1.2)
under the (C 0 ) and (4) conditions using {tn } and {sn } as majorizing sequences.

Theorem 20.3.2. Suppose that


U(x0 ,t ∗ ) ⊂ D, (20.3.26)
the (C 0 ) and (4) conditions hold. Then, sequences {xn }, {yn } generated by the third order
method (20.1.2) are well defined, remain in U(x0 ,t ∗ ) for each n = 0, 1, 2, · · · and converge to
a unique solution x∗ of equation F(x) = 0 in U(x0 ,t ∗ )∩D. Moreover the following estimates
hold for each n = 0, 1, 2, · · · .

kyn − xn k ≤ sn − tn , (20.3.27)
kxn+1 − yn k ≤ tn+1 − sn (20.3.28)
kxn+1 − xn k ≤ tn+1 − tn (20.3.29)

and
kxn − x∗ k ≤ t ∗ − tn . (20.3.30)
Furthermore, if there exists R > t ∗ such that

K0 (t ∗ + R) < 2, (20.3.31)

then, the point x∗ is the only solution of equation F(x) = 0 in U(x0 , R).

Proof. We shall first show (20.3.27) and (20.3.28) using induction. We have by (20.1.2)
and (20.3.7) that

ky0 − x0 k = |θ|kF 0 (x0 )−1 F(x0 )k ≤ |θ|η = s0 = s0 − t0 .

Hence, (20.3.27) holds for n = 0. It follows from the first substep of (20.1.2) that

F(y0 ) = F(y0 ) − θF(x0 ) − F 0 (x0 )(y0 − x0 )


= (1 − θ)F(x0 )
Z 1
+ [F 0 (x0 + t(y0 − x0 )) − F 0 (x0 )](y0 − x0 )dt. (20.3.32)
0
364 Ioannis K. Argyros and Á. Alberto Magreñán

Composing (20.3.32) by F 0 (x0 )−1 and using (2), (3) in (C 0 ) and (20.3.7)

kF 0 (x0 )−1 F(y0 )k ≤ |1 − θ|kkF 0 (x0 )−1 F(x0 )k


Z 1
+k [F 0 (x0 + t(y0 − x0 )) − F 0 (x0 )](y0 − x0 )dt
0
|1 − θ| K0
≤ (s0 − t0 ) + ky0 − x0 k2
|θ| 2
|1 − θ| K0
≤ ( + (s0 − t0 ))(s0 − t0 ). (20.3.33)
|θ| 2
Subtracting the first from the second substep in (20.1.2) we get that

(θ + 1)(θ − 1)2 0 1
x1 − y0 = − 2
F (x0 )−1 F(x0 ) − 2 F 0 (x0 )−1 F(y0 ) (20.3.34)
θ θ
Hence, using (20.3.33) and (20.3.34), we get that

|θ + 1||θ − 1|2 0 1
kx1 − y0 k = 2
kF (x0 )−1 F(x0 )k + 2 kF 0 (x0 )−1 F(y0 )k
θ θ
|θ + 1||θ − 1| 2 1 |1 − θ| K
≤ 2
(s0 − t0 ) + 2 ( + (s0 − t0 ))(s0 − t0 )
θ θ |θ| 2
= t1 − s0 , (20.3.35)

which shows (20.3.28) for n = 0. Then, (20.3.29) holds for n = 0, since

kx1 − x0 k ≤ kx1 − y0 k + ky0 − x0 k ≤ t1 − s0 + s0 − t0 = t1 − t0 ≤ t ∗ .

Then, we have x1 ∈ U(x0 ,t ∗ ). Notice that K0 t ∗ < 1 from the proof of Lemma 20.3.1. Let us
suppose x ∈ U(x0 ,t ∗ ). Then, using (2) in (C 0 ) we have that

kF 0 (x0 )−1 (F 0 (x) − F 0 (x0 ))k ≤ K0 kx − x0 k ≤ K0 t ∗ < 1. (20.3.36)

It follows from (20.3.36) and the Banach lemma that F 0 (x)−1 ∈ L(Y, X) and
1 1
kF 0 (x1 )−1 F 0 (x0 )k ≤ ≤ . (20.3.37)
1 − K0 kx1 − x0 k 1 − K0 t1

Suppose that (20.3.27)-(20.3.29) hold for all m ≤ n and xm ∈ U(x0 ,t ∗). Using the first step
in (20.1.2) we get that

F(ym ) = F(ym ) − θF(xm) − F 0 (xm )(ym − xm )


= (1 − θ)F(xm )
Z 1
+ [F 0 (xm + t(ym − xm )) − F 0 (xm )](ym − xm )dt. (20.3.38)
0

Subtracting the first step in (20.1.2) from the second step to obtain

θ3 − θ2 − θ + 1 1
F 0 (xm)(xm+1 − ym ) = 2
F(xm ) − 2 F(ym). (20.3.39)
θ θ
Third Order Family of Methods in Banach Spaces 365

We also have by (20.3.38) that

F(xm+1 ) = F 0 (xm )(xm+1 − ym ) + F(ym ) + [F 0 (ym ) − F 0 (xm )](xm+1 − ym )


+F(xm+1 ) − F(ym ) − F 0 (ym )(xm+1 − ym )
1−θ 1
= 2
F(xm ) − 2 F(ym)
θ θ
Z 1
+ [F 0 (xm + t(ym − xm )) − F 0 (xm )](ym − xm )dt
0
Z 1
+ [F 0 (ym + t(xm+1 − ym )) − F 0 (ym)](xm+1 − ym )dt
0
+[F (ym ) − F 0 (xm )](xm+1 − ym ).
0
(20.3.40)

Hence, we get by (20.3.40) that



0 −1 |θ2 − 1|
kF (x0 ) F(xm+1 )k ≤ K kym − xm k2
2θ2

kxm+1 − ym k2
+ + kym − xm kkxm+1 − ym k
2
 2
|θ − 1|
≤ K (sm − tm )2
2θ2

(tm+1 − sm )2
+ + (sm − tm )(tm+1 − sm ) . (20.3.41)
2

Then, we get that

kym+1 − xm+1 k ≤ kF 0 (xm+1 )−1 F 0 (x0 )kkF 0 (x0 )−1 F(xm+1 )k


 2
K |θ − 1|
≤ (sm − tm )2
1 − K0 tm+1 2θ2

(tm+1 − sm )2
+ + (sm − tm )(tm+1 − sm ) = sm+1 − tm+1 ,
2

where, we used (20.3.37) for x = xm+1 and

kxm+1 − x0 k ≤ kxm+1 − xm k + · · · + kx1 − x0 k ≤ tm+1 − tm + · · · + t1 − t0 = tm+1 .

Hence, we showed (20.3.27). Then, we have by (20.3.39) that

θ3 − θ2 − θ + 1 0 1
xm+1 − ym = 2
F (xm )−1 F(xm ) − 2 F 0 (xm )−1 F(ym). (20.3.42)
θ θ
366 Ioannis K. Argyros and Á. Alberto Magreñán

It follows from (20.3.42) that

K |1 + θ|(θ − 1)2 0
kxm+2 − ym+1 k ≤ [ 2
kF (x0 )−1 F(xm+1 )k
1 − K0 tm+1 θ
1
+ 2 kF 0 (x0 )−1 F(ym+1 )k]
θ
 2
K 2 |θ − 1|
≤ |1 + θ|(θ − 1) ( (sm − tm )2
θ2 (1 − K0 tm+1 ) 2θ2
(tm+1 − sm )2
+ + (sm − tm )(tm+1 − sm ))
2
|θ2 − 1|
+|1 − θ|( (sm − tm )2
2θ2
(tm+1 − sm )2
+ + (sm − tm )(tm+1 − sm )))
2 
(sm+1 − tm+1 )2
+
2
= tm+2 − sm+1 .

Hence, we showed (20.3.28). Then, we have that

kxm+2 − xm+1 k ≤ kxm+2 − ym+1 k + kym+1 − xm+1 k


≤ tm+2 − sm+1 + sm+1 − tm+1
= tm+2 − tm+1 ,

which shows (20.3.29). We also have that

kxm+2 − x0 k ≤ kxm+2 − xm+1 k + kxm+1 − xm k + · · · + kx1 − x0 k


≤ tm+2 − tm+1 + tm+1 − tm + · · · + t1 − t0
= tm+2 < t ∗ .

Hence, we get xm+2 ∈ U(x0 ,t ∗ ).


We showed in Lemma 20.3.1 that sequences {tn }, {sn } are complete. Hence, it follows
from (20.3.27)-(20.3.29) that sequences {xn }, {yn } are complete in a Banach space X and
as such they converge to some x∗ ∈ U(x0 ,t ∗ ) (since U(x0 ,t ∗ ) is a closed set.) By letting
m → ∞ in (20.3.41), we obtain F(x∗ ) = 0. Estimate (20.3.30) follows from (20.3.29) by
using standard majorization techniques [1, 3, 11, 15, 18, 19]. Let us show uniqueness, first
in U(x0 ,t ∗ ) ∩ D. Let y∗ ∈ U(x0 ,t ∗ ) be such that F(y∗ ) = 0. Set Q = 01 F 0 (x∗ +t(y∗ − x∗ ))dt.
R

Then, using (2) in (C 0 ) we get that


Z 1
kF 0 (x0 )−1 (F 0 (x0 ) − Q)k ≤ K0 [(1 − t)kx∗ + t(y∗ − x∗ ) − x0 kdt
0
Z 1
≤ K0 [(1 − t)kx∗ − x0 k + tky∗ − x∗ ) − x0 kdt
0
≤ K0 t ∗ < 1.
Third Order Family of Methods in Banach Spaces 367

It follows that Q−1 exists. Then, from the identity 0 = F(x∗ ) − F(y∗ ) = Q(x∗ − y∗ ) we
deduce that x∗ = y∗ . Similarly, if F(y∗ ) = 0 and y∗ ∈ U(x0 , R), we have that
K0
kF 0 (x0 )−1 (F 0 (x0 ) − Q)k ≤ (R + t ∗ ) < 1,
2
by (20.3.31). Hence, again we deduce that x∗ = y∗ . 
Remark 20.3.3. (a) It follows from the proof of Theorem 20.3.2 that sequences
{t¯n }, {s̄n} defined by
t¯0 = 0, s̄0 = t¯0 + θη,
|1 − θ (s̄0 − t¯0 )K0
t¯1 = s̄0 + [ 3 (1 + |1 − θ2 |) + ](s̄0 − t¯0 ),
|θ | 2θ2
|θ| K |θ2 − 1|
s̄1 = t¯1 + [ (s̄0 − t¯0 )2
1 − K0 t¯1 2 θ2
K
(t¯1 − s¯0 )2 + K0 (s̄0 − t¯0 )(t¯1 − s¯0 )],
2
|θ| |θ2 − 1|
s̄n+1 = t¯n+1 + [ (s̄n − t¯n )2
1 − K0 t¯n+1 2θ2
(t¯n+1 − s̄n )2
+ (s̄n − t¯n )(t¯n+1 − s¯n )],
2
K |θ2 − 1|
t¯n+2 = s̄n+1 + 2 {|1 − θ|(1 + |1 − θ2 |)[ 2
(s̄n − t¯n )2
¯
θ (1 − K0 tn+1 ) 2θ
¯
(tn+1 − s̄n ) 2
+ (s̄n − t¯n )(t¯n+1 − s¯n )]
2
1
(s̄n+1 − t¯n+1 )2 } for each n = 0, 1, 2, · · · .
2
Then, a simple induction argument shows that
s̄n ≤ sn ,
t¯n ≤ tn ,
s̄n − t¯n ≤ sn − tn ,
t¯n+1 − s̄n ≤ tn+1 − sn

and
t¯∗ = lim t¯n ≤ t ∗ .
n→∞
Clearly, {t¯n }, {s̄n }, t¯∗ can replace {tn }, {sn }, t ∗ in Theorem 20.3.2.
(b) The limit point t ∗ can be replaced by t ∗∗ given in closed form by (20.3.9).
(c) Criteria (4) or (20.2.9) and (20.2.11) are sufficient for the convergence of the third
order method (20.1.2). However, these criteria are not also necessary. In practice,
we shall test to see which of these criteria are satisfied (if any) and then use the best
possible error bounds and uniqueness results (see also the numerical examples in the
next section).
368 Ioannis K. Argyros and Á. Alberto Magreñán

20.4. Numerical Examples


Example 20.4.1. Let x ∈ D, X = Y = R, x0 = 1 and D = U(1, 1). Define function F on D
by
F(x) = x3 − 0.49. (20.4.1)
Then, we get that
1
β= η = 0.17, M = 12.
3
Now choosing θ = 1.15 we obtain that

a = 0.68, α = 0.68, γ = 0.34

and as a consequence a1 b1 = 134.091 ≤ 1 condition (20.2.9) is violated. Hence, there is no


guarantee under the conditions given in [4] that sequence {xn } converges to x∗ . Calculating
now δ f and δg , the smallest solutions of the polynomials f (t) and g(t) given in (20.3.1) and
(20.3.2) respectively between 0 and 1, we obtain that

δ = min{δ f , δg } = .4104586 . . .

Moreover, we observe that the ∆ conditions are satisfied since



1−θ Kη
2
θ3 (1 + |1 − θ |) + 2θ = .278261 . . . ≤ δ,
" #
K|θ| |θ2 − 1| δ2
0< + + δ (s0 − t0 ) = .360324 . . . ≤ δ
1 − K0 (1 + δ)s0 2θ2 2
and
" #
K 2 |θ2 − 1| δ2 δ2
0< 2 {|1 − θ|(1 + |1 − θ |) + + δ + }(s0 − t0 )
θ (1 − K0 (1 + δ)s0 ) 2θ2 2 2

= .136162 . . . ≤ .168476 . . . = δ2 .
Consequently, convergence to the solution is guaranteed by Theorem 20.3.2. Moreover,
the computational order of convergence (COC) is shown in Table 20.4.1. Here (COC) is
defined by    
kxn+1 − x? k∞ kxn − x? k∞
ρ ≈ ln / ln , n ∈ N,
kxn − x? k∞ kxn−1 − x? k∞
The Table 20.4.1 shows the (COC).

Example 20.4.2. Let X = Y = C [0, 1], the space of continuous functions defined in [0, 1]
equipped with the max-norm. Let Ω = {x ∈ C [0, 1]; kxk ≤ R}, such that R > 1 and F defined
on Ω and given by
Z 1
F(x)(s) = x(s) − f (s) − λ G(s,t)x(t)3 dt, x ∈ C[0, 1], s ∈ [0, 1],
0
Third Order Family of Methods in Banach Spaces 369

Table 20.4.1. COC for Example 1 using θ = 1.15

n COC
1 2.73851
2 2.99157
3 2.99999
4 3.00000
5 3.00000
ρ = 3.00000

where f ∈ C [0, 1] is a given function, λ is a real constant and the kernel G is the Green
function 
(1 − s)t, t ≤ s,
G(s,t) =
s(1 − t), s ≤ t.
In this case, for each x ∈ Ω, F 0 (x) is a linear operator defined on Ω by the following
expression:
Z 1
0
[F (x)(v)](s) = v(s) − 3λ G(s,t)x(t)2v(t) dt, v ∈ C[0, 1], s ∈ [0, 1].
0

If we choose x0 (s) = f (s) = 1, it follows kI − F 0 (x0 )k ≤ 3|λ|/8. Thus, if |λ| < 8/3, F 0 (x0 )−1
is defined and
8
kF 0 (x0 )−1 k ≤ .
8 − 3|λ|
Moreover,
|λ|
kF(x0 )k ≤ ,
8
|λ|
kF 0 (x0 )−1 F(x0 )k ≤ .
8 − 3|λ|
On the other hand, for x, y ∈ Ω we have
Z 1
0 0
[(F (x) − F (y))v](s) = 3λ G(s,t)(x(t)2 − y2 (t))v(t) dt
0

and for x ∈ Ω we get in turn that

6|λ|
kF 00 (x)k ≤ .
8
Consequently,

3|λ|(kxk + kyk) 6R|λ|


kF 0 (x) − F 0 (y)k ≤ kx − yk ≤ kx − yk ,
8 8
1 + 3|λ|(kxk + 1) 1 + 3(1 + R)|λ|
kF 0 (x) − F 0 (1)k ≤ kx − 1k ≤ kx − 1k .
8 8
370 Ioannis K. Argyros and Á. Alberto Magreñán

Choosing λ = 1.5, R = 4.4 and θ = 1.1 we have

β = 0.677966 . . .,

η = 0.127119 . . .,
M = 4.95,
a = 0.426602 . . .,
α = 1.16529 . . .,
and
γ = 0.213301 . . .
So, as a1 b1 = 1.25402 ≤ 1, condition (20.2.9) is violated. Hence, there is no guarantee
under the conditions given in [4] that sequence {xn } converges to x∗ . Calculating now δ f
and δg , the smallest solutions of the polynomials f (t) and g(t) given in (20.3.1) and (20.3.2)
respectively between 0 and 1, we obtain that

δ = min{δ f , δg } = 0.370693 . . .

Moreover, we observe that the ∆ conditions are satisfied since



1−θ Kη
2
θ3 (1 + |1 − θ |) + 2θ = 0.284819 . . . ≤ δ,
" #
K|θ| |θ2 − 1| δ2
0< + + δ (s0 − t0 ) = 0.334767 . . . ≤ δ
1 − K0 (1 + δ)s0 2θ2 2
and
" #
2
K |θ2
− 1| δ δ2
0< 2 {|1 − θ|(1 + |1 − θ2 |) + + δ + }(s0 − t0 )
θ (1 − K0 (1 + δ)s0 ) 2θ2 2 2

= 0.0871515 . . . ≤ 0.137413 . . . = δ2 .
Consequently, convergence to the solution is guaranteed by Theorem 20.3.2.
References

[1] Argyros, I.K., Computational theory of iterative methods, Series: Studies in Compu-
tational Mathematics, 15, Editors: C.K. Chui and L. Wuytack, Elsevier Publ. Co. New
York, U.S.A, 2007.

[2] Argyros, I.K., Hilout, S., Weaker conditions for the convergence of Newton’s method.
J. Complexity, 28 (2012), 364–387.

[3] Argyros, I.K., Hilout, S., Computational methods in nonlinear analysis, World Scien-
tific Publ. Comp., New Jersey, USA 2013.

[4] Chun, C., Stanica, P., Neta, B., Third order family of methods in Banach spaces, Comp
Math. Appl., 61 (2011), 1665-1675.

[5] Candela, V., Marquina, A., Recurrence relations for rational cubic methods I: The
Halley method, Computing, 44 (1990), 169-184.

[6] Candela, V., Marquina, A., Recurrence relations for rational cubic methods II: The
Chebyshev method, Computing, 45 (1990), 355-367.

[7] Ezquerro, J. A., Hernández, M.A., Recurrence relations for Chebyshev-type methods,
Appl. Math. Optim., 41 (2000), 227-236.

[8] Gutiérrez, J.M., Hernández, M.A., Recurrence relations for the super-Halley method,
Computers Math. Applic., 36 (1998), 1-8.

[9] Gutiérrez, J.M., Hernández, M.A., Third-order iterative methods for operators with
bounded second derivative, J. Comp. Appl. Math., 82 (1997), 171-183.

[10] Gutiérrez, J.M., Magreñán, Á.A., Romero, N., On the semilocal convergence of
Newton-Kantorovich method under center-Lipschitz conditions, App. Math. Comp.,
221 (2013), 79-88.

[11] Kantorovich, L.V., G.P. Akilov, Functional Analysis, Pergamon Press, Oxford, 1982.

[12] Kou, J.S., Li, T., Modified Chebyshev’s method free from second derivative for non-
linear equations, App. Math. Comp., 187 (2007), 1027-1032.

[13] Kou, J.S., Li, T., Wang, X.H., A modification of Newton method with third-order
convergence, App. Math. Comp., 181 (2006), 1106-1111.
372 Ioannis K. Argyros and Á. Alberto Magreñán

[14] Magreñán, Á. A., Different anomalies in a Jarratt family of iterative root-finding meth-
ods, App. Math. Comp., 233 (2014), 29–38.

[15] Ortega, J.M., Rheinboldt, W.C., Iterative Solution of Nonlinear Equations in Several
Variables, Academic press, New York, 1970.

[16] Parida, P.K., Study of some third order methods for nonlinear equations in Banach
spaces, Ph.D. Dessertation, Indian Institute of Technology, Department of Mathemat-
ics, Kharagpur, India, 2007.

[17] Parida, P.K., Gupta, D.K., Recurrence relations for semilocal convergence of a
Newton-like method in Banach spaces, J. Math. Anal. Applic., 345 (2008), 350-361.

[18] Potra, F.A., Ptrǎ, V., Nondiscrete induction and iterative processes, in Research Notes
in Mathematics, 103, Pitman, Boston, 1984.

[19] Rall, L.B., Computational solution of Nonlinear operator equations, Robert E.


Krieger, New York, 1979.

[20] Wu, Q., Zhao, Y., Third order convergence theorem by using majorizing function for
a modified Newton method in Banach space, App. Math. Comp., 175 (2006), 1515-
1524.
Chapter 21

Local Convergence of Modified


Halley-Like Methods with Less
Computation of Inversion

21.1. Introduction
In this chapter we are concerned with the problem of approximating a solution x∗ of the
nonlinear equation
F(x) = 0, (21.1.1)
where F is a Fréchet-differentiable operator defined on a subset D of a Banach space X with
values in a Banach space Y.
Many problems in computational sciences and other disciplines can be brought in a
form like (21.1.1) using mathematical modeling [3]. The solutions of equation (21.1.1) can
rarely be found in closed form. That is why most solution methods for these equations are
usually iterative. In particular, the practice of Numerical Functional Analysis for finding
such solution is essentially connected to Newton-like methods [1]-[20]. The study about
convergence matter of iterative procedures is usually based on two types: semilocal and
local convergence analyses. The semilocal convergence matter is, based on the information
around an initial point, to give conditions ensuring the convergence of the iterative proce-
dure; while the local one is, based on the information around a solution, to find estimates
of the radii of convergence balls. There exist many studies which deal with the local and
semilocal convergence analyses of Newton-like methods such as [1]-[20].
We present a local convergence analysis for the modified Halley-Like Method [30] de-
fined for each n = 0, 1, 2, · · · by

yn = xn − F 0 (xn )−1 F(xn ),


un = xn − θF 0 (xn )−1 F(xn ), (21.1.2)
0 −1
= yn + (1 − θ)F (xn ) F(xn ),
zn = yn − γAθ,n F 0 (xn )−1 F(xn ),
xn+1 = zn − αBθ,n F 0 (xn )−1 F(zn ),
374 Ioannis K. Argyros and Á. Alberto Magreñán

where x0 is an initial point, α, γ, θ ∈ (−∞, ∞) − {0} are given parameters, Hθ,n =


1 0 −1 0 0 1 1 2
θ F (xn ) (F (un ) − F (xn )), Aθ,n = I − 2 Hθ,n (I − 2 Hθ,n ) and Bθ,n = I − H1,n + Hθ,n . The
semilocal convergence of method (21.1.2) was studied in [30] in the special case when
α = γ = 1 and θ ∈ [0, 1]. Moreover, if γ = 1, α = 0 and θ ∈ (0, 1], the semilocal conver-
gence of the resulting method (21.1.2) was given in [30].
The semilocal convergence results in [30] were given in a non-affine invariant form.
However, the results obtained in our chapter are given in affine invariant form. The sufficient
semilocal convergence conditions (given in affine invariant form) used in [30] are (C ):

(C1 ) There exists F 0 (x0 )−1 ∈ L(Y, X) and kF 0 (x0 )−1 k ≤ β;

(C2 )
kF 0 (x0 )−1 F(x0 )k ≤ β1 ;

(C3 )
kF 0 (x0 )−1 F 00 (x)k ≤ β2 for each x ∈ D;

(C4 )

kF 0 (x0 )−1 (F 00 (x) − F 00 (y))k ≤ β3 kx − ykq


for each x, y ∈ D, and some q ∈ [0, 1].

Under the (C ) conditions for α = γ = 1 and θ ∈ (0, 1] the convergence order was shown
to be 3 + 2q in [30]. Moreover, for γ = 1, α = 0 and θ ∈ (0, 1] the convergence order was
shown to be 2 + q in [10].
Similar conditions have been used by several authors on other high convergence or-
der methods [1]-[20]. The corresponding conditions for the local convergence analysis are
given by simply replacing x0 by x∗ in the preceding (C ) conditions. These conditions how-
ever are very restrictive. As a motivational example, let us define function f on D = [− 12 , 52 ]
by  3 2
x lnx + x5 − x4 , x 6= 0
f (x) =
0, x = 0
Choose x∗ = 1. We have that

f 0 (x) = 3x2 lnx2 + 5x4 − 4x3 + 2x2 , F 0 (1) = 3,


f 00 (x) = 6x lnx2 + 20x3 − 12x2 + 10x
f 000 (x) = 6 lnx2 + 60x2 − 24x + 22.

Then, e.g, function f cannot satisfy condition (C4 ), say for q = 1, since function f 000 is un-
bounded on D. In the present chapter we only use hypotheses on the first Fréchet derivative
(see conditions (21.2.12)-(21.2.15)). Notice that they used θ ∈ (0, 1], whereas in this chap-
ter θ can belong in a wider than (0, 1] interval and γ = α = 1 in [30]. This way we expand
the applicability of method (21.1.2).
The chapter is organized as follows. The local convergence of method (21.1.2) is given
in Section 21.2, whereas the numerical examples are given in Section 21.3. Finally, some
remarks are given in the concluding Section 21.4.
Modified Halley-Like Methods 375

21.2. Local Convergence Analysis


We present the local convergence analysis of method (21.1.2) in this section. Denote by
U(v, ρ), Ū(v, ρ) the open and closed balls, respectively, in X of center v ∈ X and of radius
ρ > 0.
Let L0 > 0, L > 0, θ ∈ (−∞, ∞) − {0}, α, γ ∈ (−∞, ∞) and M > 0 be given parameters.
Define functions on the interval [0, L10 ) by

Lr
g1 (r) = ,
2(1 − L0 r)
M|1 − θ|
g2 (r) = g1 (r) + ,
1 − L0 r
L0 (1 + g2 (r))
g3 (r) = ,
2|θ|(1 − L0 r)
g4 (r) = 1 + g3 (r)r + g23 (r)r2,
|γ|Mg4 (r)
g5 (r) = g1 (r) + ,
1 − L0 r
g6 (r) = 1 + 2g1,3 (r)r + 4g23 (r)r2,
L0 (1 + g1 (r))
g1,3 (r) =
2(1 − L0 r)

and
|α|Mg6(r)
g7 (r) = [1 + ]g5 (r).
1 − L0 r
Moreover, define parameter

2(1 − M|1 − θ|)


r2 = .
2L0 + L
Suppose
M|1 − θ| < 1.
Then, it follows from the definition of functions g1 and g2 that

0 < g1 (r) < 1, and 0 < g2 (r) < 1, for each r ∈ (0, r2).
1
Evidently, g5 (r) ∈ (0, 1), if for each r ∈ (0, r5) and r5 < L0 to be determined, we have that

|γ|g4 (r)M
0 < g1 (r) + < 1 for each r ∈ (0, r5 ).
1 − L0 r

Define function p5 on the interval [0, L10 ] by

p5 (r) = |γ|Mg4(r) − (1 − L0 r)(1 − g1 (r)).

We have that
1 − 1
p5 (( ) ) = |γ|Mg4(( )− ) > 0.
L0 L0
376 Ioannis K. Argyros and Á. Alberto Magreñán

Suppose that
|γ|M < 1.
Then, we have that
p5 (0) = M|γ| − 1 < 0.
It follows from the intermediate value theorem that function p5 has zeros in the interval
(0, L10 ). Denote by r5 the smallest such zero. Then, we have that

p5 (r) < 0 ⇒ 0 < g5 (r) < 1 for each r ∈ (0, r5).


1
Similarly, function g7 ∈ (0, 1) for each r ∈ (0, r7) and r7 < L0 to be determined, if function
p7 (r) ∈ (0, 1) for each r ∈ [0, r7 ], where

p7 (r) = (1 − L0 r + |α|Mg6 (r))g5(r) − (1 − L0 r).

We get that
1 − 1 1
p7 (( ) ) = |γ|Mg6(( )−)g5 (( )− ) > 0.
L0 L0 L0
and
p7 (0) = (1 + |α|Mg6 (0))|γ|g5 (0) − 1 = (1 + |α|M)|γ|M − 1.
Suppose that
(1 + |α|M)|γ|M < 1.
Then, we have p7 (0) < 0. It follows that function p7 has zeros in the interval (0, L10 ). Denote
by r7 the smallest such zero. Then, we obtain that

p7 (0) < 0 ⇒ 0 < g7 (r) < 1, for each r ∈ (0, r7 ).

Set
r∗ = min{r2 , r5 , r7 }. (21.2.1)
Then, we have that

0 < g1 (r) < 1, (21.2.2)


0 < g2 (r) < 1 (21.2.3)
0 < g3 (r) (21.2.4)
0 < g4 (r) (21.2.5)
0 < g5 (r) < 1 (21.2.6)
0 < g6 (r) (21.2.7)

and
0 < g7 (r) < 1, for each r ∈ (0, r∗ ). (21.2.8)
Next, we present the local convergence analysis of method (21.1.2).
Modified Halley-Like Methods 377

Theorem 21.2.1. Let F : D ⊆ X → Y be a Fréchet-differentiable operator. Suppose that


there exist x∗ ∈ D, parameters L0 > 0, L > 0, M > 0, θ ∈ (−∞, ∞) − {0} and α, γ ∈ (−∞, ∞)
such that for each x ∈ D
M|1 − θ| < 1, (21.2.9)
M|γ| < 1, (21.2.10)
(1 + |α|M)|γ|M < 1, (21.2.11)
∗ 0 ∗ −1
F(x ) = 0, F (x ) ∈ L(Y, X), (21.2.12)
kF 0 (x∗ )−1 (F 0 (x) − F 0 (x∗ ))k ≤ L0 kx − x∗ k, (21.2.13)
L
kF 0 (x∗ )−1 (F(x) − F(x∗ ) − F 0 (x)(x − x∗ )k ≤ kx − x∗ k2 , (21.2.14)
2
0 ∗ −1 0
kF (x ) F (x)k ≤ M (21.2.15)
and
Ū(x∗ , r∗ ) ⊆ D, (21.2.16)
where r∗ is given in (21.2.1). Then, the sequence {xn } generated by method (21.1.2) for
x0 ∈ U(x∗ , r∗ ) is well defined, remains in U(x∗ , r∗ ) for each n = 0, 1, 2, · · · and converges to
x∗ . Moreover, the following estimates hold for each n = 0, 1, 2, · · · ,

kyn − x∗ k ≤ g1 (kxn − x∗ k)kxn − x∗ k < kxn − x∗ k < r∗ , (21.2.17)

kun − x∗ k ≤ g2 (kxn − x∗ k)kxn − x∗ k < kxn − x∗ k, (21.2.18)


∗ ∗
kHθ,n k ≤ 2g3 (kxn − x k)kxn − x k, (21.2.19)
kAθ,n k ≤ g4 (kxn − x∗ k) (21.2.20)
kzn − x∗ k ≤ g5 (kxn − x∗ k)kxn − x∗ k < kxn − x∗ k, (21.2.21)
kBθ,n k ≤ g6 (kxn − x∗ k) (21.2.22)
and
kxn+1 − x∗ k ≤ g7 (kxn − x∗ k)kxn − x∗ k < kxn − x∗ k. (21.2.23)
where the ”g” functions are defined above Theorem 21.2.1.
Proof. Using (21.2.13), the definition of r∗ and the hypothesis x0 ∈ U(x∗ , r∗ ) we get that

kF 0 (x∗ )−1 (F 0 (x0 ) − F 0 (x∗ ))k ≤ L0 kx0 − x∗ k < L0 r∗ < 1. (21.2.24)

It follows from (21.2.24) and the Banach Lemma on invertible operators [3, ?] that
F 0 (x0 )−1 ∈ L(Y, X) and
1 1
kF 0 (x0 )−1 F 0 (x∗ )k ≤ < . (21.2.25)
1 − L0 kx0 − x k 1 − L0 r∗

Hence, y0 and u0 are well defined. Using the first substep in method (21.1.2) for n = 0,
(21.2.2), (21.2.14), (21.2.25) and the definition of function g1 we obtain in turn that

y0 − x∗ = x0 − x∗ − F 0 (x0 )−1 F(x0 )


= −F 0 (x0 )−1 F 0 (x∗ )F 0 (x∗ )−1 [F(x0 ) − F(x∗ ) − F 0 (x0 )(x0 − x∗ )]
378 Ioannis K. Argyros and Á. Alberto Magreñán

so,

ky0 − x∗ k ≤ kF 0 (x0 )−1 F 0 (x∗ )kkF 0 (x∗ )−1 [F(x0 ) − F(x∗ ) − F 0 (x0 )(x0 − x∗ )]k
Lkx0 − x∗ k2

2(1 − L0 kx0 − x∗ k)
= g1 (kx0 − x∗ k)kx0 − x∗ k < kx0 − x∗ k < r∗ ,

which shows (21.2.17) for n = 0. We also have from the second substep of method (21.1.2)
for n = 0, (21.2.9), (21.2.15), (21.2.17) and the definition of functions g1 and g2 that

ku0 − x∗ k ≤ ky0 − x∗ k + |1 − θ|kF 0 (x0 )−1 F 0 (x∗ )k


Z 1
×k F 0 (x∗ + t(x0 − x∗ )dtkkx0 − x∗ k
0
M|1 − θ|
≤ [g1 (kx0 − x∗ k) + ]kx0 − x∗ k
1 − L0 kx0 − x∗ k
= g2 (kx0 − x∗ k)kx0 − x∗ k < kx0 − x∗ k < r∗ , (21.2.26)

which shows (21.2.18) for n = 0.


Next, we need an estimate on 12 kHθ,0 k. We have from (21.2.4), (21.2.13), (21.2.25),
(21.2.26) and the definition of functions g2 and g3 that
1 1
kHθ,0 k ≤ kF 0 (x0 )−1 F 0 (x∗ )k(kF 0 (x∗ )−1 (F 0 (u0 ) − F 0 (x∗ ))k
2 2|θ|
+kF 0 (x∗ )−1 (F 0 (x0 ) − F 0 (x∗ ))k)
L0 (ku0 − x∗ k + kx0 − x∗ k)

2|θ|(1 − L0 kx0 − x∗ k)
L0 (kx0 − x∗ k + g2 (kx0 − x∗ k)kx0 − x∗ k)

2|θ|(1 − L0 kx0 − x∗ k)
L0 (1 + g2 (kx0 − x∗ k))kx0 − x∗ k

2|θ|(1 − L0 kx0 − x∗ k)
= g3 (kx0 − x∗ k)kx0 − x∗ k, (21.2.27)

which shows (21.2.19) for n = 0. We also need an estimate on kAθ,0 k. It follows from
(21.2.27) and the definition of Aθ,0 , g3 , g4 that

1 1
kAθ,0 k ≤ 1 + kHθ,0 k + kHθ,0 k2
2 4
≤ 1 + g3 (kx0 − x∗ k)kx0 − x∗ k + g23 (kx0 − x∗ k)kx0 − x∗ k2
= g4 (kx0 − x∗ k), (21.2.28)

which shows (21.2.20) for n = 0. Then, from the third substep of method (21.1.2) for n = 0,
Modified Halley-Like Methods 379

(21.2.19), (21.2.20), (21.2.28) the definition of functions g1 , g5 and radius r∗ , we have that

kz0 − x∗ k ≤ ky0 − x∗ k + |γ|kAθ,0 kkF 0 (x0 )−1 F 0 (x∗ )k


Z 1
k F 0 (x∗ )−1 F 0 (x∗ + t(x0 − x∗ )dtkkx0 − x∗ k
0
M|γ|g4 (kx0 − x∗ k)
≤ [g1 (kx0 − x∗ k) + ]kx0 − x∗ k
1 − L0 kx0 − x∗ k
= g5 (kx0 − x∗ k)kx0 − x∗ k < kx0 − x∗ k < r∗ , (21.2.29)

which shows (21.2.21) for n = 0. Next, we need an estimate on kBθ,0 k. We have by the
definition of operator Bθ,0 and functions g1,3 , g3 , g6 that

kBθ,0 k ≤ 1 + 2g1,3 (kx0 − x∗ k)kx0 − x∗ k + 4g23 (kx0 − x∗ k)kx0 − x∗ k2 = g6 (kx0 − x∗ k),


(21.2.30)
which shows (21.2.22) for n = 0. Using the fourth substep in method (21.1.2) for n = 0,
(21.2.3), (21.2.15), (21.2.21), (21.2.22), (21.2.29) the definition of functions g5 , g6 , g7 and
radius r∗ , we obtain that

kx1 − x∗ k ≤ kz0 − x∗ k + |α|kBθ,0 kkF 0 (x0 )−1 F 0 (x∗ )k


Z 1
k F 0 (x∗ )−1 F 0 (x∗ + t(z0 − x∗ )dtkkz0 − x∗ k
0
M|α|g6 (kx0 − x∗ k)
≤ (1 + )kz0 − x∗ k
(1 − L0 kx0 − x∗ k)
M|α|g6 (kx0 − x∗ k)
= (1 + )g5 (kx0 − x∗ k)kx0 − x∗ k, (21.2.31)
(1 − L0 kx0 − x∗ k)
which shows (21.2.23) for n = 0. By simply replacing y0 , u0 , z0 , x1 by yk , uk , zk , xk+1 in the
preceding estimates we arrive at estimates (21.2.17)-(21.2.23). Finally, from the estimate
kxk+1 − x∗ k < kxk − x∗ k, we deduce that limk→∞ xk = x∗ .

Remark 21.2.2. 1. In view of (21.2.13) and the estimate

kF 0 (x∗ )−1 F 0 (x)k = kF 0 (x∗ )−1 (F 0 (x) − F 0 (x∗ )) + Ik


≤ 1 + kF 0 (x∗ )−1 (F 0 (x) − F 0 (x∗ ))k ≤ 1 + L0 kx − x∗ k

condition (21.2.15) can be dropped and M can be replaced by

M(r) = 1 + L0 r.

Moreover, condition (21.2.14) can be replaced by the popular but stronger conditions

kF 0 (x∗ )−1 (F 0 (x) − F 0 (y))k ≤ Lkx − yk for each x, y ∈ D (21.2.32)

or

kF 0 (x∗ )−1 (F 0 (x∗ + t(x − x∗ )) − F 0 (x))k ≤ L(1 − t)kx − x∗ k for each


x, y ∈ D and t ∈ [0, 1].
380 Ioannis K. Argyros and Á. Alberto Magreñán

2. The results obtained here can be used for operators F satisfying autonomous differ-
ential equations [3] of the form
F 0 (x) = P(F(x))
where P is a continuous operator. Then, since F 0 (x∗ ) = P(F(x∗ )) = P(0), we can
apply the results without actually knowing x∗ . For example, let F(x) = ex − 1. Then,
we can choose: P(x) = x + 1.
3. The local results obtained here can be used for projection methods such as the
Arnoldi’s method, the generalized minimum residual method (GMRES), the gener-
alized conjugate method(GCR) for combined Newton/finite projection methods and
in connection to the mesh independence principle can be used to develop the cheapest
and most efficient mesh refinement strategies [3, 4].
4. The radius rA given by
1
r ≤ rA = . (21.2.33)
L0 + L2
was shown by us to be the convergence radius of Newton’s method [3, 4]
xn+1 = xn − F 0 (xn )−1 F(xn ) for each n = 0, 1, 2, · · · (21.2.34)
under the conditions (21.2.13) and (21.2.32). It follows from (21.2.1) and (21.2.33)
that the convergence radius r∗ of the method (21.1.2) cannot be larger than the con-
vergence radius rA of the second order Newton’s method (21.2.33). As already noted
in [3, 4] rA is at least as large as the convergence ball given by Rheinboldt [3, 4]
2
rR = . (21.2.35)
3L
In particular, for L0 < L we have that
rR < rA
and
rR 1 L0
→ as → 0.
rA 3 L
That is our convergence ball rA is at most three times larger than Rheinboldt’s. The
same value for rR was given by Traub [3, 4].
5. It is worth noticing that method (21.1.2) is not changing when we use the conditions
of Theorem 21.2.1 instead of the stronger (C ) conditions used in [30]. Moreover, we
can compute the computational order of convergence (COC) defined by
   
kxn+1 − x∗ k kxn − x∗ k
ξ = ln / ln
kxn − x∗ k kxn−1 − x∗ k
or the approximate computational order of convergence
   
kxn+1 − xn k kxn − xn−1 k
ξ1 = ln / ln .
kxn − xn−1 k kxn−1 − xn−2 k
This way we obtain in practice the order of convergence in a way that avoids the
bounds given in [30] involving estimates up to the second Fréchet derivative of oper-
ator F.
Modified Halley-Like Methods 381

21.3. Numerical Examples


We present numerical examples in this section.

Example 21.3.1. Let X = Y = R2 , D = Ū(0, 1), x∗ = 0 and define function F on D by

1
F(x) = (sinx, (ex + 2x − 1)). (21.3.1)
3

Then, using (21.2.9)-(21.2.15), we get L0 = L = 1, M = 13 (e + 2), θ = 43 , γ = 35 , α = 3


100 .
Then, by (21.2.1) we obtain

r∗ = 0.3161 < rR = rA = 0.6667

Example 21.3.2. Let X = Y = R3 , D = U(0, 1). Define F on D for v = x, y, z) by

e−1 2
F(v) = (ex − 1, y + y, z). (21.3.2)
2
Then, the Fréchet-derivative is given by
 x 
e 0 0
F 0 (v) =  0 (e − 1)y + 1 0  .
0 0 1

Notice that x∗ = (0, 0, 0), F 0 (x∗ ) = F 0 (x∗ )−1 = diag{1, 1, 1}, L0 = e−1 < L = e, M = e, θ =
3 3 3
4 , γ = 10 , α = 100 . Then, by (21.2.1) we obtain

r∗ = 0.2136 < rR = 0.2453 < rA = 0.3249.

Example 21.3.3. Returning back to the motivational example at the introduction of this
chapter, we see that conditions (21.2.12)–(21.2.15) are satisfied for x∗ = 1, f 0 (x∗ ) =
3, f (1) = 0, L0 = L = 146.6629073 and M = 101.5578008. Hence, the results of Theo-
rem 2.1 can apply but not the ones in [30]. In particular, for θ = 0.9902, α = 0.008 and
γ = 0.005 hypotheses (21.2.9)-(21.2.15) are satisfied. Moreover, we obtain

r∗ = 0.0032 < rR = 0.0045 ≤ rA = 0.0045.

21.4. Conclusion
We present a local convergence analysis of Modified Halley-Like Methods with less com-
putation of inversion in order to approximate a solution of an equation in a Banach space
setting. Earlier convergence analysis is based on Lipschitz and Holder-type hypotheses up
to the second Fréchet-derivative [1]–[20]. In this chapter the local convergence analysis is
based only on Lipschitz hypotheses of the first Fréchet-derivative. Hence, the applicability
of these methods is expanded under less computational cost of the constants involved in the
convergence analysis.
References

[1] Ahmad, F., Hussain, S., Mir, N.A., A. Rafiq, New sixth order Jarratt method for solv-
ing nonlinear equations, Int. J. Appl. Math. Mech., 5(5) (2009), 27-35.

[2] Amat, S., Hernández, M.A., Romero, N., A modified Chebyshev’s iterative method
with at least sixth order of convergence, App. Math. Comp., 206(1) (2008), 164-174.

[3] Argyros, I.K., Convergence and Application of Newton-type Iterations, Springer,


2008.

[4] Argyros, I. K., Hilout, S., A convergence analysis for directional two-step Newton
methods, Numer. Algor., 55 (2010), 503-528.

[5] Argyros, I. K., Magreñán, Á.A., On the convergence of an optimal fourth-order family
of methods and its dynamics. App. Math. Comp., 252 (2015), 336-346.

[6] Bruns, D.D., Bailey, J.E., Nonlinear feedback control for operating a nonisothermal
CSTR near an unstable steady state, Chem. Eng. Sci., 32 (1977), 257-264.

[7] Candela, V., Marquina, A., Recurrence relations for rational cubic methods I: The
Halley method, Computing, 44 (1990), 169-184.

[8] Candela, V., Marquina, A., Recurrence relations for rational cubic methods II: The
Chebyshev method, Computing, 45 (1990), 355-367.

[9] Chun, C., Some improvements of Jarratt’s method with sixth-order convergence, App.
Math. Comp., 190(2) (2007), 1432-1437.

[10] Ezquerro, J.A., Hernández, M.A., A uniparametric Halley-type iteration with free sec-
ond derivative, Int. J.Pure and Appl. Math., 6 (1) (2003), 99-110.

[11] Ezquerro, J.A., Hernández, M.A., New iterations of R-order four with reduced com-
putational cost. BIT Numer. Math., 49 (2009), 325-342.

[12] Ezquerro, J.A., Hernández, M.A., On the R-order of the Halley method, J. Math. Anal.
Appl., 303 (2005), 591-601.

[13] Gutiérrez, J.M., Hernández, M.A., Recurrence relations for the super-Halley method,
Computers Math. Applic. 36(7) (1998), 1-8.
384 Ioannis K. Argyros and Á. Alberto Magreñán

[14] Ganesh, M., Joshi, M.C., Numerical solvability of Hammerstein integral equations of
mixed type, IMA J. Numer. Anal., 11 (1991), 21-31.

[15] Gutiérrez, J.M., Magreñán, Á.A., Romero, N., On the semilocal convergence of
Newton-Kantorovich method under center-Lipschitz conditions, App. Math. Comp.,
221 (2013), 79-88.

[16] Hernández, M.A., Chebyshev’s approximation algorithms and applications, Comput.


Math. Appl., 41 (3-4) (2001), 433-455.

[17] Hernández, M.A., M.A. Salanova, Sufficient conditions for semilocal convergence of
a fourth order multipoint iterative method for solving equations in Banach spaces.
Southwest J. Pure Appl. Math, (1) (1999), 29-40.

[18] Jarratt, P., Some fourth order multipoint methods for solving equations, Math. Comp.
20(95) (1966), 434-437.

[19] Kou, J., Li, Y., An improvement of the Jarratt method, App. Math. Comp., 189 (2007),
1816-1821.

[20] Kou, J., Wang, X., Semilocal convergence of a modified multi-point Jarratt method
in Banach spaces under general continuity conditions, Numer. Algorithms, 60 (2012),
369-390.

[21] Magreñán, Á.A., Different anomalies in a Jarratt family of iterative root-finding meth-
ods, App. Math. Comp., 233 (2014), 29–38.

[22] Parhi, S.K., Gupta, D.K., Semilocal convergence of a Stirling-like method in Banach
spaces, Int. J. Comput. Methods, 7(02) (2010), 215-228.

[23] Parhi, S.K., Gupta, D.K., Recurrence relations for a Newton-like method in Banach
spaces, J. Comput. Appl. Math., 206(2) (2007), 873-887.

[24] Rall, L.B., Computational solution of nonlinear operator equations, Robert E. Krieger,
New York (1979).

[25] Ren, H., Wu, Q., Bi, W., New variants of Jarratt method with sixth-order convergence,
Numer. Algorithms 52(4) (2009), 585-603.

[26] Wang, X., Kou, J., Li, Y., Modified Jarratt method with sixth order convergence, Appl.
Math. Let., 22 (2009), 1798-1802.

[27] Ye, X., Li, C., Convergence of the family of the deformed Euler-Halley iterations
under the Hölder condition of the second derivative, J. Comp. Appl. Math., 194(2)
(2006), 294-308.

[28] Ye, X., Li, C., Shen, W., Convergence of the variants of the Chebyshev-Halley itera-
tion family under the Hölder condition of the first derivative, J. Comput. Appl. Math.,
203(1) (2007), 279-288.
Modified Halley-Like Methods 385

[29] Zhao, Y., Wu, Q., Newton-Kantorovich theorem for a family of modified Halley’s
method under Hölder continuity condition in Banach spaces, App. Math. Comp.,
202(1) (2008), 243-251.

[30] Wang, X., Kou, J., Convergence for modified Halley-like methods with less computa-
tion of inversion, J. Diff. Eq. and Appl., 19(9) (2013), 1483-1500.
Chapter 22

Local Convergence for an Improved


Jarratt-Type Method in Banach
Space

22.1. Introduction
In this chapter we are concerned with the problem of approximating a solution x∗ of the
equation
F(x) = 0, (22.1.1)
where F is a Fréchet-differentiable operator defined on a convex subset D of a Banach space
X with values in a Banach space Y .
Many problems in computational sciences and other disciplines can be brought in a
form like (22.1.1) using mathematical modelling [11, 12, 29, 31]. The solutions of these
equations can rarely be found in closed form. That is why most solution methods for
these equations are iterative. The study about convergence matter of iterative procedures
is usually based on two types: semilocal and local convergence analysis. The semilocal
convergence matter is, based on the information around an initial point, to give condi-
tions ensuring the convergence of the iterative procedure; while the local one is, based on
the information around a solution, to find estimates of the radii of convergence balls. In
particular, the practice of Numerical Functional Analysis for finding solution x∗ of equa-
tion (22.1.1) is essentially connected to variants of Newton’s method. This method con-
verges quadratically to x∗ if the initial guess is close enough to the solution. Iterative
methods of convergence order higher than two such as Chebyshev-Halley-type methods
[5, 6, 11, 14, 23, 30, 12, 19, 20, 21, 22, 24, 25, 26, 27, 28, 31, 33] require the evaluation of
the second Fréchet-derivative, which is very expensive in general. However, there are inte-
gral equations, where the second Fréchet-derivative is diagonal by blocks and inexpensive
or for quadratic equations the second Fréchet-derivative is constant. Moreover, in some ap-
plications involving stiff systems, high order methods are usefull. That is why in a unified
way we study the local convergence of the improved Jarratt-type method (IJTM) defined
388 Ioannis K. Argyros and Á. Alberto Magreñán

for each n = 0, 1, 2, . . . by

un = xn − F 0 (xn )−1 F(xn ),


2
yn = xn + (un − xn ),
3
−1 
Jn = 6F (yn ) − 2F 0 (xn )
0
3F 0 (yn ) + F 0 (xn ) , (22.1.2)
0 −1
zn = xn − Jn F (xn ) F(xn ),
xn+1 = zn − (2Jn − I)F 0 (xn )−1 F(zn ),

where x0 is an initial point and I is the identity operator. If we set Hn = F 0 (xn )−1 (F 0 (yn ) −
F 0 (xn )), then using some algebraic manipulation we obtain that
 −1 !  −1
1 3 3 3
Jn = I + I + Hn =I− I + Hn Hn . (22.1.3)
2 2 4 2

This method has been shown to be of convergence order between 5 and 6 [29, 33]. The
usual conditions for the semilocal convergence of these methods are (C ):

(C1 ) There exists Γ0 = F 0 (x0 )−1 and kΓ0 k ≤ β, β > 0;

(C2 ) kΓ0 F(x0 )k ≤ η, η ≥ 0;

(C3 ) kF 00 (x)k ≤ β1 for each x ∈ D, β1 ≥ 0;

(C4 ) kF 000 (x)k ≤ β2 for each x ∈ D, β2 ≥ 0

or

(C 04 ) kF 000 (x0 )k ≤ β2 for each x ∈ D, β2 ≥ 0 and some x0 ∈ D;

(C5 ) kF 000 (x) − F 000 (y)k ≤ β3 kx − yk for each x, y ∈ D

or kF 000 (x) − F 000 (y)k ≤ ϕ(kx − yk)) for each x, y ∈ D, where ϕ : [0, +∞) → [0, +∞) is
a non-decreasing function.

The local convergence conditions are similar but x0 is x∗ in (C1 ) and (C2 ). There is
a plethora of local and semilocal convergence results under the (C ) conditions [1]–[33].
These conditions restrict the applicability of these methods. That is why, in our chapter we
assume the conditions (A ):

(A1 ) F : D → Y is Fréchet-differentiable and there exists x∗ ∈ D such that F(x∗ ) = 0 and


F 0 (x∗ )−1 ∈ L(Y, X);

(A2 ) kF 0 (x∗ )−1 (F 0 (x) − F 0 (x∗ ))k ≤ L0 kx − x∗ k for each x ∈ D;

(A3 ) kF 0 (x∗ )−1 (F 0 (x) − F 0 (y))k ≤ Lkx − yk for each x, y ∈ D;

and

(A4 ) kF 0 (x∗ )−1 F 0 (x)k ≤ K for each x ∈ D, k > 0.


Improved Jarratt-Type Method 389

Notice that the (A ) conditions are weaker than the (C ) conditions. Hence, the applicability
of (IJTM) is expanded under the (A ) conditions.

As a motivational example, let us define function f on D = U 1, 32 by
 3 2
 x lnx + x5 − x4 , x 6= 0
f (x) =

0, x = 0.
Choose x∗ = 1. We have that

f 0 (x) = 3x2 ln x2 + 5x4 − 4x3 + 2x2 ,

f 00 (x) = 6x lnx2 + 20x3 + 12x2 + 10x


and
f 000 (x) = 6 lnx2 + 60x2 − 24x + 22.
Notice that f 000 (x) is bounded on D. That is condition (C4 ) is not satisfied. Hence, the results
depending on (C4) cannot apply in this case. However, we have f 0 (x∗ ) = 3 and f (1) =
0. That is, conditions (A1) is satisfied. Moreover, conditions (A2 ), (A3) are satisfied for
L0 = L = 146.6629073 . .. and K = 101.5578008 . . .. Then, condition (A4 ) is also satisfied.
Hence, the results of our Theorem 22.2.1 that follows can apply to solve equation f (x) = 0
using IJTM. Hence, the applicability of IJTM is expanded under the conditions (A ).
The chapter is organized as follows: In Section 22.2. we present the local convergence
of these methods. The numerical examples are given in the concluding Section 22.3..
In the rest of this chapter, U(w, q) and U(w, q) stand, respectively, for the open and
closed ball in X with center w ∈ X and of radius q > 0.

22.2. Local Convergence


In this section we present the local convergence of IJTM under the (A ) conditions. It is
convenient for the local convergence of IJTM to introduce some funcitons and parameters.
Let L0 > 0, L > 0 and K > 0 be given constants. Define parameters rA and r0 by
2
rA = (22.2.1)
2L0 + L
and √
2
r0 = √ · (22.2.2)
2L0 + L
Notice that
1
r0 < rA < · (22.2.3)
L0
h 
Define functions f 1 and f 2 on the interval 0, L10 by

Lt
f1 (t) = (22.2.4)
2(1 − L0 t)
390 Ioannis K. Argyros and Á. Alberto Magreñán

and  
1 Lt
f2 (t) = 1+ . (22.2.5)
3 1 − L0 t
Then, we have by the choice of rA that
f1 (t) ≤ 1 for each t ∈ [0, rA] (22.2.6)
and
f2 (t) ≤ 1 for each t ∈ [0, rA]. (22.2.7)
h 
Define function f 3 on the interval 0, L10 by

(Lt)2
f3 (t) = · (22.2.8)
2(1 − L0 t)2
Then, we have that
f3 (t) ≤ 1 for each t ∈ [0, r0] (22.2.9)
and
f3 (t) < 1 for each t ∈ [0, r0). (22.2.10)
Moreover, define functions f 4 and f 5 on the interval [0, r0) by
 
Lt 2 L2 Kt
f4 (t) = 1+ (22.2.11)
2(1 − L0 t) 2(1 − L0 t)2 − L2 t 2
and  
2K
f5 (t) = 1 + f4 (t). (22.2.12)
2(1 − L0 t)2 − L2 t 2
Furthermore, define functions f 4 and f 5 on the interval [0, r0) by
f 4 (t) = f 4 (t) − 1 (22.2.13)
and
f 5 (t) = f 5 (t) − 1. (22.2.14)
We have that f 4 (0) = f 5 (0) = −1 < 0 and f 4 (t) → +∞, f 5 (t) → +∞ as t → r0 . It follows
by intermediate value theorem that functions f 4 and f 5 has zeros in (0, r0). Denote by r4
and r5 the minimal zeros of functions f 4 and f 5 on the interval (0, r0), respectively. Finally,
define
r = min{r4 , r5 }. (22.2.15)
Then, we have by the choice of r that
f1 (t) < 1, (22.2.16)
f2 (t) < 1, (22.2.17)
f3 (t) < 1, (22.2.18)
f4 (t) < 1, (22.2.19)
and
f5 (t) < 1 for each t ∈ [0, r). (22.2.20)
Next, we present the main local convergence for IJTM under the (A ) conditions.
Improved Jarratt-Type Method 391

Theorem 22.2.1. Suppose that the (A ) conditions and U(x∗ , r) ⊆ D, where r is given by
(22.2.15). Then, sequence {xn } generated by IJTM (22.1.2) for any x0 ∈ U(x∗ , r) is well
defined, remains in U(x∗ , r) for each n = 0, 1, 2, . . . and converges to x∗ . Moreover, the
following estimates hold for each n = 0, 1, 2, . . .

kxn+1 − x∗ k ≤ f 5 (kxn − x∗ k)kxn − x∗ k < kxn − x∗ k < r, (22.2.21)

where function f 5 is defined by (22.2.12).

Proof. We shall use induction to show that estimates (22.2.20) hold for each n = 0, 1, 2, . . .
Using (A2 ) and the hypothesis x0 ∈ U(x∗ , r), we have that

kF 0 (x∗ )−1 (F 0 (x0 ) − F 0 (x∗ ))k ≤ L0 kx0 − x∗ k < L0 r < 1, (22.2.22)

by the choice of r. It follows from (22.2.22) and the Banach lemma on invertible operators
that [11, 12, 28] F 0 (x0 )−1 ∈ L(Y, X) and
1 1
kF 0 (x0 )−1 F 0 (x∗ )k ≤ < · (22.2.23)
1 − L0 kx0 − x∗ k 1 − L0 r

Using the first substep of IJTM for n = 0, F(x∗ ) = 0, (A1), (A2 ), (22.2.22) and the choice
of r we get that

u0 − x∗ = x0 − x∗ − F 0 (x0 )−1 F(x0 )


"
= −(F 0 (x0 )−1 F 0 (x∗ )) F 0 (x∗ )−1

Z 1
#
× (F 0 (x∗ + θ(x0 − x∗ )) − F 0 (x0 )) dθ(x0 − x∗ ) , (22.2.24)
0

so

ku0 − x∗ k ≤ kF 0 (x0 )−1 F 0 (x∗ )k




0 ∗ −1 1 0 ∗
Z
× F (x ) (F (x + θ(x0 − x∗ )) − F 0 (x0 )) dθ kx0 − x∗ k
0
L0 kx0 − x∗ k2 Lkx0 − x∗ k2
≤ ≤
2(1 − L0 kx0 − x∗ k) 2(1 − L0 kx0 − x∗ k)
≤ f 1 (r)kx0 − x∗ k < kx0 − x∗ k < r, (22.2.25)

which shows u0 ∈ U(x∗ , r). Using the second substep of IJTM, we get by (22.2.25) and
(22.2.17) that
2
y0 − x∗ = x0 − x∗ + (u0 − x0 )
3
2 2
= x0 − x∗ + (u0 − x∗ ) + (x∗ − x0 )
3 3
1 2
= (x0 − x∗ ) + (u0 − x∗ )
3 3
392 Ioannis K. Argyros and Á. Alberto Magreñán

so,
1 2
ky0 − x∗ k ≤ kx0 − x∗ k + ku0 − x∗ k ≤ f 2 (r)kx0 − x∗ k < r,
3 3

which shows that y0 ∈ U(x , r).
Next, we shall find upper bounds on kH0 k and kJ0 k. Using (A1 ), (22.2.24), (22.2.18)
that
3 3 0
kH0 k ≤ kF (x0 )−1 F 0 (x∗ )kkF 0 (x∗ )−1 (F 0 (y0 ) − F 0 (x0 ))k
2 2
3 Lky0 − x0 k 3 2 Lku0 − x0 k
≤ ≤ ·
2 1 − L0 kx0 − x k 2 3 1 − L0 kx0 − x∗ k

 2
L2 kx0 − x∗ k2 Lr
≤ < √
2(1 − L0 kx0 − x∗ k2 ) 2(1 − L0 r)
= ( f 3 (r))2 < 1. (22.2.26)

It follows from (22.2.25) and the Banach lemma on invertible operators that
−1
I + 32 H0 ∈ L(Y, X) and
 −1
3 1

I + H0 ≤ 2 kx −x∗ k2
2 L
1 − 2(1−L kx0
−x∗ k)2 0 0

1
< L r2 2 · (22.2.27)
1 − 2(1−L r)2
0

It then follows from the definition of J0 , (22.2.26) and (22.2.27) that


L2 kx0 −x∗ k2
3 3(1−L0 kx0 −x∗ k)
kJ0 k ≤ 1 +
4 1 − L2 kx0 −x∗ k∗2
2(1−L0 kx0 −x k)2
1 (1 − L0 kx0 − x∗ k)L2 kx0 − x∗ k2
= 1+ · · (22.2.28)
2 [2(1 − L0 kx0 − x∗ k)2 − L2 kx0 − x∗ k2 ]

Then, from the fourth substep of IJTM for n = 0, (22.2.25), (22.2.26), (22.2.27), (22.2.19)
and (A4 )
 −1
0 −1 3 3
z0 = x0 − F (x0 ) F(x0 ) + I + H0 H0 F 0 (x0 )−1 F(x0 )
4 2
Improved Jarratt-Type Method 393

so,

kz0 − x∗ k ≤ kx0 − x∗ − F 0 (x0 )−1 F(x0 )k


 −1
3 3

+ I + H0 kH0 kkF 0 (x0 )−1 F 0 (x∗ )k
4 2

0 ∗ −1 1 0 ∗
Z

× F (x ) F (x + θ(x0 − x ))(x0 − x ) dθ
∗ ∗
0

∗ 2
Lkx0 − x k 3 1
≤ + L2 kx0 −x∗ k
2(1 − L0 kx0 − x∗ k) 4 1 −
2(1−L0 kx0 −x∗ k)
2 L2 kx0 − x∗ k Kkx0 − x∗ k
=
3 2(1 − L0 kx0 − x∗ k)2 1 − L0 kx0 − x∗ k)
= f 4 (kx0 − x∗ k)kx0 − x∗ k < kx0 − x∗ k < r, (22.2.29)

which shows z0 ∈ U(x∗ , r).


Notice that we used
Z 1
F(x0 ) = F(x0 ) − F(x∗ ) = F 0 (x∗ + θ(x0 − x∗ ))(x0 − x∗ ) dθ
0

so
kF 0 (x∗ )−1 F(x0 )k ≤ Kkx0 − x∗ k by (A4 ). (22.2.30)
Next, using the last substep in IJTM for n = 0, (22.2.23), (22.2.27), (22.2.19) and
(22.2.30) (for x0 replaced by z0 ) we get in turn that

1 Kkz0 − x∗ k
kx1 − x∗ k ≤ kz0 − x∗ k + L2 kx0 −x∗ k2
1 − 2(1−L ∗ 2
1 − L0 kx0 − x∗ k
0kx0 −x k)
 
2K(1 − L0 kx0 − x∗ k)
× 1+ kz0 − x∗ k
2(1 − L0 kx0 − x∗ k)2 − L2 kx0 − x∗ k2
≤ f 5 (kx0 − x∗ k)kx0 − x∗ k ≤ f 5 (r)kx0 − x∗ k
< kx0 − x∗ k, (22.2.31)

which shows (22.2.21) for n = 0.


To complete the induction, simple replace in all preceding estimates x0 , u0 , y0 , z0 , x1 by
xk , uk, yk , zk, xk+1 , respectively to arrive at (22.2.21), which complete the induction.
Finally it follows from (22.2.21) that lim xk = x∗ 
k→+∞

Remark 22.2.2. (a) Condition (A2 ) can be dropped, since this condition follows from
(A3 ). Notice, however that
L0 ≤ L (22.2.32)
L
holds in general and L0 can be arbitrarily large [2, 3, 4, 5, 6].
394 Ioannis K. Argyros and Á. Alberto Magreñán

(b) In view of condition (A2 ) and the estimate

kF 0 (x∗ )−1 F 0 (x)k = kF 0 (x∗ )−1 [F 0 (x) − F 0 (x∗ )] + Ik


≤ 1 + kF 0 (x∗ )−1 (F 0 (x) − F 0 (x∗ ))k
≤ 1 + L0 kx − x∗ k,

condition (A4 ) can be dropped and K can be replaced by

K(r) = 1 + L0 r. (22.2.33)

(c) It is worth noticing that r is such that

r < rA for α 6= 0. (22.2.34)

The convergence ball of radius rA was given by us in [2, 3, 5] for Newton’s method
under conditions (A1 )-(A3 ). Estimate (22.2.22) shows that the convergence ball of
higher than two IJTM methods is smaller than the convergence ball of the quadrat-
ically convergent Newton’s method. The convergence ball given by Rheinboldt [31]
for Newton’s method is
2
rR = < rA (22.2.35)
3L
if L0 < L and rrRA → 13 as LL0 → 0. Hence, we do not expect r to be larger than rA no
matter how we choose L0 , L and K. Finally note that if α = 0, then IJTM reduces to
Newton’s method and r = rA .

(d) The local results can be used for projection methods such as Arnoldi’s method, the
generalized minimum residual method (GMREM), the generalized conjugate method
(GCM) for combined Newton/finite projection methods and in connection to the mesh
independence principle in order to develop the cheapest and most efficient mesh re-
finement strategy [11, 12, 31].

(e) The results can also be used to solve equations where the operator F 0 satisfies the
autonomous differential equation [11, 12, 29, 31]:

F 0 (x) = T (F(x)), (22.2.36)

where T is a known continuous operator. Since F 0 (x∗ ) = T (F(x∗ )) = T (0), we can


apply the results without actually knowing the solution x∗ . Let as an example F(x) =
ex − 1. Then, we can choose T (x) = x + 1 and x∗ = 0.

(f) It is worth noticing that IJTM is not changing if we use the (A ) instead of the (C )
conditions. Moreover for the error bounds in practice we can use the computational
order of convergence (COC) [1, 2, 3, 4, 11, 12, 15] using
 
ln kxkxn+2 −xn+1 k
n+1 −xn k
ξ = sup   for each n = 1, 2, . . .
kx −xn k
ln kxnn+1
−xn−1 k
Improved Jarratt-Type Method 395

or the approximate computational order of convergence (ACOC)


 
kx −x∗ k
ln kxn+2
n+1 −x k

ξ∗ = sup   for each n = 0, 1, 2, . . .
−x∗ k
ln kxkxn+1 ∗
n −x k

instead of the error bounds obtained in Theorem 22.2.1.

22.3. Numerical Examples


We present numerical examples where we compute the radii of the convergence balls.

Example 22.3.1. Let X = Y = R3 , D = U(0, 1). Define F on D for v = (x, y, z) by


 
x e−1 2
F(v) = e − 1, y + y, z . (22.3.1)
2
Then, the Fréchet-derivative is given by
 x 
e 0 0
F 0 (v) =  0 (e − 1)y + 1 0  .
0 0 1

Notice that x∗ = (0, 0, 0), F 0 (x∗ ) = F 0 (x∗ )−1 = diag{1, 1, 1}, L0 = e − 1 < L = K = e, r0 =
0.274695 . . . < rA = 0.324967 . . . < 1/L0 = 0.581977 . . ., r = 0.144926 . . ..
Example 22.3.2. Let X = Y = C([0, 1]), the space of continuous functions defined on [0, 1]
be and equipped with the max norm. Let D = U(0, 1). Define function F on D by
Z 1
F(ϕ)(x) = ϕ(x) − 5 xθϕ(θ)3 dθ. (22.3.2)
0

We have that
Z 1
0
F (ϕ(ξ))(x) = ξ(x) − 15 xθϕ(θ)2 ξ(θ)dθ, for each ξ ∈ D.
0

Then, we get that x = 0, L0 = 7.5, L = 15 and K = K(t) = 1 + 7.5t, r0 = 0.055228 . . . <
rA = 0.066666 . . . < 1/L0 = 0.133333 . . ., r = 0.0370972 . . ..
Example 22.3.3. Returning to the motivational
 example at the Introduction of this chapter,
let the function f on D = U = 1, 23 defined by
 3 2
 x lnx + x5 − x4 , x 6= 0
f (x) =

0, x = 0.

Then, L0 = L = 146.662907 . . ., K = 101.557800 . .., r0 = 0.003984 . . . < rA =


0.004545 . . . < 1/L0 = 0.006818 . . . and r = 0.000442389 .. ..
References

[1] Amat, S., Busquier, S., Gutiérrez, J. M.,. Geometric constructions of iterative func-
tions to solve nonlinear equations, J. Comput. Appl. Math., 157 (2003), 197-205.

[2] Amat, S., Busquier, S., Plaza, S., Review of some iterative root-finding methods from
a dynamical point of view, Scientia Series A: Mathematical Sciences, 10 (2004), 3-35.

[3] Amat, S., Busquier, S., Plaza, S., Dynamics of the King and Jarratt iteration, Aequa-
tiones Math., 69, (2005), 3, 212-223.

[4] Amat, S., Busquier, S., Plaza, S., Chaotic dynamics of a third-order Newton-type
method, J. Math. Anal. Appl., 366(1) (2010), 24-32.

[5] Argyros, I. K., On the convergence of Chebyshev-Halley-type method under Newton-


Kantorovich hypotheses, Appl. Math. Let., 6(5) (1993), 71-74.

[6] Argyros, I. K., A note on the Halley method in Banach spaces, App. Math. Comp., 58
(1993), 215-224.

[7] Argyros, I. K., The Jarratt method in a Banach space setting, J. Comp. Appl. Math.,
51 (1994), 103-106.

[8] Argyros, I. K., A multi-point Jarratt-Newton-type approximation algorithm for solving


nonlinear operator equations in Banach spaces, Func. Approx. Comment. Math. XXIII
(1994), 97-108.

[9] Argyros, I. K.,. A new convergence theorem for the Jarratt method in Banach spaces,
Computers and Mathematics with applications, 36 (8) (1998), 13-18.

[10] Argyros, I. K., Chen, D.. An inverse-free Jarratt type approximation in a Banach space,
J. Approx. Th. Appl. 12(1) (1996), 19-30.

[11] Argyros, I. K., Computational theory of iterative methods. Series: Studies in Com-
putational Mathematics, 15, Editors: C.K. Chui and L. Wuytack, Elsevier Publ. Co.
New York, U.S.A, 2007.

[12] Argyros, I. K., Hilout, S., Numerical methods in Nonlinear Analysis, World Scientific
Publ. Comp. New Jersey, 2013.

[13] Argyros, I. K., and Hilout, S., Weaker conditions for the convergence of Newton’s
method, J. Complexity, 28 (2012), 364-387.
398 Ioannis K. Argyros and Á. Alberto Magreñán

[14] Argyros, I. K., Magreñán, Á.A., On the convergence of an optimal fourth-order family
of methods and its dynamics. App. Math. Comp., 252 (2015), 336-346.

[15] Chicharro,F., Cordero, A., Gutiérrez, J. M., Torregrosa, J. R., Dynamics of derivative-
free methods for nonlinear equations, App. Math. Comp., 219 (2013), 7023-7035.

[16] Chun, C., Lee, M. Y., Neta, B., Dzunić, J., On optimal fourth-order iterative methods
free from second derivative and their dynamics, App. Math. Comp., 218 (2012), 6427-
6438.

[17] Candela, V., Marquina, A., Recurrence relations for rational cubic methods I: The
Halley method, Computing, 44 (1990), 169-184.

[18] Candela, V., Marquina, A., Recurrence relations for rational cubic methods II: The
Chebyshev method, Computing, 45 (1990), 355-367.

[19] Ezquerro, J.A., Hernández, M. A., Avoiding the computation of the second Fréchet-
derivative in the convex acceleration of Newton’s method, J. Comp. Appl. Math., 96
(1998), 1-12.

[20] Ezquerro, J.A., Hernández, M. A., On Halley-type iterations with free second deriva-
tive, J. Comp. Appl. Math., 170 (2004), 455-459.

[21] Gutiérrez, J. M., Hernández, M. A., Recurrence relations for the super-Halley method,
Computers Math. Applic. 36 (1998), 1-8.

[22] Gutiérrez, J. M., Hernández, M. A., Third-order iterative methods for operators with
bounded second derivative, J. Comp. Appl. Math., 82 (1997), 171-183.

[23] Gutiérrez, J.M., Magreñán, Á.A., Romero, N., On the semilocal convergence of
Newton-Kantorovich method under center-Lipschitz conditions, App. Math. Comp.,
221 (2013), 79–88.

[24] Hernández, M. A., Salanova, M. A., Modification of the Kantorovich assumptions for
semilocal convergence of the Chebyshev method, J. Comp. Appl. Math., 126 (2000),
131-143.

[25] Hernández, M. A., Chebyshev’s approximation algorithms and applications, Comput-


ers Math. Applic., 41 (2001), 433-455.

[26] Hernández, M. A., Reduced recurrence relations for the Chebyshev method, J. Optim.
Theory App., 98 (1998), 385-397.

[27] Jarratt, P., Some fourth order multipoint iterative methods for solving equations, Math-
ematics and Computation, 20(95) (1996), 434-437.

[28] Kantorovich, L.V., Akilov, G. P., Functional Analysis, Pergamon Press, Oxford, 1982.

[29] Kou, J., Li, Y., An improvement of the Jarratt method, App. Math. Comp., 189, 2,
(2007), 1816-1821.
Improved Jarratt-Type Method 399

[30] Magreñán, Á.A., Different anomalies in a Jarratt family of iterative root-finding meth-
ods, App. Math. Comp., 233 (2014), 29–38.

[31] Ortega, J.M., Rheinboldt, W.C., Iterative Solution of Nonlinear Equations in Several
Variables, Academic press, New York, 1970.

[32] Parida, P.K., Gupta, D.K., Semilocal convergence of a family of third order methods
in Banach spaces under Hölder continuous second derivative, Nonl. Anal., 69 (2008),
4163-4173.

[33] Ren, H., Wo, Q., Bi, W., New variants of Jarratt method with sixth-order convergence,
Numer. Algorithms, 52(4) (2009), 585-603.
Chapter 23

Enlarging the Convergence Domain


of Secant-Like Methods for
Equations

23.1. Introduction
Let X , Y be Banach spaces and D be a non-empty, convex and open subset in X . Let
U(x, r) and U(x, r) stand, respectively, for the open and closed ball in X with center x and
radius r > 0. Denote by L (X , Y ) the space of bounded linear operators from X into Y . In
the present chapter we are concerned with the problem of approximating a locally unique
solution x? of equation
F(x) = 0, (23.1.1)
where F is a Fréchet continuously differentiable operator defined on D with values in Y .
A lot of problems from computational sciences and other disciplines can be brought in
the form of equation (23.1.1) using Mathematical Modelling [8, 10, 14]. The solution of
these equations can rarely be found in closed form. That is why most solution methods for
these equations are iterative. In particular, the practice of numerical analysis for finding
such solutions is essentially connected to variants of Newton’s method [8, 10, 14, 22, 25,
27, 32].
A very important aspect in the study of iterative procedures is the convergence domain.
In general the convergence domain is small. This is why it is important to enlarge it without
additional hypotheses. Then, this is our goal in this chapter.
In the present chapter we study the secant-like method defined by

x−1 , x0 are initial points


yn = λ xn + (1 − λ) xn−1 , λ ∈ [0, 1] (23.1.2)
xn+1 = xn − B−1 n F(xn ), Bn = [yn , xn ; F] for each n = 0, 1, 2, · · · .

The family of secant-like methods reduces to the secant method if λ = 0 and to Newton’s
method if λ = 1. It was shown in [27] (see√ also [7, 8, 21] and the references therein) that
the R–order of convergence is at least (1 + 5)/2 if λ ∈ [0, 1), the same as that of the se-
cant method. In the real case the closer xn and yn are, the higher the speed of convergence.
402 Ioannis K. Argyros and Á. Alberto Magreñán

Moreover in [19], it was shown that as λ approaches 1 the speed of convergence is close
to that of Newton’s method. Moreover, there exist new graphical tools [24]. Furthermore,
the advantages of using secant-like method instead of Newton’s method is that the former
method avoids the computation of F 0 (xn )−1 at each step. The study about convergence mat-
ter of iterative procedures is usually centered on two types: semilocal and local convergence
analysis. The semilocal convergence matter is, based on the information around an initial
point, to give criteria ensuring the convergence of iterative procedure; while the local one
is, based on the information around a solution, to find estimates of the radii of convergence
balls. There is a plethora of studies on the weakness and/or extension of the hypothesis
made on the underlying operators; see for example [1]–[34].
The hypotheses used for the semilocal convergence of secant-like method are (see [8,
18, 19, 21]):
(C1 ) There exists a divided difference of order one denoted by [x, y; F] ∈ L (X , Y ) satisfy-
ing
[x, y; F](x − y) = F(x) − F(y) for all x, y ∈ D ;

(C2 ) There exist x−1 , x0 in D and c > 0 such that

k x0 − x−1 k≤ c;

(C3 ) There exist x−1 , x0 ∈ D and M > 0 such that A−1


0 ∈ L (Y , X ) and

k A−1
0 ([x, y; F] − [u, v; F]) k≤ M (k x − u k + k y − v k) for all x, y, u, v ∈ D ;

(C3? ) There exist x−1 , x0 ∈ D and L > 0 such that A−1


0 ∈ L (Y , X ) and

k A−1
0 ([x, y; F] − [v, y; F]) k≤ L k x − v k for all x, y, v ∈ D ;

(C3?? ) There exist x−1 , x0 ∈ D and K > 0 such that F(x0 )−1 ∈ L (Y , X ) and

k F 0 (x0 )−1 ([x, y; F] − [v, y; F]) k≤ K k x − v k for all x, y, v ∈ D ;

(C4 ) There exists η > 0 such that

k A−1
0 F(x0 ) k≤ η;

(C4? ) There exists η > 0 for each λ ∈ [0, 1] such that

k B−1
0 F(x0 ) k≤ η.

We shall refer to (C1 )–(C4 ) as the (C ) conditions. From analyzing the semilocal conver-
gence of the simplified secant method, it was shown [18] that the convergence criteria are
milder than those of secant-like method given in [20]. Consequently, the decreasing and
accessibility regions of (23.1.2) can be improved. Moreover, the semilocal convergence of
(23.1.2) is guaranteed.
In the present chapter we show: an even larger convergence domain can be obtained
under the same or weaker sufficient convergence criteria for method (23.1.2). In view of
(C3 ) we have that
Enlarging the Convergence Domain of Secant-Like Methods for Equations 403

(C5 ) There exists M0 > 0 such that

k A−1
0 ([x, y; F] − [x−1 , x0 ; F]) k≤ M0 (k x − x−1 k + k y − x0 k) for all x, y ∈ D .

We shall also use the conditions


(C6 ) There exist x0 ∈ D and M1 > 0 such that F 0 (x0 )−1 ∈ L (Y , X ) and

k F 0 (x0 )−1 ([x, y; F] − F 0 (x0 )) k≤ M1 (k x − x0 k + k y − x0 k) for all x, y ∈ D ;

(C7 ) There exist x0 ∈ D and M2 > 0 such that F 0 (x0 )−1 ∈ L (Y , X ) and

k F 0 (x0 )−1 (F 0 (x) − F 0 (x0 )) k≤ M2 (k x − x0 k + k y − x0 k) for all x, y ∈ D .

Note that M0 ≤ M, M2 ≤ M1 , L ≤ M hold in general and M/M0, M1 /M2 , M/L can be


arbitrarily large [6, 7, 8, 9, 10, 14]. We shall refer to (C1 ), (C2 ), (C3?? ), (C4? ), (C6 ) as the
(C ? ) conditions and (C1 ), (C2 ), (C3? ), (C4? ), (C5 ) as the (C ?? ) conditions. Note that (C5 ) is not
additional hypothesis to (C3 ), since in practice the computation of constant M requires that
of M0 . Note that if (C6 ) holds, then we can set M2 = 2 M1 in (C7 ).
The chapter is organized as follows. In Section 23.2. we use the (C ? ) and (C ?? ) con-
ditions instead of the (C ) conditions to provide new semilocal convergence analyses for
method (23.1.2) under weaker sufficient criteria than those given in [18, 19, 21, 26, 27].
This way we obtain a larger convergence domain and a tighter convergence analysis. Two
numerical examples, where we illustrate the improvement of the domain of starting points
achieved with the new semilocal convergence results, are given in the Section 23.3..

23.2. Semilocal Convergence of Secant-Like Method


We present the semilocal convergence of secant-like method. First, we need some results
on majorizing sequences for secant-like method.
Lemma 23.2.1. Let c ≥ 0, η > 0, M1 > 0, K > 0 and λ ∈ [0, 1]. Set t−1 = 0, t0 = c and
t1 = c + η. Define scalar sequences {qn }, {tn }, {αn } for each n = 0, 1, · · · by

qn = (1 − λ) (tn − t0 ) + (1 + λ) (tn+1 − t0 ),
K (tn+1 − tn + (1 − λ) (tn − tn−1 ))
tn+2 = tn+1 + (tn+1 − tn ), (23.2.1)
1 − M1 q n
K (tn+1 − tn + (1 − λ) (tn − tn−1 ))
αn = , (23.2.2)
1 − M1 q n
function { f n } for each n = 1, 2, · · · by

fn (t) = K ηt n + K (1 − λ) η t n−1 + M1 η ((1 − λ) (1 + t + · · · + t n )+


(23.2.3)
(1 + λ) (1 + t + · · · + t n+1 )) − 1

and polynomial p by

p(t) = M1 (1 + λ)t 3 + (M1 (1 − λ) + K)t 2 − K λt − K (1 − λ). (23.2.4)


404 Ioannis K. Argyros and Á. Alberto Magreñán

Denote by α the smallest root of polynomial p in (0, 1). Suppose that

0 < α0 ≤ α ≤ 1 − 2 M1 η. (23.2.5)

Then, sequence {tn} is non-decreasing, bounded from above by t ?? defined by


η
t ?? = +c (23.2.6)
1−α
and converges to its unique least upper bound t ? which satisfies

c + η ≤ t ? ≤ t ?? . (23.2.7)

Moreover, the following estimates are satisfied for each n = 0, 1, · · ·

0 ≤ tn+1 − tn ≤ αn η (23.2.8)

and
αn η
t ? − tn ≤ . (23.2.9)
1−α
Proof. We shall first prove that polynomial p has roots in (0, 1). If λ 6= 1, p(0) = −(1 −
λ) K < 0 and p(1) = 2 M1 > 0. If λ = 1, p(t) = t p(t), p(0) = −K < 0 and p(1) = 2 M1 > 0.
In either case it follows from the intermediate value theorem that there exist roots in (0, 1).
Denote by α the minimal root of p in (0, 1). Note that, in particular for Newton’s method
(i.e. for λ = 1) and for Secant method (i.e. for λ = 0), we have, respectively by (23.2.4)
that
2K
α= p (23.2.10)
K + K 2 + 4 M1 K
and
2K
α= p . (23.2.11)
K+ K 2 + 8 M1 K
It follows from (23.2.1) and (23.2.2) that estimate (23.2.8) is satisfied if

0 ≤ αn ≤ α. (23.2.12)

Estimate (23.2.12) is true by (23.2.5) for n = 0. Then, we have by (23.2.1) that

t2 − t1 ≤ α (t1 − t0 ) =⇒ t2 ≤ t1 + α (t1 − t0 )
1 − α2
=⇒ t2 ≤ η + t0 + α η = c + (1 + α) η = c + < t ?? .
1−αη
Suppose that
1 − αk+1
tk+1 − tk ≤ αk η and tk+1 ≤ c + η. (23.2.13)
1−α
Estimate (23.2.12) shall be true for k + 1 replacing n if

0 ≤ αk+1 ≤ α (23.2.14)
Enlarging the Convergence Domain of Secant-Like Methods for Equations 405

or
fk (α) ≤ 0. (23.2.15)
We need a relationship between two consecutive recurrent functions f k for each k = 1, 2, · · ·.
It follows from (23.2.3) and (23.2.4) that

fk+1(α) = f k (α) + p(α) αk−1 η = f k (α), (23.2.16)

since p(α) = 0. Define function f ∞ on (0, 1) by

f∞(t) = lim fn (t). (23.2.17)


n→∞

Then, we get from (23.2.3) and (23.2.17) that

f∞ (α) = lim fn (α)


n→∞
= K η lim αn + K (1 − λ) η lim αn−1 +

n→∞ n→∞

M1 η (1 − λ) lim (1 + α + · · · + αn )+
n→∞ (23.2.18)

(1 + λ) lim (1 + α + · · · + αn+1 ) − 1
 n→∞ 
1−λ 1+λ 2 M1 η
= M1 η + −1 = − 1,
1−α 1−α 1−α
since α ∈ (0, 1). In view of (23.2.15), (23.2.16) and (23.2.18) we can show instead of
(23.2.15) that
f∞ (α) ≤ 0, (23.2.19)
which is true by (23.2.5). The induction for (23.2.8) is complete. It follows that sequence
{tn } is non-decreasing, bounded from above by t ?? given by (23.2.6) and as such it con-
verges to t ? which satisfies (23.2.7). Estimate (23.2.9) follows from (23.2.8) by using stan-
dard majorization techniques [8, 10, 22]. The proof of Lemma 23.2.1 is complete. 
Lemma 23.2.2. Let c ≥ 0, η > 0, M1 > 0, K > 0 and λ ∈ [0, 1]. Set r−1 = 0, r0 = c and
r1 = c + η. Define scalar sequences {rn } for each n = 1, · · · by

r2 = r1 + β1 (r1 − r0 )
(23.2.20)
rn+2 = rn+1 + βn (rn+1 − rn ),

where
M1 (r1 − r0 + (1 − λ) (r0 − r−1 ))
β1 = ,
1 − M1 q 1
K (rn+1 − rn + (1 − λ) (rn − rn−1 ))
βn = f or each n = 2, 3, · · ·
1 − M1 q n
and function {gn } on [0, 1) for each n = 1, 2, · · · by
n−1
 (1 − λ))t n+1
gn (t) = K (t + (r2 − r1 )+ 
1 −t 1 − t n+2
M1 t (1 − λ) + (1 + λ) (r2 − r1 ) + (2 M1 η − 1)t.
1 −t 1 −t
(23.2.21)
406 Ioannis K. Argyros and Á. Alberto Magreñán

Suppose that
2 M1 (r2 − r1 )
0 ≤ β1 ≤ α ≤ 1 − , (23.2.22)
1 − 2 M1 η
where α is defined in Lemma 23.2.1. Then, sequence {rn } is non-decreasing, bounded from
above by r?? defined by
r2 − r1
r?? = c + η + (23.2.23)
1−α
and converges to its unique least upper bound r? which satisfies

c + η ≤ r? ≤ r?? . (23.2.24)
Moreover, the following estimates are satisfied for each n = 1, · · ·
0 ≤ rn+2 − rn+1 ≤ αn (r2 − r1 ). (23.2.25)
Proof. We shall use mathematical induction to show that
0 ≤ βn ≤ α. (23.2.26)
Estimate (23.2.26) is true for n = 0 by (23.2.22). Then, we have by (23.2.20) that
0 ≤ r3 − r2 ≤ α (r2 − r1 ) =⇒ r3 ≤ r2 + α (r2 − r1 )
=⇒ r3 ≤ r2 + (1 + α) (r2 − r1 ) − (r2 − r1 )
1 − α2
=⇒ r3 ≤ r1 + (r2 − r1 ) ≤ r?? .
1−α
Suppose (23.2.26) holds for each n ≤ k, then, using (23.2.20), we obtain that
1 − αk+1
0 ≤ rk+2 − rk+1 ≤ αk (r2 − r1 ) and rk+2 ≤ r1 + (r2 − r1 ). (23.2.27)
1−α
Estimate (23.2.26) is certainly satisfied, if
gk (α) ≤ 0, (23.2.28)
where gk is defined by (23.2.21). Using (23.2.21), we obtain the following relationship
between two consecutive recurrent functions gk for each k = 1, 2, · · ·
gk+1(α) = gk (α) + p(α) αk−1 (r2 − r1 ) = gk (α), (23.2.29)
since p(α) = 0. Define function g∞ on [0, 1) by
g∞ (t) = lim gk (t). (23.2.30)
k→∞

Then, we get from (23.2.21) and (23.2.30) that


 
2 M1 (r2 − r1 )
g∞ (α) = α + 2 M1 η − 1 . (23.2.31)
1−α
In view of (23.2.28)–(23.2.31) to show (23.2.28), it suffices to have g∞ (α) ≤ 0, which true
by the right hand hypothesis in (23.2.22). The induction for (23.2.26) (i.e. for (23.2.25)) is
complete. The rest of the proof is omitted (as identical to the proof of Lemma 23.2.1). The
proof of Lemma 23.2.2 is complete. 
Enlarging the Convergence Domain of Secant-Like Methods for Equations 407

Remark 23.2.3. Let us see how sufficient convergence criterion on (23.2.5) for sequence
{tn } simplifies in the interesting case of Newton’s method. That is when c = 0 and λ = 1.
Then, (23.2.5) can be written for L0 = 2 M1 and L = 2 K as
1 p 1
h0 = (L + 4 L0 + L2 + 8 L0 L) η ≤ . (23.2.32)
8 2
The convergence criterion in [18] reduces to the famous for it simplicity and clarity Kan-
torovich hypothesis
1
h = Lη ≤ . (23.2.33)
2
Note however that L0 ≤ L holds in general and L/L0 can be arbitrarily large [6, 7, 8, 9, 10,
14]. We also have that
1 1
h ≤ =⇒ h0 ≤ (23.2.34)
2 2
but not necessarily vice versa unless if L0 = L and
h0 1 L
−→ as −→ ∞. (23.2.35)
h 4 L0
Similarly, it can easily be seen that the sufficient convergence criterion (23.2.22) for se-
quence {rn } is given by
q p
1 1
h1 = (4 L0 + L0 L + 8 L20 + L0 L) η ≤ . (23.2.36)
8 2
We also have that
1 1
h0 ≤ =⇒ h1 ≤ (23.2.37)
2 2
and
h1 h1 L0
−→ 0, −→ 0 as −→ 0. (23.2.38)
h h0 L
Note that sequence {rn } is tighter than {tn } and converges under weaker conditions. In-
deed, a simple inductive argument shows that for each n = 2, 3, · · ·, if M1 < K, then

rn < tn , rn+1 − rn < tn+1 − tn and r? ≤ t ? . (23.2.39)

We have the following usefull and obvious extensions of Lemma 23.2.1 and Lemma
23.2.2, respectively.

Lemma 23.2.4. Let N = 0, 1, 2, · · · be fixed. Suppose that

t1 ≤ t2 ≤ · · · ≤ tN ≤ tN+1 , (23.2.40)
1
> (1 − λ) (tN − t0 ) + (1 + λ) (tN+1 − t0 ) (23.2.41)
M1
and
0 ≤ αN ≤ α ≤ 1 − 2 M1 (tN+1 − tN ). (23.2.42)
408 Ioannis K. Argyros and Á. Alberto Magreñán

Then, sequence {tn } generated by (23.2.1) is nondecreasing, bounded from above by t ??


and converges to t ? which satisfies t ? ∈ [tN+1 ,t ??]. Moreover, the following estimates are
satisfied for each n = 0, 1, · · ·

0 ≤ tN+n+1 − tN+n ≤ αn (tN+1 − tN ) (23.2.43)

and
αn
t ? − tN+n ≤ (tN+1 − tN ). (23.2.44)
1−α
Lemma 23.2.5. Let N = 1, 2, · · · be fixed. Suppose that

r1 ≤ r2 ≤ · · · ≤ rN ≤ rN+1 , (23.2.45)
1
> (1 − λ) (rN − r0 ) + (1 + λ) (rN+1 − r0 ) (23.2.46)
M1
and
2 M1 (rN+1 − rN )
0 ≤ βN ≤ α ≤ 1 − . (23.2.47)
1 − 2 M1 (rN − rN−1 )
Then, sequence {rn } generated by (23.2.20) is nondecreasing, bounded from above by r??
and converges to r? which satisfies r? ∈ [rN+1 , r?? ]. Moreover, the following estimates are
satisfied for each n = 0, 1, · · ·

0 ≤ rN+n+1 − rN+n ≤ αn (rN+1 − rN ) (23.2.48)

and
αn
r? − rN+n ≤ (rN+1 − rN ). (23.2.49)
1−α
Next, we present the following semilocal convergence result for secant-like method
under the (C ? ) conditions.
Theorem 23.2.6. Suppose that the (C ? ), Lemma 23.2.1 (or Lemma 23.2.4) conditions and

U(x0 ,t ? ) ⊆ D (23.2.50)

hold. Then, sequence {xn } generated by the secant-like method is well defined, remains in
U(x0 ,t ? ) for each n = −1, 0, 1, · · · and converges to a solution x? ∈ U(x0 ,t ? − c) of equation
F(x) = 0. Moreover, the following estimates are satisfied for each n = 0, 1, · · ·

k xn+1 − xn k≤ tn+1 − tn (23.2.51)

and
k xn − x? k≤ t ? − tn . (23.2.52)
Furthemore, if there exists r ≥ t ? such that

U(x0 , r) ⊆ D (23.2.53)

and
1 2
r + t? < or r + t ? < , (23.2.54)
M1 M2
then, the solution x? is unique in U(x0 , r).
Enlarging the Convergence Domain of Secant-Like Methods for Equations 409

Proof. We use mathematical induction to prove that

k xk+1 − xk k≤ tk+1 − tk (23.2.55)

and
U(xk+1 ,t ? − tk+1 ) ⊆ U(xk ,t ? − tk ) (23.2.56)
for each k = −1, 0, 1, · · ·. Let z ∈ U(x0 ,t ? − t0 ). Then, we obtain that

k z − x−1 k≤k z − x0 k + k x0 − x−1 k≤ t ? − t0 + c = t ? = t ? − t−1 ,

which implies z ∈ U(x−1 ,t ? − t−1 ). Let also w ∈ U(x0 ,t ? − t1 ). We get that

k w − x0 k≤k w − x1 k + k x1 − x0 k≤ t ? − t1 + t1 − t0 = t ? = t ? − t0 .

That is w ∈ U(x0 ,t ? − t0 ). Note that

k x−1 − x0 k≤ c = t0 − t−1 and k x1 − x0 k=k B−1 ?


0 F(x0 ) k≤ η = t1 − t0 < t ,

which implies x1 ∈ U(x0 ,t ? ) ⊆ D . Hence, estimates (23.2.51) and (23.2.52) hold for k = −1
and k = 0. Suppose (23.2.51) and (23.2.52) hold for all n ≤ k. Then, we obtain that
k+1 k+1
k xk+1 − x0 k≤ ∑ k xi − xi−1 k≤ ∑ (ti − ti−1 ) = tk+1 − t0 ≤ t ?
i=1 i=1

and
k yk − x0 k≤ λ k xk − x0 k +(1 − λ) k xk−1 − x0 k≤ λt ? + (1 − λ)t ? = t ? .
Hence, xk+1 , yk ∈ U(x0 ,t ? ). Let Ek := [xk+1, xk ; F] for each k = 0, 1, · · ·. Using (23.1.2),
Lemma 23.2.1 and the induction hypotheses, we get that

k F 0 (x0 )−1 (Bk+1 − F 0 (x0 )) k≤ M1 (k yk+1 − x0 k + k xk+1 − x0 k)


≤ M1 ((1 − λ) k xk − x0 k +λ k xk+1 − x0 k + k xk+1 − x0 k) (23.2.57)
≤ M1 ((1 − λ) (tk − t0 ) + (1 + λ) (tk+1 − t0 )) < 1,

since, yk+1 − x0 = λ (xk+1 − x0 ) + (1 − λ) (xk − x0 ) and

k yk+1 − x0 k=k λ (xk+1 − x0 ) + (1 − λ) (xk − x0 ) k


≤ λ k xk+1 − x0 k +(1 − λ) k xk − x0 k .

It follows from (23.2.57) and the Banach lemma on invertible operators that B−1
k+1 exists and

1 1
k B−1 0
k+1 F (x0 ) k≤ ≤ , (23.2.58)
1 − Θk 1 − M1 qk+1

where Θk = M1 ((1 − λ) k xk − x0 k +(1 + λ) k xk+1 − x0 k). In view of (23.1.2), we obtain


the identity

F(xk+1 ) = F(xk+1) − F(xk ) − Bk (xk+1 − xk ) = (Ek − Bk ) (xk+1 − xk ). (23.2.59)


410 Ioannis K. Argyros and Á. Alberto Magreñán

Then, using the induction hypotheses, the (C ? ) condition and (23.2.59), we get in turn that

k F 0 (x0 )−1 F(xk+1) k = k F 0 (x0 )−1 (Ek − Bk ) (xk+1 − xk ) k


≤ K k xk+1 − yk k k xk+1 − xk k
≤ K (k xk+1 − xk k +(1 − λ) k xk − xk−1 k) k xk+1 − xk k
≤ K (tk+1 − tk + (1 − λ) (tk − tk−1 )) (tk+1 − tk ),
(23.2.60)
since, xk+1 − yk = xk+1 − xk + (1 − λ) (xk − xk−1 ) and

k xk+1 − yk k≤k xk+1 − xk k +(1 − λ) k xk − xk−1 k≤ tk+1 − tk + (1 − λ) (tk − tk−1 ).

It now follows from (23.1.2), (23.2.1), (23.2.58)–(23.2.60) that

k xk+2 − xk+1 k ≤ k B−1 0 0 −1


k+1 F (x0 ) k k F (x0 ) F(xk+1 ) k
K (tk+1 − tk + (1 − λ) (tk+1 − xk )) (tk+1 − tk )
≤ = tk+2 − tk+1,
1 − M1 qk+1

which completes the induction for (23.2.55). Furthemore, let v ∈ U(xk+2 ,t ? − tk+2 ). Then,
we have that
k v − xk+1 k ≤ k v − xk+2 k + k xk+2 − xk+1 k
≤ t ? − tk+2 + tk+2 − tk+1 = t ? − tk+1 ,
which implies v ∈ U(xk+1 ,t ? −tk+1 ). The induction for (23.2.55) and (23.2.56) is complete.
Lemma 23.2.1 implies that {tk} is a complete sequence. It follows from (23.2.55) and
(23.2.56) that {xk } is a complete sequence in a Banach space X and as such it converges
to some x? ∈ U(x0 ,t ? ) (since U(x0 ,t ?) is a closed set). By letting k −→ ∞ in (23.2.60), we
get that F(x? ) = 0. Moreover, estimate (23.2.52) follows from (23.2.51) by using standard
majorization techniques [8, 10, 22]. To show the uniqueness part, let y? ∈ U(x0 , r) be such
F(y? ) = 0, where r satisfies (23.2.53) and (23.2.54). We have that

k F 0 (x0 )−1 ([y? , x? ; F] − F 0 (x0 )) k ≤ M1 (k y? − x0 k + k x? − x0 k)


(23.2.61)
≤ M1 (t ? + r) < 1.

It follows by (23.2.61) and the Banach lemma on invertible operators that linear operator
[y? , x? ; F]−1 exists. Then, using the identity 0 = F(y? ) − F(x? ) = [y? , x? ; F] (y? − x? ), we
deduce that x? = y? . The proof of Theorem 23.2.6 is complete. 

In order for us to present the semilocal result for secant-like method under the (C ?? )
conditions, we first need a result on a majorizing sequence. The proof in given in Lemma
23.2.1.

Remark 23.2.7. Clearly, (23.2.22) (or (23.2.47)), {rn } can replace (23.2.5) (or (23.2.42)),
{tn }, respectively in Theorem 23.2.6.

Lemma 23.2.8. Let c ≥ 0, η > 0, L > 0, M0 > 0 with M0 c < 1 and λ ∈ [0, 1]. Set
L M0
s−1 = 0, s0 = c, s1 = c + η, K̃ = and M̃1 = .
1 − M0 c 1 − M0 c
Enlarging the Convergence Domain of Secant-Like Methods for Equations 411

Define scalar sequences {q̃n }, {sn }, {α̃n } for each n = 0, 1, · · · by

q̃n = (1 − λ) (sn − s0 ) + (1 + λ) (sn+1 − s0 ),

K̃ (sn+1 − sn + (1 − λ) (sn − sn−1 ))


sn+2 = sn+1 + (sn+1 − sn ),
1 − M̃1 q̃n
K̃ (sn+1 − sn + (1 − λ) (sn − sn−1 ))
α̃n = ,
1 − M̃1 q̃n
function { f˜n } for each n = 1, 2, · · · by

f˜n (t) = K̃ ηt n + K̃ (1 − λ) η t n−1 + M̃1 η ((1 − λ) (1 + t + · · · + t n )+


(1 + λ) (1 + t + · · · + t n+1 )) − 1

and polynomial p̃ by

p̃(t) = M̃1 (1 + λ)t 3 + (M̃1 (1 − λ) + K̃)t 2 − K̃ λt − K̃ (1 − λ).

Denote by α̃ the smallest root of polynomial p̃ in (0, 1). Suppose that

0 ≤ α̃0 ≤ α̃ ≤ 1 − 2 M̃1 η. (23.2.62)

Then, sequence {sn } is non-decreasing, bounded from above by s?? defined by


η
s?? = +c
1 − α̃
and converges to its unique least upper bound s? which satisfies c + η ≤ s? ≤ s?? . Moreover,
the following estimates are satisfied for each n = 0, 1, · · ·

α̃n η
0 ≤ sn+1 − sn ≤ α̃n η and s? − sn ≤ .
1 − α̃
Next, we present the semilocal convergence result for secant-like method under the
(C ?? )
conditions.

Theorem 23.2.9. Suppose that the (C ??) conditions, (23.2.62) (or Lemma 23.2.2 conditions
with α̃n , α̃, M̃1 replacing, respectively, αn , α, M1 ) and U(x0 , s? ) ⊆ D hold. Then, sequence
{xn } generated by the secant-like method is well defined, remains in U(x0 , s? ) for each
n = −1, 0, 1, · · · and converges to a solution x? ∈ U(x0 , s? ) of equation F(x) = 0. Moreover,
the following estimates are satisfied for each n = 0, 1, · · ·

k xn+1 − xn k≤ sn+1 − sn and k xn − x? k≤ s? − sn .

Furthemore, if there exists r ≥ s? such that U(x0 , r) ⊆ D and r + s? + c < 1/M0 , then, the
solution x? is unique in U(x0 , r).
412 Ioannis K. Argyros and Á. Alberto Magreñán

Proof. The proof is analogous to Theorem 23.2.6. Simply notice that in view of (C5 ), we
obtain instead of (23.2.57) that
k A−1
0 (Bk+1 − A0 ) k≤ M0 (k yk+1 − x−1 k + k xk+1 − x0 k)
≤ M0 ((1 − λ) k xk − x0 k +λ k xk+1 − x0 k + k x0 − x−1 k + k xk+1 − x0 k)
≤ M0 ((1 − λ) (sk − s0 ) + (1 + λ) (sk+1 − s0 ) + c) < 1,

leading to B−1
k+1 exists and
1
k B−1
k+1 A0 k≤ ,
1 − Ξk
where Ξk = M0 ((1 − λ) (sk − s0 ) + (1 + λ) (sk+1 − s0 ) + c). Moreover, using (C3? ) instead of
(C3?? ), we get that

k A−1
0 F(xk+1 ) k≤ L (sk+1 − sk + (1 − λ) (sk − sk−1 )) (sk+1 − sk ).

Hence, we have that


k xk+2 − xk+1 k≤k B−1 −1
k+1 A0 k k A0 F(xk+1) k
L (sk+1 − sk + (1 − λ) (sk − sk−1 )) (sk+1 − sk )

1 − M0 ((1 + λ) (sk+1 − s0 ) + (1 − λ) (sk − s0 ) + c)
K̃ (sk+1 − sk + (1 − λ) (sk − sk−1 )) (sk+1 − sk )
≤ = sk+2 − sk+1 .
1 − M̃1 ((1 + λ) (sk+1 − s0 ) + (1 − λ) (sk − s0 ))
The uniqueness part is given in Theorem 23.2.6 with r, s? replacing R2 and R0 , respectively.
The proof of Theorem 23.2.9 is complete. 
Remark 23.2.10. (a) Condition (23.2.50) can be replaced by

U(x0 ,t ?? ) ⊆ D , (23.2.63)

where t ?? is given in the closed form by (23.2.55).

(b) The majorizing sequence {un } essentially used in [18] is defined by

u−1 = 0, u0 = c, u1 = c + η
M (un+1 − un + (1 − λ) (un − un−1 )) (23.2.64)
un+2 = un+1 + (un+1 − un ),
1 − M q?n
where
q?n = (1 − λ) (un − u0 ) + (1 + λ) (un+1 − u0 ).
Then, if K < M or M1 < M, a simple inductive argument shows that for each n =
2, 3, · · ·

t n < un , tn+1 − tn < un+1 − un and t ? ≤ u? = lim un . (23.2.65)


n→∞

Clearly {tn } converges under the (C ) conditions and conditions of Lemma 2.1. More-
over, as we already showed in Remark 23.2.3, the sufficient convergence criteria of
Theorem 23.2.6 can be weaker than those of Theorem 23.2.9. Similarly if L ≤ M, {sn }
is a tighter sequence than {un }. In general, we shall test the convergence criteria and
use the tightest sequence to estimate the error bounds.
Enlarging the Convergence Domain of Secant-Like Methods for Equations 413

(c) Clearly the conclusions of Theorem 23.2.9 hold if {sn }, (23.2.62) are replaced by
{r̃n }, (23.2.22), where {r̃n } is defined as {rn } with M0 replacing M1 in the definition
of β1 (only at the numerator) and the tilda letters replacing the non-tilda letters in
(23.2.22).

23.3. Numerical Examples


Now, we check numerically with two examples that the new semilocal convergence results
obtained in Theorems 23.2.6 and 23.2.9 improve the domain of starting points obtained by
the following classical result given in [20].

Theorem 23.3.1. Let X and Y be two Banach spaces and F : Ω ⊆ X → Y be a nonlinear


operator defined on a non-empty open convex domain Ω. Let x−1 , x0 ∈ Ω and λ ∈ [0, 1].
Suppose that there exists [u, v; F] ∈ L (X,Y ), for all u, v ∈ Ω (u 6= v), and the following four
conditions

· kx0 − x−1 k = c 6= 0 with x−1 , x0 ∈ Ω,

· Fixed λ ∈ [0, 1], the operator B0 = [y0 , x0 ; F] is invertible and such that kB−1
0 k ≤ β,

· kB−1
0 F(x0 )k ≤ η,

· k[x, y; F] − [u, v; F]k ≤ Q(kx − uk + ky − vk); Q ≥ 0; x, y, u, v ∈ Ω; x 6= y; u 6= v,


1−a
are satisfied. If B(x0 , ρ) ⊆ Ω, where ρ = η,
1 − 2a

η 3− 5 Qβc2 a(1 − a)2
a= < and b= < , (23.3.1)
c+η 2 c + η 1 + λ(2a − 1)

then the secant-like methods defined by (23.1.2)√ converge to a solution x∗ of equation


F(x) = 0 with R-order of convergence at least 1+2 5 . Moreover, xn , x∗ ∈ B(x0 , ρ), the solu-
tion x∗ is unique in B(x0 , τ) ∩ Ω, where τ = Qβ
1
− ρ − (1 − λ)α.

23.3.1. Example 1
We illustrate the above-mentioned with an application, where a system of nonlinear equa-
tions is involved. We see that Theorem 23.3.1 cannot guarantee the semilocal convergence
of secant-like methods (23.1.2) , but Theorem 23.2.6 can do it.
It is well known that energy is dissipated in the action of any real dynamical system,
usually through some form of friction. However, in certain situations this dissipation is
so slow that it can be neglected over relatively short periods of time. In such cases we
assume the law of conservation of energy, namely, that the sum of the kinetic energy and
the potential energy is constant. A system of this kind is said to be conservative.
If ϕ and ψ are arbitrary functions with the property that ϕ(0) = 0 and ψ(0) = 0, the
general equation  
d 2 x(t) dx(t)
µ +ψ + ϕ(x(t)) = 0, (23.3.2)
dt 2 dt
414 Ioannis K. Argyros and Á. Alberto Magreñán

can be interpreted as the equation of motion of a mass µ under the action of a restoring force
−ϕ(x) and a damping force −ψ(dx/dt). In general these forces are nonlinear, and equation
(23.3.2) can be regarded as the basic equation of nonlinear mechanics. In this chapter we
shall consider the special case of a nonlinear conservative system described by the equation
d 2 x(t)
µ + ϕ(x(t)) = 0,
dt 2
in which the damping force is zero and there is consequently no dissipation of energy.
Extensive discussions of (23.3.2), with applications to a variety of physical problems, can
be found in classical references [4] and [31].
Now, we consider the special case of a nonlinear conservative system described by the
equation
d 2 x(t)
+ φ(x(t)) = 0 (23.3.3)
dt 2
with the boundary conditions
x(0) = x(1) = 0. (23.3.4)
After that, we use a process of discretization to transform problem (23.3.3)–(23.3.4) into a
finite-dimensional problem and look for an approximated solution of it when a particular
function φ is considered. So, we transform problem (23.3.3)–(23.3.4) into a system of non-
linear equations by approximating the second derivative by a standard numerical formula.
1
Firstly, we introduce the points t j = jh, j = 0, 1, . . ., m + 1, where h = m+1 and m is
an appropriate integer. A scheme is then designed for the determination of numbers x j ,
it is hoped, approximate the values x(t j ) of the true solution at the points t j . A standard
approximation for the second derivative at these points is
x j−1 − 2x j + x j+1
x00j ≈ , j = 1, 2, . . ., m.
h2
A natural way to obtain such a scheme is to demand that the x j satisfy at each interior mesh
point t j the difference equation
x j−1 − 2x j + x j+1 + h2 φ(x j ) = 0. (23.3.5)
Since x0 and xm+1 are determined by the boundary conditions, the unknowns are
x1 , x2 , . . ., xm .
A further discussion is simplified by the use of matrix and vector notation. Introducing
the vectors    
x1 φ(x1 )
   
 x2   φ(x2 ) 
   
x =  . , vx =  . 
 ..   .. 
   
xm φ(xm )
and the matrix  
−2 1 0 ··· 0
 1 −2 1 ··· 0 
 
 0 1 −2 ··· 0 
A= ,
 .. .. .. .. .. 
 . . . . . 
0 0 0 · · · −2
Enlarging the Convergence Domain of Secant-Like Methods for Equations 415

the system of equations, arising from demanding that (23.3.5) holds for j = 1, 2, . . ., m, can
be written compactly in the form

F(x) ≡ Ax + h2 vx = 0, (23.3.6)

where F is a function from Rm into Rm.


From now on, the focus of our attention is to solve a particular system of form (23.3.6).
We choose m = 8 and the infinity norm.
The steady temperature distribution is known in a homogeneous rod of length 1 in
which, as a consequence of a chemical reaction or some such heat-producing process, heat
is generated at a rate φ(x(t)) per unit time per unit length, φ(x(t)) being a given function of
the excess temperature x of the rod over the temperature of the surroundings. If the ends of
the rod, t = 0 and t = 1, are kept at given temperatures, we are to solve the boundary value
problem given by (23.3.3)–(23.3.4), measured along the axis of the rod. For an example we
choose an exponential law φ(x(t)) = exp(x(t)) for the heat generation.
Taking into account that the solution of (23.3.3)–(23.3.4) with φ(x(t)) = exp(x(t)) is of
the form Z 1
x(s) = G(s,t) exp(x(t)) dt,
0
where G(s,t) is the Green function in [0, 1] × [0, 1], we can locate the solution x∗ (s) in some
domain. So, we have
1
kx∗ (s)k − exp(kx∗ (s)k) ≤ 0,
8

so that kx (s)k ∈ [0, ρ1 ] ∪ [ρ2 , +∞], where ρ1 = 0.1444 and ρ2 = 3.2616 are the two positive
real roots of the scalar equation 8t − exp(t) = 0.
Observing the semilocal convergence results presented in this chapter, we can only
guarantee the semilocal convergence to a solution x∗ (s) such that kx∗ (s)k ∈ [0, ρ1 ]. For this,
we can consider the domain

Ω = {x(s) ∈ C2 [0, 1] ; kx(s)k < log(7/4), s ∈ [0, 1]},



since ρ1 < log 74 < ρ2 .
In view of what the domain Ω is for equation (23.3.3), we then consider (23.3.6) with
F :Ωe ⊂ R8 → R8 and
Ωe = {x ∈ R8 ; kxk < log(7/4)}.

According to the above-mentioned, vx = (exp(x1 ), exp(x2 ), . . ., exp(x8 ))t if φ(x(t)) =


exp(x(t)). Consequently, the first derivative of the function F defined in (23.3.6) is given
by
F 0 (x) = A + h2 diag(vx ).
Moreover,
F 0 (x) − F 0 (y) = h2 diag(z),
where y = (y1 , y2 , . . ., y8 )t and z = (exp(x1 ) − exp(y1 ), exp(x2 ) − exp(y2 ), . . ., exp(x8 ) −
exp(y8 )). In addition,

kF 0 (x) − F 0 (y)k ≤ h2 max |exp(`i)| kx − yk,


1≤i≤8
416 Ioannis K. Argyros and Á. Alberto Magreñán

e and h = 1 , so that
where ` = (`1 , `2 , . . ., `8 )t ∈ Ω 9

7 2
kF 0 (x) − F 0 (y)k ≤ h kx − yk. (23.3.7)
4
Considering (see [27])
Z 1
[x, y; F] = F 0 (τx + (1 − τ)y)dτ,
0

taking into account


Z 1
1
kτ(x − u) + (1 − τ)(y − v)kdτ ≤ (kx − uk + ky − vk) ,
0 2
and (23.3.7), we have
Z 1
k[x, y; F] − [u, v; F]k ≤ kF 0 (τx + (1 − τ)y) − F 0 (τu + (1 − τ)v)k dτ
0
7 2 1
Z
≤ h (τkx − uk + (1 − τ)ky − vk)dτ
4 0
7 2
= h (kx − uk + ky − vk) .
8
From the last, we have L = 648 7 7
and M1 = 648 k[F 0 (x0 )]−1 k.
1 1 1 1 t
If we choose λ = 2 and the starting points x−1 = ( 10 , 10 , . . ., 10 ) and x0 = (0, 0, . . ., 0)t ,
1
we obtain c = 10 , β = 11.202658 . . . and η = 0.138304 . . ., so that (23.3.1) of Theo-
rem 23.3.1 is not satisfied, since

η 3− 5
a= = 0.580368 . . . > = 0.381966 . . .
c+η 2
Thus, according to Theorem 23.3.1, we cannot guarantee the convergence of secant-like
method (23.1.2) with λ = 12 for approximating a solution of (23.3.6) with φ(s) = exp(s).
However, we can do it by Theorem 23.2.6, since all the inequalities which appear in
(23.2.5) are satisfied:

0 < α0 = 0.023303 . . . ≤ α = 0.577350 . . . ≤ 1 − 2M1 η = 0.966625 . . .,

where k[F 0 (x0 )]−1 k = 11.169433 . . ., M1 = 0.120657 . . . and

p(t) = (0.180986 . . .)t 3 + (0.180986 . . .)t 2 − (0.060328 . . .)t − (0.060328 . . .).

Then, we can use secant-like method (23.1.2) with λ = 12 to approximate a solution of


(23.3.6) with φ(u) = exp(u), the approximation given by the vector x∗ = (x∗1 , x∗2 , . . ., x∗8 )t
shown in Table 23.3.1 and reached after four iterations with a tolerance 10−16 . In Ta-
ble 23.3.2 we show the errors kxn − x∗ k using the stopping criterion kxn − xn−1 k < 10−16 .
Notice that the vector shown in Table 23.3.1 is a good approximation of the solution of
(23.3.6) with φ(u) = exp(u), since kF(x∗ )k ≤ C × 10−16 . See the sequence {kF(xn )k} in
Table 23.3.2.
Enlarging the Convergence Domain of Secant-Like Methods for Equations 417

Table 23.3.1. Approximation of the solution x∗ of (23.3.6) with φ(u) = exp(u)

n x∗i n x∗i n x∗i n x∗i


1 0.05481058 . . . 3 0.12475178 . .. 5 0.13893761 .. . 7 0.09657993 . . .
2 0.09657993 . . . 4 0.13893761 . .. 6 0.12475178 .. . 8 0.05481058 . . .

1
Table 23.3.2. Absolute errors obtained by secant-like method (23.1.2) with λ = 2 and
{kF(xn )k}

n kxn − x∗ k kF(xn)k
−1 1.3893 . . .× 10−1 8.6355 . . .× 10−2
0 4.5189 . . .× 10−2 1.2345 . . .× 10−2
1 1.43051 . . .× 10−4 2.3416 . . .× 10−5
2 1.14121 . . .× 10−7 1.9681 . . .× 10−8
3 4.30239 . . .× 10−13 5.7941 . . .× 10−14

23.3.2. Example 2
Consider the following nonlinear boundary value problem
(
1
u00 = −u3 − u2
4
u(0) = 0, u(1) = 1.

It is well known that this problem can be formulated as the integral equation
Z 1
1
u(s) = s + Q (s,t) (u3(t) + u2 (t)) dt (23.3.8)
0 4
where, Q is the Green function:

t (1 − s), t ≤ s
Q (s,t) =
s (1 − t), s < t.

We observe that Z 1
1
max |Q (s,t)| dt = .
0≤s≤1 0 8
Then problem (23.3.8) is in the form (23.1.1), where, F is defined as
Z 1
1
[F(x)] (s) = x(s) − s − Q (s,t) (x3(t) + x2 (t)) dt.
0 4
The Fréchet derivative of the operator F is given by
Z 1 Z 1
1
[F 0 (x)y] (s) = y(s) − 3 Q (s,t)x2(t)y(t)dt − Q (s,t)x(t)y(t)dt.
0 2 0
418 Ioannis K. Argyros and Á. Alberto Magreñán

1 + 14 5
Choosing x0 (s) = s and R = 1 we have that kF(x0 )k ≤ = . Define the divided
8 32
difference defined by
Z 1
[x, y; F] = F 0 (τx + (1 − τ)y)dτ.
0
Taking into account that
Z 1
k[x, y; F] − [v, y; F]k ≤ kF 0 (τx + (1 − τ)y) − F 0 (τv + (1 − τ)y) k dτ
0
1 1 2 2 τ 
Z
≤ 3τ kx − v2 k + 2τ(1 − τ)kykkx − vk + kx − vk dτ
8 0 2
  
1 1
≤ kx2 − v2 k + kyk + kx − vk
8 4
 
1 1
≤ kx + vk + kyk + kx − vk
8 4
25
≤ kx − vk
32

9s
Choosing x−1 (s) = , we find that
10
Z 1
k1 − A0 k ≤ kF 0 (τx0 + (1 − τ)x−1 ) k dτ
0
   !
1 1 9 2 1 9
Z
≤ 3 τ + (1 − τ) + τ + (1 − τ) dτ
8 0 10 2 10
≤ 0.409375 . . .

Using the Banach Lemma on invertible operators we obtain


kA−1
0 k ≤ 1.69312 . . .

and so
25 −1
L≥ kA k = 1.32275 . . .
32 0
.
In an analogous way, choosing λ = 0.8 we obtain
M0 = 0.899471 . . .,
kB−1
0 k = 1.75262 . . .
and
η = 0.273847 . . ..
Notice that we can not guarantee the convergence of the secant method by Theorem 3.1
since the first condition of (3.1) is not satisfied:

η 3− 5
a= = 0.732511 . . . > = 0.381966 . . .
c+η 2
Enlarging the Convergence Domain of Secant-Like Methods for Equations 419

On the other hand, observe that

M̃1 = 0.0988372 . ..,

K̃ = 1.45349 . . .,
α0 = 0.434072 . . .,
α = 0.907324 . . .
and
1 − 2M̃1 η = 0.945868 . . ..
And condition (2.62) 0 < α0 ≤ α ≤ 1 − 2M̃1 η is satisfied and as a consequence we can
ensure the convergence of the secant method by Theorem 23.2.9.

Conclusion
We presented a new semilocal convergence analysis of the secant-like method for approx-
imating a locally unique solution of an equation in a Banach space. Using a combination
of Lipschitz and center-Lipschitz conditions, instead of only Lipschitz conditions invested
in [18], we provided a finer analysis with larger convergence domain and weaker sufficient
convergence conditions than in [15, 18, 19, 21, 26, 27]. Numerical examples validate our
theoretical results.
References

[1] Amat, S., Busquier, S., Negra, M., Adaptive approximation of nonlinear operators,
Numer. Funct. Anal. Optim., 25 (2004) 397-405.

[2] Amat, S., Busquier, S., Gutiérrez, J. M., On the local convergence of secant-type
methods, Int. J. Comp. Math., 81(8) (2004), 1153-1161.

[3] Amat, S., Bermúdez, C., Busquier, S., Gretay, J. O., Convergence by nondiscrete
mathematical induction of a two step secant’s method, Rocky Mountain J. Math., 37(2)
(2007), 359-369.

[4] Andronow, A.A., Chaikin, C. E., Theory of oscillations, Princenton University Press,
New Jersey, 1949.

[5] Argyros, I.K., Polynomial Operator Equations in Abstract Spaces and Applications,
St.Lucie/CRC/Lewis Publ. Mathematics series, 1998, Boca Raton, Florida, U.S.A.

[6] Argyros, I.K., A unifying local-semilocal convergence analysis and applications for
two-point Newton-like methods in Banach space, J. Math. Anal. Appl., 298 (2004),
374-397.

[7] Argyros, I.K., New sufficient convergence conditions for the secant method, Che-
choslovak Math. J., 55 (2005), 175-187.

[8] Argyros, I.K., Convergence and Applications of Newton–type Iterations, Springer–


Verlag Publ., New–York, 2008.

[9] Argyros, I.K., A semilocal convergence analysis for directional Newton methods,
Math. Comput., 80 (2011), 327-343.

[10] Argyros, I.K., Cho, Y.J., Hilout, S., Numerical Methods for Equations and its Appli-
cations, CRC Press/Taylor and Francis, Boca Raton, Florida, USA, 2012

[11] Argyros, I.K., Hilout, S., Convergence conditions for secant–type methods, Che-
choslovak Math. J., 60 (2010), 253–272.

[12] Argyros, I.K., Hilout, S., Weaker conditions for the convergence of Newton’s method,
J. Complexity, 28 (2012), 364-387.

[13] Argyros, I.K., Hilout, S., Estimating upper bounds on the limit points of majorizing
sequences for Newton’s method, Numer. Algorithms, 62(1) (2013), 115-132.
422 Ioannis K. Argyros and Á. Alberto Magreñán

[14] Argyros, I.K., Hilout, S., Numerical methods in nonlinear analysis, World Scientific
Publ. Comp., New Jersey, 2013.

[15] Argyros, I.K., Ezquerro, J.A., Hernández, M.Á. Hilout, S., Romero, N., Velasco, A.I.,
Expanding the applicability of secant-like methods for solving nonlinear equations,
Carp. J. Math. 31 (1) (2015), 11-30.

[16] Dennis, J.E., Toward a unified convergence theory for Newton–like methods, in Non-
linear Functional Analysis and Applications (L.B. Rall, ed.), Academic Press, New
York, (1971), 425–472.

[17] Ezquerro, J.A., Gutiérrez, J.M., Hernández, M.A., Romero, N., Rubio, M.J., The
Newton method: from Newton to Kantorovich. (Spanish), Gac. R. Soc. Mat. Esp.,
13 (2010), 53–76.

[18] Ezquerro, J.M., Hernández, M.A., Romero, N., A.I. Velasco,Improving the domain of
starting point for secant-like methods,App. Math. Comp., 219 (8) (2012), 3677–3692.

[19] Ezquerro, J.A., Rubio, M.J., A uniparametric family of iterative processes for solving
nondifferentiable equations, J. Math. Anal. Appl., 275 (2002), 821–834.

[20] Hernández, M.A., Rubio, M.J., Ezquerro, J.A., secant-like methods for solving non-
linear integral equations of the Hammerstein type. Proceedings of the 8th International
Congress on Computational and Applied Mathematics, ICCAM-98 (Leuven), J. Com-
put. Appl. Math., 115 (2000), 245–254.

[21] Hernández, M.A., Rubio, M.J., Ezquerro, J.A., Solving a special case of conservative
problems by secant-like methods, App. Math. Comp., 169 (2005), 926–942.

[22] Kantorovich, L.V., Akilov, G.P., Functional Analysis, Pergamon Press, Oxford, 1982.

[23] Laasonen, P., Ein überquadratisch konvergenter iterativer algorithmus, Ann. Acad. Sci.
Fenn. Ser I, 450 (1969), 1–10.

[24] Magreñán, Á.A., A new tool to study real dynamics: The convergence plane, App.
Math. Comp., 248 (2014), 215–224.

[25] Ortega, J.M., Rheinboldt, W.C., Iterative Solution of Nonlinear Equations in Several
Variables, Academic Press, New York, 1970.

[26] Potra, F.A., Sharp error bounds for a class of Newton–like methods, Libertas Mathe-
matica, 5 (1985), 71–84.

[27] Potra, F.A., Pták, V., Nondiscrete Induction and Iterative Processes, Pitman, New
York, 1984.

[28] Proinov, P.D., General local convergence theory for a class of iterative processes and
its applications to Newton’s method, J. Complexity, 25 (2009), 38–62.

[29] Proinov, P.D., New general convergence theory for iterative processes and its applica-
tions to Newton–Kantorovich type theorems, J. Complexity, 26 (2010), 3–42.
Enlarging the Convergence Domain of Secant-Like Methods for Equations 423

[30] Schmidt, J.W., Untere Fehlerschranken fur Regula-Falsi Verhafren, Period. Hungar.,
9 (1978), 241–247.

[31] Stoker, J.J., Nonlinear vibrations, Interscience-Wiley, New York, 1950.

[32] Traub, J.F., Iterative Methods for the Solution of Equations, Prentice-Hall, Englewood
Cliffs, New Jersay, 1964.

[33] Yamamoto, T., A convergence theorem for Newton–like methods in Banach spaces,
Numer. Math., 51 (1987), 545–557.

[34] Wolfe, M.A., Extended iterative methods for the solution of operator equations, Nu-
mer. Math., 31 (1978), 153–174.
Author Contact Information

Ioannis K. Argyros
Professor
Department of Mathematical Sciences
Cameron University
Lawton, OK, US
Tel: (580) 581-2908
Email: [email protected]

Á. Alberto Magreñán


Professor
Universidad Internacional de La Rioja
Departmento de Matematicas
Logroño, La Rioja, Spain
Tel: (+34) 679-257459
Email: [email protected]
Index
A D

algorithm, 190, 200, 207, 210, 211, 214, 215, 217, Damped Newton method, 309, 310, 311, 314, 315,
218, 223, 235, 269, 270, 271, 272, 273, 274, 276, 316
277, 279, 280, 284, 291, 397 derivatives, 21, 22, 43, 66, 73, 99, 106
deviation, 187, 281
differential equations, 64, 337
B distribution, 17, 259, 316, 317, 415
dynamic systems, 337
Banach space(s), 1, 9, 12, 20, 21, 33, 34, 35, 45, 47,
72, 73, 80, 94, 95, 99, 100, 109, 125, 127, 133,
134, 135, 137, 147, 148, 150, 167, 178, 186, 245, F
254, 257, 263, 265, 267, 307, 308, 309, 321, 323,
324, 335, 336, 337, 338, 343, 345, 350, 351, 352, finite element method, 205
353, 354, 355, 357, 359, 363, 365, 366, 367, 369,
371, 372, 373, 381, 384, 385, 387, 397, 399, 401,
410, 413, 419, 421, 423 G
boundary value problem, 2, 16, 162, 181, 203, 217,
238, 261, 347, 417 Genetic Algorithm, 269, 271, 273, 275, 277, 279
bounded linear operators, 1, 75, 137, 167, 187, 245,
309, 323, 337, 354, 401
H
bounds, 6, 7, 9, 13, 15, 19, 27, 34, 74, 89, 91, 92, 93,
127, 129, 131, 133, 149, 190, 196, 225, 256, 265,
Hilbert space, 151, 187, 207, 210, 211, 214, 215,
266, 282, 299, 308, 310, 321, 327, 352, 357, 358,
223, 230, 281, 295
367, 380, 392, 394, 395, 412, 421, 422

I
C
IMA, 45, 71, 133, 134, 135, 148, 149, 150, 241, 293,
CAD, 280
307, 308, 336, 384
calculus, 108
image, 105, 138, 139, 269, 296, 297, 298, 304
closed ball, 35, 59, 73, 138, 167, 187, 245, 296, 323,
induction, 5, 7, 8, 12, 20, 25, 28, 38, 40, 50, 52, 53,
344, 354, 375, 389, 401
56, 61, 79, 80, 108, 119, 120, 121, 143, 155, 172,
complex numbers, 21
173, 174, 176, 177, 178, 185, 186, 227, 228, 249,
computation, 21, 28, 129, 138, 144, 153, 169, 189,
250, 253, 254, 265, 287, 288, 303, 313, 329, 331,
201, 224, 225, 235, 237, 246, 247, 271, 296, 299,
342, 345, 362, 363, 367, 372, 391, 393, 405, 406,
303, 325, 346, 356, 358, 398, 402, 403
409, 410, 421
computational mathematics, 96, 138, 149, 190, 225,
inequality, 15, 55, 59, 84, 88, 118, 121, 122, 128,
296, 303, 308, 353
153, 162, 163, 174, 193, 204, 218, 227, 228, 234,
computing, 100, 200
238, 239, 271, 303, 311, 326, 340, 342, 346, 357
convergence criteria, 1, 2, 8, 14, 74, 80, 91, 138, 190,
inversion, 381, 385
225, 246, 256, 337, 339, 350, 354, 402, 412
428 Index

iteration, 70, 93, 110, 112, 141, 152, 175, 188, 200,
235, 242, 270, 283, 284, 291, 293, 309, 310, 325, O
331, 337, 383, 397
iterative solution, 47 one dimension, 131
operations, 208
operations research, 208
K optimization, 34, 47, 99, 136, 138, 148, 269, 296,
324, 337, 338
K+, 174, 235, 248, 404 optimization method, 136
Kantorovich theorem, 19, 71, 73, 111, 126, 353, 385 organism, 272
orthogonality, 99

L
P
Lavrentiev Regularization Methods, 151
laws, 17, 317 parallel, 104, 105, 106, 111, 123
light, 214 parents, 272
linear systems, 138 Philadelphia, 148, 206, 217
Lipschitz conditions, 72, 75, 77, 93, 95, 144, 263, physics, 167, 281
304, 339, 351, 371, 384, 398, 419 population, 269, 270, 272, 273, 277
Portugal, 279
preservation, 273
M probability, 272, 273
programming, 218
management, 208 proposition, 108, 144
manifolds, 99, 100, 101, 102, 103, 110, 111, 122, prototype, 17, 317
126, 133, 134, 135, 136 publishing, 242
manipulation, 84, 270, 272, 362, 388
mapping, 101, 103, 104, 207, 208, 209, 215
mass, 134, 258, 414 R
mathematical programming, 208
matrix, 18, 31, 138, 143, 145, 258, 310, 317, 414 radiation, 17, 316
matter, 21, 35, 73, 100, 137, 223, 246, 295, 309, 323, radius, 2, 9, 23, 29, 30, 35, 36, 59, 68, 73, 89, 105,
337, 353, 373, 387, 394, 402 138, 145, 151, 167, 187, 188, 221, 245, 296, 299,
measurement, 281 310, 321, 323, 324, 325, 326, 327, 330, 332, 333,
memory, 34 334, 343, 354, 375, 379, 380, 389, 394, 401
meth, 335, 351, 372, 384, 399 real dynamics, 19, 186, 266, 352, 422
mixing, 274 real numbers, 10, 152, 154, 155, 309
modelling, 1, 35, 137, 185, 337, 353, 387 Relaxed Proximal Point Algorithms, 209, 211, 213,
models, 19 215, 219
modifications, 21 Riemannian Manifolds, 99, 101, 103, 105, 107, 109,
mutation, 272, 273 110, 111, 113, 115, 117, 119, 121, 123, 125, 127,
129, 131, 135
root(s), 3, 4, 5, 34, 36, 45, 48, 67, 99, 112, 115, 135,
N 171, 172, 191, 197, 248, 255, 259, 315, 343, 349,
350, 360, 372, 384, 397, 399, 404, 411, 415
Newton iteration, 134, 302
next generation, 272, 273
nodes, 17, 31, 317 S
noisy data, 151, 152, 158, 187, 239
nonlinear least squares, 137, 295, 296, 298, 304 Secant method, 13, 33, 39, 248, 337, 338, 341, 343,
null, 324 344, 351, 404
numerical analysis, 21, 73, 100, 167, 245, 323, 401 SHEGA, 271, 273, 274, 275, 276, 277, 278
structure, 103, 107, 337
STTM, 22, 23, 24, 25, 29, 30, 31, 32
symmetry, 272, 273, 274, 276, 277, 280
Index 429

vector, 100, 101, 104, 105, 106, 107, 108, 109, 117,
T 119, 120, 121, 122, 134, 258, 260, 310, 321, 414,
416
techniques, 12, 47, 52, 61, 80, 172, 178, 219, 249,
254, 281, 345, 366, 405, 410
tensions, 149, 308 W

weakness, 73, 100, 246, 323, 402


V

variables, 186, 273, 294

You might also like