On Inexact Newton Methods For Inverse Problems in Banach Spaces
On Inexact Newton Methods For Inverse Problems in Banach Spaces
DISSERTATION
von
M.Sc. Fábio J. Margotti
aus Nova Veneza
————————————————————————————————————————
1 Introduction 1
5 Numerical Experiments 75
5.1 EIT - Continuum Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.1.1 Fréchet-differentiability of the forward operator . . . . . . . . . . . . 77
5.1.2 Computational implementation . . . . . . . . . . . . . . . . . . . . . 79
5.2 EIT - Complete Electrode Model . . . . . . . . . . . . . . . . . . . . . . . . 89
5.2.1 Fréchet-differentiability of the forward operator . . . . . . . . . . . . 92
5.2.2 Computational implementation . . . . . . . . . . . . . . . . . . . . . 95
I am very glad to express the most sincere gratitude to my advisor Prof. Dr. Andreas
Rieder for trusting in my work since its beginning. His excellent mathematical skills have
strongly contributed to the improvement of my knowledge and his constant explanations
about German language and culture greatly facilitated my everyday life. I would like to
thank Prof. Dr. Andreas Kirsch too, for being a referee of this work, for kindly writing
recommendations to support the extension of my scholarship in two occasions and for
helping to complement my mathematical formation with his useful lectures as well as with
his fruitful discussions in our work group on inverse problems.
This work would have been impossible without the important financial support provided
by the Deutscher Akademischer Austauschdienst. I am very thankful to DAAD for intro-
ducing me to the interesting German’s culture too. In particular, I would like to say thank
you to Mrs. Maria Salgado for being so supportive every time I needed. I am much obliged
to my previous advisor Prof. Dr. Antonio Leitão, who introduced me to this fascinating
area of inverse problems and provided a strong support for my coming to Germany.
It is very necessary to thank all my colleagues of my research group of scientific comput-
ing IANM3 for creating a very pleasant and productive work environment. Special thanks
to Sonja Becker, Nathalie Sonnefeld, Johannes Ernesti, Julian Krämmer, Daniel Weiß, Mar-
cel Mikl, Ramin S. Nejad, Lydia Wagner, professors Tobias Jahnke and Christian Wieners
and my ex-colleagues Tudor Udrescu and Tim Kreutzmann.
My gratefulness is also directed to my colleague Robert Winkler for his always useful
discussions concerning the inverse problem of Electrical Impedance Tomography as well as
for providing me with his very organized and well-written code of EIT-CEM. Further, I
want to thank Ekkachai Thawinan and Andreas Schulz for being good friends in these last
four years.
Last but not least, I am deeply thankful to my dear wife Patrı́cia, who has been always
very considerate and comprehensive and with whom I have shared each little moment in
this amazing experience in Germany. Muito obrigado!
Abstract
———————————————————————
Key words: Inverse problems, Ill-posed problems, Regularization theory, Iterative meth-
ods, Inexact-Newton methods, Electrical Impedance Tomography, Banach spaces.
Chapter 1
Introduction
This work is dedicated to the study of regularization methods for obtaining stable solutions
of nonlinear inverse ill-posed problems in Banach spaces. It is focused on iterative methods
and places particular emphasis on Newton-type algorithms. Further, a convergence analysis
of the investigated methods is provided and some numerical implementations support the
theoretical results.
In order to properly introduce the classical definition of ill-posedness due to Hadamard,
we start defining a forward problem as a function which associates the cause of a specific
phenomenon with the effect produced by it. The effect of this phenomenon is usually called
the solution of the forward problem. To each forward problem belongs a correlated inverse
problem.
The inverse problems consist of a set of mathematical problems where one aims to
determine the cause of a particular phenomenon (solution of the inverse problem) by means
of observing the effect produced by it (data). This kind of procedure encompasses an
extensive amount of real problems described by mathematical models in the most diversified
fields such as medicine, astronomy, engineering, geology, meteorology and a large sort of
physical problems as well as image restoration and several other applications [30, 45, 48].
The biggest challenge in the resolution of such problems is related to the fact that they are
frequently ill-posed in the sense of Hadamard.
The French mathematician Jacques Hadamard [19] has defined a problem as well-posed
if for this problem:
1
2 CHAPTER 1. INTRODUCTION
The third item in the definition of Hadamard is certainly the most delicate to deal
with. The non-fulfillment of the stability statement implies that unavoidable measurement
and round-off errors can be amplified by an arbitrarily large factor, severely compromising
the reliability of the computed solutions. Moreover, in contrast to the two first items of
Hadamard’s definition, a reformulation of the problem in order to achieve stability is not
a trivial issue. The stability property depends on the topology of the involved spaces and
an alteration of these topologies often modifies the original features of the problem and
deprives it of its fundamental characteristics, or in other words, the reformulated problem
becomes meaningless.
The observance of item 3 of Hadamard’s definition on the other hand, signifies that
the solution of the inverse problem corresponding to data corrupted by a low-level noise
cannot be distant from the searched-for solution. Hadamard believed, just as many of
his contemporaries, that a mathematical model could only correctly represent a natural
phenomenon if the associated inverse problem was well-posed (natura non facit saltum).
If this was not the case, the model was considered incorrectly formulated and the related
problem was called ill-posed. This was a controversy perception until the beginning of last
century, when was finally realized that an enormous quantity of real problems are actually
ill-posed if translated in any reasonable mathematical model. This conclusion initiated in
the second half of last century a huge amount of research targeting methods capable of
stably solving inverse ill-posed problems. At this time, the regularization theory was born.
In practical situations one usually has access only to a perturbed version of the data,
which is invariably contaminated by noise. As a consequence, the exact reconstruction of a
solution is unattainable and therefore, the best it can be done is finding an approximate so-
lution. However, unless a regularization method1 is engaged, the ill-posedness phenomenon
combined with even very low noise levels can easily ruin the process of computing a solution.
Due to both, its technological relevance and the challenging difficulties involving the
development of regularization methods, inverse problems are still nowadays an active area
of research. This fact is reflected in the large number of Journals, books and monographs
devoted to this subject. A particular interesting group, widely applied in the resolution of
large scale inverse ill-posed problems, is the iterative regularization methods. The literature
in this area is vast and the books [30, 15, 45] can be duly included in the most complete
references concerning the subject in Hilbert spaces. The work [29] deserves to be mentioned
too. For more recent progress, including the regularization theory in Banach spaces, consult
[48] and the references therein.
Known for their fast convergence properties, Newton-type methods are often regarded
as robust algorithms for solving nonlinear inverse problems. Among all the members of this
large group of remarkable iterative methods, we highlight the REGINN method. Introduced
in 1999 by A. Rieder [43], the REGularization based on INexact Newton iteration is a class
of inexact-Newton methods, which solves nonlinear inverse ill-posed problems by means
of solving a sequence of linearizations of the original problem. This algorithm linearizes
the original problem around the current iterate and then applies an iterative regularization
technique in the so-called inner iteration to find and approximate solution of the resulting
linearized system. Afterwards, this approximate solution is added to the current iterate
in the outer iteration in order to generate an update. Finally, the discrepancy principle is
employed to terminate the outer iteration.
The properties of REGINN in Hilbert spaces are well-known. Convergence results, regular-
ization property and convergence rates have been established under standard requirements
and with different iterative regularization methods in the inner iteration [43, 44, 46, 25]. A
1
If for each sequence of positive noise levels converging to zero, the correspondent sequence of approximate
solutions converges to an exact solution of the problem, then the related method is called a regularization
method and it is accordingly said to satisfy the regularization property.
3
few examples of possible inner iterations are given by gradient methods like the Conjugate
Gradient, Steepest Descent and Landweber methods and by Tikhonov-like methods such
as the Iterated-Tikhonov and Tikhonov-Phillips methods.
Perhaps the biggest disadvantage of Newton-like methods is the necessity of calculating
the derivative of the forward problem. This is normally an expensive process and it is
generally the bottleneck in numerical implementations. A wise approach to overcome this
obstacle is the utilization of a loping strategy. This procedure, proposed by S. Kaczmarz in
[27], splits the problem into a finite number of sub-problems, which are cyclically processed,
using one sub-problem each time, in order to find a solution of the original problem. The
Kaczmarz’s technique reduces drastically the computational effort necessary to perform
a single iteration. The expected effect is the acceleration of the original method with a
consequent gain in the convergence speed. Kaczmarz methods have an additional advantage:
they describe more appropriately the problems whose data is gathered by a set of individual
measurements, which is the case in Electrical Impedance Tomography for example (see the
numerical experiments in Chapter 5).
There are many papers concerning Kaczmarz methods in the context of ill-posed prob-
lems, but similar to REGINN, most results are proven in the Hilbert spaces framework
[31, 4, 20, 7]. Several inverse problems however, are naturally described in more gen-
eral contexts than Hilbert spaces. The Lebesgue spaces Lp (Ω) , as well as the Sobolev
spaces W n,p (Ω) and the space of the p−summable sequences `p (N) for different values of
p ∈ [1, ∞] are classical examples of Banach spaces, which model more appropriately nu-
merous inverse problems. Further, describing an inverse problem in more suitable Banach
spaces contributes with supplementary benefits such as the non-destruction of sparsity char-
acteristics of the searched-for solution and the increase of data accuracy when dealing with
impulsive noise2 . These improvements are commonly obtained changing in the solution
and data spaces, the parameter p = 2, commonly used to characterize Hilbert spaces, with
p ≈ 1, see Daubechies et al [12].
Figure 1.1 illustrates the advantage of using more general Banach spaces in the descrip-
tion of inverse problems. The searched-for (sparse) electrical conductivity is reconstructed
in the Electrical Impedance Tomography problem using 1% of impulsive noise and compares
two different frameworks: in the first case, the solution space X and the data space Y are
both the Hilbert space L2 , while in the second one, the conductivity is reconstructed in the
Banach spaces X = Y = L1.01 .
For more examples of inverse problems modelled in Banach spaces, see the numerical
experiments in Chapter 5 and the book of Schuster et al [48].
The difficulties of carrying out a convergence analysis in Banach spaces grow massively if
the solution and data spaces have poor smoothness/convexity properties. It is not straight-
forward modifying classical methods from Hilbert spaces in order to adjust them to more
complicated Banach spaces. As long as we know, the first version of REGINN in Banach
spaces has been published by Q. Jin in [24], where a weak convergence result was proven.
The first strong convergence result has been firstly successfully accomplished for the com-
bination Kaczmarz/inexact-Newton/Banach spaces in our previous work [40], where the
Landweber method was employed as inner iteration. The most recent progress has been
made using the Iterated-Tikhonov method as inner iteration [39].
The present thesis contributes to expand the results above mentioned, providing a rel-
ative general convergence analysis of a Kaczmarz version of REGINN (in short K-REGINN) in
Banach spaces, valid at the same time for different methods in the inner iteration. This
work is structured as following:
Exact X = Y = L2 X = Y = L1.01
Figure 1.1: Sparse conductivity (picture on the left) reconstructed in the Electrical
Impedance Tomography problem with 1% of impulsive noise. The Hilbert spaces X =
Y = L2 and the Banach spaces X = Y = L1.01 have been used to reconstruct the pictures
displayed on the middle and on the right respectively.
are presented. First, the concepts of convexity and smoothness are discussed. These
concepts determine ”how favorable” the geometry of a Banach space is. Hilbert
spaces are usually regarded as the Banach spaces with the most favorable geometrical
features, being at the same time the most convex and the smoothest Banach spaces.
After a discussion about the connections between convexity and smoothness, the
definition of duality mapping is introduced. Inspired by the Riesz Representation
Theorem, this mapping associates each element x in the solution space with a subset
Jp (x) of its dual space so that the resulting duality pairing hx∗ , yi , with x∗ ∈ Jp (x) ,
mimics the properties of the inner product hx, yi in Hilbert spaces. At the end of this
chapter, the Bregman distance is presented. It serves to simulate in Banach spaces
the polarization identity and replaces the norm in some specific situations. Finally,
using the Xu-Roach famous Theorems [53], some connections between the Bregman
distance and the standard norm are proven.
ods is suggested in Subsection 3.1.4. This novel algorithm yields extra stability to the
computations of the inner iteration without requiring the solution of an optimization
problem.
• Most methods presented in Section 3.1 have a similar nature and share similar prop-
erties. This particular aspect makes possible a general convergence analysis, which
is implemented in Chapter 4. The convergence analysis of K-REGINN presented in
Chapter 4 is carried out without considering any specific method in the inner itera-
tion. It is made assuming only specific properties of the sequence generated in the
inner iteration, which makes this convergence analysis valid for every method having
the required properties. Since many of the requested properties have already been
proven for the methods presented in Section 3.1, the results of Chapter 4 hold true
in particular for these methods. We highlight the main results: In Section 4.1 it is
proven that REGINN terminates and has a decreasing residual behavior whenever a
primal gradient method or a Tikhonov-Phillips-like method is used as inner itera-
tion. For the remaining methods, we further provide in the subsequent sections the
proofs of strong convergence of REGINN in the noiseless situation and of the regu-
larization property, see Theorems 43 and 47. Moreover, a decreasing error behavior
and a weak convergence result for the Kaczmarz version are provided in Theorem 38
and Corollary 41 respectively. Additionally, for the dual gradient Landweber method,
Iterated-Tikhonov and Tikhonov-Phillips methods, strong convergence in the noise-
less situation and the regularization property are proven for the Kaczmarz versions
too.
• In Chapter 5, the performance of K-REGINN is tested for solving the inverse problem of
the Electrical Impedance Tomography. Section 5.1 presents the Continuum Model [8]
and the Complete Electrode Model [50] is presented in Section 5.2. At the beginning
of each section, a brief explanation of the mathematical model is given and in the
subsequent subsection, the evaluation of the derivatives is discussed. Further, some
numerical experiments are performed in Subsections 5.1.2 and 5.2.2 and the respective
results discussed.
6 CHAPTER 1. INTRODUCTION
Chapter 2
This first chapter lists the most relevant concepts and facts concerning the geometry of Ba-
nach spaces. The main goals are to point out and explain the main results about convexity
and smoothness of Banach spaces, duality mappings, Bregman distances and the proper-
ties connecting these notions. We only prove some few results and suggest some references
where the remaining proofs can be found.
We consider the book of Cioranescu [10] a good reference. It presents in a very clear
and understandable way the main ideas concerning the uniform convexity and uniform
smoothness of Banach spaces and their connections with duality mappings. However, it
lacks the important concepts of convexity and smoothness of power type. Moreover, the
Cioranescu’s book had been written before the very famous paper of Xu and Roach [53]
was published, which means that the main relations between convexity/smoothness and the
Bregman distances is not present. To understand this significant issue and learn its main
results, we suggest beyond the paper of Xu and Roach itself, the article [47], the books
[48, 9] and the references therein.
The theoretical results of this work cover simultaneously real and complex Banach
spaces. However, as we will discuss in a moment, their dual spaces have analogous proper-
ties, which permits to assume without restriction of the generality that the Banach space in
question is always a real Banach space. To make it clear, we give the following definition.
Definition 1 Let X be a normed space defined over the field k (either R or C). We call
the set
X ∗ := {x∗ : X → k : x∗ linear and continuous}
the dual space of X. We write XR to represent the vector space X regarded as a real
vector space. Accordingly,
hRe x∗ , xi := Re hx∗ , xi , x ∈ X,
7
8 CHAPTER 2. GEOMETRY OF BANACH SPACES
hx∗ , xi hx∗ , xi
hx∗ , x
bi = = |hx∗ , xi| ∈ R,
|hx∗ , xi|
which proves the claim. Using this property, it is not so hard to prove that (see e.g. [6,
Lemma 6.39])
kRe x∗ kL(X,R) = kx∗ kL(X,C)
for all x∗ ∈ X ∗ and consequently T is an isometric R−linear operator. In particular T is
injective. Writing now hx∗ , xi = a + ib, with a, b ∈ R, we see that a = Re hx∗ , xi and
Re hx∗ , ixi = Re (i hx∗ , xi) = Re (i (a + ib)) = −b.
Hence
hx∗ , xi = hRe x∗ , xi − i hRe x∗ , ixi , x ∈ X
which implies that T is surjective. Indeed, if x b∗ ∈ XR∗ , then the operator x∗ : X → C defined
by
hx∗ , xi = hb
x∗ , xi − i hb
x∗ , ixi , x ∈ X
is C−linear and continuous, which means that x∗ ∈ X ∗ . Further, it is clear that
hRe x∗ , xi = Re hx∗ , xi = hb
x∗ , xi for all x ∈ X
and consequently Re x∗ = x
b∗ . This leads to the conclusion that T is a R−linear isometric
isomorphismus.
All these considerations show that it suffices to consider only real Banach spaces (resp.
real Hilbert spaces). For this reason, we assume without restriction of the generality, that
X is always a real Banach space for the rest of this work. Accordingly, the Hilbert space
H is always considered a real Hilbert space.
In this case, the G-derivative of F at x0 , F 0 (x0 ) , is also called the F-derivative. Finally, F
is said to be continuously f-differentiable (resp. continuously g-differentiable)
in C if the F-derivative (resp. G-derivative) F 0 : C → L (X, Y ) is a continuous function.
Definition 5 Let X be a Banach space, f : X → R be a function and let 2X denote the set
of all subsets of X. We say that f is subdifferentiable at a point x0 ∈ X if there exists
a functional x∗0 ∈ X ∗ , called subgradient of f at x0 , such that
The equivalence
1 ∗
αf (x) − αf (x0 ) ≥ hx∗0 , x − x0 i ⇐⇒ f (x) − f (x0 ) ≥ x , x − x0
α 0
for any α > 0, leads to the conclusion: x∗0 ∈ ∂ (αf ) (x0 ) ⇐⇒ α1 x∗0 ∈ ∂f (x0 ) , this is,
∂ (αf ) (x0 ) = α∂f (x0 ) for each α > 0.
A subdifferentiable function f is convex and lower semi-continuous in any open convex
set C ⊂D(f ) . Reciprocally, a proper convex and lower semi-continuous function f is always
subdifferentiable on int(D (f )) .
The optimality condition 0 ∈ ∂f (x0 ) is equivalent to f (x0 ) ≤ f (x) for all x ∈ X, which
means that x0 is a minimizer of f.
Let f be proper and convex and let x0 ∈int(D (f )) . Then x∗0 ∈ ∂f (x0 ) if and only if
f−0 (x0 , x) ≤ hx∗0 , xi ≤ f+0 (x0 , x) , for all x ∈ X,
see [10, Prop. 2.5, Ch. I]. From it, we conclude that f has a unique subgradient at
x0 ∈int(D (f )) if and only if f is G-differentiable at x0 . In this case, f 0 (x0 ) x = hx∗0 , xi for
all x ∈ X, this is, ∂f (x0 ) = {f 0 (x0 )} .
If f1 and f2 are two convex functions defined on X such that there is a point x0 ∈D(f1 ) ∩D(f2 )
where f1 is continuous, then [10, Theo. 2.8, Ch. I]
∂ (f1 + f2 ) (x) = ∂f1 (x) + ∂f2 (x) for all x ∈ X.
If A : X → Y is a bounded linear operator between the Banach spaces X and Y and
f : Y → R is convex and continuous at some point of the range of A, then
∂ (f ◦ A) (x) = A∗ (∂f (Ax)) for all x ∈ X,
where A∗ : Y ∗ → X ∗ represents the Banach adjoint of A (cf. [49]). Finally, for f : Y → R
convex and b ∈ Y fixed, see [48, Theo. 2.24],
∂ (f (· − b)) (y) = (∂f ) (y − b) for all y ∈ Y.
Using the definition of subdifferential, one can prove that for all x ∈ X\ {0} [10, Prop.
3.4, Ch. I],
∂ kxk = {x∗ ∈ X ∗ : hx∗ , xi = kxk and kx∗ k = 1} . (2.3)
This result motivates the definition of a smooth Banach space.
2.2. SMOOTHNESS OF BANACH SPACES 11
Remark 8 We want to mention the fact that from the isomorphismus T constructed at the
beginning of this chapter it follows that a complex Banach space is smooth if and only if its
corresponding real Banach space is smooth. This is a direct consequence of the equivalence
which we prove now. Indeed, first remember that kT x∗ k = kx∗ k . Now, if hx∗ , xi = kxk ∈ R,
then
kxk = hx∗ , xi = Re hx∗ , xi = hT x∗ , xi .
Reciprocally, assuming hT x∗ , xi = kxk we define the vector x
b := sgn (hx∗ , xi)x and obtain
from (2.1) ,
Consequently, Re hx∗ , xi = |hx∗ , xi| which implies that hx∗ , xi = Re hx∗ , xi = kxk and the
proof is complete.
As the subdifferential of k·k : X → R is always a subset of XR∗ , we see that in a complex
Banach space X, the set in (2.3) is actually described by
But as these two sets can be identified to each other using the isomorphismus T (see (2.4)
above), (2.3) can be used, in a slight abuse of notation, even in complex Banach spaces.
Using the last definition and (2.3) , we conclude that a Banach space is smooth if and
only if the subdifferential of the convex function f (x) = kxk is single valued for all x ∈
X\ {0} . This is in turn, an equivalent condition to the G-differentiability of f in X\ {0} ,
i.e., the Banach space X is smooth if and only if the norm-function is G-differentiable in
X\ {0} .
Assume now that X is uniformly smooth, then for all x, y ∈ X with kxk = kyk = 1,
f (x + τ y) − f (x) f (x + τ (−y)) − f (x)
0 = lim + = f+0 (x, y) − f−0 (x, y) .
τ →0+ τ τ
It is not so difficult to extend this equality for all y ∈ X, which implies that f is G-
differentiable at x, for all x ∈ X satisfying kxk = 1. Finally, one can prove that the
result actually holds for all x ∈ X\ {0} , which means that X is smooth. Hence, the
uniform smoothness of X implies the smoothness of this space. The converse is however
not true, as can be seen in [10, Theo. 3.12, Ch. I], where a proof of the equivalence
between uniform smoothness and uniform F-differentiability of the norm-function on the
unit sphere is given. In particular, the norm-function is F-differentiable in X\ {0} provided
X is uniformly smooth.
The above definition states that in a strictly convex Banach space, the line segment
connecting two points in the unit sphere has only points lying inside this sphere, except for
the extremal points themselves. It is possible to prove (see [10, Prop. 1.2, Ch. II]) that a
Banach space is strictly convex
1 if and
only if the unit sphere has no line segments. It is also
an equivalent condition to
2 (x + y)
< 1 for all x, y ∈ X with x 6= y and kxk = kyk = 1
(the midpoint of a line segment with extremal points lying in the unit sphere does not
belong to this sphere).
The strict convexity of the Banach space X is also equivalent to the strict convexity of
the function h (x) = kxkp , for any p > 1 fixed. In fact, let p > 1 and x, y ∈ X be fixed. As
the function f : R+ + p
0 → R0 , t 7−→ t is strictly convex, we get from triangle inequality,
suppose that X is not strictly convex. Then there exist λ ∈ (0, 1) and x, y ∈ X with x 6= y
and kxk = kyk = 1 such that
and ( 1
(1 + τ p ) p − 1 < p1 τ p :1<p<2
ρLp (τ ) = p−1 2
,
2 < p−1 2
2 τ +o τ 2 τ :2≤p<∞
which means that this space is1 p∨2−convex and p∧2−smooth. In particular, it is uniformly
smooth and uniformly convex. The space of the summable sequences `p (N) and the Sobolev
spaces W n,p (Ω) , n ∈ N, are also p ∨ 2−convex and p ∧ 2−smooth for 1 < p < ∞. As these
spaces are not reflexive for p = 1 and p = ∞, we conclude that they cannot be uniformly
smooth nor uniformly convex. One can actually prove that they are not even strictly convex
or smooth Banach spaces.
This reasoning suggests we could cover the lack of an inner product in a general Banach
space X associating each element x ∈ X to a functional x∗ ∈ X ∗ and then replacing the
inner product hx, yi with hx∗ , yiX ∗ ×X . In the ideal case, the dual pairing h·, ·iX ∗ ×X would
have inner-product-like properties similar to (2.8) above.
1
a ∨ b := max {a, b} and a ∧ b := min {a, b} .
2.5. DUALITY MAPPING 15
is called duality mapping associated with the gauge function ϕ. The duality mapping
associated with the gauge function ϕ (t) = t is called normalized duality mapping.
Finally, a selection of the duality mapping Jϕ is a single-valued function jϕ : X → X ∗
satisfying jϕ (x) ∈ Jϕ (x) for each x ∈ X.
Suppose we are given x ∈ X. From Hahn-Banach Theorem it follows that there exists
x∗∈ X ∗ such that kx∗ k = 1 and hx∗ , xi = kxk, which means that y ∗ := x∗ ϕ (kxk) ∈ Jϕ (x) .
Hence Jϕ (x) 6= ∅ for any x ∈ X.
Remark 16 The same reasoning of the previous paragraph implies, in view of (2.4) , that
the vector x∗ belongs to the set
which means that the two above sets can be identified to each other by the use of the iso-
morphismus T x∗ = Re x∗ . Therefore, one is allowed to write
even if X is a complex Banach space. The above inequality should actually be interpreted
as
Re hjϕ (x) , yi ≤ kRe jϕ (x)k . kyk = kjϕ (x)k . kyk .
The Asplund’s Theorem (2.9) below and the Xu-Roach inequalities in Theorem 18 should
be interpreted in the same way.
With the special notation Jp , where p > 1 is fixed, we denote the duality mapping
associated with the gauge function ϕ (t) = tp−1 . In particular, J2 is the normalized duality
mapping. From definition, we conclude that, for all x, y ∈ X,
Further, each selection j2 of the normalized duality mapping has the inner-product prop-
erties shown in (2.8) .
The connection between the subdifferential and the duality mapping is given by the
very important Asplund’s Theorem [10, Lemma 4.3 and Theo. 4.4, Ch. I]: Let X be a
Banach Rspace, x ∈ X an arbitrary vector and ϕ a gauge function. Then the function
t
ψ (t) := 0 ϕ (s)ds, t ≥ 0 is convex in R+
0 and
For the gauge function ϕ (t) = tp−1 we have ψ (kxk) = p1 kxkp and conclude that Jp =
∂ p1 k·kp . In particular, for the normalized duality mapping it holds J2 = ∂ 12 k·k2 . In
16 CHAPTER 2. GEOMETRY OF BANACH SPACES
0
a Hilbert space, ∂ 12 k·k2 (x) = 12 k·k2 (x) = x, which means that J2 = I is the identity
operator. Unfortunately, this very nice property is true only in Hilbert spaces. In fact,
one can prove that the normalized duality mapping J2 is linear in X if and only if X is a
Hilbert space [10, Prop. 4.8, Ch. I].
The Asplund’s Theorem is the key to connect the properties of the duality mappings
with convexity and smoothness properties of a Banach space. For instance, an interesting
consequence of Asplund’s Theorem is the fact that a Banach space X is smooth if and only
if each duality mapping Jϕ is single valued (cf [10, Cor. 4.5, Ch. I]). In this case,
d
hJϕ (x) , yi = ψ (kx + tyk) , for all x, y ∈ X. (2.10)
dt t=0
Further, X is uniformly smooth if and only if each duality mapping is single-valued and
uniformly continuous in the unit sphere [10, Theo. 2.16, Ch. II].
The next properties of the duality mapping Jϕ are collected from [10, Prop. 4.7, Ch. I]:
Let x, y ∈ X. The duality mapping inherits the monotonicity property of the subdifferential:
hjϕ (x) − jϕ (y) , x − yi ≥ 0.
Further, Jϕ (−x) = −Jϕ (x) and
ϕ (λ kxk)
Jϕ (λx) = Jϕ (x) for all λ > 0.
ϕ (kxk)
In particular, J2 (λx) = λJ2 (x) is homogeneous. The inverse of ϕ is a gauge function too
and if Jϕ∗−1 : X ∗ → X ∗∗ is the duality mapping on X ∗ associated with the gauge function
ϕ−1 , then
x∗ ∈ Jϕ (x) =⇒ x ∈ Jϕ∗−1 (x∗ ) . (2.11)
Finally, if ϕ1 and ϕ2 are gauge functions, then
ϕ2 (kxk) Jϕ1 (x) = ϕ1 (kxk) Jϕ2 (x) .
In particular, for ϕ1 (t) = tr−1 and ϕ2 (t) = tp−1 with p, r > 1 it holds
Jr (x) = kxkr−p Jp (x) . (2.12)
From [10, Cor. 3.13, Ch. II], we see that the range2 of Jϕ is dense in X ∗ , i.e., R (Jϕ ) =
X ∗ . If X is reflexive, then this result becomes R (Jϕ ) = X ∗ (actually, this is an equivalent
condition to the reflexivity of X, see [10, Cor 3.4, Ch.II]). We
conclude that in case of X
being reflexive, the reciprocal of (2.11) is true and R Jϕ−1 = X ∗∗ ∼
∗
= X. Assuming that
X is smooth, then each duality mapping is single valued and if X is additionally reflexive
(this is the case for instance, if X is uniformly smooth), then Jϕ is invertible and satisfies
Jϕ−1 = Jϕ∗−1 : X ∗ → X ∗∗ ∼
= X. (2.13)
Example 17 The dual space of the Lebesgue space Lp (Ω), 1 < p < ∞ is given by (Lp (Ω))∗ =
∗ ∗
Lp (Ω) and the duality mapping Jp : Lp (Ω) → Lp (Ω) can be calculated using the formula
(2.10) . In fact, the duality mapping Jp is associated with the gauge function ϕ (t) = tp−1 ,
Rt
which means that ψ (t) = 0 ϕ (s)ds= p1 tp . Let f, g ∈ Lp (Ω) be given. Then by (2.10) ,
d d 1 p
hJp (f ) , giLp∗ ×Lp = ψ (kf + tgkLp ) = kf + tgkLp
dt t=0 dt p t=0
Z
1 d
|f (x) + tg (x)|p dx
=
p dt Ω t=0
Z
p−1
= |f (x) + tg (x)| sgn (f (x) + tg (x)) g (x) dx
ZΩ D
t=0
E
= |f (x)|p−1 sgn (f (x)) g (x) dx = |f |p−1 sgn (f ) , g .
Ω Lp∗ ×Lp
This means that Jp (f ) = |f |p−1 sgn(f ) , where the equality is understood pointwise. Using
now (2.12) we conclude that the duality mapping Jr in Lp (Ω) is given by
Jr (f ) = kf kr−p
Lp |f |
p−1
sgn (f ) . (2.14)
with
1
(kx − tyk ∨ kxk)p
t kyk
Z
σp (x, y) := K
ep δX dt,
0 t 2 (kx − tyk ∨ kxk)
where δX is the modulus of convexity, see Definition 12.
(B) If X is uniformly smooth, then there exists a positive constant C
ep such that for all
x, y ∈ X
kx − ykp ≤ kxkp − p hJp (x) , yi + σ
ep (x, y)
with
1
(kx − tyk ∨ kxk)p
t kyk
Z
σ
ep (x, y) := C
ep ρX dt,
0 t kx − tyk ∨ kxk
where ρX is the modulus of smoothness (Definition 12).
Assuming that the space X is s−convex for some s > 1 and using the definition of
s−convexity,
Z 1
K
e p Ks s
σp (x, y) ≥ kyk (kx − tyk ∨ kxk)p−s ts−1 dt.
2s 0
where Kp,s = Ke p Ks 2p−2s /ps > 0. Similarly, if the Banach space X is assumed to be
s−smooth and p ≥ s, then there exists a positive constant Cp,s such that for all x, y ∈ X
In particular, if p = s, then
1 1 Kp
kx − ykp ≥ kxkp − hjp (x) , yi + kykp , (2.17)
p p p
and
1 1 Cp
kx − ykp ≤ kxkp − hJp (x) , yi + kykp , (2.18)
p p p
respectively, with K p := pKp,s and C p := pCp,s .
In [9, Cor. 4.17 and Cor. 5.8] it is shown that the existence of K p and C p in inequalities
(2.17) and (2.18) are actually equivalent conditions to p−convexity and p−smoothness of
X respectively.
Note that inequalities (2.17) and (2.18) reduce to the polarization identity (2.5) in
Hilbert spaces (for p = s = 2, K p = C p = 1). Trying to mimic this identity in a general
Banach space, we introduce the Bregman distance.
Despite its name, the Bregman distance is not a metric because it does not satisfy the
reflexivity for example (∆Ω (x, y) 6= ∆Ω (y, x) in general). It does not satisfy the important
triangle inequality either. But, from definition of subdifferential immediately follows that
∆Ω (x, y) ≥ 0 for all x, y ∈ X. Additionally, x = y implies ∆Ω (x, y) = 0.
Let ϕ be a gauge function. Then, from Asplund’s Theorem, the function Ω (x) :=
R kxk
ψ (kxk) = 0 ϕ (s)ds is convex and ∂Ω (x) = Jϕ (x) . It follows that in this case,
We denote by ∆p , p > 1, the Bregman distance associated to the particular gauge function
ϕ (t) = tp−1 . This means
1 1
∆p (x, y) = kxkp − kykp − inf {hξ, x − yi : ξ ∈ Jp (y)} . (2.19)
p p
Assume from now on that the duality mapping is single-valued (this is the case in smooth
Banach spaces for instance). Hence, the above equality becomes
1 1
∆p (x, y) = kxkp − kykp − hJp (y) , x − yi
p p
1 1
= kxkp − kykp + kykp − hJp (y) , xi
p p
1 1
= kxkp + ∗ kykp − hJp (y) , xi
p p
1 1 ∗
= kxkp − hJp (y) , xi + ∗ kJp (y)kp .
p p
2.6. BREGMAN DISTANCES 19
Observe the similarity between this formula and the polarization identity (2.5) . Since in
Hilbert spaces the normalized duality mapping is the identity operator, we conclude that
in these spaces ∆2 (x, y) = 12 kx − yk2 . Further,
1 1
∆p (x, y) ≥ kxkp + ∗ kykp − kykp−1 kxk .
p p
Now, if (xn )n∈N ⊂ X is a sequence and x ∈ X is a fixed vector, then the inequality
∆p (x, xn ) ≤ ρ implies
1
kxn kp−1 kx n k − kxk ≤ ρ.
p∗
1 1 1 1
Considering now the cases p∗ kxn k − kxk ≤ 2p∗ kxn k and p∗ kxn k − kxk > 2p∗ kxn k, we
conclude the implication
∆p (x, xn ) ≤ ρ =⇒ kxn k ≤ 2p∗ kxk ∨ ρ1/p . (2.20)
Therefore, (xn )n∈N is bounded whenever ∆p (x, xn ) is bounded. A similar result can be
proven if ∆p (xn , x) ≤ ρ. If the duality mapping is single-valued and continuous (this is
the case, for instance, in a uniformly smooth Banach space) then the continuity is handed
down to both arguments of the Bregman distance ∆p .
If X is strictly convex, then x = y whenever ∆p (x, y) = 0 because X strictly convex
implies that x 7−→ p1 kxkp is strictly convex, which in turn implies that ∆p is strictly convex
on its first argument. Now, if x 6= y, we find for λ ∈ (0, 1)
for all x, y, z ∈ X. It is also easy to verify the identity ∆p (x, y) = ∆p∗ (Jp (y) , Jp (x)) .
The Xu-Roach Theorem states that in an arbitrary uniformly convex Banach space it
holds
1
σp (y, y − x) ≤ ∆p (x, y) ,
p
for all x, y ∈ X. Using (2.15) we conclude that in an s−convex Banach space there exists,
for each p ≤ s, a constant Kp,s such that
for all x, y ∈ X. In particular, if the sequence (xn )n∈N ⊂ X is bounded (or p = s), then
there exists a constant C > 0 such that
for all x, y ∈ X. One can additionally prove for an arbitrary s−smooth Banach space, that
for each p > 1, there exists a positive constant C p,s satisfying
F (x) = y (3.1)
with F operating between Banach spaces X and Y , that is, F : D(F ) ⊂ X → Y, where
D(F ) denotes the domain of definition of F . We suppose to have full knowledge of this
operator. An approximation y δ for y satisfying
y − y δ
≤ δ,
and the noise level δ > 0 are assumed to be known as well. Suppose now that a solution
x+ of (3.1) exists and for the ease of presentation, assume for now that it is unique. We
aim to find for each pair y δ , δ satisfying the above inequality, a vector xδ such that the
xn+1 = xn + sn (3.3)
An s = bδn (3.4)
with a pre-defined constant 0 < µ < 1. The Newton iteration (3.3), also called the outer
iteration, is now realized defining sn := sn,kn . The algorithm is finally terminated with the
1
The approximate solution xδ actually depends on δ and y δ : xδ = x(δ,yδ ) .
21
22 CHAPTER 3. THE INEXACT NEWTON METHOD K-REGINN
j = 0, . . . , d − 1.
Fj (x) = yj , (3.6)
δ
Our task can be recast as: for each d pairs yj j , δj satisfying
δ
kyj − yj j k ≤ δj , j = 0, ..., d − 1, (3.7)
find a vector xδ such that the regularization property (3.2) holds for
Algorithm 1 K-REGINN
Input: xN ; (y δj , δj ); Fj ; Fj0 , j = 0, . . . , d − 1; µ; τ ;
δ
Output: xN with kyj j − Fj (xN )k ≤ τ δj , j = 0, . . . , d − 1;
` := 0; x0 := xN ; c := 0;
while c < d do
for j = 0 : d − 1 do
n := `d + j;
δ
bδn := yj j − Fj (xn ); An := Fj0 (xn );
if kbδn k ≤ τ δj then
xn+1 := xn ; c := c + 1;
else
k := 0; sn,0 := 0; choose kmax,n ∈ N;
repeat
calculate sn,k+1 := f (sn,k ) from (3.4) % The meaning of f is explained in
% Remark 20
k := k + 1;
until
bδn − An sn,k
< µ
bδn
or k = kmax,n
xn+1 := xn + sn,k ; c := 0;
end if
end for
` := ` + 1;
end while
xN := x`d−c ;
where (kmax,n )n∈N is an arbitrary sequence of natural numbers and kREG := 0 if (3.9) is
verified. Observe further that if kmax < ∞, where
the function f depends on the Banach spaces X and Y and on the particular method used
to approximate the solution of (3.4), we do not consider any particular method to perform
this task in our convergence analysis in Chapter 4. Instead of this, we only assume that
some properties of this sequence hold true, without caring about how it is generated. In
next section however, we adapt some classical and well-known methods from Hilbert to
more general Banach spaces to provide some practical examples of how this sequence could
be generated in order to have the required properties used in the convergence analysis of
Chapter 4.
represent respectively the F-derivative of the forward operator F[n] at the current iterate
xn and the nonlinear residual. The numbers p, r > 1 are fixed and p∗ and r∗ represent their
conjugate numbers respectively.
We suppose in the whole of this section that exact data (δ = 0 in (3.8)) is given. The
objectives are to avoid unnecessary complications at this point of the text as well as to ease
the notation, and for this last reason, we temporarily omit the superscript δ. We would like
to stress however, that all the results presented here can similarly be proven for the noisy
data case. Later, in Chapter 4 we turn back to the old notation and consider noisy data
again.
The concept in next definition is essential to understand the ideas of the primal gradient
methods of Subsection 3.1.1 below.
K-REGINN updates the current vector xn adding the vector sn,kn found in the inner
iteration: xn+1 = xn + sn,kn . The vector xn+1 is now a new approximation for a solution
of (3.1) . For this reason,
we would
like the vectors sn,k to be descent directions for the
1
r
functional ψn (x) := r F[n] (x) − y[n] from xn .
The function 1r k·kr is F-differentiable if the uniform smoothness of Y[n] is assumed, see
Section 2.2. Assume now the F-differentiability of F[n] . In this case the chain rule applies
to the auxiliary function ϕn (t) := ψn (xn + tsn,k ):
D E
ϕ0n (0) = Jr F[n] (xn + tsn,k ) − y[n] , F[n]
0
(xn + tsn,k ) sn,k
t=0
= hJr (−bn ) , An sn,k i = hJr (−bn ) , Asn,k − bn i − hJr (−bn ) , −bn i
≤ kbn kr−1 (kAn sn,k − bn k − kbn k) .
Though the assumption under the space Y[n] facilitates the above proof through the
use of the chain rule in the F-differentiable functions 1r k·kr and F[n] , it is not an essential
condition. The result actually holds true under weaker restrictions on the space Y[n] , which
guarantees only G-differentiability of norm-functions, as the next proposition shows.
Proposition 22 Let X and Y be Banach spaces with Y being smooth and let F : D(F ) ⊂
X → Y be a Gâteaux-differentiable function in x ∈int(D (F )) . Further, let y ∈ Y be a
fixed vector and define A = F 0 (x) and b = y − F (x) . If s ∈ X satisfies the inequality
kAs − bk < kbk , then s is a descent direction for the functional ψ (·) := kF (·) − yk from x.
Proof. Define the auxiliary functions ψ0 (t) := 1r kb − tAskr , r > 1, ψ1 (t) := kF (x + ts) − yk
and ψ2 (t) := kb − tAsk . As Y is smooth, the duality mapping Jr : Y → Y ∗ is single-valued
and satisfies (see (2.10))
d 1 r
hJr (v) , wi = kv + twk . (3.14)
dt r t=0
Therefore,
ψ0 (t) − ψ0 (0)
lim = ψ00 (0) = hJr (b) , −Asi = hJr (b) , b − Asi − hJr (b) , bi
t→0 t
≤ kbkr−1 (kAs − bk − kbk) < 0.
The result implies that there exist small numbers t, γ > 0 such that
ψ0 (t) − ψ0 (0)
≤ −γ < 0
t
Now,
1
ψ1 (t) − ψ1 (0) kF (x + ts) − F (x) − tF 0 (x) sk (ψ2 (0)r − γtr) r − ψ2 (0)
≤ +
t t t
Hence, there exists a number t1 > 0 such that ψ1 (t) < ψ1 (0) for all 0 < t ≤ t1 , i.e.,
kF (x + ts) − yk < kF (x) − yk .
26 CHAPTER 3. THE INEXACT NEWTON METHOD K-REGINN
It follows that there exists anumber t > 0 such that for all 0 < t ≤ t it holds Φ (t) < Φ (0) ,
this is, ϕ sk + t −jp∗∗ (∇k ) < ϕ (sk ) .
Proposition 23 shows that in a smooth Banach space Y, the sequence generated by the
iterative method
sk+1 := sk − λk jp∗∗ (∇k ) , (3.15)
with p∗ > 1, s0 ∈ X and λk > 0 small enough, satisfies the inequality ϕ (sk+1 ) < ϕ (sk ) ,
i.e.,
kAsk+1 − bk < kAsk − bk (3.16)
as long as ∇k 6= 0. If additionally s0 := 0, then kAsk − bk < kbk for all4 k ∈ N. The
iterative methods defined in this way are called primal gradient 5 methods. Algorithm 2
codifies K-REGINN with a primal gradient method in the inner iteration in a smooth Banach
space Y . The pieces highlighted in red represent the part of the algorithm exclusively
related to (3.15).
If the step-size λk in (3.15) can be chosen independently on k, the associated gradient
method is called Landweber method6 (in short LW), that is, λLW =constant. The Steepest
Descent method (SD) is defined choosing a step-size λSD satisfying
λSD ∈ arg minϕ sk − λjp∗∗ (∇k ) ,
λ∈R+
4
In view of Proposition 22,
we see that
for A = An and b = bn , the vectors sk = sn,k are descent
directions for the functional
F[n] (·) − y[n]
from xn .
5
Here the iteration occurs in the primal space X, in contrast with the so-called dual methods where the
iteration happens in the dual space X ∗ , see Subsection 3.1.2.
6
To homage the relevant work of L. Landweber [33].
3.1. SOLVING THE INNER ITERATION 27
if such a minimizer exists. Assuming this is the case and additionally assuming that the
function ϕ is F-differentiable (the second assumption is true for instance, if Y is uniformly
smooth), one can apply the chain rule to find
d ∗
= ϕ0 sk − λjp∗∗ (∇k ) , −jp∗∗ (∇k ) λ=λ
ϕ sk − λjp∗ (∇k )
dλ λ=λSD
SD
It follows that, similarly to Hilbert spaces, the gradient of two consecutive iterates are
”orthogonal” in the sense that jp∗∗ (∇k ) , ∇k+1 = 0. Further, the inequality ϕ (sk+1 ) ≤
ϕ sk − λjp∗∗ (∇k ) is immediately verified for all λ ≥ 0 and consequently (3.16) holds. Due
to the nonlinearity of the duality mapping, an explicit formula for λSD is nevertheless
difficult to be achieved. The uniqueness of a minimizer is guaranteed for instance, if A is
injective and Y is strictly convex because in this case the function 1r k·kr is strictly convex
and Ax1 − b 6= Ax2 − b whenever x1 6= x2 , which implies the strict convexity of ϕ and
consequently the desired result.
Suppose now that Y is a r−smooth Banach space, then there exists a positive number
28 CHAPTER 3. THE INEXACT NEWTON METHOD K-REGINN
C r (see (2.18)) such that for all λ ≥ 0 and sk+1 = sk − λjp∗∗ (∇k ) ,
1 1
r
kAsk+1 − bkr =
(Ask − b) − λAjp∗∗ (∇k )
r r
1 Cr
λAjp∗∗ (∇k )
r
≤ kAsk − bkr − Jr (Ask − b) , λAjp∗∗ (∇k ) +
r r
1 ∗ C r r
r
= kAsk − bkr − λ k∇k kp + λ Ajp∗∗ (∇k )
,
r r
which implies that
∗ Cr r
r
ϕ (sk+1 ) − ϕ (sk ) ≤ −λ k∇k kp + λ
Ajp∗∗ (∇k )
=: f (λ) . (3.17)
r
The above inequality is verified for each method defined via sk+1 = sk − λjp∗∗ (∇k ) and
in particular, since the step-size λSD minimizes the difference ϕ (sk+1 ) − ϕ (sk ) , inequality
(3.17) holds for this method using an arbitrary λ ≥ 0 in the rightmost side. The optimality
condition f 0 (λ) = 0 drives to the step-size
∗
k∇k kp
λr−1
M SD := C0
r , (3.18)
∗
Ajp∗ (∇k )
with C0 := 1/C r . The associated gradient method is called Modified Steepest Descent (MSD)
method. If Y is a Hilbert space and r = 2, then the polarization identity (2.5) shows that
C r = 1 can be chosen. In this case, the SD and MSD methods coincide and have the same
step-size:
∗
k∇k kp
λ=
2 .
∗
Ajp∗ (∇k )
If X is a Hilbert space too, then this step-size is given by the expression (for p = r = 2):
∗ C r r−1
r
ϕ (sk+1 ) − ϕ (sk ) ≤ f (λk ) ≤ −λk k∇k kp + λk λM SD
Ajp∗∗ (∇k )
(3.20)
r
∗ C r C0 ∗ ∗
= −λk k∇k kp + λk k∇k kp = −C1 λk k∇k kp < 0,
r
where C1 := 1 − C r C0 /r > 0. The above result is true for each primal gradient method with
step-size satisfying λk ∈ (0, λM SD ] with 0 < C0 < r/C r . It holds for SD too (in principle
not for λk = λSD in the rightmost term because we do not know whether 0 < λSD ≤ λM SD ,
but for an arbitrary λk ∈ (0, λM SD ]). Now, if λk ∈ [cλM SD , λM SD ] with 0 < c < 1 being
independent of k,
∗ r−1 ∗ r−1
λk k∇k kp ≥ cλM SD k∇k kp
∗ ∗ −1)
k∇k kp −r(p ∗ (r−1) cr−1 C0
≥c r−1
C0 k∇k kp = k∇k kr ,
kAkr kAkr
3.1. SOLVING THE INNER ITERATION 29
∗ ∗
which implies that λk k∇k kp & k∇k kr . From (3.20) ,
∞ ∞ ∞
r∗ p∗
X X X
k∇k k . λk k∇k k . ϕ (sk ) − ϕ (sk+1 ) ≤ ϕ (s0 ) < ∞.
k=0 k=0 k=0
It follows that
k∇k k → 0 as k → ∞. (3.21)
Hence, there exists a constant C2 > 0 independent of k such that k∇k k ≤ C2 . The result is
true for each primal gradient method with step-size λk in the interval [cλM SD , λM SD ] with
0 < c < 1 fixed but arbitrary. Although we do not know if the inequality cλM SD ≤ λSD ≤
λM SD is true, (3.21) is ensured for SD method too because (3.20) holds for this method
with an arbitrary λk ∈ (0, λM SD ] . In particular, (3.20) holds for SD with λk = λM SD for
example, which implies (3.21).
Finally, the inequality p ≤ r implies that p∗ − r (p∗ − 1) ≤ 0, which in turn implies
∗ ∗ −1) p∗ −r(p∗ −1)
r−1 k∇k kp −r(p C2
λM ≥ C0 ≥ C0 .
SD
kAkr kAkr
Choosing
∗ −1 ∗ ∗
C0r C2r −p
λLW := ∗ , (3.22)
kAkr
we conclude that λLW ∈ (0, λM SD ] and due to (3.20) , the inequality
∗
ϕ (sk+1 ) − ϕ (sk ) ≤ −C2 λLW k∇k kp
is valid for LW method. As λLW is constant, the last inequality immediately implies (3.21).
In summary, we have proven that:
• Y r−smooth implies that each primal gradient method with step-size λk ∈ [cλM SD , λM SD ]
and 0 < c < 1 (in particular, MSD itself) satisfies (3.16) , (3.20) and (3.21).
Lemma 24 Let X and Y be Banach spaces and (sk )k∈N be a sequence generated by the
iterative method (3.15) with s0 = 0. Suppose that (3.21) and (3.20) hold. Then there exists
a constant C ≥ 0 such that
1
lim kAsk − bkr ≤ (kAs − bkr + C kbkr ) , for all s ∈ X. (3.23)
k→∞ 1+C
Proof. From (3.21) , it is possible to choose a subsequence ∇kj j∈N such that
∇kj
≤
k∇m k for m ≤ kj . Let s ∈ X be an arbitrary vector. As ∇kj ∈ ∂ϕ skj , it follows from
definition of subgradient that
kj −1
(3.15) X
λm ∇kj , jp∗∗ (∇m ) − ∇kj , s
ϕ skj ≤ ϕ (s) + ∇kj , skj − ∇kj , s = ϕ (s) −
m=0
kj −1
X ∗
λm k∇m kp +
∇kj
. ksk ,
≤ ϕ (s) +
m=0
30 CHAPTER 3. THE INEXACT NEWTON METHOD K-REGINN
Now, (3.20) implies that (ϕ (sk ))k∈N is a positive decreasing sequence, hence convergent. It
follows that
1
lim ϕ (sk ) = lim ϕ skj ≤ lim ϕ (s) + Cϕ (0) +
∇kj
. ksk
k→∞ j→∞ j→∞ 1 + C
1
= [ϕ (s) + Cϕ (0)] .
1+C
kAsk − bk2
λk := C0 (3.25)
k∇k k2
and C0 < 2 implies that g (λk ) < 0, as we wanted. Observe that C0 = 1 transforms λk in the
optimal step-size (in the sense that the resulting method has, among all gradient
methods,
the error which decreases with maximal speed), which is obtained from g 0 λk = 0. The
gradient method associated with the step-size λk with C0 = 1 is just the Minimal Error
3.1. SOLVING THE INNER ITERATION 31
method introduced in [17]. Note further that the step-size (3.25) is simpler to be computed
than the one of Steepest Descent method (3.19).
To enlarge the above results in order to guarantee their validity for nonlinear problems in
Banach spaces, more general results as those shown in last subsection are required. For the
proper adjustment of DE method to K-REGINN, we suppose for the rest of this subsection
that X is an uniformly smooth and uniformly convex Banach space. Both restrictions
together ensure that the duality mapping Jp : X → X ∗ , 1 < p < ∞, is single valued,
continuous, invertible and with continuous inverse given by Jp−1 = Jp∗∗ : X ∗ → X ∗∗ ∼ = X.
∗
This result provides free access to the dual space X in the sense that it is always possible
to transfer a vector from X to its dual space X ∗ , perform an iteration and then come back
to the original space in a stable way.
We further assume the next assumption on the inverse problem F (x) = y :
Bρ x+ , ∆p := v ∈ X : ∆p x+ , v < ρ ⊂ D (F ) ,
where ρ > 0 and p > 1 are fixed numbers and the Bregman distance ∆p is defined in (2.19) .
(b) All the functions Fj , j = 0, . . . , d − 1, are continuously Fréchet differentiable in
Bρ (x+ , ∆p ) and their derivatives satisfy
0
Fj (v)
≤ M for all v ∈ Bρ x+ , ∆p and j = 0, . . . , d − 1,
Before starting the analysis in more general Banach spaces, we stay a little longer
in Hilbert spaces and observe how could be possible to employ the DE method as inner
iteration of K-REGINN. Similar to before, we define the iteration sn,k+1 = sn,k − λn,k ∇n,k ,
where λn,k > 0 and ∇n,k = A∗n (An sn,k − bn ) , with An := F[n]
0 (x ) and b := y −F
n n [n] [n] (xn ) .
Observe that the most suitable vector to be approximated in the inner iteration is not s+ ,
but en := x+ − xn . The reason is quite simple: if sn,kn = en , then xn+1 = xn + en = x+ .
Applying an idea similar to (3.24) , we derive the equality
1 1 1
ksn,k+1 − en k2 − ksn,k − en k2 = −λn,k hsn,k − en , ∇n,k i + λ2n,k k∇n,k k2 (3.26)
2 2 2
:= g (λn,k )
hsn,k − en , ∇n,k i
λn,k < 2 .
k∇n,k k2
Note that the term hsn,k − en , ∇n,k i depends on the unavailable information en . But,
Applying now the TCC, Assumption 1(c) , and observing that kbn k ≤ µ1 kAn sn,k − bn k for
k = 0, ..., kn , , see (3.11),
0
kAn en − bn k =
F[n] x+ − F[n] (xn ) − F[n] (xn ) x+ − xn
(3.28)
η
≤ η kbn k ≤ kAn sn,k − bn k .
µ
Therefore
hsn,k − en , ∇n,k i ≥ K1 kAn sn,k − bn k2 ,
η
with K1 := 1 − µ (which is positive for µ ∈ (η, 1)). Thus, λn,k ∈ 0, λn,k with
kAn sn,k − bn k2
λn,k := 2K1
k∇n,k k2
Remark 25 To expand the above ideas to more general Banach spaces, we need to assume
that X is s−convex for some s > 1, which leave us with two possible approaches. In the
first one, we assume that s = p, which means that the index s coincides with the index used
to define the duality mapping Jp . In this case, the numbers p and s in inequalities (2.22) ,
(2.23) , (2.24) and (2.25) coincide. This framework strongly facilitates the convergence
analysis of K-REGINN presented in Chapter 4. However, when performing the numerical
experiments in Chapter 5 we are interested in the use of the Lebesgue spaces Lp (Ω) with
1 < p < 2, which are not p−convex but 2−convex spaces. This structure forces the use of
the normalized duality mapping J2 in Lp (Ω) instead of the standard option Jp . At first, this
is not a big problem, because J2 can be calculated in Lp (Ω) via (2.14) . But the numerical
experiments we have done suggested that this is not the best approach to guarantee good
reconstructions.
In order to fix this problem, an alternative and more general approach can be applied
assuming the space X to be s−convex with p ≤ s, which makes possible the use of the duality
mapping Jp in the s−convex space Lp (Ω) (s = max {p, 2}). For this reason, we have chosen
to employ this approach to develop our theory. This procedure actually results in a much
more complicated theory, but as a reward we are free to use a more suitable duality mapping
and expect to achieve better reconstructions in our numerical experiments in Chapter 5.
We now clarify how to engage the dual gradient methods in combination with K-REGINN
in Banach spaces. Assume that X is uniformly smooth and s−convex with some s ≥ p.
Suppose that the current outer iterate xn of K-REGINN is well-defined and define the convex
functional ϕn (s) := 1r kAn s − bn kr , r > 1. The dual gradient methods are now defined by
the following iteration:
Jp (zn,k+1 ) := Jp (zn,k ) − λn,k ∇n,k , (3.29)
where λn,k > 0, ∇n,k ∈ ∂ϕn (sn,k ) = A∗n Jr (An sn,k − bn ) , sn,k := zn,k − xn and sn,0 := 0.
Observe that zn,0 = xn + sn,0 = xn as well as zn,kn = xn + sn,kn = xn+1 . We codify
3.1. SOLVING THE INNER ITERATION 33
K-REGINN with a dual gradient method in the inner iteration in Algorithm 3. The fragments
highlighted in red represent here the part of the algorithm corresponding to (3.29).
To imitate the polarization identity, which is necessary to derive (3.26) , we change the
2
functional 12 kx+ − ·k into the Bregman distance ∆p (x+ , ·) . Assume that7 xn ∈ Bρ (x+ , ∆p ) ,
that is, ∆p (x+ , xn ) < ρ. In view of (2.20) , this inequality implies that kxn k ≤ Cρ,x+ , with
Cρ,x+ := 2p∗
x+
∨ ρ1/p . (3.30)
Our goal now is to define a calculable step-size λDE = λDE (n, k) > 0 such that the
inequalities 0 < λn,k ≤ λDE will imply the monotonicity of the error in the inner iteration:
Further, we prove that the step-size λLW of the Landweber method and λM SD of the
Modified Steepest Descent method satisfy the required inequality and consequently the
monotonicity property (3.31) is ensured for these methods.
The definition of λDE depends on an uniformly bound in n and k for the generated
sequence (zn,k )0≤k≤kn , n ∈ N, see (3.32) and (3.39). But in order to prove that this
7
In Theorem 38, we will prove that if K-REGINN starts with x0 ∈ Bρ x+ , ∆p , then all the outer iterations
sequence is uniformly bounded, it is necessary to use the property (3.31) , which in turn
depends on the definition of λDE . This reasoning suggests that an induction argument is
necessary to prove all these properties at the same time. Note that zn,0 = xn ∈ Bρ (x+ , ∆p )
and therefore kzn,0 k ≤ Cρ,x+ . We prove now by induction that
for k = 0, ..., kn . Assume that (3.32) holds true for some k < kn as well as ∆p (x+ , zn,` ) <
∆p (x+ , zn,`−1 ) for ` = 1, ..., k.
The three points identity (2.21) is applied to replace (3.26) . Definition (3.29) together
with en = x+ − xn yields
Making use of the properties of the duality mapping, we obtain, similarly to (3.27) ,
The last inequality is the exact point where the bound (3.32) is used. We impose now a
restriction on λn,k : if it is possible to find λn,k satisfying
!
p−s∗ (p−1) s∗ ∗
∗ ∗
Cp∗ ,s∗ Cρ,x+ λn,k k∇n,k ks ∨ λpn,k k∇n,k kp ≤ C0 λn,k kAn sn,k − bn kr (3.36)
for some 0 < C0 < 1, then putting it together with (3.34) in (3.33) we arrive at
The above inequality is the key to prove that g (λn,k ) < 0 and it is fundamental for our con-
vergence analysis in Chapter 4. Observe that it does not depend on the TCC (Assumption
1(c) , page 31), but if we use it, together with k < kn and the definition (3.11), we obtain
like in (3.28) the bound
with C3 := 1 − C0 − η/µ (which is positive if η < 1 − C0 and η/ (1 − C0 ) < µ < 1). This
inequality ensure that (3.31) holds and accordingly
It remains only to find λDE > 0 such that (3.36) holds for all λn,k ≤ λDE . Looking at
(3.36) , we see that this is the case if
where
kAn sn,k − bn kr(`−1)
λDE,` := ,
k∇n,k k`
s−p
with C1 := C0s−1 /Cps−1
∗ ,s∗ C
ρ,x+
and C2 := C0p−1 /Cpp−1
∗ ,s∗ .
Therefore, λDE,s . λDE,p for all n and k. This means that λDE,s . λDE . λDE,p . Further,
if a small enough constant replaces the constant C1 in definition of λDE , then λDE = λDE,s
can be chosen and the property
still holds.
1 r
Remark 27 As long as a minimizer s+ n of ϕn (s) = r kAn s − bn k exists and λn,k ∈
(0, λDE ], the sequence (∆p (zn+ , zn,k ))k≤kn with zn+ := xn + s+
n is also monotonically de-
creasing. In fact, proceeding like in (3.33) and (3.35) ,
with
p−s∗ (p−1) s∗ ∗
∗ ∗
λn,k k∇n,k ks ∨ λpn,k k∇n,k kp − λn,k ∇n,k , sn,k − s+
h (λn,k ) := Cp∗ ,s∗ Cρ,x+ n .
Now, as kAn s+
n − bn k ≤ kAn en − bn k , we obtain like in (3.34) and (3.28) ,
η
∇n,k , sn,k − s+ kAn sn,k − bn kr
n ≥ 1−
µ
From (3.36) , it follows that for all λn,k ∈ (0, λDE ] it is true that
η
h (λn,k ) ≤ − 1 − C0 − λn,k kAn sn,k − bn kr < 0.
µ
36 CHAPTER 3. THE INEXACT NEWTON METHOD K-REGINN
Notice that
kAn sn,k − bn kr(s−1) 1
λDE,s & ≥ s kAn sn,k − bn ks−r ,
k∇n,k ks M
which implies that
λDE & kAn sn,k − bn kt , with t := s − r > −r. (3.40)
Employing the TCC (Assumption 1(c) , page 31), we see that
kbn k − kAn en k ≤ kbn − An en k ≤ η kbn k .
Hence
M M
x+
+ Cρ,x+ .
kbn k ≤ ken k ≤ (3.41)
1−η 1−η
Thus, the sequence of residuals (bn )n∈N is bounded. Since
ksn,k k ≤ kzn,k k + kxn k ≤ 2Cρ,x+ , (3.42)
the sequences (sn,k )k≤kn and consequently (An sn,k − bn )k≤kn are uniformly bounded for all
n ∈ N. This implies that there exists a constant C4 > 0 independent on n and k such that
kAn sn,k − bn k ≤ C4 for all k ≤ kn and n ∈ N. Thus, as kAn sn,k − bn k ≥ µ kbn k for all
k = 0, ..., kn − 1,
C −r µs
λDE & kAn sn,k − bn ks−r ≥ 4 s kbn ks .
M
In particular, the Landweber method with step-size (independent of k) defined by λLW :=
C5 kbn ks , where C5 > 0 is a small constant, is well-defined as inner iteration of K-REGINN
and satisfies λLW ∈ (0, λDE ]. Consequently, inequality (3.37) is satisfied for this method.
Even a small constant in n and k can be used as Landweber step-size if s ≤ r, because
in this case
λDE & kAn sn,k − bn ks−r ≥ C4s−r . (3.43)
From definition (3.29) , we can see that
sn,k+1 = Jp∗∗ (Jp (xn + sn,k ) − λn,k ∇n,k ) − xn
and since the Steepest Descent method is the gradient method whose the step-size minimizes
the residual, the most natural manner to define the Steepest Descent method for the dual
gradient methods seems to be choosing a step-size satisfying
λSD ∈ arg minϕn Jp∗∗ (Jp (xn + sn,k ) − λ∇n,k ) − xn .
(3.44)
λ∈R+
for all λ ≥ 0, and in particular, kAn sn,k+1 − bn k ≤ kAn sn,k − bn k is achieved by picking
λ = 0. However, an explicit expression for λSD is hard to find. Even simple inequalities
involving λSD are not easy to be obtained. Unfortunately we were not able to prove that
λSD ∈ (0, λDE ] to include it in our convergence analysis of Chapter 4. But, the Modified
Steepest Descent method defined in (3.18) can be useful here, because it already has an
explicit step-size. For the dual methods it is nevertheless convenient to alter its exponents.
Notice that for ` ∈ {p, s} ,
∗ ∗ r∗ (`−1)
r∗ (`−1)
k∇n,k kp r (`−1) = Jp∗∗ (∇n,k ) , ∇n,k = An Jp∗∗ (∇n,k ) , jr (An sn,k − bn )
r∗ (`−1)
≤
An Jp∗∗ (∇n,k )
kAn sn,k − bn kr(`−1) ,
3.1. SOLVING THE INNER ITERATION 37
and therefore,
∗ r ∗ (`−1)−`
k∇n,k kp kAn sn,k − bn kr(`−1)
r∗ (`−1) ≤ .
k∇n,k k`
A J ∗ (∇ )
n p∗ n,k
Thus, defining8
λM SD := K1 λM SD,s ∧ K2 λM SD,p (3.45)
with ∗ r ∗ (`−1)−`
k∇n,k kp
λM SD,` :=
r∗ (`−1) ,
∗
An Jp∗ (∇n,k )
Using the same argument as that used for the DE method (see (3.43)), we conclude that
we intend to find a step-size λ = λopt such that the corresponding dual gradient method
minimizes f (λ) := ∆p (x+ , zn,λ ) − ∆p (x+ , zn,k ) , yielding a dual gradient method with the
fastest decreasing error in the inner iteration of K-REGINN. Looking to the three points
identity (2.21), we see that this difference can be written as
f 0 (λ) = h∇n,k , zn,k i − hJp∗ (Jp (zn,k ) − λ∇n,k ) , ∇n,k i − ∇n,k , zn,k − x+
8
The step-size of the MSD method is defined slightly differently for the primal and dual gradient methods.
The same occurs with the LW method. We however keep the same notation λM SD and λLW for these step-
sizes in both situations.
9
In Hilbert spaces, Cp∗ ,s∗ = 1/2 can be chosen and therefore, the unique restriction we have is 1 = K` ≤
C` = C0 /Cp∗ ,s∗ = 2C0 , ` = 1, 2. This means that the inequality 1/2 ≤ C0 < 1 needs to be observed.
38 CHAPTER 3. THE INEXACT NEWTON METHOD K-REGINN
= (Jp (zn,k ) − λ2 ∇n,k ) − (Jp (zn,k ) − λ1 ∇n,k ) , Jp∗∗ (Jp (zn,λ2 )) − Jp∗∗ (Jp (zn,λ1 ))
Of course λopt is a minimizer of f because f 0 is negative in zero and positive for large values.
Additionally, the unique minimizer λopt solves the nonlinear equation
∇n,k , Jp∗∗ (Jp (zn,k ) − λopt ∇n,k ) − x+ = 0.
(3.48)
Using again the three points identity, we conclude that for zn,k+1 := zn,λopt ,
∆p x+ , zn,k+1 − ∆p x+ , zn,k = −∆p (zn,k+1 , zn,k ) + Jp (zn,k+1 ) − Jp (zn,k ) , zn,k+1 − x+
and consequently
∆p x+ , zn,k+1 − ∆p x+ , zn,k = −∆p (zn,k+1 , zn,k ) .
An explicit formula for λopt is hard to achieve and to calculate λopt , even numerically, is
very challenging since the nonlinear equation (3.48) depends on the unavailable information
x+ . It is easy to confirm that λopt = h∇n,k , sn,k − en i / k∇n,k k2 in Hilbert spaces, which still
depends on the unavailable information x+ . This leaves little hope to exactly determine λopt
for nonlinear problems using only available information10 . However, using the property
(3.48) a lower bound for λopt can be achieved:
λopt ∇n,k , zn,k − x+ = − hλopt ∇n,k , zn,k+1 − zn,k i
p−s∗ (p−1) s∗ ∗
∗ ∗
≤ 2Cp∗ ,s∗ Cρ,x+ λopt k∇n,k ks ∨ λpopt k∇n,k kp .
10
In Hilbert spaces, the last expression reduces to the calculable step-size λopt = kAsk − bk2 / k∇k k2 if
the problem (3.1) is linear, see (3.25).
3.1. SOLVING THE INNER ITERATION 39
Thus
! !
1 h∇n,k , sn,k − en is−1 1 h∇n,k , sn,k − en ip−1
λopt ≥ ∧ .
2s−1 Cps−1
∗ ,s∗ C
s−p k∇n,k ks 2p−1 Cpp−1
∗ ,s∗ k∇n,k kp
ρ,x+
Remark 28 Unfortunately, we do not know if λopt ≤ λDE to include this method in the
convergence analysis of Chapter 4. However, as λopt minimizes f, we can conclude that
(3.38)
∆p x+ , zn,λopt − ∆p x+ , zn,k = f (λopt ) ≤ f (λDE ) ≤ −C3 λDE kAn sn,k − bn kr ,
which is enough to prove that the inner iteration of K-REGINN terminates (kn < ∞) and
consequently the monotonicity property (3.31) as well as the bound (3.32) is transferred to
the outer iteration, see Theorem 38. This implies that at least weak convergence holds for
this method, see Corollary 41.
We finish this subsection discussing how the dual gradient methods look like when
kmax = 1 is used (see (3.12)). In this case, K-REGINN performs only one inner iteration each
outer iteration whenever the inequality (3.9) is not verified. It follows that
We conclude that this procedure is equivalent to apply a dual gradient method directly to
the nonlinear system (3.6). Thus the convergence of K-REGINN using these methods as inner
iteration implies in particular the convergence of a Kaczmarz version of the dual gradient
methods themselves. Analogously, the iteration
xn+1 = xn − λn,0 jp∗∗ F[n]
0
(xn )∗ Jr F[n] (xn ) − y[n]
1
Tk (s) := kAs − bkr + αk Ω (s) , (3.49)
r
with r > 1, αk > 0 and Ω : X → R+
0 being subdifferentiable and satisfying (sm )m∈N bounded
whenever (Ω (sm ))m∈N bounded. Define now the Tikhonov iteration as
0 ∈ A∗ Jr (Ask+1 − b) + αk ∂Ω (sk+1 ) ,
and conclude that there exist a selection jr : Y → Y ∗ and s∗k+1 ∈ ∂Ω (sk+1 ) such that
1 ∗
s∗k+1 = − A jr (Ask+1 − b) .
αk
If the functional Ω is strictly convex, the minimizer of Tk is unique and consequently, sk+1
is unique in (3.50). If Ω is Gâteaux-differentiable in sk+1 , the above equality becomes
1 ∗
Ω0 (sk+1 ) = − A jr (Ask+1 − b) ,
αk
where Ω0 represents the G-derivative of Ω.
The Tikhonov-Phillips method (TP) is defined by choosing a strictly decreasing zero-
sequence (αk )k∈N and Ω (s) := p1 ks − xkp , with p > 1 and x ∈ X being independent on
k.
Assume from now on that X is smooth and strictly convex. The first restriction is
equivalent to the G-differentiability of p1 k·kp and consequently Ω0 (s) = Jp (s − x) . The
second restriction in turn, is equivalent to the strict convexity of p1 k·kp , which implies that
Tk is strictly convex and therefore the minimizer of this functional is unique. Choosing
x := 0, the TP method results in the implicit iteration
1 ∗
Jp (sk+1 ) = − A jr (Ask+1 − b) . (3.51)
αk
It is easy to confirm that sk+1 = (A∗ A + αk I)−1 A∗ b in Hilbert spaces (using p = r = 2).
Definition (3.50) immediately implies that Tk−1 (sk ) ≤ Tk−1 (0) for all k ∈ N and as
Ω (0) = 0,
kAsk − bk ≤ kbk . (3.52)
Used as inner iteration of K-REGINN, the iteration (3.51) results in a Kaczmarz variation
of the well-known Levenberg-Marquardt method [21]. K-REGINN with the use of TP method
in the inner iteration and the choice x := x0 − xn is transformed into a Kaczmarz version
of the IRGN (Iteratively Regularized Gauss-Newton method, due to Bakushinskii [3], see also
[29]). Observe that, if x 6= 0, then Ω (0) = p1 kxkp 6= 0, which means that the inequality
Tk−1 (sk ) ≤ Tk−1 (0) alone is not enough to ensure (3.52). However, inequality (3.16) can
be proven for this method, as Lemma 30 below shows.
A variation of TP method is defined employing the Bregman distance instead of the
standard norm in X. The functional p1 k· − xkp is then replaced by Ω (s) := ∆p (x + s, x + x)
42 CHAPTER 3. THE INEXACT NEWTON METHOD K-REGINN
where x ∈ X is a fixed vector. The restrictions on X still guarantee the strict convexity
and G-differentiability of Ω in this case. Moreover, Ω0 (s) = Jp (x + s) − Jp (x + x) and we
have for this variation,
1 ∗
Jp (x + sk+1 ) = Jp (x + x) − A jr (Ask+1 − b) . (3.53)
αk
The two variations of TP method coincide for p = 2 whenever X is a Hilbert space, because
in this situation 21 ks − xk2 = ∆2 (x + s, x + x) . Similar to before, x = 0 implies Ω (0) = 0
and the property (3.52) follows again from Tk−1 (sk ) ≤ Tk−1 (0). Inequality (3.16) also
holds for this method, even in case of x 6= 0, as the next lemma demonstrate.
The first part of this lemma is a generalization to Banach spaces of [30, Theo. 2.16].
Lemma 30 Let X and Y be Banach spaces with X being reflexive, A : X → Y linear,
b ∈ Y and (αk )k∈N ⊂ R a strictly decreasing positive sequence with αk → 0 as k → ∞. Let
s0 ∈ X and
sk+1 ∈ arg minTk (s)
s∈X
with
1
Tk (s) :=kAs − bkr + αk Ω (s) ,
r
where r > 1, Ω : X → R+ 0 is subdifferentiable and satisfies (sm )m∈N bounded whenever
(Ω (sm ))m∈N bounded. Then (sk )k∈N is well-defined and satisfies
kAsk+1 − bk ≤ kAsk − bk and Ω (sk+1 ) ≥ Ω (sk ) (3.54)
for all k ∈ N, where the above inequalities are strict in case of Y being strictly convex.
Furthermore,
lim kAsk − bk = inf kAs − bk (3.55)
k→∞ s∈X
and consequently, inequality (3.23) in Proposition 24 holds for C = 0.
Proof. As shown at the beginning of this subsection, a minimizer of Tk exists for each k ∈
N0 . Thus sk and sk+1 are well-defined and fill the optimality conditions 0 ∈ ∂Tk−1 (sk ) and
0 ∈ ∂Tk (sk+1 ) . Therefore, there exist a selection jr of the duality mapping Jr , s∗k ∈ ∂Ω (sk )
and s∗k+1 ∈ ∂Ω (sk+1 ) such that
αk−1 s∗k = −A∗ jr (Ask − b) and (3.56)
αk s∗k+1 ∗
= −A jr (Ask+1 − b) .
Subtracting the second equation from the first and applying both sides of the resulting
equality in the vector sk − sk+1 ,
αk−1 s∗k − s∗k+1 , sk − sk+1 + (αk−1 − αk ) s∗k+1 , sk − sk+1
∗ respectively.
As α k−1 > α k , we conclude that s ,
k+1 ks − s k+1 ≤ 0 (resp.
sk+1 , sk − sk+1 < 0). We proceed applying both sides of second equation of (3.56) in
the vector sk − sk+1 ,
0 ≥ αk s∗k+1 , sk − sk+1 = hjr (Ask+1 − b) , A (sk+1 − sk )i
which proves the first inequality in (3.54) . To prove the second one, we only use Tk−1 (sk ) ≤
Tk−1 (sk+1 ) and apply the just proved inequality kAsk+1 − bk ≤ kAsk − bk.
Now we turn to (3.55) . Because (kAsk − bk)k∈N is a non-increasing sequence, it con-
verges. Let > 0 be given and (sj )j∈N ⊂ X be a sequence satisfying
1 1
kAsj − bkr → a := inf kAs − bkr as j → ∞.
r s∈X r
Then there exists a number J ∈ N such that for all j ≥ J and all k ∈ N,
1 1
a≤ kAsk − bkr ≤ kAsk − bkr + αk−1 Ω (sk )
r r
1 r
≤ kAsj − bk + αk−1 Ω (sj ) ≤ a + + αk−1 Ω (sj ) .
r
In particular
1
a≤ kAsk − bkr ≤ a + + αk−1 Ω (sJ )
r
for all k ∈ N. Since αk → 0 as k → ∞,
1
a ≤ lim kAsk − bkr ≤ a + .
k→∞ r
1
αk ∆p (zk+1 , x + x) ≤ kAx − bkr
r
and with help of (2.21) and (3.54) , we obtain for f (k) := ∆p (x+ , zk ) − ∆p (x+ , x + x) with
e := x+ − x,
1
= − hjr (Ask+1 − b) , A (sk+1 − e)i − ∆p (zk+1 , x + x)
αk
1
≤ kAsk+1 − bkr−1 kAe − bk − kAsk+1 − bkr − αk ∆p (zk+1 , x + x)
αk
1
≤ kAsk+1 − bkr−1 kAe − bk − kAsk − bkr + (rθ − 1) αk ∆p (zk+1 , x + x)
αk
1
≤ kAsk − bkr−1 kAe − bk − kAsk − bkr + C2 kAx − bkr ,
αk
∆p x+ , zn,k+1 − ∆p x+ , xn
(3.59)
1 h i
≤ kAn sn,k − bn kr−1 (kAn en − bn k − kAn sn,k − bn k) + C2 kbn kr .
αk
Moreover, the vector zn,k+1 in (3.58) is the unique minimizer of the Tikhonov functional
1
Tn,k (z) := kAn (z − xn ) − bn kr + αk ∆p (z, xn ) . (3.60)
r
Remark 31 Inequality (3.59) was achieved using only the conditions for existence and
uniqueness of a minimizer of (3.49) together with the hypotheses of Lemma 30. These are
actually very few requirements. The s−convexity of X, for instance, is not necessary either
to the well-definedness of the TP method nor to prove (3.59) above. The TCC (Assumption
1(c) , page 31) is not necessary as well.
The main difficulties in studying the convergence of K-REGINN using the Tikhonov-
Phillips-type methods as inner iteration arise from the fact that the iteration sk+1 is actually
independent of the previous iteration sk . This fact also indicates that some information is
wasted during the inner iteration. It would be helpful if we could count on an iteration
similar to (3.29) of the dual gradient methods, where the available information sk is used
to generate the update sk+1 . Motivated by these facts, we introduce the Iterated-Tikhonov
method (IT), where the sequence (αk )k∈N is chosen being independent of k , i.e., αk = α
for all k ∈ N. In contrast, the functional Ω is dependent on this variable: Ω (s) = Ωk (s) :=
1 p
p ks − sk k . Then from (3.50) ,
1
Jp (sk+1 − sk ) = − A∗ jr (Ask+1 − b) ,
α
for some selection jr : Y → Y ∗ . Adopting the Bregman distance variation
Ωk (s) := ∆p (x + s, x + sk ) ,
1 ∗
Jp (x + sk+1 ) = Jp (x + sk ) − A jr (Ask+1 − b) . (3.61)
α
The inequality kAsk+1 − bk ≤ kAsk − bk is an immediate consequence of Tk (sk+1 ) ≤
Tk (sk ). Moreover, Tk (sk+1 ) < Tk (sk ) for sk+1 6= sk because in this case Ωk (sk+1 ) > 0 and
accordingly (3.16) applies. Since s0 = 0, inequality (3.52) is also true.
Our target now is to prove a similar inequality to (3.37) for the Bregman variation of
IT method. Similarly to what occurs with the dual gradient methods, this property needs
to be proven by induction at the same time with (3.31) and (3.32) . For this reason, it is
necessary to observe the inner and outer iteration of K-REGINN simultaneously. Regarded
as inner iteration of K-REGINN, IT method has a constant regularization parameter α in
the inner iteration, but it can be chosen dependent on the index n of the outer iteration:
α = αn . Let Assumption 1, page 31, hold true and assume that X is uniformly smooth and
s−convex with p ≤ s ≤ r. Define now x = xn , sk = sn,k , A = An , b = bn and α = αn in
(3.61), which results in
1 ∗
Jp (zn,k+1 ) = Jp (zn,k ) − A jr (An sn,k+1 − bn ) , (3.62)
αn n
3.1. SOLVING THE INNER ITERATION 45
with zn,k := xn +sn,k . With these definitions, the vector zn,k+1 ∈ X is the unique minimizer
of the Tikhonov functional
1
Tn,k (z) := kAn (z − xn ) − bn kr + αn ∆p (z, zn,k ) . (3.63)
r
In order to prove (3.37) , and based in the proof of this inequality for the dual gradient
method DE presented in the last subsection, we assume that11 xn ∈ Bρ (x+ , ∆p ) and observe
that this implies that ∆p (x+ , zn,0 ) = ∆p (x+ , xn ) < ρ and consequently kzn,0 k ≤ Cρ,x+ , with
the constant Cρ,x+ > 0 being defined in (3.30) . Suppose now by induction that for some
k ∈ N, the inner iterates zn,0 , ..., zn,k are well-defined and satisfy the inequalities (3.31)
and (3.32) . Everything we need to complete the induction proof is that the inequality
∆p (x+ , zn,k+1 ) < ∆p (x+ , zn,k ) holds. To this end, we adapt ideas of our previous work
[39]. Applying the three points identity (2.21) :
1
=− hjr (An sn,k+1 − bn ) , An (sn,k+1 − en )i
αn
1
= hjr (An sn,k+1 − bn ) , (An en − bn ) − (An sn,k+1 − bn )i
αn
1
≤ kAn sn,k+1 − bn kr−1 kAn en − bn k − kAn sn,k+1 − bn kr .
αn
We would like to have the linearized residual in the k−th inner iterate, kAn sn,k − bn k,
instead of kAn sn,k+1 − bn k in the rightmost term of above inequality. Observe however
that, since the functional k·kr is convex,
it follows that
∗ −1 ∗ −1 ∗ −1)
k∇n,k+1 kp ≤ Mp kAn sn,k+1 − bn k(r−1)(p (3.66)
p∗ −1 (r−1)(p∗ −1)−1
≤M kbn k kAn sn,k − bn k .
Proceeding similarly, replacing p∗ with s∗ and using at this time the inequality (r − 1) (s∗ − 1)−
1 ≥ 0 we arrive at
M r kzn,k+1 − zn,k kr ≤ C (n)r kAn sn,k − bn kr ,
with
p−s∗ (p−1) ∗ ∗ −1)−1
∗
C p∗ ,s∗ M p kbn k(r−1)(p
∗ −1)−1
C p∗ ,s∗ Cρ,x+ M s kbn k(r−1)(s
C (n) := ∗ −1 ∨ ∗ .
αnp αns −1
∗
Choose 0 < C0 < 2−1/r and observe that C (n) ≤ C0 if and only if αn ≥ αmin,n with
p−1 s−1 s−p s r−s
C p∗ ,s∗ M p kbn kr−p C p∗ ,s∗ Cρ,x + M kbn k
αmin,n := ∨ .
C0p−1 C0s−1
∆p x+ , zn,k+1 − ∆p x+ , zn,k
(3.67)
1
≤ kAn sn,k − bn kr−1 (kAn en − bn k − C1 kAn sn,k − bn k) ,
αn
with C1 := 1/2r−1 −C0r > 0, which has the same form as (3.37) with λn,k = 1/αn . Employing
the TCC and choosing carefully the constant µ, we conclude, like in (3.28) and (3.38), that
the right-hand side of (3.67) is negative, this is, ∆p (x+ , zn,k+1 ) < ∆p (x+ , zn,k ) , completing
the induction proof.
Notice that, from TCC, the sequence of the residuals can be proven to be bounded, this
is, kbn k ≤ C2 (see (3.41)) and since p ≤ s ≤ r,
p−1 s−1 s−p s r−s
C p∗ ,s∗ M p C2r−p C p∗ ,s∗ Cρ,x + M C2
αmin,n ≤ ∨ =: αmin ,
C0p−1 C0s−1
which means that the inequality αn ≥ αmin implies αn ≥ αmin,n . Choosing, for instance,
αmax := Kαmin with K > 1, we conclude that all the above results hold true for
Remark 32 See that, if a single step is given in each inner iteration, i.e., if kn = 1
whenever (3.9) does not hold (this means that kmax = 1 in (3.12)), then xn+1 minimizes
1
0
r
Tn (x) =
y[n] − F[n] (xn ) − F[n] (xn ) (x − xn )
+ αn ∆p (x, xn ) .
r
3.1. SOLVING THE INNER ITERATION 47
with
p−s∗ (p−1) s∗ ∗
∗ ∗
g (λn,k ) := −λn,k h∇Tn,k , sn,k − en i + Cp∗ ,s∗ Cρ,x+ λn,k k∇Tn,k ks ∨ λpn,k k∇Tn,k kp
hA∗n jr (An sn,k − bn ) , sn,k − en i ≥ kAn sn,k − bn kr − kAn sn,k − bn kr−1 kAn en − bn k ,
Ωn (en ) ≤ C3 (3.71)
for all n ∈ N, then choosing 0 < αn ≤ (C4 /C3 ) kbn kr with 0 < C4 < 1,
with C1 and C2 like in (3.39) and λDE,` being the same as λDE,` but with ∇Tn,k replacing
∇n,k , i.e.,
kAn sn,k − bn kr(`−1)
λDE,` := .
k∇Tn,k k`
Thus, λn,k ∈ 0, λDE implies that
h i
r−1 r
g (λn,k ) ≤ λn,k kAn sn,k − bn k (kAn en − bn k − C5 kAn sn,k − bn k) + C4 kbn k . (3.73)
1
αn kΩn (sn,k )k . αn . kbn kr ≤ kAn sn,k − bn kr
µr
for k = 0, ..., kn − 1. Thus,
because kAn sn,k − bn k is uniformly bounded (see (3.41) and (3.42)). Consequently,
and therefore,
λDE & kAn sn,k − bn ks−r ∧ kAn sn,k − bn kp−r & kAn sn,k − bn ks−r (3.74)
for k = 0, ..., kn − 1.
Similarly to (3.43) , one can prove for s ≤ r that λDE,` & C6−r , where C6 > 1 is an
upper bound to kAn sn,k − bn k . This reasoning implies that the Landweber method, with
constant step-size given by
λLW := C6−r , (3.75)
satisfies λLW ≤ λDE , which implies that inequality (3.73) holds for this method.
Suppose from now on that the functional Ωn is defined by Ωn (s) := p1 ks − xn kp , with
(xn )n∈N ⊂ X being independent of k. Then the iteration (3.70) assumes the form
In this case, as X is s−convex and hence strictly convex, Ωn is strictly convex too and the
minimizer of Tn , defined in (3.69) , is unique.
50 CHAPTER 3. THE INEXACT NEWTON METHOD K-REGINN
In order to find an upper bound C3 for Ωn (en ) = p1 ken − xn kp , an upper bound for
kx+ k normally needs to be known. For example, if xn := −xn , i.e., Ωn (s) = p1 ks + xn kp ,
p
then we need a bound for Ωn (en ) = p1 kx+ k . If the functional Ωn (s) = p1 ks − (x0 − xn )kp
is considered, then
1
x+ − x0
p ≤ 1
x+
+ kx0 k p .
Ωn (en ) =
p p
p
The bound Ωn (en ) ≤ kx+ k + Cx+ ,ρ /p holds in case of Ωn (s) = p1 kskp is chosen and
xn ∈ Bρ (x+ , ∆p ) is assumed.
We finish this subsection giving a practical example: consider the iteration (3.76) with
xn := 0, which is equivalent to the dual gradient iteration (3.70) applied to the Tikhonov
functional
1 1
Tn (s) = kAn s − bn kr + αn kskp .
r p
If one wants to use SD method to iteratively minimize this functional, it is necessary to
define the step-size λn,k in (3.70) such that the number Tn (sn,k+1 ) is as small as possible.
As
sn,k+1 = Jp∗∗ (Jp (sn,k + xn ) − λn,k ∇Tn,k ) − xn ,
we conclude that λn,k ∈ arg minh (λ) , with h : R+ +
0 → R0 defined by
λ∈R+
0
k∇Tn,k k2
λSD = .
kA∇Tn,k k2 + αn k∇Tn,k k2
The result implies in particular that 1/ M 2 + αn ≤ λSD ≤ 1/αn . Finally, from the modi-
fied version of Young’s inequality12 , it follows that for all a, b ≥ 0 and > 0,
2
b2
2 2 a 1+ 2
(a + b) ≤ a + 2 + + b2 = a + (1 + ) b2 .
2 2
Thus,
it follows that
1+
k∇Tn,k k4 ≤ kAn sn,k − bn k2 kA∇Tn,k k2 + αn k∇Tn,k k2 ,
and consequently,
1 + kAn sn,k − bn k2
λSD ≤ ,
k∇Tn,k k2
for all > 0. Choosing 0 < ≤ 1/ (1 − C0 ) , we find (1 + ) / ≤ C0 , which implies that
λSD ≤ λDE , see (3.72) .
52 CHAPTER 3. THE INEXACT NEWTON METHOD K-REGINN
Chapter 4
Although we do not consider in this chapter any specific method to generate the inner
iteration of K-REGINN, some properties of the inner iteration sequence (sn,k )k∈N are required
in form of assumptions. Only these assumptions are then used to prove general results for
the outer iteration sequence (xn )n∈N . Many of these required properties were previously
proven for the sequences generated by the methods presented in Section 3.1, which means
that these sequences can be used as inner iteration of K-REGINN and the respective results
proved in this chapter are assured for these methods. We point out however, that any
sequence satisfying the necessary requirements assumed in this chapter can be used as
inner iteration of K-REGINN and all the respective results will accordingly hold for them.
We assume again that only noisy data is available, that is, δ > 0 in (3.8) is given. To
δ[n]
stress this fact, we accordingly add a superscript δ in the residual bn : bδn = y[n] − F[n] (xn ) .
Lemma 33 Let X and Y be Banach spaces with Y being smooth. Let Assumption 1 in
page 31 hold true and kn = kREG . If xn is
well-defined and
kREG < ∞, then sn,kn is a
δ[n]
descent direction for the functional ψn (·) :=
F[n] (·) − y[n]
from xn .
Proof. The result follows from Proposition 22 and
An sn,k − bδn
< µ
bδn
<
bδn
.
n
δ[n]
δ[n]
The above lemma shows that
F[n] (xn + tsn,kn ) − y[n]
<
F[n] (xn ) − y[n]
for t > 0
small enough. But, as we will see later in Theorem
35, if the right
assumptions are
required,
δ[n]
δ[n]
then t = 1 can be used and consequently
F[n] (xn+1 ) − y[n]
<
F[n] (xn ) − y[n]
.
53
54 CHAPTER 4. CONVERGENCE ANALYSIS OF K-REGINN
The Assumption 2 holds true if the sequence (sn,k )k∈N is generated from the following
methods:
• The primal gradient LW method, defined by iteration (3.15) with the constant step-
size (3.22) and with the additional requirement p ≤ r. For this method, we also assume
the hypotheses of Lemma 24. In particular, the space Y[n] needs to be r−smooth.
We did not prove this property for the dual gradient methods, Iterated-Tikhonov and
mixed gradient-Tikhonov methods. Therefore, the results shown in this section, which are
based on Assumption 2 above are in principle not guaranteed for these methods. However,
the inequalities (3.37) , (3.67) and (3.73) are strong enough to guarantee not only the results
of this section, but even the stronger results of the next one (see Theorem 38 below).
Assumption 2 is a weaker version of
lim
An sn,k − bδn
= inf
An s − bδn
,
k→∞ s∈X
in Hilbert spaces. This last property is exactly the Assumption (2.2) in [36], used to prove
[36, Lemma 2.3], which is a similar version of next proposition. Assumption 2 combined
with a careful choice of µ implies that the inner iteration stops after finitely many iterations.
Proof. η < 1 together with the restriction on τ imply (1 + η) /τ + η < 1, which in turn
implies that the interval of picking
the
parameter µ is non-empty. If inequality (3.9) is
satisfied, then kn = 0. Otherwise,
bδn
> τ δ[n] and we define en := x+ − xn , use (3.7) and
the TCC to obtain
δ
δ[n] 0 +
A e − b = y − F (x ) − F (x ) x − x (4.1)
n n n
[n] [n] n [n] n n
δ[n] 0
− y[n]
+
F[n] x+ − F[n] (xn ) − F[n] (xn ) x+ − xn
≤
y[n]
Theorem 35 (Termination) Let X and Y be Banach spaces with Y being smooth. Let
D(F ) be open, d = 1, kn = kREG and suppose that Assumption 2 holds true. Suppose
additionally that Assumption 1 in page 31 holds true with Bρ (x+ , ∆p ) replaced by £ (x) for
some x ∈ X fixed and start with x0 ∈ £ (x) . Further, assume that the constant η in the
TCC (Assumption 1(c)) is small enough to satisfy the inequality
ηr + C
< (1 − 2η)r (4.2)
1+C
and the constant τ in the discrepancy principle (3.13) is large enough to verify the inequality
r
1+η
+ η + C < (1 + C) (1 − 2η)r . (4.3)
τ
Define now the constants 0 < µmin < µmax < 1 as
r
1+η
τ + η +C
µrmin := , µmax := 1 − 2η (4.4)
1+C
and pick up µ ∈ (µmin , µmax ) . Then, there exists a constant 0 < Λ < 1 such that:
1. all the iterates of REGINN are well-defined and belong to the level set £ (x0 ) as long as
the discrepancy principle (3.13) is not satisfied;
for n = 0, ..., N − 1.
and therefore, inequality (4.3) is verified whenever τ is large enough. Further, the restriction
on τ implies that the constants µmin and µmax in (4.4) are well-defined and thus the interval
for choosing the tolerance µ is non-empty. Finally, choosing Λ ∈ R satisfying
η (1 + µ)
µ+ < Λ < 1,
1−η
we observe that the restriction µ < µmax implies that µ + η (1 + µ) / (1 − η) < 1. Hence Λ
is well-defined. Further, as 1 − 2η < 1, we also have τ > (1 + η) / (1 − η) and Proposition
34 applies.
We use now an inductive argument: x0 is well-defined and belongs to £ (x0 ) . Suppose
that the iterates x0 , ..., xn with n ≤ N − 1, are well-defined, belong to £ (x0 ) and satisfy
inequality (4.6) . From Proposition 34, kn < ∞ and consequently xn+1 = xn + sn,kn is
well-defined. We prove next that xn+1 ∈D(F ) and inequality (4.6) holds for this vector,
which will prove that xn+1 ∈ £ (x0 ) . In fact, for each t ∈ R define the vector
xn,t := xn + tsn,kn ∈ X
and see that for each t ∈ R such that xn,t ∈ £ (x) ⊂D(F ) , it is true that (see Assumption
1(c))
F (xn,t ) − y δ
≤ (1 − t)
bδn
+ t
bδn − F 0 (xn ) sn,kn
(4.7)
We want to prove that tmax = 1. Observe first that the above set where the supremum is
taken is
non-empty
because from Lemma 33, sn,kn is a descent direction for the functional
ψ (·) :=
F (·) − y δ
from xn and since D(F ) is open, there exists a constant t > 0 such
that for all 0 < t ≤ t, xn,t ∈D(F ) and
Induction
F (xn,t ) − y δ
<
F (xn ) − y δ
≤
F (x0 ) − y δ
.
This implies that xn,t ∈ £ (x0 ) for all 0 < t ≤ t. Assume next that tmax < 1, i.e., xn,tmax ∈
∂£ (x0 ) . As x0 ∈ £ (x) , it follows that
Thus,
F (xn,tmax ) − y δ
< [1 − tmax (1 − Λ)]
bδn
<
bδn
≤
bδ0
,
which
xn,tmax ∈ ∂£ (x0 ) . Choosing t = 1 in (4.7) we find (4.6) . Now, from (4.6) ,
δ
contradicts
bn
≤ Λn
bδ
→ 0 as n → ∞, which implies that (3.13) is satisfied for n large enough.
0
Finally, from
τ δ <
bδN −1
≤ ΛN −1
bδ0
we obtain
ln τ δ/
bδ0
N −1< .
ln (Λ)
Note that the level set £ (x) can be replaced by £ (x0 ) or X in the above theorem.
Corollary 36 Assume all the hypotheses of Theorem 35. Then, F xN (δ) → F (x+ ) as
δ → 0.
From Tangential Cone Condition, Assumption 1(c) , page 31, it follows easily that
1
Fj0 (w) (v − w)
kFj (v) − Fj (w)k ≤
1−η
xN (δ) − x+
=
F 0 x+ xN (δ) − x+
A
≤ (η + 1)
F xN (δ) − F x+
≤ (η + 1) (1 + τ ) δ.
The boundedness of £ (x0 ) would certainly facilitate a convergence analysis of the family
xN (δ) δ>0 ⊂ £ (x0 ) using for instance, a weak convergence argument in reflexive Banach
spaces. There is no reason however, to believe that this is the case. On the contrary, this
set is expected to be unbounded for ill-posed problems, which stimulate us to concentrate
in a local convergence analysis restricted to the bounded set Bρ (∆p , x+ ) .
58 CHAPTER 4. CONVERGENCE ANALYSIS OF K-REGINN
such that
∆p x+ , zn,k+1 − ∆p x+ , zbn,k
r−1
r
δ
≤ λn,k
An sn,k − bn
An en − bn
− K1
An sn,k − bn
+ K0
bδn
,
δ
δ
The following methods satisfy the required properties of Assumption 3, with zbn,k = zn,k ,
provided X is uniformly smooth and s−convex with p ≤ s:
• The dual gradient method DE, presented in (3.29) and (3.39) , Subsection 3.1.2 with
K1 = 1 − C0 , 0 < C0 < 1, K0 = 0 (see (3.37)) and t = s − r (see (3.40));
• The Bregman variation of the IT method defined by the implicit iteration (3.62)
in Subsection 3.1.3 with K1 = 1/2r−1 − C0r , 0 < C0r < 1/2r−1 , λn,k = 1/αn and
K0 = t = 0 (see (3.67) and (3.68));
Assumption 3 is a generalization of [36, Assumption (3.2)] and the next Theorem cor-
responds to Theorem 3.1 in the same reference.
Theorem 38 (Decreasing error) Let X and Y be Banach spaces with X being uniformly
smooth and uniformly convex. Assume that Assumption 3 and Assumption 1 in page 31
hold true with the constant η in the TCC satisfying
0 ≤ η < K 1 − K0 ,
4.2. DECREASING ERROR AND WEAK CONVERGENCE 59
∆p x+ , xn+1 ≤ ∆p x+ , xn
(4.10)
for n = 0, ..., N − 1, where the equality holds if and only if (3.9) holds;
4. the generated sequence is bounded:
if zbn,k := xn .
Proof. The restrictions on the constant η imply the well-definedness of τ and µmin < 1. We
use induction: as x0 ∈ Bρ (x+ , ∆p ) , the bound kx0 k ≤ Cρ,x+ holds (see (2.20)). Suppose
that the iterates x0 , ..., xn , with n < N are well-defined, belong to Bρ (x+ , ∆p ) and satisfy
(4.10) as well as (4.11) . If (3.9) is
verified, then xn+1 = x
n and all the required properties
hold true for xn+1 . Otherwise, as
bδn
≤ µ1
An sn,k − bδn
for k = 0, ..., kn − 1, we proceed
which implies that ` < ∞ and consequently kn < ∞. This means that the inner iteration
terminates. Using ` = kn in the above inequality we obtain
n −1
kX
r
C0 λn,k
An sn,k − bδn
≤ ∆p x+ , zn,0 −∆p x+ , zn,kn = ∆p x+ , xn −∆p x+ , xn+1 ,
0<
k=0
which implies (4.12) and (4.10) . Using now (4.10) , we conclude that
∆p x+ , xn+1 ≤ ∆p x+ , xn ≤ ... ≤ ∆p x+ , x0 < ρ,
which means that xn+1 ∈ Bρ (x+ , ∆p ) ⊂D(F ) and that (4.11) is true(see (2.20)).
It remains
to prove that the outer iteration terminates. Define the set I := n ∈ N0 :
bδn
> τ δ[n]
and let n represent the largest number in I (which in principle could possibly be ∞). For any
t+r
n ∈ I the inequality
bδn
≥ (τ δmin )t+r holds, where δmin := min {δj : j = 0, ..., d − 1} >
0. Now, as kn ≥ 1 for all n ∈ I and kn = 0 otherwise, we obtain from (4.12),
n
X
δ
t+r
t+r
X
X
t+r
(µτ δmin ) ≤ µ
bn
kn ≤ µ
bδn
kn
n∈I n∈I n=0
X n −1
n kX
r
δ
. λn,k
An sn,k − bn
n=0 k=0
Xn
∆p x+ , xn − ∆p x+ , xn+1 ≤ ∆p x+ , x0 < ∞,
.
n=0
which can only be true if n < ∞. Then N (δ) = n + 1 < ∞.
We turn now to the proof in the case zbn,k = xn . Observe first that τ > (1 + η) / (1 − η) .
Now, as C = 0 in Assumption 2, 0 < K1 ≤ 1 and K0 ≥ 0, we conclude that µmin is larger
than that one in Proposition 34, which implies that the results of this proposition are valid
here and therefore kn is finite. Using k = kn − 1 in (4.14) ,
∆p x+ , xn+1 − ∆p x+ , xn = ∆p x+ , zn,kn − ∆p x+ , zbn,k
r
δ
≤ −C0 λn,kn −1
An sn,kn −1 − bn
,
which is (4.13) . Consequently, xn+1 ∈ Bρ (x+ , ∆p ) and (4.10) as well as (4.11) is true.
Defining the set I and the number δmin as before,
t+r X
X X
t+r
µt+r (τ δmin )t+r ≤ µ
bδn
≤
An sn,kn −1 − bδn
n∈I n∈I n∈I
Xn
r
. λn,kn −1
An sn,kn −1 − bδn
n=0
Xn
∆p x+ , xn − ∆p x+ , xn+1 ≤ ∆p x+ , x0 < ∞.
.
n=0
4.2. DECREASING ERROR AND WEAK CONVERGENCE 61
Remark 39 Assuming the hypotheses of Theorem 38, the results of Corollaries 36 and 37
are clearly true. Furthermore, the monotonicity estimate (4.10) can be proven in a more
general setting:
∆p (ϑn , xn+1 ) ≤ ∆p (ϑn , xn ) , (4.15)
where ϑn represents a solution of the [n]-th equation in Bρ (x+ , ∆p ): y[n] = F[n] (ϑn ) . We
remark further that, for those iterations satisfying kn = kREG and where inequality (3.9) is
violated by xn , we find following [21],
δ[n]
δ[n] 0
y[n] − F[n] (xn+1 )
≤
y[n] − F[n] (xn ) − F[n] (xn ) (xn+1 − xn )
0
+
F[n] (xn+1 ) − F[n] (xn ) − F[n] (xn ) (xn+1 − xn )
δ[n]
≤ µ
y[n] − F[n] (xn )
+ η
F[n] (xn+1 ) − F[n] (xn )
δ[n]
δ[n]
≤ (µ + η)
y[n] − F[n] (xn )
+ η
y[n] − F[n] (xn+1 )
,
so that
δ[n]
δ[n]
µ+η
y[n] − F[n] (xn+1 )
≤ Λ
y[n] − F[n] (xn )
with Λ := .
1−η
Finally, if η satisfies η + K0 < (1 − 2η)r K1 (this is possible for a small η because K0 < K1 )
and
η+1
τ> r
(1 − 2η) K1 − (η + K0 )
then µrmin < (1 − 2η)r and restricting µ to (µmin , 1 − 2η) yields Λ < 1. In particular, if
d = 1 (only one equation in (3.6) is considered), (4.6) and (4.5) are true.
Thus the sequences (zn,k )0≤k≤kn , n ≤ N (δ) , are uniformly bounded in n, k and δ.
The monotonicity in the inner iteration does not follow directly from (4.14) if zbn,k = xn ,
but the uniform bound (4.16) still holds in this case because
∆p x+ , zn,k+1 ≤ ∆p x+ , xn < ρ.
Weak convergence of K-REGINN follows immediately from (4.11) and the reflexivity of
X.
Corollary 41 (Weak convergence) Let all the assumptions of Theorem 38 hold true.
If the operators
Fj , j = 0, . . . , d − 1, are weakly sequentially closed then for any se-
(δj )i (i)
quence yj with δ = max (δj )i : j = 0, . . . , d − 1 → 0 as i → ∞, the se-
i∈N
quence xN (δ(i) ) contains a subsequence that converges weakly to a solution of (3.1)
i∈N
in Bρ (x+ , ∆p ) . If x+ is the unique solution of (3.1) in Bρ (x+ , ∆p ) , then xN (δ) δ>0 con-
Assumption 4 If the iterate xn of K-REGINN is well-defined, then the inner iteration se-
quence is generated by
where (vn,k )k≤kn ⊂ Y[n] satisfies kvn,k k ≤ kAn sn,k − bn k. The sequence (γn )n∈N ⊂ R obeys
the inequalities 0 ≤ γn ≤ K2 kbn kr with K2 ≥ 0. Further, (xn )n∈N ⊂ X is an arbitrary
sequence and K2 = 0 whenever this sequence is unbounded or in case zbn,k = xn .
• All the dual gradient methods, defined by iteration (3.29) in Subsection 3.1.2 and
satisfying λn,k ≤ λDE . In particular, the DE method itself, the MSD and LW methods.
Here, zbn,k = zn,k , vn,k = An sn,k − bn and K2 = 0;
• The Bregman variation of the Iterated-Tikhonov method (3.62), with zbn,k = zn,k ,
vn,k = An sn,k+1 − bn and K2 = 0;
• The mixed gradient-Tikhonov methods, defined by iteration (3.76) , with zbn,k = zn,k ,
vn,k = An sn,k − bn , γn = αn and K2 = K0 /K3 , where
K0 is defined
in Assumption 3,
page 58, and K3 is an upper bound to the sequence 1
p ken − xn kp (see Subsection
n∈N
3.1.4).
With the next lemma we prepare our convergence proof for the exact data case.
Lemma 42 Assume all the hypotheses from Theorem 38 but with µ ∈ η+K K1
0
, 1 and let
Assumption 4 hold true. Then,
for all n ∈ N.
4.3. CONVERGENCE WITHOUT NOISE 63
n −1
kX
= hJp (zn,k+1 ) − Jp (zn,k ) , en i
k=0
n −1
kX
= λn,k (hjr (vn,k ) , −An en i + γn hJp (sn,k − xn ) , −en i) .
k=0
If (xn )n∈N is not bounded, then γn ≡ 0 (K2 = 0), otherwise, since the sequences (en ) and
(sn,k ) are uniformly bounded, see (4.11) and (4.16) ,
n −1
kX
Jp (xn+1 ) − Jp (xn ) , x+ − xn . λn,k kAn sn,k − bn kr−1 kAn en k + kbn kr
k=0
n −1
kX
≤ λn,k kAn sn,k − bn kr−1 (η + 1) kbn k + kbn kr
k=0
kn −1
η+1 1 X
≤ + r λn,k kAn sn,k − bn kr
µ µ
k=0
. ∆p x , xn − ∆p x+ , xn+1 .
+
Inserting this result in (4.18), we arrive at (4.17) . We proceed now proving the result for
the case zbn,k = xn (consequently, K2 = 0). Similarly to above, but using (4.13) instead of
(4.12) , the inequality
Assumption 5 If s ≤ r and kmax < ∞ in (3.12) , then there exist constants 0 < λmin ≤
λmax such that λmin ≤ λn,k ≤ λmax for all k = 0, ..., kn and n ∈ N.
• The dual gradient method LW with a small constant step-size. This method clearly
satisfies λmin = λLW = λmax ;
• The Bregman variation of Iterated-Tikhonov method, where the required inequalities
are satisfied for λn,k = 1/αn . Consequently, λmin = 1/αmax and λmax = 1/αmin (see
(3.68));
• The Bregman variation of Tikhonov-Phillips method with λn,k = 1/αk . In this case,
λmin = 1/α0 and λmax = 1/αkmax (see (3.57)).
It is possible to prove the existence of λmin for the dual gradient methods DE and MSD
(see (3.43) and (3.47)). But unfortunately, this seems not to be the case for λmax . However,
if we modify the definitions of these methods to λnew = λold ∧ λ with a (possibly very
large) constant λ ≥ λmin , Assumption 5 is immediately verified for these methods with
λmax := λ. Anyway, some results of next Theorem hold true for these methods in their
original definition. In the following convergence proof we adapt ideas from our previous
work [40].
Theorem 43 (Convergence without noise) Let X and Y be Banach spaces with X
being uniformly smooth and s−convex with p ≤ s. Let Assumption 1 in page 31 hold true
and start with x0 ∈ Bρ (x+ , ∆p ) . Assume that Assumptions 3, page 58 and 4, page 62
hold true and that Assumption 2, page 53 is verified with C = 0 in case of zbn,k = xn .
Suppose that the constant η in the TCC satisfy η < K1 − K0 and that µ ∈ (µmin , 1) with
µmin := (η + K0 ) /K1 < 1. If d = 1, then REGINN either stops after finitely many iterations
with a solution of (3.1) or the generated sequence (xn )n∈N ⊂ Bρ (x+ , ∆p ) converges strongly
in X to a solution of (3.1) . If x+ is the unique solution in Bρ (x+ , ∆p ), then xn → x+ as
n → ∞.
If d > 1 then the same results hold true for K-REGINN in case of Assumption 5 holds
true as well as kmax < ∞ and s ≤ r.
Proof. As this proof is relatively long, it is divided in four parts. In the first and second
parts, the case d > 1 is analyzed and Assumption 5 is employed to prove inequality (4.31)
below, for the case zbn,k = zn,k in the first part of the proof and for zbn,k = xn in the second
one. In the third part, an inequality similar to (4.31) is established for the case d = 1.
Finally, we prove in the fourth part of the proof that the residual converges to zero as n
goes to infinity, which together with (4.31) will ensure the desired result.
If Algorithm 1 stops after a finite number of iterations then the current iterate is a
solution of (3.6) and consequently of (3.1) . Otherwise, (xn )n∈N is a Cauchy sequence, as
we will prove now. Let m, l ∈ N with m ≤ l. The trick of the whole proof is to use the
triangle inequality in order to ”cut” the norm kxm − xl k in an appropriate vector xz and
then estimate each of the resulting norms using the properties of this vector.
Part 1: We consider first the case d > 1. In this case Assumption 5 holds true as well
as kmax < ∞ and s ≤ r. Write m = m0 d + m1 and l = l0 d + l1 with m0 , l0 ∈ N and
m1 , l1 ∈ {0, . . . , d − 1} . Of course m0 ≤ l0 . Choose z0 ∈ {m0 , . . . , l0 } such that
d−1
P
(kyj − Fj (xz0 d+j )k + kxz0 d+j+1 − xz0 d+j k) (4.20)
j=0
d−1
P
≤ (kyj − Fj (xn0 d+j )k + kxn0 d+j+1 − xn0 d+j k)
j=0
with
βm,z := ∆p x+ , xm − ∆p x+ , xz
and
l−1
X
Jp (xn+1 ) − Jp (xn ) , xz − x+ .
f (z, m, l) ≤ (4.22)
n=m
We want to show now that f (z, m, l) is bounded from above by a term that converges to
zero as m → ∞, which together with (4.21) will prove that (xn )n∈N is a Cauchy sequence.
Our first goal is to prove this property for the case zbn,k = zn,k . As in the proof of Lemma
42,
Jp (xn+1 ) − Jp (xn ) , xz − x+
(4.23)
n −1
kX
≤ λn,k kAn sn,k − bn kr−1 kAn ez k + γn |hJp (sn,k − xn ) , ez i| .
k=0
Similarly to (4.19),
γn |hJp (sn,k − xn ) , ez i| . kbn kr . (4.24)
We proceed applying Assumption 1(c) to estimate kAn ez k :
Note that in the last norm, the operator F[n] is applied in the ”wrong” vector xz . To
estimate this norm, we use Assumption 1. Write n = n0 d + n1 with n0 ∈ {m0 , . . . , l0 } and
n1 ∈ {0, . . . , d − 1} . Then,
y[n] − F[n] (xz )
= kyn − Fn (xz d+z )k
1 1 0 1
d−1
X
≤ kyn1 − Fn1 (xz0 d+n1 )k + kFn1 (xz0 d+j+1 ) − Fn1 (xz0 d+j )k
j=0
d−1
1 X
Fn0 (xz d+j ) (xz d+j+1 − xz d+j )
≤ kyn1 − Fn1 (xz0 d+n1 )k + 0 0 0
1−η 1
j=0
d−1
X
M
≤ 1+ (kyj − Fj (xz0 d+j )k + kxz0 d+j+1 − xz0 d+j k) .
1−η
j=0
66 CHAPTER 4. CONVERGENCE ANALYSIS OF K-REGINN
This inequality was the motivation for the definition of z. From (4.20),
d−1
X
y[n] − F[n] (xz )
≤ 1 + M
(kyj − Fj (xn0 d+j )k + kxn0 d+j+1 − xn0 d+j k) . (4.26)
1−η
j=0
Inserting (4.26) in (4.25), (4.25) and (4.24) in (4.23), (4.23) in (4.22), and using the defini-
tion of kn , we arrive at
n −1
l−1 kX
X
f (z, m, l) . λn,k kAn sn,k − bn kr + g (z, m, l) + h (z, m, l) , (4.27)
n=m k=0
where
X n −1
l−1 kX d−1
X
g (z, m, l) := λn,k kAn sn,k − bn kr−1 kyj − Fj (xn0 d+j )k ,
n=m k=0 j=0
X n −1
l−1 kX d−1
X
h (z, m, l) := λn,k kAn sn,k − bn kr−1 kxn0 d+j+1 − xn0 d+j k .
n=m k=0 j=0
The first term on the right-hand side of (4.27) can be estimated by (4.12). It remains to
estimate g and h. At this point we need the boundedness of kn . From kn ≤ kmax ,
!v k −1
n −1
kX n −1
kX n
X
v
kAn sn,k − bn k . kAn sn,k − bn k . kAn sn,k − bn kv
k=0 k=0 k=0
for any v > 0. Then defining
n −1
kX
wn := kAn sn,k − bn k ,
k=0
Similarly,
l0 d+d−1
X kX n −1 l0 d+d−1
X
r
h (z, m, l) . λn,k kAn sn,k − bn k + kxn+1 − xn kr . (4.29)
n=m0 k=0 n=m0
Plugging this bound into (4.29), inserting then inequalities (4.29) and (4.28) in (4.27), (4.27)
in (4.21), and using (4.12), we end up with
Part 2: Now we consider the case zbn,k = xn (in this case K2 = 0 and consequently
γn ≡ 0 in Assumption 4) together with d > 1. Inequality (4.23) reads here
Jp (xn+1 ) − Jp (xn ) , xz − x+ ≤ λn,k −1 kAn sn,k −1 − bn kr−1 kAn ez k .
n n (4.32)
with
l−1
X d−1
X
r−1
g (z, m, l) := λn,kn −1 kAn sn,kn −1 − bn k kyj − Fj (xn0 d+j )k ,
n=m j=0
l−1
X d−1
X
r−1
h (z, m, l) := λn,kn −1 kAn sn,kn −1 − bn k kxn0 d+j+1 − xn0 d+j k .
n=m j=0
In the same way as done in (4.28) and (4.29) above, the inequalities
l0 d+l−1
X
g (z, m, l) . λn,kn −1 kAn sn,kn −1 − bn kr ,
n=m0
l0 d+l−1
X l0
X
r
h (z, m, l) . λn,kn −1 kAn sn,kn −1 − bn k + kxn+1 − xn kr
n=m0 n=m0
can be proven. Using now (4.13) instead of (4.12) we arrive again at (4.31) .
Part 3: The case d = 1 is considered now (therefore kmax = ∞ is possible). This
situation is easier because everything we need is to change the definition of xz in (4.20) to
the vector with the smallest residual in the outer iteration, i.e., choose z ∈ {m, . . . , l} such
that kbz k ≤ kbn k , for all n ∈ {m, . . . , l} . Then, from (4.25) ,
The inequality kbn k ≤ µ1 kAn sn,k − bn k for k = 0, ..., kn − 1, together with (4.32) (respec-
tively (4.23) and (4.24)) and (4.22) , results in
n −1
l−1 kX
X
f (z, m, l) . λn,k kAn sn,k − bn kr
n=m k=0
l−1
X
f (z, m, l) . λn,kn −1 kAn sn,kn −1 − bn kr
n=m
in the case zbn,k = xn . Plugging now these inequalities into (4.21) and using again (4.12)
(respectively (4.13)), we arrive at (4.31) with m0 and l0 d + d replaced by m and l, respec-
tively.
Part 4: In any case the right-hand side of inequality (4.31) converges to zero as m → ∞
revealing (xn )n∈N to be a Cauchy sequence. As X is complete, it converges to some x∞ ∈ X.
Observe that kn ≥ 1 if kbn k 6= 0 and since µ kbn k ≤ kAn sn,k − bn k for all k ≤ kn − 1, it
follows that for the case zbn,k = zn,k ,
∞ ∞ n −1
∞ kX
X X X (4.12)
r+t r+t
µ r+t
kbn k ≤ kn (µ kbn k) . λn,k kAn sn,k − bn kr < ∞
n=0 n=0 n=0 k=0
Similarly, ∞ r+t
P
n=0 kb n k < ∞ for z
b n,k = x n . Then,
y[n] − F[n] (xn )
= kbn k → 0 as n → ∞
and since the Fj ’s are continuous for all j = 0, . . . , d − 1, we have yj = Fj (x∞ ). If (3.1) has
only one solution in Bρ (x+ , ∆p ) then x∞ = x+ .
In this subsection, we apply a similar reasoning with the necessary modifications to fit
it in with our framework. We follow ideas from [36] and [40]. In order to facilitate the
comprehension, we summarize the main results:
1. From (3.11) , it follows that kn is an arbitrary number less than or equal kREG ,
which means that the sequence (xn )n∈N , generated from a run of K-REGINN using
4.4. REGULARIZATION PROPERTY 69
data without noise, can change if the sequence (kmax,n )n∈N is chosen differently. In
Definition 44 below, we observe all the sequences which are possibly generated from
a run of K-REGINN in the noiseless situation and collect all the n−th iterates in the
set Xn .
2. In Lemma 45, we prove that if the sequences xδni n∈N , with (δi )i∈N ⊂ R being a
zero-sequence, are generated by different runs of K-REGINN with the different noise
δ
levels δi , then for each n ∈ N fixed, the sequence of the n−th iterates xn i∈N splits
i
3. Further, in Lemma 46 it is proven that for each > 0, there exists a M ∈ N such that,
for each n ≥ M, there exists an element ξn ∈ Xn and a solution x∞ of (3.1) satisfying
kξn − x∞ k < (the sets Xn converge uniformly to the set of solutions of (3.1)).
4. Finally, in Theorem 47, we provide a proof that the sequence xδNi (δi ) of the final
i∈N
iterates of K-REGINN (generated with the different noise levels δi , where (δi )i∈N ⊂ R
is again a zero-sequence) splits into convergent subsequences, all of which converge
strongly to solutions of (3.1) as i → ∞.
In a first step we investigate the stability of the scheme, i.e., we study the behavior of
the n-th iterate xδn as δ approaches zero. The sets Xn defined below play an important role,
see the Item 1 above.
Definition 44 Let X0 := {x0 } and define Xn+1 from Xn by the following procedure: for
each ξ ∈ Xn , change xn in definition of K-REGINN with ξ and change the respective inner
iterate zn,k with σn,k , resulting from using ξ instead of xn . More precisely, define σn,0 =
σn,0 (ξ) := ξ and σn,k+1 = σn,k+1 (ξ) as
σn,k ) − λξn,k F[n]
Jp (σn,k+1 ) := Jp (b 0
(ξ)∗ jr vn,k
ξ
+ γnξ Jp (σn,k − ξ) − xξn ,
otherwise. Then σn,k (ξ) ∈ Xn+1 for k = 1, . . . , kREG (ξ) in case kREG (ξ) ≥ 1 and only for
k = 0 in case kREG (ξ) = 0. We call ξ ∈ Xn the predecessor of the vectors σn,k (ξ) ∈ Xn+1
and these ones successors of ξ.
ξ ∈ Xn . This is,
δi
lim xδni = ξ ∈ Xn =⇒ lim zn,k = σn,k (ξ) for k = 0, ..., kREG (ξ) .
i→∞ i→∞
• All the dual gradient methods, defined by iteration (3.29) in Subsection 3.1.2 and
satisfying λδn,k ≤ λδDE , where λδn,k depends continuously on δ. In particular, the DE
method itself, the MSD and LW methods. For these methods, we assume that X is
uniformly smooth and s−convex with p ≤ s. Additionally, we suppose that the spaces
Yj , j = 0, ..., d − 1 are uniformly smooth;
• The mixed gradient-Tikhonov methods, defined by iteration (3.76) . Here the uniform
smoothness and the s−convexity of X, with p ≤ s, as well as the uniform smoothness
of Yj , j = 0, ..., d − 1 is assumed.
Lemma 45 Let all assumptions of Theorem 38 hold true and assume additionally Assump-
tion 6 and that X is s−convex with p ≤ s. If δi → 0 as i → ∞ then for n ≤ N (δi ) with
δi > 0 sufficiently small, the sequence xδni i∈N splits into convergent subsequences, all of
which converge to elements of Xn .
As the functions Fj and Fj0 are continuous for j = 0, ..., d − 1, it follows from (3.7) , (4.35)
and (4.34) that
lim
bδni
=
ebn
and (4.36)
i→∞
xδni sδn,k
0
0
lim
F[n] i
− bδni
=
F[n] (ξ) (σn,k − ξ) − ebn
i→∞
Case 2: ebn = 0. In this case, y[n] = F[n] (ξ) and σn,0 = ξ ∈ Xn+1 (see Definition 44).
As the sequence xδni n≤N (δ ) is uniformly bounded (see (4.11)) and X is s−convex, we
i
as ` → ∞, contradicting > 0.
In the second step towards establishing the regularization property we provide a kind
of uniform convergence of the set sequence (Xn )n to solutions of (3.1). For the rigorous
formulation in Lemma 46 below we need to introduce further notation: Let l ∈ N and set
(l) (l) (l) (l) (l)
ξ0 := x0 . Now define ξn+1 := σn,k(l) (ξn ) by choosing kn ∈ {1, . . . , kREG (ξn )} in case of
n
(l) (l) (l) (l) (l)
kREG (ξn ) ≥ 1 and kn = 0 otherwise. Then ξn+1 is a successor of ξn . Of course ξn ∈ Xn
(l)
for all n ∈ N and reciprocally, each element in Xn can be written as ξn for some l ∈ N.
(l)
Observe that (ξn )n∈N represents a sequence generated by K-REGINN in the case of exact
(l)
data is given and with the inner iteration stopped with an arbitrary stop index kn less
(l) (l)
than or equal kREG (ξn ). Due to this fact, we call the sequence (kn )n∈N a stop rule. The
(l)
sequence (ξn )n∈N is therefore one of many possible sequences which can be generated from
a run of K-REGINN with initial vector x0 . Since the results of Theorem 43 hold true for all
(l)
these sequences, it applies in particular to (ξn )n∈N . Thus, the limit
x(l) (l)
∞ := lim ξn (4.37)
n→∞
Proof. Assume the statement is not true. Then, there exist an > 0 and sequences
(nj )j , (lj )j ⊂ N with (nj )j strictly increasing such that
(lj ) (lj )
ξnj − x∞
≥ for all j ∈ N
(l ) (l )
where (ξn j )n represents the sequence generated by the stop rule (kn j )n . We stress the fact
(l )
that the iterates ξnjj must be generated by infinitely many different sequences of stop rules
(l) (l)
(otherwise, as ξnj → x∞ as j → ∞ for each l and as the lj ’s attain only a finite number of
(l ) (l )
values, we would have kξnjj − x∞j k < for nj large enough). Next we reorder the numbers
lj (excluding some iterates if necessary) such that
(l) (l)
ξnl − x∞
≥ for all l ∈ N. (4.38)
In view of (4.37) the limit xb∞ := limn→∞ ξbn exists in Bρ (x+ , ∆p ) and solves (3.1). It
follows that the sequence (ξbn )n is bounded and since X is s−convex, inequality (2.23)
holds. Additionally, there exists M = M () ∈ N such that
s
∆p xb∞ , ξbn < s for all n > M, (4.40)
2 C
where C > 0 is the constant from (2.23) . We can additionally suppose that ξbn ∈ Bρ (x+ , ∆p )
for all n > M. In fact, as lim n→∞ξn = Dx
b b∞ and the mappings Jp and E ∆p (b x∞ , ·) are
continuous, we have that ∆p x b∞ , ξbn and Jp (ξbn ) − Jp (b b∞ − x+ converge to zero
x∞ ) , x
as n → ∞. From three points identity (2.21),
D E
∆p x+ , ξbn = ∆p xb∞ , ξbn + ∆p x+ , xb∞ + Jp (ξbn ) − Jp (b b∞ − x+
x∞ ) , x
and as ∆p (x+ , x
b∞ ) < ρ, we conclude that ∆p x+ , ξbn < ρ for n large enough.
Now, for l0 ∈ £M fixed,
(4.39) (4.40) s
(l )
∆p xb∞ , ξM0+1 = ∆p x b∞ , ξbM +1 < s .
2 C
(l )
As xb∞ is a solution of (3.1) and ξM0+1 = ξbM +1 ∈ Bρ (x+ , ∆p ) , inequality (4.10) applies and
(l )
the errors ∆p x b∞ , ξn 0 are monotonically decreasing in n for all n ≥ M + 1. In particular,
nl0 ≥ l0 ≥ M + 1 (because l0 ∈ £M ⊂ N\ {1, . . . , M }). Then
(l )
s
∆p xb∞ , ξn(ll0 ) ≤ ∆p xb∞ , ξM0+1 < s .
0 2 C
4.4. REGULARIZATION PROPERTY 73
(l ) (l )
Since ξn 0 → x∞0 as n → ∞, we conclude that
(l0 )
(4.10) s
∆p xb∞ , x∞ = lim ∆p xb∞ , ξn(l0 ) ≤ ∆p x b∞ , ξn(ll0 ) < s .
n→∞ 0 2 C
From the s−convexity of X,
s
s
s
(l0 )
ξnl0 − x(l 0)
≤ 2 s−1
(l0 )
ξ − x + x − x (l0 )
∞
nl0 b∞
b∞ ∞
≤ 2s−1 C ∆p x b∞ , ξn(ll0 ) + ∆p x b∞ , x(l ∞
0)
< s ,
0
contradicting (4.38) .
We are now in position to prove our main result. The result of the next theorem was
first established in our work [39].
Theorem 47 (Regularization property) Assume all the hypotheses of Theorem 38, As-
sumptions 4, page 62, and 6, page 70, and suppose that X is s−convex with p ≤ s. If d > 1,
holds true, as well as kmax < ∞ and s ≤ r.
assume additionally that Assumption 5, page 63,
δi
If δi → 0 as i → ∞, then the sequence xN (δi ) splits into convergent subsequences, all
i∈N
of which converge strongly to solutions of (3.1) as i → ∞. In particular, if x+ is the unique
solution of (3.1) in Bρ (x+ , ∆p ) then
lim
xδNi (δi ) − x+
= 0.
i→∞
Proof. If N (δi ) ≤ I as i → ∞ for some I ∈ N, then the sequence (xδNi (δi ) )i∈N splits
δi
into subsequences of the form (xn ` )`∈N where n is a fixed number less than or equal to I.
According to Lemma 45, each of these subsequences splits into convergent subsequences.
Hence each limit of such a subsequence must be a solution of (3.1) due to the discrepancy
principle (3.13). In fact, if xδni → a as i → ∞, then using (3.7) ,
kyj − Fj (a)k = lim
yj − Fj xδni
≤ lim (1 + τ ) δi = 0, j = 0, . . . , d − 1.
i→∞ i→∞
Suppose now that N (δi ) → ∞ as i → ∞ and let > 0 be given. As the Bregman distance
is a continuous function in both arguments, there exists γ = γ () > 0 such that
s
∆p x, xδni < whenever
x − xδni
< γ, (4.41)
C
where C > 0 is the constant from (2.23). From Lemma 46, there is an M ∈ N such that,
(l) (l)
for each ξM ∈ XM , there exists a solution x∞ of (3.1) satisfying
γ
(l) (l)
x∞ − ξM
< .
2
According to Lemma 45, the sequence xδMi splits into convergent subsequences, each one
δi
converging to an element of XM . Let (xMj )j∈N be a generic convergent subsequence, which
converges to an element of XM , say
δi (l )
lim xM` = ξM0 ∈ XM .
`→∞
δi (l )
We prove now that the subsequence (xN`(δi ) )`∈N converges to the solution x∞0 . In fact,
`
δi (l )
since xM` → ξM0 as ` → ∞, there exists a L1 = L1 () such that
γ
(l0 ) δi
ξM − xM`
< for all ` ≥ L1 .
2
74 CHAPTER 4. CONVERGENCE ANALYSIS OF K-REGINN
Numerical Experiments
To test the performance of K-REGINN, we have chosen a severely ill-posed problem, namely,
the Electrical Impedance Tomography (in short EIT). In this non-invasive problem, one
aims to reconstruct specific features of the interior of an object collecting information on
its boundary. The procedure consists in applying different electric current configurations
on the boundary of a bounded set and then measuring the resulting potentials on the
boundary as well. The objective is to access information of the interior of this set in order
to reconstruct the electrical conductivity.
This idea was originally introduced by Calderón in his famous paper [8]. The problem
proposed by him is currently known as the Continuum Model of EIT (EIT-CM), where
electric current is applied on the whole boundary and the corresponding potential is also
read in all the point of the boundary. The Calderón’s method is more of theoretical than
practical interest because in real situations is actually impossible to apply or record this
information in the whole boundary. This method is however important in the mathematical
point of view. More details and some numerical experiments using K-REGINN to solve this
problem are presented in Section 5.1 below.
To modify the EIT-CM in order to adjust it to a more concrete and realistic framework,
many attempts have been made and different methods have been suggested. One of the
most promising and currently considered a very realistic framework is the so-called Complete
Electrode Model (EIT-CEM), which was first presented by Somersalo et al in [50]. In the
EIT-CEM, electrodes are attached on the boundary of an object and the electric current is
injected on this object through these electrodes. The resulting voltage is then read in the
same electrodes. The inverse problem of reconstructing the electrical conductivity in the
entire set from the application of different configurations of currents and the measure of
the respective voltages on its boundary is considered in Section 5.2, where we additionally
present some numerical experiments and discuss the results.
Concerning practical situations, EIT has a vast range of potential applications in several
fields, including non-invasive medical imaging, non-destructive testing for locating resistiv-
ity anomalies, monitoring of oil and gas mixtures in oil pipelines, geophysics, environmental
sciences, among others. See [5, 23] and the references therein.
Before starting, we would like to point out that both EIT variations are appropriated
and naturally suitable to the use of Kaczmarz methods. The reason is simple: the electrical
conductivity is reconstructed from a set of individual measurements. Each of those mea-
surements can be regarded as an individual operator in a system of equations to be solved.
The details will be explained later in a convenient moment.
75
76 CHAPTER 5. NUMERICAL EXPERIMENTS
Z Z
γ∇u∇ϕ = gϕ for all ϕ ∈ H♦1 (Ω) . (5.3)
Ω ∂Ω
The symbol ♦ means that the integral of the function over the boundary of Ω is zero:
Z
1/2 1/2
H♦ (∂Ω) := v ∈ H (∂Ω) : v=0 .
∂Ω
If the function is defined in Ω, this integral is understood in the sense of the trace theorem:
for u ∈ H 1 (Ω) , its trace f = u|∂Ω belongs to H 1/2 (∂Ω) , and we define
Z
1 1
H♦ (Ω) := u ∈ H (Ω) : f =0 .
∂Ω
−1/2 1/2
Finally, the set H♦ (∂Ω) is defined as the dual space of H♦ (∂Ω) and
L∞ ∞
+ (Ω) := {v ∈ L (Ω) : v ≥ C a.e. in Ω} ,
in (5.3). It can be interpreted as the grounding of potential. Further, the lower bound
γ ≥ C a.e. in Ω (γ ∈ L∞+ (Ω)) ensure that the electric flux can flow through the entire set
Ω.
With the existence and uniqueness of solutions of the variational problem (5.3) , the
Neumann-to-Dirichlet (NtD) operator Λγ , which associates the current with the respective
voltage,
−1/2 1/2
Λγ : H♦ (∂Ω) → H♦ (∂Ω) , g 7−→ f
is well-defined. Moreover, it can be proven to be an invertible linear bounded operator. Its
bounded linear inverse is called the Dirichlet-to-Neumann operator (DtN).
The forward operator associated with EIT-CM problem is now defined as the nonlinear
function
−1/2 1/2
F : D (F ) ⊂ L∞ (Ω) → L H♦ (∂Ω) , H♦ (∂Ω) , γ 7−→ Λγ , (5.4)
5.1. EIT - CONTINUUM MODEL 77
with D(F ) := L∞ + (Ω) . The EIT-CM inverse problem consists therefore in recovering γ
from a partial knowledge of the NtD operator Λγ , which is a nonlinear and highly ill-posed
problem [1].
In 2006, Astala and Päivärinta [2] proved the injectivity of F in (5.4) , which means
that the EIT-CM is uniquely solvable. In practice however, one cannot expect to have
full knowledge of the NtD operator. The best it can be done is to apply a finite number
of currents = := (g0 , ..., gd−1 ) and then recover the corresponding voltages, fj = Λγ gj ,
j = 0, ..., d − 1. This fact suggests that the operator
d
1/2
F= : D (F ) ⊂ L∞ (Ω) → H♦ (∂Ω) , γ 7−→ (f0 , ..., fd−1 ),
we immediately see that F= = (F0 , ..., Fd−1 )> and Y = Y0 × ... × Yd−1 , with Y :=
d
1/2 1/2
H♦ (∂Ω) and Yj := H♦ (∂Ω) for j = 0, ..., d − 1. This is exactly the structure pre-
sented in (3.6) , which makes this problem suitable to the application of a Kaczmarz method.
Moreover, each of the operators in (5.5) is Fréchet-differentiable as we will discuss now.
where
uγ+tσ − uγ Gj (γ + tσ) − Gj (γ)
wσ := lim = lim = G0j (γ) σ
t→0 t t→0 t
78 CHAPTER 5. NUMERICAL EXPERIMENTS
is the Fréchet derivative of Gj evaluated in γ and applied in the vector σ. As the trace
operator is linear and continuous,
Fj0 (γ) σ = G0j (γ) σ ∂Ω = wσ |∂Ω .
Therefore, in order to calculate Fj0 (γ) σ, one needs first to find uγ ∈ H♦1 (Ω) in (5.6) , use it
1/2
to find wσ ∈ H♦1 (Ω) in (5.8) and finally evaluate its trace wσ |∂Ω ∈ H♦ (∂Ω) .
To calculate the vector Fj0 (γ)∗ η, we use the following procedure: let ϑ ∈ L∞ (Ω) be
fixed and let ψη ∈ H♦1 (Ω) be the unique solution of (5.3) for g := η, i.e.,
Z Z
γ∇ψη ∇ϕ = ηϕ for all ϕ ∈ H♦1 (Ω) . (5.9)
Ω ∂Ω
Thus,
Fj0 (γ)∗ η = −∇uγ ∇ψη , (5.10)
with uγ , ψη ∈ H♦1 (Ω) being given in (5.6) and (5.9) respectively.
Unfortunately, the space L∞ (Ω) has too poor smoothness/convexity properties to be
included in the convergence analysis of K-REGINN in Chapter 4 (it is not uniformly smooth,
for example). But, as Ω is bounded, L∞ (Ω) ⊂ Lp (Ω) for 1 < p < ∞, and since these spaces
have the necessary properties (remember that for 1 < p < ∞, the Lebesgue spaces Lp are
p∨2−convex and p∧2−smooth (see Example 14)), a possible and immediate solution would
be redefine the operators Fj in different spaces:
1/2
Fj : D (F ) ⊂ Lp (Ω) → H♦ (∂Ω) , 1 < p < ∞.
∗
The duality mapping Jp : Lp (Ω) → Lp (Ω) can be now easily calculated via (2.14).
Using this strategy however, a new problem becomes evident: D(F ) has no interior
points in the Lp −topology1 , which means that differentiability or even the continuity of Fj
are compromised. To overcome this technical obstacle, we suggest restricting the searched-
for conductivity space X to a finite dimensional space V ⊂ L∞ (Ω) , that is, redefine the
functions Fj ’s as
Fj : V+ ⊂ Vp → L2 (∂Ωj ) , γ 7−→ fj , V+ := D (F ) ∩ V, (5.11)
where ∂Ωj is the part of the boundary where the experiments are actually taken. The
subscript index p in Vp := (V, k·kLp ) highlights the fact that the Lp −topology is used
in2 V = span {v1 , ..., vM }. This is a reasonable framework because from a computational
point of view, the best one can do is to find the coefficients of the conductivity vector in a
specific basis of a finite dimensional subspace of L∞ (Ω). Moreover, using a finite number of
experiments, only a finite number of degrees of freedom of the conductivity can be restored.
Since in finite dimensional spaces all the norms are equivalent, the F-derivative of Fj
remain the same3 .
n o
1
For γ ∈ D (F ) fixed, the ball γ e ∈ L∞ (Ω) : kγ − γ ekLp (Ω) < ρ is not entirely contained in D (F ) for
any ρ > 0.
2
The vectors vi ∈ L∞ (Ω) , i = 1, ..., M are naturally assumed to be linearly independent.
3 1/2
The change of the data space H♦ (∂Ω) with L2 (∂Ωj ) in (5.11) does not alter the derivative of Fj
1/2 2
either, since H (∂Ω) ,→ L (∂Ω) , see e.g. [14, Theo. 2.72].
5.1. EIT - CONTINUUM MODEL 79
Proposition 48 Let X and Y be Banach spaces defined over the same field k. If there ex-
∗
ists a linear, isometric and surjective operator T : X → Y, then the operator T −1 : X ∗ →
Y ∗ is well-defined and shares the same properties.
Proof. Suppose that T : X → Y has the required properties. From its isometry follows
that T is injective and hence bijective. Thus T −1 : Y → X is well-defined and it is clearly
linear, invertible and isometric too. From the isometry of T −1follows now the boundedness
∗
of this operator, which implies that its Banach-adjoint T −1 : X ∗ → Y ∗ is well-defined.
∗
Further, T −1 is a linear operator. It remains to prove that this operator is isometric and
hx∗ , xi := hy ∗ , T xi , x ∈ X.
∗
Then it is clear that x∗ ∈ X ∗ . We will prove that T −1 x∗ = y ∗ . In fact, for each y0 ∈ Y,
define x0 := T −1 y0 . Then
D ∗ E
hy ∗ , y0 i = hy ∗ , T x0 i = hx∗ , x0 i = x∗ , T −1 y0 = T −1 x∗ , y0 .
Note that if T has the required properties of above proposition, then it is a linear,
bounded, isometric and invertible operator. Further, its inverse is also a bounded operator,
because it is isometric too.
∗ This means that T is an isomorphismus of Banach spaces and
the same applies to T −1 . Moreover, it obviously holds, for all x ∈ X ∗ and x ∈ X,
∗
D ∗ E
hx∗ , xiX ∗ ×X = T −1 x∗ , T x ∗ ,
Y ×Y
−1 ∗
which implies that T is actually an isomorphismus of dual spaces. The result implies
in particular that ∗
x∗ ∈ Jϕ (x) ⇐⇒ T −1 x∗ ∈ Jϕ (T x) ,
∗
this is, T −1 Jϕ (x) = Jϕ (T x) . Then, for all x, y ∈ X,
D ∗ E
hJϕ (T x) , T yiY ∗ ×Y = T −1 Jϕ (x) , T y ∗ = hJϕ (x) , yiX ∗ ×X .
Y ×Y
Each property of X which only depends either on the norm k·kX or on the duality pair-
ing h·, ·iX ∗ ×X of its dual space X ∗ is therefore preserved. In particular, smoothness and
convexity properties of X and Y are exactly the same. For instance, the equivalence
1 1 Cp
kx − ykp ≤ kxkp − hJp (x) , yi + kykp
p p p
if and only if
1 1 Cp
kT x − T ykp ≤ kT xkp − hJp (T x) , T yi + kT ykp ,
p p p
80 CHAPTER 5. NUMERICAL EXPERIMENTS
Observe that f ∈ Lpω (Ω) if and only if f ω 1/p ∈ Lp (Ω) . We can therefore define the norm
kf kLpω (Ω) :=
f ω 1/p
p , f ∈ Lpω (Ω) ,
L (Ω)
which transforms Lpω (Ω) in a normed space. If ωmin ≤ ω (x) ≤ ωmax for all x ∈ Ω, where
0 < ωmin ≤ ωmax are constants independent of x, then
1 1
kf kpLp (Ω) ≤ kf kpLp (Ω) ≤ kf kpLp (Ω) .
ωmax ω ωmin ω
Thus, the two vector spaces have equivalent norms and the same elements. Further, the
completeness of Lp (Ω) is transmitted to Lpω (Ω) and the operator T : Lpω (Ω) → Lp (Ω) ,
f 7−→ f ω 1/p is well-defined and satisfies all the hypotheses of Proposition 48. Therefore∗ the
∗
Banach space Lpω (Ω) inherits the properties of Lp (Ω) . In particular, (Lpω (Ω)) = Lpω (Ω)
and for all f, g ∈ Lpω (Ω) ,
D E
hJp (f ) , giLp∗ ×Lp = hJp (T f ) , T giLp∗ ×Lp = |T f |p−1 sgn (T f ) , T g p∗ p
ω ω L ×L
D E Z
p−1 ∗ p−1
= |f | sgn (f ) ω 1/p , gω 1/p p∗ p = |f | sgn (f ) gω
L ×L Ω
D E
= |f |p−1 sgn (f ) , g p∗ p .
Lω ×Lω
Hence, the duality mapping Jp in Lpω (Ω) is still given by Jp (f ) = |f |p−1 sgn(f ) .
The adjoint operator Fj0 (γ)∗ changes slightly in the new space because for the new
0
adjoint operator F j (γ)∗ we want to have for each γ ∈ int (D (F )) , σ ∈ Lp (Ω) and η ∈
L2 (∂Ωj ) ,
D E
0 !
that is Z Z
0 ∗
Fj (γ) ησω = Fj0 (γ)∗ ησ.
Ω Ω
We keep using the old notation Fj instead of F j for the new operator.
The idea of using the function ω is to give different weights for different regions of Ω.
With this procedure we hope to increase stability in some regions where is important to
have it. We come back to this subject later to introduce an appropriated weight-function
for our framework. Before doing that however, some preliminaries are needed.
For our numerical experiments in this section, we observe only the Kaczmarz version
of REGINN and let the comparison between Kaczmarz and non-Kaczmarz methods for the
EIT-CEM problem, which will be analyzed in next section.
5.1. EIT - CONTINUUM MODEL 81
In order to perform the experiments, we have chosen Ω as the unit square (0, 1) × (0, 1)
and d = 4m (m ∈ N) independent currents
cos (2j0 πx) cos (2j0 πy) : (x, y) ∈ Γj1
gj := ,
0 : otherwise
where j = 4 (j0 − 1) + (j1 − 1) , j0 = 1, ..., m and j1 = 1, ..., 4. The sets Γj1 represent the
faces of Ω : Γ1 := [0, 1] × {1} , Γ2 := {1} × [0, 1] , Γ3 := [0, 1] × {0} and Γ4 := {0} × [0, 1] .
The voltages fj = Λγ gj are measured in ∂Ωj = ∂Ω\Γj1 , which means that we do not read
the voltages where we apply the currents.
For implementing K-REGINN numerically, the computational evaluation of the vectors
Fj (γ), Fj0 (γ) σ and Fj0 (γ)∗ η for γ ∈ int (V+ ) , σ ∈ Vp and η ∈ L2 (∂Ω) are necessary. Since
an analytical solution is in general not available, we apply the Finite Element Method
(FEM), constructing a Delaunay triangulation
for Ω, provided by MATLAB’s pde toolbox with M = 2778. The same triangulation is then
used to solve the elliptic problems (5.6) , (5.8) and (5.9) and to reconstruct the conductivity.
The elements of the basis used to define the finite dimensional space V are defined as
1 : x ∈ Ti
vi (x) := χTi (x) = , x ∈ Ω, (5.14)
0 :x∈ / Ti
which means that we are looking for piecewise constant conductivities. This choice of
V guarantees injectivity of Fj0 (γ) . Moreover, Fj satisfies the Tangential Cone Condition
(Assumption 1(c) , page 31), see [35, Subsection 3.1].
We want to test the performance of K-REGINN to reconstruct sparse conductivities in
the {v1 , ...vM } basis, and for this reason, we define the exact solution as
M
X 0.9 : x ∈ B
γ + (x) := γ0 (x) + αi vi (x) , αi := , x∈Ω (5.15)
0 : otherwise
i=1
where γ0 (x) ≡ 0.1 represents a known background and it is also the first iterate. The set
B represents two open balls inclusions with radii equal 0.15 and centers (0.35, 0.35) and
(0.65, 0.65). Figure 5.1 shows γ + and the triangulation Υ of Ω.
To properly compare the results in different spaces, we always calculate the error in the
2
L −norm. The relative iteration error of the n−th iterate γn is therefore defined by
which implies that the initial error is e0 ≈ 87, 4%. The (final) relative iteration error eN (δ)
is denominated the reconstruction error. The corresponding data fj = Λγ + gj have been
synthetically computed using again the FEM, but with a different and much more refined
mesh than Υ to avoid an inverse crime. The generated data is then corrupted by artificial
random noise of relative noise level δ, that is, the perturbed data are
where perj is an uniformly distributed random variable with kperj kL2 (∂Ωj ) = 1. in contrast
to the previous chapter, δ denotes here a relative noise level. The other input parameter of
Algorithm 1 are chosen as τ = 1.8, µ = 0.8 and kmax = 50.
82 CHAPTER 5. NUMERICAL EXPERIMENTS
0.8
0.6
0.4
0.2
0
0 0.2 0.4 0.6 0.8 1
Figure 5.1: Left: the searched-for conductivity γ + , defined in (5.15) and modelled by a
resistive background (in blue) and two balls-like conductive inclusions (in dark red). Right:
the triangulation Υ of Ω.
An appropriate choice of the weight-function ω is crucial for the quality of the recon-
structions. Based
PMon the explanations from [52], we have defined ω as a piecewise constant
function: ω := i=1 ωi χTi , with
r
2
Pd−1
0
F (γ ) χ
j=0
j 0 Ti
2
L (∂Ωj )
ωi := , (5.18)
|Ti |
where |Ti | represents the area of the triangle Ti . In spite of being only a heuristic choice for
the EIT-CM, this definition of ω has a special meaning for the complete electrode model
in Hilbert spaces, see (5.43) and Remark 50 in next section.
Since Yj = L2 (∂Ωj ) is a Hilbert space, we choose r = 2 because in this case, j2 (f ) = f
for all f ∈ L2 (∂Ωj ) . Since we are interested in studying sparse conductivities, we follow
ideas from [12] and always choose 1 < p ≤ 2. For the first test, we have used p = 1.1.
The goal of the first experiment is to test the quality of the weight-function ω. Figure 5.2
illustrates the results obtained using d = 8 and different
relative
noise levels δ. It compares
the influence of the Banach spaces X = Vp,ω := V, k·kLpω (Ω) , weighted with ω and the
standard space X = Vp in the reconstruction γN (δ) . The dual gradient DE method with
C1 = C2 = 0.1 (see (3.29) and (3.39)) is employed as inner iteration of K-REGINN to generate
the results. Below each picture it is shown the number of outer iterations N = N (δ) , the
reconstruction error eN and the number kall , which represents the sum of all inner iterations
performed until convergence. All the pictures are in the same scale of colors and below all of
them, a color bar is exhibited. Observe that for all noise levels, the reconstructions in first
row, obtained with the weighted-norm k·kL1.1 ω
, are both qualitatively and quantitatively
superior than those found using the L1.1 −norm and displayed in the second row. Moreover,
in the weighted space, Algorithm 1 requires less iterations until convergence. Further, it
is clear that the use of the weight-function ω brings more stability to the solutions in the
sense that it reduces the oscillation near the boundary, which is constantly present in the
reconstructions in the non-weighted space L1.1 . This figure also makes perceptible in both
spaces, the behavior N (δ) → ∞ as δ → 0 and the regularization property, γN (δ) → γ + as
δ → 0, proved in Theorem 47.
Due to the improvement provided by the use of ω, this weight-function is used in all
the remaining experiments of this subsection.
For this reason, we skip the dependence of
Vp on ω, i.e., from now on Vp = V, k·kLpω (Ω) .
5.1. EIT - CONTINUUM MODEL 83
δ = 4% δ = 2% δ = 1%
V1.1,ω
V1.1
Figure 5.2: Reconstructed conductivity γN (δ) in two different Banach spaces and with dif-
ferent noise levels. The first and second rows display the results obtained using respectively
the weighted-norm k·kL1.1ω
and the standard norm k·kL1.1 .
The purpose of the second test is to check the results of Theorem 43. We fix all the same
parameters of last experiment (inclusively p = 1.1 and d = 8) and analyze the behavior
of the Bregman variation of TP method (3.58) in the inner iteration of K-REGINN with
maximal number of inner iterations is fixed to kmax = 1. This configuration results in a
variation of the Levenberg-Marquardt method. The regularization parameter in (3.57) is
chosen as α0 = 0.01. Now, in order to carry out the inner iteration, either the solution
of the nonlinear equation (3.58) needs to be found or the minimization of the functional
(3.60) must be executed. A fixed-point iteration for zn,k+1 in (3.58) could be employed to
find this vector. Convergence is guaranteed if the spaces X and Y have enough regularity
and the constant α0 is large enough, see e.g. [39, Appendix A]. However, these conditions
are very restrictive: convergence is only ensured for large values of α0 and this constant
needs to be very large for small values of p, which in a practical point of view, seems to
render this method unfeasible. In the other hand, minimizing the functional (3.60) , even
in Hilbert spaces, is not a trivial task and it is far from simple in more general Banach
spaces. In order to achieve this goal, we have utilized the DE method ((3.29) and (3.39))
with C1 = C2 = 0.01, which showed itself robust enough to reach a satisfactory precision
in the minimization of (3.60). As Theorem 43 refers to the noiseless situation, no artificial
noise is added in the generated data (δ = 0). Table 5.1 presents the results. It reinforces
the fact that in the noiseless situation, both the residual bn and the relative iteration error
en converge to zero as the number of outer iterations n grows to infinity.
84 CHAPTER 5. NUMERICAL EXPERIMENTS
Table 5.1: Levenberg-Marquardt method employed to confirm the results from Theorem
43: the relative iteration error en as well as the nonlinear residual bn approaches zero as
the outer iteration index n grows to infinity.
In the next experiment we aim to confirm the results of Theorem 38, which states that
the iteration error in the Bregman distance is monotonically non-increasing in the outer
iteration of K-REGINN, this is, ∆p (γ + , γn ) ≤ ∆p (γ + , γn−1 ) for n = 1, ..., N (δ) . Figure 5.3
exhibits the evolution of the iteration error versus the outer iteration index n for three
different dual gradient methods (3.29) engaged as inner iteration: the DE, MSD and LW
methods with C1 = C2 = 0.1 in (3.39) , K1 = K2 = 0.1 in (3.45) and λLW = 0.1 (see (3.43)).
The noise level is set to δ = 1%, kmax = 50 and τ = 1.5. The other parameters remain the
same as in the last experiment. See that the (final) iteration errors are comparable for all
methods. This behavior is somehow expected because the quality of the approximations
in the inner iteration are not only dependent on the utilized method itself, but it is also
controlled by the stop criteria of inner iteration (3.5) , which remains the same for all
methods. Among these three methods however, the DE method is by far the method which
results in the fastest convergence while the LW method seems to be the slowest. See that
although the iteration error is strictly decreasing in most iterates, it remains constant in
some of them. This behavior is in accordance with the results of Theorem 38, which states
that the iteration error in the Bregman distance remains the same in the iterates where the
discrepancy principle 3.9 is satisfied and it is strictly decreasing otherwise.
0.24
Iteration Error in the Bregman Distance
DE
MSD
0.22
LW
0.2
0.18
0.16
0.14
0.12
0 40 80 120 160 200
Outer Iteration
Figure 5.3: The decreasing error behavior of the outer iteration of K-REGINN (shown in
Theorem 38) observed with three different dual gradient methods in the inner iteration.
0.84
p=2
0.82 p=1.1
Reconstruction error
p=1.01
0.8
0.78
0.76
0.74
4 2 1 0.5
Noise level
Figure 5.4: Reconstruction error confronted with the noise level and observed in three
different Banach spaces norms: L2 (in red) L1.1 (in blue) and L1.01 (in green).
consequently destroying the sparsity properties of the solution γ + , the Lp −norm with p
assuming values close to 1 preserves these sparsity features and it is therefore more capable
to reconstruct sparse solutions, see e.g., [12].
One of the main advantages of using the Kaczmarz methods is the fact that this kind
of algorithms does not work with all the equations of (3.6) in all cycles. Only the equations
whose the current iterate is not a good enough approximation4 are used to perform an
iteration. The remaining equations whose the current iterate is already considered good
enough, are not used. This procedure only updates the current iterate in the necessary
directions, accelerating the method until convergence and rendering better reconstructions.
This behavior is evident in Figure 5.5, where the number of active equations is compared
on each cycle of K-REGINN. The dual gradient MSD method ((3.29) and (3.45)) is employed
as inner iteration with the same parameters of last experiment but with the fixed noise
level δ = 1%, p = 1.1 and d = 16. Observe that after some few initial cycles, the number of
active equations drops to relative small levels, where only the relevant equations are worked
until termination. The average of active equations in this example is roughly half of the
whole number.
Number of active equations
16
12
0
0 10 20 30 40 50 60 70
Cycles
Figure 5.5: Behavior of Kaczmarz methods: the average of active equations becomes lower
with the time and the current iterate is corrected using only the relevant information.
The regularization property proved in Theorem 47 is once more evident in Figure 5.6,
where different numbers of electric currents are applied on the boundary of Ω (different
values of d, or equivalently, different numbers of equations in (3.6)) are tested. We want to
illustrate both behaviors in this experiment: more information improves the reconstructions
independently on the noise levels and the results become better whenever the noise levels
are reduced. Each row in this Figure presents the reconstructions for a different value of
d : d = 4, d = 8 and d = 12. The columns exhibit and compare the different noise levels
δ = 2%, δ = 1% and δ = 0.5%. The pictures have been generated using the Bregman
variation of the IT method (3.62) with the same parameters of the last experiment but
4
Called the active equations, they are those equations, whose inequality (3.9) is not satisfied.
86 CHAPTER 5. NUMERICAL EXPERIMENTS
with kmax = 10. Further, the regularization parameter in (3.68) is defined as the constant
αn ≡ 0.1 and the dual gradient DE method ((3.29) and (3.39)) with C1 = C2 = 0.01 is
used to find the minimizer of the associated functional (3.63). All the pictures are in the
same scale of colors and below all of them, a color bar shows the values represented by each
of those colors. A clear improvement in the reconstructions is seen moving towards the
rightmost column (where the noise levels are lower) and moving downwards (where more
information is available).
d=4
d=8
d = 12
Figure 5.6: Reconstruction γN (δ) in the space V1.1 obtained with the noise levels δ = 4%,
δ = 2% and δ = 1% and with different number of equations: d = 4, d = 8 and d = 12.
The norm of the noise can vary if it is regarded as a vector in different spaces. For
instance, the so-called impulsive noise, which consists of standard uniform noise superim-
posed with some few highly inconsistent data points called outliers, has a small Lr −norm
for r small and becomes larger if r increases. In contrast, the so-called Gaussian noise,
which is an uniformly distributed noise, is less sensitive to the chosen norm and has more
similar values in different Lr −norms. In the first row of Figure 5.7, three completely differ-
ent kinds of noise are presented. The above referred Gaussian an impulsive noises are the
first and second picture respectively. Below each kind noise, the value of the corresponding
Lr −norm is shown for different values of r. The noises are scaled such that the (relative)
5.1. EIT - CONTINUUM MODEL 87
N oise 0 0 0
0 1 2 0 1 2 0 1 2
Figure 5.7: Norm of three different kinds of noise, measured in the Banach space norms
L2 (∂Ω1 ) (first row), L1.01 (∂Ω1 ) (second row) and L50 (∂Ω1 ) (third row).
X = V2
X = V1.01
X = V50
Figure 5.8: Reconstruction γN (δ) obtained in different Banach spaces norm (compared by
rows) and with different kinds of noise (compared by columns). All the noises are scaled
and have the same L2 −norm.
noise and used space results in a large noise-norm (see the second row intersected with the
third column and the intersection between the third row and the second column).
To finish this section we explain how to take advantage of the weight-function ω to
incorporate a priori information about the solution γ + . If the inclusions B are expected
to be located in a specific region A of Ω, one way to get better reconstructions is giving
less weight for the points within this region and ”penalizing” the distance to this set: let
A ⊂ Ω be a closed subset in Ω where the inclusions are expect to be located. Define the
new weight-function
ω (x) := ω (x) h (x) , x ∈ Ω,
where
h (x) := (c0 + dist (x, A)) . (5.19)
The function dist (·, A) : Ω → R measures the distance between a point of Ω and the set
A:
dist (x, A) := inf {kx − yk : y ∈ A} , x ∈ Ω.
5.2. EIT - COMPLETE ELECTRODE MODEL 89
0.8
0.6
0.4
0.2
0.35
0.3
0.25
0.2
0.15
Figure 5.9: Reconstructed solution γN (δ) for different weight-functions. The first row shows
the function h from (5.19).
The small constant c0 > 0 is used to avoid zero-weights and to define the ”contrast” between
the points which belong and the ones which do not belong to the set A. This new weight-
function is still bounded from above and from below: c0 ωmin ≤ ω ≤ (c0 + |Ω|) ωmax , where
0 < ωmin ≤ ωmax are respectively lower and upper bounds for ω.
Figure 5.9 collects the results obtained using different sets A and the same constant
c0 = 0.03. The first row shows the function h, defined in (5.19) , while the second one
exhibits the respective reconstructions with the number of outer iteration N, the overall
number of inner iteration kall and the reconstruction error eN being highlighted below each
picture. At the end of each row, a color bar is displayed. The first column presents the
result for the set A = Ω, which means that the new weight ω (x) = c0 ω (x) is just a multiple
of the original weight-function ω. For the reconstruction in the second column the strip-like
set
A = {(x, y) ∈ Ω : y − 0.25 ≤ x ≤ y + 0.25}
have been used, and the surface delimited by a square
is the chosen set in the third column. The pictures have been generated using the DE
method with C1 = C2 = 0.1 and the following configuration: d = 8, p = 1.1, r = 2, τ = 1.5
and δ = 0.5%. The other parameters are the same of last experiment.
Note that better results are found in both the second and third columns, where more
accurate information about the location of the inclusions is provided. Further, the overall
number of outer and inner iterations are lower in both cases.
domain Ω ⊂ R2 via L ∈ N electrodes attached to its boundary ∂Ω and the resulting voltages
are measured in the same electrodes with the goal of restoring the electrical conductivity
γ : Ω → R. For the correct translation in an appropriate mathematical model, we suppose
that the electrodes e1 , ..., eL are identified with the part of the surface of Ω they contact.
Thus, ei ⊂ ∂Ω is open and has positive measure |ei | > 0 for i = 1, ..., L. Additionally,
the electrodes are connected and separated: ei ∩ ej = ∅ for i 6= j. Similarly to the EIT-
CM model, we suppose that Ω has no sources or drains and thus (5.1) is satisfied, where
u : Ω → R represents again the potential distribution in Ω. The electrodes are modelled as
perfect conductors, which means that the electric current in the electrode ei is a constant
Ii ∈ R, which agrees with the total electric flux over the same electrode:
Z
∂u
γ dS = Ii , i = 1, ..., L.
ei ∂ν
The vector I := (I1 , ..., IL )> ∈ RL , which collects in a single vector the electric currents of
all electrodes, is called a current pattern. As the electric flux does not flow in the electrodes
gaps we have
∂u
γ = 0 in ∂Ω\ ∪L i=1 ei .
∂ν
At the contact of the electrodes with ∂Ω, there is an electro-chemical effect which gives rise
to a thin, highly resistive layer characterized here by the quantities zi > 0, i = 1, ..., L and
called the contact impedances. The product of the contact impedance zi times the electric
flux γ (∂u/∂ν) results in the drop of the voltage in the electrode ei . Thus
∂u
u + zi γ = Ui , on ei , i = 1, ..., L,
∂ν
where Ui is the measured voltage in the i−th electrode ei . We denote by U := (U1 , ..., UL )> ∈
RL , the vector of the voltages associated with the current pattern I.
In [50, Prop. 3.1], it is demonstrated that if the set Ω, the electrical conductivity γ and
the electrodes ei have enough regularity, the above conditions can be equivalently replaced
by the variational problem
L
X
B ((u, U ) , (v, V )) = Ii Vi for all (v, V ) ∈ H, (5.20)
i=1
Aiming now to apply the Lemma of Lax Milgram [32] in (5.20) in order to prove the
existence of a vector (u, U ) for fixed contact impedances zi > 0 and a fixed current pattern
I ∈ RL , it was suggested in [50, Theo. 3.3] changing the space H with H e = H/R, which
is essentially the same space but with the difference that two vectors from H which differ
only by a constant function are regarded as the same vector in H, e i.e.,
(v, V ) ∼ ve, Ve ⇔ v − ve = const = V1 − Ve1 = ... = VL − VeL . (5.22)
In this new vector space, the bi-linear form B in (5.21) is coercive whenever γ ∈ L∞
+ (Ω).
Now defining f : H → R as
e
L
X
f ((v, V )) := Ii V i ,
i=1
5.2. EIT - COMPLETE ELECTRODE MODEL 91
we see that, in order to have a well-defined function, the equality f ((v, V )) = f ve, Ve
must be verified whenever the right-hand side of (5.22) holds true. Thus, the equality
L
X L
X L
X
0 = f ((v, V )) − f ve, Ve = Ii Vi − Ii (Vi + const) = const Ii ,
i=1 i=1 i=1
L
( )
X
RL
♦ := I ∈ RL : Ii = 0 .
i=1
In this case, f is a well-defined, bounded linear functional and the Lemma of Lax Milgram
applies. It follows that there exists a unique solution of (5.20) if H is replaced with H. e
Therefore, there exist infinite solutions in H, differing from each other only by a constant.
To obtain uniqueness in H, we suppose that U ∈ RL L
♦ . The conditions I, U ∈ R♦ can be
understood as the law of the conservation of the charge and the grounding of the potential
respectively.
The above reasoning leads to the conclusion that the following related variational prob-
lem is well-defined: fixed a current pattern I ∈ RL ♦ and positive contact impedances
z1 , ..., zL , find the unique pair (u, U ) ∈ H 1 (Ω) ⊕ RL
♦ satisfying
L
X
B ((u, U ) , (v, V )) = Ii Vi for all (v, V ) ∈ H 1 (Ω) ⊕ RL
♦, (5.23)
i=1
b ×H
where B : H b := H 1 (Ω) ⊕ RL is defined in (5.21) .
b → R, with H
♦
Remark 49 The variational problem (5.23) has a unique PL solution even if the fixed current
pattern I belongs to5 RL \RL♦ . Indeed, defining a := i=1 Ii and changing I with I − C,
>
where C := (a/L, ..., a/L) , we see that I − C ∈ R♦ and for all V ∈ RL
L
♦,
L L L L
X X aX X
(Ii − Ci ) Vi = Ii Vi − Vi = Ii Vi .
L
i=1 i=1 i=1 i=1
| {z }
=0
Since a unique solution of (5.23) exists with the current pattern I − C, the same occurs with
I.
Using (5.23), it is not difficult to prove that the Neumann-to-Dirichlet (NtD) operator
Λγ : RL → RL , I 7→ U, (5.24)
is linear. The Complete Electrode Model of the EIT (in short EIT-CEM) forward problem
is now defined by the function
F : D (F ) ⊂ L∞ (Ω) → L RL , RL , γ 7→ Λγ ,
(5.25)
with D(F ) := L∞
+ (Ω) . Recovering γ from a partial knowledge of Λγ is the associated inverse
problem to be solved.
5
The current pattern vector I ∈ RL \RL
♦ does not satisfy the principle of the conservation of the charge.
Therefore, it has no physical meaning.
92 CHAPTER 5. NUMERICAL EXPERIMENTS
As can be seen from Remark 49, the NtD operator (5.24) is not injective. In fact, the
null-space of Λγ is given by
But, since it is a linear operator, the knowledge of the vectors U j = Λγ I j , j = 1, ..., L, where
I 1 , ..., I L is a basis for RL , implies in the knowledge of the NtD operator Λγ itself6 .
where uj , U j ∈ H 1 (Ω) ⊕ RL
♦ is the unique solution of (5.23) associated with the current
pattern I = I , that is, U := Λγ I j , j = 1, ..., `. Observe that the noisy versions of U j
j j
and regard this operator as the EIT-CEM forward problem for the rest of this section.
The determination of an approximation for γ from partial knowledge of Γγ is therefore the
associated inverse problem we want to solve.
Similarly to the EIT-CM, presented in the last section, the space Y = R`L factorizes into
the spaces Y = Y1 × ... × Y` , with Yj := RL , j = 1, ..., ` and accordingly, F= = (F1 , ..., F` )>
with
Fj : D (F ) ⊂ L∞ (Ω) → RL , γ 7→ U j , j = 1, ..., `, (5.29)
which replaces the problem (5.28) with a more suitable version for an application of a
Kaczmarz method.
where uj , U ∈ H 1 (Ω) ⊕ RL
j
♦ is the unique solution of (5.23) with the current pattern
I = I j , this is, U j = Fj (γ) .
Proceeding similarly to (5.10) , one can prove that for each Z ∈ RL , the adjoint operator
of the F-derivative of Fj satisfies
The F-derivative of the forward operator (5.28) is now given by F=0 (γ) = (F10 (γ) , ..., F`0 (γ))> ,
i.e., for each η ∈ L∞ (Ω) ,
0
F1 (γ) η
F=0 (γ) η = .. `L
∈R . (5.32)
.
F`0 (γ) η
Since only a finite number of experiments can be made, only a finite number of degrees
of freedom of the conductivity can be restored. Thus, it makes fully sense to restrict
the searched-for conductivities to a finite dimensional space. Proceeding similarly to the
last section, we define the triangulation Υ := {Ti : i = 1, ..., M } of Ω as in (5.13) and
analogously to (5.11) and (5.14) , we restrict the space X to the finite dimensional space V
and the domain of definition of F to
F : V+ ⊂ Vp → R`L , γ 7→ Γγ , (5.34)
and
Fj : V+ ⊂ Vp → RL , γ 7→ U j , j = 1, ..., `, (5.35)
PM
respectively. Identifying the arbitrary vector γ = i=1 αi χTi of V+ ⊂ L∞ (Ω) with the
vector of its coordinates (α1 , ..., αM )> ∈ RM , we see that the functions in (5.34) and in
(5.35) above can now be seen as nonlinear operators from (a subset of) RM to R`L and
from (a subset of) RM to RL respectively. Their derivatives F 0 (γ) and Fj0 (γ) , evaluated
in a vector γ ∈ int (V+ ) , can accordingly be regarded as two matrices (called the Jacobian
matrices) with dimension M × `L and M × L respectively. In this framework, the adjoint
operators of these derivatives are simply the transpose Jacobian matrices.
We now explain how to calculate the Jacobian matrix efficiently: let γ ∈ int (V+ ) be
given and observe that
Fj0 (γ) = Fj0 (γ) χT1 , ..., Fj0 (γ) χTM ∈ RL×M , j = 1, ..., `.
Further, using (5.30) , the vector Fj0 (γ) χTi =: Wij can be evaluated as a part of wij , Wij ∈
H 1 (Ω) ⊕ RL
♦ , the unique solution of
Z
B wij , Wij , (v, V ) = − χTi ∇uj ∇vdx for all (v, V ) ∈ H 1 (Ω) ⊕ RL
♦, (5.36)
Ω
where uj , U j ∈ H 1 (Ω) ⊕ RL
♦ is the unique solution of (5.23) for the current pattern
I = I j . This relative straightforward procedure for calculating the Jacobian matrix is
however highly expensive. In fact, it is necessary to solve the problem (5.23) for I = I j in
order to obtain the vector uj and then solve the problem (5.36) for i = 1, ..., M in order
to calculate Fj0 (γ). Similarly, ` solutions of (5.23) and M ` solutions of (5.36) are required
for the evaluation of F 0 (γ) . But, using a simple trick, as explained in [42], we are able to
strongly reduce these numbers. For each i ∈ {1, ..., L} , let v i , V i ∈ H 1 (Ω) ⊕ RL ♦ be the
i L
unique solution of (5.23) with the current pattern I = J := (δi,j )j=1 , which is the vector
having the value 1 in the i−th coordinate and zero elsewhere. This is,
Thus
D E D E>
Fj0 (γ) χTi = Wij = J 1 , Wij , ..., J L , Wij (5.38)
(5.37)
>
= B v 1 , V 1 , wij , Wij , ..., B v L , V L , wij , Wij
>
= B wij , Wij , v 1 , V 1 , ..., B wij , Wij , v L , V L
Z Z >
(5.36) j 1 j L
= − χTi ∇u ∇v dx, ..., − χTi ∇u ∇v dx
Ω Ω
Z Z >
= − ∇uj ∇v 1 dx, ..., − ∇uj ∇v L dx .
Ti Ti
Thus, in order to calculate the Jacobian matrix, which represents the derivative of
(5.34) , everything one needs to do is to solve the problem (5.23) for each of the ` current
patterns in (5.26) , solve (5.37) for i ∈ {1, ..., L} and assemble the results as explained in
(5.38) and (5.32) . This procedure demands the solution of only ` + L variational problems
and since the number of electrodes L is normally much smaller than the number of triangles
M in Υ, a tremendous saving in the computational effort can be achieved with this strategy.
If we further observe that in each inner iteration of REGINN, the vectors uj , j = 1, ..., `
have already been calculated in the outer iteration in order to evaluate F (γ), then the
calculation of the Jacobian actually requires the additional calculation of L solutions of
(5.37) . Analogously, the evaluation of the Jacobian Fj0 (γ) for the Kaczmarz version (5.35)
demands L solutions of (5.37) if the information Fj (γ) is already available.
All the methods presented in Section 3.1 demand the calculation of at least one derivative
applied in a vector (An s, s ∈ X) or of the calculation of the adjoint of the derivative applied
in a vector (A∗n b, b ∈ Y ∗ ) to calculate the first vector sn,1 in each inner iteration of REGINN.
Thus, if one wants to calculate sn,1 for the problem (5.34) without executing the calculation
of the Jacobian matrix, we conclude in view of (5.32) that at least L solutions of (5.30) or
(5.31) need to be found. Considering that the process of obtaining a solution of a variational
problem is the unique relevant computational effort in a run of Algorithm 1, and recalling
that L solutions of (5.37) is everything one needs in order to calculate the Jacobian for this
problem, we conclude that it is always worth to evaluate this matrix.
The situation is a little different for the Kaczmarz version of REGINN, i.e., if the problem
(5.35) is analyzed instead of (5.34) . If the Jacobian matrix Fj0 (γ) is not available and a
dual gradient method (3.29) or a mixed gradient-Tikhonov method (3.76) is employed
as inner iteration for instance, the solution of two variational problems are required to
compute each single vector: one for the calculation of the derivative applied in sn,k and a
second one for the adjoint of the derivative applied in jr An sn,k − bδn . This results in a
total of 2kn required solutions for the entire execution of an inner iteration. In the first
iteration however, the calculation of An sn,0 can be avoided because sn,0 = 0. But, if the
inner iteration is not terminated by the maximal number kmax,n , the additional derivative
An sn,kn needs to be calculated only to verify the stop criteria (3.5) in the last iteration,
which leave us with the same number 2kn . The calculation of the last derivative is not
necessary however, if the maximal number of inner iterations is reached. In this case,
the inner iteration calculates the solution of 2kmax,n − 1 variational problems. Since the
evaluation of the Jacobian matrix demands a total of L solutions of (5.37) , we conclude
that it is not worth to calculate this matrix if 2kmax,n −1 ≤ L or 2kn ≤ L for the cases when
the maximal number of inner iterations is reached or not reached respectively. But, as an
a priori estimation for the number kn is not available, it is difficult to decide whether the
Jacobian should be calculated. Note however, that if kmax ≤ L/2, then both inequalities
are satisfied and therefore the calculation of the Jacobian is more expensive. See a detailed
comparison in the Tables 5.2 and 5.3 in the final of Subsection 5.2.2 below.
5.2. EIT - COMPLETE ELECTRODE MODEL 95
where r > 1 is a fixed number. A similar calculation to that made in Example 17 shows
that the duality mapping is given by
(Jr (U ))i = |Ui |r−1 sgn (Ui ) , i = 1, ..., `L, (5.40)
where
Jr (U ) := (Jr (U )1 , ...Jr (U )`L )> , U ∈ Y.
In our experiments, we have chosen Υ and Θ with dim Υ = 1200 and dim Θ = 14340.
We highlight the fact that the use of the relatively large dimension of the triangulation Υ
implies that the inequality dim Υ ≤ L (L − 1) /2, necessary for the injectivity of F 0 , is only
verified for L ≥ 50. However, most experiments we have performed in this subsection have
a moderate number of electrodes, which is in general much smaller than 50. The result is
that a solution of an under-determined system
An s = bn (5.41)
needs to be approximated in each inner iteration of K-REGINN. Among all the possible
solutions we could approximate, we then pick up a specific one which is the most suitable to
our interests. We follow ideas from [52] and select an appropriate weight-function ω : Ω → R,
changing the Lp −norm in X with Lpω , as done in the last section. In [52], Winkler and Rieder
have proposed choosing a different weight-function ω = ωn for each (outer) iteration. They
argument that an usual choice for the solution of (5.41) in the Hilbert spaces is
s := arg min kAn s − bn k ,
s∈N (An )⊥
7
For a full explanation of how to employ the FEM in order to find a solution of (5.23) we recommend
[42].
96 CHAPTER 5. NUMERICAL EXPERIMENTS
such that the weighted inner product h·, ·iωn := h·, ωn ·iL2 (Ω) substitutes the original L2
inner product and then redefine
where N (An )⊥ωn represents the orthogonal complement of N (An ) with the new inner prod-
uct h·, ·iωn . They have suggested an strategy which updates indistinguishable coefficients
of the current (outer) iterate γn := M
P
i=1 γn,i χTi with the same amount as following: the
definition of the coefficients
kSi k2 −1
ωn,i := γ , for i = 1, ..., M, (5.43)
|Ti | n,i
where |Ti | is the area of the triangle Ti and
v
u `
2
uX
kSi k2 := t
Fj0 (γn ) χTi
2
j=1
Sj = βSi , β ∈ R\ {0}
holds true.
Remark 50 Since γ0 was chosen as a constant in the last section, the weight function ω
used in that section is just given by the first weight (n = 0 in Definition (5.43) above) times
a constant, this is, ω = γ0 ω0 , see (5.18) .
Note that the use of the weighted L2 inner product hf, giωn = Ω f gωn results in the
R
weighted space Xn = V2,ωn = V, k·kL2ω (Ω) . Unfortunately, our framework does not allow
n
the use of different spaces X in different iterations and for this reason we proceed like in
the last section and fix the same weight-function for all iterations. We therefore define
M
X
ω := ωi χTi
i=1
8
The authors of [52] actually proposed the use of a slightly different weight-function ω n , whose coefficients
ω n,i correspond to |Ti | ωn,i in (5.43) . But, since they have used the Euclidean inner product in RM in place
of the L2 inner product, both choices result in equivalent approaches.
5.2. EIT - COMPLETE ELECTRODE MODEL 97
Figure 5.10: Left: example of a conductivity γ + , modeled by a background (in blue) and
a sparsely distributed inclusion (in dark red). Middle and right: Triangulations Θ and Υ
respectively.
with r
P`
2
0
F (γ ) χ
j=1
j 0 Ti
2 −1
ωi := γ0,i . (5.44)
|Ti |
Even not having any results for Banach spaces, we use this weight-function in all the
experiments. In particular, the change of the space X = Vp with X = Vp,ω results in the
new norm
kf kLpω (Ω) =
f ω 1/p
p , f ∈ X.
L (Ω)
Further, this modification in X implies a slight modification in the way the adjoint operator
of the derivative is calculated. In this new space, the adjoint A∗n of the Jacobian matrix is
not only the transpose of the matrix An , but the transpose of the weighted matrix An,ω ,
defined multiplying each element in the i−th row of the matrix An by ωi . Indeed, for all
u ∈ RM and v ∈ R`L ,
D E
u, (An,ω )> v = hAn,ω u, viVp = hωAn u, viVp = hAn u, viVp,ω = hu, A∗n viY ,
Y
p=2
p = 1.1
Figure 5.11: Convergence in the noiseless situation observed in two different Banach space
norms. The first and second rows show the reconstructions obtained using the Hilbert space
X = V2 and the Banach space X = V1.1 respectively.
To confirm the important noiseless convergence result, proved in Theorem 43, we add no
artificial noise in the generated data (δ = 0), and see what happens to the approximation
γn as n grows to infinity. For this first experiment, the IT method (3.62) has been employed
as inner iteration of REGINN and the results, which have been built in the spaces X = V2
(p = 2) and X = V1.1 (p = 1.1), are respectively displayed in the first and second row of
Figure 5.11. The Euclidean norm k·k2 (Definition (5.39) with r = 2) is used to transform
the vector space Y in a Hilbert space. With this configuration, the duality mapping in
(5.40) is just the identity operator. The parameter αn in (3.68) is given by the constant
value 0.1 and the dual gradient DE method (see (3.29) and (3.39)) with C1 = C2 = 0.1
is applied to find the minimizer of the functional (3.63) , which is necessary to realize the
inner iteration. The other parameters of REGINN are µ = 0.8 and kmax = 10. Each picture
displayed in Figure 5.11 is actually a linear interpolation of the coefficients of the piecewise
constant reconstructions. All these pictures are in the same scale of colors and below each
of them, the relative error iteration en is shown. The first column illustrates the iterate
γ10 while the second and third one refers to γ100 and γ1000 respectively. As expected,
the reconstructions in the Hilbert space X = V2 are over-smoothed, with a relative large
oscillation in the background and too low inclusions. In the other hand, the second row
clearly shows more thin and high inclusions with larger slopes and a lower oscillation levels
in the background, which is a typical behavior of solutions restored in the Banach space
norms Lp for small values of p, see e.g. [12].
In the next experiment we use artificially generated noise to contaminate the data Γγ
in (5.27) with the relative noise level δ :
Exact
X = V2
X = V1.01
Figure 5.12: Conductivity with sparsely located inclusions (first row) and the respective
reconstructed solutions: using the Hilbert space norm L2 (second row) and the Banach
space norm L1.01 (third row).
in the last experiment. Figure 5.12 presents the searched-for solutions γ + in the first row
and exhibits in the second and third rows, the respective V2 − and V1.01 −reconstructions. All
the pictures are in the same scale of colors and after the third row, a color bar is displayed.
We clearly see a lower level of oscillation in the background for the reconstructions in the
third row. Additionally, a doubtless improvement in the reconstructions is achieved if the
Hilbert space V2 is replaced with the Banach space V1.01 , resulting in sharper inclusions
with more accurate values, forms and locations.
In order to examine the behavior of REGINN for reconstructing a sparse conductivity
together with impulsive noise9 , we choose different norms in both X and Y. The values
p = 2 and p = 1.01 as well as r = 2 and r = 1.01 have been designated to perform this task.
Since d = 1 (non-Kaczmarz method), inequality s ≤ r in Theorem 47 is an unnecessary
condition and therefore, the index r of the duality mapping in Y can be freely chosen.
Figure 5.13 illustrates the searched-for conductivity γ + , modelled by a ring-like inclusion,
9
Impulsive noise in the context of the Complete Electrode Model corresponds to a low level error in most
electrodes in the measure of the voltage for a fixed current pattern, but with very high error levels in some
few of them.
100 CHAPTER 5. NUMERICAL EXPERIMENTS
0.01
−0.01
16
12 16
8 12
4 4 8
Figure 5.13: Left: the searched-for sparsely distributed conductivity γ + . Middle and right:
impulsive noise Γδγ − Γγ .
followed by the noise Γδγ − Γγ (see (5.27)) displayed in two different angles10 and with noise
level δ = 1%.
Figure 5.14 complements Figure 5.13 and shows the reconstructions in the above referred
framework. The dual gradient DE method ((3.29) and (3.39) with C1 = C2 = 0.1) is used
as inner iteration of REGINN to generate the results. The remaining parameters of REGINN
are the same as in the last experiment. In the first and second rows we present the results
found using the Hilbert space X = V2 and the Banach space X = V1.01 respectively. The
first column shows the reconstructions when r = 2 in the norm (5.39) is used, while the
second column displays the results for r = 1.01. Below each picture, the number of outer
iterations N = N (δ) as well as the reconstruction error eN and the overall number of inner
iterations kall are shown. Further, a color bar is presented on the right. It is clear that
the pictures in the last column have a superior quality than the ones in the first column,
which means that the norm k·k1.01 is more appropriated to deal with the impulsive noise
than the standard Euclidean norm k·k2 . The price to be paid for better reconstructions is a
significantly increase in the overall number of inner and outer iterations. Observe however,
that a small value for p produces an additional effect of less variation in the background
in the same time that it demands less computational effort until convergence. Figure 5.14
makes clear that the combination p = r = 1.01 results in a very satisfactory framework for
this specific situation.
To finish this chapter, we confront the non-Kaczmarz version of REGINN (5.34) (where
the case d = 1 is considered) with its corresponding Kaczmarz version (5.35) , where d = `
is used (i.e., each single current pattern in (5.26) results in a different equation used by
K-REGINN). The goal is to compare the reconstruction errors and the computational effort
necessary to perform an entire run of Algorithm 1. For the case d = 1, it is only considered
the case where the Jacobian is calculated because this is always the most advantageous
situation (see the discussion at the end of last subsection). Further, the computational
effort necessary to perform all the iterations using or not using the calculation of the
Jacobian matrix is compared for the Kaczmarz version. Different numbers of electrodes,
and respectively of current patterns, are used: L = ` (= d for the Kaczmarz version) with
L = 8, L = 16, L = 32 and L = 64. Trying to be as fair as possible in the comparisons,
we have not used any weight-function in the Lp −norms of X. The constant τ has been
chosen as the smallest constant such that Algorithm 1 terminates in all cases11 . We found
the values τ = 1.3 for the non-Kaczmarz version and τ = 3 for the Kaczmarz version as
the optimal values. The conductivity function γ + is the one shown in the intersection of
2
10
The vector-noise Γδγ − Γγ ∈ RL is actually displayed in Figure 5.13 in the matrix form, where the j−th
column corresponds to the vector U j,δ − U j ∈ RL and generated from the current pattern I j in (5.26) .
11
The constant η in the TCC can be significantly different for the problems (5.34) and (5.35) if the
number of experiments ` ∈ N is large. Since the constant τ depends on η, see (4.9), it can be different for
the Kaczmarz and non-Kaczmarz versions.
5.2. EIT - COMPLETE ELECTRODE MODEL 101
r=2 r = 1.01
0.6
0.5
p=2
0.4
0.2
0.1
p = 1.01
Figure 5.14: Conductivity function γ + from Figure 5.13, reconstructed in different
Ba-
L2
nach spaces with 1% of impulsive noise. In the first column Y = R , k·k2 and
2
Y = RL , k·k1.01 in the second one. The first and second rows correspond to the spaces
X = V2 and X = V1.01 respectively.
the first row and the first column of Figure 5.12. For the inner iteration, the dual gradient
DE method with C1 = C2 = 0.1 is used again. Further, all the tests have been made with
the same noise level δ = 0.1% and with p = 1.1. Since a large constant kmax represents an
extra advantage for the case where the Jacobian matrix is calculated in comparison with
the case where this matrix is not calculated, this constant must be carefully chosen. We
fixed it with the value 50, which we consider a good number. All the other parameters
are the same as in the last experiment. The results of this experiment are collected in the
Tables 5.2 and 5.3. In the first table, the number of outer iterations N (δ) as well as the
overall number of inner iterations kall and the reconstruction error eN (δ) are displayed for
each electrode configuration.
As already discussed at the end of last subsection, the regular version of REGINN requires
L 08 16 32 64
Regular 39 149 109 561
N (δ)
Kaczmarz 558 28174 33111 202499
Regular 3700 26294 14788 18643
kall
Kaczmarz 2003 103227 93392 535148
Regular 79.20 70.38 67.10 65.48
eN (δ)
Kaczmarz 79.08 69.16 65.78 64.32
Table 5.2: Comparison between the Regular version of REGINN and its Kaczmarz version
with different numbers L of electrodes.
102 CHAPTER 5. NUMERICAL EXPERIMENTS
L 08 16 32 64
AI 331 14954 18093 77693
Regular 616 4752 6944 71744
CE Kaczmarz Jac 3206 267438 612087 5174851
Kaczmarz 4564 234628 219895 1272795
the solution of L variational problems in each outer iteration in order to calculate F (γn ).
Each inner iteration in turn, requires the solution of L additional problems for the evaluation
of the Jacobian matrix. Since the last outer iteration is the unique iteration which does
not perform an inner iteration, we conclude that the overall number of variational problems
which need to be solved in an entire run of Algorithm 1 is (2N − 1) L. For the Kaczmarz
version of REGINN, 1 solution of a variational problem is required in each outer iteration
in order to calculate F[n] (γn ). If the Jacobian is calculated, L additional solutions are
necessary in each active iteration12 , which results in a total of N + L.AI, where the number
AI represents the overall number of active iterations. On the other hand, each vector
calculated in the inner iteration of K-REGINN demands the solution of approximately 2
variational problem if the Jacobian matrix is not calculated, which results in a total of
N + 2kall required solutions.
Turning again to the results, we see that a small improvement in the reconstruction
error eN (δ) is shown in the Table 5.2 if the Kaczmarz version of REGINN is considered.
The price to be paid is however high, as Table 5.3 makes evident. This table compares
the computational effort necessary to achieve the results. It shows the overall number of
active iterations AI and the ”computational effort” CE for each electrode configuration13 .
Further, the regular version of REGINN (where d = 1 and the Jacobian matrix is calculated)
and the Kaczmarz versions (where d = L) with and without the calculation of the Jacobian
matrix are compared. It is clear that the regular version of REGINN always works less than
its Kaczmarz versions in all situations. Comparing only the Kaczmarz versions, we see that
the calculation of the Jacobian matrix demands less computational effort only for a small
number of electrodes. If a relative large number of electrodes is used, the computational
effort necessary to calculate the Jacobian matrix becomes higher than the direct calculation
of the solutions of (5.23) and (5.30) and the evaluation of this matrix is not worth any longer.
12
An active iteration is an outer iteration where the discrepancy principle (3.9) is not satisfied and con-
sequently, the inner iteration needs to be performed.
13
The number CE actually represents the overall number of required solutions of variational problems
used by Algorithm 1 until termination.
Chapter 6
We consider the convergence analysis of Chapter 4 the main accomplishment of this work.
In this chapter, a Kaczmarz version of the inexact-Newton method REGINN [43] is analyzed
in Banach spaces and the proofs are carried out considering an inner iteration defined in
a relatively general way, which fits in with various methods. The result is a convergence
analysis of K-REGINN, valid at the same time for several different regularization methods
in the inner iteration. This analysis is a generalization to Banach spaces and to Kaczmarz
methods of ideas previously discussed in [36]. In order to properly develop this general
convergence analysis however, strong restrictions needed to be imposed. We think it is
possible to weaken these hypotheses, especially those required on the solution space X. For
instance, in the cases where the non-Kaczmarz version (d = 1 in (3.6)) is observed, the first
inequality in (4.30) becomes unnecessary to prove convergence in the noiseless situation.
But, this inequality seems to be exactly the crucial point, where the s−convexity of X
is most needed. In the remaining cases, the s−convexity is apparently preventable and
could be handled with similar techniques to those engaged in [47]. Assuming the condition
d = 1 therefore, the uniform smoothness and s−convexity of X could be weaken to only
smoothness and uniform convexity. Of course without the s−convexity, the verification of
Assumption 3, page 58, for the methods investigated in Section 3.1 would be much more
arduous.
In our opinion, the uniform smoothness of data space Y, required to show the stability
property of the dual gradient methods in Appendix A, could also be avoided using similar
arguments to those employed in [47]. However, in order to perform this modification,
the proof of the stability property should be made simultaneously with the proof of the
regularization property, which means that, at least in principle, Assumption 6, page 70,
could not be proven separately, and consequently the general convergence analysis could be
harmed.
The Tikhonov-Phillips method (3.58) has a peculiar iteration form, which is somewhat
different from the other methods of Section 3.1. This characteristic complicates the con-
vergence analysis of Chapter 4 substantially, forcing the recurrent separation in two cases:
zbn,k = zn,k and zbn,k = xn , see Assumptions 3, and 4, page 62. But looking on the bright
side, none geometrical property on the space Y is required for this method in order to prove
Assumption 6, see the stability proof in Appendix A. This suggests that the space L1 could
be used as data space in further numerical experiments, building up a more appropriate
framework to deal with impulsive noise for example. If Y = L1 however, although the
functional (3.60) remains strictly convex, it is not differentiable any longer and the pro-
cess of finding its minimizer, necessary to perform the inner iteration, becomes much more
103
104 CHAPTER 6. CONCLUSIONS AND FINAL CONSIDERATIONS
challenging.
At the end of Subsection 3.1.2, the possibility of obtaining the optimal step-size of
the dual gradient methods is discussed and a lower bound for this step-size is shown.
Though an explicit formula for the optimal step-size is hard to achieve, sharper bounds could
possibly be determined. An upper bound like λopt . λDE for instance, would be especially
interesting and could facilitate the verification of Assumption 3, including the gradient
method associated with the optimal step-size in our convergence analysis of Chapter 4.
A similar situation is observed with the dual gradient Steepest Descent method. Find-
ing an explicit formula for λSD in (3.44) involves the differentiation of the duality mapping,
which makes difficult the determination of this step-size. To overcome this technical obsta-
cle, we have replaced this method with its similar version, the Modified Steepest Descent
method (3.45) , which already has an explicit step-size satisfying the desired inequality
λM SD ≤ λDE , necessary to include it in our convergence analysis. Although an obvious
formula for λSD is not available, it is not unconditionally necessary to prove Assumption
3 since only the inequality λSD ≤ λDE needs to be verified. Once proven, Assumption 3
would imply the convergence of Steepest Descent method. The implicit defined step-size
λSD could be numerically approximated with the help of an optimization algorithm in the
real-line.
In addition to several classical regularization techniques, which have been adapted from
Hilbert to more general Banach spaces in this work, a few novel approaches are presented.
Among the methods which have been firstly introduced in this thesis, we highlight the
mixed gradient-Tikhonov methods (Subsection 3.1.4) and the dual gradient Decreasing
Error method ((3.29) and (3.39)). Both algorithms have shown themselves useful methods
to reconstruct stable solutions of ill-posed problems as can be seen in Figures 5.12 and
5.14 for example. The additional (regularization) term of the mixed gradient-Tikhonov
methods confers further stability to regular gradient methods and results in a more steady
inner iteration. The Decreasing Error method in turn, exhibits the advantage of playing
the role of the gradient method with the fastest decreasing error in the case of a linear
problem in a Hilbert space been observed. This behavior seems to be somehow transmitted
to nonlinear problems in Banach spaces as Figure 5.3 shows.
The nonlinearity of the duality mapping is possibly the biggest complication in the con-
vergence analysis of K-REGINN in general Banach spaces. The extra effort to demonstrate
the theorems in these more complicated spaces is however counterbalanced with the im-
provements provided by the use of more convenient norms. The use of Banach spaces can
be therefore lucrative in order to achieve better reconstructions of inverse problems in some
specific situations, especially if sparsity constraints on the searched-for solution or on the
data-noise are present. This enhancement of the quality becomes evident in the numerical
experiments performed in Chapter 5, however, many questions concerning the mathemat-
ical modelling of EIT problem with Lp −norms remain still open. In its original (infinite
dimensional) version, differentiability or even continuity of the forward operator with the
Lebesgue spaces Lp are in principle not valid for any p ∈ (1, ∞) and legitimizing the use
of these spaces seems not to be a trivial issue. For a proper adaption, further research is
therefore needed.
A superficial examination on the computational implementations of Subsection 5.2.2
and particularly on Tables 5.2 and 5.3 could erroneously lead to the conclusion that the
Kaczmarz version of REGINN is not competitive. These results however, are in large part
due to the manipulation described in (5.38) , which reduces considerably the computational
effort necessary to calculate the Jacobian matrix, especially for the non-Kaczmarz version
of REGINN. But without this strategy, the regular REGINN would be much more ”expensive”
and its Kaczmarz version could become more advantageous. Thus, to have a more accurate
answer about the real utility of the Kaczmarz version of REGINN, supplementary numerical
105
This appendix provides for different methods, the proof of the stability property given in
Assumption 6, page 70. For the ease of the presentation, we assume all the hypotheses
of Theorem 47 except for Assumption 6 itself. For some methods, other hypotheses are
necessary to complete the proof and these additional hypotheses will be required at the
specific points where they are needed.
Let (δi )i∈N ⊂ R be a zero-sequence and suppose that xδni → ξ ∈ Xn as i → ∞. We
δi
proceed giving a proof by induction: for k = 0, zn,0 = xδni → ξ = σn,0 (ξ) as i → ∞.
δi
Assume now that lim zn,k = σn,k (ξ) for k < kREG (ξ) . Our task is now to prove that
i→∞
δi
zn,k+1 → σn,k+1 as i → ∞.
converges to
ξ 0
vn,k = F[n] (ξ) (σn,k (ξ) − ξ) − y[n] − F[n] (ξ)
as i → ∞. Similarly, λδn,k
i
→ λξn,k as i → ∞ (see (3.39) , (3.43) and (3.45)). As the
spaces Yj ’s are uniformly smooth, the selection jr : Yj → Yj∗ is unique and continuous
and since the mappings Jp , Fj and Fj0 are continuous too,
∗
δi δi
Jp zn,k+1 = Jp zn,k − λδn,k
i 0
F[n] xδni jr Aδni sδn,k
i
− bδni (A.1)
converges to
Jp (σn,k+1 ) = Jp (σn,k ) − λξn,k F[n]
0
(ξ)∗ jr vn,k
ξ
, (A.2)
δi
as i → ∞. This means that lim zn,k+1 = σn,k+1 because the duality mapping Jp−1 =
i→∞
Jp∗∗ is also continuous.
107
108APPENDIX A. STABILITY PROPERTY OF THE REGULARIZING SEQUENCES
δi 1
0
r
δi
Tn,k (z) :=
F[n] xδni z − xδni − bδni
+ αn ∆p z, zn,k
r
and
1
0
r
Wn,k (z) :=
F[n] (ξ) (z − ξ) − ebn
+ αn ∆p (z, σn,k ) ,
r
δi
where ebn := y[n] − F[n] (ξ) . As the family zn,k+1 is uniformly bounded (see
δi >0
(4.16)) and X is reflexive, there exists, by picking a subsequence if necessary, some
δi
z ∈ X such that zn,k+1 * z as i → ∞. We first prove that z = σn,k+1 and later that
δi ∗ ,
zn,k+1 → z as i → ∞. For all g ∈ Y[n]
D E D E D E
0
g, F[n] xδni sδn,k+1
i 0
= g, F[n] 0
xδni − F[n] (ξ) sδn,k+1
i 0
+ g, F[n] (ξ) sδn,k+1
i
.
But as sδn,k+1
i δi
= zn,k+1 0 (ξ)∗ g ∈ X ∗ ,
− xδni * z − ξ =: s as i → ∞ and F[n]
D E D E D E D E
0
g, F[n] (ξ) sδn,k+1
i 0
= F[n] (ξ)∗ g, sδn,k+1
i 0
→ F[n] (ξ)∗ g, s = g, F[n]
0
(ξ) s .
(4.16)). Then, D E D E
0
g, F[n] xδni sδn,k+1
i 0
→ g, F[n] (ξ) s (A.3)
∗ is arbitrary,
and as g ∈ Y[n]
0
F[n] xδni sδn,k+1
i 0
* F[n] (ξ) s.
and then
0
δi 0 δi δi
bn − F[n] (ξ) s
≤ lim inf
bn − F[n] xn sn,k+1
. (A.4)
e
δi
From xδni → ξ, (A.4), (A.5) and due to the minimality property of zn,k+1 ,
δi δi δi
Wn,k (z) ≤ lim inf Tn,k zn,k+1 ≤ lim inf Tn,k (σn,k+1 )
δi
= lim Tn,k (σn,k+1 ) = Wn,k (σn,k+1 ) .
i→∞
Using minimality and uniqueness of σn,k+1 , we conclude that σn,k+1 = z and then
δi
zn,k+1 * σn,k+1 . Accordingly, sδn,k+1
i
* σn,k+1 − ξ which implies that s = σn,k+1 − ξ.
We prove now that
δi δi
∆p zn,k+1 , zn,k → ∆p (σn,k+1 , σn,k ) as i → ∞. (A.6)
Define
δi δi
ai := ∆p zn,k+1 , zn,k , a := lim sup ai , c := ∆p (σn,k+1 , σn,k ) ,
1
r
0
rei :=
bδni − F[n] xδni sδn,k+1
i
, and re := lim inf rei .
r
In view of (A.5) , it is enough to prove that a ≤ c. Suppose that a > c. From the
definition of lim sup there exists, for all M ∈ N, some index i > M such that
a−c
ai > a − . (A.7)
4
From the definition of lim inf, there exists N1 ∈ N such that
αn (a − c)
rei ≥ re − , (A.8)
4
δi
for all i ≥ N1 . As above, lim Tn,k (σn,k+1 ) = Wn,k (σn,k+1 ) and then there is an N2 ∈ N
i→∞
such that
δi αn (a − c)
Tn,k (σn,k+1 ) < Wn,k (σn,k+1 ) + (A.9)
2
for all i ≥ N2 . Using (A.4) and setting M = N1 ∨ N2 , there exists some index i > M
such that
Wn,k (σn,k+1 ) ≤ re + αn c = re + αn a − αn (a − c)
αn (a − c) αn (a − c)
≤ rei + + αn ai + − αn (a − c)
4 4
αn (a − c) δi
δi
α (a − c)
n
= rei + αn ai − = Tn,k zn,k+1 −
2 2
δi αn (a − c)
≤ Tn,k (σn,k+1 ) −
2
where the second inequality comes from (A.8) and (A.7) and the last one, from
δi
the minimality of zn,k+1 . From (A.9) we obtain the contradiction Wn,k (σn,k+1 ) <
110APPENDIX A. STABILITY PROPERTY OF THE REGULARIZING SEQUENCES
≤ c and
Wn,k (σn,k+1 ). Thus, a
(A.6) holds. From the definition of the Bregman
δi δi
distance ∆p we have
zn,k+1
→ kσn,k+1 k . As zn,k+1 * σn,k+1 we conclude that
δi
zn,k+1 → σn,k+1 as i → ∞, because X is uniformly convex. So far, we have shown
that each positive zero-sequence (δi )i∈N contains a subsequence δij j∈N such that
δij
zn,k+1 → σn,k+1 as j → ∞ which is enough to prove the statement.
converges to
h i
Jp (σn,k+1 ) = Jp (σn,k ) − λξn,k F[n]
0
(ξ)∗ jr vn,k
ξ
+ γnξ Jp σn,k − ξ − xξn ,
δi
as i → ∞. Since Jp−1 = Jp∗∗ is a continuous function, the vector zn,k+1 converges to
σn,k+1 as i → ∞ and the proof is complete.
Bibliography
[2] Kari Astala and Lassi Päivärinta. Calderón’s inverse conductivity problem in the plane.
Ann. of Math. (2), 163(1):265–299, 2006.
[9] Charles Chidume. Geometric properties of Banach spaces and nonlinear iterations,
volume 1965 of Lecture Notes in Mathematics. Springer-Verlag London, Ltd., London,
2009.
[10] Ioana Cioranescu. Geometry of Banach spaces, duality mappings and nonlinear prob-
lems, volume 62 of Mathematics and its Applications. Kluwer Academic Publishers
Group, Dordrecht, 1990.
[11] Christian Clason and Bangti Jin. A semismooth Newton method for nonlinear param-
eter identification problems with impulsive noise. SIAM J. Imaging Sci., 5(2):505–538,
2012.
[12] Ingrid Daubechies, Michel Defrise, and Christine De Mol. An iterative thresholding
algorithm for linear inverse problems with a sparsity constraint. Comm. Pure Appl.
Math., 57(11):1413–1457, 2004.
[13] Ron S. Dembo, Stanley C. Eisenstat, and Trond Steihaug. Inexact Newton methods.
SIAM J. Numer. Anal., 19(2):400–408, 1982.
111
112 BIBLIOGRAPHY
[14] Françoise Demengel and Gilbert Demengel. Functional spaces for the theory of elliptic
partial differential equations. Universitext. Springer, London; EDP Sciences, Les Ulis,
2012. Translated from the 2007 French original by Reinie Erné.
[15] Heinz W. Engl, Martin Hanke, and Andreas Neubauer. Regularization of inverse prob-
lems, volume 375 of Mathematics and its Applications. Kluwer Academic Publishers
Group, Dordrecht, 1996.
[16] Heinz W. Engl, Karl Kunisch, and Andreas Neubauer. Convergence rates for Tikhonov
regularisation of nonlinear ill-posed problems. Inverse Problems, 5(4):523–540, 1989.
[17] V. M. Fridman. On the convergence of methods of steepest descent type. Uspehi Mat.
Nauk, 17(3 (105)):201–204, 1962.
[19] Jacques Hadamard. Le problème de Cauchy et les équations aux dérivées partielles
linéaires hyperboliques. Hermann, Paris, 1932.
[20] Markus Haltmeier, Antonio Leitão, and Otmar Scherzer. Kaczmarz methods for regu-
larizing nonlinear ill-posed equations. I. Convergence analysis. Inverse Probl. Imaging,
1(2):289–298, 2007.
[22] Martin Hanke. Regularizing properties of a truncated Newton-CG algorithm for non-
linear inverse problems. Numer. Funct. Anal. Optim., 18(9-10):971–993, 1997.
[23] Bangti Jin, Taufiquar Khan, and Peter Maass. A reconstruction algorithm for electrical
impedance tomography based on sparsity regularization. Internat. J. Numer. Methods
Engrg., 89(3):337–353, 2012.
[24] Qinian Jin. Inexact Newton-Landweber iteration for solving nonlinear inverse problems
in Banach spaces. Inverse Problems, 28(6):065002, 15, 2012.
[25] Qinian Jin. On the order optimality of the regularization via inexact Newton iterations.
Numer. Math., 121(2):237–260, 2012.
[26] Qinian Jin and Linda Stals. Nonstationary iterated Tikhonov regularization for ill-
posed problems in Banach spaces. Inverse Problems, 28(10):104011, 15, 2012.
[28] Jari P. Kaipio, Ville Kolehmainen, Erkki Somersalo, and Marko Vauhkonen. Statistical
inversion and Monte Carlo sampling methods in electrical impedance tomography.
Inverse Problems, 16(5):1487–1522, 2000.
[29] Barbara Kaltenbacher, Andreas Neubauer, and Otmar Scherzer. Iterative regulariza-
tion methods for nonlinear ill-posed problems, volume 6 of Radon Series on Computa-
tional and Applied Mathematics. Walter de Gruyter GmbH & Co. KG, Berlin, 2008.
[30] Andreas Kirsch. An introduction to the mathematical theory of inverse problems, vol-
ume 120 of Applied Mathematical Sciences. Springer, New York, second edition, 2011.
BIBLIOGRAPHY 113
[32] Erwin Kreyszig. Introductory functional analysis with applications. Wiley Classics
Library. John Wiley & Sons, Inc., New York, 1989.
[33] L. Landweber. An iteration formula for Fredholm integral equations of the first kind.
Amer. J. Math., 73:615–624, 1951.
[34] Armin Lechleiter and Andreas Rieder. Newton regularizations for impedance tomog-
raphy: a numerical study. Inverse Problems, 22(6):1967–1987, 2006.
[35] Armin Lechleiter and Andreas Rieder. Newton regularizations for impedance tomog-
raphy: convergence by local injectivity. Inverse Problems, 24(6):065009, 18, 2008.
[36] Armin Lechleiter and Andreas Rieder. Towards a general convergence theory for inex-
act Newton regularizations. Numer. Math., 114(3):521–548, 2010.
[38] Joram Lindenstrauss and Lior Tzafriri. Classical Banach spaces. II, volume 97 of
Ergebnisse der Mathematik und ihrer Grenzgebiete [Results in Mathematics and Related
Areas]. Springer-Verlag, Berlin-New York, 1979. Function spaces.
[39] Fábio Margotti and Andreas Rieder. An inexact Newton regularization in Banach
spaces based on the nonstationary iterated Tikhonov method. Journal of inverse and
Ill-posed Problems, Ahead of print, 2014.
[40] Fábio Margotti, Andreas Rieder, and Antonio Leitão. A Kaczmarz version of the
REGINN-Landweber iteration for ill-posed problems in Banach spaces. SIAM J. Numer.
Anal., 52(3):1439–1465, 2014.
[41] A. Neubauer and O. Scherzer. A convergence rate result for a steepest descent method
and a minimal error method for the solution of nonlinear ill-posed problems. Z. Anal.
Anwendungen, 14(2):369–377, 1995.
[42] Nick Polydorides and William R. B. Lionheart. A Matlab toolkit for three-dimensional
electrical impedance tomography: a contribution to the Electrical Impedance and Dif-
fuse Optical Reconstruction Software project. Measurement Science and Technology,
13(12):1871–1873, 2002.
[43] Andreas Rieder. On the regularization of nonlinear ill-posed problems via inexact
Newton iterations. Inverse Problems, 15(1):309–327, 1999.
[45] Andreas Rieder. Keine Probleme mit inversen Problemen. Friedr. Vieweg & Sohn,
Braunschweig, 2003. Eine Einführung in ihre stabile Lösung. [An introduction to their
stable solution].
[46] Andreas Rieder. Inexact Newton regularization using conjugate gradients as inner
iteration. SIAM J. Numer. Anal., 43(2):604–622 (electronic), 2005.
114 BIBLIOGRAPHY
[47] F. Schöpfer, A. K. Louis, and T. Schuster. Nonlinear iterative methods for linear
ill-posed problems in Banach spaces. Inverse Problems, 22(1):311–329, 2006.
[48] Thomas Schuster, Barbara Kaltenbacher, Bernd Hofmann, and Kamil S. Kazimierski.
Regularization methods in Banach spaces, volume 10 of Radon Series on Computational
and Applied Mathematics. Walter de Gruyter GmbH & Co. KG, Berlin, 2012.
[49] R. E. Showalter. Monotone operators in Banach space and nonlinear partial differential
equations, volume 49 of Mathematical Surveys and Monographs. American Mathemat-
ical Society, Providence, RI, 1997.
[50] Erkki Somersalo, Margaret Cheney, and David Isaacson. Existence and uniqueness
for electrode models for electric current computed tomography. SIAM J. Appl. Math.,
52(4):1023–1040, 1992.
[51] A. N. Tikhonov. On the solution of incorrectly put problems and the regularisation
method. In Outlines Joint Sympos. Partial Differential Equations (Novosibirsk, 1963),
pages 261–265. Acad. Sci. USSR Siberian Branch, Moscow, 1963.
[52] Robert Winkler and Andreas Rieder. Model-aware newton-type inversion scheme for
electrical impedance tomography. Inverse Problems, 31(4):045009, 2015.
[53] Zong Ben Xu and G. F. Roach. Characteristic inequalities of uniformly convex and
uniformly smooth Banach spaces. J. Math. Anal. Appl., 157(1):189–210, 1991.