Censor 2014

J Optim Theory Appl
DOI 10.1007/s10957-014-0591-x
Strict Fejér Monotonicity by Superiorization

of Feasibility-Seeking Projection Methods
Yair Censor · Alexander J. Zaslavski
Received: 22 February 2013 / Accepted: 27 May 2014

© Springer Science+Business Media New York 2014
Abstract We consider the superiorization methodology, which can be thought of as

lying between feasibility-seeking and constrained minimization. It is not quite trying
to solve the full-fledged constrained minimization problem; rather, the task is to find
a feasible point which is superior (with respect to the objective function value) to one
returned by a feasibility-seeking only algorithm. Our main result reveals new informa-
tion about the mathematical behavior of the superiorization methodology. We deal with
a constrained minimization problem with a feasible region, which is the intersection
of finitely many closed convex constraint sets, and use the dynamic string-averaging
projection method, with variable strings and variable weights, as a feasibility-seeking
algorithm. We show that any sequence, generated by the superiorized version of a
dynamic string-averaging projection algorithm, not only converges to a feasible point
but, additionally, also either its limit point solves the constrained minimization prob-
lem or the sequence is strictly Fejér monotone with respect to a subset of the solution
set of the original problem.
Keywords Bounded perturbation resilience · Constrained minimization · Convex

feasibility problem · Dynamic string-averaging projections · Strict Fejér monotonicity ·
Subgradients · Superiorization methodology · Superiorized version of an algorithm
Communicated by Jonathan Michael Borwein.
Y. Censor (B)
Department of Mathematics, University of Haifa, Mt. Carmel, Haifa 3498838, Israel
e-mail: [email protected]
A. J. Zaslavski
Department of Mathematics, The Technion – Israel Institute of Technology, Technion City,
Haifa 32000, Israel
e-mail: [email protected]
123
J Optim Theory Appl
Mathematics Subject Classification 90C25 · 90C30 · 90C45 · 65K10
1 Introduction
1.1 What is Superiorization
The recently developed superiorization methodology (SM) lies between feasibility-

seeking and constrained minimization (CM). It is not quite trying to solve the full-
fledged CM problem; rather, the task is to find a feasible point of the CM problem, that
is superior, not necessarily optimal, with respect to the objective function value to one
returned by a feasibility-seeking only algorithm. Therefore, the SM can be beneficial
for CM problems for which an exact algorithm has not yet been discovered, or when
existing exact optimization algorithms are very time consuming or require too much
computer space for realistic large problems to be run on commonplace computers. In
such cases, efficient feasibility-seeking iterative projection methods that provide non-
optimal but constraints-compatible solutions can be turned by the SM into efficient
algorithms for superiorization that will be practically useful from the point of view of
the underlying objective function.
For the SM to be useful for a CM problem, we need to have an efficient feasibility-
seeking algorithm that is in some well-defined sense perturbation resilient. Then, the
SM uses those permitted perturbations in order to steer the superiorized version of
the original feasibility-seeking algorithm toward points with lesser, not necessarily
minimal, objective function values. The advantage is that in this manner, one uses
essentially not an optimization algorithm but a superiorized-feasibility-seeking algo-
rithm to attack the CM problem. The later methods are in many cases very efficient,
see e.g., [1], and, therefore, can save time and computing resources as compared with
exact optimization algorithms.
Additionally, in many mathematical formulations of significant real-world tech-
nological or physical problems, the objective function is exogenous to the modeling
process which defines the constraints. In such cases, the “faith” of the modeler in the
usefulness of an objective function for the application at hand is limited and, as a
consequence, it is not worthwhile to invest too much resources in trying to reach an
exact constrained minimum point. These notions are rigorously explained in the next
sections below.
1.2 Contribution
Our main result, in Theorem 4.1 below, establishes a mathematical basis for the behav-
ior of the SM when dealing with a CM problem with a feasible region that is the
intersection of finitely many closed convex constraint sets; see Case 2.1 in Sect. 2
below. We use the dynamic string-averaging projection (DSAP) method, with vari-
able strings and variable weights, as a feasibility-seeking algorithm, which is indeed
bounded perturbations resilient. The bounded perturbations resilience of the DSAP
method has been proved in [2], and the practical behavior of the SM was observed
in numerous recent works, see references mentioned below. Our contribution here is
123
J Optim Theory Appl
the mathematical guarantee of the convergence behavior of the superiorized version

of the DSAP algorithm.
Theorem 4.1 below says that any sequence, generated by the superiorized version of
a DSAP algorithm, given in Algorithm 4.1 below, will not only converge to a feasible
point of the underlying CM problem, a fact which is due to the bounded perturbations
resilience of the DSAP method, but, additionally, also either its limit point will solve
the CM problem (1) or that the sequence is strictly Fejér monotone with respect to,
i.e., gets strictly closer to the points of, a subset of the solution set of the CM problem
according to (29) below.
1.3 Related Work
This paper is a sequel to a series of recent publications on the SM [2–13] , culminating

in [14]. The latter contains a detailed description of the SM, its motivation, and an up-
to-date review of SM-related previous work, including a reference to [4] in which it all
started, although without using yet the terms superiorization and bounded perturbation
resilience. [4] was the first to propose this approach and implement it in practice, but
its roots go back to [15,16] where it was shown that if iterates of a nonexpansive
operator converge for any initial point, then its inexact iterates with summable errors
also converge. More details on related work appear in [14, Sect. 3] and in [17, Sect. 1].
1.4 Paper Structure
The paper is laid out as follows. Section 2 presents the SM. Preliminaries needed for
our study are presented in Sect. 3, and the superiorized version of the DSAP algorithm
is given in Sect. 4. The proof of our main result that gives a mathematical basis for
the SM is presented in Sect. 5. Conclusions are given in Sect. 6.
2 The Superiorization Methodology
Consider some mathematically formulated problem, of any kind or sort, and denote it
by T. The set of solutions, called the solution set of T , is denoted by SOL(T ). The
superiorization methodology (SM) of [5,10,14] is intended for constrained minimiza-
tion (CM) problems of the following form:
minimize {φ(x) | x ∈ T } , (1)
where φ : R J → R is an objective function and T ⊆ R J is the solution set

T = SOL(T ) of a problem T . In [4,5], SOL(T ) was assumed to be nonempty and
in later works [10,14], this assumption was removed. Here, however, we adhere to
T = SOL(T ) = ∅ throughout this paper.
To proceed with the SM, the problem T can be just about any mathematical or
mathematically formulated problem for which a “good” iterative algorithm for its
solution exists which is bounded perturbation resilient, as explained below. Two widely
123
J Optim Theory Appl
used cases of such underlying problems and their set T come to mind, although the
general approach is by no means restricted to those.
Case 2.1 The set T is the solution set of a convex feasibility problem (CFP) of the
form: find a vector x ∗ ∈ T := ∩i=1I C , where the sets C ⊆ R J are closed and
i i
convex subsets of the Euclidean space R J ; see, e.g., [18–20] or [21, Chap. 5] for
results and references on the broad topic of CFPs or consult [1,3,22–28]. In such a
case, we deal in (1) with a standard CM problem. This is the case analyzed in this
paper.
Case 2.2 The set T is the solution set of another CM problem which serves as the
problem T , such as
minimize {J (x) | x ∈ } , (2)
in which case we look at

T := x ∗ ∈ | J (x ∗ ) ≤ J (x)for allx ∈ , (3)
assuming that T is nonempty. This case has been studied in [8], [7], and [29].
In either case, or any other case for the set T , the SM strives not to solve (1) but
rather the task is to find a point in T which is superior, i.e., has a lower, but not
necessarily minimal, value of the φ objective function value, to one returned by an
algorithm that solves the original problem T alone.
This is done in the SM by first investigating the bounded perturbation resilience
of an available iterative algorithm designed to solve the original problem T and then
proactively using such permitted perturbations in order to steer the iterates of such
an algorithm toward lower values of the φ objective function while not loosing the
convergence to a point in T . See [5,10,14] for details. A review of superiorization-
related previous work appears in [14, Sect. 3].
3 Preliminaries
Let X be a Hilbert space equipped with an inner product ·, · , which induces a

complete norm || · ||. For each x ∈ X and each nonempty set E ⊆ X define
d(x, E) := inf{||x − y|| | y ∈ E}, (4)
and for each x ∈ X and each r > 0 define the closed ball around x with radius r by
B(x; r ) := {y ∈ X | ||x − y|| ≤ r }. (5)
The following proposition and corollary are well known, see, e.g., [25] or [23].
123
J Optim Theory Appl
Proposition 3.1 Let D be a nonempty, closed, and convex subset of X . Then, for
each x ∈ X , there is a unique point PD (x) ∈ D (called the projection of x onto D)
satisfying
||x − PD (x)|| = inf{||x − y|| | y ∈ D}. (6)
Moreover,
||PD (x) − PD (y)|| ≤ ||x − y|| for all x, y ∈ X, (7)
and for each x ∈ X and each z ∈ D,
z − PD (x), x − PD (x) ≤ 0. (8)
Corollary 3.1 Assume that D is a nonempty, closed, and convex subset of X . Then,
for each x ∈ X and each z ∈ D,
||z − PD (x)||2 + ||x − PD (x)||2 ≤ ||z − x||2 . (9)
Suppose that C1 , C2 , . . . , Cm are nonempty, closed, and convex subsets of X where

m is a natural number. Set
C := ∩i=1
m
Ci , (10)
and assume throughout that C = ∅. For i = 1, 2, . . . , m, denote Pi := PCi . By an

index vector, we a mean a vector t = (t1 , t2 , . . . , tq ) such that for all i = 1, 2, . . . , q,
ti ∈ {1, 2, . . . , m} and whose length is (t) = q. Define the product of the individual
projections onto the sets whose indices appear in the index vector t by
P[t] := Ptq · · · Pt1 (11)
and call it a string operator.

A finite set of index vectors is called fit iff for each i ∈ {1, 2, . . . , m}, there exists
a vector t ∈ such that ts = i for some s ∈ {1, 2, . . . , q}. For each index vector t,
the string operator is nonexpansive, since the individual projections are, i.e.,
||P[t](x) − P[t](y)|| ≤ ||x − y|| for all x, y ∈ X, (12)
and also
P[t](x) = x for all x ∈ C. (13)
Denote by M the collection of all pairs (, w), where is a finite fit set of index
vectors and

w : →]0, ∞[ is such that w(t) = 1. (14)
t∈
123
J Optim Theory Appl
A pair (, w) ∈ M and the function w were called in [4] an amalgamator and a fit
weight function, respectively. For any (, w) ∈ M define, the convex combination
of the end-points of all strings defined by members of by

P,w (x) := w(t)P[t](x), x ∈ X. (15)
t∈
It is easy to see that
||P,w (x) − P,w (y)|| ≤ ||x − y|| for all x, y ∈ X, (16)
and
P,w (x) = x for all x ∈ C. (17)
We will make use of the following condition, known in the literature as bounded
regularity, see [18], and assume throughout that it holds.
Condition 3.1 For each > 0 and each M > 0, there exists a positive δ = δ(, M)
such that for each x ∈ B(0, M) satisfying d(x, Ci ) ≤ δ, for all i = 1, 2, . . . , m, the
inequality d(x, C) ≤ holds.
For the proof of the next proposition, see e.g., [2, Proposition 5].
Proposition 3.2 If the space X is finite dimensional, then Condition 3.1 holds.
We choose an arbitrary fixed number ∈]0, 1/m[ and an integer q̄ ≥ m and denote
by M∗ ≡ M∗ (, q̄) the set of all (, w) ∈ M such that the lengths of the strings
are bounded and the weights are all bounded away from zero, namely
M∗ := {(, w) ∈ M | (t) ≤ q̄ and w(t) ≥ for all t ∈ }. (18)
The dynamic string-averaging projection (DSAP) method with variable strings and
variable weights is the following algorithm.
Algorithm 3.1 The DSAP method with variable strings and variable weights
Initialization: select an arbitrary x 0 ∈ X ,
Iterative step: given an iteration vector x k , pick a pair (k , wk ) ∈ M∗ and calculate
the next iteration vector x k+1 by
x k+1 = Pk ,wk (x k ). (19)
The first prototypical string-averaging algorithmic scheme appeared in [30] and

subsequent work on various such algorithmic operators includes [13,17,31–37].
If in the DSAP method one uses only a single index vector t = (1, 2, . . . , m)
that includes all constraints indices, then the fully sequential Kaczmarz cyclic pro-
jection method, see e.g., [25, p. 220], is obtained, sometimes called the POCS, for
123
J Optim Theory Appl
Projections Onto Convex Sets, method; see e.g., [21, Chap. 5]. For linear hyperplanes
as constraints sets, the latter is equivalent with the, independently discovered, ART,
for Algebraic Reconstruction Technique, in image reconstruction from projections,
see [28]. If, at the other extreme, one uses exactly m one-dimensional index vectors
t = (i), for i = 1, 2, . . . , m, each consisting of exactly one constraint index, then the
fully simultaneous projection method of Cimmino, see e.g., [18, p. 405], is recovered.
In-between these “extremes,” the DSAP method allows for a large “arsenal” of many
specific feasibility-seeking projection algorithms—to all of which the results of this
paper will apply.
For the reader’s convenience, we quote here the definition of bounded perturbations
resilience and the bounded perturbations resilience theorem of the DSAP method, see
[2] for details. The next definition was originally given in [5, Definition 1] with a
finite-dimensional Euclidean space R J instead of the Hilbert space X in the definition
below which is taken from [2].
Definition 3.1 Given a problem T, an algorithmic operator A : X → X is said

to be bounded perturbations resilient iff the following is true: if the
sequence {x k }∞
k=0 , generated by x
k+1 = A(x k ), for all k ≥ 0, converges to a solution
of T for all x 0 ∈ X , then any sequence {y k }∞

k=0 of points in X that is generated by
y k+1 = A(y + βk v ), for all k ≥ 0, also converges to a solution of T provided that,
k k
for all k ≥ 0, βkv k are bounded perturbations, meaning that βk ≥ 0 for all
∞
k ≥ 0 such that βk < ∞ and such that the sequence {v k }∞ k=0 is bounded.
k=0
The convergence properties and the so called bounded perturbation resilience of

this DSAP method were analyzed in [2].
Theorem 3.1 [2, Theorem 12] Let C1 , C2 , . . . , Cm be nonempty, closed, and convex
subsets of X, where m is a natural number, C := ∩i=1 m C = ∅, let {β }∞ be a
∞ i k k=0
sequence of non-negative numbers such that k=0 βk < ∞, let {v k }∞ k=0 ⊂ X be a
norm-bounded sequence, let {(k , wk )}∞
k=0 ⊂ M ∗ , for all k ≥ 0, and let y 0 ∈ X.
∞
Then, any sequence {y }k=0 , generated by the iterative formula
k
y k+1 = Pk ,wk (y k + βk v k ) (20)
converges in the norm of X and its limit belongs to C.
4 The Superiorized Version of the Dynamic String-Averaging Projection

Algorithm
The “superiorized version of an algorithm” has evolved and undergone several mod-
ifications throughout the publications on the SM, from the initial [4, pseudocode on
page 543] through [6,9,12] until the most recent [14, “Superiorized Version of the
Basic Algorithm” in Sect. 4]. The next algorithm, called “The superiorized version of
the DSAP algorithm,” is a further modification of the latest [14, “Superiorized Version
of the Basic Algorithm” in Sect. 4] as we explain in Remark 4.1 below.
123
J Optim Theory Appl
Let C be as in (10), let φ : X → R be a convex continuous function, and consider

the set
Cmin := {x ∈ C | φ(x) ≤ φ(y) for all y ∈ C}, (21)
and assume that Cmin = ∅.
Algorithm 4.1 The superiorized version of the DSAP algorithm

(0) Initialization: Let N be a natural number and let y 0 ∈ X be an arbitrary
user-chosen vector.
(1) Iterative step: Given a current vector y k , pick an Nk ∈ {1, 2, . . . , N } and start
an inner loop of calculations as follows:
(1.1) Inner loop initialization: Define y k,0 = y k .
(1.2) Inner loop step: Given y k,n , as long as n < Nk do as follows:
(1.2.1) Pick a 0 < βk,n ≤ 1 in a way that guarantees that (this can be done; see
Remark 4.1 below)
k −1
∞ N

βk,n < ∞. (22)
k=0 n=0
(1.2.2) Let ∂φ(y k,n ) be the subgradient set of φ at y k,n and define v k,n as follows:
⎧
⎪
⎨− s
k,n
, if 0 ∈
/ ∂φ(y k,n ),
k,n
v =
k,n s (23)
⎪
⎩
0, if 0 ∈ ∂φ(y ),
k,n
where s k,n ∈ ∂φ(y k,n ).

(1.2.3) Calculate
y k,n+1 = y k,n + βk,n v k,n (24)
and go to (1.2).
(1.3) Exit the inner loop with the vector y k,Nk
(1.4) Calculate
y k+1 = Pk ,wk (y k,Nk ) (25)
with (k , wk ) ∈ M∗ , and go back to (1).
Remark 4.1 1. For step (1.2.1) in Algorithm 4.1, assume that we have available a
summable sequence {η }∞
=0 of positive real numbers (for example, η = a , where
0 <a < 1). Then, wecan let
∞the algorithm generate, simultaneously with the sequence
∞
y k k=0 , a sequence βk,n k=0 as a subsequence of {η }∞ =0 , by choosing βk,n = η
and increasing the index in everypass through
∞ step (1.2.1) in the algorithm, resulting
in a positive summable sequence βk,n k=0 , as required in (22). This is how it was
done in [14, “Superiorized Version of the Basic Algorithm” in Sect. 4].
123
J Optim Theory Appl
2. There are some differences between the “Superiorized Version of the Basic Algo-
rithm” in Sect. 4 of [14] and Algorithm 4.1. In [14], it was the case that Nk = N for
all k ≥ 0, whereas here we allow the number of times that the inner loop step (1.1)
is exercised to vary from iteration to iteration depending on the iteration index k.
3. In our Algorithm 4.1, we do not have to check if φ(y k,n+1 ) ≤ φ(y k ) after (24)
in step (1.2.3) of the algorithm, as is done in step 14 of the “Superiorized Version
of the Basic Algorithm” in Sect. 4 of [14]. In spite of this saving shortcut, we are
able to prove the main result for our Algorithm 4.1, as seen in Theorem 4.1 below.
4. Admittedly, our algorithm is related to only Case 2.1 in Sect. 2 and uses negative
subgradients in step (1.2.2) and not general nonascend steps as in step 8 of the
“Superiorized Version of the Basic Algorithm” in Sect. 4 of [14].
5. Finally, as mentioned before, our findings are related to Case 2.1 for the consistent
case C = ∩i=1 m C = ∅ and treat bounded perturbations resilience and not the
i
notion of strong perturbation resilience as in [10,14]. This enables us to prove
asymptotic convergence results here but also calls for future research to cover the
inconsistent case.
6. Note that the DSAP method, covered here, is a versatile algorithmic scheme that
includes, as special cases, the fully sequential projections method and the fully
simultaneous projections method as “extreme” structures obtained by putting all
sets Ci into a single string, or by putting each constraint in a separate string,
respectively. Consult the references mentioned in Case 2.1 above for further details
and relevant references.
We will prove the following theorem as our main result.
Theorem 4.1 Let φ : X → R be a convex continuous function, and let C∗ ⊆ Cmin
be a nonempty subset of Cmin . Let r0 ∈]0, 1] and L̄ ≥ 1 be such that
|φ(x) − φ(y)| ≤ L̄||x − y|| for all x ∈ C∗ and all y ∈ B(x, r0 ), (26)
and suppose that

{(k , wk )}∞
k=0 ⊂ M∗ . (27)
Then, any sequence {y k }∞

k=0 , generated by Algorithm 4.1, converges in the norm
topology of X to y ∗ ∈ C and exactly one of the following two cases holds:
(a) y ∗ ∈ Cmin ;
(b) y ∗ ∈
/ Cmin and there exist a natural number k0 and a c0 ∈]0, 1[ such that for each
x ∈ C∗ and each integer k ≥ k0 ,
k −1
N
y k+1
− x ≤ y − x − c0
2 k 2
βk,n , (28)
n=1
showing that {y k }∞
k=0 is strictly Fejér-monotone with respect to C ∗ , i.e.,
y k+1 − x2 < y k − x2 , for all k ≥ k0 , (29)

Nk −1
because c0 n=1 βk,n > 0.
123
J Optim Theory Appl
This theorem establishes a mathematical basis for the behavior of the SM when
dealing with Case 2.1 in Sect. 2, i.e., T is the solution set of a CFP as in (10),
assuming that T = C = ∅, and using the DSAP method algorithmic scheme as
a feasibility-seeking algorithm which is indeed bounded perturbations resilient. The
bounded perturbations resilience of the DSAP method has been proved in [2], and the
practical behavior of the SM was observed in numerous recent works, so we furnish
here a mathematical guarantee of the convergence behavior of the superiorized version
of the DSAP Algorithm 4.1.
Theorem 4.1 tells us that any sequence {y k }∞ k=0 , generated by Algorithm 4.1, will
not only converge to a feasible point of the underlying CFP, which is due to the bounded
perturbations resilience of the DSAP method, but additionally, that either its limit point
will solve the CM problem (1) with T = C = ∅, or that the sequence {y k }∞ k=0 is
strictly Fejér-monotone with respect to a subset C∗ of the solution set Cmin of the CM
problem, according to (28).
This strict Fejér-monotonicity of the sequence {y k }∞ k=0 does not suffice to guarantee
its convergence to a minimum point of (1), even though the sequence does converge
to a limit point in C, but it says that the superiorized version algorithm retains asymp-
totic convergence to a feasible point in C, and that the so-created feasibility-seeking
sequence has the additional property of getting strictly closer, without necessarily
converging to a subset of the solution set of minimizers of the CM problem (1). For
properties of Fejér-monotone and strictly Fejér-monotone sequences see, e.g., [18,
Theorem 2.16], [25, Sect. 3.3] and [38].
Another result that describes the behavior of the superiorized version of a basic
algorithm is [14, Theorm 4.2]. It considers the case of strong perturbation resilience
of the underlying basic algorithm, contrary to our result that considers bounded per-
turbations resilience, and is, therefore, valid to the case of consistent feasible sets.
The inability to prove that the superiorized version of a basic algorithm reduces the
value of the objective function φ, a fact which was repeatedly observed experimentally
made us use the term “heuristic” in [10,11]. In spite of the strict Fejér monotonicity
proven here, the question of proving mathematically the reduction of the value of the
objective function φ remains.
Another open question is to formulate some reasonable further conditions that
will help distinguish beforehand between the two alternatives in Theorem 4.1 for the
behavior of the superiorized version of a DSAP algorithm. Published experimental
results repeatedly confirm that reduction of the value of the objective function φ is
indeed achieved, without loosing the convergence toward feasibility, see [2–13] . In
some of these cases, the SM returns a lower value of the objective function φ than an
exact minimization method with which it is compared, e.g., [14, Table 1].
5 The Main Result: Strict Fejér Monotonicity in the Superiorization Method
5.1 Auxiliary Results
We will need the next two lemmas.

Lemma 5.1 Let x, y ∈ X and > 0 and let φ : X → R be a convex continuous
function.
123
J Optim Theory Appl
If φ(x) − φ(y) > and v ∈ ∂φ(x) then v, y − x < −. (30)
Proof It follows from the subgradient inequality that
v, y − x ≤ φ(y) − φ(x) < −, (31)
thus proving the lemma.

Lemma 5.2 Under the assumptions of Theorem 4.1, let x̄ ∈ C∗ , ∈]0, r0 ],

α ∈]0, 1], and x ∈ X satisfy
φ(x) − φ(x̄) > , (32)
and assume that v ∈ ∂φ(x) and that (, w) ∈ M∗ . Then, v = 0 and
y := P,w (x − α||v||−1 v) (33)
satisfies
y − x̄2 ≤ x − α||v||−1 v − x̄2 ≤ x − x̄2 − 2α(4 L̄)−1 + α 2 . (34)
Proof Equations (30) and (32) imply that v = 0. For each z ∈ B(x̄, 4−1 L̄ −1 ), we
have by (26),
φ(z) − φ(x̄) ≤ L̄||z − x̄|| ≤ 4−1 . (35)
Thus, in view of (35), Lemma 5.1, and (32), we have for all z ∈ B(x̄, 4−1 L̄ −1 ),
v, z − x < −(3/4), (36)
which implies that
||v||−1 v, z − x < 0. (37)
Defining z̄ := x̄ + 4−1 L̄ −1 v−1 v, we obtain, by (37),
0 > ||v||−1 v, z̄ − x = ||v||−1 v, x̄ + 4−1 L̄ −1 v−1 v − x , (38)
which, in turn, yields
||v||−1 v, x̄ − x < −4−1 L̄ −1 . (39)
123
J Optim Theory Appl
Defining now u := x − αv−1 v, (39) gives rise to
u − x̄2 = x − αv−1 v − x̄2

= x − x̄2 − 2 x − x̄, αv−1 v + α 2
≤ x − x̄2 − 2α(4 L̄)−1 + α 2 . (40)
This yields, by (33), the definition of u, (16), (17), and the assumption that x̄ ∈ C∗ ,
y − x̄2 = P,w (u) − x̄2

≤ u − x̄2 ≤ x − x̄2 − 2α(4 L̄)−1 + α 2 , (41)
which completes the proof of the lemma.

5.2 Proof of Theorem 4.1
We are now ready to prove the main result of our paper, Theorem 4.1.
Proof From Algorithm 4.1, a sequence {y k }∞ k=0 that it generates has the property that
for each integer k ≥ 1 and each h ∈ {1, 2, . . . , Nk },

h k −1
N k −1
N
y k,h − y k = (y k, j − y k, j−1 ) ≤ y k,n − y k,n−1 ≤ βk,n (42)
j=1 n=0 n=0
and, in particular,
k −1
N
y k,Nk − y k ≤ βk,n , (43)
n=0
so that, by (22),
⎛ ⎞
∞
∞
k −1
N
y k,Nk − y k ≤ ⎝ βk,n ⎠ . (44)
k=0 k=0 n=0
The bounded perturbation resilience secured by Theorem 3.1 guarantees the conver-
gence of {y k }∞
k=0 to a point in C, namely, that there exists
y ∗ = lim y k ∈ C (45)
k→∞
in the norm topology.

Assume that (a) of Theorem 4.1 does not hold, i.e., that y ∗ ∈
/ Cmin . Then, there is
a 0 ∈]0, r0 /2[ such that
φ(y ∗ ) > φ(x) + 40 , for all x ∈ Cmin , (46)
123
J Optim Theory Appl
and there is a natural number k0 such that for all integers k ≥ k0 , by (22),
k −1
N
βk,n < (16 L̄)−1 0 . (47)
n=0
This index k0 can be chosen so that additionally for all integers k ≥ k0 and all
∈ {0, 1, . . . , Nk }, by (42), (45), and (46),
φ(y k, ) > φ(x) + 20 , for all x ∈ Cmin . (48)
Next, we apply Lemma 5.2 as follows. Pick an x̄ ∈ C∗ , let k ≥ k0 where k0 is as above,

and let n ∈ {1, 2, . . . , Nk }. Use (27), (24), (47)–(48) and the fact that 0 < βk,n ≤ 1 in
Algorithm 4.1, and apply Lemma 5.2 with
α = βk,n−1 , = 20 , x = y k,n−1 , v = v k,n−1 , and (, w) = (k , wk ). (49)
This leads to
y k,n − x̄2 ≤ y k,n−1 − x̄2 − 2βk,n−1 (4 L̄)−1 20 + βk,n−1

2
(50)
−1
≤ y k,n−1
− x̄ − βk,n−1 (4 L̄)
2
0 , (51)
because −3βk,n−1 (4 L̄)−1 0 + βk,n−1

2 ≤ 0. So, for n = Nk , in view (25), we have
y k+1 − x̄ ≤ y k,Nk − x̄. (52)
By y k,0 = y k in Algorithm 4.1, (50), and (52),
y k − x̄2 − y k+1 − x̄2 ≥ y k,0 − x̄2 − y k,Nk − x̄2

k −1
N
= (y k,n−1 − x̄2 − y k,n − x̄2 )
n=0
k −1
N
−1
≥ (4 L̄) 0 βk,n , (53)
n=0
and
k −1
N
y k+1 − x̄2 ≤ y k − x̄2 − (4 L̄)−1 0 βk,n , (54)
n=0
which completes the proof of Theorem 4.1.

123
J Optim Theory Appl
6 Conclusions
In very general terms, the superiorization methodology works by taking an iterative

algorithm, investigating its perturbation resilience, and then using proactively such
perturbations in order to “force” the perturbed algorithm to do something useful. The
perturbed algorithm is called the “superiorized version” of the original unperturbed
algorithm.
If the original algorithm is efficient and useful for what it is designed to do (computa-
tionally efficient and useful in terms of the application at hand), and if the perturbations
are simple and not expensive to calculate, then the advantage of this method is that,
for essentially the computational cost of the original algorithm, we are able to get
something more by steering its iterates according to the perturbations.
This is a very general principle, which has been successfully used in some important
practical applications and awaits to be implemented and tested in additional fields; see
e.g., the recent papers [8,39], for applications in intensity-modulated radiation therapy
and in nondestructive testing. An important case is when (i) the original algorithm is a
feasibility-seeking algorithm, or one that strives to find constraint-compatible points
for a family of constraints; and (ii) the perturbations that are interlaced into the original
algorithm aim at reducing (not necessarily minimizing) a given merit (objective) func-
tion. Since its inception in 2007, the superiorization method has developed and evolved
and it seems now worthwhile to distinguish between two directions that nourish from
the same general principle.
One is the direction when only bounded perturbation resilience is used and the
constraints are assumed to be consistent (nonempty intersection). Then, one treats the
“superiorized version” of the original unperturbed algorithm actually as a recursion
formula that produces an infinite sequence of iterates, and convergence questions are
meant in their asymptotic nature. This is the framework in which we work in this
paper.
The second direction does not assume consistency of the constraints but uses instead
a proximity function that “measures” the violation of the constraints. Instead of seek-
ing asymptotic feasibility, it looks at ε-compatibility and uses the notion of strong
perturbation resilience. The same core “superiorized version” of the original unper-
turbed algorithm might be investigated in each of these directions, but the second is
the more practical one, whereas the first makes only asymptotic statements.
We propose the terms “weak superiorization” and “strong superiorization” as a
nomenclature for the first and second directions, respectively.
Acknowledgments We thank Gabor Herman for an enlightening discussion about terminology, and we
greatly appreciate the constructive comments of two anonymous reviewers which helped us improve the
paper.
References
1. Censor, Y., Chen, W., Combettes, P.L., Davidi, R., Herman, G.T.: On the effectiveness of projection
methods for convex feasibility problems with linear inequality constraints. Comput. Optim. Appl. 51,
1065–1088 (2012)
123
J Optim Theory Appl
2. Censor, Y., Zaslavski, A.J.: Convergence and perturbation resilience of dynamic string-averaging pro-
jection methods. Comput. Optim. Appl. 54, 65–76 (2013)
3. Bauschke, H.H., Koch, V.R.: Projection methods: swiss army knives for solving feasibility and best
approximation problems with halfspaces. In: Reich, S., Zaslavski, A. (eds.) Proceedings of the Work-
shop “Infinite Products of Operators and Their Applications”, Haifa, 2012 (2013). Accepted for pub-
lication. http://arxiv.org/abs/1301.4506, https://people.ok.ubc.ca/bauschke/Research/c16
4. Butnariu, D., Davidi, R., Herman, G.T., Kazantsev, I.G.: Stable convergence behavior under summable
perturbations of a class of projection methods for convex feasibility and optimization problems. IEEE
J. Sel. Top. Signal Process. 1, 540–547 (2007)
5. Censor, Y., Davidi, R., Herman, G.T.: Perturbation resilience and superiorization of iterative algorithms.
Inverse Probl. 26, 065008 (2010)
6. Davidi, R., Herman, G.T., Censor, Y.: Perturbation-resilient block-iterative projection methods with
application to image reconstruction from projections. Int. Trans. Oper. Res. 16, 505–524 (2009)
7. Garduño, E., Herman, G.T.: Superiorization of the ML-EM algorithm. IEEE Trans. Nucl. Sci. 61,
162–172 (2014)
8. Davidi, R., Censor, Y., Schulte, R.W., Geneser, S., Xing, L.: Feasibility-seeking and superiorization
algorithms applied to inverse treatment planning in radiation therapy. Contemp. Math., accepted for
publication. http://arxiv.org/abs/1402.1310
9. Herman, G.T., Davidi, R.: Image reconstruction from a small number of projections. Inverse Probl. 24,
045011 (2008)
10. Herman, G.T., Garduño, E., Davidi, R., Censor, Y.: Superiorization: an optimization heuristic for
medical physics. Med. Phys. 39, 5532–5546 (2012)
11. Jin, W., Censor, Y., Jiang, M.: A heuristic superiorization-like approach to bioluminescence tomogra-
phy. In: International Federation for Medical and Biological Engineering (IFMBE) Proceedings, vol
39, pp. 1026–1029 (2012)
12. Nikazad, T., Davidi, R., Herman, G.: Accelerated perturbation-resilient block-iterative projection meth-
ods with application to image reconstruction. Inverse Probl. 28, 035005 (2012)
13. Penfold, S.N., Schulte, R.W., Censor, Y., Rosenfeld, A.B.: Total variation superiorization schemes in
proton computed tomography image reconstruction. Med. Phys. 37, 5887–5895 (2010)
14. Censor, Y., Davidi, R., Herman, G.T., Schulte, R.W., Tetruashvili, L.: Projected subgradient minimiza-
tion versus superiorization. J. Optim. Theory Appl. 160, 730–747 (2014)
15. Butnariu, D., Reich, S., Zaslavski, A.J.: Convergence to fixed points of inexact orbits of Bregman-
monotone and of nonexpansive operators in Banach spaces. In: Nathansky, H .F., de Buen, B.G., Goebel,
K., Kirk, W.A., Sims, B. (eds.) Fixed Point Theory and its Applications, (Conference Proceedings,
Guanajuato, Mexico, 2005), pp. 11–32. Yokahama Publishers, Yokahama (2006)
16. Butnariu, D., Reich, S., Zaslavski, A.J.: Stable convergence theorems for infinite products and powers
of nonexpansive mappings. Numer. Funct. Anal. Optim. 29, 304–323 (2008)
17. Censor, Y., Zaslavski, A.J.: String-averaging projected subgradient methods for constrained minimiza-
tion. Optim. Methods Softw. 29, 658–670 (2014)
18. Bauschke, H.H., Borwein, J.M.: On projection algorithms for solving convex feasibility problems.
SIAM Rev. 38, 367–426 (1996)
19. Byrne, C.L.: Applied Iterative Methods. AK Peters, Wellsely (2008)
20. Chinneck, J.W.: Feasibility and Infeasibility in Optimization: Algorithms and Computational Methods.
Springer, New York (2007)
21. Censor, Y., Zenios, S.A.: Parallel Optimization: Theory, Algorithms, and Applications. Oxford Uni-
versity Press, New York (1997)
22. Aharoni, R., Censor, Y.: Block-iterative projection methods for parallel computation of solutions to
convex feasibility problems. Linear Algebra Appl. 120, 165–175 (1989)
23. Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces.
Springer, New York (2011)
24. Bauschke, H.H., Matoušková, E., Reich, S.: Projection and proximal point methods: convergence
results and counterexamples. Nonlinear Anal-Theor. 56, 715–738 (2004)
25. Cegielski, A.: Iterative Methods for Fixed Point Problems in Hilbert Spaces. Lecture Notes in Mathe-
matics, vol. 2057. Springer, Berlin (2012)
26. Escalante, R., Raydan, M.: Alternating Projection Methods. Society for Industrial and Applied Math-
ematics (SIAM), Philadelphia (2011)
27. Galántai, A.: Projectors and Projection Methods. Kluwer Academic Publishers, Dordrecht (2004)
123
J Optim Theory Appl
28. Herman, G.T.: Fundamentals of Computerized Tomography: Image Reconstruction from Projections,
2nd edn. Springer, Berlin (2009)
29. Luo, S., Zhou, T.: Superiorization of EM algorithm and its application in single-photon emission
computed tomography (SPECT). Inverse Probl. Imaging 8, 223–246 (2014)
30. Censor, Y., Elfving, T., Herman, G.T.: Averaging strings of sequential iterations for convex feasibility
problems. In: Butnariu, D., Censor, Y., Reich, S. (eds.) Inherently Parallel Algorithms in Feasibility
and Optimization and Their Applications, pp. 101–114. Elsevier, Amsterdam (2001)
31. Censor, Y., Segal, A.: On the string averaging method for sparse common fixed point problems. Int.
Trans. Oper. Res. 16, 481–494 (2009)
32. Censor, Y., Segal, A.: On string-averaging for sparse problems and on the split common fixed point
problem. Contemp. Math. 513, 125–142 (2010)
33. Censor, Y., Tom, E.: Convergence of string-averaging projection schemes for inconsistent convex
feasibility problems. Optim. Methods Softw. 18, 543–554 (2003)
34. Crombez, G.: Finding common fixed points of strict paracontractions by averaging strings of sequential
iterations. J. Nonlinear Conv. Anal. 3, 345–351 (2002)
35. Gordon, D., Gordon, R.: Component-averaged row projections: a robust, block-parallel scheme for
sparse linear systems. SIAM J. Sci. Comput. 27, 1092–1117 (2005)
36. Penfold, S.N., Schulte, R.W., Censor, Y., Bashkirov, V., McAllister, S., Schubert, K.E., Rosenfeld,
A.B.: Block-iterative and string-averaging projection algorithms in proton computed tomography
image reconstruction. In: Censor, Y., Jiang, M., Wang, G. (eds.) Biomedical Mathematics: Promis-
ing Directions in Imaging, Therapy Planning and Inverse Problems, pp. 347–367.Medical Physics,
Madison (2010)
37. Rhee, H.: An application of the string averaging method to one-sided best simultaneous approximation.
J. Korean Soc. Math. Educ. Ser. B 10, 49–56 (2003)
38. Schott, D.: Basic properties of Fejér monotone sequences. Rostocker Math. Kolloq. 49, 57–74 (1995)
39. Schrapp, M.J., Herman, G.T.: Data fusion in X-ray computed tomography using an superiorization
approach. Rev. Sci. Instrum. 85, 053701 (2014)
123

Censor 2014

Uploaded by

Copyright:

Available Formats

Censor 2014

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Censor 2014

Uploaded by

Copyright:

Available Formats

J Optim Theory Appl

Strict Fejér Monotonicity by Superiorization

Yair Censor · Alexander J. Zaslavski

Received: 22 February 2013 / Accepted: 27 May 2014

Abstract We consider the superiorization methodology, which can be thought of as

Keywords Bounded perturbation resilience · Constrained minimization · Convex

Communicated by Jonathan Michael Borwein.

Mathematics Subject Classification 90C25 · 90C30 · 90C45 · 65K10

1.1 What is Superiorization

The recently developed superiorization methodology (SM) lies between feasibility-

the mathematical guarantee of the convergence behavior of the superiorized version

1.3 Related Work

This paper is a sequel to a series of recent publications on the SM [2–13] , culminating

1.4 Paper Structure

2 The Superiorization Methodology

minimize {φ(x) | x ∈ T } , (1)

where φ : R J → R is an objective function and T ⊆ R J is the solution set

minimize {J (x) | x ∈ } , (2)

in which case we look at

Let X be a Hilbert space equipped with an inner product ·, · , which induces a

d(x, E) := inf{||x − y|| | y ∈ E}, (4)

B(x; r ) := {y ∈ X | ||x − y|| ≤ r }. (5)

||x − PD (x)|| = inf{||x − y|| | y ∈ D}. (6)

||PD (x) − PD (y)|| ≤ ||x − y|| for all x, y ∈ X, (7)

and for each x ∈ X and each z ∈ D,

z − PD (x), x − PD (x) ≤ 0. (8)

||z − PD (x)||2 + ||x − PD (x)||2 ≤ ||z − x||2 . (9)

Suppose that C1 , C2 , . . . , Cm are nonempty, closed, and convex subsets of X where

and assume throughout that C = ∅. For i = 1, 2, . . . , m, denote Pi := PCi . By an

P[t] := Ptq · · · Pt1 (11)

and call it a string operator.

||P[t](x) − P[t](y)|| ≤ ||x − y|| for all x, y ∈ X, (12)

P[t](x) = x for all x ∈ C. (13)

It is easy to see that

||P,w (x) − P,w (y)|| ≤ ||x − y|| for all x, y ∈ X, (16)

P,w (x) = x for all x ∈ C. (17)

M∗ := {(, w) ∈ M | (t) ≤ q̄ and w(t) ≥  for all t ∈ }. (18)

x k+1 = Pk ,wk (x k ). (19)

The first prototypical string-averaging algorithmic scheme appeared in [30] and

Definition 3.1 Given a problem T, an algorithmic operator A : X → X is said

of T for all x 0 ∈ X , then any sequence {y k }∞

The convergence properties and the so called bounded perturbation resilience of

y k+1 = Pk ,wk (y k + βk v k ) (20)

converges in the norm of X and its limit belongs to C.

4 The Superiorized Version of the Dynamic String-Averaging Projection

Let C be as in (10), let φ : X → R be a convex continuous function, and consider

and assume that Cmin = ∅.

Algorithm 4.1 The superiorized version of the DSAP algorithm

where s k,n ∈ ∂φ(y k,n ).

with (k , wk ) ∈ M∗ , and go back to (1).

and suppose that

Then, any sequence {y k }∞

y k+1 − x2 < y k − x2 , for all k ≥ k0 , (29)

5 The Main Result: Strict Fejér Monotonicity in the Superiorization Method

5.1 Auxiliary Results

We will need the next two lemmas.

If φ(x) − φ(y) >  and v ∈ ∂φ(x) then v, y − x < −. (30)

Proof It follows from the subgradient inequality that

v, y − x ≤ φ(y) − φ(x) < −, (31)

and assume throughout that C = ∅. For i = 1, 2, . . . , m, denote Pi := PCi . By an

M∗ := {(, w) ∈ M | (t) ≤ q̄ and w(t) ≥ for all t ∈ }. (18)

and assume that Cmin = ∅.

y k+1 − x2 < y k − x2 , for all k ≥ k0 , (29)

If φ(x) − φ(y) > and v ∈ ∂φ(x) then v, y − x < −. (30)

v, y − x ≤ φ(y) − φ(x) < −, (31)

thus proving the lemma.

Lemma 5.2 Under the assumptions of Theorem 4.1, let x̄ ∈ C∗ , ∈]0, r0 ],

φ(x) − φ(x̄) > , (32)

and assume that v ∈ ∂φ(x) and that (, w) ∈ M∗ . Then, v = 0 and

y − x̄2 ≤ x − α||v||−1 v − x̄2 ≤ x − x̄2 − 2α(4 L̄)−1 + α 2 . (34)

φ(z) − φ(x̄) ≤ L̄||z − x̄|| ≤ 4−1 . (35)

v, z − x < −(3/4), (36)

Defining z̄ := x̄ + 4−1 L̄ −1 v−1 v, we obtain, by (37),

0 > ||v||−1 v, z̄ − x = ||v||−1 v, x̄ + 4−1 L̄ −1 v−1 v − x , (38)

||v||−1 v, x̄ − x < −4−1 L̄ −1 . (39)

Defining now u := x − αv−1 v, (39) gives rise to

u − x̄2 = x − αv−1 v − x̄2

y − x̄2 = P,w (u) − x̄2

which completes the proof of the lemma.

φ(y k, ) > φ(x) + 20 , for all x ∈ Cmin . (48)

α = βk,n−1 , = 20 , x = y k,n−1 , v = v k,n−1 , and (, w) = (k , wk ). (49)

y k,n − x̄2 ≤ y k,n−1 − x̄2 − 2βk,n−1 (4 L̄)−1 20 + βk,n−1

because −3βk,n−1 (4 L̄)−1 0 + βk,n−1

y k+1 − x̄ ≤ y k,Nk − x̄. (52)

y k − x̄2 − y k+1 − x̄2 ≥ y k,0 − x̄2 − y k,Nk − x̄2

which completes the proof of Theorem 4.1.