A Scaled Gradient Projection Method For Constrained Image Deblurring

Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

A Scaled Gradient Projection Method for

Constrained Image Deblurring


S Bonettini
1
, R Zanella
2
and L Zanni
2
1
Dipartimento di Matematica, Universit`a di Ferrara, Polo Scientico Tecnologico,
Blocco B, Via Saragat 1, I-44100 Ferrara, Italy
2
Dipartimento di Matematica, Universit`a di Modena e Reggio Emilia, Via Campi
213/B, I-41100 Modena, Italy
E-mail: [email protected], [email protected],
[email protected]
Abstract. A class of scaled gradient projection methods for optimization problems
with simple constraints is considered. These iterative algorithms can be useful in
variational approaches to image deblurring that lead to minimize convex nonlinear
functions subject to nonnegativity constraints and, in some cases, to an additional
ux conservation constraint. A special gradient projection method is introduced
that exploits eective scaling strategies and steplength updating rules, appropriately
designed for improving the convergence rate. We give convergence results for this
scheme and we evaluate its eectiveness by means of an extensive computational study
on the minimization problems arising from the maximum likelihood approach to image
deblurring. Comparisons with the standard expectation maximization algorithm and
with other iterative regularization schemes are also reported to show the computational
gain provided by the proposed method.
AMS classication scheme numbers: 65K10, 65F22, 68U10
Submitted to: Inverse Problems
Keywords Image deblurring, deconvolution methods, gradient projection methods, large
scale optimization.
SGP Method for Constrained Image Deblurring 2
1. Introduction
In image deblurring problems, image formation is modeled as a Fredholm integral
equation of the rst kind which, after discretization, results in a system of linear
equations. By representing a two dimensional image X R
nn
as a vector x =
(x
1
, . . . , x
N
)
T
R
N
, N = n
2
, in which the entries of X are stacked column by column,
the above system can be stated as
Ax = b, (1)
where A is an N N matrix representing the physical eect of the imaging system,
x is the image to be reconstructed and b R
N
is the sum of two terms: b = g + ,
g R
N
being the blurred image that would have been recorded in absence of noise and
R
N
denoting the noise aecting the image acquisition [1]. The image restoration
problem is then to obtain an approximation of x, knowing A and b. Since the system
(1) is given by the discretization of an ill-posed problem [1], the matrix A could be very
ill-conditioned and a trivial approach that looks for the solution of (1) is in general not
successful. Thus, alternative strategies must be exploited, that often consist in iterative
schemes motivated by dierent approaches, ranging from methods for the solution of
linear equations to methods for solving variational problems [2].
In this work we propose a scaled gradient projection method suited for the
constrained minimization problems arising from several variational approaches to image
restoration. Examples of such minimization problems are the following:
min J(x)
sub. to x 0,
(2)
or
min J(x)
sub. to x 0

N
i=1
x
i
= c,
(3)
where J(x) is a continuously dierentiable convex function measuring the dierence
between reconstructed and measured data and, possibly, containing a penalty term
expressing additional information on the solution, while the constraints force the
nonnegativity of the solution and, in case of the problem (3), the so-called ux
conservation property. Gradient projection type methods seem appealing approaches
for these problems for two main reasons. Firstly, the special structure of the constraints
makes the projection of a vector on the feasible region a non-excessively expensive
operation: this is obviously the case of the problem (2), but also of the problem (3), in
which the projection can be performed by linear-time algorithms [3, 4, 5, 6, 7]. Secondly,
the recent advances on the steplength selection in gradient methods [8, 4, 9, 10] allow to
largely improve the convergence rate of these schemes, without introducing signicant
additional costs. Thus, new gradient projection methods can nowadays be designed
that, thanks to the low computational cost per iteration and the good convergence
SGP Method for Constrained Image Deblurring 3
rate, may represent a valid alternative to other gradient-based iterative approaches
widely used in image restoration [11, 12, 13, 14, 15]. The main feature of the gradient
projection method introduced in this paper consists in the combination of non-expensive
diagonally scaled gradient directions with steplength selection rules specially designed
for these directions. Moreover, global convergence properties are ensured by exploiting
a nonmonotone line-search strategy along the feasible direction [16, 17]. Scaled gradient
directions are also exploited by other popular algorithms for image restoration; see,
for example, the projected Newton methods described in [2, 18, 19, 20]. However,
these schemes are substantially dierent from our approach since they require inner
linear solvers to compute the non-diagonally scaled gradient direction, do not consider
steplength selection strategies and use line-search along the projection arc instead of
along the feasible direction [21, p. 226].
The eectiveness of the proposed scheme is evaluated by solving the minimization
problems arising from the maximum likelihood approach to the deconvolution of images
corrupted by Poisson noise, that is when the objective function in (2) and (3) is the
Kullback-Leibler divergence of the blurred image Ax from the observed noisy image
b [1, 12, 15, 22]. Comparisons with the well known Expectation Maximization (EM)
method [15], the accelerated EM version proposed in [23] and the Weighted Modied
Residual Norm Steepest Descent (WMRNSD) algorithm introduced in [24] are reported
to assess the reconstruction accuracy and the computational gain provided by the new
method.
The paper is organized as follows: in section 2 the scaled gradient projection
approach is presented and its global convergence theory is developed, while in section
3 crucial details for a practical implementation of the method are given. An extensive
numerical experimentation on astronomical test images is presented in section 4. Our
conclusions and future developments are discussed in section 5.
2. The algorithm and its convergence
We introduce a Scaled Gradient Projection (SGP) method for solving constrained
minimization problems of the general form
min f(x)
sub. to x ,
(4)
where R
N
is a closed convex set and f : R is a continuously dierentiable
function. Obviously, the deblurring problems (2) and (3) can be considered as special
cases of this formulation.
Before to state the SGP algorithm we recall some basic properties of the projection
operator.
SGP Method for Constrained Image Deblurring 4
2.1. Denitions and basic properties
Throughout the paper, the 2-norm of vectors and matrices is denoted by while

D
indicates the vector norm associated to a symmetric positive denite matrix D:
x
D
=

x
T
Dx.
Given the optimization problem (4), we recall that x

is a stationary point of
f over if
f(x

)
T
(y x

) 0, y .
When is convex, the stationarity condition can be stated also as f(x

)
T
w 0 for
any w in the tangent cone of at x

[21, p. 336].
Let R
N
be a closed convex set and D be a symmetric positive denite N N
matrix, we dene the projection operator P
,D
: R
N
as
P
,D
(x) arg min
y
y x
D
= arg min
y
_
(y)
1
2
y
T
Dy y
T
Dx
_
. (5)
We observe that, given the set and the point x, the operator P
,D
(x) is a continuous
function with respect to the elements of the matrix D. Furthermore, from the denition
of stationary point and the strict convexity of the function introduced in (5), we have
that P
,D
(x) is dened also by
(P
,D
(x) x)
T
D(P
,D
(x) y) 0, y . (6)
Let D
L
R
NN
be the compact set of the symmetric positive denite N N matrices
such that D L and D
1
L, for a given threshold L > 1. The next lemmas state
two basic properties related to the projection operator: a Lipschitz continuity condition
on P
,D
and a characterization for the stationary points of the problem (4) [25].
Lemma 2.1 If D D
L
, then
P
,D
(x) P
,D
(z) L
2
x z (7)
for any x, z R
N
.
Proof. By applying the condition (6) we get
(P
,D
(x) x)
T
D(P
,D
(x) P
,D
(z)) 0
(P
,D
(z) z)
T
D(P
,D
(z) P
,D
(x)) 0
and, by adding the two inequalities,
((P
,D
(x) x) (P
,D
(z) z))
T
D(P
,D
(x) P
,D
(z)) 0,
that is
P
,D
(x) P
,D
(z)
2
D
(P
,D
(x) P
,D
(z))
T
D(x z). (8)
SGP Method for Constrained Image Deblurring 5
If
min
denotes the minimum eigenvalue of the matrix D, for the left hand side of the
previous inequality we have
P
,D
(x) P
,D
(z)
2
D

min
P
,D
(x) P
,D
(z)
2
=
1
D
1

P
,D
(x) P
,D
(z)
2

1
L
P
,D
(x) P
,D
(z)
2
and from (8) we obtain
1
L
P
,D
(x) P
,D
(z)
2
P
,D
(x) P
,D
(z)
2
D
(P
,D
(x) P
,D
(z))
T
D(x z)
P
,D
(x) P
,D
(z) D x z
LP
,D
(x) P
,D
(z) x z
which yields (7).
Lemma 2.2 A vector x

is a stationary point of the problem (4) if and only if


x

= P
,D
1(x

Df(x

)) for any positive scalar and for any symmetric positive


denite matrix D.
Proof. Let R
+
and let D be a symmetric positive denite matrix. Assume that
x

= P
,D
1(x

Df(x

)). From (6) we obtain


(x

+ Df(x

))
T
D
1
(x

x) 0, x ,
which implies the stationarity condition
f(x

)
T
(x

x) 0, x .
Conversely, let us assume that x

is a stationary point of (4), and suppose that


x = P
,D
1(x

Df(x

)), with x = x

. Then, from (6) we can write


( x x

+ Df(x

))
T
D
1
( x x

) 0,
that is
x x

2
D
1 + f(x

)
T
( x x

) 0.
The previous inequality yields
f(x

)
T
(x

x)
x x

2
D
1

> 0,
which gives a contradiction with the stationarity assumption on x

.
SGP Method for Constrained Image Deblurring 6
Algorithm SGP (Scaled Gradient Projection Method)
Choose the starting point x
(0)
, set the parameters , (0, 1), 0 <
min
<
max
and x a positive integer M.
For k = 0, 1, 2, ... do the following steps:
Step 1. Choose the parameter
k
[
min
,
max
] and the scaling matrix D
k
D
L
;
Step 2. Projection: y
(k)
= P
,D
1
k
(x
(k)

k
D
k
f(x
(k)
));
If y
(k)
= x
(k)
then stop, declaring that x
(k)
is a stationary point;
Step 3. Descent direction: d
(k)
= y
(k)
x
(k)
;
Step 4. Set
k
= 1 and f
max
= max
0jmin(k,M1)
f(x
(kj)
);
Step 5. Backtracking loop:
If f(x
(k)
+
k
d
(k)
) f
max
+
k
f(x
(k)
)
T
d
(k)
then
go to Step 6;
Else
set
k
=
k
and go to Step 5;
Endif
Step 6. Set x
(k+1)
= x
(k)
+
k
d
(k)
.
End
2.2. The SGP method
The Lemma 2.2 shows the eect of the projection operator P
,D
1 on the points
(x

Df(x

)), > 0, when x

is a stationary point of (4). In the case x


is a nonstationary point, P
,D
1( x Df( x)) can be exploited to generate a descent
direction for the function f in x. This idea serves as the basis for the method described
in Algorithm SGP.
Before to discuss the convergence properties of the method, some considerations
about its main steps can be useful.
First of all, it is worth stressing that any choice of the steplength
k
in a closed
interval and of the scaling matrix D
k
in the compact set D
L
is permitted. This is very
important from a practical point of view since it allows one to make the updating rules
of
k
and D
k
problem related and oriented at optimizing the performance. Refer to the
next section for a study on the selection strategies of these parameters in the case of
the deblurring problems (2) and (3).
If the projection performed in step 2 returns a vector y
(k)
equal to x
(k)
, then Lemma
2.2 implies that x
(k)
is a stationary point and the algorithm stops. When y
(k)
= x
(k)
,
it is possible to prove that d
k
is a descent direction for f in x
(k)
(see the next lemma)
and the backtracking loop in step 5 terminates with a nite number of runs; thus the
algorithm is well dened.
The nonmonotone line-search strategy implemented in step 5 ensures that f(x
(k+1)
)
is lower than the maximum of the objective function on the last M iterations [17]; of
course, if M = 1 then the strategy reduces to the standard monotone Armijo rule [21].
SGP Method for Constrained Image Deblurring 7
Finally, we remark the relation between SGP and the projected gradient method
proposed in [26]. We observe that
d
(k)
= y
(k)
x
(k)
= P
,D
1
k
(x
(k)

k
D
k
f(x
(k)
)) x
(k)
=
_
arg min
y
1
2
y
T
D
1
k
y y
T
D
1
k
(x
(k)

k
D
k
f(x
(k)
))
_
x
(k)
and, by introducing a new variable d such that y = x
(k)
+d, we may rewrite d
(k)
in an
equivalent form:
d
(k)
= arg min
x
(k)
+d
1
2
d
T
B
k
d +f(x
(k)
)
T
d, (9)
where B
k
=
D
1
k

k
. Since the general method proposed in [26] allows to dene the search
direction as in (9) (or even by means of an inexact solution of the problem (9)), the
SGP algorithm could be seen as an alternative formulation of that scheme in which
the steplength and the scaling matrix are managed separately, with the aim to better
emphasize their contribution and to simplify the design of eective updating rules for
these parameters. This means also that the convergence results obtained in [26] could
be adapted to our method; however, for the sake of completeness, we develop a slightly
dierent convergence analysis, appropriately designed for SGP.
2.3. A convergence analysis for SGP
In this subsection we will focus on the case in which the algorithm generates an innite
sequence of iterates, denoted by {x
(k)
}. The main SGP convergence result is stated in
Theorem 2.1, whose proof is based on some crucial properties that we report in the next
lemmas.
The rst two lemmas are concerned with the descent condition and the boundedness
of the directions d
(k)
, respectively.
Lemma 2.3 Assume that d
(k)
= 0. Then, d
(k)
is a descent direction for the function
f at x
(k)
, that is, f(x
(k)
)
T
d
(k)
< 0.
Proof. From the inequality (6) with x = x
(k)

k
D
k
f(x
(k)
), D = D
1
k
and y = x
(k)
,
it follows that
(d
(k)
+
k
D
k
f(x
(k)
))
T
D
1
k
d
(k)
0,
and then
f(x
(k)
)
T
d
(k)

d
(k)
T
D
1
k
d
(k)

k
< 0. (10)

Lemma 2.4 If the sequence {x


(k)
} is bounded, then also the sequence {d
(k)
} is bounded.
SGP Method for Constrained Image Deblurring 8
Proof. From the denition of d
(k)
and (7) we have that, for any k,
d
(k)
= P
,D
1
k
(x
(k)

k
D
k
f(x
(k)
)) x
(k)

= P
,D
1
k
(x
(k)

k
D
k
f(x
(k)
)) P
,D
1
k
(x
(k)
)
L
2

k
D
k
f(x
(k)
)
max
L
3
f(x
(k)
).
Let

be a closed and bounded set that contains the iterates x
(k)
. Since f is a
continuous function on , then it is bounded in

and thus {d
(k)
} is bounded.
Now we prove two properties of the accumulation points of the sequence generated
by SGP.
Lemma 2.5 Assume that the subsequence {x
(k)
}
kK
, K N, is converging to a point
x

. Then, x

is a stationary point of (4) if and only if


lim
kK
f(x
(k)
)
T
d
(k)
= 0.
Proof. Let x

be a stationary point of (4); this means that f(x

)
T
d 0 for any vector
d such that x

+ d . Suppose that {f(x


(k)
)
T
d
(k)
} does not tend to 0 for k K.
In this case, taking into account Lemma 2.3, we know that there exists > 0 and an
innite set K
1
K such that
f(x
(k)
)
T
d
(k)
< 0, k K
1
.
By the compactness of the interval [
min
,
max
] and of the set D
L
, we can extract a set
of indices K
2
K
1
such that
k

[
min
,
max
], and D
k
D

, D

D
L
, for
k K
2
; hence, by continuity, we can write lim
kK
2
d
(k)
= d

, where
d

= P
,D
1

(x

f(x

)) x

. (11)
Thus,
lim
kK
2
f(x
(k)
)
T
d
(k)
= f(x

)
T
d

< 0. (12)
Since from by the denition (11) we have that x

+d

belongs to , the inequality (12)


contradicts the stationarity assumption on x

.
On the other hand, let us assume that lim
kK
f(x
(k)
)
T
d
(k)
= 0. Suppose by
contradiction that x

is not a stationary point. Let K


3
K be a set of indices
such that
k

and D
k
D

, when k diverges, k K
3
; we have lim
kK
3
d
(k)
=
(P
,D
1

(x

f(x

)) x

). Furthermore, from Lemma 2.2 there exists > 0


such that P
,D
1

(x

f(x

)) x

2
= . By exploiting (10), we can write, for
a suciently large

k K
3
,
f(x
(k)
)
T
d
(k)

d
(k)
T
D
1
k
d
(k)

k


2
max
L
< 0, k

k, k K
3
.
This contradicts the assumption lim
kK
f(x
(k)
)
T
d
(k)
= 0 and then x

must be a
stationary point.
SGP Method for Constrained Image Deblurring 9
Lemma 2.6 Let x

be an accumulation point of the sequence {x


(k)
} such that
lim
kK
x
(k)
= x

, for some K N. If x

is a stationary point of (4), then x

is an
accumulation point also for the sequence {x
(k+r)
}
kK
for any r N. Furthermore,
lim
kK
d
(k+r)
= 0, r N.
Proof. From Lemma 2.5 we have that lim
kK
f(x
(k)
)
T
d
(k)
= 0 and, from (10), we
obtain that lim
kK
d
(k)
= 0. Thus lim
kK
x
(k+1)
x
(k)
= 0, and this implies that
x

is an accumulation point also for the sequence {x


(k+1)
}
kK
. Recalling again Lemma
2.5, we obtain that lim
kK
f(x
(k+1)
)
T
d
(k+1)
= 0; for the same reasons as before, we
conclude that lim
kK
d
(k+1)
= 0. Hence, the statement of the lemma follows by
induction.
At this point we may state a convergence result for SGP.
Theorem 2.1 Assume that the level set
0
= {x : f(x) f(x
(0)
)} is bounded.
Every accumulation point of the sequence {x
(k)
} generated by the SGP algorithm is a
stationary point of (4).
Proof. Since every iterate x
(k)
lies in
0
, the sequence {x
(k)
} is bounded and has at
least one accumulation point. Let x

be such that lim


kK
x
(k)
= x

for a set of
indices K N. Let us consider separately the two cases
a. inf
kK

k
= 0;
b. inf
kK

k
= > 0.
Case a.
Let K
1
K be a set of indices such that lim
kK
1

k
= 0. This implies that, for k K
1
,
k suciently large, the backtracking rule fails to be satised at least once. Thus, at the
penultimate step of the backtracking loop, we have
f(x
(k)
+

k

d
(k)
) > f(x
(k)
) +

f(x
(k)
)
T
d
(k)
,
hence
f(x
(k)
+

k

d
(k)
) f(x
(k)
)

> f(x
(k)
)
T
d
(k)
. (13)
By the mean value theorem, we have that there exits a scalar t
k
[0,

k

] such that
the left hand side of (13) is equal to f(x
(k)
+ t
k
d
(k)
)
T
d
(k)
. Thus, the inequality (13)
becomes
f(x
(k)
+ t
k
d
(k)
)
T
d
(k)
> f(x
(k)
)
T
d
(k)
. (14)
Since
k
and D
k
are bounded, it is possible to nd a set of indices K
2
K
1
such that
lim
kK
2

k
=

and lim
kK
2
D
k
= D

. Thus the sequence {d


(k)
}
kK
2
converges to the
SGP Method for Constrained Image Deblurring 10
vector d

= (P
,D
1

(x

f(x

)) x

) and, furthermore, t
k
d
(k)
0 when k
diverges, k K
2
. Taking limits in (14) as k , k K
2
, we obtain
(1 )f(x

)
T
d

0.
Since (1 ) > 0 and f(x
(k)
)
T
d
(k)
< 0 for all k, then we necessarily have
lim
kK
2
f(x
(k)
)
T
d
(k)
= f(x

)
T
d

= 0. Then, by Lemma 2.5, we conclude that


x

is a stationary point.
Case b.
Let us dene the point x
((k))
as the point such that
f(x
((k))
) = f
max
= max
0jmin(k,M1)
f(x
(kj)
).
Then, for k > M 1, k N, the following condition holds:
f(x
((k))
) f(x
(((k)1))
) +
(k)1
f(x
((k)1)
)
T
d
((k)1)
. (15)
Since the iterates x
(k)
, k N belong to a bounded set, the monotone non-increasing
sequence {f(x
((k))
)} admits a nite limit L R for k K. Let K
3
K be a set of
indices such that lim
kK
3

(k)1
=
1
> 0 and lim
kK
3
f(x
((k)1)
)
T
d
((k)1)
exists
(recall that, from Lemma 2.4, the sequence {d
(k)
}
kN
is bounded); taking limits on (15)
for k K
3
we obtain
L L +
1
lim
kK
3
f(x
((k)1)
)
T
d
((k)1)
,
that is
lim
kK
3
f(x
((k)1)
)
T
d
((k)1)
0.
Recalling that f(x
(k)
)
T
d
(k)
< 0 for any k, the previous inequality implies that
lim
kK
3
f(x
((k)1)
)
T
d
((k)1)
= 0. (16)
Then, by the Lemma 2.5, (16) implies that every accumulation point of the sequence
{x
((k)1)
}
kK
3
is a stationary point of (4).
Let us prove that the point x

is an accumulation point of {x
((k)1)
}
kK
3
.
The denition of x
((k))
implies that k M + 1 (k) k. Thus we can write
x
(k)
x
((k)1)

k(k)

j=0

(k)1+j
d
((k)1+j)
, k K. (17)
Let K
4
K
3
be a subset of indices such that the sequence {x
((k)1)
}
kK
4
converges to an
accumulation point x . Recalling that, from (16) and Lemma 2.5, x is a stationary
point of (4), we can apply Lemma 2.6 to obtain that lim
kK
4
d
((k)1+j)
= 0 for any
j N. By using (17) we conclude that
lim
kK
4
x
(k)
x
((k)1)
= 0. (18)
Since
x

x
((k)1)
x
(k)
x
((k)1)
+x
(k)
x

and lim
kK
x
(k)
= x

, then (18) implies that x

is an accumulation point also for the


sequence {x
((k)1)
}
kK
3
. Hence, we conclude that x

is a stationary point of (4).


SGP Method for Constrained Image Deblurring 11
3. The SGP method for image deblurring
In this section we describe SGP implementations for solving special constrained
minimization problems arising in image deblurring. We consider the maximum
likelihood approach to the deconvolution of images corrupted by Poisson noise [1, 12, 15].
It is well known that in this approach an estimate of the unknown image is obtained
by approximating the solutions of minimization problems of the form (2) or (3), in
which, following the notation used in the introduction, the objective function is the
Kullback-Leibler divergence of the blurred image Ax from the observed noisy image b.
In particular, we deal with models that take into account also the existence of a constant
background radiation bg > 0 and, consequently, we must look for approximate solutions
of (2) or (3) when J(x) is the Kullback-Leibler divergence of (Ax + bg) from b:
J(x) = D
KL
(Ax + bg, b) =
=

N
i=1
_

N
j=1
A
ij
x
j
+ bg b
i
b
i
log

N
j=1
A
ij
x
j
+bg
b
i
_
.
(19)
We remark that, due to the ill-posedness of the image restoration problem, when J(x)
is dened as in (19), one is not interested in computing the exact solution of (2) or
(3), because this does not provide a sensible estimate of the unknown image; for this
reason, the iterative minimization methods are usually exploited to obtain acceptable
(regularized) solutions by early stopping. A dierent approach to image restoration
consists in minimizing special regularized functionals given by the sum of a t-to-data
function and a penalty term; in this case the trade-o between data delity and stability
is controlled by means of a regularization parameter and a suited image restoration is
obtained by an exact solution of the minimization problem. Even if SGP can be used
in this context, we do not consider such approaches here and we refer to [19, 20, 27] for
examples of minimization methods applied to these problems.
Some comments about the objective function (19) are necessary for the subsequent
discussions. The gradient and the hessian of J(x) can be written as
J(x) = A
T
e A
T
Y
1
b (20)

2
J(x) = A
T
BY
2
A, (21)
where e R
N
is a vector whose components are all equal to one, Y = diag(Ax +bg) is
a diagonal matrix with the entries of (Ax+bg) on the main diagonal and B = diag(b).
The matrix A can be considered with nonnegative entries, generally dense and such
that A
T
e = e and

j
A
ij
> 0, i; moreover, we may assume that periodic boundary
conditions are imposed for the discretization of the Fredholm integral equation that
models the image formation process, so that the matrix A is block-circulant with
circulant blocks. Other boundary conditions can be used, but the crucial point is that
matrix-vector products can be done quickly. For example, one can replace periodic
boundary conditions with zero, reexive or anti-reexive boundary conditions. In all
of these cases the products Ax can be performed with O(N log N) complexity, by
employing the Fast Fourier Transform (FFT) [28], or a fast trigonometric transform
SGP Method for Constrained Image Deblurring 12
such as the Discrete Cosine Transform (DCT) or Discrete Sine Transform (DST)
[29]. Concerning the components b
i
of the observed image, we remark that they are
nonnegative; this implies that the hessian matrix (21) is positive semidenite in any
point of the nonnegative orthant and the problems (2) and (3) are convex minimization
problems. Thus, when J(x) is dened as in (19), each accumulation point of the
sequence generated by applying SGP to (2) or (3) is a solution of the problem (we
recall that the assumption of Theorem 2.1 is trivially satised for these problems).
Concerning the computational cost of SGP in this particular application, we observe that
the heaviest tasks in each iteration are two matrix-vector products useful to compute
J(x
(k)
) in Step 2 and J(x
(k)
+
k
d
(k)
) in Step 5; that is, two (two-dimensional)
FFT/IFFT pairs (O(N log N) complexity). It is possible to implement the other
signicant operations in each iteration in such a way that their costs are no larger
than those of the above products. To this end, in the following subsections we show
how the projection on the feasible region of (3) can be performed and how the scaling
matrix D
k
and the steplength parameter
k
can be updated.
3.1. Compute the projection
From now on we will assume that the scaling matrix D
k
is diagonal, D
k
=
diag
_
d
(k)
1
, d
(k)
2
, . . . , d
(k)
N
_
, as usually done in scaled gradient projection methods; some
examples of diagonal scaling matrices will be given in the next subsection.
Under this assumption, the projections arising from the application of SGP to the
problem (2) are trivial operations and do not require special investigations.
Let

be the feasible region of the minimization problem (3):

=
_
x R
N
| x 0,

N
i=1
x
i
= c
_
. (22)
When algorithm SGP is applied for solving (3), at each iteration we need to compute
P

,D
1
k
(x
(k)

k
D
k
f(x
(k)
)), that is, we must solve the constrained strictly convex
quadratic program
min
1
2
x
T
D
1
k
x x
T
z
sub. to

N
i=1
x
i
c = 0,
x
i
0, i = 1, . . . , N,
(23)
where z = D
1
k
_
x
(k)

k
D
k
f(x
(k)
)
_
.
Due to the special structure of the constraints, the problem (23) can be reformulated
as a one-dimensional root-nding problem.
In fact, if x denotes the solution of (23), then from the KKT rst order optimality
conditions we know that there exist Lagrange multipliers

R and R
N
such that
D
1
k
x z

e = 0
x 0
0

T
x = 0

N
i=1
x
i
c = 0
SGP Method for Constrained Image Deblurring 13
Figure 1. Time in seconds to perform 100 projections.
128^2 512^2 1024^2
0
5
10
15
20
Problem size
T
i
m
e

(
s
e
c
)
From the rst four KKT conditions it is easy to obtain x and as functions of

:
x
i
(

) = max
_
0, d
(k)
i
(z
i
+

)
_
,
i
(

) = max
_
0, (z
i
+

)
_
, i = 1, . . . , N.
Thus, in order to solve the KKT system, we must nd

such that
N

i=1
x
i
(

) c = 0. (24)
This means that the computation of the projection P

,D
1
k
(x
(k)

k
D
k
f(x
(k)
))
essentially reduces to solve a root-nding problem for a piecewise linear monotonically
non-decreasing function. Specialized linear time algorithms for this root-nding problem
can be derived from the wide literature available for the continuous quadratic knapsack
problem [3, 4, 5, 6, 7]. In the SGP implementation proposed in this work we face
the problem (24) by the secant-based method suggested in [4] that has shown very
good performance also within the gradient projection methods used for the quadratic
programs arising in training the learning methodology Support Vector Machines [30, 31].
To emphasize the scaling properties of the projection algorithm [4] we show in Figure
1 the computational time it requires with respect to the size of the problem. The
projection problems arise from the application of SGP to the reconstruction of images
of dierent sizes; for each size, the time to perform one hundred projections is reported.
The experiments are carried out on an AMD Opteron Dual Core 2.4 GHz processor
using Matlab 7.5.0
SGP Method for Constrained Image Deblurring 14
From Figure 1 we may observe the linear scaling of the secant-based method [4]
and the low computational cost of each projection: for example, in case of N = 1024
2
a
projection is performed in approximately 0.2 seconds.
3.2. Update the scaling matrix
The choice of the scaling matrix D
k
in SGP must aim at two main goals: avoiding
to introduce signicant computational costs and improving the convergence rate. As
previously motivated, a diagonal scaling allows one to make the projection in step 2 of
SGP a non-excessively expensive task; thus, we will concentrate on such kind of scaling
matrices. A classical choice is to use a scaling matrix D
k
= diag
_
d
(k)
1
, d
(k)
2
, . . . , d
(k)
N
_
that approximates the inverse of the Hessian matrix
2
J(x); for example by requiring
d
(k)
i

_

2
J(x
(k)
)
(x
i
)
2
_
1
, i = 1, . . . , N.
In this case an updating rule for the entries of D
k
could be
d
(k)
i
= min
_
L, max
_
1
L
,
_

2
J(x
(k)
)
(x
i
)
2
_
1
__
, i = 1, . . . , N, (25)
where L is an appropriate threshold. Another appealing choice is suggested by the
diagonal scaling used to rewrite the EM method as a special scaled gradient method for
minimizing J(x) [15]:
x
(k+1)
= X
k
A
T
Y
1
k
b = x
(k)
X
k
J( x
(k)
),
where X
k
= diag( x
(k)
) and Y
k
= diag(A x
(k)
+bg) (refer to section 4.3 for more details on
the EM method and a comparison with SGP). By following this idea we may introduce
the updating rule
d
(k)
i
= min
_
L, max
_
1
L
, x
(k)
i
__
, i = 1, . . . , N. (26)
From a computational viewpoint, the updating rule (25) is more expensive than
(26), due to the computation of the diagonal entries of the Hessian (see (21)). With
regard to the eect of the scaling on the SGP convergence rate, we will show the
behaviour of the above updating rules on several test problems in the next section.
Now we go into details about another crucial issue for the convergence rate of a gradient
method: the choice of the steplength.
3.3. Update the steplength
Steplength selection rules in gradient methods have received an increasing interest
in the last years from both the theoretical and the practical point of view. On one
hand, following the original ideas of Barzilai and Borwein (BB) [32], several steplength
updating strategies have been devised to accelerate the slow convergence exhibited in
most cases by standard gradient methods, and a lot of eort has been put into explaining
SGP Method for Constrained Image Deblurring 15
the eects of these strategies [33, 8, 4, 34, 9, 35, 10]. On the other hand, numerical
experiments on randomly generated, library and real-life test problems have conrmed
the remarkable convergence rate improvements involved by some BB-like steplength
selections [8, 4, 36, 9, 37, 30, 10]. Thus, it seems natural to equip SGP with a steplength
selection that takes into account of the recent advances on the BB-like updating rules.
First of all we must rewrite, in case of a scaled gradient method, the two BB rules
usually exploited by the main steplength updating strategies. To this end, we can regard
the matrix B(
k
) = (
k
D
k
)
1
as an approximation of the Hessian
2
J(x
(k)
) and derive
two updating rules for
k
by forcing quasi-Newton properties on B(
k
):

BB1
k
= argmin

k
R
B(
k
)s
(k1)
z
(k1)
(27)
and

BB2
k
= argmin

k
R
s
(k1)
B(
k
)
1
z
(k1)
, (28)
where s
(k1)
=
_
x
(k)
x
(k1)
_
and z
(k1)
=
_
J(x
(k)
) J(x
(k1)
)
_
.
In this way, the steplengths

(1)
k
=
s
(k1)
T
D
1
k
D
1
k
s
(k1)
s
(k1)
T
D
1
k
z
(k1)
(29)
and

(2)
k
=
s
(k1)
T
D
k
z
(k1)
z
(k1)
T
D
k
D
k
z
(k1)
(30)
are obtained, that reduce to the standard BB rules in case of non-scaled gradient
methods, that is when D
k
is equal to the identity matrix for all k:

BB1
k
=
s
(k1)
T
s
(k1)
s
(k1)
T
z
(k1)
,
BB2
k
=
s
(k1)
T
z
(k1)
z
(k1)
T
z
(k1)
. (31)
At this point, inspired by the steplength alternations successfully implemented in the
framework of non-scaled gradient methods [9, 10], we propose a steplength updating
rule for SGP which adaptively alternates the values provided by (29) and (30). The
details of the SGP steplength selection are given in Algorithm SS. This rule decides the
alternation between two dierent selection strategies by means of the variable threshold

k
instead of a constant parameter as done in [9] and [10]. This trick makes the choice
of
0
less important for the SGP performance and, in our experience, seems able to
avoid the drawbacks due to the use of the same steplength rule in too many consecutive
iterations. Finally, we remark that a deeper analysis about the SGP steplength selections
should be worthwhile; in fact, at least to our knowledge, steplength rules for scaled
gradient methods are not well investigated in literature and the generalization of the
standard BB-like selections to this context is an interesting open problem. However,
such an analysis is beyond the aims of this work and we will limit to show in the next
section the eectiveness of the proposed strategy in comparison with some widely used
steplength selections.
SGP Method for Constrained Image Deblurring 16
Algorithm SS (SGP Steplength Selection)
if k = 0
set
0
[
min
,
max
],
1
(0, 1) and a nonnegative integer M

;
else
if s
(k1)
T
D
1
k
z
(k1)
0 then

(1)
k
=
max
;
else

(1)
k
= max
_

min
, min
_
s
(k1)
T
D
1
k
D
1
k
s
(k1)
s
(k1)
T
D
1
k
z
(k1)
,
max
__
;
endif
if s
(k1)
T
D
k
z
(k1)
0 then

(2)
k
=
max
;
else

(2)
k
= max
_

min
, min
_
s
(k1)
T
D
k
z
(k1)
z
(k1)
T
D
k
D
k
z
(k1)
,
max
__
;
endif
if
(2)
k
/
(1)
k

k
then

k
= min
_

(2)
j
, j = max {1, k M

} , . . . , k
_
;
k+1
=
k
0.9;
else

k
=
(1)
k
;
k+1
=
k
1.1;
endif
endif
4. Numerical experiments
The experiments of this section aim to show the practical usefulness of SGP for image
restoration problems. Firstly, in order to determine an SGP setting suited for this
application, we carefully evaluate the updating rules for the scaling matrix and the
steplength parameter proposed in the previous section. Secondly, we compare SGP
with the standard EM method [15], the accelerated EM version described in [23] and
the WMRNSD algorithm proposed in [24]. All the methods are implemented in Matlab
7.5.0 and the experiments are performed on a computer equipped with a processor AMD
Opteron Dual Core 2.4 GHz.
4.1. Test problems and performance measures
The considered methods are tested on several optimization problems of the form (2) and
(3), with the objective function J(x) dened as in (19), corresponding to the deblurring
problems arising from a set of astronomical images corrupted by Poisson noise. The test
problems are generated by convolving the original 256 256 images shown in Figure
2 and denoted by the letters A, B, C, with a point spread function (PSF). Then, a
constant background term is added and the resulting images are perturbed with Poisson
SGP Method for Constrained Image Deblurring 17
noise. From the original images, several blurred noisy images have been generated, with
Image A Image B Image C
Figure 2. Original images.
dierent PSFs and noise levels; however, the relative behaviour of the methods on the
corresponding deblurring problems are very similar and, consequently, we only discuss
the results corresponding to an ideal PSF and three levels of noise. In particular, by
following [38], our PSF is dened as
2
_
J
1
(R)
R
_
2
,
where J
1
is the Bessel function of the rst kind and R =
_
x
2
+ y
2
, x, y R. The
PSF is computed on a 256 256 grid dened by considering uniformly spaced values in
[-36.4113, 36.4113]; the resulting matrix is then normalized such that A
T
e = e. Since
we are considering the case of Poisson noise, the dierent noise levels are obtained by
changing the total ux (total number of counts) of the original image: the noise level is
increasing when the total ux is decreasing. For each of the considered images, we show
in Figure 3 the three blurred noisy images used in these experiments: the total ux is
4.4310
9
for the images in the left panels, 7.0210
8
for the images in the middle panels
and 4.4310
7
for the images in the right panels. In all the experiments the background
level is bg = 6.76 10
3
.
In order to evaluate the performance of the deblurring methods, we measure for
each method the relative reconstruction error, dened as x
(k)
x/x, where x is
the image to be reconstructed and x
(k)
is the reconstruction after k iterations; then
we report the minimum relative reconstruction error (err opt), the number of iterations
(it opt) and the computational time in seconds (sec) required to provide the minimum
error. The cases where a method reaches the prexed maximum number of iterations
will be marked with an asterisk and the relative error and the time corresponding to
that number of iterations will be given.
SGP Method for Constrained Image Deblurring 18
Image A1 Image A2 Image A3
Image B1 Image B2 Image B3
Image C1 Image C2 Image C3
Figure 3. Blurred noisy images. The left, middle and right panels refer to low,
medium and high noise levels, respectively.
4.2. Scaling matrix and steplength parameter in SGP
We study the SPG behaviour for dierent choices of the scaling matrix and of the
steplength rule. We test three scaling matrices: D
k
= I, where I denotes the identity
matrix, D
k
selected as in (25) and D
k
dened as in (26). In the last two cases we use
L = 10
10
. Concerning the steplength, we evaluate the following updating rules:
SGP-BB1:
k
=
(1)
k
, where
(1)
k
is dened as in Algorithm SS;
SGP-BB2:
k
=
(2)
k
, where
(2)
k
is dened as in Algorithm SS;
SGP Method for Constrained Image Deblurring 19
SGP-ABB:
k
dened by alternating
(1)
k
and
(2)
k
as in the ABB method described
in [10] (that is, by setting in Algorithm SS M

= 0 and
k
= 0.15, for any k);
SGP-SS:
k
dened by the Algorithm SS, with
1
= 0.5 and M

= 2.
In all the above selections we set
min
= 10
10
,
max
= 10
5
and
0
= 1.3, except for
SGP-BB1 where we found convenient to start with a smaller steplength:
0
= 10
2
.
The above scaling matrices and steplength rules are examined in both the monotone
and the nonmonotone version of SGP; the line-search parameter are: = 0.4, = 10
4
,
M = 1 in the monotone SGP and M = 10 in the nonmonotone SGP. In every experiment
the pixels of the starting image x
(0)
are set as follows: x
(0)
i
= c/N, i = 1, . . . , N, where
c =

N
i=1
(b
i
bg) denotes the right hand side of the equality constraint in the problem
(3).
The numerical results obtained by solving the problem (2) are reported in Tables
1 and 3 for the nonmonotone and monotone versions, respectively. The behaviour on
the problem (3) is described in Table 2 for the nonmonotone SGP and in Table 4 for
the monotone SGP. The main conclusion that can be drawn from these experiments
is that the updating rule (26) for the scaling matrix D
k
combined with the steplength
selection suggested in Algorithm SS gives generally the best performance in terms of
computational time and a reconstruction error comparable with those provided by the
other choices of D
k
and
k
. This special version of SGP is able to achieve a very
good convergence rate and benets from a non-expensive updating of the matrix D
k
.
Interesting results in terms of number of iterations are observed also when D
k
is updated
by the rule (25); nevertheless, the additional costs introduced by this rule seem to
imply signicant reductions of the overall performance. Concerning the steplength, the
selections exploited in SGP-ABB and SGP-SS provide the better results, conrming
the eectiveness of the strategies based on adaptive alternations of the BB-like rules
[9, 10]. In particular, the alternation implemented in Algorithm SS yields remarkable
improvements when D
k
is updated as in (26). For shortness, in the following we will
denote by SGP the scheme that uses Algorithm SS for dening the steplength and
exploits the rule (26) for updating the scaling matrix.
For the sake of completeness, we show in Figure 4 the SGP relative reconstruction
error as a function of the number of iterations. In each panel of the gure the errors
obtained by applying SGP to the problems (2) and (3) are reported (denoted by
SGP-(2) and SGP-(3), respectively); for each test image, both the nonmonotone (the
left panels) and the monotone (the right panels) version of SGP is considered. No
signicant dierences are observed between the reconstruction errors corresponding to
the two problems. In particular, in both cases the error drops to a value close to the
minimum in very few iterations and it remains close to this value for a large number
of iterations; this suggests that the choice of the optimal number of iterations does not
seem to be critical in the case of real images. From a computational point of view, it
is worth recalling that, when SGP applies to the problem (3), each iteration is slightly
more expensive due to the more complicated projection operation. If we compare the
SGP Method for Constrained Image Deblurring 20
Table 1. Nonmonotone SGP on the deblurring problem (2).
D
k
= I D
k
as in (25) D
k
as in (26)
it opt err opt sec it opt err opt sec it opt err opt sec
Test problem: Image A1
SGP-BB1 2443 0.1865 147.43 607 0.1862 61.90 748 0.1850 53.08
SGP-BB2 898 0.1865 46.75 736 0.1862 69.29 678 0.1851 38.24
SGP-ABB 979 0.1866 49.92 443 0.1862 32.54 624 0.1851 36.94
SGP-SS 624 0.1866 33.91 346 0.1862 27.03 380 0.1851 23.40
Test problem: Image B2
SGP-BB1 1101 0.0586 66.34 307 0.0571 30.76 550 0.0544 48.66
SGP-BB2 1107 0.0586 59.05 380 0.0571 34.58 229 0.0542 12.83
SGP-ABB 1084 0.0586 55.57 367 0.0571 27.01 205 0.0547 12.24
SGP-SS 1086 0.0586 60.70 292 0.0571 23.06 157 0.0542 9.82
monotone and the nonmonotone version of SGP, almost identical behaviours in terms of
reconstruction error are observed. Thus, taking into account that the two SGP versions
exhibit similar convergence rate and that the nonmonotone line-search strategy requires
less function evaluations with respect to the monotone version, the nonmonotone SGP
seems preferable (see also the numerical results in Tables 1-4).
Table 2. Nonmonotone SGP on the deblurring problem (3).
D
k
= I D
k
as in (25) D
k
as in (26)
it opt err opt sec it opt err opt sec it opt err opt sec
Test problem: Image A1
SGP-BB1 4000 0.1871 266.97 794 0.1861 91.93 1178 0.1850 102.92
SGP-BB2 4000 0.1873 241.27 673 0.1861 76.45 666 0.1850 44.12
SGP-ABB 4000 0.1873 266.38 338 0.1861 28.16 399 0.1851 29.27
SGP-SS 4000 0.1868 262.98 385 0.1862 33.38 336 0.1851 22.37
Test problem: Image B2
SGP-BB1 4000 0.0613 631.13 352 0.0577 42.22 363 0.0635 35.32
SGP-BB2 4000 0.0588 348.76 372 0.0578 42.38 214 0.0542 14.17
SGP-ABB 4000 0.0608 590.31 360 0.0577 34.49 180 0.0547 13.65
SGP-SS 4000 0.0587 281.35 353 0.0577 34.95 163 0.0544 11.37
SGP Method for Constrained Image Deblurring 21
Table 3. Monotone SGP on the deblurring problem (2).
D
k
= I D
k
as in (25) D
k
as in (26)
it opt err opt sec it opt err opt sec it opt err opt sec
Test problem: Image A1
SGP-BB1 1592 0.1865 103.36 527 0.1862 44.01 748 0.1850 49.94
SGP-BB2 985 0.1866 55.99 659 0.1863 50.48 605 0.1850 35.35
SGP-ABB 1125 0.1866 60.17 659 0.1862 49.84 538 0.1851 33.24
SGP-SS 568 0.1866 37.40 418 0.1862 33.61 388 0.1851 28.85
Test problem: Image B2
SGP-BB1 1475 0.0586 90.54 528 0.0571 43.20 550 0.0542 49.72
SGP-BB2 1087 0.0586 60.55 400 0.0582 32.49 190 0.0543 11.71
SGP-ABB 1072 0.0586 57.11 327 0.0571 26.47 191 0.0548 12.54
SGP-SS 1110 0.0586 72.74 386 0.0571 31.56 137 0.0543 9.27
Table 4. Monotone SGP on the deblurring problem (3).
D
k
= I D
k
as in (25) D
k
as in (26)
it opt err opt sec it opt err opt sec it opt err opt sec
Test problem: Image A1
SGP-BB1 4000 0.1882 569.74 450 0.1861 46.99 778 0.1851 58.47
SGP-BB2 4000 0.1886 647.82 573 0.1861 48.53 573 0.1850 38.41
SGP-ABB 4000 0.1884 637.89 571 0.1861 46.98 511 0.1851 35.18
SGP-SS 4000 0.1874 557.53 324 0.1861 27.75 420 0.1851 38.26
Test problem: Image B2
SGP-BB1 4000 0.0638 627.51 400 0.0577 42.55 633 0.0599 83.25
SGP-BB2 4000 0.0659 651.00 355 0.0578 32.69 228 0.0541 17.71
SGP-ABB 4000 0.0631 648.92 313 0.0577 28.57 243 0.0546 17.47
SGP-SS 4000 0.0598 648.81 320 0.0577 30.23 145 0.0543 13.91
4.3. Comparisons with other methods
To better evaluate the SGP behaviour in image deblurring, we report some comparisons
with other widely used iterative regularization methods.
We rst consider the EM method [15], also known as Richardson-Lucy method [14, 13].
Starting from a positive initial image x
(0)
, the EM algorithm looks for a minimum of
SGP Method for Constrained Image Deblurring 22
Nonmonotone SGP - Image A1 Monotone SGP - Image A1
10
0
10
1
10
2
10
3
0.18
0.19
0.2
0.21
0.22
0.23
0.24
0.25
Iterations
E
r
r
o
r


SGP(2)
SGP(3)
10
0
10
1
10
2
10
3
0.18
0.19
0.2
0.21
0.22
0.23
0.24
0.25
Iterations
E
r
r
o
r


SGP(2)
SGP(3)
Nonmonotone SGP - Image B2 Monotone SGP - Image B2
10
0
10
1
10
2
10
3
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
Iterations
E
r
r
o
r


SGP(2)
SGP(3)
10
0
10
1
10
2
10
3
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
Iterations
E
r
r
o
r


SGP(2)
SGP(3)
Figure 4. SGP relative reconstruction error.
the functional (19) by exploiting the iteration
x
(k+1)
= X
k
A
T
Y
1
k
b. (32)
where X
k
= diag(x
(k)
) and Y
k
= diag(Ax
(k)
+ bg). The nonnegativity of A, bg and b
guarantees that all the iterates remain nonnegative and, under the assumption bg = 0,
useful to ensure the ux conservation property, the convergence to a solution of (2) has
been proved by several authors [39, 40, 41, 42, 43] (see also [44] for a convergence analysis
derived by a proximal point interpretation of the EM algorithm). In the general case
bg = 0, at least to our knowledge, the EM convergence is not proved. The EM method is
attractive because of its low computational cost, consisting in O(N log N) operations per
iteration (needed to perform the two matrix-vector products involving the matrix A).
This nice feature made the method one of the most popular approaches for astronomical
and medical image restoration problems. However, the main drawback of the method
is the slow convergence, that, in many cases, leads to the desired approximation of the
solution in a too large time. With regard to the comparison between SGP and EM, it
SGP Method for Constrained Image Deblurring 23
is interesting to observe that the iteration (32) can be written as
x
(k+1)
= x
(k)

k
D
k
J(x
(k)
), (33)
with
D
k
= X
k
,
k
= 1, k.
This means that EM can be interpreted as a scaled steepest descent method with a
special scaling matrix and a constant steplength [21, 45]. On the other hand, we have
seen that SGP uses a similar scaling matrix, exploits variable steplengths to improve
the convergence rate and can handle in a natural way a ux conservation constraint.
Thus, we may considered SGP a generalization of the EM method.
An interesting accelerated EM version is provided in [23]: it exploits a vector
extrapolation to determine the point on which the EM iteration is applied. The
extrapolation, consisting in a shift along the direction given by the dierence between the
current iteration and the previous iteration, introduces a little computational overhead
and, consequently, the cost per iteration of the method is only slightly larger than in
EM. In the following of the paper we will denote by EM MATLAB the implementation
of this algorithm available in the deconvlucy function of the Image Processing MATLAB
toolbox.
Another algorithm able to show superior convergence rate when compared to
EM is the WMRNSD method recently proposed in [24]. This algorithm can be
viewed as a steepest descent method applied to the minimization problem arising
from a constrained weighted least squares approach to the image restoration problem;
the nonnegativity constraint is satised by modifying the standard line-search that
minimizes the residual norm. The main tasks per iteration consist in two matrix-
vector products, as for the other methods. In our computational study we consider
the WMRNSD version tested in [24], whose MATLAB implementation is available at
the web page http://web.math.umt.edu/bardsley/codes.html.
In Table 5 we show the numerical results obtained by solving some deblurring
problems with the above iterative image reconstruction algorithms: the nonmonotone
SGP applied to the problem (2) (SGP-(2)) and (3) (SGP-(3)), the standard EM,
the EM MATLAB and the WMRNSD. The methods start from the same image
(x
(0)
i
= c/N, i = 1, . . . , N) and all the parameters of the nonmonotone SGP are set as
described in the previous section. Also in this table, the numbers of iterations (it opt)
refer to the iteration where the minimum relative reconstruction error is obtained; an
asterisk is used to mark the cases where this minimum is not reached within the prexed
maximum number of iterations. In all the experiments, SGP largely outperforms EM in
the number of iterations and in the computational time, even if the time per iterations
exhibited by SGP-(2) and SGP-(3) is approximately 40% and 70% grater than in EM,
respectively. Concerning the optimal reconstruction error, no signicant dierences are
observed between SGP and EM. To better compare the behaviour of the two methods,
we plot in Figure 5 the reconstruction error as a function of the number of iterations for
the test problems corresponding to the blurred images C1 and C3. The reconstructed
SGP Method for Constrained Image Deblurring 24
Table 5. Comparison among SGP, EM, EM MATLAB and WMRNSD
it opt err opt sec it opt err opt sec it opt err opt sec
Image A1 Image A2 Image A3
SGP-(2) 380 0.1851 23.40 103 0.1866 6.37 21 0.1944 1.32
SGP-(3) 336 0.1851 22.37 108 0.1865 8.44 20 0.1947 1.55
EM 10000 0.1852 433.25 4047 0.1865 177.14 414 0.1942 17.90
EM MATLAB 388 0.1853 18.32 141 0.1868 7.19 46 0.1947 2.40
WMRNSD 10000 0.1853 573.31 2904 0.1866 163.61 50 0.1942 2.72
Image B1 Image B2 Image B3
SGP-(2) 251 0.0513 15.67 157 0.0542 9.82 26 0.0689 1.61
SGP-(3) 274 0.0514 19.12 163 0.0544 11.37 35 0.0688 2.57
EM 10000 0.0512 413.20 4185 0.0541 183.49 500 0.0687 21.69
EM MATLAB 259 0.0516 12.30 139 0.0559 7.83 44 0.0700 2.64
WMRNSD 10000 0.0521 571.69 2928 0.0548 166.43 80 0.0690 4.26
Image C1 Image C2 Image C3
SGP-(2) 736 0.2922 45.89 374 0.2953 23.36 41 0.3127 2.51
SGP-(3) 1125 0.2923 80.13 272 0.2949 18.99 45 0.3125 3.26
EM 10000 0.2929 431.77 10000 0.2945 430.14 1459 0.3110 61.60
EM MATLAB 811 0.2952 38.30 280 0.2977 14.60 68 0.3138 3.75
WMRNSD 10000 0.2939 582.13 10000 0.2948 580.76 168 0.3115 9.09
images corresponding to the minimum errors are shown in Figure 6 for the blurred
images A2, B2 and C2. From the above results we may conclude that SGP seems
able to provide the same reconstruction accuracy given by EM but with a remarkable
computational gain.
We complete the experiments by comparing SGP also with the accelerated
algorithm EM MATLAB and with the WMRNSD method. The numerical results in
Table 5 show that, even if the cost per iteration in SGP is higher than in EM MATLAB,
the two approaches are very well comparable and sometimes SGP-(2) can be a valid
alternative. We point out that, as far as we know, no convergence proof of the Biggs-
Andrews algorithm implemented by EM MATLAB is available.
Finally, we can observe that SGP largely outperforms the WMRNSD method in terms
SGP Method for Constrained Image Deblurring 25
Image C1 Image C3
10
0
10
1
10
2
10
3
10
4
0.3
0.32
0.34
0.36
0.38
0.4
Iterations
E
r
r
o
r


SGP(2)
SGP(3)
EM
10
0
10
1
10
2
10
3
10
4
0.35
0.4
0.45
0.5
0.55
0.6
0.65
0.7
0.75
Iterations
E
r
r
o
r


SGP(2)
SGP(3)
EM
Figure 5. SGP and EM relative reconstruction error.
of both iterations number and computational time.
5. Conclusions
We have proposed a scaled gradient projection method, called SGP, for solving
optimization problems with simple constraints. The main features of SGP are its
global convergence properties and the use of ecient updating rules for the steplength
parameter and for the scaling matrix able to improve the convergence rate. A wide
computational study on the minimization problems arising from the maximum likelihood
approaches to image deblurring shows that SGP can provide the same reconstruction
accuracy of the standard EM method with much less iterations and, consequently, with
a remarkable computational gain. Furthermore, it exhibits the ability to eectively
handle a linear equality constraint in addition to the nonnegativity or box constraints.
The SGP eectiveness is also conrmed by the comparisons with other popular iterative
regularization methods.
Future works will regard the evaluation of the proposed algorithm on dierent
optimization problems arising in image deblurring and the comparison with quasi-
Newton and interior point methods. If SGP can compete with these methods, then
it could provide a very useful and simple approach to iterative image reconstruction.
Acknowledgments
The authors are thankful to the anonymous referees for their useful comments and
suggestions. This research is supported by the PRIN2006 project of the Italian
Ministry of University and Research Inverse Problems in Medicine and Astronomy,
grant 2006018748.
SGP Method for Constrained Image Deblurring 26
SGP-(2) SGP-(3) EM
Figure 6. Reconstruction obtained with the SGP-(2), the SGP-(3) and the EM
methods for the blurred images A2, B2, C2 (from the top to the bottom of the gure).
References
[1] Bertero M and Boccacci P 1998 Introduction to Inverse Problems in Imaging (Bristol: Institute of
Physics Publishing)
[2] Vogel C R 2002 Computational Methods for Inverse Problems (Frontiers Appl. Math., Philadelphia:
SIAM)
[3] Brucker P 1984 An O(n) algorithm for quadratic knapsack problems Oper. Res. Lett. 3(3) 163166
[4] Dai Y H and Fletcher R 2006 New algorithms for singly linearly constrained quadratic programming
problems subject to lower and upper bounds Math. Programming 106(3) 403421
[5] Kiwiel K C 2008 Breakpoint searching algorithms for the continuous quadratic knapsack problem
Math. Programming 112 473491
[6] Maculan N, Santiago C P, Macambira E M, and Jardim M H C 2003 An O(n) algorithm for
SGP Method for Constrained Image Deblurring 27
projecting a vector on the intersection of a hyperplane and a box in R
n
J. Optim. Theory Appl.
117(3) 553574
[7] Pardalos P M and Kovoor N 1990 An algorithm for a singly constrained class of quadratic programs
subject to upper and lower bounds Math. Programming 46 321328
[8] Dai Y H, Hager W W, Schittkowski K and Zhang H 2006 The cyclic Barzilai-Borwein method for
unconstrained optimization IMA J. Numer. Anal. 26 604627
[9] Frassoldati G, Zanghirati G and Zanni L 2008 New adaptive stepsize selections in gradient methods
J. Industrial and Management Optim. 4(2) 299312
[10] Zhou B, Gao L and Dai Y H 2006 Gradient methods with adaptive step-sizes Comput. Optim.
Appl. 35(1) 6986
[11] Daube-Witherspoon M E and Muehllehner G 1986 An iterative image space reconstruction
algorithm suitable for volume ect IEEE Trans. Med. Imaging 5 6166
[12] Lanteri H, Roche M and Aime C 2002 Penalized maximum likelihood image restoration with
positivity constraints: multiplicative algorithms Inverse Problems 18 13971419
[13] Lucy L B 1974 An iterative technique for the rectication of observed distributions Astronom. J.
79 745754
[14] Richardson W H 1972 Bayesianbased iterative method of image restoration J. Opt. Soc. Amer.
A 62 5559
[15] Shepp L A and Vardi Y 1982 Maximum likelihood reconstruction for emission tomography IEEE
Trans. Med. Imaging 1 113122
[16] Birgin E G, Martinez J M, and Raydan M 2000 Nonmonotone spectral projected gradient methods
on convex sets SIAM J. Optim. 10 11961211
[17] Grippo L, Lampariello F, and Lucidi S 1986 A nonmonotone line-search technique for Newtons
method SIAM J. Numer. Anal. 23 707716
[18] Kelley C T 1999 Iterative Methods for Optimization (Frontiers Appl. Math., Philadelphia: SIAM)
[19] Bardsley J and Vogel C 2003 Nonnegatively constrained convex programming methods for image
reconstruction SIAM J. Sci. Comput. 25 13261343
[20] Landi G and Loli Piccolomini E 2008 A projected Newton-CG method for nonnegative astronomical
image deblurring Numer. Alg. 48(4) 279300
[21] Bertsekas D P 1999 Nonlinear Programming (Athena Scientic, 2nd edition)
[22] Csiszar I 1991 Why least squares and maximum entropy? An axiomatic approach to inference for
linear inverse problems Ann. Stat. 19 20322066
[23] Biggs D S C and Andrews M 1997 Acceleration of iterative image restoration algorithms Appl.
Opt. 36 17661775
[24] Bardsley J and Nagy J 2006 Covariance-preconditioned iterative methods for nonnegatively
constrained astronomical imaging SIAM J. Matr. Anal. Appl. 27 11841198
[25] Johansson B, Elfving T, Kozlov V, Censor Y, Forssen P E and Granlund G 2006 The application
of an oblique-projected Landweber method to a model of supervised learning Math. Comp.
Modelling 43 892909
[26] Birgin E G, Martinez J M and Raydan M 2003 Inexact spectral projected gradient methods on
convex sets IMA J. Numer. Anal. 23 539559
[27] Hanke M, Nagy J and Vogel C 2000 Quasi-Newton approach to nonnegative image restoration
Lin. Alg. and Appl. 316 223236
[28] Davis P J 1979 Circulant Matrices (John Wiley & Sons)
[29] Strang G 1999 The discrete cosine transform SIAM Review 41(1) 135147
[30] Zanni L 2006 An improved gradient projection-based decomposition technique for support vector
machines Comput. Management Sci. 3 131145
[31] Zanni L, Serani T and Zanghirati G 2006 Parallel software for training large scale support vector
machines on multiprocessor systems J. Mach. Learn. Res. 7 14671492
[32] Barzilai J and Borwein J M 1988 Two point step size gradient methods IMA J. Numer. Anal. 8
141148
SGP Method for Constrained Image Deblurring 28
[33] Dai Y H and Fletcher R 2005 On the asymptotic behaviour of some new gradient methods Math.
Programming 103(3) 541559
[34] Fletcher R 2001 On the Barzilai-Borwein method Technical Report NA/207, Department of
Mathematics, University of Dundee, Dundee, UK
[35] Friedlander A, Martnez J M, Molina B and Raydan M 1999 Gradient method with retards and
generalizations SIAM J. Numer. Anal. 36 275289
[36] Figueiredo M A T, Nowak R D, and Wright S J 2007 Gradient projection for sparse reconstruction:
Application to compressed sensing and other inverse problems IEEE J. Selected Topics in Signal
Process. 1 586597
[37] Serani T, Zanghirati G and Zanni L 2005 Gradient projection methods for quadratic programs
and applications in training support vector machines Optim. Meth. Soft. 20(23) 343378
[38] Anconelli B, Bertero M, Boccacci P, Carbillet C and Lanteri H 2005 Restoration of interferometric
images - iii. Ecient Richardson-Lucy methods for LINC-NIRVANA data reduction Astron.
Astrophys. 430 731738
[39] Iusem A N 1991 Convergence analysis for a multiplicatively relaxed EM algorithm Math. Meth.
Appl. Sci. 14 573593
[40] Iusem A N 1992 A short convergence proof of the EM algorithm for a specic Poisson model
REBRAPE 6 5767
[41] Lange K and Carson R 1984 EM reconstruction algorithms for emission and transmission
tomography J. Comp. Assisted Tomography 8 306316
[42] M ulthei H N and Schorr B 1989 On properties of the iterative maximum likelihood reconstruction
method Math. Meth. Appl. Sci. 11 331342
[43] Vardi Y, Shepp L A and Kaufman L 1985 A statistical model for positron emission tomography
J. Amer. Statist. Soc. 80(389) 837
[44] Tseng P. 2004 An analysis of the EM algorithm and entropylike proximal point methods Math.
Oper. Res. 29 2744
[45] Kaufman K 1987 Implementing and accelerating the EM algorithm for positron emission
tomography IEEE Trans. Med. Imaging 6 3751

You might also like