0% found this document useful (0 votes)
60 views15 pages

Quadratic Vector Equations: (X) Is Operator Concave Instead of Convex

This document discusses quadratic vector equations (QVEs) of the form Mx = a + b(x,x) where M is a nonsingular M-matrix, a and x are nonnegative vectors, and b is a nonnegative bilinear form. It is shown that under certain conditions, this equation has a minimal nonnegative solution. Specific cases of this general problem that have been studied include Markovian binary trees, Lu's simple equation, nonsymmetric algebraic Riccati equations, and unilateral quadratic matrix equations. The conditions ensure the existence of a solution and allow the use of fixed point iterations to find it.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views15 pages

Quadratic Vector Equations: (X) Is Operator Concave Instead of Convex

This document discusses quadratic vector equations (QVEs) of the form Mx = a + b(x,x) where M is a nonsingular M-matrix, a and x are nonnegative vectors, and b is a nonnegative bilinear form. It is shown that under certain conditions, this equation has a minimal nonnegative solution. Specific cases of this general problem that have been studied include Markovian binary trees, Lu's simple equation, nonsymmetric algebraic Riccati equations, and unilateral quadratic matrix equations. The conditions ensure the existence of a solution and allow the use of fixed point iterations to find it.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Chapter 2

Quadratic vector equations

2.1. Introduction
In this chapter, we aim to study in an uni4ed fashion several quadratic
vector and matrix equations with nonnegativity hypotheses. Speci4c
cases of these problems have been studied extensively in the past by sev-
eral authors. For references to the single equations and results, we refer
the reader to the following sections, in particular Section 2.3. Many of
the results appearing here have already been proved for one or more of
the single instances of the problems, resorting to speci4c characteristics
of the problem. In some cases the proofs we present here are mere rewrit-
ings of the original proofs with a little change of notation to adapt them to
our framework, but in some cases we are effectively able to remove some
hypotheses and generalize the results by abstracting the speci4c aspects
of each problem.
It is worth noting that Ortega and Rheinboldt [131, Chapter 13], in a
1970 book, treat a similar problem in a far more general setting, assuming
only the monotonicity and operator convexity of the involved operator.
Since their hypotheses are far more general than the ones of our problem,
the obtained results are less precise than the one we are reporting here.
Moreover, all of their proofs have to be adapted to our case, since our
map F(x) is operator concave instead of convex.

2.2. General problem


We are interested in solving the equation (1) (quadratic vector equation,
QVE), where M ∈ R N ×N is a nonsingular M-matrix, a, x ∈ R+N , and b is
a nonnegative vector bilinear form, i.e., a map b : R+N × R+N → R+N such
that b(v, ·) and b(·, v) are linear maps for each v ∈ Rn+ . The
 map b can be
represented by a tensor Bi jk , in the sense that b(x, y)k = i,n j=1 Bi jk xi y j .
It is easy to prove that x ≤ y, z ≤ w implies b(x, z) ≤ b(y, w). If A is
a nonsingular M-matrix, A−1 B denotes the tensor representing the map
(x, y) → A−1 b(x, y). Note that, here and in the following, we do not
F. Poloni, Algorithms for Quadratic Matrix and Vector Equations
© Scuola Normale Superiore Pisa 2011
16 Federico Poloni

require that b be symmetric (that is, b(x, y) = b(y, x) for all x, y): while
in the equation only the quadratic form associated with b is used, in the
solution algorithms there are often terms of the form b(x, y) with x = y.
Since there are multiple ways to extend the quadratic form b(x, x) to
a bilinear map b(x, y), this leaves more freedom in de7ning the actual
solution algorithms.
We are only interested in nonnegative solutions x ∗ ∈ R+N ; in the fol-
lowing, when referring to solutions of (1) we always mean nonnegative
solutions only. A solution x ∗ of (1) is called minimal if x ∗ ≤ y ∗ for any
other solution y ∗ .
Later on, we give a necessary and suf7cient condition for (1) to have a
minimal solution.

2.3. Concrete cases


E1: Markovian binary trees Equation (1), with the assumption that e
is a solution, is studied in [16, 79], where it arises from the study of
Markovian binary trees.
E2: Lu’s simple equation In [117, 116], the equation (12) appears, de-
rived from a special Riccati equation appearing in a neutron transport
problem. By setting w := [u T v T ]T as the new unknown, the equation
takes the form (1).
E3: Nonsymmetric algebraic Riccati equation In [69], the equation
(5), where the matrix M de7ned in (6) is a nonsingular or singular
irreducible M-matrix, is studied. Vectorizing everything, we get

(I ⊗ A + D T ⊗ I ) vec(X) = vec(B) + vec(XC X),

which is in the form (1) with N = mn.


E4: Unilateral quadratic matrix equation In several queuing prob-
lems [31], the equation (2) with assumptions (3) is considered. Vec-
torizing everything, we fall again in the same class of equations, with
N = n  2 : in fact, since B  e ≤ e, Be ≥ 0 and thus B is an M-matrix.

To ease the notation in the cases E3 and E4, in the following we set
xk = vec(X k ), d = max(m, n) for E3 and d = n  for E4.

2.4. Minimal solution


2.4.1. Existence of the minimal solution
It is clear by considering the scalar case (N = 1) that (1) may have no
real solutions. The following additional condition allows us to prove their
existence.
17 Algorithms for Quadratic Matrix and Vector Equations

Assumption 2.4.1. There are a positive linear functional l : R N → Rt


and a vector z ∈ Rt+ such that for any x ∈ R+N , the property l(x) ≤ z
implies l(M −1 (a + b(x, x))) ≤ z.
Theorem 2.4.2 ([136]). Equation (1) has at least one solution if and only
if Assumption 2.4.1 holds. Among its solutions, there is a minimal one.
Proof. Let us consider the iteration

xk+1 = M −1 (a + b(xk , xk )) , (2.4.1)

starting from x0 = 0. Since M is an M-matrix, we have x1 = M −1 a ≥ 0.


It is easy to see by induction that xk ≤ xk+1 :

xk+1 − xk = M −1 (b(xk , xk ) − b(xk−1 , xk−1 )) ≥ 0

since b is nonnegative. We prove by induction that l(xk ) ≤ z. The


base step is clear: l(0) = 0 ≤ z; the inductive step is simply Assump-
tion 2.4.1. Thus the sequence xk is nondecreasing and bounded from
above by l(M xk ) ≤ z, and therefore it converges. Its limit x ∗ is a solu-
tion to (1).
On the other hand, if (1) has a solution s, then we may choose l = I
and z = s; now, x ≤ s implies M −1 (a+b(x, x)) ≤ M −1 (a+b(s, s)) = s,
thus Assumption 2.4.1 is satis1ed with this choices.
For any solution s, we may prove by induction that xk ≤ s:

s − xk+1 = a + b(s, s) − a − b(xk , xk ) ≥ 0.

Therefore, passing to the limit, we get x ∗ ≤ s. 


Notice that Assumption 2.4.1 is equivalent to the following statement
Assumption 2.4.3. There exists a vector z ∈ R+N such that for any x ∈
R+N the property x ≤ z implies M −1 (a + b(x, x)) ≤ z,
i.e., its version with l = I . Indeed, both parts of the proof work ver-
batim by setting l = I . On the other hand, the slightly more general
formulation described above allows a simpler proof that the problem E3
satis1es the requirement, as we see in the following.

2.4.2. Taylor expansion


Let F(x) := M x − a − b(x, x). Since the equation is quadratic, we have
the expansion
1 
F(y) = F(x) + Fx (y − x) + F (y − x, y − x), (2.4.2)
2 x
18 Federico Poloni

where Fx (w) = Mw − b(x, w) − b(w, x) is the (Fréchet) derivative of


F and Fx (w, w) = −2b(w, w) ≤ 0 is its second (Fréchet) derivative.
Notice that Fx is nonpositive and does not depend on x.
The following theorem is a straightforward extension to our setting of
the argument in [69, Theorem 3.2].
Theorem 2.4.4 ([136]). If x ∗ > 0, then Fx ∗ is an M-matrix.
Proof. Let us consider the 1xed point iteration (2.4.1). By a theorem on
1xed-point iterations [108], one has

lim sup k x ∗ − xk ≤ ρ(Gx ∗ ), (2.4.3)

where Gx ∗ is the Fréchet derivative of the iteration map

G(x) := M −1 (a + b(x, x)),

that is,
Gx ∗ (y) = M −1 (b(x ∗ , y) + b(y, x ∗ )).
In fact, if x ∗ > 0, equality holds in (2.4.3). Let ek := x ∗ − xk . We have
ek+1 = Pk ek , where

Pk := M −1 (b(x ∗ , ·) + b(·, xk ))

are nonnegative matrices. The matrix sequence Pk is nondecreasing and


limk→∞ Pk = Gx ∗ . Thus for any ε > 0 we may 1nd an integer l such that

ρ(Pm ) ≥ ρ(Gx ∗ ) − ε, ∀m ≥ l.

We have
 
lim sup k
x ∗ − xk = lim sup k
Pk−1 . . . Pl . . . P0 x ∗
 
≥ lim sup k  P k−l P l x ∗ .
l 0

Since x ∗ > 0, we have P0l x ∗ > 0 and l ∗


 k−lP0 x > cl e for a suitable
 thus
constant  k−l  =  P vk,l  for a suitable vk,l ≥
 cl . Also, it holds that Pl l
 
0 with vk,l = 1, and this implies
   
lim sup k x ∗ − xk = lim sup k cl  Plk−l e
  
≥ lim sup k cl  Plk−l vk,l 
  
= lim sup k cl  Plk−l 
=ρ(Pl ) ≥ ρ(Gx ∗ ) − ε.
19 Algorithms for Quadratic Matrix and Vector Equations

Since ε is arbitrary, this shows that equality holds in (2.4.3).


From the convergence of the sequence xk , we thus get

ρ(M −1 (b(x ∗ , ·) + b(·, x ∗ ))) ≤ 1,

which implies that M − b(x ∗ , ·) − b(·, x ∗ ) is an M-matrix. 

Corollary 2.4.5. From part 2 of Theorem 1.1.3, we promptly obtain that


F  (x) is an M-matrix for all x ≤ x ∗ .

2.4.3. Concrete cases


We may prove Assumption 2.4.1 for all the examples E1–E4. E1 is cov-
ered by the following observation.

Lemma 2.4.6. If there is a vector y ≥ 0 such that F(y) ≥ 0, then As-


sumption 2.4.1 holds.

Proof. In fact, we may take the identity map as l and y as z. Clearly


x ≤ y implies M −1 (a + b(x, x)) ≤ M −1 (a + b(y, y)) ≤ y. 

As for E2, it follows from the reasoning in [117] that a solution to


the speci5c problem is u = Xq + e, v = X T q + e, where X is the
solution of an equation of the form E3; therefore, E2 follows from E3 and
Lemma 2.4.6. An explicit but rather complicate bound to the solution is
given in [101].
The case E3 is treated in [70, Theorem 3.1]. Since M in (6) is a non-
singular or singular M-matrix, there are vectors v1 , v2 > 0 and u 1 , u 2 ≥ 0
such that Dv1 − Cv2 = u 1 and Av2 − Bv1 = u 2 . Let us set l(x) = Xv1
and z = v2 − A−1 u 2 . We have

(AX k+1 + X k+1 D)v1 =(X k C X k + B)v1


≤X k Cv2 + Av2 − u 2 ≤ X Dv1 + Av2 − u 2 .

Since X k+1 Dv1 ≥X k Dv1 (monotonicity of the iteration), we get X k+1 v1 ≤


v2 − A−1 u 2 , which is the desired result.
The case E4 is similar. It suf5ces to set l(x) = vec−1 (x)e and z = e:

X k+1 e = (I − B)−1 (A + C X k2 )e ≤ (I − B)−1 (Ae + Ce) ≤ e,

since (A + C)e = (I − B)e.


20 Federico Poloni

2.5. Functional iterations


2.5.1. Denition and convergence
We may de+ne a functional iteration for (1) by choosing a splitting b =
b1 + b2 such that bi ≥ 0 and a splitting M = Q − P such that Q is an
M-matrix and P ≥ 0. We then have the iteration

(Q − b1 (·, xk ))xk+1 = a + P xk + b2 (xk , xk ). (2.5.1)

Theorem 2.5.1 ([136]). Suppose that the equation (1) has a minimal so-
lution x ∗ > 0. Let x0 be such that 0 ≤ x0 ≤ x ∗ and F(x0 ) ≤ 0 (e.g.,
x0 = 0). Then:
1. Q − b1 (·, xk ) is nonsingular for all k, i.e., the iteration (2.5.1) is well-
dened.
2. xk ≤ xk+1 ≤ x ∗ , and xk → x ∗ as k → ∞.
3. F(xk ) ≤ 0 for all k.
Proof. Let J (x) := Q − b1 (·, x) and g(x) := a + P x + b2 (x, x). It is
clear from the nonnegativity constraints that J is nonincreasing (i.e., x ≤
y ⇒ J (x) ≥ J (y)) and g is nondecreasing (i.e., x ≤ y ⇒ g(x) ≤ g(y)).
The matrix J (x) is a Z -matrix for all x ≥ 0. Moreover, since J (x ∗ )x ∗ =
g(x ∗ ) ≥ 0, J (x ∗ ) is an M-matrix by Theorem 1.1.3 and thus, by the same
theorem, J (x) is a nonsingular M-matrix for all x ≤ x ∗ .
We +rst prove by induction that xk ≤ x ∗ . This shows that the iteration
is well-posed, since it implies that J (xk ) is an M-matrix for all k. Since
g(x ∗ ) = J (x ∗ )x ∗ ≤ J (xk )x ∗ by inductive hypothesis, (2.5.1) implies

J (xk )(x ∗ − xk+1 ) ≥ g(x ∗ ) − g(xk ) ≥ 0;

thus, since J (xk ) is an M-matrix by inductive hypothesis, we deduce that


x ∗ − xk+1 ≥ 0.
We prove by induction that xk ≤ xk+1 . For the base step, since we have
F(x0 ) ≤ 0, and J (x0 )x0 − g(x0 ) ≤ 0, thus x1 = J (x0 )−1 g(x0 ) ≥ x0 . For
k ≥ 1, it holds that

J (xk−1 )(x k+1 − xk ) ≥ J (xk )xk+1 − J (xk−1 )xk = g(xk ) − g(xk−1 ) ≥ 0,

thus xk ≤ xk+1 . The sequence xk is monotonic and bounded above by x ∗ ,


thus it converges. Let x be its limit; by passing (2.5.1) to the limit, we
see that x is a solution. But since x ≤ x ∗ and x ∗ is minimal, it must be
the case that x = x ∗ .
Finally, for each k we have

F(xk ) = J (xk )xk − g(xk ) ≤ J (xk )xk+1 − g(xk ) = 0. 


21 Algorithms for Quadratic Matrix and Vector Equations

Theorem 2.5.2 ([136]). Let f be the map dening the functional itera-
tion (2.5.1), i.e., f (xk ) = J (xk )−1 g(xk ) = xk+1 . Let Jˆ, ĝ, fˆ be the
same maps as J , g, f but for the special choice b2 = 0, P = 0. Then
fˆk (x) ≥ f k (x), i.e., the functional iteration with b2 = 0, P = 0 has the
fastest convergence among all those dened by (2.5.1).

Proof. It suf6ces to prove that fˆ(y) ≥ fˆ(x) for all y ≥ x, which is


obvious from the fact that Jˆ is nonincreasing and ĝ is nondecreasing, and
that fˆ(x) ≥ f (x), which follows from the fact that Jˆ(x) ≤ J (x) and
ĝ(x) ≥ g(x). 

Corollary 2.5.3. Let


GS
xk+1 = J (yk )−1 gk , (2.5.2)
where yk is a vector such that xk ≤ yk ≤ xk+1 , and gk a vector such that
g(xk ) ≤ gk ≤ g(xk+1 ). It can be proved with the same arguments that
xk+1 ≤ xk+1
GS
≤ x ∗.

This implies that we can perform the iteration in a “Gauss–Seidel” fash-


ion: if in some place along the computation an entry of xk is needed, and
we have already computed the same entry of xk+1 , we can use that entry
instead. It can be easily shown that J (xk )−1 g(xk ) ≤ J (yk )−1 gk , there-
fore the Gauss–Seidel version of the iteration converges faster than the
original one.
Remark 2.5.4. The iteration (2.5.1) depends on b as a bilinear form,
while (1) and its solution depend only on b as a quadratic form. There-
fore, different choices of the bilinear form b lead to different functional
iterations for the same equation. Since for each iterate of each functional
iteration both xk ≤ x ∗ and F(xk ) ≤ 0 hold (thus xk is a valid starting
point for a new functional iteration), we may safely switch between dif-
ferent functional iterations at every step.

2.5.2. Concrete cases


For E1, the algorithm called depth in [16] is given by choosing P = 0,
b2 = 0. The algorithm called order in the same paper is obtained with
the same choices, but starting by the bilinear form b̃(x, y) := b(y, x) ob-
tained by switching the arguments of b. The algorithm called thicknesses
in [79] is given by performing alternately one iteration of each of the two
above methods.
For E2, Lu’s simple iteration [117] and the algorithm NBJ in [12] can
be seen as the basic iteration (2.4.1) and P = 0, b2 = 0. The algorithm
NBGS in the same paper is a Gauss–Seidel-like variant.
22 Federico Poloni

For E3, the 3xed point iterations in [69] are given by b2 = b and
different choices of P. The iterations in [99] are the one given by b2 =
0, P = 0 and a Gauss–Seidel-like variant.
For E4, the iterations in [31, Chapter 6] can also be reinterpreted in our
framework.

2.6. Newton’s method


2.6.1. Denition and convergence
We may de3ne the Newton method for the equation (1) as

Fxk (xk+1 − xk ) = −F(xk ). (2.6.1)

Alternatively, we may write

Fxk xk+1 = a − b(xk , xk ).

Also notice that

−F(xk ) = b(xk−1 − xk , xk−1 − xk ). (2.6.2)

Theorem 2.6.1 ([136]). If x ∗ > 0, the Newton method (2.6.1) starting


from x0 = 0 is well-dened, and the generated sequence xk converges
monotonically to x ∗ .

Proof. First notice that, since Fx ∗ is an M-matrix, by Theorem 2.4.4, Fx
is a nonsingular M-matrix for all x ≤ x ∗ , x = x ∗ .
We prove by induction that xk ≤ xk+1 . We have x1 = M −1 a ≥ 0, so
the base step holds. From (2.6.2), we get

Fxk+1 (xk+2 − xk+1 ) = b(xk+1 − xk , xk+1 − xk ) ≥ 0,

thus, since Fxk+1 is a nonsingular M-matrix, xk+2 ≥ xk+1 , which com-


pletes the induction proof.
Moreover, we may prove by induction that xk < x ∗ . The base step is
obvious, the induction step is

Fxk (x ∗ − xk+1 ) = M x ∗ − b(xk , x ∗ ) − b(x ∗ , xk ) − a + b(xk , xk )


= b(x ∗ − xk , x ∗ − xk ) > 0.

The sequence xk is monotonic and bounded from above by x ∗ , thus it


converges; by passing (2.6.1) to the limit we see that its limit must be a
solution of (1), hence x ∗ . 
23 Algorithms for Quadratic Matrix and Vector Equations

2.6.2. Concrete cases


Newton methods for E1, E2, and E3 appear respectively in [79, 116, 69],
with more restrictive hypotheses which hold true in the special applica-
tion cases. As far as we know, the more general hypothesis x ∗ > 0 9rst
appeared here. In particular, in [69] (and later [70]) the authors impose
that x1 > 0; [79] impose that Fx ∗ is an M-matrix, which is true in their
setting because of probabilistic assumptions; and in the setting of [116],
x1 > 0 is obvious. For E4, the Newton method is described in a more
general setting in [31, 112], and can be implemented using the method
described in [60] for the solution of the resulting Sylvester equation.

2.7. Modied Newton method


Recently Hautphenne and Van Houdt [81] proposed a different version
of Newton’s method for E1 that has a better convergence rate than the
traditional one. Their idea is to apply the Newton method to the equation
G(x) = x − (M − b(·, x))−1 a, (2.7.1)
which is equivalent to (1).

2.7.1. Theoretical properties


Let us set for the sake of brevity Rx := M − b(·, x). The Jacobian of G
is
G x = I − Rx−1 b(Rx−1 a, ·).
As for the original Newton method, it is a Z -matrix, and a nonincreasing
function of x. It is easily seen that G x ∗ is an M-matrix. The proof in
Hautphenne and Van Houdt [81] is of probabilistic nature and cannot be
extended to our setting; we provide here a different one. We have

G  (x ∗ ) = Rx−1
∗ M − b(·, x ∗ ) − b(Rx−1
∗ a, ·)

= Rx−1
∗ M − b(·, x ∗ ) − b(x ∗ , ·) ;
the quantity in parentheses is Fx ∗ , an M-matrix, thus there is a vector v >
0 such that Fx ∗ v ≥ 0, and therefore G x ∗ v = Rx−1 
∗ Fx ∗ v ≥ 0. This shows
 ∗
that G x is an M-matrix for all x ≤ x and thus the modi9ed Newton
method is well-de9ned. The monotonic convergence is easily proved in
the same fashion as for the traditional method.
The following result holds.
Theorem 2.7.1 ([81]). Let x̃k be the iterates of the modied Newton
method and xk those of the traditional Newton method, starting from
x̃k = xk = 0. Then x̃k − xk ≥ 0.
24 Federico Poloni

The proof in Hautphenne and Van Houdt [81] can be adapted to our
setting with minor modi7cations.

2.7.2. Concrete cases


Other than for E1, its original setting, the modi7ed Newton method can
be applied successfully for the other concrete cases of quadratic vector
equations. In order to outline here the possible bene7ts of this strategy,
we ask for the reader’s indulgence and refer forward to the results on the
structure of Newton’s method which are presented in Chapter 8.
For E2, let us choose the bilinear map b as

b([u 1 v1 ], [u 2 v2 ]) := [u 1 ◦ (Pv2 ), v1 ◦ ( P̃u 2 )].

It is easily seen that b(·,x) is a diagonal matrix. Moreover, as in the tradi-


tional Newton method, the matrix b(x,·) has a special structure (Trummer-
like structure [33], also in Chapter 8) which allows a fast inversion with
O(n 2 ) operations per step. Therefore the modi7ed Newton method can
be implemented with a negligible overhead (O(n) ops per step on an al-
gorithm that takes O(n 2 ) ops per step) with respect to the traditional one,
and increased convergence rate.
We have performed some numerical experiments on the modi7ed New-
ton method for E2; as can be seen in Figure 2.1, the modi7ed Newton
method does indeed converge faster to the minimal solution, and this al-
lows one to get better approximations to the solution with the same num-
ber of steps. The bene7ts of this approach are limited, however; in the
numerical experiments performed, this modi7ed Newton method never
saved more than one iteration with respect to the customary one.
For E3 and E4, the modi7ed Newton method leads to similar equations
to the traditional one (continuous- and discrete-time Sylvester equations),
but requires additional matrix inversions and products; that is, the over-
head is of the same order O(d 3 ) of the cost of the Newton step. Therefore
it is not clear whether the improved convergence rate makes up for the in-
crease in the computational cost.

2.8. Positivity of the minimal solution


2.8.1. Role of the positivity
In many of the above theorems, the hypothesis x ∗ > 0 is required. Is it
really necessary? What happens if it is not satis7ed?
In all the algorithms we have exposed, we worked with only vectors x
such that 0 ≤ x ≤ x ∗ . Thus, if x ∗ has some zero entry, we may safely
replace the problem with a smaller one by projecting the problem on the
25 Algorithms for Quadratic Matrix and Vector Equations

10−2
10−2
10−5
10−5
10−8
10−8
10−11
10−11
10−14
10−14
1 2 3 4 5 2 4 6 8 10 12
c = 0:5; α = 0:5 c = 1 −10−4 ; α = 10−4

10−2
10−5
Newton
10−8 Mod. Newton
10 −11
10 −14
5 10 15 20
c = 1 − 10−9 ; α = 10− 9

Figure 2.1. Convergence history of the two Newton methods for E2 for several
values of the parameters α and c. The plots show the residual Frobenius norm
of Equation (1) vs. the number of iterations.

subspace of all vectors that have the same zero pattern as x ∗ : i.e., we may
replace the problem with the one de0ned by
â = a, M̂ = MT , b̂(x, y) = b(T x, T y),
where  is the orthogonal projector on the subspace
W = {x ∈ R N : xi = 0 for all i such that xi∗ = 0}, (2.8.1)
i.e., the linear operator that removes the entries known to be zero from
the vectors. Performing the above algorithms on the reduced vectors and
matrices is equivalent to performing them on the original versions, pro-
vided the matrices to invert are nonsingular. Notice, though, that both
functional iterations and Newton-type algorithms may break down when
the minimal solution is not strictly positive. For instance, consider the
problem
1      1   
x1 y x y 1
a = 2 , M = I2 , b , 1 = 2 1 1 , x∗ = .
0 x2 y2 K x1 y2 0
For suitable choices of the parameter K , the matrices to be inverted in the
functional iterations (excluding obviously (2.4.1)) and Newton’s meth-
ods are singular; for large values of K , none of them are M-matrices.
26 Federico Poloni

However, the nonsingularity and M-matrix properties still hold for their
restrictions to the subspace W de3ned in (2.8.1). It is therefore important
to consider the positivity pattern of the minimal solution in order to get
ef3cient algorithms.

2.8.2. Computing the positivity pattern


By considering the functional iteration (2.4.1), we may derive a method
 ). Let
3
to infer the positivity pattern of the minimal solution in time O(N
us denote by ek the k-th vector of the canonical basis, and e R = r∈R er
for any set R ⊆ {1, . . . , N }.
The main idea of the algorithm is following the iteration xk+1 =
−1
M (a +b(xk , xk )), checking at each step which entries become (strictly)
positive. Notice that, since the iterates are nondecreasing, once an entry
becomes positive for some k it stays positive. However, it is possible to
reduce substantially the number of checks needed if we follow a different
strategy. We consider a set S of entries known to be positive at a certain
step xk , i.e., i ∈ S if we already know that (xk )i > 0 at some step k.
At the 3rst step of the iteration, only the entries i such that M −1 ai > 0
belong to this set. For each entry i, we check whether we can deduce the
positiveness of more entries thanks to i and some other j ∈ S being pos-
itive, using the nonzero pattern of M −1 b(·, ·). As we prove formally in
the following, it suf3ces to consider each i once. Therefore, we consider
a second set T ⊆ S of positive entries that have not been checked for the
consequences of their positiveness, and examine them one after the other.
We report the algorithm as Algorithm 1, and proceed to prove that it
computes the positivity set of x ∗ .
Theorem 2.8.1 ([136]). The above algorithm runs in at most O(N 3 ) op-
erations and computes the set {h ∈ {1, . . . , N } : x h∗ > 0}.
Proof. For the implementation of the working sets, we use the simple
approach to keep in memory two vectors S, T ∈ {0, 1} N and set to 1 the
components corresponding to the indices in the sets. With this choice,
insertions and membership tests are O(1), loops are easy to implement,
and retrieving an element of the set costs at most O(N ).
Let us 3rst prove that the running time of the algorithm is at most
O(N 3 ). If we precompute a PLU factorization of M, each subsequent
operation M −1 v, for v ∈ R N , costs O(n 2 ). The 3rst for loop runs in at
most O(N ) operations. The body of the while loop runs at most N times,
since an element can be inserted into S and T no more than once (S never
decreases). Each of its iterations costs O(N 2 ), since evaluating b(et , e S )
is equivalent to computing the matrix-vector product between the matrix
(Bti j )i, j=1,...,N and e S , and similarly for b(e S , et ).
27 Algorithms for Quadratic Matrix and Vector Equations

Algorithm 1: Compute the positivity pattern of the solution x ∗ .


Require: a, M, b
Ensure: S = {i : xi∗ > 0}
S ← ∅{entries known to be positive}
T ← ∅{entries to check}
a  ← M −1 a
for i = 1 to N do
if ai > 0 then
T ← T ∪ {i}; S ← S ∪ {i}
end if
end for
while T = ∅ do
t ← some element of T
T ← T \ {t}
u ← M −1 (b(e S , et ) + b(et , e S )) {or only its positivity pattern}
for i ∈ {1, . . . , N } \ S do
if u i > 0 then
T ← T ∪ {i}; S ← S ∪ {i}
end if
end for
end while
return S

The fact that the algorithm computes the right set may not seem obvious
at *rst sight. Since the sequence xk is increasing, if one entry in xk is
positive, then it is positive for all iterates x h , h > k. When does an entry
of xk become positive? The positive entries of x1 are those of M −1 a;
then, an entry t of xk+1 is positive if either the corresponding entry of xk
is positive or two entries r, s of xk are positive and Brst > 0. Thus, an
entry in x ∗ is positive if and only if we can *nd a sequence Si of subsets
of {1, . . . , N } such that:

• S0 = {h ∈ {1, . . . , N } : (M −1 a)h > 0};


• Si+1 = Si ∪ {ti }, and there are two elements r, s ∈ Si such that Brsti > 0.

For each element u of {h ∈ {1, . . . , N } : x h∗ > 0}, we may prove by


induction on the length of its minimal sequence Si that it eventually gets
into S (and T ). In fact, suppose that the last element of the sequence
is Sl . Then, by inductive hypothesis, all the elements of Sl−1 eventually
get into S and T . All of them are removed from T at some step of the
algorithm. When the last one is removed, the if condition triggers and
28 Federico Poloni

the element u is inserted into S. Conversely, if u is an element that gets


inserted into S, then by considering the values of S at the successive steps
of the algorithm we get a valid sequence {Si }. 

It is a natural question to ask whether for the cases E3 and E4 it is


possible to use the special structure of M and b in order to develop a
similar algorithm with running time O(d 3 ), that is, the same as the cost
per step of the basic iterations. Unfortunately, we were unable to go
below O(d 4 ). It is therefore much less appealing to run this algorithm
as a preliminary step before, since its cost is likely to outweigh the cost
of the actual solution. However, we remark that the strict positiveness
of the coef5cients is usually a property of the problem rather than of the
speci5c matrices involved, and can often be solved in the model phase
before turning to the actual computations. An algorithm such as the above
one would only be needed in an “automatic” subroutine to solve general
instances of the problems E3 and E4.

2.9. Other concrete cases


In Bini et al. [30], the matrix equation
d
X+ Ai X −1 Di = B − I
i=1

d
appears, where B, Ai , Di ≥ 0 and the matrices B + D j + i=1 Ai are
stochastic. The solution X = T − I , with T ≥ 0 minimal and sub-
stochastic, is sought. Their paper proposes a functional iteration and
Newton’s method. By setting Y = −X −1 and multiplying both sides
by Y , we get
(I − B)Y = I + Ai Y Di Y,

which is again in the form (1). It is easy to see that Y is nonnegative


whenever T is substochastic, and Y is minimal whenever T is.
The paper considers two functional iterations and the Newton method;
all these algorithm are expressed in terms of X instead of Y , but they
essentially coincide with those exposed in this chapter.

2.10. Conclusions and research lines


We presented in this chapter a novel uni5ed approach to the analysis of
a family of quadratic vector and matrix equations [136]. This helps pin-
pointing which are the minimal hypotheses needed to derive the presented
results, and in particular the role of the strict positivity of x ∗ . In some
29 Algorithms for Quadratic Matrix and Vector Equations

cases, such as in Theorem 2.6.1, we can weaken the hypotheses of ex-


isting theorems; in other cases, this uni7ed approach allows adapting the
algorithms designed for one of such equations to the others, with good
numerical results.
There are many open questions that could lead to better theoretical
understanding of this class of equations and better solution algorithms.
A 7rst question is whether x ≤ x ∗ implies F(x) ≤ 0, and under which
conditions the converse holds. If an easy criterion is found, then one
could safely implement overrelaxation in many of the above algorithms,
which would lead to faster convergence.
There are a wide theory and several numerical methods for E3 and
E4 that rely on the speci7c matrix structure of the two equations; con-
vergence properties and characteristics of the solutions are understood in
terms of the spectral structure of the involved matrices. Some of these
results are exposed in the following chapters. Is there a way to extend
this theory to include all quadratic vector equations? What is the right
way to generalize spectral properties?
In particular, it would be extremely interesting to 7nd a useful expres-
sion of (1) in terms of an invariant subspace, as in (4) for E4 and (7)
for E3. Applying recursively the techniques used in [128] could yield an
expression of the type we are looking for, but with dimensions growing
exponentially with N .
On the other hand, 7nding a more feasible invariant subspace formu-
lation for (1) seems to be a challenging or even impossible problem. In
fact, if we drop the nonnegativity assumptions, (1) can represent basi-
cally any system of N simultaneous quadratic equations. In 7nite 7elds,
similar problems are known to be NP-hard [50]. An invariant subspace
formulation of polynomial size would probably lead to a polynomial-
time algorithm for the 7nite-7eld version of this problem, which seems
too much to ask for.
Another interesting question is whether we can generalize this ap-
proach to the positive-de7nite ordering on symmetric matrices. This
would lead to the further uni7cation of the theory of a large class of
equations, including the algebraic Riccati equations appearing in control
theory [110]. A lemma proved by Ran and Reurings [139, Theorem 2.2]
could replace the 7rst part of Theorem 1.1.3 in an extension of the results
of this chapter to the positive de7nite ordering.

You might also like