An Implicitly Restarted Block Arnoldi Method in A Vector-Wise Fashion
An Implicitly Restarted Block Arnoldi Method in A Vector-Wise Fashion
An Implicitly Restarted Block Arnoldi Method in A Vector-Wise Fashion
Correspondence to: Linzhang Lu, School of Mathematical Science, Xiamen University, Xiamen 361005, China.
Email: [email protected]
This work is supported by National Natural Science Foundation of China No. 10531080.
Numer. Math. J. Chinese Univ. (English Ser.) 268 http://www.global-sci.org/nm
Q. Yin and L. Lu 269
Since the deation exists, it becomes necessary to exploit whether the structure of the matrices
in the Arnoldi factorization after the truncation from (k + p)-order to k-order is right. In
this paper, we rst prove this point which means that the implicit restarting technique can be
combined with the vector-wise block Arnoldi method, we then develop an implicitly restarted
block Arnoldi algorithm in a vector-wise fashion to solve the large-scale eigenproblems.
In Section 2 we give some notations and the vector-wise block Arnoldi process which is
introduced in [4]. In Section 3 we present our strategy for the computation of eigenvalues. In
Section 4 we report our numerical experiments which demonstrate that the proposed algorithm
is eective.
2 Arnoldi-type algorithm
2.1 Block Krylov subspaces
We rst introduce our notion of block Krylov subspaces for multiple starting vectors. Let
A R
NN
be a given N N matrix and
R = [r
1
r
2
r
m
] R
Nm
(1)
be a given matrix of m right starting vectors r
1
, r
2
, , r
m
. In contrast to the case m = 1, linear
independence of the columns in the block Krylov sequence,
R, AR, A
2
R, , A
j1
R, (2)
is lost gradually in general. By scanning the columns of the matrices in (2) from left to right
and deleting each column that is linearly dependent on earlier columns, we obtain the deated
block Krylov sequence
R
1
, AR
2
, A
2
R
3
, , A
jmax1
R
jmax
. (3)
This process of deleting linearly dependent vectors is referred to as exact deation in the follow-
ing. In (3), for each j = 1, 2, , j
max
, R
j
is a submatrix of R
j1
, with R
j
= R
j1
if, and only
if, deation occurred within the jth Krylov block A
j1
R in (2). Here, for j = 1, we set R
0
= R.
Denoting by m
j
the number of columns of R
j
, we thus have
m m
1
m
2
m
jmax
1. (4)
By construction, the columns of the matrices (3) are linearly independent, and for each n, the
subspace spanned by the rst n of these columns is called the nth block Krylov subspace(induced
by A and R). In the following, we denote the nth Krylov subspace by K
n
(A, R). For later use,
we remark that for
n = m
1
+ m
2
+ + m
j
, (5)
where 1 j j
max
, the nth block Krylov subspace is given by
K
n
(A, R) = Colspan{R
1
, AR
2
, A
2
R
3
, , A
j1
R
j
}. (6)
2.2 Basis vectors
The columns of the deated block Krylov sequence (3), which is used in the above denition
of K
n
(A, R), tend to be almost linearly dependent even for moderate values of n. Therefore,
270 An Implicitly Restarted Block Arnoldi Method in a Vector-Wise Fashion
they should not be used in numerical computations. Instead, we construct other suitable basis
vectors.
In the following,
v
1
, v
2
, , v
n
R
N
(7)
denotes a set of basis vectors for K
n
(A, R), i.e.,
span{v
1
, v
2
, , v
n
} = K
n
(A, R).
The N n matrix
V
n
:= [v
1
v
2
v
n
] (8)
whose columns are the basis vectors (7) is called a basis matrix for K
n
(A, R).
2.3 Arnoldi-type algorithm
The classical Arnoldi process generates orthonormal basis vectors for the sequence of Krylov
subspaces K
n
(A, r), n 1, induced by A and a single starting vector r. In this subsection, we
state an Arnoldi-type algorithm that extends the classical algorithm to block-Krylov subspaces
K
n
(A, R), n 1.
Like the classical Arnoldi process, the algorithm constructs the basis vectors (7) to be ortho-
normal. In terms of the basis matrix (8), this orthonormality can be expressed as follows:
V
T
n
V
n
= I.
In addition to (7), the algorithm produces the so-called candidate vectors,
v
n+1
, v
n+2
, , v
n+mc
, (9)
for the next m
c
basis vectors v
n+1
, v
n+2
, , v
n+mc
. Here, m
c
= m
c
(n) is the number m of
columns in the starting block (1), R, reduced by the number of exact and inexact deations that
have occurred so far. The candidate vectors (9) satisfy the orthogonality relation
V
T
n
[ v
n+1
v
n+2
v
n+mc
] = 0.
Due to the vector-wise construction of (7) and (9), detection of necessary deation and the
actual deation becomes trivial. In fact, essentially the same proof as given in [1] for the case of
a Lanczos-type algorithm can be used to show that exact deation at step n of the Arnoldi-type
process occurs if, and only if, v
n
= 0. Similarly, inexact deation occurs if, and only if, v
n
0,
but v
n
= 0. Therefore, in the algorithm, one checks if
v
n
dtol , (10)
where dtol 0 is a suitably chosen deation tolerance. If (10) is satised, then v
n
is deated by
deleting v
n
, shifting the indices of all remaining candidate vectors by 1, and setting m
c
= m
c
1.
If this results in m
c
= 0, then the block-Krylov subspace is exhausted and the algorithm is
stopped. Otherwise, the deation procedure is repeated until a vector v
n
with v
n
> dtol is
found. This vector is then turned into v
n
by normalizing it to have Euclidean norm 1.
A complete statement of the resulting Arnoldi-type algorithm is as follows.
Algorithm 1
function [V, W, T, Rho, m
c
, n] = vbArnoldi(A, N, R, m, jmax)
(0) Set v
i
= r
i
for i = 1, 2, , m.
Q. Yin and L. Lu 271
Set m
c
= m.
(1) For j = 1, 2, , jmax
(1.1) Compute v
j
and check if the deation criterion (10) is fullled.
If yes, v
j
is deated by doing the following:
Set m
c
= m
c
1. If m
c
= 0, set j = j 1 and n = j, and stop.
Set v
i
= v
i+1
for i = j, j + 1, , j + m
c
1.
Return to Step (1.1).
(1.2) Set t
j,jmc
= v
j
and v
j
= v
j
/t
j,jmc
.
(1.3) Compute v
j+mc
= Av
j
.
(1.4) For i = 1, 2, , j do:
Set t
i,j
= v
T
i
v
j+mc
and v
j+mc
= v
j+mc
v
i
t
i,j
.
(1.5) For i = j m
c
+ 1, j m
c
+ 2, , j 1 do:
Set t
j,i
= v
T
j
v
i+mc
and v
i+mc
= v
i+mc
v
j
t
j,i
.
(2) Set n = j and V = [v
1
v
2
v
n
] and W = [ v
n+1
v
n+2
v
n+mc
];
Set T = [t
i, l
]
i=1,2, ,n
l=1,2, ,n
and Rho = [t
i, lm
]
i=1,2, ,n
l=1,2, ,m
.
Remark 2.1. Other block-Arnoldi algorithms (all without deation though) can be found in [6,
Section 6.12].
After n passes through the main loop, Algorithm 1 has constructed the rst n basis vectors
(7) and the candidate vectors (9) for the next m
c
basis vectors. In terms of the basis matrix (8),
V
n
, the recurrences used to generate all these vectors can be written compactly in matrix format.
We collect the recurrence coecients computed during the rst n passes through the main loop
of Algorithm 1 in the matrices
Rho := [t
i, lm
]
i=1,2, ,n
l=1,2, ,m
and T
n
:= [t
i, l
]
i=1,2, ,n
l=1,2, ,n
, (11)
respectively. Moreover, in (11), all elements t
i, lm
and t
i, l
that are not explicitly dened in
Algorithm 1 are set to be zero. The compact statement of the recurrences used in Algorithm 1
is now as follows. For n 1, we have
AV
n
= V
n
T
n
+ [0 0 v
n+1
v
n+2
v
n+mc
]. (12)
In (12), we assumed that only exact deations are performed. If both exact and inexact deations
are performed, then an additional matrix term, say
V
de
n
, appears on the right-hand side of (12).
The only non-zero columns of
V
de
n
are those non-zero vectors that satised the deation check
(10). Since at any stage of Algorithm 1, at most mm
c
= mm
c
(n) vectors have been deated,
the additional matrix term is small in the sense that
V
de
n
dtol
_
mm
c
(n).
3 Our algorithm for computing the eigenvalues
3.1 The Arnoldi factorization
Let AV
k
= V
k
T
k
+[0 0 v
k+1
v
k+2
v
k+mc
] be a k-step Arnoldi factorization of A, we
can easily use p additional steps (see Algorithm 2) to get the (k + p)-step Arnoldi factorization
AV
k+p
= V
k+p
T
k+p
+ [0 0 v
k+p+1
v
k+p+2
v
k+p+mc
]. (13)
272 An Implicitly Restarted Block Arnoldi Method in a Vector-Wise Fashion
If we denote
W
k+p
= [ v
k+p+1
v
k+p+2
v
k+p+mc
], (14)
then we can write (13) in the following form
AV
k+p
= (V
k+p
W
k+p
)
_
T
k+p
_
0
.
.
. I
mc
_
_
, (15)
where I
mc
is m
c
m
c
identity matrix.
Algorithm 2
function [V, W, T, Rho, m
c
, n] = vbArnoldi2(A, V, W, T, Rho, k, p, m
c
, m)
Input : AV V T = W(0
.
.
. I
mc
) with V
T
V = I
k
, V
T
W = 0.
Output : AV V T = W(0
.
.
. I
mc
) with V
T
V = I
k+p
, V
T
W = 0.
(1) For j = k + 1, k + 2, , k + p
(1.1) Compute v
j
and check if the deation criterion (10) is fullled.
If yes, v
j
is deated by doing the following:
Set m
c
= m
c
1. If m
c
= 0, set j = j 1 and n = j, and stop.
Set v
i
= v
i+1
for i = j, j + 1, , j + m
c
1.
Return to Step (1.1).
(1.2) Set t
j,jmc
= v
j
and v
j
= v
j
/t
j,jmc
.
(1.3) Compute v
j+mc
= Av
j
.
(1.4) For i = 1, 2, , j do:
Set t
i,j
= v
T
i
v
j+mc
and v
j+mc
= v
j+mc
v
i
t
i,j
.
(1.5) For i = j m
c
+ 1, j m
c
+ 2, , j 1 do:
Set t
j, i
= v
T
j
v
i+mc
and v
i+mc
= v
i+mc
v
j
t
j, i
.
(2) Set n = j and V = [v
1
v
2
v
n
] and W = [ v
n+1
v
n+2
v
n+mc
].
3.2 Updating the Arnoldi factorization
Throughout this discussion, the integer k should be thought of as a xed prespecied integer
of modest size. Let p be another positive integer, and consider the result of k + p steps of the
Arnoldi process applied to A, which has the factorization form as (15). In practice, we choose p
larger than k, and here we assume that p > k > m
1
throughout our following discussion.
Let be a shift, putting V = V
k+p
, T = T
k+p
, and let T I = QR with Q orthogonal and
R upper triangular. An analogy of the explicitly shifted QR-algorithm may be applied to (13).
It consists of the following four steps:
(A I)V V (T I) =
W
k+p
(0
.
.
. I
mc
), (16)
(A I)V V QR =
W
k+p
(0
.
.
. I
mc
), (17)
(A I)(V Q) (V Q)(RQ) =
W
k+p
(0
.
.
. I
mc
)Q, (18)
A(V Q) (V Q)(RQ + I) =
W
k+p
(0
.
.
. I
mc
)Q. (19)
Let V
+
= V Q and T
+
= RQ + I = Q
T
QRQ + Q
T
Q = Q
T
(QR + I)Q = Q
T
TQ.
Proposition 3.1. Suppose that T R
(k+p)(k+p)
is an m
1
-upper Hessenberg matrix (i.e. for
1 j k + p m
1
1, T(j + m
1
+ 1 : k + p, j) = 0), T I = QR with Q orthogonal and R
Q. Yin and L. Lu 273
upper triangular, then Q has the same zero structure as T (i.e. if T(i, j) = 0, then Q(i, j) = 0)
and T
+
= RQ+ I = Q
T
TQ is also an m
1
-upper Hessenberg matrix as T.
Proof The proof is similar to the one in [3, p. 166, Proposition 4.5].
Proposition 3.2. We denote v
+
i
= V
+
e
i
, r
ij
= e
T
i
Re
j
, then we have
(A I)(v
1
, v
2
, , v
m1
) = (v
+
1
, v
+
2
, , v
+
m1
)
_
_
_
_
_
r
11
r
12
r
1,m1
0 r
22
r
2,m1
.
.
.
.
.
.
.
.
.
.
.
.
0 0 r
m1,m1
_
_
_
_
_
.
Proof As we have assumed that p > k > m
1
, we have the result
W
k+p
(0
.
.
. I
mc
)e
i
= 0, for i = 1, 2, , m
1
.
Applying the matrices in (17) to the vector e
1
, we get
(A I)V e
1
V QRe
1
=
W
k+p
(0
.
.
. I
mc
)e
1
= 0.
(A I)v
1
= r
11
v
+
1
. (20)
Applying the matrices in (17) to the vector e
2
, we get
(A I)V e
2
V QRe
2
=
W
k+p
(0
.
.
. I
mc
)e
2
= 0.
(A I)v
2
= V Q
_
_
_
_
_
_
_
r
12
0
.
.
.
0
_
_
_
_
_
+
_
_
_
_
_
0
r
22
.
.
.
0
_
_
_
_
_
_
_
= r
12
v
+
1
+ r
22
v
+
2
. (21)
It follows from (20) and (21) that
(A I)(v
1
, v
2
) = (v
+
1
, v
+
2
)
_
r
11
r
12
0 r
22
_
. (22)
The above result can be extended to v
3
, v
4
, , v
m1
,
(A I)(v
1
, v
2
, , v
m1
) = (v
+
1
, v
+
2
, , v
+
m1
)
_
_
_
_
_
r
11
r
12
r
1,m1
0 r
22
r
2,m1
.
.
.
.
.
.
.
.
.
.
.
.
0 0 r
m1,m1
_
_
_
_
_
.
This completes the proof of Proposition 3.2.
This idea may be extended for up to p shifts being applied successively. The application of
a QR-iteration corresponding to a shift produces an m
1
-upper Hessenberg orthogonal Q
R
(k+p)(k+p)
such that
AV
k+p
Q = (V
k+p
Q,
W
k+p
)
_
Q
T
T
k+p
Q
(0
.
.
. I
mc
)Q
_
.
274 An Implicitly Restarted Block Arnoldi Method in a Vector-Wise Fashion
An application of p shifts therefore results in
AV
+
k+p
= (V
+
k+p
,
W
k+p
)
_
T
+
k+p
(0
.
.
. I
mc
)
Q
_
, (23)
where V
+
k+p
= V
k+p
Q, T
+
k+p
=
Q
T
T
k+p
Q and
Q = Q
1
Q
2
Q
p
, with Q
j
the orthogonal matrix
associated with the shift
j
.
Now, partition
V
+
k+p
= (V
+
k
,
V
p
), T
+
k+p
=
_
_
T
+
k
M
(0
.
.
. I
p
)T
+
k+p
_
I
k
0
_
T
p
_
_
, (24)
and note
(0
.
.
. I
mc
)
Q = (0
.
.
. I
mc
)
_
Q
_
I
k
0
_
,
Q
_
0
I
p
__
=
_
(0
.
.
. I
mc
)
Q
_
I
k
0
_
, (0
.
.
. I
mc
)
Q
_
0
I
p
__
.
Substituting into (23) gives
A(V
+
k
,
V
p
) = (V
+
k
,
V
p
,
W
k+p
)
_
_
_
_
_
_
T
+
k
M
(0
.
.
. I
p
)T
+
k+p
_
I
k
0
_
T
p
(0
.
.
. I
mc
)
Q
_
I
k
0
_
(0
.
.
. I
mc
)
Q
_
0
I
p
_
_
_
_
_
_
_
. (25)
Equating the rst k columns on both sides of (25) gives
AV
+
k
= V
+
k
T
+
k
+
V
p
(0
.
.
. I
p
)T
+
k+p
_
I
k
0
_
+
W
k+p
(0
.
.
. I
mc
)
Q
_
I
k
0
_
= V
+
k
T
+
k
+ (0
.
.
.
V
p
)T
+
k+p
_
I
k
0
_
+ (0
.
.
.
W
k+p
)
Q
_
I
k
0
_
. (26)
Theorem 3.1. With the above notations
AV
+
k
= V
+
k
T
+
k
+
_
0 0 v
+
k+1
v
+
k+2
v
+
k+mc
. (27)
Proof Note that for (0
.
.
.
V
p
)T
+
k+p
_
I
k
0
_
= [
1
2
k
] R
Nk
, and
i
= 0 for 1 i k m
c
,
so we have
i
= 0 for i = k m
c
+ 1, , k,
which means there are m
c
nonzero columns. Similarly, (0
.
.
.
W
k+p
)
Q
_
I
k
0
_
= [u
1
u
2
u
k
] R
Nk
has less than m
c
nonzero columns. This leads to
AV
+
k
= V
+
k
T
+
k
+
_
0 0 v
+
k+1
v
+
k+2
v
+
k+mc
,
Q. Yin and L. Lu 275
which is the desired result.
We denote
W
+
k
=
_
v
+
k+1
v
+
k+2
v
+
k+mc
,
so we have
AV
+
k
=
_
V
+
k
,
W
+
k
_
_
T
+
k
(0
.
.
. I
mc
)
_
. (28)
Note that (V
+
k
)
T
V
p
= 0 and (V
+
k
)
T
W
k+p
= 0, so (V
+
k
)
T
W
+
k
= 0. Thus (28) is a legitimate
Arnoldi factorization of A. Using this as a starting point it is possible to use p additional steps
(see Algorithm 2) to return to the original form (13).
3.3 The complete algorithm for computing the eigenvalues
Based on the above strategy, we can present our complete algorithm.
Algorithm 3
(1) Start: Given the number q of the desired eigenpairs;
Choose the steps k of the Arnoldi-type process;
Choose p unwanted eigenvalues as p shifts;
Given the initial block size m, a user-prescribed tolerance tol and an
initial N m block vector V
1
.
(2) [V, W, T, Rho, m
c
, n] = vbArnoldi(A, N, V
1
, m, k + p).
(3) Compute the eigenvalues
(k+p)
i
(i = 1, 2, , k + p) of T
k+p
, and select q
of them as approximations to the desired eigenvalues
i
, here we denote y
i
as theeigenvector associated with
(k+p)
i
which satises T
k+p
y
i
=
(k+p)
i
y
i
.
(4) (Compute the approximate eigenvectors of A and the corresponding
residual norms as convergence testing criteria.)
(4.1) x
i
:= V
k+p
y
i
, i = 1, 2, , q
(4.2)
i
:= Ax
i
(k+p)
i
x
i
, i = 1, 2, , q
(5) If
i
< tol (i = 1, 2, , q)
stop;
else
(5.1) Sort (T) according to the algebraically largest real part (the largest
modulus) and select the p eigenvalues with the smallest real part
(modulus) as shifts, here we use a vector to preserve these p shifts.
(5.2) Q := I
k+p
;
(5.3) for j = 1, 2, , p
(5.3.1) T (j)I = Q
j
R
j
(perform the QR factorization);
(5.3.2) T := Q
T
j
TQ
j
;
(5.3.3) Q := QQ
j
;
(5.4) W := (V Q)
_
0
I
p
_
(0
.
.
. I
p
)T
_
I
k
0
_
+ W(0
.
.
. I
mc
)Q
_
I
k
0
_
;
(5.5) V := (V Q)
_
I
k
0
_
;
(5.6) Compute the number m
c
of the nonzero columns of W;
(5.7) W := W
_
0
I
mc
_
;
276 An Implicitly Restarted Block Arnoldi Method in a Vector-Wise Fashion
(5.8) T := (I
k
.
.
. 0)T
_
I
k
0
_
, Rho := (I
k
.
.
. 0)Rho ;
(5.9) [V, W, T, Rho, m
c
, n] = vbArnoldi2(A, V, W, T, Rho, k, p, m
c
, m)
(5.10) go to step (3).
4 Numerical experiments
We have tested our Algorithm 3 using Matlab 6.1 on a personal computer with the machine
precision eps 2.22 10
16
.
In the experiment to be reported below, we test the matrix 685 bus
From http://math.nist.gov/MatrixMarket/.
Q. Yin and L. Lu 277
[2] Bai Z, Demmel J, Dongarra J, Ruhe A, van der Vorst H, editors. Templates for the solution of
algebraic eigenvalue problems: A practical guide. SIAM, Philadelphia, 2000.
[3] Demmel J. W. Applied numerical linear algebra. SIAM, Philadelphia, 1997.
[4] Freund R W. Krylov-subspace methods for reduced-order modeling in circuit simulation. J. Comput.
Appl. Math., 2000, 123: 395-421.
[5] Lehoucq R B, Maschho K J. Implementation of an implicitly restarted block Arnoldi. Preprint
MCS-P649-0297, Argonne National Laboratory, Argonne, IL, 1997.
[6] Saad Y. Iterative methods for sparse linear systems, PWS Publishing Company, Boston, 1996.
[7] Sorensen D C. Implicit application of polynomial lters in a k-Step Arnoldi method. SIAM J. Matrix
Anal. Appl., 1992, 13: 357-385.