remarkRemark
\newsiamremarkhypothesisHypothesis
\newsiamthmclaimClaim
\headersImplicitly Restarted LanczosP.S. Negi and C. Arratia
\externaldocument[][nocite]ex_supplement
Structure preserving restarts of the non-symmetric Lanczos algorithm via the implicitly shifted LR algorithm††thanks: Submitted to the editors DATE.
\fundingThis work was funded by the Nordita the Swedish Research Council Grant No. 2018-04290. Nordita is partially supported by Nordforsk.
P. S. Negi
Nordita, Stockholm University, KTH Royal Institute of Technology, Stockholm, Sweden ().
[email protected]C. Arratia
Nordita, Stockholm University, KTH Royal Institute of Technology, Stockholm, Sweden ().
[email protected]
Abstract
The implicitly shifted QR iteration is used as a restart procedure for the Arnoldi method for the calculation of a few dominant eigenvalues of a large matrix. We show that the underlying idea of implicit polynomial filtering can be utilized in much the same manner via the implicitly shifted LR iteration to create a restart procedure for the non-symmetric Lanczos algorithm for eigenvalue computations, which preserves the tri-diagonal structure of the reduced matrix.
keywords:
LR algorithm, non-symmetric Lanczos, implicit restart
{MSCcodes}
68Q25, 68R10, 68U05
1 Introduction
The Arnoldi iteration [1] is a popular Krylov space method for calculating a few eigenvalues of a large matrix. The method relies on the generation of a sequence of Krylov vectors which determine the subspace within which approximations of the eigenvalue-eigenvector pairs are obtained. Depending on the accuracy and number of eigenpair approximations needed, the Krylov space size can become exceedingly large so that the quality of the results may be limited by the available memory. Sorensen [18] introduced an elegant procedure for restarting the Arnoldi factorization based on polynomial filters, which are applied through the implicitly shifted QR iterations on the reduced Hessenberg matrix obtained through the Arnoldi method. In particular, the use of exact shifts was shown to be successful in the convergence process of the eigenspace[18] of the specified eigenvalues. The method has subsequently found widespread application through the ARPACK library [11]. An alternative, more modern method for eigenvalue computations is the Krylov-Schur method introduced by Stewart [19], which can be found implemented in the SLEPc library [9]. Krylov-Schur restarts are known to be less sensitive than the restarting based on the implicitly shifted QR method. However, they do not preserve the Hessenberg structure of the reduced matrix which may sometimes be required. The use of QR iterations ensures that the reduced matrix preserves its Hessenberg structure through the transforms that make up the restart process. If the underlying matrix is symmetric, the Arnoldi iteration reduces to the Lanczos algorithm, and the Hessenberg matrix reduces to a symmetric tridiagonal matrix. The QR iteration preserves the symmetric tridiagonal structure as well and, as pointed out by Sorensen [18], the implicit restart process applies equally well for the Lanczos method for symmetric matrices.
The Lanczos algorithm introduced in [10] is in fact the predecessor of the Arnoldi method and can be used for non-symmetric matrices as well. In such cases it is referred to as the non-symmetric or the bi-Lanczos. It has been used in investigationss for plasma instabilities [13], linear systems solvers [8], model reduction [12], and more recently for non-linear eigenvalue problems [5]. For the non-symmetric Lanczos one builds two Krylov subspaces referred to as the right and the left Krylov subspaces. The idea behind the two methods is similar, which is to obtain a projection of the original large matrix on to an appropriate reduced subspace such that, the eigenvalues may be approximated via the eigenvalues of the reduced operator. The difference being that in the Arnoldi method one obtains an orthogonal projection on to a subspace while, in the Lanczos method one obtains an oblique projection. One would therefore like to extend the idea of implicit restarts to the non-symmetric Lanczos algorithm as well. However, the reduced matrix that one obtains in such a case is a non-symmetric tridiagonal matrix, with the tridiagonal structure being the result of the recurrence relations of the Lanczos algorithm [17]. Since the QR iterations do not preserve the banded structure of non-symmetric matrices, a straightforward application of the restart procedure put forward by Sorensen will lead to a loss of this tridiagonal structure (the Hessenberg structure will still be preserved). This loss of structure can be circumvented if one looks to the predecessor of the QR algorithm namely, the LR algorithm proposed by Rutishauser [14, 16, 15], which has the attractive property of preserving the band structure of a matrix. This property was already pointed out by Rutishauser in [14] where the banded matrices were referred to as striped matrices. As we will show in the next section, the shifted LR transform is the appropriate analogue of the restart procedure in the case of non-symmetric Lanczos iteration. The process would necessarily require refining both the right as well as the left Krylov spaces simultaneously.
The rest of the paper is organized as follows. In section 2 we start with the introduction of the non-symmetric Lanczos iteration and then develop the restart procedure. In section 3 we apply the restart process to the Grcar matrix, and make some concluding remarks in section 4.
2 Non-symmetric Lanczos
Lanczos first introduced his algorithm in [10] as a method for tridiagonalizing a matrix, but also realized that the method could be used iteratively to find eigenvalues. For an arbitry matrix , the method generates a pair of Krylov subspaces and , through repeated action of and respectively. We refer to these as the right and left Krylov spaces respectively and they satisfy the biorthogonality relation . The two subspaces are generated through a three term recurrence relation which, for a Krylov space of size , can be written in matrix form as
(1a)
(1b)
(1c)
where, represents the Identity matrix of size , is a tri-diagonal matrix of size with it’s Hermitian conjugate, and is the standard unit vector. and represent the residual vectors at the step. If either or vanishes it represents the convergence of the right or the left Krylov subspaces to an invariant subspace of dimension . A more serious breakdown occurs if with both and , in which case a look-ahead strategy may be employed. We do not address the issues with breakdown here since it is not specifically related to the restart procedure. We refer the reader to [13] for the look-ahead Lanczos and to [8] for a comprehensive overview on Lanczos type solvers and the related issues of breakdown.
As Sorensen points out for the Arnoldi method [18], if one is interested in an invariant subspace of dimension , the starting vector of the Krylov subspace must not contain components of the generator of a cyclic subspace of dimension greater than . This applies equally for the right and left Krylov subspaces generated through the Lanczos recurrence relations. Hence a non-vanishing (respectively ) implies that (respectively ) contains components of an invariant subspace of dimension greater than . The idea behind restarts then is to discard the components of the starting vector (and ) along the unwanted dimensions, such that each restart process moves the Krylov space(s) closer to being invariant. For the Arnoldi method Sorensen [18] proposed to achieve this via polynomial filtering, i.e. replacing
(2a)
(2b)
Obviously is the filtering polynomial, is a normalization constant and each specifies a node of the polynomial. The polynomial acts on to filter out the part of the spectrum of that is close to each . If a particular corresponds to an exact eigenvalue of , then components of the corresponding eigenvector are completely filtered out from (at least in exact arithmetic).
The node is referred to as a shift since the application of the polynomial filtering relies on the shifted QR algorithm, where is used as the shift. As shown below for the case of a single shift, an analogous procedure can be followed using a shifted LR algorithm which achieves the same effect of applying a polynomial filter to the starting vector . Starting with the Lanczos relation for the right subspace (1a), and adding and subtracting we obtain
(3a)
(3b)
(3c)
(3d)
(3e)
Here we have set and . The matrices are the lower and upper triangular matrices obtained from the LU decomposition of . The matrix can be required to be unit triangular (all entries on the main diagonal are ones), in which case the LU decomposition is unique. Furthermore, for a tridiagonal matrix only consists of one sub-diagonal (in addition to the main diagonal). One can easily recognize that the new reduced matrix is a result of one step of the shifted LR iteration. Hence retains the tridiagonal structure of the [14, 15]. The relationship between starting vectors of the two spaces and can be obtained by multiplying (3b) by , i.e.
where . This clearly shows the filtering operation done on the original vector to generate the new vector .
Since the Lanczos method creates a biorthogonal basis, one must simultaneously transform the left basis to maintain the biorthogonality property. It is easy to see that the necessary transform to maintain biorthogonality is , since,
Substituting the relation obtained from the LU decomposition in to equation (1b) we can obtain the modified Lanczos relation for the left Krylov space as
(4a)
(4b)
(4c)
(4d)
Conveniently the structure of the Lanczos iteration for the left Krylov space is also preserved. Noting that is upper triangular, one can again expose the relationship between the generating vectors of the two left Krylov spaces as
Clearly is simply a scalar multiple of the old vector and no filtering of the generating vector has occurred. In order to ensure that we filter the left Krylov space as well, we perform one step of the shifted LR iteration with the conjugated shift on the reduced matrix obtained in equation (4d).
(5a)
(5b)
(5c)
(5d)
(5e)
One may again obtain the relation between the starting vectors of the two spaces as
(6)
which clearly shows the filtering operation performed on the starting vector of the left Krylov space. Obviously the appropriate transform for the right subspace to maintain orthogonality is . Again, note that the upper triangular implies that the new is simply the scalar multiple of and no additional filtering occurs for in this step. One can write the corresponding modified Lanczos relation as
(7)
The above process can be repeated for unwanted shifts. We denote by as the product of the lower triangular matrices generated due to shifted-LR steps for the right Krylov space , and by as the product of the lower triangular matrices due to the shifted-LR iterations for the left Krylov space. Then for a Krylov space size of we have two modified Lanczos relations
(8a)
(8b)
We may take a closer look at the structure of the residual matrices on the right hand side of equation (8a). is a product of matrices that are lower triangular with just one subdiagonal. then is lower triangular with non-zero subdiagonals. is upper triangular and the product therefore has non-zero subdiagonals. Left multiplication by therefore has the form
where, . Therefore the matrix on the right hand side of (8a) has zeros in the first columns and the column is simply . The remaining columns are non-zero in general. A very similar structure is obtained for the residual matrix in the right hand side of equation (8b) with zeros in the first columns and the column being equal to , with
Partitioning the matrices such that
with the length of the vectors understood to be such that the resulting matrices are consistent. We can write the modified Lanczos relations of (8a) and (8b) as
Finally, equating the individual columns on both sides and discarding columns we are left with the new Krylov spaces of order and the Lanczos relations
(10a)
(10b)
(10c)
The new residual vectors are defined as
(11a)
(11b)
which may be normalized appropriately such that the inner product of the new Krylov vectors is unity. The Lanczos process may now be carried out again to generate the next vectors of the right and left Krylov spaces and the cycle may be repeated till an adequately converged subspace has been obtained. In the case that exact shifts (eigenvalues of ) are used for the restart procedure and are both zero in exact arithmetic.
As a final note, we mention that Della-Dora [3] introduced a class of algorithms of the GR type of which, LR and QR are special cases. Watkins then introduced generic bulge-chasing algorithms for the entire GR class of methods[20]. The shifted LR algorithm used in the restart procedure outlined above can therefore be carried out in an implicit manner through the bulge-chase sequence of Watkins [20]. For real matrices, the operations can be confined in the real space by using the double-shift strategy introduced by Francis [4]. The entire restart process is put together in algorithm 1.
3 Computational Results
We present the results of computational tests performed in the Grcar matrix which is highly non-normal and has presented problems with convergence in previous studies. In his implicit restart work Sorensen [18] indeed points out that the restarted Arnoldi has trouble converging to the left-most part of the spectrum and even though the algorithm claimed convergence, what was obtained was in fact part of the pseudospectrum.
We present the results of the restarted Lanczos method applied to the Grcar matrix for and . Figure 2 shows the spectrum obtained after performing eleven restarts of the Lanczos algorithm. The restarts were performed such that the eigenvalues with the largest imaginary part were retained for the reduced operator and the remaining were used as shifts to be discarded from the subspace. The same part of the spectrum was sought by Sorensen in [18]. In this particular case we obtained even though the individual residuals were both of order . At this particular moment, one would need to employ a look-ahead step of the Lanczos however, we have shown the spectrum of the reduced matrix at this point. One may think of the spectrum to have converged to . As seen from the figure, the spectrum of the reduced operator matches quite well with the original spectrum of the Grcar matrix. In figure 2, we show the convergence history of the individual eigenvalues with each restart step as the Lanczos algorithm progresses. Clearly the restart process is working well to shift the Krylov spaces towards the wanted region of the spectrum. The error in the eigenvalues is of even though the perturbation to the reduced matrix is of order . This is in contrast to the results obtained in [18] which reported large perturbations to the eigenvalues even when the Arnoldi method reported convergence. We suspect this is due to the fact that the non-symmetric Lanczos approximates both the right and left eigenspaces simultaneously and leads to a lower error in the truncated matrix. We expect this to be particularly useful in hydrodynamic problems where highly non-normal matrices are a routine occurrence and where problems with convergence of the spectrum have often been reported (see [2] for example).
At this point we make a note of caution that for large Grcar matrices our attempts have been somewhat less successful and the convergence sometimes fails. We expect this is due to the naive implementations of the implicit LR method that we have done in Julia where, the standard checks for small sub-diagonal elements have not been performed. In our investigations of such cases we indeed do find small sub diagonal elements which lead to loss of precision. Very small sub diagonal elements also lead to rapid loss of biorthogonality of the two subspaces. Similarly we have not paid attention to the issue arising out of the Lanczos algorithm itself, except for employing a double (two-sided) Gram-Schmidt to ensure biorthogonality. We also note that for large number of restarts, the biorthogonality property of the two subspaces is progressively lost, probably due to accruing floating point errors. Hence we expect some method of reorthogonalization would be required to ensure stability for very long calculations, which has not been done in the current work. The (implicit) LR decomposition is not unique. Uniqueness is ensured for unit main diagonal of the lower triangular matrix however, this does not pay any heed to conditioning of the transforming matrices. A better strategy of building the matrices could be pursued which has better conditioning while at the same time preserves the tridiagonal structure. These issues would require careful implementation of all the individual components and we do not address those in the current work.
Finally we note that very similar work has been reported in [7] where hyperbolic transforms (HR) are used for restarts of the bi-Lanczos method, and in [6] where the LR transformations are used, albeit in the context of model reduction.
4 Conclusion
We present an algorithm to restart the non-symmetric Lanczos method which is based on the idea of polynomial filtering via the implicit QR method proposed by Sorensen [18] for restarting the Arnoldi iteration. The (implicitly) shifted LR method is shown to be the appropriate analogue for restarting the non-symmetric Lanczos algorithm for structure preserving restarts. It is shown that the polynomial filtering process needs to be carried out for both the right and the left Krylov spaces and the appropriate transforms for maintaining biorthogonality of the two spaces are highlighted. Computational results are shown for a Grcar matrix which is known to be highly non-normal and the spectrum is found to converge adequately even when the residual error is relatively large.
Acknowledgments
The authors would like to thank Professor Elias Jarlebring for his helpful comments on the manuscript. The authors acknowledge support of Nordita and the Swedish Research Council Grant No. 2018-04290. Nordita is partially supported by Nordforsk.
References
[1]W. E. Arnoldi, The principle of minimized iterations in the solution
of the matrix eigenvalue problem, Quarterly of applied mathematics, 9
(1951), pp. 17–29.
[2]M. Brynjell-Rahkola, N. Shahriari, P. Schlatter, A. Hanifi, and D. S.
Henningson, Stability and sensitivity of a cross-flow-dominated
falkner–skan–cooke boundary layer with discrete surface roughness,
Journal of Fluid Mechanics, 826 (2017), p. 830–850.
[3]J. Della-Dora, Numerical linear algorithms and group theory, Linear
Algebra and its Applications, 10 (1975), pp. 267–283.
[4]J. G. F. Francis, The QR Transformation—Part 2, The Computer
Journal, 4 (1962), pp. 332–345.
[5]S. W. Gaaf and E. Jarlebring, The infinite bi-lanczos method for
nonlinear eigenvalue problems, SIAM Journal on Scientific Computing, 39
(2017), pp. S898–S919.
[6]E. Grimme, D. Sorensen, and P. Van Dooren, Stable partial
realizations via an implicitly restarted lanczos method, in Proceedings of
1994 American Control Conference - ACC ’94, vol. 3, 1994, pp. 2814–2818
vol.3.
[7]E. J. Grimme, D. C. Sorensen, and P. Van Dooren, Model reduction of
state space systems via an implicitly restarted lanczos method, Numerical
algorithms, 12 (1996), pp. 1–31.
[8]M. H. Gutknecht, Lanczos-type solvers for nonsymmetric linear
systems of equations, Acta numerica, 6 (1997), pp. 271–397.
[9]V. Hernandez, J. E. Roman, and V. Vidal, SLEPc: A scalable and
flexible toolkit for the solution of eigenvalue problems, ACM Trans. Math.
Software, 31 (2005), pp. 351–362.
[10]C. Lanczos, An iteration method for the solution of the eigenvalue
problem of linear differential and integral operators, (1950).
[11]R. B. Lehoucq, D. C. Sorensen, and C. Yang, ARPACK users’ guide:
solution of large-scale eigenvalue problems with implicitly restarted Arnoldi
methods, SIAM, 1998.
[12]V. Papakos and I. Jaimoukha, A deflated implicitly restarted lanczos
algorithm for model reduction, in 42nd IEEE International Conference on
Decision and Control (IEEE Cat. No.03CH37475), vol. 3, 2003, pp. 2902–2907
Vol.3.
[13]B. N. Parlett, D. R. Taylor, and Z. A. Liu, A look-ahead lanczos
algorithm for unsymmetric matrices, Mathematics of Computation, 44 (1985),
pp. 105–124.
[14]H. Rutishauser, Solution of eigenvalue problems with the
lr-transformation, National Bureau of Standards, Applied Mathematics Series,
49 (1958), pp. 47–81.
[15]H. Rutishauser, Lectures on Numerical Mathematics, Birkhauser,
1990.
[16]H. Rutishauser and H. Schwarz, The lr transformation method for
symmetric matrices, Numerische Mathematik, 5 (1963), pp. 273–289.
[17]Y. Saad, The lanczos biorthogonalization algorithm and other oblique
projection methods for solving large unsymmetric systems, SIAM Journal on
Numerical Analysis, 19 (1982), pp. 485–506.
[18]D. C. Sorensen, Implicit application of polynomial filters in a
k-step arnoldi method, Siam journal on matrix analysis and applications, 13
(1992), pp. 357–385.
[19]G. W. Stewart, A krylov–schur algorithm for large eigenproblems,
SIAM Journal on Matrix Analysis and Applications, 23 (2002), pp. 601–614.
[20]D. S. Watkins and L. Elsner, Chasing algorithms for the eigenvalue
problem, SIAM Journal on Matrix Analysis and Applications, 12 (1991),
pp. 374–384.