0% found this document useful (0 votes)
14 views11 pages

LecN16 R

Uploaded by

taha23akter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views11 pages

LecN16 R

Uploaded by

taha23akter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Solution of eigenvalue problems Background.

Origins of Eigenvalue Problems

• Structural Engineering [Ku = λM u] (Goal: frequency response)


• Introduction – motivation
• Electronic structure calculations [Schrödinger equation..]
• Projection methods for eigenvalue problems • Stability analysis [e.g., electrical networks, mechanical system,..]
• Subspace iteration, The symmetric Lanczos algorithm • Bifurcation analysis [e.g., in fluid flow]
• Nonsymmetric Lanczos procedure; ä Large eigenvalue problems in quantum chemistry use up biggest
• Implicit restarts portion of the time in supercomputer centers

• Harmonic Ritz values, Jacobi-Davidson’s method

16-2 – eigBackg

Background. New applications in data analytics Background. The Problem (s)

ä Machine learning problems often require a (partial) Singular Value ä Standard eigenvalue problem:
Decomposition -
Ax = λx
ä Somewhat different issues in this case:
Often: A is symmetric real (or Hermitian complex)
• Very large matrices, update the SVD
• Compute dominant singular values/vectors ä Generalized problem Ax = λBx Often: B is sym-
metric positive definite, A is symmetric or nonsymmetric
• Many problems of approximating a matrix (or a tensor) by one of
lower rank (Dimension reduction, ...) ä Quadratic problems: (A + λB + λ2C)u = 0

ä But: Methods for computing SVD often based on those for " n
#
ä Nonlinear eigenvalue X
standard eigenvalue problems problems (NEVP) A0 + λB0 + fi(λ)Ai u = 0
i=1

16-3 – eigBackg 16-4 – eigBackg


ä General form of NEVP A(λ)x = 0 Large eigenvalue problems in applications
ä Nonlinear eigenvector problems:
ä Some applications require the computation of a large number of
eigenvalues and vectors of very large matrices.
[A + λB + F (u1, u2, · · · , uk )]u = 0
ä Density Functional Theory in electronic structure calculations:
‘ground states’
ä Excited states involve transitions and invariably lead to much
more complex computations. → Large matrices, *many* eigen-pairs
What to compute: to compute
• A few λi ’s with smallest or largest real parts;
• All λi’s in a certain region of C;
• A few of the dominant eigenvalues;
• All λi’s (rare).
16-5 – eigBackg 16-6 – eigBackg

Background: The main tools Background. The main tools (cont)

Projection process: Shift-and-invert:


(a) Build a ‘good’ subspace K = span(V ); ä If we want eigenvalues near σ, replace A by (A − σI)−1.
(b) get approximate eigenpairs by a Rayleigh-Ritz process: Example: power method: vj = Avj−1/scaling replaced by
λ̃, ũ ∈ K satisfy: (A − λ̃I)ũ ⊥ K −→
(A−σI)−1vj−1
V H (A − λ̃I)V y = 0 vj = scaling

ä λ̃ = Ritz value, ũ = V y = Ritz vector ä Works well for computing a few eigenvalues near σ/
ä Two common choices for K: ä Used in commercial package NASTRAN (for decades!)
1) Power subspace K = span{Ak X0}; or span{Pk (A)X0};
2) Krylov subspace K = span{v, Av, · · · , Ak−1v} ä Requires factoring (A − σI) (or (A − σB) in generalized
case.) But convergence will be much faster.
ä A solve each time - Factorization done once (ideally).
16-7 – eigBackg 16-8 – eigBackg
Background. The main tools (cont) Current state-of-the art in eigensolvers

Deflation: ä Eigenvalues at one end of the spectrum:


ä Once eigenvectors converge remove them from the picture (e.g., • Subspace iteration + filtering [e.g. FEAST, Cheb,...]
with power method, second largest becomes largest eigenvalue after • Lanczos+variants (no restart, thick restart, implicit restart,
deflation). Davidson,..), e.g., ARPACK code, PRIMME.
Restarting Strategies : • Block Algorithms [Block Lanczos, TraceMin, LOBPCG,
SlepSc,...]
ä Restart projection process by using information gathered in pre-
vious steps • + Many others - more or less related to above

ä ‘Interior’ eigenvalue problems (middle of spectrum):


ä ALL available methods use some combination of these ingredients. • Combine shift-and-invert + Lanczos/block Lanczos. Used in,
[e.g. ARPACK: Arnoldi/Lanczos + ‘implicit restarts’ + shift-and- e.g., NASTRAN
invert (option).] • Rational filtering [FEAST, Sakurai et al.,.. ]

16-9 – eigBackg 16-10 – eigBackg

Projection Methods for Eigenvalue Problems Rayleigh-Ritz projection

General formulation: Given: a subspace X known to contain good approximations to


Projection method onto K orthogonal to L eigenvectors of A.
Question: How to extract good approximations to eigenvalues/
ä Given: Two subspaces K and L of same dimension. eigenvectors from this subspace?
ä Find: λ̃, ũ such that
Answer: Rayleigh Ritz process.
λ̃ ∈ C, ũ ∈ K; (λ̃I − A)ũ ⊥ L
Let Q = [q1, . . . , qm] an orthonormal basis of X. Then write an
Two types of methods: approximation in the form ũ = Qy and obtain y by writing

Orthogonal projection methods: situation when L = K. QH (A − λ̃I)ũ = 0

Oblique projection methods: When L 6= K.


ä QH AQy = λ̃y

16-11 – eig1 16-12 – eig1


Procedure: Subspace Iteration
1. Obtain an orthonormal basis of X
2. Compute C = QH AQ (an m × m matrix) ä Original idea: projection technique onto a subspace if the form
3. Obtain Schur factorization of C, C = Y RY H Y = Ak X
4. Compute Ũ = QY
ä In practice: Replace Ak by suitable polynomial [Chebyshev]
Property: if X is (exactly) invariant, then procedure will yield • Easy to implement (in symmetric case);
exact eigenvalues and eigenvectors. Advantages:
• Easy to analyze;
Proof: Since X is invariant, (A − λ̃I)u = Qz for a certain z. Disadvantage: Slow.
QH Qz = 0 implies z = 0 and therefore (A − λ̃I)u = 0.
ä Often used with polynomial acceleration: Ak X replaced by
ä Can use this procedure in conjunction with the subspace obtained Ck (A)X. Typically Ck = Chebyshev polynomial.
from subspace iteration algorithm

16-13 – eig1 16-14 – eig1

Algorithm: Subspace Iteration with Projection THEOREM: Let S0 = span{x1, x2, . . . , xm} and assume that
1. Start: Choose an initial system of vectors X = S0 is such that the vectors {P xi}i=1,...,m are linearly independent
[x0, . . . , xm] and an initial polynomial Ck . where P is the spectral projector associated with λ1, . . . , λm. Let
Pk the orthogonal projector onto the subspace Sk = span{Xk }.
2. Iterate: Until convergence do:
Then for each eigenvector ui of A, i = 1, . . . , m, there exists a
(a) Compute Ẑ = Ck (A)Xold. unique vector si in the subspace S0 such that P si = ui. Moreover,
(b) Orthonormalize Ẑ into Z. the following inequality is satisfied
(c) Compute B = Z H AZ and use the QR algorithm to  k
λm+1
compute the Schur vectors Y = [y1, . . . , ym] of B. k(I − Pk )uik2 ≤ kui − sik2 + k , (1)
λi
(d) Compute Xnew = ZY .
(e) Test for convergence. If satisfied stop. Else select a new where k tends to zero as k tends to infinity.
polynomial Ck0 0 and continue.

16-16 – eig1
Krylov subspace methods Arnoldi’s Algorithm

Principle: Projection methods on Krylov subspaces, i.e., on ä Goal: to compute an orthogonal basis of Km.

Km(A, v1) = span{v1, Av1, · · · , Am−1v1} ä Input: Initial vector v1, with kv1k2 = 1 and m.

ALGORITHM : 1 Arnoldi’s procedure


• probably the most important class of projection methods [for linear
For j = 1, ..., m do
systems and for eigenvalue problems]
Compute w := Avj 
• many variants exist depending on the subspace L. hi,j := (w, vi)
For i = 1, . . . , j, do
w := w − hi,j vi
Properties of Km. Let µ = deg. of minimal polynom. of v. Then,
hj+1,j = kwk2; vj+1 = w/hj+1,j
• Km = {p(A)v|p = polynomial of degree ≤ m − 1} End
• Km = Kµ for all m ≥ µ. Moreover, Kµ is invariant under A.
• dim(Km) = m iff µ ≥ m.
16-17 – eig1 16-18 – eig1

Result of Arnoldi’s algorithm Appliaction to eigenvalue problems

Let ä Write approximate eigenvector as ũ = Vmy + Galerkin condi-


 
x x x x x tion
x x x x x
 
 
Hm

=
x x x x (A − λ̃I)Vmy ⊥ Km → VmH (A − λ̃I)Vmy = 0
; Hm = H m(1 : m, 1 : m)
 x x x
  ä Approximate eigenvalues are eigenvalues of Hm
 x x
x Hmyj = λ̃j yj
1. Vm = [v1, v2, ..., vm] orthonormal basis of Km.
Associated approximate eigenvectors are
2. AVm = Vm+1H m = VmHm + hm+1,mvm+1eTm
ũj = Vmyj
3. VmT AVm = Hm ≡ H m− last row.
Typically a few of the outermost eigenvalues will converge first.
16-19 – eig1 16-20 – eig1
Restarted Arnoldi Example:
Small Markov Chain matrix [ Mark(10) , dimension = 55]. Restarted
In practice: Memory requirement of algorithm implies restarting is
Arnoldi procedure for computing the eigenvector associated with the
necessary
eigenvalue with algebraically largest real part. We use m = 10.
ä Restarted Arnoldi for computing rightmost eigenpair:
m <(λ) =(λ) Res. Norm
ALGORITHM : 2 Restarted Arnoldi 10 0.9987435899D+00 0.0 0.246D-01
20 0.9999523324D+00 0.0 0.144D-02
1. Start: Choose an initial vector v1 and a dimension m. 30 0.1000000368D+01 0.0 0.221D-04
2. Iterate: Perform m steps of Arnoldi’s algorithm. 40 0.1000000025D+01 0.0 0.508D-06
(m)
3. Restart: Compute the approximate eigenvector u1 50 0.9999999996D+00 0.0 0.138D-07
(m)
4. associated with the rightmost eigenvalue λ1 .
(m)
5. If satisfied stop, else set v1 ≡ u1 and goto 2.

16-21 – eig1 16-22 – eig1

Restarted Arnoldi (cont.) Deflation

ä Can be generalized to more than *one* eigenvector : ä Very useful in practice.


p
X
(new) (m) ä Different forms: locking (subspace iteration), selective orthogo-
v1 = ρi u i nalization (Lanczos), Schur deflation, ...
i=1
A little background Consider Schur canonical form
ä However: often does not work well – (hard to find good coeffi-
cients ρi’s) A = U RU H
ä Alternative : compute eigenvectors (actually Schur vectors) one
at a time. where U is a (complex) upper triangular matrix.
ä Implicit deflation. ä Vector columns u1, . . . , un called Schur vectors.
ä Note: Schur vectors are not unique. In particular, they depend
on the order of the eigenvalues

16-23 – eig1 16-24 – eig1


Wiedlandt Deflation: Assume we have computed a right eigenpair ALGORITHM : 3 Explicit Deflation
λ1, u1. Wielandt deflation considers eigenvalues of 1. A0 = A
A1 = A − σu1v H 2. For j = 0 . . . µ − 1 Do:
3. Compute a dominant eigenvector of Aj
Note: 4. Define Aj+1 = Aj − σj uj uH j
Λ(A1) = {λ1 − σ, λ2, . . . , λn} 5. End
Wielandt deflation preserves u1 as an eigenvector as well all the left
eigenvectors not associated with λ1.
ä Computed u1, u2., .. form a set of Schur vectors for A.
ä An interesting choice for v is to take simply v = u1. In this
ä Alternative: implicit deflation (within a procedure such as Arnoldi).
case Wielandt deflation preserves Schur vectors as well.
ä Can apply above procedure successively.

16-25 – eig1 16-26 – eig1

Deflated Arnoldi ä Similar techniques in Subspace iteration [G. Stewart’s SRRIT]


Example: Matrix Mark(10) – small Markov chain matrix (N =
ä When first eigenvector converges, put it in 1st column of Vm = 55).
[v1, v2, . . . , vm]. Arnoldi will now start at column 2, orthogonaling
still against v1, ..., vj at step j. ä First eigenpair by iterative Arnoldi with m = 10.

ä Accumulate each new converged eigenvector in columns 2, 3, ... m <e(λ) =m(λ) Res. Norm
[‘locked’ set of eigenvectors.] 10 0.9987435899D+00 0.0 0.246D-01
" # 20 0.9999523324D+00 0.0 0.144D-02
z active
}| { 30 0.1000000368D+01 0.0 0.221D-04
Thus, for k = 2: Vm = v , v}2 , v3, . . . , vm
| 1{z
Locked 40 0.1000000025D+01 0.0 0.508D-06
50 0.9999999996D+00 0.0 0.138D-07
 
∗ ∗ ∗ ∗ ∗ ∗
 
 ∗ ∗ ∗ ∗ ∗ 
 
 
 ∗ ∗ ∗ ∗ 
 
Hm =  
 ∗ ∗ ∗ ∗ 
 
 
 ∗ ∗ ∗ 
 
∗ ∗
16-28 – eig1
ä Computing the next 2 eigenvalues of Mark(10). Hermitian case: The Lanczos Algorithm
Eig. Mat-Vec’s <e(λ) =m(λ) Res. Norm
ä The Hessenberg matrix becomes tridiagonal :
2 60 0.9370509474 0.0 0.870D-03
69 0.9371549617 0.0 0.175D-04 A = AH and VmH AVm = Hm H
→ Hm = Hm
78 0.9371501442 0.0 0.313D-06
87 0.9371501564 0.0 0.490D-08 ä We can write
3 96 0.8112247133 0.0 0.210D-02  
α1 β2
104 0.8097553450 0.0 0.538D-03 β α β 
 2 2 3 
112 0.8096419483 0.0 0.874D-04  
.. .. .. ..  β3 α3 β4 
Hm =  (2)
.. .. .. ..  . . . 
 
 . . . 
152 0.8095717167 0.0 0.444D-07
βm αm
ä Consequence: three term recurrence
βj+1vj+1 = Avj − αj vj − βj vj−1
16-29 – eig1 16-30 – eig1

ALGORITHM : 4 Lanczos Lanczos with reorthogonalization


1. Choose v1 of norm unity. Set β1 ≡ 0, v0 ≡ 0
2. For j = 1, 2, . . . , m Do: Observation [Paige, 1981]: Loss of orthogonality starts suddenly,
3. wj := Avj − βj vj−1 when the first eigenpair converges. It indicates loss of linear
4. αj := (wj , vj ) indedependence of the vis. When orthogonality is lost, then several
5. wj := wj − αj vj copies of the same eigenvalue start appearing.
6. βj+1 := kwj k2. If βj+1 = 0 then Stop
7. vj+1 := wj /βj+1 ä Full reorthogonalization – reorthogonalize vj+1 against all previous
8. EndDo vi’s every time.
ä Partial reorthogonalization – reorthogonalize vj+1 against all pre-
vious vi’s only when needed [Parlett & Simon]
Hermitian matrix + Arnoldi → Hermitian Lanczos ä Selective reorthogonalization – reorthogonalize vj+1 against com-
puted eigenvectors [Parlett & Scott]
ä In theory vi’s defined by 3-term recurrence are orthogonal.
ä No reorthogonalization – Do not reorthogonalize - but take mea-
ä However: in practice severe loss of orthogonality; sures to deal with ’spurious’ eigenvalues. [Cullum & Willoughby]
16-31 – eig1 16-32 – eig1
Partial reorthogonalization The Lanczos Algorithm in the Hermitian Case

ä Partial reorthogonalization: reorthogonalize only when deemed Assume eigenvalues sorted increasingly
necessary.
λ1 ≤ λ2 ≤ · · · ≤ λn
ä Main question is when?
ä Uses an inexpensive recurrence relation ä Orthogonal projection method onto Km;
ä Work done in the 80’s [Parlett, Simon, and co-workers] + more ä To derive error bounds, use the Courant characterization
recent work [Larsen, ’98]
ä Package: PROPACK [Larsen] V 1: 2001, most recent: V 2.1 (Au, u) (Aũ1, ũ1)
λ̃1 = min =
(Apr. 05) u ∈ K, u6=0 (u, u) (ũ1, ũ1)
(Au, u) (Aũj , ũj )
ä Often, need for reorthogonalization not too strong λ̃j =  min =
u ∈ K, u6=0 (u, u) (ũj , ũj )
u ⊥ũ1 ,...,ũj−1

16-33 – eig1 16-34 – eig1

ä Bounds for λ1 easy to find – similar to linear systems. A-priori error bounds
ä Ritz values approximate eigenvalues of A inside out:
Theorem [Kaniel, 1966]:
 2
λ1 λ2 λn−1 λn (m) tan ∠(v1, u1)
0≤ λ1 − λ1 ≤ (λN − λ1)
λ̃1 λ̃2 λ̃n−1 λ̃n Tm−1(1 + 2γ1)
λ2−λ1
where γ1 = λN −λ2
; and ∠(v1, u1) = angle between v1 and u1.

+ results for other eigenvalues. [Kaniel, Paige, YS]

Theorem
 2
(m) (m) tan ∠(vi, ui)
0≤ λi − λi ≤ (λN − λ 1 ) κi
Tm−i(1 + 2γi)
λi+1−λi (m) Q (m)
λj−λN
where γi = λN −λi+1
, κi = j<i λ(m)−λ
j i

16-35 – eig1 16-36 – eig1


The Lanczos biorthogonalization (AH 6= A) ä Builds a pair of biorthogonal bases for the two subspaces
Km(A, v1) and Km(AH , w1)
ALGORITHM : 5 Lanczos bi-orthogonalization
ä Many choices for δj+1, βj+1 in lines 7 and 8. Only constraint:
1. Choose two vectors v1, w1 such that (v1, w1) = 1.
2. Set β1 = δ1 ≡ 0, w0 = v0 ≡ 0 δj+1βj+1 = (v̂j+1, ŵj+1)
3. For j = 1, 2, . . . , m Do:
4. αj = (Avj , wj ) Let  
5. v̂j+1 = Avj − αj vj − βj vj−1 α1 β2
6. ŵj+1 = AT wj − αj wj − δj wj−1 δ α β3 
 2 2 
 
7. δj+1 = |(v̂j+1, ŵj+1)|1/2. If δj+1 = 0 Stop Tm = . . .  .
 
8. βj+1 = (v̂j+1, ŵj+1)/δj+1  δm−1 αm−1 βm 
9. wj+1 = ŵj+1/βj+1 δm αm
10. vj+1 = v̂j+1/δj+1
11.EndDo ä vi ∈ Km(A, v1) and wj ∈ Km(AT , w1).

16-37 – eig1 16-38 – eig1

If the algorithm does not break down before step m, then ä If θj , yj , zj are, respectively an eigenvalue of Tm, with associated
the vectors vi, i = 1, . . . , m, and wj , j = 1, . . . , m, right and left eigenvectors yj and zj respectively, then corresponding
are biorthogonal, i.e., approximations for A are

(vj , wi) = δij 1 ≤ i, j ≤ m . Ritz value Right Ritz vector Left Ritz vector
θj Vmyj Wmzj
Moreover, {vi}i=1,2,...,m is a basis of Km(A, v1) and
{wi}i=1,2,...,m is a basis of Km(AH , w1) and [Note: terminology is abused slightly - Ritz values and vectors nor-
AVm = VmTm + δm+1vm+1eH mally refer to Hermitian cases.]
m,
A Wm = WmTm + β̄m+1wm+1eH
H H
m,
H
Wm AVm = Tm .

16-39 – eig1 16-40 – eig1


Advantages and disadvantages Look-ahead Lanczos

Advantages: Algorithm breaks down when:

ä Nice three-term recurrence – requires little storage in theory. (v̂j+1, ŵj+1) = 0


ä Computes left and a right eigenvectors at the same time
Three distinct situations.
Disadvantages:
ä ‘lucky breakdown’ when either v̂j+1 or ŵj+1 is zero. In this case,
ä Algorithm can break down or nearly break down. eigenvalues of Tm are eigenvalues of A.
ä Convergence not too well understood. Erratic behavior ä (v̂j+1, ŵj+1) = 0 but of v̂j+1 6= 0, ŵj+1 6= 0 → serious
ä Not easy to take advantage of the tridiagonal form of Tm. breakdown. Often possible to bypass the step (+ a few more) and
continue the algorithm. If this is not possible then we get an ...
ä ... Incurable break-down. [very rare]

16-41 – eig1 16-42 – eig1

Look-ahead Lanczos algorithms deal with the second case.


See Parlett 80, Freund and Nachtigal ’90.... Main idea: when
break-down occurs, skip the computation of vj+1, wj+1 and define
vj+2, wj+2 from vj , wj . For example by orthogonalizing A2vj ...
Can define vj+1 somewhat arbitrarily as vj+1 = Avj . Similarly for
wj+1.

ä Drawbacks: (1) projected problem no longer tridiagonal (2)


difficult to know what constitutes near-breakdown.

16-43 – eig1

You might also like