Lect 066
Lect 066
Lect 066
Lent 2005
Professor A. Iserles
where λ̂ is a scalar that may depend on k. Therefore the calculation of x(k+1) from x(k) requires the
solution of an n × n system of linear equations whose matrix is A − λ̂I. Further, if λ̂ is a constant and if
A − λ̂I is nonsingular, we deduce from (2.1) that x(k+1) is a multiple of (A − λ̂I)−k−1 x(0) .
Pn
We again let x(0) = j=1 θj wj , as in the proof of Theorem 2.3, assuming that wj , j = 1, 2, . . . , n, are
linearly independent eigenvectors of A that satisfy Awj = λj wj . Therefore we note that the eigenvalue
equation implies (A − λ̂I)wj = (λj − λ̂)wj , which in turn implies (A − λ̂I)−1 wj = (λj − λ̂)−1 wj . It
follows that x(k+1) is a multiple of
n
X n
X
(A − λ̂I)−k−1 x(0) = θj (A − λ̂I)−k−1 wj = θj (λj − λ̂)−k−1 wj .
j=1 j=1
Thus, if the lth number in the set {|λj − λ̂| : j = 1, 2, . . . , n} is smaller than the rest and if θl is nonzero,
then x(k+1) tends to be a multiple of wl as k → ∞. We see that the speed of convergence can be excellent
if λ̂ is very close to λl . Further, it can be made even faster by adjusting λ̂ during the calculation. Typical
details are given in the following implementation.
Algorithm 2.8 (Typical implementation of inverse iteration)
0. Set λ̂ to an estimate of an eigenvalue of A. Either prescribe x(0) 6= 0 or let it be chosen automatically
in 3. Let 0 < ε ¿ 1 and set k = 0.
1. Calculate (with pivoting if necessary) the LU factorization of A − λ̂I.
2. Stop if U is singular because then λ̂ is an eigenvalue of A, while its eigenvector is any vector in the
null space of U : it can be found easily, U being upper triangular.
3. If k = 0 and unless x(0) has been prescribed, define x(1) by U x(1) = ei , where ei is the ith
coordinate vector, and where i is defined by the property that |Ui,i | is the smallest modulus of a diagonal
element of U . Further, we set x(0) = Lei , in order to satisfy (A − λ̂I)x(1) = x(0) .
For k ≥ 1 x(k+1) is calculated by solving (A − λ̂I)x(k+1) = LU x(k+1) = x(k) , which is straightforward
using the LU factorization from 1.
4. Set η to the number that minimizes f (η) = kx(k) − ηx(k+1) k.
5. Stop if f (η) ≤ εkx(k+1) k. Since f (η) = k(A − λ̂I)−1 [A − (η + λ̂)I]x(k) k, we let λ̂ + η be the
calculated eigenvalue of A and x(k+1) /kx(k+1) k be its eigenvector.
6. Otherwise, replace x(k+1) by x(k+1) /kx(k+1) k, increase k by one, and either return to 3 without
changing λ̂ or to 1 after replacing λ̂ by λ̂ + η.
Remark 2.9 (Further on inverse iteration). Algorithm 2.8 is very efficient if A is an upper Hessenberg
matrix: every element of A under¡the¢first subdiagonal is zero (i.e. Ai,j = 0, j ≤ i − 2). In this case the LU
factorization in 1 requires just O n2 or O(n) when A is nonsymmetric or symmetric, respectively. Thus
the replacement of λ̂ by λ̂ + η in 6 need not be expensive, so fast convergence can often be achieved easily.
1 Please email all corrections and suggestions to these notes to [email protected]. All handouts are available on
11
There are standard ways of giving A this convenient form which will be considered later. This and our next
topic, deflation, depend on the following basic result.
Theorem 2.10 Let A and S be n × n matrices, S being nonsingular. Then w is an eigenvector of A with
eigenvalue λ if and only if Sw is an eigenvector of SAS −1 with the same eigenvalue.
Proof
Aw = λw ⇔ AS −1 (Sw) = λw ⇔ (SAS −1 )(Sw) = λ(Sw).
2
Definition 2.11 (Deflation). Suppose that we have found one solution of the eigenvector equation Aw =
λw, where A is again n × n. Then deflation is the task of constructing an (n − 1) × (n − 1) matrix, B
say, whose eigenvalues are the other eigenvalues of A. Specifically, we apply a similarity transformation
S to A such that the first column of SAS −1 is λ times the first coordinate vector, because it follows from
the characteristic equation for eigenvalues and from Theorem 2.10 that we can let B be the bottom right
(n − 1) × (n − 1) submatrix of SAS −1 .
We write the condition on S as (SAS −1 )e1 = λe1 . Therefore the last part of the proof of Theorem 2.10
shows that it is sufficient if S has the property Sw = ce1 , where c is any nonzero scalar.
Technique 2.12 (Algorithm for deflation for symmetric A). Suppose that A is symmetric and w ∈ Rn \{0},
λ ∈ R are given so that Aw = λw. We seek a nonsingular matrix S such that Sw = ce1 and such that
SAS −1 is also symmetric. The last condition holds if S is orthogonal, since then S −1 = S > . It is suitable
to pick a Householder reflection, which means that S has the form I − 2uu> /kuk2 , where u ∈ Rn \ {0}.
Specifically, we recall from the D3 Numerical Analysis course that Householder reflections are orthogonal
and that, for any nonzero w ∈ Rn , a real vector u can be found with the property
uu>
µ ¶
I −2 w = ±kwke1 . (2.2)
kuk2
Thus, we let ui = wi for i = 2, 3, . . . , n and choose u1 so that 2u> w = kuk2 . Since
uu> u> w
µ ¶
I −2 2
w =w−2 u = w − u,
kuk kuk2
it follows that (2.2) holds for components i = 2, 3, . . . , n. Finally, we compute u1 so that 2u> w = kuk2
is true. Since for our u it is true that u> w = kwk2 + (u1 w1 − w12 ) and kuk2 = kwk2 + u21 − w12 , we
deduce that u21 − 2u1 w1 + w12 = kwk2 , hence u1 = w1 ± kwk and (2.2) is obeyed also for i = 1.
Since the bottom n − 1 components of u and w coincide, the calculation¡of u
¢ requires only O(n) computer
operations. Further, the calculation of SAS −1 can be done in only O n2 operations, taking advantage
of the form S = I − 2uu> /kuk2 , even if all the elements of A are nonzero. After deflation, we may
find an eigenvector, ŵ say, of SAS −1 . Then the corresponding eigenvector of A, given in Theorem 2.10,
is S −1 ŵ = S ŵ, because Householder matrices, like all symmetric orthogonal matrices, are involutions:
S 2 = I.
Technique 2.13 (Algorithm for deflation when A is nonsymmetric). There is a faster algorithm for deflation
in the nonsymmetric case (of course, it applies also to symmetric matrices). Let wi , i = 1, 2, . . . , n, be the
components of the eigenvector w and assume w1 6= 0 (which can be achieved by reordering the variables if
necessary). Further, we let S be the n × n matrix whose elements agree with those of the n × n unit matrix,
except that the off-diagonal part of the first column of S has the elements Si1 = −wi /w1 , i = 2, 3, . . . , n.
Then S is nonsingular and has the property Sw = w1 e1 , so it is suitable for our purpose. Moreover,
the elements of S −1 also agree with those of the n × n unit matrix, except that (S −1 )i,1 = +wi /w1 ,
−1 −1
i = 2, 3,
¡ .2.¢. , n. These forms of S and S allow the calculation of SAS−1 , and hence B, to be done in
only O n operations. Further, because the last n − 1 columns of SAS and of SA are the same, B is
just the bottom right (n − 1) × (n − 1) submatrix of SA. Therefore, for every integer 1 ≤ i ≤ n − 1 we
form the ith row of B in the following way: subtract wi /w1 times the first row of A from the (i + 1)th row
of A, and ignore the first component of the resultant row vector.
12