0% found this document useful (0 votes)
4 views

lecture10 (1)

The power method is an iterative algorithm used to find the largest eigenvalue of a diagonalizable matrix A, assuming a dominant eigenvalue exists. The process involves normalizing the vector at each step to avoid overflow and underflow, and it converges slowly, particularly in the nonsymmetric case. In the symmetric case, the method is more efficient due to orthonormal eigenvectors, allowing for better bounds on the computed eigenvectors and eigenvalues.

Uploaded by

kedir
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

lecture10 (1)

The power method is an iterative algorithm used to find the largest eigenvalue of a diagonalizable matrix A, assuming a dominant eigenvalue exists. The process involves normalizing the vector at each step to avoid overflow and underflow, and it converges slowly, particularly in the nonsymmetric case. In the symmetric case, the method is more efficient due to orthonormal eigenvectors, allowing for better bounds on the computed eigenvectors and eigenvalues.

Uploaded by

kedir
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Lecture # 10

The Power Method for Eigenvalues

The power method find the largest (in magnitude) eigenvalue of

A ∈ Rn×n .

To explain the power method, we make two assumptions.


1. A is diagonalizable. That is,

A = XΛX −1

for some nonsingular matrix X and diagonal matrix Λ.

2. There is a single dominant eigenvalue λ1 such that

|λ1 | > |λ2 | ≥ |λ3 | ≥ . . . ≥ |λn |.

The bigger the gap between |λ1 | and |λ2 |, the better.
Let xj be the eigenvector associated with λj .
We choose an initial guess w0 such that

w0 = a1 x1 + a2 x2 + · · · + an xn = Xa

where
a = (a1 , a2 , . . . , an )T .
The basic idea of the power method is that

Aw0 = a1 λ1 x1 + a2 λ2 x2 + · · · + an λn xn = XΛa

and by induction

Ak w0 = a1 λk1 x1 + a2 λk2 x2 + · · · + an λkn xn = XΛk a.

Factor out λk1 and we have


( )k ( )k
λ2 λn
k
A w0 = λk1 (a1 x1 + a2 x2 + · · · + an xn ).
λ1 λ1

1
You can see that unless a1 = 0 (very unlikely), the dominant direction of this
vector will eventually be that of x1 .
To avoid overflow and underflow, we need to normalize Ak w0 at each step.
For the nonsymmetric case, the infinity norm will be the most convenient (for
the symmetric case, we switch to the two-norm).
So we let
wk = Ak w0 /∥Ak w0 ∥∞ .
A good heuristic initial guess for w0 is the vector such that

∥Aw0 ∥∞ = ∥A∥∞ (1)

which is just the vector whose jth component is

(w0 )j = sign(aIj ) (2)

where I is the row of A with largest absolute sum.


This leads to a very simple iteration

function [λ, x]= power iter(A)


Choose w0 according to (1)–(2)
y = Aw0
λ = eTJ y where J is the maximum component of y
x = y/λ; xold = w0 ;
while ∥x − xold ∥∞ > ϵ
y = Ax;
xold = x;
λ = eTJ y where J is the maximum component of y
x = y0 /λ;
end;
end power iter

To see why this converges, we assume that the maximum component eJ


“settles down” after a while. Then let
( )k ( )k
λ2 λn
T k k T
eJ A w0 = λ1 [a1 eJ x1 + a2 eJ x2 + · · · + an
T
eTJ xn ]
λ1 λ1
k T
= λ1 [a1 eJ x1 + ϵk ]

2
where ([ ]k )
|λ2 |
ϵk = O .
|λ1 |
The algorithm computes the approximation

eTJ Ak w0
λ̂(k) =
eTJ Ak−1 w0
λk1 [a1 eTJ x1 + ϵk ]
=
λk−1 T
1 [a1 eJ x1 + ϵk−1 ]
a1 eTJ x1 + ϵk
= λ1
a1 eTJ x1 + ϵk−1
([ ]k )
|λ2 |
= λ1 (1 + O ).
|λ1 |

In general, this is quite slow!!! See the example on p.259 of your text. It
is a 3 × 3 and it takes about 28 iterations to get anything decent. The only
bound given here is on the eigenvalues, not the vectors!!

The Symmetric Case Much easier!! Now we can say

A = AT = XΛX T , X T X = In

so we have orthonormal eigenvectors and real eigenvalues. Here we can ac-


tually get bounds on the vectors.
Now

n
w 0 = a 1 x1 + aj xj = Xa
j=2

as before. But
1 n−1
( )
X= x1 X̃
where
X̃ = (x2 , . . . , xn )
satifies
X̃ T x1 = 0.

3
Now assume each xj satifies ∥xj ∥2 .
Thus w0 separates into

w0 = a1 x1 + X̃a2

where
a1 = xT1 w0 , a2 = X̃ T w0 .
These have geometric meaning. We assume that ∥w0 ∥2 = 1, but the choice
of w0 in (1) still makes sense except that we normalize it in the two-norm.

cos θ0 = a1 = xT1 w0
| sin θ0 | = ∥a2 ∥2 = ∥X̃ T w0 ∥2

where θ0 is the angle between w0 and x1 .


We also track
∥X̃ T w0 ∥2
| tan θ0 | = .
|xT1 w0 |
If we let the next iterate w1 be given by

w1 = Aw0 /∥Aw0 ∥2

then we want a bound on


∥X̃ T w1 ∥2
| tan θ1 | = .
|xT1 w1 |
Since the normalizing factors out, we have

∥X̃ T Aw0 ∥2
| tan θ1 | = .
|xT1 Aw0 |
Now,
Aw0 = a1 λ1 x1 + AX̃a2 .
Since the columns of X̃ are eigenvectors,
 
λ1 0 ··· ··· ···
 0 λ2 ··· ··· ··· 
 
AX̃ = X̃ Λ̃, Λ̃ = 
 ··· ··· ··· ······ 

 ··· ··· ··· ······ 
··· ··· ··· 0 λn

4
so
Aw0 = a1 λ1 x1 + X̃ Λ̃a2 .
and
xT1 Aw0 = a1 λ1 ,
X̃ T Aw0 = Λ̃a2 .
thus
∥X̃ T Aw0 ∥2 ∥Λ̃a2 ∥2
| tan θ1 | = =
|x1 Aw0 |
T
|λ1 ||a1 |
∥Λ̃∥2 ∥a2 ∥2

|λ1 ||a1 |
|λ2 | ∥a2 ∥2
=
|λ1 | |a1 |
|λ2 |
= | tan θ0 |
|λ1 |
An induction argument yields
( )k
∥X̃ T wk ∥2 |λ2 |
| tan θk | = ≤ | tan θ0 |.
|xT1 w0 | |λ1 |
Thus the computed eigenvector converges according to
([ ]k )
|λ2 |
O .
|λ1 |

You can compute the eigenvalue even more accurately using the Rayleigh
quotient
wT Awk
λ(k) = kT .
wk wk
Since wk is a unit vector this is
λ(k) = wkT Awk .
You can show that
([ ]2k )
|λ(k) − λ1 | |λ2 |
≤ 2 sin2 θk = O . (3)
|λ1 | |λ1 |

5
This is still not fast enough!! Next time we will look at method for accelo-
rating it.
The proof of (3) is below, but it was skipped in class (because of its
length).
We can write wk as

wk = cos θk x1 + sin θk f

where
f = X̃ X̃ T wk /∥X̃ X̃ T wk ∥2 .
Thus ∥f ∥2 = 1 and xT1 f = 0. We have that

Awk = cos θk λ1 x1 + sin θk Af .

Prove for yourself that


xT1 Af = 0.
Then

wkT Awk = (cos θk x1 + sin θk f )T (cos θk λ1 x1 + sin θk Af )


= λ1 cos2 θk + sin2 θk f T Af
= λ1 + sin2 θk (f T Af − λ1 ).

So
|wkT Awk − λ1 |
≤ sin2 θk |f T Af − λ1 |/|λ1 |
|λ1 |
≤ sin2 θk (∥f ∥22 ∥A∥2 + |λ1 |)/|λ1 |)
= 2 sin2 θk |λ1 |/|λ1 |
= 2 sin2 θk

You might also like