0% found this document useful (0 votes)
94 views7 pages

Diagonalization and Powers of Matrices: Brian Krummel April 6, 2020

1. Diagonalizable matrices can be used to efficiently compute powers of matrices by expressing the matrix as PDP-1, where D is a diagonal matrix of eigenvalues. To compute Ak, one simply takes the kth power of the diagonal entries of D. 2. Computing functions of matrices, such as f(A) = A2 + 2A + 5I, follows a similar process. If A is diagonalizable as PDP-1, then f(A) = Pf(D)P-1, where f(D) applies the function to the diagonal entries of D. 3. This process extends to analytic functions defined by their Taylor series, such as the exponential function exp(A)
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
94 views7 pages

Diagonalization and Powers of Matrices: Brian Krummel April 6, 2020

1. Diagonalizable matrices can be used to efficiently compute powers of matrices by expressing the matrix as PDP-1, where D is a diagonal matrix of eigenvalues. To compute Ak, one simply takes the kth power of the diagonal entries of D. 2. Computing functions of matrices, such as f(A) = A2 + 2A + 5I, follows a similar process. If A is diagonalizable as PDP-1, then f(A) = Pf(D)P-1, where f(D) applies the function to the diagonal entries of D. 3. This process extends to analytic functions defined by their Taylor series, such as the exponential function exp(A)
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Diagonalization and powers of matrices

Brian Krummel
April 6, 2020

One important application of diagonalizable matrices is computing powers of square matrices.


Let A be a diagonalizable n × n matrix expressed as

A = P DP −1

for a n × n diagonal matrix D and n × n invertible matrix P . Suppose we want to compute Ak


for some integer k. Then by multiplying P DP −1 k-times and cancelling P −1 P = I:

Ak = (P DP −1 )k = (P DP −1 )(P DP −1 ) · · · (P DP −1 ) = P |DD{z
· · · D} P −1 = P Dk P −1 .
| {z }
k times k times

Computing the Dk for a diagonal matrix is very easy:


 k  
λ1 0 · · · 0 λk1 0 · · · 0
k
 0 λ2 · · · 0   0 λk2 · · · 0 
D =  = .
   
.. .. . . . .. .. . . .
 . . . ..   . . . .. 
0 0 · · · λn 0 0 · · · λkn

That is, Dk is the diagonal matrix obtained by computing the k-th power of the diagonal entries
of D.

Example 1. Given the 2 × 2 matrix


 
−4 6
A= ,
−3 5

find A5 .
Answer. Finding eigenvalues of A. Suppose λ is an eigenvalue of A. Then (A − λI)x = 0 has a
nontrivial solution. Thus the matrix A − λI is singular and det(A − λI) = 0. We have

−4 − λ 6
det(A − λI) = = (−4 − λ)(5 − λ) + 18 = λ2 − λ − 2 = (λ + 1)(λ − 2).
−3 5−λ

Therefore the eigenvalues of A are −1, 2.

1
Finding eigenvector corresponding to −1. We solve (A + I) X = 0.
   
−3 6 1 −2
A+I = −→ .
−3 6 0 0

x2 is a free variable and x1 is a basic variable with x1 = 2x2 , so an eigenvector of A corresponding


to −2 is  
2
.
1
Finding eigenvector corresponding to 2. We solve (A − 2I) X = 0.
   
−6 6 1 −1
A − 2I = −→ .
−3 3 0 0

x2 is a free variable and x1 is a basic variable with x1 = x2 , so an eigenvector of A corresponding


to −1 is  
1
.
1
Diagonalize. We let D be the diagonal matrix whose diagonal entries are the eigenvalues −1, 1.
We let P be the matrix whose columns are the corresponding eigenvectors:
   
−1 0 2 1
D= P =
0 2 1 1

Thus    −1
−1 2 1 −1 0 2 1
A = P DP = (1)
1 1 0 2 1 1
Compute A5 .
  5  −1
5 5 −1 2 1 −1 0 2 1
A = PD P =
1 1 0 2 1 1
  5  
2 1 −1 0 1 −1
=
1 1 0 2 −1 2
   
2 1 −1 0 1 −1
=
1 1 0 32 −1 2
  
2 1 −1 1
=
1 1−32 64
 
−34 66
= .
−33 65

More generally, this gives us a way to compute functions of matrices.

Example 2. Let A be as in Example 1. Is B = A2 + 2A + 5I diagonalizable?

2
Answer. We have already shown that A is diagonalizable, so let A = P DP −1 . Then, using
A2 = P D2 P −1 and I = P IP −1 ,

B = A2 + 2A + 5I
= (P DP −1 )2 + 2P DP −1 + 5I
= P D2 P −1 + 2P DP −1 + 5P IP −1
= P (D2 + 2D + 5I) P −1 .

Recalling (1),
   
(−1)2 + 2(−1) + 5 0 −1 4 0
B=P 2 P =P P −1 .
0 2 + 2(2) + 5 0 13

Therefore, B is diagonalizable.

Notice that here we had a polynomial function f (x) = x2 + 2x + 5. We showed that if A is a


diagonalizable n × n matrix written as
 
λ1 0 ··· 0
 0 λ2 · · · 0 
A = P DP −1 = P  ..
 −1
P

.. . . .
 . . . .. 
0 0 · · · λn

where P is an invertible n × n matrix, then f (A) = A2 + 2A + 5I (with 5I in place of 5) is


 
f (λ1 ) 0 ··· 0
 0 f (λ2 ) · · · 0 
f (A) = P f (D) P −1 = P  ..
 −1
..  P .

.. ..
 . . . . 
0 0 · · · f (λn )

This holds true for any polynomial function f (x). In fact, this holds true for any real analytic
function f (x), i.e. any function which converges to its Taylor series.

Example 3. For instance, consider the exponential function exp(x) = ex . This function has the
Taylor series

X xk
exp(x) = .
k=0
k!
We can define exp(A) for an n × n matrix by

X Ak
exp(A) = ,
k=0
k!

where the infinite sum means that we compute the infinite sum for each entry. Of course, defining
exp(A) by an infinite series is not particularly enlightening. Instead, suppose that A is a diago-
nalizable matrix with A = P DP −1 for an n × n diagonal matrix D and n × n invertible matrix

3
P . Then using Ak = P Dk P −1 :
 
λk1 0 · · · 0
∞ ∞ ∞ ∞
X Ak X P Dk P −1 X Dk X 1  0 λk2 · · · 0 
· P −1 = P ·
 −1
exp(A) = = =P· P

.. .. . . .
k! k! k! k! . ..

k=0 k=0 k=0 k=0
 . . 
0 0 · · · λkn
 ∞ 
X λk 1
0 ··· 0

 k=0
k! 
  

 ∞
X λk

 e λ1 0 · · · 0
∞ 2
X Dk  0 ··· 0 
 −1
 0 eλ2 · · · 0 
 −1
=P· =

k=0
k! P = P

.. .. . . .. P
k!

k=0
 .. .. .. ..   . . . . 

 . . . . 
 0 0 ··· e λn
 ∞
X λk 
n 
0 0 ···

k=0
k!

For instance, when A is as in Example 1,


   −1  −1
2 1 e 0 2 1
exp(A) =
1 1 0 e2 1 1
   −1  
2 1 e 0 1 −1
=
1 1 0 e2 −1 2
   −1 −1

2 1 e −e
= 2
1 1 −e 2e2
 −1
2e − e2 −2e−1 + 2e2

= .
e−1 − e2 −e−1 + 2e2

This is important when studying differential equations. Recall that for each real number a, the
solution to y 0 = ay is y = ceat , where c ∈ R is a constant. For each n × n matrix A, we can
consider the differential system Y 0 = AY , where Y (t) is a function of t taking values in Rn . The
solution to Y 0 = AY is Y = exp(tA) · C, where C ∈ Rn is a constant.

Example 4. Metropolis is served by two local newspapers, the Daily Planet and Metropolis Star.
The Daily Planet seems to be in trouble. Currently has only a 34% market share. Every year, 10%
of its readership switches to the Star, whereas only 6% of the Star’s readership switches to the
Planet. Assume that no one subscribes to both papers and that the total newspaper readership
remains constant. What is the long-term outlook for the Planet?
Answer. Next year, the figures for the Planet and Star will be, respectively

0.9 · 0.34 + 0.06 · 0.66 = 0.3456


0.1 · 0.34 + 0.94 · 0.66 = 0.6544

This can be expressed as the matrix product of the form


    
0.9 0.06 0.34 0.3456
= .
0.1 0.94 0.66 0.6544

4
In other words, X1 = P X0 where
     
0.9 0.06 0.34 0.3456
P = , X0 = , X1 = .
0.1 0.94 0.66 0.6544

We refer to the vectors X0 and X1 representing the readership for each year as the state vectors.
For each positive integer k, we will let the state vector Xk represent the readership in the k-th
year. Notice that the sum of the entries of each state vector Xk (for k = 0, 1) is 1. We call a
column vector with non-negative entries and the sum of its entries equal to 1 a probability vector.
We refer to the matrix P as the transition matrix, as it transitions the state vector Xk for the
k-th year to the state vector Xk+1 = P Xk for the next year via multiplication. The columns of P
represent the probability that the readership will stay with the magazine or go to its rival. Thus
the state vectors satisfy the inductive relationship

Xk+1 = P Xk (2)

for each k. Notice that since the readership for each magazine stays with them or goes to their
rival in the next year, the sum of the columns of P must equal 1. We call a matrix P with non-
negative entries and the sum of its entries in each column equal to 1 a probability matrix. Since
the transition matrix P is independent of the readership, we say that this is Markov process.
If we compute the readership for the next few years, we obtain
    
0.9 0.06 0.3456 0.350304
X2 = P X 1 = = ,
0.1 0.94 0.6544 0.649696
    
0.9 0.06 0.350304 0.35425536
X3 = P X 3 = = ,
0.1 0.94 0.649696 0.64574464
    
0.9 0.06 0.35425536 0.3575745024
X4 = P X 4 = = .
0.1 0.94 0.64574464 0.6424254976

The Planet is not in trouble. The readership of the Planet is in fact going up each year, whereas
the readership of the Star is going down. This is because even though the Planet is currently less
popular, there are not enough disgruntled Planet readers to keep the Star growing.
To compute the readership for the k-th year, we multiplied P by the state vectors k times.
Hence by (2),
Xk = P k X0
for each k. We can use what we learned about computing powers of matrices using diagonalization
to compute P k and P k X0 are and thereby work-out the long-term prospects of the Daily Planet.
Find eigenvalues. We compute

0.9 − λ 0.06
det(P − λI) =
0.1 0.94 − λ
= (0.9 − λ)(0.94 − λ) − 0.006
= λ2 − 1.84λ + p11 + 0.84
= (λ − 1)(λ − 0.84) = 0.

5
Therefore, λ = 1, 0.84.
Find eigenvectors for λ = 1. We compute
     
−0.1 0.06 R2+R1 7→ R2 −0.1 0.06 −10 R1 7→ R1 1 −0.6
P −I = −−−−−−−→ −−−−−−−→
0.1 −0.06 0 0 0 0
x2 is a free variable and x1 = 0.6 x2 . Thus the eigenspace corresponding to λ = 1 is spanned by
 
0.6
.
1
Find eigenvectors for λ = 0.84. We compute
   
0.06 0.06 1 1
P − 0.84 I = −→
0.1 0.1 0 0
x2 is a free variable and x1 = −x2 . Thus the eigenspace corresponding to λ = 0.84 is spanned by
 
−1
.
1

Diagonalize. P = QDQ−1 where


   
1 0 0.6 −1
D= , Q= .
0 0.84 1 1

Compute P k and P k X0 . We have that


 
k −1 1 0
k
P = QD Q = Q Q−1
0 0.84k
   −1
0.6 −1 1 0 0.6 −1
=
1 1 0 0.84k 1 1
    
0.6 −1 1 0 1 1 1
= ·
1 1 0 0.84k 1.6 −1 0.6
  
1 0.6 −1 1 1
=
1.6 1 1 −0.84k 0.6 · 0.84k
 
1 0.6 + 0.84k 0.6 (1 − 0.84k )
=
1.6 1 − 0.84k 1 + 0.6 · 0.84k
and
    
k 1 0.6 + 0.84k 0.6 (1 − 0.84k ) 0.34 1 0.6 − 0.32 · 0.84k
P X0 = =
1.6 1 − 0.84k 1 + 0.6 · 0.84k 0.66 1.6 1 + 0.32 · 0.84k
 
0.375 − 0.32 · 0.84k
=
0.625 + 0.32 · 0.84k
Letting k → ∞, the long-term readership X is given by
   
k 0.375 − 0.32 · 0.84k 0.375
X = lim P X0 = lim = .
k→∞ k→∞ 0.625 + 0.32 · 0.84k 0.625

6
Note that there is another way we could have determined the long-term readership. Recall
that
Xk+1 = P Xk
for each k. Letting k → ∞

X = lim Xk+1 = lim P Xk = P · lim Xk = P X,


k→∞ k→∞ k→∞

or simply
X = P X.
In other words, the long-term readership X is an eigenvector of P with corresponding eigenvalue
1. Therefore,  
0.6
X=c
1
for some scalar c ∈ R. Since the sum of the entries of X add up to 1, we must have 1.6 c =
c (0.6 + 1) = 1, or c = 0.625, so that
   
0.6 0.375
X = 0.625 =
1 0.625

as we found above.

We can describe some of what we observed by with the following theorem. Recall that a
probability vector is a column vector with non-negative entries and the sum of its entries equal to
1. A probability matrix is a matrix with non-negative entries and the sum of its entries in each
column equal to 1.

Theorem 1. Let P be an n × n probability matrix with all non-zero entries. Then there is a
unique probability vector X ∈ Rn such that

P X = X.

For each probability vector X0 ∈ Rn ,

X = lim P k X0 .
k→∞

You might also like