Diagonalization and Powers of Matrices: Brian Krummel April 6, 2020
Diagonalization and Powers of Matrices: Brian Krummel April 6, 2020
Brian Krummel
April 6, 2020
A = P DP −1
Ak = (P DP −1 )k = (P DP −1 )(P DP −1 ) · · · (P DP −1 ) = P |DD{z
· · · D} P −1 = P Dk P −1 .
| {z }
k times k times
That is, Dk is the diagonal matrix obtained by computing the k-th power of the diagonal entries
of D.
find A5 .
Answer. Finding eigenvalues of A. Suppose λ is an eigenvalue of A. Then (A − λI)x = 0 has a
nontrivial solution. Thus the matrix A − λI is singular and det(A − λI) = 0. We have
−4 − λ 6
det(A − λI) = = (−4 − λ)(5 − λ) + 18 = λ2 − λ − 2 = (λ + 1)(λ − 2).
−3 5−λ
1
Finding eigenvector corresponding to −1. We solve (A + I) X = 0.
−3 6 1 −2
A+I = −→ .
−3 6 0 0
Thus −1
−1 2 1 −1 0 2 1
A = P DP = (1)
1 1 0 2 1 1
Compute A5 .
5 −1
5 5 −1 2 1 −1 0 2 1
A = PD P =
1 1 0 2 1 1
5
2 1 −1 0 1 −1
=
1 1 0 2 −1 2
2 1 −1 0 1 −1
=
1 1 0 32 −1 2
2 1 −1 1
=
1 1−32 64
−34 66
= .
−33 65
2
Answer. We have already shown that A is diagonalizable, so let A = P DP −1 . Then, using
A2 = P D2 P −1 and I = P IP −1 ,
B = A2 + 2A + 5I
= (P DP −1 )2 + 2P DP −1 + 5I
= P D2 P −1 + 2P DP −1 + 5P IP −1
= P (D2 + 2D + 5I) P −1 .
Recalling (1),
(−1)2 + 2(−1) + 5 0 −1 4 0
B=P 2 P =P P −1 .
0 2 + 2(2) + 5 0 13
Therefore, B is diagonalizable.
This holds true for any polynomial function f (x). In fact, this holds true for any real analytic
function f (x), i.e. any function which converges to its Taylor series.
Example 3. For instance, consider the exponential function exp(x) = ex . This function has the
Taylor series
∞
X xk
exp(x) = .
k=0
k!
We can define exp(A) for an n × n matrix by
∞
X Ak
exp(A) = ,
k=0
k!
where the infinite sum means that we compute the infinite sum for each entry. Of course, defining
exp(A) by an infinite series is not particularly enlightening. Instead, suppose that A is a diago-
nalizable matrix with A = P DP −1 for an n × n diagonal matrix D and n × n invertible matrix
3
P . Then using Ak = P Dk P −1 :
λk1 0 · · · 0
∞ ∞ ∞ ∞
X Ak X P Dk P −1 X Dk X 1 0 λk2 · · · 0
· P −1 = P ·
−1
exp(A) = = =P· P
.. .. . . .
k! k! k! k! . ..
k=0 k=0 k=0 k=0
. .
0 0 · · · λkn
∞
X λk 1
0 ··· 0
k=0
k!
∞
X λk
e λ1 0 · · · 0
∞ 2
X Dk 0 ··· 0
−1
0 eλ2 · · · 0
−1
=P· =
k=0
k! P = P
.. .. . . .. P
k!
k=0
.. .. .. .. . . . .
. . . .
0 0 ··· e λn
∞
X λk
n
0 0 ···
k=0
k!
This is important when studying differential equations. Recall that for each real number a, the
solution to y 0 = ay is y = ceat , where c ∈ R is a constant. For each n × n matrix A, we can
consider the differential system Y 0 = AY , where Y (t) is a function of t taking values in Rn . The
solution to Y 0 = AY is Y = exp(tA) · C, where C ∈ Rn is a constant.
Example 4. Metropolis is served by two local newspapers, the Daily Planet and Metropolis Star.
The Daily Planet seems to be in trouble. Currently has only a 34% market share. Every year, 10%
of its readership switches to the Star, whereas only 6% of the Star’s readership switches to the
Planet. Assume that no one subscribes to both papers and that the total newspaper readership
remains constant. What is the long-term outlook for the Planet?
Answer. Next year, the figures for the Planet and Star will be, respectively
4
In other words, X1 = P X0 where
0.9 0.06 0.34 0.3456
P = , X0 = , X1 = .
0.1 0.94 0.66 0.6544
We refer to the vectors X0 and X1 representing the readership for each year as the state vectors.
For each positive integer k, we will let the state vector Xk represent the readership in the k-th
year. Notice that the sum of the entries of each state vector Xk (for k = 0, 1) is 1. We call a
column vector with non-negative entries and the sum of its entries equal to 1 a probability vector.
We refer to the matrix P as the transition matrix, as it transitions the state vector Xk for the
k-th year to the state vector Xk+1 = P Xk for the next year via multiplication. The columns of P
represent the probability that the readership will stay with the magazine or go to its rival. Thus
the state vectors satisfy the inductive relationship
Xk+1 = P Xk (2)
for each k. Notice that since the readership for each magazine stays with them or goes to their
rival in the next year, the sum of the columns of P must equal 1. We call a matrix P with non-
negative entries and the sum of its entries in each column equal to 1 a probability matrix. Since
the transition matrix P is independent of the readership, we say that this is Markov process.
If we compute the readership for the next few years, we obtain
0.9 0.06 0.3456 0.350304
X2 = P X 1 = = ,
0.1 0.94 0.6544 0.649696
0.9 0.06 0.350304 0.35425536
X3 = P X 3 = = ,
0.1 0.94 0.649696 0.64574464
0.9 0.06 0.35425536 0.3575745024
X4 = P X 4 = = .
0.1 0.94 0.64574464 0.6424254976
The Planet is not in trouble. The readership of the Planet is in fact going up each year, whereas
the readership of the Star is going down. This is because even though the Planet is currently less
popular, there are not enough disgruntled Planet readers to keep the Star growing.
To compute the readership for the k-th year, we multiplied P by the state vectors k times.
Hence by (2),
Xk = P k X0
for each k. We can use what we learned about computing powers of matrices using diagonalization
to compute P k and P k X0 are and thereby work-out the long-term prospects of the Daily Planet.
Find eigenvalues. We compute
0.9 − λ 0.06
det(P − λI) =
0.1 0.94 − λ
= (0.9 − λ)(0.94 − λ) − 0.006
= λ2 − 1.84λ + p11 + 0.84
= (λ − 1)(λ − 0.84) = 0.
5
Therefore, λ = 1, 0.84.
Find eigenvectors for λ = 1. We compute
−0.1 0.06 R2+R1 7→ R2 −0.1 0.06 −10 R1 7→ R1 1 −0.6
P −I = −−−−−−−→ −−−−−−−→
0.1 −0.06 0 0 0 0
x2 is a free variable and x1 = 0.6 x2 . Thus the eigenspace corresponding to λ = 1 is spanned by
0.6
.
1
Find eigenvectors for λ = 0.84. We compute
0.06 0.06 1 1
P − 0.84 I = −→
0.1 0.1 0 0
x2 is a free variable and x1 = −x2 . Thus the eigenspace corresponding to λ = 0.84 is spanned by
−1
.
1
6
Note that there is another way we could have determined the long-term readership. Recall
that
Xk+1 = P Xk
for each k. Letting k → ∞
or simply
X = P X.
In other words, the long-term readership X is an eigenvector of P with corresponding eigenvalue
1. Therefore,
0.6
X=c
1
for some scalar c ∈ R. Since the sum of the entries of X add up to 1, we must have 1.6 c =
c (0.6 + 1) = 1, or c = 0.625, so that
0.6 0.375
X = 0.625 =
1 0.625
as we found above.
We can describe some of what we observed by with the following theorem. Recall that a
probability vector is a column vector with non-negative entries and the sum of its entries equal to
1. A probability matrix is a matrix with non-negative entries and the sum of its entries in each
column equal to 1.
Theorem 1. Let P be an n × n probability matrix with all non-zero entries. Then there is a
unique probability vector X ∈ Rn such that
P X = X.
X = lim P k X0 .
k→∞