The Geometric Series of A Matrix: Charlie Watson
The Geometric Series of A Matrix: Charlie Watson
The Geometric Series of A Matrix: Charlie Watson
Charlie Watson
Department of Mathematics and Statistics, University of Victoria
The first series you encountered was probably a geometric series. You
probably summed 21 + 14 + 18 + 16 1
+ to find it equal to 1. Its a lovely
series, with a lovely proof of convergence and a lovely closed-form sum. Its
even useful, as a bound on other series, as a source of other power series, or
for stranger purposes, like proving that every countable subset of a Euclidean
space has Lebesgue measure zero.
The geometric series is useful in matrix algebra, too, and in many areas
that use it. The question is, given a matrix T , does the geometric series
I + T + T 2 + T 3 + make sense? If it does, and it converges, what does it
converge to? We can actually answer the second question before thePfirst.
Remember that if z is a complex number and |z| < 1, the series k=0 z
k
1
converges to 1z . But every number can be thought of as a 1 1 matrix;
the matrix addition and multiplication are the same as for numbers. Since a
geometric series only involves addition and multiplication, the result should
1
translate perfectly. However, we dont write 1T for operators; we write
1
(I T ) . Not only is this true for matrices in general, but the proof is
almost identical to that for the familiar series!
The partial sums of the series are Sn = I + T + + T n , so Sn T S n =
(I + + T n ) (T + + T n+1 ) = I T n+1 . Also, Sn T Sn = (I T )Sn ,
so we have
(I T )Sn = I T n+1 .
In the limit n , if T n+1 0, then (I T )Sn I. That is, if T n 0,
then Sn (I T )1 . Notice that, just like the familiar geometric series, the
terms only have to vanish in the limit for the series to converge (in contrast
1
with a general series). In the familiar case, z n 0 if, and only if, |z| < 1.
We must determine the conditions under which a matrix does the same.
The answer is rather beautiful, and Ill show you a linear algebraic proof.
It relies on the fact that a sequence of matrices converges if, and only if,
the sequences formed by the entries converge. That is, limn Tn = L iff
limn (Tn )i,j = (L)i,j for all i and j. I wont prove this here, but I invite
you to stretch your metric muscles and do so yourself.
Diagonal Matrices
Before we tackle the case of a general matrix, we can look for an easy case. A
geometric series sums every non-negative power of the object in question, so
we want a matrix with easily-computable powers. Diagonal matrices come
to mind, so lets look at them. Let D be a diagonal matrix, say
d1 0 0
0 d2 0
D = 0 0 d .
3
.. ...
.
The powers of D are then simply
dm
1 0 0
0 dm 0
m 2
D =0 .
m
0 d3
.. ..
. .
The limit of the powers of D is simple: Dm 0 iff, each |di | < 1. Thats
diagonal matrices done, then. To summarize:
This is nice and clean, but diagonal matrices are rare and precious. How
can we generalize it?
2
Diagonalizable Matrices
The next obvious step is to diagonalizable matrices, whose powers are also
easy to compute. Let T be a diagonalizable matrix, D be its diagonal
form, and S the similarity transformation. Then T = SDS 1 and T m =
(SDS 1 )m = SDm S 1 . Since S 6= 0, T m 0 iff Dm 0.
We already found the conditions under which Dm 0. Pleasingly, the
diagonal entries of D are the eigenvalues of T , so T m 0 iff each eigenvalue
of T has magnitude less than one. That language is a bit clumsy, and it
can be improved. The spectral radius of a matrix, (T ), is the maximum
magnitude of its eigenvalues. The constraint on the eigenvalues of T is then
simply (T ) < 1.
This makes sense, since the eigenvalues measure how eigenvectors are
stretched. If the matrix is diagonalizable, then it has a basis of eigenvectors,
and every vector is stretched by the eigenvalues. If the eigenvalues all have
magnitude less than one, all vectors must be shrunk by the matrix. In the
limit m , Dm v 0 for every vector v, which means Dm 0. Thats
the diagonalizable matrices done, then.
This still isnt good enough, since not all matrices are diagonalizable.
How can we generalize this further?
All Matrices
We need a simplification that applies to all matrices. It wont be as nice as
a diagonal matrix, but well cope. Triangular form seems a likely candidate,
since every matrix has such a form. Unfortunately, computing powers of an
triangular matrix is cumbersome (but solvable, as I might show in a future
note). Instead, we can use Jordan canonical form. Every matrix is similar
to a Jordan matrix and, as well see, the powers of a Jordan matrix are
relatively simple. Recall that a Jordan matrix is block-diagonal; each block
has a repeated diagonal entry, ones on the first super-diagonal, and zeroes
elsewhere.
Since a Jordan matrix is block diagonal, its powers are greatly simplified.
Just like the powers of a diagonal matrix only require the powers of the
3
diagonal entries, the powers of a block diagonal matrix only require the
powers of the blocks. With that in mind, let J be a Jordan block. That
is, for some (possibly complex) eigenvalue, ,
1 0
0 1
J = 0 0 .
.. ..
. .
Youre welcome to compute the powers of J by brute force; it isnt too
hard. However, Id prefer to show you a clever trick. Notice that J can be
written as
0 0 0 1 0
0 0 0 0 1
J = 0 0 + 0 0 0 = I + S,
.. ... .. ...
. .
where S has ones on its first superdiagonal, and zeroes elsewhere. This
seems
Pm silly, but it lets us write the powers of J as J m = (I + S)m =
m
mk
k=0 k S k . At first glance, this might not seem to simplify anything,
but the powers of S are very simple:
0 1 0 0 0 0 1 0 0 0 0 1
0
0 1 0
0 0
0 1
0 0
0 0
S = 0 0 0 1 2 0 0 0 0 3 0 0 0 0
,S = ,S = .
0 0 0 0 0 0 0 0 0 0 0 0
.. .. .. .. .. ..
. . . . . .
Without much effort, you can see that (S m )ij = 1 if j i = m and 0 oth-
erwise. The crucical thing to notice is that a particular entry is only nonzero
in at most one power of S. That means the earlier binomial expansion for
J m collapses to a single term for each entry of J, namely
4
(P
m m
mk k
m k=0 k
(S )ij if j i
(J )ij =
0 otherwise
(
m
m(ji)
ji
if j i
=
0 otherwise.
m
m(ji)
Evaluating limm J m is then as simple as evaluating limm ji .
m
m(ji)
With lHospitals Rule, you can confirm that limm ji = 0 iff
|| < 1. The powers of J then go to zero iff || < 1. Since every matrix
has a Jordan form, and the diagonal entries of the Jordan matrix are the
eigenvalues, we can wrap up the general case.
Remarks
One point to take away from this is a way to approximate the inverse of a
matrix. For example, if A is almost the identity matrix, then A = I T for
some small matrix T . The inverse of A is then I + T + T 2 + if (T ) < 1.
You can truncate the series at any point to get an approximation to A1 .
Another is simply a way to compute the geometric series of a matrix. If
youre
Pworking on a finite Markov chain and need to find limiting probabili-
ties, m=0 P m
= (I P )1 .
Finally, you can take away an appreciation for abstraction. We can add
and multiply matrices, so theres nothing to stop us from thinking about
geometric series. Indeed, thats not the end of it. A geometric series can be
considered anywhere you can add, multiply, measure size, and take limits.
The most general setting for that is a Banach algebra.