A Useful Basis For Defective Matrices: Jordan Vectors and The Jordan Form
A Useful Basis For Defective Matrices: Jordan Vectors and The Jordan Form
A Useful Basis For Defective Matrices: Jordan Vectors and The Jordan Form
diagonalization of A fails? The answer, for any matrix is obviously rank 1, and has a one-dimensional nullspace
function f (A), turns out to involve the derivative of f ! spanned by x1 = (1, 0).
Defective matrices arise rarely in practice, and usually
only when one arranges for them intentionally, so we have
1 Introduction not worried about them up to now. But it is important to
have some idea of what happens when you have a defec-
So far in the eigenproblem portion of 18.06, our strategy tive matrix. Partially for computational purposes, but also
has been simple: find the eigenvalues λi and the corre- to understand conceptually what is possible. For example,
sponding eigenvectors xi of a square matrix A, expand what will be the result of
any vector of interest u in the basis of these eigenvectors
(u = c1 x1 + · · · + cn xn ), and then any operation with A can k 1 At 1
A or e
be turned into the corresponding operation with λi acting 2 2
k k At
on each eigenvector. So, A becomes λi , e becomes e i , λ t
for the defective matrix A above, since (1, 2) is not in the
and so on. But this relied on one key assumption: we re-
span of the (single) eigenvector of A? For diagonaliz-
quire the n × n matrix A to have a basis of n independent
able matrices, this would grow as λ k or eλt , respectively,
eigenvectors. We call such a matrix A diagonalizable.
but what about defective matrices? Although matrices in
Many important cases are always diagonalizable: ma-
real applications are rarely exactly defective, it sometimes
trices with n distinct eigenvalues λi , real symmetric or or-
happens (often by design!) that they are nearly defective,
thogonal matrices, and complex Hermitian or unitary ma-
and we can think of exactly defective matrices as a limit-
trices. But there are rare cases where A does not have a
ing case. (The book Spectra and Pseudospectra by Tre-
complete basis of n eigenvectors: such matrices are called
fethen & Embree is a much more detailed dive into the
defective. For example, consider the matrix
fascinating world of nearly defective matrices.)
1 1 The textbook (Intro. to Linear Algebra, 5th ed. by
A= . Strang) covers the defective case only briefly, in section
0 1
1
8.3, with something called the Jordan form of the matrix, If λi is a double root, we will need a second vector to
a generalization of diagonalization, but in this section we complete our basis. Remarkably,1 it turns out to always
will focus more on the “Jordan vectors” than on the Jordan be the case for a double root λi that N([A − λi I]2 ) is two-
factorization. For a diagonalizable matrix, the fundamen- dimensional, just as for the 2 × 2 example above. Hence,
tal vectors are the eigenvectors, which are useful in their we can always find a unique second solution ji satisfying:
own right and give the diagonalization of the matrix as a
side-effect. For a defective matrix, to get a complete basis (A − λi I)ji = xi , ji ⊥ xi .
we need to supplement the eigenvectors with something
called Jordan vectors or generalized eigenvectors. Jor- Again, we can choose ji to be perpendicular to xi via
dan vectors are useful in their own right, just like eigen- Gram-Schmidt—this is not strictly necessary, but gives a
vectors, and also give the Jordan form. Here, we’ll focus convenient orthogonal basis. (That is, the complete solu-
mainly on the consequences of the Jordan vectors for how tion is always of the form x p +cxi , a particular solution x p
matrix problems behave. plus any multiple of the nullspace basis xi . If we choose
c = −xTi x p /xTi xi we get the unique orthogonal solution
ji .) We call ji a Jordan vector or Jordan vector of A.
2 Defining Jordan vectors The relationship between j1 and x1 is also called a Jor-
dan chain.
In the example above, we had a 2 × 2 matrix A but only a
single eigenvector x1 = (1, 0). We need another vector to
get a basis for R2 . Of course, we could pick another vector 2.1 More than double roots
at random, as long as it is independent of x1 , but we’d like (1)
A more general notation is to use xi instead of xi and
it to have something to do with A, in order to help us with (2)
xi instead of ji . If λi is a triple root, we would find
computations just like eigenvectors. The key thing is to (3) (1,2)
look at A−I above, and to notice that (A−I)2 = 0. (A ma- a third vector xi perpendicular to xi by requiring
(3) (2)
trix is called nilpotent if some power is the zero matrix.) (A − λi I)xi = xi , and so on. In general, if λi is an m-
So, the nullspace of (A − I)2 must give us an “extra” basis times repeated root, then N([A − λi ]m ) is m-dimensiohnal
vector beyond the eigenvector. But this extra vector must we will always be able to find an orthogonal sequence (a
still be related to the eigenvector! If y ∈ N[(A − I)2 ], then ( j)
Jordan chain) of Jordan vectors xi for j = 2 . . . m satis-
(A − I)y must be in N(A − I), which means that (A − I)y ( j) ( j−1) (1)
fying (A − λi I)xi = xi and (A − λi I)xi = 0. Even
is a multiple of x1 ! We just need to find a new “Jordan
more generally, you might have cases with e.g. a triple
vector” or “generalized eigenvector” j1 satisfying
root and two ordinary eigenvectors, where you need only
(A − I)j1 = x1 , j1 ⊥ x1 . one generalized eigenvector, or an m-times repeated root
with ` > 1 eigenvectors and m − ` Jordan vectors. How-
Notice that, since x1 ∈ N(A − I), we can add any multiple
ever, cases with more than a double root are extremely
of x1 to j1 and still have a solution, so we can use Gram-
rare in practice. Defective matrices are rare enough to be-
Schmidt to get a unique solution j1 perpendicular to x1 .
gin with, so here we’ll stick with the most common defec-
This particular 2 × 2 equation is easy enough for us to
tive matrix, one with a double root λi : hence, one ordinary
solve by inspection, obtaining j1 = (0, 1). Now we have
2 eigenvector xi and one Jordan vector ji .
a nice orthonormal basis for R , and our basis has some
simple relationship to A!
Before we talk about how to use these Jordan vectors, 3 Using Jordan vectors
let’s give a more general definition. Suppose that λi is
an eigenvalue of A corresponding to a repeated root of Using an eigenvector xi is easy: multiplying by A is just
det(A − λi I), but with only a single (ordinary) eigenvector like multiplying by λi . But how do we use a Jordan vector
xi , satisfying, as usual:
1 This fact is proved in any number of advanced textbooks on linear
2
ji ? The key is in the definition above. It immediately tells 3.1 More than double roots
us that
In the rare case of two Jordan vectors from a triple root,
Aji = λi ji + xi . (3) (3)
you will have a Jordan vector xi and get a f (A)xi =
It will turn out that this has a simple consequence for more (3)
complicated expressions like Ak or eAt , but that’s probably f (λ )xi + f 0 (λ )ji + f 00 (λ )xi , where the f 00 term will give
not obvious yet. Let’s try multiplying by A2 : you k(k − 1)λik−2 and t 2 eλi t for Ak and eAt respectively.
A quadruple root with one eigenvector and three Jordan
A2 ji = A(Aji ) = A(λi ji + xi ) = λi (λi ji + xi ) + λi xi vectors will give you f 000 terms (that is, k3 and t 3 terms),
= λi2 ji + 2λi xi and so on. The theory is quite pretty, but doesn’t arise
often in practice so I will skip it; it is straightforward to
and then try A3 :
work it out on your own if you are interested.
A3 ji = A(A2 ji ) = A(λi2 ji + 2λi xi ) = λi2 (λi ji + xi ) + 2λi2 xi
= λi3 ji + 3λi2 xi . 3.2 Example
From this, it’s not hard to see the general pattern (which 1 1
can be formally proved by induction): Let’s try this for our example 2 × 2 matrix A =
0 1
from above, which has an eigenvector x 1 = (1, 0) and a
Ak ji = λik ji + kλik−1 xi . Jordan vector j1 = (0, 1) for an eigenvalue λ1 = 1. Sup-
k At
Notice that the coefficient in the second term is exactly pose we want to comput A u0 and e u0 for u0 = (1, 2).
d k As usual, our first step is to write u0 in the basis of the
dλi (λi ) . This is the clue we need to get the general for-
mula to apply any function f (A) of the matrix A to the eigenvectors...except that now we also include the gener-
eigenvector and the Jordan vector: alized eigenvectors to get a complete basis:
f (A)xi = f (λi )xi , 1
u0 = = x1 + 2j1 .
2
f (A)ji = f (λi )ji + f 0 (λi )xi .
k
Multiplying by a function of the matrix multiplies j by the Now, computing A u0 is easy, from our formula above:
i
same function of the eigenvalue, just as for an eigenvec- Ak u0 = Ak x1 + 2Ak j1 = λ1k x1 + 2λ1k j1 + 2kλ1k−1 x1
tor, but also adds a term multiplying xi by the derivative
f 0 (λi ). So, for example: k 1 k−1 1 1 + 2k
= 1 + 2k 1 = .
2 0 2
eAt ji = eλi t ji + teλi t xi .
For example, this is the solution to the recurrence uk+1 =
We can show this explicitly by considering what happens Au k . Even though A has only an eigenvalue |λ1 | = 1 ≤ 1,
k
when we apply our formula for A in the Taylor series for the solution still blows up, but it blows up linearly with k
At
e : instead of exponentially.
Consider instead eAt u0 , which is the solution to the
∞
Ak t k ∞ k
t
At
e ji = ∑ k
ji = ∑ (λi ji + kλi xi ) k−1
system of ODEs du(t) dt = Au(t) with the initial condition
k! k!
k=0 k=0 u(0) = u0 . In this case, we get:
∞
(λit)k ∞
(λit)k−1
=∑ ji + t ∑ xi = eλi t ji + teλi t xi . eAt u0 = eAt x1 + 2eAt j1 = eλ1 t x1 + 2eλ1 t j1 + 2teλ1 t x1
k=0 k! k=1 (k − 1)!
t 1 t 1 1 + 2t
In general, that’s how we show the formula for f (A) = e + 2te = et .
2 0 2
above: we Taylor expand each term, and the Ak formula
means that each term in the Taylor expansion has corre- In this case, the solution blows up exponentially since
sponding term multiplying ji and a derivative term multi- λ1 = 1 > 0, but we have an additional term that blows
plying xi . up as an exponential multiplied by t.
3
Those of you who have taken 18.03 are probably fa- exists, but how to use it via the Jordan vectors: in particu-
miliar with these terms multiplied by t in the case of a lar, that generalized eigenvectors give us linearly growing
repeated root. In 18.03, it is presented simply as a guess terms kλ k−1 and teλt when we multiply by Ak and eAt ,
for the solution that turns out to work, but here we see respectively.
that it is part of a more general pattern of Jordan vectors Computationally, the Jordan form is famously problem-
for defective matrices. atic, because with any slight random perturbation to A
(e.g. roundoff errors) the matrix typically becomes diag-
onalizable, and the 2 × 2 Jordan block for λ2 disappears.
4 The Jordan form One then has a basis X of eigenvectors, but it is nearly
singular (“ill conditioned”): for a nearly defective ma-
For a diagonalizable matrix A, we made a matrix S out trix, the eigenvectors are almost linearly dependent.
of the eigenvectors, and saw that multiplying by A was This makes eigenvectors a problematic way of looking at
equivalent to multiplying by SΛS−1 where Λ = S−1 AS is nearly defective matrices as well, because they are so sen-
the diagonal matrix of eigenvalues, the diagonalization of sitive to errors. Finding an approximate Jordan form of a
A. Equivalently, AS = ΛS: A multiplies each column of S nearly defective matrix is the famous Wilkinson problem
by the corresponding eigenvalue. Now, we will do exactly in numerical linear algebra, and has a number of tricky
the same steps for a defective matrix A, using the basis of solutions. Alternatively, there are approaches like “Schur
eigenvectors and Jordan vectors, and obtain the Jordan factorization” or the SVD that lead to nice orthonormal
form J instead of Λ. bases for any matrix, but aren’t nearly as simple to use as
Let’s consider a simple case of a 4 × 4 first, in which eigenvectors.
there is only one repeated root λ2 with an eigenvector x2
and a Jordan vector j2 , and the other two eigenvalues λ1
and λ3 are distinct with independent eigenvectors x1 and
x3 . Form a matrix M = (x1 , x2 , j2 , x3 ) from this basis of
four vectors (3 eigenvectors and 1 Jordan vector). Now,
consider what happends when we multiply A by M:
AM = (λ1 x1 , λ2 x2 , λ2 j2 + x2 , λ3 x3 ).
λ1
λ2 1
= M = MJ.
λ2
λ3
That is, A = MJM −1 where J is almost diagonal: it has
λ 0 s along the diagonal, but it also has 1’s above the diag-
onal for the columns corresponding to generalized eigen-
vectors. This is exactly the Jordan form of the matrix A.
J, of course, has the same eigenvalues as A since A and J
are
similar, but J is much simpler than A. The 2 × 2 block
λ2 1
is a called a 2 × 2 Jordan block.
λ2
The generalization of this, when you perhaps have
more than one repeated root, and perhaps the multiplicity
of the root is greater than 2, is fairly obvious, and leads
immediately to the formula given without proof in section
6.6 of the textbook. What I want to emphasize here, how-
ever, is not so much the formal theorem that a Jordan form