Jordan Canonical Form Applications (Steven H. Weintraub)
Jordan Canonical Form Applications (Steven H. Weintraub)
Jordan Canonical Form Applications (Steven H. Weintraub)
Application to
Differential Equations
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in
any form or by any meanselectronic, mechanical, photocopy, recording, or any other except for brief quotations in
printed reviews, without the prior permission of the publisher.
ISBN: 9781598298048
ISBN: 9781598298055
paperback
ebook
DOI 10.2200/S00146ED1V01Y200808MAS002
Steven H. Weintraub
Lehigh University
M
&C
Morgan
ABSTRACT
Jordan Canonical Form ( JCF) is one of the most important, and useful, concepts in linear algebra.
In this book we develop JCF and show how to apply it to solving systems of differential equations.
We rst develop JCF, including the concepts involved in iteigenvalues, eigenvectors, and chains
of generalized eigenvectors. We begin with the diagonalizable case and then proceed to the general
case, but we do not present a complete proof. Indeed, our interest here is not in JCF per se, but in
one of its important applications. We devote the bulk of our attention in this book to showing how
to apply JCF to solve systems of constant-coefcient rst order differential equations, where it is
a very effective tool. We cover all situationshomogeneous and inhomogeneous systems; real and
complex eigenvalues. We also treat the closely related topic of the matrix exponential. Our discussion
is mostly conned to the 2-by-2 and 3-by-3 cases, and we present a wealth of examples that illustrate
all the possibilities in these cases (and of course, a wealth of exercises for the reader).
KEYWORDS
Jordan Canonical Form, linear algebra, differential equations, eigenvalues, eigenvectors,
generalized eigenvectors, matrix exponential
Contents
Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .v
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
1.2
2.2
2.3
2.4
Background Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
A.1
A.2
vi
CONTENTS
Preface
Jordan Canonical Form ( JCF) is one of the most important, and useful, concepts in linear
algebra. In this book, we develop JCF and show how to apply it to solving systems of differential
equations.
In Chapter 1, we develop JCF. We do not prove the existence of JCF in general, but
we present the ideas that go into iteigenvalues and (chains of generalized) eigenvectors. In
Section 1.1, we treat the diagonalizable case, and in Section 1.2, we treat the general case. We
develop all possibilities for 2-by-2 and 3-by-3 matrices, and illustrate these by examples.
In Chapter 2, we apply JCF. We show how to use JCF to solve systems Y = AY + G(x) of
constant-coefcient rst-order linear differential equations. In Section 2.1, we consider homogeneous systems Y = AY . In Section 2.2, we consider homogeneous systems when the characteristic
polynomial of A has complex roots (in which case an additional step is necessary). In Section 2.3,
we consider inhomogeneous systems Y = AY + G(x) with G(x) nonzero. In Section 2.4, we
develop the matrix exponential eAx and relate it to solutions of these systems. Also in this chapter
we provide examples that illustrate all the possibilities in the 2-by-2 and 3-by-3 cases.
Appendix A has background material. Section A.1 gives background on coordinates for
vectors and matrices for linear transformations. Section A.2 derives the basic properties of the
complex exponential function. This material is relegated to the Appendix so that readers who
are unfamiliar with these notions, or who are willing to take them on faith, can skip it and still
understand the material in Chapters 1 and 2.
Our numbering system for results is fairly standard: Theorem 2.1, for example, is the rst
Theorem found in Section 2 of Chapter 1.
As is customary in textbooks, we provide the answers to the odd-numbered exercises here.
Instructors may contact me at [email protected] and I will supply the answers to all of the exercises.
Steven H. Weintraub
Lehigh University
Bethlehem, PA USA
July 2008
viii
PREFACE
CHAPTER
Although, for simplicity, most of our examples will be over the real numbers (and indeed over the
rational numbers), we will consider that all of our vectors and matrices are dened over the complex
numbers C. It is only with this assumption that the theory of Jordan Canonical Form ( JCF) works
completely. See Remark 1.9 for the key reason why.
Denition 1.1.
5 7
7
Example 1.2. Let A be the matrix A =
. Then, as you can check, if v1 =
, then
2 4
2
1
, then
Av1 = 3v1 , so v1 is an eigenvector of A with associated eigenvalue 3, and if v2 =
1
Av2 = 2v2 , so v2 is an eigenvector of A with associated eigenvalue 2.
We note that the denition of an eigenvalue/eigenvector can be expressed in an alternate form.
Here I denotes the identity matrix:
Av = v
Av = I v
(A I )v = 0 .
For an eigenvalue of A, we let E denote the eigenspace of ,
E = {v | Av = v} = {v | (A I )v = 0} = Ker(A I ) .
(The kernel Ker(A I ) is also known as the nullspace NS(A I ).)
We also note that this alternate formulation helps us nd eigenvalues and eigenvectors. For if
(A I )v = 0 for a nonzero vector v, the matrix A I must be singular, and hence its determinant
must be 0. This leads us to the following denition.
Denition 1.3. The characteristic polynomial of a matrix A is the polynomial det(I A).
Remark 1.4. This is the customary denition of the characteristic polynomial. But note that, if A is
an n-by-n matrix, then the matrix I A is obtained from the matrix A I by multiplying each
of its n rows by 1, and hence det(I A) = (1)n det(A I ). In practice, it is most convenient
to work with A I in nding eigenvectorsthis minimizes arithmeticand when we come to
nd chains of generalized eigenvectors in Section 1.2, it is (almost) essential to use A I , as using
I A would introduce lots of spurious minus signs.
5 7
Example 1.5. Returning to the matrix A =
of Example 1.2, we compute that det(I
2 4
A) = 2 6 = ( 3)( + 2), so A has eigenvalues 3 and 2. Computation then shows that
7
the eigenspace E3 = Ker(A 3I ) has basis
, and that the eigenspace E2 = Ker(A
2
1
(2)I ) has basis
.
1
We now introduce two important quantities associated to an eigenvalue of a matrix A.
Let a be an eigenvalue of a matrix A. The algebraic multiplicity of the eigenvalue
a is alg-mult(a) = the multiplicity of a as a root of the characteristic polynomial det(I A). The
geometric multiplicity of the eigenvalue a is geom-mult(a) = the dimension of the eigenspace Ea .
Denition 1.6.
It is common practice to use the word multiplicity (without a qualier) to mean algebraic
multiplicity.
We have the following relationship between these two multiplicities.
Lemma 1.7.
Proof. By the denition of an eigenvalue, there is at least one eigenvector v with eigenvalue a, and
so Ea contains the nonzero vector v, and hence dim(Ea ) 1.
For the proof that geom-mult(a) alg-mult(a), see Lemma 1.12 in Appendix A.
2
Corollary 1.8. Let a be an eigenvalue of A and suppose that a has algebraic multiplicity 1. Then a also
has geometric multiplicity 1.
Lemma 1.10.
Proof. Let A have eigenvalues a1 , a2 , . . . , am . For each i between 1 and m, let si = geom-mult(ai )
and ti = alg-mult(ai ). Then, by Lemma 1.7, si ti for each i, and by Remark 1.9, m
i=1 ti = n.
m
m
Thus, if si = ti for each i, then i=1 si = n, while if si < ti for some i, then i=1 si < n.
2
Proposition 1.11. (1) Let a1 , a2 , . . . , am be distinct eigenvalues of A (i.e., ai = aj for i = j ). For each
i between 1 and m, let vi be an associated eigenvector. Then {v1 , v2 , . . . , vm } is a linearly independent set
of vectors.
(2) More generally, let a1 , a2 , . . . , am be distinct eigenvalues of A. For each i between 1 and m, let Si be
a linearly independent set of eigenvectors associated to ai . Then S = S1 . . . Sm is a linearly independent
set of vectors.
Denition 1.13.
Theorem 1.14.
Proof. We give a proof by direct computation here. For a more conceptual proof, see Theorem 1.10
in Appendix A.
First let us suppose that for each eigenvalue a of A, geom-mult(a) = alg-mult(a).
Let A have eigenvalues a1 , a2 , , an . Here we do not insist that the ai s are distinct; rather,
each eigenvalue appears the same number of times as its algebraic multiplicity.Then J is the diagonal
matrix
J = j1 j2 . . . jn
and we see that ji , the i th column of J , is the vector
0
..
.
0
ji =
ai ,
0
..
.
with ai in the i th position, and 0 elsewhere.
We have
P = v1 v2 . . .
vn ,
a matrix whose columns are eigenvectors forming bases for the associated eigenspaces. By hypothesis,
geom-mult(a) = alg-mult(a) for each eigenvector a of A, so there are as many columns of P that are
eigenvectors for the eigenvalue a as there are diagonal entries of J that are equal to a. Furthermore,
by Lemma 1.10, the matrix P indeed has n columns.
We rst show by direct computation that AP = P J . Now
AP = A v1 v2 . . . vn
Avi = ai vi
v1 v2 . . .
P ji =
vn ji .
Proof. By hypothesis, for each eigenvalue a of A, alg-mult(a) = 1. But then, by Corollary 1.8, for
each eigenvalue a of A, geom-mult(a) = alg-mult(a), so the hypothesis of Theorem 1.14 is satised.
2
5 7
Example 1.16. Let A be the matrix A =
of Examples 1.2 and 1.5. Then, referring to
2 4
Example 1.5, we see
1
5 7
7 1 3 0
7 1
.
=
2 4
2 1 0 2 2 1
As we have indicated, we have developed this theory over the complex numbers, as JFC works
best over them. But there is an analog of our results over the real numberswe just have to require
that all the eigenvalues of A are real. Here is the basic result on diagonalizability.
Let A be an n-by-n real matrix. Then A is diagonalizable if and only if all the
eigenvalues of A are real numbers, and, for each eigenvalue a of A, geom-mult(a) = alg-mult(a). In that
case, A = P J P 1 where J is a diagonal matrix whose entries are the eigenvalues of A, each appearing
according to its algebraic multiplicity (and hence is a real matrix), and P is a real matrix whose columns
are eigenvectors forming bases for the associated eigenspaces.
Theorem 1.17.
1.2
Let us begin this section by describing what a matrix in JCF looks like. A matrix in JCF is composed
of Jordan blocks, so we rst see what a single Jordan block looks like.
Denition 2.1.
A k-by-k Jordan block associated to the eigenvalue is a k-by-k matrix of the form
J =
1
..
.
..
In other words, a Jordan block is a matrix with all the diagonal entries equal to each other, all
the entries immediately above the diagonal equal to 1, and all the other entries equal to 0.
Denition 2.2.
J1
J =
J2
J3
..
.
J
Note that every diagonal matrix is a matrix in JCF, with each Jordan block a 1-by-1
block.
In order to understand and be able to use JCF, we must introduce a new concept, that of a
generalized eigenvector.
Denition 2.4.
for some positive integer k, then v is a generalized eigenvector of A associated to the eigenvalue .
The smallest k with (A I )k (v) = 0 is the index of the generalized eigenvector v.
Let us note that if v is a generalized eigenvector of index 1, then
(A I )(v) = 0
(A)v = (I )v
Av = v
and so v is an (ordinary) eigenvector.
Recall that, for an eigenvalue of A, E denotes the eigenspace of ,
E = {v | Av = v} = {v | (A I )v = 0} .
We let E denote the generalized eigenspace of ,
E = {v | (A I )k (v) = 0 for some k} .
It is easy to check that E is a subspace.
is a subspace of
Cn
0 1
1
. Then, as you can check, if u =
, then
4 4
2
(A 2I )u = 0, so u is an eigenvector of A with associated eigenvalue 2 (and hence a generalized
1
eigenvector of index 1 of A with associated eigenvalue 2). On the other hand, if v =
, then
0
(A 2I )2 v = 0 but (A 2I )v = 0, so v is a generalized eigenvector of index 2 of A with associated
eigenvalue 2.
In this case, as you can check, the vector u is a basis for the eigenspace E2 , so E2 = { cu | c C}
is one dimensional.
On the other hand, u and v are both generalized eigenvectors associated to the eigenvalue
2, and are linearly independent (the equation c1 u + c2 v = 0 only has the solution c1 = c2 = 0, as
you can readily check), so E2 has dimension at least 2. Since E2 is a subspace of C2 , it must have
dimension exactly 2, and E2 = C2 (and {u, v} is indeed a basis for C2 ).
Example 2.6.
10
and note that each vi is a generalized eigenvector of index i associated to the eigenvalue . A
collection of generalized eigenvectors obtained in this way gets a special name:
If {v1 , . . . , vk } is a set of generalized eigenvectors associated to the eigenvalue of
A, such that vk is a generalized eigenvector of index k and also
Denition 2.7.
With this concept in hand, let us return to JCF. As we have seen, a matrix J in JCF has a
number of blocks J1 , J2 , . . . , J , called Jordan blocks, along the diagonal. Let us begin our analysis
with the case when J consists of a single Jordan block. So suppose J is a k-by-k matrix
J =
0
1
..
.
..
Then,
0 1
0 1
J I =
11
1
..
.
..
0
1
0
1
0
0
1
Let e1 = 0, e2 = 0,
.
.
..
..
0
0
0
0
0
e3 = 1, , ek = 0 .
.
.
..
..
12
Now let us multiply this equation on the left by (A I )k1 . Then we obtain the equation
c1 (A I )2k2 vk + c2 (A I )2k3 vk + + ck1 (A I )k vk + ck (A I )k1 vk = 0 .
Now (A I )k1 vk = v1 = 0. However, (A I )k vk = 0, and then also (A I )k+1 vk =
(A I )(A I )k vk = (A I )(0) = 0, and then similarly (A I )k+2 vk = 0, . . . , (A
I )2k2 vk = 0, so every term except the last one is zero and this equation becomes
ck v1 = 0 .
Since v1 = 0, this shows ck = 0, so our linear combination is
c1 v1 + c2 v2 + + ck1 vk1 = 0 .
Repeat the same argument, this time multiplying by (A I )k2 instead of (A I )k1 .
Then we obtain the equation
ck1 v1 = 0 ,
and, since v1 = 0, this shows that ck1 = 0 as well. Keep going to get
c1 = c2 = = ck1 = ck = 0 ,
Let A be a k-by-k matrix and suppose that Ck has a basis {v1 , . . . , vk } consisting of a
single chain of generalized eigenvectors of length k associated to an eigenvalue a. Then
Theorem 2.11.
A = P J P 1
where
J =
1
a
1
a
1
..
.
..
1
a
vk
13
Proof. We give a proof by direct computation here. (Note the similarity of this proof to the proof
of Theorem 1.14.) For a more conceptual proof, see Theorem 1.11 in Appendix A.
Let P be the given matrix. We will rst show by direct computation that AP = P J .
It will be convenient to write
J = j1 j2 . . . jk
and we see that ji , the i th column of J , is the vector
0
..
.
1
ji =
a
0
..
.
with 1 in the (i 1)st position, a in the i th position, and 0 elsewhere.
We show that AP = P J by showing that their corresponding columns are equal.
Now
AP = A v1 v2 . . . vk
so the i th column of AP is Avi . But
Avi = (A aI + aI )vi
= (A aI )vi + aI vi
= vi1 + avi for i > 1, = avi for i = 1 .
On the other hand,
v1 v2 . . .
vk J
v1 v2 . . .
vk ji .
PJ =
and the i th column of P J is P ji ,
P ji =
14
2
0 1
Example 2.12. Applying Theorem 2.11 to the matrix A =
of Examples 2.6 and 2.8, we
4 4
see that
1
0 1
2 1 2 1 2 1
=
.
4 4
4 0 0 2 4 0
Here is the key theorem to which we have been heading. This theorem is one of the most
important (and useful) theorems in linear algebra.
Let A be any square matrix dened over the complex numbers. Then A is similar to a
matrix in Jordan Canonical Form. More precisely, A = P J P 1 , for some matrix J in Jordan Canonical
Form. The diagonal entries of J consist of eigenvalues of A, and P is an invertible matrix whose columns
are chains of generalized eigenvectors of A.
Theorem 2.13.
Proof. (Rough outline) In general, the JCF of a matrix A does not consist of a single block, but will
have a number of blocks, of varying sizes and associated to varying eigenvalues.
But in this situation we merely have to assemble the various blocks (to get the matrix J )
and the various chains of generalized eigenvectors (to get a basis and hence the matrix P ). Actually,
the word merely is a bit misleading, as the argument that we can do so is, in fact, a subtle one, and
we shall not give it here.
2
In lieu of proving Theorem 2.13, we shall give a number of examples that illustrate the
situation. In fact, in order to avoid complicated notation we shall merely illustrate the situation for
2-by-2 and 3-by-3 matrices.
Theorem 2.14.
15
(i) A has two eigenvalues, a and b, each of algebraic multiplicity 1. Let u be an eigenvector associated to
the eigenvalue a and let v be an eigenvector associated to the eigenvalue b. Then A = P J P 1 with
a 0
J =
and P = u v .
0 b
(Note, in this case, A is diagonalizable.)
(ii) A has a single eigenvalue a of algebraic multiplicity 2.
(a) A has two linearly independent eigenvectors u and v.
Then A = P J P 1 with
J =
a 0
0 a
and
P = u v .
(Note, in this case, A is diagonalizable. In fact, in this case Ea = C2 and A itself is the matrix
a 0
.)
0 a
(b) A has a single chain {v1 , v2 } of generalized eigenvectors. Then A = P J P 1 with
a 1
J =
and P = v1 v2 .
0 a
Theorem 2.15.
(i) A has three eigenvalues, a, b, and c, each of algebraic multiplicity 1. Let u be an eigenvector
associated to the eigenvalue a, v be an eigenvector associated to the eigenvalue b, and w be an
eigenvector associated to the eigenvalue c. Then A = P J P 1 with
a 0 0
and P = u v w .
J = 0 b 0
0 0 c
(Note, in this case, A is diagonalizable.)
(ii) A has an eigenvalue a of algebraic multiplicity 2 and an eigenvalue b of algebraic multiplicity 1.
(a) A has two independent eigenvectors, u and v, associated to the eigenvalue a. Let w be an eigenvector
associated to the eigenvalue b. Then A = P J P 1 with
a 0 0
and P = u v w .
J = 0 a 0
0 0 b
(Note, in this case, A is diagonalizable.)
16
(b) A has a single chain {u1 , u2 } of generalized eigenvectors associated to the eigenvalue a. Let v be an
eigenvector associated to the eigenvalue b. Then A = P J P 1 with
a 1 0
and P = u1 u2 v .
J = 0 a 0
0 0 b
(iii) A has a single eigenvalue a of algebraic multiplicity 3.
(a) A has three linearly independent eigenvectors, u, v, and w. Then A = P J P 1 with
a 0 0
J = 0 a 0 and P = u v w .
0 0 a
(Note, in this case, A is diagonalizable. In fact, in this case Ea = C3 and A itself is the matrix
a 0 0
0 a 0.)
0 0 a
(b) A has a chain {u1 , u2 } of generalized eigenvectors and an eigenvector v with {u1 , u2 , v} linearly
independent. Then A = P J P 1 with
a 1 0
J = 0 a 0 and P = u1 u2 v .
0 0 a
(c) A has a single chain {u1 , u2 , u3 } of generalized eigenvectors. Then A = P J P 1 with
a 1 0
and P = u1 u2 u3 .
J = 0 a 1
0 0 a
Suppose that A has JCF J = aI , a scalar multiple of the identity matrix. Then
A = P J P 1 = P (aI )P 1 = a(P I P 1 ) = aI = J .This justies the parenthetical remark in Theorems 2.14 (ii) (a) and 2.15 (iii) (a).
Remark 2.16.
Note that Theorems 2.14 (i), 2.14 (ii) (a), 2.15 (i), 2.15 (ii) (a), and 2.15 (iii) (a) are
all special cases of Theorem 1.14, and in fact Theorems 2.14 (i) and 2.15 (i) are both special cases
of Corollary 1.15.
Remark 2.17.
17
Now we would like to apply Theorems 2.14 and 2.15. In order to do so, we need to have an
effective method to determine which of the cases we are in, and we give that here (without proof ).
Denition 2.18.
Note that Ei consists of generalized eigenvectors of index at most i (and the 0 vector), and is
a subspace. Note also that
E = E1 E2 . . . E .
In general, the JCF of A is determined by the dimensions of all the spaces Ei , but this
determination can be a bit complicated. For eigenvalues of multiplicity at most 3, however, the
situation is simplerwe need only consider the eigenspaces E .This is a consequence of the following
general result.
Let be an eigenvalue of A. Then the number of blocks in the JCF of A corresponding
to is equal to dim E , i.e., to the geometric multiplicity of .
Proposition 2.19.
Proof. (Outline) Suppose there are such blocks. Since each block corresponds to a chain of generalized eigenvectors, there are such chains. Now the bottom of the chain is an (ordinary) eigenvector,
so we get eigenvectors in this way. It can be shown that these eigenvectors are always linearly
independent and that they always span E , i.e., that they are a basis of E . Thus, E has a basis
consisting of vectors, so dim E = .
2
We can now determine the JCF of 1-by-1, 2-by-2, and 3-by-3 matrices, using the following
consequences of this proposition.
Let be an eigenvalue of A of algebraic multiplicity 1. Then dim E1 = 1, i.e., a has
geometric multiplicity 1, and the submatrix of the JCF of A corresponding to the eigenvalue is a single
1-by-1 block.
Corollary 2.20.
Corollary 2.21. Let be an eigenvalue of A of algebraic multiplicity 2. Then there are the following
possibilities:
(a) dim E1 = 2, i.e., a has geometric multiplicity 2. In this case, the submatrix of the JCF of A
corresponding to the eigenvalue consists of two 1-by-1 blocks.
18
(b) dim E1 = 1, i.e., a has geometric multiplicity 1. Also, dim E2 = 2. In this case, the submatrix of
the JCF of A corresponding to the eigenvalue consists of a single 2-by-2 block.
Corollary 2.22. Let be an eigenvalue of A of algebraic multiplicity 3. Then there are the following
possibilities:
(a) dim E1 = 3, i.e., a has geometric multiplicity 3. In this case, the submatrix of the JCF of A corresponding to the eigenvalue consists of three 1-by-1 blocks.
(b) dim E1 = 2, i.e., a has geometric multiplicity 2. Also, dim E2 = 3. In this case, the submatrix of
the Jordan Canonical Form of A corresponding to the eigenvalue consists of a 2-by-2 block and a
1-by-1 block.
(c) dim E1 = 1, i.e., a has geometric multiplicity 1. Also, dim E2 = 2, and dim E3 = 3. In this case,
the submatrix of the Jordan Canonical Form of A corresponding to the eigenvalue consists of a
single 3-by-3 block.
Now we shall do several examples.
2 3 3
Example 2.23. A = 2 2 2 .
2 1
1
A has characteristic polynomial det (I A) = ( + 1)()( 2). Thus, A has eigenvalues
1, 0, and 2, each of multiplicity one, and so we are in the situation
ofTheorem 2.15 (i). Computation
1
shows that the eigenspace E1 = Ker(A (I )) has basis 0 , the eigenspace E0 = Ker(A)
1
1
0
has basis 1 , and the eigenspace E2 = Ker(A 2I ) has basis 1 . Hence, we see
1
1
that
1
1 0 1
1 0 0
1 0 1
2 3 3
2 2 2 = 0 1 1 0 0 0 0 1 1 .
1 1
1
0 0 2
1 1
1
2 1
1
3 1 1
Example 2.24. A = 2 4 2 .
1 1 3
19
1
0
Corollary
2.21 (a). Further computation shows that the eigenspace E6 = Ker(A 6I ) has basis
1
1
1 1 1
2 0 0
1 1 1
3 1 1
2 4 2 = 1
0 2 .
0 2 0 2 0 1
0
1 1
0 0 6
0
1 1
1 1 3
2 1 1
Example 2.25. A = 2 1 2 .
1 0 2
A has characteristic polynomial det (I A) = ( + 1)2 ( 3). Thus, A has an eigenvalue
1 of multiplicity 2 and an eigenvalue
ofmultiplicity 1. Computation shows that the eigenspace
3
1
E1 = Ker(A (I )) has basis 2 so dim E1 = 1 and we are in the situation of Corol
1
0
1
2 = Ker((A (I ))2 ) has basis 2 , 0 ,
lary 2.21 (b). Then we further compute that E1
1
0
therefore is two-dimensional, as we expect. More to the point, we may choose any generalized eigen2 that is not in E 1 , as the top of a chain. We choose u =
vector of index 2, i.e., any vector in E1
2
1
1
0
0 , and then we have u1 = (A (I ))u2 = 2 , and {u1 , u2 } form a chain.
1
1
We also compute that, for the eigenvalue 3, the eigenspace E3 has basis v = 6 .
1
Hence, we see that
1
2 1 1
1 0 5
1 1 0
1 0 5
2 1 2 = 2 0 6 0 1 0 2 0 6 .
1 0 2
1 1 1
0
0 3
1 1 1
20
2
1
1
Example 2.26. A = 2 1 2 .
1
1
2
1 of
A has characteristic polynomial det (I A) = ( 1)3 , so A has
one eigenvalue
1
1
multiplicity three. Computation shows that E1 = Ker(A I ) has basis 0 , 1 , so
1
0
dim E1 = 2 and we are in the situation of Corollary 2.22 (b). Computationthen shows that
0
0
1
dim E12 = 3 (i.e., (A I )2 = 0 and E12 is all of C3 ) with basis 0 , 1 , 0 .We may choose
1
0
0
1
u2 to be any vector in E12 that is not in E11 , and we shall choose u2 = 0 . Then u1 = (A I )u2 =
0
1
2 , and {u1 , u2 } form a chain. For the third vector, v, we may choose any vector in E1 such that
1
1
{u1 , v} is linearly independent. We choose v = 0 . Hence, we see that
1
1
1 1 1
1 1 0
1 1 1
2
1 1
2 1 2 = 2 0 0 0 1 0 2 0 0 .
1 0 1
0 0 1
1 0 1
1
1 2
5 0 1
Example 2.27. A = 1 1 0 .
7 1 0
A has characteristic polynomial det (I A) = ( 2)3 , so A
one eigenvalue 2 of multihas
1
plicity three. Computation shows that E2 = Ker(A 2I ) has basis 1 , so dim E21 = 1 and
3
we are in the situation of Corollary 2.22 (c). Then computation shows that E22 = Ker((A 2I )2 )
1
1
1
1
1
0
2
3
0
2
21
0
0
1
3
3
3
3
0
0
1
1
We may choose u3 to be any vector in C3 that is not in E22 , and we shall choose u3 = 0 . Then
0
3
2
u2 = (A 2I )u3 = 1 and u1 = (A 2I )u2 = 2 , and then {u1 , u2 , u3 } form a chain.
7
6
Hence, we see that
1
2
3 1
2 1 0
2
3 1
5 0 1
1 1 0 = 2
1 0 .
1 0 0 2 1 2
6 7 0
0 0 2
6 7 0
7 1 0
As we have mentioned, we need to work over the complex numbers in order for the theory
of JCF to fully apply. But there is an analog over the real numbers, and we conclude this section by
stating it.
Let A be a real square matrix (i.e., a square matrix with all entries real numbers), and
suppose that all of the eigenvalues of A are real numbers. Then A is similar to a real matrix in Jordan
Canonical Form. More precisely, A = P J P 1 with P and J real matrices, for some matrix J in Jordan
Canonical Form. The diagonal entries of J consist of eigenvalues of A, and P is an invertible matrix whose
columns are chains of generalized eigenvectors of A.
Theorem 2.28.
75
56
1. A =
,
90 67
50 99
2. A =
,
20 39
18 9
3. A =
,
49 24
det(I A) = ( 3)2 .
22
1
1
4. A =
,
16 9
5. A =
6. A =
det(I A) = ( 5)2 .
2
1
,
25 12
det(I A) = ( 7)2 .
15 9
,
25 15
det(I A) = 2 .
1 0
0
7. A = 1 2 3,
1 1 0
3 0 2
8. A = 1 3 1,
0 1 1
5
8
16
9. A = 4
1
8 ,
4 4 11
4 2 3
10. A = 1 1 3,
2 4 9
5
2
1
11. A = 1 2 1,
1 2 3
8 3 3
12. A = 4
0 2,
2 1
3
3 1 1
13. A = 7 5 1,
6 6 2
3
0
0
14. A = 9 5 18,
4 4
12
6 9
0
15. A = 6 6 2,
9 9 3
det(I A) = 2 ( 3).
18 42 168
16. A = 1
7 40,
2
6
27
1 1 1
17. A = 10 6 5,
6 3 2
det(I A) = ( 1)3 .
0 4 1
18. A = 2 6 1,
4 8 0
det(I A) = ( + 2)3 .
4 1 2
19. A = 5 1 3,
7 2 3
det(I A) = 3 .
4 2 5
20. A = 1 1 1,
2 1 2
det(I A) = ( + 1)3 .
23
24
25
CHAPTER
We will now see how to use Jordan Canonical Form ( JCF) to solve systems Y = AY . We begin by
describing the strategy we will follow throughout this section.
Consider the matrix system
Y = AY .
26
You will note that throughout this section, in solving Z = J Z, we write the solution as
Z = MZ C, where MZ is a matrix of functions, called the fundamental matrix of the system, and
C is a vector of arbitrary constants. The reason for this will become clear later. (See Remarks 1.12
and 1.14.)
Although it is not logically necessarywe may regard a diagonal matrix as a matrix in JCF
in which all the Jordan blocks are 1-by-1 blocksit is illuminating to handle the case when J is
diagonal rst. Here the solution is very easy.
Theorem 1.1.
a1
J =
a2
a3
..
0
.
ak1
ak
ea1 x
Z=
ea2 x
C = MZ C
0
ea3 x
..
.
eak1 x
eak x
c1
c2
where C = . is a vector of arbitrary constants c1 , c2 , . . . , ck .
..
ck
Proof. Multiplying out, we see that the system Z = J Z is just the system
a1 z1
z1
z2 a2 z2
.= . .
.. ..
zk
a k zk
27
zi
Z= . ,
..
ck eak x
Y = AY
where
A=
5 7
.
2 4
Z=
e3x
0
and so Y = P Z = P MZ C, i.e.,
c1
e2x c2
3x
e2x c1
7e
=
2e3x e2x c2
7c1 e3x + c2 e2x
.
=
2c1 e3x + c2 e2x
Y =
Example 1.3.
c1 e3x
c1
C
=
=
M
Z
e2x c2
c2 e2x
0
7 1
2 1
e3x
0
2 3 3
A = 2 2 2 .
2 1
1
Y = AY
where
28
1 0 0
1 0 1
P = 0 1 1 and J = 0 0 0 .
0 0 2
1 1
1
ex
Z= 0
0
c1
0 0
1 0 c2 = MZ C
0 e2x
c3
and so Y = P Z = P MZ C, i.e.,
x
c1
0 0
1 0 1
e
Y = 0 1 1
0 1 0
c2
2x
0 0 e
c3
1 1
1
x
c1
0 e2x
e
= 0 1 e2x c2
ex 1
e2x
c3
x
c1 e
c3 e2x
=
c2 c3 e2x .
c1 ex + c2 + c3 e2x
We now see how to use JCF to solve systems Y = AY where the coefcient matrix A is not
diagonalizable.
The key to understanding systems is to investigate a system Z = J Z where J is a matrix
consisting of a single Jordan block. Here the solution is not as easy as in Theorem 1.1, but it is still
not too hard.
Theorem 1.4.
J =
1
a
1
a
0
1
..
.
..
a
1
a
Z
29
1
x
x k3 /(k 3)!
ax
Z=e
C = MZ C
..
..
.
.
x
1
c1
c2
where C = . is a vector of arbitrary constants c1 , c2 , . . . , ck .
..
ck
Proof. We will prove this in the cases k = 1, 2, and 3, which illustrate the pattern. As you will see,
the proof is a simple application of the standard technique for solving rst-order linear differential
equations.
The case k = 1: Here we are considering the system
[z1 ] = [a][z1 ]
which is nothing other than the differential equation
z1 = az1 .
This differential equation has solution
z1 = c1 eax ,
which we can certainly write as
[z1 ] = eaz [1][c1 ] .
The case k = 2: Here we are considering the system
z1
a 1 z1
=
,
z2
0 a z2
which is nothing other than the pair of differential equations
z1 = az1 + z2
z2 =
az2 .
30
c2 dx = c1 + c2 x
so
z1 = eax (c1 + c2 x) .
Thus, our solution is
z1 = eax (c1 + c2 x)
z2 =
eax c2 ,
which we see we can rewrite as
z1
c1
ax 1 x
=e
.
z2
0 1 c2
z2 =
z3
az2 + z3
az3 .
31
If we just concentrate on the last two equations, we see we are in the k = 2 case. Referring to
that case, we see that our solution is
z2 = eax (c2 + c3 x)
z3 =
eax c3 .
Substituting the value of z2 into the equation for z1 , we obtain
z1 = az1 + eax (c2 + c3 x) .
To solve this, we rewrite this as
z1 az1 = eax (c2 + c3 x)
and recognize that this differential equation has integrating factor eax . Multiplying by this factor,
we nd
eax (z1 az1 ) = c2 + c3 x
(eax z1 ) = c2 + c3 x
eax z1 =
(c2 + c3 x) dx = c1 + c2 x + c3 (x 2 /2)
so
z1 = eax (c1 + c2 x + c3 (x 2 /2)) .
Thus, our solution is
z1 = eax (c1 + c2 x + c3 (x 2 /2))
z2 =
eax (c2 + c3 x)
eax c3 ,
z3 =
which we see we can rewrite as
c1
z1
1 x x 2 /2
z2 = eax 0 1
x c2 .
0 0
1
z3
c3
2
Suppose that Z = J Z where J is a matrix in JCF but one consisting of several blocks,
not just one block. We can see that this systems decomposes into several systems, one corresponding
to each block, and that these systems are uncoupled, so we may solve them each separately, using
Theorem 1.4, and then simply assemble these individual solutions together to obtain a solution of
the general system.
Remark 1.5.
32
We now illustrate this (conning our illustrations to the case that A is not diagonalizable, as
we have already illustrated the diagonalizable case).
Example 1.6.
Y = AY
where
0 1
A=
.
4 4
xe2x
e2x
2x
c1 e + c2 xe2x
c1
= MZ C =
c2
c2 e2x
2 1 2x 1 x c1
e
0 1 c2
4 0
2x
xe2x c1
2 1 e
=
0
e2x
c2
4 0
2e2x 2xe2x + e2x c1
=
4e2x
4xe2x
c2
(2c1 + c2 )e2x 2c2 xe2x
.
=
4c1 e2x 4c2 xe2x
Y =
Example 1.7.
2 1 1
A = 2 1 2 .
1 0 2
Y = AY
where
1 0 5
1 1 0
P = 2 0 6 and J = 0 1 0 .
1 1 1
0
0 3
Then
Z
= J Z has solution
ex
Z= 0
0
xex
ex
0
c1
0
0 c2 = MZ C
e3x
c3
and so Y = P Z = P MZ C, i.e.,
x
e
1 0 5
c1
xex 0
Y = 2 0 6 0
ex
0 c2
1 1 1
0
0
e3x
c3
x
c1
e
xex
5e3x
= 2ex
2xex
6e3x c2
x
x
x
e
xe + e
e3x
c3
Example 1.8.
2
1
1
A = 2 1 2 .
1
1
2
Y = AY
where
1 1 0
1 1 1
P = 2 0 0 and J = 0 1 0 .
0 0 1
1 0 1
Then Z = J Z has solution
ex
Z = 0
0
and so Y = P Z = P MZ C, i.e.,
xex
ex
0
c1
0
0 c2 = MZ C
ex
c3
x
e
1 1 1
Y = 2 0 0
0
1 0 1
0
xex
ex
0
c1
0
0
c2
x
e
c3
33
34
ex
c1
xex + ex ex
x
x
= 2e
2xe
0
c2
x
x
x
e
xe
e
c3
Example 1.9.
5 0 1
A = 1 1 0 .
7 1 0
Y = AY
where
2
3 1
P = 2
1 0
6 7 0
and
2 1 0
J = 0 2 1 .
0 0 2
e2x
Z= 0
0
xe2x
e2x
0
c1
(x 2 /2)e2x
xe2x c2 = MZ C
e2x
c3
and so Y = P Z = P MZ C, i.e.,
2x
e
c1
xe2x (x 2 /2)e2x
2
3 1
Y = 2
1 0 0
e2x
xe2x c2
6 7 0
0
0
e2x
c3
2x
c1
2e
2xe2x + 3e2x x 2 e2x + 3xe2x + e2x
2x
2x
2x
2
2x
2x
= 2e
2xe + e
x e + xe
c2
2x
2x
2x
2
2x
2x
6e
6xe 7e
3x e 7xe
c3
35
We conclude this section by showing how to solve initial value problems. This is just one more
step, given what we have already done.
Example 1.10.
and
3
Y (0) =
.
8
In Example 1.6, we saw that this system has the general solution
(2c1 + c2 )e2x 2c2 xe2x
Y =
.
4c1 e2x
4c2 xe2x
Applying the initial condition (i.e., substituting x = 0 in this matrix), gives
3
2c1 + c2
= Y (0) =
8
4c1
with solution
2
c1
=
.
c2
7
2 1 1
Y = AY where A = 2 1 2 ,
1 0 2
Example 1.11.
8
Y (0) = 32 .
5
and
In Example 1.8, we saw that this system has the general solution
x
c1 e
+ c2 xex 5c3 xe3x
Y = 2c1 ex 2c2 xex 6c3 e3x .
(c1 + c2 )ex c2 xex + c3 e3x
Applying the initial condition (i.e., substituting x = 0 in this matrix) gives
5c3
8
c1
32 = Y (0) = 2c1 6c3
c1 + c2 + c3
5
36
with solution
c1
7
c2 = 1 .
3
c3
Remark 1.12. There is a variant on our method of solving systems or initial value problems.
Z(0) = Z0 ,
we nd that, in general,
Z(x) = MZ (x)C
and, in particular,
Z0 = Z(0) = MZ (0)C = I C = C ,
so the solution to this initial value problem is
Z(x) = MZ (x)Z0 .
Now suppose we wish to solve the system Y = AY . Then, if A = P J P 1 , we have seen that
this system has solution Y = P Z = P MZ C. Let us manipulate this a bit:
Y = P MZ C = P MZ I C = P MZ (P 1 P )C
= (P MZ P 1 )(P C) .
Now let us set MY = P MZ P 1 , and also let us set = P C. Note that MY is still a matrix of
functions, and that is still a vector of arbitrary constants (since P is an invertible constant matrix
and C is a vector of arbitrary constants). Thus, with this notation, we see that
Y = AY
has solution
Y = MY .
37
Y (0) = Y0 .
Rewriting the above solution of Y = AY to explicitly include the independent variable, we see that
we have
Y (x) = MY (x)
and, in particular,
Y0 = Y (0) = MY (0) = P MZ (0)P 1 = P I P 1 = ,
so we see that
Y = AY,
Y (0) = Y0
has solution
Y (x) = MY (x)Y0 .
This variant method has pros and cons. It is actually less effective than our original method
for solving a single initial value problem (as it requires us to compute P 1 and do some extra
matrix multiplication), but it has the advantage of expressing the solution directly in terms of the
initial conditions. This makes it more effective if the same system Y = AY is to be solved for a
variety of initial conditions. Also, as we see from Remark 1.14 below, it is of considerable theoretical
importance.
Let us now apply this variant method.
Example 1.13.
a1
.
a2
2 1
2 1
1
and J =
.Then MZ (x) =
As we have seen in Example 1.6, A = P J P with P =
4 0
0 2
2x
xe2x
e
and
0
e2x
and
Y (0) =
1
2 1 e2x xe2x 2 1
0
e2x
4 0
4 0
2x
2x
2x
xe
e 2xe
=
4xe2x
e2x + 2xe2x
MY (x) = P MZ (x)P 1 =
so
2x
xe2x
a
e 2xe2x
a1
Y (x) = MY (x) 1 =
a2
4xe2x
e2x + 2xe2x a2
a1 e2x + (2a1 + a2 )xe2x
.
=
a2 e2x + (4a1 + 2a2 )xe2x
38
2x
3
3e 14xe2x
In particular, if Y (0) =
, then Y (x) =
, recovering the result of Exam8e2x 28xe2x
8
2x
4
2
2e + xe2x
, and if Y (0) =
, then Y (x) =
ple 1.10. But also, if Y (0) =
, then Y (x) =
2x
2x
5e + 2te
15
5
4e2x + 23xe2x
, etc.
15e2x + 46xe2x
Remark 1.14. In Section 2.4 we will dene the matrix exponential, and, with this denition,
MZ (x) = eJ x and MY (x) = P MZ (x)P 1 = eAx .
41
18 9
.
3. A =
and Y0 =
98
49 24
7
1
1
.
4. A =
and Y0 =
16
16 9
10
2
1
.
and Y0 =
5. A =
75
25 12
50
15 9
.
6. A =
and Y0 =
100
25 15
1 0
0
6
7. A = 1 2 3 and Y0 = 10.
1 1 0
10
3 0 2
0
8. A = 1 3 1 and Y0 = 3.
0 1 1
3
5
8
16
0
9. A = 4
1
8 and Y0 = 2 .
4 4 11
1
4 2 3
3
10. A = 1 1 3 and Y0 = 2.
2 4 9
1
5
2
1
3
11. A = 1 2 1 and Y0 = 2 .
1 2 3
9
5
8 3 3
12. A = 4
0 2 and Y0 = 8.
7
2 1
3
3 1 1
1
13. A = 7 5 1 and Y0 = 3 .
6 6 2
6
3
0
0
2
14. A = 9 5 18 and Y0 = 1.
4 4
12
1
39
40
6 9
0
1
15. A = 6 6 2 and Y0 = 3 .
9 9 3
6
18 42 168
7
16. A =
1
7 40 and Y0 = 2.
2
6
27
1
1 1 1
3
17. A = 10 6 5 and Y0 = 10.
6 3 2
18
0 4 1
2
18. A = 2 6 1 and Y0 = 5.
4 8 0
8
4 1 2
6
19. A = 5 1 3 and Y0 = 11.
7 2 3
9
9
4 2 5
20. A = 1 1 1 and Y0 = 5.
8
2 1 2
2.2
In this section, we show how to solve a homogeneous system Y = AY where the characteristic
polynomial of A has complex roots. In principle, this is the same as the situation where the
characteristic polynomial of A has real roots, which we dealt with in Section 2.1, but in practice,
there is an extra step in the solution.
41
We will begin by doing an example, which will show us where the difculty lies, and then we
will overcome that difculty. But rst, we need some background.
Denition 2.1.
Proof. We have that Av = v, by hypothesis. Let us take the complex conjugate of each side of this
equation. Then
Av = v,
Av = v,
Av = v (as A = A since all the entries of A are real) ,
as claimed.
42
where
A=
2 17
.
1
4
1 + 4i
with P =
1
1 4i
1
3 + 4i
and J =
0
0
3 4i
.
We continue as before, but now we use F to denote a vector of arbitrary constants. (This is just for
neatness. Our constants will change, as you will see, and we will use the vector C to denote our nal
constants, as usual.) Then Z = J Z has solution
(3+4i)x
(3+4i)x
0
e
f e
f1
Z=
= MZ F = 1 (34i)x
(34i)x
0
e
f2
f2 e
and so Y = P Z = P MZ F , i.e.,
1 + 4i
Y =
1
1 4i e(3+4i)x
0
f1
1
0
e(34i)x f2
(3+4i)x 1 + 4i
(34i)x 1 4i
+ f2 e
.
= f1 e
1
1
Now we want our differential equation to have real solutions, and in order for this to be the
case, it turns out that we must have f2 = f1 . Thus, we may write our solution as
(3+4i)x 1 + 4i
(34i)x 1 4i
Y = f1 e
+ f1 e
1
1
1 + 4i
(3+4i)x 1 + 4i
(3+4i)x
+ f1 e
,
= f1 e
1
1
where f1 is an arbitrary complex constant.
This solution is correct but unacceptable. We want to solve the system Y = AY , where A has
real coefcients, and we have a solution which is indeed a real vector, but this vector is expressed in
43
terms of complex numbers and functions. We need to obtain a solution that is expressed totally in
terms of real numbers and functions. In order to do this, we need an extra step.
In order not to interrupt the ow of exposition, we simply state here what we need to do, and
we justify this after the conclusion of the example.
We therefore do the following: We simply replace the matrix P MZ by the matrix whose
1 + 4i
x
(3+4i)x
1
rst column is the real part Re(e v1 ) = Re e
, and whose second column is
1
1 + 4i
, and the vector F by the vector C of
the imaginary part Im(e1 x v1 ) = Im e(3+4i)x
1
arbitrary real constants. We compute
1 + 4i
(3+4i)x 1 + 4i
3x
= e (cos(4x) + i sin(4x))
e
1
1
cos(4x)
4
sin(4x)
3x
3x 4 cos(4x) sin(4x)
+ ie
=e
cos(4x)
sin(4x)
and so we obtain
e3x ( cos(4x) 4 sin(4x)) e3x (4 cos(4x) sin(4x)) c1
e3x cos(4x)
e3x sin(4x)
c2
3x
3x
(c1 + 4c2 )e cos(4x) + (4c1 c2 )e sin(4x)
.
=
c1 e3x cos(4x) + c2 e3x sin(4x)
Y =
x
1
= e v1 e 1 v1
,
Y = P MZ F = v1 v1
x
1
f1
f1
0
e
Lemma 2.5.
where f1 is an arbitrary complex constant. Then this system also has general solution of the form
c1
,
Y = Re(e1 x v1 ) Im(e1 x v1 )
c2
where c1 and c2 are arbitrary real constants.
Proof. First note that for any complex number z = x + iy, x = Re(z) = 21 (z + z) and y = Im(z) =
1
2i (z z), and similarly, for any complex vector.
44
P MZ R = Re(e1 x v1 ) Im(e1 x v1 ) .
Then
R
1 1
=
i i
.
Since f1 is an arbitrary complex constant, we may (cleverly) choose to write it as f1 = 21 (c1 + ic2 )
for arbitrary real constants c1 and c2 , and with this choice
c
1
R F = 1 ,
c2
We now solve Y = AY where A is a real 3-by-3 matrix with a pair of complex eigenvalues
and a third, real eigenvalue. As you will see, we use the idea of Lemma 2.5 to simply replace the
relevant columns of P MZ in order to obtain our nal solution.
Example 2.6.
15 16 8
A = 10 10 5 .
0
1
2
Y = AY
where
2 + 2i
that the eigenspace E1+2i = Ker(A (1 + 2i)I ) has basis v1 = 1 + 2i , and hence, by
2 2i
Lemma 2.3, that the eigenspace E12i = Ker(A (1 2i)I ) has basis v2 = v1 = 1 2i .
We further compute that the eigenspace E5 = Ker(A 5I ) has basis v3 = 3 . Hence, just as
before,
A = P J P 1
2 + 2i
with P = 1 + 2i
1
e(1+2i)x
Z= 0
0
2 2i
1 2i
1
1 + 2i
4
3 and J = 0
0
1
0
1 2i
0
e(12i)x
0
f1 e(1+2i)x
f1
0
0 f1 = MZ F = f1 e(1+2i)x
e5x
c3
c3 e5x
2 2i
1 2i
1
(1+2i)x
e
4
3 0
1
0
0
0 .
5
and so Y = P Z = P MZ F , i.e.,
2 + 2i
Y = 1 + 2i
1
0
e(12i)x
0
f1
0
0 f1 .
e5x
c3
Now
2 + 2i
2 + 2i
e(1+2i)x 1 + 2i = ex (cos(2x) + i sin(2x)) 1 + 2i
1
1
x
e (2 cos(2x) 2 sin(2x))
e (2 cos(2x) 2 sin(2x))
= ex ( cos(2x) 2 sin(2x)) + i ex (2 cos(2x) sin(2x))
ex cos(2x)
ex sin(2x)
and of course
5x
4e
4
5x
e
3 = 3e5x ,
1
e5x
ex (2 cos(2x) 2 sin(2x)) ex (2 cos(2x) 2 sin(2x)) 4e5x
c1
Y = ex ( cos(2x) 2 sin(2x))
ex (2 cos(2x) sin(2x)) 3e5x c2
ex cos(2x)
ex sin(2x)
e5x
c3
x
x
5x
(2c1 + 2c2 )e cos(2x) + (2c1 2c2 )e sin(2x) + 4c3 e
= (c1 + 2c2 )ex cos(2x) + (2c1 c2 )ex sin(2x) + 3c3 e5x .
c1 ex cos(2x) + c2 ex sin(2x) + c3 e5x
45
46
3 4
2. A =
,
2 7
5 13
3. A =
,
1 9
det(I A) =
10 + 29,
7 17
4. A =
,
4 11
det(I A) =
det(I A) =
14 + 58,
37
10
20
5. A = 59 9 24,
33 12 21
18 + 145,
8
and Y0 =
.
13
3
and Y0 =
.
5
2
and Y0 =
.
1
5
and Y0 =
.
2
4 42 15
6. A = 4
25 10,
6
32 13
2.3
In this section, we show how to solve an inhomogeneous system Y = AY + G(x) where G(x)
is a vector of functions. (We will often abbreviate G(x) by G). We use a method that is a direct
generalization of the method we used for solving a homogeneous system in Section 2.1.
Consider the matrix system
Y = AY + G .
Step 1. Write A =
P J P 1
47
Y
Y
P 1 Y
(P 1 Y )
=
=
=
=
(P J P 1 )Y + G
P J (P 1 Y ) + G
J (P 1 Y ) + P 1 G
J (P 1 Y ) + P 1 G .
48
as claimed.
Yi = NY
NY1 G dx
Yi . We
Yi
49
have
NY1 G dx
= NY
0
= NY NY1 G dx + NY
NY1 G dx
0
NY1 G dx + NY (NY1 G)
= NY
as NY = ANY
= A NY NY1 G dx + G
0
= AYi + G
as claimed.
Y = AY + G
where
A=
5 7
30ex
and G =
.
60e2x
2 4
and NY = P MZ . Then
3x
e
1
NY G =
0
Then
7 1
P =
2 1
and
e3x
MZ =
0
e2x
,
2x
0
12ex
1 1 30ex
6e
(1/5)
=
.
e2x
60e2x
12e3x + 84e4x
2 7
0
NY1 G =
3e2x + 12e2x
4e3x + 21e4x
50
and
7 1
Yi = NY
=
2 1
0x
25e + 105e2x
.
=
10ex + 45e2x
Example 3.4.
NY1 G
e3x
0
3e2x + 12e2x
4e3x + 21e4x
e2x
Y = AY + G
where
3x
0 1
60e
A=
and G =
.
72e5x
4 4
2 1
P =
4 0
and
MZ = e
2x
1 x
,
0 1
and NY = P MZ . Then
0 1 60e3x
18e3x 60xex + 36xe3x
1
2x 1 x
NY G = e
(1/4)
=
.
0 1
4 2 72e5x
60ex 36e3x
Then
0
NY1 G
and
2 1 2x 1 x 60ex 60xex 10e3x + 12xe3x
e
Yi = NY NY1 G =
60ex 12e3x
0 1
4 0
0 3x
60e + 8e5x
.
=
240e3x + 40e5x
Example 3.5.
Y = AY + G
where
x
2 3 3
e
A = 2 2 2 and G = 12e3x .
2 1
1
20e4x
1 0 1
P = 0 1 1
1 1
1
and
ex
MZ = 0
0
0 0
1 0 ,
0 e2x
and NY = P MZ . Then
x
x
e
e 0
0
1
1
12e4x + 20e5x
0
NY1 G = 0 1
0 1 2 1 12e3x = ex 24e3x 20e4x .
1 1
1
0 0 e2x
20e4x
ex + 12ex + 20e2x
Then
and
3e4x + 4e5x
NY1 G = ex 8e3x 5e4x
0
ex + 12ex + 10e2x
x
e
1 0 1
Yi = NY NY1 G = 0 1 1 0
0
1 1
1
0
x
e 9e3x 6e4x
= 2ex 4e3x 5e4x .
2ex + 7e3x + 9e4x
Example 3.6.
3e4x + 4e5x
0 0
1 0 ex 8e3x 5e4x
0 e2x
ex + 12ex + 10e2x
Y = AY + G
We saw in Example 2.4 that
1 + 4i
P =
1
and NY = P MZ . Then
2 17
200
A=
and G =
.
1
4
160ex
where
1 4i
1
and
e(3+4i)x
MZ =
0
e(34i)x
0
1 1 + 4i
200
e(3+4i)x
(1/(8i))
0
e(34i)x
1 1 4i 160ex
25e(34i)x + 20(4 i)e(24i)x
.
=
25e(3+4i)x + 20(4 + i)e(2+4i)x
NY1 G =
51
52
Then
0
NY1 G =
(4 + 3i)e(34i)x + (4 + 18i)e(24i)x
(4 3i)e(3+4i)x + (4 18i)e(2+4i)x
and
Yi = NY NY1 G
0
0
(4 + 3i)e(34i)x + (4 + 18i)e(24i)x
1 + 4i 1 4i e(3+4i)x
=
0
e(34i)x (4 3i)e(3+4i)x + (4 18i)e(2+4i)x
1
1
1 + 4i 1 4i (4 + 3i) + (4 + 18i)ex
=
(4 3i) + (4 18i)ex
1
1
x
32 136e
.
=
8 8ex
(Note that in this last example we could do arithmetic with complex numbers directly, i.e.,
without having to convert complex exponentials into real terms.)
Once we have done this work, it is straightforward to solve initial value problems. We do a
single example that illustrates this.
Consider the initial value problem
7
, where
Y = AY + G, Y (0) =
17
Example 3.7.
5 7
30ex
A=
and G =
.
2 4
60e2x
We saw in Example 1.2 that the associated homogenous system has general solution
7c1 e3x + c2 e2x
YH =
2c1 e3x + c2 e2x
and in Example 3.3 that the original system has a particular solution
25ex + 105e2x
.
Yi =
10ex + 45e2x
Thus, our original system has general solution
7c1 e3x + c2 e2x 25ex + 105e2x
.
Y = YH + Yi =
2c1 e3x + c2 e2x 10ex + 45e2x
We apply the initial condition to obtain the linear system
7
7c1 + c2 + 80
=
Y (0) =
2c1 + c2 + 35
17
53
with solution c1 = 11, c2 = 4. Substituting, we nd that our initial value problem has solution
77e3x + 4e2x 25ex + 105e2x
.
Y =
22e3x + 4e2x 10ex + 45e2x
1
7. G(x) = 3e2x .
5e4x
8
8. G(x) = 3e3x .
3e5x
2.4
In this section, we will discuss the matrix exponential and its use in solving systems Y = AY .
Our rst task is to ask what it means to take a matrix exponential. To answer this, we are
guided by ordinary exponentials. Recall that, for any complex number z, the exponential ez is given
by
ez = 1 + z + z2 /2! + z3 /3! + z4 /4! + . . .
.
54
Denition 4.1.
eT = I + T +
1 2
1
1
T + T3 + T4 + ...
2!
3!
4!
(For this denition to make sense we need to know that this series always converges, and it
does.)
Recall that the differential equation y = ay has the solution y = ceax . The situation for
= AY is very analogous. (Note that we use rather than C to denote a vector of constants for
reasons that will become clear a little later. Note that is on the right in Theorem 4.2 below, a
consequence of the fact that matrix multiplication is not commutative.)
Y
Theorem 4.2.
Y (0) = Y0
has solution
Y = eAx Y0 .
Proof. (Outline) (1) We rst compute eAx . In order to do so, note that (Ax)2 = (Ax)(Ax) =
(AA)(xx) = A2 x 2 as matrix multiplication commutes with scalar multiplication, and (Ax)3 =
(Ax)2 (Ax) = (A2 x 2 )(Ax) = (A2 A)(x 2 x) = A3 x 3 , and similarly, (Ax)k = Ak x k for any k. Then,
substituting in Denition 4.1, we have that
Y = eAx = (I + Ax +
1 2 2
1
1
A x + A3 x 3 + A4 x 4 + . . .) .
2!
3!
4!
55
Y ,
To nd
we may differentiate this series term-by-term. (This claim requires proof, but we shall
not give it here.) Remembering that A and are constant matrices, we see that
1 2
1
1
A (2x) + A3 (3x 2 ) + A4 (4x 3 ) + . . .)
2!
3!
4!
1
1
= (A + A2 x + A3 x 2 + A4 x 3 + . . .)
2!
3!
1
1 2 2
= A(I + Ax + A x + A3 x 3 + . . .)
2!
3!
Y = (A +
= A(eAx ) = AY
as claimed.
(2) By (1) we know that Y = AY has solution Y = eAx . We use the initial condition to
solve for . Setting x = 0, we have:
Y0 = Y (0) = eA0 = e0 = I =
(where e0 means the exponential of the zero matrix, and the value of this is the identity matrix I , as
is apparent from Denition 4.1), so = Y0 and Y = eAx = eAx Y0 .
2
In the remainder of this section we shall see how to translate the theoretical solution of
Y = AY given by Theorem 4.2 into a practical one. To keep our notation simple, we will stick to
2-by-2 or 3-by-3 cases, but the principle is the same regardless of the size of the matrix.
One case is relatively easy.
Lemma 4.3.
If J is a diagonal matrix,
d1
J =
d2
0
0
..
.
dn
eJ x =
ed1 x
ed2 x
0
0
..
.
edn x
56
J2
d 2
= 1
0
0
d2
.
3
0
d
3
, J = 1
2
d2
0
0
, and similarly, J k =
d2 3
1 2 2
1
1
J x + J 3x3 + J 4x4 + . . .
2!
3!
4!
2
1 d1 3 0
1
0
0
d1
1 0
d
2
+
x+
x
x3 + . . .
=
+ 1
0 d2
0 1
2! 0 d2 2
3! 0 d2 3
0
1 + d1 x + 2!1 (d1 x)2 + 3!1 (d1 x)3 + . . .
=
0
1 + d2 x + 2!1 (d2 x)2 + 3!1 (d2 x)3 + . . .
eJ x = I + J x +
which we recognize as
dx
0
e 1
.
=
0
ed2 x
2
Example 4.4.
To do so we directly apply Theorem 4.2 and Lemma 4.3. The solution is given by
3x
0
e
1
1 e3x
y1
Jx
=Y =e =
=
.
0 e2x 2
y2
2 e2x
5 7
= AY where A =
. We may
2 4
still apply Theorem 4.2 to conclude that the solution is Y = eAx . We again try to calculate eAx .
Now we nd
11 7
41 49
5 7
, A3 =
, ...
A=
, A2 =
2
2
14 22
2 4
Now suppose we want to nd the general solution of Y
so
57
1 11 7 2
1 41 49 3
1 0
5 7
x +
x + ... ,
=
+
x+
2
0 1
2 4
2! 2
3! 14 22
Ax
which looks like a hopeless mess. But, in fact, the situation is not so hard!
Lemma 4.5.
eS = I + S +
as claimed.
58
5 7
.
2 4
7 1
3 0
P =
and J =
.
2 1
0 2
Then
eAx = P eJ x P 1
1
0
7 1 e3x
7 1
=
0 e2x 2 1
2 1
7 3x 2 2x
75 e3x + 75 e2x
5e 5e
=
2 3x
2 2x
25 e3x + 75 e2x
5e 5e
and
Y =e
Ax
=e
Ax
=
Example 4.7.
1
2
( 75 1 75 2 )e3x + ( 25 1 + 75 2 )e2x
( 25 1 25 2 )e3x + ( 25 1 + 75 2 )e2x
.
2 3 3
A = 2 2 2 .
2 1
1
1 0 1
1 0 0
P = 0 1 1 and J = 0 0 0 .
1 1
1
0 0 2
59
Then
eAx = P eJ x P 1
x
1
e
1 0 1
1 0 1
0 0
= 0 1 1 0 1 0 0 1 1
1 1
1
1 1
1
0 0 e2x
2x
x
2x
x
2x
e
e e
e e
= 1 + e2x
2 e2x
1 e2x
1 e2x ex 2 + e2x ex 1 + e2x
and
1
Y = eAx = eAx 2
3
(2 + 3 )ex + (1 2 3 )e2x
.
= (1 + 22 + 3 ) + (1 2 3 )e2x
x
2x
(2 + 3 )e + (1 22 3 ) + (1 + 2 + 3 )e
1
Now suppose we want to solve the initial value problem Y = AY , Y (0) = 0. Then
0
Y = eAx Y (0)
ex e2x
e2x
2x
= 1 + e
2 e2x
1 e2x ex 2 + e2x
e2x
= 1 + e2x .
1 e2x
1
ex e2x
1 e2x 0
0
ex 1 + e2x
Let us compare the results of our method here with that of our previous method. In
the case of Example 4.6, our previous method gives the solution
3x
0
e
C
Y =P
0 e2x
Remark 4.8.
= P eJ x C
60
3 0
where J =
,
0 2
ex
Y =P 0
0
0 0
1 0 C
0 e2x
= P eJ x C
1 0 0
where J = 0 0 0,
0 0 2
While these two methods are in principle the same, we may ask which is preferable in
practice. In this regard we see that our earlier method is better, as the use of the matrix exponential
requires us to nd P 1 , which may be a considerable amount of work. However, this advantage
is (partially) negated if we wish to solve initial value problems, as the matrix exponential method
immediately gives the unknown constants , as = Y (0), while in the former method we must
solve a linear system to obtain the unknown constants C.
Now let us consider the nondiagonalizable case. Suppose Z = J Z where J is a matrix consisting of a single Jordan block. Then by Theorem 4.2 this has the solution Z = eJ x . On the other
61
hand, in Theorem 1.1 we already saw that this system has solution Z = MZ C. In this case, we simply
have C = , so we must have eJ x = MZ . Let us see that this is true by computing eJ x directly.
Theorem 4.9.
J =
1
a
1
a
0
1
..
.
..
1
a
Then
k3 /(k 3)!
1
x
= eax
.
.
.
.
.
.
.
eJ x
Then
J2
so
a2
=
0
3
2a
a
3
,J =
2
a
0
4
3a 2
a
4
,J =
3
a
0
1 a2
1 0
a 1
=
+
x+
0 1
0 a
2! 0
m11 m12
,
=
0
m22
Jx
a 1
.
0 a
4a 3
,
a4
1 a3
2a 2
x +
a2
3! 0
1 a4
3a 2 3
x +
a3
4! 0
4a 3 4
x + ...
a4
1
1
1
1
(ax)2 + (ax)3 + (ax)4 + (ax)5 + . . .
2!
3!
4!
5!
62
and
1 2 3
1
1
a x + a3x 4 + a4x 5 + . . .
2!
3!
4!
1
1
1
= x(1 + ax + (ax)2 + (ax)3 + (ax)4 + . . .) = xeax
2!
3!
4!
and so we conclude that
ax
xeax
e
Jx
ax 1 x
=e
.
e =
0
eax
0 1
m12 = x + ax 2 +
a 1 0
J = 0 a 1 .
0 0 1
2
3
4
a 2a 1
a 3a 2 3a
a 4a 3 6a 2
Then J 2 = 0 a 2 2a , J 3 = 0 a 3 3a 2 , J 4 = 0 a 4 4a 3 ,
0 0 a2
0
0
a3
0
0
a4
5
a 5a 4 10a 3
J 5 = 0 a5
5a 4 ,
0
0
a5
so
2
3
a 1 0
1 0 0
a 2a 1
a 3a 2
1
1
eJ x = 0 1 0 + 0 a 1 x + 0 a 2 2a x 2 + 0 a 3
2!
3!
0 0 a
0 0 1
0 0 a2
0
0
4
a 4a 3 6a 2
a 5 5a 4 10a 3
1
1
4
+
0 a 4 4a 3 x + 0 a 5
5a 4 x 5 + . . .
4!
5!
0
0
a4
0
0
a5
3a
3a 2 x 3
a3
1
1
1
1
(ax)2 + (ax)3 + (ax)4 + (ax)5 + . . .
2!
3!
4!
5!
and
m12 = m23 = x + ax 2 +
= xeax
1 2 3
1
1
a x + a3x 4 + a4x 5 + . . .
2!
3!
4!
63
1 2
1
1 1
1 1
x + ax 3 + ( a 2 x 4 ) + ( a 3 x 5 ) + . . .
2!
2!
2! 2!
2! 3!
so
eJ x
eax
= 0
0
1 2 ax
2! x e
xeax
xeax
eax
0
eax
1 x x 2 /2!
= eax 0 1
x ,
0 0
1
Example 4.10.
0 1
Y = AY where A =
.
4 4
Y
3
= AY , Y (0) =
.
8
2 1
2 1
P =
and J =
.
4 0
0 2
Then
eAx = P eJ x P 1
2 1 e2x xe2x 2
=
0
e2x
4 0
4
xe2x
(1 2x)e2x
=
4xe2x
(1 + 2x)e2x
1
0
,
1
64
and so
Y = eAx
Ax 1
=e
2
1 e2x + (21 + 2 )xe2x
.
=
2 e2x + (41 + 22 )xe2x
The initial value problem has solution
Y = eAx Y0
3
Ax
=e
8
2x
3e 14xe2x
.
=
8e2x 28xe2x
Example 4.11.
2 1 1
Y = AY where A = 2 1 2 .
1 0 2
8
Also, consider the initial value problem Y = AY , Y (0) = 32.
5
1 1
1 0 5
P = 2 0 6 and J = 0 1
0
0
1 1 1
0
0 .
3
Then
eAx = P eJ x P 1
x
e
1 0 5
= 2 0 6
0
1 1 1
0
xex
ex
0
1 0 5
0
0 2 0 6
1 1 1
e3x
1
x
2 xe
x +
+
8e
3
x
x
=
4 e xe +
1 x
8e
21 xex
5 3x
8e
3 3x
4e
1 3x
8e
5 x
5 3x
16
e 41 xex + 16
e
1
3 3x
5 x
x
+ 2 xe + 8 e
8e
1 3x
1 x
+ 41 xex 16
e
16 e
xex
65
2xex
ex xex
and so
Y = eAx
1
Ax
=e
2
3
5
5
2 )ex + ( 21 1 41 2 + 3 )xex + ( 58 1 + 16
2 )e3x
( 38 1 16
=
( 43 1 + 58 2 )ex + (1 + 21 2 23 )xex + ( 43 1 + 83 2 )e3x
( 18 1 +
+ 3 )ex + ( 21 1 + 41 2 3 )xex + ( 18 1
1
16 2
1
3x
16 2 )e
1 + 4i
1
1 4i
1
and J =
3 + 4i
0
0
3 4i
.
66
Then
eAx = P eJ x P 1
1 + 4i 1 4i e(3+4i)x
=
0
1
1
(3+4i)x
1 + 4i 1 4i e
=
0
1
1
m11 m12
=
m21 m22
1
1 + 4i 1 4i
e(34i)x
1
1
0
1
1 + 4i
(1/(8i))
e(34i)x
1 1 + 4i
0
where
m11 = (1/(8i))((1 + 4i)e(3+4i)x + (1 4i)(e(34i)x ))
= (1/(8i))((1 + 4i)e3x (cos(4x) + i sin(4x)) (1 4i)e3x (cos(4x) i sin(4x))
= (1/(8i))(ie3x (4 cos(4x) sin(4x))(2)) = e3x (cos(4x) (1/4) sin(4x)),
m12 = (1/(8i))((1 + 4i)(1 + 4i)e(3+4i)x + (1 4i)(1 + 4i)e(34i)x )
= (1/(8i))(ie3x (17 sin(4x))(2)) = e3x ((17/4) sin(4x)),
m21 = (1/(8i))(e(3+4i)x e(34i)x )
= (1/(8i))(ie3x (sin(4x))(2)) = e3x ((1/4) sin(4x)),
m22 = (1/(8i))((1 + 4i)e(3+4i)x + (1 + 4i)e(34i)x )
= (1/(8i))((1 + 4i)e3x (cos(4x) + i sin(4x)) + (1 + 4i)e3x (cos(4x) i sin(4x))
= (1/(8i))(ie3x (4 cos(4x) + sin(4x))(2)) = e3x (cos(4x) + (1/4) sin(4x)) .
Thus,
eAx = e3x
and
1
4
Y =e
Ax
=
17
4
cos(4x) 41 sin(4x)
sin(4x)
sin(4x)
cos(4x) + 41 sin(4x)
17
3x
4 2 )e sin(4x)
2 e3x cos(4x) + ( 41 1 + 41 2 )e3x sin(4x)
1 e3x cos(4x) + ( 1
4 1 +
Our procedure in this section is essentially that of Remark 1.12. (Compare Example 4.10 with Example 1.13.)
Remark 4.13.
67
Now let us see how to use the matrix exponential to solve an inhomogeneous system
= AY + G(x). Since we already know how to solve homogeneous systems, we need only, by
Lemma 3.1, nd a (single) particular solution Yi of this inhomogeneous
system, and that is what
we do. We shall again use our notation from Section 2.3, that 0 H (x)dx denotes an arbitrary (but
xed) antiderivative of H (x).
Remark 4.15.
Y
Ax
eAx G(x)
eAx G(x) .
Yi = e
Yi =
0
Ax
Let us compare this with the solution we found in Theorem 3.2. By Remark 4.14, we can
rewrite this solution as Yi = MY 0 MY1 G(x). This is almost, but not quite, what we had in Theo
rem 3.2. There we had the solution Yi = NY 0 NY1 G(x), where NY = P MZ . But these solutions
are the same, as MY = P MZ P 1 = NY P 1 . Then MY1 = P MZ1 P 1 and NY1 = MZ1 P 1 , so
MY1 = P NY1 . Substituting, we nd
Yi = MY MY1 G(x)
0
1
P NY1 G(x) ,
= NY P
0
and, since P is a constant matrix, we may bring it outside the integral to obtain
1
Yi = NY P P NY1 G(x)
0
1
= NY NY G(x)
0
as claimed.
68
1 x x 2 /2!
1 x (x)2 /2!
1 x x 2 /2!
eax 0 1
x = eax 0 1
x = eax 0 1
x , etc.
0 0
1
0 0
1
0 0
1
Remark 4.16.
69
APPENDIX
Background Results
A.1
In this section of the Appendix, we review the basic facts on bases for vector spaces and on coordinates
for vectors and matrices for linear transformations.Then we use these to (re)prove some of the results
in Chapter 1.
First we see how to represent vectors, once we have chosen a basis.
Let V be a vector space and let B = {v1 , . . . , vn } be a basis of V . Then any vector v in
V can be written as v = c1 v1 + . . . + cn vn in a unique way.
Theorem 1.1.
cn
is the coordinate vector of v in the basis B .
Remark 1.3.
where
0
..
.
0
ei =
1 ,
0
..
.
70
Then, if
c1
c2
..
.
v=
,
cn1
cn
we see that
1
0
0
0
v = c1 ... + . . . + cn ... = c1 e1 + . . . + cn en
0
0
0
c1
c2
..
.
[v]E =
.
cn1
cn
(In other words, a vector in Cn looks like itself in the standard basis.)
Next we see how to represent linear transformations, once we have chosen a basis.
Let V be a vector space and let B = {v1 , . . . , vn } be a basis of V . Let T : V V be
a linear transformation. Then there is a unique matrix [T ]B such that, for any vector v in V ,
Theorem 1.4.
[T (v)]B = [T ]B [v]B .
Furthermore, the matrix [T ]B is given by
[T ]B = [v1 ]B [v2 ]B . . .
[vn ]B .
71
and
basis E = {e1 , . . . , en }.
the
standard
consider
Let A be an n-by-n square matrix and write A = a1 a2 . . . an . If TA is the linear transforRemark 1.6.
Cn
Theorem 1.7.
[v]C = PC B [v]B .
(PC B )
= [w1 ]B [w2 ]B . . .
= PBC
[wn ]B .
72
Denition 1.8. The matrix PC B is the change-of-basis matrix from the basis B to the basis C .
Let V = Cn , let E be the standard basis of V , and let B = {v1 , . . . , vn } be any basis of
V . Let A be any n-by-n square matrix. Then
Corollary 1.9.
A = P BP 1
where
P = v1 v2 . . .
vn and B = [TA ]B .
PE B
[vn ]E = v1 v2 . . .
vn = P ,
2
With this in hand we now present new proofs of Theorems 1.14 and 2.11 in Chapter 1, and
a proof of Lemma 1.7 in Chapter 1. For convenience, we restate these results.
Theorem 1.10.
73
Proof. First suppose that for each eigenvalue a of A, geom-mult(a) = alg-mult(a). In the notation of
the proof of Theorem 1.14 in Chapter 1, B = {v1 , . . . , vn } is a basis of Cn . Then, by Corollary 1.9,
A = P [TA ]B P 1 . But B is a basis of eigenvectors, so for each i, TA (vi ) = Avi = ai vi = 0v1 +
. . . + 0vi1 + ai vi + 0vi+1 + . . . + 0vn . Then
0
..
.
Theorem 1.11.
A = P J P 1
where
J =
1
a
1
a
1
..
.
..
1
a
vk
74
so for i > 1
0
..
.
1
[Avi ]B =
a
0
..
.
with 1 in the (i 1)st position, a in the i th position, and 0 elsewhere, and [Av1 ]B is similar, except
that a is in the 1st position (there is no entry of 1), and every other entry is 0. Assembling these
vectors, we see that the matrix [TA ]B = J has the form of a single k-by-k Jordan block with diagonal
entries equal to a.
2
Lemma 1.12.
Proof. By the denition of an eigenvalue, there is at least one eigenvector v with eigenvalue a, and
so Ea contains the nonzero vector v, and hence dim(Ea ) 1.
Now suppose that a has geometric multiplicity k, and let {v1 , . . . , vk } be a basis for the
eigenspace Ea . Extend this basis to a basis B = {v1 , . . . , vk , vk+1 , . . . , vn } of Cn . Let B = [TA ]B .
Then
B = b1 b2 . . . bn
with bi = [TA (vi )]B . But for i between 1 and k, TA (vi ) = Avi = avi , so
0
..
.
bi = [avi ]B =
a
..
.
0
with a in the i th position and 0 elsewhere. (For i > k, we do not know what bi is.)
Now we compute the characteristic polynomial of B, det(I B). From our computation of
B, we see that, for i between 1 and k, the i th column of (I B) is
75
0
..
.
..
.
0
with a in the i th position and 0 elsewhere. (For i > k, we do not know what the i th column of
this matrix is.)
To compute the characteristic polynomial, i.e., the determinant of this matrix, we successively
expand by minors of the 1st , 2nd , . . ., k th columns. Each of these gives a factor of ( a), so we see
that det(I B) = ( a)k q() for some (unknown) polynomial q().
We have computed the characteristic polynomial of B, but what we need to know is the
characteristic polynomial of A. But these are equal, as we see from the following computation
(which uses the fact that scalar multiplication commutes with matrix multiplication, and properties
of determinants):
det(I A) = det(I P BP 1 ) = det((P I P 1 ) P BP 1 )
= det(P (I )P 1 P BP 1 ) = det(P (I B)P 1 )
= det(P ) det(I B) det(P 1 ) = det(P ) det(I B)(1/ det(P ))
= det(I B) .
Thus, det(I A), the characteristic polynomial of A, is divisible by ( a)k (and perhaps
by a higher power of ( a), and perhaps not, as we do not know anything about the polynomial
q()), so alg-mult(a) k = geom-mult(a), as claimed.
2
A.2
In this section of the Appendix, we prove properties of the complex exponential. For convenience,
we restate the basic denition and the properties we are trying to prove.
Denition 2.1.
First we note that this denition indeed makes sense, as this power series converges for every
complex number z. Now for the properties we wish to prove. Note that properties (2) and (3) are
direct generalizations of the situation for the real exponential function.
Theorem 2.2.
76
77
78
79
APPENDIX
Answers to Odd-Numbered
Exercises
Chapter 1
1. A =
1
7 4
3 0 7 4
.
9 5 0 5
9 5
21 1
3. A =
49 0
5 1
5. A =
25 0
21 1
49 0
3 1
0 3
7 1
0 7
5 1
25 0
1
.
1
.
0 2 0
1 0 0
0 2 0
7. A = 1 1 3 0 1 0 1 1 3 .
1 1 1
0 0 3
1 1 1
1 2 2
3 0 0
1 2 2
9. A = 1
0 1 .
0 1 0 3 0 1
0
1
1
0
0 1
0
1
1
4 0 0
1 2 1
1 2 1
11. A = 0
1
1 .
1
1 0 4 0 0
0 0 2
1
0
1
1
0
1
1 0 0
2 1 0
1 0 0
13. A = 1 0 1 0 2 0 1 0 1 .
0 1 1
0
0 4
0 1 1
3 1 2
0 1 0
3 1 2
15. A = 2 1 2 0 0 0 2 1 2 .
3
1
3
0 0 3
3
1
3
80
1
2 1 1
1 1 0
2 1 1
17. A = 10 0 2 0 1 0 10 0 2 .
6 0 0
0 0 1
6 0 0
1
1 2 0
0 1 0
1 2 0
19. A = 2 3 0 0 0 1 2 3 0 .
1 3 1
0 0 0
1 3 1
Section 2.1
7c1 e3x + 4c2 e5x
.
1a. Y =
9c1 e3x 5c2 e5x
b. Y =
7e3x + 8e5x
.
9e3x 10e5x
(21c1 + c2 )e3x 21c2 xe3x
3a. Y =
.
49c1 e3x
49c2 xe3x
(5c1 + c2 )e7x 5c2 xe7x
5a. Y =
.
25c1 e7x
25c2 xe7x
2c2 ex
7a. Y = c1 ex + c2 ex 3c3 e3x .
c1 ex + c2 ex + c3 e3x
3x
41e + 21xe3x
b. Y =
.
98e3x + 49xe3x
10e7x 25xe7x
b. Y =
.
75e7x 125xe7x
6ex
b. Y = 2ex + 3ex 15e3x .
2ex + 3ex + 5e3x
b. Y = 2e3x .
e3x
2e4x 5e2x
b. Y = 3e4x + 5e2x .
4e4x + 5e2x
c1 e2x c2 xe2x
13a. Y = c1 e2x c2 xe2x + c3 e4x .
c2 e2x
+ c3 e4x
e2x 2xe2x
b. Y = e2x 2xe2x + 4e4x .
2e2x
+ 4e4x
9 9x + 10e3x
b. Y = 7 6x + 10e3x .
9 + 9x 15e3x
xex
(2c1 + c2 + c3
2c2
x
14xex
b. Y = 10ex 70xex .
18ex 42xex
(c1 + 2c2 )
+ (c2 + 2c3 )x + (c3 /2)x 2
19a. Y = (2c1 + 3c2 )
+ (2c2 + 3c3 )x + c3 x 2 .
(c1 + 3c2 + c3 ) + (c2 + 3c3 )x + (c3 /2)x 2
3ex
6 + 5x + x 2
b. Y = 11 + 8x + 2x 2 .
9 + 7x + x 2
Section 2.2
1a.
e4x (cos(3x) + 3 sin(3x)) e4x (3 cos(3x) + sin(3x)) c1
2e4x cos(3x)
2e4x sin(3x)
c2
4x
4x
(c1 3c2 )e cos(3x) + (3c1 + c2 )e sin(3x)
.
=
2c1 e4x cos(3x) + 2c2 e4x sin(3x)
Y =
b. Y =
3a.
8e4x cos(3x) + 19e4x sin(3x)
.
13e4x cos(3x) sin(3x)
e7x (2 cos(3x) + 3 sin(3x)) e7x (3 cos(3x) + 2 sin(3x)) c1
Y =
e7x cos(3x)
e7x sin(3x)
c2
7x
7x
(2c1 3c2 )e cos(3x) + (3c1 + 2c2 )e sin(3x)
.
=
c1 e7x cos(3x) + c2 e7x sin(3x)
2e7x cos(3x) + 3e7x sin(3x)
.
b. Y =
e7x cos(3x)
5.
2x
c1
e2x (cos(5x) sin(5x))
0
e ( cos(5x) sin(5x))
2x
2x
3x
2x
2x
(c1 + c2 )e cos(5x) + (c1 c2 )e sin(5x)
= (3c1 4c2 )e2x cos(5x) + (4c1 3c2 )e2x sin(5x) 2c3 e3x .
3c1 e2x cos(5x) + 3c2 e2x sin(5x) + c3 e3x
Section 2.3
10e8x 168e4x
1. Yi =
.
12e8x + 213e4x
81
82
20e4x + 9e5x
.
49e4x + 23e5x
2e10x + e12x
.
5. Yi =
25e10x + 10e12x
1
7. Yi = 1 2e2x 3e4x .
1 + e2x + 2e4x
3. Yi =
Section 2.4
1a.
35e3x + 36e5x
45e3x 45e5x
eAx =
28e3x + 28e5x
.
36e3x 35e5x
Y =
(351 282 )e3x + (361 + 282 )e5x
.
(451 + 362 )e3x + (451 352 )e5x
3a.
e3x 21xe3x
49xe3x
eAx =
9xe3x
.
e3x + 21xe3x
1 e3x + (211 + 92 )xe3x
Y =
.
2 e3x + (491 + 212 )xe3x
5a.
eAx
e7x 5xe7x
=
25xe7x
xe7x
.
e7x + 5xe7x
1 e7x + (51 + 2 )xe7x
Y =
.
2 e7x + (251 + 52 )xe7x
7a.
eAx
ex
1 x 1 x
= 2 e + 2 e
21 ex + 21 ex
1 x
4e
1 x
4e
0
+ 43 e3x
41 e3x
3 x
4e
3 x
4e
0
43 e3x
.
+ 41 e3x
1 ex
1
1
3
1
3
3
x
x
3x
Y = ( 2 1 + 4 2 + 4 3 )e + 2 1 e + ( 4 2 4 3 )e .
( 21 1 + 41 2 + 43 3 )ex + 21 1 ex + ( 41 2 + 41 3 )e3x
e3x
9a.
2e3x
+ 2ex
eAx = e3x + ex
e3x ex
e3x
+ 2ex
ex
ex
4e3x
+ 4ex
2e3x + 2ex .
3e3x 2ex
11a.
eAx =
3 4x
1 2x
2e 2e
1
e4x + 1 e2x
2
2
1 4x
2 e + 21 e2x
e4x e2x
e2x
e4x + e2x
1 4x
1 2x
2e 2e
21 e4x + 21 e2x
.
1 2x
1 4x
2e + 2e
( 23 1 + 2 + 21 3 )e4x + ( 21 1 2 21 3 )e2x
1
1
1
1
4x
2x
.
Y =
( 2 1 2 3 )e + ( 2 1 + 2 + 2 3 )e
1
1
1
1
4x
2x
( 2 1 2 + 2 3 )e + ( 2 1 + 2 + 2 3 )e
13a.
eAx
e2x xe2x
2x
= e
xe2x e4x
e2x e4x
xe2x
2x
xe
+ e4x
e2x + e4x
xe2x
xe2x .
e2x
1 e2x + (1 + 2 3 )xe2x
Y = 1 e2x + (1 + 2 3 )xe2x + (1 + 2 )e4x .
(1 2 + 3 )e2x + (1 + 2 )e4x
15a.
eAx
3 2e3x
= 2 2e3x
3 + 3e3x
9x
1 + 6x
9x
2 + 6x 2e3x
2 + 4x 2e3x .
2 6x + 3e3x
17a.
eAx
ex 2xex
= 10xex
6xex
xex
ex + 5xex
3xex
xex
5xex .
x
e 3xex
83
84
1 ex + (21 + 2 3 )xex
Y = 2 ex + (101 + 52 53 )xex .
3 ex + (61 + 32 33 )xex
19a.
eAx
1 4x 23 x 2
=
5x 3x 2
7x 23 x 2
x + 21 x 2
1 + x + x2
2x + 21 x 2
2x + 21 x 2
3x + x 2 .
1 + 3x + 21 x 2
1 + (41 + 2 + 23 )x + ( 23 1 + 21 2 + 21 3 )x 2
Y = 2 + (51 + 2 + 33 )x + (31 + 2 + 3 )x 2 .
3 + (71 + 22 + 33 )x + ( 23 1 + 21 2 + 21 3 )x 2
5 4x
e sin(3x)
e4x (cos(3x) 13 sin(3x))
3
.
=
e4x (cos(3x) + 13 sin(3x))
23 e4x sin(3x)
21a.
eAx
Y =
23a.
e7x (cos(3x)
eAx =
2
3
sin(3x))
13 e7x sin(3x)
Y =
13 7x
3 e
sin(3x)
e7x (cos(3x) +
2
3
13
7x
3 2 )e sin(3x)
.
2 e7x cos(3x) + ( 13 1 + 23 2 )e7x sin(3x)
1 e7x cos(3x) + ( 23 1 +
sin(3x))
.
85
Index
antiderivative
arbitrary, 48, 67
chain of generalized eigenvectors, 10, 12, 73
bottom of, 10
top of, 10
characteristic polynomial, 1, 75
complex root, see eigenvalue, complex
complex exponential, 41, 75
diagonalizable, 57, 72
eigenspace, 1
generalized, 8, 9
eigenvalue, 1
complex, 40, 43, 65
eigenvector, 1
generalized, 8
index of generalized, 8
Euler, 41, 75
Fundamental Theorem of Algebra, 3
integrating factor, 30, 31, 67
JCF, see Jordan Canonical Form
Jordan block, 7, 8, 12, 28, 47, 60, 61, 68, 73
Jordan Canonical Form, 8, 14, 17, 21, 25, 31,
47, 66
linear differential equations
associated homogenous system of, 47
fundamental matrix of system of, 26