Topic 1: Matrix Diagonalization
Topic 1: Matrix Diagonalization
Uniqueness of A−1 can be easily proved. For, suppose that B is another inverse of matrix A. Then
BA = In and
B = BIn = B(AA−1 ) = (BA)A−1 = In A−1 = A−1 ,
showing that B = A−1 .
Some properties of the inverse matrix are the following. It is assumed that the matrices in each
of the following laws are regular.
(1) (A−1 )−1 = A.
(2) (AT )−1 = (A−1 )T .
(3) (AB)−1 = B −1 A−1 .
1.2. Determinants. To a square matrix A we associate a real number called the determinant, |A|
or det (A), in the following way.
For a matrix of order 1, A = (a), det (A)
= a.
a b
For a matrix of order 2, A = , det (A) = ad − bc.
c d
For a matrix of order 3
a11 a12 a13
a22 a23 a12 a13 a12 a13
det (A) = a21 a22 a23 = a11 − a21
a32 a33 + a31 a22 a23 .
a31 a32 a33 a32 a33
This is known as the expansion of the determinant by the first column, but it can be done for any
other row or column, giving the same result. Notice the sign (−1)i+j in front of the element aij .
Before continuing with the inductive definition, let us see an example.
Example 1.6. Compute the following determinant expanding by the second column.
1 2 1
4 3 5 = (−1)1+2 2 4 5 + (−1)2+2 3 1 1 + (−1)2+3 1 1 1
3 3 3 3 4 5
3 1 3
= −2 · (−3) + 3 · (0) − (1) · 1 = 5
For general n the method is the same that for matrices of order 3, expanding the determinant by a
row or a column and reducing in this way the order of the determinants that must be computed. For
a determinant of order 4 one has to compute 4 determinants of order 3.
Definition 1.7. Given a matrix A of order n, the complementary minor of element aij is the deter-
minant of order n − 1 which results from the deletion of the row i and the column j containing that
element. The adjoint Aij of the element aij is the minor multiplied by (−1)i+j .
According to this definition, the determinant of matrix A can be defined as
|A| = ai1 Ai1 + ai2 Ai2 + · · · + ain Ain (by row i)
or, equivalently
|A| = a1j A1j + a2j A2j + · · · + anj Anj (by column j).
Example 1.8. Find the value of the determinant
1 2 0 3
4 7 2 1
.
1 3 3 1
0 2 0 7
3
2. Diagonalization of matrices
Definition 2.1. Two matrices A and B of order n are similar if there exists a matrix P such that
B = P −1 AP.
Definition 2.2. A matrix A is diagonalizable if it is similar to a diagonal matrix D, that is, there
exists D diagonal and P invertible such that D = P −1 AP .
Of course, D diagonal means that every element out of the diagonal is null
λ1 0 . . . 0
0 λ2 . . . 0
D= ... .. . . . , λ1 , . . . , λn ∈ R.
. . ..
0 0 . . . λn
Proposition 2.3. If A is diagonalizable, then for all m ≥ 1
(2.1) Am = P Dm P −1 ,
where
λm 0 ... 0
1
m
m
0 λ 2 ... 0
D = ..
.. .. .. .
. . . .
0 0 . . . λm
n
Example 2.4. At a given date, instructor X can teach well or teach badly. After a good day, the
probability of doing well for the next class is 1/2, whilst after a bad day, the probability of doing
well is 1/9. Let gt (bt ) the probability of good (poor) teaching at day t. Suppose that at time t = 1
the class has been right, that is, g1 = 1, b1 = 0. Which is the probability that the 5th class go fine
(bad)?
Answer: The data lead to the following equations that relate the probability of a good/bad class
with the performance showed by the teacher the day before
1 1
gt+1 = gt + bt ,
2 9
1 8
bt+1 = gt + bt .
2 9
In matrix form !
1 1
gt+1 2 9 gt
= .
bt+1 1 8 bt
2 9
Obviously
!4
1 1
g5 2 9 g1
= .
b5 1 8 b1
2 9
If the matrix were diagonalizable and we could find matrices P and D, then the computation of the
10th power of the matrix would be easy using Proposition 2.3. We will come back to this example
afterwards.
Definition 2.5. Let A be a matrix of order n. We say that λ ∈ R is an eigenvalue of A and that
u ∈ Rn , u 6= 0, is an eigenvector of A associated to λ if
Au = λu.
The set of eigenvalues of A, σ(A) = {λ1 , . . . , λk }, is called the spectrum of A. The set of all
eigenvectors of A associated to the same eigenvalue λ, including the null vector, is denoted S(λ), and
is called the eigenspace or proper subspace associated to λ.
The following result shows that an eigenvector can only be associated to a unique eigenvalue.
Proposition 2.6. Let 0 6= u ∈ S(λ) ∩ S(µ). Then λ = µ.
Proof. Suppose 0 6= u ∈ S(λ) ∩ S(µ). Then
Au = λu
Au = µu.
Subtracting both equations we obtain 0 = (λ − µ)u and, since 0 6= u, we must have λ = µ.
Recall that for an arbitrary matrix A, the rank of the matrix is the number of linearly independent
columns or rows (both numbers necessarily coincide). It is also given by the order of the largest non
null minor of A.
Theorem 2.7. The real number λ is an eigenvalue of A if and only if
|A − λIn | = 0.
Moreover, S(λ) is the set of solutions (including the null vector) of the linear homogeneous system
(A − λIn )u = 0,
and hence it is a vector subspace, which dimension is
dim S(λ) = n − rank(A − λIn ).
5
Proof. Suppose that λ ∈ R is an eigenvalue of A. Then the system (A − λIn )u = 0 admits some non–
trivial solution u. Since the system is homogeneous, this implies that the determinant of the system
is zero, |A − λIn | = 0. The second part about S(λ) follows also from the definition of eigenvector,
and the fact that the set of solutions of a linear homogenous system is a subspace (the sum of two
solutions is again a solution, as well as it is the product of a real number by a solution). Finally, the
dimension of the space of solutions is given by the Theorem of Rouche–Frobenius.
Definition 2.8. The characteristic polynomial of A is the polynomial of order n given by
pA (λ) = |A − λIn |.
Notice that the eigenvalues of A are the real roots of pA . This polynomial is of degree n. The
Fundamental Theorem of Algebra estates that a polynomial of degree n has n complex roots (not
necessarily different, some of the roots may have multiplicity grater than one). It could be the case
that some of the roots of pA were not real numbers. For us, a root of pA (λ) which is not real is not
an eigenvalue of A.
Example 2.9. Find the eigenvalues and the proper subspaces of
0 −1 0
A= 1 0 0 .
0 0 1
Answer:
−λ −1 0
−λ −1
A − λI = 1 −λ 0 ; p(λ) = (1 − λ) = (1 − λ)(λ2 + 1).
1 −λ
0 0 1−λ
The characteristic polynomial has only one real root, hence the spectrum of A is σ(A) = {1}. The
proper subspace S(1) is the set of solutions of the homogeneous linear system (A − I3 )u = 0, that
is, the set of solutions of
−1 −1 0 x 0
(A − I3 )u = 1 −1 0 y = 0
0 0 0 z 0
Solving the above system we obtain
S(1) = {(0, 0, z) : z ∈ R} =< (0, 0, 1) > (the subspace generated by (0, 0, 1)).
Notice that pA (λ) has other roots that are not reals. They are the complex numbers ±i, that are
not (real) eigenvalues of A. If we would admit complex numbers, then they would be eigenvalues of
A in this extended sense.
Example 2.10. Find the eigenvalues and the proper subspaces of
2 1 0
B = 0 1 −1 .
0 2 4
Answer: The eigenvalues are obtained solving
2−λ 1 0
0
1 − λ −1 = 0.
0 2 4−λ
The solutions are λ = 3 (simple root) and λ = 2 (double root). To find S(3) = {u ∈ R3 : (B−3I3 )u =
0} we compute the solutions to
−1 1 0 x 0
(B − 3I3 )u = 0 −2 −1 y = 0 ,
0 2 1 z 0
6
which are x = y and z = −2y, and hence S(3) =< (1, 1, −2) >. To find S(2) we solve the system
0 1 0 x 0
(B − 2I3 )u = 0 −1 −1 y = 0 ,
0 2 2 z 0
from which x = y = 0 and hence S(2) =< (1, 0, 0) >.
Example 2.11. Find the eigenvalues and the proper subspaces of
1 2 0
C= 0 2
0
1 1 3
Answer: To compute the eigenvalues we solve the characteristic equation
1 − λ 2 0
0 = |C − λI3 | = 0 2−λ 0
1 1 0 − λ
1 − λ 0
= 2 − λ = (2 − λ)(1 − λ)(3 − λ)
1 3 − λ
So, the eigenvalues are λ1 = 1, λ2 = 2 and λ3 = 3. We now compute the eigenvectors. The eigenspace
S(1) is the solution of the homogeneous linear system whose associated matrix is C − λI3 with λ = 1.
That is, S(1) is the solution of the following homogeneous linear system
0 2 0 x 0
0 2 0 y = 0
1 1 2 z 0
Solving the above system we find that
S(1) = {(−2z, 0, z) : z ∈ R} =< (−2, 0, 1) >
On the other hand, S(2) is the set of solutions of the homogeneous linear system whose associated
matrix is C − λI3 with λ = 2. That is, S(2) is the solution of the following
−1 2 0 x 0
0 0 0 y = 0
1 1 1 z 0
So,
S(2) = {(2y, y, −3y) : y ∈ R} =< (2, 1, −3) >
Finally, S(3) is the set of solutions of the homogeneous linear system whose associated matrix is
A − λI3 with λ = 3. That is, S(3) is the solution of the following
−2 2 0 x 0
0 −1 0 y = 0
1 1 0 z 0
and we obtain
S(3) = {(0, 0, z) : z ∈ R} =< (0, 0, 1) >
We now start describing the procedure to diagonalize a matrix. Fix a square matrix A. Let
λ1 , λ2 , . . . , λk
be distinct real roots of the characteristic polynomial pA (λ) an let mk be the multiplicity of each λk
(Hence mk = 1 if λk is a simple root, mk = 2 if it is double, etc.). Note that m1 + m2 + · · · + mk ≤ n.
7
The following result estates that the number of independent vectors in the subspace S(λ) can never
be bigger than the multiplicity of λ.
Proposition 2.12. For each j = 1, . . . , k
1 ≤ dim S(λj ) ≤ nj .
The following theorem gives necessary and sufficient conditions for a matrix A to be diagonalizable.
Theorem 2.13. A matrix A is diagonalizable if and only if the two following conditions hold.
(1) Every root, λ1 , λ2 , . . . , λk of the charateristic polynomial pA (λ) is real.
(2) For each j = 1, . . . , k
dim S(λj ) = nj .
Corollary 2.14. If the matrix A has n distinct real eigenvalues, then it is diagonalizable.
Theorem 2.15. If A is diagonalizable, then the diagonal matrix D is formed by the eigenvalues of
A in its main diagonal, with each λj repeated nj times. Moreover, a matrix P such that D = P −1 AP
has as columns independent eigenvectors selected from each proper subspace S(λj ), j = 1, . . . , k.
Comments on the examples above.
• Matrix A of Example 2.9 is not diagonalizable, since pA has complex roots.
• Although all roots of pB are real, B of Example 2.10 is not diagonalizable, because dim S(2) =
1, which is smaller than the multiplicity of λ = 2.
• Matrix C of Example 2.11 is diagonalizable, since pC has 3 different real roots. In this case
1 0 0 −2 2 0
D = 0 2 0 , P = 0 1 0 .
0 0 3 1 −3 1
Example 2.16. Returning to Example 2.4, we compute
1 1
−λ
2 1 8
9 = 0,
2 9
−λ
7
or 18λ2 − 25λ + 7 = 0. We get λ1 = 1 and λ2 = 18 . Now, S(1) is the solution set of
!
− 12 91 x 0
= .
1
−9 1 y 0
2