0% found this document useful (0 votes)
55 views8 pages

Topic 1: Matrix Diagonalization

This document discusses matrix diagonalization. It begins with definitions of matrices, operations on matrices like transpose and inverse, and properties of determinants. It then defines similar matrices and introduces the concept of a diagonalizable matrix as a matrix that is similar to a diagonal matrix. A key point is that if a matrix A is diagonalizable, then Am can be written as the product of three matrices - the change of basis matrix P, a diagonal matrix of the eigenvalues raised to the mth power, and the inverse of P.

Uploaded by

PerepePere
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views8 pages

Topic 1: Matrix Diagonalization

This document discusses matrix diagonalization. It begins with definitions of matrices, operations on matrices like transpose and inverse, and properties of determinants. It then defines similar matrices and introduces the concept of a diagonalizable matrix as a matrix that is similar to a diagonal matrix. A key point is that if a matrix A is diagonalizable, then Am can be written as the product of three matrices - the change of basis matrix P, a diagonal matrix of the eigenvalues raised to the mth power, and the inverse of P.

Uploaded by

PerepePere
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Topic 1: Matrix diagonalization

1. Review of Matrices and Determinants


Definition 1.1. A matrix is a rectangular array of real numbers
a11 a12 · · · a1m
 
 a21 a22 · · · a2m 
A=  ... .. ... ..  .
. . 
an1 an2 ··· anm
The matrix is said to be of order n × m if it has n rows and m columns. The set of matrices of order
n × m will be denoted Mn×m .
The element aij belongs to the ith row and to the jth column. Most often we will write in abbreviated
form A = (aij )i=1,...,n
j=1,...,m or even A = (aij ).
The main or principal, diagonal of a matrix is the diagonal from the upper left– to the lower right–
hand corner.
Definition 1.2. The transpose of a matrix A, denoted AT , is the matrix formed by interchanging
the rows and columns of A
a11 a21 · · · an1
 
 a12 a22 · · · an2 
AT =   ... .. .. ..  ∈ Mm×n .
. . . 
a1m a2m · · · anm
We can define two operations with matrices, sum and multiplication. The main properties of these
operations as well as transposition are the following. It is assumed that the matrices in each of
the following laws are such that the indicated operation can be performed and that α, β ∈ R.
(1) (AT )T = A.
(2) (A + B)T = AT + B T .
(3) A + B = B + A (commutative law).
(4) A + (B + C) = (A + B) + C (associative law).
(5) α(A + B) = αA + αB.
(6) (α + β)A = αA + βA.
(7) Matrix multiplication is not always commutative, i.e., AB 6= BA.
(8) A(BC) = (AB)C (associative law).
(9) A(B + C) = AB + AC (distributive with respect to addition).
1.1. Square matrices. We are mainly interested in square matrices. A matrixP is square if n = m.
The trace of a square matrix A is the sum of its diagonal elements, trace (A) = ni=1 aii .
Definition 1.3. The identity matrix of order n is
1 0 ... 0
 
 0 1 ... 0 
In = 
 ... .. . . ..  .
. . . 
0 0 ... 1
The square matrix of order n with all its entries null is the null matrix, and will be denoted On . It
holds that In A = AIn = A and On A = AOn = On .
Definition 1.4. A square matrix A is called regular or invertible if there exists a matrix B such
that AB = BA = In . The matrix B is called the inverse of A and it is denoted A−1 .
Theorem 1.5. The inverse matrix is unique.
1
2

Uniqueness of A−1 can be easily proved. For, suppose that B is another inverse of matrix A. Then
BA = In and
B = BIn = B(AA−1 ) = (BA)A−1 = In A−1 = A−1 ,
showing that B = A−1 .
Some properties of the inverse matrix are the following. It is assumed that the matrices in each
of the following laws are regular.
(1) (A−1 )−1 = A.
(2) (AT )−1 = (A−1 )T .
(3) (AB)−1 = B −1 A−1 .

1.2. Determinants. To a square matrix A we associate a real number called the determinant, |A|
or det (A), in the following way.
For a matrix of order 1, A = (a), det (A)
 = a.
a b
For a matrix of order 2, A = , det (A) = ad − bc.
c d
For a matrix of order 3

a11 a12 a13
a22 a23 a12 a13 a12 a13
det (A) = a21 a22 a23 = a11 − a21
a32 a33 + a31 a22 a23 .

a31 a32 a33 a32 a33

This is known as the expansion of the determinant by the first column, but it can be done for any
other row or column, giving the same result. Notice the sign (−1)i+j in front of the element aij .
Before continuing with the inductive definition, let us see an example.
Example 1.6. Compute the following determinant expanding by the second column.

1 2 1
4 3 5 = (−1)1+2 2 4 5 + (−1)2+2 3 1 1 + (−1)2+3 1 1 1

3 3 3 3 4 5
3 1 3
= −2 · (−3) + 3 · (0) − (1) · 1 = 5
For general n the method is the same that for matrices of order 3, expanding the determinant by a
row or a column and reducing in this way the order of the determinants that must be computed. For
a determinant of order 4 one has to compute 4 determinants of order 3.
Definition 1.7. Given a matrix A of order n, the complementary minor of element aij is the deter-
minant of order n − 1 which results from the deletion of the row i and the column j containing that
element. The adjoint Aij of the element aij is the minor multiplied by (−1)i+j .
According to this definition, the determinant of matrix A can be defined as
|A| = ai1 Ai1 + ai2 Ai2 + · · · + ain Ain (by row i)
or, equivalently
|A| = a1j A1j + a2j A2j + · · · + anj Anj (by column j).
Example 1.8. Find the value of the determinant

1 2 0 3

4 7 2 1
.
1 3 3 1

0 2 0 7
3

Answer: Expanding the determinant by the third column, one gets



1 2 0 3
1 2 3 1 2 3
4 7 2 1
3+2
3+3


1 3 = (−1) 2 1 3 1 + (−1) 3 4 7 1 .
3 1
0 2 7 0 2 7


0 2 0 7
The main properties of the determinants are the following. It is assumed that the matrices A
and B in each of the following laws are square of order n and λ ∈ R.
(1) |A| = |AT |.
(2) |λA| = λn |A|.
(3) |AB| = |A||B|.
(4) A matrix A is regular if and only if |A| =6 0; in this case |A−1 | = |A|
1
.
(5) If in a determinant two rows (or columns) are interchanged, the value of the determinant is
changed in sign.
(6) If two rows (columns) in a determinant are identical, the value of the determinant is zero.
(7) If all the entries in a row (column) of a determinant are multiplied by a constant λ, then the
value of the determinant is also multiplied by this constant.
(8) In a given determinant, a constant multiple of the elements in one row (column) may be added
to the elements of another row (column) without changing the value of the determinant.
The next result is very useful to check if a given matrix is regular or not.
Theorem 1.9. A square matrix A has an inverse if and only |A| =
6 0.

2. Diagonalization of matrices
Definition 2.1. Two matrices A and B of order n are similar if there exists a matrix P such that
B = P −1 AP.
Definition 2.2. A matrix A is diagonalizable if it is similar to a diagonal matrix D, that is, there
exists D diagonal and P invertible such that D = P −1 AP .
Of course, D diagonal means that every element out of the diagonal is null
λ1 0 . . . 0
 
 0 λ2 . . . 0 
D=  ... .. . . .  , λ1 , . . . , λn ∈ R.
. . .. 
0 0 . . . λn
Proposition 2.3. If A is diagonalizable, then for all m ≥ 1
(2.1) Am = P Dm P −1 ,
where
λm 0 ... 0
 
1
m
m
 0 λ 2 ... 0 
D =  ..
 .. .. ..  .
. . . . 
0 0 . . . λm
n

Proof. Since A is diagonalizable


m
Am = (P DP −1 )(P DP −1 ) · · · (P DP −1 )
= P D(P −1 P )D · · · D(P −1 P )DP −1
= P DIn D · · · DIn DP −1 = P Dm P −1 .
The expression for Dm is readily obtained by induction on m. 
4

Example 2.4. At a given date, instructor X can teach well or teach badly. After a good day, the
probability of doing well for the next class is 1/2, whilst after a bad day, the probability of doing
well is 1/9. Let gt (bt ) the probability of good (poor) teaching at day t. Suppose that at time t = 1
the class has been right, that is, g1 = 1, b1 = 0. Which is the probability that the 5th class go fine
(bad)?
Answer: The data lead to the following equations that relate the probability of a good/bad class
with the performance showed by the teacher the day before
1 1
gt+1 = gt + bt ,
2 9
1 8
bt+1 = gt + bt .
2 9
In matrix form !
  1 1 
gt+1 2 9 gt
= .
bt+1 1 8 bt
2 9
Obviously
!4 
  1 1 
g5 2 9 g1
= .
b5 1 8 b1
2 9
If the matrix were diagonalizable and we could find matrices P and D, then the computation of the
10th power of the matrix would be easy using Proposition 2.3. We will come back to this example
afterwards.
Definition 2.5. Let A be a matrix of order n. We say that λ ∈ R is an eigenvalue of A and that
u ∈ Rn , u 6= 0, is an eigenvector of A associated to λ if
Au = λu.
The set of eigenvalues of A, σ(A) = {λ1 , . . . , λk }, is called the spectrum of A. The set of all
eigenvectors of A associated to the same eigenvalue λ, including the null vector, is denoted S(λ), and
is called the eigenspace or proper subspace associated to λ.
The following result shows that an eigenvector can only be associated to a unique eigenvalue.
Proposition 2.6. Let 0 6= u ∈ S(λ) ∩ S(µ). Then λ = µ.
Proof. Suppose 0 6= u ∈ S(λ) ∩ S(µ). Then
Au = λu
Au = µu.
Subtracting both equations we obtain 0 = (λ − µ)u and, since 0 6= u, we must have λ = µ. 
Recall that for an arbitrary matrix A, the rank of the matrix is the number of linearly independent
columns or rows (both numbers necessarily coincide). It is also given by the order of the largest non
null minor of A.
Theorem 2.7. The real number λ is an eigenvalue of A if and only if
|A − λIn | = 0.
Moreover, S(λ) is the set of solutions (including the null vector) of the linear homogeneous system
(A − λIn )u = 0,
and hence it is a vector subspace, which dimension is
dim S(λ) = n − rank(A − λIn ).
5

Proof. Suppose that λ ∈ R is an eigenvalue of A. Then the system (A − λIn )u = 0 admits some non–
trivial solution u. Since the system is homogeneous, this implies that the determinant of the system
is zero, |A − λIn | = 0. The second part about S(λ) follows also from the definition of eigenvector,
and the fact that the set of solutions of a linear homogenous system is a subspace (the sum of two
solutions is again a solution, as well as it is the product of a real number by a solution). Finally, the
dimension of the space of solutions is given by the Theorem of Rouche–Frobenius. 
Definition 2.8. The characteristic polynomial of A is the polynomial of order n given by
pA (λ) = |A − λIn |.
Notice that the eigenvalues of A are the real roots of pA . This polynomial is of degree n. The
Fundamental Theorem of Algebra estates that a polynomial of degree n has n complex roots (not
necessarily different, some of the roots may have multiplicity grater than one). It could be the case
that some of the roots of pA were not real numbers. For us, a root of pA (λ) which is not real is not
an eigenvalue of A.
Example 2.9. Find the eigenvalues and the proper subspaces of
 
0 −1 0
A= 1 0 0 .
0 0 1
Answer:
 
−λ −1 0
−λ −1


A − λI =  1 −λ 0 ; p(λ) = (1 − λ) = (1 − λ)(λ2 + 1).
1 −λ
0 0 1−λ
The characteristic polynomial has only one real root, hence the spectrum of A is σ(A) = {1}. The
proper subspace S(1) is the set of solutions of the homogeneous linear system (A − I3 )u = 0, that
is, the set of solutions of
    
−1 −1 0 x 0
(A − I3 )u =  1 −1 0   y  =  0 
0 0 0 z 0
Solving the above system we obtain
S(1) = {(0, 0, z) : z ∈ R} =< (0, 0, 1) > (the subspace generated by (0, 0, 1)).
Notice that pA (λ) has other roots that are not reals. They are the complex numbers ±i, that are
not (real) eigenvalues of A. If we would admit complex numbers, then they would be eigenvalues of
A in this extended sense.
Example 2.10. Find the eigenvalues and the proper subspaces of
 
2 1 0
B =  0 1 −1  .
0 2 4
Answer: The eigenvalues are obtained solving

2−λ 1 0

0
1 − λ −1 = 0.

0 2 4−λ
The solutions are λ = 3 (simple root) and λ = 2 (double root). To find S(3) = {u ∈ R3 : (B−3I3 )u =
0} we compute the solutions to
    
−1 1 0 x 0
(B − 3I3 )u =  0 −2 −1   y  =  0  ,
0 2 1 z 0
6

which are x = y and z = −2y, and hence S(3) =< (1, 1, −2) >. To find S(2) we solve the system
    
0 1 0 x 0
(B − 2I3 )u =  0 −1 −1   y  =  0 ,
0 2 2 z 0
from which x = y = 0 and hence S(2) =< (1, 0, 0) >.
Example 2.11. Find the eigenvalues and the proper subspaces of
 
1 2 0
C= 0 2
 0
1 1 3
Answer: To compute the eigenvalues we solve the characteristic equation

1 − λ 2 0

0 = |C − λI3 | = 0 2−λ 0
1 1 0 − λ

1 − λ 0
= 2 − λ = (2 − λ)(1 − λ)(3 − λ)
1 3 − λ
So, the eigenvalues are λ1 = 1, λ2 = 2 and λ3 = 3. We now compute the eigenvectors. The eigenspace
S(1) is the solution of the homogeneous linear system whose associated matrix is C − λI3 with λ = 1.
That is, S(1) is the solution of the following homogeneous linear system
    
0 2 0 x 0
0 2 0 y  = 0
1 1 2 z 0
Solving the above system we find that
S(1) = {(−2z, 0, z) : z ∈ R} =< (−2, 0, 1) >
On the other hand, S(2) is the set of solutions of the homogeneous linear system whose associated
matrix is C − λI3 with λ = 2. That is, S(2) is the solution of the following
    
−1 2 0 x 0
 0 0 0 y  = 0
1 1 1 z 0
So,
S(2) = {(2y, y, −3y) : y ∈ R} =< (2, 1, −3) >
Finally, S(3) is the set of solutions of the homogeneous linear system whose associated matrix is
A − λI3 with λ = 3. That is, S(3) is the solution of the following
    
−2 2 0 x 0
 0 −1 0 y  = 0
1 1 0 z 0
and we obtain
S(3) = {(0, 0, z) : z ∈ R} =< (0, 0, 1) >

We now start describing the procedure to diagonalize a matrix. Fix a square matrix A. Let
λ1 , λ2 , . . . , λk
be distinct real roots of the characteristic polynomial pA (λ) an let mk be the multiplicity of each λk
(Hence mk = 1 if λk is a simple root, mk = 2 if it is double, etc.). Note that m1 + m2 + · · · + mk ≤ n.
7

The following result estates that the number of independent vectors in the subspace S(λ) can never
be bigger than the multiplicity of λ.
Proposition 2.12. For each j = 1, . . . , k
1 ≤ dim S(λj ) ≤ nj .
The following theorem gives necessary and sufficient conditions for a matrix A to be diagonalizable.
Theorem 2.13. A matrix A is diagonalizable if and only if the two following conditions hold.
(1) Every root, λ1 , λ2 , . . . , λk of the charateristic polynomial pA (λ) is real.
(2) For each j = 1, . . . , k
dim S(λj ) = nj .
Corollary 2.14. If the matrix A has n distinct real eigenvalues, then it is diagonalizable.
Theorem 2.15. If A is diagonalizable, then the diagonal matrix D is formed by the eigenvalues of
A in its main diagonal, with each λj repeated nj times. Moreover, a matrix P such that D = P −1 AP
has as columns independent eigenvectors selected from each proper subspace S(λj ), j = 1, . . . , k.
Comments on the examples above.
• Matrix A of Example 2.9 is not diagonalizable, since pA has complex roots.
• Although all roots of pB are real, B of Example 2.10 is not diagonalizable, because dim S(2) =
1, which is smaller than the multiplicity of λ = 2.
• Matrix C of Example 2.11 is diagonalizable, since pC has 3 different real roots. In this case
   
1 0 0 −2 2 0
D = 0 2 0 , P = 0 1 0 .
0 0 3 1 −3 1
Example 2.16. Returning to Example 2.4, we compute
1 1

−λ
2 1 8
9 = 0,

2 9
−λ
7
or 18λ2 − 25λ + 7 = 0. We get λ1 = 1 and λ2 = 18 . Now, S(1) is the solution set of
!   
− 12 91 x 0
= .
1
−9 1 y 0
2

We find y = 92 x, so that S(1) =< (2, 9) >. In the same way, S( 18


7
) is the solution set of
!   
1 1
9 9 x 0
= .
1 1 y 0
2 2
7
We find y = −x, so that S( 18 ) =< (1, −1) >. Hence the diagonal matrix is
!
1 0
D= 7
0 18
and    
2 1 −1 1 1 1
P = , P = .
9 −1 11 9 −2
Thus, !
1 0
  
1 2 1 1 1
An = .
11 9 −1 7 n
0 ( 18 ) 9 −2
8

In particular, for n = 4 we obtain


!
0.1891 0.1802
A4 = .
0.8111 0.8198
Hence !  
0.1891 0.1802
    
g4 4 g1 1 0.1891
=A = = .
b4 b1 0.8111 0.8198 0 0.8111
This means that probability that the 5th class goes right, conditioned to the event that the first class
was also right is of 0.1891.
We can wonder what happens in the long run, that is, supposing that the course lasts forever (oh
no!). In this case
! !
2 2
1 0 11 11
lim An = P ( lim Dn )P −1 = P P −1 = 9 9 ,
n→∞ n→∞ 0 0 11 11
to find that the stationary distribution of probabilities is
   
g∞ 0.1818
= .
b∞ 0.8182

You might also like