0% found this document useful (0 votes)
10 views10 pages

Diagonal Ization

Uploaded by

MeMu MeMu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views10 pages

Diagonal Ization

Uploaded by

MeMu MeMu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Topic 1: Matrix diagonalization

1. Review of Matrices and Determinants


Definition 1.1. A matrix is a rectangular array of real numbers
a11 a12 · · · a1m
 
 a21 a22 · · · a2m 
A=  ... .. .. ..  .
. . . 
an1 an2 · · · anm
The matrix is said to be of order n×m if it has n rows and m columns. The set of matrices
of order n × m will be denoted Mn×m .
The element aij belongs to the ith row and to the jth column. Most often we will write
in abbreviated form A = (aij )i=1,...,n
j=1,...,m or even A = (aij ).
The main or principal, diagonal of a matrix is the diagonal from the upper left– to the
lower right–hand corner.
Definition 1.2. The transpose of a matrix A, denoted AT , is the matrix formed by inter-
changing the rows and columns of A
a11 a21 ··· an1
 
 a 12 a22 ··· an2 
AT = 
 ... .. .. ..  ∈ Mm×n .
. . . 
a1m a2m · · · anm
We can define two operations with matrices, sum and multiplication. The main properties
of these operations as well as transposition are the following. It is assumed that the
matrices in each of the following laws are such that the indicated operation can be performed
and that α, β ∈ R.
(1) (AT )T = A.
(2) (A + B)T = AT + B T .
(3) A + B = B + A (commutative law).
(4) A + (B + C) = (A + B) + C (associative law).
(5) α(A + B) = αA + αB.
(6) (α + β)A = αA + βA.
(7) Matrix multiplication is not always commutative, i.e., AB 6= BA.
(8) A(BC) = (AB)C (associative law).
(9) A(B + C) = AB + AC (distributive with respect to addition).

1.1. Square matrices. We are mainly interested in square matrices. A matrix is square
if
Pnn = m. The trace of a square matrix A is the sum of its diagonal elements, trace (A) =
i=1 aii .
1
2

Definition 1.3. The identity matrix of order n is


1 0 ... 0
 
 0 1 ... 0 
In =  ... ... . . . ..  .
. 
0 0 ... 1
The square matrix of order n with all its entries null is the null matrix, and will be denoted
On . It holds that In A = AIn = A and On A = AOn = On .
Definition 1.4. A square matrix A is called regular or invertible if there exists a matrix B
such that AB = BA = In . The matrix B is called the inverse of A and it is denoted A−1 .
Theorem 1.5. The inverse matrix is unique.
Uniqueness of A−1 can be easily proved. For, suppose that B is another inverse of matrix
A. Then BA = In and
B = BIn = B(AA−1 ) = (BA)A−1 = In A−1 = A−1 ,
showing that B = A−1 .
Some properties of the inverse matrix are the following. It is assumed that the
matrices in each of the following laws are regular.
(1) (A−1 )−1 = A.
(2) (AT )−1 = (A−1 )T .
(3) (AB)−1 = B −1 A−1 .
1.2. Determinants. To a square matrix A we associate a real number called the determi-
nant, |A| or det (A), in the following way.
For a matrix of order 1, A =  (a), det (A)
 = a.
a b
For a matrix of order 2, A = , det (A) = ad − bc.
c d
For a matrix of order 3
a11 a12 a13
a a a a a a
det (A) = a21 a22 a23 = a11 22 23 − a21 12 13 + a31 12 13 .
a32 a33 a32 a33 a22 a23
a31 a32 a33
This is known as the expansion of the determinant by the first column, but it can be done
for any other row or column, giving the same result. Notice the sign (−1)i+j in front of the
element aij .
Before continuing with the inductive definition, let us see an example.
Example 1.6. Compute the following determinant expanding by the second column.
1 2 1
4 5 1 1 1 1
4 3 5 = (−1)1+2 2 + (−1)2+2 3 + (−1)2+3 1
3 3 3 3 4 5
3 1 3
= −2 · (−3) + 3 · (0) − (1) · 1 = 5
3

For general n the method is the same that for matrices of order 3, expanding the determi-
nant by a row or a column and reducing in this way the order of the determinants that must
be computed. For a determinant of order 4 one has to compute 4 determinants of order 3.
Definition 1.7. Given a matrix A of order n, the complementary minor of element aij is
the determinant of order n − 1 which results from the deletion of the row i and the column
j containing that element. The adjoint Aij of the element aij is the minor multiplied by
(−1)i+j .
According to this definition, the determinant of matrix A can be defined as
|A| = ai1 Ai1 + ai2 Ai2 + · · · + ain Ain (by row i)
or, equivalently
|A| = a1j A1j + a2j A2j + · · · + anj Anj (by column j).
Example 1.8. Find the value of the determinant
1 2 0 3
4 7 2 1
.
1 3 3 1
0 2 0 7
Answer: Expanding the determinant by the third column, one gets
1 2 0 3
1 2 3 1 2 3
4 7 2 1
= (−1)3+2 2 1 3 1 + (−1)3+3 3 4 7 1 .
1 3 3 1
0 2 7 0 2 7
0 2 0 7
The main properties of the determinants are the following. It is assumed that the
matrices A and B in each of the following laws are square of order n and λ ∈ R.
(1) |A| = |AT |.
(2) |λA| = λn |A|.
(3) |AB| = |A||B|.
(4) A matrix A is regular if and only if |A| =6 0; in this case |A−1 | = |A|
1
.
(5) If in a determinant two rows (or columns) are interchanged, the value of the deter-
minant is changed in sign.
(6) If two rows (columns) in a determinant are identical, the value of the determinant is
zero.
(7) If all the entries in a row (column) of a determinant are multiplied by a constant λ,
then the value of the determinant is also multiplied by this constant.
(8) In a given determinant, a constant multiple of the elements in one row (column) may
be added to the elements of another row (column) without changing the value of the
determinant.
The next result is very useful to check if a given matrix is regular or not.
Theorem 1.9. A square matrix A has an inverse if and only |A| =
6 0.
4

2. Diagonalization of matrices
Definition 2.1. Two matrices A and B of order n are similar if there exists a matrix P
such that
B = P −1 AP.
Definition 2.2. A matrix A is diagonalizable if it is similar to a diagonal matrix D, that
is, there exists D diagonal and P invertible such that D = P −1 AP .
Of course, D diagonal means that every element out of the diagonal is null
λ1 0 . . . 0
 
 0 λ2 . . . 0 
D=  ... .. . . .  , λ1 , . . . , λn ∈ R.
. . .. 
0 0 . . . λn
Proposition 2.3. If A is diagonalizable, then for all m ≥ 1
(2.1) Am = P Dm P −1 ,
where
λm 0 ... 0
 
1
m
m
 0 λ 2 ... 0 
D =  ..
 .. .. ..  .
. . . . 
0 0 . . . λm
n

Proof. Since A is diagonalizable


m
Am = (P DP −1 )(P DP −1 ) · · · (P DP −1 )
= P D(P −1 P )D · · · D(P −1 P )DP −1
= P DIn D · · · DIn DP −1 = P Dm P −1 .
The expression for Dm is readily obtained by induction on m. 
Example 2.4. At a given date, instructor X can teach well or teach badly. After a good day,
the probability of doing well for the next class is 1/2, whilst after a bad day, the probability
of doing well is 1/9. Let gt (bt ) the probability of good (poor) teaching at day t. Suppose
that at time t = 1 the class has been right, that is, g1 = 1, b1 = 0. Which is the probability
that the 5th class is good (bad)?
Answer: The data lead to the following equations that relate the probability of a
good/bad class with the performance showed by the teacher the day before
1 1
gt+1 = gt + bt ,
2 9
1 8
bt+1 = gt + bt .
2 9
5

In matrix form !
  1 1 
gt+1 2 9 gt
= .
bt+1 1 8 bt
2 9

Obviously
!4 
  1 1 
g5 2 9 g1
= .
b5 1 8 b1
2 9

If the matrix were diagonalizable and we could find matrices P and D, then the computation
of the powers of the matrix would be easy using Proposition 2.3. We will come back to this
example afterwards.
Definition 2.5. Let A be a matrix of order n. We say that λ ∈ R is an eigenvalue of A and
that u ∈ Rn , u 6= 0, is an eigenvector of A associated to λ if
Au = λu.
The set of eigenvalues of A, σ(A) = {λ1 , . . . , λk }, is called the spectrum of A. The set of all
eigenvectors of A associated to the same eigenvalue λ, including the null vector, is denoted
S(λ), and is called the eigenspace or proper subspace associated to λ.
The following result shows that an eigenvector can only be associated to a unique eigen-
value.
Proposition 2.6. Let 0 6= u ∈ S(λ) ∩ S(µ). Then λ = µ.
Proof. Suppose 0 6= u ∈ S(λ) ∩ S(µ). Then
Au = λu
Au = µu.
Subtracting both equations we obtain 0 = (λ−µ)u and, since 0 6= u, we must have λ = µ. 

Recall that for an arbitrary matrix A, the rank of the matrix is the number of linearly
independent columns or rows (both numbers necessarily coincide). It is also given by the
order of the largest non null minor of A.
Theorem 2.7. The real number λ is an eigenvalue of A if and only if
|A − λIn | = 0.
Moreover, S(λ) is the set of solutions (including the null vector) of the linear homogeneous
system
(A − λIn )u = 0,
and hence it is a vector subspace, which dimension is
dim S(λ) = n − rank(A − λIn ).
6

Proof. Suppose that λ ∈ R is an eigenvalue of A. Then the system (A − λIn )u = 0 admits


some non–trivial solution u. Since the system is homogeneous, this implies that the deter-
minant of the system is zero, |A − λIn | = 0. The second part about S(λ) follows also from
the definition of eigenvector, and the fact that the set of solutions of a linear homogenous
system is a subspace (the sum of two solutions is again a solution, as well as it is the product
of a real number by a solution). Finally, the dimension of the space of solutions is given by
the Theorem of Rouche–Frobenius. 
Definition 2.8. The characteristic polynomial of A is the polynomial of order n given by
pA (λ) = |A − λIn |.
Notice that the eigenvalues of A are the real roots of pA . This polynomial is of degree n.
The Fundamental Theorem of Algebra estates that a polynomial of degree n has n complex
roots (not necessarily different, some of the roots may have multiplicity grater than one). It
could be the case that some of the roots of pA were not real numbers. For us, a root of pA (λ)
which is not real is not an eigenvalue of A.
Example 2.9. Find the eigenvalues and the proper subspaces of
 
0 −1 0
A= 1 0 0 .
0 0 1
Answer:
 
−λ −1 0
−λ −1
A − λI =  1 −λ 0 ; p(λ) = (1 − λ) = (1 − λ)(λ2 + 1).
1 −λ
0 0 1−λ
The characteristic polynomial has only one real root, hence the spectrum of A is σ(A) =
{1}. The proper subspace S(1) is the set of solutions of the homogeneous linear system
(A − I3 )u = 0, that is, the set of solutions of
    
−1 −1 0 x 0
(A − I3 )u =  1 −1 0   y  =  0 
0 0 0 z 0
Solving the above system we obtain
S(1) = {(0, 0, z) : z ∈ R} =< (0, 0, 1) > (the subspace generated by (0, 0, 1)).
Notice that pA (λ) has other roots that are not reals. They are the complex numbers ±i,
that are not (real) eigenvalues of A. If we would admit complex numbers, then they would
be eigenvalues of A in this extended sense.
Example 2.10. Find the eigenvalues and the proper subspaces of
 
2 1 0
B =  0 1 −1  .
0 2 4
7

Answer: The eigenvalues are obtained solving


2−λ 1 0
0 1 − λ −1 = 0.
0 2 4−λ

The solutions are λ = 3 (simple root) and λ = 2 (double root). To find S(3) = {u ∈ R3 :
(B − 3I3 )u = 0} we compute the solutions to
    
−1 1 0 x 0
(B − 3I3 )u =  0 −2 −1   y  =  0  ,
0 2 1 z 0

which are x = y and z = −2y, and hence S(3) =< (1, 1, −2) >. To find S(2) we solve the
system
    
0 1 0 x 0
(B − 2I3 )u =  0 −1 −1   y = 0 ,
0 2 2 z 0
from which x = y = 0 and hence S(2) =< (1, 0, 0) >.

Example 2.11. Find the eigenvalues and the proper subspaces of


 
1 2 0
C = 0 2 0
1 1 3

Answer: To compute the eigenvalues we solve the characteristic equation


1−λ 2 0
0 = |C − λI3 | = 0 2−λ 0
1 1 0−λ
1−λ 0
= 2−λ = (2 − λ)(1 − λ)(3 − λ)
1 3−λ

So, the eigenvalues are λ1 = 1, λ2 = 2 and λ3 = 3. We now compute the eigenvectors. The
eigenspace S(1) is the solution of the homogeneous linear system whose associated matrix is
C − λI3 with λ = 1. That is, S(1) is the solution of the following homogeneous linear system
    
0 2 0 x 0
0 2 0 y  = 0
1 1 2 z 0

Solving the above system we find that

S(1) = {(−2z, 0, z) : z ∈ R} =< (−2, 0, 1) >


8

On the other hand, S(2) is the set of solutions of the homogeneous linear system whose
associated matrix is C − λI3 with λ = 2. That is, S(2) is the solution of the following
    
−1 2 0 x 0
 0 0 0 y  = 0
1 1 1 z 0
So,
S(2) = {(2y, y, −3y) : y ∈ R} =< (2, 1, −3) >
Finally, S(3) is the set of solutions of the homogeneous linear system whose associated matrix
is A − λI3 with λ = 3. That is, S(3) is the solution of the following
    
−2 2 0 x 0
 0 −1 0 y  = 0
1 1 0 z 0
and we obtain
S(3) = {(0, 0, z) : z ∈ R} =< (0, 0, 1) >

We now start describing the method to diagonalize a matrix. Fix a square matrix A. Let
λ1 , λ2 , . . . , λk
be distinct real roots of the characteristic polynomial pA (λ) an let mk be the multiplicity
of each λk (Hence mk = 1 if λk is a simple root, mk = 2 if it is double, etc.). Note that
m1 + m2 + · · · + mk = n.

The following result estates that the number of independent vectors in the subspace S(λ)
can never be bigger than the multiplicity of λ.
Proposition 2.12. For each j = 1, . . . , k
1 ≤ dim S(λj ) ≤ mj .
The following theorem gives necessary and sufficient conditions for a matrix A to be
diagonalizable.
Theorem 2.13. A matrix A is diagonalizable if and only if the two following conditions
hold.
(1) Every root, λ1 , λ2 , . . . , λk of the characteristic polynomial pA (λ) is real.
(2) For each j = 1, . . . , k
dim S(λj ) = mj .
Corollary 2.14. If the matrix A has n distinct real eigenvalues, then it is diagonalizable.
Theorem 2.15. If A is diagonalizable, then the diagonal matrix D is formed by the eigen-
values of A in its main diagonal, with each λj repeated nj times. Moreover, a matrix P
such that D = P −1 AP has as columns independent eigenvectors selected from each proper
subspace S(λj ), j = 1, . . . , k.
9

Comments on the examples above.


• Matrix A of Example 2.9 is not diagonalizable, since pA has complex roots.
• Although all roots of pB are real, B of Example 2.10 is not diagonalizable, because
dim S(2) = 1, which is smaller than the multiplicity of λ = 2.
• Matrix C of Example 2.11 is diagonalizable, since pC has 3 different real roots. In
this case    
1 0 0 −2 2 0
D = 0 2 0 , P = 0 1 0 .
0 0 3 1 −3 1
Example 2.16. Returning to Example 2.4, we compute
1 1
2
−λ 9
1 8 = 0,
2 9
−λ
7
or 18λ2 − 25λ + 7 = 0. We get λ1 = 1 and λ2 = 18 . Now, S(1) is the solution set of
!   
− 12 91 x 0
= .
1
− 1 y 0
2 9
9 7
We find y = 2
x, so that S(1) =< (2, 9) >. In the same way, S( 18 ) is the solution set of
!   
1 1
9 9 x 0
= .
1 1 y 0
2 2
7
We find y = −x, so that S( 18 ) =< (1, −1) >. Hence the diagonal matrix is
!
1 0
D= 7
0 18
and    
2 1 −1 1 1 1
P = , P = .
9 −1 11 9 −2
Thus, !
1 0
  
n 1 2 1 1 1
A = .
11 9 −1 7 n
0 ( 18 ) 9 −2
In particular, for n = 4 we obtain
!
0.1891 0.1802
A4 = .
0.8111 0.8198
Hence !  
0.1891 0.1802
    
g4 g1 1 0.1891
= A4 = = .
b4 b1 0.8111 0.8198 0 0.8111
This means that probability that the 5th class goes right, conditioned to the event that the
first class was also right is of 0.1891.
10

We can wonder what happens in the long run, that is, supposing that the course lasts
forever (oh no!). In this case
! !
2 2
1 0
lim An = P ( lim Dn )P −1 = P P −1 = 11
9
11
9
,
n→∞ n→∞ 0 0 11 11
to find that the stationary distribution of probabilities is
   
g∞ 0.1818
= .
b∞ 0.8182

You might also like