0% found this document useful (0 votes)
161 views53 pages

HELM Workbook 22 Eigenvalues and Eigenvectors

Uploaded by

bilbis4321
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
161 views53 pages

HELM Workbook 22 Eigenvalues and Eigenvectors

Uploaded by

bilbis4321
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

Contents 22

Eigenvalues and
Eigenvectors
22.1 Basic Concepts 2

22.2 Applications of Eigenvalues and Eigenvectors 18

22.3 Repeated Eigenvalues and Symmetric Matrices 30

22.4 Numerical Determination of Eigenvalues and Eigenvectors 46

Learning outcomes
In this Workbook you will learn about the matrix eigenvalue problem AX = kX where A
is a square matrix and k is a scalar (number). You will learn how to determine the
eigenvalues (k) and corresponding eigenvectors (X) for a given matrix A. You will learn
of some of the applications of eigenvalues and eigenvectors. Finally you will learn how
eigenvalues and eigenvectors may be determined numerically.
 

Basic Concepts 22.1 

Introduction
From an applications viewpoint, eigenvalue problems are probably the most important problems that
arise in connection with matrix analysis. In this Section we discuss the basic concepts. We shall see
that eigenvalues and eigenvectors are associated with square matrices of order n × n. If n is small
(2 or 3), determining eigenvalues is a fairly straightforward process (requiring the solutiuon of a low
order polynomial equation). Obtaining eigenvectors is a little strange initially and it will help if you
read this preliminary Section first.

#
• have a knowledge of determinants and
matrices
Prerequisites
• have a knowledge of linear first order
Before starting this Section you should . . .
differential equations
"
# !
• obtain eigenvalues and eigenvectors of 2 × 2
and 3 × 3 matrices
Learning Outcomes
• state basic properties of eigenvalues and
On completion you should be able to . . .
eigenvectors
" !

2 HELM (2008):
Workbook 22: Eigenvalues and Eigenvectors
®

1. Basic concepts

Determinants
A square matrix possesses an associated determinant. Unlike a matrix, which is an array of numbers,
a determinant has a single value.
 
c11 c12
A two by two matrix C = has an associated determinant
c21 c22
c11 c12
det (C) = = c11 c22 − c21 c12
c21 c22
(Note square or round brackets denote a matrix, straight vertical lines denote a determinant.)
A three by three matrix has an associated determinant
c11 c12 c13
det(C) = c21 c22 c23
c31 c32 c33
Among other ways this determinant can be evaluated by an “expansion about the top row”:
c22 c23 c21 c23 c21 c22
det(C) = c11 − c12 + c13
c32 c33 c31 c33 c31 c32
Note the minus sign in the second term.

Task
Evaluate the determinants
6 5 4
4 6 4 8
det(A) = det(B) = det(C) = 2 −1 7
3 1 1 2
−3 2 0

Your solution

Answer
det A = 4 × 1 − 6 × 3 = −14 det B = 4 × 2 − 8 × 1 = 0
−1 7 2 7 2 −1
det C = 6 −5 +4 = 6 × (−14) − 5(21) + 4(4 − 3) = −185
2 0 −3 0 −3 2
 
4 8
A matrix such as B = in the previous task which has zero determinant is called a singular
1 2
matrix. The other two matrices A and C are non-singular. The key factor to be aware of is as
follows:

HELM (2008): 3
Section 22.1: Basic Concepts
Key Point 1
Any non-singular n × n matrix C, for which det(C) 6= 0, possesses an inverse C −1 i.e.
CC −1 = C −1 C = I where I denotes the n × n identity matrix
A singular matrix does not possess an inverse.

Systems of linear equations


We first recall some basic results in linear (matrix) algebra. Consider a system of n equations in n
unknowns x1 , x2 , . . . , xn :
c11 x1 + c12 x2 + ... + c1n xn = k1
c21 x1 + c22 x2 + ... + c2n xn = k2
.. .. .. .
. + . + ... + . = ..
cn1 x1 + cn2 x2 + ... + cnn xn = kn
We can write such a system in matrix form:
    
c11 c12 . . . c1n x1 k1
 c21 c22 . . . c2n   x2   k2 
   
..   ..  =  ..  , or equivalently CX = K.
 
 .. ..
 . . ... .   .   .
cn1 cn2 . . . cnn xn kn
We see that C is an n × n matrix (called the coefficient matrix), X = {x1 , x2 , . . . , xn }T is the n × 1
column vector of unknowns and K = {k1 , k2 , . . . , kn }T is an n × 1 column vector of given constants.
The zero matrix will be denoted by O.
If K 6= O the system is called inhomogeneous; if K = O the system is called homogeneous.

Basic results in linear algebra


Consider the system of equations CX = K.
We are concerned with the nature of the solutions (if any) of this system. We shall see that this
system only exhibits three solution types:
• The system is consistent and has a unique solution for X
• The system is consistent and has an infinite number of solutions for X
• The system is inconsistent and has no solution for X

4 HELM (2008):
Workbook 22: Eigenvalues and Eigenvectors
®

There are two basic cases to consider:


det(C) 6= 0 or det(C) = 0
Case 1: det(C) 6= 0
In this case C −1 exists and the unique solution to CX = K is
X = C −1 K
Case 2: det(C) = 0
In this case C −1 does not exist.

(a) If K 6= O the system CA = K has no solutions.


(b) If K = O the system CX = O has an infinite number of solutions.

We note that a homogeneous system


CX = O
has a unique solution X = O if det(C) 6= 0 (this is called the trivial solution) or an infinite number
of solutions if det(C) = 0.

Example 1
(Case 1) Solve the inhomogeneous system of equations
x1 + x2 = 1 2x1 + x2 = 2
which can be expressed as CX = K where
     
1 1 x1 1
C= X= K=
2 1 x2 2

Solution
Here det(C) = −1 6= 0.
   
x1 1
The system of equations has the unique solution: X= = .
x2 0

HELM (2008): 5
Section 22.1: Basic Concepts
Example 2
(Case 2a) Examine the following inhomogeneous system for solutions
x1 + 2x2 = 1
3x1 + 6x2 = 0

Solution
1 2
Here det (C) = = 0. In this case there are no solutions.
3 6
To see this we see the first equation of the system states x1 + 2x2 = 1 whereas the second equation
(after dividing through by 3) states x1 + 2x2 = 0, a contradiction.

Example 3
(Case 2b) Solve the homogeneous system
x1 + x2 = 0
2x1 + 2x2 = 0

Solution
1 1
Here det(C) = = 0. The solutions are any pairs of numbers {x1 , x2 } such that x1 = −x2 ,
 2 2
α
i.e. X= where α is arbitrary.
−α
There are an infinite number of solutions.

A simple eigenvalue problem


We shall be interested in simultaneous equations of the form:
AX = λX,
where A is an n × n matrix, X is an n × 1 column vector and λ is a scalar (a constant) and, in the
first instance, we examine some simple examples to gain experience of solving problems of this type.

6 HELM (2008):
Workbook 22: Eigenvalues and Eigenvectors
®

Example 4
Consider the following system with n = 2:

2x + 3y = λx
3x + 2y = λy

so that
   
2 3 x
A= and X= .
3 2 y
It appears that there are three unknowns x, y, λ. The obvious questions to ask
are: can we find x, y? what is λ?

Solution
To solve this problem we firstly re-arrange the equations (take all unknowns onto one side);
(2 − λ)x + 3y = 0 (1)
3x + (2 − λ)y = 0 (2)
Therefore, from equation (2):
(2 − λ)
x=− y. (3)
3
Then when we substitute this into (1)
(2 − λ)2
− y + 3y = 0 which simplifies to [−(2 − λ)2 + 9] y = 0.
3
We conclude that either y = 0 or 9 = (2 − λ)2 . There are thus two cases to consider:
Case 1
If y = 0 then x = 0 (from (3)) and we get the trivial solution. (We could have guessed this
solution at the outset.)
Case 2
9 = (2 − λ)2
which gives, on taking square roots:
±3 = 2 − λ giving λ = 2 ± 3 so λ=5 or λ = −1.

Now, from equation (3), if λ = 5 then x = +y and if λ = −1 then x = −y.

We have now completed the analysis. We have found values for λ but we also see that we cannot
obtain unique values for x and y: all we can find is the ratio between these quantities. This behaviour
is typical, as we shall now see, of an eigenvalue problem.

HELM (2008): 7
Section 22.1: Basic Concepts
2. General eigenvalue problems
Consider a given square matrix A. If X is a column vector and λ is a scalar (a number) then the
relation.
AX = λX (4)
is called an eigenvalue problem. Our purpose is to carry out an analysis of this equation in a
manner similar to the example above. However, we will attempt a more general approach which will
apply to all problems of this kind.
Firstly, we can spot an obvious solution (for X) to these equations. The solution X = 0 is a
possibility (for then both sides are zero). We will not be interested in these trivial solutions of the
eigenvalue problem. Our main interest will be in the occurrence of non-trivial solutions for X.
These may exist for special values of λ, called the eigenvalues of the matrix A. We proceed as in
the previous example:
take all unknowns to one side:
(A − λI)X = 0 (5)
where I is a unit matrix with the same dimensions as A. (Note that AX − λX = 0 does not
simplify to (A − λ)X = 0 as you cannot subtract a scalar λ from a matrix A). This equation (5)
is a homogeneous system of equations. In the notation of the earlier discussion C ≡ A − λI and
K ≡ 0. For such a system we know that non-trivial solutions will only exist if the determinant of the
coefficient matrix is zero:
det(A − λI) = 0 (6)
Equation (6) is called the characteristic equation of the eigenvalue problem. We see that the
characteristic equation only involves one unknown λ. The characteristic equation is generally a
polynomial in λ, with degree being the same as the order of A (so if A is 2 × 2 the characteristic
equation is a quadratic, if A is a 3 × 3 it is a cubic equation, and so on). For each value of λ that is
obtained the corresponding value of X is obtained by solving the original equations (4). These X’s
are called eigenvectors.
N.B. We shall see that eigenvectors are only unique up to a multiplicative factor: i.e. if X satisfies
AX = λX then so does kX when k is any constant.

8 HELM (2008):
Workbook 22: Eigenvalues and Eigenvectors
®

Example 5  
1 0
Find the eigenvalues and eigenvectors of the matrix A=
1 2

Solution
The eigenvalues and eigenvectors are found by solving the eigenvalue probelm
 
x
AX = λX X= i.e. (A − λI)X = 0.
y
Non-trivial solutions will exist if det (A − λI) = 0
   
1 0 1 0 1−λ 0
that is, det −λ = 0, ∴ = 0,
1 2 0 1 1 2−λ
expanding this determinant: (1 − λ)(2 − λ) = 0. Hence the solutions for λ are: λ = 1 and λ = 2.
So we have found two values of λ for this 2 × 2 matrix A. Since these are unequal they are said to
be distinct eigenvalues.
To each value of λ there corresponds an eigenvector. We now proceed to find the eigenvectors.

Case 1
λ = 1 (smaller eigenvalue). Then our original eigenvalue problem becomes: AX = X. In full this
is

x = x
x + 2y = y

Simplifying

x = x (a)
x+y = 0 (b)
 
x
All we can deduce here is that x = −y ∴ X= for any x 6= 0
−x
(We specify x 6= 0 as, otherwise, we would have the trivial solution.)
   
1 2
So the eigenvectors corresponding to eigenvalue λ = 1 are all proportional to , e.g. ,
  −1 −2
−1
etc.
1
Sometimes we write the eigenvector in normalised form that is, with modulus or magnitude 1.
Here, the normalised form of X is
 
1 1
√ which is unique.
2 −1

HELM (2008): 9
Section 22.1: Basic Concepts
Solution (contd.)
Case 2 Now we consider the larger eigenvalue λ = 2. Our original eigenvalue problem AX = λX
becomes AX = 2X which gives the following equations:
    
1 0 x x
=2
1 2 y y
i.e.

x = 2x
x + 2y = 2y

These equations imply that x = 0 whilst the variable y may take any value whatsoever (except zero
as this gives the trivial solution).
     
0 0 0
Thus the eigenvector corresponding to eigenvalue λ = 2 has the form , e.g. , etc.
  y 1 2
0
The normalised eigenvector here is .
1
 
1 0
In conclusion: the matrix A = has two eigenvalues and two associated normalised eigen-
1 2
vectors:
λ1 = 1, λ2 = 2
   
1 1 0
X1 = √ X2 =
2 −1 1

Example 6
Find the eigenvalues and eigenvectors of the 3 × 3 matrix
 
2 −1 0
A =  −1 2 −1 
0 −1 2

Solution
The eigenvalues and eigenvectors are found by solving the eigenvalue problem
 
x
AX = λX X= y 
z
Proceeding as in Example 5:
(A − λI)X = 0 and non-trivial solutions for X will exist if det (A − λI) = 0

10 HELM (2008):
Workbook 22: Eigenvalues and Eigenvectors
®

Solution (contd.)
that is,
   
 2 −1 0 1 0 0 
det  −1 2 −1  − λ  0 1 0  = 0
0 −1 2 0 0 1
 

2 − λ −1 0
i.e. −1 2 − λ −1 = 0.
0 −1 2 − λ
Expanding this determinant we find:
2 − λ −1 −1 −1
(2 − λ) + =0
−1 2 − λ 0 2−λ
that is,
(2 − λ) {(2 − λ)2 − 1} − (2 − λ) = 0
Taking out the common factor (2 − λ):
(2 − λ) {4 − 4λ + λ2 − 1 − 1}
which gives: (2 − λ) [λ2 − 4λ + 2] = 0.

4±16 − 8 √
This is easily solved to give: λ = 2 or λ = = 2 ± 2.
2
So (typically) we have found three possible values of λ for this 3 × 3 matrix A.
To each value of λ there corresponds an eigenvector.

Case 1: λ = 2 − 2 (lowest eigenvalue)

Then AX = (2 − 2)X implies

2x − y = (2 − 2)x

−x + 2y − z = (2 − 2)y

−y + 2z = (2 − 2)z

Simplifying

2x − y = 0 (a)

−x + 2y − z = 0 (b)

−y + 2z = 0 (c)

We conclude the following:


√ √
(c) ⇒ y = 2z (a) ⇒ y = 2x
∴ these two relations give x = z then (b) ⇒ −x + 2x − x = 0
The last equation gives us no information; it simply states that 0 = 0.

HELM (2008): 11
Section 22.1: Basic Concepts
Solution (contd.)
 
√ x
∴ X =  2x  for any x 6= 0 (otherwise we would have the trivial solution). So the
x
 
√ √1
eigenvectors corresponding to eigenvalue λ = 2 − 2 are all proportional to  2 .
1
 
1
1 √ 
In normalised form we have an eigenvector 2 .
2
1
Case 2: λ = 2
    
2 −1 0 x x
Here AX = 2X implies  −1 2 −1   y  = 2  y 
0 −1 2 z z
i.e.

2x − y = 2x
−x + 2y − z = 2y
−y + 2z = 2z

After simplifying the equations become:

−y = 0 (a)
−x − z = 0 (b)
−y = 0 (c)

(a), (c) imply y = 0: (b) implies x = −z


 
x
∴ eigenvector has the form  0  for any x 6= 0.
−x
 
1
That is, eigenvectors corresponding to λ = 2 are all proportional to  0 .
−1
 
1
1
In normalised form we have an eigenvector √  0  .
2 −1

12 HELM (2008):
Workbook 22: Eigenvalues and Eigenvectors
®

Solution (contd.)

Case 3: λ = 2 + 2 (largest eigenvalue)
Proceeding along similar lines to cases
 1,2 above
 we find that the eigenvectors
 corresponding
 to
√ 1
√ 1

1
λ = 2 + 2 are each proportional to  − 2  with normalised eigenvector  − 2  .
2
1 1
In conclusion the matrix A has three distinct eigenvalues:
√ √
λ1 = 2 − 2, λ2 = 2 λ3 = 2 + 2
and three corresponding normalised eigenvectors:
     
1 1 1
1 √ 1 1 √
X1 =  2  , X2 = √  0  , X3 =  − 2 
2 2 −1 2
1 1

Exercise
Find the eigenvalues and eigenvectors of each of the following matrices A:
   
    2 0 −2 10 −2 4
4 −2 1 2
(a) (b) (c)  0 4 0 (d) −20 4 −10
1 1 −8 11
−2 0 5 −30 6 −13
Answer (eigenvectors are written in normalised form)
 √   √ 
2/√5 1/√2
(a) 3 and 2; and
1/ 5 1/ 2
   
1 1 1 1
(b) 3 and 9; √ and √
2 1 17 4
     
2 0 1
1     1  
(c) 1, 4 and 6; √ 0 ; 1 ; √ 0
5 1 0 5 −2
     
1 0 1
1   1   1  
(d) 0, −1 and 2; √ 5 ; √ 2 ; √ 0
26 0 5 1 5 −2

HELM (2008): 13
Section 22.1: Basic Concepts
3. Properties of eigenvalues and eigenvectors
There are a number of general properties of eigenvalues and eigenvectors which you should be familiar
with. You will be able to use them as a check on some of your calculations.
Property 1: Sum of eigenvalues
For any square matrix A:

sum of eigenvalues = sum of diagonal terms of A (called the trace of A)


n
X
Formally, for an n × n matrix A: λi = trace(A)
i=1

(Repeated eigenvalues must be counted according to their multiplicity.)


3
X
Thus if λ1 = 4, λ2 = 4, λ3 = 1 then λi = 9).
i=1

Property 2: Product of eigenvalues


For any square matrix A:

product of eigenvalues = determinant of A


n
Y
Formally: λ 1 λ 2 λ3 · · · λ n = λi = det(A)
i=1
Q P
The symbol simply denotes multiplication, as denotes summation.

Example 7
Verify Properties 1 and 2 for the 3 × 3 matrix:
 
2 −1 0
A =  −1 2 −1 
0 −1 2
whose eigenvalues were found earlier.

Solution
The three eigenvalues of this matrix are:
√ √
λ1 = 2 − 2, λ2 = 2, λ3 = 2 + 2
Therefore
√ √
λ1 + λ2 + λ3 = (2 − 2) + 2 + (2 + 2) = 6 = trace(A)
√ √
whilst λ1 λ2 λ3 = (2 − 2)(2)(2 + 2) = 4 = det(A)

14 HELM (2008):
Workbook 22: Eigenvalues and Eigenvectors
®

Property 3: Linear independence of eigenvectors


Eigenvectors of a matrix A corresponding to distinct eigenvalues are linearly independent i.e. one
eigenvector cannot be written as a linear sum of the other eigenvectors. The proof of this result is
omitted but we illustrate this property with two examples.
We saw earlier that the matrix
 
1 0
A=
1 2
has distinct eigenvalues λ1 = 1 λ2 = 2 with associated eigenvectors
   
1 1 0
X (1) = √ X (2) =
2 −1 1
respectively.
Clearly X (1) is not a constant multiple of X (2) and these eigenvectors are linearly independent.
We also saw that the 3 × 3 matrix
 
2 −1 0
A =  −1 2 −1 
0 −1 2
√ √
had the following distinct eigenvalues λ1 = 2 − 2, λ2 = 2, λ3 = 2 + 2 with corresponding
eigenvectors of the form shown:
     
√ 1 1 1

X (1) =  2  , X (2) =  0  , X (3) =  − 2 
1 −1 1
Clearly none of these eigenvectors is a constant multiple of any other. Nor is any one obtainable as
a linear combination of the other two. The three eigenvectors are linearly independent.
Property 4: Eigenvalues of diagonal matrices
A 2 × 2 diagonal matrix D has the form
 
a 0
D=
0 d
The characteristic equation
a−λ 0
|D − λI| = 0 is =0
0 d−λ
i.e. (a − λ)(d − λ) = 0
So the eigenvalues are simply the diagonal elements a and d.
Similarly a 3 × 3 diagonal matrix has the form
 
a 0 0
D= 0 b 0 
0 0 c
having characteristic equation

HELM (2008): 15
Section 22.1: Basic Concepts
|D − λI| = (a − λ)(b − λ)(c − λ) = 0
so again the diagonal elements are the eigenvalues.
We can see that a diagonal matrix is a particularly simple matrix to work with. In addition to the
eigenvalues being obtainable immediately by inspection it is exceptionally easy to multiply diagonal
matrices.

Task
Obtain the products D1 D2 and D2 D1 of the diagonal matrices
   
a 0 0 e 0 0
D1 =  0 b 0  D2 =  0 f 0 
0 0 c 0 0 g

Your solution

Answer  
ae 0 0
D1 D2 = D2 D1 =  0 bf 0 
0 0 cg
which of course is also a diagonal matrix.

Exercise
If λ1 , λ2 , . . . λn are the eigenvalues of a matrix A, prove the following:

(a) AT has eigenvalues λ1 , λ2 , . . . λn .


(b) If A is upper triangular, then its eigenvalues are exactly the main diagonal entries.
1 1 1
(c) The inverse matrix A−1 has eigenvalues , ,... .
λ1 λ2 λn
(d) The matrix A − kI has eigenvalues λ1 − k, λ2 − k, . . . λn − k.
(e) (Harder) The matrix A2 has eigenvalues λ21 , λ22 , . . . λ2n .
(f) (Harder) The matrix Ak (k a non-negative integer) has eigenvalues λk1 , λk2 , . . . λkn .

Verify the above results for any 2 × 2 matrix and any 3 × 3 matrix found in the previous Exercises
on page 13.
N.B. Some of these results are useful in the numerical calculation of eigenvalues which we shall
consider later.

16 HELM (2008):
Workbook 22: Eigenvalues and Eigenvectors
®

Answer

(a) Using the property that for any square matrix A, det(A) = det(AT ) we see that if

det(A − λI) = 0 then det(A − λI)T = 0

This immediately tells us that det(AT −λI) = 0 which shows that λ is also an eigenvalue
of AT .

(b) Here simply write down a typical upper triangular matrix U which has terms on the leading
diagonal u11 , u22 , . . . , unn and above it. Then construct (U − λI). Finally imagine how
you would then obtain det(U −λI) = 0. You should see that the determinant is obtained
by multiplying together those terms on the leading diagonal. Here the characteristic
equation is:

(u11 − λ)(u22 − λ) . . . (unn − λ) = 0

This polynomial has the obvious roots λ1 = u11 , λ2 = u22 , . . . , λn = unn .


(c) Here we begin with the usual eigenvalue problem AX = λX. If A has an inverse A−1
we can multiply both sides by A−1 on the left to give

A−1 (AX) = A−1 λX which gives X = λA−1 X

or, dividing through by the scalar λ we get


1
A−1 X = X which shows that if λ and X are respectively eigenvalue and eigen-
λ
vector of A then λ−1 and X are respectively eigenvalue and eigenvector of A−1 .
 
2 3
As an example consider A = . This matrix has eigenvalues λ1 = −1, λ2 = 5
3 2    
1 1
with corresponding eigenvectors X1 = and X2 = . The reader should
−1   1
1 2 −3 1
verify (by direct multiplication) that A−1 = − has eigenvalues −1 and
  5 −3   2 5
1 1
with respective eigenvectors X1 = and X2 = .
−1 1

(d) (e) and (f) are proved in similar way to the proof outlined in (c).

HELM (2008): 17
Section 22.1: Basic Concepts
Applications of
Eigenvalues and  

Eigenvectors 22.2 

Introduction
Many applications of matrices in both engineering and science utilize eigenvalues and, sometimes,
eigenvectors. Control theory, vibration analysis, electric circuits, advanced dynamics and quantum
mechanics are just a few of the application areas.
Many of the applications involve the use of eigenvalues and eigenvectors in the process of trans-
forming a given matrix into a diagonal matrix and we discuss this process in this Section. We then
go on to show how this process is invaluable in solving coupled differential equations of both first
order and second order.

#
• have a knowledge of determinants and
matrices
Prerequisites
• have a knowledge of linear first order
Before starting this Section you should . . .
differential equations
" !
#
• diagonalize a matrix with distinct eigenvalues
using the modal matrix
Learning Outcomes
• solve systems of linear differential equations
On completion you should be able to . . .
by the ‘decoupling’ method
" !

18 HELM (2008):
Workbook 22: Eigenvalues and Eigenvectors
®

1. Diagonalization of a matrix with distinct eigenvalues


Diagonalization means transforming a non-diagonal matrix into an equivalent matrix which is diagonal
and hence is simpler to deal with.
A matrix A with distinct eigenvalues has, as we mentioned in Property 3 in 22.1, eigenvectors
which are linearly independent. If we form a matrix P whose columns are these eigenvectors, it can
be shown that
det(P ) 6= 0
so that P −1 exists.
The product P −1 AP is then a diagonal matrix D whose diagonal elements are the eigenval-
ues of A. Thus if λ1 , λ2 , . . . λn are the distinct eigenvalues of A with associated eigenvectors
X (1) , X (2) , . . . , X (n) respectively, then
(1) .. (2) .. ..
" #
(n)
P = X . X . ··· . X

will produce a product


 
λ1 0 . . . 0
−1
 0 λ2 . . . 0 
P AP = D = 
 
.. 
 . 
0 . . . . . . λn
We see that the order of the eigenvalues in D matches the order in which P is formed from the
eigenvectors.
N.B.

(a) The matrix P is called the modal matrix of A


(b) Since D is a diagonal matrix with eigenvalues λ1 , λ2 , . . . , λn which are the same as those
of A, then the matrices D and A are said to be similar.
(c) The transformation of A into D using

P −1 AP = D

is said to be a similarity transformation.

HELM (2008): 19
Section 22.2: Applications of Eigenvalues and Eigenvectors
Example
 8 
2 3
Let A = . Obtain the modal matrix P and calculate the product P −1 AP .
3 2
(The eigenvalues and eigenvectors of this particular matrix A were obtained earlier
in this Workbook at page 7.)

Solution
The matrix
  eigenvalues λ1 = −1, λ2 = 5 with corresponding eigenvectors
A has two distinct
x x
X1 = and X2 = . We can therefore form the modal matrix from the simplest
−x x
eigenvectors of these forms:
 
1 1
P =
−1 1
 
2 3
(Other eigenvectors would be acceptable e.g. we could use P = but there is no reason
−2 3
to over complicate the calculation.)
It is easy to obtain the inverse of this 2 × 2 matrix P and the reader should confirm that:
 T  
−1 1 1 1 1 1 1 −1
P = adj(P ) = =
det(P ) 2 −1 1 2 1 1
We can now construct the product P −1 AP :
   
−1 1 1 −1 2 3 1 1
P AP =
2 1 1 3 2 −1 1
  
1 1 −1 −1 5
=
2 1 1 1 5
 
1 −2 0
=
2 0 10
 
−1 0
=
0 5

which is a diagonal matrix with entries the eigenvalues,


 as expected. Show (by repeating the method
1 1
outlined above) that had we defined P = (i.e. interchanged the order in which the
1 −1 
−1 5 0
eigenvectors were taken) we would find P AP = (i.e. the resulting diagonal elements
0 −1
would also be interchanged.)

20 HELM (2008):
Workbook 22: Eigenvalues and Eigenvectors
®

Task 
−1 4
The matrix A = has eigenvalues −1 and 3 with respective
0 3
   
1 1
eigenvectors and .
0 1
     
1 1 2 2 1 1
If P1 = , P2 = , P3 = write down the
0 1 0 2 1 0
products P1−1 AP1 , P2−1 AP2 , P3−1 AP3
(You may not need to do detailed calculations.)

Your solution

Answer      
−1 0 −1 0 3 0
P1−1 AP1 = = D1 P2−1 AP2 = = D2 P3−1 AP3 = = D3
0 3 0 3 0 −1
Note that D1 = D2 , demonstrating that any eigenvectors of A can be used to form P . Note also
that since the columns of P1 have been interchanged in forming P3 then so have the eigenvalues in
D3 as compared with D1 .

Matrix powers
If P −1 AP = D then we can obtain A (i.e. make A the subject of this matrix equation) as follows:
Multiplying on the left by P and on the right by P −1 we obtain
P P −1 AP P −1 = P DP −1
Now using the fact that P P −1 = P −1 P = I we obtain
IAI = P DP −1 and so
A = P DP −1
We can use this result to obtain the powers of a square matrix, a process which is sometimes useful
in control theory. Note that
A2 = A.A A3 = A.A.A. etc.
Clearly, obtaining high powers of A directly would in general involve many multiplications. The
process is quite straightforward, however, for a diagonal matrix D, as this next Task shows.

HELM (2008): 21
Section 22.2: Applications of Eigenvalues and Eigenvectors
Task  
2 3 3 0
Obtain D and D if D = . Write down D10 .
0 −2

Your solution

Answer
    2   
2 3 0 3 0 3 0 9 0
D = = =
0 −2 0 −2 0 (−2)2 0 4
 2    3   
3 3 0 3 0 3 0 27 0
D = = =
0 (−2)2 0 (−2) 0 (−2)3 0 −8
 10   
10 3 0 58049 0
Continuing in this way: D = =
0 (−2)10 0 1024

We now use the relation A = P DP −1 to obtain a formula for powers of A in terms of the easily
calculated powers of the diagonal matrix D:
A2 = A.A = (P DP −1 )(P DP −1 ) = P D(P −1 P )DP −1 = P DIDP −1 = P D2 P −1
Similarly: A3 = A2 .A = (P D2 P −1 )(P DP −1 ) = P D2 (P −1 P )DP −1 = P D3 P −1
The general result is given in the following Key Point:

Key Point 2
For a matrix A with distinct eigenvalues λ1 , λ2 , . . . , λn and associated eigenvectors
X (1) , X (2) , . . . , X (n) then if
P = [X (1) : X (2) : . . . : X (n) ]
D = P −1 AP is a diagonal matrix such that
 
λ1
 λ2 
D= and Ak = P Dk P −1
 
 ... 

λn

22 HELM (2008):
Workbook 22: Eigenvalues and Eigenvectors
®

Example
 9
2 3
If A = find A23 . (Use the results of Example 8.)
3 2

Solution
   
1 1 −1 −1 0
We know from Example 8 that if P = then P AP = =D
−1 1 0 5
 
1 1 −1
where P −1 =
2 1 1
∴ A = P DP −1 and A23 = P D23 P −1 using the general result in Key Point 2
   
1 1 −1 0 1 −1
i.e. A=
−1 1 0 523 1 1
which is easily evaluated.

Exercise
Find a diagonalizing matrix P if
 
4 2
(a) A =
−1 1
 
1 0 0
(b) A = 1 2 0
2 −2 3

Verify, in each case, that P −1 AP is diagonal, with the eigenvalues of A as its diagonal elements.
Answer
   
−1 −2 −1 1 0
(a) P = , P AP =
1 1 0 3
   
1 0 0 1 0 0
(b) P = −1 1 0, P AP −1 = 0 2 0
−2 2 1 0 0 3

HELM (2008): 23
Section 22.2: Applications of Eigenvalues and Eigenvectors
2. Systems of first order differential equations
Systems of first order ordinary differential equations arise in many areas of mathematics and engi-
neering, for example in control theory and in the analysis of electrical circuits. In each case the basic
unknowns are each a function of the time variable t. A number of techniques have been developed
to solve such systems of equations; for example the Laplace transform. Here we shall use eigenvalues
and eigenvectors to obtain the solution. Our first step will be to recast the system of ordinary differ-
ential equations in the matrix form Ẋ = AX where A is an n × n coefficient matrix of constants,
X is the n × 1 column vector of unknown functions and Ẋ is the n × 1 column vector containing the
derivatives of the unknowns.. The main step will be to use the modal matrix of A to diagonalise
the system of differential equations. This process will transform Ẋ = AX into the form Ẏ = DY
where D is a diagonal matrix. We shall find that this new diagonal system of differential equations
can be easily solved. This special solution will allow us to obtain the solution of the original system.

Task
Obtain the solutions of the pair of first order differential equations

ẋ = −2x
(1)
ẏ = −5y
given the initial conditions
x(0) = 3 i.e. x = 3 at t = 0
y(0) = 2 i.e. y = 2 at t = 0
dx dy
(The notation is that ẋ ≡ , ẏ ≡ )
dt dt
[Hint: Recall, from your study of differential equations, that the general solution
dy
of the differential equation = Ky is y = y0 eKt .]
dt

Your solution

Answer
Using the hint: x = x0 e−2t y = y0 e−5t where x0 = x(0) and y0 = y(0).
From the given initial condition x0 = 3 y0 = 2 so finally x = 3e−2t y = 2e−5t .

24 HELM (2008):
Workbook 22: Eigenvalues and Eigenvectors
®

In the above Task although we had two differential equations to solve they were really quite separate.
We needed no knowledge of matrix theory to solve them. However, we should note that the two
differential equations can be written in matrix form.
     
x ẋ −2 0
Thus if X = Ẋ = A=
y ẏ 0 −5
the two equations (1) can be written as
    
ẋ −2 0 x
=
ẏ 0 −5 y
i.e. Ẋ = AX.

Task
Write in matrix form the pair of coupled differential equations

ẋ = 4x + 2y
(2)
ẏ = −x + y

Your solution

Answer
    
ẋ 4 2 x
=
ẏ −1 1 y
Ẋ = AX

The essential difference between the two pairs of differential equations just considered is that the
pair (1) were really separate equations whereas pair (2) were coupled:

• The first equation of (1) involving only the unknown x, the second involving only y. In matrix
terms this corresponded to a diagonal matrix A in the system Ẋ = AX.

• The pair (2) were coupled in that both equations involved both x and y. This corresponded
to the non-diagonal matrix A in the system Ẋ = AX which you found in the last Task.

Clearly the second system here is more difficult to deal with than the first and this is where we can
use our knowledge of diagonalization.
Consider a system of differential equations written in matrix form: Ẋ = AX where
   
x(t) ẋ(t)
X= and Ẋ =
y(t) ẏ(t)
 
r(t)
We now introduce a new column vector of unknowns Y = through the relation
s(t)

HELM (2008): 25
Section 22.2: Applications of Eigenvalues and Eigenvectors
X = PY
where P is the modal matrix of A. Then, since P is a matrix of constants:
Ẋ = P Ẏ so Ẋ = AX becomes P Ẏ = A(P Y )
Then, multiplying by P −1 on the left, Ẏ = (P −1 AP )Y
But, because of the properties of the modal matrix, we know that P −1 AP is a diagonal matrix.
Thus if λ1 , λ2 are distinct eigenvalues of A then:
 
−1 λ1 0
P AP =
0 λ2
Hence Ẏ = (P −1 AP )Y becomes
    
ṙ λ1 0 r
= .
ṡ 0 λ2 s
That is, when written out we have

ṙ = λ1 r
ṡ = λ2 s.

These equations are decoupled. The first equation only involves the unknown function r(t) and
has solution r(t) = Ceλ1 t . The second equation only involves the unknown function s(t) and has
solution s(t) = Keλ2 t . [C, K are arbitrary constants.]
Once r, s are known the original unknowns x, y can be found from the relation X = P Y .
Note that the theory outlined above is more widely applicable as specified in the next Key Point:

Key Point 3
For any system of differential equations of the form

Ẋ = AX

where A is an n×n matrix with distinct eigenvalues λ1 , λ2 , . . . , λn , and t is the independent variable
the solution is
X = PY
where P is the modal matrix of A and

Y = [C1 eλ1 t , C2 eλ2 t , . . . , Cn eλn t ]T

26 HELM (2008):
Workbook 22: Eigenvalues and Eigenvectors
®

Example 10
Find the solution of the coupled differential equations
ẋ = 4x + 2y
ẏ = −x + y with initial conditions x(0) = 1 y(0) = 0
dx dy
Here ẋ ≡ and ẏ ≡ .
dt dt

Solution
 
4 2
Here A= . It is easily checked that A has distinct eigenvalues λ1 = 3 λ2 = 2 and
−1 1    
−2 1
corresponding eigenvectors X1 = , X2 = .
1 −1
   
−2 1 −1 3 0
Therefore, taking P = then P AP =
1 −1 0 2
and using Key Point 3, r(t) = Ce3t s(t) = Ke2t .

       
x −2 1 r −2 1 Ce3t
So ≡ X = PY = =
y 1 −1 s 1 −1 Ke2t
 
−2Ce3t + Ke2t
= .
Ce3t − Ke2t

Therefore x = −2Ce3t + Ke2t and y = Ce3t − Ke2t .


We can now impose the initial conditions x(0) = 1 and y(0) = 0 to give

1 = −2C + K
0 = C − K.

Thus C = K = −1 and the solution to the original system of differential equations is

x(t) = 2e3t − e2t


y(t) = −e3t + e2t

The approach we have demonstrated in Example 10 can be extended to

(a) Systems of first order differential equations with n unknowns (Key Point 3)
(b) Systems of second order differential equations (described in the next subsection).

The only restriction, as we have said, is that the matrix A in the system Ẋ = AX has distinct
eigenvalues.

HELM (2008): 27
Section 22.2: Applications of Eigenvalues and Eigenvectors
3. Systems of second order differential equations
The decoupling method discussed above can be readily extended to this situation which could arise,
for example, in a mechanical system consisting of coupled springs.
A typical example of such a system with two unknowns has the form
ẍ = ax + by ÿ = cx + dy
or, in matrix form,
d2 x d2 y
   
x a b
Ẍ = AX where X= A= , ẍ = , ÿ =
y c d dt2 dt2

Task  
r(t)
Make the substitution X = P Y where Y = and P is the modal matrix
s(t)
of A, A being assumed here to have distinct eigenvalues λ1 and λ2 . Solve the
resulting pair of decoupled equations for the case, which arises in practice, where
λ1 and λ2 are both negative.

Your solution

Answer
Exactly as with a first order system, putting X = P Y into the second order system Ẍ = AX gives
   
−1 λ1 0 r̈
Ÿ = P AP Y that is Ÿ = DY where D = and Ÿ = so
0 λ2 s̈
    
r̈ λ1 0 r
=
s̈ 0 λ2 s
That is, r̈ = λ1 r = −ω12 r and s̈ = λ2 s = −ω22 s (where λ1 and λ2 are both negative.)
The two decoupled equations are of the form of the differential equation governing simple harmonic
motion. Hence the general solution is
r = K cos ω1 t + L sin ω1 t and s = M cos ω2 t + N sin ω2 t
The solutions for x and y are then obtained by use of X = P Y.
Note that in this second order case four initial conditions, two each for both x and y, are required
because four constants K, L, M, N arise in the solution.

28 HELM (2008):
Workbook 22: Eigenvalues and Eigenvectors
®

Exercises
1. Solve by decoupling each of the following first order systems:
   
dX 3 4 1
(a) = AX where A = , X(0) =
dt 4 −3 3
(b) ẋ1 = x2 ẋ2 = x1 + 3x3 ẋ3 = x2 with x1 (0) = 2, x2 (0) = 0, x3 (0) = 2
   
2 2 1 1
dX 
(c) = 1 3 1 X, with X(0) = 0
 
dt
1 2 2 0
(d) ẋ1 = x1 ẋ2 = −2x2 + x3 ẋ3 = 4x2 + x3 with x1 (0) = x2 (0) = x3 (0) = 1

2. Matrix methods can be used to solve systems of second order differential equations such as
might arise with coupled electrical or mechanical systems. For example the motion of two
masses m1 and m2 vibrating on coupled springs, neglecting damping and spring masses, is
governed by

m1 ÿ1 = −k1 y1 + k2 (y2 − y1 )


m2 ÿ2 = −k2 (y2 − y1 )

where dots denote derivatives with respect to time.

Write this system as a matrix equation Ÿ = AY and use the decoupling method to find Y if

(a) m1 = m2 = 1, k1 = 3, k2 = 2
√ √
and the initial conditions are y1 (0) = 1, y2 (0) = 2, ẏ(0) = −2 6, ẏ2 (0) = 6

(b) m1 = m2 = 1, k1 = 6, k2 = 4
√ √
and the initial conditions are y1 (0) = y2 (0) = 0, ẏ1 (0) = 2, ẏ2 (0) = 2 2

Verify your solutions by substitution in each case.

Answers
 
2 cosh 2t
2e − e−5t
 5t 
1. (a) X = (b) X =  4 sinh 2t 
e5t + 2e−5t
2 cosh 2t
 5t   
e + 3et 5et
1 1
(c) X = e5t − et  (d) X = 2e2t + 3e−3t 
4 5
e5t − et 8e2t − 3e−3t
 √   √ 
cos t − 2 sin √6t sin √2t
2. (a) Y = (b) Y =
2 cos t + sin 6t 2 sin 2t

HELM (2008): 29
Section 22.2: Applications of Eigenvalues and Eigenvectors
Repeated Eigenvalues
and  

Symmetric Matrices 22.3 

Introduction
In this Section we further develop the theory of eigenvalues and eigenvectors in two distinct directions.
Firstly we look at matrices where one or more of the eigenvalues is repeated. We shall see that this
sometimes (but not always) causes problems in the diagonalization process that was discussed in the
previous Section. We shall then consider the special properties possessed by symmetric matrices
which make them particularly easy to work with.

#
• have a knowledge of determinants and
matrices
Prerequisites
• have a knowledge of linear first order
Before starting this Section you should . . .
differential equations
"
' !
$
• state the conditions under which a matrix
with repeated eigenvalues may be
Learning Outcomes diagonalized
On completion you should be able to . . . • state the main properties of real symmetric
matrices
& %

30 HELM (2008):
Workbook 22: Eigenvalues and Eigenvectors
®

1. Matrices with repeated eigenvalues


So far we have considered the diagonalization of matrices with distinct (i.e. non-repeated) eigen-
values. We have accomplished this by the use of a non-singular modal matrix P (i.e. one where
det P 6= 0 and hence the inverse P −1 exists). We now want to discuss briefly the case of a ma-
trix A with at least one pair of repeated eigenvalues. We shall see that for some such matrices
diagonalization is possible but for others it is not.
The crucial question is whether we can form a non-singular modal matrix P with the eigenvectors of
A as its columns.
Example
Consider the matrix
 
1 0
A=
−4 1
which has characteristic equation
det(A − λI) = (1 − λ)(1 − λ) = 0.
So the only eigenvalue is 1 which is repeated or, more formally, has multiplicity 2.
To obtain eigenvectors of A corresponding to λ = 1 we proceed as usual and solve
AX = 1X
or
    
1 0 x x
=
−4 1 y y
implying
x=x and − 4x + y = y
from which x = 0 and y is arbitrary.
Thus possible eigenvectors are
       
0 0 0 0
, , , ...
−1 1 2 3
However, if we attempt to form a modal matrix P from any two of these eigenvectors,
     
0 0 0 0
e.g. and then the resulting matrix P = has zero determinant.
−1 1 −1 1
Thus P −1 does not exist and the similarity transformation P −1 AP that we have used previously
to diagonalize a matrix is not possible here.
The essential point, at a slightly deeper level, is that the columns of P in this case are not linearly
independent since
   
0 0
= (−1)
−1 1
i.e. one is a multiple of the other.
This situation is to be contrasted with that of a matrix with non-repeated eigenvalues.

HELM (2008): 31
Section 22.3: Repeated Eigenvalues and Symmetric Matrices
Earlier, for example, we showed that the matrix
 
2 3
A=
3 2
has the non-repeated eigenvalues λ1 = −1, λ2 = 5 with associated eigenvectors
   
1 1
X1 = X2 =
−1 1
These two
 eigenvectors
  are
 linearly independent.
1 1
since 6= k for any value of k 6= 0.
−1 1
Here the modal matrix
 
1 1
P =
−1 1
has linearly independent columns: so that det P 6= 0 and P −1 exists.
The general result, illustrated by this example, is given in the following Key Point.

Key Point 4
Eigenvectors corresponding to distinct eigenvalues are always linearly independent.

It follows from this that we can always diagonalize an n × n matrix with n distinct eigenvalues
since it will possess n linearly independent eigenvectors. We can then use these as the columns of
P , secure in the knowledge that these columns will be linearly independent and hence P −1 will exist.
It follows, in considering the case of repeated eigenvalues, that the key problem is whether or not
there are still n linearly independent eigenvectors for an n × n matrix.
We shall now consider two 3 × 3 cases as illustrations.

 
Task −2 0 1
Let A= 1 1 0 
0 0 −2
(a) Obtain the eigenvalues and eigenvectors of A.
(b) Can three linearly independent eigenvectors for A be obtained?
(c) Can A be diagonalized?

32 HELM (2008):
Workbook 22: Eigenvalues and Eigenvectors
®

Your solution

Answer
−2 − λ 0 1
(a) The characteristic equation of A is det(A − λI) = 1 1−λ 0 =0
0 0 −2 − λ
i.e. (−2 − λ)(1 − λ)(−2 − λ) = 0 which gives λ = 1, λ = −2, λ = −2.
    
−2 0 1 x x
For λ = 1 the associated eigenvectors satisfy  1 1 0   y = y  from which
 
0 0  −2  z z
0
x = 0, z = 0 and y is arbitrary. Thus an eigenvector is X = α  where α is arbitrary, α 6= 0.

0
For the repeated eigenvalue λ = −2 we must solve AY = (−2)Y for the eigenvector Y :
    
−2 0 1 x −2x
 1 1 0   y  =  −2y  from which z = 0, x + 3y = 0 so the eigenvectors are
0 0 −2  z  −2z
 
−3β −3
of the form Y =  β  = β  1  where β 6= 0 is arbitrary.
0 0
(b) X and Y are certainly linearly independent (as we would expect since they correspond to distinct
eigenvalues.) However, there is only one independent eigenvector of the form Y corresponding to
the repeated eigenvalue −2.
(c) The conclusion is that since A is 3 × 3 and we can only obtain two linearly independent
eigenvectors then A cannot be diagonalized.

HELM (2008): 33
Section 22.3: Repeated Eigenvalues and Symmetric Matrices
 
Task 5 −4 4
The matrix A =  12 −11 12  has eigenvalues −3, 1, 1. The eigenvector
4 −4 5  
1
corresponding to the eigenvalue −3 is X = 3  or any multiple.

1
Investigate carefully the eigenvectors associated with the repeated eigenvalue λ = 1
and deduce whether A can be diagonalized.

Your solution

Answer
We must solve AY = (1)Y for the required eigenvector
     
5 −4 4 x x
i.e.  12 −11 12   y = y 
 
4 −4 5 z z
Each equation here gives on simplification x − y + z = 0. So we have just one equation in three
unknowns so we can choose any two values arbitrarily. The choices x = 1, y = 0 (and hence
z = −1) and x =0, y = 1 (andhence  z = 1) for example, give rise to linearly independent
1 0
eigenvectors Y1 =  0  Y2 =  1 
−1 1
We can thus form a non-singular modal matrix P from Y1 and Y2 together with X (given)
 
1 1 0
P = 3 0 1 
1 −1 1
We can then indeed diagonalize A through the transformation
 
−3 0 0
P −1 AP = D =  0 1 0 
0 0 1

34 HELM (2008):
Workbook 22: Eigenvalues and Eigenvectors
®

Key Point 5
An n × n matrix with repeated eigenvalues can be diagonalized provided we can obtain n linearly
independent eigenvectors for it. This will be the case if, for each repeated eigenvalue λi of multiplicity
mi > 1, we can obtain mi linearly independent eigenvectors.

2. Symmetric matrices
Symmetric matrices have a number of useful properties which we will investigate in this Section.

Task
Consider the following four matrices
   
3 1 3 1
A1 = A2 =
4 5 1 5
   
5 8 7 5 8 7
A3 =  −1 6 8  A4 =  8 6 4 
3 4 0 7 4 0
What property do the matrices A2 and A4 possess that A1 and A3 do not?

Your solution

Answer
Matrices A2 and A4 are symmetric across the principal diagonal. In other words transposing these
matrices, i.e. interchanging their rows and columns, does not change them.
AT2 = A2 AT4 = A4 .
This property does not hold for matrices A1 and A3 which are non-symmetric.

Calculating the eigenvalues of an n×n matrix with real elements involves, in principle at least, solving
an n th order polynomial equation, a quadratic equation if n = 2, a cubic equation if n = 3, and
so on. As is well known, such equations sometimes have only real solutions, but complex solutions
(occurring as complex conjugate pairs) can also arise. This situation can therefore arise with the
eigenvalues of matrices.

HELM (2008): 35
Section 22.3: Repeated Eigenvalues and Symmetric Matrices
Task
Consider the non-symmetric matrix
 
2 −1
A=
5 −2
Obtain the eigenvalues of A and show that they form a complex conjugate pair.

Your solution

Answer
The characteristic equation of A is
2−λ −1
det(A − λI) = =0
5 −2 − λ
i.e.
−(2 − λ)(2 + λ) + 5 = 0 leading to λ2 + 1 = 0
giving eigenvalues ± i which are of course complex conjugates.

In particular any 2 × 2 matrix of the form


 
a b
A=
−b a
has complex conjugate eigenvalues a ± ib.

A 3 × 3 example of a matrix with some complex eigenvalues is


 
1 −1 −1
B =  1 −1 0 
1 0 −1
A straightforward calculation shows that the eigenvalues of B are

λ = −1 (real), λ = ±i (complex conjugates).

With symmetric matrices on the other hand, complex eigenvalues are not possible.

36 HELM (2008):
Workbook 22: Eigenvalues and Eigenvectors
®

Key Point 6
The eigenvalues of a symmetric matrix with real elements are always real.

The general proof of this result in Key Point 6 is beyond our scope but a simple proof for symmetric
2 × 2 matrices is straightforward.
 
a b
Let A = be any 2 × 2 symmetric matrix, a, b, c being real numbers.
b c
The characteristic equation for A is
(a − λ)(c − λ) − b2 = 0 or, expanding: λ2 − (a + c)λ + ac − b2 = 0
from which
p
(a + c) ± (a + c)2 − 4ac + 4b2
λ=
2
The quantity under the square root sign can be treated as follows:
(a + c)2 − 4ac + 4b2 = a2 + c2 + 2ac − 4ac + b2 = (a − c)2 + 4b2
which is always positive and hence λ cannot be complex.

Task
Obtain the eigenvalues and the eigenvectors of the symmetric 2 × 2 matrix
 
4 −2
A=
−2 1

Your solution

HELM (2008): 37
Section 22.3: Repeated Eigenvalues and Symmetric Matrices
Answer
The characteristic equation for A is
(4 − λ)(1 − λ) + 4 = 0 or λ2 − 5λ = 0
giving λ = 0 and λ = 5, both of which are of course
 real and also unequal (i.e. distinct). For the
x
larger eigenvalue λ = 5 the eigenvector X = satisfy
y
    
4 −2 x 5x
= i.e. −x − 2y = 0, −2x − 4y = 0
−2 1 y 5y
 
2
Both equations tell us that x = −2y so an eigenvector for λ = 5 is X = or any multiple of
−1
this. For λ = 0 the associated eigenvectors satisfy
4x − 2y = 0 −2x + y = 0
 
1
i.e. y = 2x (from both equations) so an eigenvector is Y = or any multiple.
2

We now look more closely at the eigenvectors X and Y in the last task. In particular we consider
the product X T Y .

Task
Evaluate X T Y from the previous task i.e. where
   
2 1
X= Y =
−1 2

Your solution

Answer  
T 1
X Y = [2, −1] =2×1−1×2=2−2=0
2
X T Y = 0 means are X and Y are orthogonal.

Key Point 7
Two n × 1 column vectors X and Y are orthogonal if X T Y = 0.

38 HELM (2008):
Workbook 22: Eigenvalues and Eigenvectors
®

Task
We obtained earlier in Section 22.1 Example 6 the eigenvalues of the matrix
 
2 −1 0
A =  −1 2 −1 
0 −1 2
which, as
√ we now √ emphasize, is symmetric. We found that the eigenvalues were
2, 2 + 2, 2 − 2 which are real and distinct. The corresponding eigenvectors
were, respectively
     
1 1
√ √1
X= 0  Y = − 2  Z= 2 
−1 1 1
(or, as usual, any multiple of these).
Show that these three eigenvectors X, Y, Z are mutually orthogonal.

Your solution

Answer  
1

X T Y = [1, 0, −1]  − 2  = 1 − 1 = 0
1
 
√ √1
Y T Z = [1, − 2, 1]  2  = 1 − 2 + 1 = 0
1
 
√ 1
Z T X = [1, 2, 1]  0  = 1 − 1 = 0
−1
verifying the mutual orthogonality of these three eigenvectors.

HELM (2008): 39
Section 22.3: Repeated Eigenvalues and Symmetric Matrices
General theory
The following proof that eigenvectors corresponding to distinct eigenvalues of a symmetric matrix
are orthogonal is straightforward and you are encouraged to follow it through.
Let A be a symmetric n × n matrix and let λ1 , λ2 be two distinct eigenvalues of A i.e. λ1 6= λ2
with associated eigenvectors X, Y respectively. We have seen that λ1 and λ2 must be real since A
is symmetric. Then
AX = λ1 X AY = λ2 Y (1)
Transposing the first of there results gives
X T AT = λ 1 X T (2)
(Remember that for any two matrices the transpose of a product is the product of the transposes in
reverse order.)
We now multiply both sides of (2) on the right by Y (as well as putting AT = A, since A is
symmetric) to give:
X T AY = λ1 X T Y (3)
But, using the second eigenvalue equation of (1), equation (3) becomes
X T λ2 Y = λ1 X T Y
or, since λ2 is just a number,
λ2 X T Y = λ1 X T Y
Taking all terms to the same side and factorising gives
(λ2 − λ1 )X T Y = 0
from which, since by assumption λ1 6= λ2 , we obtain the result
XT Y = 0
and the orthogonality has been proved.

Key Point 8
The eigenvectors associated with distinct eigenvalues of a
symmetric matrix are mutually orthogonal.

The reader familiar with the algebra of vectors will recall that for two vectors whose Cartesian forms
are
a = ax i + ay j + az k b = bx i + by j + bz k
the scalar (or dot) product is

40 HELM (2008):
Workbook 22: Eigenvalues and Eigenvectors
®

a · b = ax bx + ay by + az bz .
Furthermore, if a and b are mutually perpendicular then a·b = 0. (The word ‘orthogonal’ is sometimes
used instead of perpendicular in the case.) Our result, that two column vectors are orthogonal if
X T Y = 0, may thus be considered as a generalisation of the 3-dimensional result a · b = 0.

Diagonalization of symmetric matrices


Recall from our earlier work that
1. We can always diagonalize a matrix with distinct eigenvalues (whether these are real or com-
plex).
2. We can sometimes diagonalize a matrix with repeated eigenvalues. (The condition for this to
be possible is that any eigenvalue of multiplicity m had to have associated with it m linearly
independent eigenvectors.)
The situation with symmetric matrices is simpler. Basically we can diagonalize any symmetric matrix.
To take the discussions further we first need the concept of an orthogonal matrix.
A square matrix A is said to be orthogonal if its inverse (if it exists) is equal to its transpose:
A−1 = AT or, equivalently, AAT = AT A = I.
Example
An important example of an orthogonal matrix is
 
cos φ sin φ
A=
− sin φ cos φ
which arises when we use matrices to describe rotations in a plane.
  
T cos φ sin φ cos φ − sin φ
AA =
− sin φ cos φ sin φ cos φ

cos2 φ + sin2 φ
   
0 1 0
= = =I
0 sin2 φ + cos2 φ 0 1
It is clear that AT A = I also, so A is indeed orthogonal.
It can be shown, but we omit the details, that any 2 × 2 matrix which is orthogonal can be written
in one of the two forms.
   
cos φ sin φ cos φ − sin φ
or
− sin φ cos φ sin φ cos φ
If we look closely at either of these matrices we can see that
1. The two columns are mutually orthogonal e.g. for the first matrix we have
 
sin φ
(cos φ − sin φ) = cos φ sin φ − sin φ cos φ = 0
cos φ
p
2. Each column has magnitude 1 (because cos2 φ + sin2 φ = 1)
Although we shall not prove it, these results are necessary and sufficient for any order square matrix
to be orthogonal.

HELM (2008): 41
Section 22.3: Repeated Eigenvalues and Symmetric Matrices
Key Point 9
A square matrix A is said to be orthogonal if its inverse (if it exists) is equal to its transpose:
A−1 = AT or, equivalently, AAT = AT A = I.
A square matrix is orthogonal if and only if its columns are mutually orthogonal and each column
has unit magnitude.

Task
For each of the following matrices verify that the two properties above are satisfied.
Then check in both cases that AAT = AT A = I i.e. that AT = A−1 .
 √   1 1 
3 1 √ 0 −√
 2 −2   2 2 
(a) A =  (b) A = 0 1 0
   
√    

1 3
 1 1 
−√ 0 √
2 2 2 2

Your solution

Answer
1
 
√ √ √
1  −
!
3 √ 2 3 3
(a) Since  3 =− + = 0 the columns are orthogonal.

2 2 4 4
2
√ r √ r
3 1 3 1 1 3 1 3
Since + = + = 1 and − + = + = 1 each column has unit
2 2 4 4 2 4 4 4
magnitude.
 
T T 1 0
Straightforward multiplication shows AA = A A = = I.
0 1
(b) Proceed as in (a).

42 HELM (2008):
Workbook 22: Eigenvalues and Eigenvectors
®

The following is the key result of this Section.

Key Point 10
Any symmetric matrix A can be diagonalized using an orthogonal modal matrix P via the transfor-
mation
 
λ1 0 . . . 0
 0 λ2 . . . 0 
P T AP = D = 
 
.. ... 
 . 
0 λn
It follows that any n × n symmetric matrix must possess n mutually orthogonal eigenvectors even
if some of the eigenvalues are repeated.

It should be clear to the reader that Key Point 10 is a very powerful result for any applications that
involve diagonalization of a symmetric matrix. Further, if we do need to find the inverse of P , then
this is a trivial process since P −1 = P T (Key Point 9).

Task
The symmetric matrix
 √ 
1 0 2
A = √0 2 0 

2 0 0
has eigenvalues 2, 2, −1 (i.e. eigenvalue 2 is repeated with multiplicity 2.)
Associated with the non-repeated eigenvalue −1 is an eigenvector:
 
1
X= √ 0  (or any multiple)
− 2

(a) Normalize the eigenvector X:

Your solution

HELM (2008): 43
Section 22.3: Repeated Eigenvalues and Symmetric Matrices
Answer q √ √
(a) Normalizing X which has magnitude 12 + (− 2)2 = 3 gives
 √ 
  1/ 3
√ 1 



1/ 3  0

 = 
 0 

− 2 
p

− 2/3

(b) Investigate the eigenvectors associated with the repeated eigenvalue 2:

Your solution

Answer
(b) The eigenvectors associated with λ = 2 satisfy AY = 2Y
 √    
−1 0 2 x 0
which gives  √0 0 0  y  =  0 
2 0 −2 z 0
The first and third equations give

−x + 2z = 0
√ √
2x − 2z = 0 i.e. x = 2z
The equations give us no information about y so its value is arbitrary.
 √ 

Thus Y has the form Y =  α  where both α and β are arbitrary.
β

A certain amount of care is now required in the choice of α and β if we are to find an orthogonal
modal matrix to diagonalize A.
For any choice
 √ 
√ 2β √ √
X T Y = (1 0 − 2)  α  = 2β − 2β = 0.
β
So X and Y are orthogonal. (The normalization of X does not affect this.)

44 HELM (2008):
Workbook 22: Eigenvalues and Eigenvectors
®

 √ 

However, we also need two orthogonal eigenvectors of the form  α . Two such are
β
   √ 
0 2
Y (1) = 1  ( choosing β = 0, α = 1) and Y (2) =  0  ( choosing α = 0, β = 1)
0 1
    p
0 2/3
After normalization, these become Y (1) Y (2) =  0√ 
= 1 
0 1/ 3
 √ p 
1/ 3 0 2/3
h . . i
Hence the matrix P = X .. Y (1) .. Y (2) =  p0 1 0√ 
− 2/3 0 1/ 3
is orthogonal and diagonalizes A:
 
−1 0 0
P T AP =  0 2 0 
0 0 2

Hermitian matrices
In some applications, of which quantum mechanics is one, matrices with complex elements arise.
T
If A is such a matrix then the matrix A is the conjugate transpose of A, i.e. the complex
conjugate of each element of A is taken as well as A being transposed. Thus if
   
2+i 2 T 2 − i −3i
A= then A =
3i 5 − 2i 2 5 + 2i
An Hermitian matrix is one satisfying
T
A =A
The matrix A above is clearly non-Hermitian. Indeed the most obvious features of an Hermitian
matrix is that its diagonal elements must be real. (Can you see why?) Thus
 
6 4+i
A=
4 − i −2
is Hermitian.
A 3 × 3 example of an Hermitian matrix is
 
1 i 5 − 2i
A=  −i 3 0 
5 + 2i 0 2
An Hermitian matrix is in fact a generalization of a symmetric matrix. The key property of an
Hermitian matrix is the same as that of a real symmetric matrix – i.e. the eigenvalues are always
real.

HELM (2008): 45
Section 22.3: Repeated Eigenvalues and Symmetric Matrices
Numerical
Determination
of Eigenvalues  

and Eigenvectors 22.4 

Introduction
In Section 22.1 it was shown how to obtain eigenvalues and eigenvectors for low order matrices, 2 × 2
and 3 × 3. This involved firstly solving the characteristic equation det(A − λI) = 0 for a given n × n
matrix A. This is an n th order polynomial equation and, even for n as low as 3, solving it is not
always straightforward. For large n even obtaining the characteristic equation may be difficult, let
alone solving it.
Consequently in this Section we give a brief introduction to alternative methods, essentially numerical
in nature, of obtaining eigenvalues and perhaps eigenvectors.
We would emphasize that in some applications such as Control Theory we might only require one
eigenvalue of a matrix A, usually the one largest in magnitude which is called the dominant eigen-
value. It is this eigenvalue which sometimes tells us how a control system will behave.

#
• have a knowledge of determinants and
matrices
Prerequisites
• have a knowledge of linear first order
Before starting this Section you should . . .
differential equations
"
' !
$
• use the power method to obtain the
dominant eigenvalue (and associated
Learning Outcomes eigenvector) of a matrix
On completion you should be able to . . . • state the main advantages and disadvantages
of the power method
& %

46 HELM (2008):
Workbook 22: Eigenvalues and Eigenvectors
®

1. Numerical determination of eigenvalues and eigenvectors

Preliminaries
Before discussing numerical methods of calculating eigenvalues and eigenvectors we remind you of
three results for a matrix A with an eigenvalue λ and associated eigenvector X.

1
• A−1 (if it exists) has an eigenvalue with associated eigenvector X.
λ

• The matrix (A − kI) has an eigenvalue (λ − k) and associated eigenvector X.

• The matrix (A − kI)−1 , i.e. the inverse (if it exists) of the matrix (A − kI), has eigenvalue
1
and corresponding eigenvector X.
λ−k
Here k is any real number.

 
Task 2 1 1
The matrix A =  1 2 1  has eigenvalues λ = 5, 3, 1 with associated
  0 0 5  
1/2 1 1
eigenvectors  1/2  ,  1  ,  −1  respectively.
1 0 0
2 −1 −5
 
1
The inverse A−1 exists and is A−1 =  −1 2 −5 
3 3

0 0
5
Without further calculation write down the eigenvalues and eigenvectors of the
following matrices:
   −1
3 1 1 0 1 1
(a) A−1 (b)  1 3 1  (c)  1 0 1 
0 0 6 0 0 3

Your solution

HELM (2008): 47
Section 22.4: Numerical Determination of Eigenvalues and Eigenvectors
Answer
1 1
(a) The eigenvalues of A−1 are , , 1. (Notice that the dominant eigenvalue of A yields the
5 3
smallest magnitude eigenvalue of A−1 .)
(b) The matrix here is A + I. Thus its eigenvalues are the same as those of A increased by 1 i.e.
6, 4, 2.
(c) The matrix here is (A − 2I)−1 . Thus its eigenvalues are the reciprocals of the eigenvalues of
1
(A − 2I). The latter has eigenvalues 3, 1, −1 so (A − 2I)−1 has eigenvalues , 1, −1.
3
In each of the above cases the eigenvectors are the same as those of the original matrix A.

The power method


This is a direct iteration method for obtaining the dominant eigenvalue (i.e. the largest in mag-
nitude), say λ1 , for a given matrix A and also the corresponding eigenvector.
We will not discuss the theory behind the method but will demonstrate it in action and, equally
importantly, point out circumstances when it fails.

Task  
4 2
Let A = . By solving det(A − λI) = 0 obtain the eigenvalues of A and
5 7
also obtain the eigenvector associated with the dominant eigenvalue.

Your solution

48 HELM (2008):
Workbook 22: Eigenvalues and Eigenvectors
®

Answer
4−λ 2
det(A − λI) = =0
5 7−λ
which gives
λ2 − 11λ + 18 = 0 ⇒ (λ − 9)(λ − 2) = 0
so
λ1 = 9 ( the dominant eigenvalue) and λ2 = 2.
 
x
The eigenvector X = for λ1 = 9 is obtained as usual by solving AX = 9X, so
y
      
4 2 x 9x 2
= from which 5x = 2y so X = or any multiple.
5 7 y 9y 5
If we normalize here such that the largest component of X is 1
 
0.4
X=
1

 
0.4
We shall now demonstrate how the power method can be used to obtain λ1 = 9 and X =
  1
4 2
where A = .
5 7

• We choose an arbitrary 2 × 1 column vector


 
(0) 1
X =
1

• We premultiply this by A to give a new column vector X (1) :


    
(1) 4 2 1 6
X = =
5 7 1 12

• We ‘normalize’ X (1) to obtain a column vector Y (1) with largest component 1: thus
   
(1) 1 6 1/2
Y = =
12 12 1

• We continue the process


    
(2) (1) 4 2 1/2 4
X = AY = =
6 7 1 9.5
   
(2) 1 4 0.421053
Y = =
9.5 9.5 1

HELM (2008): 49
Section 22.4: Numerical Determination of Eigenvalues and Eigenvectors
Task
Continue this process for a further step and obtain X (3) and Y (3) , quoting values
to 6 d.p.

Your solution

Answer     
(3) (2) 4 2 0.421053 3.684210
X = AY = =
5 7 1 9.105265
 
(3) 1 0.404624
Y =
9.105265 1

The first 8 steps of the above iterative process are summarized in the following table (the first
three rows of which have been obtained above):
Table 1

(r) (r) (r) (r)


Step r X1 X2 αr Y1 Y2
1 6 12 12 0.5 1
2 4 9.5 9.5 0.421053 1
3 3.684211 9.105265 9.105265 0.404624 1
4 3.618497 9.023121 9.023121 0.401025 1
5 3.604100 9.005125 9.005125 0.400228 1
6 3.600911 9.001138 9.001138 0.400051 1
7 3.600202 9.000253 9.000253 0.400011 1
8 3.600045 9.000056 9.000056 0.400002 1
In Table 1, αr refers to the largest magnitude component of X (r) which is used to normalize X (r)
to give Y (r) . We can see that αr is converging to 9 which we know is the dominant eigenvalue λ1
of A. Also Y (r) is converging towards the associated eigenvector [0.4, 1]T .
Depending on the accuracy required, we could decide when to stop the iterative process by looking
at the difference |αr − αr−1 | at each step.

Task
Using the power method obtain the dominant eigenvalue and associated
eigenvector of
   
3 −1 0 1
(0)
A = −2
 4 −3  using a starting column vector X = 1 

0 −1 1 1

50 HELM (2008):
Workbook 22: Eigenvalues and Eigenvectors
®

Calculate X (1) , Y (1) and α1 :


Your solution

Answer     
3 −1 0 1 2
X (1) = AX (0) =  −2 4 −3   1  =  −1 
0 −1 1 1 0
 
1
so Y (1) = 21  −0.5  using α1 = 2, the largest magnitude component of X (1) .
0

Carry out the next two steps of this iteration to obtains X (2) , Y (2) , α2 and X (3) , Y (3) , α3 :
Your solution

Answer
      
3 −1 0 1 3.5 −0.875
1
X (2) =  −2 4 −3   −0.5 = −4  Y (2) = −  1  α2 = −4
4
0 −1 1 0 0.5 −0.125
      
3 −1 0 −0.875 −3.625 −0.5918
X (3) =  −2 4 −3   1 = 6.125  Y (3) = 1  1  α3 = 6.125
6.125
0 −1 1 −0.125 −1.125 −0.1837

After just 3 iterations there is little sign of convergence of the normalizing factor αr . However the
next two values obtained are
α4 = 5.7347 α5 = 5.4774
and, after 14 iterations, |α14 − α13 | < 0.0001 and the power method converges, albeit slowly, to
α14 = 5.4773
which (correct to 4 d.p.) is the dominant eigenvalue of A. The corresponding eigenvector is
 
−0.4037
 1 
−0.2233
It is clear that the power method requires, for its practical execution, a computer.

HELM (2008): 51
Section 22.4: Numerical Determination of Eigenvalues and Eigenvectors
Problems with the power method
1. If the initial column vector X (0) is an eigenvector of A other than that corresponding to the
dominant eigenvalue, say λ1 , then the method will fail since the iteration will converge to the
wrong eigenvalue, say λ2 , after only one iteration (because AX (0) = λ2 X (0) in this case).
2. It is possible to show that the speed of convergence of the power method depends on the ratio
magnitude of dominant eigenvalue λ1
magnitude of next largest eigenvalue
If this ratio is small the method is slow to converge.
In particular, if the dominant eigenvalue λ1 is complex the method will fail completely to
converge because the complex conjugate λ1 will also be an eigenvalue and |λ1 | = |λ1 |
3. The power method only gives one eigenvalue, the dominant one λ1 (although this is often the
most important in applications).

Advantages of the power method


1. It is simple and easy to implement.
2. It gives the eigenvector corresponding to λ1 as well as λ1 itself. (Other numerical methods
require separate calculation to obtain the eigenvector.)

Finding eigenvalues other than the dominant


We discuss this topic only briefly.

1. Obtaining the smallest magnitude eigenvalue


1
If A has dominant eigenvalue λ1 then its inverse A−1 has an eigenvalue (as we discussed at the
λ1
1
beginning of this Section.) Clearly will be the smallest magnitude eigenvalue of A−1 . Conversely if
λ1
we obtain the largest magnitude eigenvalue, say λ01 , of A−1 by the power method then the smallest
1
eigenvalue of A is the reciprocal, 0 .
λ1
This technique is called the inverse power method.
Example
   
3 −1 0 1 1 3
If A =  −2 4 −3  then the inverse is A−1 =  2 3 9 .
0 −1 1 2 3 10
 
1
Using X = 1  in the power method applied to A−1 gives λ01 = 13.4090. Hence the smallest
(0) 
1  
0.3163
1
magnitude eigenvalue of A is = 0.0746. The corresponding eigenvector is  0.9254  .
13.4090
1

52 HELM (2008):
Workbook 22: Eigenvalues and Eigenvectors
®

In practice, finding the inverse of a large order matrix A can be expensive in computational effort.
Hence the inverse power method is implemented without actually obtaining A−1 as follows.
As we have seen, the power method applied to A utilizes the scheme:
X (r) = AY (r−1) r = 1, 2, . . .
1
where Y (r−1) = X (r−1) , αr−1 being the largest magnitude component of X (r−1) .
αr−1
For the inverse power method we have
X (r) = A−1 Y (r−1)
which can be re-written as
AX (r) = Y (r−1)
Thus X (r) can actually be obtained by solving this system of linear equations without needing to
obtain A−1 . This is usually done by a technique called LU decomposition i.e. writing A (once and
for all) in the form
A = LU L being a lower triangular matrix and U upper triangular.

2. Obtaining the eigenvalue closest to a given number p


Suppose λk is the (unknown) eigenvalue of A closest to p . We know that if λ1 , λ2 , . . . , λn are the
eigenvalues of A then λ1 − p, λ2 − p, . . . , λn − p are the eigenvalues of the matrix A − pI. Then
1
λk −p will be the smallest magnitude eigenvalue of A−pI but will be the largest magnitude
λk − p
eigenvalue of (A − pI)−1 . Hence if we apply the power method to (A − pI)−1 we can obtain λk .
The method is called the shifted inverse power method.

3. Obtaining all the eigenvalues of a large order matrix


In this case neither solving the characteristic equation det(A − λI) = 0 nor the power method (and
its variants) is efficient.
The commonest method utilized is called the QR technique. This technique is based on similarity
transformations i.e. transformations of the form
B = M −1 AM
where B has the same eigenvalues as A. (We have seen earlier in this Workbook that one type of
similarity transformation is D = P −1 AP where P is formed from the eigenvectors of A. However,
we are now, of course, dealing with the situation where we are trying to find the eigenvalues and
eigenvectors of A.)
In the QR method A is reduced to upper (or lower) triangular form. We have already seen that a
triangular matrix has its eigenvalues on the diagonal.
For details of the QR method, or more efficient techniques, one of which is based on what is called
a Householder transformation, the reader should consult a text on numerical methods.

HELM (2008): 53
Section 22.4: Numerical Determination of Eigenvalues and Eigenvectors

You might also like