0% found this document useful (0 votes)
6 views44 pages

MTH 2202 Linear Algebra II Lecture Notes 2020-2021 (ICIE)

The document contains lecture notes for a Linear Algebra II course at Kano University of Science and Technology, covering essential topics such as systems of linear equations, eigenvalues, and orthogonal diagonalization. It outlines course contents, recommended texts, and methods for solving linear equations, including Gaussian and Gauss-Jordan elimination. The notes emphasize the importance of linear algebra in various scientific fields and provide a foundation for understanding linear systems.

Uploaded by

alameenadam456
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views44 pages

MTH 2202 Linear Algebra II Lecture Notes 2020-2021 (ICIE)

The document contains lecture notes for a Linear Algebra II course at Kano University of Science and Technology, covering essential topics such as systems of linear equations, eigenvalues, and orthogonal diagonalization. It outlines course contents, recommended texts, and methods for solving linear equations, including Gaussian and Gauss-Jordan elimination. The notes emphasize the importance of linear algebra in various scientific fields and provide a foundation for understanding linear systems.

Uploaded by

alameenadam456
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

DEPARTMENT OF MATHEMATICS

INSTUTUTE OF CONTINUING AND INNOVATIVE EDUCATION

KANO UNIVERSITY OF SCIENCE AND TECHNOLOGY, WUDIL

LINEAR ALGEBRA II
(MTH 2202)

LECTURE NOTES

2020/2021 SESSION

Prof. Abdulhadi Aminu

(Course Lecturer)

0
Chapter 1
Introduction

1.1 Preamble
Linear Algebra is one of the most important mathematical tools used in the world today. It is
an excellent example of mathematical abstraction, the process which starts with the
observation that apparently different mathematical objects share common properties. The
common rules which these objects obey are then identified, and formalized as axioms. Then
an abstract object which satisfies these axioms is studied, with no further assumption being
made. Of course any deduction which we can draw from these axioms then holds in any
mathematical object which satisfies them, and so we have proved many apparently different
results all at the same time. Although linear algebra is properly thought of as a branch of pure
Mathematics, it is immensely widely used in all branches of science and engineering, and in
the social sciences. Indeed, solving systems of linear equations with very large numbers of
unknowns is so crucially important in some industries that mathematicians are employed to
find faster and faster (computer-based) methods of solution.

1.2 Course contents


 System of linear equations
 Change of basis
 Equivalence and Similarity
 Eigenvectors and eigenvalues
 Minimum and characteristic polynomials of a linear transformation (matrix)
 Cayley – Hamilton Theorem
 Bilinear and quadratic forms
 Orthogonal diagonalization
 Canonical forms

1
1.3 Recommended Texts
 S.I. Grossman, Elementary Linear Algebra, (Second Edition), Wadsworth California,
1984
 R. Larson, Elementary Linear Algebra, (7th Edition), Brooks/Cole, USA, 2013
 D.Poole, Linear Algebra: A modern Introduction, (4th Edition), Cengage Learning,
USA, 2015.
 K. Singh, Linear Algebra Step by Step, Oxford University Press, United Kingdom,
2014.
 R.Kaye, R. Wilson, Linear Algebra, Oxford University Press, 2003.
 S. Lipschutz, Schaum’s Outline Series Theory and Problems of Linear Algebra,
(Fifth Edition), McGraw-Hill, USA, 2013.

2
Chapter 2

System of linear equations


Learning Outcomes:
By the end of this chapter you should be able to
 Recognize a linear equation in n variables.
 Determine whether a system of linear equations is consistent or
inconsistent.
 Be acquainted with homogeneous and non-homogeneus system of linear
equations.
 Solve any system of linear equation using Gaussian, Gauss-Jordan
elimination methods.
 Use Cramer’s rule to solve some system of linear equations.

2.1 Introduction
In this chapter we will study system of linear equations and describe methods for finding all
solutions (if any) to a system of m equations in n unknowns (variables).

2.2 Definitions
A linear equation is an equation of the forma1 x1  a2 x2  ...  am xn  b
where a1 , a2 ,..., am , b   and x1 , x2 ,..., xn are variables. The scalars a j are coefficients and the

scalar b is the constant term. The number a1 is the leading coefficient, and x1 is the leading
variable.
A system of linear equations, or linear system, is a set of one or more linear equations in
the same variables, such as
a11 x1  a12 x2  ...  a1n xn  b1
a21 x1  a22 x2  ...  a2 n xn  b2
...
am1 x1  am 2 x2  ...  amn xn  bm .

3
The matrix
 a11 . . . a1n 
 . . 
 
A . . 
 
 . . . . . 
 am1 . . . amn 

is the coefficient matrix,

 b1 
. 
 
b . 
 
. 
bm 

is the constant vector and


 x1 
.
 
x . 
 
.
 xn 

is the unknown vector. The matrix  Ab  is the augmented matrix of the

system.
A solution of the system is an n-tuple (c1 , c2 ,..., cn ) such that letting x j  c j for each

j satisfies every equation. The solution set of the system is the set of all solutions. It is
possible for a system of linear equations to have exactly one solution, infinitely many
solutions, or no solution. A system of linear equations is consistent if there exists at least one
solution; otherwise it is inconsistent.
Two systems of linear equations are equivalent if they have the same solution set.

2.3 Solving the general system of linear


equations
There are several methods for solving linear systems; here we will treat two methods namely
Gaussian and Gauss-Jordan elimination methods. Before, we closed this topic we shall also

4
discuss a determinant method for solving system of linear equations when the number of
unknowns is equal to the number of the equations generally known as Cramer’s methods.

2.3.1 Gaussian and Gauss-Jordan Elimination


Methods
To solve a system that is not in row-echelon form, first convert it to an equivalent system that
is in row-echelon form by using the elementary row operations. This process is called
Gaussian elimination, after the German mathematician Carl Friedrich Gauss (1777-1855).
Another method of elimination is called Gauss-Jordan elimination, after Carl Friedrich
Gauss and Wilhelm Jordan (1842-1899). In this method the system is converted to its reduced
row-echelon form.

Definition 2.1 Let A be a matrix with m rows. When a row of A is not zero, its first nonzero
entry is the leading entry of the row. The matrix A is in row echelon form (REF) when the
following two conditions are met:
1. Any zero rows are below all nonzero rows.
2. For each nonzero row i, i ≤ m − 1, either row i + 1 is zero or the leading entry of row
i + 1 is in a column to the right of the column of the leading entry in row i.

The matrix A is in reduced row echelon form (RREF) if it is in row echelon form and the
following third condition is also met:
3. If is the leading entry in row i, then , and every entry of column k other
than is zero

Example 2.1 The following matrices are in row echelon form (REF):
1 2 3 1 1 6 4 
i.
0 1 5 ii.
0 1 2 8
   
0 0 1  0 0 0 1 

1 1 6 4 
0 1 2 8 1 2 
ii. iv. 0 1 
 
0 0 0 1   

5
Example 2.2 The following matrices are in reduced row echelon form (RREF):

1 0 0  1 0 0 0 
i.
0 1 0  ii.
0 1 0 0 
   
0 0 1  0 0 0 1 
1 0  1 0 0 5 
iii. 0 1  iv. 0 0 1 2
   
Question: How can we convert a given matrix to row echelon form (REF) and reduced row
echelon form (RREF)?
Answer: We apply the elementary row operation.

ELEMENTARY ROW OPERATIONS

i. Multiply (or divide) one row by a non zero number.


ii. Add a multiple of one row by a multiple of another row.
iii. Interchange two rows.
The process of applying elementary row operation to simplify a matrix is called row
reduction.
The matrix A is row equivalent to the matrix B if there is a sequence of elementary row
operations that transforms A into B. The reduced row echelon form of A, RREF(A), is the
matrix in reduced row echelon form that is row equivalent to A. A row echelon form of A is
any matrix in row echelon form that is row equivalent to A. The rank of A, denoted rank A or
rank(A), is the number of leading entries in RREF(A). If A is in row echelon form, the
positions of the leading entries in its nonzero rows are called pivot positions and the entries
in those positions are called pivots. A column (row) that contains a pivot position is a pivot
column (pivot row).

 Gaussian Elimination Method: Row reduce the coefficient


matrix to row echelon form, solve for the last unknown, and then
use back substitution to solve for the other unknowns
 Gauss-Jordan Elimination Method: Row-reduce the coefficient
matrix to reduced row echelon form.

6
To solve the system of linear equations
a11 x1  a12 x2  ...  a1n xn  b1
a21 x1  a22 x2  ...  a2n xn  b2
...
am1 x1  am 2 x2  ...  amn xn  bm .
we write the system as an augmented matrix and row-reduce the matrix to its echelon form.
After reducing the augmented form, we observe one of the following

i. The last non zero equation reads for some constant c.


Then there is either a unique solution or an infinite number of
solutions to the system.
ii. The last non zero equation reads
+...+ for some constant c where at
least two of the a’s are nonzero. That is, the last equation is a
linear equation in two or more variables. Then there are an
infinite number of solutions.
iii. The last equation reads , where . Then there is no
solution. In this case the system is called inconsistent. In cases (i)
and (ii) the system is called consistent.

Example 2.3 Solve the following systems using the Gaussian elimination method.

2 x1  4 x2  6 x3  18 2 x1  4 x2  6 x3  18
1. 4 x1  5 x2  6 x3  24 2. 4 x1  5 x2  6 x3  24
3 x1  x2  2 x3  4 2 x1  7 x2  12 x3  30

2 x1  4 x2  6 x3  18
3. 4 x1  5 x2  6 x3  24
2 x1  7 x2  12 x3  40

Solution

1. We apply the elementary raw operations on the augmented matrix of the


problem as follows

7
 2 4 6 18   1 2 3 9  R2'  R2  4 R1  1 2 3 9 
  R1  12   R3'  R3 3 R1  
 4 5 6 24     4 5 6 24    0 3 6 12 
 3 1 2 4   3 1 2 4   0 5 11 23 
     

1 2 3 9  1 2 3 9 
  R3'  R3 5 R2 
R2'  R2 (  13 ) 
  0 1 2 4    0 1 2 4 
 0 5 11 23   0 0 1 3 
   

Working backwards, we have from the last (third) row


x3  3 . From the second row

x2  2 x3  4
 x2  2(3)  4
x2   2

From the first row we have

x1  2 x2  3x3  9
x1  2( 2)  3(3)  9
x1  4.

2. Applying the elementary row operation we have

 2 4 6 18   1 2 3 9  R2'  R2  4 R1  1 2 3 9 
  R1  12   R3'  R3  2 R1  
 4 5 6 24     4 5 6 24     0 3 6 12 
 2 7 12 30   2 7 12 4   0 3 6 12 
     

1 2 3 9  1 2 3 9
  R3'  R3 3 R2 
R2'  R2 (  13 ) 
  0 1 2 4    0 1 2 4 
 0 3 6 12  0 0 0 0
   

The system is equivalent to the system of equations

x1  2 x2  3x3  9
x2  2 x3  4

This system has infinitely many solutions, because it has two equations and three
unknowns. That is the number of unknowns are more than that of the equation.

8
x1  x3  1 (Equation1 - 2Equation2)
 x1  1  x3
From Equation1, we have
x2  4  2 x3
The solution set is (1  x3 , 4  2 x3 , x3 ).
For instance, if x3  0, we have (1,4,0), if x3  10, we have (11,-16,10) e.t.c.

3. Applying the elementary row operation we have

 2 4 6 18   1 2 3 9  R2'  R2  4 R1  1 2 3 9 
  R1  12   R3'  R3  2 R1  
 4 5 6 24     4 5 6 24     0 3 6 12 
 2 7 12 40   2 7 12 40   0 3 6 22 
     

1 2 3 9  1 2 3 9 
  R3'  R3 3 R2 
R2'  R2 (  13 ) 
  0 1 2 4    0 1 2 4 
 0 3 6 22   0 0 0 10 
   

0 x1  0 x2  0 x3  10
4. The last equation now reads which is not possible. Thus the
system has no solution.

Example 2.4 Consider the system

2 x1  3 x2  x3  a
x1  x2  3 x3  b
3 x1  7 x2  5 x3  c

Use Gaussian elimination method to find the condition on a,b,c such that the system is
inconsistent.

Solution

Using Gaussian elimination method, we now have

 2 3 1 a   1 1 3 b  R2'  R2  2 R1  1 1 3 b 
  R1  R2   R3'  R3 3 R1  
 1 1 3 b    2 3 1 a    0 5 7 a  2b 
 3 7 5 c   3 7 5 c   0 10 14 c  3b 
     

9
 1 1 3 b   1 1 3 b 
 a  2b 
R2'  R2 ( 15 ) R3'  R3 10 R2  
  0 1 7
5 5    0 1 7
5
a  2b
5 
 0 10 14 c  3b  0 0 0 c  b  2a 
  

It follows from the last row that the system will be consistent if

c  b  2a  0
cb
a .
2

Example 2.5: Use Gauss-Jordan method to solve the following

x1  2 x2  3 x3  9
 x1  3 x2  4
2 x1  5 x2  5 x3  17

Solution:

 1 2 3 9  R2'  R2  R1
 1 2 3 9   1 2 3 9 
  R3'  R3  2 R1   R3'  R3  R2  
 1 3 0 4    0 1 3 5    0 1 3 5
 2 5 5 17   0 1 1 1  0 0 2 4
     

 1 2 3 9   1 0 9 19  R1'  R1 9 R3  1 0 0 1 
  R1'  R1  2 R2 
R2'  R2 ( 12 )  R2'  R2 3 R3  
  0 1 3 5    0 1 3 5     0 1 0 1 
0 0 1 2 0 0 1 2  0 0 1 2 
     

The matrix is now in reduced row-echelon form (RREF). Converting back to a system
x1  1, x2  1 and x3  2
of linear equations, we have .

Exercise 2.1 Given the following system

2 x1  x2  3x3  a
3 x1  x2  5 x3  b
5 x1  5 x2  21x3  c

Find the condition on a,b,c such that the system is consistent.

10
2.4 Homogeneous system of equations
The general m  n system of linear equation is called homogeneous if all the constants
b1 , b2 ,..., bm are zero. That is, the general homogeneous system is given by

a11 x1  a12 x2  ...  a1n xn  0


a21 x1  a22 x2  ...  a2n xn  0
...
am1 x1  am 2 x2  ...  amn xn  0.

For the general homogeneous system, there are two possibilities for its solution since
x1  x2  ...  xn  0 is always a solution (called the trivial solution or zero solution). These
possibilities are either the zero solution is the only solution or there are an infinite number of
solutions in addition to the zero solution. Solutions other than the zero solution are called
nontrivial solutions.

Gaussian and Gauss-Jordan elimination methods described earlier are used to solve the
homogeneous system of equations.

Example 2.6 Solve the following homogeneous systems

2 x1  4 x2  6 x3  0
4 x1  5 x2  6 x3  0
3x1  x2  2 x3  0
Solution: We use Gauss-Jordan’ method as follows:

 2 4 6 0  1 2 3 0  R2'  R2  4 R1  1 2 3 0
  R1'  R1 ( 15 )   R3'  R3 3 R1  
 4 5 6 0     4 5 6 0    0 3 6 0 
 3 1 2 0   3 1 2 0   0 5 11 0 
     

1 2 3 0  R1'  R1  2 R2  1 0 1 0   1 0 1 0 
  R3'  R3 5 R2 
R2'  R2 (  13 )  R3'  R3 ( 1)  
  0 1 2 0    0 1 2 0    0 1 2 0 
 0 5 11 0   0 0 1 0  0 0 1 0
     

1 0 0 0
R1'  R1  R3
 
R2'  R2  2 R3
  0 1 0 0 
 0 0 1 0
 
Thus the system has the trivial solution (0,0,0) as the only solution.

11
Example 2.7 Solve the following homogeneous systems

x1  2 x2  x3  0
3x1  3x2  2 x3  0
 x1  11x2  6 x3  0
Solution: We use Gauss-Jordan’ method as follows:
1 2 1 0  R2'  R2  3 R1
 1 2 1 0  1 2 3 0
  R3'  R3  R1   R2'  R2 (  19 )  
 3 3 2 0    0 9 5 0    0 1  9 0 
5

 1 11 6 0   0 9 5 0   0 9 5 0 
     
1 2 3 0

R3'  R3  9 R3 
  0 1  59 0 
0 0 0 0
 

The augmented matrix is now in reduced echelon form and, we can clearly see that, there are
( 19 x3 , 95 x3 , x3 ) x3  0
infinitely number of solutions given by . For instance, if we obtain the

trivial solution. If
x3  1 we obtain the solution ( 19 x3 , 95 x3 ,1)

Example 2.8 Solve the following homogeneous systems

x1  x2  x3  0
4 x1  2 x2  7 x3  0
Solution: We use Gauss-Jordan’ method as follows:

 1 1 1 0  R2'  R2  4 R1  1 1 1 0 
    
 4 2 7 0   0 6 11 0 
R2'  R2 (  16 )  1 1 1 0  R1'  R1  R2  1 0 56 0
     
0 1  6 0 0 1  6 0
11 11

Thus, there are infinitely many solutions given by ( 6 x3 , 6 x3 , x3 ) . This follows from the fact
5 11

that the linear system contains two equations and three unknowns.

12
2.5. Determinant Method (Cramer’s rule)
In this section we will discuss the method of solving systems of linear equations with the
same number of unknowns as equation.

Consider the system of n equations in n unknowns

a11 x1  a12 x2  ...  a1n xn  b1


a21 x1  a22 x2  ...  a2 n xn  b2 (2.5.1)
..........................................
a1n x1  a2 n x2  ...  ann xn  bn

Which can be written in the form

Ax  b (2.5.2)

1
We suppose that det A  0 . The system (2.5.2) has a unique solution given by x  A b. We
can develop a method for finding that solution without row reduction and without computing
A1.

Let D  det A . We define n new matrices:

 b1 a12 ... a1n   a11 b1 ... a1n   a11 a12 ... b1 


     
b a22 a2 n  a b2 a2 n  a a22 b2 
A1   2 , A2   12 ,..., An   12
 . .   . .   . . 
     
 bn an 2 ... ann   a1n bn ... ann   a1n an 2 ... bn 

That is Ai . , is the matrix obtained by replacing the ith column of A with b. Finally, let
D1  det A1 , D2  det A2 ,..., Dn  det An .

Theorem 2.5.1: Let A be an n  n matrix and suppose that det A  0 . Then the unique
solution to the system Ax  b is given by

D1 D D D
x1  , x2  2 ,..., xi  i ,..., xn  n
D D D D

13
Example 2.9 Solve the following system using Cramer’s rule

x1  x2  x3  5
x1  2 x2  3 x3  1
2 x1  x2  x3  3

Solution

We first compute the determinant of the matrix of coefficients.

1 1 1
D  1 2 3  5  5  5  5
2 1 1

Because D  0 then the system has a unique solution. We now compute D1 , D2 and D3 .

5 1 1 1 5 1 1 1 5
D1  1 2 3  20, D2  1 1 3  10, D1  1 2 1  15
2 1 1 1 3 1 2 1 3

Thus, the unique solution of the system is

D3
x1  D1  4, x2  D2  2, x3  3
D D D

Exercise 2.2

1. Determine whether the following matrix is in row-echelon form. If it is, determine


whether it is also in reduced row-echelon form.

1 0 0 0
  0 1 0 0
(i) 0 1 1 2 (ii)  
0 0 0 0 1 0 2 1
 

 2 0 1 3 1 0 2 1
   
(ii)  0 1 1 4  (iv)  0 1 3 4 
0 0 0 1 0 0 1 0
   

14
0 0 1 0 0 1 0 0 0
   
(v) 0 0 0 1 0 (iv)  0 0 0 1 
0 0 0 2 0 0 0 0 0
   

2. Use Gaussian and Gauss-Jordan elimination methods to find all solutions, if any, to
the given systems

2 x1  x2  3x3  5 x1  2 x2  3x3  11
(i) 3 x1  2 x2  2 x3  5 (ii) 4 x1  x2  x3  4
5 x1  3x2  x3  16 2 x1  x2  3x3  10

x1  x2  x3  7 x1  x2  5 x3  3
(iii) 4 x1  x2  5 x3  4 (iv) x1  2 x3  1
2 x1  2 x2  3 x3  0 2 x1  x2  x3  0

2 x  y  z  2w  6
3x  4 y w  1
(v) 4 x  12 y  7 z  20w  22 (vi) x  5 y  2 z  6w  3
3 x  9 y  5 z  28w  30 5x  2 y  z  w  3

3, Determine the value of k, a, b and c (if possible) such that the system in the unknown

x,y and z has

(a) a unique solution (b) no solution (c) more than one solution.

kx  y  z  1 x  2 y  kz  1
(i) x  ky  z  1 (ii) 2 x  ky  8 z  3
x  y  kz  1

x y 2 x y 0
y z  2 y z  0
(iii) x  z2 (iii) x  z0
ax  by  cz  0 ax  by  cz  0

15
4. Determine whether each of the following system has a nontrivial solution

x  3y  2z  0 x  3y  2z  0
(i) x  8 y  8z  0 (ii) 2x  3 y  z  0
3x  2 y  4 z  0 3x  2 y  2 z  0

5. Find the value of for which the homogeneous linear system has nontrivial solution

(i) (  2) x  y0 (ii) (  1) x  2 y  0


x  (  2) y  0 xy  0

6. Use Cramer’s rule to solve the following system of linear equations.

2 x1  x2  x3  5 x1  x2  x3  7
(i)  x1  2 x2  3 x3  0 (ii) 2 x1  5 x3  4
4 x1  x2  x3  1 3 x2  x3  2

16
Chapter 3

Change of basis
Learning Outcomes:
By the end of this chapter you should be able to
 Write vectors in terms of the standard basis.
 Write vectors in terms of other bases.
 Compute the transition matrix

2 1 0 n
Recall that, in R we wrote vectors in terms of the standard basis   ,   . In R we define
 0 1
the standard basis e1 , e2 ,..., en  where

e1  (1, 0,0,...),
e2  (0,1, 0,...),
...
en  (0,0,0,...,1).

These bases are most commonly used because it is relatively easy to work with them. But it
sometimes happens that some other bases are more convenient. There are obviously many
bases to choose from since in an n-dimensional vector space any n linearly independent
vectors form a basis. In this chapter we shall see how to change from one basis to another by
computing a certain matrix.

1  0
Example 3.1. Let u1    and u2    . Then B1  u1 , u2  is the standard basis in R .
2

0 1
1  1 
Let v1    and v2    . The set B2  v1, v2  is also a basis in R .
2

 3 2

x  2
Let x   1  be a vector in R . This notation means that
 x2 

17
x  1 0
x   1   x1    x2    x1u1  x2u2 .
 x2  0 1

That is, x can be written in terms of the vectors in the basis B1 . This is denoted as

x 
( x) B1   1  .
 x2 

2
Since B2 is another basis in R , there are scalars c1 and c2 such that

x  c1v1  c2v2 . (3.1)

Once these scalars are found, we write

c 
( x) B2   1  .
 c2 

How do we find the scalars c1 and c2 ? To find the scalars c1 and c2 , we write the old basis

vectors ( u1 and u2 ) in terms of the new basis vectors ( v1 and v2 )

1  1   1 
u1     25    35    52 v1  35 v2 (3.2)
0  3  2 

and

0  1   1 
u2     15    15    15 v1  51 v2 . (3.3)
1  3  2 

That is,

 2 1
(u1 ) B2   53  and (u2 ) B2   15  .
 5  5

Then, from (3.2) and (3.3), we have

x  x1u1  x2u2  x1  52 v1  35 v2   x2  15 v1  15 v2 
  52 x1  15 x2  v1    53 x1  15 x2  v2 .

It follows from (3.1) that

18
c1  2
5 x1  15 x2
c2   53 x1  15 x2

or

 c   2 x1  15 x2   25 1
  x1 
( x) B2   1  =  5 3  3
5
  .
 c2    5 x1  5 x2    5   x2 
1 1
5

Therefore, we have shown that,

( x) B2  A( x) B1 (3.4)

 2 1

where A   53 5
 is called the transition matrix from B1 to B2 .
 5
1
5 

This example can easily be generalized. Let B1 = u1 , u 2 ,..., u n  and B2 = v1 , v2 ,..., vn  be two

base for an n-dimensional real vector space V. Let x  V , then x can be written in terms of
the two bases:

x  b1u1  b2u2  ...  bnun


(3.5)

and

x  c1v1  c2 v2  ...  cn vn
(3.6)

where b1 ' s and ci ' s are real numbers. We then write

 b1 
 
b2
( x) B1   
:
 
 bn 

To denote the representation of x in terms of the basis B1 . This is unambiguous because the

coefficients bi are unique likewise

19
 c1 
 
c2
( x) B2   
 :
 
 cn 

w1  a1u1  a2u2  ...  anun and w2  b1u1  b2u2  ...  bn un


has a similar meaning. Suppose that .

w1  w2  (a1  b1 )u1  (a2  b2 )u2  ...  (an  bn )un


Then , so that

( w1  w2 ) B1  ( w1 ) B1  ( w2 ) B1

n
That is, in the new notation we can add vectors just as we add vectors in in R . Moreover, it
is easy to show that

 ( w) B  ( w) B
1 1

B2 uj B1 vi ' s
Now, since is a basis, each in can be written as a linear combination of the .
a1 j , a2 j ,..., anj
Thus, there exists a unique set of scalars such that for j  1, 2,..., n

u j  a1 j v1  a2 j v2  ...  anj vn
(3.7)

or

 a1 j 
 
a2 j
(u j ) B2   
 : 
 
 anj  (3.8)

Definition 3.1: The n  n matrix A whose columns are given by (3.8) is called the transition

matrix from B1 the basis to basis B2 .

Theorem 3.1 Let B1 and B2 be bases for a vector space V. Let A be the transition matrix

from B1 to B2 . Then, for every x  V ,

( x) B2  A( x) B1 . (3.9)

Proof

20
We shall use the representation of x given in (3.5) and (3.6)

Therefore, from (3.5) we have

x  b1u1  b2u2  ...  bn un


 b1 ( a11v1  a21v2  ...  an1vn )  b2 ( a12 v1  a22 v2  ...  an 2vn )
 ...  bn (a1n v1  a2 n v2  ...  ann vn )
 (a11b1  a12b2  ...a1n bn )v1  (a21b1  a22b2  ...a2 nbn )v2
 (an1b1  an 2b2  ...ann bn )vn
 c1v1  c2 v2  ...  cv vn

Thus,

 c1   a11b1  a12b2  ...  a1nbn 


   
c a b  a b  ...  a2 n bn 
( x) B2   2    21 1 22 2
 :  : 
   
 cn   an1b1  an 2b2  ...  ann bn 
 a11 a12 ... a1n  b1 
  
a a22 ... a2 n  b2 
  21  A( x) B1
 : : : :  : 
  
 an1 an 2 ... ann  bn 

Theorem 3.2 If A is the transition matrix from B1 to B2 , then A1 is the transition matrix from

B2 to B1 .

Proof Let C be the transition matrix from B1 to B2 Then from (9), we have

( x) B1  C ( x) B2 (3.10)

But ( x) B2  A( x ) B1 and substituting this into (3.10) gives

( x) B1  CA( x) B1 (3.11)

But (3.11) hold only for every x in V only if CA=I. Thus, C  A1 , and the result follows.

21
Now we give a simple procedure for determining the transition matrix from standard basis to
basis B2

PROCEDURE FOR FINDING THE TRANSITION MATRIX FROM


STANDARD BASIS TO BASIS B2  v1 , v2 ,..., vn 

i. Write the matrix C whose columns are v1, v2 ,..., vn .

ii. Compute C 1 . This is the required transition matrix

 1   3   0  
 
Example 3.2 In R let B1 be the standard basis and let B2   0  ,  1 ,  1   . If
3

 2   0   2  
      

 x
 
x   y   R 3 , write x in terms of the vectors in B2 .
z
 

Solution: We first verify that B2 is a basis. That is

1 3 0
0 1 1 80
2 0 2

1 0 0


     
u1   0  , u2   1  , u1   0 
Since 0 0  1  , we immediately see that the transition matrix, C, from
     
B2 to B1 is given by

1 3 0 
 
C   0 1 1 
 2 0 2 
 

Thus, from Theorem 3.2, the transition matrix A from B1 to B2 is

22
2 6 3 

1 
A  C   2 2 1
1
8
 2 6 1 
 

Therefore, if

 x 2 6 3  x 
  1  
( x) B1   y  then ( x) B2  8  2 2 1  y  .
z  2 6 1  z 
    

For example, if

1 2 6 3  1   2   14 
        
( x) B1   2  then ( x) B2  18  2 2 1  2   18  2    14  .
 4  2 6 1  4   14    7 
        4

As a check, note that

1  3   0   1  1 0 0


  1  7         
 0   4   1   4  1    2   1  0   2  1   4  0 
1
4
 2   0   2   4   0   0   1 
             

Example 3.3 In P2 (the set of polynomials of degree 2) the standard basis is B1  1, x, x 2  .
Another basis is B2  4 x  1, 2 x 2  x,3 x 2  3 . If p  a0  a1 x  a2 x 2 , write p in terms of the
polynomial in B2 .

Solution: We first verify that B2 is a basis. If c1 (4 x  1)  c2 (2 x 2  x)  c3 (3 x 2  3)  0 for all x,


then rearranging terms, we obtain

(c1  3c3 )1  (4c1  c2 ) x  (2c2  3c3 ) x 2  0

But, since 1, x, x  is linearly independent set, we must have


2

c1  3c3  0
4c1  c2 0
2c2  3c3  0

The determinant of the homogeneous system is

23
1 0 3
4 1 0  27  0
0 2 3

Which means that c1  c2  c3  0 is the only solution. Now

 1 0 3


     
(4 x  1) B1   4  , (2 x  x) B1   1 and (3  3 x ) B1   0  .
2 2

0 2 3


     

Hence, the transition matrix from B2 to B1 is

 1 0 3   3 6 3 
   
C   4 1 0  , so that A  C 1  1
27  12 3 12 
 0 2 3  8 2 1 
  

 a0 
 
Is the transition matrix from B1 to B2 . Since (a0  a1 x  a2 x ) B1   a1  ,we have
2

a 
 2

 3 6 3   a0   271  3a0  6a1  3a2  


    
(a0  a1 x  a2 x 2 ) B1  271  12 3 12   a1    271  12a0  3a1  12a2  
 8 2 1   a2   271 8a0  2a1  a2  

For example, if p ( x)  5 x 2  3x  4 then

 3 6 3  4    15 27 
    21 
(5 x  3 x  4) B1  271  12 3 12  3    27
2

 8 2 1    31 
  5   27 

or 5 x 2  3 x  4   15
27
(4 x  1)  27
21
(2 x 2  x)  27
31
(3 x 2  3).

 3   2    2   5   b 
Example 3.4 Let B1   ,    and B2   ,    be two bases in  2 . If ( x) B1   1  ,
 1   1   4   3    b2 
write x in terms of the vectors in B2 .

24
Solution: This problem is more difficult compared to the other two examples, because
neither basis is the standard basis. We must write the vectors in B1 as linear combinations of

the vectors in B2 . That is, we must find constants a11 , a21 , a12 , a22 such that

 3  2  5  2  2  5 
   a11    a21   and    a12    a22   .
1  4  3  1   4  3

This leads to the following systems

2a11  5a21  3 and 2a12  5a22  2


4a11  3a21  1 4a12  3a22  1

The solutions are a11  137 , a21   135 , a12  1


26 and a22   135 . Thus

 14 1 
A 1
26  
 10 10 

 14 1   b1   261 (14b1  b2 ) 
and ( x) B1  1
      10 .
 10 10  b2    26 (b1  b2 ) 
26

7
For example let x    (in standard basis). Then
 4

7  3 2  3  2 
   b1    b2    3     
 4  B1 1  1  1   1

So that

7 3
   
 4  B1  1

and

7  14 1  3   2641

   
1
26     20 

 4 B 2  10 10  1  26 

That is

7  2  20  5 
     26  
41
26
4 4  3

25
Exercise 3.1

 x
1. In the following problems write     2 in terms of the given basis
 y

 1  1   5  3 a  b 
(a)   ,   (b)   ,   (c)   ,   where ad  dc  0
 1  1  7  4 c d 

 x
 
2. Write  y    3 in terms of the given basis
z
 

 1   1   1  2   1  3  a  b   c 
                 
(a)  0  ,  1  , 1 (b)  1  ,  4  ,  2  (c)  0  ,  d  ,  e  where adf  0
 0   0   1  3   5   4  0  0   f 
                 

3. In the following problems write the polynomial a0  a1 x  a2 x 2 in P2 in


terms of the given basis

(a) 1, x  1, x 2  1 (b) x  1, x  1, x 2  1 (c) 6, 2  3x,3  4 x  5 x 2

4. In P3 write the polynomial 2 x 3  3 x 2  5 x  6 in terms of the basis


polynomials 1,1  x, x  x 2 , x 2  x 3.

5. In P2 write the polynomial 4 x 2  x  5 in terms of the basis polynomials


1,1  x, (1  x) 2 ,(1  x)3 .

2  1  2  
6. In  2 suppose that x    , where B1    ,    . Write x in terms of
 1 B1  1  3  
 0   5  
the basis B2    ,    .
 3   1 

2  1   0   1  
   
7. In  , x   1 where B1   1 ,  1  ,  0   . Write x in terms of
3

4  0   1  1  
  B1       

 3   1   0  
      
B2   0  ,  2  ,  1   .
 0   1  5  
      

26
Chapter 4

Eigenvalues and eigenvectors


Learning Outcomes:
By the end of this chapter you should be able to
 Determine eigenvalues and eigenvectors
 Prove properties of eigenvalues and eigenvectors
 Understand what is meant by similar matrices
 Diagonalize a matrix
 Find powers of matrices

4.1 Eigenspace
Definition 4.1. Let V be a finite-dimensional vector space over  , and let T : V  V be a
linear transformation. A non-zero vector v  V such that T (v)   v for some    is said to
be an eigenvector of T with eigenvalue  . When  is an eigenvalue of T, the set of all
eigenvectors of T belonging to  is called the  -eigenspace of T denoted by
V  v  V : T (v)   v .

Definition 4.2. Let A be an n  n matrix with real entries. The number  is called eigenvalue
if there is a non-zero vector v such that
Av   v
The non-zero vector v is called an eigenvector of A corresponding to the eigenvalue  .

Theorem 4.1 Let  be an eigenvalue of an operator T : V  V . Then V is a subspace of V .

Proof Suppose that v, w V ; that is T (v)   v and T ( w)   w . Then for any scalars a, b   ,

T ( av  bw)  aT (v )  bT ( w)  a ( v )  b ( w)   (av  bw).


Thus av  bw is an eigenvector belonging to  , that is av  bw  V . Hence V is a subspace

of V .

27
Example 4.1
1. Let V   2 . Let T : V  V be the linear transformation defined by

T  ( x, y )t   ( y, x )t for all ( x, y )  V . Then (1,1)t is an eigenvector of T

with eigenvalue 1 and (1, 1)t is an eigenvector of T with eigenvalue -1.


We remark that these two eigenvectors form a basis of V.
2. Let I : V  V be the identity mapping. Then for every v  V , I (v)  v  1v.
Hence 1 is an eigenvalue for I, and every vector v  V is an eigenvector
belonging to 1.

We clearly need to develop a systematic approach to determining the eigenvalues and


eigenvectors of a linear transformation.

4.2 The characteristic equation and polynomial


Definition 4.3. Let M be an n  n matrix with entries in  . The characteristic polynomial of
M is defined to be

 M ( )  det( M   I )

where I is the n  n identity matrix. When V is a finite-dimensional vector space over  and
T : V  V is a linear transformation, A is the matrix of T with respect to some basis of V, we
define the characteristic polynomial of T denoted by  T ( )

 T ( )  det( A   I ).

The characteristics equation corresponding to the matrix A is:


 T ( )  det( A   I )  0
Theorem 4.2: Let A be an n  n matrix with real entries. Then  is eigenvalue of A if and
only if
 T ( )  det( A   I )  0
Since characteristics polynomial is used to find eigenvalues of a matrix, and we know from
the fundamental theorem of algebra that, any polynomial of degree n with real or complex
coefficients has exactly n roots (counting multiplicities), then we have:

28
COUNTING MULTIPLICITIES: every n  n matrix has n eigenvalues.

We now give the following three-step procedure for calculating eigenvalue and
corresponding eigenvectors.

PROCEDURE FOR COMPUTING EIGENVALUES AND EIGENVECTORS

i. Find  T ( )  det( A   I ) .

ii. Find the roots i for i  1, 2,...n of  T ( )  0.

iii. Corresponding to each eigenvalue i , solve the homogeneous


system ( A  i I )v  0

Example 4.2
1. Let V   2 , T : V  V be the linear transformation defined by

T  (a, b)t   (3a  b, a  3b) for all ( a, b)  V . Then it is easy to check that T has

3 1
matrix A    with respect to the basis (1,0) , (0,1) . Hence the characteristics
t t

 1 3 
polynomial of T is
3 1
 T ( )    2  6  8.
1 3
 1  2, 2  4.

Thus, T has eigenvalues 1  2, 2  4 .

To find the corresponding eigenvector for the eigenvalue 1  2 , we solve

( A  1 I )v  0
3 1  1 0 x   x
    2      0; u    .
  1 3  0 1    y   y
  1 1   x 
     0
  1 1   y 
 x y 0
x y 0
 x  1, y  1 is the eigenvector corresponding to the eigenvalue 1  2.

Hence, (1, 1)t is an eigenvector of T with eigenvalue 1  2 .

29
Similarly, to find the eigenvector for the eigenvalue 2  4 , we solve

( A  2 I )v  0
  3 1  1 0   x   x
    4      0; u    .
  1 3  0 1    y   y
  1 1    x 
     0
  1 1   y 
 x  y  0
x y 0
 x  1, y  1 is the eigenvector corresponding to the eigenvalue 2  4.

Hence, (1,1)t is an eigenvector of T with eigenvalue 2  4 .

We note that (1, 1)t ,(1,1)t  is a basis of V which consists of eigenvectors of T

2. Let S : V  V be the linear transformation with S  (a, b)t   (b, a)t for all

 0 1
(a, b)t  V . With respect to the basis above, S has matrix A    Hence S has
1 0 
characteristic polynomial
 1
 S ( )    2 1.
1 
Since this polynomial has no root in  , there is no eigenvector of S in V associated
to any eigenvalue. It should be clear from this example that the presence (or absence)
of eigenvalues depends very much on the field. If we had been working with the
complex field  , then the polynomial  2  1 would have had the roots i and  i.

4.3 Similar matrices and diagonalization

Definition 4.4 Two n  n matrices A and B are said to similar if there exists an invertible
n  n matrix C such that
B  C 1 AC.
Theorem 4.3 If A and B are similar n  n matrices, then A and B have the same
characteristic equation and, therefore, have the same eigenvalues.
Proof Since A and B are similar, B  C 1 AC and

30
det( B   I )  det(C 1 AC   I )  det C 1 AC  C 1 ( I )C 
 det C 1 ( A   I )C   det(C 1 ) det( A   I ) det(C )
 det(C 1 ) det(C ) det( A   I )  det(C 1C ) det( A   I )
 det I det( A   I )  det( A   I ).
These mean that A and B have the same characteristic equation and, since eigenvalues are
roots of the equation, they have the same eigenvalues.

Definition 4.5 An n  n matrix is diagonal if aij  0 for all i  j . An n  n matrix is

diagonalizable if there is a diagonal matrix D such that A is similar to D .

This is very simple kind of matrix, and it would make our lives easy if we could choose a
basis with respect to which the linear transformation in question is diagonal.

Theorem 4.4 Let V be a finite-dimensional real vector space of dimension n . Let


T : V  V be a linear transformation. Let S  v1 ,..., vn  be a basis for V. Then the matrix of T

with respect to S is diagonal if and only if each vi is an eigenvector of T.

n
Proof Let A   aij  be the matrix of T with respect to S. Then A is diagonal if and only if
i , j 1

T (v j )   aij v j  a jj v j , j  1, 2,..., n.
i

That is, if and only if each v j is an eigenvector of T.

Our next result gives a necessary and sufficient condition for the existence of a basis of
eigenvectors of a linear transformation.

Theorem 4.5 Let V be finite-dimensional real vector space of dimension n . Let T : V  V be


a linear transformation and suppose that i :1  i  m is the set of distinct eigenvalues of T

on V. Let Vi be the i  eigenspace of T on V for 1  i  m . Then there is a basis of V which


consists of eigenvectors of T if, and only if,

 dim V   n.
i 1
i

31
Corollary 4.1 Let T : V  V be a linear transformation, where V is a finite-dimensional
vector space of dimension n. Suppose that T has n distinct eigenvalues. Then there is basis of
V which consists of eigenvectors of T

Definition 4.6 A linear transformation T : V  V is diagonalizable whenever there exists a


basis for V consisting of eigenvectors of T.

Example 4.3. Let V  2 , and let T : V  V be the linear transformation defined by

T  ( a, b, c)t    5a  b  c, a  5b  c, a  b,5c 
t
for all (a, b, c )t  V . With respect to the

standard basis (1,0,0)t ,(0,1,0)t ,(0,0,1)t  , T has matrix

5 1 1
 
 1 5 1   ( A, say)
1 1 5
 
The characteristic polynomial of T is
 T ( )  (4   )2 (7   ),
so that the eigenvalues of T are 4 and 7.
We proceed to find the eigenspace of T. Now T (( a, b, c )t )  (4a, 4b, 4c )t if and only if

 5a  b  c, a  5b  c, a  b  5c    4a, 4b, 4c 
t t
which happens if and only if a  b  c  0 . It

follows that 1, 1, 0 ,  0,1, 1  is a basis for the 4-eigenspace of T.
t t

Also, T  (a, b, c)t   (7 a,7b,7c )t results in

2a  b  c (from the first coordinate)


2b  a  c (from the second coordinate)
2c  a  b (from the third coordinate).

This happens only when a  b  c , so the 7-eigenspace of T has a basis 1,1,1  , and is one
t

dimensional. By Theorem 4.4, V has a basis consisting of eigenvectors of T, which is formed


by taking the union of the bases for the various eigenspaces of T. In this case, the basis is

1, 1,0 ,  0,1, 1 , 1,1,1  .


t t t
Since the dimension of this basis is 3, therefore T is

diagonalizable.
Now with respect to this basis, the matrix of T is

32
 4 0 0
 
B   0 4 0
0 0 7
 

and the base change matrix from (1,0,0) ,(0,1, 0) ,(0,0,1)  to


t t t
the basis

1, 1,0 ,  0,1, 1 , 1,1,1  is


t t t

 1 0 1
 
P   1 1 1 .
 0 1 1
 
Check that P 1 AP  B.

1 4
Example 4.4 Let A   .
2 3

(i) Find all eigenvalues of A and the corresponding eigenvectors.

(ii) Find an invertible matrix P such that P 1 AP is diagonal.

Solution

(i) The characteristics equation of A is

1  4
 A ( )  0
2 3
  2  4  5  0
 1  1, 2  5.

Thus, A has eigenvalues 1  1, 2  5 .


Since A has two distinct eigenvalues then by Corollary 4.1 it is diagonalizable. We
now proceed to find the eigenvectors corresponding to the eigenvalues.
To find the corresponding eigenvector for the eigenvalue 1  1 , we solve

( A  1 I )v  0
 1 4 1 0 x   x
    (1)       0; u    .
 2 3 0 1 y  y

33
 2 4  x 
    0
 2 4  y 
 2x  4 y  0
2x  4 y  0

 x  2, y  1 is the eigenvector corresponding to the eigenvalue 1  1.

Hence, (2,1, )t is an eigenvector of A with respect to the eigenvalue 1  1 .

Similarly, to find the corresponding eigenvector for the eigenvalue 1  5 , we solve

( A  1 I )v  0
1 4 1 0 x   x
    5      0; u    .
 2 3  0 1 y   y
 4 4   x 
    0
 3 3   y 
 4 x  4 y  0
 3x  3 y  0
 x  1, y  1 is the eigenvector corresponding to the eigenvalue 1  5.

Hence, (1,1, )t is an eigenvector of A with respect to the eigenvalue 1  5 .

(ii) The matrix P is

 2 1
P 
 1 1

The diagonal matrix D  PAP 1 is a matrix whose entries in the main diagonal are the
eigenvalues of A. That is

 1 0 
D 
 0 5

You can also verify that

1
 2 1 1 4  2 1
1  1 0 
D  PAP       
 1 1 2 3  1 1  0 5

34
4.3 Powers of matrices
Theorem 4.6 If an n  n matrix A is diagonalizable with P 1 AP  D where D is a diagonal
matrix then

Am  PD m P 1 .

Proof We proof the theorem by the principle of mathematical induction. That is we

Step 1: Check for m = 1.

Step 2: Assume that the result is true for m = k.

Step 3: Prove it for m = k + 1

Step 1: Check for m = 1, that is we need to show PDP 1  A . We have D  P 1 AP . Left


multiplying this by matrix P and right multiplying it by P 1 :

PDP 1  P  P 1 AP  P 1
  PP 1  A  PP 1   IAI  A.

Thus we have our result for m = 1 which is A  PDP 1 . This means that we can factorize
matrix A into three matrices P, D and P 1 .

Step 2: Assume that the result is true for m = k, that is

Ak  PD k P 1

Step 3: We need to prove the result for m = k + 1, that is we need to prove Ak 1  PD k 1P 1 .

Starting with the left hand side of this we have

Ak 1  Ak A
 PD k  P 1P  DP 1
 PD k  I  DP 1
 PD k DP 1
 PD k 1 P 1.

This is our required result. Hence by mathematical induction we have Am  PD m P 1 .

35
 1 2 3 
Example 4.5 Find A , where A   0 2 5  .
5

 0 0 3
 

 1 2 7   2 4 13 
1
Solution: Using the methods described above P   0 1 10  , P   0 2 10  . and
  1

0 0 2
 2  0 0 1 

1 0 0
1  
D  P AP   0 2 0  . Using the above result, we have
0 0 3
 

A5  PD 5 P 1
5
 1 2 7  1 0 0   2 4 13 
   1 
  0 1 10  0 2 0   0 2 10 
0 0 2   2 
  0 0 3   0 0 1 
 1 2 7   1 0 0   2 4 13 
5

1    
  0 1 10   0 25 0   0 2 10 
2
0 0 2   0 0 35   0 0 1 
 1 2 7  1 0 0   2 4 13 
1   
  0 1 10  0 32 0   0 2 10 
2
0 0 2  
 0 0 243   0 0 1 

 1 64 1701  2 4 13 


1  
  0 32 2430   0 2 10 
2
0 0 486   0 0 1 
 2 124 1074   1 62 537 
1   
  0 64 2110    0 32 1055 
2
0 0 486   0 0 243 

 1 62 537 
 
Thus, A   0 32 1055  .
5

0 0 243 

4.3.1 Applications of powers of matrices


Application of powers of matrices Matrix powers are particularly useful in Markov chains –
these are based on matrices whose entries are probabilities. Many real life systems have an
element of uncertainty which develops over time, and this can be explained through Markov
chains.

36
Exercise 4.1

1. Find all eigenvalues and a basis of each eigenspace of the operator T : 3  3


defined by T ( x, y , z )t  (2 x  y, y  z , 2 y  4 z )t .

2. For each matrix, find all eigenvalues and linearly independent eigenvectors:

2 2 4 2  5 1
i. A  ii. B    iii. C  
1 3 3 3 1 3 

Find invertible matrices P1 , P2 and P3 such that P11 AP1 , P21BP2 and P31CP3 are
diagonal.

3. For the matrices in Question 2 above find A5 , B 5 and C 5 .

4. For each matrix, find all eigenvalues and a basis for each eigenspace:

3 1 1 1 2 2 1 1 0
     
i. A   2 4 2 ii. B   1 2 1 iii. C  0 1 0
1 1 3  1 1 4  0 0 1
     

When possible, find invertible matrices P1 , P2 and P3 such that

P11 AP1 , P21BP2 and P31CP3 are diagonal.

5. For the matrices in Question 4 above find A4 , B 4 and C 4 , where possible.

6. Suppose v is an eigenvector of the operators S and T . Show that v is also an


eigenvector of the operator aS  bT where a, b   .

7. For each of the following matrix, find all eigenvalues and a basis of each
eigenspace. Hence determine which matrix can be diagonalized.

 1 3 3   3 1 1 
   
(i) A   3 5 3  (ii) A   7 5 1 
 6 6 4   6 6 2 
   

37
Chapter 5

The Cayley-Hamilton Theorem and


Minimal Polynomial
Learning Outcomes:
By the end of this chapter you should be able to
 State and prove the Cayley-Hamilton Theorem
 Apply the Cayley-Hamilton theorem
 Determine the minimal polynomial

5.1 The Cayley-Hamilton theorem


Let V be a finite-dimensional vector space over the filed  . Since we can compose linear
transformations which map from V to V, and since we can also add such transformations, we
can define p (T ) whenever p ( x ) is a polynomial with coefficients in  and T : V  V is a
linear transformation (we use the convention that T (0)  I ). Thus if

p ( x)  a0  a1 x  ...  ar x r , then we can define the linear transformation p (T ) by

p (T )  a0 I  a1T  ...  arT r

We say that T satisfies the polynomial p ( x ) if p (T )  0 . It follows from Section 2 (change of


basis) that if T has matrix A with respect to some basis of V, then p (T ) has matrix p ( A)
with respect to the same basis.
Consider, for example, T :  2   2 defined by T (v)  Av where A is the matrix

 1 1
A 
 0 1
Let p ( )   2  2  1 : Then one can check by straightforward computation that p ( A)  0 .
Thus p (T )  0 .

38
Exercise 5.1 Prove that if the linear transformation T above satisfies the polynomial p ( x ) ,
then p ( )  0 for each eigenvalue  of T. (Hint: Consider the effect of p (T ) on an
eigenvector with eigenvalue  ).
Theorem 5.1 (THE CAYLEY-HAMILTON). Let V be a finite-dimensional vector space
over a field F, and let T : V  V be a linear transformation with characteristic polynomial
p ( ) . Then p (T )  0 .
In other words, If the Cayley-Hamilton states that every matrix is a zero of its characteristic
polynomial.
Proof
Let A be an arbitrary n-square matrix and let  A ( ) be the characteristic polynomial; say
 A ( )  A   I   n  an 1 n 1  ...  a1  a0 .
Now let B ( x) denote the classical adjoint of the matrix A   I .The elements of B ( ) are cofactors
of the matrix A  I  and hence are polynomials in x of degree not exceeding n  1 . Thus
B( )  Bn 1 n 1  ...  B1  B0
Where the Bi ’s are n-square matrices which are independent of x. It follows from the
following property “For any square matrix A, A  (adj A)  (adjA)  A  A I ”
that
( A   I ) B( )  A   I I Or
( A   I )( Bn 1 n 1  ...  B1  B0 )  ( n  an 1 n 1  ...  a1  a0 ) I
Removing parenthesis and equating the coefficients of corresponding powers of  ,
Bn 1  I
Bn  2  ABn 1  an 1 I
Bn 3  ABn  2  an  2 I
.............................
B0  AB1  a1 I
 AB0  a0 I
Multiplying the above matrix equations by An , An 1 ,..., A, I respectively,
An Bn 1  An
An 1 Bn  2  An Bn 1  an1 An 1
An  2 Bn 3  An 1Bn 2  an  2 An  2
.............................
AB0  A2 B1  a1 A
 AB0  a0 I
Adding the above matrix equations,
0  An  an 1 An 1  ...  a1 A  a0 I

In other words,  ( A)  0 .

39
Example 5.1 Let V  2 . Let T :V  V be the linear transformation with
 0 1
T (( a, b)t )  (b, a )t for all ( a, b)  V . Then the matrix A corresponding to T is A   
 1 0 
  1 
with A   I    . Thus, T has characteristic polynomial   1 ; and it follows that
2

 1   
2
 0 1   1 0   1 0   1 0   0 0 
T I  A I 
2 2
     .
 1 0   0 1   0 1  0 1   0 0 
 1 1 4 
 
Example 5.2 Let A   3 2 1  then the characteristic polynomial of A is
 2 1 1 
 
 3  2 2  5  6 (verify). Now you can also verify that
6 1 1   11 3 22 
  3  
A   7 0 11 , A   29 4 17  and
2

 3 1 8   16 3 5 
   

 11 3 22   6 1 1   1 1 4   1 0 0   0 0 0 
         
A  2 A  5 A  6 I   29 4 17   2  7 0 11  5  3 2 1  6  0 1 0    0 0 0 
3 2

 16 3 5   3 1 8   2 1 1  0 0 1   0 0 0 
         

The Cayley-Hamilton theorem is useful in calculating the inverse of a matrix. If A1 exists
and  ( A)  0 then A1  ( A)  0 . To illustrate, if

 ( )   n  an 1 n 1  ...  a1  a0 then

 ( A)  An  an 1 An 1  ...  a1 A  a0 I  0
and
A1  ( A)  An 1  an 1 An 2  ...  a2 A  a1 I  a0 A1  0
Thus
1
A1 
a0
  An 1  an 1 An  2  ...  a2 A  a1 I 

Note that a0  0 because a0  det A and we assume that A is invertible.

40
 1 1 4 
Example 5.2 Let A   3 2 1 . Then from Example 5.1, the characteristic polynomial of
 2 1 1 
 
A is  A ( ) 3  2 2  5  6 . Therefore, n  3, a2  2, a1  5, a0  6 and

1
A1 
6
  A2  2 A  5 I 
 6 1 1   2 2 8   5 0 0  
1      
  7 0 11   6 4 2    0 5 0  
6 
 3 1 8   4 2 2   0 0 5  
 1 3 7 
1 
  1 9 13 
6 
 1 3 5 

5.2 The minimal polynomial


Example 5.3 In Example 4.1 we had T  (a, b, c )t    5a  b  c, a  5b  c, a  b,5c 
t

for all (a, b, c )t  V . With respect to the standard basis (1,0,0)t ,(0,1,0)t ,(0,0,1)t  , T has

matrix

5 1 1
 
 1 5 1   ( A, say)
1 1 5
 
The characteristic polynomial of T is
 T ( )  (  4)2 (  7),
so that the eigenvalues of T are 4 and 7. So we have  T (T )  0 . But we see that
(T  4 I )(T  7 I )  0 already, as

 1 1 1 2 1 1   0 0 0 
    
 1 1 1 1 2 1    0 0 0  .
 1 1 1 1 1 2   0 0 0 
    

Hence  T ( ) is not the smallest degree polynomial that T satisfies .

41
Definition 5.1 Let T : V  V be a linear transformation, where V is a finite-dimensional real
vector space. The minimal polynomial of T is the monic polynomial mT ( x) of least degree

such that mT (T )  0 .
Remark: You may have noticed that the above definition implicitly assumes that the
minimum polynomial is unique. This assumption may be justified by:
Lemma 5.1 Let V,T, mT ( x) be as above. Let q(x) be any polynomial such that q(T) = 0:

Then mT ( x) is a factor of q(x). In particular, the minimal polynomial of T is a divisor of the


characteristic polynomial of T.
Proof
We may write q ( x )  a( x) mT ( x)  r ( x) where a ( x ) , r ( x) are polynomials, and either

r ( x)  0 or r ( x) has degree strictly smaller than mT ( x) ) does. Since mT (T )  q(T )  0 , we


see that r (T )  0 . By definition of the minimum polynomial, we must have r ( x)  0 , so

mT ( x) is a factor of q ( x ) . The last remark follows since p(T )  0 , where p ( x ) is the


characteristic polynomial of T, by Theorem 5.1 (Cayley-Hamilton theorem).
Lemma 5.2. Let V,T, mT ( x) be as above, and let    . Then ( x   ) is a factor of mT ( x)
if and only if  is an eigenvalue of T.
Proof.
If ( x   ) is a factor of mT ( x) , then (by Lemma 5.1), ( x   ) is also a factor of the
characteristic polynomial of T and so  is an eigenvalue of T.
On the other hand, let  be an eigenvalue of T, and let v be an eigenvector associated to this
eigenvalue. We may write mT (T )=( x   )a ( x )+ where a(x) is a polynomial with
coefficients in  , and    . Then we have

0=mT (T )(v)  a(T )  (T   I  (v)   v   v

As v is not the zero vector, this forces   0 , so that ( x   ) is a factor of mT ( x) .

Remark: Lemma 5.1 makes it fairly routine to determine the minimum polynomial of T :

 Find the matrix A of T with respect to some basis of V


 Calculate the characteristic polynomial of  ( x ) , of T
 Check each possible monic factors, q(x), of  ( x ) to see whether or not q(T) = 0
 The minimum polynomial of T is the monic factor q(x) of least degree for which
q(x) = 0.

42
Lemma 5.1 may be used to reduce the work involved somewhat. For if _ 1 ,...., n are the
distinct eigenvalues of T, then the minimal polynomial of T is divisible by


n
i 1
( x  i )  t ( x ) , so we need only check those factors q(x) of p(x) which are themselves

divisible by t(x).
 2 2 5 
 
Example 5.4. Find the minimal polynomial of the following matrix A   3 7 15 
 1 2 4 
 
Solution: The characteristics polynomial of A is  ( x)  ( x  1) 2 ( x  3) The minimal

polynomial mT ( x) must divide  ( x)  ( x  1) 2 ( x  3) . Also, each irreducible factor of

 ( x),i.e ( x  1) and( x  3) must also be a factor of mT ( x) . Thus, mT ( x) is exactly one of the


following: f ( x)  ( x  1)( x  3) or g ( x)  ( x  1) 2 ( x  3) . It is known from the Cayley-
Hamilton theorem that g ( A)   ( A)  0 . Now we need to check f(x).

 1 2 5   1 2 5   0 0 0 
    
f ( A)  ( A  I )( A  3I )   3 6 15   3 4 15    0 0 0  .
 1 2 5   1 2 7   0 0 0 
    
Thus f ( x)  mT ( x)  ( x  1)( x  3)  x 2  4 x  3 is the minimal polynomial of A.

Exercise 5.2
1. Find the minimum polynomial of
2 1 0
0
 
0 2 0 0
.
0 0 1 1
 
0 0 2 4 

2. Find the minimum polynomial of each of the following matrix (where a  0 )

 a 0 
 a   
(i) A    (ii) B 0  a
0  0 0 
 
 a 0 0
 
0  a 0
(iii) C  
0 0  a
 
0 0 0 

43

You might also like