0% found this document useful (0 votes)
24 views42 pages

Five II Handout

Uploaded by

Ronald Perez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views42 pages

Five II Handout

Uploaded by

Ronald Perez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Five.

II Similarity

Linear Algebra
Jim Hefferon

http://joshua.smcvt.edu/linearalgebra
Definition and Examples
We’ve defined two matrices H and Ĥ to be matrix equivalent if there are
nonsingular P and Q such that Ĥ = PHQ. We were motivated by this
diagram showing H and Ĥ both representing a map h, but with respect to
different pairs of bases, B, D and B̂, D̂.
h
Vwrt B −−−−−→ Wwrt D
H
 
 
idy idy

h
Vwrt B̂ −−−−−→ Wwrt D̂

We now consider the special case of transformations, where the codomain


equals the domain, and we add the requirement that the codomain’s basis
equals the domain’s basis. So, we are considering representations with
respect to B, B and D, D.
t
Vwrt B −−−−−→ Vwrt B
T
 
 
idy idy

t
Vwrt D −−−−−→ Vwrt D

−1
In matrix terms, RepD,D (t) = RepB,D (id) RepB,B (t) RepB,D (id) .
Similar matrices
1.2 Definition The matrices T and T̂ are similar if there is a nonsingular P
such that T̂ = PT P−1 .
Example Consider the derivative map d/dx : P2 → P2 . Fix the basis
B = h1, x, x2 i and the basis D = h1, 1 + x, 1 + x + x2 i. In this arrow diagram
we will first get T , and then calculate T̂ from it.
t
Vwrt B −−−−−→ Vwrt B
T
 
 
idy idy

t
Vwrt D −−−−−→ Vwrt D

The action of d/dx on the elements of the basis B is 1 7→ 0, x 7→ 1, and


x2 7→ 2x.
     
0 1 0
d d d
RepB ( (1)) = 0 RepB ( (x)) = 0 RepB ( (x2 )) = 2
dx dx dx
0 0 0
So we have this matrix representation of the map.
 
0 1 0
d
T = RepB,B ( ) = 0 0 2
dx
0 0 0

The matrix changing bases from B to D is RepB,D (id). We find these by eye
     
1 −1 0
RepD (id(1)) = 0 RepD (id(x)) =  1  RepD (id(x2 )) = −1
0 0 1

to get this.
   
1 −1 0 1 1 1
−1
P = 0 1 −1 P = 0 1 1
0 0 1 0 0 1
Now, by following the arrow diagram we have T̂ = PT P−1 .
 
0 1 −1
T̂ = 0 0 2
0 0 0
To check that, and to underline what the arrow diagram says
t
Vwrt B −−−−−→ Vwrt B
T
 
 
idy idy

t
Vwrt D −−−−−→ Vwrt D

we calculate T̂ directly. The effect of the map on the basis elements is


d/dx(1) = 0, d/dx(1 + x) = 1, and d/dx(1 + x + x2 ) = 1 + 2x. Representing
of those with respect to D
     
0 1 −1
RepD (0) = 0 RepD (1) = 0 RepD (1 + 2x) =  2 
0 0 0

gives the same matrix T̂ = RepD,D (d/dx) as above.


The definition doesn’t require that we consider the underlying maps. We
can just multiply matrices.
Example Where
   
0 −1 −2 1 1 0
T = 2 3 2 P = −1 1 0
4 5 2 0 0 3

(note that P is nonsingular) we can compute this T̂ = PT P−1 .


 
2 0 0
T̂ =  3 1 4/3
27/2 3/2 2

1.4 Example The only matrix similar to the zero matrix is


itself: PZP−1 = PZ = Z. The identity matrix has the same
property: PIP−1 = PP−1 = I.
Similarity is an equivalence
Exercise 15 checks that similarity is an equivalence relation.
Since matrix similarity is a special case of matrix equivalence, if two
matrices are similar then they are matrix equivalent. What about the
converse: must any two matrix equivalent square matrices be similar? No;
the matrix equivalence class of an identity consists of all nonsingular
matrices of that size while the prior example shows that an identity matrix
is alone in its similarity class.
So some matrix equivalence classes split into two or more similarity
classes — similarity gives a finer partition than does equivalence. This
pictures some matrix equivalence classes subdivided into similarity classes.

S
T ...

We naturally want a canonical form to represent the similarity classes.


Some classes, but not all, are represented by a diagonal form.
Diagonalizability
2.1 Definition A transformation is diagonalizable if it has a diagonal
representation with respect to the same basis for the codomain as for the
domain. A diagonalizable matrix is one that is similar to a diagonal
matrix: T is diagonalizable if there is a nonsingular P such that PT P−1 is
diagonal.
Example This matrix
 
6 −1 −1
2 11 −1
−6 −5 7
is diagonalizable by using this
   
1/2 1/4 1/4 1 −1 0
P = −1/2 1/4 1/4 P−1 = 0 1 −1
−1/2 −3/4 1/4 2 1 1

to get this D = PSP−1 .


 
4 0 0
D = 0 8 0
0 0 12
Example This matrix is not diagonalizable.
 
0 0
N=
1 0

The fact that N is not the zero matrix means that it cannot be similar to
the zero matrix, because the zero matrix is similar only to itself. Thus if N
were to be similar to a diagonal matrix D then D would have at least one
nonzero entry on its diagonal.
The crucial point is that a power of N is the zero matrix, specifically N2
is the zero matrix. This implies that for any map n represented by N with
respect to some B, B, the composition n ◦ n is the zero map. This in turn
implies that any matrix representing n with respect to some B̂, B̂ has a
square that is the zero matrix. But for any nonzero diagonal matrix D2 ,
the entries of D2 are the squares of the entries of D, so D2 cannot be the
zero matrix. Thus N is not diagonalizable.
2.4 Lemma A transformation t is diagonalizable if and only if there is a basis
~ 1, . . . , β
B = hβ ~ n i and scalars λ1 , . . . , λn such that t(β ~ i ) = λi β~ i for each i.
Proof Consider a diagonal representation matrix.
  
.. .. 
. . λ 0
  1
.. . . .. 

RepB,B (t) = RepB (t(β
 ~ 1 )) · · · RepB (t(β ~ n ))
= .
. . 


.. ..
 
. . 0 λn

Consider the representation of a member of this basis with respect to the


~ i ). The product of the diagonal matrix and the representation
basis RepB (β
vector    
0 0
  .  . 
λ1 0  .
.   .. 
 
. . .  
~ i )) =  .
RepB (t(β .. ..  
1 = λi 
  
. 
.  . 
   
0 λn   ..   .. 
0 0
has the stated action. QED
Example This matrix is not diagonal
 
4 1
T=
0 −1

but we can find a diagonal matrix similar to it, by finding an appropriate


basis.
Suppose that T = RepE2 ,E2 (t) for t : R2 → R2 . We will find a basis
B = hβ~ 1, β
~ 2 i giving a diagonal representation.
 
λ1 0
D = RepB,B (t) =
0 λ2

Here is the arrow diagram.


t
Vwrt E2 −−−−−→ Vwrt E2
T
 
 
idy idy

t
Vwrt B −−−−−→ Vwrt B
D
We want λ1 and λ2 making these true.
   
4 1 ~ ~ 4 1 ~ ~2
β 1 = λ1 · β 1 β 2 = λ2 · β
0 −1 0 −1

More precisely, we want all scalars x ∈ C such that this system


    
4 1 b1 b1
=x·
0 −1 b2 b2

has solutions b1 , b2 ∈ C that are not both zero (the zero vector is not an
element of any basis).
Rewrite that as a linear system.

(4 − x) · b1 + b2 = 0
(−1 − x) · b2 = 0

One solution is λ1 = −1, associated with those (b1 , b2 ) such that


b1 = (−1/5)b2 . The other solution is λ2 = 4, associated with the (b1 , b2 )
such that b2 = 0.
Thus the original matrix
 
4 1
T=
0 −1

is diagonalizable to  
−1 0
D=
0 4
where this is a basis.    
−1 1
B=h , i
5 0
Eigenvalues and Eigenvectors
Eigenvalues and eigenvectors
3.1 Definition A transformation t : V → V has a scalar eigenvalue λ if there
is a nonzero eigenvector ~ζ ∈ V such that t(~ζ) = λ · ~ζ.
3.5 Definition A square matrix T has a scalar eigenvalue λ associated with
the nonzero eigenvector ~ζ if T ~ζ = λ · ~ζ.
Example The matrix  
4 0
D=
0 2
has an eigenvalue λ1 = 4 and a second eigenvalue λ2 = 2. The first is true
because an associated eigenvector is e~1
    
4 0 1 1
=4·
0 2 0 0

and similarly for the second an associated eigenvector is e2 .


    
4 0 0 0
=2·
0 2 1 1
Thinking of the matrix as representing a transformation of the plane, the
transformation acts on those vectors in a particularly simple way, by
rescaling.
Not every vector is simply rescaled.
      
4 0 1 4 1
= 6= x ·
0 2 1 2 1
Computing eigenvalues and eigenvectors
Example We will find the eigenvalues and associated eigenvectors of this
matrix.  
0 5 7
T = −2 7 7
−1 1 4

We want to find scalars x such that T ~ζ = x~ζ for some nonzero ~ζ. Bring the
terms to the left side.
      
0 5 7 z1 z1 0
−2 7 7 z2  − x z2  = 0
−1 1 4 z3 z3 0

and factor out the vector.


    
0−x 5 7 z1 0
 −2 7−x 7  z 2  = 0  (∗)
−1 1 4−x z3 0

This homogeneous system has nonzero solutions if and only if the matrix is
singular, that is, has a determinant of zero.
Some computation gives the determinant and its factors.

0−x 5 7
0= −2 7−x 7
−1 1 4−x
= x3 − 11x2 + 38x − 40 = (x − 5)(x − 4)(x − 2)

So the eigenvalues are λ1 = 5, λ2 = 4, and λ3 = 2.


To find the eigenvectors associated with the eigenvalue of 5 specialize
equation (∗) for x = 5.
    
−5 5 7 z1 0
−2 2 7  z2  = 0
−1 1 −1 z3 0

Gauss’s Method gives this solution set; its nonzero elements are the
eigenvectors.
 
1
V5 = { 1 z2 | z2 ∈ C }
0
Similarly, to find the eigenvectors associated with the eigenvalue of 4
specialize equation (∗) for x = 4.
    
−4 5 7 z1 0
−2 3 7 z2  = 0
−1 1 0 z3 0

Gauss’s Method gives this.


 
−7
V4 = { −7 z3 | z3 ∈ C }
1

Specializing (∗) for x = 2


    
−2 5 7 z1 0
−2 5 7 z2  = 0
−1 1 2 z3 0

gives this.
 
1
V2 = { −1 z3 | z3 ∈ C }
1
Example To find the eigenvalues and associated eigenvectors for the matrix
 
3 1
T=
1 3

start with this equation.


         
3 1 b1 b1 3−x 1 b1 0
=x =⇒ = (∗)
1 3 b2 b2 1 3−x b2 0

That system has a nontrivial solution if this determinant is nonzero.

3−x 1
= x2 − 6x + 8 = (x − 2)(x − 4)
1 3−x

First take the x = 2 version of (∗).


 
1 · b1 + b2 = 0 b1
=⇒ V2 = { | b1 = −b2 where b2 ∈ C }
b1 + 1 · b2 = 0 b2

Solving the second system is just as easy.


 
−1 · b1 + b2 = 0 b1
=⇒ V4 = { | b1 = b2 where b2 ∈ C }
b1 − 1 · b2 = 0 b2
Example If the matrix is upper diagonal or lower diagonal
 
2 1 0
T = 0 3 1
0 0 2

then the polynomial is easy to factor.

2−x 1 0
0= 0 3−x 1 = (3 − x)(2 − x)2
0 0 2−x

These are the solutions for λ1 = 3.


      
−1 1 0 z1 0 1
 0 0 1  z 2  = 0  =⇒ V3 = { 1 z2 | z2 ∈ C }
0 0 −1 z3 0 0

These are for λ2 = 2.


      
0 1 0 z1 0 1
0 1 1 z2  = 0 =⇒ V2 = { 0 z1 | z1 ∈ C }
0 0 0 z3 0 0
Matrices that are similar have the same eigenvalues, but needn’t have
the same eigenvectors.
Example These two are similar
   
4 0 0 6 −1 −1
T = 0 8 0  S= 2 11 −1
0 0 12 −6 −5 7

since S = PT P−1 for this P.


   
1 −1 0 1/2 1/4 1/4
P = 0 1 −1 P−1 = −1/2 1/4 1/4
2 1 1 −1/2 −3/4 1/4

For the first matrix  


1
0 
0
is an eigenvector associated with the eigenvalue 4 but that does not hold
for the second matrix.
Characteristic polynomial
3.9 Definition The characteristic polynomial of a square matrix T is the
determinant |T − xI| where x is a variable. The characteristic equation is
|T − xI| = 0. The characteristic polynomial of a transformation t is the
characteristic polynomial of any matrix representation RepB,B (t).
Note The characteristic polynomial of an n×n matrix, or of a
transformation t : Cn → Cn , is of degree n. Exercise 35 checks that the
characteristic polynomial of a transformation is well-defined, that is, that
the characteristic polynomial is the same no matter which basis we use for
the representation.
3.10 Lemma A linear transformation on a nontrivial vector space has at least
one eigenvalue.
Proof Any root of the characteristic polynomial is an eigenvalue. Over
the complex numbers, any polynomial of degree one or greater has a root.
QED
Remark This result is why we switched in this chapter from working with
real number scalars to complex number scalars.
Eigenspace
3.12 Definition The eigenspace of a transformation t associated with the
eigenvalue λ is Vλ = { ~ζ | t(~ζ ) = λ~ζ }. The eigenspace of a matrix is
analogous.
Example Recall that this matrix has three eigenvalues, 5, 4, and 2.
 
0 5 7
T = −2 7 7
−1 1 4

Earlier, we found that these are the eigenspaces.


     
1 −7 1
V5 = { 1 c | c ∈ C } V4 = { −7 c | c ∈ C } V2 = { −1 c | c ∈ C }
0 1 1
3.13 Lemma An eigenspace is a subspace. It is a nontrivial subspace.
Proof Notice first that Vλ is not empty; it contains the zero vector since
t(~0) = ~0, which equals λ · ~0. To show that an eigenspace is a subspace, what
remains is to check closure of this set under linear combinations. Take
~ζ1 , . . . , ~ζn ∈ Vλ and then

t(c1 ~ζ1 + c2 ~ζ2 + · · · + cn ~ζn ) = c1 t(~ζ1 ) + · · · + cn t(~ζn )


= c1 λ~ζ1 + · · · + cn λ~ζn
= λ(c1 ~ζ1 + · · · + cn ~ζn )

that the combination is also an element of Vλ .


The space Vλ contains more than just the zero vector because by
definition λ is an eigenvalue only if t(~ζ ) = λ~ζ has solutions for ~ζ other than
~0. QED
3.18 Theorem For any set of distinct eigenvalues of a map or matrix, a set of
associated eigenvectors, one per eigenvalue, is linearly independent.
Proof We will use induction on the number of eigenvalues. The base step
is that there are zero eigenvalues. Then the set of associated vectors is
empty and so is linearly independent.
For the inductive step assume that the statement is true for
any set of k > 0 distinct eigenvalues. Consider distinct eigenvalues
λ1 , . . . , λk+1 and let ~v1 , . . . ,~vk+1 be associated eigenvectors.
Suppose that ~0 = c1~v1 + · · · + ck~vk + ck+1~vk+1 . Derive two
equations from that, the first by multiplying by λk+1 on both sides
~0 = c1 λk+1~v1 + · · · + ck+1 λk+1~vk+1 and the second by applying the map to
both sides ~0 = c1 t(~v1 ) + · · · + ck+1 t(~vk+1 ) = c1 λ1~v1 + · · · + ck+1 λk+1~vk+1
(applying the matrix gives the same result). Subtract the second from the
first.
~0 = c1 (λk+1 − λ1 )~v1 + · · · + ck (λk+1 − λk )~vk + ck+1 (λk+1 − λk+1 )~vk+1

The ~vk+1 term vanishes. Then the induction hypothesis gives that
c1 (λk+1 − λ1 ) = 0, . . . , ck (λk+1 − λk ) = 0. The eigenvalues are distinct so
the coefficients c1 , . . . , ck are all 0. With that we are left with the equation
~0 = ck+1~vk+1 so ck+1 is also 0. QED
Example This matrix from above has three eigenvalues, 5, 4, and 2.
 
0 5 7
T = −2 7 7
−1 1 4
Picking a nonzero vector from each eigenspace we get this linearly
independent set (which is a basis because it has three elements).
     
1 −14 −1/2
{ 1 , −14 ,  1/2  }
0 2 −1/2

Example This upper-triangular matrix has the eigenvalues 2 and 3


 
2 1 0
0 3 1
0 0 2
Picking a vector from each of V3 and V2 gives this linearly independent set.
   
1 2
{ 1  , 0  }
0 0
A criteria for diagonalizability
3.20 Corollary An n×n matrix with n distinct eigenvalues is diagonalizable.
Proof Form a basis of eigenvectors. Apply Lemma 2.4 . QED
Geometry of eigenvectors
Lines go to lines
Consider a real space transformation t : Rn → Rn . A defining property of
linear maps is that t(r · ~v) = r · t(~v).
In a real space Rn a line through the origin is a set { r · ~v | r ∈ R }. So t’s
action
t
r · ~v 7−→ r · t(~v)
is to send members of the line { r · ~v | r ∈ R } in the domain to members of
the line { s · t(~v) | s ∈ R } in the codomain.
Thus, lines through the origin transform to lines through the origin.
Further, the action of t is determined by its effect t(~v) on any nonzero
vector element of the domain line.
Example Consider the line y = 2x in the plane
 
1
{r · | r ∈ R}
2

and this transformation t : R2 → R2 of the plane.


   
x x + 3y
7→
y 2x + 4y

The map’s effect on any vector in the line is easy to compute.


   
1 t 7
~v = 7−→
2 10

The scalar multiplication property in the definition of linear map


t(r · ~v) = r · t(~v) imposes a uniformity on t’s action: it has twice the effect on
2~v, three times the effect on 3~v, etc.
           
2 t 14 −3 t −21 r t 7r
7−→ 7−→ 7−→
4 20 −6 −30 2r 10r

In short: the action of t on any nonzero ~v determines its action on any


other vector r~v in the line [~v].
Pick one, any one
Every plane vector is in some line through the origin so to understand
what t : R2 → R2 does to plane elements it suffices to understand what it
does to lines through the origin. By the prior slide, to understand what t
does to a line through the origin it suffices to understand what it does to a
single nonzero vector in that line.
So one way to understand a transformation’s action is to take a set
containing one nonzero vector from each line through the origin, and
describe where the transformation maps the elements of that set.
A natural set with one nonzero element from each line through the
origin is the upper half unit circle (we will explain the colors below).

   
x cos(t)
{ = | 0 6 t < π}
y sin(t)
Angles
Example This plane transformation.
   
x 2x
7→
y 2x + 2y

is a skew.

As we move through the unit half circle on the left, the transformation has
varying effects on the vectors. The dilation vary, that is, different vectors
get their length multiplied by different factors, and they are turned through
varying angles. The next slide gives examples.
The
√ prior slide’s vector from the left shown in red is dilated by a factor
of 2 2 and rotated counterclockwise by π/4 ≈ 0.78 radians.

   
1 2
7→
0 2

p √
The orange vector is dilated by a factor of 2 cos2 (π/6) + 1 = 7 and
rotated by about 0.48 radians.

   
cos(π/6) 2 cos(π/6)
7→
sin(π/6) 2 cos(π/6) + 2 sin(π/6)
On the graph below the horizontal axis is the angle of a vectors from the
upper half unit circle, while the vertical axis is the angle through which
that vector is rotated.

   
x 2x
7→
y 2x + 2y

The rotation angle of interest is 0 radians, here achieved by some green


vector.
Definition
A vector that is rotated through an angle of 0 radians or of π radians,
while being dialated by a nonzero factor, is an eigenvector. The factor by
which it is dilated is the eigenvalue.
Example The plane transformation
   
x −x
7→
y 2y

represented with respect to the standard bases by a diagonal matrix


 
−1 0
0 2

has this simple action on the upper half unit circle.


This plots the angle of each vector in the upper half unit circle against
the angle through which it is rotated.

   
x −x
7→
y 2y

One vector gets zero rotation, the vector with x = 0.


Example This generic plane transformation
   
x x + 2y
7→
y 3x + 4y

has this action on the upper half unit circle.


Plotting the angle of each vector in the upper half unit circle against the
angle through which it is rotated

   
x x + 2y
7→
y 3x + 4y

gives that one vector gets a rotation of 0 radians, while another gets a
rotation of π radians.

You might also like