0% found this document useful (0 votes)
6 views46 pages

Chap 03 T

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views46 pages

Chap 03 T

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

Chapter 3: Linear Transformations (Operators)

on Vector Spaces:
Think of linear transformations, or operators, on vector
spaces as similar to functions. Basically, it is a rule that
associates a vector in one space to a vector in another (or
maybe the same) space.
DEFINITION: A linear transformation A: X → Y from vector
space X into vector space Y is linear if
A ( α 1 x1 + α 2 x2 ) = α 1 Ax 1 + α 2 Ax 2

for any vectors x1 , x 2 ∈ X .

DEFINITION: The range space of A, denoted (A), is the set of


all vectors yi ∈Y such that for every yi ∈ ( A) there exists
an x ∈ X such thatA ( x ) = yi .
If (A)=Y, the transformation (or mapping) is onto.
(i.e., if the range of A is the entire space)

If A maps elements of X to unique values in Y; that is, if


x1 ≠ x 2 ⇒ A ( x1 ) ≠ A ( x2 )
then A is one-to-one.

If A is one-to-one and onto, then it is invertible; i.e., A −1


exists such that
A −1 ( A ( x )) = x (or A −1 A = I )

DEFINITION: The null space of A, denoted ( A ) is the set of all


vectors xi ∈ X such that A ( xi ) = 0:
N ( A ) = {x ∈ X A ( x ) = 0}
For
Forlinear
linearoperator
operatorA, A,the
therange
rangespace
space (A)
(A)isisaasubspace
subspaceofof
YY(its
(itsdimension
dimensionisisthe
therank
rankof
ofmatrix
matrixA),
A),and
andthethenull
null
space
space (A) (A)isisaasubspace
subspaceof
ofXX(its
(itsdimension
dimensionisisthe
thenullity
nullity
of
ofmatrix
matrixA).
A).

How to represent Linear Operators: We will now show that a


linear operator on a vector space can always be written in the
form of a matrix.

Consider the vectors x and y from n- and m- dimensional spaces:

x ∈Xn, y ∈ X m.

Let {v1 , v 2 , K , v n } be a basis for X n


and {u1 , u 2 , K , u m } be a basis for X m
n
Then by expanding x out in its basis vectors, x = ∑ α j v j so
j =1
n n n
y = A( x ) = A( ∑ α j v j ) = ∑ A(α j v j ) = ∑ α j A(v j )
j =1 j =1 j =1

This important results means that we can determine the effect


of A on any vector x by knowing only the effect of A on the
basis vectors of X n .

We can write this out in matrix-vector notation as:


 α1 
α 
y = A( x ) = [ A( v1 ) A( v n )] 
2
A( v 2 ) L
 M 
α 
 n
Now we note that each A( v j ) is itself a vector in the space X m
so it, like y itself, can always be expanded as a unique
representation in the basis {ui } of X m:

m m
A( v j ) = ∑ aij ui and y = ∑ β i ui
i =1 i =1

n
Substituting these into y = ∑ α j A(v j ) from the previous page,
j =1

nm 
y = ∑ α j  ∑ aij ui 
j =1  i =1 

And changing the order of summation:


 n
m  m
y = ∑  ∑ aij α j ui = ∑ βi ui
i =1  j =1  i =1

But the expansion of y into {u i } must be unique, so


Operator
n
βi = ∑ aij α j for all i = 1,K , m β = Aα
j =1
1st
basis
What good does all this do us????

Suppose we have the representations


x ∈ X n = α = [α1 α2 L α n ] in the {v j } basis, and

y ∈ X m = β = [β1 β 2 L β m ] in the {ui } basis

If y = A ( x ) , we can find the components βi of y from the


formula above.
This can be written as a multiplication of the n-dimensional
vector a by the (m x n) matrix A = [ aij ] to get the m-
dimensional vector b .
This matrix A is the matrix representation of transformation
A. The elements of this matrix obviously are going to
depend on that particular bases chosen, so unless it is
clear from the context, one must always specify the basis
in X n and the basis in X m.
“A” Transforms the input basis space to the Range basis space.
Careful examination of the subscripts in the expression shows
that the j th column of A is the representation of its effect on
the j th basis vector in the domain, expanded into the range
basis. That is, the j th column of A is A( v j ) written out in the
basis. {u i }

A very useful property, as we shall now see in some examples.


EXAMPLE: Consider the linear vector space of all vectors
in 2-D (R2 ). We can define a rotation operator as the
linear transformation A that rotates a vector x by an angle
q counterclockwise.

Ax
Find the matrix
x representation
θ
for A.

Recall the rule that says that the i th column of A is the


effect of A on the i th basis vector:
Effect of A on e1 :
cos θ
Ae2 e1 e2 Ae1 = e1 cos θ + e2 sin θ = [e1 e2 ] 
Ae 1  sin θ 
θ e2 Effect of A on e2 :
θ
 − sin θ
Ae 2 = − e1 sin θ + e2 cos θ = [e1 e 2 ]
e1

 cos θ 

 cos θ − sin θ 
So A=
 sin θ cos θ 
(1,2)
1  Ax
Try this on vector x =  ; let θ be 30 o : x
 2
θ
cos( 30 ) − sin( 30 )   1  − .134 
Ax =     = 
 sin( 30 ) cos( 30 )   2  2.23 
n
EXAMPLE: Let A be the linear operator that forms an
orthogonal projection from a 3-D space into a 2-D space.
Suppose the 2-D space is the "x-y plane" of the 3-D space:

e
ℜ 3 3
x

e2 A:ℜ 3 → ℜ 2

e Ax
ℜ 2
1
"Orthogonal" projection is
along the z-axis. We
could also project
"along" any other line,
but this wouldn't be
orthogonal.
We can form the matrix representation for A from the effect it
has on the basis vectors. Let the basis for R 2 be {e 1 , e2 }
and the basis for R3 is {e1 , e 2 , e 3 }.

 1
Ae1 = e1 = [e1 e2 ] 
 0
 0
Ae2 = e2 = [e1 e2 ] 
 1
0
Ae3 = 0 = [e1 e2 ]  zero out
0

So 1 0 0
A= 
 0 1 0  a (2 x 3) matrix

n
We often see transformations from a space X into itself
(which results in a square matrix representation of
A). It is possible for this transformation to map
representations from one basis into representations in
a different basis, but we'll seldom find use for these.
Suppose we consider a linear transformation that maps
vectors from space X into itself: What happens when
the basis is different?

For notation, let Aˆ be the transform ation in the basis {vˆi }, and
A be the transform ation in the basis {vi }; i.e.,

[ y ]v = A [ x ]v and [ y ]vˆ = Aˆ [ x ]vˆ

Now because we're working all within a single space, we can


use a B-matrix to transform both x and y vectors from
”hat" to ”no hat" coordinates:
[ y ]vˆ = Aˆ [x ]vˆ
[ y ]vˆ = B [x ]v
[B ][ y ]v = Aˆ [B ][x ]v , so
[x ]vˆ = B [x ]v
[ y ]v = [B ] −1
Aˆ [B ][ x ]v

Comparing to y v
= A x v from the previous page, we see that

A = B −1 Aˆ B
And if the bases are both orthonormal, the inverse is equal to the
transpose, so

A = BT Aˆ B
This
Thisisishowhowwewechange
changethe
theexpression
expressionof
ofaalinear
linear
transformation
transformationAAfrom fromone
onebasis
basisinto
intoanother
anotherbasis,
basis,and
and
ititisiscalled
calledaasimilarity
similaritytransformation.
transformation.
EXAMPLE: Consider the linear vector space of all
polynomials in s, of degree less than 4, with constant
coefficients (over the field of reals). One can show that the
operator A that takes a vector v(s) and transforms it into
v ′′ ( s) + 2 v ′ ( s) + 3v ( s)

is linear operator from the space into itself. Find its


matrix representation if the basis is:

{ei } = {s 3 s 2 s 1}

Find the effect on the basis vectors:


3 Operator
6
Ae1 = 6s + 6 s 2 + 3s 3 = [e1 e2 e3 e4 ]  V ′′(s ) + 2V ′(s ) + 3V (s )
6
0
  {ei } = {s 3 s 2 s 1}
 0
 3 derivative s :
Ae2 = 2 + 4 s + 3s 2 = [e1 e2 e3 e4 ] 
 4 {
1st : 3 s 2 2s 1 0 }
 2 2 nd :{6 s 0}
  2 0
 0 e1 = s 3 , e2 = s 2 , e3 = s, e4 = 1
 0
Ae3 = 2 + 3s = [e1 e2 e3 e4 ] 
 3 3 0 0 0
 2 6 3 0 0
 
So A =  
0 6 4 3 0
0 0 2 2 3

Ae4 = 3 = [e1 e2 e3 e4 ] 
0
3 operator
 
Try this out on the vector  0 e1
1  e2
v ( s ) = s 2 + 1, whose representa tion is v =  
 0 e3
1  e4
3 0 0 0 0   0  
6 3 0 0 1   3
Av =    =  
6 4 3 0 0   4
0 2 2 3 1   5

v (s ) = [0 1 0 1] s [ 3
s 2
s 1 ]T

Check: v ′′( s) + 2 v ′ ( s ) + 3v ( s ) = 3s2 + 4 s + 5 Plug in


v(s)

Now, how would this linear operator be represented if the


basis were instead:
How are
{ei } = {s 3 − s 2 , s 2 − s , s − 1, 1} the basis
related ?
To do this, we need that B-matrix that contains the
coefficients that relate the new and old bases. Recall
the relation:
n
Defines B
e j = ∑ bij ei
i =1 e = Be
It is easier to see the inverse relationship in this case:

T
e1 = e1 − e2 = e1 e2 e3 e4 1 −1 0 0
T
e2 = e2 − e3 = e1 e2 e3 e4 0 1 −1 0 Some
T Tinkering
e3 = e3 − e4 = e1 e2 e3 e4 0 0 1 −1
T
e4 = e4 = e1 e2 e3 e4 0 0 0 1

forms B −1
Which of course gives us B −1 instead of B :

1 0 0 0 1 0 0 0
− 1 1 0 0 1 1 0 0
B −1 =   from which we compute B =  
 0 −1 1 0 1 1 1 0
0 0 −1 1 1 1 1 1
 

Now from the formula: A = B −1 A B Basis change


e = Be defined sets
−1
the order
we solve for A: A = BAB and compute:

3 0 0 0
6 3 0 0
A= 
8 4 3 0
6 4 2 3

How do we check this? First find the representation of
our vector v in the new basis:
1 0 0 0  0  0 
1 1 0 0  1  1 
By v = Bv =    =  
1 1 1 0  0  1 
definition 1 1 1 1 1 2 

(= 1e2 + 1e3 + 2e4 = ( s 2 − s ) + ( s − 1) + 2 = s 2 + 1)
substitute
Now compute the differential operation within this basis:
s2 + 1
3 0 0 0  0 0
6 3 0 0  1 3 v ′′ + 2 v ′ + 3v
Av =    =  
8 4 3 0  1 7
6 4 2 3 2 12 
  
J
(= 3e2 + 7e3 + 12e4 = 3(s 2 − s ) + 7( s − 1) + 12(1) = 3s 2 + 4s + 5) n
Operations on Operators:

Linear transformations, like the matrices that represent


them, are generally not commutative, but they can be
added. That is, to find
y = A1 ( x ) + A2 ( x ),

we can add the matrix representations for A1 and A2


and compute:
( A1 + A2 ) x = y

Operator Norms:
Sometimes we want to know how big A will make x. We then
give the operator A a norm:

Ax
A = sup , or equivalently A = sup Ax
x≠ 0 x x =1
Recall that there are many ways to define x . Consequently,
there are many different matrix norms. They all follow the
rules:
Ax ≤ A ⋅ x for all x
A1 + A2 ≤ A1 + A2
A1 A2 ≤ A1 ⋅ A2
αA = α ⋅ A

EXAMPLE: The 2-norm of a matrix can be written as

{ }
1
  2

A =  max x T A T Ax 
 x =1 

Note that this definition does not indicate how to compute


such a quantity!!
And finally,

DEFINITION: The adjoint of the linear operator A is


denoted A* and is defined by
Ax , y = x , A * y

for all x and y. A* is itself a linear operator, and for


matrix representations, happens to be the complex
conjugate transpose of A:
A* = A T
Simultaneous Linear Equations

Consider the familiar set of m linear equations in


n unknowns:
a11 x1 + a12 x 2 + L + a1n x n = y1
a 21 x1 + a 22 x 2 + L + a 2 n x n = y 2
M
a m1 x1 + a m 2 x 2 + L + a mn x n = y m

Write this in matrix-vector form:


Ax = y

m xn nx1 mx1
( A: X n → X m ) (x ∈ X n) ( y ∈ X m)
We wish to investigate the circumstances of when, if
and how many solutions exists to this matrix
equation.

Consider that x is a vector of coefficients of a linear


combination of the columns of A.
       
Ax =  a1  x1 +  a 2  x 2 + L +  a n  x n =  y 
       
       
For the equality to hold, then, y would have to be equal to a
linear combination of the columns of A ("in the column
space of matrix A."

If this is the case, then y will be linearly dependent on the


columns of A.
...So appending the vector y to the columns from A
cannot add any rank to those columns, considered as
a set of vectors. That is, if
W= A y then r ( A ) = r (W )

r ( A ) = r (W ) when at least* one


( y ∈R(A))
solution exists.

r ( A ) ≠ r (W )
Conversely
When NO solutions exists.

* If r ( A ) = r (W ), we can have either a unique solution, or


possibly an infinite number of solutions.
If r ( A) = r (W ) = n, then the solution x is unique,
If r ( A) = r (W ) < n, then there are an infinite number of solutions.

Why?

If r ( A) = n , then the columns of A form a basis of the n-


dimensional column space of A. Then vectors within this
space (of which y, we have said, is one, because r (W ) = r ( A ) )
will be written as unique linear combinations of the basis
vectors.
If r ( A ) < n , then the columns of A still span the column
space, but there are too many of them to be a basis, so
the y vector is not a unique linear combination of the
columns of A, but one of an infinite number of
possibilities.
A Picture:
Suppose r ( A ) = 2 , depicted
by the plane. Then since
y all products Ax lie in
( A)
that plane, y had better
be there, or else there is
no x such that Ax=y.

Now suppose n=3; i.e., A maps


y 3-D vectors into the 2-D
( A)
plane. Then there are lots
of x's that might map to the
2-D vector y.
r ( A) < n
What's the easiest way to find out what r ( A) and r (W ) are?
computer!!

By hand? "Echelon Form"


(resulting from "elementary operations")

Many examples in book. W = [A M Y ]


notation: W ′ =echelon form of W
Ex. 3:
1 2 3 4 1 0 0 2.125 
W = 2 1 2 7 W ′ = 0 1 0 − 4. 5 
   
 3 2 1 1   0 0 1 3.625 
This is
r ( A) = r (W ) = 3 = n it!
Unique solution exists
Ex. 4: 1 −1 2 8 1 0 4 18 
W = W′= 
− 1 2 0 2  0 1 2 10 

r ( A ) = r (W ) = 2 < n An infinite number of solutions exists,


and must satisfy x1 + 4 x 3 = 18
x 2 + 2 x3 = 10

Ex. 5: 1 2 2   1 0 0
W = 3 4 3  W ′ =  0 1 0
   
5 6 − 4  0 0 1 
r ( A) = 2, r (W ) = 3 NO solutions exist!

(Note here that m > n )


So how do we actually find the solutions, when they
exist? ( r ( A) = r (W ))
Echelon form goes a long way, but is manual labor.

If A is n × n and r ( A ) = r (W ) = n , Then we just invert A:

x = A −1 y

If not . . .
Two common cases: "Overdetermined" ( m > n)
and "Underdetermined" ( m < n)
Underdetermined Case: m<n
There is no possibility of a unique solution because
r ( A) ≤ (min( n ,m ) = m ) < n

Usually, when we have a choice of many solutions,


we prefer the smallest one. That is, we will
choose the x with minimum norm, or equivalently,
minimum squared-norm.

That, we will minimize 1 xT x


2
Subject to a constraint that:
y = Ax

We'll do this using a "LaGrange Multiplier"


Find: min ( 12 x T x ) such that Ax − y = 0
x

Form "Hamiltonian" H = 12 x x + λ ( y − Ax )
T T

LaGrange multiplier; a vector that makes


our constraint equation the same
dimension as the scalar we are
minimizing. λ is actually part of our
unknown, so we take our derivatives
w.r.t it, too:

∂Τ H
= x − AT λ = 0
∂x Note that this is the same
∂Τ H as our "constraint"
= y − Ax = 0
∂λ equation!
∂Τ H
= x − AT λ = 0 (1)
(repeating) ∂x
∂Τ H
= y − Ax = 0 (2)
∂λ

Multiply eqn. (1) by A: Ax = AA T λ

But eqn. (2) gives y = Ax = AA T λ


A must be full rank
So λ = ( AA T ) −1 y How do we know ( AA T ) −1
exists?

Substitute this back into (1) to get x = A T ( AA T ) −1 y

Sometimes called a
"pseudoinverse" of A.
x = AT ( AA T ) −1 y

Is the "shortest" vector such that y = Ax


for a given y and A.

"Overdetermined Case" m>n

If r ( A ) = n, then we know there will be one solution.

However it often happens that there is no solution,


because y has m elements, and we are asking that it be a
linear combination of only n vectors ( n < m) . This often
happens even when we know there must be a solution,
because, e.g., we are taking lots of data (equations)
from an experiment with only a few variables (x's).
In this situation, the noisy data might give r ( A ) ≠ r (W )
so we settle for the x that gives us the smallest error
in the equation Ax = y .
Whether a single solution exists or we want an
approximate solution with smallest error, the
following procedure works:
Define error: e = y − Ax

Now find: min 1


2
e 2 = min 21 e T e
x x

1 eT e
2
= 1
2
( y − Ax ) T ( y − Ax )

= 1
2
( y T − x T A T )( y − Ax )
1
2 eT e = 1
2 [( y − x A )( y − Ax ) ]
T T T

= 1
2 [y y − x A y − y Ax + x
T T T T T
AT Ax ]
These are equal because
they are scalars and
transposes of each
other!
1 eT e
2
= 1
2
y T y − 2 x T A T y + x T A T Ax

Take derivative and set = to zero:

[
∂ Τ 12 eT e ] = 1 [− 2 AT y + 2 AT Ax ] = 0
∂x 2

Solving: x = ( A T A ) −1 A T y
x = ( A T A ) −1 A T y

Note the striking similarity to the "pseudoinverse" we saw


in the underdetermined case x = AT ( AA T ) − 1 y

This is sometimes also called a pseudoinverse. (a


different one; pseudoinverses are not unique,
they also come in left- and right- versions).

Example: Suppose  2 2  4
A = 1 2 and y =  3
   
1 0 1 

Find x such that y = Ax , or x such that y − Ax is as


small as possible if no exact solution exists.
2 2 4 
A = 1 2 and y =  3
   
 1 0  1 

4 
−1 1 − 1
3  
2
T
( A A) A y=
T 3 3
 3
0
1 − 2 
1
2
1 

1
= =x
1
To see how good an approximation this is, compute
the error vector:
 4  2 2  0
    1  
e = y − Ax = 3 − 1 2   = 0
    1  
 1   1 0  0

So there was no error at all. We must have y ∈ ( A )


(which may have been obvious anyway).

We had r ( A ) = r (W ) = n so the solution existed and


was unique!
R=2
Two Important Examples in Control Systems:

Consider a discrete-time system in state-space form:

x ( k + 1) = Ax ( k ) + Bu ( k )
y ( k ) = Cx ( k ) + Du ( k )

Do some brute-force recursive calculations:


x ( k + 1) = Ax ( k ) + Bu ( k )
y ( k ) = Cx ( k ) + Du ( k )

x (1) = Ax ( 0) + Bu ( 0)
y ( 0) = Cx ( 0) + Du ( 0)

x ( 2 ) = Ax (1) + Bu (1) = A 2 x ( 0) + ABu ( 0) + Bu (1)


y (1) = Cx (1) + Du (1) = CAx ( 0) + CBu ( 0) + Du (1)

x ( 3) = A 3 x ( 0) + A 2 Bu ( 0) + ABu (1) + Bu ( 2 )
y ( 2 ) = CA 2 x ( 0) + CABu ( 0) + CBu (1) + Du ( 2 )

M make note of these patterns forming


etc.
Problem #1: When is it possible to make x ( k ) = 0 by
applying the sequence u ( 0 ), K , u ( k − 1) regardless
of what x( 0) is?
Consider the equation:

0 = x ( k ) = A k x ( 0 ) + A k −1 Bu ( 0 ) + A k − 2 Bu (1) + K + Bu ( k − 1)

(from the pattern on the previous page)

Re-arrange:
 u( k − 1) 
 M 
[
− A k x ( 0) = B L Ak −2 B ]
A k −1 B 
 u (1) 

 u ( 0) 
nx1 ∆
 
=P
We are allowing x( 0) to be any n-dimensional
vector, so by our knowledge of linear equations,
we want to have r ( P ) ≥ n . (We will find out
later that the P matrix cannot have rank greater
than n). Systems with this property are called
controllable.

Problem #2: If at time k, we know the current and all the


previous, inputs and outputs, then when will it be possible
to figure out what x( 0 ) was?
Recall the recursion equations:
y ( 0 ) = Cx ( 0 ) + Du ( 0 )
y (1) = CAx ( 0 ) + CBu ( 0 ) + Du (1)

y ( 2 ) = CA 2 x ( 0) + CABu ( 0 ) + CBu (1) + Du ( 2 )


M
etc.
Re-arrange these as:
 y (0 )   C   D 0 0 L 0   u ( 0) 
 y (1)   CA   CB D 0 L 0   u (1) 
   2   
 y ( 2 )  = CA  x ( 0) +  CAB CB D L 0  u ( 2 ) 
 M   M   M O  M 
   k  k −1  
 y ( k )  CA  CA B L D  u ( k ) 

combine with other side


Re-write again as:
 C 
known  CA 
 2 ∆
Ψk = CA  x (0) = Qx ( 0)
 M 
 k
CA 

Similar to before, this Q -matrix is going to need to be


full (n) rank in order for the linear equation above
to have a solution. Then knowing the left-hand
side, which contains the past inputs and outputs,
we can find an arbitrary initial condition!

This system is observable; a concept we'll see again soon.

You might also like