Linear System Theory

Download as pdf or txt
Download as pdf or txt
You are on page 1of 62

ESE 502, Linear Systems Theory 1

Linear System Theory

ESE 502
ESE 502, Linear Systems Theory 2

1 Linear System Theory

1.1 Overview


• Vague Term
• Inputs and Outputs
• Behavior in time
• Aspects:
– Physical system (electronic, mechanical, economic,
biological, etc.)

– Mathematical Model (usually differential or difference


– Analysis (simulation and analytical)

– Design (simulation, analytical, and experience)

ESE 502, Linear Systems Theory 3

Mathematical System Descriptions

1. Internal

2. External

Analysis: From internal to external description (usually


Design: From external to internal description (usually not

ESE 502, Linear Systems Theory 4

System Properties

Linearity: If B denotes the action of a “black box”, u(t)

and v(t) are inputs, and a is a constant, then

B(u + v) = B(u) + B(v) (1)

B(au) = aB(u) (2)

Time-Invariance: Continuous and Discrete:

Continuous: If v(t) is the output for an input u(t), i.e.

v(t) = B(u(t)), then, for all t0 ,

v(t + t0 ) = B(u(t + t0 ))

Discrete: If vn is the output for an input un , i.e.,

vn = B(un ), then, for all n0 ,

vn+n0 = B(un+n0 )
ESE 502, Linear Systems Theory 5

System Properties continued:

Causality: Future inputs can not affect present and past

If u1 (t) and u2 (t) are two inputs, and v1 (t) and v2 (t)
are the corresponding outputs, then, for every t0 :

• If u1 (t) = u2 (t) for all t < t0 , then v1 (t) = v2 (t)

for all t < t0 .

• If linear: if u(t) = 0 for all t < t0 , then v(t) = 0 for

all t < t0 .

• If linear and time-invariant: if u(t) = 0 for t < 0,

then v(t) = 0 for t < 0.

“Lumpedness”: finite # of variables

SISO or MIMO: Single-Input, Single-Output, or Multi-Input,


No system is perfectly linear or time-invariant, but there are

enough systems which are approximately linear and
time-invariant, or which can be modelled as linear or
time-invariant, that these are very useful concepts.
ESE 502, Linear Systems Theory 6


• Fundamental concept of systems

• A set of variables internal to the system, whose value at
any time, t0 , together with the inputs for time ≥ t0 ,
determines the outputs for time ≥ t0 .

• A set of initial conditions

• Usually written as a column vector
• Encapsulates total effect of all past inputs on the system
• Past inputs can affect future outputs only through the

• Lumpedness ⇐⇒ State is a finite set of variables

ESE 502, Linear Systems Theory 7


L-C circuit: figure 2.2, p.7.

• State: capacitor voltages and inductor currents (or C

charges and L fluxes)

• Finite-state (description at low frequencies)

• Causal
• Time-invariant
• Linear
Unit Delay: y(t) = u(t − 1) (continuous-time)
• Infinite-state
• Causal
• Time-invariant
• Linear
ESE 502, Linear Systems Theory 8

Examples (continued):

Unit Advance: y(t) = u(t + 1) (continuous-time)

• “Infinite-state”
• Not Causal
• Time-invariant
• Linear
Unit Delay: y(n) = u(n − 1) (discrete-time)
• Finite-state
• Causal
• Time-invariant
• Linear
ESE 502, Linear Systems Theory 9

Linear System Responses

Impulse response: limit of rectangular responses.


ha (t, t0 ) = response to ra (t − t0 )
where 
 1/a for 0 < t < a
ra (t) =
 0 otherwise

Then define impulse response by

h(t, t0 ) = lim ha (t, t0 )

ESE 502, Linear Systems Theory 10

Kernel formula for input-output description: If input is

u(t), then

u(t) = lim ∆t u(n∆t)r∆t (t − n∆t)

≈ ∆t u(n∆t)r∆t (t − n∆t)

Output, y(t), then is (use linearity!)

y(t) ≈ u(n∆t)h∆t (t, n∆t)∆t

and so

y(t) = lim u(n∆t)h∆t (t, n∆t)∆t
Z ∞
= u(τ )h(t, τ )dτ
ESE 502, Linear Systems Theory 11

Special Cases:

Time-invariant: h(t, τ ) = h(t − τ ): then output is given

Z ∞
y(t) = u(τ )h(t − τ )dτ
Z ∞
= u(t − τ )h(τ )dτ

— convolution.

Causal: In terms of impulse response

General: h(t, τ ) = 0 for t < τ

Time-Invariant: h(t) = 0 for t < 0.
MIMO: With p inputs and q outputs, get a q × p impulse
response matrix.
ESE 502, Linear Systems Theory 12


The convolution of two functions f (t) and g(t) is defined by

Z ∞
f (t) ∗ g(t) = f (t − τ )g(τ )dτ
Z ∞
= f (τ )g(t − τ )dτ

If f (t) = 0 for all t < 0,

Z ∞
f (t) ∗ g(t) = f (τ )g(t − τ )dτ

If also g(t) = 0 for all t < 0,

Z t
f (t) ∗ g(t) = f (t − τ )g(τ )dτ
Z t
= f (τ )g(t − τ )dτ
ESE 502, Linear Systems Theory 13

Laplace Transform:

For a function f (t) with f (t) = 0 for all t < 0, the Laplace
Transform of f (t) is defined by:
Z ∞
L{f (t)} = F (s) = e−st f (t)dt



L{f (t) + g(t)} = F (s) + G(s)

L{af (t)} = aF (s)

for all constants a.

ESE 502, Linear Systems Theory 14

Laplace Transform Properties continued:

Shifting Theorem: For t0 ≥0

L{f (t − t0 )} = e−st0 F (s)


Z ∞
L{f (t − t0 )} = e−st f (t − t0 ) dt
Z ∞

= e−s(t +t0 ) f (t′ ) dt′
Z ∞

= e−s(t +t0 ) f (t′ ) dt′
Z ∞
−st0 −st′
= e e f (t′ ) dt′
= e−st0 L{f (t)}

using the fact that f (t) = 0 for t < 0, and the

substitution t′ = t − t0
ESE 502, Linear Systems Theory 15

Laplace Transform Properties continued:


L{f (t) ∗ g(t)} = F (s)G(s)

Z ∞
L{f (t) ∗ g(t)} = e−st f (t) ∗ g(t) dt
Z ∞ Z t
= e−st f (τ )g(t − τ ) dτ dt
Z0 ∞ Z ∞ 0

= e−st f (τ )g(t − τ ) dt dτ
Z0 ∞ τ Z ∞
= f (τ ) e−st g(t − τ ) dt dτ
Z0 ∞ Zτ ∞
−s(t′ +τ )
= f (τ ) e g(t′ ) dt′ dτ
Z0 ∞ 0
Z ∞
−sτ −st′
= f (τ )e e g(t′ ) dt′ dτ
0 0
= F (s)G(s)
where the interchange of integrals uses the fact that the
integration is over the set
{(t, τ )|t ≥ 0, τ ≥ 0, τ ≤ t}.
ESE 502, Linear Systems Theory 16

Laplace Transform Properties continued:

L{f ′ (t)} = sF (s) − f (0)
Z ∞

L{f (t)} = e−st f ′ (t) dt
Z ∞
= e−st df (t)
Z ∞

= e−st f (t) 0 − f (t) de−st
Z ∞
= −f (0) + s f (t)e−st dt
= sF (s) − f (0)

(integration by parts).
ESE 502, Linear Systems Theory 17

Laplace Transform Properties continued:

Z t
L{ f (τ )dτ } = F (s)
0− s
L{eat U (t)} =
L{δ(t)} = 1

Unit Step:
L{U (t)} =
ESE 502, Linear Systems Theory 18

More Laplace Transform Properties:

Exponential Multiplication: For any f (t):

Z ∞
L{e f (t)} = f (t)e−(s−a)t dt
= F (s − a)

Trigonometric Functions:

L{sin(ωt)} = L{(ejωt − e−jωt )/(2j)}

= (1/(s − jω) − 1/(s + jω)) /(2j)
= ω/(s2 + ω 2 )


L{cos(ωt)} = s/(s2 + ω 2 )

Exponentials and Sinusoids: From the above:

L{eat cos(ωt)} = (s − a)/((s − a)2 + ω 2 )


L{eat sin(ωt)} = ω/((s − a)2 + ω 2 )

ESE 502, Linear Systems Theory 19

Miscellaneous and Terminology:

1. Examples 2.2, 2.3, 2.4 and 2.5.

2. Laplace transform of impulse response matrix h(t) is

the transfer function matrix H(s).

3. Lumped system =⇒ rational transfer function.

4. (a) H(s) proper ⇐⇒ H(∞) is finite.

(b) Strictly proper: H(∞) = 0
(c) Biproper: proper and H(∞) 6= 0
5. (a) Pole of H(s): a complex number sp such that
H(sp ) = ∞
(b) Zero of H(s): a complex number sz such that
H(sz ) = 0
(c) H(s) can be factored into a product of first and
second-order factors with real coefficients, or of
first-order factors with complex coefficients.
Numerator factors are of the form (s − sz );
Denominator factors are of the form (s − sp )
ESE 502, Linear Systems Theory 20

2 State Variable (State Space)


2.1 Continuous-Time

1. State variables form an n-dimensional column vector

x(t) = (x1 (t), x2 (t), . . . xn (t))T
2. State equations are:

ẋ(t) = Ax(t) + Bu(t)

y(t) = Cx(t) + Du(t)

If there are p inputs and q outputs, then

A is n × n;
B is n × p;
C is q × n;
D is q × p.
ESE 502, Linear Systems Theory 21

Example; Nonlinear state equations:

Pendulum of mass m, length l, external horizontal force

u(t) on mass; θ as position variable.
Dynamical Equation:

mlθ̈ = −mg sin θ − u cos θ

State variables: x1 = θ, x2 = θ̇;

State equations (nonlinear, time-invariant):

x˙1 = x2
g 1
x˙2 = − sin x1 − (cos x1 )u
l ml
— “input affine”

Read examples 2.6 – 2.10.

ESE 502, Linear Systems Theory 22

Transfer Functions of State-Variable Systems

Take Laplace transform of state equations:

sX(s) − x(0) = AX(s) + BU(s)

Y(s) = CX(s) + DU(s)

Solve for X(s):

(sI − A)X(s) = BU(s) + x(0)

and so

X(s) = (sI − A)−1 BU(s) + (sI − A)−1 x(0)

First term: zero-state state response;

Second term: zero-input state response.

Use second equation:


Y(s) = C(sI − A) B + D U(s)+C(sI−A)−1 x(0)

First term: zero-state (output) response;

Second term: zero-input (output) response.

ESE 502, Linear Systems Theory 23

Transfer Function of State Variable Equations


For transfer function matrix, x(0)=0: then

Y(s) = H(s)U(s)

H(s) = C(sI − A)−1 B + D


• Op-amp implementation: need only integrators, adders,

and constant gains; all “easily” done with op-amps.

• Linearization of a nonlinear, time-invariant system:

– About a point: linear, time-invariant system

– About a trajectory: linear, time-varying system

ESE 502, Linear Systems Theory 24

Linearization: Example: Pendulum as previously:

State variables: x1 = θ, x2 = θ̇;

State equations (nonlinear, time-invariant):

x˙1 = x2
g 1
x˙2 = − sin x1 − (cos x1 )u
l ml
Linearize about equilibrium point x(t) = 0, u(t) = 0:

x˙1 = x2
g 1
x˙2 = − x1 − u
l ml
– linear, time-invariant.

Linearize about known natural trajectory x(t), u(t) = 0:

δx˙ 1 = δx2
g 1
δx˙ 2 = − cos(x1 (t))δx1 − cos(x1 (t))δu
l ml
– linear, time-varying.
ESE 502, Linear Systems Theory 25


1. RLC Networks, p.26; example 2.11

2. RLC procedure:

(a) Normal tree: branches in the order vsrc , C , R, L,

(b) State variables: vc in tree and iL in links

(c) Apply KVL to fundamental loops of state variable

links, and KCL to fundamental cutsets of state
variable branches.

Read example 2.13: tunnel diode =⇒ negative resistance

ESE 502, Linear Systems Theory 26

2.2 Discrete-time systems.

Basic element: unit delay

Discrete convolution:

(pn ) = (fn ) ∗ (gn )


pn = fk gn−k

For “causal” sequences (fn ) and gn , (i.e., with

fn = gn = 0 for all n < 0)
pn = fk gn−k

(One-Sided) Z-Transform: If fn = 0 for n < 0

Z{fn } = F (z) = fn z −n
ESE 502, Linear Systems Theory 27

Z -Transform Properties.
1. Non-causal shift fomula:

Z{x(n + 1)} = zX(z) − zx(0)

2. Convolution:

Z{f ∗ g} = F (z)G(z)

3. State equations

x(n + 1) = Ax(n) + Bu(n)

y(n) = Cx(n) + Du(n)

Transfer function for state equations: take Z -transform

zX(z) − zx(0) = AX(z) + BU(z)

Y(z) = CX(z) + DU(z)

and so
Y(z) = H(z)U(z)
H(z) = C(zI − A)−1 B + D
ESE 502, Linear Systems Theory 28

3 Linear Algebra (Chapter 3):

3.1 Fundamentals:

Vector space: vectors and scalars (here always real or

complex) which satisfy usual properties:

1. Vectors can be added and subtracted;

2. Scalars can be added,subtracted, multiplied, and

divided (except by 0);

3. Vectors can be multiplied by scalars to give another

vector: distributive, etc.


1. Rn : column vectors of real numbers;

2. Cn : column vectors of complex numbers;
P∞ 2
3. l2 : sequences (xn ) with n=−∞ xn finite.
R∞ 2
4. L2 : functions f (t) with −∞ f (t)dt finite

5. Many others . . .
ESE 502, Linear Systems Theory 29


1. A set of vectors {q1 , q2 , . . . qm } is linearly independent

if the only scalars α1 , α2 , . . . αm which satisfy the equation

α1 q1 + α2 q2 + . . . αm qm = 0

are α1 = α2 = . . . αm = 0.
2. A set of vectors {q1 , q2 , . . . qm } spans a vector space
if every vector q in the space can be written in the form

q = α1 q1 + α2 q2 + . . . αm qm

for some scalars α1 , α2 , . . . αm .

3. A set of vectors {q1 , q2 , . . . qn } is a basis of a vector

space if it is linearly independent and spans the space.

4. If {q1 , q2 , . . . qn } is a basis of a vector space, then

every vector q in the space can be written in the form

q = α1 q1 + α2 q2 + . . . αm qm

for a unique set of scalars {α1 , α2 , . . . αm }.

ESE 502, Linear Systems Theory 30

Dimension and Notation:

Fundamental fact: every basis of a given vector space has

the same number of elements; this number is called the
dimension of the space.

Notation: if Q is the matrix formed from the column vectors

{q1 , . . . , qn }, then the equation

x = α1 q1 + . . . + αn qn

can be written as
x = Qa
a = [α1 , . . . , αn ]′

Basis example: standard basis for Rn : i1 , i2 , . . . , in

     
1 0 0
     

 0 

 1 
 
 0

i1 =  .. ; i =
 2  .  ; · · · i n =  
 .
  ..   ... 
     
0 0 1
ESE 502, Linear Systems Theory 31



1. ||x|| ≥ 0; ||x|| = 0 =⇒ x =0
2. ||αx|| = |α| ||x||
3. ||x1 +x2 || ≤ ||x1 || + ||x2 ||
Norm Examples: 1, 2, ∞, p norms
1. ||q||1 = |q1 | + |q2 | . . . + |qn |
2. ||q||∞ = max{|q1 |, |q2 |, . . . |qn |}
3. ||q||2 = |q1 |2 + |q2 |2 . . . + |qn |2
4. ||q||p = (|q1 |p + |q2 |p . . . + |qn |p ) for p ≥1

||q||2 is the usual Euclidean norm;

The subscript is usually omitted when only one norm is
being used.
ESE 502, Linear Systems Theory 32

Inner Product:

A scalar-valued product of two vectors: < x, y > with the


1. < x, x > > 0 unless x = 0

2. < x, y >=< y, x > (real);
< x, y >=< y, x >∗ (complex)
3. < x + y, z >=< x, z > + < y, z >
4.< αx, z >= α < x, z >

Can be proved that ||x|| = < x, x > defines a norm.

Inner Product Examples:

1. Rn ; < x, y >= yT x
2. Cn ; < x, y >= y∗ x

3. L2 (−π : π); < f (x), g(t) >= −π f (t)g ∗ (t) dt
P∞ ∗
4. l2 ; < (xn ), (yn ) >= n=−∞ xn yn
5. L2 (−∞ : ∞); < f (x), g(t) >= −∞ f (t)g ∗ (t) dt
ESE 502, Linear Systems Theory 33

Norm and Inner Product Terminology:

1. Normalized: x normalized iff ||x|| = 1

2. Orthogonal: x and y orthogonal iff < x, y >= 0
3. Orthonormal: {x1 , x2 , . . . xm } orthonormal iff
||xi || = 1 for all i, and < xi , xj >= 0 for all i 6= j .
4. Projection of one vector on another: projection of x on
y is the vector ||y|| y

5. If A = [a1 , . . . , am ] with m ≤ n, and the ai are

orthonormal, then A′ A = Im , but not necessarily
AA′ = In
ESE 502, Linear Systems Theory 34

Orthonormalization (Gram-Schmidt):

Given a set of vectors {e1 , e2 , . . . em }, a set of

orthonormal vectors with the same span can be found as

u1 = e1 ; q1 = u1 /||u1 ||

u2 = e2 −(q1 e2 )q1 ; q2 = u2 /||u2 ||
X ′
um = em − (qk em )qk ; qm = um /||um ||

— not necessarily optimal numerically.

ESE 502, Linear Systems Theory 35

Cauchy-Schwartz Inequality:

For any vectors x and y in an inner-product space:

| < x, y > | ≤ ||x|| ||y||


0 ≤ < x − λy, x − λy >

= < x, x > −λ < y, x > −λ∗ < x, y >
+|λ|2 < y, y >

Now pick λ =< x, y > /||y||2 : then

2 2
| < x, y > | | < x, y > |
0 ≤ ||x||2 − 2 + ||y||2
||y||2 ||y||4
| < x, y > |2
= ||x|| −
and so
| < x, y > |2 ≤ ||x||2 ||y||2
ESE 502, Linear Systems Theory 36

3.2 Linear Equations:

Assume an equation
Ax = y
where A is m × n, x is n × 1, and y is m × 1.


1. Range(A)=all possible linear combinations of columns

of A

2. ρ(A) = rank(A) = dim(range(A)) ; note that this

causes numerical difficulties.

3. x is a null vector of A: Ax = 0;
4. Nullspace(A)=set of all null vectors of A

5. ν(A) = nullity(A) = dim(nullspace(A))

6. Fundamental result: ρ(A) + ν(A) = # columns of A.
ESE 502, Linear Systems Theory 37


Theorem 3.1:

1. There exists a solution of Ax = y if, and only if, y is in

2. If A is an m × n matrix, there exists a solution of
Ax = y for every y if, and only if, ρ(A) = m.

Theorem 3.2:

If A is an m × n matrix, and if ν(A) = 0 (i.e., ρ(A) = n),

any solution is unique. Otherwise, all solutions are given by:

x = xp + xn

where xn is any vector in the nullspace, and xp is any one

ESE 502, Linear Systems Theory 38


For a square matrix A = (aij ):

1. det(A) = Σaij cij , where cij is the cofactor of aij
2. A−1 = Adj(A)/ det(A) , where Adj(A) = (cij )′
3. Determinant properties:

(a) det(AB) = det(A) det(B) ;

(b) det(A) 6= 0 if, and only if, A−1 exists (i.e., A is
ESE 502, Linear Systems Theory 39

3.3 Change of Basis:

If A = (aij ) is an n × n matrix , and x is a vector, with

x = x1 i1 + ... + xn in

where {i1 , } is the standard basis, then

Ax = y1 i1 + ... + yn in

where yi = ai1 x1 + ai2 x2 + ... + ain xn , or

yi = Σnj=1 aij xj

Similarly, if {q1 , q2 , ...qn } is any other basis, and x is

expressed as

x = x1 q1 + ... + xn qn

and y is given by

y = Ax = y 1 q1 + ... + y n qn

y i = Σnj=1 aij xj
The matrix A= (aij ) is the representation of A with
respect to the basis {q1 , q2 , ...qn }
ESE 502, Linear Systems Theory 40

Change of Basis (Continued):

As usual, let Q = [q1 , . . . , qn ], where Q is nonsingular,

since the {qi } form a basis. Then, x = Qx, and y = Qy.

Substitute in the equation Ax = y to get:

AQx = Qy

Q−1 AQx = y
and so
Q−1 AQ = A
This can also be written as

A[q1 , . . . , qn ] = [q1 , . . . , qn ]A

The last equation implies that the i − th column of A is the

representation of Aqi with respect to the basis
{q1 , . . . qn }; this is often the easiest way to find A
ESE 502, Linear Systems Theory 41


 
3 2 −1
 
 −2 1 0 

4 3 1
b = [0, 0, 1]′
then the vectors
 
0 −1 −4
 
{b, Ab, A b} = {q1 , q2 , q3 } = 
 0 0 2 

1 1 −3

form a basis.

To find A w.r.t. this basis, use the representation of Aqj in

terms of the basis {q1 , q2 , ...qn }
ESE 502, Linear Systems Theory 42

Example (continued):

Aq1 = Ab = q2
Aq2 = A2 b = q3
Also, the characteristic equation of A (to be done) is:

A3 − 5A2 + 15A − 17I = 0

and so
A3 b = 17b − 15Ab + 5A2 b
 
0 0 17
 
Therefore A =  1 0 −15 

 (companion form).
0 1 5
ESE 502, Linear Systems Theory 43

3.4 Diagonal and Jordan Form:


Eigenvalue: λ is an eigenvalue of A if there is a vector

x 6= 0 such that
Ax = λx
(A − λI)x = 0
The vector x is called an eigenvector for the λ.

Characteristic Polynomial of A is

∆(λ) = det(λI − A)

— a monic polynomial of order n in λ, with n roots,

counting multiplicity.

Roots of ∆(λ): λ0 is an eigenvalue of A ⇔ ∆(λ0 ) = 0

Then, if λ1 , λ2 , . . . λk are the eigenvalues, with
multiplicities n1 , n2 , . . . nk , the characteristic polynomial is
given by:

∆(λ) = (λ − λ1 )n1 (λ − λ2 )n2 . . . (λ − λk )nk

ESE 502, Linear Systems Theory 44

Companion form:
 
0 1 0 0
 

 0 0 1 0 

 

 0 0 0 1 

−α4 −α3 −α2 −α1

— characteristic polynomial is

∆(λ) = λ4 + α1 λ3 + α2 λ2 + α3 λ1 + α4

Jordan Block:
 
λ0 1 0 0
 
 0 λ0 1 0 
 
 
 0 0 λ0 1 
 
0 0 0 λ0

— characteristic polynomial is

∆(λ) = (λ − λ0 )4
ESE 502, Linear Systems Theory 45


Simplest case: ∆(λ) has distinct roots.

Then the eigenvalues are all distinct; it follows that the
eigenvectors are linearly independent.

To see this, assume we have eigenvalues λ1 , . . . λn with

corresponding eigenvectors q1 , . . . qn , and suppose
α1 q1 + . . . + αn qn = 0.
Then pick any k , and apply the operator

Πnj=1,j6=k (A − λj I)

to the vector α1 q1 + . . . + αn qn to obtain

Πnj=1,j6=k (λk − λj )αk qk = 0

Since k is arbitrary, linear independence follows.

The representation of A for the basis q1 , . . . qn , is then

diagonal: that is, D = Q−1 AQ.
Because the roots may be complex, must allow complex
ESE 502, Linear Systems Theory 46

Diagonalization: Non-distinct eigenvalues:

Let λ0 be an eigenvalue of multiplicity k0 .

Assume that the nullity of (A − λ0 I) is p0 ≤ k0 .

If p0 = k0 , then can pick any k0 linearly independent
vectors in the nullspace and get diagonal form again for this

If p0 6= k0 , need the concept of generalized eigenvector:

A generalized eigenvector qk of grade k satisfies

(A − λ0 I)k qk = 0

(A − λ0 I)k−1 qk 6= 0
ESE 502, Linear Systems Theory 47

Generalized Eigenvectors (continued):

Given a generalized eigenvector q of grade k , can get a

chain of generalized eigenvectors

qk = q
qk−1 = (A − λ0 I)qk = (A − λ0 I)q
qk−2 = (A − λ0 I)qk−1 = (A − λ0 I)2 q
.. .. ..
. . .

q1 = (A − λ0 I)q2 = (A − λ0 I)k−1 q

and these are linearly independent (multiply by

(A − λ0 I)j , for k − 1 ≥ j ≥ 1).
Note that q1 is an ordinary eigenvector (Aq1 = λ0 q1 ),
and that these equations can be solved by first finding a
generalized eigenvector, and evaluating from the top, or by
first finding an ordinary eigenvector, and solving from the

For each j > 1,

Aqj = λ0 qj + qj−1
ESE 502, Linear Systems Theory 48

Jordan Canonical Form:

With respect to these vectors, the block therefore has the

 
λ0 1 0 ... 0
 

 0 λ0 1 ... 0  
 .. .. .. .. 
Jk = 
 . . . .

 

 0 0 . . . λ0 1 

0 0 . . . 0 λ0

Jordan canonical form: block diagonal matrix with these


Note: For any Jordan block, with zero eigenvalue: Jkk = 0

(nilpotent) and so

(Jk − λ0 Ik )k = 0

for any Jordan block with eigenvalue λ0

Example: Problem 3.13(4), p. 81.

ESE 502, Linear Systems Theory 49

Functions of a square matrix A:

Power of A: An = AA . . . A}
| {z
n times

Polynomial in A: If

p(x) = an xn + an−1 xn−1 · · · + a1 x + a0

then p(A) is defined by

p(A) = an An + an−1 An−1 · · · + a1 A + a0 I

Similarity: p(QAQ−1 ) = Qp(A)Q−1

ESE 502, Linear Systems Theory 50

Functions of a square matrix A (continued):

Block Diagonal: If A is block diagonal:

 
A1 0 0 ... 0
 

 0 A2 0 ... 0 
 .. .. .. .. 
 . . . .

 

 0 0 . . . Ar−1 0 
0 0 ... 0 Ar

 
p(A1 ) 0 0 ... 0
 

 0 p(A2 ) 0 ... 0 

 .. .. .. .. 
p(A) = 
 . . . .

 

 0 0 . . . p(Ar−1 ) 0 

0 0 ... 0 p(Ar )
ESE 502, Linear Systems Theory 51

Ak for Jordan Block:

 
λ 1 0 ... 0
 

 0 λ 1 ... 0 

 .. .. .. .. 
Jk = 
 . . . .

 

 0 0 ... λ 1 
0 0 ... 0 λ
 r(r−1) r−2 k r 
r r−1 1 d (λ )
λ rλ 2! λ ... k! dλk
 
r r−1

 0 λ rλ ... 

 .. .. .. .. 
Jk+1 =
 . . . .

 

 0 0 ... λr rλr−1 

0 0 ... 0 λr
ESE 502, Linear Systems Theory 52

Ak for Jordan Block (continued):

Therefore, p(Jk+1 ) for any polynomial of a Jordan Block is
given by:
 p′′ (λ) p(k) (λ)

p(λ) p (λ) 2! ... k!
 

 0 p(λ) p (λ) ... 

 .. .. .. .. 
p(Jk+1 ) = 
 . . . .

 

 0 0 ... p(λ) p (λ) 

0 0 ... 0 p(λ)

— example 3.10, p.66

ESE 502, Linear Systems Theory 53

Minimal Polynomial:

For any eigenvalue λi , the index of λi = mi is the largest

order of all Jordan blocks with eigenvalue λi .

The multiplicity of λi = ni is the highest power of (λ − λi )

in the characteristic polynomial

∆(λ) = det(λI − A)

Therefore mi ≤ ni .
Define the minimal polynomial of A to be the product of the
terms (λ − λj ) to power of index, i.e

ψ(λ) = (λ − λ1 )m1 (λ − λ2 )m2 . . . (λ − λk )mk

Apply this polynomial to each block of the Jordan Canonical

Form, and the entire matrix becomes zero:

ψ(A) = 0

The Cayley-Hamilton Theorem follows immediately:

∆(A) = 0

Consequence: for any polynomial f (x), f (A) can be

expressed as a polynomial of degree n − 1 in A.
ESE 502, Linear Systems Theory 54

Matrix Functions continued:

How is this polynomial calculated?

In principle:

An = −α0 I − α1 A − α2 A2 . . . − αn−1 An−1

= p1 (A)
An+1 = −α0 A − α1 A2 − α2 A3 . . . − αn−1 An
= −α0 A . . . − αn−2 An−1 − αn−1 p1 (A)
= ...

More realistic: division with remainder gives:

f (λ) = q(λ)∆(λ) + h(λ)

where h(λ) is the remainder, with order < n.

Therefore, for any eigenvalue λi

f (λi ) = h(λi )

More generally, if ni is the multiplicity of λi

f (l) (λi ) = h(l) (λi ) for 1 ≤ l ≤ ni − 1

ESE 502, Linear Systems Theory 55

Matrix Functions continued:

If these equations hold for all eigenvalues, we say “f = h on

the spectrum of A”, and, by the Cayley-Hamilton theorem,

f (A) = h(A)

Note: Also works with n̄ (the degree of the minimal

polynomial) in place of n, but n̄ is not normally known.

It is often more convenient to use these conditions directly:

assume a polynomial of degree n − 1 with unknown

h(λ) = β0 + β1 λ + β2 λ2 + · · · + βn−1 λn−1

and use the n conditions above to solve for the βl .

Example 3.10: If A is a Jordan block with eigenvalue λ0 , it

is more convenient to assume the form

h(λ) = β0 + β1 (λ − λ0 ) + · · · + βn−1 (λ − λ0 )n−1

and the formula for f (Jk ) follows.

Note: Formula for f (Jk ) shows that derivatives are

ESE 502, Linear Systems Theory 56

Transcendental Matrix Functions:

Can define transcendental functions of A by means of

(infinite) power series.

Simpler: define a transcendental function f (A) of A to be

a polynomial h(A) of order n − 1 in A with f = h on the
spectrum of A.

Most important transcendental function: eAt .

Example: Problem 3.22 (3.13(4), p. 81).

Properties of matrix exponentials:

1. Differentiation:
d At
e = AeAt = eAt A
e(A+B)t 6= eAt eBt
unless AB = BA
L{eAt } = (sI − A)−1
ESE 502, Linear Systems Theory 57

Lyapunov equation:

If A is n × n, B is m × m, and M and C are n × m,

then the equation

AM + M B = C
with A, B , and C known, and M unknown, is called a
Lyapunov equation: nm equations in nm unknowns.
Eigenvalues: η is an eigenvalue iff
AM + M B = ηM
The eigenvalues are given by

ηk = λi + µj
— nm eigenvalues for 1 ≤ n, and 1 ≤ m, where λi is a
(right) eigenvalue of A

Ax = λi x
and µj is a left eigenvalue of B :

xB = µj x

E.g., let u be a right eigenvector of A, v′ a left eigenvector

of B , and M = uv′
ESE 502, Linear Systems Theory 58

Miscellaneous Formulae (sec. 3.8)

1. ρ(AB) ≤ min(ρ(A), ρ(B))

2. if C and D are invertible, ρ(AC) = ρ(DA) = ρ(A)
3. If A is m × n and B is n × m, then

det(Im + AB) = det(In + BA)

For the last property, define

   
Im A Im 0
N =  and Q =  
0 In −B In
 
Im −A
P = 
B In
 
Im + AB 0
det(P ) = det(N P ) = det  
B In
 
Im −A
det(P ) = det(QP ) = det  
0 In + BA
ESE 502, Linear Systems Theory 59

3.5 Quadratic Forms (Sec.3.9):

A Quadratic Form is a product of the form x′ M x.

Since x′ Sx = 0 for any skew-symmetric (S ′ = −S )

matrix, only the symmetric part of M is significant, so
assume M is symmetric.

Since eigenvalues can be complex, initially allow x to be

complex, and look at x⋆ M x.

x⋆ M x is real: (x⋆ M x)⋆ = x⋆ M x.

Theorem: The eigenvalues of a symmetric matrix M are

Proof: Let λ be a (possibly complex) eigenvalue of M .

M x = λx =⇒ x⋆ M x = λx⋆ x
and so λ is real.

So all eigenvalues of a symmetric matrix are real, and so we

need consider only real eigenvalues and real eigenvectors.
ESE 502, Linear Systems Theory 60

Quadratic Forms (continued):

Theorem: If M is symmetric, then its range and nullspace

are orthogonal.

Proof: Suppose y = M z and M x = 0. Then

< x, y > = x′ M z
= z′ M ′ x
= z′ M x
= 0

Theorem: If M is symmetric, then M is diagonalizable.

Proof: Suppose there is a generalized eigenvector. Then

there is a vector x and a real eigenvalue λ such that
(M − λI)2 x = 0, but y = (M − λI)x 6= 0.
So y6= 0 is both in the range and nullspace of
N = (M − λI), a contradiction.
So there is a Q such that M = QDQ−1 .
ESE 502, Linear Systems Theory 61

Quadratic Forms (continued):

Theorem:For a symmetric matrix, eigenvectors of different

eigenvalues are orthogonal.

Proof: If M x1 = λ1 x1 and M x2 = λ2 x2 with λ1 6= λ2 ,

x′1 M x2 = x′1 λ2 x2 = λ2 x′1 x2
but also

x′1 M x2 = x′2 M ′ x1 = x′2 M x1 = λ1 x′2 x1 = λ1 x′1 x2

Therefore (λ1 − λ2 )x′1 x2 = 0, and since λ1 − λ2 6= 0, it

follows that x′1 x2 = 0.

Consequence: A symmetric matrix M has an orthonormal

basis of eigenvectors, and so the diagonalizing matrix Q
such that M = QDQ−1 can be taken to have orthonormal

Definition: A matrix Q is called orthogonal if the columns of

Q are orthonormal, or equivalently QQ′ = Q′ Q = I , or
Q−1 = Q′
Result: if M is symmetric, M = QDQ′ with D diagonal,
and Q orthogonal.
ESE 502, Linear Systems Theory 62

Quadratic Forms (continued):

x′ M x > 0 unless x = 0, or all

Positive Definite:
eigenvalues of M are > 0.

Positive Semidefinite:x′ M x ≥ 0 for all x, or all

eigenvalues of M are ≥ 0.

Singular Values: If H is an m × n matrix, the singular

values of H are defined to be the square roots of
eigenvalues of M = H ′H .
Since x′ H ′ Hx = ||Hx||2 ≥ 0, the singular values are all
real and nonnegative.

Singular Value Decomposition: H can be decomposed

into the form
H = RSQ′
where R′ R = RR′ = Im , Q′ Q = QQ′ = In , and S is
m × n with the singular values of H on the diagonal.

You might also like