Linear System and Background
Linear System and Background
About
The first sections of this unit are about some key properties of convolutions, integral
transforms and related concepts. Conditions for existence and finiteness are not the
focus here. We use the term continuous time and discrete time functions/signals to
mean that their domains are subsets of R or Z respectively.
The remaining sections (section 8 and onwards) about linear time invariant (LTI)
systems with a single input and a single output (SISO). The basics of some of the aspects
appearing in an undergraduate Signals and Systems course are covered. The treatment
of indefinite integrals, generalized functions and other limiting objects is not rigourous
yet suffices for the understanding of this material needed in engineering applications of
control.
Warning: Dont confuse continuous time (in these notes) with continuous.
Signals
The term signal is essentially synonymous with a function, yet a possible difference is
that a signal can be described by various different representations, each of which is a
different function.
Signals may be discrete time or continuous time. Although some signals are digitized, their values are typically taken as real (or complex). Signals may be either scalar
or vector.
When talking about SISO linear systems, a signal, u(t) may be viewed as: u : R R
if the setting is that of continuous time or u : Z R in the discrete time setting.
It is typical and often convenient to consider a signal through an integral transform
(e.g. the Laplace transform) when the transform exists.
Example 1 Consider the signal,
u(t) =
0 t < 0,
et 0 t.
est et dt =
1
,
s+1
for s 1.
In this case, both u(t) and u(s) represent the same signal. We often say that u(t)
is the time-domain representation of the signal where as u(s) is the frequency-domain
representation.
It is common to do operations on signals. Here are a few very common examples:
u(t) = 1 u1 (t) + 2 u2 (t): Add, subtract, scale or more generally take linear combinations.
u(t) = u(t ): Translation. Shift forward in case > 0 (delay) by .
u(t) = u(t): Reverse time.
u(t) = u(t): Time scaling. Stretch (for 0 < < 1). Compress (for 1 < ).
u(n) = u(nT ): Sample to create a discrete time signal from a continuous one
signal.
P
, where K() is an interpolation function. I.e. it has the
u(t) = n u(nT )K tnT
T
properties K(0) = 1, K(n) = 0 for other integers n 6= 0.
Exercise 1 Find the K() that will do linear interpolation, i.e. connect the dots. Illustrate how this works on a small example.
Convolutions
If the functions are of positive support (= 0 for t < 0) the range of integration in the
convolution integral reduces to [0, t].
For a probabilist, the convolution is the basic tool of finding the distribution of the
sum of two independent random variables X and Y , say with densities fX () and fY ():
Z Z tx
FX+Y (t) := P (X + Y t) =
P (X, Y ) [x, x + dx) [y, y + dy) dy dx
Z Z tx
Z
Z tx
=
fX (x)fY (y)dy dx =
fX (x)
fY (y)dy dx.
Convolution is also defined for discrete time functions (in probability theory this often
corresponds to the probability mass function of the sum of two independent discrete
random variables):
X
PX+Y (n) = P X + Y = n =
P X = k P Y = n k = PX PY (n).
k=
Note again that if PX and PY are of positive support (= 0 for t < 0) then the range of
summation in the convolution sum reduces to k {0, . . . , n}.
Another way to view discrete convolutions is as a representation of the coefficients
of polynomial products. Denote,
A(x) =
n1
X
aj x ,
B(x) =
j=0
n1
X
bj x ,
C(x) = A(x)B(x) =
j=0
Pj
k=0
2n2
X
cj x j .
j=0
Our use of convolutions will be neither for probability nor for polynomial products but rather for the natural application of determining the action of a linear system on
an input signal.
Algebraic Properties
Commutativity:
f g (t) =
Associativity:
Distributivity:
f ( )g(t )d =
f (t )g( )d = g f (t)
f g h=f gh
f (g + h) = f g + f h.
Scalar multiplication:
(g h) = (g) h = g (h).
3
Shift/Differentiation:
D(g h) = (Dg) h = g (Dh),
where D is either the delay by one operator for discrete time or the differentiation
operator for continuous time.
Exercise 3 Show the shift/differentiation property. Do both shift (discrete time) and
differentiation (continuous time).
Sometimes the notation f m is used for f f . . . f , m times.
If f is a probability density with mean and finite
variance 2 , the central limit
f m (t)m
Laplace Transforms
Let s be a complex number, the Laplace transform of a continuous time function f (t)
at the frequency f () is,
Z
L{f ()}(s) =
est f (t)dt.
(1)
0
We shall often denote L{f ()} by f. Observe the lower limit to be 0 and read that as,
Z
lim
est f (t).
0
This is typical engineering notation as the function f () may sometimes have peculiarities at 0. For example may have a generalized function component. In applied
probability and other more rigorous mathematical contexts, the Laplace-Stiltijes Transform is often used,
Z
est dF (t),
where the above is a Stiltijes integral. We shall not be concerned with this here. Our
Laplace transform, (1) is sometimes referred to as the one-sided Laplace transform.
Whereas,
Z
est f (t)dt,
LB {f ()}(s) =
is the bilateral Laplace transform. The latter is not as useful and important as the former
for control purposes. An exception is the case of s = i (pure imaginary) in which case,
f() = LB {f ()}(i),
is the (up to a constant) Fourier transform of f (here we slightly abuse notation by using
the hat for both
Laplace and Fourier transforms). Note that in most engineering text
the symbol i = 1 is actually denoted by j.
In probability, the Laplace transform of a density, fX of a continuous random variable
has the meaning, E [esX ]. This has many implications and applications which we shall
not discuss.
(2)
The function et is not of exponential order but most signals used in control theory are.
5
a(t)
b(t)
(here
Theorem 3 Functions f (t) that are locally integrable and are of exponential order with
c have a Laplace transform that is finite for all Re(s) > c .
The regieon in the complex s-plane: {s : Re(s) > c } is denoted the regieon of
convergence (ROC) of the Laplace transform.
Proof
Z
Z
st
e( )t dt.
e e dt = M
|f(s)| M
Writing s = + i we have |e
This integral is finite whenever = Re(s) > . Now since 0 can be chosen arbitrarily
close such that 0 > c we conclude that the transform exists whenever > c .
Uniqueness
Laplace transforms uniquely map to their original time-functions. In fact, this is the
inversion formula:
Z +iM
1
f (t) = lim
est f(s)ds,
M 2i iM
for any > c . The integration is in the complex plane and is typically not the default
method.
Exercise 8 Optional (only for those that have taken a complex analysis course). Apply
the inversion formula to show that,
1
L1
= teat .
2
(s + a)
6
Basic Examples
Example 2 The Laplace transform of f (t) = c:
Z T
Z
h c
iT
c
st
est c dt = lim est = 1 lim esT .
e c dt = lim
L(c) =
T
T 0
T
s
s
0
0
When does the limit converge to a finite value? Take s = + i,
lim esT = lim eT (cos T + i sin T ).
Re(s) > 0.
1
,
s
Basic Properties
You should derive these.
Linearity:
L 1 f1 (t) + 2 f2 (t) = 1 f1 (t) + 2 f2 (t).
Time shift:
L f (t ) =
Frequency shift:
Time Scaling:
st
f (t )e
dt =
f (t)es(t+) dt = es f(t).
L eat f (t) = f(s + a).
1 s
L f (at) =
f
.
|a| a
7
Differentiation:
Z
Z
0
0
st
st
L f (t) =
f (t)e dt = f (t)e
+s
0
0
Integration:
L
Z
0
f (t)est dt = f (0 ) + sf(s).
1
f (x)dx = f(s).
s
More basic properties are in one of tens of hundreds of tables available in books or on
the web.
t
,
2
= 0.
Relation To Convolution
This property is very important:
L f1 (t) f2 (t) = f1 (s)f2 (s).
Exercise 12 Prove it.
Often Laplace (as well as Fourier and Z) transforms are of the rational form,
p(s)
pm sm + . . . + p1 s + p 0
f(s) =
=
,
q(s)
qn sn+ . . . . . . . . . + q1 s + a0
with pi , qi either real or complex coefficients (we mostly care about real coefficients) such
that, pm , qn 6= 0. The function f() is called proper if m n, strictly proper if m < n
and improper if m > n.
If f(s) is not strictly proper, then by performing long division it may be expressed
in the form,
v(s)
,
r(s) +
q(s)
where r(s) is a polynomial of degree m n and v(s) is a polynomial of degree < n.
Exercise 13 Carry long division out for,
s4 + 2s3 + s + 2
f(s) =
,
s2 + 1
to express it in the form above.
The action of performing partial fraction expansion is the action of finding the coefficients Aik such that a strictly proper f() in the form,
f(s) =
mi
K X
X
i=1
k=1
Aik
,
(s si )k
where s1 , . . . , sK are the distinct real or complex roots of q(s), and the multiplicity of
root si is mi .
After carrying out long division (if needed) and partial fraction expansion, f(s) may
be easily inverted, term by term.
Example 3 Consider,
f(s) =
s2
1
1
=
.
+ 3s + 2
(s + 1)(s + 2)
(3)
(4)
or,
Now identity coefficients of terms with like powers of s to get a set of linear equations:
A11 + A21 = 0
2A11 + A21 = 1
to get A11 = 1 and A21 = 1.
9
Example 4 Consider,
f(s) =
s3
s1
s1
=
.
3s 2
(s + 1)2 (s 2)
(s2
s+3
.
+ 2s + 5)(s + 1)
f (t)eit dt.
f ()eit dw.
sin(t)
.
t
Basic Properties
Many properties are very similar to the Laplace transform (the Fourier transform is a
special case of the bilateral Laplace transform).
Some further important properties are:
The transform of the product f1 (t)f2 (t) is f1 f2 (). This has far reaching
implications in signal processing and communications.
Parsevals Relation (energy over time = energy over spectrum):
Z
Z
2
1
f()2 d.
f (t) dt =
2
Graphical Representaions
Plots of f() and f() are referred to by engineers as Bode plots. It is typical to
stretch the axis of the
plots so that the horizontal axis is log10 () and the vertical
axis are 20 log10 f () and f(). There is a big tradition in engineering to generate
approximate bode plots by hand based on first and second order system approximations.
An alternative plot is the Nyquist plot
Exercise 17 Generate a Bode and a Nyquist plot of a system with transfer function,
H(s) =
s2
11
1
.
+s+2
This is the analog of the Laplace transform for discrete time functions, f (n). The
Z-transform is defined as follows,
f(n) =
f (n)z n .
k=
Many of the things we do for continuous time using the Laplace transform may be
done for discrete time using the Z-transform. We will not add further details in this
unit, but rather touch discrete time systems when we talk about general (MIMO) linear
systems.
Engineering (and mathematics) practice of continuous time signals is often greatly simplified by use of generalized signals. The archetypal such signal is the delta-function
denoted by (t), also called impulse. This weird mathematical object has the following two basic properties:
1. (t) = 0 for t 6= 0.
R
2. (t)dt = 1.
Now obviously there is no such function : R R, that obeys these two properties if the
integral is taken in the normal sense (e.g. Reiman integral). The rigorous description of
delta functions is part of the theory of distributions (not to be confused with probability
distributions). We shall overview it below informally and then survey a few useful
properties of the delta function. First, one should be motivated by the fact that in
practice the delta function can model the following:
1. The signal representing the energy transfer from a hammer to a nail.
2. The derivative of the unit step function,
3. A Gaussian (normal) density of variance 0.
A more formal (yet not fully rigorous) way to define delta functions is under the
integral sign. It can be thought of as an entity that obeys,
Z
(t)(t)dt = (0),
(5)
12
for every (regular) function that is continuous at 0 and has bounded support (equals
0 outside of a set containing the origin). Entities such as (t) are not regular functions we will never talk about the value of (t) for some t, but rather always consider values
of integrals involving (t). Yet from a practical perspective they may often be treated
as such.
An important operation in the study of linear systems is the convolution. We use
to denote the convolution operator. For two continuous time signals, f (t) and g(t). The
signal h = f g resulting from their convolution is,
Z
f ( )g(t )d.
h(t) =
The delta function gives a way to represent any signal u(t). Consider the convolution,
u:
Z
( )u(t )d = u(t 0) = u(t).
(6)
Thus we see that the function is the identity function with respect to convolutions:
u = u.
The discrete time version of the convolution h = f g is,
h(n) =
X
k=
f (k)g(n k).
X
u (n) =
[k]u(n k) = u(n).
(7)
k=
Here [n] is the discrete delta function (observe the square brackets), a much simpler
object than (t) since it is defined as,
1 n = 0,
[n] =
0 n 6= 0.
Thus we have again that u = u. Note that part of the motivation for introducing for
the continuous time delta function is to be able to mimic the representation (7).
We shall soon present other generalized signals related to the delta function. Since
such functions are defined under the integral sign, two signals 1 (t) and 2 (t) are equal
if,
Z
Z
1 (t)(t)dt =
2 (t)(t)dt,
13
f (t)(t)dt =
f ( )( )d.
||
1
1
1
(t)(t)dt =
( )( )d =
(0) =
(t)(t)dt.
||
||
||
Here the first equality is a definition. and the second and third equalities come from the
defining equation (5). This then implies that
(t) =
1
(t).
||
Consider now what happens when (t) is multiplied by a function f (t) continuous at
0. If (t) was a regular function then,
Z
Z
f (t)(t) (t)dt =
(t) f (t)(t) dt = f (0)(0)
It is then sensible to define the generalized function, f (t)(t) (for any regular function
f ()) as satisfying:
Z
f (t)(t) (t) = f (0)(0)
X
T (t) =
(t kT ).
k=
Here of course one needs to justify the existence of the series (of generalized functions
!!!) etc, but this is not our interest.
Impulse trains are very useful for representing the operation of sampling a continuous
time (analog) signal. This is done by taking the signal u(t) and multiplying by T (t).
The resulting signal has values u(t) for t = kT , k N and 0 elsewhere.
The derivation of the famous Nyquist-Shannon sampling theorem is greatly aided by
the impulse train. That theorem says that a band limited analog signal u(t) can be
perfectly reconstructed if sampled at a rate that is equal or greater than twice its highest
frequency.
Related to the delta function is the unit step function,
0 t < 0,
1(t) =
1 0 t.
(8)
This is sometimes called the Heaviside unit function. Other standard notation for it
is u(t), but in control theory we typically reserve u(t) for other purpuses (i.e. the input
to a system). While it is a function in the regular sense, it can also be defined as a
generalized function:
Z
Z
1(t)(t)dt =
(t)dt,
(9)
0 (t)dt = () (0) = (0).
here (t) needs to be any function from a suitable set of test functions. We will not
discuss generelized function in any more depth than covered here. Students interested
in functional analysis and related fields can study more about Schwartzs theory of
distributions indepdently.
A system is a mapping of an input signal to an output signal. When the signals are
scalars the system is called SISO. When inputs are vectors and outputs are vectors the
system is called MIMO (Multi Input Multi Output). Other combinations are MISO and
SIMO. We concentrate on SISO in this unit. As in the figure below, we typically denote
the output of the system by y(t).
Input
u(t)
System
x(t)
Output
y(t)
Figure 1: A system
operates on an input signal u() to generate an output signal
y() = O u() . The system may have a state, x(t). This unit does not focus on
state representations and thus, x(t) is often ignored here.
A system is memoryless
if the output at time t depends only on the input at time
t. I.e. y(t) = g u(t) for some scalar function g(). These systems are typically quite
boring.
A system is non-anticipating (or causal) if the output at time t depends only on
the inputs during times up to time t. This is defined formally by requiring that for all
t0 , whenever the inputs u1 and u2 obey u1 (t) = u2 (t) for all t t0 , the corresponding
outputs y1 and y2 (t) obey y1 (t) = y2 (t) for all t t0 .
16
A system is time - invariant if its behaviour does not depend on the actual current
time. To formally define this, let y(t) be the output corresponding to u(t). The system
is time-invariant if the output corresponding to u(t ) is y(t ), for any time shift .
A system is linear if the output corresponding to the input 1 u1 (t) + 2 u2 (t) is
1 y1 (t)+2 y2 (t), where yi is the corresponding input to ui and i are arbitrary constants.
Exercise 22 Prove that the linearity property generalises to inputs of the form
PN
i=1
i ui (t).
Systems that are both linear and time-invariant are abbreviated with the acronym
LTI. Such systems are extremely useful in both control and signal processing. The LTI
systems of control are typically casual while those of signal processing are sometimes
not.
Exercise 23 For discrete time input u(n) define,
N
X
+ cos(n)
1
u(n + m)
.
y(n) =
N + M + 1 m=M
When = 1 and = 0 this system is called a sliding window averager. It is very useful
and abundant in time-series analysis and related fields. Otherwise, there is not much
practical meaning for the system other than the current exercise.
Determine when the system is memory-less, casual, linear, time-invariant based on
the parameters N, M, , .
A final general notion of systems that we shall consider is BIBO stability. BIBO
stands for bounded-input-bounded-output. A system is defined to be BIBO stable if
whenever the input u satisfies ||u|| < then the output satisfies ||y|| < . We will
see in the sections below that this property is well characterised for LTI systems.
10
Many useful systems used practice (signal processing and control) are both linear and
time-invariant. For this we have the acronym LTI. The remainder of this unit deals with
LTI SISO systems.
We now overview several ways of representing linear systems:
1. IO Mapping representation.
2. Representation using the impulse response.
3. Representation using the transfer function.
17
11
We begin the discussion with equation (7). This is merely a representation of a discrete
time signal u(n) using the shifted (by k) discrete delta function,
1 n = k,
[n k] =
0 n 6= k.
Treat now u(n) as input to an LTI system with output y(n). In this case since the input
as a function of the time n, is represented as in (7), the output may be represented as
follows:
X
X
X
y(n) = O u(n) = O
[k]u(nk) = O
u(k)[nk] =
u(k)O [nk]
k=
k=
k=
Now if denote h(n) = O [n] and since the system is time invariant we have that
h(n k) = O [n k] . So we have:
y(n) =
X
k=
u(k)h(n k) = u h (n).
This very nice fact shows that the output of LTI systems can in fact be described by
the convolution of the the input with the function h(n). This function deserves a special
name: impulse response.
For continuous time systems the same argument essentially follows, this time using
(6):
Z
Z
(t)u(t )d = O
u( )(t )d
y(t) = O u(t) = O
Z
Z
=
u( )O (t ) d =
u( )h(t )d = u h (t).
Observe that in the above we had a leap of faith in taking our system to be linear in
sense that,
Z
Z
O
s us ds =
s O us ds.
18
We have thus seen that a basic description of LTI systems is the impulse response. Based
on the impulse response we may thus be able to see when the system is memory-less,
causal and BIBO-stable.
Exercise 24 Show that an LTI system is memory less if and only if the impulse response
has the form h(t) = K(t).
Exercise 25 Show that an LTI system is causal if and only if h(t) = 0 for all t < 0.
Example 6 Consider the sliding window averager of exercise 23 with = 1 and = 0.
Find its impulse response and verify when it is casual.
The following characterises LTI systems that are BIBO stable:
Theorem 5 A SISO LTI system with impulse response h() is BIBO stable if and only
if,
||h||1 < .
Further if this holds then,
(10)
Proof The proof is for discrete-time (the continuous time case is analogous). Assume
first that ||h||1 < . Then,
X
X
|y(n)| =
h(n k)u(k)
|h(n k)| |u(k)|
k=
k=
k=
|h(n k)| ||u||
So,
||y|| ||h||1 ||u|| .
Now to prove that ||h||1 < is also a necessary condition. We assume the input is real,
the complex case is left as an exercise, choose the input,
u(n) = sign h(n) .
So,
y(0) =
X
k=
h(0 k)u(k) =
X
k=
|h(k)| = ||h||1 .
Thus if ||h||1 = the output for input u() is unbounded, so ||h||1 < is a necessary
condition.
Exercise 26 What input signal achieves equality in (10)?
Exercise 27 Prove the continuous time version of the above.
Exercise 28 Prove the above for signals that are in general complex valued.
19
12
It is now useful to consider our LTI SISO systems as operating on complex valued signals.
Consider now an input of the form u(t) = est where s C. We shall denote s = + i,
i.e. = Re(s) and = Im(s). We now have,
Z
Z
Z
s(t )
h( )e
d =
h( )es d est .
h( )u(t )d =
y(t) =
R
Denoting H(s) = h( )es d we found that for exponential input, est , the output
is simply a multiplication by the complex constant (with respect to t), H(s):
y(t) = H(s)est .
This is nice as it shows that inputs of the form est are the eigensignals of LTI systems
where as for each s, H(s) are eigenvalues.
Observe that H(s) is nothing more than the Laplace transform of the impulse response. It is central to control and system theory and deserves a name: the transfer
function. When the input signal under consideration has real part = 0, i.e. u(t) = eit
then the output can still be represented interns of the transfer function:
y(t) = H(i)eiwt
In this case y(t) is referred to as the frequency response of the harmonic input eit
at frequency . In this case, F () = H(i) is the Fourier transform of the impulse
response. Note that both the Fourier and Laplace transform are referred to in practice
as the transfer function. Further, analogies exist in discrete time systems (e.g. the
Z-transform).
In general since we have seen that y(t) = u h (t), we have that,
Y (s) = U (s)H(s).
So the transfer function, H(s) can also be viewed as,
H(s) =
Y (s)
.
U (s)
This takes practical meaning for pure imaginary s = i as it allows to measure H(i)
based on the ratios of Y (i) and U (i). The frequency response of a system is the
Fourier transform of the impulse response.
20
0
.
s2 + 02
s2
0
.
+ 02
Now applying a partial fraction expansion we will get the following form
Y (s) =
n
X
i=1
i
0
0
+
+
.
s pi s + i0 s i0
Exercise 29 Carry out the above to find 0 , 1 , 2 , 3 for some explicit H(s) of your
choice having 3 distinct poles with negative real part.
Inverting Y (s) we get,
y(t) =
n
X
i=1
t 0,
with,
= tan1
Im( )
0
.
Re(0 )
Now if all pi < 0 the system will represent stable behavior and as t grows the output
will be determined by the sinusoidal term.
Exercise 30 Continuing the previous exercise, find and plot the system output as to
illustrate convergence to the pure sinusoidal term.
Exercise 31 Now make/generate (probably using some software) a Bode plot of your
system and relate the result of the previous exercise to the Bode plot. I.e. what is the
frequency response for 0 .
21
13
When we study general linear systems we will relate BIBO stability to internal stability.
The latter term is defined for systems with rational transfer functions as follows: A
system is internally stable if the locations of all poles are in the left hand plane.
A classic criterion for this is the Routh-Hurwitz Test: We consider a polynomial
q(s) = qn sn+ . . . . . . . . . + q1 s + a0 and are interested to see if Re(qi ) < 0 for i = 1, . . . , n.
In that case call the polynomial Hurwitz.
See section 7.3, pp 247 of [PolWil98] (handout).
Exercise 32 Follow example 7.3.2 of [PolWil98]. Then choose a different polynomial
of similar order and carry out the test again. Compare your results to the actual roots
of the polynomial (which you can find using some software).
14
for some scalar . Thus the impulse response is the solution of,
1
h(t) + h(t) = (t),
h(0 ) = 0,
which is,
h(t) = et 1(t).
Exercise 33 Check that this is indeed the solution (use properties of (t)).
The transfer function of this system is,
H(s) =
.
s+
It has a pole at s = and thus the ROC is Re(s) > . So for > 0 the system is
stable.
A second order LTI system (with complex conjugate roots) is generally a more interesting object. It can be described by,
y(t) + 2n y(t)
+ n2 y(t) = n2 u(t).
The two positive parameters have the following names:
22
n2
.
s2 + 2n s + n2
n2
,
(s c1 )(s c2 )
2 1.
M
M
,
s c1 s c2
15
The next unit dealing with classic control methods generally deals with designing LTI
controllers G1 and G2 configured as in Figure 2. The whole system relating output y
to input reference r is then also an LTI and may be analyzed in the frequency (or s)
domain easily.
Y (s) = R(s)H(s).
23
G1 (s)
H(s)
ym
G2 (s)
Figure 2: A plant, H(s) is controlled by the blocks G1 (s) and G2 (s) they are both
optional (i.e. may be set to be some constant K or even 1.
This can be done easily:
Y (s) = U (s)H(s) = E(s)G1 (s)H(s) = R(s)Ym (s) G1 (s)H(s) = R(s)G2 (s)Y (s) G1 (s)H(s).
Solving for Y (s) we have,
Y (s) = R(s)
G1 (s)H(s)
.
1 + G2 (s)G1 (s)H(s)
H(s)
=
G1 (s)H(s)
.
1 + G2 (s)G1 (s)H(s)
Exercise 39 What would be the feedback system if there was positive feedback instead
of negative. I.e. if the circle in the figure would have a + instead of -?
24