Optimal Control
Optimal Control
Optimal Control
1Ii),'
Arturo Locatelli
Optimal Control
An Introduction
Birkhauser Verlag
Basel Boston Berlin
Author
Arturo Locatelli
Dipartimento di Elettronica e Informazione
Politecnico di Milano
Piazza L. da Vinci 32
20133 Milano
Italy
e-mail: [email protected]
987654321
www.birkhauser.ch
to Franca
and my parents
Contents
Preface . . . . .
1 Introduction
. .
. .
. .
ix
. .
. .
. .
. .
. .
I Global methods
2
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
The LQ problem
3.1 Introduction . . . . . . . . . .
3.2 Finite control horizon . . . .
3.3 Infinite control horizon . . . .
3.4 The optimal regulator . . . .
3.4.1 Stability properties . .
3.4.2 Robustness properties
3.4.3 The cheap control . .
3.4.4 The inverse problem .
.
3.5 Problems
..
9
10
18
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
..
21
22
39
44
59
71
76
82
89
4.1
4.2
. .
. . .
. .
. .
. .
. .
. .
. .
. .
.
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
.
. .
91
. .
. .
. .
. .
. .
. .
93
93
104
112
112
115
122
125
. .
.
. .
Contents
viii
5.2
5.3
5.4
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. . .
. .
. .
. .
. .
. .
. .
125
127
143
II Variational methods
6 The Maximum Principle
6.1 Introduction . . . .
6.2
6.3
6.4
6.5
6.6
7
. .
. .
. .
. .
Simple constraints . . . . . . . . . . . . . . . . . . .
6.2.1 Integral performance index . . . . . . . . . .
6.2.2 Performance index function of the final event
Complex constraints . . . . . . . . . . . . . . . . . .
6.3.1 Nonregular final varieties . . . . . . . . . . .
6.3.2 Integral constraints . . . . . . . . . . . . . . .
6.3.3 Global instantaneous equality constraints . .
6.3.4 Isolated equality constraints . . . . . . . . . .
6.3.5 Global instantaneous inequality constraints .
Singular arcs . . . . . . . . . . . . . . . . . . . . . .
Time optimal control . . . . . . . . . . . . . . . . . .
Problems . . . . . . . . . . . . . . . . . . . . . . . .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
151
164
180
181
184
. .
. .
. .
. .
. .
. .
. . . . .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
187
190
196
200
205
216
221
222
235
246
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
249
252
253
254
255
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
259
260
267
272
283
B Eigenvalues assignment
B.1 Introduction . . . . . . . . . . . . . . . . . .
B.2 Assignment with accessible state . . . . . .
B.3 Assignment with inaccessible state . . . . .
B.4 Assignment with asymptotic errors zeroing
.
. .
A Basic background
A.1 Canonical decomposition . . . .
A.2 'Iiansition matrix . . . . . . . .
A.3 Poles and zeros . . . . . . . . .
A.4 Quadratic forms . . . . . . . .
A.5 Expected value and covariance
Bibliography .............
. .
. .
C Notation
147
149
. .
...
. .
. .
. .
. .
. .
Index ....................................
287
291
Preface
From the very beginning in the late 1950s of the basic ideas of optimal control,
attitudes toward the topic in the scientific and engineering community have ranged
from an excessive enthusiasm for its reputed capability of solving almost any kind of
problem to an (equally) unjustified rejection of it as a set of abstract mathematical
concepts with no real utility. The truth, apparently, lies somewhere between these
nico of Milan. It is, however, not a direct translation, since the material presented
here has been organized in a quite different way, or modified and integrated into new
sections and with new material that did not exist in the older work. In addition,
it reflects the author's experience of teaching control theory courses at a variety of
levels over a span of thirty years. The level of exposition, the choice of topics, the
relative weight given to them, the degree of mathematical sophistication, and the
nature of the numerous illustrative examples, reflect the author's commitment to
effective teaching.
The book is suited for undergraduate/graduate students who have already been
exposed to basic linear system and control theory and possess the calculus background usually found in any undergraduate curriculum in engineering.
The author gratefully acknowledges the financial support of MURST (Project
Identification and Control of Industrial Systems) and ASI (ARS-98.200).
Milano, Spring 2000
Chapter 1
Introduction
Beginning in the late 1950s and continuing today, the issues concerning dynamic optimization have received a lot of attention within the framework of
control theory. The impact of optimal control is witnessed by the magnitude
of the work and the number of results that have been obtained, spanning theoretical aspects as well as applications. The need to make a selection (inside
the usually large set of different alternatives which are available when facing
a control problem) of a strategy both rational and effective is likely to be one
of the most significant motivations for the interest devoted to optimal control.
A further, and not negligable, reason originates from the simplicity and the
conceptual clearness of the statement of a standard optimal control problem:
indeed it usually (only) requires specifying. the following three items:
(a) The equations which constitute the model of the controlled system;
(b) The criterion, referred to as the performance index, according to which
the system behaviour has to be evaluated;
(c) The set of constraints active on the system state, output, control variables, not yet accounted for by the system model.
The difficulties inherent in points (a) and (c) above are not specific to the
optimization context, while the selection of an adequate performance index
may constitute a challenging issue. Indeed, the achievement of a certain goal
(clearly identified on a qualitative basis only) can often be specified in a variety of forms or by means of an expression which is well defined only as far
as its structure is concerned, while the values of a set of parameters are on
the contrary to be (arbitrarily) selected. However, this feature of optimal control problems, which might appear as capable of raising serious difficulties,
frequently proves to be expedient, whenever it is suitably exploited by the de-
Chapter 1. Introduction
Chapter 1. Introduction
Problem 1 The initial state xo and the initial time to are given, while the final time
t f when the rendezvous takes place is free since it is the performance index J to be
minimized. Thus
ftf
J=J
dt.
to
Besides the constraint x(tf) = xb(t f) (which is peculiar to the problem), other requirements can be set forth as, for instance, u,,,, < u(t) < um which account for
limits on the control actions.
Problem 2 The initial state xo, the initial time to and the final time t f are given.
The final state is only partially specified (for instance, the final position is given,
while the final velocity is free inside a certain set of values) and the performance
index aims at evaluating the global control effort (to be minimized) by means of an
expression of the kind
rju(t)dt, ri > 0
fJ =
o
{=1
where ui is the i-th component of the control variable u. The peculiar constraint of
the problem is x(r) = xb(r), where the time r when the rendezvous takes place must
satisfy the condition to < r < t f and may or may not be specified.
Problem 3 This particular version of the rendezvous problem is sometimes referred
to as the interception problem. The initial state xo may or may not be completely
specified, while both the initial and final times to and t f are to be selected under
the obvious constraint to < t f < T. The final state is free and the performance
index is as in Problem 2. The peculiar constraint of the problem involves some of
the state variables only (the positions), precisely, e(r) = Cb(r), where the time 7
when interception takes place may or may not be given and satisfies the condition
to <r<tf.
Example 1.2 A typical positioning problem consists in transferring the state x of the
controlled system from a given initial point Po to the neigbourhood of the point Pf
with coordinates x fi. The transfer has to accomplished in a short time by requiring
control actions u of limited intensity. The problem is characterized by a given initial
Chapter 1. Introduction
state xo and time to, while the performance index is given the form
J ftf
=
i=1
In general the problem formulation may also include a set of constraints which set
bounds on some of the state variables, typically, velocities and/or positions: they can
be expressed by means of relations of the type w(x(t), t) < 0 to be satisfied along
the whole control interval.
Example 1.3 A frequently encountered problem in space applications is the so-called
attitude control problem which, in its simplest formulation, consists in orienting an
object (satellite) in a specified direction, starting from an arbitrary wrong orientation.
The problem can be stated in a way similar to the one in Example 1.2. Letting to
and t f denote the extreme values (given or to be determined) of the control interval,
a significant feature of this problem is the presence of integral constraints of the kind
rt
Jto
which emphasize the need that the required manoeuvre be accomplished by consum-
ing a quantity of fuel not exceeding a given bound: the function w quantifies the
instant consumption which is assumed to depend on the value of the control variable
u applied at that time.
Example 1.4 The requirement of keeping the state x of a system as close as possible
to x,i, after a disturbance has occurred, gives rise to an important class of optimal
control problems. The perturbation Su to be given to the control Ud has to be determined in such a way as to make Sx := x - xd small. Here Ud is the control which
generates Xd in unperturbed conditions. In general, it is also desired that Su be small:
hence these requirements can be taken into account by looking for the minimization
of the performance index
tf
J= I
Jto
rn
ti
gi6xli (t) +
1: riSu; (t)
i=1
Chapter 1. Introduction
(a) The equations which describe the dynamic behaviour of the system. They
are either differential or algebraic and their number is finite.
(b) The set of allowable initial states. It can entail the complete or partial
specification of the initial state as well as its freedom.
(c) The set of allowable final states. It can entail the complete or partial
specification of the final state as well as its freedom.
(d) The performance index. It is constituted by two terms which need not
Part I
Global methods
Chapter 2
The Hamilton-Jacobi theory
2.1
Introduction
(a) The controlled system which is a continuous-time, finite dimensional dynamic system;
(b) The initial state xo;
(c) The initial time to;
(d) The set of functions which can be selected as inputs to the system;
(e) The set of admissible final events (couples (x, t));
(f) The performance index.
(2.1)
(2.2)
and arnounts to saying that the initial state is given, at a known initial time.
The set S j C {(X, t) I X E R1, t > to } specifies the admissible final events, so
that the condition (a constraint to be fulfilled)
(x(tf), tf) E Sf
(2.3)
10
allows us to deal with a large variety of meaningful cases: (i) final time t f
given and final state free, (Sf = {(x, t) I x E Rn, t = T = given}; (ii) final
state given and final time free (Sf = {(x, t) I x = x f = given, t > to}); and so
on. The set of functions which can be selected as inputs to the system coincides
with cr, the space of rn-vector functions which are piecewise continuous: thus
the constraint
E f must be satisfied. Finally, the performance index J
(to be minimized) is the sum of an integral type term with a term which is a
function of the final event, namely
J=
ftf
Jt
(2.4)
E Qm of
functions defined on the interval [to, t fJ in such a way that eq. (2.3) is satisfied
and the performance index (2.4), evaluated for x(.) =
to, xo, u(.)) and
u(.) =
over the interval [to, t f' 1, takes on the least possible value.
The next section will present the main results of the so-called Hamilton-Jacobi
theory. In the given form they provide sufficient optimality conditions for only
the above problem, even though they could have been stated so as to encompass
a larger class of situations. Our choice has been made for the sake of keeping
the discussion at the simplest possible level.
2.2
11
E Q"' be a control
defined on the interval [to, t], t' > to. It is an optimal control relative to
(xo, to) for the system (2.1), the performance index (2.4) and the set S f if.
(i) (co(tf ; to, xo, u(-)), t f) E Sf;
(ii) At f, w('; to, xo, u(.)), u(.)) < J(tf, SP(.; to, xo, u(.)), u(.)).
E 91, defined on the interval [to, t f], is any admissible control relative to (xo, to) for the system (2.1) and the set S1.
Here
(2.5)
where A E Rn, is the hamiltonian function (relative to the system (2.1) and
the performance index (2.4)).
Definition 2.4 (Regularity of the hamiltonian function) The hamiltonian function is said to be regular if, as a function of u, it admits, for each x, t >_ to, A,
a unique absolute minimum uh(x, t, A), i.e., if
H(x, u (x, t, A), t, A) < H(x, u, t, A), Vu 54 uh (x, t, A),
(2.6)
Definition 2.5 (H-minimizing control) Let the hamiltonian function be regular. The function uh which verifies the inequality (2.6) is said to be the Hminimizing control.
Example 2.1 With reference to eqs. (2.1), (2.4), let f (x, u, t) = sin(x) + to and
l(x, u, t) = tx2 + xu + u2, so that the hamiltonian function (2.5) is H(x, u, t, A) =
tx2 + xu + u2 +,\(sin(x) + tu). This function is regular and uh (x, t, A)
(x + tA)/2.
A partial differential equation (on which the sufficient conditions for optimality
are based) can be defined, once the hamiltonian function has been introduced.
It concerns a scalar function V (z, t), where z E R" and t E R.
12
Definition 2.6 (Hamilton-Jacobi equation) Let the hamiltonian function be regular. The partial differential equation
aV (z, t)
at
+ H(z,
Theorem 2.1 Let the harniltonian function (2.5) be regular and u E SZ', defined on the interval [to, t f], tf > to, be an admissible control relative to (xo, to),
so that (x(t f), t f) E S, where x(.) := co(-; to, xo,
Let V be a solution
of eq. (2.7) such that:
Z_xo(t)
l*(z,u,t)
'Vat't) +H.(z,u,t,(aaz't)Y)
_
OV(t't)+l(z,u,t)+09VV(z,t)f(z)
u,t)
which is regular since it difffers from the hamiltonian function because of a term
which does not depend on u. Therefore it admits a unique absolute minimum in
u*(z, t) = uh(z, t, (aVaz't) Y)
(2.8)
(2.9)
since V is a solution of eq. (2.7). If U E SZrn, defined on the interval [to, t f], t f > to, is
any admissible control relative to (xo, to) and x is the corresponding state motion, so
13
that (x(t f), t1) E Sf, then, in view of assumption (a3), egs.(2.8), (2.9) and recalling
that 1* admits a unique absolute minimum in u*, it follows that
to
azt) )
), t)dt
Z=x(t)
to
(2.10)
dt
dV(x(t), t)
dt
+ aV (z, t)
f (x (t), u (t), t)
az Z=x(t)
(2.lla)
= l* (x (t), u(t), t) - (x(t), u(t), t),
_
_
aV (z, t)
at
IX=X(t)
OV(z, t)
at
z=x(t) +
aV (z, t)
f (x(t), UM' t)
az
Z=s(t)
(2.11b)
Observe that u(t), to < r < t < tf is obviously an admissible control relative
to (x(T), T) and that the initial event is absolutely arbitrary in the proof of
Theorem 2.1. Therefore, u(t), to < T < t < tf is optimal relative to (x (T), T)
14
u
R
where, for the sake of simplicity, T := 7r/w, w := 1/LC. Since both the final
state and time are given, the set S f is simply constituted by the single couple
([ 0 1 ]' , T). It is easy to check that the control
u (t) = -2C cos(wt),
which causes the state response
xi(t) _ -T-tsin(wt)
[sin(wt) + Wt cos(wt)],
x2 (t)
V([ 0 JT)=0
is V (Z, t) = al (t)zl + a2(t)z2 +,8(t), where
al (t) = 2RC
sin(wt),
7r
a2(t) = 2 RC a
,(3(t) =
cos(wt),
C2 [1 +
(t + sin(wt) cos(wt) )]
15
J=
2u2dt
to be minimized in complying with the requirement that x(1) = 1. The state response
aV
1((9V)2 =0.
az
at - -8
The functions
V3(z, t ) =
2-tz
2 2
4+2t
2-tz+ 2-t'
tz 2-2
solve the HJE and the relevant boundary condition V(1,1) = 0. However, we get
u h (x ( t), t ,
OV
(Z't)
1
Z=XO(t)
= = u t
uh (x ( t), t , OV2(z, t)
z=x(t)
uh(x
(t),
t,
aVaz, t) I
z=r o(t)
so that only the first two solutions allow us to conclude that u is optimal. Further,
observe that V1 and V2 supply the correct optimal value of the performance index,
namely, 2, while V3 gives the wrong value -2.
16
Corollary 2.1 Let the hamiltonian function (2.5) be regular and V be a solution
of the HJE (2.7) such that:
z=x(t)
), t),
x(to) = xo
(xc(T),T) E Sf,
then
z=xc (t)
Example 2.4 Consider a problem similar to the one discussed in Example 2.3: the
final state is free and the term (x(1) - 1)2 is added to the performance index. The
set of admissible final events is thus Sf = {(x, t) I x = free, t = 1}. The hamiltonian
function and the relevant l-IJE remain unchanged, while the boundary condition
becomes V(z, 1) _ (z - 1)2, dz. A solution of the HJE which satisfies this condition
is
1)2
2( z
V (z' t)
iV(z,t)I z-x
l
C7
1-x
3-t
and x,,(t) = t/3. Thus a(t) = 1/3 is an optimal control. The minimal value of the
performance index can be computed by evaluating V at the initial event, yielding
o
=2
Example 2.5 Consider the optimal control problem defined by the system x = u with
J=
ftf
u2
(1 +
)dt.
The final time is free, so that the set S f is constituted by all the couples (1, T), r > 0.
c9V11(OV)2=0.
dt
2 Oz
17
11 J(t f)
tf
-101.
(2 +u2)dt+ f(tf)x2(tf)}
J = 2{
0
where f(t1) := 1 + t f. Thus the set Sf is constituted by all the couples (x, t) with
x E R and t > 0. The problem is solved by resorting to point (tn) of Theorem
2.1 and Corollary 2.1, i.e., by first assuming that t f is given, their computing the
optimal value of the performance index corresponding to this final time, and finally
performing a minimization with respect to t f. Consistent with this, the term 1/2 in
the performance index is first ignored, then it is easy to check that
V( z,
Z' t) = 0.51
+ f (tf)(tf - t)
.f (tf )
1 + f(tf)(tf - t)
x= _
f(tf)
1 + f (tf)(tf - t)
x, x'(0) = 1.
18
2.3
Problems
Problem 2.3.1 Find an optimal control law for the problem defined by the system
i(t) = A(t)x(t) + B(t)u(t), x(to) = 0 and the performance index
f [h'(t)x(t) + Zu'(t)R(t)u(t)J dt
where R'(t) = R(t) > 0, to and t f are given, while the final state is free. Moreover,
A(t), B(t), R(t) and h(t) are continuously differentiable functions.
Problem 2.3.2 Find an optimal control law for the problem defined by the system
i(t) = Ax(t) + Bu(t), x(O) = 0, x(t f) = x f and the performance index
J=
ft' u (t)Ru(t)dt
with R' = R > 0. In the statement above, t f and x f are given and the pair (A, B) is
reachable.
Problem 2.3.3 Discuss the existence of a solution of the optimal control problem
defined by the first order system i(t) = u(t), x(0) = xo, Sf = {(x,t)I x E R", t > 0}
and the performance index
cf
J= f
[tk+X(t)+U2(t)]dt
Problem 2.3.4 Find an optimal control law for the problem defined by the system
:x(t) = A(t)x(t) + B(t)u(t), x(to) = 0 and the performance index
t
J = 1 f f u(t)R(t)u(t)dt + h'x(t1) + tf
2
where R'(t) = R(t) > 0, to is given, while the final state and time are free. Moreover,
A(t), B(t) and R(t) are continuously differentiable functions and h is a known ndimensional vector.
Problem 2.3.5 Find a way of applying the Hamilton-Jacobi theory to the optimal
control problem defined by the system i(t) = f (x(t), u(t), t) and the performance
index
J=
where f and l are continuously differentiable functions. Here x(tf) and to are given,
while x(to) E R" and t f > to.
2.3. Problems
19
Problem 2.3.6 Find a solution of the optimal control problem defined by the first
order system x(t) = u(t) and the performance index
t
J= f f [t4+u2(t)]dt
,J to
where x(tf) and x(to), x(tf) 34 x(to), while to and tf > to are free.
Chapter 3
The LQ problem
3.1
Introduction
(3.1)
x(to) = xo
where xo and to are given, find a control which minimizes the performance
index
t
J = 11f f [x'(t)Q(t)x(t) + u'(t)R(t)u(t)]dt +
(t f)Sx(t f)}.
(3.2)
The final time t f is given, while no constraints are imposed on the final state
are continuously differentiable
x(tf). In eqs. (3.1), (3.2) A(),
Q(.),
functions and Q(t) = Q(t) > 0, R(t) = R'(t) > 0, Vt E [to, t f], S = S' > 0.
Before presenting the solution of this problem, it is worth taking the time
to shed some light on its significance, to justify the interest, both theoretical
and practical, that the topics discussed in this book have received in recent
years. Consider a dynamic system E and denote by xn its nominal state
response, e.g. the response one wishes to obtain. Moreover, let
be the
corresponding input when E = En, that is when the system exhibits these
nominal conditions. Unavoidable uncertainties in the system description and
disturbances acting on the system apparently suggest that we not resort to an
22
open loop control scheme but rather to a closed loop configuration as the one
shown in Fig. 3.1. The controller ER has to be such that system r supplies, on
the basis of the deviation Sx of the actual state from x, , the correction Su to be
given to u,, in order to make Sx small. The convenience of not requiring large
corrections Su suggests restating the objective above in terms of looking for
the minimization of a (quadratic) performance index with a structure similar
to the one given in eq. (3.2). Consistently, if the deviations Sx and Su are
actually small and E is described by th = f (x, u, t) with f sufficiently regular,
the effect of On on 6x can be evaluated through the (linear) equation
bx = df (x, un(t), t)
OX
Sx +
z=xn(t)
Of (xn(t), u, t)
On.
On
u=un (t)
3.2
The following result holds for the LQ problem over a finite horizon.
Theorem 3.1 Problem 3.1 admits a solution for any initial state xo and for
any finite control interval [to, t f] . The solution is given by the control law
u'(x, t) _ -R-l(t)B'(t)P(t)x
(3.3)
23
(3.4)
P(tf) = S.
(3.5)
Point 1) It is easy to verify that the harniltonian function is regular and that uh =
-R-1B'A. Indeed, by noticing that
Point 2) By letting r := P' it is straightforward to check that I' satisfies the same
differential equation and boundary condition (S is symmetric).
It is then easy to verify that the given function V solves the HJE with the boundary
condition V (z, t f) = z'Sz/2, Vz, if P solves eqs. (3.4), (3.5). In doing this it is
expedient to notice that
az
Point 4) The continuity assumptions on data imply that: (i) a solution of eqs. (3.4),
(3.5) exists and is unique in a neighbourhood of t f; (ii) a solution of eqs. (3.4), (3.5)
cannot be extended to to only if there exists a time within the interval [to, t f] where
at least one element of it becomes unbounded. Let t < t f be the greatest t E [to, t f )
where the solution fails to exist. Then Problem 3.1 can be solved on the interval
(t, t f] and it follows that
J(x(r), r) = 2x (T)P(T)x(T)
0, T E (t, t ff]
24
where the inequality sign is due to the sign of matrices Q, R and S. Since x(r) is
arbitrary it follows that P(r) > 0, T E (t, t f], which in turn implies that if
lien Ipij(T)I = 00,
r- !+
then
r-.1+
where pij is the (i, j) element of P. Indeed, should this not happen, it would then
follow that
Theorem 3.1 supplies the solution to Problem 3.1 in terms of an optimal control
law: if, on the other hand, the optimal control u is sought corresponding to
a given initial state, it suffices to look for the solution x of the equation
u(t) = -R-1(t)B'(t)P(t)x(t).
Example 3.1 Consider the electric circuit shown in Fig. 3.2 where x is the current
flowing through the inductor (L > 0) and u is the applied voltage. At t = 0 the value
of the current is xo and the goal is to determine u so as to lower this current without
wasting too much energy in the resistor of conductance G > 0. Moreover, a nonzero
final value of x should explicitly be penalized. According to these requirements an
LQ problem can be stated on the system t = u/L, x(0) = xo and the performance
index
J=
J0
25
P(t)
TQ +T + (a - T)e2tT1
o+T-(a-T)e2
t=1
x(t)-
xo
(a + T)e: - (a - T)etT2
When a increases it should be expected that x(1) tends to 0 and the control becomes
more active. This is clearly shown in Fig. 3.3 where the responses of x and u are
reported corresponding to some values of a, xo = 1, L = 1 and G = 1. Analogously,
it should be expected that increasing values of G would cause the control action to
vanish. This outcome is clearly illustrated in Fig. 3.3 where the responses of x and
Example 3.2 Consider an object with unitary mass moving without friction on a
straight line (see Fig. 3.4). A force u acts on this object, while its initial position and
velocity are x1(0) 0 0 and x2(0), respectively. The object has to be brought near
to the reference position (the origin of the straight line) with a small velocity, by
applying a force as small as possible. The task has to be accomplished in a unitary
time interval. This control problem can be recast as an LQ problem defined on the
system it = X2, x2 = u with initial state x(O) = xo # 0 and the performance index
(to be minimized)
J = f'u2dt+4(1)+x(1).
The solution of the relevant DRE can be computed in an easy way (see Problem
3.5.1) yielding
P (t) =
12(2 - t)
a(t) l 6(t2 - 4t + 3)
6(t2 - 4t + 3)
-4(t3 - 6t2 + 9t - 7)
where a(t) := t4 - 8t3 + 18t2 - 28t + 29, from which all other items follow.
Remark 3.1 (Coefficient 2) It should be clear that the particular (positive) value
of the coefficient in front of the performance index is not important: indeed, it can
be seen as a scale factor only and the selected value, namely 1, is the one which
simplifies the proof of Theorem 3.1, to some extent.
Remark 3.2 (Uniqueness of the solution) Theorem 3.1 implicitly states that the solution of the LQ problem is unique. This is true provided that two input functions
which differ from each other on a set of zero measure are considered equal. In fact,
assume that the pair (x*, u*) is also an optimal solution, so that i* = Ax* + Bu*,
x*(to) = xo. By letting
v := u* + R-1B'Px*
26
Figure 3.3: Example 3.1: responses of x and u for some values of a and G.
xl
27
where P is the solution of the DRE (3.4) with the boundary condition (3.5), it follows
that
x* = (A - BR-1B'P)x* + By.
The expressions for u* and 13v which can be deduced from these two relations can
be exploited in computing the value of the performance index corresponding to the
solution at hand. By recalling eqs. (3.4), (3.5), we obtain
J(x*
u*
2
1
fto
to
v'Rvdt + J.
v'Rvdt = 0.
0 almost
Remark 3.3 (Linear quadratic performance index) A more general statement of the
LQ problem can be obtained by adding linear functions of the control and/or state
variables into the performance index, which thus becomes
J=1
2
rtf
Jto
w(tf)=in.
28
v = 2 (B'w + k)'R-1(B'w + k)
satisfying the boundary condition
V(tf) = 0.
The proof of these statements can easily be done by checking that if P, w and v
satisfy the differential equations above, then
J=
J0
2
1
[hxA- 2 ]dt+Qx221)
29
where h E R and a > 0 are given. By exploiting Remark 3.3 we find uC (x, t) _
a + 1 - at'
w(t) = 2a, [a + 1 - at -
a + 1 - at
[.
ZR-1Z')x
+ v'Rv :=
Qx + v'Rv
respectively. The original problem has been transformed into a customary LQ problem where, however, matrix Qc might no longer be positive semidefinite. Hence the
existence of the solution of the DRE for arbitrary finite intervals is not ensured unless
a further assumption of the kind
L
LQ
'
(which is equivalent to the three conditions Q > 0, Q, > 0 and R > 0) is added.
Anyway, if the DRE
Example 3.4 Consider the LQ problem with a rectangular term defined by matrices
A=I 0
]B=[?]Q=[ 1
J,R=1z=[],s=I
30
IL
a(t)
7(t) J
where
a(t) := 4 + 7r - 2t + sin(2t),
,Cu(t) := 1 + cos(2t),
7(t)
4 + 7r - 2t - sin(2t),
-'62(t)
a(t)7(t)
i9(t)
Remark 3.5 (Sign of Q and S) The assumptions on the sign of Q and S are no
doubt conservative though not unnecessarily conservative. In other words, when these
assumptions are not inet with, there are cases where the DRE still admits a solution
and cases where the solution fails to exist over the whole given finite interval. These
facts are made clear in the following examples.
Example 3.5 Consider the interval [0, t1 ], t f > 0 and the scalar equation
1'=-2P+P2+1
with boundary condition P(t f) = S > 0. Notice that Q = -1. In a neigbourhood of
t f the solution is
P(t) =
1+(1-S)(t-tf-1)
1 + (1 - S)(t - tf)
which cannot be extended up tot = 0 if 0 < S < 1 and tf > (1 - S)-m. Indeed,
letting r := t f - (1 - S)-' > 0, we get
limn jP(t)I = 00.
tr+
P=2P+P2+1
with boundary condition P(t f) = S > 0 admits a solution for each interval [0, t f]
even if Q = -1. Such a solution is
P(t)
+
1 - (1 + S)(t - tf)
Observe that if
t
>
1+S
then P(0) < 0 so that the optimal value of the performance index of the LQ problem
which gives rise to the DRE at hand is negative unless x(0) = 0, consistent with the
nonpositivity of Q. Finally, consider the scalar equation
P=P2-1
31
with boundary condition P(tf) = S, -1 < S < 0. For each finite t f it admits
P (t)
S + 1 + (S S + 1 - (S -
1)e2(t-t f)
1)e2(t-t f)
y(t) = C(t)x(t)
and considering the performance index
J'f
J=1
2
(3.6)
P = -PA - A'P +
PBR-1 B'P
- C'QC
(3.7)
with boundary condition 1'(t1) = C'(tf)SC(t f), while w is the solution of the (linear)
differential equation
(3.8)
with boundary condition w(tf) _ -C'(tf)S(tf). Finally, the optimal value of the
performance index is
(3.9)
v=2
(w'BR-1 B'w
- 'Q)
(3.10)
with boundary condition v(t f) = '(t f)S(t f)/2. These claims can easily be verified
by exploiting Corollary 2.1 and checking that V(z, t) := lz'P(t)z + w'(t)z + v(t)
is an appropriate solution of the relevant HJE. Alternatively, one could resort to
Remark 3.3.
32
Some comments are in order. First observe that since the boundary condition
for w is set at the final time, the value of w at any time t depends on the whole
history of . Therefore, the value of the optimal control u(t) at a generic instant t
depends on the future behaviour of the signal to be tracked. Second, if
0, then
the solution above coincides with the one of the customary LQ problem, so that, in
particular, it can be concluded that P(t) > 0, Vt. Third, since the optimal value of
the performance index is a linear-quadratic function of the initial state, the question
whether there exists all optimal initial state is not trivial. The nonnegativity of the
performance index together with P > 0 imply that the set of optimal initial states
is given by
{x01
8J(xo'to)
dx0
= 0}
so that any vector of the form x' :_ -Pt(to)w(to) + x is an optimal initial state.
Here Pt(to) is the pseudoinverse of P(to) and xn is any element of ker(P): thus, the
optimal initial state is not unique unless P > 0. Finally, particular attention is given
to the case where the signal It is the output of a linear system, namely
1v(t) = F(t)19(t),
p(t) = 4t)19(t),
19(to) =190.
Matrices F and H are continuous. In this particular case the optimal control law is
(A - BR-1B'P)'h + C'QH
with boundary condition 1,(t f) = -C'(t f)SH(t f). The optimal value of the performance index is
v = 219'(FBR-1B'F - H'QH)19
33
UC T
iX
J(xo) =
fi
J0
where
1 - Qep(c-i)
t
e
w(t)
1-f3eP
where
W(O) = p
+eP
2
,Q-ep
-1
t-2
l - /Je
TI)]
34
The transients of x corresponding to some values of p and a are shown in Fig. 3.7
when the optimal control law is implemented. The system response is closer to the
desired one for large values of a and/or small values of p. Finally, as could easily be
forecast, the optimal value x(0) for the initial state is 1.
Example 3.7 Consider again the system described in Example 3.2, namely an object
with unitary mass moving without friction along a straight line with initial position
x1(0) and velocity x2(0). It is desired that its position be close to (t) := sin(t) for
0 < t < 1 by applying a force as small as possible. This problem can be seen as an
optimal tracking problem by adding to the system equations xl = x2, x2 = u, y = xl
the performance index
where p > 0 and a > 0 are two given parameters. The system responses corresponding
to some values of such parameters are shown in Fig. 3.8: the initial state is either
x(0) = 0 or x(0) = xo. The recorded plots agree with the forecast which is most
naturally suggested by the selected values of the two parameters.
Remark 3.7 (Not strictly proper system) The tracking problem can be set also for
systems which are not strictly proper, that is systems where the output variable is
given by
ff
t
J=1
where both matrices Q and It are symmetric, positive definite, continuously differentiable and is a given continuous function. Note that the adopted performance
index is purely integral: this choice simplifies the subsequent discussion without substantially altering the nature of the problem. By exploiting Remarks 3.3 and 3.4 we
easily find that the solution can be given in terms of the control law
tf
J"(xo) = x'P(to)xo + w'(to)xo + v(to) + 2 fto ( t)Q(t)(t)dt
0.9
35
t Y.
AO)=0
p=0.1,a=10
p=1,a=10
Figure 3.8: Example 3.7: responses of y for x(0) = 0 and some values of p
and a, or a = 0, p = 0.1 and x(0) = 0, x(0) = x$.
36
Figure 3.10:
is the output variable. Letting the electric parameters of the circuit be unitary for
the sake of simplicity, the relevant equations are xr = -XI + X2, x2 = -XI - X2 + u,
1, to = 0 and t f = 1. The
y = -X2 + U. In the performance index Q = R = 1,
transient responses of y and u are shown in Fig. 3.10 corresponding to x(0) = 0
and x(0) = x', the latter value of the initial state being the optimal one. Finally,
J(0) = 0.11 and J(xo) = 0.003.
Remark 3.8 (Penalties on the control derivative) Frequently it is also convenient
to prevent the first derivative of the control variable from taking on high values.
't'his requirement can easily be cast into the problem formulation by adding to the
integral part of the performance index the term it'(t)R(t)it(t). If the matrix R is
positive definite and continuously differentiable, the problem can be brought back
to a standard LQ problem (to which Theorem 3.1 can be applied) by viewing u as a
further state variable satisfying the equation
u(t) = v(t)
and letting v be the new control variable. Thus the given problem is equivalent to
the LQ problem defined on the system
.1= f
tf
to
where
A(t) :_ [got)
B :_ [ 0 ] , Q(t) :_
13(0t)
1
[ Q0(t)
R(t) J
s := [
S
0
37
P = -PA - A'P + PBR-1B'P with boundary condition P(tf) = S. Note that the resulting controller, namely the
device which computes the actual control variable u on the basis of the actual state
variable x, is no longer a purely algebraic system as in the standard LQ context but
rather a dynamic system (see Fig. 3.11) the order of which equals the number of the
control variables. Finally, it is obvious how to comply with requirements concerning
higher order derivatives of the control variable.
Example 3.9 Consider the LQ problem defined on a first order system with A = 0,
Remark 3.9 (Stochastic control problem) The LQ problem can be stated also in a
stochastic framework by allowing both the initial state and the input to the system
to be uncertain. More precisely, assume that the controlled system is described by
38
Figure 3.12:
where Q > 0, S > 0 and R > 0. It is not difficult to guess that if the state can
be measured then the solution of the problem is constituted by the same control
law which is optimal for its deterministic version, namely the law defined by eqs.
(3.3)..--(3.5). This claim can easily be proved if only the linear optimal control law has
to be found, even if it holds also in the general case under the assumptions adopted
here on v and xo. In fact, corresponding to the control law u(x, t) = K(t)x the value
taken by the index Js is
<IS = tr[PK(to)(rlo + xoxa) +
rt
Jto
f VPK(t)dt]
3.3
39
mass m which can move without friction along a straight trajectory. The first and
second of them are linked to the third one through a spring with stiffness k. Thus,
if xi, i = 1, 2,3 and x4, X5, X6 denote their positions and velocities, respectively, the
model for this system is
X1 = X4,
x2 = X5,
x3 = X6,
x4 = -x1 + X3,
x5 = -X2 + X3,
x6 =u+x1+x2-2x3
where, for the sake of simplicity, in = 1, k = 1 while u is the force which can be
applied to the third object. The goal is to make the first two objects move as close
as possible to each other without resorting to large control actions. If the control
interval is infinite, these requirements can adequately be expressed by the criterion
J=
e2=-E1
so that
61(t) = -2(0)Sin(t) + E1(0) cOS(t),
E2(t) = 62(0) cos(t) - E1(0) sin(t).
f(e(O) + E2(0))dt +
00
u2dt
40
Figure 3.13:
x1
x2
x3
fails to exist because the performance index (explicitly) depends upon state
variables which are not initially zero but belong to the uncontrollable part of
the system, this part being not asymptotically stable. As shown in the sequel,
the new assumption is sufficient, together with the previous ones (continuity
and sign definiteness), to make the solution exist for each initial state.
Problem 3.1 will now be discussed for t f = oo and S = 0. This particular
choice for S is justified mainly by the fact that in the most significant class of
LQ problems over an infinite horizon, the state asymptotically tends to zero
and a nonintegral term in the performance index would be useless. The LQ
problem over an infinite horizon is therefore stated in the following way.
(3.11)
x(to) = x0
where x0 and to are specified, find a control which minimizes the performance
index
J=2
[x (t)Q(t)x(t) + u (t)R(t)u(t)jdt.
functions; further Q(t) = Q'(t) > 0, R(t) = R'(t) > 0, Vt > to.
A solution of this problem is provided by the' following theorem.
Theorem 3.2 Let system (3.11) be controllable for each t > to. Then Problem
3.2 admits a solution for each initial state x0 which is specified by the control
41
law
u(x, t) = -R-1(t)B'(t)P(t)x
(3.12)
(3.13)
where
(3.14)
(3.15)
J(xo, to) =
(3.16)
2xoP(to)xo.
Proof. The proof of this theorem consists of four steps: 1) Existence of P; 2) Check
that P solves eq. (3.14); 3) Evaluation of J when the control law (3.12) is implemented; 4) Optimality of the control law (3.12).
Point 1) Controllability for t > to implies that for each x(t) there exists a bounded
control
E St' defined over the finite interval [t, T] (in general, T depends on x(t)
and t) such that x(T) = 0, where
t, x(t),
Letting fi(r) = 0, rr > T
it follows, for to < t < t f < oo,
2x'(t)P(t, t f)x(t) = J(x(t), t, t f)
< J(x(t), t, t f, x('),
J(x(t), t, oo, x(.), fi(.))
= J(x(t), t, T,
00.
These relations, which imply the boundedness of x'(t)P(t, t f)x(t) for each finite t f,
follow from (i) the existence of the solution of the LQ problem over any finite horizon;
(ii) the nonnegativity of the quadratic function in the performance index; (iii) x(rr) =
0, fi(r) = 0, r > T; (iv) the boundedness of fi(.) and 1(.). Let now to < t < tfi < tf2,
r1) :=
t f2,
rl) := Q(C) + K'(C,
Ac(e,,) :=
t <_ C
77) and
77),
42
x'(t)P(t, t f2)x(t) = f
fttf2
fI
> x'(t)P(t, t fr)x(t)
Point 2) Let
t f, S) be the solution of eq. (3.14) with P(t f, t f, S) = S and T < t f.
Then, P(t, t f, 0) = P(t, T, P(T, t f, 0)) so that
P(t) = urn
P(t, t f, 0) = lien P(t, T, P(-r, t f, 0))
t f-oc,
t f o0
= P(t, T, liiii P(7-, t f, 0)) = P(t, T, P(T)).
tf -00
Indeed the solution of the DRE depends continuously on the boundary condition and
it is therefore possible to evaluate the limit inside the function. Thus P(t) equals,
for each t, the solution of eq. (3.14) with boundary condition, at any instant T, given
by P(T): in other words, it satisfies such an equation.
Point 3) Letting x be the solution of the equation
dx
= (A - BR-rB'P)x
-R-rOB'OP( )x( ), we
obtain
L
2J(x(t), t, tf,
x(
)[Q(
)+
= f I :TC'(e)[-P(6)
AV)PW
+2P(e)B(6)R-1(6)B'(6)P(6)Jx(6)d6
= x'(t)P(t)x(t) - x (tf)P(tf)x(tf)
< x(t)P(t)x(t)
having exploited Point 2. For the inequality sign, note that for to < t f < Tf and for
each z E R, z'P(t f,Tf)z is nonnegative since it equals twice the optimal value of
the performance index for the LQ problem over the interval [t1, Tf] and initial state
43
lmx'(tf)P(tf'Tf)x(tf)
= x'(tf)1'(tf)x(tf) >_ 0.
Tf
Therefore, J(x(t), t, t f,
which is apparently a nondecreasing function of t f,
is also bounded, so that there exists its limit as t f - oo and
lim J(x(t), t, tf, x('), u(.)) <
cf
x'(t)P(t)x(t)
J.
(X (t), t' tf
1 X'(t)P(t)x(t).
tf
This relation together with what has previously been found allows us to conclude
that
J(x(t), t, oo,
1 X'(t)P(t)x(t).
Point 4) By contradiction assume u*(.) 0
n J(x(t), t, tf, x* (.), u* (.)) < fizn J(x(t), t, tf, x('), u(.)),
c
-00
> lim
J(x(t), t, t f, x*
tf
u*
J(x(t), t, t f)
Example 3.11 Consider the LQ problem defined by the first order system
x
x= j2-+U
J=
J002
+t2u2)dt.
44
Figure 3.14:
We find
P(t t f )
>
- (+f)
t-t
1 + (1 +
as solution of eqs. (3.14), (3.15) so that
P(t) = (1 + )
1 - e- 2
1+(1+V2-)21 e-a
It is simple to check that P solves the DRE relevant to the problem at hand and
that it is positive definite. In Fig. 3.14 the responses of x are shown corresponding
to x(1) = 1 and control laws which are optimal for some values of the parameter
t f. Note the way such time responses (which are indeed optimal for control intervals
ending at t f) tend to the response resulting from adopting the control law defined
by P.
Remark 3.10 (Uniqueness of the solution) The discussion on the uniqueness of the
solution (which has been presented in Remark 3.2 with reference to a finite control
interval) applies to the present case, provided a suitable limit operation is performed.
3.4
Due to the importance of the results and the number of applications, the LQ
problem over an infinite horizon when both the system and the performance
index are time-invariant, that is when A, B, Q, R are constant matrices is
particularly meaningful. The resulting problem is usually referred to as the
optimal regulator problem and apparently is a special case of the previously
45
(3.17)
J=
[x'(t)Qx(t) + u'(t)Ru(t)]dt.
(3.18)
Observe that, thanks to time-invariance, the initial time has been set to 0
without loss of generality. The following result holds for the problem above.
Theorem 3.3 Let the pair (A, B) be reachable. Then Problem (3.3) admits a
solution for each xo. The solution is specified by the control law
u8(x) _ -R-'B'Px
(3.19)
(3.20)
J(xo) = 1 xoPxo.
(3.21)
Proof. Obviously, it suffices to show that the limit of the solution of the DRE is
constant, since the limit itself is a solution of the DRE (see the proof of Theorem
3.2), then it must solve the ARE as well. From the time-invariance of the problem
it follows that the optimal values of the performance index (3.18), when the control
intervals are [t1, oo) or [t2, cc), must coincide if the initial state is the same. Thus
xoP(tl)xo = xoP(ta)xo, Vtl, t2, xo
which implies that
=cost.
46
Figure 3.15:
Remark 3.11 (Control in the neighbourhood of an equilibrium point) From a practical point of view the importance of the optimal regulator problem is considerably
enhanced by the discussion at the beginning of this chapter. Indeed equation (3.17)
can be seen as resulting from the linearization of the controlled system about an
equilibrium state, say ,,.. For this system the state C is desired to be close to such a
point, without requiring, however, large deviations of the control variables 77 from the
value 17,,, which, in nominal conditions, produces C,,.. In this perspective, x and u are,
with reference to the quoted equation, the state and control deviations, respectively,
and the meaning of the performance index is obvious. Further, should the control
law (3.19) force the state of system (3.17) to tend to 0 corresponding to any initial
state, then it would be possible to conclude that the system has been stabilized in
the neighbourhood of the considered equilibrium.
Example 3.12 Consider the system shown in Fig. 3.15 where x denotes the difference
between the actual and the reference value h of the liquid level, while u is the difference between the values of the incoming flow and qu, the outgoing flow. Assuming
constant and unitary the area of the tank section, the system is described by the
equation x = u, while a significant performance index is
J=+pu2)dt
where p >0 is a given parameter. The system is reachable and
'(t' tI)
1 - e 7P_
1+
(t-tf)
2 e7P(t-tf)
so that
P = V/P-.
It is easy to check that P, apparently positive, satisfies eq. (3.20) which, however,
admits also the (negative) solution -.,fp-. Thus P could have been determined by
resorting to the ARE only. The optimal system motion can easily be evaluated after
the control law u8(x) = -x/,/ has been computed. It results that x(t) = e-t'v xo.
In Fig. 3.16 the responses of x and the related optimal control u are plotted for
xo = 1. Note that the system response becomes more rapid for low values of p at the
price of a more demanding control action.
47
Example 3.13 Consider an object with unitary mass which can move on a straight line
subject to an external force u and a viscous friction. Letting the coefficient relevant
to the latter force be unitary, the system description is xi = X2, i2 = -X2 + u, where
xl and x2 are the object position and velocity, respectively. The adopted performance
index is
J =+
j(xu2)dt.
The system is reachable and the ARE admits a unique symmetric, positive semidefinite solution, namely
P=I
vf3-1 - 1 l1'
so that the optimal control law is uCe (x) = -xi - (/ - 1)x2. If the term 33x2(t) is
introduced into the performance index with the aim of setting a significant penalty on
the deviations of the velocity from 0, an ARE possessing a unique symmetric positive
semidefinite solution results and the control law u's(x) = -XI - 5x2 is found. In Fig.
3.17 the state responses are reported for xi(0) = 1, x2(0) = 1 and both choices of
Q, the first one corresponding to the subscript 1: the consequences of the changes
in matrix Q are self explanatory.
Example 3.14 Consider Problem 3.3 with
A=10
]B=[?]Q=[ 0 ]R=1.
The pair (A, B) is reachable and the ARE admits two positive semidefinite solutions
Pl -I
0
0
j'2
+ 1)2
=[ 2(v
2(v+1)
2(' -I-1)
2+V2-
48
Figure 3.17:
so that it is not possible to determine the optimal control low (at least by exploiting
the up to now acquired results) without resorting to integration of the DRE. We find
0
P(t, t f) _
21-e2f(t-tr)
1+e2,/2-(t-tf)
Example 3.14 has shown that more than one symmetric positive semidefinite
solutions of the ARE may exist: one of them is P, the matrix which defines the
optimal control law (if it exists). The following result allows us to characterize,
at least in principle, such a particular solution, provided that the set
Theorem 3.4 Assume that the pair (A, B) is reachable and let Pa be any element of the set P. Then Pa - P > 0.
Proof. Let Jal(xo) be the value of the performance index when the control law
ua(x) := -R-1B'Pa.x is enforced. Then, by proceeding as in Point 3) of the proof of
Theorem 3.2, it follows that
Ja.
ZxOl axo -
where xa, is the system motion under the control law ua. By an optimality argument
we can conclude that
49
Example 3.15 Consider the optimal regulator problem defined in Example 3.14: the
set P is made out of two elements only and it is easy to check that
P 2 - P l L
2(, + 1)2
2(v'-2 + 1)
2(,/2-+l)
> 0.
J
Theorem 3.3 supplies the solution to Problem 3.3 under a reachability assumption for the pair (A, B). Apparently, this assumption is unnecessary in general.
Indeed assume that the controlled system is constituted by two subsystems Si
and S2i independent of each other. One of them, Sl, is reachable while the
control has no influence on the second one, so that the whole system is not
reachable. However, if only the control and the state pertaining to subsystem
Sl appear in the performance index, then the problem can obviously be solved.
Thus it is worth seeking the least restrictive assumption which ensures the existence of the solution. The question is made clear by Theorem 3.5 which relies
on the following lemma where reference is made to a factorization of matrix
Q (see Appendix A, Section A.4).
In view of this lemma the notion of observability for the pair (A, Q) can be
introduced in a sharp way, simultaneously allowing us to state the following
theorem which gives a necessary and sufficient condition for existence of the
solution of the optimal regulator problem. Such a condition is fairly evident
if reference is made to the canonical decomposition (see Appendix A, Section
A.1) of system (A, B, C), where C is any factorization of Q: for this reason the
proof is not given here.
Theorem 3.5 Problem 3.3 admits a solution for each initial state xp if and only
if the observable but unreachable part of the triple (A, B, Q) is asymptotically
stable.
50
Example 3.16 Consider the problem presented in Example 3.10: the system is not
reachable and a factorization of Q is
l -1
L0
0
0
-1
0
0
so that the unreachable but observable part is precisely the one described by the
two state variables E1 and E2. The eigenvalues of the relevant dynamic matrix are
both zero so that this part is not asymptotically stable. Consistent with this result,
Problem 3.3 was found not to admit a solution for each initial state.
Remark 3.12 (Decomposition of the ARE) If the triple (A, B, C) is not minimal, the
ARE to be taken into account simplifies a lot. In fact, the canonical decomposition
of the triple induces a decomposition of the equation as well, thus enabling us to
set some parts of its solution to zero. More precisely, assume that A, B and C are
already in canonical form, namely
A2
A5
0
0
Al
A-_
0
0
A4
A6
A8
As
A3
0
A7
0
C=[ U
C1
' B-
B1
B2
0
0
C2 ]
P:=
P2
P3'
P4'
P2
P5
1's
P7
P3
P4
P6
1'a
P7
P9
P9'
Pio
P5
P7
P?
P1o
where Pi, i = 5,7, 10, are the limiting values (as t f -> oo) of the solutions of the above
51
the algebraic equations which are obtained from the differential ones by setting the
derivatives to zero and substituting for P5 and P7 their limiting values. The next
B2R_1B2P5
section will show that P5 is such that A5 is stable (all its eigenvalues
have negative real part): this fact implies that the two linear algebraic equations
which determine P7 and P10 admit a unique solution. Indeed both of them are of
the form X F + GX + H = 0 with F and G stable. Thus the solution of Problem
3.3 (when it exists relative to any initial state) can be found by first computing P5,
solution of the ARE (in principle, by exploiting Theorem 3.4, actually by making
reference to the results in Chapter 5) and subsequently determining the (unique)
solutions P7 and P1o of the remaining two linear equations.
Finally, if the given triple (A, B, C) is not in canonical form (resulting from a
change of variables defined by a nonsingular matrix T) the solution of the problem
relies on P0,.:= T'PT. The check of this claim is straightforward.
Remark 3.13 (Tracking problem over an infinite horizon) The optimal tracking problem presented in Remark 3.6 with reference to a finite control interval, can be stated
also for an infinite time horizon. This extension is particularly easy if the problem
at hand is time-invariant (the matrices which define both the system and the performance index are constant) and the signal to be tracked is the output of a linear
time-invariant system. Under these circumstances the optimal control problem is
specified by
00
J=
f{[y'(t) - '(t)]
As in Remark 3.6, Q = Q' > 0 and R = R' > 0. Further, due to self-explanatory
motivations, the pair (F, H) is assumed to be observable so that if xo and i9o are
generic though given, asymptotic stability of F must be required. Under these circumstances it is not difficult to verify that the solution of the problem exists for each
xo and 19o if and only if the observable but unreachable part of the triple (A, B, C)
is asymptotically stable. The solution can be deduced by noticing that the problem
at hand can be given the form of Problem 3.3 provided that the new system
=Wc+Vu
and the performance index
J=
J0
u' Rul dt
52
] , W .- [
I , V := Ir B0 1J
:= I
C'QC -C'QH
-H'(C
H'QH
where P1 solves the ARE 0 = PA + A'P - PBR-'BP + C'QC and is such that
Pi = liin1
P(t, t f), P(t, t f) being the solution of the DRE P = -PA - A'P +
PBR-1 B'P - C'QC satisfying the boundary condition P(t f, t f) = 0, while P2 solves
the linear equation 0 = PF+ (A - BR-1 B'Pl )'P - C'QH. Finally, the optimal value
of the performance index is J(xo,19o) = x0Pixo + 219(P2xo + 19'P319o, where P3 is
]'. Thus
J=
j[q(xi - )2 + u2]dt
I1_
1.41.
1.41
,P2=-
1 0.70
L 0.06
0.87 ]
_ r 1.60
0.61 J ' P3
L 0.34
0.34
1.50
3.53
2.05
2.50
4.00
8.25
2.50
In Fig. 3.18 (a) the time-plots of and xi are shown corresponding to these values
of q and x(0) = 0. Note that xi more closely tracks as q increases. In Fig. 3.18
(b) the time-plots of and x? are reported when q = 1 and x(0) = 0 or x(0) = x$ _
[ 1.55
-0.62 ] ', xO being the optimal initial state (recall the discussion in Remark
3.6 concerning the selection of the initial state and observe that it applies also in
the case of an infinite time horizon). The apparent improvement resulting from the
second choice is also witnessed by the values of the performance index, J(0) = 3.77
and J'(x') = 1.75, respectively.
Suppose that the velocity x2 is now to mimic the signal , so that the term
q(x2 - )2 takes the place of the term q(xr - )2 in the performance index. We obtain
1'i =
[0
P2 = -q
0 o. 1
a+
0
1
53
t
10-
(b)
(a)
Figure 3.18:
still valid in the case t f = oo even if some care must be paid to existence of the
solution. With the notation adopted there, let An, and Anr be the spectra of the
unreachable parts of the pairs (A, B) and (A, b), respectively. Then, Anr = Anr. In
fact, if Tr is a nonsingular matrix which performs the canonical decomposition of the
pair (A, B) into the reachable and unreachable parts (see Section A.1 of Appendix
A), namely a matrix such that
7rATrr=L
Ar,.
0
Air
Bir
TrB =
JI
J , (ArrB1,) = reachable,
rr '
we obtain
TrATr
17,.
Air
A2,
Blr
Air
TrB=
0
0
I
so that Anr C Anr, since the spectrum of Air is a subset of An,.. It is not difficult to
verify that
C r Arr Blr 1 r 0 11 = reachable
0
J'
Jf
54
frolrl which A,L,. = A,,,.. Indeed, if such a pair is not reachable, then, in view of the
PB11 test (see Theorem A.1 of Section A.1 of Appendix A) it follows that,
[0 1u]=U.
These equations imply that u = 0 and, again in view of the PBH test, the pair
(A1,, Bl,) should not be reachable.
Let now Ano and ' Ano be the spectra of the unobservable parts of the pairs
(A, C) and (A, C), respectively, where C'C = Q :=diag[C'C, D'D], C and D being
factorizations of Q and R, respectively. Then Ano C Ano. In fact, let To be a nonsingular matrix which performs the canonical decomposition of the pair (A, C) into the
observable and unobservable parts, namely a matrix such that
T0Al0 1 = [
Al,,
A3,,
CT,-,' = [ Clo
To
we obtain
Ago
A3o
B2o
CTo 1 =
Clo
[
0D
able parts of the triples (A, B, C) and (A, b, C). From the preceding discussion it
can be concluded that Anro C Anro
It is now possible to state that if a solution of Problem 3.3, defined by the
quadruple (A, B, Q, R) exists for each initial state x(0), that is if all elements of
A,,,.o lie in the open left half plane (see Theorem 3.5), then a solution of Problem
3.3, defined by the quadruple (A, B, Q, R) (recall that R is the weighting matrix for
u in the performance index), exists for each initial state [ X'(0) u'(0) 1', since,
necessarily, all elements of An,o lie in the open left half-plane.
In the special case where rank(B) is maximum and equal to the number of
columns, the optimal regulator can be given a form different from the one shown in
Fig. 3.11 which, referring to a finite control interval can anyhow be adopted also in
the present context, the only significant difference being the time-invariance of the
system. Since B'B is nonsingular, from the system equation x = Ax + Bu it follows
that
u = (B'B)-1B'(x - Ax).
55
TN
Figure 3.19:
On the other hand, the solution of Problem 3.3 implies that it = KKx + Kuu so that
it = K.*x + K.x
This is the control law enforced by the system in Fig. 3.19 which can be interpreted
as a generalization of the PI controller to the mnultivariable case.
Remark 3.15 (Performance evaluation in the frequency-domain) The synthesis procedure based on the solution of Problem 3.3 can easily be exploited to account for
requirements (more naturally) expressed in the frequency domain, as, for instance,
those calling for a weak dependence of some variables of interest on others in a specified frequency range. In other words, the presence of Harmonic components of some
given frequencies in some state and/or control variables, must be avoided, or, equiv-
alently, suitable penalties on them must be set. This can be done in a fairly easy
way. Indeed, recall that thanks to Parceval's theorem
..
00
z'(t)z(t)dt = "ir
-
0 Z-(jw)Z(9w)dw
where z is a time function, Z is its Fourier transform and it has obviously been
assumed that the written expression makes sense. Therefore, a penalty on some
harmonic components in the signal x(t) can be set by looking for the minimization
of a performance index of the form
00
2r
00
X -(jw)F;(jw)F=(jw)X (jw)dw
56
Jf =
27r
where FF and F. are proper rational matrices. This index has to be minimized subject
to eq. (3.17). The resulting optimal control problem can be tackled by first introduc-
ing two (minimal) realizations of F. and F. Let the quadruples (Ax, B., C., D.)
and (A,L, B.u, Cu, Du) define such realizations, respectively, and note that
Jf =
x'
if xA
z',
z;
AA:=
B,
Ax
0
0
Au
[
1
, BA:=
0
Bu
and
QA
D' D,;
D' C.
CAD,:
0
CxC,;
0
0
0
CCCU
0
0
Cu Du
, ZA :=
, RA := DuDu.
'- [
D.
C.
Cu
Thus the regulator is a dynamic system with state [ z'X zu } ', input x (the state
of the controlled system) and output u (see also Fig. 3.20).
57
iu =Auzu+Buu
Figure 3.20:
Example 3.18 Consider the mixer shown in Fig. 3.21 and assume that the concentration inside the tank, of constant section S, is uniform. Further let the outgoing
flow qu be proportional to the square root of the liquid level h. Then the system can
be described by the equations
ti = S (qi + q2 - av),
c
cl and c2 being the concentrations of the two incoming flows qi and q2, respectively.
If the deviations of h, c, ql, q2, cl and c2 about the equilibrium, characterized by h,
c, qi, 42, El and c2, are denoted by xi, X2, ul, u2, dl and d2, respectively, we obtain,
by linearizing the above equations, = Ax + Bu + Md, where
A
01
-0.2
]' B- [
0.1
-0.1
]' M= [
0.1
0.1 ]
corresponding to a suitable choice of the physical parameters and the values of the
variables at the equilibrium.
The first design is carried on in the usual way by selecting Q = I and R = 0.11,
yielding the optimal control law u = Kx with
K_-
2.19
2.19
1.45
-1.45
The second design is carried out in complying with the requirement of counteracting
the effects on x2 of a disturbance dl = sin(0.5t), accounting for fluctuations of the
58
q1 Cl
q2 C2
h
qu
fx(s)=
0.251+0.3s2
s +0.01s+0.25
a realization of the (2,2)-element 4)(s) of Fr(s) (the amplitude of its frequencyresponse is shown in Fig. 3.22) is
Ax _ r
-0. 01
-0.25 1
Bx _
1 1
Within this new framework the optimal regulator is the second order dynamic system
-[
2.19
2 .19
2.33
-2.3 3
Kzx = - [
-0.99
0.99
-0.22
0.22
Figure 3.23 shows the responses of xi and ui when dl (t) = sin(0.5t), corresponding
to the implementation of the above control laws. Note the better performance of the
second regulator as far as the second state variable is concerned and the consequent
greater involvement of the control variable.
Remark 3.16 (Stochastic control problem over an infinite horizon) The discussion
in Remark 3.9 can be suitably modified to cover the case of an unbounded control interval, provided that the material in Appendix A, Section A.5 is taken into
consideration. Corresponding to the (time-invariant) system
59
-40
Figure 3.22:
where v and xo are as in Remark 3.9, reference can be made to either the performance
index
41 = E [
when
[x (t)Qx(t) + u'(t)Ru(t)]dt]
3.4.1
Stability properties
(t) = (A - BR-'B'P)x(t),
(3.22)
are now analyzed in detail. The fact that the control law guarantees a finite
value of the performance index corresponding to any initial state suggests that
system (3.22) should be asymptotically stable if every nonzero motion of the
60
x2
0.15
0.5
U.
t
100
-0.5
-0.15
Figure 3.23:
Example 3.18: response of x2 and ui corresponding to a sinusoidal fluctuation of ci when frequency-domain requirements
are taken (heavy line) or not taken (light line) into account.
A=
0
0
0
0
0
, B=
0
0
, R=1
and the matrices Qzj i = 1, 2, 3 which have the element (i, i) equal to 1 and all others
equal to 0. The solution of the problem exists in every case, since the pair (A, B) is
reachable and is constituted by the control laws
indices are nothing but the integral of y2 + u2 and their boundedness requires the
asymptotic zeroing of yt which, in turn, implies the asymptotic zeroing of x (and
hence asymptotic stability of the system since the initial state is arbitrary) only in the
first case which is the one where the pair (A, Cj) is observable. In the remaining cases
the unobservable part of the pair (A, Cz) has zero eigerivalues and is not stabilized
by the control law which, nevertheless, is optimal.
61
The forthcoming Theorems 3.6 and 3.7 establish precise connections between
the characteristics of the performance index and the stability properties of the
resulting closed loop system. They rely on the following lemma.
Lemma 3.2 Let the pair (A, B) be reachable and Q = C'C. Then the matrix
P which specifies the optimal control law for Problem 3.3 is positive definite if
and only if the pair (A, C) is observable.
Assume that the pair (A, C) is observable and the matrix P is not positive
definite so that x' Pxo = 0 for some suitable xo 0 0. This implies that the perforProof.
mance index (3.13) is zero when x(0) = xo and the control law (3.19) is implemented.
Since R is positive definite it follows that u(.) = 0 and, if xj is the state free motion
originated in xo, also
0, i.e.,
0, contradicting the observability of (A, C). Vice versa, if the pair (A, C) is not observable, then, corresponding
to
0,
0 for some suitable xo # 0 and the performance index (3.18) is
zero. Because of optimality also J(xo) = 0 and from eq. (3.21) it follows that P is
not positive definite.
Theorem 3.6 Let Q = C'C and the triple (A, B, C) be minimal. Then the
closed loop system resulting from the solution of Problem 3.3 is asymptotically
stable.
Proof. The proof is based on Krasowskii's criterion after V (x) := a x'Px has been
chosen as a Lyapunov function. This function is positive definite thanks to Lemma
3.2. Then, recalling that P is symmetric and solves the ARE,
dV(x(t)) =
dt
'(t)P-(t)
dt
= 2x0'(t)(PA+A'P - 2PBR-1B'P)x(t)
-2x'(t)(Q + PBR-1B'P)x(t).
Thus the time derivative of V evaluated along the system motion is negative semidefinite. Asymptotic stability follows if such a derivative is nonzero for each nonzero
Example 3.20 Consider the electric circuit described in Example 2.2 and assume that
all electric parameters are equal to 1, for simplicity in notation. The system equations
become 1 = x2, x2 = u - xi. As a performance index choose
J=
j(x+ x+ u2)dt.
62
Figure 3.24:
The system is reachable and Q > 0, so that observability is ensured. The only
symmetric and positive definite solution of the ARE is
P- r 1.91
l 0.41
0.41 1
1.35 J
and the eigenvalues of the closed loop system are 1\1,2 = -0.68 jO.98.
where J is the rod inertia referred to its center of mass and g is the gravity constant.
The system linearized equations corresponding to the equilibrium = 0, = 0,19 = 0,
Figure 3.25:
63
A=
0
0
0
may
0
0
43
0
52
,B=
3
- 43
-1
Theorem 3.7 Assume that a solution of Problem 3.3 exists for each initial
state. Then the optimal closed loop system is asymptotically stable if and only
if the pair (A, Q) is detectable.
Remark 3.17 (Existence and stabilizing properties of the optimal regulator) A summary of the discussion above concerning the existence and the stabilizing properties
of the solution of Problem 3.3 is presented in Fig. 3.26 where reference is made to
a canonical decomposition of the triple (A, B, C) and the notation of Remark 3.12
is adopted. Further, the term "stab". denotes asymptotic stability and the existence
or inexistence of the solution has to be meant for an arbitrary initial state.
Remark 3.18 (Optimal regulation with constant exogenous inputs) The results concerning the optimal regulator can be exploited when the system has to be controlled
64
Figure 3.26:
(3.23a)
(3.23b)
x(O) = xo,
lirn y(t) = y,
Within this framework ys is the set point for y, while d accounts for the disturbances
acting on the system input and output. In the present setting the triple (A, B, C)
is minimal, the number of control variables equals the number of output variables
and the state of the system is available to the controller. In view of Appendix B.4
the controller can be thought of as constituted by two subsystems: the first one is
described by the equation
fi(t) = Y. - y(t)
(3.24)
65
while the second one has to generate the control variable u on the basis of x and
in such a way as to asymptotically stabilize the whole system. In designing this
second subsystem it is no doubt meaningful to ask for small deviations of the state
and control variables from their steady state values together with a fast zeroing of
the error. Since the first variations of the involved variables for constant inputs are
described by system E obtained from eqs. (3.23), (3.24) by setting d = 0 and ys = 0,
namely
J = f [Sy'(t)Qysy(t) +
Su (t)R6u(t)]dt
with Qt and R positive definite and Qy positive semidefinite, the optimal control
law, if it exists, will surely be stabilizing, since E,, is observable from (the check
can easily be performed via the PBH test) and given by
Su3(6x, S&) = K1Sx + K2S&.
The existence of the solution is guaranteed by the reachability of E,,, namely by the
fulfillment of the condition
n + in = rank(
B
[ 0
=rank([
AB
A2B
-CB -CAB . . ] )
B AB
.
0J[r
... ])
which in turn is equivalent to saying that system E(A, B, C, 0) does not possess
transmission zeros at the origin (actually invariant zeros, because of the rninirnality
of E). In fact, reachability of the pair (A, B) implies that in the above equation the
rank of the second matrix on the right-hand side be equal to art+n and, in view of the
already mentioned rninirnality of E(A, B, C, 0), that there are transmission zeros at
the origin if and only if Ax+Bu = 0 and Cx = 0 with x and/or u different from 0 (see
Section A.3 of Appendix A). On the other hand, if E(A, B, C, 0), which possesses as
many inputs as outputs, has a transmission zero located at the origin, then it would
follow that also A'x + C'y = 0 and B'x = 0 with x and/or y different from 0, which
would in turn entail the existence of a zero eigenvalue in the unreachable part of E,,,
thanks to the PBH test. Since this system is observable, we should conclude that
no solution exists for Problem 3.3 when stated on such a system. Thus zero error
regulation can be achieved in the presence of constant inputs only if none of the
transmission zeros of E(A, B, C, 0) is located at the origin.
66
Example 3.22 Consider the inverted pendulum shown in Fig. 3.27 and assume that
the mass of the rod is negligable if compared to the mass m of the sphere located
at its end. A torque c (control variable), a wind force and gravity act on the rod of
length 21. If i is the angle between the rod and the vertical axis and J is the inertia
with respect to the hinge, the system equations are
J'O = 2lrng sin (V) + c + 212 (cos
where the last term accounts for the wind of intensity v. Letting the two components
of x denote the deviations from 0 of the angle 79 and its derivative, u the deviation of
the torque c and d the deviation of the wind intensity from their nominal values, the
linearized system is characterized, consistent with the notation in eqs. (3.23), (3.24),
by
J, B=
A= [
0
1
4m1
, M=
2M
01
Remark 3.19 (Penalties on the control derivative) The problem considered in Remarks 3.8 and 3.14 can be discussed further with reference to the stability properties
of its solution.
With the same notation adopted in Remark 3.14, first recall that in view of Theorems 3.5 and 3.7, if a solution of Problem 3.3 stated for the quadruple (A, B, Q, R)
exists for each (initial state) x(0) and the resulting closed loop system is asymptotically stable, then the set Ano U
must be stable, i.e. all its elements must have
negative real parts. Assuming that this set is such, it is possible to claim that Problem 3.3 stated for the quadruple (A, B, (I R) admits a solution for each initial state
67
10
10
Figure 3.28: Example 3.22: responses of the error and control variables.
[ X'(0) u'(0) ]' if the set Anro is stable. Since Anro C Anro (see Remark 3.14), the
set An, o is stable if the set An,.o is such.
x ] = , \ [ u],Cl ux J=O,
[x100.
68
]B=[?]
A=I 1
a=0, /3=0:
Ai(F)=
(x=0, /3= 1
Ai(F) =
10
-1
1 -0.50
-1.40+0.27j
-1.40 - 0.27j,
a=1, /3=0:
A(F)=
-1.52
-0.70+0.40j
-0.70 - 0.40j,
-1
a=1, /3=1:
A(F)=
-1
The result presented in Theorem 3.6 makes precise the intuition that stability is
achieved by the closed loop system if the value of the performance index, which
is finite for all initial states, is affected by each nonzero motion of the state.
69
Problem 3.4 (Optimal regulator problem with exponential stability) For the
time-invariant system
(3.25)
X(0) = xo
e2at[x'(t)Qx(t) + u'(t)Ru(t)jdt.
J=
(3.26)
No constraints are imposed on the final state and further Q = Q' _> 0, R =
R' > 0, while a is a given nonnegative real number.
For this problem the following result holds.
Theorem 3.8 Let the triple (A, B, Q) be minimal. Then the solution of Problem
3.4 exists for each initial state x0 and each a > 0. The solution is characterized
by the control law
ucsa(x) _
-R-1 B'Pax
(3.27)
where Pa is the symmetric and positive definite solution of the algebraic Riccati
equation
(3.28)
such that Pa = limtf_ya, P0(t, t f), Pa(t, t1) being the solution of the differential
(A+aI)x+Bu
and the performance index
J = f [x (t)Qx(t) + u"'(t)R1(t)Jdt.
Note that reachability of the pair (A, B) implies reachability of the pair ((A+al), B).
Indeed, if this second pair is not reachable, i.e., (see the PBH test) if (A+aI)'x = Ax
and B'x = 0 with x i4 0, then Ax = (A - a)x and also the first pair is not reachable.
70
'T'herefore, the solution of the optimal regulator problem stated on system E exists
and is given by the control law
uos(:i) _ -R+'B'Pax
(3.29)
where P,,, satisfies eq. (3.28) and possesses the relevant properties. By recalling the
performed change of variables, eq. (3.27) is found.
For the eigenvalues location, observe that the PBII test implies that the pair
((A + aI), Q) is observable and the control law (3.29) is stabilizing. Therefore all
eigenvalues of A + aI - BR-' BP,, have negative real parts so that the eigerrvalues
of A - BR-1 B' P, have real parts less than -a (the eigenvalues of any matrix T + aI
are the eigenvalues of T increased by a).
Example 3.24 Consider the system shown in Fig. 3.29. Assume that both objects, the
first one with mass M, the second one with mass m, can move without friction along a
straight trajectory. The spring between them has stiffness k, while the external force
u is the control variable. A regulator has to be designed along the lines of Theorem
3.8 to make the transients last, from a practical point of view, no more than 1 unit
of time. If x is the vector of the positions and velocities of the two objects, the design
can be carried on by stating Problem 3.4 on a system defined by
A=
-k/M
k/M
k/rrt
-k/rn
B_
'
1/M
0
71
(3.30)
(3.31)
(3.32)
0 = PA + A'P - PBR-'B'P + Q
(3.33)
with Q=C'CandR=R'>0.
Lemma 3.3 Let K be given by eqs. (3.32), (3.33). Then
G-(s)G(s) = I + H'(s)QH(s)
(3.34)
where
G(s) = I - Ra KH(s)
H(s) = (sI -
A)-'BR-Z.
From eq. (3.34), letting s = jw, w E R, it follows that G'(-jw)G(jw) > I since
its left-hand side is an hennitian matrix (actually it is the product of a complex
72
(sI A)-'B
-K
Figure 3.31:
matrix by its conjugate transpose), while its right-hand side is the sum, of the
(3.35)
Based on this relation the following theorem states that the optimal closed
loop system is robust in terms of both phase and gain margin.
Theorem 3.9 Consider the system (3.30), (3.31) and assume that:
i) The input u is scalar;
ii) The triple (A,.B, C) is minimal;
iii) In eq. (3.32) P = P, the solution of eq. (3.33) relevant to Problem 3.3
defined by the quadruple (A, B, C'C, R).
Then the phase margin of the closed loop system is not less than 7r/3 while the
gain margin is infinite.
Proof Letting 1F'(s) := -If(sI - A)-'B, where k := -R-'B'P, the closed loop
system, which is asymptotically stable in view of the above assumptions, can be given
the structure shown in Fig. 3.31. The polar locus of the frequency-response of F lies,
in the complex plane, outside the unit circle centered at (-1, jO) since, thanks to
eq. (3.35), the polar locus of the frequency-response of 1 + F lies outside the unit
circle centered at the origin. By the Nyquist stability criterion it is easy to conclude
that the only possible shapes for the polar loci of the frequency-response of F are
of the kind of those shown in Fig. 3.32 (a), where the diagrams labeled with a and
b refer to the case in which at least one pole of F is in the open right half-plane,
while those labeled with c and d refer to the case where no pole of F is located there,
if the poles of F lying on the imaginary axis have been surrounded on the right in
the Nyquist path. It is now obvious that the intersection of the polar locus of the
frequency-response of F with the unit circle centered at the origin must occur within
the sector delimited, on the left, by the straight lines r and s shown in Fig. 3.32 (b),
which form an angle of 27r/3 with the real axis. Therefore, the phase margin can
73
-.
Re
(a)
(b)
Figure 3.32: (a) Allowed shapes for the polar plot of the frequency-response
A= [ 0075
-12
J, B= [
J.
The state variables can be measured and a stabilizing regulator has to be designed
to make all real parts of the eigenvalues not greater than -0.5. By exploiting poleplacement techniques (see Appendix B.1, Section B.2) we find that the control law
u = Kax with K. = [ -1 1 ] is such that the two eigenvalues are A1,2 = -0.5.
On the contrary, by resorting to the optimal regulator theory arid choosing, after
Fig. 3.33 (a) where the Nyquist loci of Fa(s) :_ -Ka(sI - A)-1B and F(s) :_
-K(sI - A)-1B are shown). Finally, the responses of the first state variable are
plotted in Fig. 3.33 (b) when u = Kx + v, v(t) = 1, t > 0, K = K. or K = K.
Since the two steady state conditions are different, in the second case the value of
the state variable has been multiplied by the ratio 3 between the two steady state
values. This allows a better comparison of the transient responses: specifically, the
worst performance of the first design is enhanced.
74
16
0
(b)
(a)
Figure 3.33:
When the control variable is not scalar a similar result holds: it is stated in
the following theorem, the proof of which can be found in the literature [12,
33, 35].
Theorem 3.10 Consider Problem 3.3 and assume that the pair (A, B) is reachable, Q > (l and R > 0 diagonal. Then each one of the rn loops of the system
resulting from the implementation of the optimal control law possesses a phase
margin not smaller than ir/3 and an infinite gain margin.
The significance of this result can be better understood if reference is made to
the block-scheme in Fig. 3.34. When the transfer functions gi(s) are equal to
1 it represents the system resulting from the implementation of the optimal
Fi(s) _ [K]i(sI - (A +
E[B]i[K).7))[B1:
j=1
jai
Theorem 3.10 guarantees that the system remains stable if the functions gi(s)
introduce a phase lag less than ir/3 at the critical frequency of Fi(s), or are
purely riondynarnic systems with an arbitrarily high gain.
Example 3.26 Consider the system shown in Fig.- 3.35 which is constituted by a tank
of constant section. The temperature 19i of the incoming flow qj is constant while
the temperature 19 of the outgoing flow qV, equals the temperature existing in each
point of the tank. The external temperature is '0e and Q, is the heat supplied to the
3.4.
75
ul
01.
ui
ac Ax+Bu
u,n
Figure 3.34:
liquid. If the thermal capacity of the tank is neglected, the system can be described
by the equations
it = al (qi - per),
hag = a2gi(a9i - 19) + a3Qr + a4h(a9e - a9)
where the value of the outgoing flow is assumed to be ,6 times the square root of the
liquid level while aj, j = 1, ... , 4 are constants depending on the physical parameters
of the system. By linearizing these equations and denoting with xi, X2, ul, u2, d1,
d2 the deviations of h, a9, qi, Qr, 0, age, respectively, the system x = Ax + Bu + Md
is obtained, where
`4 - [ -10/816
-1/2
]B=[ -17/4
1/4000
M= [
1/4 } '
corresponding to a particular equilibrium condition and suitable values of the parameters. The controller must attain asymptotic zero-error regulation of the liquid
level and temperature in the face of step variations of the flow qu and temperature
age (that is for arbitrary dl and d2), while ensuring good transients. According to
Remark 3.18 two integrators (with state ) are introduced together with a quadratic
Q. = I and Qt = 10001 (note
performance index characterized by R
that in this specific case the state of the controlled system coincides with the output
to be regulated). The responses of xi and x2 are reported in Fig. 3.36 corresponding to a step variation of d1 (at t = 0) with amplitude 0.5 and d2 (at t = 5) with
amplitude 10. In the same figure the responses corresponding to the same inputs
are reported when also the actuators dynamics are taken into account. If the transfer functions of these devices (controlling the incoming flow qi and the heat Qr) are
91(s) = 1/(1+0.05s) and 92(5) = 1/(1+s), respectively, one can check that they add
=diag[1,10_51,
a phase lag of approximately 45 at the critical frequencies of the two feedback loops.
The figure proves that the control system designed according to the LQ approach
behaves satisfactorily, even in the presence of the neglected dynamics.
76
q1 15i
Figure 3.35:
3.4.3
The behaviour of the solution of the optimal regulator problem is now analyzed
when the penalty set on the control variable becomes less and less important,
that is when it is less and less mandatory to keep the input values at low levels.
Obviously, when the limit situation is reached where no cost is associated to
the use of the control variable (the control has become a cheap item) the input
can take on arbitrarily high values and the ultimate capability of the system
to follow the desired behaviour (as expressed by the performance index) is
put into evidence.
With reference to the system
(3.36b)
X(0) = x0i
(3.36c)
(3.36a)
which is assumed to be both reachable and observable, consider the performance index
J(p) =
[y'(t)y(t) + pu (t)Ru(t)]dt
(3.37)
where R = R' > 0 is given and p > 0 is a scalar. Obviously, the desired
behaviour for the system is
0.
77
(a)
(b)
Figure 3.36: Example 3.26: state responses in design conditions (a) and when
PR-1B'P(P)x
u3(x, P)
(3.38)
where P(p) is the (unique) positive definite solution of the ARE (see Theorems
5.8 and 5.10 in Section 5.3 of Chapter 5)
78
since the pair (x(t, pa ), u(t, pl )), which is optimal when p = pa, is no longer such,
in general, when p = p2 0 pl. Therefore, xO'P(p)xo is a monotonic nondecreasing
function, bounded from below (because it is nonnegative) and admits a limit when
p -+ 0+. Since xo is arbitrary, also the limit of P(p) exists.
A meaningful measure of how similar the system response is to the desired one
(that is how y(.) is close to zero) is supplied by the quantity
fxoF(t,p)Qx0(t,p)dt,
'IX (X0, p)
p-4o-f-
Proof. The form of the performance index (3.37) makes obvious that the value of J. is
a monotonic nondecreasing function of p, bounded from below since it is nonnegative.
Further, .J (xo, p) G J(xo, p), Vp, so that
pin i J. (xo, p) = 2 xoPoxo - e, e > 0.
Let e > 0 and assume that there exist E > 0 and p > 0 such that 2JJ (xo, p) _
xoPoxo - 2E. By choosing
00
it follows that
xoPoxo - E =
p) + PJu(x0, p))
00
=2
J(O)
>2f
00
)Qx(t,
A) +
pu'(t,
because i) the pair (x(t, p), u(t, p)) is optimal when p = p but it is not so when p =
p; ii) it has been proved (Lemma 3.4) that x'OP(p)xo is a monotonic nondecreasing
function the limit of which, as p goes to 0, is xoPoxo. The last equation implies e = 0.
Thanks to this theorem matrix PO supplies the required information about the
maximum achievable accuracy in attaining the desired behaviour
0).
3.4.
Figure 3.37:
79
Theorem 3.12 Let system (3.36a), (3.36b) be both reachable and observable.
The following conclusions hold:
i) Ifm<p, then PO 34 0;
ii) If m = p, then PO = 0 if and only if the transmission zeros of the system
have nonpositive real parts;
iii) If m > p and there exists a full rank m x p matrix M such that the
transmission zeros of system E(A, BM, C, 0) have nonpositive real parts,
then PO = 0.
Example 3.27 Consider a system E(A, B, C, 0) where the state, input and output are
scalars and B 0 0, C # 0. This system does not possess finite transmission zeros,
since its transfer function is F(s) = BC/(s - A): hence Po = 0. Indeed, the relevant
ARE is 0 = 2AP - P2B2/(pR) + C2, the only positive solution of which is
Cixi
Ri + R2 (xi -
X2 - R2x3),
(xl - X2 + Rlx3),
Rl +R2
_
1
U3
(R2x1 + R1x2 + (RiR2 + R1R3 + R2R3)x3),
= u Rl + R2
C222 =
Rl + R2
80
p=1
t
'` p=106
1.5
Figure 3.38:
2 - sl
G(s) _
which exhibits a zero in the right half-plane. Thus, if the performance index is the
integral of y2 + put, it follows that Po # 0. In fact, for p = 10-10 we get
P(p) =
0.49
0.34
0.00
0.34
0.24
0.00
0.00
0.00
0.00
A=
1-2
L
-1
0
0
, B=
C=
det[G(s)] =
`p(8)
det[sI - A]
P(p) = 10-t
0.00
0.00
0.00
0.00
3.75
-3.75
0.00
-3.75
3.75
if a = 3
Figure 3.39:
81
and
P(p) =
10-4
0.07
0.00
0.07
0.07
0.00
0.34
-0.20
-0.20
0.27
, P(p) =
10-4
0.07
0.00
0.07
0.00
0.14
0.00
0.07
0.00
0.07
A=
0
0
B=
0
0
, C= [ 0 a
so that the zeros of system E(A, BM, C, 0) do not lie in the open right half-plane
if a > 0 and a and b have been suitably selected. Consistent with this, we get, for
R = 1, a = 1 and p = 10-10
P(p) =
10-5
0.00
0.00
0.00
0.00
1.00
1.00
0.00
1.00
1.00
82
3.4.4
The inverse optimal control problem consists in finding, for a given system
and control law, a performance index with respect to which such a control
law is optimal. The discussion as well as the solution of this seemingly useless
problem allows us to clarify the ultimate properties of a control law in order
that it can be considered optimal and to precisely evaluate the number of
degrees of freedom which are actually available when designing a control law
via an LQ approach.
We will deal only with the case of scalar control variable and time invariant system and control law: thus the inverse problem is stated on the linear
dynamical system
(3.39)
u(x) = Kx
(3.40)
where the control variable u is scalar, while the state vector x has n components. The problem consists in finding a matrix Q = Q' _> 0 such that the
control law (3.40) is optimal relative to the system (3.39) and the performance
index
(3.41)
83
Proof. The assumptions are independent of the particular choice of the state variables, therefore the pair (A, B) can be written in the canonical form
001
10
-a1
-a2
-a3
A=
...
001
0
0
,B=
0
...
-an
(ai -
ki)si- 1
i=1
[1 - K(-jwI -
A)-1Bl[l
- K(jwI -
A)-1Bj
(3.42)
ir(jw)
(3.43)
a we obtain 19'(sI - A)-1B = a(s)/(s) and, in view of eqs. (3.42), (3.43), it follows
that
= 1 + B'(-jwI - A')-1?9l9'(jwI - A)
B.
(3.44)
From Lemma 3.3, if the control law (3.40) is optimal with respect to the performance
index (3.41), then it follows that
(3.45)
84
could be optimal with respect to the performance index (3.41) with Q = i9'z9. Let
K := [ k1 k2 ... k,, ] be the matrix which defines the optimal control law for
such a Q, so that, in view of Lemma 3.3,
(3.46)
If it were possible to conclude from this relation that cp = cp, then it would also be
possible to conclude that k = K since the two polynomials are identical functions
of the elements of these matrices. Thanks to the assumption (a2), cp is uniquely
determined from the polynomial on the right-hand side of eq. (3.46), since all its
roots must have negative real parts.
It is now shown that all the roots of cp have negative real parts, thus entailing
that cp = cp. If the pair (A, i9') is observable, such a property is guaranteed by
Theorem 3.6. By contradiction assume that this is not true: the transfer function
Fs(s) := 19'(sI - A)-1B would have the degree of the denominator less than n
if expressed as the ratio of two coprilne polynomials. Since F,9(s) = a(s)/(s), the
polynomials a and ' would possess at least one common root, say z, with nonpositive
real part because of the definition of a. If eq. (3.43) is taken into account, then z must
also be a root of cp, since all the roots of cp"' have positive real parts. As a consequence,
the degree of the denominator of the transfer function FK (s) := K(sI - A)-'B,
which is the ratio of the polynomials V - cp and z/i, would be strictly less than n
after the suitable cancellations have been performed. In view of assumption (al) the
observability of the pair (A, K) (assumption (a4)) is then violated. In conclusion,
the pair (A, i9') is observable, all roots of the polynomial cp have negative real parts,
cp = cp, K = K and the control law (3.40) is optimal with respect to the performance
index (3.41) with Q = 199'.
85
A=
0
0
0
0
, B=
, K= [ -1 -2 -2 ].
It is straightforward to verify that the assumptions in Theorem 3.13 hold and ir(s) =
1, so that a(s) = 1, z9 = [ 1 0 0 ] ' and Q =diag[1, 0, 01. This result can easily be
checked by computing the solution of the optimal regulator problem stated on the
quadruple (A, B, Q, 1).
A=[A2 A3 ],B=[B2],KKl
0]
with the pair (Ai, Kl) observable and the pair (A1, B1) reachable (this last claim
follows from the reachability assumption of the pair (A, B)). Theorem 3.13 can be
applied to the subsystem E1 described by xp = Alxo + Btu and the control law
u = K1x,,. In fact, assumption (a2) is verified for the triple (A1, B1, K1) if it holds
for the triple (A, B, K), since
A+BK= r Al+B1K1
A2 + B2 K1
I
0
A3
A=
0
0
0
0
-1
01
, K=
B=
1
[ -1 -/ 0].
Remark 3.21 (Degrees of freedom in the choice of the performance index) The proof
of Theorem 3.13 points out a fact which is of some interest in exploiting the LQ
theory in the synthesis of control systems. Indeed, within this framework only the
86
structure of the performance index should be seen as given, the relevant free parameters; being selected (through a sequence of rationally performed trials) so as to
eventually specify a satisfactory control law. If the control is scalar, the number of
these design parameters is substantially less than 1 + n(n + 1)/2, the number of
elements in R and Q. In fact, on the one hand, it has already been shown that R can
be set to 1 without loss of generality, while, on the other hand, it is apparent that,
under the (mild) assumptions of reachability of (A, B) and stability of the feedback
system, conditions (al)-(a3) of Theorem 3.13 are satisfied whenever the control law
results from the solution of the LQ problem corresponding to an arbitrary Q > 0.
These conditions are sufficient (thanks to Remark 3.20) to ensure the existence of the
solution of the inverse problem. Thus the same control law must also result from a Q
expressed as the product of an n-vector by its transpose and the truly free parameters
are n.
A= [ 0
p ]B=[?].
OQ
Conditions (al) (a3) in Theorem 3.13 can further be weakened, up to becoming necessary and sufficient. These new conditions are presented in the following theorem where the triple (A, B, K) has already undergone the canonical
decomposition, thus exhibiting the form
A=
Ai
A2
A3
A4
0
0
A7
A 68
A9
Bi
LB=I
'
K= [ 0 Ki 0 K9 1
0
0
'
(3.47)
with the pair (A,.,13,.) reachable, the pair (Ao, Ko) observable and
A,.
Ao:=
Ai
A, A6
0
A2
A55
Bi
B2
,
Ko:=[ Ki K2 .
A9
Theorem. 3.14 With reference to the system (3.39) and the control law (3.40)
there exists a matrix Q = Q' > 0 such that the optimal regulator problem
87
defined on that system and the performance index (8.41) admits a solution for
each initial state, the solution being specified by the given control law, if and
only if
(a32)
1.
Proof. Necessity. Let K specify the optimal control law corresponding to a given
Q := C'C. Then Lemma 3.3 implies that condition (a32) holds or, if
1,
C(sI - A)-'B = 0, which in turn causes the solution of the optimal control problem
0, that is K = 0. Note that the unobservable part of the triple (A, B, C) is
to be
included in the unobservable part of the triple (A, B, K): in fact, if the first triple has
already undergone the canonical decomposition (in general, the relevant canonical
form differs from the one given in eq. (3.47)), it is easy to verify that the columns of
K corresponding to the zero columns of C are also zero (recall Remark 3.12). Since
the control law ensures that all eigenvalues not belonging to the unobservable part
have negative real parts, conditions (al) and (a2) are satisfied.
that a vector 192 with the same number of rows as matrix A9 can be determined
so that the n-dimensional vector 19
0 19i 0 192 ]' defines a solution of the
inverse problem relevant to the triple (A, B, K). This solution is constituted by the
matrix Q :=1919'. Note that a solution of the optimal regulator problem defined by the
quadruple (A, B, Q, 1) exists whatever the vector 192 might be, thanks to assumption
(al). In view of Remark 3.12 in Section 3.4 it can be stated that the nonzero blocks
in the solution of the relevant ARE are only four, precisely P5, P7, Plo, P7, and
satisfy the equations
0 = P5A5 + A5 P5 - P5B2B2P5 + Q1
0 = P7A9 + (A5 - B2B2P5)'P7 + P5A6 + Q12
(3.48)
implies that the solution P7 of eq. (3.48) with P5 = P5 is such that K2 = -B2P7.
88
'
P7 = J eAs-t(P5As
+ Q12)eA9tdt
v
where A50 := A,,, - B2B'2.P5 (the check can easily be done by resorting to integration
by parts and exploiting the fact that all eigenvalues of matrices As and A50 have
negative real parts). Since Q12 is a function of 192, we obtain
k (192) :_ -P7(i92)B2 = - - A792
where
/00
e9tAjP5eA5CtB2dtA
f(19!A5CtB)eA'9tdt
are a vector and a matrix, respectively, which are known because P5 (and thus also
A50) are uniquely determined once Q1 has been selected as specified above. It should
be clear that if A is nonsingular, then 'd2 can be selected so that K2 = K2. Therefore
the proof ends by checking that A is nonsingular. Without loss of generality matrix
A9 can be assumed to be in Jordan canonical form, hence eA9t and A are triangular.
In order that this last matrix be nonsingular it is sufficient to ascertain that all its
diagonal elements are different from 0. If a1 is the i-th eigenvalue of A9, the i-th
diagonal element of A is
> aI(jw)
1
Ndw,
tB2)e"atdt = 1
t:
- Jo
2ir
,_ 0
cP1(jw) - jw
- ai
having exploited Parceval's theorem and denoted by cpl the characteristic polynomial
of A5,;. The evaluation of fi can be performed by resorting to well-known properties
of analytic functions, yielding
(,1(-a-)
(Pl (-a= )
which is not 0 because lie[-ai ] > 0 (assumption (al)) and all roots of aI have
nonpositive real parts.
Example 3.34 Consider the inverse problem defined by the matrices
A=
0
0
0
0
0
2
, B=
-1
K= [ -2 -3 -1
so that, if the notations in Theorem 3.14 and its proof are adopted, we have
0],Ar2],B2
A5-[0
0K1-2 -3},As=K2=-1
P5 _ [
6-22
2
A5ct
' e
2e-t - e-2t
a-t _ e-2t
3.5. Problems
3.5
89
Problems
S-1.
Problem 3.5.2 Find the solution of the tracking problem discussed in Remark 3.7
when the term [y(t f) - l.L(t f)]'S[y(t f) - (t f)]/2, S = S' > 0 is added to J.
Problem 3.5.3 Find an example of the problem discussed in Remark 3.3 where the
optimization with respect to the initial state cannot be performed.
Problem 3.5.4 Find a solution of the LQ problem (with known exogenous input)
defined by x(t) = Ax(t) + Bu(t) + Md(t), y(t) = Cx(t) + Nd(t), x(0) = xo,
J=
where xo, T,
Problem 3.5.5 Consider the LQ problem over an infinite horizon where x(t) = tu(t),
x(to) = xo and
Problem 3.5.6 Find the solution of the (generalization of the) LQ problem defined
by x(t) = A(t)x(t) + B(t)u(t), x(to) = xo and
J=f
where 0 < Ti < Tf Goo, xo, Tf and TZ are given and Q(t) = Q'(t) > 0, S = S' > 0,
R(t) = R'(t) > 0.
Problem 3.5.7 Consider the regulator problem defined on the quadruple
(A, B, CC, 1),
where A = 100
1
1
]c={fi 1
and -oo < a < oo, -oo <,3 < oo. Discuss the existence of the solution for all initial
states and the stability properties of the resulting closed loop system.
90
Problem 3.5.8 Assume that the solution of the optimal regulator problem defined by
a quadruple of constant matrices (A, B, Q = Q' > 0, R = R' > 0, exists for each
initial state. Can the relevant ARE admit only the positive semidefinite solutions
PI=[
]' P2- [
]?
Problem 3.5.9 Consider the track ing problem defined on the system x(t) = Ax(t) +
13u(t), y(t) = Cx(t), x(O) = xo and the performance index
00
J..-T
(t))
A= I
0]
' B= [ 1
C= [ a 1 ].
With reference to the notation in Remark 3.13, check that Ps -> 0, i = 1, 2, 3 when
p -* 0+ if a > 0, while this is not so when a < 0. Could this outcome have been
forecast?
Problem 3.5.10 Find a solution of the LQ problem defined on the system x(t) _
Ax(t) + Bu(t) + Erl(t), y(t) = Cx(t) + Dri(t), x(0) = xo and the performance index
0
where the pair (A, B) is reachable, ri(t) = HC(t), 4(t) = Fi(t), (0) = Co, F is an
asymptotically stable matrix. The state x is available to the controller and o is
known.
Chapter 4
The LQG problem
4.1
Introduction
The discussion in this chapter will be embedded into a not purely deterministic framework and focused on two problems which, at first glance, might seem
to be related to each other only because they are concerned with the same
stochastic system. Also the connection between these problems and the previously presented material is not apparent from the very beginning while, on the
contrary, it will be shown to be very tight. Reference is made to a stochastic
system described by
(4.1a)
(4.1b)
(4.1c)
x(to) = xo,
E[
zvv(t)
J = 0, Vt,
E[xo] = moo,
(4.2)
(4.3)
92
and
E(
v(t)
w(t)
W(T)
v'(T)
]-
S(t - T)
ZW
_=6(t
`dt, T,
w'(t) ]] = 0, Vt,
E[(xo - to) (xo -X01 = no,
E[xo { v'(t)
(4.4)
(4.5)
(4.6)
where the quantities to, V, Z, W, 11o are given and 6 is the impulsive function.
IIo are symmetric and positive semidefinite.
Moreover, matrices V, W,
The two problems under consideration are concerned with the optimal
estimate of the state of system (4.1) and the optimal (stochastic) control of
it. The first problem is to determine the optimal (in a sense to be specified)
approximation (tf) of x(t f), relying on all available information, namely the
time history of the control and output variables (u and y, respectively) on the
interval [toy, t f] and the uncertainty characterization provided by eqs. (4.2)(4.6). The second problem is to design a regulator with input y which generates
the control it so as to minimize a suitable performance criterion.
Remark 4.1 (Different system models) Sometimes the system under consideration is
not described by eq. (4.1a) but rather by the equation
(4.7)
L'[ [ rrl(t)
v*'(T)
&(r) ]] _ E6 (t - T)
with
Z*,
In this case the previously presented formulation can still be adopted by defining the
stochastic process v := B*v*, which apparently verifies eqs. (4.1a), (4.2), (4.4), (4.5),
(4.8)
v* being a zero mean white noise independent of xo with intensity V* > 0. Letting, as
above, v := B*v* and w:= C*v*, it is straightforward to get back to eqs. (4.1)-(4.6)
with V := B*V*B*', Z := 13*Z*, W := C*V*C*'.
Finally, note that in eqs. (4.7), (4.8) matrices B* and C* could be continuously
differentiable functions of time. In such a case the discussion above is still valid but
4.2.
4.2
93
The problem of the optimal estimate or filtering of the state of system (4.1)(4.6) is considered in the present section. The adopted performance criterion
for the performed estimate is the expected value of the square of the error
undergone in evaluating an arbitrarily given linear combination of the state
components. Thus the problem to be discussed can be formally described as
follows.
Problem 4.1 (Optimal estimate of b'x(t f)) Given an arbitrary vector b E Rn,
determine, on the basis of the knowledge of y(t) and u(t), to < t < t f, a scalar
i3 such that the quantity
Jb
E[(b'x(tf) -,3)']
(4.9)
The first case to be examined is the so-called normal case where the matrix
W (the intensity of the output noise w) is positive definite, while the socalled singular case in which W is positive semidefinite only, will subsequently
be dealt with. These two situations differ significantly from each other not
only from a technical point of view, but also because of the meaning of the
underlying problem. This aspect can be put into evidence by recalling that
the symmetry of W allows us setting W := T'DT where T is an orthogonal
matrix, D := diag[dl, d2, ... , d,., 0, ... , 0] and rank(W) = r. By letting y* :=
4.2.1
The optimal state estimation problem is now considered under the assumption
that the matrix W is positive definite: first the observation interval is supposed
to be finite, subsequently the case of an unbounded interval will also be tackled.
Thus, letting -oo < to < t f < oo, Problem 4.1 is discussed by adding a
seemingly unmotivated constraint to the scalar a which is asked to linearly
depend on y, according to the equation
tf
,Q =
Jto
19'(t)y(t)dt
(4.10)
94
form of the estimate of b'x(t f) does not actually cause any loss in optimality,
since in the adopted stochastic framework the estimate which minimizes Jb is
indeed of that form. With reference to the selection of i9 the following result
holds.
Theorem 4.1 Consider eq. (4.10): the function t9 which solves Problem 4.1
0,
relative to system (4.1) .-(4.6) when the observation interval is finite,
:to = 0 and Z = 0, is given by
19(t) = W-'C(t)rl(t)a(t)
(4.11)
where lI is the (unique, symmetric, positive semidefinite) solution of the differential Riccati equation
(4.12)
11(to) = no,
(4.13)
(4.14)
a(tf) = b.
(4.15)
a = -A'a + C't9,
(4.16a)
a(t f) = b.
(4.16b)
d(a'x)
dt
By integrating both sides of this equation between to and t f we get, in view of eqs.
(4.10), (4.15),
f
tf
b'x(t f) - 11 = a(to)x(to) - f i9'(t)w(t)dt + f
to
to
(t)v(t)dt.
to
to
By squaring both sides of this equation, performing the expected value operation,
exploiting the linearity of the operator E and the identity (r's)Z = r'ss'r, which
95
E[(b'x(tf)
_,3)21
= E[a'(to)x(to)x'(to)a(to)]
1
+E [J tf O'(t)w(t)dt
w'(T)19(T)dr]
tp
+E [f' a'(t)v(t)dt
Jto
-E [2a'(to)x(to)
f v'(T)a(T)dr]
f w(t)19(t)dt]
fto
+E [2a(to)x(to)
L
Ito
f v'(t)a(t)dtJ
Note that
= J tf 19'(t) off
= f tf t9'(t)
o
E[w(t)w (T)]19(T)dTdt
tf W6(t - T)i9(T)d'rdt
to
to
i9'(t)Wt9(t)dt.
Thus, by exploiting similar arguments for the remaining terms and taking into account eqs. (4.2)-(4.6), it follows that
(4.17)
0,
and initial times have been interchanged. This fact can be managed by letting, for
any function f of t, f (E := f (t), with t :_ -t: then the problem at hand is equivalent
to the problem stated on the system
d&
= A'&
71
&(tf) = b,
and the performance index
fip
Jb =
Jtf
96
dt
=-fIA'-Ali+lIC'W-1CII-v
dt
= (A - f O'W-1C)'a
with a(if) = b. The theorem is proved once the inverse transformation from the time
t to the time t has been carried out.
Remark 4.2 (Meaning of (3) Within the particular framework into which Theorem
4.1 is embedded, both x(t) and y(t) are, for each t, zero-mean random variables
because xo = 0 and v and w are zero-mean white noises. Therefore /3 is a zero-mean
random variable as well and its value, as given in Theorem 4.1, is the one which
minimizes the variance of the estimation error of b'x(t f).
Remark 4.3 (Variance of the estimation error) The proof of Theorem 4.1 allows us
to easily conclude that the optimal value of the performance criterion is
Jb = b'll(tf)b
which is the (11nninial) variance of the estimation error at time t f: thus the variance
depends on the value of the matrix II at that time. Note that in the proof of Theorem
4.1 both the final time t f and the initial time to are finite and given but generic:
therefore b'fl(t)b is the (minimal) variance of the estimation error of b'x(t).
x=0,
y=x+w,
x(0) = xo,
where the dependence of the measurement (as supplied by the instrument) on the
value of the quantity has been assumed to be purely algebraic (see also Fig. 4.1 (a)).
Letting 1lo > 0 and W > 0 denote the variance of x(0) and the intensity of the white
noise w, respectively, it is easy to check that the solution of the relevant DRE is
II(t) =
now
W + riot
97
x(0
-IL
s+S
(a)
(b)
Figure 4.1: The problem discussed in Example 4.1 when the instrument behaves in a purely algebraic (a) or dynamic way (b).
so that, with the obvious choice b = 1,
a(t)
_ W+riot
W + Hot f
no
y(t)dt
W + not f f0tf
since
'00(t) =
110
W +IIotf
Note that when llot f > W, i.e., when the observation interval is sufficiently large or
the confidence in the (a priori available) expected value of the quantity to be measured is sufficiently smaller than the confidence on the reliability of the instrument
(no > W), the optimal estimate amounts, in essence, to averaging the collected rnca
sureinents. Finally, observe that when the observation interval becomes unbounded
(t f -* oo), the variance of the estimation error tends to 0 because lirnt-. r1(t) = 0.
Assume now that the data y" supplied by the instrument results from a first
order filtering process so that it is given by the equations
= -bx` + ryx,
y` =x*+w
with 6 and -y both positive (see also Fig. 4.1 (b)). Suppose that x"(0) = 0: to
be consistent let no =diag[1, 0] and consider the combinations of values for the
parameters b, 'y, W which are shown in Tab. 4.1 together with the corresponding
values of the variance of the estimation error x(1), i.e., the values of H11(1) (the
(1,1)-element of II in the new framework). The comparison between the second and
the fourth case points out the expected negative effect that an increase in the output
noise causes in the performances, while the comparison between the first and second
case emphasizes the benefits entailed in the adoption of an instrument with the same
gain -y/6 but wider band 6, i.e., a faster instrument. The comparison between the
second and third case confirms the prediction of the unfavourable effect of a gain
reduction that occurs while keeping the band unaltered. Finally, the comparison
98
W hil(l)
case
ry
10
10
3
4
10
10
10
10
0.86
0.54
0.99
0.92
'T'able 4.1: Example 4.1: variance I1ii(1) of the estimation error of x(1).
between the first and third case shows that, while keeping the product gainxband
constant, the negative effect entailed in a gain reduction is not compensated, at least
for sufficiently short observation intervals, by increasing the band.
Remark 4.4 (Correlated noises) The proof of Theorem 4.1 suggests a way of dealing
with the case where v and w are correlated noises, i.e., the case where Z 0 0. In fact
it is easy to check that eq. (4.17) becomes
Jb = a(to)lloa(to) +
fL
case Z=0.
Example 4.2 Again consider the problem presented in Example 4.1 and assume that
the instrument is a first order dynamical system with unitary gain. A feedback action
is performed on the instrument: a term proportional to the measure is added to the
input of the instrument itself. The equations for the whole system are
i=0,
i* =Sx+S(k-1)x*+kw,
y* = x* + w
with 6 > 0 and k: real. With the usual notations we get
A=I
1
L
S(k0
S
1) J ,
10
=
62k2W 6kW
6kW
99
It is easy to check that Ac and Vc are independent of k. Thus also the solution of
the relevant DRE is independent of k and the conclusion can be drawn that such a
feedback action has no effect on the quality of the estimate within the framework at
hand.
The importance of Theorem 4.1 is greater than might appear from its statement. Indeed, it allows us to devise the structure of a dynamical system the
state of which, 1(t), is, for each t E [to, t1], the optimal estimate of the state
of system (4.1)-(4.6). This fact is presented in the next theorem.
Theorem 4.2 Consider the system (4.1)-(4.6) with
0, to = 0 and Z =
0. Then, for each b E R' and for -oo < to < t < t f < oo the optimal estimate
of b'x(t) is b':b(t), 1(t) being the state, at time t, of the system
'gto) = 0
(4.18a)
(4.18b)
f a'(t)11(t)C'(t)W-'y(t)dt
Ito
rt
= b' J f W'(t, t f)HI(t)C'(t)W-ly(t)dt
to
where WY(t,-r) is the transition matrix associated to -(A + LC)'. By exploiting well-
known properties of the transition matrix (see Section A.2 of Appendix A) and
denoting with 4)(t, ,r) the transition matrix associated to (A + LC), we get
the effects on x of the deterministic input u and the time propagation of the
expected value of the initial state. The presence of the deterministic input is
taken into account by simply adding the term Bu to the equation for X, while
100
[i(t)
Theorem 4.3 Consider the system (4.1)-(/4.6) with Z = 0. Then, for each
b E IT' and -oo < to < t < t f < oo the minimal variance estimate of b'x(t) is
b'(t), (t) being the state, at time t, of the dynamical system
(4.19a)
(4.19b)
(A + LC)! - Ly + (B + LD)u.
Remark 4.6 (Meaning of 11(t)) By making reference to the proof of Theorem 4.1,
Remark 4.3 and Theorem 4.3 it is easy to conclude that
b11(t)b = E[(b'x(t) - b'(t))2] = b'E[(x(t) - x(t))(x(t) - x(t))']b.
Since b is arbitrary, the matrix n(t) is the variance of the optimal estimation error at
time t: therefore, any norm of it, for instance its trace, constitutes, when evaluated
at some time r, a meaningful measure of how good is the estimate performed on the
basis of the data available up to r.
4.2.
101
y(t)
L(t)
H-
y(t)
x(t)
B(t)
A t)
A=
11 1,
01
M (t) =
2 + 2e 2t
1 - e2t - 2te2t
1 - e2t - 2te2t
1 + 3e 2t + 2te2t + 2t2e2t.
t-.00
from which one can conclude that the variance of the estimation error vanishes as
the observation interval increases.
Remark 4.7 (Incorrelation between the estimation error and the filter state) An interesting property of the Kalrnan filter is put into evidence by the following discussion.
Let e := x -x and consider the system with state [ e' x ] ' which is described by
the equations
e]
x
A][Ie
A--LC
+[ 0
][
v
w
]+[
B I U.
By denoting with e(t) and X" (t) the expected values of e(t) and 1(t), respectively, and
letting
- x (t) ]]
[ II12(t)
1I22(t)
'
102
(4.20)
(4.21)
(4.22)
1112(to) = 0,
(4.23)
l22(to) = 0.
Here the material in Section A.5 of Appendix A has been exploited. By recalling
that L = -fC'W -1, it is straightforward to check that eq. (4.20) coincides with
eq. (4.12) and that eq. (4.22) is indeed eq. (4.13), so that
From this
identity it follows that in eq. (4.21) it is -1I11C'L' - LWL' = 0: thus, 1112() = 0
solves such an equation with the relevant boundary condition (4.23). This fact proves
that the stochastic processes e and x are uncorrelated. By exploiting Remark 4.4 the
same arguments can easily be extended to the case where v and w are correlated
(Z 0 0).
The proof of Theorem 4.1 suggests which results pertaining to LQ problems are
useful in the case of an unbounded observation interval, that is when to = -oo.
By referring to Section 3.3 of Chapter 3.1, the initial state of the system is
supposed to be known and equal to zero, so that to = 0, no = 0 and a suitable
reconstructability assumption is introduced (recall that this property is dual to
controllability). On this basis the forthcoming theorem can be stated without
requiring any formal proof.
Theorem 4.4 Let the pair (A(t), C(t)) be reconstructable for t < t f. Then the
problem of the optimal state estimation for the system (4.1)-(4.6) with Z = 0,
:z;o = 0, II() = 0 admits a solution also when to -- -oo. For each b E Rn and
T < tf the optimal estimate of b'x(T) is given by
where 0(T) is the
limit approached by the solution, evaluated at T, of the equation
XI (t) _ [A(t) + L(t)C(t)]1(t) - L(t)y(t) + Bu(t),
(4.24)
:a;(to) = 0
when toy --> -oo. In eq. (4.24) L(t) :_ -II(t)C'(t)W-1 and, for all t, fl is a
symmetric and positive semidefinite matrix given by
II(t) =
eolim
4-00
H(t, to),
11(t, to) being the solution (unique, symmetric and positive semidefinite) of
the differential Riccati equation (4.12) satisfying the boundary condition
11 (to) to) = 0.
103
Thus, the apparatus which supplies the optimal estimate possesses the structure shown in Fig. 4.2 (with to = 0 and, if it is the case, the term Du added
to y) also when the observation interval is unbounded.
In a similar way it is straightforward to handle filtering problems over
an unbounded observation interval when the system is time-invariant: indeed,
it is sufficient to mimic the results relevant to the optimal regulator problem
(see Section 3.4 of Chapter 3.1) in order to state the following theorem which
refers to the time-invariant system
(4.25a)
(4.25b)
Theorem 4.5 Consider the system (4.25), (4.1 0-(4.6) with Z = 0, -to = 0,
IIo = 0 and the pair (A, C) observable. Then the problem of the optimal state
estimation admits a solution also when to -, -oo. For each b E Rn and T < t f
(r), where (r) is the limit
the optimal estimate of b'x(r) is given by b% ,,'(,r),
approached by the solution, evaluated at r, of the equation
(4.26)
when to --> -oo. In eq. (4.26) L := -IIC'W -1, rI being a constant matrix,
symmetric and positive semidefinite, which solves the algebraic Riccati equation
0=IIA'+An -IIC'W-'CII+V
(4.27)
to -4- 00
II(t, to) being the solution (unique, symmetric and positive sernidefinite) of the
differential Riccati equation (4.12) with the boundary condition II(to, to) = 0.
Example 4.4 Suppose that an estimate is wanted of a quantity which is slowly varying in a totally unpredictable way. Measures of this quantity are available over an
unbounded interval. The problem can be viewed as a filtering problem relative to the
first order system
x=v,
y=x+w
where [ v w ]' is a white noise with intensity = diag[V, W], V # 0, W ; 0. The
ARE relevant to this problem admits a unique nonnegative solution n = VW: thus
the filter gain is L = - V/W. Note that the transfer function of the filter from the
input y to the state x is
G(s) =
1 + ST
104
where r = - /W/V . Therefore the Kalman filter is a low-pass filter the band of
which depends in a significant way on the noise intensities: the band increases when
the input noise is stronger than the output noise and, vice versa, it reduces in the
opposite case.
4.2.2
A possible way of dealing with the filtering problem in the singular case is now
presented with reference to the tirrle-invariant system described by eqs. (4.25).
Thus the intensity of the output noise is a matrix W which is not positive
definite, i.e., W > 0, det[W] = 0 and, for the sake of simplicity, the rank of
matrix C (eq. (4.25b)) is assumed to be equal to the number p of its rows.
Denote with T:= [ T1' T2 ]' an orthogonal matrix such that
TWT' =
S2
Yd(t)
y<: (t)
T1Cx(t)
T2Cx(t)
T1w(t)
T2w(t)
in view of the fact that the intensity of the white noise T2w is zero, this relation
can be rewritten as
(4.28a)
y,(t) = CAt)
(4.28b)
105
(4.29)
x(1)(t) := C*x(t).
(4.30)
where
(4.31a)
(4.31b)
(4.31c)
Equations (4.31) define a dynamical system with state x(i), known inputs u
and yC, unknown inputs (noises) v and w*, outputs yd and yC. More concisely,
eqs. (4.31) become
(4.32a)
(4.32b)
u(i)(t)
:-
yC t
u(t)
C(i) :=
yd(t)
y(1) (t)
w(1) (t)
yC(t)
C*Arc C*B ],
Cdr*
L ccAr*
D(1) :=
Cdre
CCArC
0
CCB
WO)' ]' is
S1
CCVC*'
CCZT1
T1Z'CCI
CCVCC
w* (t)
CCv(t)
106
W(1)
[CCZTT
T1Z'CC
CCVCc
is positive definite, the filtering problem relative to the system (4.32) is normal
and the results of the preceding section can be applied, provided that the
probabilistic characterization of x(') (to) could be performed on the basis of all
available information. If, on the other hand, W(1) is not positive definite, the
above outlined procedure can again be applied. Note that the dimension of the
vector x(1) is strictly less than n: thus, the procedure can be iterated only a
finite number of times, and either the situation is reached where W(z) > 0 or
n noise-free outputs are available and suited to exactly estimate x.
Assuming, for the sake of simplicity, that W(1) is positive definite, the
expected value and variance of the initial state of system (4.32a) has to be
computed, given y,.(to) = Cx(to). If x(to) is not a degenerate gaussian random
variable, i.e., if no > 0, it can be shown that the following relations hold
HUcC(CCHoCC)-1CCIIoJC*/.
Let II(1) be the solution (unique, symmetric arid positive semidefinite) of the
DRE
WO (t) =
II(1)(t)A(1)' + V(1)
-II(1)(t)C(r)/(W(r))-1C(1)li(1)(t)
Z(1)](W(r))-r.
(4.33)
107
Bu>
and the filter can be implemented according to the scheme in Fig. 4.4 where
differentiation is no longer required since L, can be evaluated in advance.
Example 4.6 Consider the system
xl = X2,
x2 = v2,
y1 = X1 + W1,
Y2 = x2
where the zero-mean white noises V2 and wi are uncorrelated and their intensities
are V2 > 0 and Sl > 0, respectively. According to the previous notation, the system
is affected by the noises v and w, their intensity being specified by the matrices
V = diag[0, V2], Z = 0, W = diag[S2, 0]. The initial state is a zero-mean random
variable with variance H0 = I. Consistent with the discussion above it follows that
Cc=[ 0 1 ], Cd=C*=[ 1 0 ]
so that we obtain v(l) = 0, A(1) = 0, B(1) = 1, 01) = 0, xol) = 0, n(l)
ol) = 1,
Z(1)=[ 0 0 ],and
C(1)
'
D(1)
W(1)
V2
108
Dc
(recall that there is no control input u). Since W(' > 0 no further steps are required
and it is easy to check that the solution of the relevant DRE is H ' (t) = Q /(t + SZ).
'.['herefore, Li(t) = Lc(t) = 0, Ld(t) = -1/(t + f) and the filter is defined by the
equations
=Y2+
1
t
(yi - c),
1
Note that the solution does not depend on the intensity of the input noise: this is a
consequence of the fact that the state variable x2 can be measured exactly.
Example 4.7 Consider the system
1 = x2 + vi,
2 = X3 + V2,
X3 = V3,
yi = Xi + wi,
y2 = X2 + W2
where the zero-mean white noises v and w have intensity E' specified by the matrices
V = diag[1, 0, 1], Z = 0, W = diag[1, 0], so that both V2 and W2 can be considered zero. The initial state is a zero-mean random variable with variance. flo = I.
109
Consistent with the discussion above it follows that yc = y2, yd = y1 and hence we
get
C,,=[ O
0 ] , Cd= [
0 ] , C,*= [
0
and
0
,r*=
0
0
0
0
1
Since there is no control input u, the following relations are obtained: u(1) = yc,
A(1) = 02x2, C(1) = 12, D(1) = 02x1, 01) = I2, Z(1) = 02x2, W(1) = diag[1, 01 and
X(l) =
x1
x3
, v(1) =
V1
, 2v
vs
(n)
w1
v2
,y
(1)
Yd
yc
B (i) =
The matrix W(1) is not positive definite and the procedure has to be further applied
to the second order system
X11)
= yc + vi1),
(1)
(1)
x2 = v2
x21) + w21) =
], Cd(1) = [
0 ] = C`(1), and
0],r*(1)=[o1.
Consistent with this result, x(2) =x11), v(2) = v11), V(2) = 1, Z(2) = 01x2, W(2) ='2,
0 1
A(2) =0, D(2) = 02x2, B(2)
w(2)
(1)
C(2) = [
the matrix W(2) is positive definite, the procedure ends at this stage and the
state estimation problem concerns the first order system
x(2) = u(1) + v(2) = yc -1- v(1)
y12) =X (2) +W 11),
Y2(2)
V21)
By taking into account Fig. 4.3, the filter equation can easily be found, precisely
2
X(2)
_ -Ld1)(y1 - x(2)) + y2 dd
110
with x(2)(0) = xo2). By exploiting the discussion for the alternative scheme of Fig.
4.4 we get
+ Z,
satisfying the boundary condition 11(2)(0) = II(2). In the case at hand we find xo2) = 0
(t) =
0. Finally, the
]Ax=A[g:
x = U.
with the already found relation Cdx = 0, imply that the pair (A, C) is not
observable. By iterating these arguments the truth of the claim above can be
ascertained also in the case where WW > 0, i > 1. Finally, notice that in
the scheme of Fig. 4.4 the block L, is no longer present since this function is
actually constant.
111
y=x1
under the assumption that the voltage across the condenser can exactly be measured.
The signals vi and V2 account for two sources of noise. For an unbounded observation
interval the filter is sought corresponding to the intensity V = diag[vl, v21 of the
stochastic process [ VI v2 J. By adopting the usual notation and letting C* =
[ 0 1 ], one finds A(1) _ -1, B(1) = 1, C1) = -1, D(1) _ -1, 01) = v2,
W(1) = vi. The solution of the relevant ARE is
VV12
+ vlv2 - vl
so that the filter equations are (recall that x(l) =x2 and L,,
/1 +v2/vl - 1)
:P) = z - Ley,
x2 = P),
and the transfer function G(s) from y to 12 is
G(s) _ 1 + LC - Las
1+Lc+s
Note that the time constant of G(s) approaches 1 when v2/vl --. 0, while it approaches 0 when v2/vl --> 00.
112
4.3
J=E
tf
t
to
(x (t)Q(t)x(t) + u'(t)R(t)u(t))dt
(4.34)
where, as in the LQ context, Q(t) = Q1(t) > 0 and R(t) = R'(t) > 0, Vt, are
matrices of continuously differentiable functions and
0 to avoid triviality. In the performance index (4.34) a term which is a nonnegative quadratic
function of x(tf) could also be added: its presence, however, does not alter the
essence of the problem but rather makes the discussion a little more involved.
For the sake of simplicity it is not included here and likewise the intensity W
of the noise w is assumed to be positive definite.
Problem 4.2 (LQG Problem) Consider the system (4.1)-(4.6): Find the control which minimizes the Performance index (4.34).
111 Problem 4.2 the control interval is given and may or may not be finite.
In the first case it is obvious that the multiplicative factor in front of the
integral is not important, while in the second case it is essential as far as the
boundedness of the performance index is concerned.
4.3.1
The solution of Problem 4.2 is very simple and somehow obvious. In fact,
according to it, the actual value of the control variable is made to depend on
the optimal estimate of the state of the system (that is, the state of the Kalman
filter) through a gain matrix which coincides with the matrix resulting from
the minimization of the deterministic version of the performance index (4.34)
in an LQ context, namely
id =
tf
to
The precise statement of the relevant result is given in the following theorem.
Theorem 4.7 Let Z = 0, to and t f be given such that -oo < to < t f < oo.
Then the solution of Problem 4.2 is
u(:Z-, t) = K(t).'
(4.35)
4.3.
where
113
(4.36)
J, := E
[x(t)Q(t)x(t) + u'(t)R(t)u(t)Jdt+ tr I
J
tf Q(t)n(t)dt I
to
The link between x and u can be established by means of eq. (4.36), yielding
or,
114
Pr(to)5ox+
ft1
L(t)WL'(t)PK(t)dtj
J, =
x=u+v,
y=x+w
and the performance index
L
J = E [ f f [Qx2 + u2}dt]
0
VW, w := \/V/-W
VI-Q, the solutions of the two DRE relevant to the problem are
P(t) -
1 - e2p(L-L f)
+
11o + v + (IIo - v)e-2u,L
II(t) = 11o + v - (Ho - v)e-2wt'
v
e2w(L-L f)
which imply K = -P and L = -II/v/W-. The responses of the system state and
control input are shown in Fig. 4.6 when Q = IIo = 1, V = 1 and W = 0.1.
Corresponding to the same sample of the noises, the responses of these variables are
also presented in the quoted figure when the control law u = Ky is implemented, that
is when the Kalnian filter is not utilized: the deterioration of the system behaviour
is apparent. Similar conclusions can be drawn from the analysis of the same figure
corresponding to different noises intensities, namely V = 0.1 and W = 1. The better
performance of the control system which exploits a Kalman filter is put into particular
evidence by the control transient.
115
Figure 4.6: Example 4.9: responses of x and u when a Kalrnan filter is (f) or
is not utilized.
4.3.2
Unbounded control intervals can be dealt with also in a stochastic context arid
a particularly nice solution found when the problem at hand is stationary, thus
paralleling the results of the LQ and filtering framework. Consistent with eq.
(4.34), the performance index to be minimized is
J=E
liar
t p -4-00
tr
tf -to J u
[:x'(t)Q(t)x(t) + u'(t)R(t)u(t)]dt
(4.37)
t f --00
where the matrices Q and R satisfy the usual continuity and sign definition assumptions. Moreover, the system (4.1)--(4.6) is controllable and reconstructable for all t, the initial state is zero, Z = 0 and W > 0. If these
assumptions are satisfied the solutions of the two DRE relevant to Problem
4.2 can indefinitely be extended and the following theorem holds.
Theorem 4.8 Assume that the above aasumptions hold. Then the solution of
Problem 4.2 when the control interval is unbounded, that is when the perfor-
116
t) = K(t)x
lim
to +_00
1I(t, to),
where P(t, t1) and 1I(t, to) are the solutions of the differential Riccati equations specified in Theorem 4.7 with the boundary conditions P(t f, t1) = 0 and
II(to,to) = 0.
Remark 4.9 (Optimal value of the performance index) If the LQG problem over an
infinite interval admits a solution, the optimal value of the performance index can
easily be evaluated by making reference to Remark 4.8. Thus
r
.1 =
fin,
t!
to
tr I
ff [Q(t)II(t) + L(t)WL'(t)P(t)JdtJ
t f -+oo
The case when all the problem data are constant deserves particular attention:
the most significant features are the constancy of matrices P and II (arid hence
of matrices k and L, too) and the fact that they satisfy the ARE resulting from
setting to zero the derivatives in the DRE of the statement of Theorem 4.8. The
importance of this particular framework justifies the formal presentation of the
relevant result in the forthcoming theorem where (unnecessarily) restrictive
assumptions are made in order to simplify its statement and some results
concerning Riccati equations are exploited (see Chapter 5, Section 5.3).
u,, (x) = Kx
i(t) = (A + LC)x(t) +
Ly(t)
4.3.
117
61xi
15
-15
-6
Figure 4.7: Example 4.10: responses of xl and u when the controller makes
(f) or does not make use of a Kalinan filter.
0 = PA + A'P - PBR-1B'P+Q,
O = IIA' + AH - nC'W-1CII + V.
Example 4.10 Consider the system
xl = X2,
x2 = U + V2,
y=xi+w
with the noise intensities given by V = diag[0,10] and W = 0.1. Associated to this
system is the performance index (4.37) where Q = diag[1, 0] and R = 1. Each one
of the two ARE relevant to the resulting LQG problem admits a unique positive
semidefinite solution, namely
P=
II=L
so that
[1
20J
I1
vf2i ] ,
-[
20
10
)'
Corresponding to the same noise samples, the responses of xl and u are shown in
Fig. 4.7 when the above controller is used or a controller is adopted with the same
structure and the matrix L replaced by L = - [ 20 100 ] ' (the eigenvalues of the
error dynamic matrix are thus moved from -2.24(1 j) to -10). The control variable
effort is apparently smaller when the result in Theorem 4.9 is exploited.
11.8
Remark 4.10 (Optimal value of the per formreance index in the time-invariant case)
In view of Remark 4.9, the optimal value of the performance index when the LQG
problem is tithe-invariant and the control interval is unbounded is given simply by
J = tr[QII + PLWL'1.
`['his expression implies that J > tr[QII] since the second term is nonnegative (the
eigenvalues of the product of two symmetric positive semidefinite matrices are nonnegative). This inequality holds independently of the actual matrix R: therefore, even
if the control cost becomes negligible (i.e., when R --+ 0), the value of the perforrnarice index cannot be less than tr[QII] which, in a sense, might be seen as the price
to be paid because of the imprecise knowledge of the system state. Since it can also
be proved that
= tr[PV + tlk'RK],
the conclusion can be drawn that J > tr[PV] and again, even when the output
rneasurerrrent becomes arbitrarily accurate (W --, 0), the optimal value of the performance index cannot be less than tr[PV] which, in a sense, might be seen as the
price to be paid because of the presence of the input noise.
Remark 4.11 (Stabilizing properties of the LQG solution) When the assumptions in
Theorem 4.10 hold, the resulting control system is asymptotically stable. Indeed,
since the Kalman filter which is the core of the controller has the structure of a state
observer, it follows, from Theorem B.2 (see Section B.3 of Appendix B.1) that the
eigenvalues of the control system are those of matrices A+BK and A + LC. All these
eigenvalues have negative real parts because the solutions of the ARE (from which
the matrices k and L originate) are stabilizing (recall the assumptions in Theorem
4.10 and see Theorem 5.10 of Section 5.3 of Chapter 5).
The solution of the optimal regulator problem has been proved to be robust
in terms of phase and gain margins (see Subsection 3.4.2 of Section 3.4 in
Chapter 3.1). The same conclusions hold in the filtering context because of
the duality between the two problems. Thus- one might conclude that the
controller defined in Theorem 4.9 implies that the resulting control system is
endowed with analogous robustness properties with regard to the presence of
phase and gain uncertainties on the control side (point P,, in the block scheme
of Fig. 4.8) and/or on the output side (point P. in the same block scheme).
This actually fails to be true, as shown in the following well-known example.
Example 4.11 Consider the LQG problem defined on a second order system E characterized by the matrices
A- [
1I
B
,
C=[
], Q=aq[
0 ], W=1, R=1
1I
4.3.
119
Py
PC
,n
I/s
LIT Kalman
Filter
Controller
P=aq[11121
1
F=
-a9
-a9
a,,
0
0
1-a
1
-(aq+ ay) 1 - aq
This unpleasant outcome is caused by the following fact: the transfer function
120
sclicnne of Fig. 4.8 at the point P,, does not coincide with the transfer function
Z a (,s) := K(sI - A) -113 which is expedient in proving the robustness of the
solution of the LQ problem. Indeed, TT is the transfer function which results
from cutting the scheme of Fig. 4.8 at the point P. A similar discussion applied
to the other side of G(s), with reference to the transfer function Tv(s)
-G(s)k(sI - (A + BR- + LC))-1L (which results from cutting the scheme in
Fig. 4.8 at the point P.) and to the transfer function Tf(s) := C(sI - A)-1L
which, in the Kalman filter framework, plays the same role, from the robustness
stated under the assumptions that the number of control variables u equals
the number of output variables y and the matrices B, C are full rank.
Theorem 4.10 Let the triple (A, B, C) be minimal and V = vBB'. Then, if no
transmission zeros of the triple (A, B, C) has positive real part, the function
Tom, approaches the function Tc as v --' oo.
Theorem 4.11 Let the triple (A, B, C) be minimal and Q = qC'C. Then, if no
transmission zeros of the triple (A, B, C) has positive real part, the function
1;, approaches the function T f as q -+ oo.
Remark 4.12 (Alternative statement of Theorems 4.10 and 4.11) By recalling the
role played by the matrices Q and R in specifying the meaning of the performance
index and by matrices V and W in defining the noises characteristics, it should be
fairly obvious that instead of letting matrices Q and V go to infinity, we could let
matrices R and W go to zero.
Example 4.12 Consider the LQG problem defined by the matrices
A=
0
0
-1
, B=
121
1001 I Tu11(jw) I
db
*--- T (w)
0
-100
50
I Tu22(jw) I
Tc22V w)
0
0-2
10
0)
50
I Tu22Vw) I A
Ta22((0) w
10-2
104
10
104
V=1 --W
v=102
-150
-150
(a)
(b)
Figure 4.9: Example 4.12: plots of IT.u11(jw)I and 1TU22(jw)I when C = Cd (a)
and C = Cs (b).
-2
,
1
1)'Gs-[0
0
0
0
1
When C = Cd the triple (A, B, C) has a transmission zero in the right half-plane
(s = 1), while this does not happen when C = Cs. The plots of the frequency response
of ITu11(jw)I and IT,,.22(jw)I are shown in Fig. 4.9 for the two cases, corresponding to
some values of the parameter v. Observe that ITul l I does not approach IT,11 1 when
C = Cd. However, this circumstance does not prevent some other component of 1Tu I
from approaching the corresponding component of T,,, as v increases (see the plots of
ITu22I in Fig. 4.9).
122
4.4
Problems
Problem 4.4.1 Find the Kahnan filter for the system defined by the matrices
0
0
0
-1
A=
, C=
= I when the observation interval is unbounded. Is the filter stable? What can be
said of its stability properties if the optimal gain L is replaced by L := 2L?
Problem 4.4.2 Discuss the noisy tracking problem
x = Ax -I- Bu + v,
y = Cx + w,
= I''d + C,
=H19+v,
J=E
tt
tf
to
[(y-i)'Q(y-)+u'Ru]dt
to
where -oo < to < t f < oo. Generalize the result to the case of an infinite control
interval, that is when the performance index is
o[(y
It
J=E
lint
to --+ -00
tf 1 to It!
-l-t)'Q(y-lc)+u'Ru]dt
tf -+ 00
Problem 4.4.3 Set the complete picture of existence and stability results for the
Kalman filter in the tune-invariant case when the observation interval is unbounded.
Problem 4.4.4 Find the solution of the optimal filtering problem on the interval
[to, t1] defined by the system (4.1a), (4.1b) and (4.2)-(4.6) with xo = x(-r), where T
is a given interior point of the interval [to, ti].
Problem 4.4.5 Discuss the optimal filtering problem when the input noise v is the
output of a known tune-invariant stable system driven by a white noise.
Problem 4.4.6 For -oo < a < oo and -oo < Q < oo discuss the existence and
stability of the Kalman filter which supplies the optimal estimate of the state of the
system
x=
0
x+v,
a
1 ]x+w
Y= [ 0
4.4. Problems
123
(3
,0
,02
0
0
0
0
0
0
0
0
0
0
0
1
A=
0
0
, V=
0
0
a2
C=[ 1
W=p.
For -oo < a < oo, what can be said of the variance of the optimal state estimation
error as p -, 0+?
Problem 4.4.9 Consider the triple of matrices
0
A=
0
0
0
0
, C=
0 ], L=
-1
-2
-2
the equation x = (A + LC)x - Ly define the Kalman filter for the system
x = Ax + v, y = Cx + w when the observation interval is unbounded?
Problem 4.4.10 Consider the system x(t) = Ax(t) + Bv(t), y(t) = Cx(t) + Dw(t)
where the pair (A, C) of constant matrices is observable and the zero-mean uncor-
related white noises v and w have intensities V and W > 0, respectively. Select
matrices B and D in such a way that the eigenvalues of the resulting Kalman filter
corresponding to an unbounded observation interval have real part less than -a.
Chapter 5
The Riccati equations
5.1
Introduction
5.2
(5.1b)
126
-B(t)R
Z(t)
A'(t))B'(t)
Q(t)
(5.2)
= Z(t)
fi(t)
(5.3)
A(t)
Theorem 5.1 Matrix P(t) is a solution of eq. (5.1a) if and only if eq. (5.3) is
solved by [ x'(t) x'(t)P(t) ]' for all x(to).
Proof. Necessity. If A := Px has to solve eq. (5.3), then, in view of eq. (5.2), it follows
that
_ (A - BR-1B'P)x
and
d(Px)
dt
= Px + Pi = -Qx - A'Px.
if P solves eq. (5.1a), these equations are satisfied for all x(to). Sufficiency. If
[ x' x'P ]' solves eq. (5.3) for all x(to), then x = (A - BR-1B'P)x, so that
A=
d(Px)
PBR-'B')x.
= Px + Pi = (P + PA -
The above theorem allows us to express the solution of eqs. (5.1) as a function
of the transition matrix c1 associated to Z. As a matter of fact
S
x(tf) = 4'(tf, t)
P(t)
x(t),
so that if
T11(t,tf) '1'12(t,tf)
W 21(t, tf)
4122(t, tf)
then
(5.4a)
(5.4b)
The matrix on the right-hand side of eq. (5.4a) is the inverse of the transition
5.3.
127
therefore it is nonsingular, so that from eqs. (5.4) (which hold for all x(to) and
hence for all x(tf)) it follows that
(5.5)
r
L
-1
-1 1
0
J,
4'(t'r)
_ 1 I et-T +
eT-t
2 L eT-t -
et-T
5.3
1 + a - (1 - a)e2(t-t f )
1+or
+(1-a)e2(t-tf)
0 = PA + A'P - PBR-1B'P + CC
(5.6)
-B
Z:- -C'C
B
'B'
A'
(5.7)
The eigenvalues of this matrix enjoy the peculiar property stated in the following theorem.
Theorem 5.2 The eigenvalues of matrix Z are symmetric with respect to the
origin.
Proof.
-In
In
0
which is such that J-1 = X. It is easy to verify that JZJ-1 = -Z', from which the
theorem follows (recall that Z is a real matrix).
:128
0 such that
Z[
0
then, in view of eq. (5.7), Aw = Aw and C'Cw = 0. This last relation is equivalent
to Cw = 0, so that from the PBH test (see Theorem A. 1 of Appendix A), we can
conclude that A is an eigenvalue of the unobservable part of E. Vice versa, if A and
w 0 0 are such that Aw = Aw and Cw = 0, then eq. (5.8) holds and A, which
is an eigenvalue of the unobservable part of E because of the PBH test, is also an
eigenvalue of Z. Point (b) If there exists w # 0 such that
Z[wl=A[w
I,
(5.9)
then, by taking into account eq. (5.7), it is -A'w = \w and -BR-'B'w = 0. Since
R is nnousingulax and B is full rank, this last equation implies B'w = 0, so that, in
view of the PBH test, we can conclude that A is an eigenvalue of the unreachable
part of E. Vice versa, if A and w # 0 are such that A'w = Aw and B'w = 0, then
eq. (5.9) holds and A, which is an eigenvalue of the unreachable part of E because of
the PBH test, is also an cigenvalue of Z.
Lennnna 5.1 states that all the eigenvalues of A which are not eigenvalues of
the jointly reachable and observable part of E(A, B, C, 0) are also eigenvalues
of Z.
Example 5.2 Let
A
and R 34 0 arbitrary. System Z(A, B, C, 0) is not observable and A = 1 is an eigenvalue both of the unobservable part and of the corresponding matrix Z, for all R.
5.3.
129
A tight relation can be established between the solutions of eq. (5.6) and
particular Z-invariant subspaces of C2n. Recall that a subspace
c2n
S:= I'm( Y
zI
YJ=1
YJV.
if, as will be assumed from now on, the matrix whose columns span S has full
rank, matrix V has dimension n if S is n-dimensional. Let further
In
and recall that two subspaces S and V of C2n are complements each of the
1)
S:= Irn( I X
Y
l
[Y]=[YIVa{
AXQJBRAB'Y
XV
YV
which implies V = X -'(AX - BR-'BY) and hence -QX - A'Y = YX -1(AX BR-'B'Y). This equation becomes, after postmultiplication of both sides by X-1,
-Q - A'YX -1 = YX -1 A - YX -1 BR-1 B'YX -1. This relation shows that YX -1
solves the ARE.
Im(I Y ) = Irra(I y
L
)
J
130
then, necessarily,
YT(XT)-1
= YX-1.
S .= Irn([ n ]).
The subspace S is obviously a complement of W. Furthermore, by recalling that P
solves the ARE,
[13I
A - BR-'B'P
PA - PBR-1B'P
[P1 (A - BR-'B'P)
This theorem allows some conclusions about the number HARE of solutions
of the ARE. More precisely, by recalling that each n-dimensional, Z-invariant
subspace is spanned by n (generalized) linearly independent eigenvectors of Z,
it can be stated that:
2n
2n(2nn!
(ii) If Z has multiple eigenvalues and is cyclic (its characteristic and minimal
polynomials coincide), then
r1ARE <
2n
(iii) If Z has multiple eigenvalues and is not cyclic, then HARE may be not
finite.
A= [ 0
1 ]B=[?]c=[1 0 ], R=1.
solution #
1
3
4
131
P11
-a
112
P21
P22
-a
j -j
0
.0
ja -1 -1 -ja
-ja -1 -1 ja
-j
columns of V
1,2
1,3
1, 4
2, 3
2,4
3,4
The eigenvalues of Z are distinct as they are: A1,2 = (1 j)/V'2- and A3,4 = -A1,2
(recall Theorem 5.2). Consistent with this fact, there exist six 2-dimensional, Zinvariant subspaces which are the span of the columns of the six matrices built up
with any pair of columns of
V.
which is the matrix resulting from gathering together four eigenvectors relative to
the four above mentioned eigenvalues. Each of these subspaces is a complement of
W, so that there exist six solutions of the ARE which can be computed as shown
in the proof of Theorem 5.3. They are reported in Table 5.1, where pij denotes the
(i, j) element of matrix P, solution of the ARE and a := \. Note that only the
solutions #1 and #6 are real; furthermore, the solution #1 is negative definite, while
the solution #6 is positive definite. As for the remaining solutions, the second and
the fifth ones are hermitian.
Example 5.4 Let
A=[0 ?]B=[]c=[ 0
]R=1.
The eigenvalues of Z are A1,2 = 1 and A3,4 = 2 (recall Theorem 5.2). Observe that
the couple (A, B) is not reachable and that the eigenvalue of the unreachable part is
1. Such an eigenvalue appears in the spectrum of Z (Lemma 5.1). The eigenvalues
of Z are distinct: consistent with this fact, there exist six 2-dirriensional, Z-invariant
subspaces. They are the span of the columns of the six matrices built up with any
pair of columns of
_
V
10
0
0
-1
-2
2
0
132
which is the matrix resulting from gathering together four eigenvectors relative to
the four above mentioned eigenvalues. Only two of these subspaces are complements
of W, namely those relative to columns (1,3) and (1,4) of matrix V, so that there
exist only two solutions of the ARE. They are
Pl
011.
[20
A= [
011,1
01
, B= [ 1 ] ' CFC= [
21
11
'
R=1..
The eigenvalues of Z are A1,2 = 1 and \3,4 = -1 (recall Theorem 5.2). It is simple
to verify that Z is cyclic, so that the number of solutions of the ARE must be less
than six. As a matter of fact only three solutions exist, namely
I'1-[0
1I'2 =[0
01]'p3-[
7
4
4
3].
Example 5.6 Let A = O2x2, B = C = R = 12. The eigenvalues of Z are A1,2 = 1 and
A3,4 = -1 (recall Theorem 5.2). It is easy to check that the minimal polynomial of
Z is coz (s) = (s - 1) (s + 1) so that Z is not cyclic. Hence the number of solutions
of the ARE can be not finite. Indeed, there is an infinite number of solutions which
are reported in Table 5.2, where pij denotes the (i, j) element of matrix P, solution
of the ARE, -y:= 1. - cxf3 and cx, 6 are arbitrary complex numbers.
Example 5.7 Consider the ARE defined by A = -212, B = C = 12 and R = -12. the
minimal polynomial of Z is cpz(s) = (s-V)(s+v/), so that Z is not cyclic but HARE
is finite. In fact, there exist only four solutions of the ARE, namely: Pl = (2- vf3-)I2,
P2 = (2 -F ')I2 and
T:=
-P
0
1
AC
[ 0
-BR-'B'
-A,
133
P22
P21
0
-1
1-
-1
-1
a
a
-1
-1
-1
P12
111
-1
'y
-'Y
a
a
a
a
-1
13
-'Y
This result raises the question whether one or more among the solutions of
the ARE are stabilizing, i.e., whether there exists a solution PS of the ARE
such that all the eigenvalues of A - BR-'B'PS have negative real parts. In
view of Theorem 5.4, this can happen provided only that no eigenvalue of Z
has zero real part and hence (recall Lemma 5.1) E must not have eigenvalues
with zero real part in its unobservable and/or unreachable parts. Furthermore,
the couple (A, B) has obviously to be a stabilizable one. Existence results for
PS will be given after the presentation of some features of PS (uniqueness,
symmetry, etc.). For the sake of convenience, the term stable subspace of matrix
Z [ PIS
=1
I, ( A
BR-'B'Ps).
Matrix A - BR-' B'Ps is stable and hence the subspace associated to Ps must be
the one spanned by the stable eigenvectors of Z.
134
Point (b) The uniqueness of the stabilizing solution follows from the uniqueness of
the stable subspace of Z. Alternatively, assume, by contradiction, that there exist
two stabilizing solutions of the ARE, P1 and P2, P1 54 P2. Then
either to select n real eigenvectors when all the eigenvalues are real, or to make
the 2-dimensional subspace spanned by a pair of complex conjugate eigenvectors
(corresponding to a pair of complex conjugate eigenvalues) be spanned by their
sunn and difference, the latter multiplied by j. Therefore the matrices X and Y
which specify SS can be thought of as real without loss of generality and Ps is real
too. Point (d) Since Ss is the Z-invariant subspace corresponding to the stabilizing
solution, it follows that Zr = rV, where V is stable. Letting W := X'Y - Y-X,
we obtain w = r-Jr, J being the matrix defined in the proof of Theorem 5.2. On
the other hand, JZJ' _ -Z', so that JZ + Z'J = 0 and 0 = r-(JZ + Z'J)r =
r irv + V-rNJr. The first and last terms of this relation can be rewritten as
0 = WV + V - W which, in view of the stability of V, implies 0 = W and, from the
(YX_1)'
YX-'
= PS .
= (X"")-1Y' . Therefore, Ps = YX-1 =
definition of W,
This last relation implies the symmetry of Ps since it is real.
X=10
o1,Y
135
Example 5.11 Consider the ARE defined in Example 5.7. The stable subspace of Z
can be specified by the matrices X = 12 and Y = (2 - V 3-)I2, so that Ps = Y = P1.
The remaining three solutions of the ARE which are reported in Example 5.7 (P2,
P3, P4) are not stabilizing.
An existence result for the stabilizing solution of the ARE can easily be given
when the matrix R is either positive or negative definite as the following theorem states, by showing that two necessary conditions become, if taken together,
also sufficient.
Theorem 5.6 Let R > 0 or R < 0. Then the stabilizing solution of the ARE
exists if and only if the following conditions both hold:
AX - BR-'B'Y = XV,
(5.10a)
-QX - AT = YV
(5.10b)
where V is stable. From eqs. (5.10) we obtain, by prennultiplying the first one of them
by Y' and taking into account what has been shown in the proof of Theorem 5.5,
(5.11)
Now assume, by contradiction, that X is singular, i.e. that there exists a vector
19 E ker(X), 0 j4 0. If eq. (5.11) is premultiplied by t9' and postmultiplied by 19 we
obtain
(5.12)
(5.13)
(5.14)
136
Let cp be the least degree rnonic polynomial such that W(V)i9 = 0. Such a polynomial
(V - yI)i;.
0 = (V -
Note that
0 because of the assumed minirnality of the degree of W. The last
equation implies that -y is an eigenvalue of V and therefore Re[ y} < 0. On the other
hand eq. (5.12) entails that E ker(X), so that eq. (5.13) holds also for 19 = and it
follows that
(5.15)
If, on the other hand, the ARE defined in Example 5.4 is taken into consideration, we call check that the pair (A, B) is not stabilizable and that no eigenvalue of
matrix Z has zero real part. Indeed the stabilizing solution does not exist.
Finally, consider the ARE defined in Example 5.7. The pair (A, B) is reachable and all eigenvalues of matrix Z have real parts different from zero. In fact the
stabilizing solution (Pr) exists.
Example 5.13 Consider the ARE defined by
C= [ 0
0B=[
A=[0
R=1.
The pair (A, 13) is reachable, but the characteristic polynomial of Z is '0z (s) _
s2(s2 - 1). The solutions of the ARE are
1'1,2= [ 0
01
P4= I
01
J'
Example 5.14 Consider the ARE defined by A = -1, B = C = 1, R = -1. The pair
(A, B) is reachable, but the characteristic polynomial of Z is V)z (s) = s2. In fact, the
only solution of the ARE is Pr = 1 which is not stabilizing.
Conditions for the existence and uniqueness of sign defined solutions of the
ARE can be given by further restricting the considered class of equations.
137
Theorem 5.7 Let R > 0. Then a solution P = P' > 0 of the ARE exists if and
only if the unreachable but observable part of system E(A, B, C, 0) is stable.
Proof. Sufficiency. See the results in Section 3.4, specifically Theorem 3.5. Neces-
sity. First observe that the eigenvalues of the unreachable but observable part of
E(A, B, C, 0) are also eigenvalues of matrix A + BK, whatever K is. Further, it is
immediate to ascertain that if P* satisfies the ARE, then it satisfies the equation
(5.16)
as well. Assume now that there exists a vector ; 0 such that (A BR- r B' P* ) =
A , Re[A] > 0. Should this not be true, then the theorem would be proved since
all the eigenvalues of the unreachable but observable part are also eigenvalues of
BR- r B P* . Thus, if eq. (5.16) (written for P*) is postrnultiplied by such a
Aand premultiplied by C ', we get
CCP*BR-1B'P*.
The sum of the first two terms of the right-hand side of this equation is 2Re[XIC-P*e
and is nonnegative because P* _> 0 by assumption. Also the remaining two terms
are nonnegative (recall the assumption R > 0): thus C = 0 and B'P*e = 0. This
last relation implies that A6 = (A - BR-1 B'P* ) = A6. In view of the PBH test
(Theorem A.1 of Appendix A) it is possible to conclude that A is an eigenvalue of
the unobservable part of E(A, B, C, 0) and the proof is complete.
Example 5.15 Consider the ARE defined in Example 5.3. The pair (A, B) is reachable: consistent with this, there exists a symmetric positive serrlidefinite solution, the
sixth one. If, on the other hand, reference is made to the ARE defined in Example
5.4, it is possible to check that: i) the pair (A, B) is not reachable, ii) the pair (A, C)
is observable, iii) the eigenvalues of A are nonnegative. Thus the unreachable but
observable part is certainly not stable. Indeed no one of the solutions of the ARE is
positive semidefinite.
(a) If the pair (A, C) is detectable there exists at most one symmetric and
positive semidefinite solution of the ARE. It is also a stabilizing solution;
(b) If the pair (A, C) is observable, the positive semidefinite solution (if it
exists) of the previous point is positive definite.
Proof. Point (a) Let P = P' > 0 be a solution of the ARE. From the proof of
Theorem 5.7 it follows that if e # 0 is such that (A - BR-'B'P) = \C with
Re[,\] > 0 then A is an eigenvalue of the unobservable part of E(A, B, C, 0), against
the detectability assumption. Therefore such a solution is a stabilizing one and hence
unique thanks to Theorem 5.5.
138
Point (b) By assumption there exists a solution P = P' > 0 of the ARE. If it is not
positive definite, there exists C # 0 such that PC = 0. Since P solves the ARE it
follows that 6-(PA + A'P - PBR-'B'P + C'C)C = 0 which, in turn, implies that
C-C'CC = 0, i.e., C = 0. In this case (PA + A'P - PBR-' B'P)6 = 0 and hence
also PAS = 0. Again it is C-A'(PA + A'P - PBR-1BP + C'C)Ae = 0 which, in
turn, entails that C-A'C'CAC = 0, namely CAS = 0. By iterating this procedure we
find that CA2C = 0, i = 0,1, ... , n - 1, against the observability assumption.
Example 5.16 Consider the ARE defined by
A= I
B=
fi1
J,
c= [
`2
0 ], R=1.
],p2_p1,p3[_34
and
Ps
-[
1-
10
2
-6
12
18]tPs=[
g],4=:
2].
4
the result in Theorem 5.8 holds in one sense only. Finally, make reference to the
equations defined in Examples 5.3, 5.5, 5.6: in all cases (A, C) is an observable pair
and, consistently, the relevant ARE admits a unique symmetric, positive semidefinite
solution which is positive definite and stabilizing.
Theorem 5.9 Let R > 0. If the stabilizing solution Ps exists, then it follows
that:
0=PsA.+A'Ps+T1.
5.3.
139
This equation is a Lyapunov equation in the unknown Ps: thus it admits a unique
solution which is also positive semidefinite since A. is stable and T1 > 0. Point (b)
Let Ps be the stabilizing solution and II any other symmetric solution of the ARE,
so that
This equation is a Lyapunov equation in the unknown (Ps - II): thus it admits a
unique solution which is also positive sernidefinite, since A,, is stable and T2 > 0.
Example 5.19 Consider the ARE defined in Example 5.17: it admits two positive
sernidefinite solutions: it is easy to check that the stabilizing solution exists, namely
Ps = P2 and Ps - P1 > 0. Theorem 5.9 can be tested also by making reference to
the equations defined in Examples 5.3, 5.5 and 5.6.
Example 5.20 In general, Theorem 5.9 does not hold if R is not positive definite. In
fact consider the ARE defined in Example 5.7. It admits four positive semidefinite
solutions one of which, P1, is also stabilizing but not maximal. Indeed such a solution
Lemma 5.2 Let R > 0. The hamiltonian matrix Z has eigenvalues with zero
real parts if and only if they are also eigenvalues of the unobservable and/or
unreachable part of system E(A, B, C, 0).
Proof. The proof will be given for the only if claim, the other claim being a direct
consequence of Lemma 5.1. Let w be such that
Z
[ , I =jwI,
77
77
I,
[,1 00.
t... I Z
77
+ ('mi).
140
Observe that the last terns of this equation and r9 := r,fA - -A'r, are imaginary
numbers. Thus -i;' C'C and -7ffBR-1 B'i7 (which, on the contrary, are real and
nonpositive) must both be zero, so that C = 0 and B',7 = 0. T'his fact implies that
1]
Z[
If
-[ Ar,
].[1J
0 then A = jw6 and C = 0 which in turn entails, in view of the PBH test,
0,
then -A'r1 = jwr, and B'r, = 0 which, again in view of the PBH test, implies that
-jw (and hence also jw) is an eigenvalue of the unreachable part of E(A, B, C, 0).
Theorem 5.10 Let 1Z > 0. Then there exists a unique symmetric, positive
semidefinite solution of the ARE if and only if system E(A, B, C, 0) is stabilizable and detectable. Moreover, such a solution is also stabilizing.
Proof. Sufficiency. If L(A, B, C, 0) is stabilizable and detectable, from Lemma 5.2
it follows that the eigenvalues of the harniltonian matrix Z have nonzero real part.
This fact, together with stabilizability of the pair (A, B), implies the existence of the
stabilizing solution of the ARE (Theorem 5.6). Thanks to Theorem 5.9, this solution
is symmetric and positive semidefinite, while, from Theorem 5.8, at most one of
such solutions can exist. Necessity. The existence of a stabilizing solution entails
stabilizability of the pair (A, B) and hence also stability of the unreachable but
observable part of E(A, B, C, 0). Let T be a nonsingular matrix which performs the
decomposition of system E(A, B, C, 0) into the observable and unobservable parts,
that is let
A :=1A!- I
*
Ao
AaJ,B
Al
.-
.-
Bo
B1
CT-1
:= [ C o 0 ] ,
C* :=
where the pair (Ao, Co) is observable and the pair (Ao, Bo) is stabilizable. In view
of Lemma 5.2 and Theorems 5.6, 5.9 there exists a symmetric, positive sernidefinite
and stabilizing solution Po of the Riccati equation
_[
PO
'
it is easy to check that P* is: (i) symmetric and positive sernidefinite; (ii) such that
141
namely that it solves the ARE. By assumption, it is the only solution possessing
these properties and furthermore it is also stabilizing, so that Ac given by
T_i r Ao-BOR-'B,P.
0 1T
l Ai - B1 R-' B' PO A2 J
is a stable matrix. This implies stability of A2 and hence detectability of the pair
(A, C).
Example 5.21 Consider the ARE defined in Examples 5.3, 5.5 and 5.6. The underlying systems E are stabilizable and detectable: consistent with this, in all cases there
exists a unique syrnrnetric, positive semidefinite solution which is also stabilizing. If,
on the contrary, reference is made to Example 5.13, it can be seen that system E is
stabilizable but not detectable: consistent with this, there exists a unique symmetric
and positive semidefinite solution which however is not a stabilizing one. Finally, if
Example 5.17 is again taken into consideration, stabilizability is checked while detectability does not hold. Indeed two symmetric and positive semidefinite solutions
exist, one of which is also stabilizing.
For the R > 0 case a summary of the most significant results is given in
Fig. 5.1.
Finally, a numerical method for the computation of the stabilizing solution (if it exists) of the ARE is now briefly sketched. It is considered to be one
of the most efficient among those presently available and is based on Theorem 5.5, point (a). More precisely, assume that a 2n-dimensional nonsingular
matrix T has been determined such that
Z* := TZT-1 =
Z11
Z12
Z22
T-i
x'11
[T21
T12
T22
then the stable subspace of Z is specified by the first n columns of T-1 and
hence the stabilizing solution of the ARE exists if and only if T11 is nonsingular
142
Thm. 7
L..art=statie
1
L A,C=detect.
A,C=observ
Thins. 5,8,10
3P=P'>0
P=Ps,unique
P=Ps, unique
Lemma 2
ReR(Z)] 0
Thms. 5, 6,9
-0.7071
-0.7071
-0.7071
0.4989
0.5011
0.4989
-0.5011
-0.4989
-0.5011
-0.4989
0.5011
we get
1
Z" = TZT-1 -
-1
0
0
0
0.7119
2.1197
-1
-0.7087
0
0
0.5000
0.5022
-0.7055
-0.4978
1.500
and
f,S [
-0.7071
-0.7071
,[
-0.7071
0
-0.7071 ]
-1
5.4. Problems
143
Example 5.23 Consider the ARE defined by A = 212, B = C = 12, R = -12. Letting
-0.2588
0
0.9659
0
-0.2588
0.9659
0
0.2588
0
0.9659
0
0.9659
0
0.2588
we get
-,F3
Z* =TZT-1 =
0
0
0
-2
0
0
vF3
-0
and
Ps
_
= [
0.9659
0
0.9659
,[
-0.2588
0
-0.2588 ]
_ -(2 + V 3)12
5.4
Problems
A=[0 ]B=[]Q=[g i 1.
Does there exists a stabilizing solution of the ARE?
Pl - [
,,
P2
012
]' A=[
], C= [
A= [ 1 1 ], B= [ 1
16
a ].
Find the values of the parameters a and 0 corresponding to which there exists a
stabilizing solution of the ARE.
144
A=
0
0
0
-1
0
0
B=
Q=
0
0
0
0
0
0
R=
r1
[ 0
-1 J
A=
[0 0],B=[1Q=[o 0]
I'-[2
3]?
A= 10
B= I
i0
Q= [ 0
A=[0
]B=[?].
R00and
A= [ '
1 1
,B= [1],C
Give the tightest possible upper bound on the number of its solutions.
10
A
0
L0
0
0
11
0
0
B=
0
0
, C=
0 ].
Part II
Variational methods
Chapter 6
The Maximum Principle
6.1
Introduction
(6.1)
x(to)=xoESo,
(6.2a)
x(tf)=xfESf
(6.2b)
are also present and express the need that the state at the initial and final time
belongs to given sets. Both the initial time to and the final time t f may or may
148
not be specified. The set of functions which can be selected as control inputs
is denoted by fl, which is the subset of the set f1 constituted by the m-tuples
of piecewise continuous functions taking values in a given closed subset U of
Iu"L: thus
E Si. The performance index J (to be minimized) is the sum of
all integral terra with a term which is a function of the final event, namely
J=
Jto
(6.3)
where the functions l and in possess the same continuity properties as the
function f. Finally, various kinds of specific limitations on both the state and
control variables can be settled by means of the constraints mentioned at point
(f) above. The detailed description of such constraints, which, for the sake of
convenience, are referred to as complex constraints, is postponed to Section 6.3.
As before, letting
to, xo, u(.)) be the system motion, the optimal control problem to be considered in this second part is, in essence, of the same
kind ors the one tackled in Part I and can be stated in the following way.
(6.4a)
(X (t f ), tf) E Sf,
(6.4b)
which apparently constrain the initial and final events. Handling this new
framework is however all easy job, as will be shown in the sequel: thus the
simpler setting of Problem 6.1 has been chosen for the sake of simplicity.
The so-called Maximum Principle will be presented in the forthcoming
sections, supplying a set of necessary optimality conditions. First, problems
with simple constraints are dealt with in Section 6.2, i.e., problems which reE Si are satisfied, while in the
quire that only the constraints (6.2) and
subsequent Section 6.3 problems characterized by the presence of complex constraints are tackled. Finally, particular problems are discussed in Sections 6.4
and 6.5: those characterized by the presence of singular arcs and the minimum
time problems.
A couple of definitions are expedient in the forthcoming discussion: the
first of them concerns the hamiltonian function which slightly differs from the
one introduced in Part I, while the second one specifies a particular dynamical
system.
149
(6.5)
do(t) = 0
(6.6a)
(6.6b)
only eq. (6.4a) is often referred to as the auxiliary system and eq. (6.4b) is
taken into account by just requiring the constancy of A0. Firnally, note that eq.
(6.1) can be replaced by
Wi(t) =
(6.7)
This last equation, together with eq. (6.6a), constitutes the so-called hantiltonian system.
6.2
Simple constraints
The optimal control problems dealt with in this section require that no complex
constraints are present on the state and/or control variables and that both the
set So and the set Sf are regular varieties, that is a set of the kind
dx
The nature of a regular variety is clarified by the following example.
150
(b)
(a)
Figure 6.1: The set constituted by the vectors r', i = 1, 2, 3 is (a) or is not
(b) a regular variety.
a, (x) = X2 - sin(xl ),
a2(X) = X2 -
5x1.
E(x) _
3
5ir
2x1
2X2
18x2
II
and
151
J=f
tf
6.2.1
The following result is available for optimal control problems where only simple
constraints are imposed to the state and/or control variables and the performance index simply consists of an integral term.
Theorem 6.1 Let
E S2 be a control which is defined over the interval
[to, t f], to < t f, and transfers the state of the system (6.1) from a suitable point
x(to) E So to a suitable point x(t f) E S f and
to, x(to),
A necessary condition for the quadruple (x(.),
to, t7) to be optimal is the
existence of a solution A(.), Ao of the auxiliary system (6.6) corresponding to
such that
.152
(i)
1\0>0;
(v)
0;
of
u
(tf)
19fi
i=1
9u
(to) _
Voi
i=1
dafi(x)
dx
, r9fiER;
dooi (x)
dx
19oi E R.
Some comments are in order. The first one refers to condition (i) which
amounts to saying that a control must minimize the hamiltonian function
in order to be optimal. It is quite obvious that a control which differs from an
optimal one over a set of zero measure is still an optimal control. This fact motivates the insertion of a. e. (which stands for almost everywhere) into condition
(i). The second remark concerns conditions (ii) and (iii), usually referred to as
tmnsversality conditions at the final and initial time. They hold only if the final
time or the initial time are not given, respectively. Thus, if the final (initial)
time is specified in the optimal control problem at hand, any optimal solution
might not comply with the relevant transversality condition. A third comment
refers to conditions (vi) and (vii) which are often mentioned as transversality conditions as well: there i9 fi, i = 1,2,... , q f and 19oi, i = 1,2,... , qo are
suitable constants. In order to avoid misunderstandings, they are here referred
to as orthogonality conditions at the final and initial time, respectively. Indeed, a geometric interpretation of these conditions can be given, since they
simply require that the n-dimensional vector A(t f) (A(to)) be orthogonal to
time hyperplane Tf(x(t f)) (To(x(to))) which is tangent to Sf at x(t f) (So at
x(to)). Note that when the state at the final and/or initial time is completely
specified, the relevant orthogonality condition is anyhow satisfied.
Example 6.2 This very simple problem has a trivial solution: thus it is expedient in
checking Theorem 6.1. Consider a cart moving without friction on a straight rail
under the effect of a unique force u. Denoting by xl and x2 the cart position and
153
Let the position and velocity at the initial time 0 be given and assume that the
problem is to bring the cart to a specified position in the shortest possible time.
Obviously, this new position has to be different from the initial one. The intensity
of the force which can be applied ranges between umax > 0 and u,,,in < 0. Therefore
the problem is completely specified by the above equations and x(O) E So, x(t f) E
Sf, u(t) E U, where So = {xl aoi(x) = 0, i = 1, 2}, .9f = {xl a f1(x) = 0}, U =
{ul Umin < u < umax } With a01(x) = x1 - x10, a02 (x) = X2 - x20, a f 1(x) = x1 -xl f ,
xlo # x1 f, as well as by the performance index
J=
tf
dt.
It is quite obvious that the solution of the problem is u(t) = , t > 0, where
= uma if xlo < x1 f and = u,,,in if xlo > x1 f. Corresponding to this choice we
obtain
X0 1(t) = X10 + x2ot +
t2,
X02(t) = X20 + At
and
to
ll
It is now shown how \ and Ao can be determined so as to satisfy the conditions (i),
(ii), (iv)-(vi) of Theorem 6.1 (recall that the initial time and state are given). As a
preliminary, note that the sets go and Sf are regular varieties, so that the quoted
theorem applies and the hamiltonian function H = Ao + Aix2 + \2u can be defined.
Consistent with this, Al = 0, A2 = -Al, Ao = 0 are the equations of the system (6.6),
and
1
em = -x2(to)
1
a2 = - x2 (t)
f) (tf - t),
A00
of A(tf) to be zero and this is the case. Finally, observe that the sign of A2 (t),
0 < t < tf does not change and is different from the sign of x2 (t! ), the latter being
equal to the sign of A. Therefore, also condition (i) is satisfied.
154
Remark 6.1 (Limits of Theorem 6.1) It should be emphasized that Theorem 6.1
supplies conditions which are only necessary (NC). Therefore, if a quadruple.
(x(.), u(.), to, t f) complies with the problem constraints and allows us to determine
a solution (A, Ao) of the auxiliary system which verifies the theorem conditions, the
only conclusion to be drawn is that such a quadruple satisfies the NC and might be
all optimal solution. In short, one often says that the control u at hand satisfies the
NC and is a candidate optimal control.
Remark 6.2 (Pathological problems) For a given problem, assume that the pair
(a, A) satisfies the conditions of Theorem 6.1: then, it is not difficult to check
that they are also satisfied by the pair (A, Ao) with A = kA, a$ = kA, k > 0.
Therefore, it is always possible to set A = 1 apart from those cases where it is
mandatory to set A = 0. Such cases are referred to as pathological cases.
Remark 6.3 (Constructive use of the necessary conditions) In principle, the necessary
conditions of Theorem 6.1 allow us to determine a quadruple (x(.), u(.), to, t f) which
satisfies them. Indeed, let Ao = 1 and enforce condition (i): a function u,, (x, t, A)
results which, when substituted for u(t) in eqs. (6.6a), (6.7), defines a hamiltonian
system of 2n differential equations for the 2n unknown functions x and A. This
system is endowed with 2n boundary conditions deriving from: a) the constraint
x(to) E So (qo conditions), b) the constraint x(t f) E .9f (q f conditions), c) the
orthogonality condition at the initial time (n-qo conditions) and d) the orthogonality
condition at the final time (n-qf conditions). Thus a solution can be found which is
parameterized by the yet unknown values of to and t f: this solution can be specified
fully by exploiting the two transversality conditions. A control which satisfies the NC
results from the substitution of the available expressions for x, A into the function
u!,. Finally, observe that the number of conditions to be imposed always equals the
Number of unknown parameters, even when the initial (final) state and/or time is
specified.
If in so doing no control can be found which satisfies the NC, one call set
A0 = 0, i.e., explore the possibility that the problem at hand is pathological. If also
this second attempt is unsuccessful, the conclusion should be drawn that no solution
exists for the optimal control problem under consideration.
Example 6.3 The problem in Example 6.2 is again considered by letting xlo = 0,
function with respect to u leads to defining the function ur,(A) = -sign(A2) which
fully specifies u but in the case A2 = 0. This circumstance does not entail any
significant difficulty since, as will later be shown, it can only occur at an isolated
time instant. The solution of the auxiliary system is Al = Al(0), A2 = A2(0) - Al(0)t
and the orthogonality condition (vi) requires A2(tf) = 0: thus A2(t) = Am(0)(tf - t)
and A2 does not change sign and the control is constant and equal to 1. As a
consequence, it follows that xl = t2/2, X2 = t. The transversality condition (ii)
imposes Al (0) = ::F1/t f, while complying with the constraint on the final state entails
5 = (t f)2/2. Therefore, it is t f = 10, Al (0) _ 11V1_0 and the control u(t) = 1
satisfies the NC.
6.2.
Simple constraints
155
Example 6.4 Consider the system in Fig. 6.2: it is constituted by two carts with
unitary mass which move without friction over two parallel rails. Two forces ul and
u2 act on the carts. By denoting with xl and x3 their positions and with x2 and
x4 their velocities, the equations for the system are xl = X2, x2 = u1i x3 = X4,
x4 = u2. If the initial state and time are given, namely, x1(0) = 1, x3(0) _ -1,
x2(0) = x4(0) = 0, the control actions must be determined which make, at a given
time T, xl(T) = X3 (T), x2(T) = x4(T) and minimize the performance index
J =+
f(uu)ctt.
(iv)-.-(vi) of Theorem 6.1
The sets So and Sf are regular varieties, thus conditions (i),
must be satisfied. The minimization of the hamiltonian function H = Ao(ui +ut)/2+
Aix2 + A2u1 + A3x4 + A4u2 with respect to u is possible only if the problem is not
pathological: therefore, by choosing Ao = 1, it leads to defining the function Uh(A)
specified by Uhl(A) = -A2 and uh2(1\) _ -A4. The solution of the auxiliary system
is al(t) = A1(0), A2(t) = A2(0) - A1(0)t, A3(t) = A3(0), A4(t) = A4(0) - A13(0)t, while
the orthogonality condition requires that A1(T) = -A3(T) and A2(T) = -1\4(2')- In
view of these results, complying with the constraints on the final state allows us to
determine the values of the parameters A2(0), yielding A1(0) = -A3(0) = 12/T3 and
A2(0) = -A4(0) = 6/T2, from which the expressions for the control variables ul and
u2 which satisfy the NC can easily be derived.
Now consider a slight modification of this problem where the final time is not
specified and the term 2 is added inside the integral in the performance index in order
to emphasize the need for the control interval to be reasonably short. In this case the
transversality condition (ii) must also be imposed. Once the previously found (arid
still valid) relations among the parameters A j(0) are taken into account, we obtain
T=vF6.
Example 6.5 Once more consider the system described in Example 6.2, but assume
that only the velocity is given at the known initial time, while the state at the known
156
final time is required to belong to a specified set. More precisely, let x(O) E So, x(1) E
f'udt.
J=
l3oth .So and 9f are regular varieties: thus conditions (i), (iv)--(vii) of Theorem 6.1
must be satisfied. The problem is not pathological since otherwise the hamiltonian
function II = A0u2/2 + A1X2 + A2u could not be minimized with respect to u. By
choosing A0 = 1 we obtain u,l(A) = -A2, while the equations of the auxiliary systern together with condition (vii), which requires A1(0) = 0, imply A2(t) _ A2(0),
ar (1.) = 0. Therefore it follows that xl (t) = x1(0) - A2(0)t2/2, x2(t) = -A2(0)t. The
orthogonality condition (vi) is
A1(1)
A2(1) I
_ Vf
2xi(1)
2(x2(1) - 2)
so that either i9 f = 0 or x1(1) = 0. The first occurrence must not be considered since
it would also entail A2(0) = 0, i.e., u(.) = 0 which does not comply with the constraint
at the final time. By imposing the control feasibility (i.e., that the constraint on the
final state is satisfied) we get A2(0) = -1 or A2(0) = -3 with x1(0) = -1/2 and
:cr (0) = -3/2, respectively. Therefore two distinct controls exist which satisfy the
NC: the form of the performance index points out that u(t) = 3 is certainly not
optimal.
Example 6.6 Consider the first order system x = u with x(to) = x0i x(tf) = xf,
where x0 and x f are given and different from each other, while to and t f must be
determined, together with u, so as to minimize the performance index
J=
J'(t
+u2)dt.
This problem can be handled by resorting to Theorem 6.1 and imposing that conditions (i) - (v) be satisfied. The nature of the hamiltonian function (H = Ao(t4 +
u2)/2 + Au) prevents the problem from being pathological: thus, letting Ao = 1,
we get uht(A) _ -A = -A(0). From the two transversality conditions it follows that
to = - /IA(0)I and t f = -to. Finally, by requiring that the control be admissible, i.e.,
imposing that x f = xo - A(0) (t f -to), we obtain A(0) = ((xo -x f)/2)2/3sign(xo -x f).
Remark 6.4 (14ee final and/or initial state A number of optimal control problems
do not require that the final or initial state satisfy any specific constraint (see, for
instance, the LQ problem dealt with in Part I). Such a particular situation can
not be tackled directly by making reference to the notion of regular variety which
was expedient in stating Theorem 6.1. On the other hand, when one of the state
components xi can be freely selected at the final and/or initial time, the relevant
multiplier A is zero at that instant (see Examples 6.2, 6.5): thus, it should not be
surprising that the orthogonality condition (vi) ((vii)) becomes A(t f) = 0 (A(to) _
0) if the final (initial) state is free.
6.2.
Simple constraints
157
Example 6.7 The solution of the LQ problem which was found in Chapter 3.1 by
resorting to the Hamilton-Jacobi theory is now shown to satisfy the NC. The system
is i(t) = Ax(t) + Bu(t) with given initial state x(O) = xo and free final state x(T),
T being specified. The performance index is
T
J = 2J [x'Qx+u'Ru]dt
U
with R = R' > 0 and Q = Q' > 0. The harniltonian function is H = Ao(u'Ru +
x'Qx) /2 + A' (Ax + Bu) and, for Ao = 1, is minimized by urn = -R-'B'A, so that the
equations of the harniltonian system are
i=Ax-BR-1B'A,
A=-Qx-A'A.
In view of the material in Chapter 3.1, the pair (x, A), where x is the solution
of the equation i = (A - BR-1B'P)x, x(0) = xo and A* (t) = P(t)x(t) with P a
PBR-1 B'P - Q, P(T) = 0, must satisfy the
solution of the DRE P = -PA - A'P +
harniltonian system and the relevant boundary conditions for A, specifically A(T) _
0. This last request is apparently verified, while it is straightforward to check that
also the first request is fulfilled by inserting A and x into the hamiltonian system.
Example 6.8 Consider the optimal control problem defined by the first order system
i = xu with given initial state x(0) = xo 0 0. The final state is free, while the final
time is 1. The control variable is constrained by u(t) E U = Jul - 1 < u < 1} and
the performance index is
J=
f1
J0
xudt.
158
The harniltonian function is H = (Ao + A)xu and its minimization with respect to u
yields ul,(x, Ao, A) = -sign ((Ao + A)x). The harniltonian system is
xuh (x, \o, \),
A _ -(A0 + A)uh(x, A0, A),
with boundary conditions x(O) = xo and A(1) = 0 (recall that the final state is
0 and
free). Note that this problem can not be pathological since, otherwise,
condition (v) of Theorem 6.1 would be violated. Therefore Ao = 1. In a similar way
observe that the control which minimizes H is always well defined because neither x
nor 1 +A(i) can be 0. Indeed the first circumstance is inconsistent with the nature of
the set U and xo 0 0, while the second one would entail, if verified, A(t) = -1, t > t,
thus preventing the fulfillment of the orthogonality condition. The trajectories of the
hainiltonian system are the solution of the equation
dA
1+A
x
i.e., are the curves (1 + A)x = k which are shown in Fig. 6.3 corresponding to
some values of k. This figure, where also the values taken on by the function uh are
recorded, clearly points out that the control which satisfies the NC is u = -sign(xo).
dx
Remark 6.5 (Time-invariant problems) If the functions f and I do not explicitly depend on time, i.e., when the optimal control problem is time-invariant, an interesting
property of the harniltonian function can be proved. More precisely, if the problem
is tune-invariant, then
Il (x(t), u(t), A(, A(t)) = const.
provided that the quadruple (x, u, A$, A) verifies condition (i) of Theorem 6.1,
(As, A) being a solution of the auxiliary system corresponding to (x, u). This
property allows us to check/impose (if it is the case) the transversality condition
at any time instant within the interval [0, t'] (recall that the initial time can be
thought of as given and set equal to 0, without any loss of generality, if the problem
is tine-invariant).
This claim can easily be tested by making reference to the time-invariant LQ
problem. Indeed in this case
x(t)
1X=X0(t)
+ OH(x(t), u(t),1, A)
as
u. (t)
u=u(t)
A(t).
A=A(t)
This relation is correct since u = -R-1B'Px is differentiable in view of the equations which are solved by x and P. The first and third term of the right-hand side
159
2
tb
Figure 6.4: Example 6.9: the feasibility issue for the controls a and b.
ex
LXO(t)
A=Ao(t)
Finally, no constraints are imposed on the current value of u: thus condition (i) of
Theorem 6.1 is satisfied only if
OH(x(t), u, 1, A(t))
Ou
= 0.
1U=U0(t)
Example 6.9 Consider the optimal control problem defined by the system xi = x2,
x2 = u with initial state x(0) = 0 and final state constrained to belong to the regular
variety Ss = {xi Si = 2}. The control variable must be such that -1 < u(t) < 1.
Finally, the performance index to be minimized is
J = J f t[1 + juI)dt.
0
otherwise, the transversality condition could not be verified. In view of the above
discussion the conclusion can be drawn that 1\2 is a linear function of time which
vanishes for t = tr, has always the same sign and (x2(0)1 > 1. The minimization of
the hamiltonian function yields
1,
uh (A) =
0,
-1,
A2<-1
-1<A2<1
1 < A2.
160
(b)
u(t)u(t)
0,
o<t<tf,
r -1, 0<t<T
0,
'T'ile second control turns out to be not feasible (see also Fig. 6.4) and it remains
to specify the parameters of the only control which could satisfy the NC. From the
transversality condition we get A2(0) = -2 so that, by imposing A2(T) _ -1 and
A2(t f) = 0 it follows that T = t f/2, \1(0) = -2/tf. Finally we find x2(T) = x2(tf) _
T and xl (t f) = Tt f - T2/2: thus, by imposing feasibility, i.e., xl (t f) = 2, it results
that t f = 4 3. All elements necessary to define a control which satisfies the NC have
been determined.
Example 6.10 Consider three optimal control problems defined by the same system
1 = x2, :2 = u, x(0) = 0, xl (t f) = XI f > 0, x2(t f) = 0. The performance index for
the first two problems are
Jl = 2 f'(2+u2)dt,
t
J2 = If f (1 + juj)dt,
0
respectively. As for the third problem, the performance index J3 is equal to Ji. For
all problems the final time is free, while the control variable must comply with the
constraint -1 < u(t) < 1, Vt when dealing with the first two problems.
The first problem is now considered. By choosing A0 = 1, the hamiltonian
function is 11 = 1 + u2/2 + A1x2 + A2u which yields
1\2 < -1
1,
u/t(A)=
-1<1\2<1
-A2i
-1,
A2 > 1.
The transversality condition at the initial time requires u(0) = 1. By noticing that
the auxiliary system implies that A2 is a linear function of time, the conclusion can
be drawn that the control is a function of time which is either constant or first
constant and then linear or first constant, then linear and finally constant again.
A little thought reveals that only the third alternative may lead to an admissible
control (recall the values of the initial and final states) and u(0) = 1. Therefore
A2(0) = -3/2 and a control which satisfies the NC must be of the form
1,
u(t) _
-A2(t),
-1,
0<t<Tl
Ti < t < T2
T2<t<tf
161
where Ti and T 2 are the times where .X2 equals -1 and 1 respectively. These requirements, together with control feasibility, allow us to determine all the parameters which specify a solution satisfying the NC, precisely: T , = t f, /6, T2 = 5t f, /6,
(0) = -3/t f1, t f, = \/108x, f/23. As a consequence we find
t2
X(t) =
0<t<
to - tflt+ 3t2 -
432
24
is
21f I
tf1+3t3 t2
2
24
2tf1
31
12
x(t)=
108tf1tf,t-2t
to - t
, T2<t<tf1.
The second problem is now taken into consideration. A similar discussion leads to the
conclusion that a control which satisfies the NC is a piecewise constant function of
time which first equals 1, then equals 0 and finally equals -1. The parameters which
-tf22+tf2t
t2
x(t) =
32
x(t) =
tf2
IF
t1 <t <t2,
x(t) =
t2
16 t f2 + tf2t 2
tf2 - t
Finally the third problem is tackled. It is easy to check that a control which
satisfies the NC is a linear function of time, precisely u(t) = -A2(0) + A1(0)t
which yields x2(t) = -A2(0)t + A,(0)t2/2, x1(t) _ -A2(0)t2/2 + \1(0)t3/6 and
t f3 = -6x1 f/A2(0), where A2(0) _ -v/2, A1(0) = - -4A2(0)/(3xr. f). The solutions of the three problems are compared in Fig. 6.5.
Remark 6.6 (Alternative approach to time-varying problems) It is interesting to note
that a time-varying optimal control problem might be restated in terms of an equivalent time-invariant problem. In fact, add z = 1, z(to) = to, to the system equations,
where the functions f * and l" are related to the functions f and l in an obvious way,
while So
x'
162
tf1
0.8
Figure 6.5: Example 6.10: state and control responses, state trajectories when
xif=1.
solution of this new problem supplies t; (0), the last component of which (f n+1(0)) is
the initial time for the given problem, while the final time is e"'+1(0) + t f, tI being
the final time for the restated problem. This approach can be checked by applying
it to Example 6.6.
Remark 6.7 (Constraints on the final and/or initial event) The discussion in Remark
6.6 is expedient in dealing with optimal control problems where the final and/or
initial event rather than the final and/or initial state is constrained, i.e., where eqs.
(6.4) enter into the statement, as done in Part I. In fact, first suppose that the final
event is constrained (see eq.(6.4b)), namely that
aa1(x, t)
ax
at
aag(x, t)
eag(x, t)
at
E(x, t)
ax
163
Then a customary problem with constrained final state can be obtained by simply
adding the state variable z defined by z = 1. Note that in so doing the hamiltonian
function becomes
H*(x,u,z,),o, A, A) := H(x,u,t,Ao,A)+it
_ Aol(x,u,z) +\'f(x,u,z) +p
while the equation
A(t)
z=z(t)
has to be added to the auxiliary system. The orthogonality condition can be expressed
as
q
19i
194
ayax'
LXO(to)
Oai(xo(tf),z)
(tof) =
f)
,\o (to) =
az
z=tf
where i9i are suitable scalars. These equations show that a problem restatement is
actually not needed when handling constrained final events: indeed it suffices to
modify the transversality condition which now is
H(xo(tof), u(t f), to, ao, Ao(to)) +
Vi
i_1
8a,(xo(t f)'t)
=0.
at
t=tf
A similar discussion can be carried on when the initial event rather than the state is
constrained (see eq.(6.4a)).
Example 6.11 Consider the optimal control problem defined by the first order system
x = u with initial state x(0) = 0 and final event belonging to the regular variety
Sf = {(x, t) I (x - 2)2 + (t - 2)2 - 1 = 0}. The performance index to be minimized
is
J=
ftf 2 u2dt.
0
that
0 = - 2 A2 (0) + 219(t f - 2),
0 = A(O) - 219(x(t f) - 2),
164
Figure 6.6: Example 6.11: state motions which satisfy the NC and optimal
solutions for given W.
tf
\2(0)+2'
1Y = -
\(0)
2(\(0)tf + 2)
The only two real solutions of the first equation are \1 = -2.11, \2 - -0.46 and
the relevant values for the final time, the final state and performance index are:
tj = 1.27, t2 = 2.23, xl = 2.69, x2 = 1.03, Ji = 2.83, J2 = 0.24. Figure 6.6 (a) shows
what has been obtained: point P1 is the event corresponding to the first value of
\(0), while point P2 is the event relevant to the second value. Note that point P",
though corresponding to a final event which can be attained with a better control
than the one yielding point P1 (recall the performance index), has not been found
by imposing the fulfillment of the NC. This can easily be understood by looking at
Fig. 6.6 (b) where t f, x(t f) and J are plotted against the angle Sp which specifies a
particular element of Sf. These quantities are obtained by exploiting the HamiltonJacobi theory with reference to the problem where the final state (arid hence also
the final time) is given.
6.2.2
Optimal control problems where the state and control variables undergo simple
165
J=
rtf
Jto
(6.8)
Moreover, the initial state xo and time to are given while the final state is either
free or constrained to be an element of a regular variety Sf. The following result
applies to problems of this kind.
(ii)
f, Ao,
A(tf)) + Ao
(iii)
t_tf
- 0,
AO > 0,
0
(iv)
(v)
arrt(xatt f) t)
A(tf) _ A0
540,
07n (x, t f)
ax
dcxi(x)
19Z
x=x(tf)
i=1
dx
I
,
't9i E R.
x=x(t f )
Example 6.12 A simplified model of a d.c. motor with independent and constant
excitation and purely inertial load is
1 = X2,
x2 = klx3,
where xl, x2 are the shaft position and velocity, respectively, while x3 and u are the
armature current and the applied voltage. The constants ki, i = 1, 2, 3, 4, depend on
the motor electrical parameters, the field current and the applied load. The description of the optimal control problem is complete once the initial state x(0) = 0 and
the performance index
J= 2
166
are given. The condition (i) of Theorem 6.2 prevents the problem from being pathological, while from condition (v) and the equations of the auxiliary system it follows
that \I (t) = aI (0) and A2, A3i x, u are all linear functions of A1(0), precisely of the
form ff(t)A1(0). Therefore, by recalling the orthogonality condition we get
x1(1) = KA1(0) = 1 +
AIM
Q
In this relation K is a constant: thus, in particular, Al 1(0) = K-1/a, so that when a'
increases A1(0) approaches 1/K and x1(1) tends to 1, as should have been expected.
Example 6.13 Consider the optimal control problem concerning the system xl = x2,
:C2 = u with x(0) = 0 and x(t f) free. The performance index is
/'t f
J= 2J u2dt+tf-xl(tf).
U
together with the equations of the auxiliary system imply Al(t) = -1, A2(t) = t - t f,
so that from the transversality condition it follows that X2(tf) = 1. Since uh = -A2
we gettf=v/'2-.
Remark 6.8 (A sketch of the derivation of the Maximum Principle) A simple derivation of the Maximum Principle conditions can be obtained by making reference to a
particular optimal control problem. As a preliminary, note that, in view of Remark
6.6, it is possible to assume the problem to be stationary, without loss of generality,
while, in order to simplify the discussion, the initial state is given and the final state is
free. Therefore the controlled system is described by the equation x(t) = f (x(t), u(t))
and must comply with the simple constraints x(0) = xo and
E n. Since the initial
state is specified, the performance index can be given the (only seemingly) particular
form J = c'x(t f). Indeed, observe that
mn(x(t f)) = rn(x(0)) + fo tf drn(t(t)) dt
= MOM) + fo tf drn(xx) I s-
f(x(t), u(t))dt
-=(t)
so that, by neglecting the constant term 7n(x(0)), the given performance index becomes
J.tf
J=
[lxt, ut) +
drn(x)
dx
Now introduce the new state variable z by means of the equation z(t) = l*(x(t), u(t))
and the boundary condition z(0) = 0: in so doing we obtain J = z(t f).
The forthcoming discussion relies on the (trivial) idea that, given an optimal
triple (x(.), u(.), t f), a perturbation of the control
and final time tI should
not cause an improvement of the performance index. Obviously, the more general is
167
t u(t)
V
T, 0 < T< tf, where it is continuous, by allowing that u(t) = u(t), 0 < t <
sJf = (x(tf) -
if u(t) = u(tf), tf < t < tf + r/ when 77 > 0. Here o(.) denotes a generic quantity
which is infinitesimal of higher order than the argument. In order that 5Jf > 0 it is
necessary that
C 'f (x(tf ), u(t f )) = 0
(6.9)
The perturbation SJu of the performance index caused by a control perturbation of the above described form can be evaluated by first determining the perturbation of the state at time T. We get
x(T) - x(T) = [f (x (T - e), v) - f (x(T - e), u(T - E))] E + o(E)
= [f (x (T), v) - f (x (T), u (T))] e + o(e).
Therefore, by denoting with <P the transition matrix associated to the matrix Of /dx
evaluated for x = x and u = u, it follows that
6Ju = c'(x(tf) -(x (tf)) = c (D (tf,T) [x(T) - X0 (T)] + o(E)
= C 4(too, T) [f (x (T), v) - f (x (T )' u WA -- + o(e).
168
1-lence it must result that c'4F(t f, -r) [f (x (T), v) - f (x (T), u'(7-))] > 0. In view of the
material collected in Appendix A.2, we can set c'4 )(t f, T) = dW'(T, t f) where 41 is
the transition matrix associated to the matrix -(Of/Ox)' evaluated for x = x arid
u = u, so that the last inequality becomes
A'(T) [f (x(T), v) - f(x (T), u (T))] > 0
(6.10)
fi(t) =
Of (x' u(t)) ,
I
(6.11)
A(t f) = c.
(6.12)
The hainiltonian function relevant to the problem at hand is H(x, u, A) = A' f (x, u):
hence eq. (6.10) simply states the rninirnality of such a function, while eq. (6.11) is
the auxiliary system and eqs. (6.9), (6.12) are the transversality and orthogonality
conditions, respectively.
Remark 6.9 (Nonpatholo(jical problems) The discussion in Remark 6.8 does not mention the scalar AO: this is consistent with the fact (which can be proved) that problems
where the final state is free can not be pathological.
Remark 6.10 (Constancy of the harniltonian function) When the problem is stationary the hainiltonian function is proved to be constant if it is evaluated along a solution
which satisfies the NC. By letting M(t) := H(x(t), u(t), AO, A(t)), it follows that
II(x(t + St), u(t), Ao, A (t + St)) - M(t) > M(t + St) - M(t)
> M(t + St) - II(x(t), u(t + 8t), 1\o, A(t)).
(6.13)
When bt
0 the first element of this chain of inequalities goes to 0 since if, x, A
are continuous functions. Even if u might be a discontinuous function, also the last
element of the chain goes to 0 since the two terms appearing there depend in the
same way on a(t + St): thus that element is a continuous function of St. Therefore,
it follows that
'5t-+o
Itx ( t )
HK ( t )
OH(x(t), u(t), A, A)
Ox
OA
X=X(t)
A=A(t)
6.2.
Simple constraints
169
Figure 6.8: The state space for the problem in Example 6.14.
so that
lirn
(6.14)
(6.15)
i.e., k(t) = 0 for each t where u is continuous. Since u is piecewise continuous, this
fact, together with the continuity of M, implies the constancy of M.
Example 6.14 Problems where the variable Ao can be zero are now presented. Con-
sider the second order system r = u, x2 = u with initial state x(0) = 0, final state
x(1) E Sf and performance index
rr
2
J
[U' + pu] dt
where p is a given constant and Sf = {xJ 4xi - 4x2 - 7xi + 3x2 - 4 = 0} (see Fig.
6.8). Note that the only possible trajectory for the system is the straight line r
described by x2 = xi (see Fig. 6.8) which has two points in common with S f,
precisely the points Pi and P2 with coordinates xr = X2 = 1 and xl = X2 = -1,
respectively. Therefore, the problem amounts to steering the state of the first order
system = u from 6(0) = 0 to (1) = 1 while minimizing the performance index
170
above. 'T'hree different values for p are taken into consideration: (i), p = 2; (ii), p = 0;
with x(0) = 0, x(1) E Sf, Sf = {xI 5xi - X2 - 5 = 0} and the performance index
fl u2
J=
J 2 dt.
U
The following equations are obtained by enforcing the orthogonality condition and
the feasibility of the adopted control (which is of the form u(t) = A1(0)t - A2(0)):
U = A1(0)(lOxl(1) - 1) - 10A2(0)xl(1),
x2(1) = 1 \I(0) - A2(0),
x1(1) = 6x1(0) - 2A2(0),
0 = 5x2(1) - X2(1) - 5.
These equations admit three solutions (the value of x2(1) is not reported as of no
interest)
S1:
x1(1)=1.15
A1(0)=-3.98
A2(0) = -3.63,
,5'2:
x1(1)=0.1522
A1(0)=-31.1316
A2(0) = -10.6816,
x1(1)=-0.85
Ss
A1(0)=2.18
A2(0) = 2.44.
The significance of these three solutions can be understood by assuming that x1(1)
(and hence x2(1), too) is given. For each x1(1) it is easy to compute an optimal
solution and the relevant value J(x1(1)) of the performance index by resorting to
the Hamilton-Jacobi theory. The plot of such a function is reported in Fig. 6.9 which
shows that the solution S2 corresponds to a local maximum, while the remaining
two correspond to local minirrca. Furthermore, it is obvious that the solution S3 is
globally optimal.
6.2.
Simple constraints
171
Figure 6.9: Example 6.15: the optimal value of the performance index.
Example 6.16 This example emphasizes the role played in Theorems 6.2 and 6.1 by
the request that the final time be strictly greater than the initial time. Consider the
problem defined by the first order system i = -x + u, x(O) = 1, with free final state
and the performance index
sf
J=2{
s >0
JJJ
where the final time t f must be determined. The hainiltonian function is H = 3x2/2+
u2/2 + A(u - x), thus, uh = -A and A = -3x + A. The solution of the harniltonian
system is of the form
/3e_2t,
1)e4(t-tf)
s + 3 + 3(s t-t
f
s + 3 - (s - 1)e4
,s#1
,s=1
172
0.5
Figure 6.10: Example 6.16: the optimal value of the performance index.
tir k_ t3
s=
Figure 6.11:
t22
"'412 \ 5
where the dependence of the solution of the DRE on the boundary condition has
beers put into evidence. The plot of the optimal value of the performance index
J(t f) = P(0, t f, s)/2 vs. the final time t f is reported in Fig. 6.10 corresponding
to three values of s. The following conclusions can be drawn: if s > 1 no optimal
solution of the problem exists, as was pointed out by the NC; if s = 1 there exists
an infinite number of optimal solutions, once more consistent with the NC; if s < 1
there exists an optimal solution which is characterized by t f = 0, but the NC are
not satisfied just because the final time is not strictly greater than the initial time
(recall Remark 6.8 and note that the derivative of J(tf) with respect to tf is not
zero for t f = 0).
Example 6.17 This example shows that it might happen that more than one control
satisfies the NC even if the problem does not admit a solution. Consider the problem
defined by the first order system x = u, x(0) = 0, with free final state and the
173
performance index
ft f
u2
where, obviously, t f is not given. The hamiltonian function is H = 1 + u2 /2 x + Au so that uh = -A, A(t) = t + A(0). By enforcing the transversality and
orthogonality conditions we get A(0) = fv and t f = s - A(0). Therefore, when
s>
two values for t f result, while if 0 < s < V2 only one value can be deduced.
For instance, if s = sl := 2 we find til = 2 - v/2- and t12 = 2 + /; if s = s2 :=
the problem defined by the system xl = x2, x2 = u with x(0) E So, x(1) = 0,
So = {xI x1/4 + x2 - 1 = 0} and the performance index
J=Jrl u? dt.
0
The orthogonality condition at the initial time requires that A1(0) = ?9x1(0)/4 and
A2(0) = 19X2(0). The minimization of the hamiltonian function and the auxiliary
system imply that u(t) = -A2(0) + A1(0)t: thus it is easy to conclude, by enforcing
feasibility, that x1(0) ; 0 and x2(0) 54 0. Therefore it is A2(0) = 4A1(0)x2(0)/x1(0).
From the condition x(1) = 0 it follows that x2(0) = x1(0)(-11 157)/12: thus
we can conclude that the initial states consistent with the NC are four (see Fig.
6.12). In view of the nature of the problem it is possible to state that the values of
the performance index corresponding to the choices x(0) = P1 and x(0) = P3 are
equal and similarly equal are the values corresponding to the choices x(0) = P2 and
x(0) = P4. The optimal value of the performance index can easily be computed as
a function of the angle cp which identifies the initial state (again see Fig. 6.12) by
resorting to the Hamilton-Jacobi theory. The simple conclusion is that two distinct
optimal solutions exist corresponding to the points P2 and P4.
Remark 6.11 (A computational algorithm) Many iterative algorithms have been proposed for the numerical computation of a (candidate) optimal control: one of them,
174
Figure 6.12:
Example 6.18: initial states which satisfy the NC and the optiinal value of the performance index vs. W.
particularly simple and easily understandable, is here briefly presented with the aim
of showing how the NC can be fruitfully exploited. The (trivial) idea on which it
relies is basically the following: assume that the solution at hand is far from complying with the NC: then determine a perturbation which (hopefully) makes the new
solution closer to the satisfaction of the NC and iterate until a solution is found
which satisfies the NC up to a specified degree of accuracy. In its present form, the
algorithm is suited to tackle the class of problems defined by the dynamical system
(6.16)
J=
+ m(x(t f), t f
(6.17)
(6.18)
where the number of components of the vector a is not greater than n + 1. In these
175
Algorithm 6.1
1. For u = u(') compute x(t) by integrating eq. (6.16) over the interval T(t).
2.
ax
x=x(i) (t)
(t(t)) =
8rn(x, t('))
ax
===(i)(tf ))
ax
A(') (t)
r=y(i) (t f
_Of(x, u2(t), t
Ox
X=y(i) (t)
3. Compute
t(i)
G(') :=
fto
G') :=
jt)
(t)W(t)B('(t)t)(t)dt,
o
(i)
G3') := Ito Lf
where
au
HHt) (t)
u=u(i) (t)
LUM(t)
176
5. Compute
G(i)
da(x(i) (t), t)
da(x(i) (t), t)
dt
dt
t=t()
t=ts
+ N(i)
, t)
tc
(s)
tf
+ l( i)
t
da(x(t) (t), t)
dt
t--tf
(a)
I
where l(i) := l (x(i) (t(f) ), u(t) (t(i)), t(f) ), while 13(i) > 0 is a scalar to be suitably
selected.
6. if
II a(t) 11< el,
dt
(6.19)
+ l(i)
< E2,
(6.20)
(6.21)
where Ei > 0, i = 1, 2, 3 are suitable scalars, then the triple (x(i), u(i), t(i))
satisfies the NC with an approximation the more accurate the smaller the scalars
ei are. Other wise, let
u(i+1) (t) := u(i) (t) + bu(i) (t),
t(i+l) :-' t(t) + bt(t),
where
bu(i)(t)
bt f)
-o(i)
it
t=tf)
The i-iteration of this algorithin can be justified as follows. First, it is easy to check
that the perturbations bu(i) of the control u(i) and bt f) of the final time t f) cause
6.2.
Simple constraints
177
first variations [6J](') of the performance index and [Sa] (') of the vector a(i) which
are given by
[SJ]ii) = [drn(x1) (t), t)
dt
[Sa]
tt()f
fx(t)su(t)dt,
+ J(i) 6t f) +
(6.22)
da(x(i) (t), t)
(i) +
Stf
I _ (_)
dt
t tf
aa(x, 4)
8x
f(6.23)
Sx (i) (t('))
IX=XM(tM)
if P) is chosen according to step 2. In eq. (6.23), Sx(i) (t f)) is the solution of the
equation
(6.24)
4)(i) being the transition matrix associated to A(i): thus, by exploiting the properties
of such matrices (see Appendix A.2, point (iv)), eq. (6.23) becomes
[6a]ii) =
da(x dt(t), t)
(i)
(6.25)
with SZ(i) specified at step 2. Then it is reasonable to require that the perturbations
to be given to the control and the final time should minimize [SJ] 1, while complying
with the condition
[sa](l) = sa(i)
(6.26)
which is intended to reduce the amount the constraint (6.18) is violated, if we have
chosen
(6.27)
at step 4. Since both [Sa] (') and [6J](') linearly depend on the perturbations Sufi) and
&W , the choice of these perturbations can be performed by minimizing the objective
function
SJ = J]W +
t(i)
bt(') z
2
f f Su(')'(t)(W()(t))-16u()(t)dt
(6.28)
178
function are those shown in step 6 and their substitution into eqs. (6.25), (6.26)
yields the value for jt(i) reported at step 5. By inserting the perturbations of the
control and final time given in step 6 into the right side of eqs. (6.22), (6.25) (note
that 0+1)(t) must be defined also for t E (tf ), tf+r)] if St f=) > 0, for instance by
(t ('))) the first variations of the performance index
J and the vector a can be computed, yielding
letting u(i+' ) (t) := UM (t! ))
[SJ]i' =
- [G3Z) +
d
dt
[6a] ii)
6(i)
[cirn(x(2)(t),
t)
+ lisp
dt
t_tf>
- i+
(6.29)
tf
dt
tf
t=tf)
(6.30)
The comparison of these values with the actual variations, namely those corresponding to
and tfprovides useful information for the parameters ,Q and W at
the subsequent iteration (e.g., large deviations suggest their reduction).
When the inequalities in step 6 are verified (the quantities at the left side can be
viewed as zero), the triple at hand complies with the NC and the iterative procedure
is stopped. In fact, eqs. (6.19), (6.27), imply that Sa(i) is zero and, together with eqs.
(6.20), (6.26), (6.30),
-(G1'))tG2')'. In view of these results and eqs. (6.20),
(6.21), from eq. (6.29) it follows that [SJ](') has become zero.
The Algorithm 6.1 can be specialized suitably to the case of control problems
where the final time is given and/or the final state is not constrained by modifying
its steps in a fairly obvious way. For instance, if the final time is known and equal to
1' and the final state is free, it is necessary:
u2
6.2.
Simple constraints
i
1
2
3
4
5
6
7
8
9
10
179
u(&)
tt
G('
-1.000
-1.250
-1.382
-1.413
-1.374
-1.370
-1.387
-1.398
-1.405
-1.410
1.000
0.750
0.696
0.867
0.856
0.794
0.755
0.732
0.720
0.713
1.000
0.750
0.696
0.867
0.856
0.794
0.755
0.732
0.720
0.713
G2
-1.000
-0.938
-0.962
-1.226
-1.175
-1.088
-1.047
-1.023
-1.012
-1.006
G(
1.000
1.172
1.329
1.732
1.616
1.491
1.453
1.431
1.422
1.418
a(=)
0.000
0.063
0.038
-0.226
-0.176
-0.088
-0.047
-0.023
-0.012
-0.006
a(i
0.250
0.054
0.019
0.056
0.061
0.039
0.023
0.012
0.006
0.003
a(
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
where tf is free. The initial guesses for u and tf are u(1) = -1 and t(f ) = 1, while
the parameters which characterize the Algorithm 6.1 are: W(t) = 1, 8(j) = 1,
6aw = -a(')/2, Vi. Finally, the end condition is specified by ej = 0.01, j = 1, 2, 3.
The algorithm requires ten iterations which are synthesized in Tab. 6.1, where a(i)
and a2= are the values of the left sides of eqs. (6.20), (6.21). The obtained triple
the relevant equations), the latter being x(t) = 1 - st, u(t) _ -v, tI = 1/\.
Example 6.20 Consider the optimal control problem defined by the first order system
x = u with x(0) = 0 and free final state. The performance index is
J = f[x+u]dt.
The initial guess for the control is u(1) = -1 and the end condition is characterized by
s3 = 0.01. Three distinct and constant values, namely, 0.3, 1, 1.5, have been chosen
for the parameter W(i) which is the only parameter to be specified in order to apply
the Algorithm 6.1 (in its modified form as given at the end of Remark 6.11). It is easy
to check that, no matter of actual value for W('), it results that 0)(t) =
+ clte,
so that the evolution of the algorithm can efficiently be summarized as in Tab.
6.2 which shows that there is not convergence when W M = 1 or W(') = 1.5, Vi,
while, when W(t) = 0.3, Vi, the algorithm ends at the fourth iteration by supplying
u(t) = -0.532 + 0.468t rather than u(t) = -0.5 + 0.5t which is the control resulting
from directly enforcing the NC.
In this particular case the behaviour of the algorithm can easily be understood
since the initial guess for u(1) and the constancy of W(') (W(') = W) imply that
c1 Z+1) = (1 - 2W)cls + W.
180
WO)
W = 0.3
2
2
3
4
-1.000
-0.700
-0.580
-0.532
0.000
0.300
0.420
0.468
G3
0.700
0.112
0.018
0.003
c0
cll
-1.0
0.0
0.0
1.0
-1.0
0.0
0.0
1.0
W(')
c3
0.7
0.7
0.7
0.7
c0
-1.000
1.5
cla
0.500
0.000
1.500
-2.500
-1.500
3.500
4.500
G3
0.7
2.8
11.2
44.8
6.3
Complex constraints
Optimal control problems where the state and/or control variables are constrained in a somehow more complicated way are considered in this section.
As a result, necessary optimality conditions will be established for a much
wider class of problems at a price of an acceptable increase in the presentation
complexity. The problems to be discussed are characterized by the presence, at
the same time, of only one of the constraints reported below, the extension to
the case where more than one of them is contemporarily active being conceptually trivial. More precisely, the control problems with complex constraints
which are taken into consideration are:
(a) Problems with the final state constrained to belong to the set
6.3.
Complex constraints
181
(c) Problems where some functions of the state and/or control variables must
satisfy instantaneous equality constraints over the whole control interval,
i.e., constraints of the form
(d) Problems where some functions of the state variables must satisfy instantaneous equality constraints at isolated points, i.e., constraints of the
form
w(Z)(x(ti),ti)=0, to <ti
6.3.1
With reference to the class of problems mentioned in item (a), the case where
only one state variable must belong to a given interval is first considered. Thus
the set Z is constituted by a single element and a possible way of handling the
problem consists in ignoring, as a first attempt, the constraint ai < xi(t f) < bi
and then proceeding with the computation of a control which satisfies the
NC. If the resulting state motion is such that x(t f) E S f, the solution at
hand obviously verifies the NC also for the original problem. If this does not
happen, a couple of problems must be tackled, namely those where the final
state must belong to the set S fa := {x ` x E S f, xi = ail or to the set S fb : _
{xJ x E Sf, xi = bi}. These sets, if not empty, are regular varieties (possibly
after some suitable and obvious rearrangements of the equations ai (x) = 0
which define 9f ) and it is apparent that the solutions which are dominated by
some others should be discarded.
Example 6.21 Consider a cart with unitary mass which moves without friction along
a straight rail subject to a force u, the absolute value of which must be less than 1.
182
Figure 6.13:
Example 6.21: partition of the state space into the three subregions Io, Ia and Ib.
Therefore the system is described by ti = X2, x2 = u, Ju(t)I < 1, 0 < t < t f, xi and
x2 being the cart position and velocity, respectively. The initial state is given, while
at the free final time the state must belong to the set Sf = {xI X2 = 0, -1 < x1 < 1}.
The elapsed time is the performance index and the problem is not trivial only if the
initial state does not lie inside Sf. As a first attempt the constraint on xl is ignored,
so that the orthogonality condition implies that \1 (-) = 0 and
A2(0), since
the solution of the auxiliary system is Al = \1(0) and A2(t) = A2(0) - A1(0)t.
that the time which is required to cover any trajectory (relative to u = 1) from
a point where X2 = a to a point where x2 = 8 is J a -,81, there is no convenience
in switching the control at the intersection with the curve Ca. Vice versa, when the
6.3.
Complex constraints
183
Figure 6.14: Example 6.22: partition' of the state space into the four regions
I2, i = 1,2,3,4.
initial state belongs to the region located to the left of the curve Ca, the control is
given by u(t) = 1, 0 < t < r, u(t) _ -1, r < t < t f, r being the time instant where
the trajectory starting from the initial state and corresponding to u = 1 intersects
the curve Ca. As before it can be concluded that there is no convenience in switching
the control at the intersection with the curve Cb. Some trajectories consistent with
the above discussion are reported in Fig. 6.13.
The procedure illustrated for the case where only one state variable must
belong to a given interval can fairly trivially be generalized to the case where
more than one component of the state rriust comply with requirements of
such a kind. Indeed a combinatorial-type check of the various possibilities can
conveniently be performed, as shown in the following example.
Example 6.22 Again consider the system described in Example 6.21 with the same
performance index and final state x(t f) E Sf := {xJ - 1 < xi < 1, i = 1, 2}. As
before, the problem is not trivial only if the given initial state does not lie inside
Sf. It is apparent that the situation where both the requirements on the final state
are ignored should not be considered: thus, consistent with the above discussion,
the cases to be dealt with refer to the constraints x2 = 1, -1 < xr < 1 and
xi = 1, -1 < X2 _< 1. It is simple to check that the control must be a piecewise
constant function which takes on the values 1 and switches at most once between
these values. By taking into account the form of the performance index, the shapes
of the trajectories corresponding to such values of the control and the whole set of
the NC, one can easily conclude that (see Fig. 6.14):
(a) If the initial state belongs to the region Il constituted by that part of the xi-x2
plane which is delimited by the curves Cl and C4 (defined by the equations
xl = 1.5 -x2/2, -1 < x2 and xl = -0.5 -x2/2, 1 < X2, respectively) and the
portion P4 - Pi - P2 of the boundary of S f, then u = -1;
184
(b) if the initial state belongs to the region 13 constituted by that part of the xl -x2
plane which is delimited by the curves C2 and C3 (defined by the equations
xt = 0.5 A- x2/2, x2 < -1 and xi = _1.5+X2
x2 < 1, respectively) and the
portion P2 - P3 - P4 of the boundary of Sf, then u = 1;
i/2,
(c) if the initial state belongs to the region 12 constituted by that part of the
xt - X2 plane which is located to the right of the curves C1 and C2, then
u(t) _ -1, 0 < t < T, u(t) = 1, r < t < t1, r being the time instant where the
trajectory starting at the initial state and corresponding to u = -1 intersects
the curve C2;
(d) If the initial state belongs to the region 14 constituted by that part of the xl -x2
plane which is located to the left of the curves C3 and C4, then u(t) = 1, 0 <
t < T, u(t) = -1, T < t < t f, T being the time instant where the trajectory
starting at the initial state and corresponding to u = 1 intersects the curve C4.
6.3.2
Integral constraints
(6.31)
to
to be verified. Here we is an
vector of functions which are
continuous together with their first derivatives with respect to x and t, while
we is a given vector. This constraint can be handled simply by adding a vector
z of 7-F new state variables to the original ones. These new variables are defined
by the equations
z(t) = We (X (t), U (t), t)
z(to) = 0
so that eq. (6.31) holds if z(t f) = we. Thus the given problem has been restated
in terms of a problem relative to which the NC have already been presented.
Example 6.23 Consider the system described by the equations ti = X2, x2 = u with
x(0) = 0 and given final state xt (t1) = 1, x2(t f) = 0, t f being free. The performance
index is the elapsed time and the integral equality constraint
fcf
2 dt = E
J0
must be verified. Here E > 0 is a given constant representative of the control energy
6.3.
Complex constraints
185
-t
2
Figure 6.15:
that p(0) > 0, yielding Uh = -A2/(0). Thus (0) must a posteriori be checked
to be positive. By enforcing the transversality condition and the control feasibility
(in particular, z(t f) = E), we find t f = s 6/E, A1(0) = -2/3t f, (0) = tf/18,
A2 (0) = -tf /3. Note that (0) > 0. The state and control responses and the state
trajectories are shown in Fig. 6.15 corresponding to three values of E. Notice that
faster transients and more demanding control actions result when the constant E is
increased.
Jtp
(6.32)
186
(b)
(a)
Figure 6.16:
Example 6.24: (a) the set Xo; (b) state trajectories when the
constraint is satisfied (V$) and is not satisfied (V,,).
are present, wd being an rd-dimensional vector of functions which are continuous together with their first derivatives with respect to x and t, while wd is
a given vector. As before this constraint can be taken into account by adding
a vector z of rd new state variables to the original ones. These variables are
defined by the equations
z(t) = Wd(x(t), U(t), t),
z(to) = 0
so that eq. (6.32) is verified if z(t f) < wd and the given problem is restated as
the problem discussed in Subsection 6.3.1.
Example 6.24 Consider the system presented in Example 6.23 with x(O) = xo and
x(1) = 0. The performance index is
J=
2 dt
1,2
2dt<E, E>0
it is easy to conclude that the constraint can be ignored, provided that the initial
state belongs to the region X0 of the xl - X2 plane bounded by the ellipse 2x2 +
187
18x, + 3x1x2 = 30E (see Fig. 6.16 (a)). If the initial state does not belong to Xo,
an extra state variable is introduced through the equations z = x2/2, z(O) = 0,
z(1) = E. This new problem can be dealt with in the standard way. As an example,
if E = 1, x1(0) = 5/3, x2(0) = 1, the system trajectories are shown in Fig. 6.16
(b), when the constraint is complied with or violated: correspondingly, the value of
the performance index reduces from 88.81 to 19.75 and, for the second case, we find
1
1
6.3.3
2X2dt = 2.39.
condition wi(x(t), t) = 0, to < t < t f, implies that all total time derivatives
of such a function must be identically zero. Assume that each state variable
is either directly or indirectly affected by the control: then it must result, for
some q > 1, that
d4wi(x(t), t)
dt4
:= wi (x(t), u(t), t)
since zt is a function of u. By denoting with vi the smallest value of q corresponding to which the q-th derivative explicitly depends on u, the constraint
wi = 0 is equivalent to the vi + 1 constraints
wi (x(t), u(t), t) = 0, to < t < t f,
di wi(x(t), t)
= 0, j = U, 1,
dti
t=T
188
X1
Figure 6.17:
x2
where T is any time instant belonging to the interval [to, t1}. Note that the
choice -r = to or r = t f might require us to suitably shrink the set So or S f,
or even imply the infeasibility of the problem.
Example 6.25 Consider a system similar to the one described in Example 6.4, i.e.,
constituted by two carts with mass mm and rn2 which can move without friction
along two straight and parallel rails. Two forces ul and u2 act on the carts. By
denoting with xm and x2 the position and velocity of the first cart and with x3 and
X4 the position and velocity of the second cart, the equations for this system are
x1 = X2, 1rt1x2 = u1, x3 = X4, m2x4 = U2. The initial state and time are given,
namely x(O)
0 1 0 0 ]', and the objective is to find the control actions
which minimize the performance index
J = 2
f[u+u}dt
while complying with the constraints xi (1) = 2, x3(1) = 1 and w(x(t), t) = xi(t) x3(t) - t = 0, 0 < t < 1. The function w does riot explicitly depend on the control
variable: its differentiation with respect to time leads to the further constraints x2(t)-
um 2 u2
189
- al a a2 (1\2 + A4)
al a2
21a22A2+A4
al + a2
while the equations of the auxiliary system imply A2(t)+1\4(t) = (A2(0)+A4(0))(1-t).
Here the orthogonality condition at the final time, A2 (1) + A4 (1) = 0, has been taken
into account. Finally, by imposing the control feasibility we find A2(0) + A4(0)
-3(ai + a2)/(ala2)2
Example 6.26 Consider the system shown in Fig. 6.17 constituted by two cylindrical
tanks of equal area. An outgoing flow proportional to the liquid level exists in one of
the two tanks. The two tanks are fed through a constant flow which can be subdivided
at will between them. By denoting with xi, X2, ul and u2 the levels and the flows, the
equations for this system are it = ul, x2 = -x2 +u2, UI +u2 = 1, provided that the
physical parameters have suitably been chosen. The problem is to drive the system in
the shortest time from a given initial state x(0) = xo to a given final state x(t f) = x f.
By choosing Ao = 1, the hamiltonian function is H = 1+A1u1+A2(u2-x2). A solution
satisfying the NC can be found by approaching the problem in a way which differs
from the one outlined above, namely by noticing that the equality constraint can be
The solutions of the auxiliary system are al(t) = k(0), A2(t) = A2(0)et: thus the
sign of \1 - A2 changes at most once (this quantity can not be identically zero
since otherwise it would follow that A = 0 and Ao = 0 because of the transversality
condition). Obviously no feasible solutions exist if xol > xf1, while the analysis is
trivial if x01 = x f1. Therefore it is assumed that A := x fl - xol > 0: this fact rules
out the situation
0. The only controls satisfying the NC have one of the
following forms:
(b) ul (t) _
(c)
ul (t) _
T<t<tf,
0<t<T
T<t<tf,
where the values for T and t f are computed by enforcing feasibility. With reference
to the three alternatives above we find
190
Figure 6.18: Example 6.26: the pairs (x02, xf2) corresponding to which a
(a) t f = 0,
(b)
tf=T+111(.L02e
_1 1), T=0,
Xf2
(c) t f = T + 0, T =
X02 f2
11).
6.3.4
The optimal control problems to be considered are characterized by instantaneous equality constraints on functions of the state variables which must be
satisfied at isolated time instants. Thus, the customary problem statement is
completed by expressions of the kind
wW (X (ti), ti) = 0, to < tl < ... < ti < ... < tg < t f
where the s instants ti may or may not be specified and the vector valued
functions 00 are continuous together with their first derivatives.
When these constraints are present the hamiltonian function as well as
the functions A play be discontinuous at the times ti. More precisely, if x, u
191
are optimal and A', A are the corresponding solutions of the auxiliary system
relative to which the NC are satisfied, then, for 1 < i < s, it must happen that
t-tt
ttq
x=x(ti)
where (i) is a suitable vector. Moreover, if the time ti is not given, it should
also be
limn H(x(t), 0(t), t, Ao, A(t)) = lint H(x(t), u(t), t, A0, A(t))
t--ti
9w(Z)(x(ti), t) I /
at
( i)
t=ti
for 1 < i < s. Note that, according to the nature of the functions w(Z) at hand,
H and A may or may not actually be discontinuous.
Example 6.27 Consider a rendez-vous problem which amounts to placing side-by-side
two carts which move along two straight parallel rails. The motion of both carts is
frictionless and the time of the rendez-vous is unspecified. A force u of arbitrary
intensity acts on the first cart, while the other one is subject to a constant unitary
force. At the initial time the first cart is in the reference position with zero velocity:
the same situation has to be achieved at the unspecified final time. The second cart
has initial velocity equal to 1 and its position is 1. The whole operation has to be
accomplished in a reasonably short time while requiring control actions of limited
amplitude. By denoting with x1 and x2 the position and velocity of the first cart,
the problem is to find the control which steers the system xa = x2, x2 = u from the
initial state x1(0) = x2(0) = 0 to the final state xI(tf) = x2(tf) = 0 while complying
with the constraint
w(x(tl),t1)=
x1(ti)-1-tl - 2t2
X2(ti) - 1 - ti
u2
J=I (1+2)dt.
0
The harniltonian function for the problem at hand is H = 1 + u2/2 + Aix2 + A2u
so that uh = -A2 With Al = 0, A2 = -A1. Due to the presence of the constraint, A
and H might be discontinuous at time ti. If At denotes the constant value of Al for
t > ti and a2 (,\+) is the value approached by \2 when t tends to ti from the left
(right), the equations which complete the set of the NC are
192
Figure 6.19:
X1(0)
A = 1\a + 2,
(--)2
1-
+A I (())x2(tI) = 1 -
x1(tl) = 1 + tl + 2 ,
x2(tl) = 1 + t1,
xl(tf) = 0,
x2(t f) = 0,
1 - A2(tf) - 0.
2
This is a set of eight equations f o r the eight unknowns A1(0), A2(0), A+1, A , l, 2,
11i t f (recall that A = A2(0) - Al (0)tl). The fourth equation together with the fifth
one enforce the constraint, while the last equation is the transversality condition.
We find ti = 1.70, t f = 8.23, A1(0) = -4.50, A2(0) = -5.42, Al = 0.56, A2 = 2.24.
The time responses of xl and x2 together with the position xrl and velocity xr2 Of
the cart to be approached and the corresponding trajectories in the state space are
shown in Fig. 6.19.
Example 6.28 Consider the cart described in Example 6.27 with the same initial and
final conditions but let the control interval be given and equal to 2. The performance
index is
2 u2
2dt
J
U
which has to be nlininiized by complying, every time, with one of the following
constraints: (a) x1(1) = x2(1) = 1, (b) x1(1) = 1, (c) x2(1) = 1, (d) xi(ti) =
x2 (tl) = 1, (e) xl (11) = 1, (f) X2(11) = 1. In all cases the hamiltonian function is
193
A1(0) = Al +i,
A2 = A2 + 2,
x1(1) = 1,
x2(1) = 1,
x1(2) = 0,
x2 (2) = 0.
We find A1(0) = -6, A2(0) = -4, Al = 18, A = 10 and a value Jam, = 16 for the
performance index. Since the state is fully specified at a given time, the solution of
the problem can be obtained by considering two completely independent subprobleins
with given initial and final state: the first one is defined over the interval [0, 11 while
the second one is defined over the interval [1, 21. Actually, the above equations imply
the independence of the values taken on by the vector A before and after time 1.
Case (b)
A1(0)
2=2
x1(1) = 1,
xi (2) = 0,
x2(2) = 0.
We find A1(0) = -12, A2(0) = -6, Al = 12, and a value Jb = 12 for the performance
index.
194
Case (c)
,\1(0)=A
A2 =A +2
x2(1)=1,
x1(2) = 0,
x2(2) = 0.
A,(0)
a2 = 2 + 2,
+ 2
XI (ti) = 1,
x2(tl) = 1,
x,(2) = 0,
x2(2) = 0.
,\1(0) _ Al + l,
2
- (A2
)2
xl(ti) = 1,
x1(2) = 0,
x2(2) = 0.
The second and third equations imply that A1(0) = Al or x2(tl) = 0. The first
alternative should be discarded as it is not consistent with the other equations. Thus
it follows that A,(0) = -12, A2(0) = -6, At = 12, tl = 1 and a value Je = 12 for the
performance index results. Due to the particular nature of the problem data this
solution coincides with the solution of case (b).
6.3.
Complex constraints
195
Case (f)
A1(0)
2 = a2 + 2,
-(
2+
-( 2) +A1
x2(ti) = 1,
x1(2) = 0,
x2(2) = 0.
al(0) = 3
)2(0) = -v/
A+=3
t1=1-f/3,
AI (0) = 6
A2(0) = 2
a2 =4
t1=1,
A1(0) = 3
A2(0)
A2 =3
ti = 1 + //3,
which yield the three values Jf = 3, J f = 4, J f = 3 for the performance index. Note
that the second solution coincides with the solution of case (c) and that the controls
relevant to the first and third solution are opposite in sign to each other.
The peculiarities of the considered problems may be appreciated by looking
at Figs. 6.20, 6.21 where the state trajectories (covered in a clockwise sense) and
responses are reported: the labels f, and f3 denote the curves concerning the first
and third solution of case (f). Finally, the values of the performance indices deserve
some attention: as expected, their mutual relations are Ja > Jb > JJ, J. > Jd,
196
6.3.5
I tZ
wi(x,u,t) = 0,
w (x u t)<U
,Z>>
- w* (X (t), u(t), t)
z
since i; is a function of u. By denoting with vi the smallest value of q corresponding to which the q-th derivative explicitly depends on u, the fact that
wi(x(t), t) = 0, to < tij t < ti,a+1 < t f is equivalent to the vi + 1 constraints
wz (X M, u(t), t) < 0
dt"
=0
t=t=.;
h=U,1,2,...vi-1.
6.3.
Complex constraints
197
constraint -(1 + x2/2) < u < 1 + x2/2. Thus the problem refers to the system
xl = X2, i2= u with given initial and final state subject to the constraint
2
u-(1+ 2 )
w(x, u, t) =
-(u +
2)
H*=1+A1x2+A2u+i(u-(1+
22 ))-2(u+1+ 2)
2
while the nature of the constraints implies that at least one of the two multipliers
is zero, so that
uh =
1+x2
A2 <0,
22
-(1+2),
A2>0.
The state trajectories corresponding to the two ways the control is allowed to depend
on x2 are the curves described by the equation
2
xi = k - sign(A2) 111(1 + 2 ).
198
t2 1
- 40
Figure 6.23:
u
J- f 2dt.
l
total time derivative is -u, the problem must be dealt with where x2(ti) = -1 and
the hamiltonian function is H* = u2/2 + A, X2 + A2u - u, if the choice Ao = 1 has
been performed. Note that uh = -(A2 - p): thus it follows that uh = -A2 outside the
interval T where the constraint is binding, while A2 - p. = 0 inside T. By requiring
that: (i) both Al and the hanliltonian function be continuous at tl (the equality
constraint does not explicitly depend on the first state component and time); (ii) the
control is feasible (state constraints at tl and at the final time); (iii) both A2 and
are continuous at t2 (thus A2(t2) = 0), the following equations are obtained:
Ai = X1(0),
0 = A2(0) - A1(0)t1,
1(0)ti
-1 = 10 0 = X2(t2) +
Ai (1 - t2)2
2
6.3.
Complex constraints
Figure 6.24:
199
Ai
(16
t2)3
0=A2 -1\1(t2-t1),
where x(t2) is evaluated according to the (known) form of the control function and
A+ is the right limit of A. The solution of these equations is ti = 0.26, t2 = 0.92,
A1(0) = 312.21, A2(0) = 82.88, A2 = 204.35. For the same problem a control which
satisfies the NC when the inequality constraint is ignored is u(t) = -40 + 60t. For
a significant comparison of the two situations the values of the performance indices
are Jc = 312.2 and Jf = 200, respectively. The control responses and the state
trajectories are reported in Fig.6.23 when the constraint is or is not taken into
account (curves labelled with c and f, respectively).
Now assume that the control interval is free and the performance index is
J=
jIf
u2
(1+ 2 )dt.
The procedure to be followed is truly similar to that presented relative to the previous
version of the problem: thus only the transversality condition has to be added to the
set of equations above, yielding
Al = AI (O),
0 = A2(0) - AI(0)ti,
Ai(0)ti
ai+
(tf 6- t2)
200
0 = A2 - )I (t2 - tl ),
0=1-a1-.
Note that the last equation is the transversality condition written for a time which is
interior to the interval where the constraint is binding, thus implying u = 0. The solution of these equations is tl = 4.69, t2 = 16.26, Ai(0) = 1, A2(0) = 4.69, Aa = 11.56,
t f = t p, = 17.66. The control and final time satisfying the NC when the constraint
is ignored are u(t) _ -2.83 + 0.3t and t f = t f f = 14.14: the performance indices
corresponding to the two situations are J,, = 35.34 and Jf = 28.28, respectively. The
control responses and the state trajectories are shown in Fig.6.24 when the constraint
is or is not taken into account (curves labelled with c and f, respectively).
6.4
Singular arcs
lit many optimal control problems the hamiltonian function is linear with
respect to the control variable u (see the problems described in Examples 6.2,
6.8, 6.21, 6.22, 6.26, 6.29), i.e., it takes the form
II (x, u, t, A0, A) = a'(x, t, Ao, A)u + /3(x, t, A0, A).
(6.33)
In such cases it might happen that, corresponding to a given control u*, there
exist A > 0 and a pair of functions (x*, A*) which are solutions of the equations
(t.) = f (:x(t), u*(t), t),
A(t)
and such that one or more of the components of a is zero when t E [tl, t2], tl <
12. In this time interval the state trajectory is referred to as a singular arc and
the control components with index equal to the index of the zero components
of a are called singular components of the control. Whenever the whole vector
cx is zero, the control is referred to as a singular control. When a singular are
exists within the interval [ti, t2] the pair (x*, u*) is called a singular solution
over that interval.
In order to be part of a solution of an optimal control problem a singular
solution must satisfy the condition stated in. the forthcoming theorem which
requires suitable differentiability properties of the functions a and ,Q: such
properties are here assumed to hold. Moreover, a(i)(x(t), t, Ao, A(t), u(t)) denotes, for i = 0, 1, 2, ... , the total i-th time derivative of a where the available
201
aa(x, t, \o, A)
at
x=x(t)
f (x(t), u, t)
aca(x(t), t, Ao, A)
as
x=x(t), A=A(t)
ax
='\(t)
Ii
x=x(t)
AO, a (t))
dT a(x(t)dt,
aa(P)(x(t),
AO, A(t), u)
t,
On
atr=2q, 15 =2q-1, q.
Example 6.31 Consider the system xl = X2, x2 = u. The state of this system has to
be driven in a short time from a given initial value xo to the final value x(tf) = 0
while keeping the system velocity small and avoiding positive positions. For these
requests a convenient performance index could be
tf
J =
(1
x2
+ 2 + xl)dt
fo
to be minimized by a suitable choice of the control which must comply with the
constraint -2 < u(t) < 2, 0 < t < t f. By setting Ao = 1, the harniltonian function is
H = 1 + x2/2 + x1 + Aix2 + A2u, so that uit = -2sign(t2) if A2 54 0 and ai
A2 = -x2 - A1. A singular control might exist if A2 = 0, 1\i = -X2, u = 1 and
xi = x2/2 - 1, the last equation being a consequence of the transversality condition.
In view of these results the variable u can take on only the values +2 yielding the
trajectories xl = +x2/4
+ k or the value 1 yielding the trajectory xl = x2/2
-1
2
2
which constitutes a singular arc (heavy curve in Fig. 6.25). Thus the origin can be
reached only if u = 2 and the consistency of the trajectories P31) - P2 - Pi - 0,
202
Q7.1
Figure 6.25:
P31)-p2-p1P4-P4-0, Q51)-Q21)-Q1-U'
Q52)-Q22)-Q1-0, Q3-Q22)-Q1-0,
T2) > 0, if the path continues towards P31), while d1\2(7-)/dr = -(T - T2) < 0, if
the path continues towards P32) . These conclusions have been drawn by taking into
account that x2* (7-) = x2* (-r2) 2(T - 7-2) in the two cases, with x2* (-r2) _ -(1 + T2).
Therefore the control does not further switch once the path towards either of the
two points P3Z) has been chosen.
`Trajectory 0 - P4 - P5: If T4 is the time when the point P4 is reached, then it follows
that A*(7-4) = 0 and A* (0) = T4/2+1/(2T4), since a2(0) = -1/2. Thus dA*(T)/dr > 0,
if the path continues towards the point P5 and no further control switches can occur.
203
Trajectory P7 - P6 - P1 - 0: This trajectory does not satisfy the NC. In fact, letting
T6 be the time when the point P6 is reached, it is necessary that \2*(T6) = -1/2 +
ai (0)T6 -,r62/2 = 0: this equation supplies values of T6 greater than 1 only if ai (0) >
1. However, in this case .z vanishes also for T = T* < 1: at this time dal (T) /d r > 0
so that a control switch takes place.
Trajectory Q7 - Q6 - Q1 - 0: This case is similar to the preceding one and the same
conclusion holds, namely that the trajectory does not satisfy the NC.
The analysis above is conveniently summarized in Fig. 6.26 where the state
space is subdivided into regions which specify the control complying with the NC
(if it exists). Note that two choices for u can be made in a region, while the NC
u3 > 0, and the optimal control problem is to steer the system in the shortest
possible time from the given initial state xO to the situation x2 (t f) = X2 f > X20,
X30. No requirements are set on the first tank level. By letting \0 = 1
and handling in the obvious way the equality constraint on the control variables,
x3(t1) = X3 f
204
Figure 6.27:
uh 1=
0,
A1>0
\1 < 0,
V,
U,
kx1,
Al (t) = A U) (1 -
e'*-t f))
where the orthogonality condition A1(t f) = 0 has already been imposed. Note that
A3(0) ; 0, since otherwise the transversality condition (1 + A2u2 = 0, in such a case)
could not be satisfied because of the form of u2 and the constancy of A2. Thus the
sign of Al does not vary and ul is constant with value either 0 or v. Since 1\2 and
A3 are constant as well, the conclusion can be drawn that if A2 (0) - A3 (0)/A is not
zero, i.e., if no component of the control is singular, either u2 = 0 or u3 = 0, which
is not a feasible choice whenever A2 := X2 f - X20 > 0 and A3 := X3 f - X30 > 0. By
assuming that these inequalities both hold, the value of ul consistent with the NC
is searched. Incidentally, note that conditions (i)-(iii) of Theorem 6.3 hold with the
equality sign for all r, p, q if A2(0) - A3(0)/A = 0.
If
0 it follows that A3(0) > 0 (recall the expression for A1). Feasibility
requires that
A2 =
A3 =
ftf
J0
u2dt,
[ft1 kxldt -
6.5.
and
205
+ A2
1 -e- ktf = AA3
x1(0)
since xl (t) = xl
This fact calls for A3 (0) to be negative since the term between brackets is positive
because of the previous equation where the left term ranges between 0 and 1. Thus
a contradiction results.
v it follows that )\3(0) < 0, while feasibility calls for
If ul
since xl (t) _ [kxl (0) - v}e-kt + v and t f can be determined. The transversality
condition set at t = 0, requires, in view of the expression for A1(0),
A3(0) = -
Since the denominator is positive, it follows that A3(0) < 0 as required. As for the
response of u2, it is not uniquely determined by imposing the NC: among the infinite
allowable choices the simplest one is setting it at the constant value u2 = A2/t1.
6.5
The so-called minimum time problems are now considered. They deserve particular attention in view of both the significance of the available theoretical
results and the number and importance of applications.
Basically, a minimum time problem consists in steering the system in the
shortest time from a suitable point of a given set So of the allowable initial
206
set resulting from the intersection of a finite number s of closed half planes:
thus it can be defined by the equation (see also Fig. 6.28)
(6.34)
The forthcoming results require that the matrices A, B and the set U satisfy
a crucial assumption.
Assumption 6.1 Given an arbitrary vector v 0 0 aligned with any one of the
edges of U, the set of vectors By, ABv, A2Bv,..., A"-"' By is linearly independent.
Theorem 6.4 Let Assumption 6.1 hold. Then, for any nonzero solution of the
auxiliary system there exists a unique control which minimizes the hamiltonian
6.5.
Figure 6.29:
207
A=
0
0
-1
0
0
, B=
0
1
0
0
, S=
-1
-1
I-1 -1
1
1
and v = [ 1 1 ]', which is aligned with the edge or of U (see Fig. 6.29). It is easy to
check that Assumption 6.1 is not verified. Consistent wth this, the general solution
of the auxiliary system is
.X1(0)e-t
.X(t) _
so that
A'(t)Bu = [A3(0) + AI(0)(1 - t - e-t) -.X2(0)t]ui +.X1(0)e-tut
208
By choosing A2(Q) = )t3(0) =-A, (0), A1(O) > 0, we get )t'(t)Bu = Ai(0)e-t(u2-ul)
and the haniiltonian function is minimized by any pair (U1, u2) belonging to the
edge a.
Theorem 6.6 Let Assumption 6.1 hold and ,9f = {x f}. Then the optimal control, if it exists, is unique.
This theorem (which is not an existence theorem) states that if u and u* are
two optimal controls defined over the intervals [0, t f] and [0, t f], respectively,
which transfer the state of the system from xo to x f, then they do coincide,
i.e., t f = tf and u(t) = u*(t) for (almost) all t.
Obviously, an optimal control has to be sought within the set of those
extrernal controls which are also feasible, that is which drive the state of the
system from xo to Sf. In some particular cases the feasible extrernal controls
are unique.
Theorem 6.7 Let Assumption 6.1 hold and S f = Ix f } = 0. Moreover, let the
origin of R"z be an interior point of U. Then there exists at most one extremal
feasible control.
This theorem (which is not an existence theorem) states that if u and u* are
two extrernal controls defined over the intervals [0, if I and [0, t f], respectively,
which transfer the state of the system from xo to 0, then they do coincide,
that is f f = tf and u(t) = u*(t), for (almost) all t. This fact implies that if
an extrernal feasible control has been found, then it is the optimal control,
obviously provided that air optimal control exists.
Theorem 6.8 Let Assumption 6.1 hold. If a feasible control exists, then an
optimal control exists too.
209
u = feasible
u=u
extremal
S,=(0), 0 int. U
A= as. stable
u = feasible
u= piecewise
constant
u = unique
Sf=(xt)
#switches
of u '=finite
? (A)=real
U=P
Figure 6.30: Summary of the results for time optimal control problems.
0}. Assumption 6.1 holds and Theorem 6.5 can be applied so that each extremal
control (hence the optimal control, too) may commute at most once. Indeed, the
nonzero solutions of the auxiliary system are A1(t) = A1(0),
A2(0) - A1(0)t:
thus the sign of 1\2 may change at most once. Moreover A2 can be zero at an isolated
time only since, otherwise, A1(0) = 0, A2(0) = 0 and Ao = 0 as well, because of the
transversality condition and the NC would be violated. By noticing that
= I Urn,
u
uM,
A2>0
2
<0
it is easy to check that the control can switch at most once. The trajectories of the
system corresponding to the two allowed values for the control are shown in Fig.
6.31: they are the parabolas defined by the equation xl = k + x2 /a where a =
or a = 2uM. In the first case the trajectories are covered in the sense of decreasing
x2 while in the second case they are covered in the opposite sense. It is apparent
(see Fig. 6.31) that for each pair (xo, x1) there exists a feasible extremal control.
Therefore also the optimal control exists. Now consider Fig. 6.32 and let the final
state x be such that xf2 > 0. If the initial state does not belong to the region D+
delimited by the line A+ - x - P+ (made out of pieces of the trajectories passing
210
x1
Figure 6.31:
covered in a shorter time since in each point the controlled object possesses a greater
velocity than while covering the trajectory ds - 2 - x f ). The line A+ - x - P+ - B+
f versa. It
is the locus where the optimal control switches from um to um or vice
is appropriately referred to as a switching curve as it allows one to implement the
solution in a closed loop form: the value to be given to the control depends only on
the system state (u = u,n to the right of the curve, u = um to the left of the curve).
A similar discussion applies when the final state x f is such that x72 < 0 or its second
component is zero (see Fig. 6.32).
Finally, consider the particular case when x(t f) = 0. Thanks to Theorem 6.7,
there exists a unique feasible extremal control: this fact can easily be checked by
looking at the trajectories corresponding to the two extreme values the control can
take on. The form of the switching curve is the same as the previous case and its
equation is xi = X2'/(2u,,,), x2 > 0, xi = x2/(2uM), x2 < 0. As before, the switching
curve partitions the state space into two subregions where the control is u = u,n
(region to the right of the curve) or u = um (region to the left of the curve).
Example 6.35 Consider a problem similar to the one presented in Example 6.34.
The initial state is zero (x(0) = 0) while the final state must belong to the set
Sf = {x) xi = 0, x2 = 1}: thus the system has to be drawn back to the initial
position but with a specified kinetic energy. The control variable must comply with
the constraint Ju(t)I < um # 0. Assumption 6.1 obviously holds so that, thanks
to Theorems 6.4 and 6.5, it can be concluded that an optimal control, if it exists,
6.5.
211
f - P+ - B+, B- -
switches at most once from -um to um or vice versa. In view of the form of the
system trajectories corresponding to u = UM, which are parabolas defined by the
equations xr = x2/(2uM) (see Fig. 6.33 where um = 1), and the fact that two
final states are admissible, the conclusion can be drawn that two feasible extreral
controls exist which are also optimal. Note that, consistent with this, Theorem 6.6
can not be applied.
Example 6.36 Again consider a positioning problem concerning a pointwise object
with unitary mass. It moves in a plane subject to two forces ur and u2 acting along
the axis. By denoting with xr and x3 the coordinates of the object, with x2 and x4
the corresponding time derivatives, the system equations are
xl
X2
which can legitimately be viewed as describing two second order independent sub-
systems. The initial state is the origin (x(0) = 0), while, as for the final state, the
object has to be steered, at rest and in the shortest time, to any point of the set
212
Figure 6.33:
Sf = {xI X3 = X112 - i9, x2 = X4 = 0}, '0 > 0 (see Fig. 6.34 where the orthogonal
Al = 0,
A2 = -A1,
30,
A4=-A3
exist which do not uniquely determine the control minimizing the hamiltollian function II = I + Aix2 + A2ui + A3x4 + A4U2 (for instance, A2 = 1, Al = A3 = A4 = 0).
however the extrerrlal controls satisfying the NC can be selected out of the infinite ones. The orthogonality condition requires (recall that Ai(t) = Ai(0), i = 1, 3)
Al (0) = -A3(0)x1(t f), so that the following cases can occur:
(a) A1(0) = A3(0) = 0,
0, xl(tf) = 0,
0-
'File first case must be discarded since it implies that A2 and/or A4 always have the
salve sign, thus entailing that ul arid/or u2 are constant: this fact prevents them
from being feasible because x2(t f) = 0 and x4(t f) = 0 can not simultaneously occur.
The second case is consistent with \2 = 0, ul = 0 (singular control component)
so that x1(t f) = 0 and the final position is Po (see Fig. 6.34). On the contrary, A4(0)
6.5.
Figure 6.34:
213
must be nonzero and such as to call for a control u2 which steers the second subsystem
uin2,
0 < t < T,
T = - /219U.,,2UM2/(um2 - uM2)/umn2,
UM2,
T < t < t f,
t f = \/219(U n2 - uM2)/(um2UM2).
The value for T and t f have been computed by taking into account that: (i) the time
required to pass from x4(0) = 0 to x4 = x4(T) < 0, when u = um2, is T = x4/u,,,,2, (ii)
the time required to pass from x4 < 0 to x4 = 0, when u = UM2, is t f -T = -x4/UM2i
/
tfi ='PiV I6I, cpi :=
V
umjUMj
2(umi
- uM2)
By imposing that t11 = t12 the conclusion is drawn that the final positions consistent with the case under consideration are those resulting from the intersection of
the straight lines defined by the equations x3 = (cP1/cP2)2x1 with the parabola 11
(see Fig. 6.34, where a :=tan-1((cP1/(P2)2)). However, the points P2 and P2 are
214
Figure 6.35:
reached by means of an extremal control which does not satisfy the orthogonality
condition. Indeed, consider the final position P2 which implies x(t1) > 0, i = 1, 2:
it is necessary that ui(O) > 0, i = 1,2 so that \i(0) < 0, i = 2, 4. In order that
the two control components switch it must result that \i(0) < 0, i = 1, 3, which
is inconsistent with the orthogonality condition. A similar reasoning leads to saying
that also point P. is reached by means of an extremal control which does not satisfy
the NC, while points P1+ and Pi are reached by means of extremal controls which
satisfy the NC. As a conclusion three final states have been located corresponding
to feasible extremal controls. The times required to drive the system at rest to the
points 11) and P,+ (the time required to drive the system to the point Pi obviously
equals the time required to reach Pl+) are
tf(Po) _ 02VV,
Ja2 + 2i9 - a,
where
(cpiImp2)2. Thus, the final optimal point, if it exists, is the one among them
which requires the least final time. It is not difficult to check that t f(Po) > t1(P7)
for each 29,
uMi, so that the solution of the problem cannot be unique.
Now consider a similar problem where the set U rather than being a parallelepiped is a circle centered at the origin of R2 with unitary radius. If A2 and A4
are not simultaneously zero (this fact can occur only at isolated times, otherwise
the NC are violated), the minimization of the lramiltonian function implies that the
vector ul, has unitary norm and is orthogonal to the straight line r defined by the
equation A2ui + A4u2 = k (see Fig. 6.35). Thus the actual problem is selecting the
direction of the resulting force, its magnitude being anyway unitary. By imposing
the NC, we find the final position which is closest to the initial one and belongs to
the parabola ii. Since this parabola is symmetric with respect to the x3 axis, two
such points exist when 19 > 1, namely, Q+ and Q- with coordinates ( 2(n9 - 1), -1)
UM
215
80
it
and (-J2(19 - 1), -1), respectively, while, if 0 < 0 < 1 the final point is unique and
coincides with A.
Example 6.37 Consider a minimum time problem defined by a harmonic oscillator
described by the equations
d1
dT
= 2,
dT = - W,a l + V
with given yet generic initial state o and final state C(t f) = 0. The control variable
must belong to the set V = {vJ v,n < v < vM, vmvM < 0}. For the sake of
convenience, define the new variables xl :=
x2 := 2i t := WnT, u := v/W,L,
so that the system equations become i1 = X2, i2 = -xi + u and the new control
variable u must comply with the constraint urn := v,n/Wn < U < vM/Wn := UM.
Assumption 6.1 holds so that the ininimization of the hamiltonian function uniquely
determines uh. In fact,
uh
r um,
Sl UM,
A2 > 0
A2 < 0
and A2 (t) = a sin(t+cp) with a > 0. Therefore, a generic extremal control is a function
like the one shown in Fig. 6.36 where 0 < So < and 0 < b f < ir. Corresponding to
the two limit values u,n and um which can be taken on by u, the system trajectories
are circumferences centered in (u,n, 0) and (um, 0) respectively, which are covered in
a clockwise sense. Note that ir is the time required to cover half these circumferences.
We now search the initial states corresponding to which there exists a feasible control
of the form shown in Fig. 6.36 (or ending with u(tf) = um and/or beginning with
u(0) = um). In view of Theorems 6.6, 6.7, 6.8 such a control is the optimal control.
By making specific reference to the control shown in Fig. 6.36, it is easy to draw
the corresponding trajectory which ends at the origin: it is shown in Fig. 6.37. The
location of point P1 where the last (:= v-th) switch takes place can vary, as Sf varies,
216
Figure 6.37:
between 0 and A1. Corresponding to this, the point P2 where the (v - 1)-th switch
occurs varies between B1 and B2, the point P3 where the (v - 2)-th switch takes
place varies between A2 and A3 and so on. By repeating these arguments when the
value of the feasible extrennal control at the end of the control interval is um, it is
easy to conclude that: (i) a feasible extremal control exists for each initial state, (ii)
the number of switching times is unbounded as the initial state varies (note that
Theorem 6.5 does not apply), (iii) the curve
...-A4-A3-A2-A1-0-B1-B2-B3-B4-...
is the locus where the switches from
to um or vice versa occur. This switching
curve partitions the state space into two subregions where the optimal control is u,,,
or um thus allowing a closed loop implementation of the solution.
6.6
Problems
Problem 6.6.1 For > 0 and -oo < cp < oo discuss the optimal control problem
defined by the system
.9f
:_ {x (xl - 1)2 + x2 - = 0}
6.6. Problems
217
f'udt.
J= 2Problem
6.6.2 Find a control which satisfies the NC for the problem defined by the
system xn = X2, x2 = u, x(O) = x(2) = 0, and the performance index
J=
u2dt - ax1(7-)
fo
tf
J=I (1+
u2
)dt
when:
J=
J0
where 7' is specified, while no other constraints are present. Let all assumptions
required by the Hamilton-Jacobi theory and the Maximum Principle be verified and
V be a solution of the Hamilton-Jacobi equation, corresponding to which it is possible
to conclude that the pair (x(), u(.)) is an optimal solution. Show that a' = 1 and
a(t)
.-
dV (z, ty
C7z
z=x(t)
tf dt.
J=J
218
Problem 6.6.6 Discuss the problem defined by the first order system x = u, with
given initial and final state and performance index
u2
rt f
J =
(tk +
)dt
Jto
x2(0) = 0, x(T) E Sf, Ju(t)s <,3, 0 < t < T and performance index
J =
Iuldt
Jin0
J=
Problem 6.6.9 Consider the problem defined by the system x1 = x2, x2 = u, x1(0) _
1, x2(0) = 0, x(t1) = 0 and the performance index
ft f
u2
when (i) no other constraints are present, (ii) Ju(t)I < 1, 0 < t < tf. Find a control
which satisfies the NC.
Problem 6.6.10 Discuss the problem defined by the system xl = ul - U2, x2 = U2,
x(0) = xo, x(tf) = 0, Jul(t)I < 2, 1u2(t)I < 1, 0 < t < tf and the performance index
J=J
U
tf
dt.
Problem 6.6.11 Discuss the problem defined on the first order system x = u, x(0) =
xo 34 0, x(1) = 0, u,nin < u(t) < Umax, uminuinax < 0, 0 < t < 1 and the performance
index
i (2
1x2
J=
u)dt
Problem 6.6.12 For k = 0 and k = 1 consider the problem defined by the first order
system x = u, x(0) = 1, x(t1) = 0 and the performance index
J=
f'(tk + 2 )dt.
6.6.
Problems
219
Problem 6.6.13 Consider the problem defined by the system xl = x2, x2 = U, x(0) _
0 and the performance index
J=
2 dt+t-axl(tf)
It!
where o > 0 and 8 is a positive integer. Whenever possible find a control which
satisfies the NC and discuss its optimality.
Problem 6.6.14 Consider the problem defined by the system x(t) = f (x(t), u(t)),
x(0) = xo, x(tf) = x f and the performance index
c
where x(t f), x(0) are specified, t f is free and t(x(t), u(t), t) = p(t) + t* (x(t), u(t)),
p(t) being a given polynomial of degree k > 0. Let the problem be not pathological,
(x(t), u(t), t f) a triple which satisfies the NC and A(t), A$ = 1 a solution of the
auxiliary system corresponding to (x(t), u(t)). Show that H(x(t), u(t), A A(t)) _
p(t) - p(tf)
Problem 6.6.15 Let 81 and [32 be given real numbers. Discuss the problem defined
by the system xl = x2, X2 = u, x(0) = xo, x(t f) = 0, Iu(t) I < 1, 0 < t < tf and the
performance index
J = 2-
/ft
f [1 + tlxi
Problem 6.6.16 Find a control which satisfies the NC for the problem defined by the
J=
fZ u2
2dt
when:
(v) xl(T)=-2,0<T<2;
x2(T)_-2,0<T<2.
(vi)
Problem 6.6.17 Consider the minimum time problem defined by the linear system
x = Ax + Bu where
A=[0 -1
B=[
0
1
220
S, _
-1 -1
2
-2
-1
a_
,
3
3
Problem 6.6.18 Discuss the problem defined by the first order system th = x + u,
x(O) 0 0, x(T) = 0, Iu(t)l < 1, 0 < t < T and the performance index
r7
J=J
where T is given.
Jul dt
Chapter 7
Second variation methods
7.1
Introduction
to<t<t f
bu'(t)bu(t) < e
caused by bu satisfies a relation of the same kind, provided that finite time
intervals are considered.
Independently of the considered class of perturbations, the evaluation of
both the first variation [6J]1 and second variation [6J]2 of the performance
222
index allows us to proceed along two significant directions. The first one is
pursued in Section 7.2 and leads to local sufficient optimality conditions which
result from the requirement that [SJ]1 = 0 and [6J]2 > 0 whatever is the
(nonzero) perturbation selected inside the set of the admissible ones. Whenever these conditions are satisfied one can legitimately state that the solution
at hand is locally optimal. The second direction is pursued in Section 7.3 and
deals with the problem of finding the perturbations to be given to the solution at hand, optimal when the initial state is the nominal one, in order to
preserve optimality (in a given sense) also when the initial state is changed.
Rather surprisingly, the solvabilty conditions for this problem, referred to as
the neighbouring optimal control, coincide with those ensuring local optimality.
We adopt the following notations. Let cp(zl, z2, ... , z3, t) be a generic
function. Then
cP*a(T) _ d dp(zl, z2, ... , zs, t)
dry
cab
where -y and S are any two of the arguments of W. In particular, the partial
derivative can be performed with respect to only one of the function arguments or even with respect to none of them, this second instance simply referring to the function evaluation. As an example, for the hamiltonian function
H(x, u, t, A) = l (x, u, t) + )' f (x, u, t) we obtain
H"(t) = H(x(t), uO (t), t, AO (t)),
11 (t) =
Ii.xu(t ) _
7.2
y8H(x` u, t, A)
Ox
x=zo(t) u=u(t),A=A0(t)
d aH x, u, t, A)
dX
(97 t
x=xO(t),u=u,(t),A=Ao(t)
Weak local sufficient optimality conditions are now presented with reference to
optimal control problems similar to those dealt with in Section 6.2 of Chapter
6, i.e., exhibiting simple constraints only. Here the term local is due to considering the performance index perturbation up to the second order, while the
terns weak reflects time nature of the admissible control perturbations.
More in detail, the problems on which the attention is focused are always
assumed to be not pathological and defined by
j, (t) = f (x(t), u(t), t),
x(to) = a:0,
(7.1a)
(7.1b)
223
J = n(x(tf), tf) +J
tf
to
(7.1c)
(7.1d)
given and the set S, which is the set of feasible final events (if not free,
in this case the only requirement is that the final time be greater than the
initial one) is a regular variety. No further constraints are present either on
the control or state variables. A triple which satisfies the necessary conditions
of the Maximum Principle is denoted by (x, u, t f ): it satisfies the equations
th(t) = f (x(t), u(t), t),
x(to) = xo,
0 = a(t f)
(7.2a)
(7.2b)
(7.2c)
A(t) _ -HH'(t),
A(tf) =rnx(tf)+ax'(tf)19,
H(tf) = -mt (t f) - at '(t f )19,
H(t) < H(x(t), u, t, A (t)), Vu, t E [to, t f ].
(7.3a)
(7.3b)
(7.3c)
(7.3d)
Under the assumption that Huu(t) is nonsingular for t E [to, t f ], let, for the
sake of convenience in notation,
R(t)
Huu(t),
fx(t) - fu(t)R-1(t)Hux(t),
fu(t),
HHu(t)R-1(t)Hux(t),
Hsx(t) K(t) := B(t)R-1(t)B'(t),
A(t)
B(t)
Q(t)
(7.4a)
(7.4b)
(7.4c)
(7.4d)
(7.4e)
axx i=1
L
Si := rnox(tf) + axx(tf ),
S2
ao'(t f ),
S3 := S1 f (t f) + mot(to) + a't(tf) 9 +
(7.4f)
(7.4g)
(7.4h)
(7.4i)
224
(7.4j)
(7.4k)
It is now possible to present four results which refer to the following cases:
Only the proof of the fourth result is reported here, as the remaining ones
(especially that pertaining to the first case) are fairly complex.
Relative to case (a) the following theorem holds, where the definitions
(7.4) are exploited.
Theorem 7.1 Let (x) a, t f) be a triple which satisfies the necessary optimality
conditions (7.2)-(7.3). Further assume that R(t) > 0, t E [to, t f ] and there
exist solutions of the equations
P5(tf) = 65)
P3(too ) =S3,
P6(tf) = 0,
such that P4(t) > 0, t E [to, t f], P6(t) < 0, t E [to, t f ). Then (x, u, t f) is a
locally optimal triple, in a weak sense, for the problem (7.1).
Some comments on the essence of the assumptions in the above theorem are in
order: (i) The sign definition of matrix R, though not required by the NC, is
often satisfied as naturally matching with the minimization of the hamiltonian
function; (ii) The equations for Pi can be integrated one at a time arid, apart
from the first one (which is a Riccati equation), are all linear, the unknowns
being three matrices (P1i n x n, P2, n x q, P6, q x q), two vectors (P3, n x 1,
P,5, q x 1) and a scalar (P4).
225
Example 7.1 Consider the optimal control problem defined by the system x = u,
x(O) = 0, the performance index
cf
J=i
zldt + x(tff
and the set Sf = {(x, t) I xt + 1 = 0}. It is easy to check that there exists only one
solution which satisfies the NC, namely
-3t, tf =
Example 7.2 Consider the optimal control problem defined by the system l = X2,
x2 = u, x(0) = 0, x2 (t f) _ -1 and the performance index
rtf
u2dt + 2t f + vx1(ti)
xl (t) =
A1 (0)t3 -
A2 (0)t2,
P1(t) = 0, P2(t) _ [0 1]', P3(t) = S3. As for P4, it results that P4(t) = P4(0) + alt,
P4(0) _ -a(A2(t f) + at f), so that the sign requirement on P4 is fulfilled if and only
- 2. The
if P4(0) > 0. This happens only when a = -1, A2(0) = 2 and t f =
equations for P5 and P6 admit solutions and the sign of P6 complies with the relevant
request. Thus a solution which is locally optimal in a weak sense has been found.
Further light can be shed on these conclusions by applying the Hamilton-Jacobi
theory to the problem with given, though generic, final time. The optimal value
of the performance index (as a function of the final time) can be found in this way,
226
Figure 7.1: Example 7.2: the optimal value of the performance index.
the problem does not admit an optimal solution, the local minimum is determined
by the sufficient conditions only when a = -1, the sufficient conditions cannot be
verified when the final time is t f2 or t f4.
x2(t) =
A2(0)t2,
A2(0)t,
a and 1\2(0) =
12t f, with
J to =
12
2
2-1
V -V7
tr =
tf2 =
2,41-2+1'
We now explore the possibility of exploiting Theorem 7.1. Note that only some of
the quantities involved in the sufficient conditions depend on the particular solution
selected: they are IT. = [0 \1(0)]', 83 = [0 a)', S4 = -aA2(tf) + 12t f, where the
superscript "o" has been omitted for the sake of simplicity in notation. Subsequently
we find Pl (t) = 0, P2(t) = [01]', P3(t) = S3. As for P4, it results that P4(t) = P4(0)+
227
alt, P4(0) = -aA2(t f)+(12-a2)t f, so that the sign requirement on P4 is fulfilled for
a = 1. The equations for P5 and P6 admit solutions and the sign of P6 complies with
the relevant request. Thus the solutions which satisfy the NC are locally optimal in
a weak sense. Further light can be shed on this conclusions by resorting again to the
Hamilton-Jacobi theory. The optimal value of the performance index as a function of
the final time can be found, yielding J(tf) _
/2-a2tf3/6+2t f
where a :_ (atf + 2)/(2t f). The plots of this function (see Fig. 7.1) show that the
solutions which have been found are globally optimal.
In the second case (constraints on the final state, given final time) the following
theorem holds where reference is made only to those among the NC (7.2), (7.3)
which apply to the problem at hand. Furthermore, the (obvious) modifications
which must be performed in order to take care of the peculiarity of the problem
have been made relative to the functions a, m and the notations defined by
eqs. (7.4).
Theorem 7.2 Let (x, u) be a pair which satisfies the necessary optimality
conditions (7.2), (7.3). Moreover, assume that R(t) > 0, t E [te,T] and there
exist solutions of the equations
P2(T) = S2,
P6(T) = 0
such that P6(t) < 0, t E [to, T). Then, (x,u) is a locally optimal pair in a
weak sense for problem (7.1) with t f = T.
Example 7.3 Consider the optimal control problem defined by the system xl = X2,
x2 = u, x(0) = 0 and the performance index
J = fu2dt+2xi(1).
The final state is constrained by the equation xl - x2/2 + 1 = 0. First we determine
the solutions which satisfy the NC. In particular, we obtain u = A1(0)t - A2(0)
where the vector a(0) can be specified by enforcing feasibility and the orthogonality
condition. It turns out that
1\2(0) _
A1(0)(4
A1(0))
2(3 - A1(0))
228
Figure 7.2: Example 7.3: the optimal value of the performance index: the final
states x2(1) = z correspond to the values O= of \1(0).
al (0) being a solution of the equation -2A1(0)3 + 39AI (0)2 - 180A1(0) + 216. Three
values for )t1(0) result, namely )q (0) = ,13r := 13.3746, A1(0) = 02 := 4.2052,
Ai(0) = /33 := 1.9273. We now explore the possibility of exploiting Theorem 7.2.
Note that only some of the quantities involved in the sufficient conditions depend
on the particular solution selected: they are S1=diag[0, 2 - A1(0)], S2 = [1 )\2(0) ,\,(O)/21', where the superscript "o" has been omitted for the sake of simplicity
in the third case (no constraints on the final state and time) the following
theorem holds where reference is made only to those among the NC (7.2), (7.3)
which apply to the problem at hand. Furthermore, the (obvious) modifications
which must be performed in order to take care of the peculiarity of the problem,
have been made relative to the functions a, rn and the notations defined by
eqs. (7.4).
229
Theorem 7.3 Let (x, u, t f) be a triple which satisfies the necessary optimality
conditions (7.2), (7.3). Moreover, let R(t) > 0, t E [tei tf ] and assume that
there exist solutions of the equations
NO _ - [A(t) - K(t)P1(t)]'P3(t),
P4(t) = P3(t)K(t)P3(t)
with the boundary conditions
PI (tf)=S1,
P3(tf) -'S3,
P4(tf) = s4,
and such that P4(t) > 0, t E [to, t 7j. Then (x, u, to) is a locally optimal triple
in a weak sense for problem (71 a), (71 b), (7.1 d).
Example 7.4 Consider the optimal control problem defined by the first order system
x = u, x(0) = 0 and the performance index
J = 2
ft(2
+ u2)dt + 2(x2(tf1)2.
First we determine the solutions which satisfy the NC. Note that the data imply
the existence of an even number of such solutions: thus only half of there will be
explicitly mentioned, the remaining ones being trivially deducible. We obtain that
u = /2- and two values for t f (the positive solutions of the equation 16t f -8t f+1 = 0),
namely, tf 1 = 0.63 and t1 2 = 0.13 and the corresponding final states xl = 0.90 and
x2 = 0.18, respectively. We now explore the possibility of exploiting Theorem 7.3.
The quantities involved in the sufficient conditions which depend on the particular
solution selected are Sl = mxx, S3 = NF27rtxx, S4 = 2rnxx, with mxx = 11.26 when
the first value for t f is selected, otherwise mxx = -7.20. The equations for P1, P3
and P4 admit solutions corresponding to both situations, but the sign requirement
on P4 is satisfied only when mxx > 0. Thus the sufficient conditions are met for t f1
and xi. By resorting to the Hamilton-Jacobi theory it is possible to conclude that
this solution is indeed optimal, since the optimal value of the performance index as
a function of the final state x f (see Fig. 7.3) is J(x f) = v
+ 2(xf - 1)2.
Finally, in the fourth case (free final state, given final time) the following
theorem holds where, as done in the two preceding cases, reference is made
only to those among the NC (7.2), (7.3) which apply to the problem at hand.
Furthermore, the (obvious) modifications which must be performed in order
to take care of the peculiarity of the problem, have been made relative to the
functions a, m and the notations defined by eqs. (7.4).
230
Figure 7.3: Example 7.4: the optimal value of the performance index.
Theorem 7.4 Let (x, u) be a couple which satisfies the necessary optimality
conditions (7.2), (7.3). Moreover, let R(t) > 0, t E [t0, T) and assume that
there exists a solution of the equation
't'hen (x, u) is a locally optimal pair in a weak sense for problem (7.1 a),
(7.1 b), (7.1 d) with t f = T.
For the sake of simplicity in notation only the arguments which are essential
for the understanding of the forthcoming discussion will be displayed in the relevant
functions. Assume that the problem is not pathological, then, for any function A
it follows that l(x, u, t) = H(x, u, t, A) - A'x so that the variation 16j] ,2 of the
Proof.
performance index which accounts for the terms up to the second order and is induced
by the perturbation (Sx, Su) of the couple (x, u) is given by
[6J]1,2 = [J(x + Sx, u + Su) - J(x, u)] 1,2
ft
+ IT [ Sx'
2
Su' ]
HH'XX
-u
Hu
][
6x ] dt
Su
231
provided that A = a. In the relation above the first two terms in the right side
account for the first variation 16J], of the performance index, while the remaining
two terms account for the second variation [SJ]2. By performing an integration by
parts of the term -A'S and taking into account that Sx(O) = 0, the conclusion can
be drawn that [6J] 1 = 0, if the triple (x, u, A) verifies the NC. As for the term
[6J]2, if the assumptions in the theorem are satisfied we get
[SJJ2 = 2
Su' ] [
T [ 6X'
M,,u
HHHXX
H.
(T)m',,(T)6x(T) +
J0
J[
6X
Su
] dt
Sx'Pi
Sx] dt
since, by disregarding terms which would entail variations of the performance index
of order higher than two, the dependence of Sx on Su is given by
Sx = f f 8x + fubu, Sx(0) = 0.
2 fT
f [Su + Sx]'
[Su + *Sx] dt
Example 7.5 Consider the optimal control problem defined by the system x1 = x2,
x2 = u, x(0) = 0 and the performance index
J=2
where fl is either equal to 0 or 0.6. First we determine the solutions which satisfy the
NC. In particular, we obtain that u = A1(0)t - A2(0) where the constants A (0) are
232
Figure 7.4: Example 7.5: the optimal value of the performance index.
determined by enforcing the orthogonality condition, yielding A2(0) _ A1(0) where
Al (0) solves the equation 213A (0) - 13.5A (0) + 27A1(0) = 0. Thus it follows that
X1(0) = 0,
x1(1) = 0
A1(0)=2,
x1(1)=-0.67,
A1(0) = 0,
x1(1) = 0
A1(0)=8.65,
x1(1)=-2.88
/(3=0.6
A1(0) = 2.60,
x1(1) = -0.87.
We now explore the possibility of exploiting Theorem 7.4. The equation for P1 admits
a solution corresponding to the first value of A1(0) when [ = 0 and to the first and
second values of Al (0) when /3 = 0.6: moreover the required assumptions are verified.
Remark 7.1 (A computational algorithm) A particularly simple and easily understandable algorithm is now presented with the aim of showing how the knowledge of
second order terms can fruitfully be exploited. Similarly to the algorithm described
in Remark 6.11 of Subsection 6.2.2 of Chapter 6, the idea is to identify a solution
which satisfies the local sufficient conditions. The perturbation of the solution at
hand (which does not comply with the NC) is determined by evaluating the induced
effects up to second order terms. In essence, the actual objective functional is approximated by a quadratic functional, while the algorithms which only consider first
order terms performs a linear approximation. Therefore an increase in the rate of
convergence should be expected whenever the information carried by the second order terms are exploited, especially in the vicinity of a (local) minimum where, in
233
general, the methods based on the first variations are not very efficient. For the sake
of simplicity the algorithm to be presented refers to the particularly simple class of
optimal control problems defined by the system
(7.5)
J=
rT
J'
(7.6)
In these equations to, xo and T are given and the functions f, 1, m, possess the
continuity properties which have often been mentioned. No further constraints are
present on either the state and control variables and the problem is assumed not to
i_
1. Compute x(') by integrating eqs. (7.5) with u = u(i) over the interval [to, T];
2. Compute A(') by integrating the equations
A(T) = m(')'(T)
over the interval [to, T];
3. If
R(') (t) := HH'u(t) > 0, t E [to,T]
find the solutions P(i) and z(i) (if they exist) of the equations
- [A()(t) - K (t)P
)
(t) - B (t)(R
)- r(t)H' (t)] z(t)
with
A(') (t)
f.() (t)
K(')(t) :=
B(') (t)
- f!' )
fuz) (t),
H4 (t)(R('))-r(t)H,`)'(t),
w(')(t) := B(')(t)(R('))-'(t)Hu`)'(t).
v(Z)(t)
234
4. If
II H "I (t) 11 < e, t E [to, TJ
where e > 0 is a suitable scalar, then the couple (x(=), u(i)) satisfies the local
sufficient conditions with an approximation the more accurate the smaller the
scalar a is. Otherwise, let
u(i+1) (t) := u(i) (t) + 6u(i) (t)
where
-(R(i))-'(t)1 [B('(t)P")(t) + Hu.)(t)] 0(0(t)
60)(t) :_
Hu)
(t) I >
(t) = [A(i) (t) - K(i) (t)P() (t)] V(i) (t) - x() (t)z(i) (t) - w(i) (t)
with the boundary condition
19(to) = 0.
The i-tli iteration of this algorithm can be justified as follows. First, it is easy to
check that the perturbation Su(i) of the control u(i) induces the variation
[S']1,2 =
Jto
+
(t)bu(')(t)
H (t)
H u (t)
6xi (t)
Hum (t)
Huu (t)
L Su(i) (t) ]
in in the performance index, provided that the first and second order terms are taken
into account and A(') is selected as in step 2. In this equation Sx(i) is the solution of
S:z(t) = fxi) (t)Sx(t) + B(i) (t)Su(i) (t), 6x(to) = 0.
A significant choice for 60l is selecting, if possible, that function which minimizes
[SJJ12? while complying with the constraint that the last equation imposes on 6x(i)
and Sufi). If what stated at step 3 holds, the Hamilton-Jacobi theory can be applied
to this optimal control problem and the expression for 60) given at step 4 follows.
If the inequality there is satisfied, the solution at hand verifies the local sufficient
conditions with an approximation the more accurate the smaller the scalar a is.
Example 7.6 Consider the problem presented in Example 6.20 which is defined by
the system x = u, x(0) = 0 and the performance index
J=
[x(t) + u2(t)]dt.
235
As before the choice 0)(t) = 1, t E [0, 1], is made and e = 0.001 is set into the
algorithm stopping condition. We find A(1> (t) = 0, B(') (t) = 1, R(1) (t) = 2, Q(1) (t) =
0, P(1)(1) = 0, 01) (t) = 0, so that the differential equations at step 3 have the
solutions P(') (t) = 0 and P) (t) = 0. The stopping condition is not verified (step 4),
thus we set
6U(1) (t) = 2 + 2 t, u(2) (t) _ - 2 + 1 t.
Corresponding to this new control we find Hue) (t) = 0 and the local sufficient conditions are nlet.
7.3
The problem to be considered in the present section deserves particular attention both from the applicative and theoretical point of view. Indeed, its
solution leads, from one hand, to designing a control system which preserves
the power of the optimal synthesis methods also in the face of uncertainties,
thus allowing their actual exploitation, while, on the other hand, unexpected
deep connections are established with the material of the preceding section
which refers to (seemingly) unrelated topics. More in detail, consider the same
optimal control problem discussed in Section 7.2, i.e., the problem defined by
the dynamical system
(7.7a)
x(to) = xo,
(7.7b)
(7.7c)
ft f
J = rn(x(t f), t f) +
(7.7d)
to
In the preceding equations (7.7) the functions f (which are n-vectors), rn,
1, a (which are q-vectors, with q < n + 1 components ai) are continuous
together with all first and second derivatives, 1 is not identically zero, xo and
to are given and the set S f, which is the set of feasible final events (if not
free, in this case the only request is that the final time be greater than the
initial one) is a regular variety. No further constraints are present either on
the control or state variables. A triple which satisfies the necessary conditions
of the Maximum Principle is denoted by (x, u, t f): it verifies the equations
(7.8a)
(7.8b)
(7.8c)
236
Controlled
system
Ideal system
U
t
f
Stf
tf
80V OX)
Controller
Figure 7.5: The control scheme which can be adopted when Su and btf are
known.
A(t) = -HH'(t),
(7.9a)
(7.9b)
(7.9c)
(7.9d)
f].
In general, the control u and the final time to , which are optimal for a given
this be possible also when the state perturbation occurs at any generic time,
the control scheme of Fig. 7.5 could be implemented and the computation of
the functions Su and Pt f done only once and for all. What can actually be
computed in a fairly easy way are the perturbations 62u and 621t f to be given
to u and to in order that, within the set of weak perturbations, [6J] 1,2 is
minimized, [S J] 1,2 being the variation of the performance index evaluated up
to second order terms. When the initial state, the final time and the control are
perturbed (the latter in a weak sense), it is easy to check that such a variation
is given by
[S<I] 1,2
S1
S3
S3
S4
Sx(tf)
6t f
237
tf o
,
+2 ft. [ Sx (t) Su (t)
1
[ Hx(t) Hu(t)
H,x(t) Huu(t)
6x(t)
6u(t)
dt,
provided that the unperturbed situation refers to a solution which satisfies the
NC. In the above equations reference has been made to some of the definitions
(7.4) given in Section 7.2. Note that the quantity A'(to)Sxo must be considered
given and the two remaining terms in the right side of the equation constitute
the second variation [SJ]2 of the performance index. Under suitable assumptions, 16j]2 is minimized by a couple (62u, 62 t f) which can easily be specified
as a function of the state perturbation detected at any time 7- E [to, t f) since
the initial state is arbitrary. In such a case the computed couple minimizes
[SJ]2 over the time interval beginning at r. Finally, it can be proved that the
difference between the value of the performance index corresponding to the
couple (Su, St f) and the value corresponding to the couple (Sou, 6t1) is an
infinitesimal of higher order with respect to Sx('r): thus the neighbouring optimal control which is supplied by the controller in Fig. 7.5 when the second
rather than the first couple of variations is exploited, satisfactorily solves the
problem.
It is now possible to present four results which refer to the following cases:
(a) (x(t f), t f) E Sf,
conditions (7.8), (7.9). Moreover, let R(t) > 0, t E [to, t f] and assume that
there exist solutions of the equations
P4(t) = P3(t)K(t)P3(t),
P5 (t) = P2 (t) K(t) P3 (t),
Ps(t) = P2(t)K(t)P2(t)
238
P2(tf)=S2,
P3(tf) = S3,
P4(tf) = S4,
P5 tQf = S51
P6(tf) = 0
such that P4(t) > 0, t E [to, t f ], P6(t) < 0, t E [to, t7). Then, the variations
62u of the control and Sit f of the final time which minimize P12 when x(T) _
x(T) + Sx.,-, T E [to, t f) are given by
02tf = - [P3
'(7-)
J=where
f'(2+u)dt
t f is free. The NC are satisfied only by the triple (the superscript "o" has been
omitted)
+ 1,
x(t) = 2e-
ft
u(t)
tf
vF2,
= ln(2)/V.
Corresponding to this, we find that A(t) = -2f, B(t) = 2e-v2t, Q(t) = -e2ft/2,
R(t) = 1, S1 = 0, S2 = 1, S3 = -2, S4 = 2V2-, S5 = 0 and the check can be
performed whether the assumptions of Theorem 7.5 are satisfied. The equation for
P, with the relevant boundary condition admits the solution
Pi
(t) = t f - t e2./zt,
2,Q(t)
Q(t) := /(t - t f) - 1.
239
100-A
Sx(0)
-0.5
0.5
3
-t
Figure 7.6: Example 7.7: violation of the constraint when u + 62u is applied
and state responses when 6x(0) = 0.5.
All the equations for the remaining matrices Pi, with the relevant boundary conditions, admit solutions, namely,
P2(t)
eat
2,8(t)' P3(t)
P5(t)
2(t
eat
Q(t) ,
P4 (t)
P6(t) =
2/
a(t)'
t)3(t)t.
Note that the sign conditions on P4 and P6 are satisfied: thus Theorem 7.5 can be
exploited yielding
Pl (t)
P2 (t) = e
t,
P6(t) = (t - t f)
and
62x( t, 6x( 0 ))
62t f(6x(0)) = 0.
240
state has undergone a perturbation and the neighbouring optimal control u + 62u is
exploited, yielding x(t) = (2 + bx(0))e-(f+6x(0)/(2t f))t + 1, and
x(tf) - 2 = (1 +
2
6x(0)/2)e-6x(0)12
-1
This function is plotted in Fig. 7.6 together with the state response when the NC
are enforced or the control u + 62u is applied and 6x(0) = 0.5.
In the second case (constraints on the final state, given final time) the following
theorem holds where reference is made only to those among the NC (7.8), (7.9)
which apply to the problem at hand. Furthermore, the (obvious) modifications
which must be performed in order to take care of the peculiarity of the problem,
have been rriade relative to the functions a, m and the notation defined by
eqs. (7.4).
Theorem 7.6 Let (x, u) be a pair which satisfies the necessary optirnality
conditions (7.8), (7.9). Moreover, assume that R(t) > 0, t E [to, T] and there
exist solutions of the equations
Pi(T)=S1i
P2(T) = S2
P6(T)=0
such that PE;(t) < 0, t E [to,T). Then, the variation 62u of the control which
minimizes [SJJ2 when x(T) = x(T) + Sx1, T E [to, T) is given by
SZu(t) = -1?-1(t) {B'(t) [P1(t) - P2(t)Ps 1(t)P2(t)]
+ Hur(t)} SZx(t), t E [r,T)
where Slx is the solution of the equation
241
Figure 7.7: Example 7.8: violation of the constraint when u + 62u is applied
and state responses when Sx(0) = 0.5.
Example 7.8 Consider a modified version of the problem presented in Example 7.7
where t f = 1 and, consistent with this, the performance index is simply the integral
of the square of the control action. The only solution which satisfies the NC is
x(t) = 2e"t + 1,
u(t) = ry,
ry = - In(2)
where the superscript "o" has been omitted. Corresponding to this, we find A(t) = 2ry,
-ry2e-2-yt/4, R(t)
performed whether the assumptions of Theorem 7.6 are satisfied. The equations for
Pi, P2, P6 with the relevant boundary condition admit the solutions
P1(t)
P2 (t)
Q(t) , P6(t) =
4(1 -
(t)t)
Note that P6(t) < 0, t E [0, 1) so that Theorem 7.6 can be applied, yielding
S2x(t, Sx(0)) = (1 - t)e''tsx(0)
S2u(t, Sx(0))
6 2)
242
As a conclusion, the control scheme of Fig. 7.5 can be implemented and it is interesting to evaluate to what extent the constraint on the final state is violated when
the initial state has undergone a perturbation and the neighbouring optimal control
62U is exploited, yielding x(1) _ (2 + Sx(0))e7-6x(O)/2 + 1 and
x(1) - 2
2
-- (1 + Sx(0)/2)e-ax(o)/2 _ 1
2
This function is plotted in Fig. 7.7 together with the state response when the NC
are enforced or the control u + 82u is applied and Sx(0) = 0.5.
In the third case (no constraints on the final state and time) the following
theorem holds where reference is made only to those among the NC (7.8), (7.9)
which apply to the problem at hand. Furthermore, the (obvious) modifications
which must be performed in order to take care of the peculiarity of the problem,
have been made relative to the functions a, rn and the notation defined by
eqs. (7.4).
Theorem 7.7 Let (x, u, t f) be a triple which satisfies the necessary optimality
conditions (7.8), (7.9). Moreover, let R(t) > 0, t E [to, t f] and assume that
there exist solutions of the equations
P4(tf) = S4
and such that P4 (t) > 0, t E [to, t f] . Then, the variations 6 u of the control
and b2t f of the final time which minimize 012 when x(T) = x(T) + Sx, ,
r E [to, to) are given by
243
-0.4
0.4
Example 7.9 Consider a modified version of the problem described in Example 7.7
where the final state is free and a function of it is added to the performance index,
If
namely,
J=2J
The NC are satisfied only by (the superscript "o" has been omitted)
= -e2ft/2,
P3(t)=e -
P4(t)=-2
Q(t)
Q(t)
Note that P4 (t) > 0, t E [0, t fJ so that Theorem 7.7 can be exploited. It follows that
P1 = - 2 e2vt
1
244
and
bet f(bx(0)) =
S2x
,
62u(t, sx(O)) = 0,
As a conclusion, the control scheme of Fig. 7.5 can be implemented and it is inter-
esting to evaluate the quantity 0 := (J2 - J)/J (see Fig. 7.8), where
J2(xo + bxo) =
6,(0)12
+1
is the value of the performance index relative to the neighbouring optimal control
u+b2u and
J(xo+bxo) _ -f 1n(
Sx(0)
is the value of the performance index relative to the control which satisfies the NC.
Finally,. in the fourth case (free final state, given final time) the following
theorem holds where, as done in the two preceding cases, reference is made
only to those among the NC (7.8), (7.9) which apply to the problem at hand.
Furthermore, the (obvious) modifications which must be performed in order
to take care of the peculiarity of the problem, have been made relative to the
functions ot, in and the notation defined by eqs. (7.4).
Theorem 7.8 Let (x, u) be a couple which satisfies the necessary optimality
conditions (7.8), (7.9). Moreover, let R(t) > 0, t E [to, T] and assume that
there exists a solution of the equation
Pi(T) = S1.
Then, the variation 62u of the control which minimizes [6J]2 when x(7-)
X'(-r) + bx., T E [to,T) is given by
62"u (t) _ -R-1(t) [B'(t)P1(t) + Hin(t)] 62x (t), t E [T, T)
where 6"x is the solution of the equation
245
j=2
fT
u2dt + x(T).
x(t) =
2e-t
+ 1,
u(t) = -1.
Corresponding to this, we find A(t) = -2, B(t) _ 2e-t, Q(t) = -eat/4, R(t) = 1,
Si = 0 and the check can be performed whether the assumptions of Theorem 7.8
are verified. The equation for Pl and the relevant boundary condition admits the
solution
Pl (t)
Q(t)
elt, i3(t) := t - T - 1.
8x(0)
2(T + 1)'
246
As a conclusion, the control scheme of Fig. 7.5 can be implemented and it is interesting to evaluate the quantity 0 :_ (J2 - J)/J (see Fig. 7.9), where
J2(xo + Sxo) = (2(T
+ 6X( 0))2T
+ 1)
+ 1) 2
+ X2 (T),
Sx(0))e-(1+ax(o)/(2(T+I)))T
X2(T) = (2 +
+1
is the value of the performance index relative to the neighbouring optimal control
u+62u and
6x(0))2T
+ x(T),
x(T) = (2 + 6x(0))A(0) + 1
is the value of the performance index relative to the control which satisfies the NC,
a(c) being the solution of the equation
1=
7.4
(0)eA'(o)(2+ax(o))T
Problems
Problem 7.4.1 Consider the optimal control problem defined by the system xl = X2,
:x.2 = u, x(O) = 0, the performance index
/cf
l
J= 2 {J u2dt+tf+'yxr(tf)}
l U
JJ
Problem 7.4.2 Consider the optimal control problem defined by the system :cr = X2,
:c2 = u, x(0) = 0 and the performance index
J=
fl
JU
(u2 + 2xr)dt.
The final state must comply with the constraint x2 l +x2-1 = 0. Discuss the optimality
of the solutions (if any) which satisfy the NC.
Problem 7.4.3 Consider the optimal control problem defined by the first order system
:c = u, x(0) = 1 and the performance index
J=1
2
(1 + u2 + x)dt - 2x(t f)
0
where both the final state and time are free. Discuss the optimality of the solutions
(if any) which satisfy the NC.
7.4. Problems
247
Problem 7.4.4 Consider the optimal control problem defined by the system th1 = X2,
{fu2
(+ 2x1)dt + x2(1) }
1
J=2
where the final state is free. Discuss the optimality of the solutions (if any) which
satisfy the NC.
Problem 7.4.5 Consider the optimal control problem defined by the first order system
J=
x f,
f(2-{-u2)dt
2J
where the final time is free. Find (if any) the triples (x, u, t1) which satisfy the NC
and such that Theorem 7.5 can be applied.
Problem 7.4.6 Consider the optimal control problem defined by the first order system
J=
u24t.
Find (if any) the couples (x, u) which satisfy the NC and such that Theorem 7.6 can
be applied.
Problem 7.4.7 Consider the optimal control problem defined by the first order system
x = xu with given initial xo # 0 and the performance index
cf
J= 2Jo (2-i-u2)dt+x(tf)
w here the final state and time are free. Find (if any) the triples (x, u, t f) which satisfy
J=
j'udt+x(i)
where the final state is free. Find (if any) the couples (x, u) which satisfy the NC
and such that Theorem 7.8 can be applied.
Problem 7.4.9 Consider the optimal control problem defined by a linear system and a
linear quadratic performance index which has been discussed in Section 3.2 of Chapter 3.1 (Remark 3.3). Show that the solution given there satisfies all the assumptions
of Theorem 7.4.
248
Problem 7.4.10 Check whether Theorem 7.2 can be applied to the optimal control
problem defined by the system :c = Ax+Bu, x(O) = xo, x(1) = 0 and the performance
index
J=
judt
'
Problem 7.4.11 Find the neighbouring optimal control for the optimal control problem defined by the system xr = X2, x2 = u, x(0) = xo, x(tf) = 0 and the performance
index
J = f'(2+u2)dt
where xo is given.
Appendix A
Basic background
A.1
Canonical decomposition
of these four parts, the matrix T can be chosen in such a way that Tx _
[ xr,no
xr,o
xnr,no
xnr,o
Al
A:= TAT-1 =
and
A2
A5
A3
0
0
A7
0
A4
A6
A8
A9
C:=CT-1 = [ 0
with
Al
A5
(
A6
A9
C1
Bl
A2
A5
B2
C1
B1
B:=TB=
B2
0
C2 ]
) = reachable
C2 ]) = observable.
250
An-1B ]
(An-' )'C'
...
(A2)'C'
Algorithm A.1
(i) Compute n,.:= raiik(Kr) and no := rank(K&).
(ii)
Find two matrices Xr and Xo of dimensions n x nr and n x no, respectively, such that
(Xr
(Xo )'Xo = 0.
)'Xr = 0,
(iv) Find two matrices Xr,no and Xnro of maximum rank and dimensions
Xo ] = 0, Xnro [ Xr Xo ] = U.
Xr,no [ Xr
(v) Find two matrices X,.,o and Xnr,no of maximum rank and dimensions n x
(n-rank([ Xr X,.,1z. ] )) and n x (n-rank([ X,.,no Xo ] )), respectively,
such that
Xr,o [ Xr
Z
-1
-3
-6
-1
-5
-1
]B=[]C=[1
1
By choosing
2 ;]xo=[i -5
1
X,.
2
1
-2 -8
-1
-2
251
it is possible to select
-1
-1
X1=
Xo
L0
I'
Xr,no =
-1
Xr,o =
Xnr,o =
-1
, Xnr,no
Thus we obtain
A-
-4
12
1.5
-1
-0.5
0
0
B=
0
0
-2
6 ].
With reference to the above defined matrix A, the system E (the pair (A, B))
is said to be stabilizable if the two submatrices A7 and Ay are stable, while
the system E (the pair (A, C)) is said to be detectable if the two submatrices
Al and A7 are stable. Obviously, if the system E is reachable (observable), it
is also stabilizable (detectable).
Reachability (stabilizability) and observability (detectability) of a system
can easily be checked by resorting to the following theorem, referred to as the
PBH test, where
PB(S)
AI - A -B
Pc(A)
AI
C,
252
of A corresponding to which the rank of matrix PP(A) is less than n are the
eigenvalues of the unobservable part of E.
A.2
Transition matrix
(t, T)B(T)u(T)dT
where 4)(t,-r) is the transition matrix associated to A(t), i.e., is the unique
solution of the matrix differential equation
&D (t, T)
= A(t)(P(t, T),
dt
4) (T, T) = I.
(d) 4>'(T, t) = *(t, ,r), where '(t, T) is the transition matrix associated to
-A'(t);
(e) If A(t) = const., -LD(t, T) =
eA(t-T)
_t2
1
e
2t
e-T2 (et-T
4
and
- 1)
et2_T2
eT - t
e-t2(eT-t - 1)
Properties (a)---(d) can easily be checked.
eT2t2
253
A.3
Consider the n-th order linear dynamical system E(A, B, C, D) with p out-
puts, m inputs and let G(s) = C(sI - A)-1B + D be the relevant transfer function with rank(G(s)) = r, r < rnin(p, rn). Furthermore, consider the
(canonical) McMillan-Smith form of G(s), i.e., the matrix M(s) such that
G(s) = L(s)M(s)R(s), L and R being unimodular polynomial matrices, and
f, (S)
...
f2(s)
...
0
0
0
0
..
..
f, (s)
0
0
M(s) =
0
p-r rows
m-r
columns
with
fi (s) _
ei(S)
,
i = 1, 2, ... , r
VMS)
where
i, i = 1, 2, ... , r - 1.
1'
rlei (s).
i=1
The polynomial -7rp is the least common multiple of the polynomials which
are the denominators of all nonzero minors of any order of G(s), while the
polynomial 7rt is the greatest common divisor of the polynomials which are
the numerators of all nonzero minors of order r of G(s), once they have been
adjusted so as to have rrp as a denominator. Finally, consider the system matrix
P(s)
sI - A -B
C
Definition A.1 (Poles of E) The poles of E are the roots of the polynomial 7rp.
254
Definition A.4 (Invariant zeros of E) The invariant zeros of E are the values
of s corresponding to which rank(P(s)) < n + min(m, p).
The set of the transmission zeros of E is a subset of the set of invariant zeros of
E, so that each transmission zero is also an invariant zero, while the converse
is not true in general. However if E is minimal the two sets coincide.
A.4
Quadratic forms
xNQx =
(xNQW + xNQx) =
(x'Qx + x QNx)
2xN('V + QN)x
Definition A.5 (Positive semidefinite matrix) An hermitian matrix Q is positive semidefinite (Q > 0) if x'Qx > 0, Vx.
255
Thus a positive definite matrix is nonsingular and its inverse is positive definite
as well.
q{1,2,...,n} > 0.
Q = C-C.
This property follows from the fact that each hermitian matrix can be diagonalized by means of a unitary matrix (a matrix the inverse of which coincides
with its conjugate transpose) so that if Q = Q' > 0 there exists a matrix U
with U-1 = UN such that
Q = UDUwhere D is a diagonal matrix with nonnegative elements. If D, is the matrix
the elements of which are the square roots of the corresponding elements of
D, so that D = D 2 D 2 , then the matrix
C:=UD2U'
is a factorization of Q, since (D'21)- = D'2- An n x n matrix Q = Q' > 0 admits
many factorizations: among them are those with dimensions rank(Q) xn. When
Q is real, U and C can be selected as real.
A.5
(A.1a)
(A.lb)
256
With reference to the system (A.1), where it is assumed that B(t) = I, consider
the functional
tf
Jn := E [I
+ x'(t f)Sx(t f)
(A.2)
to
with Q = Q' > 0 and S = S' > 0. The value of this functional can easily be
computed according to the following theorem.
Theorem A.6 The value of the functional (A.2) with eqs. (A. 1) taken into
account is
P(t)dt
Jn = tr P(to)(HIo + xoxl) + W
Itot
J2 := E
lnn 1
T
J0
x'(t)Qx(t)dt
257
A:= TAT-1 .=
Ai
A2
0 ],O:=CT1:=[Ci 0 ]
As
where the pair (A,, C1) is observable, the following result holds.
Theorem A.7 Let the matrices A and Q be constant, to = 0 and t f = oo. Let
the matrix Ai be stable. Then,
i) If
0,
Ji = tr [P(IIo + xoxo)]
ii) If x(0) = 0,
J2 = tr [PW ]
where
P=T'
Oi
g]T.
Appendix B
Eigenvalues assignment
B.1
Introduction
The pole placement problem consists in designing, if possible, a feedback controller R (linear, finite dimensional, time-invariant) for a given system P (lin-
ear, finite dimensional and time-invariant as well) in such a way that the
eigenvalues of the resulting closed loop system are located in arbitrarily preassigned positions of the complex plane. It is obvious that only the eigenvalues
of P belonging to its jointly reachable and observable part (namely the poles
xw,o
Figure B.1:
260
B.2
The eigenvalue assignment problem when the controller input is the state of
the system
(B.1)
(where x E R" and u E R"", rn < n) can be stated in the following way.
Problem B.1 (Eigenvalue assignment with accessible state) Find a control law
u(x) = Kx
(B.2)
such that the eigenvalues of the closed loop system (B.1), (B.2) constitute a
preassigned symmetric set A of n complex numbers.
In other words, given the set A, this problem amounts to selecting, if possible,
a matrix K in such a way that the set of the eigenvalues of A + BK coincides
with A.
A very nice result can be proved relative to problem B.1: however, when
the control variable is not a scalar it requires a preliminary fact which is stated
in the following lemma where ,@j denotes the i-th column of B and [I,,]' is the
i-th column of them x m identity matrix. Furthermore, the rank of matrix B
is equal to the number of its columns, without a true loss of generality: thus
13i00'Vi.
Lemma B.1 Let the pair (A, B) be reachable. Then there exists a matrix Ki
such that the pair (A -I- BKi,
is reachable.
Proof. The proof of this lemma can easily be found in the literature, thus only the
form of one of the infinite many matrices Kl is here given. Consider the matrix
Q
[ Q1
Q2
with
Qi ._ [
/3i
Afs
...
...
Qm ]
Ahi-i$$
(B.3)
261
`4__
0
0
0
0
0
0
B_
One finds
Q=[
13i
A,(31
l32
Au2
0
0
0
V-[ 0
0 00J'Ki=VQ_11
0,.
Theorem B.1 Problem B. 1 admits a solution if and only if the system is reachable.
Proof. Necessity. In view of the discussion above, the pair (A, B) must be reachable,
since, otherwise, the eigenvalues of the unreachable part could not be modified and
the elements of the set A could not be arbitrarily fixed. Sufficiency. Thanks to Lemma
B.1 there is no loss in generality if the system is supposed to have a single control
variable, since this situation can be recovered by means of a first control law u =
Kix + [I..]'v. Then reachability of the pair (A, B) together with m = 1 implies the
existence of a nonsingular matrix T such that
z=Fz+Gu, z:=Tx
262
with
r
I
..
...
F:=
1
polynomial of the matrix A. If the control law is u = KCz, K' := [k1 k2 ... k,t],
the (closed loop dynamic) matrix F + GK,- has the same (canonical) structure as F,
because of the form of matrix G. The entries in the last row are fi := -ai + ki, i =
1,2,...,n: thus the characteristic polynorrrial of the matrix F+GKC is 'OF+cx,.(A) =
f,,,\n-r
+ f2,\ + fl. If A = {A1, A2, ... , An}, the desired characteristic
A"t +
+
An+wnAn-r+...+cp2A+(pr. By
polynomial for F+GK, is r/)A(A) =
setting On (A) = V)F+cx,. (A), the set of equations f i = -ai + ki = Wi, i = 1, 2, ... , n
is obtained and the matrix K, is uniquely determined. In terms of the given state
variables x the desired control law is u = Kx, with K := KIT.
Remark B.1 (Uniqueness of the control law when rn=1) The proof of Theorem B.1
shows that when the control u is scalar the matrix K is uniquely determined by the
set A.
Remark B.2 (Selection of the set A) The possibility of (arbitrarily) assigning the
eigenvalues of the closed loop system in principle allows us to arbitrarily increase the
speed of response of the system itself. However this usually entails weighty consequences on the control variables effort, which aright attain unacceptable limits. The
forthcoming Example B.2 sheds light on these considerations either in terms of the
time response of u or by evaluating the quantity
W
Ju := E[
u'(t)u(t)dt]
o
u
(B.4)
where the expected value operation is performed with respect to the set of allowable
initial conditions. In general, eq. (B.4) makes sense only for stable systems and stands
for the expected value of the control energy . By assuming that the initial state is
a random variable with zero mean value and unitary variance, it results that (see
Theorem A.7 of Section A.5 of Appendix A)
Ju = tr[P]
(B.5)
A=
B=
0
1
(B.6)
263
U
50
A' K2
K,
-150
Figure B.2:
K,=-[ 1
3 ], K2=-[ 125
75
15 ].
The time responses of the first state variable x, and the control variable u are
reported in Fig. B.2 corresponding to the initial state x(O) = [ 1 1 1 ] . On
the other hand, if the quantity Ju is evaluated by means of the eqs. (B.4)-(B.6), it
results that Jul = 4 and Jut = 815, in the two cases. Thus it is possible to verify that
reduction in duration of the transient lasting entails a much more severe involvement
of the control variable.
Remark B.3 (Multiplicity of the control law when m > 1) In general, if the control
variable is not a scalar, more than one control law exists which causes the eigenvalues
of the closed loop system to constitute a given set A. This should be clear in view of
the procedure presented in the proof of Theorem B.1 and the statement of Lemma
B.I. Furthermore, by exploiting the fact that the reachability property is generic
(in other words, given two random matrices F and G of consistent dimensions, the
probability that they constitute a reachable pair is 1), one can state that the pair
(A + BKs, B[I,,,,]') is reachable for almost all matrices Kzj provided that [B]' # 0.
The possibility of achieving the same closed loop eigenvalue configuration by
means of different control laws naturally raises the question of selecting the best one
(in some sense) among them. Tile forthcoming example shows the benefits entailed
by a wise choice of matrix K, but the relevant problem is not discussed here.
Example B.3 Consider the system (B.1) with
A=
0
0
, B=
0
0
264
Figure B.3:
By letting A = { -1, -1, -1), it is easy to check that the two matrices
__[ 0
K1 _
0.5
0
1.5
-1 1
3
J'
K2
__r 0
L 0.1
-9
0.3
define two different control laws which solve the placement problem. The response
(B.7)
where y E R'', p < n and the rank of matrix C is equal to the number of its rows.
Now consider the system
x=Ax+Bu+L(Cx+Du-y)
(B.8)
E:=x - x
(B.9)
e _ (A + LC)e.
(B.10)
Figure B.4:
265
the pair (A', C') (which is reachable) guarantees the existence of a matrix L' such
that the eigenvalues of A' + C'L' (hence also the eigenvalues of A + LC) are the elements of an arbitrarily assigned symmetric set An of n complex numbers. Therefore,
if the system (B.1), (B.7) is observable, system (B.8) can be designed in such a way
that x asymptotically tends to x with any desired speed.
Remark B.5 (Observer of order n-rank(C)) The information carried by the p output
variables y concerning the state x is not explicitly exploited by the observer (B.8).
In fact if rank(C) = p the change of variables z := Tx where
T:=
CC1
'
C1 being any matrix such that T is nonsingular, implies that eq. (B.7) becomes
y=[I O]z+Du
(B.11)
and the first p components of z differ from the output variables because of the known
term Du. Thus it is possible to assume that the state vector is z := [ r( w' ]',
where r) := y - Du, i.e., that the output transformation C is as in eq. (B.11) and
conclude that the asymptotic approximation of r) need not be found. Then system
(B.1), (B.7) becomes
rl = A117, + A12w + Blu,
w = A21r7 + A22w + Btu,
(B.12a)
(B.12b)
y=r/+Du,
and the problem of observing z reduces to the problem of observing the vector w
which has n-rank(C) = n - p components. This new problem can be stated relative
266
A21+LA1 1
B2+LB1
I/s
A22+LA12
Figure B.5:
y = y - Du,
(B.13a)
(B.13b)
(B.13c)
x=1,-1rw
where, if it is the case, also the change of variables z = Tx has been taken into
account. The last equation gives the observation x of x. The observability of the
pair (A22, A12) can easily be checked if reference is made to Fig. B.6 where the
system (B.12) is shown after depriving it of the input u. If the system (B.1), (B.7)
is observable also, the system shown in Fig. B.6 is such. However, this last system
remains observable also when A21 = 0, since an output feedback does not alter the
observability properties. Finally, it is obvious that the system shown in Fig. B.6 (with
A21 = 0) can not be observable if the pair (A22, A12) is not.
Figure B.6:
267
(B.14a)
(B.14b)
(where u E R"", y E RP, x E Rn, m < n and p < n) cannot be fed to the input
of the controller can be stated in the following way.
Problem B.2 (Eigenvalue assignment with inaccessible state) Find a dynamic
system of order v and transfer function R(s) such that the control law
u(y, s) = R(s)y
causes the eigenvalues of the closed loop system to constitute a set of n + v
numbers coincident with an arbitrarily given set
A=APUAR
(B.15)
where AP and AR are two symmetric sets of n and v complex numbers, respectively.
Theorem B.2 Problem B.2 admits a solution with v = n if and only if the
system (B.14) is minimal.
Proof. Necessity. Necessity follows from the previous discussion since the eigen-
268
Controller
Figure B.7:
x = Ax + Bu + L(Cx + Du - y)
(B.16)
u = Kim.
By exploiting eqs. (13.9), (B.10), the dynamic matrix of the system (B.14), (B.16),
(B.17) turns out to be
1,,
L
A + BK
BK
A+LC
The eigenvalues of the matrix F are those of the matrix A + BK together with those
of the matrix A + LC: such eigenvalues can be made to coincide with the elements of
two arbitrarily given symmetric sets Ap and An of complex numbers, respectively,
thanks to reachability of the pair (A, B) and observability of the pair (A, C).
The closed loop system resulting from the proof of Theorem B.2 is shown
in Fig. B.7: it is apparent that the same control law which would have been
implemented on the state of the system (B.14) in order to obtain Ap when the
state is accessible, is implemented on the state of the observer (B.16).
Remark B.6 (Choice of AR) The time response of the observation error a is determined by the location in the complex plane of the elements of the set AR which is
269
10
Figure B.8:
arbitrary: thus it makes sense to select them in such a way that their real parts are
substantially smaller than the real parts of the elements of the set Ap which characterize the transient evolution of the state x. This synthesis approach can surely be
adopted provided that possible unsatisfactory behaviours of the control variables are
checked, as shown in the forthcoming Example B.4.
Example B.4 Consider the system (B.14), with
A=
0
0
0
0
B=
C= [
0 ].
By choosing Ap = 1-1, -1, -1}, AR1 = {-1, -1, -1} and AR2 = {-5, -5, -5}, we
1
3 3 ], L1 = - [ 3 3 1 ] ' and L2 = - [ 15 75 125 ] .
The responses of the first state variable x1 and the control variable u are reported in
Fig. B.8, corresponding to the initial state x(0) = [ 1 1 1 ], e(0) _ - [ 1 1 1
On the other hand, if the quantity Ju is evaluated (recall eq. (B.4)), we get Jul =
12.63 and Jut = 163.18 in the two cases. Note the effects of the choice of a faster
observer on the closed loop system performances.
obtain K = - [
Remark B.7 (Stability of the controller) The stability of the closed loop system which
is guaranteed by the selection of stable sets AP and AR (that is with elements in the
open left half-plane only) does not imply that the controller designed according to
the preceding discussion is stable, since its dynamic matrix is AR := A + BK + LC
which can be stable or not independently from stability of the matrices A + BK and
A + LC. This fact is clarified in the forthcoming Example B.5.
Example B.5 Consider the (stable) system (B.14) with
A=
010
11
], B= [ 0l
], C= [ 7
270
Let 1IA+RK (s) := s2 + s + 1 be the polynomial the roots of which constitute the set
Ap and t'A+LC (s) := s2 + 4s + 4 the polynomial the roots of which constitute the set
Are. We obtain K = [ 9 0 ] and L = [ -27/52 33/52 ],, but the characteristic
polynomial of A+BK+LC is bA+BK+LC(s) = s2+4s-17/52, so that the controller
is not stable.
Remark B.5 suggests that Problem B.2 could be solved by resorting to a controller of order less than n, namely, of order n-rank(C), as will be shown in
the proof of Theorem B.3.
Theorem B.3 Problem B.2 admits a solution with v = n - rank(C) only if the
system (B.14) is minimal. This condition is also sufficient for almost all the
sets A of the form (B.15).
Proof. Necessity. Necessity is proved as in Theorem B.2. Sufficiency. Assume that
the system is already in the forth (B.12), i.e.,
= Alfa+A12w+Btu,
w = A2177 + A22w + Btu,
w= ' - Ly
together with the further equation
u = Karl + KwtIi.
(B.18)
If:=L
A. + B. K.
0
B. Kw
A22 + LA12
where
Ay
A11
A21
A12
A22
Bz
B1
[ B2
K
K.
:- [ K
Kw. ]
The eigenvalues of the matrix F can be made to coincide with the elements of the set
A since the pair (As, B,.,:) is reachable (this property follows from the reachability of
the pair (A, B)) and the pair (A22, A12) is observable. The actual implementation of
the control law must be performed by taking into account eq. (13.13c) and checking
the solvability of the algebraic loop that such a relation constitutes together with eq.
(B.18), i.e., by verifying that the matrix I + (K,, - K,,,L)D is nonsingular. If such a
property does not hold it is sufficient to perturb the three matrices K,,, K,,, and L:
the amount of the perturbation is, at least in principle, arbitrarily small so that the
eigenvalues of the resulting closed loop system differ from their desired values of a
similarly arbitrarily small amount.
Figure B.9:
271
Example B.6 Consider the system described in Example B.4 and let Ap = {-1, -1,
-1}, while the set AR is taken equal to either AR, = {-1, -1} or Ant = {-5, -5}
since a second order controller has to be adopted. We obtain K = - [ 1 3 3 J,
Lit, = - [ 2 1 ]', LR2 = - [ 10 25 ]'. The transient responses of the first state
variable xl and the control variable u are reported in Fig. B.9 corresponding to the
initial state x(0) = [ 1 1 1 ], e(0) = - [ 1 1 ]. In order to allow a meaningful
comparison the responses of these variables are shown in the same figure when the
controllers given in the Example B.4 are exploited. On the other hand, if the quantity
Ju is evaluated (recall eq. (B.4)), we obtain Jul = 8.21 and Jut = 11.63 in the two
cases: the comparison of these values with those in Example B.4 deserves attention.
Theorem B.4 Problem B.2 admits a solution with v = n-rank(B) only if the
system (B.14) is minimal. This condition is also sufficient for almost all the
sets A of the form (B.15).
Remark B.8 (Stability of the reduced order controllers) Remark B.7 applies also to
the cases where the controller has been designed by exploiting a reduced order observer of the system (B.14) (controller of order n-rank(C)) or of its transpose (controller of order n-rank(B)).
272
LJ
YS
y(t)
E
i=O
(B.20a)
(B.20b)
(B.20c)
(B.20d)
273
Problem B.3 (Eigenvalue assignment and errors zeroing) Given the system
(B.20) and the set of exogenous signals (B.19), find a linear time-invariant
controller such that:
i) The eigenvalues of the closed loop system are located in arbitrarily preassigned positions.
ii) Corresponding to any exogenous signal of the form (B. 19) and any initial
condition it results that
(B.21)
A solution of this problem is given in the following theorem, the proof of which
supplies a way of actually designing the required controller.
P(s) _ si - A
C
B
D
det(P(0)) = det([
D J) # 0
(B.22)
(c) rank(P(0)) < n + rn and none of the two preceding conditions holds.
274
If the condition (a) is verified there exists a vector v 0 0 such that A'v = 0 and
B'v = 0 so that (recall the PBH test) the pair is not reachable and an already proved
necessary condition is violated. If the condition (b) is verified and the disturbance d
is' not present, at least for one i E 11, 2, ... , m} it results that
In
iTCj=1
ajyyj(.)
against the requirement ii) in Problem B.3. Finally, if the condition (c) is verified
and the disturbance d is not present, it follows that
n
7n
E ajxj
j=1
i=1
with at least one of the coefficients ai 0 0. This implies that the error can not
asymptotically be zero for constant, yet arbitrary, set points since limti_.. x(t) = 0.
Sufficiency. Consider the system described by the equations (B.20) and
4(0) = e = Y. - y,
(B.23a)
(1) _ (e),
(B..23b)
t(k-1) _ [(k-2)
(B.23c)
(k) _ (k-1)
(B.23d)
where (i) E R"i, i = 0, 1, ... , k. If this system is reachable from the input un and
observable from the output := [ (o)' 0)' . . . (k-1)i (k)' ]', then (see Fig.
B.10), there exists a controller R(s) constituted by the system (B.23) and the system
z = Fz +
(B.24)
un = Hz + K
(B.25)
such as to guarantee the eigenvalue assignment for the resulting closed loop system.
Thus, the system shown in Fig. B.10 can without loss of generality be thought of as
stable. This fact implies that the steady state condition which will be shown to be
consistent and characterized by e = 0 is the one to which the system asymptotically
tends.
As a preliminary it is proved that the assumptions i)-iii) imply both observability from the output and reachability from the input un for the system (B.20),
(13.23). The dynamic and input matrices for the system under consideration are
A
AA =
-C
Onxink
inXmk
O,nkxn
'ink
Onxm
Omxm
Omkxin
, BA =
-D
Omkxm
275
AB
K,. =
A2B
A3B
-A
-B
Onxmk
Ornxmk
Ornkxn
Omkxm
Ink
T:=
-B
-AB
-A2B
-A3B
I -Im
OmXm
OrnXi
OrfXi
OmXm
-D
OmXrn
On,xi
O,nxm
OnXm
S :=
The matrix T is nonsingular because of assumption iii), hence the rank of K,, is
maximum if and only if the rank of S is such. In view of the structure of S (in
particular, the position of the submatrix I n) the rank of S is maximum if and only
if the rank of the matrix obtained from S after the rows and columns which are
concerned with the submatrix In have been deleted is such. The resulting matrix
possesses the very same structure as Kr. (apart from the sign of the first row of
submatrices, which is of no relevance in the rank context). What was previously
done with reference to K, can also be applied to this new matrix yielding, after k - 1
iterations, a type "S" matrix of the form
Onxm
-B
-Im
Omxm
Omxm
which has maximum rank thanks to the assumption i). Thus the pair (AA, BA) is
reachable. As for observability, notice that the output matrix CA is
CA = [ Om(k+1)Xn
Im(k+1) J
J.
Thus the matrix Ko for the Kalman observability test is, after a suitable rearrangement of the columns,
K.
I
Oi(k+)
1)
Ont(k+l)Xm
Om(k+l
( ))
O,
...
276
steady state the state of this system is a polynomial function of degree not greater
than k, i.e., w = Ek c, w(0 ti, wfil = coast., i = 0, 1, . . . , k. In fact, the unknown
vectors wNNN, i = 0, 1, . . . , k are uniquely determined by the set of equations (which
eiiforce the above steady state condition)
0 = A,w(k) + Bv(k),
kw(k) = Ar-w(k-1) + Bv(k-i),
no matter how the vectors 0), i = 0, 1, ... , k (which specify the input v) have been
selected, since the matrix A4 is nonsingular (the system is stable). When applied
to the system in Fig. B.10, this discussion implies that eq. (B.21) is satisfied since
otherwise lkl would be a polynomial function of time of degree k + 1 (recall eqs.
(B.23)).
Remark B.9 (Internal model principle) The most significant part of the controller
defined in the proof of Theorem B.5 (eqs. (B.23)) is constituted by a set of rn noninteracting subsystems (as many as the number of controlled variables) each one of
them simply consisting of a bunch of k + 1 cascaded integrators (thus their number
is just equal to the maximum degree of the polynomials exogenous inputs plus 1).
Therefore each subsystem possesses the very same structure of an autonomous system capable of generating the considered family of exogenous signals. Including into
the controller this block of integrators is, within the framework under consideration,
only a sufficient condition for zero errors regulation, which becomes also necessary
whenever the controlled plant is allowed to undergo arbitrary parametric perturba
tions, the magnitude of which has to be so small as to preserve the stability of the
closed loop system. Indeed the integrators which could be present in the plant (nominal) description can not be exploited in achieving asymptotic zero errors regulation
in case of arbitrary parameters perturbations. In the literature, the need for inserting
into the controller a duplicate of the system which generates the exogenous signals
whenever some kind of robustness is required for the closed loop system (designed
so as to achieve zero errors regulation) is referred to as the Internal model principle.
These facts are well known to anybody who is familiar with basic control theory for
single input single output systems.
Remark B.10 (Inputs with rational Laplace transforrrt) A similar discussion can be
carried on if the exogenous signals have rational Laplace transform with poles in
the closed right half plane, namely signals which are the sums and products of time
functions of polynomial, increasing exponential, sinusoidal type. Once the structure
of the autonomous linear system EE of least order which is able to generate the whole
set of considered exogenous signals has been identified, it is sufficient to insert into the
controller a subsystem constituted by m copies of Es (one for each error component)
277
:t-, y
10-
input of system (B.24), (B.25) and its order lowered to n - rn since the relevant
output matrix is readily seen to have maximum rank provided that the (reasonable)
assumption is made that also the matrix C has maximum rank.
Remark B.12 (Constant exogenous signals) The particular case of constant exogenous signals deserves special attention from a practical point of view (the design
specifications frequently call for asymptotic zero error regulation in the presence
of constant set points and disturbances) and furthermore allows us to resort to a
substantially different controller when the disturbance d can be measured. With
reference to Fig. B.11, where II,, and Hd are nondynamical systems, it is easy to
check that condition (B.21) is satisfied provided that the closed loop system is stable, bun = 0 and x = 0 (i.e., x =.t = const.) with
0 = At + Bun + BMd,
y. = Cx + Dun + (DM + N)d.
These equations can be rewritten as
A -B
[
][ n][-(DM+N IJLyJ
278
Figure B.12: Example B.7: responses of the error el when a compensator (C)
or an integral controller (I) is adopted: (a) nominal conditions;
(b) perturbed conditions.
where
H9:=[ 0 I ](P(0))-1
] '
[0 ].
Assumptions i) and ii) obviously guarantee the existence of a controller R(s) which
assigns the system eigenvalues to their desired locations.
However it must be noted that, different from the controller presented in the
proof of Theorem B.5, a design carried on according to these ideas does not guarantee, in general, that eq. (B.21) is verified if the system parameters undergo even
small perturbations. In other words, the open loop compensator made out of the two
subsystems lid and Hq does not endow the design with any kind of robustness.
Remark B.13 (Nonsquare systems) It is quite obvious how Theorem B.5 could be
applied to the case where the number of components of the control vector u,,, is
greater than the number of components of the output vector y, while Problem B.3
does not have a solution when the dimension of the vector y is greater than the
dimension of the vector u,-,.
A=
-1
]'
, B=
D=O, M=
[1]
1
N
'
0
2
279
Figure B.13: Example B.7: responses of the error e2 when a compensator (C)
or an integral controller (I) is adopted: (a) nominal conditions;
(b) perturbed conditions.
and let both the set point ys and the disturbance d be constant. Problem B.3 is
solved by resorting both to a controller based on an open loop compensator (see
Remark B.12) and to a controller inspired by the internal model principle. In the
two cases the closed loop eigenvalues are required to be equal to -2. The first design
is specified by
Hd= [ l
Hs=
01
-1
0
4
A controller R(s) = E(AR, BR, C R, DR) which implements the second design approach and has [ e' y' ] ' as input is specified by
AR=
0
0
0
0
0
0
-4
-1
CR-[04
, BR=
11]'DR=[0
0
0
0
0
16
The time responses of the error variables for the two designs are reported in Figs.
B.12 and B.13 when the exogenous signals are
ysl ys2 - -
0, t<3
1, t > 3,
1 0,t<6
1, t > 6,
d= -1, t>0,
280
and the initial state of the plant is zero. Now assume that in the plant description
the matrix N has become 1.5 times the original one. The time responses of the
error variables are reported in the quoted figures for this new situation and the same
exogenous signals. It is apparent that the first design procedure does not satisfactorily
face the occurred perturbation.
Appendix C
Notation
Re[s] is the real part of the complex number s.
y=Cx+Du.
Im(A) is the range space of A, that is the subspace spanned by the columns
of A.
i = 1,2,...,n.
Bibliography
It is fairly difficult to select a reasonably compact list of references out of the
huge number of excellent contributions which appeared in the literature since
the late 1950s. Thus we are going to mention a small number of textbooks and
journal papers only, picking them out of those which are widely recognized
as particularly significant for the understanding of the basic issues of optimal
control theory.
With specific reference to the subjects which have been discussed in this
book it is first worth mentioning the texts [2], [4], [8], [2 J], [22], [32] which are
universally acknowledged as classical references. More recent contributions are
the books [1], [3], [18], [21], [25], [29], [37]. A more detailed analysis leads to
the following suggestions.
A synthetic presentation of the Hamilton-Jacobi theory is given in [20];
the Linear Quadratic problem is extensively discussed in [2], [3], [4], [9], [22],
[26], [33], [41], in particular, more details about the cheap control issue can be
found in [22]. The Linear Quadratic Gaussian problem is dealt with in [2], [15],
[16], [22], this last book constituting a particularly suitable source for a simple
yet efficient summary of basic notions of probability theory, while [12], [33]
and [35] supply a deeper insight into the robustness issues concerning control
systems; Riccati equations are analyzed under various view points in [7], [12],
[23], [27]. A reference for stochastic control is [36]. A different approach to the
Linear Quadratic Gaussian problem can be found in [10], [34]. Computational
aspects are considered in [30], while interesting examples of applications can
be found in (24].
The Maximum Principle, and in general the material pertaining to first
order variational methods, is extensively treated in [4], [8], [21], [32]; the specific
issue of singular solutions is further investigated in [5], [6], [9], while many
examples of minimum time problems can be found in [4] and (32].
Second order variational methods have been presented here by following
the book [8] which is a useful reference, together with [9], [14], [19], [31] and
[38], for the basic aspects of computational algorithms, also for those which
restrict their attention to first order effects only.
Bibliography
284
As for the observation of the state of a linear system based on the knowledge of its inputs and outputs and some specific issues concerning eigenvalues
assignment see [28] and [17]. The first statement of the so called internal model
principle can be found in [11] and [13].
A wide collection of examples particularly suited for didactic purposes
can be found in the Italian books [26] and [27].
[1] V.M. Alekseev, V.M. Tikhomirov, S.V. Foinin, Optimal control, Consultants Bureau, 1987.
[3] B.D.O. Anderson, J.B. Moore, Optimal control. Linear- quadratic methods, Prentice-Hall, 1989.
[4] M.Athans, P.L. Fall), Optimal control, Mc Graw-Hill, 1966.
[5] D.J. Bell, D.H. Jacobson, Singular optimal control problems, Academic
Press, 1975.
[8] A.E. Bryson, Y.C. Ho, Applied optimal control, Hemisphere Publ. Co.,
1975.
[12] J.C. Doyle, Guaranteed margins for LQG regulators, IEEE Trans. on
Automatic Control, vol. AC-23, pp. 756-757, 1978.
[13] B.A. Francis, W.M. Wonham, The internal model principle of control
theory, Autornatica, vol. 12, pp. 457-465, 1976.
Bibliography
285
[15] M.S. Grewal, A.P. Andrews, Kalman filtering: theory and practice, Prentice Hall, 1993.
[16] M.J. Grimble, M.A. Johnson, Optimal control and stochastic estimation,
J. Wiley, 1988.
[18] L.M. Hocking, Optimal control: an introduction to the theory and applications, Oxford University Press, 1991.
[20] R.E. Kalman, P.L. Falb, M.A. Arbib, Topics in mathematical system
theory, Mc Graw-Hill, 1969.
[24] F.L. Lewis, Applied optimal control and estimation: digital design and
inplementation, Prentice Hall, 1993.
[25] F.L. Lewis, V.L. Syrmos Optimal control, J. Wiley, 1995.
[28] D.G. Luenberger, An introduction to observers, IEEE Trans. on Automatic Control, vol. AC-16, pp. 596-602, 1971.
[29] J.M. Macieiowsky, Multivariable feedback design, Addison Wesley, 1989.
[30] V.M. Melirrnann, The autonomous linear quadratic control problem: theory and numerical solutions, Springer-Verlag, 1991.
Bibliography
286
[33] A. Saberi, B.M. Chen, P. Sannuti, Loop transfer recovery: analysis and
design, Springer Verlag, 1993.
[34] A. Saberi, P. Sannuti, B.M. Chen, 1-12 optimal control, Prentice Hall,
1995.
[35] M.G. Safonov, M. Athans, Gain and phase margin for multiloop LQG
regulators, IEEE Trans. on Automatic Control, vol. AC-22, pp. 173-179,
1977.
[36] R.F. Stengal, Stochastic optimal control: theory and application, J. Wiley,
1986.
[37] R.F. Stengal, Optimal control and estimation, Dover Publications, 1994.
[38] K.L. Teo, C.J. Goh, K.H. Wong, A unified computational approach to
optimal control problems, Longinan Scientific & Technical, 1991.
Algorithms
6 6.1: 175
7 7.1: 233
A A.1: 250
Assuintions
6 6.1: 206
Corollaries
2 2.1: 16
Definitions
2 2.1: 10, 2.2: 11, 2.3: 11, 2.4: 11, 2.5: 11, 2.6: 12
6 6.1: 149, 6.2: 149
A A.1: 254, A.2: 254, A.3: 254, A.4: 254, A.5: 255, A.6: 255
Examples
1 1.1: 2,
1.2: 3,
1.3: 4,
1.4: 4
2 2.1: 11, 2.2: 13, 2.3: 15, 2.4: 16, 2.5: 16, 2.6: 17
3 3.1: 24, 3.2: 25, 3.3: 28, 3.4: 29, 3.5: 30, 3.6: 32, 3.7: 34, 3.8: 35, 3.9: 37,
3.10: 39, 3.11:43, 3.12: 46, 3.13: 47, 3.14: 47, 3.15: 49, 3.16: 50, 3.17: 52,
3.18: 57, 3.19: 60, 3.20: 61, 3.21: 62, 3.22: 66, 3.23: 68, 3.24: 70, 3.25: 73,
3.26: 74, 3.27: 79, 3.28: 79, 3.29: 80, 3.30: 81, 3.31: 85, 3.32: 85, 3.33: 86,
3.34: 88
288
4 4.1: 96, 4.2: 98, 4.3: 101, 4.4: 103, 4.5: 104, 4.6: 107, 4.7: 108, 4.8: 111,
4.9: 114, 4.10: 117, 4.11: 118, 4.12: 120
5 5.1: 1.27, 5.2: 128, 5.3: 130, 5.4: 131, 5.5: 132, 5.6: 132, 5.7: 132, 5.8: 134,
5.9: 134, 5.10: 134, 5.11: 135, 5.12: 136, 5.13: 136, 5.14: 136, 5.15: 137,
5.16: 138, 5.17: 138, 5.18: 138, 5.19: 139, 5.20: 139, 5.21: 141, 5.22: 142,
5.23: 143
6 6.1: 150, 6.2: 152, 6.3: 154, 6.4: 155, 6.5: 155, 6.6: 156, 6.7: 157, 6.8: 157,
6.9: 159, 6.10: 160, 6.11: 163, 6.12: 165, 6.13: 166, 6.14: 169, 6.15: 170,
6.16: 171, 6.17: 172, 6.18: 173, 6.19: 178, 6.20: 179, 6.21: 181, 6.22: 183,
6.23: 184, 6.24: 186, 6.25: 188, 6.26: 189, 6.27: 191, 6.28: 192, 6.29: 197,
6.30: 198, 6.31: 201, 6.32: 203, 6.33: 207, 6.34: 209, 6.35: 210, 6.36: 211,
6.37: 215
7 7.1: 225, 7.2: 225, 7.3: 227, 7.4: 229, 7.5: 231, 7.6: 234, 7.7: 238, 7.8: 241,
7.9: 243, 7.10: 245
B B.1: 261, B.2: 262, B.3: 263, B.4: 269, B.5: 269, B.6: 271, B.7: 278
Figures
1 1.1: 2
2 2.1: 14, 2.2: 17
3 3.1: 22, 3.2: 24, 3.3: 26, 3.4: 26, 3.5: 28, 3.6: 33, 3.7: 33, 3.8: 35, 3.9: 35,
3.10: 36, 3.11: 37, 3.12: 38, 3.13: 40, 3.14: 44, 3.15: 46, 3.16: 47, 3.17: 48,
3.18: 53, 3.19: 55, 3.20: 57, 3.21: 58, 3.22: 59, 3.23: 60, 3.24: 62, 3.25: 63,
3.26: 64, 3.27: 66, 3.28: 67, 3.29: 70, 3.30: 71, 3.31: 72, 3.32: 73, 3.33: 74,
3.34: 75, 3.35: 76, 3.36: 77, 3.37: 79, 3.38: 80, 3.39: 81
4 4.1: 97, 4.2: 1.01, 4.3: 107, 4.4: 108, 4.5: 111, 4.6: 115, 4.7: 117, 4.8: 119,
4.9: 121
5 5.1: 142
6 6.1: 150, 6.2: 155, 6.3: 157, 6.4: 159, 6.5: 162, 6.6: 164, 6.7: 167, 6.8: 169,
6.9: 171, 6.10: 1721 6.11: 172, 6.12: 174, 6.13: 182, 6.14: 183, 6.15: 185,
6.16: 186, 6.17: 188, 6.18: 190, 6.19: 192, 6.20: 193, 6.21: 195, 6.22: 197,
6.23: 198, 6.24: 199, 6.25: 202, 6.26: 203, 6.27: 204, 6.28: 206, 6.29: 207,
6.30: 209, 6.31: 210, 6.32: 211, 6.33: 212, 6.34: 213, 6.35: 214, 6.36: 215,
6.37: 216
7 7.1: 226, 7.2: 228, 7.3: 230, 7.4: 232, 7.5: 236, 7.6: 239, 7.7: 241, 7.8: 243,
7.9: 245
B B.1: 259, B.2: 263, B.3: 264, B.4: 265, B.5: 266, B.6: 267, B.7: 268, B.8:
269, B.9: 271, B.10: 272, B.11:277, B.12: 278, B.13: 279
Lemmas
3 3.1: 49, 3.2: 61, 3.3: 71., 3.4: 77
5 5.1: 128, 5.2: 139
B B.1: 260
289
Problems
2 2.1: 10
3 3.1: 21, 3.2: 40, 3.3: 45, 3.4: 69
4 4.1: 93, 4.2: 112
6 6.1: 148
Remarks
3 3.1: 25, 3.2: 25, 3.3: 27, 3.4: 29, 3.5: 30, 3.6: 31, 3.7: 34, 3.8: 36, 3.9: 37,
3.10: 44, 3.11: 46, 3.12: 50, 3.13: 51, 3.14: 53, 3.15: 55, 3.16: 58, 3.17: 63,
3.18: 63, 3.19: 66, 3.20: 85, 3.21: 85
4 4.1: 92, 4.2: 96, 4.3: 96, 4.4: 98, 4.5: 100, 4.6: 100, 4.7: 101, 4.8: 113, 4.9:
116, 4.10: 118, 4.11: 118, 4.12: 120
6 6.1: 154, 6.2: 154, 6.3: 154, 6.4: 156, 6.5: 158, 6.6: 161, 6.7: 162, 6.8: 166,
6.9: 168, 6.10: 168, 6.11: 173
7 7.1: 232
B B.1: 262, B.2: 262, B.3: 263, B.4: 264, B.5: 265, B.6: 268, B.7: 269, B.8:
271, B.9: 276, B.10: 276, B.11: 277, B.12: 277, B.13: 278
Tables
4 4.1: 98
5 5.1: 131, 5.2: 133
6 6.1: 179, 6.2: 180
Theorems
2 2.1: 12
3 3.1: 22, 3.2: 40, 3.3: 45, 3.4: 48, 3.5: 49, 3.6: 61, 3.7: 63, 3.8: 69, 3.9: 72,
3.10: 74, 3.11: 78, 3.12: 79, 3.13: 82, 3.14: 86
4 4.1: 94, 4.2: 99, 4.3: 100, 4.4: 102, 4.5: 103, 4.6: 104, 4.7: 112, 4.8: 115,
4.9: 116, 4.10: 120, 4.11: 120
5 5.1: 126, 5.2: 127, 5.3: 129, 5.4: 132, 5.5: 133, 5.6: 135, 5.7: 137, 5.8: 137,
5.9: 138, 5.10: 140
6 6.1: 151, 6.2: 165, 6.3: 201, 6.4: 206, 6.5: 208, 6.6: 208, 6.7: 208, 6.8: 208,
6.9: 209
7 7.1: 224, 7.2: 227, 7.3: 229, 7.4: 230, 7.5: 237, 7.6: 240, 7.7: 242, 7.8: 244
A A.1: 251, A.2: 255, A.3: 255, A.4: 255, A.5: 256, A.6: 257, A.7: 257
B B.1: 261, B.2: 267, B.3: 270, B.4: 271, B.5: 273
Index
-AAlgebraic Riccati equation, 45, 69,
103, 117
decomposition, 50
feedback connection, 132
eigenvalues, 132
harniltonian matrix
Z-invariant subspaces, 129
definition, 127
eigenvalues, 127, 128, 132,
139
limit of P(p), 77
minimal solution, 48
sign defined solution
existence, 137
uniqueness, 140
solution
number of, 130
sign defined, 136
vs. Z-invariant subspaces,
129
Constraints
complex, 148
global instantaneous equality,
181, 187
global instantaneous
inequality, 181, 196
integral, 180, 184
isolated instantaneous
equality, 181, 190
nonregular variety, 180, 181
simple, 148
-DDetectability
PBH test, 251
Differential Riccati equation, 22,
40, 69, 94, 99, 100, 102, 103,
112, 116, 224, 227, 229, 230,
237, 240, 242, 244
harniltonian matrix
definition, 126
solution, 126
-EEigenvalues, 254
Eigenvalues assignment
accessible state
definition, 260
number of solutions, 262, 263
selection of A, 262
solution, 261
errors zeroing
constant exogenous signals,
277
Index
292
solution, 273
inaccessible state
choice of AR, 268
controller stability, 269
definition, 267
reduced order controller
stability, 271
reduced order solution, 270,
271
solution, 267
error/filter state
incorrelation, 101
infinite horizon, 102
meaning of Pi(t), 100
normal time-invariant case
infinite horizon, 103
stability, 104
singular time-invariant case
Linear-quadratic problem
definition, 21
finite horizon
Du term, 34
coefficient, 25
- G -
Global methods
H-minimizing control, 11
admissible control, 10
liamiltonian function, 11
H-minimizing control, 11
regularity, 11
optimal control, 11
optimal control problem
definition, 10
sufficient conditions, 12, 16
- H -
Hamiltoii-Jacobi equation, 12
Hamiltonian matrix, 126, 127
Hamiltonian system, 149
_I -
Kalinan filter
normal case, 100
Du term, 100
linear-quadratic index, 27
penalty on ic, 36
rectangular term, 29
sign of Q and S, 30
solution, 22
stochastic problem, 37
tracking problem, 31
uniqueness of solution, 25
infinite horizon, 39
definition, 40
solution, 40
uniqueness of solution, 44
optimal regulator
cheap control, 78, 79
control of equilibrium, 46
definition, 45
existence and stability, 63
existence of solution, 49
exogenous inputs, 63, 66
frequency domain index, 55
index choice, 85
inverse problem, 82, 85, 86
Ininimality of P, 48
penalty on ic, 53
positiveness of P, 61
Index
robustness, 72, 74
solution, 45
stability, 61, 63
tracking problem, 51
optimal regulator with
exponential stability
definition, 69
solution, 69
Linear-quadratic-gaussian control
definition, 112
finite horizon
index value, 113
solution, 112
infinite horizon
index value, 116, 118
robustness, 120
solution, 115
stability, 118
time-invariant case, 116
Linear-quadratic-gaussian
estimation
definition, 93
normal case
correlated noises, 98
error variance, 96
meaning of 0, 96
result, 94
solution, 99
Local sufficient conditions, 222,
224, 227, 229, 230
-MMatrix
harniltonian, 126, 127
McMillan-Smith form, 253
positive definite
check, 254, 255
definition, 254
eigenvalues, 254
positive semidefinite
check, 254, 255
definition, 254
293
eigenvalues, 254
factorization, 255
Maximum Principle, 148
piecewise constancy of
optimal control, 206
orthogonality condition, 152
pathological problems, 154, 168,
169
Index
294
-OObservability
PBH test, 251
of the pair (A, Q), 49
Optimal control problem
attitude, 4
characterization, 4
linear-quadratic, 21
finite horizon, 22
infinite horizon, 39
positioning, 3
regulator, 45
rendezvous, 2
stabilization, 4
tracking, 31
Orthogonality condition, 152
-PParallelepiped, 208
PBH test, 251
Poles, 253
Polyhedron, 205
- Q -
-RReachability
PBH test, 251
from a single input, 260
Regular variety, 149
-VVariational methods
auxiliary system, 149
first order analysis, 147, 221
necessary conditions, 151,
165
-s-
-Z-
Stabilizability
PBH test, 251
Zeros
State observer
of order n, 264
of order n-rank(C), 265
Stochastic system, 255
expected value of x, 256
quadratic functional
invariant, 254
transmission, 254