8 Pontryagin
8 Pontryagin
Optimal
Control with Constraints on Inputs
1
Numerical determination of optimal
trajectories
1.1
In previous chapter variational techniques were used to derive necessary conditions for optimal control. The problem was stated as follows:
Consider the problem of minimizing the performance measure
J = h(x(tf ), tf ) +
tf
t0
(1.1)
x(t0 ) = x0
(1.2)
subject to:
x = f(x(t), u(t), t),
The problem is to find an admissible control u (t) that causes the system
(1.2) to follow an admissible trajectory x (t) that minimizes the performance
measure (1.1).
The hamiltonian, has been defined as:
H(x(t), u(t), (t), t) = g(x(t), u(t), t) + T (t)f(x(t), u(t), t)
Assuming that the state and control variables are not constrained, that
the final time tf is fixed and the final state is free, we can summarize the
two-point boundary value problem (TPBVP) that results from the variational
approach by the equations, (Kirk, 2004):
H
= f(x , u , t)
H
=
x
x =
(1.3)
(1.4)
H
u
x (t0 ) = x0
h
(tf ) =
(x (tf ))
x
0 =
(1.5)
(1.6)
(1.7)
From these five sets of conditions it is desired to obtain an explicit relationship for x (t) and u (t), t [t0 , tf ]. The difficulty of solving the differential equations in this case is caused by the combination of split boundary
values and nonlinear differential equations.
Note that the differential equations (1.3) and (1.4) should be solved, in
general, simultaneously because they depend on the same variables (the
states and the adjoint variables). The initial or final conditions, however,
are split as shown in (1.6) and (1.7), as the first ones are the values of the
states at the initial time and the rest are the values of the adjoint variables
at the final time. A classical numerical method for integration cannot be
applied.
The algorithm (Kirk, 2004) is based on the observation that u (t) which
minimizes the hamiltonian will minimize J. Thus, the numerical algorithm
will determine the control u which makes the first derivative of the hamiltonian equal to zero.
Write the hamiltonian H(x(t), u(t), (t)).
H(x(t), u(t), (t), t) = g(x(t), u(t), t) + T f(x(t), u(t), t)
(1.8)
u(t)
uk
initial guess
uk
optimal control u*
1
t0
tf
k
t
If
H
k
(1.9)
u
where is a small positive constant, terminate the iterative procedure
and output the extremal state and control values.
k
H
u
(1.10)
2
Pontryagins Minimum Principle
2.1
We have assumed that the admissible controls and states are not constrained
by any boundaries; however, in realistic systems such constraints commonly
occur. Physically realizable controls generally have magnitude limitations:
the thrust of a rocket engine cannot exceed a certain value; motors, which
provide torque, saturate; attitude control mass expulsion systems are capable of providing a limited torque. State constraints often arise because of
safety or structural restrictions: the current in an electric motor cannot exceed a certain value without damaging the windings; the turning radius of a
maneuvering aircraft cannot be less than a specified minimum value (Kirk,
2004).
The general approach in which we consider the effect of control constraints
and show the necessary conditions leads to the Pontryagins minimum principle.
u(t)
inadmissible
region
u(t)
admissible
region
t0
tf
(2.1)
x(t)
= f(x(t), u(t), t)
(2.2)
tf
t0
(2.3)
(2.4)
H
(t) =
(x (t), u(t) , t)
x
x (t) =
(2.5)
(2.6)
(2.7)
and:
h
(x (tf ), tf ) = 0
x
when the final state is fixed and the final time is free.
It should be emphasized that:
H(x (tf ), u (tf ), (tf ), tf ) +
(2.9)
u (t) is a control that causes H(x (t), u (t), (t), t)to assume its global,
or absolute minimum.
Equations (2.5), (2.6), (2.7), (2.8) consitute a set of necessary conditions for optimality; these conditions are not, in general, sufficient.
In addition, the minimum principle, although derived for controls with
values in a closed and bounded region, can also be applied to problems in
which the admissible controls are not bounded. This can be done by viewing
the unbounded control region as having arbitrarily large bounds, thus ensuring that the optimal control will not be constrained by boundaries. In this
case, for u (t) to minimize the hamiltonian it is necessary (but not sufficient)
that
H
(x (t), u (t), (t), t) = 0
(2.10)
u
If equation (2.10) is satisfied, and the matrix:
2H
(x (t), u (t), (t), t)
u2
is positive definite, this is sufficient to guarantee that u (t) causes H to be a
local minimum; if the hamiltonian can be expressed in the form:
1
H(x(t), u(t), (t), t) = f(x, , t) + c(x, , t)T u(t) + uT R(t)u
2
(2.11)
where c is a m 1 array that does not have any terms containing u(t), then
2
satisfaction of (2.10) and uH2 be positive definite are necessary and sufficient
for H(x (t), u (t), (t), t) to be a global minimum.
For H of the form (2.11)
2H
(x (t), u (t), (t), t) = R(t)
u2
thus, if R(t) is positive definite
u (t) = R1 (t)c(x (t), (t), t)
minimizes the hamiltonian.
7
Example 2.1 (Weber, 2000) Consider the problem of accelerating a skateboard in such a way as to maximize the total distance traveled in a given time
T , minus the effort expended. Denote x1 (t) the distance traveled at time t.
The speed x2 (t) is the first derivative of x1 , and let the acceleration be u(t),
the first derivative of x2 . The dynamical system that describes the problem
is:
x 1 (t) = x2 (t)
x 2 (t) = u(t)
(2.12)
(2.15)
where
h(x(T )) = x1 (T )
From (2.14) and (2.15) we obtain:
1 (t) = 1
2 (t) = t T
(2.16)
(2.17)
with the initial conditions x(t0 ) = x0 . The performance measure to be minimized is:
Z tf
1 2
J(u) =
[x1 (t) + u2 (t)]dt
(2.18)
t0 2
tf is specified and the final state x(tf ) is free.
a) Find the necessary conditions for an unconstrained control to minimize
J.
1
H(x, u, ) = [x21 (t) + u2 (t)] + 1 x2 (t) + 2 (t)(x2 (t) + u(t)) (2.19)
2
The costate equations are:
H
1 (t) =
= x1 (t)
x1
H
2 (t) =
= 1 (t) + 2 (t)
x2
(2.20)
(2.21)
1 (tf ) =
(2.22)
(2.23)
The state and costate equations and the boundary conditions for (tf )
remain unchanged; however u must be selected to minimize
1
1 2
H(x , u, ) = x2
1 + u + 1 x2 + 2 (x2 + u)
2
2
subject to the constraining relations (2.23). To determine the control
that minimizes H, we first separate all of the terms containing u(t)
1 2
u (t) + 2 (t)u(t)
2
(2.24)
from the hamiltonian. For times when the optimal control is unsaturated, we have:
u (t) = 2 (t)
as in part a); clearly this will occur when |2 (t)| 1. If however, there
are times when |2 (t)| > 1 then, from (2.24) the control that minimizes
H is:
(
1 f or 2 (t) > 1
u (t) =
1 f or 2 (t) < 1
Thus u (t) is the saturated function of 2 (t) pictured in Figure 2.2. For
the constrained control we have:
1 f or 2 (t) > 1
u (t) = 2 (t) f or 1 2 (t) 1
1 f or 2 (t) < 1
10
(2.25)
u(t)
1
s
l2*(t)
-1
-1
2.2
Other necessary conditions may be obtained for the special case when the
hamiltonian does not depend explicitly on time, or the optimal control problem is stated as follows:
Determine the optimal control u (t) that causes the system
x(t)
= f(x(t), u(t)), t0 t tf
(2.26)
(2.28)
(2.29)
dt
u
H
H
T H + u(t)
T
T
= x(t)
+ (t)
x
u
Using the necessary conditions for the optimal trajectory x (t):
(t) =
x
H
=0
u
(2.30)
(2.31)
(2.32)
(2.33)
(2.36)
But the hamiltonian is constant along the optimal trajectory, therefore from
(2.36) and (2.28) we obtain:
H(x (t), u (t), (t)) = 0, t0 t tf
12
(2.37)
3
Minimum time problems. Bang-bang control
Example 3.1
Consider the problem of accelerating a skateboard in such way as to bring
the skateboard at rest at given position in minimum time. Denote x1 (t) the
distance traveled at time t, x 1 (t) = x2 (t) the velocity, and x1 (t) = x 2 (t) =
u(t) the acceleration. The model of the dynamical system is:
x 1 (t) = x2 (t); x 2 (t) = u(t)
with the final states specified: x1 (T ) = x2 (T ) = 0. We want to determine
the minimum final time T by minimizing:
J=
1dt
H
1 (t) = x
=0
1
H
2 (t) = x2 = 1 (t)
13
1 (t) = C1
2 (t) = C1 t + C2
The switching function 2 (t) is a line and therefore it can change sign at
most once. Then:
(
+1, 2 (t) < 0
u (t) =
1, 2 (t) > 0
u*(t)
+1
l2(t)
-1
x 1 (t) = x2 (t)
x 2 (t) = 1
u (t) = 1
x220
x22 (t)
1
1
x1 (t) = (x20 t)2 + x10 +
= (x20 t)2 + k1 =
2
2
2
2
x 1 (t) = x2 (t)
u (t) = +1
x 2 (t) = +1
x1 (t) =
x220
(3.2)
1
1
(t + x20 )2 + x10
= (x20 t)2 + k2 =
2
2
2
14
+ k1
(3.3)
x22 (t)
2
+ k2 (3.4)
The equations (3.2) and (3.4) are a set of parabolas opening about the x1 axis,
as shown in Figure 3.2. Two parabolas pass through each point in x1 x2
x2(t)
k1<0
k1>0
k2<0
k2>0
u*=+1
u*=-1
x1(t)
x220
x2
, for u = 1, or x10 = 20 , for u = +1
2
2
(3.5)
The equation of the curve that transfer the system to the origin are obtained
as follows:
x 1 (t) = x2 (t)
x 2 (t) = 1
u (t) = 1
x1 (T ) = x2 (T ) = 0
x 1 (t) = x2 (t)
x 2 (t) = +1
u (t) = +1
x1 (T ) = x2 (T ) = 0
x1 (t) = (T t)
2
x2 (t) = T t
)
x1 (t) = (tT
2
x2 (t) = t T
x1 (t) =
x1 (t) =
x22 (t)
2
Since t < T , for u = 1 x2 (t) > 0 and for u = +1 x2 (t) < 0. Then
the equation of the switching locus is written as:
1
x1 (t) = x2 (t)|x2 (t)|
2
x22 (t)
2
(3.6)
x2(t)
u*=-1
x1(t)
u*=+1
x10,x20
u*=-1
x1(t)
u*=+1
x10,x20
u*=+1
(3.7)
Notice that:
s(x(t)) > 0 implies x(t) lies on the right of the switching locus
s(x(t)) < 0 implies x(t) lies on the left of the switching locus
s(x(t)) = 0 implies x(t) lies on the switching locus
In terms of this switching function, the optimal control law is:
u = 1 for x(t) such that s(x(t)) > 0
u = +1 for x(t) such that s(x(t)) < 0
u = 1 for x(t) such that s(x(t)) = 0 and x2 (t) > 0
u = +1 for x(t) such that s(x(t)) = 0 and x2 (t) < 0
u = 0 for x(t) = 0
An implementation of this optimal control law is shown in Figure 3.5.
Plant
out
+1
x2
x1
in
-1
in
17
4
Minimum fuel problems
4.1
4.1.1
x(t)
= f(x(t), u(t), t), t0 t tf , x(t0 ) = x0
(4.1)
tf
t0
"m
X
i |ui (t)| dt
i=1
(4.2)
(4.3)
4.1.2
= x(t) + u(t)
18
(4.4)
|u(t)|dt
(4.5)
(4.6)
(4.7)
The terms in the hamiltonian that depend on the control u(t) are given by
(the reduced hamiltonian Hr ):
Hr (u(t), (t)) = |u(t)| + (t)u(t)
(4.8)
(4.9)
(4.10)
or:
19
>1
1
0.5
2
1
0.5
0
u
0.5
0
1
1.5
0
u
0.5
0
u
0.5
0
u
0.5
1 < < 0
|u|+ u
|u|+ u
0.5
1.5
0<<1
0.5
0
1
0.5
0
u
0.5
1
0.5
0
1
|u|+ u
1
0.5
0
1
0.5
0.5
= 1
1.5
|u|+ u
=1
1.5
|u|+ u
|u|+ u
0
u
0.5
2
0
2
1
< 1
0.5
u (t) =
(4.11)
(t)
=
= (t) (t) = Cet
x
20
(4.12)
There are five possible trajectories for (t) that affect the control signal
u (t), depending on the value of the constant of integration C, as shown in
Figure 4.2, (Beale, 2001).
6
C 1
4
0<C<1
*(t) = C et
t1
t2
C=0
1< C < 0
C 1
6
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
*(t)
-1
-1
u*(t)
*(t)
u*(t) *(t)
*(t)
*(t)
1
t1
-1
-1
u*(t)
(a) C> 1
u*(t)
u*(t)
*(t)
u*(t)
*(t)
u*(t)
u*(t)
1
t1
-1
-1
*(t)
*(t)
(c) C< -1
Figure 4.4: Forms of the costate and optimal control signals, (?)
u (t) = 0. The solution of the state equation is obtained as follows:
x(t)
(4.14)
The exponential form of the trajectory indicates that the system will
approach the final state zero as time approaches infinity, but it will
never attain the zero value in a finite time. Therefore, when no control
signal is applied, or u (t) = 0, this cannot be the only value used.
u (t) = +1. The solution of the state equation is :
x(t)
(4.15)
This state trajectory moves towards 1 as time increases and it will pass
through the value 0 only if x0 < 0.
For a fixed final time T , the state will get to the origin only for a
particular value of the initial condition, determined from:
x(T ) = 0 = 1 + (x0 1)eT x0 = 1 eT
For the rest of the initial conditions switching is needed.
22
(4.16)
(4.17)
(4.18)
x (t) = 1+(x 1) e
0
x >0
0
1
0
*
x <0
u (t) = +1
1
2
3
0
0.5
1.5
(a)
*
2.5
3
2
1
x >0
u (t) = 1
0
1
x <0
2
0
0.5
1.5
(b)
2.5
Figure 4.5: State trajectories for different initial conditions. (a) u (t) = +1
(b) u (t) = 1
Figure 4.5 (a) and (b) show the state trajectories for u (t) = 1 and
u (t) = +1. They indicate that the target set x(T ) = 0 can be reached
in most of the cases (when the initial condition does not have one of the
values given by (4.16) or (4.18)) only after switching the control value from
u (t) = 0 to u (t) = 1 (if the initial state x0 > 0) or to u (t) = +1 (if the
initial state x0 < 0).
4.1.3
.
23
(4.19)
After switching, the state equation (4.17) will be solved for the time
interval [t1 , T ] with the initial condition x(t1 ):
x(t)
(4.20)
(4.21)
(4.22)
(4.23)
(4.24)
(4.25)
(4.26)
x0 et1 = eT ,
24
t1 = ln(eT + x0 )
(4.27)
4.1.4
Open-loop control
u (t) =
0,
1,
0,
+1,
for
for
for
for
x0
x0
x0
x0
>0
>0
<0
<0
and
and
and
and
t < ln(eT x0 )
ln(eT x0 ) < t < T
t < ln(eT + x0 )
ln(eT + x0 ) < t < T
(4.28)
The optimal control (4.28) is an open-loop form since the current value
of the state x(t) is not used to determine the control signal.
Closed-loop control would be preferable to reduce the effects of disturbances.
4.1.5
Closed-loop control
The closed-loop control law will be obtained by solving the state equation for
u (t) = 1 and then for u (t) = +1, with the final condition x(T ) = 0. This
is based on the observation that during the last part of the time interval
[t1 , T ], the optimal control is either +1 or 1 depending on whether the
initial state is negative or positive.
For x(t) > 0:
x(t)
(4.29)
(4.30)
During the first part of the interval t [0, t1 ], the optimal control is zero
and
x(t) = x0 et , t < t1
(4.31)
The switching of the control from 0 to 1 occurs when the curve (4.31)
intersect curve (4.29). We denote x(t) from (4.29) with:
s(t) = 1 + eT t
(4.32)
and call it the equation of the switching curve. The control will be 0 for all
state values smaller than s(t) and it will switch to 1 when they become
equal.
25
For negative values of the states x(t) < 0 we apply a similar reasoning
and find out that the control is 0 for 0 < x(t) < s(t) and it switches to +1
for x(t) s(t).
The closed-loop form of the optimal control can be written as:
(4.33)
x0 e- t
x*(t)
t1
-1
u*(t)
26
5
Exercises
(5.1)
1
u (t) = et1
2
(5.3)
(b) if 2M < 1:
u (t) =
12 et1 , for 0 t 1 + ln 2M
M, for 1 + ln 2M t 1
(5.4)
= 0.1x(t) + u(t)
(5.5)
where x(t) is the height of the water and u(t) is the net inflow rate of
water at time t. Assume 0 u(t) M.
Find the optimal control law if it is desired to minimize:
J=
100
27
x(t)dt
(5.6)
(5.7)
(5.8)
Find the optimal control law that brings the system from the initial
state x(0) = [1, 0]T to the final state x(1) = [0, 0]T , and minimizes
the performance measure:
1
J=
2
u2 (t)dt
(5.9)
x(t)dt
(5.10)
= u(t), x(0) = 1
(5.11)
(5.12)
x(t)
(5.13)
J=
(5.14)
1
x(t) + u2 (t) dt
(5.15)
2
0
in the presence of the constraint |u(t)| M is the optimal control:
J = 2x(1) +
u (t) =
(t)
if |(t)| 1
Msign((t)) if |(t)| > 1
(5.16)
(t)
= (t) 1, (0) = 2
(5.17)
u (t) = M
(5.18)
M 1 + e1
b)
u (t) = 1 et1
if
M 2
(5.19)
c)
u (t) =
1 et1 if 0 t ln(M 1)
M
if t > 1 + ln(M 1) > 1
if 1 + e1 M 2.
29
(5.20)
Bibliography
30
and
control.
online
at