0% found this document useful (0 votes)
130 views14 pages

Pontryagin Principle of Maximum Time-Optimal Control: Constrained Control, Bang-Bang Control

1) The document discusses Pontryagin's maximum principle, which provides a necessary condition for optimal control problems. 2) It states Pontryagin's principle, which says that for an optimal control u*, there exists a costate variable such that together they satisfy Hamilton's canonical equations and the Hamiltonian is maximized over the set of allowable controls. 3) It gives the example of time-optimal control problems, where bounds on controls are needed to compute realistic minimum times, and Pontryagin's principle is directly applicable.

Uploaded by

dhayanethra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
130 views14 pages

Pontryagin Principle of Maximum Time-Optimal Control: Constrained Control, Bang-Bang Control

1) The document discusses Pontryagin's maximum principle, which provides a necessary condition for optimal control problems. 2) It states Pontryagin's principle, which says that for an optimal control u*, there exists a costate variable such that together they satisfy Hamilton's canonical equations and the Hamiltonian is maximized over the set of allowable controls. 3) It gives the example of time-optimal control problems, where bounds on controls are needed to compute realistic minimum times, and Pontryagin's principle is directly applicable.

Uploaded by

dhayanethra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

7

Pontryagin principle of maximum;


time-optimal control
Constrained control, bang-bang control

Zdeněk Hurák
April 27, 2017

he techniques of calculus of variations introduced in the previous lecture signifi-


T cantly extended our set of tools for solving optimal control problems—instead of
optimizing over (finite) sequences of real numbers we are now able to optimize over
functions. Nonetheless, the methods were also subject to severe restrictions. For ex-
ample, the property that the optimal controls maximize the Hamiltonian were checked
by testing the derivative of the Hamiltonian, which only makes sense if 1.) the Hamil-
tonian is differentiable with respect to control and 2.) the optimal control is inside the
set of allowable controls (vanishing derivative of Hamiltonian on the boundary is not
a necessary condition for the Hamiltonian to achieve a maximum value there). It also
appears that the classes of perturbations characterized by 1-norm and even 0-norm
are not rich enough for practical applications. Just consider switched (on-off) con-
trols which differ in the times of switching. For all these reasons, some more advanced
mathematical machinery has been developed. Unfortunately, it calls for a different
and a bit more advanced mathematics than what we used in the calculus of varia-
tions. The mathematics here is indeed rather involved, therefore we will only state the
most important result—the powerful Pontryagin’s principle of maximum. Although
it sort of replaces some of the previous results (and you might start complaining why
on earth we spent time with calculus of variations), having been introduced to the
calculus of variations-style of reasoning we are now certainly well-equipped to digest
at least the very statement of the powerful result by Pontryagin. If you are interested
(and courageous), for a proof see the book [1] or elsewhere.

1 Pontryagin’s principle of maximum


We have already seen in the calculus of variations that the Hamiltonian when evalu-
ated along the extremal, has the property that

Hy0 = 0. (1)

Combined with the second-order necessary condition of minimum

Ly0 y0 ≥ 0, (2)

we concluded that Hamiltonian is not only stationary along the extremal; it is actually
maximized since
Hy0 y0 = −Ly0 y0 ≤ 0. (3)
Pontryagin principle of maximum; time-optimal control

This result can be written as


H(x, y ∗ , (y ∗ )0 , p∗ ) ≥ H(x, y ∗ , y 0 , p∗ ). (4)
This is a major observation and would never be obtained withouth viewing y 0 as a
separate variable (see [2] for an insightful discussion). After a notational transition
to the optimal control setting and considering the augmented Lagrangian
Laug (t, x, u, ẋ, λ) = L(t, x, u) + λT · (ẋ − f (x, u, t)), (5)
Hamiltonian can be written as
 T

H(t, x, u, ẋ, λ) = Laug



 · ẋ − Laug ,
|{z}
λ

= λ · ẋ − L(t, x, u) − λT · ẋ + λT · f (x, u, t),


T

= λT · f (x, u, t) − L(t, x, u).


Thanks to the fact that ẋ = f (x, u, t), the Hamiltonian can be considered as a
function of t, x, u and λ:
H(t, x, u, λ). (6)
An important observation is that u now plays a similar role to that of y 0 in calculus
of variations. See more on this in the book.
If we now label the set of all allowable controls as U, the result on maximization of
Hamiltonian can then be written as
H(t, x∗ , u∗ , λ∗ ) ≥ H(t, x∗ , u, λ∗ ) ∀u ∈ U, (7)

or, equivalently as

u∗ = argmax H(t, x∗ , u, λ∗ ), u ∈ U. (8)

The essence of the celebrated Pontryagin’s principle is that actually the above
condition is the necessary condition of optimality. The fact that
Hu = 0 (9)
is just a consequence in the situation when Hu exists and the set of allowable controls
u is not bounded. Let us emphasize the fact that the control u comes from some
bounded set U by writing the Pontryagin’s principle as
Theorem 1 (Pontryagin’s principle of maximum). For a given system and a given
optimal criterion, let u∗ ∈ U be an optimal control, then there is a variable called
costate which together with the state satisfies the Hamilton canonical equations
ẋ = ∇λ H, (10)
λ̇ = −∇x H, (11)
where
H(t, x, u, λ) = λT (t) · f (x, u, t) − L(t, x, u) (12)
and
H(t, x∗ , u∗ , λ∗ ) ≥ H(t, x∗ , u, λ∗ ), u ∈ U. (13)
Moreover, the corresponding boundary conditions must hold.

Lecture 7 on Optimal and robust control at CTU in Prague 2


Pontryagin principle of maximum; time-optimal control

As a matter of fact, Hamiltonian here is defined as

H(t, x, u, λ) = λT (t) · f (x, u, t) − λ0 L(t, x, u), (14)

which allows for degenerate situations by setting λ0 = 0.


We could certainly rederive our previous results on LQ-optimal control with fixed
and free final states. Nonetheless, this would be an overkill unless we want to explore
the bounded controls case. There is another scenario, where the Pontryagin’s principle
is immediately needed, and that is the minimum-time problem. The task is to bring
the system to a given state in the shortest time possible. Apparently, with no bounds
on controls, the time can be shrunk to zero (the control signal assuming the shape of
a Dirac impulse). Therefore, bounds need to be imposed on the magnitudes of control
signals in order to compute realistic outcomes.
In order to investigate this situation, we must first know how our necessary con-
ditions change if we relax the final time, that is, the final time becomes one of the
optimization variables.

2 Necessary conditions for a free final time


Setting the final time free means that we want to use the final time as yet another
parameter for optimization.

2.1 Free final time and free final state


Example 2.1. Maybe by late submision of your solution to a homework assignment
in a general course, you can have a net gain in grading since the penalty for late
submission may be more than compensated by a much higher grading for the increased
quality of the solution. Time may be a parameter. Well, not so in our course :-)

2.1.1 Calculus of variations setting


Let us return back to the calculus of variations setting with their notation. The
problem we are going to solve is visualized in Fig. 1.

y(x)

y(x)
δy(x)
y ∗ (x)

a b b + db x

Figure 1: Optimizing over curves with one of their end point on a curve.

Lecture 7 on Optimal and robust control at CTU in Prague 3


Pontryagin principle of maximum; time-optimal control

This trick is, that stretching or shrinking of the interval of the independent variable
is done by perturbing the stationary value of the right end b of the interval with the
same α as we use to perturb the functions y and y 0 . That is, b is perturbed by
∆b = α∆x and the perturbed cost functional is then
Z b+α∆x
J(y ∗ + αη) = L(x, y ∗ + αη, (y ∗ )0 + αη 0 )dx. (15)
a

Note that we have a minor technical problem here since y ∗ is only defined on the
interval [a, b]. But there is an easy fix: we will define a continuation of the function
even to the right of b in the form of a linear approximation given by the derivative of
y ∗ at b. We will exploit it in a while.
Now, in order to find a variation δJ, we can either proceed by fitting the Taylor’s
expansion of the above perturbed cost function to the general Taylor’s expansion and
identifying the first-order term in α. Alternatively (well, in fact, equivalently), we
can use the already stated fact that

d ∗

δJ = J(y + αη) α. (16)
dα α=0

In order to find this derivative, we have to observe that the variable with respect to
which we are differentiating is included in the upper bound of the integral. Therefore
we cannot just change the order of differentiation and integration. This situation
is handled by the well-known Leibniz rule for differentiation under the integral sign.
Look it up yourself in the full generality. In our case it leads to
Z b 
d ∗
d
J(y + αη) = Ly − Ly η(x)dx + Ly0 |b η(b) + L|b ∆x,
0 (17)
dα α=0 a dx
which after multiplication by α gives the variation of the functional
Z b  
d
δJ = Ly − Ly0 δy(x)dx + Ly0 |b δy(x) + L|b ∆xα
| {z }, (18)
a dx
∆b

where the first two terms on the right are already known to us. The only new is
the third term. The reasoning now is pretty much the same as it was in the fixed-
interval free-end case. We argue that among the variations δy there are also those that
vanish at b, hence the conditions must be satisfied even if the last two terms are zero.
But then the integral must be zero, which gives rise to the familiar Euler-Lagrange
equation. The last two terms must together be zero and it does not hurt to rewrite
them in a complete notation to dispell any ambiquity

∂L(x, y(x), y 0 (x))



δy(b) + L(x, y(x), y 0 (x))|x=b ∆b = 0. (19)
∂y 0
x=b

Now, in order to get some more insight into the above condition, the relation
between the participating objects can be further explored. We will do it using the
Fig. 1 but we will augment it a bit with a few labels, see Fig. 2 below.
Note that we have included a brand new label here, namely, dyf for the perturbation
of the value of the function y() at the end of the interval (taking into consideration
that the length of the interval can change as well). We can now write
y ∗ (b + ∆b) + δy(b + ∆b) = y ∗ (b) + dyf , (20)

Lecture 7 on Optimal and robust control at CTU in Prague 4


Pontryagin principle of maximum; time-optimal control

y(x)

≈ δy(b)
dyf
y(x) δy(b) y ∗0 (b)db
y ∗ (b)
y ∗ (x)

a b b + db x

Figure 2: Optimizing over curves with one end of the interval of the independent
variable x set free and relaxing also the value of the function there.

which after approximating each term with the first two terms of its Taylor expansion
gives
y ∗ + y ∗ 0 (b)∆b + δy(b) + δ 0 (b)∆b
(b)  = y ∗
 (b) + dyf . (21)
 
 
Note that the third product on the left can be neglected since it contains two
terms that are both of order one in the perturbation variable α. In other words, we
approximate δy(b + ∆b) by δy(b). In addition, the term y ∗ (b) can be subtracted from
both sides. From what remains after these cancelations, we can conclude that
y ∗ 0 (b)∆b + δy(b) = dyf , (22)
or, equivalently,
δy(b) = dyf − y ∗ 0 (b)∆b. (23)
We will now substitute this into the general form of the boundary equation in (19)
Ly0 (b, y(b), y 0 (b)) · (dyf − y ∗ 0 (b)∆b) + L(b, y(b), y 0 (b))∆b = 0. (24)
Collecting now the terms with the two independent perturbation variables dyf and
∆b, we reformat the above expression into

Ly0 (b, y(b), y 0 (b))dyf + (L(b, y(b), y 0 (b)) − Ly0 (b, y(b), y 0 (b))y ∗ 0 (b))∆b = 0. (25)

Now, since dyf and ∆b are assumed independent, the corresponding terms must be
simultaneously and independently equal zero, that is,
Ly0 (b, y(b), y 0 (b)) = 0, (26)
0 0 ∗0
L(b, y(b), y (b)) − Ly0 (b, y(b), y (b))y (b) = 0. (27)
Note that the first condition actually constitutes n scalar conditions whereas the
second one is just a scalar condition itself, hence, n + 1 boundary conditions.

2.1.2 Optimal control setting


Let’s now switch back to the optimal control setting with t as the independent variable.
Recall that the optimal control problem is
 Z tf 
min φ(x(tf )) + L(x, u, t)dt . (28)
x(),u(),tf ti

Lecture 7 on Optimal and robust control at CTU in Prague 5


Pontryagin principle of maximum; time-optimal control

subject to
ẋ(t) = f (x, u, t), x(ti ) = ri . (29)
We have already seen that the integrand of the augmented cost function now con-
tains not only the term that corresponds to the Lagrange multiplier but also the term
that penalizes the state at the final time, that is,
∂φ dx
Laug (x, u, λ, t) = L(x, u, t) + + (∇x φ)T +λ(ẋ − f (x, u, t)) (30)
|∂t {z dt}
dφ(x(t),t)
dt |t=tf

We then rewrite the boundary conditions (25) as


 
T ∂φ
− λT f (x, u, t)

(∇x φ + λ)|t=tf dxf + L+ dtf . (31)
∂t t=tf

Since here we assume that the final time and the state at the final time are inde-
pendent, this single conditions breaks down into two boundary conditions1

∇x φ(x(tf ), tf ) + λ(tf ) = 0 (32)


∂φ(x(tf ), tf )
L(x(tf ), u(tf ), tf ) + − λT (tf )f (x(tf ), u(tf ), tf ) = 0. (33)
∂t
The first one is actually representing n scalar conditions, the second one is just a
single scalar condition. Hence, altogether we have n + 1 boundary conditions.
Let’s try to get some more insight into this. Let’s assume now that the term
penalizing the state at the final time does not explicitly depend on time, that is,
∂φ
∂t = 0. Then the boundary condition modifies to
 
T
(∇x φ + λ)|t=tf dxf + L − λT f (x, u, t) dtf , (34)

t=tf

which can be rewritten as


T
(∇x φ + λ)|t=tf dxf − H(x, u, λ, t)|t=tf dtf , (35)

which, in turn, enforces the scalar boundary condition (on top of those other n con-
ditions)
H(x(t), u(t), λ(t), t)|t=tf dtf = 0 . (36)

This is an observation that is worth memorizing—for a free final time optimal


control problem, Hamiltonian vanishes at the end of the time interval.
Let’s now add one more observation. We could have mentioned it even in the
previous lecture since it is a general property of a Hamiltonian evaluated along the

1 Note that here we commit the common abuse of notation in writing the functions to be differ-
∂φ(x(tf ),tf )
entiated as explicitnly dependent on tf such as in ∂t
. Instead, we should perhaps keep

∂φ(x(t),t)
writing it as ∂t but it is tiring and the formulas look cluttered.
t=tf

Lecture 7 on Optimal and robust control at CTU in Prague 6


Pontryagin principle of maximum; time-optimal control

optimal solution—the total derivative of a Hamiltonian (evaluated along the solution)


with respect to time is equal to its partial derivative with respect to time:
dH ∂H dx ∂H dλ ∂H du ∂H ∂H
= + + + = . (37)
dt ∂x
|{z} dt ∂λ
|{z} dt ∂u
|{z} dt ∂t ∂t
−λ̇ ẋ 0

Now, if neither the system equations nor the optimal control cost function depend
explicitly on time, that is, if ∂H
∂t = 0, the Hamiltonian remains constant along the
optimal solution (trajectory), that is,

H(x(t), u(t), λ(t)) = const. ∀t (38)

Combined with the previous result (boundary value of H at the end of the free time
interval is zero), we obtain the powerful conclusion that the Hamiltonian evaluated
alon the optimal trajectory is always zero in the free final time scenario:

H(x(t), u(t), λ(t)) = 0 ∀t (39)

This is a pretty insightful piece of information. Since some (numerical) techniques


for optimal control are based on iterative minimization of a Hamiltonian, here we
already know the minimum value.

Remark on notation In the previous lecture (notes) we already discussed the unfor-
tunate discrepancy in the definitions of Hamiltonian in the literature. Perhaps there
is no need to come back to this topic because you are now aware of the problem, but
I will do it anyway. My only motivation is to have the formulas at hand.
Recall that the ambigiuty starts with the definition of the augmented Lagrangian.
I could have easily written instead of (30) the following
∂φ dx T
L̂aug (x, u, λ̂, t) = L(x, u, t) + + (∇x φ)T +λ̂ (f (x, u, t) − ẋ). (40)
|∂t {z dt
}
dφ(x(t),t)
dt |t=tf

The boundary condition would then modify to


T  
∂φ T
(∇x φ − λ̂) dxf + L + + λ̂ f (x, u, t) dtf , (41)

t=tf ∂t t=tf

∂φ(x(tf ),tf )
which can be rewritten in the case of ∂t = 0 and using the alternative defini-
T
tion of the Hamiltonian Ĥ = L + λ̂ f as
T
(∇x φ − λ̂) dxf + Ĥ(x, u, λ̂, t) dtf , (42)

t=tf t=tf

2.2 Free final time but the final state on a prescribed curve
2.2.1 Calculus of variations setting
We will now investigate a special case when the final value of the solution y(x) is to
be on the curve described by ψ(x), that is
y ∗ (b + ∆b) + δy(b + ∆b) = ψ(b + ∆b). (43)

Lecture 7 on Optimal and robust control at CTU in Prague 7


Pontryagin principle of maximum; time-optimal control

y(x)
ψ(x)

y ∗ (x)

a x

Figure 3: Free final time on curve

This corresponds to the situation depicted in Fig. 3.


We already discussed the terms on the left. What is new here is the term on the
right. It can also be approximated by the first two terms in Taylor’s expansion

ψ(b + ∆b) = ψ(b) + ψ 0 (b)∆b. (44)

Therefore, we can expand (43) into

y ∗

 + (y ∗ )0 (b)∆b + δy(b) = y ∗
(b) 
 +ψ 0 (b)∆b,
(b) (45)
| {z }
ψ(b)

from which we can express δy(b) as

δy(b) = ψ 0 (b)∆b − (y ∗ )0 (b)∆b (46)

and substitute to the boundary condition (19), which after cancelling the common
∆b term yields

Ly0 (b, y(b), y 0 (b)) · (ψ 0 (b) − y 0 (b)) + L(b, y(b), y 0 (b)) = 0. (47)

This is just one scalar boundary conditions. But the original n conditions that the
state that y(x) = ψ(x) at the right end of the interval must be added. Altogether,
we have n + 1 boundary conditions.
Anyway, the above single equation is called transversality condition for the reason
to be illuminated by the next example.
Example 2.2. To get an insight, consider again the minimum distance problem. This
time we want to find the shortest distance from a point to a curve given by φ(x). The
answer is intuitive, but let us see what our rigorous tools offer here. The EL equation
stays intact, therefore we know that the shortest path is a line. It starts at (a, 0)
but in order to determine its end, we need to invoke the other boundary condition.
Remember that the Lagrangian is
p
L = 1 + (y 0 )2 dx (48)

and
y0
Ly0 = p dx. (49)
1 + (y 0 )2

Lecture 7 on Optimal and robust control at CTU in Prague 8


Pontryagin principle of maximum; time-optimal control

The transversality condition boils down to

1 + y 0 (b)ψ 0 (b) = 0, (50)

which can also be visualized using vectors in the plane


 
1
1 y 0 (b) · 0
 
= 0. (51)
ψ (b)

The interpretation of this result is that our desired curve y hits the target curve φ
in a perpendicular (transverse) direction.
Understanding the boundary conditions is crucial. Let us have yet another look at
the result just derived. It can be written as

Ly0 ψ 0 |b − H|b = 0. (52)

It follows that for a free length of the interval and fixed value of the variable at the
end of the interval, in which
ψ(x) = c, c ∈ R, (53)
the transversality condition simplifies to

H(b) = 0. (54)

2.2.2 Optimal control setting


Once again, let’s recall that the optimal control problem is
 Z tf 
min φ(x(tf )) + L(x, u, t)dt . (55)
x(),u(),tf ti

subject to

ẋ(t) = f (x, u, t), (56)


x(ti ) = ri (57)
x(tf ) = ψ(tf ). (58)

Translating the above derived transversality condition from the domain (and nota-
tion) of calculus of variations into the optimal control setting gives

T ∂φ T

(∇x φ + λ) ψ̇(t) + L + − λ f (x, u, t) = 0. (59)
∂t t=tf

Of course, as usual, on top of this single condition, the 2n boundary conditions


shown above must be added.

Lecture 7 on Optimal and robust control at CTU in Prague 9


Pontryagin principle of maximum; time-optimal control

3 Time-optimal control for a linear system—bang-bang


control
The task of bringing the system from a given state to some given final state (either a
single state or a set of states) can be formulated by setting
L = 1, (60)
which turns the cost functional to
Z tf
J= 1dt = (tf − ti ). (61)
ti

Let us solve the task for an LTI system


ẋ = Ax + Bu, ti = 0, x(ti ) = r0 (62)
for which we set the final desired state as
x(tf ) = 0. (63)
As already discussed, this only makes sense if we impose some bounds on the
control, therefore
|ui (t)| ≤ 1 ∀i. (64)
The necessary conditions can be build immediately by forming Hamiltonian as
H = λT · (Ax + Bu) − 1 (65)
and substituting into the Hamilton canonical equations
ẋ = ∇λ H = Ax + Bu, (66)
T
λ̇ = −∇x H = −A λ. (67)
plus the Pontryagin’s statement about maximization of H with respect to u
H(t, x∗ , u∗ , λ∗ ) ≥ H(t, x∗ , u, λ∗ ), ui (t) ∈ [−1, 1] ∀t. (68)
Application of Pontryagin’s principle gives
(λ∗ )T · (Ax∗ + Bu∗ ) − 1 ≥ (λ∗ )T · (Ax∗ + Bu) − 1, ui ∈ [−1, 1]. (69)
Cancelling the identical terms on both sides we are left with
(λ∗ )T · Bu∗ ≥ (λ∗ )T · Bu, ui ∈ [−1, 1]. (70)
It turns out that if this inequality is to hold then with the u arbitrary on the left
(within the bounds), the only way to guarantee the validity is to have
u∗ = sgn (λ∗ )T · B ,

(71)
where the signum function is applied elementwise. Clearly the optimal control is
switching—it only assumes values of 1 or -1. This is visualized in Fig. 4 for a scalar
case (the B matrix has only a single column).
Well, in fact to support this claim, it must be rigorously excluded that the argument
of the signum function, the so-called switching function can assume zero value for
longer then just a time instant (although repeatedly). Check this by yourself in [1]
(or its online version). Search for normality conditions.

Lecture 7 on Optimal and robust control at CTU in Prague 10


Pontryagin principle of maximum; time-optimal control

u∗(t)
1

bT λ(t)

-1

Figure 4: Switching function and an optimal control derived from it.

3.1 Time-optimal control for a double integrator system


Let us analyze the situation for a double integrator. This corresponds to a system
described by the second Newton’s law. For a normalized mass the state space model
is       
ẏ 0 1 y 0
= + u. (72)
v̇ 0 0 v 1
The switching function is obviously λ2 (t) and an optimal control is given by

u(t) = sgnλ2 (t). (73)

We do not know λ2 (t). In order to get it, we may need to solve the costate equations.
Indeed, we can solve them independently of the state equations since it is decoupled
from them     
λ̇1 0 0 λ1
=− , (74)
λ̇2 1 0 λ2
from which it follows that
λ1 (t) = c1 (75)
and
λ2 (t) = c1 t + c2 . (76)
for some constants c1 and c2 . To determine the constants, we will have to bring the
boundary conditions finally into the game. The condition that H(tf ) = 0 gives

λ2 (tf )u(tf ) = 1. (77)

We can now sketch possible profiles of the switching function. A few characteristic
versions are in Fig. 5
What we have learnt is that the costate λ2 would go through zero at most once
during the whole control interval. Therefore we will have at most one switching of
the control signal. This is a valuable observation.
We are approaching the final stage of the derivations. So far we have learnt that
we can only consider u(t) = 1 and u(t) = −1. The state equations can be easily
integrated to get
1
v(t) = v(0) + ut, y(t) = y(0) + y(0)t + ut2
2

Lecture 7 on Optimal and robust control at CTU in Prague 11


Pontryagin principle of maximum; time-optimal control

λ2(t)

t0 tf t

Figure 5: Possible evolutions of the costate in time-optimal control.

To visualize this in y − v domain, express t from the first and subsitute into the
second equation
1
u(y − y(0)) = v(0)(v − v(0)) + (v − v(0))2 ,
2
which is a family of parabolas parameterized by (y(0), v(0)). These are visualized in
Fig. 6.
There is a distinguished curve in the figure, which is composed of two branches. It
is special in that for all the states starting on this curve, the system is brought to
the origin for a corresponding setting of the control (and no further switching). This
curve, called switching curve can be expressed as
 1 2
y= 2v v<0
− 12 v 2 v > 0
or
1
y = − v|v|
2
The final step can be done using this figure. Point your finger anywhere in the
plane. Follow the state trajectory that emanates from the particular point for which
you can get to the origin with at maximum 1 switching. Clearly the strategy is to set
u such that it brings us to the switching curve (the red one in the figure) and then
follow it (after switching). That is it. This control strategy can be written as

−1 if y(t) > − 21 v(t)|v(t)| or if y(t) = − 12 v(t)|v(t)| and y < 0



u(t) = ,
1 if y(t) < − 21 v(t)|v(t)| or if y(t) = − 12 v(t)|v(t)| and y > 0

which can be written in a compact form as


 
1
u(t) = sign − v(t)|v(t)| − y(t) . (78)
2

A simulation scheme in Simulink is in Fig. 7. and the expected simulated optimal


response is in Fig. 8.
In the plots you can find a confirmation of the fact that we derived rigorously—the
fact that there will be at most one switch in the control signal. . . Ooops. . . This is

Lecture 7 on Optimal and robust control at CTU in Prague 12


Pontryagin principle of maximum; time-optimal control

1.5

0.5

0
v

−0.5

−1

−1.5

−2
−3 −2 −1 0 1 2 3
y

Figure 6: Typical trajectories for both u(t) = 1 and u(t) = −1. Red is the switching
curve.

Figure 7: The structure of the time-optimal controller.

actually not quite what we see in the plot above, is it? We can see two switches in the
control signals. The first one happend at about 2.2 s, but the second happend close to
3.5 s. In fact, what you would experience if you run the code in Simulink is an error
statement that “At time 3.449489782192745, simulation hits (1000) consecutive zero
crossings.” and the simulation will be finished. Obviosly, what is going on is that the
simulator is tempted to include not just two but in fact a huge number of switches
in the control signal as it approaches the origin. This is quite characteristic of bang-
bang control—a phenomenon called chattering. In this particular example you may
decide to ignore it since both state variables are already close enough to the origin
and you may want to declare the control task as finished2 . Generally, this chattering
phenomenon needs to be handled somehow. Any suggestion how to reduced it?

4 Further reading
This lecture was prepared using [1], in particular chapters 3 (application of calculus of
variations to general problem of optimal control) and chapter 4 (Pontryagin’s princi-

2 Remember that we still consider a control over finite time interval, even though its length is a
tunable parameter. Hence, after reaching the end of the interval, the task is over.

Lecture 7 on Optimal and robust control at CTU in Prague 13


Pontryagin principle of maximum; time-optimal control

0
u
-1
0 0.5 1 1.5 2 2.5 3 3.5
2

0
v

-2
0 0.5 1 1.5 2 2.5 3 3.5
2

1
y

0
0 0.5 1 1.5 2 2.5 3 3.5
Time [s]

Figure 8: Time optimal response obtained from numerical simulation in Simulink.

ple). We did not talk about the proof of Pontryagin’s principle at the lecture and we
do not even command the students to go through the proof in the book. Understand-
ing the result, its roots in calculus of variations and how it removes the deficiencies
of the calculus of variations based results will suffice for our purposes.
The transition from the calculus of variations to the optimal control, especially
when it comes to the definition of Hamiltonian, is somewhat tricky. Unfortunately,
it is not discussed satisfactorily in the literature. Even Liberzon leaves it as an
(unsolved) exercise (3.5 and 3.6) to the student. Other major textbooks avoid the
topic altogether. The only treatment can be found in the famouse journal paper [2],
in particular the section “The first fork in the road: Hamilton” on page 39. The issue
is so delicate that they even propose to distinguish the two types of Hamiltonian by
referring to one as control Hamiltonian.
The time-optimal control for linear systems, in particular bang-bang control for a
double integrator is described in section 4.4.1 and 4.4.2. The material is quite standard
and can be found in many books and lecture notes. What is not covered, however, is
the fact that without any adjustment, the bang bang control is very troublesome from
an implementation viewpoint. A dedicated research thread has evolved, especially
driven by the needs of hard disk drive industry, which is called (a)proximate time-
optimal control (PTOS). Many dozens of papers can be found with this keyword in
the title.

References
[1] Daniel Liberzon. Calculus of Variations and Optimal Control Theory: A Concise
Introduction. Princeton University Press, December 2011.
[2] H.J. Sussmann and J.C. Willems. 300 years of optimal control: from the brachys-
tochrone to the maximum principle. IEEE Control Systems, 17(3):32–44, 1997.

Lecture 7 on Optimal and robust control at CTU in Prague 14

You might also like