Optimal Control Theory
Optimal Control Theory
Federico De Angelis
i
Chapter 1
1
Now, we have said the variable describe the evoultions of this quantities (GDP,
interest rate...) in time. Now, we have to decide how they do this.
If a variable is a funciton of time, it will have a form like: GDP (t) = t2 ,
r(t) = t − 2. So, we either alredy know these varaibles. In these case, we are
speaking of exogenous varaibles; or, they may be to be determined, in which case
we speak of endogenous variables. In the last case, the functions that describe
the motion of the variables are unknown. How to know un unknown function?
As in highschool we learnt to solve equations, lik 2x + 3 = x2 , here we have to
solve for functions, that is, like f (x) = f (x)2 + 1. MOre precisely, in dynamic
models, we usually know
1. The form of the derivatives of our variables.
2. The value of the variables in a specific point in time.
For example, we may know that GDP in 0 equals 1000, and that its derivative,
GDP 0 , equals to publicexpenditure − interestpaid. More specifically, we are
know talking about dynamic models in continous time
For example, going back to our dynamic model before, it could have this
form:
2
dynamics f1 , f2 , . . . , fn . To do this, we need to split the variables in two kinds:
state variables and control variables. State variables will still have the form like
in system (1.10), this time, their system of equations will change into:
ẋ1 = f1 (t, x, u)
ẋ2 = f2 (t, x, u)
..
. (1.10)
ẋn−1 = fn−1 (t, x, u)
ẋn = fn (t, x, u)
ẋ = 3t + u (1.11)
ẋ = 3t + at = 7t (1.12)
The idea behind optimal control is to use the controls to drive the state
varaibles, thorught their dynamics, towards some objective. For example, sup-
pose I have, again
ẋ = 3t + u (1.13)
What if, I want to minimize the value x takes at a certain time (for examle, at
t = 10), while, at the same time, minimizing the minimum value u takes during
this “journey” from, for example, t = 0, to t = 10? An infitie tyrpe of problems
can be formulazed in this way.
In this thesis, I willl focus on the problems where the objective can be ex-
pressed as an integral, over an interval of time, of some funciton fo time, x and
u. That is, an objective function in the form
Z t1
f t, x(t), u(t) dt (1.14)
t0
In the last case, this can be seen as: minimize the integral of x over 0 - 10 with
|u| ≤ α for all t ∈ [0, 10].
3
Integrals are the way we sum infinite quantities: for example, on a time
period 0 - 100, if my derivative is x0 (t) = t, and I start in 0 with a quantity of
100, I will get, on day 10, a quantity of:
Z 1
x(10) = 100
|{z} + 0t dt (1.15)
0
initial quantity
and we have Z 1
0t dt = 50 (1.16)
0
so, I shall have, in 10, 100 + 50 = 150.
So, te resume, you can immagine the state variable x (in the last case, my
wallet) as a quantity that changes in time; its instantaneous change is given by
its derviative, x0 (t), which tell’s us how much x is changing, given that we can
compute this integrating x0 over the time period of interest. You could think of
x(t) as a liquid, and of x0 (t) as the water flowing out of a tube, that can either
fill up the container if positive, or drown it: In the next paragraph, I give the
Figure 1.1:
4
3. State equations: the equations that link the state variables and the con-
trol variables, thus determine the direction taken by the state varaibles,
once we have chosen the control variables. In a continous time1 frame-
work, these will be differential equation while, in discrete time, we’ll
havedifference equations.
4. Initial condition: the starting point, given exogenously, of the state vari-
ables.
5. Objective functional: it is a functional2 , depending on both the state and
the control variables, that we want to maximize/minimize.
The scheme below shows the relationship standing between control variables
and state variables: given the value of the state variable and the control vari-
Figure 1.2:
able at a certain time t, that s x(t) (state variable) and u(t) (control variable),
the next value of the state variable is determined by its state equation. If we
are in a continous time framework, the next value of the state variable is to be
intended as the value of x(t) at a time very near to t, that is t + δt; in a discrete
time framework, this value is what x takes one period ahead, so in t + 1. The
equations will represent the derivative of the state-variable (in continous time),
and the change (or “jump”) of the state variable in discrete time.
Control variables are used to steer state variables into some direction, with the
aim of maximizing some functional; for example, we may want x(t) reach a
certain state X, starting at X0 at time 0, as quickly as possible.
5
1.4.1 Closed/open loop and stochasticity
A fundamental important distinction made in the context of optimal control
theroy is weather the control function, u(t), is choosen entirely at the beginning
of the process (in t0 ), or is determined step-by step as the process goes on. In
the first case, the control is said to be in open-loop form: this means that the
control is entirely predetermined for the entire period. In the second case, we
have a closed-loop control.
Figure 1.3:
closed-loop form, instead, the control works as a feed-back from the actual state,
x(t), and the control it-self: at each time, u(t) takes the best value depending
on the value of the state variable observed at that time. This is way closed-loop
control are also called “feed-back” controls.
This was a very brief introduction to the subject of optimal control theory, in
6
Figure 1.4:
next section, I shortly explain what a mathemathical model is; in the following
section, I report some examples of old, and recent, optimal control problems.
dx
= ẋ = g(t, x, u) (1.17)
dt
The objective of the optimal control problem will be always to maximize (or
minimize) an integral like
Z t1
f (t, x, u) dt (1.18)
t0
with f (t, x, u) is a function that depends on t, x(t) and u(t), as g(t, x, u) in the
derivative of x(t) ((1.8)).
7
Then, if u(t) is an optimal control for problem P , there must be a continous
function p : [t0 , t1 ] → R such that:
1. u(t) ∈ arg maxH(t, x, u, p) ∀ t ∈ [t0 , t1 ].
∂
2. ṗ = − ∂x H(t, x, u, p) for a.e. t ∈ [t0 , t1 ].
3. p(t1 ) = 0.
8
Figure 1.5:
Figure 1.6:
Another example of an old problem, which goes back to the Eineid, of Virgil,is
the story of Queen Dido; basing on the legend, she was the daughter of the
Phoenician king of 9th century B.C., and was forced to a long exile by her
brother, Pigmalion, who assassinated her husband; after a long journey in the
Mediterrean see, Queen Dido ended up in Tunisis. There, king Iarba allowed
her to construct her own city. The perimeter of the city would be delined by
the area she would be able to sorround the skin of a bull.
This problem is now called Isoperimetric problem; its solution is shown in
Appendix, section 1.8; it is collected in the mathemathical book of Pappus
of Alexandria, (Mathamatical Collection, Book 5), where he inserted, several
mathematical problems such as from Eulcid, Archimedes and Zeonodrus.
A lot of time later, in the 17th centurys, Pierre De Fermat addressed the prob-
lem of finding the path the lights follows whene reflected/refracted throught a
9
Figure 1.7:
medium. He affirmed that “nature works in those ways which are easiet and
fastest”. This equation: Z x1
ds
δ =0 (1.23)
x0 v
where x0 and x1 are two point travelled by light, v is the instant veloicty and
ds the travelled spece, expresses mathamatically this finding on the behaviour
of nature; in fact, δ is the change of the integral around an optimal trajectory,
meaning that the light follows a path that minimizes time (which is given by
the integral of space over velocity).
Some time later Isaac newton, in his Philosophiae Naturalis Principia Matem-
atica studied the problem of a cone, travelling throught a medium; trying to
find the shape of the cone that would provide minimum dragging resistance.
But the first problem who really set the start, for the mathematic commu-
nity as a whole, of an era of intense and fruitous research that would have set
the promis for the brith of Calculus of variations, happened in 1696 when the
youngest member of teh most notorious mathematical family at the time (and
nowdays), Jhanne Bernoulli, posed a problem known as the Brachistochrone to
the scientific community.
10
Figure 1.8:
the next big step was made by Leonard Euler, who developed a riguourous,
complete theory that would become known officially as Calculus of variations.
While his father, who was a Protestant minister, wished him to follow his path,
Euler was observed by Johann bernoulli while studying at the university of
Basel, and was taken by him as an apprentice. Under the advice of Bernoulli,
Euler started studying advanced math textbook sand, in1744, published his
cornerwork, the Methodus Inveniendi Lineas Curva [...] that is today considerd
the first formalization start of Calculus of Variations; in the book the Euler
equation made its first appearance:
d
Lẋ (t, x, ẋ) = Lx (t, x, ẋ) (1.24)
dt
A few years later, in 1755, a young teenager of Turin, Lagrange, wrote a letter
to Euler, in which he reported some of the results present in Methodus Inve-
niendi Lineas Curva [...], re-derived purely analitically, contrary to the geomet-
ric passages Euler had followed in his work. This constituted a great progress
in the formalization of Calculus of Variations; Euler did not fail to publicily
recognize to Lagrange his merit; morover, his equation become known as the
Euler-Lagrange equation.
11
results could be better be expressed with a partial differential equation (now
known as Hamilton-Jacobi equation)
Vt + H(t, p, q) (1.25)
12