Linear Quadratic Optimal Control
Linear Quadratic Optimal Control
Xu Chen
University of Washington
1 ∞ T
Z
J= x (t)Qx(t) + u T (t)Ru(t) dt
2 0
1 T 1 tf T
Z
x (t)Qx(t) + u T (t)Ru(t) dt
J = x (tf )Sx(tf ) +
2 2 t0
▶ adding
tf
1 1
Z
J = x T (tf )Sx(tf ) + x T (t)Qx(t) + u T (t)Ru(t) dt
2 2 t0
yields
1
J + V (tf ) − V (t0 ) = x T (tf )Sx(tf )+
2
Z tf
1 x T AT P + PA + Q + dP x + u T B T Px + x T PBu + u T Ru dt
2 t0 dt | {z } | {z }
products of x and u quadratic
1
J + V (tf ) − V (t0 ) = x T (tf )Sx(tf )+
2
1 tf
Z
x T AT P + PA + Q + dP x + u T B T Px + x T PBu + u T Ru
dt
2 t0 dt | {z }
1 −1
∥R 2 u+R 2 B T Px∥22 −x T PBR −1 B T Px
▶ the best that the control can do in minimizing the cost is to have
u(t) = −K (t) x (t) = −R −1 B T P(t)x(t)
dP
− = AT P + PA − PBR −1 B T P + Q, P(tf ) = S
dt
to yield the optimal cost J 0 = 12 x0T P(t0 )x0
UW Linear Systems (X. Chen, ME547) LQ 11 / 32
Observation 1
A − BR −1 B T P (t) x (t)
ẋ (t) = Ax (t) + Bu (t) =
| {z }
time-varying closed-loop dynamics
tf
1 1
Z
J = x T (tf )Sx(tf ) + x T (t)Qx(t) + u T (t)Ru(t) dt
2 2 t0
1
J 0 = x0T P(t0 )x0
2
▶ the minimum value J 0 is a function of the initial state x (t0 )
▶ J (and hence J 0 ) is nonnegative ⇒ P (t0 ) is at least positive
semidefinite
▶ t0 can be taken anywhere in (0, tf ) ⇒ P (t) is at least positive
semidefinite for any t
0.6
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4
time/s
1 0
Figure: LQ example: P ∗ (0) = , P (t) = P ∗ (tf − t)
0 1
dP
− = AT P + PA − PBR −1 B T P + Q, P(tf ) = S the Riccati differential equation
dt
1 T
J0 = x P(t0 )x0
2 0
dP dP ∗
− = P + P + 1 = 2P + 1 ⇔ = 2P ∗ + 1
dt dt
forward integration of P ∗ (backward integration of P), will drive
P ∗ (∞) and P (0) to infinity
∞
1
Z
x T (t)Qx(t) + u T (t)Ru(t) dt
J=
2 t0
LQ stationary LQ
J = 12 x T (tf )Sx(tf )+ 1
R∞
x T Qx + u T Ru dt
Cost 1
R tf ⇒ J= 2 t0
x T (t)Qx(t) + u T (t)Ru(t) dt
2 t0
ẋ = Ax + Bu
Syst. ẋ = Ax + Bu ⇒ (A, B) controllable/stabilizable
(A, C ) observable/detectable
1
R∞ 1
R∞
Cost J= 2 0
x T Qc xdt J= 2 t0
x T Qx + u T Ru dt
ẋ = Ax + Bu
Syst. dynamics ẋ = Ac x (A, B) controllable/stabilizable
(A, C ) observable/detectable
Key Eq. AT
c P + PAc + Qc = 0 AT P + PA − PBR −1 B T P + Q = 0
Optimal control N/A u(t) = −R −1 B T P+ x(t)
0 T
Opt. cost J = 21 x T (0) P+ x (0) J 0 = 12 x (t0 ) P+ x (t0 )
Root locus
1.00 R 0
0.75
0.50
0.25
[P, Λ, K ] = care A, B, C T C , R
[K , P, Λ] = lqr A, B, C T C , R
[K , P, Λ] = lqry (sys, Qy , R)
choosing R and Q:
▶ if there is not a good idea for the structure for Q and R, start
with diagonal matrices;
▶ gain an idea of the magnitude of each state variable and input
variable
▶ call them xi,max (i = 1, . . . , n) and ui,max (i = 1, . . . , r )
▶ make the diagonal elements of Q and R inversely proportional to
||xi,max ||2 and ||ui,max ||2 , respectively.