DR Lee Peng Hin: EE6203 Computer Control Systems 231 DR Lee Peng Hin: EE6203 Computer Control Systems 232
DR Lee Peng Hin: EE6203 Computer Control Systems 231 DR Lee Peng Hin: EE6203 Computer Control Systems 232
tems 232
find the minimum fuel path from node sighted decision maker at a would com-
d to the destination i. All we need to pare the costs of traveling to b and d,
do is to begin at d and follow the ar- and decide to go to d. The next myopic
rows ! The optimal control u∗k and the decision would take him to g. From
cost to go at each stage k are deter- there on there is no choice : he must
mined if we know the value of xk . go via h to i. The net cost of this strat-
Our feedback law tells us how to get egy is 1 + 2 + 4 + 2 = 9, which is non
from any state to the fixed final state optimal.
x4 = xN = i. If we change the final
state, however (e.g., to x3 = xN = f ),
then the entire grid must be redone.
Note : Suppose we had attempted, in
ignorance of the optimality principle,
to determine an optimal route from a
to i by working forward. Then a near-
Dr Lee Peng Hin : EE6203 Computer Control Systems 247 Dr Lee Peng Hin : EE6203 Computer Control Systems 248
to Bellman, the optimal cost from time 22.3 Discrete time linear quadratic reg-
k on is equal to ulator via dynamic programming
Jk∗(x(k)) = min Lk(x(k), u(k)) Let the plant
u(k)
∗ (x(k + 1) (22.6)
+ Jk+1 x(k + 1) = Ax(k) + Bu(k) (22.7)
and the optimal control u∗(k) at time k have an associated performance index
1
is the u(k) that achieves this minimum. Ji = xT(N )S(N )x(N )+
2
(22.6) is the principle of optimality of N −1
1
xT(k)Qx(k) + uT(k)Ru(k)
discrete time systems. Its importance 2
k=i
lies in the fact that it allows us to op- (22.8)
timise over only one control vector at wit h
a time by working backwards from N .
S(N) ≥ 0, Q ≥ 0, R>0
do this, use (22.7) to write Solving for the optimal control yields
−1
1
JN −1 = xT(N − 1)Qx(N − 1)+ u∗(N − 1) = − BTS(N )B + R
2
1 T
u (N − 1)Ru(N − 1) BTS(N )Ax(N − 1) (22.13)
2 T
1
+ Ax(N − 1) + Bu(N − 1) S(N ) Defining,
2 −1
Ax(N − 1) + Bu(N − 1) (22.11) K(N − 1) = BTS(N )B + R
Hence the optimal gain schedule is 23 Solution of the discrete Riccati dif-
and from (22.24), the minimum cost is The difference equations employed in
1
J ∗ = xT(0)S(0)x(0) = 4x2(0) the design of optimal control systems
2
are :
−1
K(k) = BTS(k + 1)B + R
KT (k)RK(k) + Q (23.1b)
Dr Lee Peng Hin : EE6203 Computer Control Systems 261 Dr Lee Peng Hin : EE6203 Computer Control Systems 262
Sub (23.1a) into (23.1b) and rearrang- Note that through some lengthy ma-
ing, nipulations involving Hamiltonians, gen-
eralised eigenvalues, etc., for the steady
S(k) = ATS(k + 1)A + Q−
−1 state solution as N → ∞ such that the
ATS(k + 1)B BTS(k + 1)B + R
gains K(k) have become constant val-
BTS(k + 1)A (23.2) ues, it must then be true in (23.2) that
• The solution of this equation can be Example 23.1. A second-order digital pro-
found by recursion or by the eigenvalue- cess is described by
eigenvector method. We shall dis- 0 1
x(k + 1) = x(k) +
cuss the recursive method here : −1 1
0
– set N to a large value and cal- u(k) (23.5)
1
culate the values of the S matrix
T
Given that x(0) = 1 1 , find the
(by computer) until the matrix el-
optimal control u(k), k = 0, 1, 2, . . . , 7
ements become constant values.
such that the performance index
– the computer solution requires set-
7
ting a tolerance level , so that the J8 = [x21(k) + u2(k)] (23.6)
k=0
difference between every element
is minimised. We have
of S(k) and the corresponding el-
20 0 0
N = 8, R = 2, Q = , S(8) =
ement of S(k − 1) is less than . 00 0 0
– Then we have the solution to (23.4). (23.1a) and (23.1b) are solved recur-
sively to give,
Dr Lee Peng Hin : EE6203 Computer Control Systems 265 Dr Lee Peng Hin : EE6203 Computer Control Systems 266
and the constant optimal control is
S(7) =
2 0
K(7) = 0 0
0 0
K = −0.654 0.486
2 0
S(6) = K(6) = 0 0
0 2
S(4) = K(4) = −0.6 0.4 ing these values. Note that in this ex-
−0.8 3.2
3.23 −0.922
S(3) = K(3) = −0.615 0.462 ample, since the pair (A, B) is com-
−0.922 3.69
−0.973
S(2) =
3.297
K(2) = −0.651 0.481
pletely controllable and we can find a
−0.973 3.729
T
2 x 2 matrix
S(1) =
3.301 −0.962
K(1) = −0.652 0.485
√ C such that CC = Q
−0.962 3.75
2 0
3.305 −0.97
(e.g. C = ) and (A, C) is com-
S(0) = K(0) = −0.6538 0.486 0 0
−0.97 3.777
pletely observable, the closed loop sys-
For large values of N , we can show tem will be asymptotically stable for
S11 S12 (23.10) gives S12 = 1. Then from (23.9),
Let S = . Then (23.8) gives
S12 S22 we have
S11 S12 S22 = S11 − 2 (23.12)
=
S12 S22
Sub (23.12) into (23.11), we have
(S11 + 2S12 + S22 + 1) S11 + S12
S11 + S12 S11 + 1 2 − 3S − 3 = 0
S11
11
1 (S11 + S12)2 S11(S11 + S12)
− 2 which gives
S11 + 1 S11(S11 + S12) S11
which gives S11 = 3.7913; −0.7913
(23.10) = 1 0.7913
2 i.e.,
S11
The closed-loop state equation, with 0.6206 x 10−10
Note that x(30) =
the optimal control law u(k) is 0
and x(N ) → 0 as N → ∞.
x(k + 1) = (A − BK)x(k)
To compute the optimal trajectories
Closed-loop poles are x(k), can also use (23.7) and (23.13)
0 0.2087 1
λi(A − BK) = λi iteratively, given that x(0) = .
1 0 0
= ±0.4568
which gives
0
x(1) =
1
0.2087 0
x(2) = ; x(3) =
0 0.2087