0% found this document useful (0 votes)
25 views2 pages

Kleinman

This document discusses an iterative technique for solving the linear regulator problem with infinite terminal time, emphasizing the use of successive substitutions. The author, John D. Pearson, acknowledges that while the approximations are not exact, they can lead to optimal control solutions with less computational effort than traditional methods. Additionally, the correspondence highlights the importance of ensuring convergence in iterative schemes and presents a method for generating gradient functions of cost functions for linear systems.

Uploaded by

guo degang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views2 pages

Kleinman

This document discusses an iterative technique for solving the linear regulator problem with infinite terminal time, emphasizing the use of successive substitutions. The author, John D. Pearson, acknowledges that while the approximations are not exact, they can lead to optimal control solutions with less computational effort than traditional methods. Additionally, the correspondence highlights the importance of ensuring convergence in iterative schemes and presents a method for generating gradient functions of cost functions for linear systems.

Uploaded by

guo degang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

114 IEEE TUVSACTIONS ON AUTOMATIC CONTROL, FEBRUARY 1968

the so-called multilevel schemes existing On an Iterative Technique for Riccati 0 = ( A - BL)’V + V ( A - BL)
today involve iterations of one form or
another. Equation Computations + C‘C + L‘RL. (8)
In the conclusion, Pearson statesthat In this correspondence an iterative tech- Lastly, if LI and L are gain matrices with
the suggested approximations work quite nique for solving the linear regulator prob- associated cost matrices Vl and Vz,it can be
well. I t would be most helpful if he could lem with infinite terminal time is presented. shownl~l that
provide some evidence or references to verify The scheme is based on the method of suc-
this statement,as well as being more explicit
about theapproximations themselves.
cessive substitutions, which has been applied
elsewhere to solve linear regulator prob-
V , - V s = i m e ( d - B & ) ‘ 7 [(Ll - LI’R(L1 - Lz)
hrI. “iSAK lems,[11-[3l albeit under the assumption of a
Dept. of Elec. Engrg. finite terminal time. The results presented - (L1 - Lzl‘(B’V1- R L z )
University of British Columbia below are based on research reported by this - (B’V1- RLI’(L1- Lo)]
Vancouver, B. C., Canada author,l*l+l in which successive substitution
methods were applied to solve the infinite .e(A-BLt)rdT (9)
time problem. By using the concept of a cost
nzutrix, it is directly shown that the itera- or, alternatively,
tions are monotonically convergent. Re-
cently, Puriand Gruverl41 have obtained v1- V Z = Jome(+BLI)“[(LI -
- L~)’R(L’ L-J
similar results by applying concepts of the
Author’s Re@lyj Hamilton-Jacobi theory. Thislatter tech- - (L1- L2)’(B‘Vz - R L z )
Mas& is quite correct in his conclusions nique gives an interesting and different
that my approximations are not exact. As insight to the nature of the iterativescheme. - (B’V2 - RL2)‘(L1- Lz)]
far as decomposition is concerned, a further The linear time-invariant, completely e(A-BLl)‘&.
controllable system is described by the state
a
(10)
“published attempt” with Reich shows that
fora cascaded structure of 15 third-order equations Care must be exercised in using the above
systems, i.e., n =45, decomposition produces
the optimal open-loop control with less work A(t) = Ax(t) + Bu(t), ~ ( 0=
) XO. (1)
formula if VI and/or V2 are unbounded.
We now state andprove the main result.
than the Riccati soIution.6 Unfortunately, W control u*( .) that
-4s is well k n ~ w n , [ ~ ]the
here a feedback solution is desired, and the THEOREM
minimizes the quadratic cost functional
approximation methods are the attempts to Let V , , k =0, 1, . . . , be the (unique)
obtain it. J(x0; 2 4 . ) ) positive definite solution of the linear alge-
Assuming p and Y constant in (11) and braic equation
(12) with7 h = Z ? = O is a standard trick8
equivalent to assuming the reference signal
= Jo-[z’(t)C’cx(f) + U‘(t)RU(t)]dl ( 2 ) 0 Ak’Vk + VkAk f c’c + Lk’RLk (11)
fixed for the purpose of designing the feed-
where R>O1 and the pair [ A , C ] is com- where, recursively,
forward. Thus Zl(n-1, T k ) and l z ( z k - 1 , ~ k in
)
(8) and (9) essentially define 8 x O ( ~ t rk), ~, pletely observable, is given bythe linear
feedback law Lk = R’B’Vk-1, k = 1, 2,
which can be used to satisfy insome manner
8Xk = 8yh1, with 8yr;l.ldefined by observation ZL*(Z(~)=
) - R-’B’Rx(t) = - L * x ( ~ )(3) * Ak=A-BLk
of the ( k + l ) t h vehicle. Approximation 2 is
merely integral action, based on the error where K is the unique positive definite solu- and where LOis chosen such that the ma&
between vehicle spacing, which isnot too tion of the algebraic equation A O= A -BLo has eigenvalues with negative
critical. The equation dai/drr = 8yiil - 6sio is real parts. Then
included only for comparison with a pro-
0 = K.4 + A‘K + C’C - KBR-’B’K. (4)
1) K < V ~ I V<k l . . . , k =0,1
posed iteration which turns out to befairly In addition,K has theproperty 2) limb, V k = R .
useless compared with the conjugate gradi-
J ( Z O zc*(.))
; = minJ(x0; u ( . ) )= X ~ K Z O(5)
.
ent method.6 An example has appeared else- U(.)
Proof
where.9 Neither approximation suggests
Before presenting theiterative scheme 1 ) Let VObe the cost matrix associated
iteration.
for determining K , the concept of a cost with LO.Vothen satisfies (1 1 ) with k =O. N O ~ V
Finally, atthe veryleast,these tech-
matrix is introduced. Suppose that U L ( X ( ~ ) ) set L1 = R-lB’ VOand let Vl be its associated
niques reduce to proportional integral con-
= -Lx(t) is an arbitrary feedback law. If cost matrix. Using (9) we obtain
trol of a stable system a problem that is not
U L ( -) is applied to the system ( l ) , the re-
well known for its intransigence.
sulting cost (2) may be written as vo - V I
Although these tricksamounttovery
little, they represent an attempt (possibly l(mL.0
2 4;
L(.)) = X ~ V f l O (6)
unsuccessful) to solve the large-scale struc-
tured feedback problem as it should be %-hereV L is defined as the cost matrix asso-
solved. I would be delighted if somebody ciated with the feedback gains L a n d is given so that V l j V O . In addition, we have by (10)
would prodwe an alternative method, which by2
was the point of my first correspondence. VI - R
JOHN D. PEARSON Vr, = Lrnc(I-BL)’L(C’c + L’RL)
Research Analysis Corp.
McLean, Va. 22101 .e U - B L 1tdf. (7)
V L is finite if and only if the closed-loop Hence Vl is also bounded below and there-
system matrix A - B L has eigenvalues with fore has finite norm. Thus A1 has eigenvalues
6 Manuscript received Sentember 15 1967. with negative real parts, and so Vl satisfies
6 J. D. Pearson and S. Reich, ‘De&mposition of
negative real parts. In this case, V L is the
large optimal controlproblems, Proc. I E E (London), unique (positive definite) solution of the (11) with k = 1. Repeating the above argu-
vol. 114, June 1967. linear equation ment for k =2, 3, * yields the desired
e
7 Equationnumben of the original correspon-
dence.1 inwhich 11 and l r should read b and b.
result 1).
8 I believe it dates back to Kalman and Englar; 2) IimL, Vk = V , exists by a theorem on
‘Fundamental study of adaptive control systems, I Manuscript received September 11. 1967. monotonic convergence of positive operators
Wright-PattersonAFE,Dayton, Ohio, Tech. Rep’t. 1If X and Yare positive semidefinite. t h e notation
ASD-TR-61-27. 1961. X > YIX 2 U]means that the matrix X - Y is positive (Kantorovich and Akilov,171 p. 189). Thus
DJ. D. Pearson. “Multi!evel control systems.”
presented at the IFAC Symp. on Adaptive Control.
[ s e ~ i ldefinite.
Note that the costmatrix associated with L*
bytakingthe limit of (11) as k+ = we
London. 1965. = R - l B K is amply R. obtain

Authorized licensed use limited to: University of Waterloo. Downloaded on February 08,2025 at 23:27:29 UTC from IEEE Xplore. Restrictions apply.
CORRESPONDENCE 115

0 = A'V- +
V,A C'C + SimultaneousGeneration of First- Define 1, as the average of the cross product
of the perturbed cost function P ( K ( t ) ) and
- VaR'B'V,. (12) and Higher-OrderGradient Func- the nth harmonic of the perturbation signal,
Since K is the unique positive definite tions of a Cost Function
solution of (12), V,=K and the proof is Some promising applications of the first-,
+
1, = (P(R AK cos opt)COS m ~ ~ t ()4)
~ ~ .
completed. Q.E.D.
second-, and higher-order gradient functions The first-order and other successive
REXARKS of a cost function fora linear system may be higher-order gradient functions maybe
1) Since the system (1) is completely made inthe gradientmethod of hill-climbing obtained from (4).For example, for the first-
controllable, it is always possible t o choose processes, where an efficient computer opti- order gradient function,putting n = 1,
an L , such that Re (hi(A0)) <0.[21-[81I t is mization may be achieved if these gradient
necessary for Re ( h i ( A 0 ) ) <O to insure the functions are made available simultaneously. 11 = (P(K f AK C O S W ~ )
boundedness of the cost matrix VO.Other- Sinusoidal perturbation signals' and binary C O S O ~ ) (Sa)
~ ~
wise the iterations may converge to an in- m sequences2 have been used extensively to
definite solution of (4),if they converge a t derive the first-order gradient function. Substituting (3) in (4) it can be shown that
all. Recently, the use of three-level ~lt
sequences8
2) I t can be shown that the iterative has been described to generate the first-order II iW'gI(K)
scheme embodied in ( 1 1 ) is precisely that and second-order gradient functions simul-
which is obtained by applyingNewton's taneously, but the results for the higher-
method (in function spaces) to solve (4) order gradient functions are subject to the
(Kleinman,[sIAppendix E). However, New- complications inherent in the generation of ((cos wpt)m+l)avp.
ton's method alone will not provide condi- higher-order m sequences. For AK/K<<l, assuming that
tions that will insure monotonic conver- A method is presented here for the simul-
gence. taneous generationof all thenecessary gradi-
3) In addition to being monotonically ent functions using the lowfrequency sinus-
convergent, the iterates V k are also quadrat- oidal perturbation signals.
ically convergent. This isexpected from m=3,5,7.-.
Newton's method and can be shown di- DESCRIPTION
OF METHOD
rectly by taking the matrix norm of (10) it yields
The gradientfunction of order ttz of a
with 4 = R-lB' V k = Lrc, LZ = R l B ' K . I t can given cost function is defined as 1 1 z $ARgl(R) clgl(K)
be shown that, uniformly in k ,
amP(K) or
&(K) = -
aK" gt(K) = L / C I (6)
where P(K)is a cost function for a given
linear systemwith stationaryinputs,and where cl=+AK. In (5b) the convergence of
g,,,(K) represents the mth-ordergradient the series is guaranteed assuming the con-
function with respect to the variable param- vergences of the Taylorexpansion in (3).
Therefore, convergence of Vk to K is rapid in eter K. The second-order gradient function may
the vicinity of K. This is not the case in most If the variable parameter is perturbed by be generated by putting n = 2 in (4).Thus,
other
iterative schemes (ASP program, a sinusoidal signal a K cos @, the perturbed again for a / K < < l , neglecting the higher-
Runge-Kutta, etc.), which display only parameter is given by order terms, gq(K) is approximated t o
linear convergence to K.
4) The theorem provides a useful itera- K(t) = K + AK COSWJ. (2)
tive scheme for the numerical solution of (4) The dynamic response in the cost func-
and the associated regulator problem. If the tion may be evaluated knowing the dynam- where c? =AK2/8.
number of state variables n is small, V k may ics of the process and the statistics of the The process, as shown in Fig. 1 , may be
be obtained by solving an n(n+l)/2-dimen- input signal. I t has been shown' that if continued for the generation of the higher-
sional linear vector equation. Acomputer order gradient functions. The technique may
program for finding R in this manner has w,T 5 0.1
be extended for multivariable parameter
been written for up t o 10 state variables. where T is the system time constant,the cost function, using uncorrelated low-fre-
Convergence to K is rapid with approxi- dynamic effect of the parameter perturba- quency sinusoidal perturbation signals. In
mately 10 iterations of (11) needed in most tion signal transmission path, from param- practice, the usefulness of the method, for
cases. eter variation to response in the cost func- more than a couple of the parameters, will
DAVIDL. KLEISMAN tion, is negligible. be limited by thecross-coupling effect due to
Bolt, Beranek and Newman, Inc. Thus, for low perturbation frequencies, the generation of the low-frequency cross-
Cambridge, Mass. the response in the cost function may be modulating terms during the process of the
REFEREKES evaluated simply by Taylorexpansion which perturbation.
1'1 R. Bellman and R. Kalaba.nQuasi-linyriza-
yields The estimates of the gradient functions
tionandnonlinearboundaryvalue problems, The
Rand Corp., Santa Monica, Calif., Rept. R-438.1965.
121 W;. hi. Wonham. 'Lecture notes on stochastic
+
P(K A R cosw,t) = P ( R )
depend upon the amplitude of the perturba-
tion signal a. .4 large amplitude of the
1
control, Center for Dynamical Systems. Brown Uni-
versity. Providence, R. I., Lecture Notes 67-2, 1967.
Dl D. L. Kleinman, 'On the linear regulator prob-
+ m=I.2.3'.'
(AKcoswpt)"- gm(K). (3)
lit !
perturbation signal will cause large errors in
theestimate of the gradients due to the
lem and the mauix Riccati equation," M.I.T. Elec- significant coupling effect of the higher-
tronicSystems Lab.. Cambridge, Mass.. Rept. 271, order gradients, and at the same time large
.nLL
17"".
L4l N.N. Puri and W. A. Gruver. "Optimal control Manuscript received August 18. 1967. AK will create large disturbances in the sys-
design via successive approximations." Preprints, 1 P. H. Hammond and M. J. Duckenfield. "Auto-
J o h t AutomaticControl Conf., Philadelphia. P a . maticoptimizationbycontinuousperturbation of tem. For small AK, there is a possibility of
June 1967,pp. 335-341. parameters," AufomQlica, vol. 1. pp. 147-175. 1963. large experimental errors in the measure-
151 D. L. Kleinman, "Suboptimal design of linear 2 J. L. Douce and K. C. NE. 'The use of, pseudo;
regulator systems subject to computer storage limita- random binary test signals in process optimlzation, ment of these gradients.
However, the
tions," M.I.T.Electronic Systems Lab., Cambridge. Proc. IFACSymp. on the Theory of Self-Adaptize method seems to have promising applica-
Mass Rept. 297 1967. Control Systems. -Xew York: Plenum Press (for the
[SI 'hi. Atbansand P. F. Falb, Optimal Conlrol. Instrument Sbciety of America). 1966. tions over other techniques. First, such
Sew York: McGraw-Hill, 1966. a D. W. Clarke and K. B. Godfrey. 'Simultaneous signals are easy to generate and second, it
111 L. V. Kantorovicb and G. P. Akilov. Funclional estimation of first and second derivatives of a cost
A a d y s i s in Sortned Spaces. Kern York: Mamillan, function:' Electronics Lelt. voL. 2. p. 338. 1966. does not introduce unwanted disturbancesin
1964.
..... 4 J. L. Douce, I (.C. Ng. and M. M . Gupta. 'DY- the system as introduced by the multilevel
181 W. A. Wolovicb, "On the stabilization of con- namics of the
parameterperturbation Pvoc.
trollable systems" (to be publlshed). I E E (London), vol. 113, no. 6 , pp. 1077-1083. 1966. sequences. The gradientestimation under

Authorized licensed use limited to: University of Waterloo. Downloaded on February 08,2025 at 23:27:29 UTC from IEEE Xplore. Restrictions apply.

You might also like