Opt Switch CDC 04
Opt Switch CDC 04
School of Electrical and Computer Engineering Georgia Institute of Technology Atlanta, GA 30332, USA
[email protected]
Keywords. Optimal Control, Hybrid Systems, Switching Surfaces, Gradient Descent, Numerical Algorithms I. I NTRODUCTION This paper investigates an optimal-control approach to hybrid dynamical systems, where modal switching occurs whenever the state reaches a suitable switching surface. The switching surfaces are controlled by free variables (parameters), which have to be determined so as to optimize (minimize) a cost-performance functional dened on the state trajectory. Application domains of such optimal control problems include robotics [1], [9], manufacturing systems [3], [8], power converters [10], and scheduling of medical treatment [18]. The problem addressed here is how to characterize the gradient of the cost functional with respect to the switching-surface control parameters, and then use them in optimization algorithms. The special structure of the hybrid dynamical system lends itself to an especially simple computation of the gradient, and holds out promise of effective optimization in the aforementioned (as well as other) application areas. The general framework for optimal control of hybrid dynamical systems, that has inuenced many subsequent developments, had been dened in [6]. Following this work, Refs. [16], [17] derived variants of the maximum principle. At the same time, the question of numerical optimization
The work of Wardi has been partly supported by a grant from the Georgia Tech Manufacturing Research Center. The work of Egerstedt has been partly supported by the National Science Foundation under Grant # 0237971 ECS NSF-CAREER, and by a grant from the Georgia Tech Manufacturing Research Center.
algorithms has received a signicant interest. In particular, the problem of computing optimal control laws given a partition of the state space [13], or a xed set of switching surfaces [15], [16], [17], [19], has been investigated. Refs. [2], [12], [15] addressed a timing optimization problem in piecewise-linear systems with quadratic costs, and derived homogeneous regions in the state space that determine the optimal switching times. However, the problem of optimal design of switching surfaces has not yet been fully addressed. In this paper, the switching surfaces are dened by solution sets of equations of the form g (x, a) = 0, where x Rn is the state variable and a Rk is the control parameter; here 0 Rk , and g : Rn Rk Rk is a continuously differentiable function. In fact, we assume that there are a number of such switching points, with possibly different switching surfaces and control parameters. The main challenge is to develop a formula for the gradient of the cost functional with respect to these control parameters, that is computationally simple so that it can be deployed in an iterative optimization procedure. A rst attempt resulted in a fairly complicated and time-consuming formula [4]. This paper derives a much simpler formula by dening an appropriate costate equation. We point out that the associated optimality condition is based on variational principles and hence may be derivable from classical results on optimal control (e.g., [7], Ch. 3), but here we provide a direct derivation and proof based on the problems specic structure. The gradient formula will then be used in a descent algorithm to optimize an example problem. The rest of the paper is organized as follows. Section 2 derives the formula for the gradient, Section 3 presents an example, and Section 4 concludes the paper. II. P ROBLEM F ORMULATION
AND
G RADIENT F ORMULA
Consider the following dynamical system dened on the interval [0, T ], x (t) = f (x(t), t), x(0) = x0 , (1)
where x(t) Rn , the initial condition x0 and the nal time T > 0 are given and xed, and f : Rn [0, T ] Rn is a given function. Let 1 , . . . , N , be a nite sequence of times, where 0 < 1 < . . . , < N < T , and let fi : Rn Rn , i = 1, . . . , N + 1, be a given set of functions. Suppose that, f (x, t) = fi (x) i = 1, . . . , N, (2) where we dene 0 = 0 and N +1 = T . The functions fi are assumed to have the following properties. Assumption 2.1. (i) The functions fi , i = 1, . . . , N + 1, are continuously differentiable throughout Rn . (ii) There exists a constant K > 0 such that, for every x Rn and for all i = 1, . . . , N + 1, ||fi (x)|| K (||x|| + 1). (3) This assumption guarantees the existence of unique solutions to equations of the form x = fi (x), with a given initial condition xi at a time i , for any interval [i , i+1 ]. Let L : Rn R be a continuously differentiable function, and consider the cost functional, J , dened by
T
with the following recursively dened boundary conditions at the upper end-points i+1 : pN (T ) = 0 (recall that N +1 := T ), and for all i = N 1, . . . , 1, pi (i+1 ) = gi+1 Ri+1 LT (xi+1 ) I i+1 ||Li+1 ||2 x 1
T
where I is the n n identity matrix. We will make use of the values of these costate variables at their lower end points, i , and correspondingly we dene the term pi Rn by pi = pi (i ). (12) We will express the derivative term dJ/dai in term of the total derivative dJ/di . For the latter derivative we view J as a function of i in the following way: a change in i will cause a change in i+1 via Eq. (5) (with i + 1), which in turn will cause changes in i+2 , . . . , N ; and all of that will cause a change in J via Eq. (4). Let us x ai , i = 1, . . . , N , and to ensure that the derivatives mentioned below do indeed exist, we make the following assumption. Assumption 2.2. For all i = 1, . . . , N , Li = 0. The derivative terms dJ/dai and dJ/di are related to each other by the following formula. Proposition 2.1. The following equation is in force, dJ dJ T gi 1 = L . (xi , ai ) dai di i a ||Li ||2 (13)
J =
0
L(x(t))dt.
(4)
We will view J as a function of the switching times i , i = 1, . . . , N . These switching times are not independent variables, but they depend upon each other through controlled switching surfaces in the following manner. Let us dene the switching surfaces by the solution sets of the equations gi (x(i ), ai ) = 0, (5) where ai R is the control parameter of the ith switching surface, gi : Rn Rk Rk is a given continuously differentiable function (and hence the zero term in the righthand side of (5) is 0 Rk ), and k n. The switching time i is dened by i = min{t > i1 : gi (x(t), ai ) = 0}, (6) where the state trajectory {x(t)} evolves according to Eq. (1) with the given initial condition x0 . The main issue concerning this section is a formula for the derivative dJ/dai , for all i = 1, . . . , N . The formula derived below makes use of the following dened notation: xi := x(i ); (7) gi Li := (xi , ai )fi (xi ), (8) x where we recognize Li Rk as the Lie derivative of gi along fi ; and Ri := fi (xi ) fi+1 (xi ). (9) Furthermore, for every i = 1, . . . , N , let us dene the costate pi ( ) Rn on the interval [i , i+1 ] by the following differential equation, L fi+1 T T (x( )) pi ( ) (x( )) (10) p i ( ) = x x
k
Proof. Taking derivative with respect to ai in Eq. (5) we obtain, gi dxi di gi + (xi , ai ) (xi , ai ) = 0. x di dai a (14)
By the denition of the term i (see (6)) we have that dxi /di = fi (xi ). Multiplying all terms in (14) from the left by LT i , we have that gi 1 di = LT . (xi , ai ) i dai a ||Li ||2 (15)
Finally, noting that dJ/dai = (dJ/di )(di /dai ), Eq. (13) follows from (15). 2 Given ai , i = 1, . . . , N , all of the terms in the right-hand side of Eq. (13) but dJ/di can be directly computed from the state trajectory x(t). Therefore, to complete the characterization of the derivative term dJ/dai , all that remains is to compute the total derivative dJ/di . This is the subject of the following proposition. Proposition 2.2. The following equation is in force. dJ = pT i Ri . di (16)
We break down the proof into a number of steps. By (4), we have that
T N i+1
Taking derivatives with respect to i , and using (9), dx(t) = fi (xi ) fi+1 (xi ) di t dx( ) fi+1 d (x( )) + x di i t fi+1 dx( ) = Ri + d. (x( )) x di i By lemma 2.1 as applied to dx(t)/di ,
J =
0
L(x(t))dt =
i=0 i
L(x(t))dt,
(17)
(25)
j =i
(18)
Thus, we need to get an expression for the term dx(t)/di in (18). Note that in this term t is xed while i is the variable with respect to which we take the derivative. Let us denote by i+1 (t, ) the state transition matrix of +1 (x) z , and we mention the the linearized system z = fix following well-known result concerning this state transition matrix. Lemma 2.1. Let z () : [i , i+1 ] Rn be a differentiable function, and let r Rn be a given vector. Suppose that for every t [i , i+1 ], we have that
t
dx(t) = i+1 (t, i )Ri . (26) di Consequently, and by (21), (23) follows with j = i. Suppose now that (23) holds for some j {i, . . . , N 1} and for all t (j , j +1 ). We next prove it for j + 1. Note that for every t [j , j +1 ],
t
x(t) = x(j ) +
j
fj +1 (x( ))d,
(27)
x(j +1 ) = x(j ) +
j
fj +1 (x( ))d.
(28)
z (t) =
i
(19)
Proof. Follows immediately by differentiating (19) with respect to t. 2 Next, for all i = 1, . . . , N , let us dene the n n matrices j,i , j = i, . . . , N , recursively in j , as follows. i,i = i+1 (i+1 , i )1 , and for every j = i, . . . , N 1, j +1,i = I 1 gj +1 Rj +1 LT (xj +1 ) j +1 ||Lj +1 ||2 x j +1 (j +1 , j )j,i , (22) (21)
Now let us compare the derivatives with respect to i in these two equations. The derivative of (27) yields dx(t)/di , whose value is given by (23) by dint of the inductions hypothesis. The derivative of (28) yields the same expression as the derivative in (27) (with j +1 instead of t) plus the dj+1 additional term fj +1 (xj +1 ) d . In other words, we have i that dx(j +1 ) di dx(t) dj +1 = |t=j+1 + fj +1 (xj +1 ) di di = j +1 (j +1 , j )j,i i+1 (i+1 , i )Ri dj +1 . (29) + fj +1 (xj +1 ) di Consider next t (i+1 , i+2 ) We have that
t
x(t) = x(j +1 ) +
j+1
fj +2 (x( ))d
(30)
where I denotes the n n identity matrix. Fix i = 1, . . . , N . We have the following result. Lemma 2.2. For every j = i, . . . , N , and for every t (j , j +1 ), dx(t) = j +1 (t, j )j,i i+1 (i+1 , i )Ri . di (23)
and by taking derivatives with respect to i , we obtain, dx(j +1 ) dj +1 dx(t) = fj +2 (xj +1 ) di di di t fj +2 dx( ) + d. (x( )) x di j+1
(31)
Proof. We prove the statement by induction on j = i, . . . , N . Consider rst the case where j = i. For every t (i , i+1 ),
t
x(t) = x(i ) +
i
(24)
Now plug Eq. (29) for the rst term in the RHS of (31) to obtain, dx(t) = di dj +1 j +1 (j +1 , j )j,i i+1 (i+1 , i )Ri + Rj +1 di t fj +2 dx( ) + d. (32) (x( )) x di j+1
By Lemma 2.1 we have, for all t (j +1 , j +2 ), dx(t) = j +2 (t, j +1 ) di j +1 (j +1 , j )j,i i+1 (i+1 , i )Ri + dj +1 . Rj +1 di
First, consider the case where j = i. By (21) with i 1, i1,i1 = i (i , i1 )1 . Therefore, and by (22), the left-hand side of (39) has the following form, (33) i,i1 = I 1 gi Ri LT (xi ). i ||Li ||2 x
The last term in (33), dj +1 /di , can be computed from (29) as follows. By denition of j +1 , gj +1 (x(j +1 )) = 0. Taking derivative with respect to i we get that dx(j +1 ) gj +1 = 0, (x(j +1 )) x di and accounting for (29), gj +1 (x(j +1 )) x j +1 (j +1 , j )j,i i+1 (i+1 , i )Ri + dj +1 fj +1 (xj +1 ) = 0. di Multiplying (35) from the left by dj+1 di we get, LT j +1 (34)
By Eq. (21), the RHS of (39) (with j = i) has the same form. This proves (39) for j = i. Next, suppose that (39) is in force for some j {i, . . . , N 1}, and consider the case of j +1. An application of (22) yields, j +1,i1 = I 1 ||Lj +1 ||2 Rj +1 LT j +1 gj +1 (xj +1 ) x j +1 (j +1 , j )j,i1 , (40)
(35)
and by using the inductions hypothesis (Eq. (39)) in the last term we obtain, j +1,i1 = I gj +1 (xj +1 ) ||Lj +1 x j +1 (j +1 , j )j,i i+1 (i+1 , i ) 1 gi I Ri LT (xi ) . (41) i 2 ||Li || x 1 ||2 Rj +1 LT j +1
(36)
Now we recognize the rst three multiplicative terms in the RHS of (41) as the RHS of (22), and therefore, plugging in the LHS of (22), we obtain, j +1,i1 = j +1,i i+1 (i+1 , i ) I 1 gi Ri LT (xi ) . i ||Li ||2 x
Plugging this in (33) we obtain, after some algebra, for every t (j +1 , j +2 ), dx(t) = j +2 (t, j +1 ) di 1 gj +1 I Rj +1 LT (xj +1 ) j +1 ||Lj +1 ||2 x j +1 (j +1 , j )j,i i+1 (i+1 , i )Ri . It now follows from (22) that dx(t) = j +2 (t, j +1 )j +1,i i+1 (i+1 , i )Ri , di (38)
But this is Eq. (39) with j + 1, thus completing the proof. 2 We now are in a position to prve Proposition 2.2. Proof of Proposition 2.2. Applying Lemma 2.2 to Eq. (18) we obtain, dJ = di
N j+1 j
(37)
j =i
which veries Eq. (23) for j +1 and for all t (j +1 , j +2 ), and hence completes the proof. 2 Recall that the matrices j,i were dened by a recursive relation in the rst index, j ; see (21) and (22). We need a recursive relation in the second index, i, and it is given by the following result. Lemma 2.3. For every i = 2, . . . , N , and for all j = i, . . . , N , gi 1 Ri LT (xi ) . i 2 ||Li || x (39) Proof. Fix i {2, . . . , N }. We will prove (39) by induction on j = i, . . . , N . j,i1 = j,i i+1 (i+1 , i ) I
qi ( )T =
N j+1 j
+
j =i+1
By (21) and (42), it is readily seen that dJ = qi (i )T Ri . di Therefore, it sufces to show that pi = qi (i ) (45) (44)
in order to complete the proof. Recall the denition of pi ( ) via Eqs. (10) and (11) with the boundary condition pN (T ) = 0. Now Eq. (45) will be proved once we establish the following: (i) the differential equation (10) holds for qi ( ) in the interval [i , i+1 ]; (ii) qN (T ) = 0; and (iii) Eq. (11) is in force for qi (i ) in lieu of pi . This is what we now do. (i). By taking derivatives with respect to in (43), Eq. (10) is satised for qi ( ). (ii). By (43) and the fact that N +1 = T , it is evident that qN (T ) = 0. (iii). By Eq. (43), L (x(t))j +1 (t, j )dtj,i . x j =i+1 j (46) Apply Lemma 2.3 with i + 1 instead of i to the last term of (46) to obtain, qi (i+1 )T =
N N j+1
mode 1 x = A1 x x := (1, 0)
T
g1 (x, a) = 0
mode 2 x = A2 x
g2 (x, a) = 0
Fig. 1. A hybrid automaton showing the switching structure of the example problem.
Now, the problem that we will study is the problem of forcing the trajectories of this system to look circular through the selection of optimal switching surfaces. In other words, we let the cost functional be given by
T
J (a1 , a2 ) =
0
x(t)
2 dt,
qi (i+1 )
j+1 j
=
j =i+1
j,i+1 i+2 (i+2 , i+1 ) gi+1 I Ri+1 LT (xi+1 ) . i+1 2 ||Li+1 || x 1 By (43) with i + 1, we recognize that
N
(47)
where T is the nal time and is the desired circle radius. In Figure 2, the solution is shown for the case when T = 0.5 and = 2. The solution is obtained by computing the derivative in Eq. (9) and then adjusting the a-values using a gradient descent with Armijo stepsize [14]. From Figure 3, it can be seen that the algorithm terminated after 70 steps, with the initial a-values being a = (1, 0.8)T and the nal values being a = (0.85, 0.73)T .
Two Switching Surfaces
qi+1 (i+1 )T =
j =i+1
j+1 j
j,i+1 i+2 (i+2 , i+1 ), (48) and hence, (47) implies that qi (i+1 )T = qi+1 (i+1 )T 1 gi+1 I Ri+1 LT (xi+1 ) . i+1 2 ||Li+1 || x
0.5
This shows that Eq. (11) is in force for qi (i ) and hence establishes (45). This completes the proof of the Proposition. 2 III. E XAMPLE As an example, consider the problem of letting the switched systems be composed from two second order, unstable linear systems, with switches between the different subsystems occurring on one-dimensional subspaces. Inspired by [5], we let the two subsystems be dened through x = Ai x, x R2 , i = 1, 2, where A1 = 1 100 10 1 , A2 = 1 10 100 1 .
x2
(49)
1.5
2.5
Fig. 2. The trajectories are shown as the switching parameters are varied.
Starting at x(0) = (1, 0)T , we let x(t) evolve according to x = A1 x (mode 1) until the line g1 (x, a) = x1 + a2 1 x2 = 0, at which point the dynamics change to x = A2 x (mode 2). The system returns to mode 1 when the line g2 (x, a) = a2 2 x1 + x2 = 0, as seen in Figure 1.
A couple of comments should be made about this experiment. First, by switching between unstable systems, the resulting system is no longer unstable. Moreover, the computational burden of the proposed method is quite reasonable since only two forward (from 0 to T ) differential equations must be solved in order to obtain x(t) as well as i in Eq. (13). Moreover, only one backward differential equation (from T to 0) must be solved in order to obtain
20 J(a)
18
16
14
10
20
30 iteration #
40
50
60
70
50 40 ||dJ/da|| 30 20 10 0
10
20
30 iteration #
40
50
60
70
Fig. 3.
the continuous costate p(t), which is a huge computational improvement over the result in [4]. IV. C ONCLUSIONS This paper concerns an optimal switching problem in hybrid dynamical systems in the setting of optimal control. The modal switching takes place whenever the state trajectory hits a certain switching surface, and the free variables of the optimization problem consist of control parameters of the switching surfaces. The paper investigates the structure of the gradient of the cost functional, and develops an algorithm for its computation. The algorithm is based on a hybrid costate having a discrete component and a continuous component. This structure of the gradient is amenable to efcient computation, as demonstrated by a numerical example. R EFERENCES
[1] R.C. Arkin. Behavior Based Robotics, The MIT Press, Cambridge, MA, 1998. [2] A. Bemporad, A. Giua, and C. Seatzu. A Master-Slave Algorithm for the Optimal Control of Continuous-Time Switched Afne Systems. IEEE Conf. on Decision and Control, pp. 19761981, Las Vegas, USA, Dec. 2002.
[3] M. Boccadoro and P. Valigi. A Modelling Approach for the Dynamic Scheduling Problem of Manufacturing Systems with Non-Negligible Setup Times and Finite Buffers. 42nd IEEE Conference on Decision and Control, Maui, Hawaii, USA, Dec. 2003. [4] M. Boccadoro, M. Egerstedt, and Y. Wardi. Optimal Control of Switching Surfaces in Hybrid Dynamic Systems. Submitted to IFAC Workshop on Discrete Event Systems, Reims, France, Sept. 2004. [5] M.S. Branicky. Multiple Lyapunov Functions and Other Analysis Tools for Switched and Hybrid Systems. IEEE Trans. on Automatic Control, Vol. 43, No. 4, pp. 475482, 1998. [6] M. Branicky, V. Borkar and S. Mitter. A Unied Framework for Hybrid Control: Model and Optimal Control Theory. IEEE Trans. on Automatic Control, Vol. 43, No. 1, pp. 3145, 1998. [7] A.E. Bryson, Jr. and Y.-C. Ho. Applied Optimal Control. Ginn and Co., 1969. [8] C.G. Cassandras, D.L. Pepyne, and Y. Wardi. Optimal Control of a Class of Hybrid Systems. IEEE Trans. on Automatic Control, Vol. 46, No. 3, pp. 398415, March 2001. [9] M. Egerstedt. Behavior Based Robotics Using Hybrid Automata. Lecture Notes in Computer Science: Hybrid Systems III: Computation and Control, Springer Verlag, pp. 103-116, Pittsburgh, PA, March 2000. [10] D. Flieller, J.-P. Louis, and J. Barrenscheen. General Sampled Data Modeling of Power Systems Supplied by Static Converter with Digital and Analog Controller. Mathematics and Computer in Simulation, Vol. 46, pp. 373-385, 1998. [11] M. Egerstedt, Y. Wardi, and F. Delmotte. Optimal Control of Switching Times in Switched Dynamical Systems. IEEE Conference on Decision and Control, Maui, Hawaii, Dec. 2003. [12] A. Giua, C. Seatzu, and C. Van Der Mee. Optimal Control of Switched Autonomous Linear Systems. IEEE Conf. on Decision and Control, pp. 24722477, Orlando, FL, USA, Dec. 2001. [13] M. Johansson and A. Rantzer. Piecewise Linear Quadratic Optimal Control. IEEE Trans. on Automatic Control, Vol. 43, No. 4, pp. 629 637, 2000. [14] E. Polak, Optimization: Algorithms and Consistent Approximations. Springer-Verlag, 1997. [15] P. Riedinger F. Kratz C. Iung, and C. Zanne. Linear Quadratic Optimization for Hybrid Systems. IEEE Conf. on Decision and Control, pp. 30593064, Phoenix, USA, Dec. 1999. [16] M.S. Shaikh and P.E. Caines. On Trajectory Optimization for Hybrid Systems: Theory and Algorithms for Fixed Schedules. IEEE Conf. on Decision and Control, Las Vegas, USA, Dec. 2002. [17] H.J. Sussmann. A Maximum Principle for Hybrid Optimal Control Problems. Proceedings of the 38th IEEE Conference on Decision and Control, pp. 425430, Phoenix, AZ, Dec. 1999. [18] E. Verriest. Regularization Method for Optimally Switched and Impulsive Systems with Biomedical Applications (I). Proceedings of the IEEE 42nd Conference on Decision and Control, Maui, Hawaii, December 9-12, 2003. [19] X. Xu and P.J. Antsaklis. An Approach to Optimal Control of Switched Systems with Internally Forced Switchings. Proceedings of the American Control Conference, pp. 148153, Anchorage, AK, May 2002.