Manual
Manual
ntegration
A Matlab Toolbox for Solving
ptimal
Optimal Control Problems
rajectory
olver
by
14
0
12
10
−0.2
8
−0.4 6
4
−0.6
2
0
−0.8
−2
−1 −4
0 0.5 1 0 0.5 1
Time Time
Conditions for Use of RIOTS_95™
To use any part of the RIOTS_95 toolbox the user must agree to the following conditions:
1. The RIOTS_95 toolbox for solving optimal control problem is distributed for sale according to the
RIOTS_95 license agreement. Use of RIOTS_95 is limited to those users covered under the pur-
chase agreement.
2. This software is distributed without any performance or accuracy guarantees. It is solely the repon-
sibility of the user to determine the accuracy and validity of the results obtained using RIOTS.
3. RIOTS_95, the RIOTS_95 user’s manual, or any portion of either may not be distributed to third
parties. Interested parties must obtain RIOTS_95 directly from Adam Schwartz or his associates.
4. Any modifications to the programs in RIOTS_95 must be communicated to Adam Schwartz. Modi-
fied programs will remain the sole property of Adam Schwartz.
5. Due acknowledgment must be made of the use of RIOTS_95 in any resarch reports or publications.
Whenever such reports are released for public access, a copy should be forwarded to Adam
Schwartz.
6. RIOTS_95, or any portion of the software in RIOTS_95, cannot be used as part of any other soft-
ware without the explicit consent of Adam Schwartz.
7. RIOTS_95 has been thoroughly debugged and there are no memory leaks or memory errors in the
code. However, it is possible for the user’s code to create a memory error through faulty use of
pointers or incorrectly allocated memory arrays.
RIOTS_95™: A Matlab Toolbox for Solving Optimal Control Problems, Version 1.0
Copyright © 1997-1998 by Adam L. Schwartz
All Rights Reserved.
A self-extracting RIOTS_95 educational/demonstration kit is available from the following web sites:
http://www.accesscom.com/˜adam/RIOTS
http://www.shuya.home.ml.org/RIOTS_95
http://www.crosswinds.net/singapore/˜yqchen/riots.html
http://www.cadcam.nus.sg/˜elecyq
http://www.ee.nus.sg/˜yangquan/riots.html
Abstract
This manual describes the use and operation of RIOTS_95. RIOTS_95 is a group of programs and
utilities, written mostly in C and designed as a toolbox for Matlab, that provides an interactive environ-
ment for solving a very broad class of optimal control problems. RIOTS_95 comes pre-compiled for use
with the Windows3.1, Windows95 or WindowsNT operating systems.
The numerical methods used by RIOTS_95 are supported by the theory in [1-4] which uses the
approach of consistent approximations as defined by Polak [5]. In this approach, a solution is obtained as
an accumulation point of the solutions to a sequence of discrete-time optimal control problems that are, in
a specific sense, consistent approximations to the original continuous-time, optimal control problem. The
discrete-time optimal control problems are constructed by discretizing the system dynamics with one of
four fixed step-size Runge-Kutta integration methods1 and by representing the controls as finite-
dimensional B-splines. The integration proceeds on a (possibly non-uniform) mesh that specifies the
spline breakpoints. The solution obtained for one such discretized problem can be used to select a new
integration mesh upon which the optimal control problem can be re-discretized to produce a new discrete-
time problem that more accurately approximates the original problem. In practice, only a few such re-
discretizations need to be performed to achieve an acceptable solution.
RIOTS_95 provides three different programs that perform the discretization and solve the finite-
dimensional discrete-time problem. The appropriate choice of optimization program depends on the type
of problem being solved as well as the number of points in the integration mesh. In addition to these opti-
mization programs, RIOTS_95 also includes other utility programs that are used to refine the discretiza-
tion mesh, to compute estimates of integration errors, to compute estimates for the error between the
numerically obtained solution and the optimal control and to deal with oscillations that arise in the numer-
ical solution of singular optimal control problems.
1
RIOTS_95 also includes a variable step-size integration routine and a discrete-time solver.
Table of Contents
Section 1: Purpose 1
Section 2: Problem Description 3
Transcription for Free Final Time Problems ............................................................................. 4
Trajectory Constraints ............................................................................................................... 5
Continuum Objective Functions ............................................................................................... 5
Section 3: Using RIOTS_95 6
Session 1 ................................................................................................................................... 8
Session 2 ................................................................................................................................... 11
Session 3 ................................................................................................................................... 13
Session 4 ................................................................................................................................... 15
Section 4: User Supplied Subroutines 18
activate, sys_activate ................................................................................................................ 20
init, sys_init ............................................................................................................................... 21
h, sys_h ...................................................................................................................................... 23
l, sys_l ....................................................................................................................................... 24
g, sys_g ...................................................................................................................................... 26
Dh, sys_Dh; Dl, sys_Dl; Dg, sys_Dg ........................................................................................ 28
get_flags .................................................................................................................................... 30
time_fnc ..................................................................................................................................... 31
Section 5: Simulation Routines 33
simulate ..................................................................................................................................... 34
Implementation of the Integration Routines ............................................................................. 41
System Simulation ........................................................................................................... 41
Gradient Evaluation ......................................................................................................... 41
check_deriv ............................................................................................................................... 46
check_grad ................................................................................................................................ 48
eval_fnc ..................................................................................................................................... 50
Section 6: Optimization Programs 52
Choice of Integration and Spline Orders .................................................................................. 52
Coordinate Transformation ....................................................................................................... 55
Description of the Optimization Programs ............................................................................... 58
aug_lagrng ................................................................................................................................ 59
outer .......................................................................................................................................... 61
pdmin ........................................................................................................................................ 63
riots .......................................................................................................................................... 67
Section 7: Utility Routines 72
control_error ............................................................................................................................. 73
distribute ................................................................................................................................... 74
est_errors .................................................................................................................................. 76
sp_plot ....................................................................................................................................... 78
transform ................................................................................................................................... 19
Section 8: Installing, Compiling and Linking RIOTS_95 80
Compiling the User-Supplied System Code ............................................................................. 80
The M-file Interface .................................................................................................................. 81
Section 9: Planned Future Improvements 82
Appendix: Example Problems 85
REFERENCES 89
1. PURPOSE
This chapter describes the implementation of a Matlab2 toolbox called RIOTS_95 for solving opti-
mal control problems. The name RIOTS stands for ‘‘Recursive3 Integration Optimal Trajectory Solver.’’
This name highlights the fact that the function values and gradients needed to find the optimal solutions
are computed by forward and backward integration of certain differential equations.
RIOTS_95 is a collection of programs that are callable from the mathematical simulation program
Matlab. Most of these programs are written in either C (and linked into Matlab using Matlab’s MEX
facility) or Matlab’s M-script language. All of Matlab’s functionality, including command line execution
and data entry and data plotting, are available to the user. The following is a list of some of the main fea-
tures of RIOTS_95.
• Solves a very large class of finite-time optimal controls problems that includes: trajectory and end-
point constraints, control bounds, variable initial conditions (free final time problems), and problems
with integral and/or endpoint cost functions.
• System functions can be supplied by the user as either object code or M-files.
• System dynamics can be integrated with fixed step-size Runge-Kutta integration, a discrete-time
solver or a variable step-size method. The software automatically computes gradients for all func-
tions with respect to the controls and any free initial conditions. These gradients are computed
exactly for the fixed step-size routines.
• The controls are represented as splines. This allows for a high degree of function approximation
accuracy without requiring a large number of control parameters.
• The optimization routines use a coordinate transformation that creates an orthonormal basis for the
spline subspace of controls. The use of an orthogonal basis can results in a significant reduction in
the number of iterations required to solve a problem and an increase in the solution accuracy. It also
makes the termination tests independent of the discretization level.
• There are three main optimization routines, each suited for different levels of generality of the opti-
mal control problem. The most general is based on sequential quadratic programming methods. The
most restrictive, but most efficient for large discretization levels, is based on the projected descent
method. A third algorithm uses the projected descent method in conjunction with an augmented
Lagrangian formulation.
• There are programs that provide estimates of the integration error for the fixed step-size Runge-Kutta
methods and estimates of the error of the numerically obtained optimal control.
• The main optimization routine includes a special feature for dealing with singular optimal control
problems.
• The algorithms are all founded on rigorous convergence theory.
In addition to being able to accurately and efficiently solve a broad class of optimal control prob-
lems, RIOTS_95 is designed in a modular, toolbox fashion that allows the user to experiment with the
optimal control algorithms and construct new algorithms. The programs outer and aug_lagrng,
2
Matlab is a registered trademark of Mathworks, Inc. Matlab version 4.2c with the Spline toolbox is required.
3
Iterative is more accurate but would not lead to a nice acronym.
Section 1: Purpose 1
described later, are examples of this toolbox approach to constructing algorithms.
RIOTS_95 is a collection of several different programs (including a program which is, itself, called
riots) that fall into roughly three categories: integration/simulation routines, optimization routines, and
utility programs. Of these programs, the ones available to the user are listed in the following table,
Several of the programs in RIOTS_95 require functions that are available in the Matlab Spline toolbox. In
addition to these programs, the user must also supply a set of routines that describe the optimal control
problem which must be solved. Several example optimal control problems come supplied with
RIOTS_95. Finally, there is a Matlab script called RIOTS_demo which provides a demonstration of
some of the main features of RIOTS_95. To use the demonstration, perform the following steps:
Step 1: Follow the directions in §8 on compiling and linking RIOTS_95. Also, compile the sample
systems rayleigh.c, bang.c and goddard.c that come supplied with RIOTS_95.
Step 2: Start Matlab from within the ‘RIOTS/systems’ directory.
Step 3: Add the RIOTS_95 directory to Matlab’s path by typing at the Matlab prompt,
>> path(path,’full_path_name_for_RIOTS’)
>> RIOTS_demo
Limitations. This is the first version of RIOTS_95. As it stands, there are a few significant limitations
on the type of problems which can be solved by RIOTS_95:
1. Problems with inequality state constraints that require a very high level of discretization cannot be
solved by RIOTS_95. Also, the computation of gradients for trajectory constraints is not handled as
efficiently as it could be.
2. Problems that have highly unstable, nonlinear dynamics may require a very good initial guess for the
solution in order to be solved by RIOTS_95.
3. General constraints on the controls that do not involve state variables are not handled efficiently:
adjoints are computed but not used.
4. RIOTS_95 does not allow delays in the systems dynamics (although Padé approximations can be
used).
5. Numerical methods for solving optimal control problems have not reached the stage that, say, meth-
ods for solving differential equations have reached. Solving an optimal control problem can,
depending on the difficulty of the problem, require significant user involvement in the solution pro-
cess. This sometimes requires the user to understand the theory of optimal control, optimization
and/or numerical approximation methods.
Section 1: Purpose 2
Conventions. This manual assumes familiarity with Matlab. The following conventions are used
throughout this manual.
• Program names and computer commands are indicated in bold typeface.
• User input is indicated in Courier typeface.
• Optional program arguments are listed in brackets. The default value for any optional argument can
be specified using [].
• Optional program arguments at the end of an argument list can be omitted in which case these argu-
ments take on their default values.
• Typing a function’s name without arguments shows the calling syntax for that function. Help can be
obtained for M-file programs by typing help followed by the function name at Matlab’s prompt.
Typing help RIOTS produces a list of the programs in RIOTS_95.
• The machine precision is denoted by mach .
2. PROBLEM DESCRIPTION
≤ ≤
j j j
min max , j = 1, . . . , n ,
g ei ( , x(b)) ≤ 0 , ∈ qei ,
g ee ( , x(b)) = 0 , ∈ qee ,
where x(t) ∈ IR , u(t) ∈ IR , g : IR × IR → IR, l : IR × IRn × IRm → IR, h : IR × IRn × IRm → IRn and we
n m n n
.
have used the notation q = { 1, . . . , q } and L ∞
m
[a, b] is the space of Lebesgue measurable, essentially
bounded functions [a, b] → IR . The functions in OCP can also depend upon parameters which are
m
4
Not all of the optimization routines in RIOTS_95 can handle the full generality of problem OCP.
Section 1: Purpose 3
the subscript. The functions in the description of problem OCP, and the derivatives of these functions5,
must be supplied by the user as either object code or as M-files. The bounds on the components of and
u are specified on the Matlab command line at run-time.
The optimal control problem OCP allows optimization over both the control u and one or more of
the initial states . To be concise, we will define the variable
= (u, ) ∈ H 2 = L ∞m
[a, b] × IRn .
.
With this notation, we can write, for example, f ( ) instead of f ( , u). We define the inner product on H 2
as
/ , \ .
= \/ u1 , u2 /\ L2 + \/ 1 , \
∼
subject to ẏ = h (t, y, u) , y(a) = , t ∈ [a, a + T ] ,
.
can, with an augmented state vector x = (y, x n−1 , x n ), be converted into the equivalent fixed final time
optimal control problem
b
minn g( , x(b)) +
u, ∫a l(t, x, u)dt
.
∼
x n h (x n−1 , y, u)
subject to ẋ = h(t, x, u) = xn , x(a) = = a , t ∈ [a, b] ,
n
0
. ∼ .∼
where y is the first n − 2 components of x, g( , x(b)) = g (a + T n , y(b)), l(t, x, u) = l (x n−1 , y, u) and
b = a + T . Endpoint and trajectory constraints can be handled in the same way. The quantity T = b − a
is the nominal trajectory duration. In this transcription, x n−1 plays the role of time and n is the duration
scale factor, so named because T n is the effective duration of the trajectories for the scaled dynamics.
Thus, for any t ∈ [a, b], x n (t) = n , x n−1 (t) = a + (t − a) n and the solution, t f , for the final time is
5
If the user does not supply derivatives, the problem can still be solved using riots with finite-difference computation of the gradients.
Trajectory constraints.
The definition of problem OCP allows trajectory constraints of the form l ti (t, x, u) ≤ 0 to be handled
directly. However, constraints of this form are quite burdensome computationally. This is mainly due to
the fact that a separate gradient calculation must be performed for each point at which the trajectory con-
straint is evaluated.
At the expense of increased constraint violation, reduced solution accuracy and an increase in the
number of iterations required to obtain solutions, trajectory constraints can be converted into endpoint
constraints which are computationally much easier to handle. This is accomplished as follows. The sys-
tem is augmented with an extra state variable x n+1 with
ẋ n+1 (t) = max { 0, l ti (t, x(t), u(t)) } 2 , x n+1 (a) = 0 ,
where > 0 is a positive scalar. The right-hand side is squared so that it is differentiable with respect to x
and u. Then it is clear that either of the endpoint constraints
.
g ei ( , x(b)) = x n+1 (b) ≤ 0 or .
g ee ( , x(b)) = x n+1 (b) = 0
is satisfied if and only if the original trajectory constraint is satisfied. In practice, the accuracy to which
OCP can be solved with these endpoint constraints is quite limited because these endpoint constraints do
not satisfy the standard constraint qualification (described in the §4). This difficulty can be circumvented
by eliminating the constraints altogether and, instead, adding to the objective function the penalty term
.
g o ( , x(b)) = x n+1 (b) where now serves as a penalty parameter. However, in this approach, must now
be a large positive number and this will adversely affect the conditioning of the problem. Each of these
possibilities is implemented in ‘obstacle.c’ for problem Obstacle (see Appendix B).
can be converted into the form used in problem OCP by augmenting the state vector with an additional
state, w, such that
ẇ = 0 ; w(0) = n+1
subject to
l(t, x(t), u(t)) − n+1
≤ 0 , t ∈ [a, b] .
u
min max g (u, ) +
∈ qo ∫ l
b
In this case, an equivalent endpoint constrained problem with a single objective function,
n+1
min
u, n+1
subject to
g∼ (u, ) − n+1
≤0, ∈ qo
is formed by using the augmented state vector (x, w, z) with
ẇ = 0 , w(0) = n+1
ż = l (t, x(t), u(t)) , z (0) = 0 ,
∈ qo ,
and defining
.
g∼ (u, ) = g (u, ) + z (b) .
3. USING RIOTS_95
This section provides some examples of how to simulate systems and solve optimal control problems with
the RIOTS_95 toolbox. Detailed descriptions of all required user-functions, simulation routines, opti-
mization programs and utility programs are given in subsequent sections. These programs are all callable
from within Matlab once Matlab’s path is set to include the directory containing RIOTS_95. The Matlab
command
>> path(path,’full_path_name_for_RIOTS’)
>> RIOTS_demo
should be used for this purpose. Refer to the §8, ‘‘Compiling and Linking RIOTS_95’’, for details on
how to install RIOTS_95.
RIOTS_95 provides approximate solutions of continuous time optimal control problems by solving
discretized ‘‘approximating’’ problems. These approximating problems are obtained by (i) numerically
integrating the continuous time system dynamics with one of four Runge-Kutta integration methods6 and
(ii) restricting the space of allowable controls to finite-dimensional subspaces of splines. In this way, the
approximating problems can by solved using standard mathematical programming techniques to optimize
over the spline coefficients and any free intial conditions. It is not important for the user of RIOTS_95 to
6
RIOTS_95 also includes a discrete-time system solver and a variable step-size integration routine.
cients and these coefficients are stored as row vectors. Thus, a system with m inputs will be stored in a
‘‘short-fat’’ matrix with m rows and N + − 1 columns. More details about splines are given in the next
section.
Typically, we use the Matlab variable u to store the spline coefficients. The system trajectories
computed by integrating the system dynamics are stored in the variable x. Like u, x is a ‘‘short-fat’’
matrix with n rows and N + 1 columns. Thus, for example, x(:,k) is the computed value of x(t k ).
Other quantities, such as gradients and adjoints, are also stored as ‘‘short-fat’’ matrices.
The following sample sessions with RIOTS_95 solve a few of the sample optimal control problems
that are supplied with RIOTS_95 as examples. Appendix B provides a description of these problems and
the C-code implementations are included in the ‘RIOTS/systems’ sub-directory.
We can take a look at the solution trajectories by simulating this system with some initial control. We
will specify an arbitrary piecewise linear (order = 2) spline by using N + − 1 = N + 1 coefficients
and perform a simulation by calling simulate.
>> N=50;
>> x0=[-5;-5]; % Initial conditions
>> t=[0:2.5/50:2.5]; % Uniform integration mesh
>> u0=zeros(1,N+1); % Spline with all coeff’s zero.
>> [j,x]=simulate(1,x0,u0,t,4,2);
>> plot(t,x)
[j,x]=simulate(1,x0,u0,t,4,2);
6
−2
−4
−6
−8
0 0.5 1 1.5 2 2.5
Next, we find an approximate solution to the Problem Rayleigh, which will be the same type of spline as
u0, by using either riots or pdmin.
>> [u1,x1,f1]=riots(x0,u0,t,[],[],[],100,4);
>> [u1,x1,f1]=pdmin(x0,u0,t,[],[],[],100,4);
The first three input arguments are the initial conditions, initial guess for the optimal control, and the inte-
gration mesh. The next three inputs are empty brackets indicating default values which, in this case, spec-
ify that there are no control lower bounds, no control upper bounds, and no systems parameters. The last
two inputs specify that a maximum of 100 iterations are to be allowed and that integration routine 4
(which is a fourth order Runge-Kutta method) should be used. The outputs are the control solution, the
trajectory solution, and the value of the objective function.
The displayed output for pdmin is shown below. The displayed output for riots depends on the
mathematical programming algorithm with which it is linked (see description of riots in §6).
The column labeled ||free_grad|| gives the value of | ∇ f ( )|| H 2 , the norm of the gradient of the
objective function. For problems with bounds on the free initial conditions and/or controls, this norm is
restricted to the subspace where the bounds are not active. For problems without state constraints,
| ∇ f ( )|| H 2 goes to zero as a local minimizer is approached. The column with three letters, each a T or F,
indicates which of the three normal termination criterion (see description of pdmin in §6) are satisfied.
For problems with control or initial condition bounds there are four termination criteria.
We can also solve this problem with quadratic splines ( = 3) by using N + − 1 = N + 2 spline coeffi-
cients.
>> u0=zeros(1,N+2);
>> [u2,x2,f2]=pdmin(x0,u0,t,[],[],[],100,4);
We can view the control solutions using sp_plot which plots spline functions. The trajectory solutions
can be viewed using plot or sp_plot.
5
4
3
2
1
0
−1
0 0.5 1 1.5 2 2.5
sp_plot(t,u2)
5
4
3
2
1
0
−1
0 0.5 1 1.5 2 2.5
The output displayed above shows that one flag has been read from the Matlab workspace. The next two
lines are messages produced by the user-supplied routines. The last set of data shows the value of the sys-
tem information (see discussion of neq[] in the description of init, §4, and also simulate, §5). Since
this problem has a state constraint, we can use either aug_lagrng or riots to solve it.
>> x0=[-5;-5];
>> u0=zeros(1,51);
>> t=[0:2.5/50:2.5];
>> u=aug_lagrng(x0,u0,t,[],[],[],100,5,4);
The displayed output reports that, at the current solution, the objective value is 29.8635 and the endpoint
constraint is being violated by −8. 63 × 10−6 . There is some error in these values due to the integration
error of the fixed step-size integration routines. We can get a more accurate measure by using the variable
step-size integration routine to simulate the system with the control solution u:
ans =
29.8648
ans =
5.3852e-06
The integration was performed with the default value of 1e − 8 for both the relative and absolute local
integration error tolerances. So the reported values are fairly accurate.
The nominal time interval is of duration T. Next, we specify a value for 3 , the duration scale factor,
which is the initial condition for the augmented state. The quantity T 3 represents our guess for the opti-
mal duration of the maneuver.
>> X0=[x0,fixed,x0_lower,x0_upper]
X0 =
0 1.0000 0 0
0 1.0000 0 0
1.0000 0 0.1000 10.0000
The first column of X0 is the initial conditions for the problem; there are three states including the aug-
mented state. The initial conditions for the original problem were x(0) = (0, 0)T . The initial condition for
the augmented state is set to x0(3) = 3 = 1 to indicate that our initial guess for the optimal final time is
one times the nominal final time of T = 10, i.e., 3 T . The second column of X0 indicates which initial
conditions are to be considered fixed and which are to be treated as free variables for the optimization
program to adjust. A one indicates fixed and a zero indicates free. The third and fourth columns provide
lower an upper bound for the free initial conditions.
>> u0=zeros(1,N+1);
>> [u,x,f]=riots(X0,u0,t,-2,1,[],100,2); % Solve problem; f=x(3,1)=x0(3)
>> f*T % Show the final time.
ans =
29.9813
redistribute_factor = 7.0711
Redistributing mesh.
>> X0(:,1) = x(:,1);
>> [u,x,f]=riots(X0,new_u,new_t,-2,1,[],100,2);
>> f*10
ans =
30.0000
Notice that before calling riots the second time, we set the initial conditions (the first column of X0) to
x(:,1), the first column of the trajectory solution returned from the preceding call to riots. Because 3
is a free variable in the optimization, x(3,1) is different than what was initially specified for x0(3).
Since x(3,1) is likely to be closer to the optimal value for 3 than our original guess we set the current
guess for X0(3,1) to x(3,1).
We can see the improvement in the control solution and the solution for the final time. The reported
final time solution is 30 and this happens to be the exact answer. The plot of the control solution before
and after the mesh redistribution is shown below. The circles indicate where the mesh points are located.
The improved solution does appear to be a bang-bang solution.
Control soln. before redistribution Control soln. after redistribution
1.5 1.5
1 1
0.5 0.5
0 0
−0.5 −0.5
−1 −1
−1.5 −1.5
−2 −2
−2.5 −2.5
0 10 20 30 0 10 20 30
time time
Now outer is called with lower and upper control bounds of 0 and 3.5, respectively; no systems parame-
ters; a maximum of 300 iterations for each inner loop; a maximum of 10 outer loop iteration with a maxi-
mum discretization level of N = 500; default termination tolerances; integration algorithm 4 (RK4); and
mesh redistribution strategy 2.
>> [new_t,u,x]=outer([x0,fixed],u0,t,0,3.5,[],500,[10;500],4,[],2);
Goddard
Completed 114 riots iterations. Kuhn-Tucker conditions satisfied but sequence did not converge.
The message stating that the Kuhn-Tucker conditions are satisfied but that the sequence did not converge
is a message from NPSOL which is the nonlinear programming algorithm linked with riots in this exam-
ple. This message indicates that, although first order optimality conditions for optimality are satisfied (the
norm of the gradient of the Lagrangian is sufficiently small), the control functions from one iteration of
riots to the next have not stopped changing completely. The sources of this problem are (i) the Goddard
problem is a singular optimal control problem; this means that small changes in the controls over some
portions of the time interval have very little effect on the objective function and (ii) outer calls riots with
very tight convergence tolerances. Because of this, the calls to riots probably performed many more iter-
ations than were useful for the level of accuracy achieved. Choosing better convergence tolerances is a
subject for future research.
The optimal control and optimal state trajectories are shown on the next page. Notice that to plot
the optimal control we multiply the time vector new_t by x(4,1) which contains the duration scale
factor. The optimal final time for this problem, since a = 0 and b = 1, is just x(4,1)=0.1989. Note
that the final mass of the rocket is 0.6. This is the weight of the rocket without any fuel. The maximum
height is the negative of the objective function, h*(t) ≈ 1. 01284.
>> sp_plot(new_t*x(4,1),u)
>> plot(new_t*x(4,1),x(1,:))
>> plot(new_t*x(4,1),x(2,:))
>> plot(new_t*x(4,1),x(3,:))
3.5
2.5
1.5
0.5
−0.5
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2
Time
0.95
0.12
0.9
0.1 1.01
0.85
0.08 0.8
0.06 0.75
1.005
0.7
0.04
0.65
0.02
0.6
0 1 0.55
0 0.1 0.2 0 0.1 0.2 0 0.1 0.2
All of the functions in the description of OCP in §2 are computed from the user functions h, l and g; the
derivatives of these functions are computed from the user functions Dh, Dl and Dg. Two other user func-
tions, activate and init, are required for the purpose of passing information to and from RIOTS_95.
Smoothness Requirements. The user-supplied functions must have a certain degree of smoothness.
The smoothness requirement comes about for three reasons. First, the theory of differential equations
requires, in general, that h(t, x, u) be piecewise continuous with respect to t, Lipschitz continuous with
respect to x and u and that u(⋅) be continuous, in order to ensure the existence and uniqueness of a solu-
tion satisfying the system of differential equations. A finite number of discontinuities in h(⋅, x, u) and u(⋅)
are allowable. Second, the optimization routines needs at least one continuous derivative of the objective
and constraint functions g(⋅, ⋅) and l(t, ⋅, ⋅). Two continuous derivatives are needed in order for there to be
a chance of superlinear convergence. The third reason is that the accuracy of numerical integration of dif-
ferential equations depends on the smoothness of h(⋅, ⋅, ⋅) and l(⋅, ⋅, ⋅). For a fixed step-size methods with
order s, ∂(s) h(t, x, u)/∂x s and ∂(s) h(t, x, u)/∂u s should be continuous (or the (r − 1)-th partial should be
Lipschitz continuous). Furthermore, any discontinuities in h(⋅, x, u(⋅)) or its derivatives should occur only
at integration breakpoints7. Conversely, the user should place integration breakpoints wherever such dis-
continuities occur. The same considerations also hold for the function l(t, x, u). For variable step-size
integration, h(t, x, u) and l(t, x, u) should have at least continuous partial derivatives of order one with
respect to x and u. Again, any discontinuities in h(⋅, x, u(⋅)) and l(⋅, x, u(⋅)) or its derivatives should only
occur at integration break points.
7
Note that discontinuities in u(t) can only occur at the spline breakpoints, t k .
Purpose
This function is always called once before any of the other user-supplied functions. It allows the user to
perform any preliminary setup needed, for example, loading a data array from a file.
C Syntax
void activate(message)
char **message;
{
*message = "";
/* Any setup routines go here. */
}
M-file Syntax
message = ’’;
Description
If the message string is set, that string will be printed out whenever simulate (form 0) or an optimization
routine is called. It is useful to include the name of the optimal control problem as the message.
Purpose
This function serves two purposes. First, it provides information about the optimal control problem to
RIOTS_95. Second, it allows system parameters to be passed from Matlab to the user-defined functions
at run-time. These system parameters can be used, for instance, to specify constraint levels. Unlike acti-
vate, init may be called multiple times. The array neq[] is explained after the syntax.
C Syntax
void init(neq,params)
int neq[];
double *params;
{
if ( params == NULL ) {
/* Set values in the neq[] array. */
}
else {
/* Read in runtime system parameters. */
}
}
M-file Syntax
When this functions is called, the variable params will be set to 0 (NULL) if init() is expected to
return information about the optimal control problem via the neq[] array. Otherwise, params is a vec-
tor of system parameters being passed from Matlab to the user’s program. When params==0, the values
in neq[] should be set to indicate the following:
= =
will not be adjusted during optimization. Parameters that are to be used as decision variables must be
specified as initial conditions to augmented states with ˙ = 0.
Notes
1. Control bounds should be indicated separately when calling the optimization routines. Do not
include any simple bound constraints in the general constraints. Similarly, simple bounds on free initial
conditions should be specified on the command line.
>
2. For nonlinear systems, all constraints involving a state variable are nonlinear functions of the control.
Thus, the constraint g( , x(b)) = x(b) = 0, while linear in its arguments, is nonlinear with respect to u.
The user does not need to account for this situation, however, and should indicate that g is a linear con-
straint. RIOTS_95 automatically treats all general constraints for nonlinear systems as nonlinear.
Purpose
This function serves only one purpose, to compute h(t, x, u), the right hand side of the differential equa-
tions describing the system dynamics.
C Syntax
void h(neq,t,x,u,xdot)
int neq[];
double *t,x[NSTATES],u[NINPUTS],xdot[NSTATES];
{
/* Compute xdot(t) = h(t,x(t),u(t)). */
}
M-file Syntax
global sys_params
Description
On entrance, t is the current time, x is the current state vector and u is the current control vector. Also,
neq[3] is set to the current discrete-time index, k − 1, such that t k ≤ t < t k+1 8.
On exit, the array xdot[] should contain the computed value of h(t, x, u). The values of xdot[]
default to zero for the object code version. Note that for free final time problems the variable t should
not be used because derivatives of the system functions with respect to t are not computed. In the case of
non-autonomous systems, the user should augment the state variable with an extra state representing time
(see transcription for free final time problems in §2).
8
The index is k − 1 since indexing for C code starts at zero. For M-files, neq(4) = k.
Purpose
This function serves two purposes. It is used to compute values for the integrands of cost functions,
l o (t, x, u), and the values of state trajectory constraints, l ti (t, x, u).
C Syntax
double l(neq,t,x,u)
int neq[];
double *t,x[NSTATES],u[NINPUTS];
{
int F_num, constraint_num;
double z;
F_num = neq[4];
if ( F_num == 1 ) {
/* Compute z = l(t,x(t),u(t)) for the integrand. */
/* If this integrand is identically zero, */
/* set z = 0 and neq[3] = -1. */
}
else {
constraint_num = F_num - 1;
/* Compute z = l(t,x(t),u(t) for the */
/* constraint_num trajectory constraint. */
}
return z;
}
M-file Syntax
function z = sys_l(neq,t,x,u)
% z is a scalar.
global sys_params
F_NUM = neq(5);
if F_NUM == 1
% Compute z = l(t,x(t),u(t)) for the objective integrand.
else
constraint_num = F_num - 1;
% Compute z = l(t,x(t),u(t)) for the constraint_num
% traj. constraint.
end
On entrance, t is the current time, x is the current state vector and u is the current control vector. Also,
neq[3] is set to the current discrete-time index k − 1 such that t k ≤ t < t k+1 (see footnote for h) and
neq[4] is used to indicate which integrand or trajectory constraint is to be evaluated. Note that, for free
final time problems, the variable t should not be used because derivatives of the system functions with
respect to t are not computed. In this case, the user should augment the state variable with an extra time
state and an extra final-time state as described in §2.
If neq[4] = 1, then z should be set to l oneq[4] (t, x, u). If l oneq[4] (⋅, ⋅, ⋅) = 0 then, besides returning 0, l
(in object code versions) can set neq[3] = − 1 to indicate that the function is identically zero. The latter
increases efficiency because it tells RIOTS_95 that there is no integral cost. Only the function l is allowed
to modify neq[3]. Regardless of how neq[3] is set, l must always return a value even if the returned value
is zero.
neq[4] − 1
If neq[4] > 1, then z should be set to l ti (t, x, u). If there are both linear and nonlinear trajec-
tory constraints, the nonlinear constraints must precede those that are linear. The ordering of the func-
tions computed by l is summarized in the following table:
C function to compute
neq[4] = 1 neq[4] D
l o (t, x, u)
D
D
l ti (t, x, u), nonlinear
neq[4] > 1 neq[4] − 1
l ti (t, x, u), linear
Purpose
G
G G
This function serves two purposes. It is used to compute the endpoint cost function g o ( , x(b)) and the
endpoint inequality and equality constraints g ei ( , x(b)) and g ee ( , x(b)). The syntax for this function
includes an input for the time variable t for consideration of future implementations and should not be
used. Problems involving a cost on the final time T should use the transcription for free final time prob-
lems described in §2.
C Syntax
double g(neq,t,x0,xf)
int neq[];
double *t,x0[NSTATES],xf[NSTATES];
{
int F_num, constraint_num;
double value;
F_num = neq[4];
if ( F_num <= 1 ) {
/* Compute value of g(t,x0,xf) for the */
/* F_num cost function. */
}
else {
constraint_num = F_num - 1;
/* Compute value g(t,x0,xf) for the */
/* constraint_num endpoint constraint. */
}
return value;
}
M-file Syntax
function J = g(neq,t,x0,xf)
% J is a scalar.
global sys_params
F_NUM = neq(5);
if F_NUM <= sys_params(6)
% Compute g(t,x0,xf) for cost function.
elseif F_NUM == 2
% Compute g(t,x0,xf) for endpoint constraints.
end
On entrance, x0 is the initial state vector and xf is the final state vector. The value neq[4] is used to
indicate which cost function or endpoint constraint is to be evaluated. Nonlinear constraints must precede
linear constraints. The order of functions to be computed is summarized in the following table:
H function to compute
neq[4] = 1 1 I
g o ( , x(b))
J I
J I
gie ( , x(b)), nonlinear
1 < neq[4] ≤ 1 + q ei neq[4] − 1
gie ( , x(b)), linear
J I
J I
g ee ( , x(b)), nonlinear
1 + q ei < neq[4] ≤ 1 + q ei + q ee neq[4] − 1 − q ei
g ee ( , x(b)), linear
Purpose
These functions provide the derivatives of the user-supplied function with respect to the arguments x and
u. The programs riots (see §6) can be used without providing these derivatives by selecting the finite-
difference option. In this case, dummy functions must be supplied for Dh, Dl and Dg.
C Syntax
void Dh(neq,t,x,u,A,B)
int neq[];
double *t,x[NSTATES],u[NINPUTS];
double A[NSTATES][NSTATES],B[NSTATES][NINPUTS];
{
/* The A matrix should contain dh(t,x,u)/dx. */
/* The B matrix should contain dh(t,x,u)/du. */
}
double Dl(neq,t,x,u,l_x,l_u)
int neq[];
double *t,x[NSTATES],u[NINPUTS],l_x[NSTATES],l_u[NINPUTS];
{
/* l_x[] should contain dl(t,x,u)/dx */
/* l_u[] should contain dl(t,x,u)/du */
/* according to the value of neq[4]. */
/* The return value is dl(t,x0,xf)/dt which */
/* is not currently used by RIOTS. */
return 0.0;
}
double Dg(neq,t,x0,xf,g_x0,g_xf)
int neq[];
double *t,x0[NSTATES],xf[NSTATES],J_xf[NSTATES];
{
/* g_x0[] should contain dg(t,x0,xf)/dx0. */
/* g_xf[] should contain dg(t,x0,xf)/dxf. */
/* according to the value of neq[4]. */
/* The return value is dg(t,x0,xf)/dt which */
/* is not currently used by RIOTS. */
return 0.0;
}
Description
The input variables and the ordering of objectives and constraints are the same for these derivative func-
tions as they are for the corresponding functions h, l, and g. The derivatives with respect to t are not used
in the current implementation of RIOTS_95 and can be set to zero. The derivatives should be stored in
the arrays as follows:
Note that, for sys_Dh, RIOTS_95 automatically accounts for the fact that Matlab stores matrices trans-
posed relative to how they are stored in C.
Purpose
This function allows user-supplied object code to read a vector of integers from Matlab’s workspace.
C Syntax
int get_flags(flags,n)
int flags[],*n;
Description
A call to get_flags causes flags[] to be loaded with up to n integers from the array FLAGS if FLAGS
exists in Matlab’s workspace. It is the user’s responsibility to allocate enough memory in flags[] to
store n integers. The value returned by get_flags indicates the number of integers read into flags[].
The main purpose of get_flags is to allow a single system program to be able to represent more than
one problem configuration. The call to get_flags usually takes place within the user-function activate. In
the example below, get_flags reads in the number of constraints to use for the optimal control problem.
Example
extern int get_flags();
static int Constraints;
void activate(message)
char **message;
{
int n,flags[1];
Notes
1. It is best to define FLAGS as a global variable in case simulate gets called from within an M-file.
This is accomplished by typing
>> global FLAGS
At the Matlab prompt. To clear FLAGS use the Matlab command
>> clear global FLAGS
2. Since activate is called once only, you must clear simulate if you want to re-read the values in
FLAGS. To clear simulate, at the Matlab prompt type
>> clear simulate
3. For M-files, any global variable can be read directly from Matlab’s workspace so an M-file version of
get_flags is not needed.
Purpose
This function allows user-supplied object code to make calls back to a user-supplied Matlab m-function
called sys_time_fnc.m which can be used to compute a function of time. Call-backs to Matlab are very
slow. Since this function can be called thousand of times during the course of a single system simulation
it is best to provide the time function as part of the object code if possible.
C Syntax
void time_fnc(t,index,flag,result)
int *index,*flag;
double *t,result[];
Syntax of sys_time_fnc.m
function f = sys_time_fnc(tvec)
% tvec = [time;index;flag]
% Compute f(time,index,flag).
Description
If time_fnc is to called by one of the user-functions, then the user must supply an m-function named
sys_time_fnc. The inputs tvec(1)=time and tvec(2)=index to sys_time_fnc are related by
t index ≤ time ≤ t index+1 . The value of index passed to sys_time_fnc is one greater than the value passed
from time_fnc to compensate for the fact the Matlab indices start from 1 whereas C indices start from 0.
The input flag is an integer that can be used to select from among different time functions. Even if
flag is not used, it must be set to some integer value.
The values in the vector f returned from sys_time_fnc are stored in result which must have
enough memory allocated for it to store these values.
Suppose we want l to compute f (t)x 1 (t) where f (t) = sin(t) + y d (t) with y d (t) is some pre-computed
global variable in the Matlab workspace. Then we can use time_fnc to compute f (t) and use this value to
multiply x[0]:
Here is the function that computes f (t). It computes different functions depending on the value of
flag=t(3). In our example, it is only called with flag=0.
function f = sys_time_fnc(t)
if flag == 0
f = yd(time) + sin(time);
else
f = another_fnc(time);
end
This section describes the central program in RIOTS_95, simulate. All of the optimization programs in
RIOTS_95 are built around simulate which is responsible for computing all function values and gradients
and serves as an interface between the user’s routines and Matlab.
The computation of function values and gradients is performed on the integration mesh
.
t N = { t N ,k } k=1
N +1
.
This mesh also specifies the breakpoints of the control splines. For any mesh t N we define
.
∆ N ,k = t N ,k+1 − t N ,k .
The values of the trajectories computed by simulate are given at the times t N ,k and are denoted x N ,k ,
k = 1, . . , N + 1. Thus, x N ,k represents the computed approximation to the true solution x(t N ,k ) of the dif-
Y
ferential equation ẋ = h(t, x, u), x(a) = . The subscript N is often omitted when its its presence is clear
from context.
Spline Representation of controls. The controls u are represented as splines. These splines are given
by
Z
[ \ Z
N + −1
u(t) = Σ
k=1
k t N , ,k (t)
[ \ Z ]
where k ∈ IRm and t N , ,k (⋅) is the k-th B-spline basis element of order , defined on the knot sequence
]
formed from t N by repeating its endpoints times. Currently, RIOTS_95 does not allow repeated interior
knots. We will denote the collection of spline coefficients by
[ .
={ [ k
Z
N + −1
} k=1 .
[
For single input systems, is a row vector. Those interested in more details about splines are referred to
]
the excellent reference [6]. The times t k , k = 1, . . . , N , define the spline breakpoints. On each interval
[t k , t k+1 ], the spline coincides with an -th order polynomial. Thus, fourth order splines are made up of
piecewise cubic polynomials and are called cubic splines. Similarly, third order splines are piecewise
quadratic, second order splines are piecewise linear and first order splines are piecewise constant. For
[
first and second order splines, k = u(t k ). For higher-order splines, the B-spline basis elements are evalu-
ated using the recursion formula in (A2.2a).
The following pages describe simulate. First, the syntax and functionality of simulate is presented.
This is followed by a description of the methods used by the integration routines in simulate to compute
function values and gradients. Finally, two functions, check_deriv and check_grad, for checking user-
supplied derivative information, and the function eval_fnc are described.
Purpose
This is the central program in RIOTS_95. The primary purpose of simulate is to provide function values
and gradients of the objectives and constraints using one of six integration algorithms. The optimization
routines in RIOTS_95 are built around simulate. This program also serves as a general interface to the
user-supplied functions and provides some statistical information.
There are currently seven different forms in which simulate can be called. Form 1 and form 2
(which is more conveniently accessed using eval_fnc) are the most useful for the user. The other forms
are used primarily by other programs in RIOTS_95. The form is indicated by the first argument to simu-
late. A separate description for each form is given below.
Form 0
[info,simed] = simulate(0,{params})
Form 1
[f,x,du,dz,p] = simulate(1,x0,u,t,ialg,action)
Form 2
f=simulate(2,f_number,1)
[du,dz,p] = simulate(2,f_number,action)
Form 3
[xdot,zdot] = simulate(3,x,u,t,{f_num,{k}})
[xdot,zdot,pdot] = simulate(3,x,u,t,p,{k})
Form 4
[h_x,h_u,l_x,l_u] = simulate(4,x,u,t,{f_num,{k}})
Form 5
[g,g_x0,g_xf] = simulate(5,x0,xf,tf,{f_num})
Form 6
stats = simulate(6)
Form 7
lte = simulate(7)
Table S1
`
xf n 1 final state
u m N + −1 control vector
t 1 N +1 time vector
tf 1 to 4 1 final time
ialg 1 1 integration algorithm
action 1 1 (see below)
f_num 1 1 (see below)
params (see below) (see below) system parameters
The following table describes the outputs that are returned by the various forms of simulate.
Table S2
If a division by zero occurs during a simulation, simulate returns the Matlab variable NaN, which stands
for ‘‘Not a Number’’, in the first component of each output. This can be detected, if desired, using the
Matlab function isnan().
`
Note: The length of the control vector depends on the control representation. Currently, all of the inte-
gration routines are setup to work with splines of order defined on the knot sequence constructed from
`
t N . The current implementation of RIOTS_95 does not allow repeated interior knots. The length (num-
ber of columns) of u and du is equal to N + − 1 where N=length(t)-1 is the number of intervals in
9
x0 can be a matrix but only the first column is used.
Table S3
a
0 (discrete) discrete-time controls
a
1 (Euler) =1
a
2 (RK2) =2
a
3 (RK3) =2
a
4 (RK4) = 2, 3, 4
= 1, 2, 3, 410
a
5 (LSODA)
6 (LSODA w/0 Jacobians) = 1, 2, 3, 410
When more than one spline order is possible, the integration determines the order of the spline representa-
tion by comparing the length of the control input u to the length of the time input t. If LSODA is called
with ialg=5, the user must supply dh dl
dx and dx in the user-functions Dh and Dl. If the user has not pro-
grammed these Jacobians, LSODA must be called with ialg=6 so that, if needed, these Jacobians will
be computed by finite-differences. The different integration methods are discussed in detail following the
description of the various forms in which simulate can be called.
Bugs
1. There may be a problem with computation of gradients for the variable step-size integration algo-
rithm (ialg=5,6) if the number of interior knots nknots is different than one (see description of form 1
and gradient computations for LSODA below).
[info,simed] = simulate(0,{params})
This form is used to load system parameters and to return system information. If params is supplied,
simulate will make a call to init so that the user’s code can read in these parameters. Normally params
is a vector. It can be a matrix in which case the user should keep in mind that Matlab stores matrices col-
umn-wise (Fortran style). If the system has no parameters then either omit params or set params=[].
If no output variables are present in this call to simulate the system message loaded on the call to activate
and other information about the system will be displayed.
10
The maximum spline order allowed by simulate when using LSODA can be increased by changing the pre-compiler define symbol
MAX_ORDER in adams.c.
The scalar output simed is used to indicate whether a call to simulate (form 1) has been made. If
simed=1 then a simulation of the system has occurred. Otherwise simed=0.
[f,x,du,dx0,p] = simulate(1,x0,u,t,ialg,action)
This form causes the system dynamics, ẋ = h(t, u, x) with x(a) = x0, to be integrated using the integra-
tion method specified by ialg (cf. Table S3). Also, the value f of the first objective function, and possi-
bly its gradients, du and dx0 and the adjoint p, can be evaluated. Only the first column of x0 is read.
The strictly increasing time vector t of length N + 1 specifies the integration mesh on [a, b] with
t(1) = a and t(N + 1) = b. The control u is composed of m rows of spline coefficients.
The calculations performed by simulate (form 2) depend on the value of action. These actions
are listed in the following table:
Table S4
The meaning of ‘‘internal knots’’ is discussed below in the description of gradient computation with
LSODA.
Example
The following commands, typed at the Matlab prompt, will simulate a three state system with two inputs
using integration algorithm RK4 and quadratic splines. The simulation time is from a = 0 until b = 2. 5
and there are N = 100 intervals in the integration mesh.
>> N=100;
>> t = [0:2.5/N:2.5];
>> x0 = [1;0;3.5];
>> u0 = ones(2,N+2); % u0(t)=[1;1];
>> [j,x] = simulate(1,x0,u0,t,4,2);
j = simulate(2,f_number,1)
[du,dx0,p] = simulate(2,f_number,action)
This form allows function values and gradients to be computed without re-simulating the system. A call
to this form must be proceeded by a call to simulate (form 1). The results are computed from the most
recent inputs (x0,u,t,ialg) for the call to simulate, form 1. The following table shows the relation-
ship between the value of f_number, and the function value or gradient which is computed.
Table S5
1 ≤ f_number ≤ n1 bc
g o ( , x(b)) + ∫
b
b
l o (t, x, u)dt d = f_number
n1 < f_number ≤ n2 b
a
l ti (t, x(t), u(t)) d = n%(N + 1) + 1 , t = t k where
n = f_number − n1 − 1 and
d
k = f_number − n1 − (N + 1).
n2 < f_number ≤ n3 bc
g ei ( , x(b)) d = f_number − n2
n3 < f_number ≤ n4 b c
g ee ( , x(b)) d = f_number − n3
e e
q ee the number of endpoint equality constraints. The notation n%m means the remainder after division of
n by m (n modulo m). Thus, for trajectory constraints, the -th constraint (with = n%(N + 1) + 1) is
[xdot,zdot] = simulate(3,x,u,t,{f_num,{k}})
[xdot,zdot,pdot] = simulate(3,x,u,t,p,{k})
f
This form evaluates (as opposed to integrates) the following quantities: ẋ = h(t, x, u), ż = l o (t, x, u), and
ṗ = − ( ∂h(t,x,u) p + ∂l(t,x,u)
T T
g g f
∂x ∂x ) at the times specified by t. These functions are evaluated at the points in t.
If f_num is specified, = f_num, otherwise = 1. The function l (⋅, ⋅, ⋅) is evaluated according to
Table S5 above. The last input, k, can only be supplied if t is a single time point. It is used to indicate
the discrete-time interval containing t. That is, k is such that t k ≤ t < t k+1 . If k is given, l is called with
neq[3] = k − 1. In this call, the values in u represent pointwise values of u(t), not its spline coeffi-
cients. The inputs x and u must have the same number of columns as t.
[h_x,h_u,l_x,l_u] = simulate(4,x,u,t,{f_num,{k}})
[g,g_x0,g_xf] = simulate(5,x0,xf,tf,{f_num})
f
This form evaluates g (x0, xf), ∂g (x0,xf) h
, and ∂g (x0,xf) h g
g ∂x0 ∂xf . If f_num is specified, = f_num. Otherwise
= 1. The input tf gets passed to the user functions g and Dg (see descriptions in §2) for compatibility
with future releases of RIOTS_95.
stats = simulate(6)
This form provides statistics on how many times the functions h and Dh have been evaluated, how many
f f f
times the system has been simulated to produce the trajectory x, and how many times functions or the
gradients of f (⋅, ⋅), g (⋅, ⋅) or l ti (⋅, ⋅, ⋅) have been computed. The following table indicates what the com-
ponents of stats represent:
Component Meaning
stats(1) Number of calls to h.
stats(2) Number of calls to Dh.
stats(3) Number of simulations.
stats(4) Number of function evaluations.
stats(5) Number of gradient evaluations.
lte = simulate(7)
This form, which must be preceded by a call to simulate (form 1) with ialg=1,2,3,4, returns esti-
mates of the local truncation error for the fixed step-size Runge-Kutta integration routines. The local
truncation error is given, for k = 1, . . . , N , by
x k (t k+1 ) − x N ,k+1
lte k = ,
z k (t k+1 ) − z N ,k+1
where x k (t k+1 ) and z k (t k+1 ) are the solutions of
ẋ h(x, u) x(t k ) = x N ,k
= , , t ∈ [t k , t k+1 ] .
ż l 1o (t, x, u) z(t k ) = 0
and x N ,k+1 and z N ,k+1 are the quantities computed by one Runge-Kutta step from x N ,k and 0, respectively.
These local truncations errors are estimated using double integration steps as described in [4, Sec. 4.3.1].
The local truncation error estimates are used by distribute (see description in §7) to redistribute the inte-
gration mesh points in order to increase integration accuracy.
System Simulation
System simulation is accomplished by forward integration of the differential equations used to describe
the system. There are four fixed step-size Runge-Kutta integrators, one variable step-size integrator
(LSODA), and one discrete-time solver. The RK integrators and LSODA produce approximate solutions
to the system of differential equation
ẋ = h(t, x, u) , x(a) = i
ż = l(t, x, u) , z(a) = 0
on the interval t ∈ [a, b]. The four Runge-Kutta integrators are Euler’s method, improved Euler, Kutta’s
formula and the classical Runge-Kutta method (see 7 or [4, Sec. 4.2] ) and are of order 1, 2, 3 and 4
respectively. The discrete-time integrator solves
x k+1 = h(t k , x k , u k ) , x 0 = i
z k+1 = l(t k , x k , u k ) , z 0 = 0
for k = 1, . . . , N .
The variable step-size integrator is a program called LSODA [8,9]. LSODA can solve both stiff and
non-stiff differential equations. In the non-stiff mode, LSODA operates as an Adams-Moulton linear,
multi-step method. If LSODA detects stiffness, it switches to backwards difference formulae. When
operating in stiff mode, LSODA requires the system Jacobians ∂h(t,x,u)
∂x and ∂l(t,x,u)
∂x . If the user has not
supplied these functions, LSODA must be called using ialg=6 so that these quantities will be computed
using finite-difference approximations. Otherwise, LSODA should be called using ialg=5 so that the
analytic expressions for these quantities will be used.
The integration precision of LSODA is controlled by a relative tolerance and an absolute tolerance.
These both default to 1e − 8 but can be specified in ialg(3:4) respectively (see description of simu-
late, form 1). The only non-standard aspect of the operation of LSODA by simulate is that the integra-
tion is restarted at every mesh point t k due to discontinuities in the control spline u(⋅), or its derivatives, at
these points.
Gradient Evaluation
In this section we discuss the computation of the gradients of the objective and constraint functions of
problem OCP with respect to the controls and free initial conditions. These gradients are computed via
backwards integration of the adjoint equations associated with each function.
j
Discrete-time Integrator. For the discrete-time integrator, the adjoint equations and gradients are given
by the following equations. For the objective functions, ∈ qo , k = N , . . . , 1,
k
p k = h x (t k , x k , u k ) p k+1 + l x (t k , x k , u k ) ; p N +1
T T
=
k i
∂g ( , x N +1 )T
∂x N +1
= h u (t k , x k , u k ) p k+1 + l u (t k , x k , u k )
T T
l
du k
l m
df ( , u)T ∂g ( , x N +1 )T l m
d m
=
∂
+ p0 .
m
For the endpoint constraints, n ∈ qei ∩ qee , k = N , . . . , 1,
p k = h x (t k , x k , u k )T p k+1 ; p N +1 =
l m
∂g ( , x N +1 )T
∂x N +1
l m
dg ( , x N +1 )
T
T
= h u (t k , x k , u k ) p k+1
du k
l m
dg ( , x N +1 )T ∂g ( , x N +1 )T l m
d m =
∂
+ p1 .
m
For the trajectory constraints, n ∈ qti , evaluated at the discrete-time index l ∈ { 1, . . . , N + 1 } ,
p k = h x (t k , x k , u k )T p k+1 , k = l − 1, . . . , 1 ; pl = l x (t l , x l , u l )T l
h (t , x , u )T p
l
dl (t k , x k , u k )
T
u k k k Tk+1
= l u (t k , x k , u k ) l
k = 1, . . . , l − 1
k=l
du k 0 k = l + 1, . . . , N
î
l
dl (t l , x l , u l )T
d m
= p1 .
p
per integration interval, and with respect to is computed. Second, the gradient with respect to the spline
coefficients, k , of u(t) is computed using the chain-rule as follows,
m
df ( , u) N r df ( , u) du m q
p =ΣΣ
p
i, j
, k = 1, . . . , N + −1,
d k i=1 j=1 du i, j d k
q
where is the order of the spline representation. Most of the terms in the outer summation are zero
because the spline basis elements have local support. The quantity
is easily determined from a recurrence relation for the B-spline basis [6].
Due to the special structure of the specific RK methods used by simulate there is a very efficient
formula, discovered by Hager [10] for computing df /dui, j . We have extended Hager’s formula to deal
with the various constraints and the possibility of repeated control samples (see Chapter 2.4). To describe
this formula, we use the notation for k = 1, . . . , N − 1 and j = 1, . . . , s,
Ak, j = h x (
.
t k, j , y k, j , u( t k, j ))
T
,
B k, j = h u (
.
t k, j , y k, j , u( t k, j ))
T
,
u
lx k , j = l x (
.
u t k, j , y k, j , u( t k, j ))
T
,
and
u
lu k , j = l u (
.
u t k, j , y k, j , u( t k, j ))
T
.
where, with a k, j parameters of the Runge-Kutta method,
. .
y k,1 = x k , y k, j = x k + ∆k
j−1
Σ
m=1
a j,m h(y k,m , u( t k,m )) , j = 2, . . . , s .
v
The gradients of the objective and constraint functions with respect to the controls u k, j and the ini-
.
tial conditions are given as follows. In what follows a s+1, j = b j , q k,s = q k+1,0 and the standard adjoint
variables, p k , are given by p k = q k,0 . For the objective functions, we have for ∈ qo , k = N , . . . , 1, w
q N +1,0 =
u v
∂g ( , x N +1 )T
,
∂x N +1
u v
df ( , u)
T
= b j ∆k B k, j q k, j + l k , j , j = 1, . . . , s , u
du k, j
u v
df ( , u)T ∂g ( , x N )T u v
d
=
∂ v + q 01 ,
v
For the endpoint constraints, we have for w ∈ qei ∩ qee , k = N , . . . , 1,
q k, j = q k+1,0 + ∆k Σ
s
a s− j+1,s−m+1 Ak,m q k,m , j = s − 1, s − 2, . . . , 0 ; q N +1,0 =
u v
∂g ( , x N +1 )T
,
m= j+1 ∂x N +1
u v
dg ( , x N +1 )
T
= b j ∆k B k, j q k, j , j = 1, . . . , s
du k, j
s
q k, j = q k+1,0 + ∆k Σ
m= j+1
a s− j+1,s−m+1 Ak,m q k,m , k = l − 1, . . , 1 , j = s − 1, s − 2, . . . , 0 ;
b ∆ B q
x
dl (t k , x k , u(t k ))
T
x
j k k, j k, j k = 1, . . . , l − 1 , j = 1, . . . , s
{
= l u (t k , x k , u( k, j )) k = l ; j = 0 if l ≤ N , else j = s
du k, j 0 otherwise
î
x
dl (t l , x l , u(t l ))T
d y = q 01 .
{ {
{ {
For method RK4, we have the special situation that k,2 = k,3 for all k because c 2 = c 3 = 1/2.
Hence, there is a repeated control sample: u k,2 = u( k,2 ) = u( k,3 ). Thus, for any function f , the
derivatives with respect to u k,1 , u k,2 and u k,3 are given by the expressions,
df df df df df df df
= , = + , = .
du k,1 du du k,2 du du k,3 du k,3 du
k,1 k,2 k,4
Variable Step-Size Integrator (LSODA). For the variable step-size integrator, LSODA, the adjoint
equations and gradients are given by the equations below which require knowledge of x(t) for all
n knots +1,N +1
t ∈ [a, b]. As in [11], x(t) is stored at the internal knots { t k + n i +1 ∆k } i=0,k=1 during the forward
knots
system integration. By default, n knots = 1, but the user can specify n knots ≥ 1 by setting
ialg(2) = nknots (see description of simulate, form 1). Then, during the computation of the adjoints
and gradients, x(t) is determined by evaluating the quintic11 Hermite polynomial which interpolates
(t, x(t), ẋ(t)) at the nearest three internal knots within the current time interval [t k , t k+1 ]. Usually
nknots = 1 is quite sufficient.
We now give the formulae for the adjoints and the gradients. It is important to note that, unlike the
fixed step-size integrators, the gradients produced by LSODA are not exact. Rather, they are numerical
approximations to the continuous-time gradients for the original optimal control problem. The accuracy
of the gradients is affected by the integration tolerance and the number of internal knots used to store val-
z
ues of x(t). Under normal circumstances, the gradients will be less accurate than the integration toler-
ance. For the objective functions, ∈ qo ,
11
The order of the Hermite polynomial can be changed by setting the define’d symbol ORDER in the code adams.c. If the trajectories are
not at least five time differentiable between breakpoints, then it may be helpful to reduce the ORDER of the Hermite polynomials and increase
nknots .
| }
∫a (hu (t, x, u)T p(t) + l| u (t, x, u)T ) t , ,k (t)dt ,
T
df ( , u) b
~
d k
= N
k = 1, . . . , N + −1
| }
df ( , u)T ∂g ( , x(b))T | }
d
=
} ∂
+ p(a) .
}
For the endpoint constraints, ∈ qei ∩ qee ,
| }
∫a hu (t, x, u)T p(t) t , ,k (t)dt ,
T
dg ( , u) b
~
d k
= N
k = 1, . . . , N + −1
| }
dg ( , u)T ∂g ( , x(b))T | }
d
=
} ∂
+ p(a) .
}
For the trajectory constraints, ∈ qti , evaluated at time t = t l , l ∈ { 1, . . . , N + 1 } ,
|
ṗ = − h x (t, x, u)T p , t ∈ [a, t l ] ; p(t l ) = l x (t l , x(t l ), u(t l ))T
|
∫a hu (t, x, u)T p(t) t , ,k (t)dt ,
T
dl (t l , x l , u(t l )) tl
d k ~
= N
k = 1, . . . , N + −1
|
dl (t l , x l , u(t l ))T
}
d
= p(a) .
The numerical evaluation of the integrals in these expressions is organized in such a way that they are
computed during the backwards integration of ṗ(t). Also, the computation takes advantage of the fact
that the integrands are zero outside the local support of the spline basis elements t N , ,k (t).
Purpose
This function provides a check for the accuracy of the user-supplied derivatives Dh, Dl and Dg by com-
paring these functions to derivative approximations obtained by applying forward or central finite-
differences to the corresponding user-supplied function h, l and g.
Calling Syntax
[errorA,errorB,max_error] = check_deriv(x,u,t,{params},{index},
{central},{DISP})
Description
The inputs x ∈ IRn , u ∈ IRm and t ∈ IR give the nominal point about which to evaluate the derivatives
h x (t, x, u), h u (t, x, u), l x (t, x, u), l u (t, x, u), g x (t, x, u) and gu (t, x, u). If there are system parameters (see
description of init in §3), they must supplied by the input params. If specified, index indicates the dis-
crete-time index for which t(index) ≤ t ≤ t(index+1). This is only needed if one of the user-
supplied system functions uses the discrete-time index passed in neq[3].
The error in each derivative is estimated as the difference between the user-supplied derivative and
its finite-difference approximation. For a generic function f (x), this error is computed, with ei the i-th
unit vector and i a scalar, as
f (x) − f (x + i ei ) df (x)
E=
i
−
dx i
e ,
1/3
if central differences are used. The perturbation size is i = mach
max { 1, |x i | } . Central difference
approximations are selected by setting the optional argument central to a non-zero value. Otherwise,
forward difference approximations will be used.
The first term in the Taylor expansion of E with respect to i is of order is O( i2 ) for central differ-
ences and O( i ) for forward differences. More details can be found in [12, Sec. 4.1.1]. Thus, it is some-
times useful to perform both forward and central difference approximations to decide whether a large dif-
ference between the derivative and its finite-difference approximations is due merely a result of scaling or
if it is actually due to an error in the implementation of the user-supplied derivative. If the derivative is
correct then E should decrease substantially when central differences are used.
If DISP=0, only the maximum error is displayed.
The outputs errorA and errorB return the errors for h x (t, x, u) and h u (t, x, u) respectively. The
output max_error is the maximum error detected for all of the derivatives.
Error in h_u =
1.0e-10 *
0
0.9421
For function 1:
Error in l_x =
1.0e-04 *
-0.3028 0
For function 1:
Error in g_x0 = 0 0
Error in g_xf = 0 0
>> check_deriv([-5;-5],0,0,[],0,1);
========================================================================
System matrices:
Error in h_x =
1.0e-10 *
0 -0.0578
-0.2355 -0.3833
Error in h_u =
1.0e-10 *
0
0.9421
For function 1:
Error in l_x =
1.0e-10 *
0.5782 0
Error in l_u = 0
For function 1:
Error in g_x0 = 0 0
Error in g_xf = 0 0
This function checks the accuracy of gradients of the objective and constraint functions, with respect to
controls and initial conditions, as computed by simulate, forms 1 and 2. It also provides a means to indi-
rectly check the validity of the user-supplied derivative Dh, Dl and Dg.
Calling Syntax
max_error = check_grad(i,j,k,x0,u,t,ialg,{params},{central},
{DISP})
Description
The input x0, u, t and ialg specify the inputs to the nominal simulation simulate(1,x0,u,t,ialg,0) prior
to the computation of the gradients. The gradients are tested at the discrete-time indices as specified in
the following table:
Index Purpose
i Spline coefficient control u that will be perturbed. If i=0, the
gradients with respect to u will not be checked.
j Index of initial state vector, , that will be perturbed. If j=0, the
gradients with respect to the will not be checked.
k For each trajectory constraints, k indicates the discrete-time in-
dex, starting with k=1, at which the trajectory constraints will be
evaluated. If k=0, the trajectory constraint gradients will not be
checked.
Example
The following example checks the tenth component of the control gradient and the second component of
initial condition gradient as computed by RK2 using central differences. The integration is performed on
the time interval t ∈ [0, 2. 5] with N = 50 intervals. The gradients are evaluated for the second order
spline control u(t) = 1 for all t (i.e., k = 1, k = 1, . . . , N + 1).
============================================================================
Using perturbation size of 6.05545e-06
Evaluating function 1.
error_u = 1.84329e-09
error_x0 = -4.88427e-11
Relative error in control gradient = 2.52821e-07%
Gradient OK
Purpose
This function provides a convenient interface to simulate (form 2), for computing function and gradient
values. A system simulation must already have been performed for this function to work.
Calling Syntax
[f,du,dx0,p] = eval_fnc(type,num,k)
Description of Inputs
type A string that specifies the type of function to be evaluated. The choices are
’obj’ Objective function
’ei’ Endpoint inequality constraint
’ee’ Endpoint equality constraint
’traj’ Trajectory constraint
num Specifies for the function of the type specified by type is to be evaluated.
k For trajectory constraints only. Specifies the index for the time, t k , in the current integration
mesh at which to evaluate the trajectory constraint. If k is a vector, the trajectory constraint
will be evaluated at the times specified by each mesh point index in k.
Description of Outputs
Examples
The following examples assume that a simulation has already been performed on a system that has at least
two endpoint equality constraints and a trajectory constraint. The first call to eval_fnc evaluates the sec-
ond endpoint equality constraint.
f =
0.2424
Since equality constraints should evaluate to zero, this constraint is violated. This next call evaluates the
first trajectory constraint at the times t k , k = 5, . . . , 15, in the current integration mesh.
>> eval_fnc(’traj’,1,5:15)
ans =
Columns 1 through 7
Columns 8 through 11
Since inequality constraints are satisfied if less than or equal to zero, this trajectory constraint is satisfied
at these specified points.
This section describes the suite of optimization programs that can be used to solve various cases of the
optimal control problem OCP. These programs seek local minimizers to the discretized problem. The
most general program is riots which converts OCP into a mathematical program which is solved using
standard nonlinear programming techniques. Besides being able to solve the largest class of optimal con-
trol problems, riots is also the most robust algorithm amongst the optimization programs available in
RIOTS_95. However, it can only handle medium size problems. The size of a problem, the number of
decision variables, is primarily determined by the number of control inputs and the discretization level.
What is meant by medium size problems is discussed in the description of riots.
The most restrictive program is pdmin which can solve optimal control problems with constraints
consisting of only simple bounds on and u. State constraints are not allowed. The algorithm used by
pdmin is the projected descent method described in [4, Chap. 3]. Because of the efficiency of the pro-
jected descent method, pdmin can solve large problems.
Problems that have, in addition to simple bounds on u and , endpoint equality constraints can be
solved by aug_lagrng. The algorithm is a multiplier method which relies upon pdmin to solve a
sequence of problems with only simple bound constraints. This program provides a good example of how
the toolbox style of RIOTS_95 can be used to create a complex algorithm from a simpler one. Currently,
the implementation of aug_lagrng is fairly naive and has a great deal of room left for improvement.
Also, it would be relatively straightforward to add an active set strategy to aug_lagrng in order to allow it
to handle inequality constraints.
Finally, the program outer is an experimental outer loop which repeatedly calls riots to solve a
sequence of increasingly accurate discretizations (obtained by calls to distribute) of OCP in order to effi-
ciently compute the optimal control to a specified accuracy.
While it is possible with some optimal control problems to achieve higher order accuracies, this is a non-
generic situation. The order of spline representation should therefore not exceed the accuracies listed in
the second column of this table. Thus, for RK4, even though cubic splines are allowed there is usually no
reason to use higher than quadratic splines ( = 3).
The orders listed in the above table are usually only achieved for unconstrained problems. For prob-
lems with control constraints it is typically impossible to achieve better than first order accuracy. This is
even true if the discontinuities in the optimal control are known a priori since the locations of these dis-
continuities will not coincide with the discontinuities of the discretized problems. For problems with state
constraints, the issue is more complicated. In general, we recommend using second order splines (except
for Euler’s method) for problems with control and/or trajectory constraints. Even if first order accuracy is
all that can be achieved, there is almost no extra work involved in using second order splines. Further-
more, second order splines will often give somewhat better results than first order splines even if the accu-
racy is asymptotically limited to first order.
A second consideration is that the overall solution error is due to both the integration error and the
error caused by approximating an infinite dimensional function, the optimal control, with a finite dimen-
sional spline. Because of the interaction of these two sources of error and the fact that the accuracy of the
spline representations is limited to the above table, improving the integration accuracy by using a higher
order method does not necessarily imply that the accuracy of the solution to the approximating problem
will improve. However, even if the spline accuracy is limited to first order, it is often the case that the
integration error, which is of order O(∆s ), where s is the order of the RK method, still has a significantly
greater effect on the overall error than the spline error (especially at low discretization levels). This is
partly due to the fact that errors in the control are integrated out by the system dynamics. Thus, it is often
advantageous to use higher-order integration methods even though the solution error is asymptotically
limited to first order by the spline approximation error.
The importance of the RK order, in terms of reducing the overall amount of computational work
required to achieve a certain accuracy, depends on the optimization program being used. Each iterations
of riots requires the solution of one or more dense quadratic program. The dimension of these quadratic
programs is equal to the number of decision parameters (which is m(N + − 1) plus the number of free
initial conditions). Because the work required to solve a dense quadratic program goes up at least cubi-
cally with the number of decision variables, at a certain discretization level most of the work at each itera-
tion will be spent solving the quadratic program. Thus, it is usually best to use the fourth order RK
method to achieve as much accuracy as possible for a given discretization level. An exception to this rule
occurs when problem OCP includes trajectory constraints. Because a separate gradient calculation is per-
formed at each mesh point for each trajectory constraint, the amount of work increases significantly as the
integration order is increased. Thus, it may be beneficial to use a RK3 or even RK2 depending on the
problem.
On the other hand, for the optimization programs pdmin and aug_lagrng (which is based on
pdmin) the amount of work required to solve the discretized problem is roughly linear in the number of
Table O2
RK order spline order
type of problem optimization program
(ialg) ( )
4 (N small) 3 (N small)
no control nor trajectory con- pdmin/aug_lagrng
straints 2 (N large) 2 (N large)
riots 4 3
4 (N small)
pdmin/aug_lagrng
control constraints 2 (N large) 2
riots 4
trajectory constraints riots 212 2
Coordinate Transformation
All of the optimization programs in RIOTS_95 solve finite-dimensional approximations to OCP obtained
by the discretization procedure described in the introduction of §5. Additionally, a change of basis is per-
formed for the spline control subspaces. The new basis is orthonormal. This change of basis is accom-
plished by computing the matrix M with the property that for any two splines u(⋅) and v(⋅) with coeffi-
cients and ,
12
Sometimes a higher-order method must be used to provide a reasonable solution to the system differential equations.
13
A spline of higher order would be too smooth since RIOTS_95 currently does not allow splines with repeated interior knots.
/
\ u, v /\ L 2 = /
\ , ∼
∼ \
/ .
In words, the L 2 -inner product of any two splines is equal to the Euclidean inner product of their coeffi-
cients in the new basis. The matrix M is referred to as the transform matrix and the change of basis is
referred to as the coordinate transformation.
By performing this transformation, the standard inner-product of decision variables (spline coeffi-
cients) used by off-the-shelf programs that solve mathematical programs is equal to the function space
inner product of the corresponding splines. Also, because of the orthonormality of the new basis, the con-
ditioning of the discretized problems is no worse than the conditioning of the original optimal control
problem OCP. In practice, this leads to solutions of the discretized problems that are more accurate and
that are obtained in fewer iterations than without the coordinate transformation. Also, any termination
criteria specified with an inner product become independent of the discretization level in the new basis.
In effect, the coordinate transformation provides a natural column scaling for each row of control
coefficients. It is recommended that, if possible, the user attempt to specify units for the control inputs so
that the control solutions have magnitude of order one. Choosing the control units in this way is, in effect,
a row-wise scaling of the control inputs.
One drawback to this coordinate transformation is that for splines of order two and higher the matrix
M−1/2 is dense. A diagonal matrix would be preferable for two reasons. First, computing M−1/2 is com-
putationally intensive for large N . Second, there would be much less work involved in transforming
between bases: each time a new iterate is produced by the mathematical programming software, it has to
be un-transformed to the original basis. Also, every gradient computation involves an inverse transforma-
tion. Third, simple control bounds are converted into general linear constraints by the coordinate transfor-
mation. This point is discussed next.
Control bounds under the coordinate transformation. Simple bounds on the spline coefficients
takes the form a k ≤ k ≤ b k , k = 1, . . , N + − 1. If a k and b k are in fact constants, a and b, then for all
t, a ≤ u(t) ≤ b. Now, under the coordinate transformation, simple bounds of this form become
( a1 , . . . ,
N + −1
) ≤ ∼ M−1/2 ≤ ( b1 , . . . , b N + −1 ).
Thus, because of the coordinate transformation, the simple bounds are converted into general linear
bounds. Since this is undesirable from an efficiency point of view, RIOTS_95 instead replaces the bounds
with
( a1 , . . . ,
N + −1
)M1/2 ≤ ∼ ≤ ( b1 , . . . , b N + −1
)M1/2 .
For first order splines (piecewise constant), these bounds are equivalent to the actual bounds since M1/2 is
diagonal. For higher order splines, these bounds are not equivalent. They are, however, approximately
correct since the entries of the matrix M fall off rapidly to zero away from the diagonal.
It turns out that the problems enumerated above can be avoided when using second order splines
(piecewise linear) which are, in any case, the recommended splines for solving problems with control
bounds. Instead of using M in the coordinate transformation, the diagonal matrix
bounds, (ii) RK2 is being used as the integration method, or (iii) N > 300. The latter case is employed
because the time it takes to compute the transform becomes excessive when N is large. When > 2, the
transformation is skipped altogether if (i) N > 300 or (ii) LSODA is being used on a problem with con-
trol bounds14.
14
Recall from Table S3 (p. 36) that splines of order greater than 2 can only be used with RK4 and LSODA.
The first six inputs are the same for all of the optimization programs; they are listed in the following table.
Default values for vectors apply to each component of that vector. Specifying [] for an input causes that
input to be set to its default value. In the following, N is the discretization level and is the order of the
control splines.
Table O3
The first two outputs are the same for all of the optimization programs; they are listed in the following
table:
Table O4
Purpose
¡
This function uses pdmin as an inner loop for an augmented Lagrangian algorithm that solves optimal
control problem with, in addition to simple bounds on and u, endpoint equality constraints. Only one
objective function is allowed.
The user is urged to check the validity of the user-supplied derivatives with the utility program
check_deriv before attempting to use pdmin.
Calling Syntax
[u,x,f,lambda,I_i] = aug_lagrng([x0,{fixed,{x0min,x0max}}],u0,t,
Umin,Umax,params,N_inner,N_outer,
ialg,{method},{[tol1,tol2]},{Disp})
ee
∈ qee
This program calls pdmin to minimize a sequence of augmented Lagrangian functions of the form
¦ §
L c, ( ) = f ( ) − § ¨ Σ © ¨ g¨ (§ ) + 12 ¨ Σ c¨ g¨ (§ )
q ee
=1
ee
q ee
=1
ee
2
ª
subject to simple bounds on and u. The value of the augmented Lagrangian and its gradient are sup-
plied to pdmin by a_lagrng_fnc via extension 1 (see description of pdmin).
©¨ «
©¨
The values of the Lagrange multiplier estimates , = 1, . . . , q ee , are determined in one of two
ways depending on the setting of the internal variable METHOD in aug_lagrng.m. Initially = 0.
Multiplier Update Method 1. This method adjusts the multipliers at the end of each iteration of
pdmin by solving the least-squares problem
© =¦ minq | ∇ f ( ) −
∈ IR ee
§ ¨ Σ © ¨ ∇g¨ (§ )||
q ee
=1
ee
2
I _i ,
where the norm is taken only on the uncstrained subspace of decision variables which is indicated by the
index set I_i. This update is performed by multiplier_update which is called by pdmin via extension
1/2 ¬
2. If update method 1 is used, the tolerance requested for the inner loop is decreased by a factor of ten on
each outer iteration starting from 10min { 6,N_outer } mach 1/2
until the tolerance is mach . ¬
Multiplier Update Method 2. This method is the standard ‘‘method of multipliers’’ which solves the
© ¨ − c¨ g¨ (§ ) , \/ ∈ I¨
inner loop completely and then uses the first order multiplier update
©¨ ← ee
¨ ¨ ¨
where
I¨ = { « ∈ q | |g (§ )| ≤ |g (§ )| or |g (§ )| ≤ tol2 } .
. 1
If update method 2 is used, the tolerance requested for the inner loop is fixed at ¬
ee ee 4 ee previous ee
1/2
. mach
Penalty Update. The initial values for the constraint violation penalties are c¨ = 1, « = 1, . . . , q ee . It
may be helpful to use larger initial values for highly nonlinear problems. The penalties are updated at the
end of each outer iteration according to the rule
¨
c ← 10c ¨ , \/ ∈
/ Ï ,
¨
where I is as defined above.
Note that this algorithm is implemented mainly to demonstrate the extensible features of pdmin and is
missing features like, (i) constraint scaling, (ii) an active set method for handling inequality endpoint con-
straints, (iii) a mechanism for decreasing constraint violation penalties when possible and, most impor-
tantly, (iv) an automatic mechanism for setting the termination tolerance for each call to pdmin.
Notes:
1. On return from a call to aug_lagrng, the variable opt_program will be defined in the Matlab
workspace. It will contain the string ’aug_lagrng’.
Purpose
This program calls riots to solve problems defined on a sequence of different integration meshes, each of
which result in a more accurate approximation to OCP than the previous mesh. The solution obtained for
one mesh is used as the starting guess for the next mesh.
The user is urged to check the validity of the user-supplied derivatives with the utility program
check_deriv before attempting to use pdmin.
Calling Syntax
[new_t,u,x,J,G,E] = outer([x0,{fixed,{x0min,x0max}}],u0,t,
Umin,Umax,params,N_inner,[N_outer,{max_N}]
ialg,{[tol1,tol2,tol3]},{strategy},{Disp})
and
| u N − u*|| ≤ tol3(1 + | u N ||∞ )b ,
where b is the nominal final time. The default values for these tolerances factors
²
1/3
are [ mach ²1/4
, mach 1/6
²
, mach ].
strategy Passed on to distribute to select the mesh redistribution strategy.
Default = 3.
Disp Passed on to riots to control amount of displayed output. Default = 1.
³ ´
1/2
. b
|| | H 2 = || | 2 + ∫ | u(t)||22 dt .
a
Description of Algorithm
outer is an outer loop for riots. During each iteration, riots is called to solve the discretized problem on
the current mesh starting from the solution of the previous call to riots interpolated onto the new mesh.
After riots returns a solution, est_errors and control_error are called to provide estimates of certain
quantities that are used to determine whether outer should terminate or if it should refine the mesh. If
necessary, the mesh is refined by distribute, with FAC=10, according to strategy except following
the first iteration. After the first iteration, the mesh is always doubled.
After each iteration, the following information is displayed: the H 2 -norm of the free portion of the
gradient of the Lagrangian, the sum of constraint errors, objective function value, and integration error of
the integration algorithm ialg at the current solution. All of these quantities are computed by
est_errors. The first three values are estimates obtained using LSODA with a tolerance set to about one
thousandth of the integration error estimate. The control solution is plotted after each iteration (although
the time axis is not scaled correctly for free final time problems).
Additionally, following all but the first iteration, the change in the control solution from the previous
³ ³
iteration and an estimate of the current solution error, | N* − *|| H 2 , are display.
Notes:
1. If solutions exhibit rapid oscillations it may be helpful to add a penalty on the piecewise derivative
variation of the control by setting the variable VAR in outer.m to a small positive value.
2. The factor by which distribute is requested to increase the integration accuracy after each iteration
can be changed by setting the variable FAC in outer.m.
3. An example using outer is given in Session 4 (§3).
Purpose
This is an optimization method based on the projected descent method [3]. It is highly efficient but does
not solve problems with general constraints or more than one objective function.
The user is urged to check the validity of the user-supplied derivatives with the utility program
check_deriv before attempting to use pdmin.
Calling Syntax
[u,x,J,inform,I_a,I_i,M] = pdmin([x0,{fixed,{x0min,x0max}}],u0,t,
Umin,Umax,params,[miter,{tol}],
ialg,{method},{[k;{scale}]},{Disp})
Description of Inputs
The first six inputs are described in Table O3. The remainder are described here.
miter The maximum number of iterations allowed.
tol Specifies the tolerance for the following stopping criteria
·
||g k || I k / |I k | < tol2/3 (1 + |f( k )|) ,
x ik = 0 , \/ i ∈ Ak ,
where g k is the k-th component of the derivative of f (⋅) in transformed coordinates, I k is set
1/2
of inactive bound indices and Ak is set of active bound indices. Default: mach .¸
ialg Specifies the integration algorithm used by simulate.
method A string that specifies the method for computing descent directions in the unconstrained sub-
space. The choices are:
’’ limited memory quasi-Newton (L-BFGS)
’steepest’ steepest descent
’conjgr’ Polak-Ribière conjugate gradient method
’vm’ limited memory quasi-Newton (L-BFGS)
The default method is the L-BFGS method.
k This value is used to determine a perturbation with which to compute an initial scaling for the
objective function. Typically, k is supplied from a previous call to pdmin or not at all.
scale This value is used to determine a perturbation with which to compute an initial function scal-
ing. Typically, scale is supplied from a previous call to pdmin or not at all.
Description of Outputs
º º ¹
inform(1) Reason for termination (see next table).
Function space norm of the free portion of ∇ f ( ), = (u, ).
» ¼ »
inform(2)
¼
inform(3) Final step-size k = log / log where is the Armijo step-
length and = 3/5.
inform(4) The value of the objective function scaling.
Depending on the setting of Disp, pdmin displays a certain amount of information at each iteration.
This information is displayed in columns. In the first column is the number of iterations completed; next
» ¼ º
is the step-size, = k , with k shown in parenthesis; next is | ∇ f ( )|| I k which is the norm of the gradient
with respect to those decision variables that are not at their bounds; next is a four (three if there are no
upper or lower bounds) letter sequence of T’s and F’s where a T indicates that the corresponding termina-
tion test, described above, is satisfied; next is the value of the objective function; and in the last column,
an asterix appears if the set of indices corresponding to constrained variables changed from the previous
iteration.
Because pdmin is designed to be callable by other optimization programs, it includes three extensions
that allow the user to customize its behavior. These extensions are function calls that are made to user
supplied subroutines at certain points during each iteration. They allow the user to (i) construct the
objective function and its gradients, (ii) specify termination criteria and perform computations at the end
of each pdmin iteration, and (iii) add additional tests to the step-size selection procedure. The use of the
first two of these extensions is demonstrated in the program aug_lagrng.
Notes:
¾
¾
6. Control bounds can be violated if using splines of order > 2 if the spline coordinate transformation
is in effect. This is only possible with RK4 because splines of order > 2 are only allowed for RK4 and
LSODA and the transform is turned off for LSODA if bounds are used.
Purpose
This is the main optimization program in RIOTS_95. The algorithm used by riots is a sequential
quadratic programming (SQP) routine called NPSOL. Multiple objective functions can be handled indi-
rectly using the transcription describe in §2.3.
The user is urged to check the validity of the user-supplied derivatives with the utility program
check_deriv before attempting to use riots.
Calling Syntax
[u,x,f,g,lambda2] = riots([x0,{fixed,{x0min,x0max}}],u0,t,Umin,Umax,
params,[miter,{var,{fd}}],ialg,
{[eps,epsneq,objrep,bigbnd]},{scaling},
{disp},{lambda1});
Description of Inputs
The first six inputs are described in Table O3. The remainder are described here.
miter The maximum number of iterations allowed.
var Specifies a penalty on the piecewise derivative variation[4, Sec. 4.5] 15 of the control to be
added to the objective function. Can only be used with first and second order splines.
Adding a penalty on the piecewise derivative variation of the control is useful if rapid oscilla-
tions are observed in the numerical solution. This problem often occurs for singular
problems [13,14] in which trajectory constraints are active along singular arcs. The penalty
should be ten to ten thousand times smaller than the value of the objective function at a solu-
tion.
fd If a non-zero value is specified, the gradients for all functions will be computed by finite-
difference approximations. In this case Dh, Dg, and Dl will not be called. Default: 0.
ialg Specifies the integration algorithm used by simulate.
eps Overall optimization tolerance. For NPSOL, eps is squared before calling NPSOL. See the
SQP user’s manual for more details. Default: 10−6 .
epsneq Nonlinear constraint tolerance. Default: 10−4 .
objrep Indicates function precision. A value of 0 causes this features to be ignored. Default: 0.
bigbnd A number large than the largest magnitude expected for the decision variables. Default: 106 .
scaling Allowable values are 00, 01, 10, 11, 12, 21, 22. Default: 00. See description below.
disp Specify zero for minimal displayed output. Default: 1.
15
The piecewise derivative variation is smoothed to make it differentiable by squaring the terms in the summation.
Description of Outputs
Table O5
¿
lambda2 Vector of Lagrange multipliers. This output has two columns if NPSOL is used. The first
column contains the Lagrange multipliers. The first m(N + − 1) components are the multi-
pliers associated with the simple bounds on u. These are followed by the multipliers associ-
ated with the bounds on any free initial conditions. Next are the multipliers associated with
the general constraint, given in the same order as the constraint violations in the output g.
The second column of lambda2 contains information about the constraints which is used by
riots if a warm start using lambda1 is initiated (as described below).
Scaling
There are several heuristic scaling options available in riots for use with badly scaled problems. There
are two scaling methods for objective functions and two scaling methods for constraints. These are
selected by setting scaling to one of the two-digit number given in the following table:
Table O6
à = 1 Ä Å Å À ,Å À
Å
2 f (À + À ) − f Ä (À ) − ∇ f Ä (À ), À
/
\
\
/I
/ \
,
0 0 0 \ 0 0/I
Á À
where [⋅]# is the projection operator that projects its argument into the region feasible with respect to the
Ã
simple bounds on u and , and I is the set of indices of 0 corresponding to components which are in the
ÅÀ
Ã
interior of this feasible region ( is the distance along the projected steepest descent direction, , to the
minimum of a quadratic fit to f (⋅)). If ≥ 10−4 , scale the -th objective function by o = FACTOR . Â ÃÄ Ã
Ã Ä À Ã
Otherwise, compute = | ∇ f ( 0 )||. If ≥ 10−3 , set o = FACTOR . Otherwise, use function scaling 1. ÃÄ Ã
Constraint Scaling 1: For each  ∈ qei , the endpoint inequality constraints are scaled by
ÃÄ ei =
1
FACTOR , Ä À
max { 1, |g ei ( 0 )| }
for each  ∈ qee , the endpoint equality constraints are scaled by
ÃÄ ee =
1
FACTOR , Ä À
max { 1, |g ee ( 0 )| }
and, for each  ∈ qti , the trajectory inequality constraints are scaled by
ÃÄ
ti =
1
Ä FACTOR .
max { 1, max |l ti (t k , x k , u k )| }
k ∈ { 1,...,N +1 }
Â
Constraint Scaling 2: The trajectory constraint scalings are computed in the same way as for con-
ÃÄ Ã
straint scaling method 1. For each ∈ qei , the endpoint inequality constraints are scaled by ei = and,
Â
for each ∈ qee , the endpoint equality constraints are scaled by ee = where is determined as fol- ÃÄ Ã Ã
À
lows. If |g( 0 ))| ≥ 10−3 , let
à 1
=
|g( 0 )|ÀFACTOR ,
À
otherwise, if | ∇g( 0 )|| ≥ 10−3 , let
à 1
=
||∇g( 0 )||
FACTOR ,
À
otherwise do not scale.
Scaling will not always reduce the amount of work required to solve a specific problem. In fact, it
can be detrimental. In the following table, we show the number of iterations required to solve some
Table O7
Problem ialg 0 10 20
LQR 2 5 7 7
Rayleigh w/o endpoint constraint 2 18 17 14
Rayleigh with endpoint constraint 2 24 29 19
Goddard w/o trajectory constraint 4 69 29 45
Goddard with trajectory constraint 4 22 17 19
For the last row, riots was called with var = 10−4 . Constraint scaling did not have any affect on the
number of iterations for these problems. Discussion of scaling issues can be found in [12,15,16].
Warm Starts
The input lambda1 controls the warm-starting feature available with riots if it is linked with NPSOL.
There are two types of warm starts.
The first type of warm start is activated by setting lambda1=1. If this warm start is used, the
Lagrange multiplier estimates and Hessian estimate from the previous run will automatically be used as
the starting estimates for the current run. This is useful if riots terminates because the maximum number
of iterations has been reached and you wish to continue optimizing from where riots left off. This type of
warm start can only be used if the previous call to riots specified lambda1=-1 or lambda1=1. Setting
lambda1=-1 does not cause a warm-start, it just prepares for a warm start by the next call to riots.
The second type of warm start allows warm starting from the previous solution from riots but inter-
polated onto a new mesh and is only implemented for first and second order splines. It is activated by
providing estimates of the Lagrange multipliers in the first column of input lambda1 and the status of
the constraints in the second column of lambda1. Typically, lambda1 is produced by the program dis-
Æ
tribute which appropriately interpolates the lambda2 output from the previous run of riots onto the new
mesh. When lambda1 is supplied in this way, riots estimates H( ), the Hessian of the Lagrangian at the
current solution point, by applying finite-differences to the gradients of all objective and constraint func-
tions weighted by their Lagrange multipliers (and scalings if a scaling option has been specified).
Æ
Ç
The estimate H( ) of the Hessian of the Lagrangian is computed by the program comp_hess. This
computation requires N + + nfree x0 system simulations (where nfree x0 is the number of free initial con-
ditions) and twice as many gradient computations as there are objective functions and constraints with
non-zero Lagrange multipliers. Also, if a non-zero value for var is specified, the second derivative of the
Ç
penalty term on the piecewise derivative variation of the control is added to the Hessian estimate. When
≤ 2, the computation takes advantage of the symmetry of the Hessian by stopping the simulations and
gradient computations once the calculations start filling the Hessian above its diagonal. After H is com-
∼
È È
puted, it is converted into transformed coordinates using the formula H = (M−1/2 )T HM−1/2 , unless the
∼ T
∼
∼ ∼ T
É É ∼
É Ê
gular values of H . Next, each diagonal element, i , of S is set to i = max { i , mach1/3
} . Then, we set
H = USU , which, because H = H , makes all negative eigenvalues of H positive while preserving the
∼ ∼
eigenstructure of H . Finally, the Cholesky factorization of H is computed.
Notes:
1. Since NPSOL is not a feasible point algorithm, it is likely that intermediate iterates will violate some
nonlinear constraints.
2. Because of the coordinate transformation, the inner products in the termination tests correspond to
inner-products in L 2 [a, b]. Thus the tests are independent of the discretization level.
Ë
Ë
3. Control bounds can be violated if using splines of order > 2 if the spline coordinate transformation
is in effect. This is only possible with RK4 because splines of order > 2 are only allowed for RK4 and
LSODA and the transform is turned off for LSODA if bounds are used.
Bugs:
1. riots uses the Matlab MEX function mexCallMATLAB to make calls to simulate. There is a bug in
this function that interferes with the operation of ctrl-C. This problem can be circumvented by compil-
ing simulate directly into riots (see instructions on compiling riots).
2. The full warm-start feature, which requires the computation of the Hessian using finite-differencing
of the gradients, is not allowed if the input fd is set to a non-zero value.
There are several utility programs, some are used by the optimization programs and some are callable by
the user. Those utility programs of interest to the user are described in this section. These are:
control_error Ì
Computes an estimate of the norm of the error of the computed solution. If N* is the
Ì
computed solution and * is a local minimizer for problem OCP, the solution error is
Ì Ì
| N* − *| H 2 .
distribute Redistributes the integration mesh according to one of several mesh refinement strate-
gies including one which simply doubles the mesh. The control spline defined on the
previous mesh will be interpolated onto the mesh. The order of the spline is allowed to
change.
est_errors Returns an estimate of the global integration error for the fixed step-size Runge-Kutta
methods and uses the variable step-size integration algorithm to obtain accurate mea-
sures of the objective functions, constraint violations and trajectories. It also returns
the function space norm the free portion of the gradient of the augmented Lagrangian
which is needed by control_error.
sp_plot Plots spline functions.
transform Computes a matrix which allows the L 2 inner product of two splines to be computed
by taking the inner product of their coefficients.
Purpose
This function uses values computed by est_errors for solutions of OCP on different integration meshes
Í Í Í Î
to estimate | N − *| H 2 for the current solution N = (u N , N ) using results from [4, Sec. 4.4.].
Calling Syntax
[error,norm_zd]=control_error(x01,u1,t1,ze1,x02,u2,t2,ze2,{Tf})
Description
Í
This program compares the two solutions N 1 = (u1, x01) and N 2 = (u2, x02), corresponding to the Í
Í
mesh sequences t1 and t2 to produce an estimate of | N 2 − *|| H 2 where * = (u*, *) is a solution for Í Í Î
OCP. For free final time problems, Tf should be set to the duration scale factor (see transcription for free
Í Í
final time problems in §2). Only the first columns of x01 and x02 are used. The inputs ze1 and ze2
are the norms of the free gradients of the augmented Lagrangians evaluated at N 1 and N 2 , respectively,
which can be obtained from calls to est_errors.
The output error is the estimate of | Í N2 Í
− *| H 2 where
Í Í Î
a+(b−a)Tf
.
| N 2 − *||2H 2 = ||x02 − *||22 + ∫a | u2 (t) − u*(t)||22 dt ,
with u2 (⋅) the spline determined by the coefficients u2. The output norm_zd is | Í N2 − Í N 1 || H 2 where
Í Í
a+(b−a)Tf
.
| N2 − N 1 || H 2 = | x02 − x01||2 +
2 2
∫a ||u2 (t) − u1 (t)||22 dt ,
with u1 (⋅) and u2 (⋅) the splines determined by the coefficients u1 and u2, respectively.
Example
Ï Ï
Let u1 be the coefficients of the spline solution for the mesh t1 and let u2 be the coefficients of the spline
solution for the mesh t2. Let 1 and 2 be the Lagrange multipliers (if the problem has state constraints)
and let I 1 and I 2 be the index set of inactive control bounds returned by one of the optimization programs
(if the problem has control bounds). The Lagrange multipliers and the inactive control bound index sets
are also returned by the optimization routines. Then we can compute the errors, e1 = || N 1 − *|| H 2 and Í Í
Í Í
e2 = | N 2 − *|| H 2 as follows:
Purpose
This function executes various strategies for redistributing and refining the current integration mesh. It
also interpolates the current control and Lagrange multipliers corresponding to trajectory constraints onto
this new mesh.
Calling Syntax
[new_t,new_u,new_lambda,sum_lte]=distribute(t,u,x,ialg,lambda,
n_free_x0,strategy,
{FAC},{new_K},{norm})
Description of Inputs
t Row vector containing the sequence of breakpoints for the current mesh.
u The coefficients of the spline defined on the current mesh.
x Current state trajectory solution.
ialg Integration algorithm to be used during next simulation or optimization.
lambda Current Lagrange multiplier estimates from riots. Specify lambda=[] if you do not
need new multipliers for a warm start of riots.
n_free_x0 Number of free initial conditions. This value only affects the extension of Lagrange mul-
tipliers needed for a warm start of riots.
strategy Selects the redistribution strategy according to the following table:
strategy Type of Redistribution
1 Movable knots, absolute local truncation error.
2 Fixed knots absolute local truncation error.
3 Double the mesh by halving each interval.
4 Just change spline order to new_K.
11 Movable knots, relative local truncation error.
12 Fixed knots, relative local truncation error.
Ô
For more information on these strategies, see Chapter 4.3.2 in 4. The quasi-uniformity
Õ
constant in equations (4.3.13) and (4.3.24) of that reference is set to = 50. In Step 2 of
Strategy 2 (and 12), = 1/4.
FAC For use with strategies 1,2,11 and 12. If specified, the number of intervals in the new
mesh is chosen to achieve an integration accuracy approximately equal to the current inte-
gration accuracy divided by FAC. If FAC=[] or FAC=0, the number of intervals in the
new mesh will be the same as the previous mesh for strategies 1 and 11. For strategies 2
and 12, the relative errors e k will be used without being pre-weighted by FAC.
new_K Specifies the order of the output spline with coefficients new_u. By default, new_K is
the same as the order of the input spline with coefficients u.
Description of Outputs
where eik is as computed above. The (n + 1)-th component represents the accumulation of
local truncation errors for the integrand of the first objective function.
Notes:
1. The algorithm used in strategies 1 and 2 does not take into account the presence, if any, of trajectory
constraints. Strategies 2 and 12 include a mechanism that tends to add mesh points at times, or near
times, where trajectory constraints are active. The input lambda must be supplied for this mechanism to
be used.
Purpose
This function performs a high accuracy integration with LSODA to produce estimates of various quanti-
Ö Ö
ties. One of these quantities is used by control_error to produce an estimate of | N − *| H 2 .
Calling Syntax
[int_error,norm_gLa,J,G,x,Ii] = est_errors([x0,{fixed}],u,t,Tf,
ialg,lambda,{I_i})
Description of Inputs
x0 Initial conditions of the current solution. When one or more initial conditions are free
variables, set x0=x(:,1) where x is the trajectory solution returned by one of the opti-
mization programs.
fixed An n-vector that indicates which components of x0 are free variables. If fixed(i)=0
then x0(i) is a free variable. Default: all ones.
u Current control solution.
t Sequence of breakpoints for the current integration mesh on the (nominal) time interval
[a, b].
Tf The duration scale factor. For fixed final time problems, set Tf=1.
ialg Integration algorithm used to produce the current solution.
lambda Vector of Lagrange multiplier estimates (one or two columns depending on which opti-
mization program produced lambda).
I_i Index set of controls and free initial conditions that are not at their bounds (returned by
one of the optimization program).
Description of Outputs
Ú
computed by transform. If Ii is the index set estimating the free portion of
= [u(:);xi(free_x0)] (see below), then the free norm if computed as follows:
ÛÚ
||∇free L c, ( )|| H 2 = gLM(Ii)’*gL(Ii) ,
where
Ù
gLM = [grad_Lu(:)M−1 ; grad_Lx0(free_x0)]
and
gL = [grad_Lu(:) ; grad_Lx0(free_x0)] .
Ü Ú Ü
Û Ú Ú
In forming the augmented Lagrangian, = lambda(:,1) and c i = | i |. The quantity
||∇free L c, ( )|| H 2 is used by control_error to estimate the error | N − *|| H 2 .
J An estimate of the objective function at the current solution. This estimate is produced using
LSODA.
G An estimate of the sum of constraint violations. This estimate is produced using LSODA.
x The solution trajectory as produced using LSODA.
Ii Set of indices that specify those time points in the mesh t that are contained in the estimate I of
subintervals in [a, b] on which the control solution is not constrained by a control bound followed
by the indices of any free initial conditions that are not constrained by a bound. This index set is
used by control_error. For the purpose of demonstration, consider a single input systems (m = 1)
with no free initial conditions. Let
.
Î = ∪ [t k−1 , t k+1 ] ,
k ∈ I_i
. .
where t 0 = t 1 and t N +2 = t N +1 . Î is an estimate of the time intervals on which the control bounds
are inactive. From Î , the index set Ii is set to
.
Ii = { k | t k ∈ Î } .
When there are multiple inputs, this procedure is repeated for each input. When there are free initial
conditions, the indices of the unconstrained components of x0(free_x0) are added to the end of
Ii.
Notes:
1. If the user does not supply the derivative functions Dh and Dl then it will be necessary to change the
statement IALG=5 to IALG=6 in the file est_errors.m.
Purpose
This program allows the user to easily plot controls which are represented as splines.
Calling Syntax
val = sp_plot(t,u,{tau})
Description
Ý
Produces a plot of the spline with coefficients u defined on the knot sequence constructed from the inte-
gration mesh t. The order, , of the spline is presumed equal to length(u) − N + 1. If tau is speci-
fied, u is not plotted, just evaluated at the times tau. Otherwise, u is plotted at 100 points with the same
relative spacing as the breakpoints in t. Second order splines can also be plotted using the Matlab com-
mand plot instead of sp_plot.
If the input tau is not given, then the output is val=[t;uval] where t are the data points and
uval are the data values; uval has the same number of rows as the input u. If the input tau is given,
then the output is just val=uval.
Example. This example plots a first, second and third order spline approximation to one period of a
sinusoid using ten data points. The splines are produced using the commands in the Spline Toolbox.
>> t=[0:2*pi/10:2*pi];
>> sp1 = spapi(t,t(1:10),sin(t(1:10)));
>> [dummy,u1] = spbrk(sp1);
>> knots2 = augknt(t,2); knots3 = augknt(t,3);
>> sp2 = spapi(knots2,t,sin(t));
>> [dummy,u2] = spbrk(sp2);
>> tau = aveknt(knots3,3);
>> sp3 = spapi(knots3,tau,sin(tau));
>> [dummy,u3] = spbrk(sp3);
>> sp_plot(t,u1); sp_plot(t,u2); sp_plot(t,u3);
0 0 0
−1 −1 −1
0 2 4 6 0 2 4 6 0 2 4 6
Purpose
Þ
This function produces the transformation matrix M . It is called by riots and pdmin to generate the
spline coordinate transformation for the controls.
Calling Syntax
Malpha = transform(t,order)
Description
ß à à
Given two splines u1 and u2 of order = order with coefficient 1 and 2 defined on the knot sequence
à Þ,à
with breakpoints given by t, \/ u1 , u2 /\ L 2 = trace( 1 M 1T ). This function works with non-uniform
meshes and with repeated interior knot points.
ß
The output, Malpha is given in sparse matrix format. The transform matrix for = 1, 2, 3, or 4
has been pre-computed for uniformly spaced mesh points. Also, if the inputs to the preceding call to
transform, if there was a preceding call, were the same as the values of the current inputs, then the previ-
ously computed transform matrix is returned.
Example
Þ
This example generates two second order splines and computes their L 2 inner-product by integrating their
product with the trapezoidal rule on a very fine mesh and by using M .
>> t = [0:.1:1];
>> knots = augknt(t,2);
>> coef1 = rand(1,11); coef2 = rand(1,11);
>> sp1 = spmak(knots,coef1);
>> sp2 = spmak(knots,coef2);
>> tau = [0:.0001:1];
>> u1 = fnval(sp1,tau);
>> u2 = fnval(sp2,tau);
>> inner_prod1 = trapz(tau,u1.*u2)
inner_prod1 = 0.2800
inner_prod2 = 0.2800
>> inner_prod1-inner_prod2
ans = 1.9307e-09
Note: If you have the RIOTS_95 demo package but have not yet purchased RIOTS_95, you will not be
able to solve your own optimal control problems. Please refer to "license.doc" supplied with the demon-
stration for further details on the RIOTS_95 purchase agreement.
It is recommended that you make a copy of the "simulate.mex" that comes supplied with RIOTS_95
before creating your own "simulate.mex" with the steps outlined here. Then, if you want to use the m-file
interface for some reason you can copy back the original version of "simulate.mex".
Step 1: Write the user-supplied C routines (refer to §4 for details) required for you optimal control prob-
lem. Several sample problems are supplied with RIOTS_95 in the "systems" directory. Additionally,
there is a file called "template.c" which you can use as a starting point for writing your own problem.
16
If you are using Matlab v. 4.0, only version 9.0 or up of the Watcom C compiler is required.
This version of RIOTS was developed over a period of two years. Many desirable features that could
have been included were omitted because of time constraints. Moreover, there are many extensions and
improvements that we have envisioned for future versions. We provide here a synopsis of some of the
improvements currently being planned for hopefully, upcoming versions of RIOTS.
• Automatic Differentiation of user-supplied functions. This would provide automatic generation
of the derivative functions Dh, Dl and Dl using techniques of automatic differentiation [17,18].
• Extension to Large-Scale Problems. The size of the mathematical programming problem created
by discretizing an optimal control problem (the way it is done in RIOTS) depends primarily on the dis-
cretization level N . The work done by the projected descent algorithm, pdmin, grows only linearly
with N and hence pdmin (and aug_lagrng) can solve very large problems. However, these programs
cannot handle trajectory constraints or endpoint equality constraints17 The main program in, riots, is
based on dense sequential quadratic programming (SQP). Hence, riots is not well-suited for high dis-
cretization levels. There are many alternate strategies for extending SQP algorithms to large-scale
problems as discussed in [4, Chap. 6]. The best approach is not known at this time and a great deal of
work, such as the work in [19-22] as well as our on investigations, is being done in this area .
• Trajectory constraints. Our current method of computing functions gradients with respect to the
control is based on adjoint equations. There is one adjoint equation for each function. This is quite
inefficient when there are trajectory constraints because for each trajectory constraint there is, in
effect, one constraint function per mesh point. Thus, for an integration mesh with N + 1 breakpoints,
roughly N adjoint equations have to be solved to compute the gradients at each point of a trajectory
constraint. An alternate strategy based on the state-transition (sensitivity) matrix may prove to be
much more efficient. Also, it is really only necessary to compute gradients at points, t k , where the tra-
jectory constraints are active or near-active. The other mesh points should be ignored. Algorithms for
selecting the active or almost active constraint are present in [23,24] along with convergence proofs.
• Stabilization of Iterates. One of the main limitations of the current implementation of RIOTS is
that it is not well-equipped to deal with problems whose dynamics are highly unstable. For such prob-
lems, the iterates produced by the optimization routines in RIOTS can easily move into regions where
the system dynamics ‘‘blow-up’’ if the initial control guess is not close to a solution. For instance, a
very difficult optimal control problem is the Apollo re-entry problem [25]. This problem involves find-
ing the optimum re-entry trajectory for the Apollo space capsule as it enters the Earth’s atmosphere.
Because of the physics of this problem, slight deviations of the capsules trajectory can cause the cap-
sule to skip off the Earth’s atmosphere or to burn up in the atmosphere. Either way, once an iterate is a
control that drives the system into such a region of the state-space, there is no way for the optimization
routine to recover. Moreover, in this situation, there is no way to avoid these regions of the state-space
using control constraints.
This problem could be avoided using constraints on the system trajectories. However, this is a
very expensive approach for our method (not for collocation-based methods), especially at high dis-
cretization levels. Also, for optimization methods that are not feasible point algorithms, this approach
still might not work. An intermediate solution is possible because it is really only necessary to check
the trajectory constraints at a few points, called nodes, in the integration mesh. This can be accom-
plished as follows. Let t k be one such node. Then define the decision variable x∼ k,0 which will be
17
Endpoint inequality constraints can be handled effectively with aug_lagrng by incorporating a suitable active constraint set strategy.
Other Issues and Extensions. Some other useful features for RIOTS would include:
• A graphical user interface. This would allow much easier access to the optimization programs and
selection of options. Also, important information about the progress of the optimization such as error
messages and warnings, condition estimates, step-sizes, constraint violations and optimality conditions
could be displayed in a much more accessible manner.
• Dynamic linking. Currently, the user of RIOTS must re-link simulate for each new optimal control
problem. It would be very convenient to be able to dynamically link in the object code for the optimal
control problem directly from Matlab (without having to re-link simulate). There are dynamic linkers
available but they do not work with Matlab’s MEX facility.
• For problems with dynamics that are difficult to integrate, the main source of error in the solution to
the approximating problems is due to the integration error. In this case, it would be useful to use an inte-
gration mesh that is finer than the control mesh. Thus, several integration steps would be taken between
control breakpoints. By doing this, the error from the integration is reduced without increasing the size
(the number of decision variables) of the approximating problem.
• The variable transformation needed to allow the use of a standard inner product on the coefficient
space for the approximating problems adds extra computation to each function and gradient evaluation.
Also, if the transformation is not diagonal, simple bound constraints on the controls are converted into
general linear constraints. Both of these deficits can be removed for optimization methods that use Hes-
sian information to obtain search directions. If the Hessian is computed analytically, then the transforma-
á
tion is not needed at all. If the Hessian is estimated using a quasi-Newton update, it may be sufficient to
use the transformation matrix M N or M as the initial Hessian estimate (rather than the identity matrix)
and dispense with the variable transformation. We have not performed this experiment; it may not work
because the the updates will be constructed from gradients computed in non-transformed coordinates18.
• It may be useful to allow the user to specify bounds on the control derivatives. This would be a sim-
ple matter for piecewise linear control representations.
• Currently the only way to specify general constraints on the controls is using mixed state-control tra-
jectory constraints. This is quite inefficient since adjoint variables are computed but not needed for pure
control constraints.
• Currently there is no mechanism in RIOTS for to directly handle systems with time-delays or, more
generally, integro-differential equations [29]. This would be a non-trivial extension.
18
With appropriate choice of H 0 , quasi-Newton methods are invariant with respect to objective function scalings[27,28], but not coordinate
transformations (which is variable scaling).
This appendix describes several optimal control problem examples that are supplied with RIOTS_95
in the ‘systems’ directory. Control bounds can be included on the command line at run-time. See the file
‘systems/README’ for a description of the code for these problems.
. 1
min J(u) =
u ∫0 0. 625x 2 + 0. 5xu + 0. 5u2 dt
subject to:
ẋ = 1
2 x+u ; x(0) = 1 .
.
min J(u, T ) = T
u,T
subject to:
ẋ 1 = x 2 ; x 1 (0) = 0 , x 1 (T ) = 300
ẋ 2 = u ; x 2 (0) = 0 , x 2 (T ) = 0 ,
and
−2 ≤ u(t) ≤ 1 , \/ t ∈ [0, T ] .
This problem has an analytic solution which is given by T* = 30 and
0 ≤ t < 20 20 ≤ t ≤ 30
u*(t) 1 −2
x 1*(t) 2
t /2 − t + 60t − 600
2
x 2*(t) t 60 − 2t
. 1
min J(u) =
u ∫0 12 u2 dt
subject to:
ẋ = v ; x(0) = 0 , x(1) = 0
v̇ = u ; v(0) = 1 , v(1) = − 1
x(t) − L ≤ 0 , \/ t ∈ [0, 1] ,
with L = 1 / 9. This problem has an analytic solution. For any L such that 0 < L ≤ 1 / 6, the solution is
J* = 9L
4
with
0 ≤ t < 3L 3L ≤ t < 1 − 3L 1 − 3L ≤ t ≤ 1
u*(t) − (1 − 3L
2
3L
t
) 0 − 3L
2
(1 − 1−t
3L )
v*(t) (1 − 3L )
t 2
0 (1 − 3L )
1−t 2
. 2.5
min J(u) =
u ∫0 x 12 + u2 dt
subject to:
ẋ 1 (t) = x 2 (t) x 1 (0) = − 5
ẋ 2 (t) = − x 1 (t) + [1. 4 − 0. 14x 22 (t)]x 2 (t) + 4u(t) x 2 (0) = − 5
. 5
min J(u) =
u
1
2 ∫0 x12 + x22 + u2 dt
subject to:
ẋ 1 (t) = x 2 (t) x 1 (0) = 1
ẋ 2 (t) = − x 1 (t) + (1 − x 22 )x 2 (t) + u(t) x 2 (0) = 0
−x 1 (5) + x 2 (5) − 1 = 0 .
. 1
min J(u) =
u ∫0 x12 + x22 + 0. 005u2 dt
subject to:
ẋ 1 = x 2 ; x 1 (0) = 0
ẋ 2 = − x 2 + u ; x 2 (0) = − 1
and
x 2(t) − 8(t − 0. 5)2 + 0. 5 ≤ 0 , \/ t ∈ [0, T ] .
.
min J(u) = 5x 1 (2. 9)2 + x 2 (2. 9)2
u
subject to:
ẋ 1 = x 2 x 1 (0) = 1
ẋ 2 = u − 0. 1(1 + 2x 12 )x 2 x 2 (0) = 1
− 1 ≤ u(t) ≤ 1 , \/ t ∈ [0, 2. 9]
x (t) − 0. 4
2
1 − 9(x 1 (t) − 1) − 2
2
≤ 0 , \/ t ∈ [0, 2. 9]
0. 3
− 0. 8 − x 2 (t) ≤ 0 , \/ t ∈ [0, 2. 9] .
.
max J(u, T ) = h(T )
u,T
subject to:
1
m h
1
v̇ = (u − D(h, v)) − 2 , D(h, v) = 1
2 CD Aâ 0v
2
e ã (1−h)
v(0) = 0
ḣ = v h(0) = 1
1
ṁ = − u m(0) = 1 ; m(T ) = 0. 6
c
0 ≤ u(t) ≤ 3. 5 , \/ t ∈ [0, T ] .
where ä = 500, C D = 0. 05 and A â 0 = 12, 400. The variables used above have the following meanings:
v vertical velocity
h radial altitude above earth (h = 1 is earth’s surface)
m mass of vehicle
u thrust
c specific impulse (impulse per unit mass of fuel burned, c = 0. 5)
â â â ã
air density ( = 0 e (1−h) )
q dynamic pressure (q = 12 v 2 ) â
D drag
The endpoint constraint m(T ) = 0. 6 means that there is no more fuel left in the rocket. Another version
of this problem includes the trajectory constraint
Aq(t) ≤ 10 , \/ t ∈ [0, T ] .
This is a upper bound on the dynamic pressure experienced by the rocket during ascent.
1. A. Schwartz and E. Polak, “Consistent approximations for optimal control problems based on
Runge-Kutta integration,” SIAM J. Control Optim. 34(4)(1996).
2. A. Schwartz and E. Polak, “Runge-Kutta discretization of optimal control problems,” in Proceed-
ings of the 10th IFAC Workshop on Control Applications of Optimization, (1996).
3. A. Schwartz and E. Polak, “A family of projected descent methods for optimization problems with
simple bounds,” J. Optim. Theory and Appl. 91(1)(1997).
4. A. Schwartz, “Theory and Implementation of Numerical Methods Based on Runge-Kutta Integra-
tion for Solving Optimal Control Problems,” Ph.D. Dissertation, Dept. of Electrical Engineering,
University of California, Berkeley (1996). Avaiable from
http://robotics.eecs.berkeley.edu/˜adams
5. E. Polak, “On the use of consistent approximations in the solution of semi-infinite optimization and
optimal control problems,” Math. Prog. 62 pp. 385-415 (1993).
6. Carl de Boor, A Practical Guide to Splines, Springer-Verlag, New York (1978).
7. J. D. Lambert, Numerical Methods for Ordinary Differential Systems, John Wiley and Sons, Eng-
land (1991).
8. K. Radhakrishnan and A. C. Hindmarsh, “Description and use of LSODE, the Livermore Solver for
Ordinary Differential Equations,” NASA Reference Publ. 1327 (1993).
9. L. R. Petzold, “Automatic selection of methods for solving stiff and nonstiff systems of differential
equations,” SIAM J. Sci. Stat. Comput. 4 pp. 136-148 (1983).
10. W.W. Hager, “Rates of convergence for discrete approximations to unconstrained control problems,”
SIAM J. Numer. Anal. 13(4) pp. 449-472 (1976).
11. L. S. Jennings, M. E. Fisher, K. L. Teo, and C. J. Goh, “MISER3: Solving optimal control prob-
lems---an update,” Advances in Engineering software 14(13) pp. 190-196 (1991).
12. P. E. Gill, W. Murray, and M. H. Wright, Practical Optimization, Academic Press, London (1981).
13. A. E. Bryson and Y. Ho, Applied Optimal Control, Hemisphere Publishing Corp. (1975). (revised
printing)
14. D. J. Bell and D. H. Jacobson, Singular Optimal Control Problems, Academic Press, London
(1975).
15. L. T. Biegler and J. E. Cuthrell, “Improved infeasible path optimization for sequential modular sim-
ulators--II: the optimization algorithm,” Computers & Chemical Engineering 9(3) pp. 257-267
(1985).
16. O. Stryk, “Numerische Losung optimaler Steuerungsprobleme: Diskretisierung, Parameteropti-
mierung und erechnung der adjungierten Variablen,” Diploma-Math., Munchen University of Tech-
nology, VDI Verlag, Germany (1995).
17. A. Griewank, D. Juedes, and J. Utke, ADOL-C: A package for the automatic differentiation of algo-
rithms written in C/C++, Argonne National Laboratory, ftp://info.mcs.anl.gov/pub/ADOLC
(December 1993).
18. A. Griewank, “On automatic differentiation,” Preprint MCS-P10-1088, Argonne National
References 89
Laboratory, ftp://info.mcs.anl.gov/pub/tech_reports/reports (October 1988).
19. J. T. Betts and P. D. Frank, “A sparse nonlinear optimization algorithm,” J. Optim. Theory and Appl.
82(3) pp. 519-541 (1994).
20. J. T. Betts and W. P. Huffman, “Path-constrained trajectory optimization using sparse sequential
quadratic programming,” J. Guidance, Control, and Dynamics 16(1) pp. 59-68 (1993).
21. Henrik Jonson, “Newton Method for Solving Non-linear Optimal Control Problems with Genereal
constraints,” Ph.D. Dissertation, Linkoping Studies in Science and Technology (1983).
22. J. C. Dunn and D. P. Bertsekas, “Efficient dynamic programming implementations of Newton’s
method for unconstrained optimal control problems,” J. Optim. Theory and Appl. 63(1) pp. 23-38
(1989).
23. å
J. E. Higgins and E. Polak, “An -active barrier-function method for solving minimax problems,”
Appl. Math. Optim. 23 pp. 275-297 (1991).
24. J. L. Zhou and A. L. Tits, “An SQP algorithm for finely discretized continuous minimax problems
and other minimax problems with many objective functions,” to appear in SIAM J. Optimization, ().
25. O. Stryk and R. Bulirsch, “Direct and indirect methods for trajectory optimization,” Annals of Oper-
ations Research 37 pp. 357-373 (1992).
26. U. Ascher, R. Mattheij, and R. Russell, Numerical Solution of Boundary Value Problems for Ordi-
nary Differential Equations, Prentice Hall, Englewood Cliffs, NJ (1988).
27. D. F. Shanno and K. H. Phua, “Matrix conditioning and nonlinear optimization,” Math. Prog. 14 pp.
149-160 (1978).
28. S. S. Oren, “Perspectives on self-scaling variable metric algorithms,” J. Optim. Theory and Appl.
37(2) pp. 137-147 (1982).
29. F.H. Mathis and G.W. Reddien, “Difference approximations to control problems with functional
arguments,” SIAM J. Control and Optim. 16(3) pp. 436-449 (1978).
30. D. I. Jones and J. W. Finch, “Comparison of optimization algortihms,” Int. J. Control 40 pp.
747-761 (1984).
31. S. Strand and J. G. Balchen, “A Comparison of Constrained Optimal Control Algorithms,” pp.
439-447 in IFAC 11th Triennial World Congress, , Estonia, USSR (1990).
32. O. Stryk, “Numerical solution of optimal control problems by direct collocation,” International
Series of Numerical Methematics 111 pp. 129-143 (1993).
33. N. B. Nedeljković, “New algorithms for unconstrained nonlinear optimal control problems,” IEEE
Trans. Autom. Cntrl. 26(4) pp. 868-884 (1981).
34. D. Talwar and R. Sivan, “An Efficient Numerical Algorithm for the Solution of a Class of Optimal
Control Problems,” IEEE Trans. Autom. Cntrl. 34(12) pp. 1308-1311 (1989).
35. D. H. Jacobson and M. M. Lele, “A transformation technique for optimal control problems with a
state variable inequality constraint,” IEEE Trans. Optim. Cntrl. 14(5) pp. 457-564 (1969).
36. V. H. Quintana and E. J. Davison, “Clipping-off gradient algorithms to compute optimal controls
with constrained magnitude,” Int. J. Control 20(2) pp. 243-255 (1974).
37. H. Seywald and E. M. Cliff, “Goddard Problem in Presence of a Dynamic Pressure Limit,” J. Guid-
ance, Control and Dynamics 16(4) pp. 776-781 (1993).
References 90
References 91