MPC Book 2nd Edition 4th Printing
MPC Book 2nd Edition 4th Printing
James B. Rawlings
Department of Chemical Engineering
University of California
Santa Barbara, California, USA
David Q. Mayne
Department of Electrical and Electronic Engineering
Imperial College London
London, England
Moritz M. Diehl
Department of Microsystems Engineering and
D
Department of Mathematics
University of Freiburg
Freiburg, Germany
b Hi
No ll Publishing
First Edition
First Printing August 2009
Electronic Download (1st) November 2013
Electronic Download (2nd) April 2014
Electronic Download (3rd) July 2014
Electronic Download (4th) October 2014
Electronic Download (5th) February 2015
Second Edition
First Printing October 2017
Electronic Download (1st) October 2018
Electronic Download (2nd) February 2019
Paperback Edition
Third Printing October 2020
Electronic Download (3rd) October 2020
Electronic Download (4th) April 2022
v
To Cheryl, Josephine, and Stephanie,
In the eight years since the publication of the first edition, the field
of model predictive control (MPC) has seen tremendous progress. First
and foremost, the algorithms and high-level software available for solv-
ing challenging nonlinear optimal control problems have advanced sig-
nificantly. For this reason, we have added a new chapter, Chapter 8,
“Numerical Optimal Control,” and coauthor, Professor Moritz M. Diehl.
This chapter gives an introduction into methods for the numerical so-
lution of the MPC optimization problem. Numerical optimal control
builds on two fields: simulation of differential equations, and numeri-
cal optimization. Simulation is often covered in undergraduate courses
and is therefore only briefly reviewed. Optimization is treated in much
more detail, covering topics such as derivative computations, Hessian
approximations, and handling inequalities. Most importantly, the chap-
ter presents some of the many ways that the specific structure of opti-
mal control problems arising in MPC can be exploited algorithmically.
We have also added a software release with the second edition of
the text. The software enables the solution of all of the examples and
exercises in the text requiring numerical calculation. The software is
based on the freely available CasADi language, and a high-level set of
Octave/MATLAB functions, MPCTools, to serve as an interface to CasADi.
These tools have been tested in several MPC short courses to audiences
composed of researchers and practitioners. The software can be down-
loaded from www.chemengr.ucsb.edu/~jbraw/mpc.
In Chapter 2, we have added sections covering the following topics:
• economic MPC
• MPC with discrete actuators
We also present a more recent form of suboptimal MPC that is prov-
ably robust as well as computationally tractable for online solution of
nonconvex MPC problems.
In Chapter 3, we have added a discussion of stochastic MPC, which
has received considerable recent research attention.
In Chapter 4, we have added a new treatment of state estimation
with persistent, bounded process and measurement disturbances. We
have also removed the discussion of particle filtering. There are two
vii
viii
ix
x
JBR DQM
Madison, Wisconsin, USA London, England
Acknowledgments
Both authors would like to thank the Department of Chemical and Bio-
logical Engineering of the University of Wisconsin for hosting DQM’s
visits to Madison during the preparation of this monograph. Funding
from the Paul A. Elfers Professorship provided generous financial sup-
port.
JBR would like to acknowledge the graduate students with whom
he has had the privilege to work on model predictive control topics:
Rishi Amrit, Dennis Bonné, John Campbell, John Eaton, Peter Findeisen,
Rolf Findeisen, Eric Haseltine, John Jørgensen, Nabil Laachi, Scott Mead-
ows, Scott Middlebrooks, Steve Miller, Ken Muske, Brian Odelson, Mu-
rali Rajamani, Chris Rao, Brett Stewart, Kaushik Subramanian, Aswin
Venkat, and Jenny Wang. He would also like to thank many colleagues
with whom he has collaborated on this subject: Frank Allgöwer, Tom
Badgwell, Bhavik Bakshi, Don Bartusiak, Larry Biegler, Moritz Diehl,
Jim Downs, Tom Edgar, Brian Froisy, Ravi Gudi, Sten Bay Jørgensen,
Jay Lee, Fernando Lima, Wolfgang Marquardt, Gabriele Pannocchia, Joe
Qin, Harmon Ray, Pierre Scokaert, Sigurd Skogestad, Tyler Soderstrom,
Steve Wright, and Robert Young.
DQM would like to thank his colleagues at Imperial College, espe-
cially Richard Vinter and Martin Clark, for providing a stimulating and
congenial research environment. He is very grateful to Lucien Polak
and Graham Goodwin with whom he has collaborated extensively and
fruitfully over many years; he would also like to thank many other col-
leagues, especially Karl Åström, Roger Brockett, Larry Ho, Petar Koko-
tovic, and Art Krener, from whom he has learned much. He is grateful
to past students who have worked with him on model predictive con-
trol: Ioannis Chrysochoos, Wilbur Langson, Hannah Michalska, Sasa
Raković, and Warren Schroeder; Hannah Michalska and Sasa Raković, in
particular, contributed very substantially. He owes much to these past
students, now colleagues, as well as to Frank Allgöwer, Rolf Findeisen,
Eric Kerrigan, Konstantinos Kouramus, Chris Rao, Pierre Scokaert, and
Maria Seron for their collaborative research in MPC.
Both authors would especially like to thank Tom Badgwell, Bob Bird,
Eric Kerrigan, Ken Muske, Gabriele Pannocchia, and Maria Seron for
their careful and helpful reading of parts of the manuscript. John Eaton
xi
xii
again deserves special mention for his invaluable technical support dur-
ing the entire preparation of the manuscript.
Added for the second edition. JBR would like to acknowledge the
most recent generation of graduate students with whom he has had the
privilege to work on model predictive control research topics: Doug Al-
lan, Travis Arnold, Cuyler Bates, Luo Ji, Nishith Patel, Michael Risbeck,
and Megan Zagrobelny.
In preparing the second edition, and, in particular, the software re-
lease, the current group of graduate students far exceeded expectations
to help finish the project. Quite simply, the project could not have been
completed in a timely fashion without their generosity, enthusiasm,
professionalism, and selfless contribution. Michael Risbeck deserves
special mention for creating the MPCTools interface to CasADi, and
updating and revising the tools used to create the website to distribute
the text- and software-supporting materials. He also wrote code to cal-
culate explicit MPC control laws in Chapter 7. Nishith Patel made a
major contribution to the subject index, and Doug Allan contributed
generously to the presentation of moving horizon estimation in Chap-
ter 4.
A research leave for JBR in Fall 2016, again funded by the Paul A.
Elfers Professorship, was instrumental in freeing up time to complete
the revision of the text and further develop computational exercises.
MMD wants to especially thank Jesus Lago Garcia, Jochem De Schut-
ter, Andrea Zanelli, Dimitris Kouzoupis, Joris Gillis, Joel Andersson,
and Robin Verschueren for help with the preparation of exercises and
examples in Chapter 8; and also wants to acknowledge the following
current and former team members that contributed to research and
teaching on optimal and model predictive control at the Universities of
Leuven and Freiburg: Adrian Bürger, Hans Joachim Ferreau, Jörg Fis-
cher, Janick Frasch, Gianluca Frison, Niels Haverbeke, Greg Horn, Boris
Houska, Jonas Koenemann, Attila Kozma, Vyacheslav Kungurtsev, Gio-
vanni Licitra, Rien Quirynen, Carlo Savorgnan, Quoc Tran-Dinh, Milan
Vukov, and Mario Zanon. MMD also wants to thank Frank Allgöwer, Al-
berto Bemporad, Rolf Findeisen, Larry Biegler, Hans Georg Bock, Stephen
Boyd, Sébastien Gros, Lars Grüne, Colin Jones, John Bagterp Jørgensen,
Christian Kirches, Daniel Leineweber, Katja Mombaur, Yurii Nesterov,
Toshiyuki Ohtsuka, Goele Pipeleers, Andreas Potschka, Sebastian Sager,
Johannes P. Schlöder, Volker Schulz, Marc Steinbach, Jan Swevers, Phil-
ippe Toint, Andrea Walther, Stephen Wright, Joos Vandewalle, and Ste-
fan Vandewalle for inspiring discussions on numerical optimal control
xiii
xv
xvi Contents
C Optimization 729
C.1 Dynamic Programming . . . . . . . . . . . . . . . . . . . . . 729
C.1.1 Optimal Control Problem . . . . . . . . . . . . . . . 731
C.1.2 Dynamic Programming . . . . . . . . . . . . . . . . . 733
C.2 Optimality Conditions . . . . . . . . . . . . . . . . . . . . . 737
C.2.1 Tangent and Normal Cones . . . . . . . . . . . . . . 737
C.2.2 Convex Optimization Problems . . . . . . . . . . . . 741
C.2.3 Convex Problems: Polyhedral Constraint Set . . . 743
C.2.4 Nonconvex Problems . . . . . . . . . . . . . . . . . . 745
C.2.5 Tangent and Normal Cones . . . . . . . . . . . . . . 746
C.2.6 Constraint Set Defined by Inequalities . . . . . . . 750
C.2.7 Constraint Set; Equalities and Inequalities . . . . . 753
C.3 Set-Valued Functions and Continuity of Value Function . 755
C.3.1 Outer and Inner Semicontinuity . . . . . . . . . . . 757
C.3.2 Continuity of the Value Function . . . . . . . . . . . 759
C.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 767
List of Figures
xxiii
xxiv List of Figures
5.1 State estimator tube. The solid line x̂(t) is the center of
the tube, and the dashed line is a sample trajectory of x(t). 336
5.2 The system with disturbance. The state estimate lies in
the inner tube, and the state lies in the outer tube. . . . . . 337
p p p+1 p+1
6.1 Convex step from (u1 , u2 ) to (u1 , u2 ). . . . . . . . . . 380
6.2 Ten iterations of noncooperative steady-state calculation. . 397
6.3 Ten iterations of cooperative steady-state calculation. . . . 397
6.4 Ten iterations of noncooperative steady-state calculation;
reversed pairing. . . . . . . . . . . . . . . . . . . . . . . . . . . 398
xxvi List of Figures
xxix
xxx List of Examples and Statements
Mathematical notation
∃ there exists
∈ is an element of
∀ for all
=⇒ ⇐= implies; is implied by
≠ ⇒ ⇐=
̸ does not imply; is not implied by
a := b a is defined to be equal to b.
a =: b b is defined to be equal to a.
≈ approximately equal
V (·) function V
V :A→B V is a function mapping set A into set B
x , V (x) function V maps variable x to value V (x)
x+ value of x at next sample time (discrete time system)
ẋ time derivative of x (continuous time system)
fx partial derivative of f (x) with respect to x
∇ nabla or del operator
δ unit impulse or delta function
|x| absolute value of scalar; norm of vector (two-norm unless
stated otherwise); induced norm of matrix
x sequence of vector-valued variable x, (x(0), x(1), . . .)
∥x∥ sup norm over a sequence, supi≥0 |x(i)|
∥x∥a:b max a≤i≤b |x(i)|
tr(A) trace of matrix A
det(A) determinant of matrix A
eig(A) set of eigenvalues of matrix A
ρ(A) spectral radius of matrix A, max i |λi | for λi ∈ eig(A)
A−1 inverse of matrix A
A †
pseudo-inverse of matrix A
A′ transpose of matrix A
inf infimum or greatest lower bound
min minimum
sup supremum or least upper bound
max maximum
xli
xlii Notation
Symbols
A, B, C system matrices, discrete time, x + = Ax + Bu, y = Cx
Ac , Bc system matrices, continuous time, ẋ = Ac x + Bc u
Aij state transition matrix for player i to player j
Ai state transition matrix for player i
ALi estimate error transition matrix Ai − Li Ci
Bd input disturbance matrix
Bij input matrix of player i for player j’s inputs
Bi input matrix of player i
Cij output matrix of player i for player j’s interaction states
Ci output matrix of player i
Cd output disturbance matrix
C controllability matrix
C∗ polar cone of cone C
d integrating disturbance
E, F constraint matrices, F x + Eu ≤ e
f,h system functions, discrete time, x + = f (x, u), y = h(x)
fc (x, u) system function, continuous time, ẋ = fc (x, u)
F (x, u) difference inclusion, x + ∈ F (x, u), F is set valued
G input noise-shaping matrix
Gij steady-state gain of player i to player j
H controlled variable matrix
I(x, u) index set of constraints active at (x, u)
0
I (x) index set of constraints active at (x, u0 (x))
k sample time
K optimal controller gain
ℓ(x, u) stage cost
ℓN (x, u) final stage cost
L optimal estimator gain
m input dimension
M cross-term penalty matrix x ′ Mu
M number of players, Chapter 6
M class of admissible input policies, µ ∈ M
n state dimension
N horizon length
O observability matrix, Chapters 1 and 4
O compact robust control invariant set containing the origin,
Chapter 3
p output dimension
xliv Notation
Greek letters
xlvii
xlviii Acronyms
1.1 Introduction
The main purpose of this chapter is to provide a compact and acces-
sible overview of the essential elements of model predictive control
(MPC). We introduce deterministic and stochastic models, regulation,
state estimation, dynamic programming (DP), tracking, disturbances,
and some important performance properties such as closed-loop sta-
bility and zero offset to disturbances. The reader with background in
MPC and linear systems theory may wish to skim this chapter briefly
and proceed to Chapter 2. Other introductory texts covering the ba-
sics of MPC include Maciejowski (2002); Camacho and Bordons (2004);
Rossiter (2004); Goodwin, Serón, and De Doná (2005); Kwon (2005);
Wang (2009).
1
2 Getting Started with Model Predictive Control
u(s) y(s)
G(s)
Figure 1.1: System with input u, output y, and transfer function ma-
trix G connecting them; the model is y = Gu.
y(s) = G(s)u(s)
G(s) ∈ Cp×m is the transfer function matrix. Notice the state does
not appear in this input-output description. If we are obtaining G(s)
instead from a state space model, then G(s) = C(sI − A)−1 B + D, and
we assume x(0) = 0 as the system initial condition.
∂cA
+ ∇ · (cA vA ) − RA = 0
∂t
in which cA is the molar concentration of species A, vA is the velocity
of species A, and RA is the production rate of species A due to chemical
reaction, in which
∂ ∂ ∂
∇ := δx + δy + δz
∂x ∂y ∂z
and the δx,y,z are the respective unit vectors in the (x, y, z) spatial
coordinates.
We also should note that the distribution does not have to be “spa-
tial.” Consider a particle size distribution f (r , t) in which f (r , t)dr
represents the number of particles of size r to r + dr in a particle reac-
tor at time t. The reactor volume is considered well mixed and spatially
homogeneous. If the particles nucleate at zero size with nucleation rate
B(t) and grow with growth rate, G(t), the evolution of the particle size
distribution is given by
∂f ∂f
= −G
∂t ∂r
f (r , t) = B/G r =0 t≥0
f (r , t) = f0 (r ) r ≥0 t=0
but we avoid this notation in this text. To reduce the notational com-
plexity we usually express (1.3) as
x + = Ax + Bu
y = Cx + Du
x(0) = x0
in which the superscript + means the state at the next sample time.
The linear discrete time model is convenient for presenting the ideas
and concepts of MPC in the simplest possible mathematical setting.
Because the model is linear, analytical solutions are readily derived.
The solution to (1.3) is
k−1
X
x(k) = Ak x0 + Ak−j−1 Bu(j) (1.4)
j=0
The discrete transfer function matrix G(z) then represents the discrete
input-output model
y(z) = G(z)u(z)
and G(z) ∈ Cp×m is the transfer function matrix. Notice the state does
not appear in this input-output description. We make only passing
reference to transfer function models in this text.
1.2.5 Constraints
u ≤ u(k) ≤ u k ∈ I≥0
in which
" # " #
h i
e = A
A
0 e= B
B e = C
C 0
0 0 I
is then stated as
" # " # " #
0 −I I ∆
Fx
e (k) + Eu(k) ≤ e F= E= e=
0 I −I −∆
ε(k) ≥ 0 k ∈ I≥0
Consider the augmented input to be u e (k) = (u(k), ε(k)), the soft state
constraint formulation is then a set of mixed input-state constraints
Fe x(k) + E
eue (k) ≤ e
e k≥0
with
0 E 0 " # e
e= u
Fe = 0 E 0 −I u
e = e
e = 0
ε
F 0 −I f
As we discuss subsequently, one then formulates a stage-cost penalty
that weights how much one cares about the state x, the input u and
the violation of the hard state constraint, which is given by ε. The hard
state constraint has been replaced by a mixed state-input constraint.
The benefit of this reformulation is that the state constraint cannot
cause an infeasiblity in the control problem because it can be relaxed
by choosing ε; large values of ε may be undesirable as measured by the
stage-cost function, but they are not infeasible.
Discrete actuators and integrality constraints. In many industrial
applications, a subset of the actuators or decision variables may be in-
teger valued or discrete. A common case arises when the process has
banks of similar units such as furnaces, heaters, chillers, compressors,
etc., operating in parallel. In this kind of process, part of the control
problem is to decide how many and which of these discrete units should
be on or off during process operation to meet the setpoint or reject a
disturbance. Discrete decisions also arise in many scheduling prob-
lems. In chemical production scheduling, for example, the discrete de-
cisions can be whether or not to produce a certain chemical in a certain
1.2 Models and Modeling 9
(a) (b)
Figure 1.2: Typical input constraint sets U for (a) continuous ac-
tuators and (b) mixed continuous/discrete actuators.
The origin (circle) represents the steady-state operating
point.
reactor during the production schedule. Since these decisions are often
made repeatedly as new measurement information becomes available,
these (re)scheduling problems are also feedback control problems.
To define discrete-valued actuators, one may add constraints like
20
15
10
−5
0 50 100 150 200
time
variables
x + = Ax + Bu + Gw
y = Cx + Du + v
x + = Ax + Bu
y = Cx (1.5)
subject to
x + = Ax + Bu
The objective function depends on the input sequence and state se-
quence. The initial state is available from the measurement. The re-
mainder of the state trajectory, x(k), k = 1, . . . , N, is determined by the
model and the input sequence u. So we show the objective function’s
explicit dependence on the input sequence and initial state. The tuning
parameters in the controller are the matrices Q and R. We allow the
final state penalty to have a different weighting matrix, Pf , for general-
ity. Large values of Q in comparison to R reflect the designer’s intent
to drive the state to the origin quickly at the expense of large control
action. Penalizing the control action through large values of R relative
to Q is the way to reduce the control action and slow down the rate at
which the state approaches the origin. Choosing appropriate values of
Q and R (i.e., tuning) is not always obvious, and this difficulty is one of
the challenges faced by industrial practitioners of LQ control. Notice
that MPC inherits this tuning challenge.
We then formulate the following optimal LQ control problem
Notice that the objective function has a special structure in which each
stage’s cost function in the sum depends only on adjacent variable
pairs. For the first version of this problem, we consider w to be a
fixed parameter, and we would like to solve the problem
We solve the inner problem over z first, and denote the optimal value
and solution as follows
Notice that the optimal z and value function for this problem are both
expressed as a function of the y variable. We then move to the next
optimization problem and solve for the y variable
g 0 (x), y 0 (x)
z }|
{
min f (w, x) + min g(x, y) + min h(y, z)
x y z
| {z }
h0 (y), z0 (y)
| {z }
0
f (w), x 0 (w)
We can still break the problem into three smaller nested problems, but
the order is reversed
g 0 (y), x 0 (y)
z }|
{
min h(y, z) + min g(x, y) + min f (w, x) (1.7)
y x w
| {z }
0
f (x), w 0 (x)
| {z }
0
h (z), y 0 (z)
For the reader interested in trying some exercises to reinforce the con-
0
cepts of DP, Exercise 1.15 considers finding the function w e (z) with
1.3 Introductory MPC Regulator 15
1.5 V2 (x)
1
V1 (x) b
x2 0.5
0 v
a V (x) = V1 (x) + V2 (x)
−0.5
−1
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2
x1
(a) Show that the sum V (x) = V1 (x) + V2 (x) is also quadratic
in which
(c) Use the matrix inversion lemma (see Exercise 1.12) and show that
V (x) of part (b) can be expressed also in an inverse form, which
is useful in state estimation problems
−1
e
V (x) = (1/2)((x − v)′ H (x − v) + d)
e =A
H −1
−A −1 ′
C (CA −1
C ′ + B −1 )−1 CA−1
v = a + A−1 C ′ (CA−1 C ′ + B −1 )−1 (b − Ca)
d = (b − Ca)′ (CA−1 C ′ + B −1 )−1 (b − Ca)
Solution
(a) The sum of two quadratics is also quadratic, so we parameterize
the sum as
V (x) = (1/2) (x − v)′ H(x − v) + d
1.3 Introductory MPC Regulator 17
H =A+B
v = H −1 (Aa + Bb)
d = −v ′ Hv + a′ Aa + b′ Bb
= −(Aa + Bb)′ H −1 (Aa + Bb) + a′ Aa + b′ Bb
H = A + C ′ BC
v = H −1 (Aa + C ′ Bb)
d = −(Aa + C ′ Bb)′ H −1 (Aa + C ′ Bb) + a′ Aa + b′ Bb (1.8)
subject to
subject to
x(N) = Ax(N − 1) + Bu(N − 1)
1.3 Introductory MPC Regulator 19
in which
H = R + B ′ Pf B
v = −(B ′ Pf B + R)−1 B ′ Pf A x(N − 1)
d = x(N − 1)′ A′ Pf A − A′ Pf B(B ′ Pf B + R)−1 B ′ Pf A x(N − 1)
Given this form of the cost function, we see by inspection that the opti-
mal input for u(N − 1) is v, so the optimal control law at stage N − 1 is
a linear function of the state x(N − 1). Then using the model equation,
the optimal final state is also a linear function of state x(N − 1). The
optimal cost is (1/2)(|x(N − 1)|2Q + d), which makes the optimal cost
a quadratic function of x(N − 1). Summarizing, for all x
subject to
x(N − 1) = Ax(N − 2) + Bu(N − 2)
The recursion from Π(N −1) to Π(N −2) is known as a backward Riccati
iteration. To summarize, the backward Riccati iteration is defined as
follows
−1
Π(k − 1) = Q + A′ Π(k)A − A′ Π(k)B B ′ Π(k)B + R B ′ Π(k)A
k = N, N − 1, . . . , 1 (1.10)
The optimal gain at time k is computed from the Riccati matrix at time
k+1
−1
K(k) = − B ′ Π(k + 1)B + R B ′ Π(k + 1)A
k = N − 1, N − 2, . . . , 0
(1.13)
and the optimal cost to go from time k to time N is
Assume that we use as our control law the first feedback gain of the
finite horizon problem, K(0)
u(k) = K(0)x(k)
This system is chosen so that G(z) has a zero at z = 3/2, i.e., an unsta-
ble zero. We now construct an LQ controller that inverts this zero and
hence produces an unstable system. We would like to choose Q = C ′ C
so that y itself is penalized, but that Q is only semidefinite. We add a
small positive definite piece to C ′ C so that Q is positive definite, and
choose a small positive R penalty (to encourage the controller to mis-
behave), and N = 5
" #
′ 4/9 + .001 −2/3
Q = C C + 0.001I = R = 0.001
−2/3 1.001
of instability.
3 Please check this answer with Octave or MATLAB.
22 Getting Started with Model Predictive Control
1.3.5 Controllability
A system is controllable if, for any pair of states x, z in the state space,
z can be reached in finite time from x (or x controlled to z) (Sontag,
1998, p.83). A linear discrete time system x + = Ax + Bu is therefore
controllable if there exists a finite time N and a sequence of inputs
rank(C) = n
4 See Section A.4 of Appendix A or (Strang, 1980, pp.87–88) for a review of this result.
24 Getting Started with Model Predictive Control
The following result for checking controllability also proves useful (Hau-
tus, 1972).
Notice that the first n columns of the matrix in (1.17) are linearly
independent if λ is not an eigenvalue of A, so (1.17) is equivalent to
checking the rank at just the eigenvalues of A
h i
rank λI − A B = n for all λ ∈ eig(A)
subject to
x + = Ax + Bu
x(0) = x
exists and is unique for all x. We denote the optimal solution by u0 (x),
and the first input in the optimal sequence by u0 (x). The feedback con-
trol law κ∞ (·) for this infinite horizon case is then defined as u = κ∞ (x)
in which κ∞ (x) = u0 (x) = u0 (0; x). As stated in the following lemma,
this infinite horizon linear quadratic regulator (LQR) is stabilizing.
Lemma 1.3 (LQR convergence). For (A, B) controllable, the infinite hori-
zon LQR with Q, R > 0 gives a convergent closed-loop system
x + = Ax + Bκ∞ (x)
1.3 Introductory MPC Regulator 25
Proof. The cost of the infinite horizon objective is bounded above for
all x(0) because (A, B) is controllable. Controllability implies that there
exists a sequence of n inputs (u(0), u(1), . . . , u(n − 1)) that transfers
the state from any x(0) to x(n) = 0. A zero control sequence after
k = n for (u(n + 1), u(n + 2), . . .) generates zero cost for all terms
in V after k = n, and the objective function for this infinite control
sequence is therefore finite. The cost function is strictly convex in u
because R > 0 so the solution to the optimization is unique.
If we consider the sequence of costs to go along the closed-loop
trajectory, we have
Vk+1 = Vk − (1/2) x(k)′ Qx(k) + u(k)′ Ru(k)
x(k) → 0 u(k) → 0 as k → ∞
in which
K = −(B ′ ΠB + R)−1 B ′ ΠA
Π = Q + A′ ΠA − A′ ΠB(B ′ ΠB + R)−1 B ′ ΠA (1.18)
Proving Lemma 1.3 has shown also that for (A, B) controllable and Q,
R > 0, a positive definite solution to the discrete algebraic Riccati equa-
tion (DARE), (1.18), exists and the eigenvalues of (A+BK) are asymptot-
ically stable for the K corresponding to this solution (Bertsekas, 1987,
pp.58–64).
This basic approach to establishing regulator stability will be gener-
alized in Chapter 2 to handle constrained and nonlinear systems, so it
26 Getting Started with Model Predictive Control
is helpful for the new student to first become familiar with these ideas
in the unconstrained, linear setting. For linear systems, asymptotic
convergence is equivalent to asymptotic stability, and we delay the dis-
cussion of stability until Chapter 2. In Chapter 2 the optimal cost is
shown to be a Lyapunov function for the closed-loop system. We also
can strengthen the stability for linear systems from asymptotic stability
to exponential stability based on the form of the Lyapunov function.
The LQR convergence result in Lemma 1.3 is the simplest to estab-
lish, but we can enlarge the class of systems and penalties for which
closed-loop stability is guaranteed. The system restriction can be weak-
ened from controllability to stabilizability, which is discussed in Exer-
cises 1.19 and 1.20. The restriction on the allowable state penalty Q
can be weakened from Q > 0 to Q ≥ 0 and (A, Q) detectable, which
is also discussed in Exercise 1.20. The restriction R > 0 is retained to
ensure uniqueness of the control law. In applications, if one cares little
about the cost of the control, then R is chosen to be small, but positive
definite.
m = mx + Pxy Py−1 (y − my )
in which
m = mx + Pxy Py−1 (y − my )
P = Px − Pxy Py−1 Pyx
From the previous equation, the pair (x(0), y(0)) is a linear transfor-
mation of the pair (x(0), v(0)). Therefore, using the linear transfor-
mation of normal result (1.21), and the density of (x(0), v(0)) gives
the density of (x(0), y(0))
" # " # " #!
x(0) x(0) Q(0) Q(0)C ′
∼N ,
y(0) Cx(0) CQ(0) CQ(0)C ′ + R
Given this joint density, we then use the conditional of a joint normal
result (1.22) to obtain
px(0)|y(0) x(0)|y(0) = n (x(0), m, P )
in which
m = x(0) + L(0) y(0) − Cx(0)
L(0) = Q(0)C ′ (CQ(0)C ′ + R)−1
P = Q(0) − Q(0)C ′ (CQ(0)C ′ + R)−1 CQ(0)
We see that forecasting forward one time step may increase or decrease
the conditional variance of the state. If the eigenvalues of A are less
than unity, for example, the term AP (0)A′ may be smaller than P (0),
but the process noise Q adds a positive contribution. If the system is
unstable, AP (0)A′ may be larger than P (0), and then the conditional
variance definitely increases upon forecasting. See also Exercise 1.27
for further discussion of this point.
Given that px(1)|y(0) is also a normal, we are situated to add mea-
surement y(1) and continue the process of adding measurements fol-
lowed by forecasting forward one time step until we have processed
all the available data. Because this process is recursive, the storage re-
quirements are small. We need to store only the current state estimate
and variance, and can discard the measurements as they are processed.
The required online calculation is minor. These features make the op-
timal linear estimator an ideal candidate for rapid online application.
We next summarize the state estimation recursion.
and we denote the mean and variance with a superscript minus to in-
dicate these are the statistics before measurement y(k). At k = 0, the
recursion starts with x̂ − (0) = x(0) and P − (0) = Q(0) as discussed
previously. We obtain measurement y(k) which satisfies
" # " #" #
x(k) I 0 x(k)
=
y(k) C I v(k)
in which
x̂(k) = x̂ − (k) + L(k) y(k) − C x̂ − (k)
L(k) = P − (k)C ′ (CP − (k)C ′ + R)−1
P (k) = P − (k) − P − (k)C ′ (CP − (k)C ′ + R)−1 CP − (k)
in which
x̂ − (k + 1) = Ax̂(k)
P − (k + 1) = AP (k)A′ + Q
1
VT (x(T )) = |x(0) − x(0)|2(P − (0))−1 +
2
TX
−1 XT
2
|x(k + 1) − Ax(k)|2Q−1 + y(k) − Cx(k) R −1 (1.26)
k=0 k=0
in which x(T ) := (x(0), x(1), . . . , x(T )). We claim and then show that
the following (deterministic) least squares optimization problem pro-
duces the same result as the conditional density function maximization
of the Kalman filter
min VT (x(T )) (1.27)
x(T )
Game plan. Using forward DP, we can decompose and solve recur-
sively the least squares state estimation problem. To see clearly how
the procedure works, first we write out the terms in the state estimation
least squares problem (1.27)
34 Getting Started with Model Predictive Control
1 2 2 2
min |x(0) − x(0)|(P − (0))−1 + y(0) − Cx(0) R −1 +|x(1) − Ax(0)|Q−1
x(0),...,x(T ) 2
2 2
+ y(1) − Cx(1) R −1 + |x(2) − Ax(1)|Q−1 + · · · +
2 2
|x(T ) − Ax(T − 1)|Q−1 + y(T ) − Cx(T ) R −1 (1.28)
Then we optimize over the first state, x(0). This produces the arrival
cost for the first stage, V1− (x(1)), which we will show is also quadratic
2
V1− (x(1)) = (1/2)( x(1) − x̂ − (1) (P − (1))−1 + d(0))
Next we combine the arrival cost of the first stage with the next mea-
surement y(1) to obtain V1 (x(1))
We optimize over the second state, x(1), which defines arrival cost for
the first two stages, V2− (x(2)). We continue in this fashion until we
have optimized finally over x(T ) and have solved (1.28). Now that we
have in mind an overall game plan for solving the problem, we look at
each step in detail and develop the recursion formulas of forward DP.
1.4 Introductory State Estimation 35
Using the third form in Example 1.1 we can combine these two terms
into a single quadratic function
−1
e
V0 (x(0)) = (1/2) (x(0) − x(0) − v)′ H (x(0) − x(0) − v) + d(0)
in which
v = P − (0)C ′ (CP − (0)C ′ + R)−1 y(0) − Cx(0)
e = P − (0) − P − (0)C ′ (CP − (0)C ′ + R)−1 CP − (0)
H
2
d(0) = y(0) − Cx(0) (CP − (0)C ′ +R)−1
If we define
x̂(0) = x(0) + v
x̂(0) = x(0) + L(0) y(0) − Cx(0)
State evolution and arrival cost. Now we add the next term in (1.28)
to the function V0 (·) and denote the sum as V (·)
Again using the third form in Example 1.1, we can add the two quadrat-
ics to obtain
V (x(0), x(1)) = (1/2)(|x(0) − v|2He −1 + d)
in which
−1
v = x̂(0) + P (0)A′ AP (0)A′ + Q (x(1) − Ax̂(0))
′ ′
−1
d = (x(1) − Ax̂(0)) AP (0)A + Q (x(1) − Ax̂(0)) + d(0)
−1
e = P (0) − P (0)A′ AP (0)A′ + Q
H AP (0)
This form is convenient for optimization over the first decision variable
x(0); by inspection the solution is x(0) = v and the cost is (1/2)d. We
define the arrival cost to be the result of this optimization
V1− (x(1)) = min V (x(0), x(1))
x(0)
x̂ − (k + 1) = Ax̂(k)
P − (k + 1) = AP (k)A′ + Q
and the recursion starts with the prior information x̂ − (0) = x(0) and
P − (0). The arrival cost, Vk− , and arrival cost plus measurement, Vk , for
each stage are given by
2
Vk− (x(k)) = (1/2) x(k) − x̂ − (k) + d(k − 1)
(P − (k))−1
Vk (x(k)) = (1/2) |x(k) − x̂(k)|2(P (k))−1 + d(k)
1
p(x(k)|y(k − 1)) =
(2π )n/2 (det P − (k))1/2
exp − (Vk− (x(k)) − (1/2)d(k − 1))
1
p(x(k)|y(k)) =
(2π )n/2 (det P (k))1/2
exp − (Vk (x(k)) − (1/2)d(k)) (1.31)
y(T − N) y(T )
x(T − N) x(T )
moving horizon
full information
0 T −N T
TX
−1
1
V̂T (xN (T )) = |x(k + 1) − Ax(k)|2Q−1 +
2 k=T −N
T
X
2
y(k) − Cx(k) R −1 (1.33)
k=T −N
We use the circumflex (hat) to indicate this is the MHE cost function
considering data sequence from T − N to T rather than the full infor-
mation or least squares cost considering the data from 0 to T .
MHE in terms of least squares. Notice that from our previous DP
recursion in (1.29), we can write the full least squares problem as
in which the constant c can be found from (1.19) if desired, but its
value does not change the solution to the optimization. We can see
from (1.34) that setting VT−−N (·) to zero in the simplest form of MHE is
equivalent to giving infinite variance to the conditional density of x(T −
N)|y(T − N − 1). This means we are using no information about the
state x(T −N) and completely discounting the previous measurements
y(T − N − 1).
1.4 Introductory State Estimation 41
For the linear Gaussian case, we can account for the neglected data
exactly with no approximation by setting Γ equal to the arrival cost, or,
equivalently, the negative logarithm of the conditional density of the
state given the prior measurements. Indeed, there is no need to use
MHE for the linear Gaussian problem at all because we can solve the
full problem recursively. When addressing nonlinear and constrained
problems in Chapter 4, however, we must approximate the conditional
density of the state given the prior measurements in MHE to obtain a
computationally tractable and high-quality estimator.
1.4.5 Observability
x(k + 1) = Ax(k)
y(k) = Cx(k)
The system is observable if there exists a finite N, such that for every
x(0), N measurements y(0), y(1), . . . , y(N − 1) distinguish uniquely
the initial state x(0). Similarly to the case of controllability, if we can-
not determine the initial state using n measurements, we cannot de-
termine it using N > n measurements. Therefore we can develop a
convenient test for observability as follows. For n measurements, the
42 Getting Started with Model Predictive Control
rank(O) = n
The following result for checking observability also proves useful (Hau-
tus, 1972).
Notice that the first n rows of the matrix in (1.37) are linearly inde-
pendent if λ ∉ eig(A), so (1.37) is equivalent to checking the rank at
just the eigenvalues of A
" #
λI − A
rank =n for all λ ∈ eig(A)
C
6 See Section A.4 of Appendix A or (Strang, 1980, pp.87–88) for a review of this result.
1.4 Introductory State Estimation 43
x̂(T ) → x(T ) as T → ∞
VT0−1 ≤ VT0+n−1 −
n−2 n−1
1 X 2
X 2
ŵT (j) Q−1 + y(T + j) − C x̂(T + j|T + n − 1) R −1
2 j=−1 j=0
From the system model we have the following relationship between the
last n stages in the optimization problem at time T + n − 1 with data
1.4 Introductory State Estimation 45
y(T + n − 1)
x̂(T |T + n − 1) I
x̂(T + 1|T + n − 1) A
.. = . x̂(T |T + n − 1)+
.
. .
x̂(T + n − 1|T + n − 1) An−1
0 ŵT (0)
I
0 ŵT (1)
. .. .. (1.39)
. ..
. . . .
An−2 An−3 · · · I ŵT (n − 2)
y(T ) − C x̂(T |T + n − 1)
y(T + 1) − C x̂(T + 1|T + n − 1)
. = O x(T ) − x̂(T |T + n − 1) −
..
y(T + n − 1) − C x̂(T + n − 1|T + n − 1)
0 ŵT (0)
C 0 ŵT (1)
. .. .. ..
.. . . .
n−2 n−3
CA CA ··· C ŵT (n − 2)
x̂(T |T ) → x(T ) as T → ∞
This convergence result also covers MHE with prior weighting set to
the exact arrival cost because that is equivalent to Kalman filtering and
full least squares. The simplest form of MHE, which discounts prior
data completely, is also a convergent estimator, however, as discussed
in Exercise 1.28.
The estimator convergence result in Lemma 1.6 is the simplest to
establish, but, as in the case of the LQ regulator, we can enlarge the
class of systems and weighting matrices (variances) for which estimator
convergence is guaranteed. The system restriction can be weakened
from observability to detectability, which is discussed in Exercises 1.31
and 1.32. The restriction on the process disturbance weight (variance)
Q can be weakened from Q > 0 to Q ≥ 0 and (A, Q) stabilizable, which
is discussed in Exercise 1.33. The restriction R > 0 remains to ensure
uniqueness of the estimator.
1.5.1 Tracking
x
e (k) = x(k) − xs
u
e (k) = u(k) − us
x
e (k + 1) = x(k + 1) − xs
= Ax(k) + Bu(k) − (Axs + Bus )
x
e (k + 1) = Ax
e (k) + Bu
e (k)
so that the deviation variables satisfy the same model equation as the
original variables. The zero regulation problem applied to the system in
deviation variables finds u e (k) that takes x
e (k) to zero, or, equivalently,
which takes x(k) to xs , so that at steady state, Cx(k) = Cxs = ysp ,
which is the goal of the setpoint tracking problem. After solving the
regulation problem in deviation variables, the input applied to the sys-
tem is u(k) = u e (k) + us .
We next discuss when we can solve (1.40). We also note that for con-
strained systems, we must impose the constraints on the steady state
(xs , us ). The matrix in (1.40) is a (n + p) × (n + m) matrix. For (1.40)
to have a solution for all ysp , it is sufficient that the rows of the ma-
trix are linearly independent. That requires p ≤ m: we require at least
as many inputs as outputs with setpoints. But it is not uncommon in
applications to have many more measured outputs than manipulated
inputs. To handle these more general situations, we choose a matrix
H and denote a new variable r = Hy as a selection of linear combi-
nations of the measured outputs. The variable r ∈ Rnc is known as
the controlled variable. For cases in which p > m, we choose some set
of outputs nc ≤ m, as controlled variables, and assign setpoints to r ,
denoted rsp .
We also wish to treat systems with more inputs than outputs, m > p.
For these cases, the solution to (1.40) may exist for some choice of H
and rsp , but cannot be unique. If we wish to obtain a unique steady
state, then we also must provide desired values for the steady inputs,
usp . To handle constrained systems, we simply impose the constraints
on (xs , us ).
48 Getting Started with Model Predictive Control
subject to
" #" # " #
I−A −B xs 0
= (1.41b)
HC 0 us rsp
Eus ≤ e (1.41c)
F Cxs ≤ f (1.41d)
Assumption 1.7 (a) ensures that the solution (xs , us ) exists, and
Assumption 1.7 (b) ensures that the solution is unique. If one chooses
nc = 0, then no controlled variables are required to be at setpoint, and
the problem is feasible for any (usp , ysp ) because (xs , us ) = (0, 0) is a
feasible point. Exercises 1.56 and 1.57 explore the connection between
feasibility of the equality constraints and the number of controlled vari-
ables relative to the number of inputs and outputs. One restriction is
that the number of controlled variables chosen to be offset free must
be less than or equal to the number of manipulated variables and the
number of measurements, nc ≤ m and nc ≤ p.
N−1
1 X 2 2 +
V (x
e (0), u
e) = x
e (k) Q + u
e (k) R e = Ax
s.t. x e + Bu
e
2 k=0
in which xe (0) = x̂(k) − xs , i.e., the initial condition for the regula-
tion problem comes from the state estimate shifted by the steady-state
xs . The regulator solves the following dynamic, zero-state regulation
problem
min V (x e (0), u
e)
u
e
1.5 Tracking, Disturbances, and Zero Offset 49
subject to
Eu
e ≤ e − Eus
F Cx
e ≤ f − F Cxs
in which the constraints also are shifted by the steady state (xs , us ).
0
The optimal cost and solution are V 0 (x e (0)) and u
e (xe (0)). The mov-
ing horizon control law uses the first move of this optimal sequence,
0 0 0
u
e (x e (0)) = u
e (0; x
e (0)), so the controller output is u(k) = u e (x
e (0)) +
us .
d+ = d + wd (1.42)
and we are free to choose how the integrating disturbance affects the
states and measured outputs through the choice of Bd and Cd . The only
restriction is that the augmented system is detectable. That restriction
can be easily checked using the following result.
nd ≤ p
p+n
h n. iThus,h we can
rank i choose any nd ≤ p columns in R independent
I−A −Bd
of C for Cd .
The state and the additional integrating disturbance are estimated
from the plant measurement using a Kalman filter designed for the
augmented system. The variances of the stochastic disturbances w
and v may be treated as adjustable parameters or found from input-
output measurements (Odelson, Rajamani, and Rawlings, 2006). The
estimator provides x̂(k) and d̂(k) at each time k. The best forecast of
the steady-state disturbance using (1.42) is simply
d̂s = d̂(k)
subject to
" #" # " #
I−A −B xs Bd d̂s
= (1.45b)
HC 0 us rsp − HCd d̂s
Eus ≤ e (1.45c)
F Cxs ≤ f − F Cd d̂s (1.45d)
+
e = Ax
x e + Bu
e
(Q, R)
u y
regulator plant
xs
us estimator
x̂
" #+ " #" # " #
x̂ A Bd x̂ B
= + u+
target d̂ 0 I d̂ 0
selector " #
x̂ e h i
d̂ Lx y − C Cd x̂
ysp , usp , rsp e d̂
L d
(Qs , Rs )
If the plant output y(k) goes to steady state ys , the closed-loop system is
stable, and constraints are not active at steady state, then there is zero
offset in the controlled variables, that is
Hys = rsp
But notice this condition merely restricts the output prediction error
to lie in the nullspace of the matrix Ld , which is an nd × p matrix. If
we choose nd = nc < p, then the number of columns of Ld is greater
than the number of rows and Ld has a nonzero nullspace.8 In general,
we require the output prediction error to be zero to achieve zero offset
independently of the regulator tuning. For Ld to have only the zero
vector in its nullspace, we require nd ≥ p. Since we also know nd ≤ p
from Corollary 1.9, we conclude nd = p.
Notice also that Lemma 1.10 does not require that the plant output
be generated by the model. The theorem applies regardless of what
generates the plant output. If the plant is identical to the system plus
disturbance model assumed in the estimator, then the conclusion can
be strengthened. In the nominal case without measurement or process
noise (w = 0, v = 0), for a set of plant initial states, the closed-loop sys-
tem converges to a steady state and the feasible steady-state target is
achieved leading to zero offset in the controlled variables. Characteriz-
ing the set of initial states in the region of convergence, and stabilizing
the system when the plant and the model differ, are treated in Chap-
ters 3 and 5. We conclude the chapter with a nonlinear example that
demonstrates the use of Lemma 1.10.
Example 1.11: More measured outputs than inputs and zero offset
We consider a well-stirred chemical reactor depicted in Figure 1.7, as
in Pannocchia and Rawlings (2003). An irreversible, first-order reac-
tion A -→ B occurs in the liquid phase and the reactor temperature is
F0 , T0 , c0
Tc
F T,c
r
regulated with external cooling. Mass and energy balances lead to the
following nonlinear state space model
dc F0 (c0 − c) E
= − k 0 exp − c
dt π r 2h RT
dT F0 (T0 − T ) −∆H E 2U
= + k0 exp − c+ (Tc − T )
dt π r 2h ρCp RT r ρCp
dh F0 − F
=
dt πr2
The controlled variables are h, the level of the tank, and c, the molar
concentration of species A. The additional state variable is T , the re-
actor temperature; while the manipulated variables are Tc , the coolant
liquid temperature, and F , the outlet flowrate. Moreover, it is assumed
that the inlet flowrate acts as an unmeasured disturbance. The model
parameters in nominal conditions are reported in Table 1.1. The open-
loop stable steady-state operating conditions are the following
in which
0.2681 −0.00338 −0.00728 1 0 0
A = 9.703 0.3279 −25.44 C = 0 1 0
0 0 1 0 0 1
−0.00537 0.1655 −0.1175
B = 1.297 97.91 Bp = 69.74
0 −6.637 6.637
ear differential equations for the plant model. Do you have steady
offset in any of the outputs? Which ones?
Solution
(a) Integrating disturbances are added to the two controlled variables
(first and third outputs) by choosing
1 0
Cd = 0 0 Bd = 0
0 1
c (kmol/m3 ) 0.9
0.895
0.89
0.885
0.88
0.875
0.87
0 10 20 30 40 50
326
324
T (K)
322
320
0 10 20 30 40 50
0.8
0.76
h (m)
0.72
0.68
0.64
0 10 20 30 40 50
time (min)
Figure 1.8: Three measured outputs versus time after a step change
in inlet flowrate at 10 minutes; nd = 2.
301
300
Tc (K)
299
298
0 10 20 30 40 50
0.11
F (m3 /min)
0.1
0 10 20 30 40 50
time (min)
Figure 1.9: Two manipulated inputs versus time after a step change
in inlet flowrate at 10 minutes; nd = 2.
58 Getting Started with Model Predictive Control
0.9
c (kmol/m3 )
0.87
0 10 20 30 40 50
330
T (K)
325
320
0 10 20 30 40 50
0.78
0.72
h (m)
0.66
0.6
0 10 20 30 40 50
time (min)
Figure 1.10: Three measured outputs versus time after a step change
in inlet flowrate at 10 minutes; nd = 3.
(c) Next we try three integrating disturbances: two added to the two
controlled variables, and one added to the second manipulated
variable
1 0 0 0 0 0.1655
Cd = 0 0 0 Bd = 0 0 97.91
0 1 0 0 0 −6.637
302
Tc (K)
300
298
0 10 20 30 40 50
0.12
F (m3 /min)
0.11
0.1
0 10 20 30 40 50
time (min)
Figure 1.11: Two manipulated inputs versus time after a step change
in inlet flowrate at 10 minutes; nd = 3.
Further notation
G transfer function matrix
m mean of normally distributed random variable
T reactor temperature
u
e input deviation variable
x, y, z spatial coordinates for a distributed system
x
e state deviation variable
60 Getting Started with Model Predictive Control
1.6 Exercises
(a) Write the linear state space model for the deterministic series chemical reaction
model. Assume we can measure the component A concentration. What are x,
y, A, B, C, and D for this model?
(b) Simulate this model with initial conditions and parameters given by
v
z=0 z=L
(b) Assuming plug flow and neglecting chemical reaction in the tube, show that the
equation of change reduces to
∂cj ∂cj
= −v (1.46)
∂t ∂z
This equation is known as a hyperbolic, first-order partial differential equation.
Assume the boundary and initial conditions are
cj (z, t) = u(t) 0=z t≥0 (1.47)
cj (z, t) = cj0 (z) 0≤z≤L t=0 (1.48)
In other words, we are using the feed concentration as the manipulated variable,
u(t), and the tube starts out with some initial concentration profile of compo-
nent j, cj0 (z).
(c) Show that the solution to (1.46) with these boundary conditions is
(
u(t − z/v) vt > z
cj (z, t) = (1.49)
cj0 (z − vt) vt < z
(d) If the reactor starts out empty of component j, show that the transfer function
between the outlet concentration, y = cj (L, t), and the inlet concentration, cj (0,
t) = u(t), is a time delay. What is the value of θ?
(b) Use the chain rule to find the velocity of the pendulum in terms of the time
derivatives of r and θ. Do not simplify yet by assuming r is constant. We want
the general result.
9 Youwill need the Gauss divergence theorem and 3D Leibniz formula to go from a
mass balance on a volume element to the equation of continuity.
62 Getting Started with Model Predictive Control
θ
m
(d) Use a momentum balance on the pendulum mass (you may assume it is a point
mass) to determine both the force exerted by the link
t = mR θ̇ 2 + mg cos θ
and an equation for the acceleration of the pendulum due to gravity and the
applied torque
mR θ̈ − T /R + mg sin θ = 0
(e) Define a state vector and give a state space description of your system. What is
the physical significance of your state. Assume you measure the force exerted
by the link.
One answer is
dx1
= x2
dt
dx2
= −(g/R) sin x1 + u
dt
y = mRx22 + mg cos x1
in which u = T /(mR 2 )
e x(k) + D
y(k) = C e u(k)
e,B
(b) Is your result valid for A singular? If not, how can you find A e , and D
e, C e for
this case?
(b) Assume f is linear and apply this result to check the result of Exercise 1.5.
(c) Prove that (1.53) is true without assuming the eigenvalues are distinct.
Hint: use the Taylor series defining the functions and apply the Cayley-Hamilton
theorem (Horn and Johnson, 1985, pp. 86–87).
e?
(b) What A give purely oscillatory solutions? What are the corresponding A
(c) For what A is the solution of the ODE stable? Unstable? What are the corre-
e?
sponding A
1
(a) G(s) = (d) y(k + 1) = y(k) + 2u(k)
2s + 1
1 (e) y(k + 1) = a1 y(k) + a2 y(k − 1) +
(b) G(s) = b1 u(k) + b2 u(k − 1)
(2s + 1)(3s + 1)
2s + 1
(c) G(s) =
3s + 1
(c) A host of other useful control-related inversion formulas follow from these re-
sults. Equate the (1,1) or (2,2) entries of Z −1 and derive the identity
(I + X −1 )−1 = I − (I + X)−1
(d) Equate the (1,2) or (2,1) entries of Z −1 and derive the identity
subject to
Dx = d
in which H > 0, x ∈ Rn , d ∈ Rm , m < n, i.e., fewer constraints than decisions. Rather
than partially solving for x using the constraint and eliminating it, we make use of the
method of Lagrange multipliers for treating the equality constraints (Fletcher, 1987;
Nocedal and Wright, 2006).
In the method of Lagrange multipliers, we augment the objective function with the
constraints to form the Lagrangian function, L
L(x, λ) = (1/2)x ′ Hx + h′ x − λ′ (Dx − d)
in which λ ∈ Rm is the vector of Lagrange multipliers. The necessary and sufficient
conditions for a global minimizer are that the partial derivatives of L with respect to x
and λ vanish (Nocedal and Wright, 2006, p. 451), (Fletcher, 1987, p.198,236).
1.6 Exercises 67
(a) Show that the necessary and sufficient conditions are equivalent to the matrix
equation " #" # " #
H −D ′ x h
=− (1.57)
−D 0 λ d
The solution to (1.57) then provides the solution to the original problem (1.56).
(b) We note one other important feature of the Lagrange multipliers, their relation-
ship to the optimal cost of the purely quadratic case. For h = 0, the cost is given
by
V 0 = (1/2)(x 0 )′ Hx 0
Show that this can also be expressed in terms of λ0 by the following
V 0 = (1/2)d′ λ0
x10 (d) = K
ed e = (D ′ H2 D + H1 )−1 D ′ H2
K
x20 (d) = (I − DK
e )d
• Estimator form
V 0 (d) = d′ (DH1−1 D ′ + H2−1 )−1 d
x10 (d) = L
ed e = H −1 D ′ (DH −1 D ′ + H −1 )−1
L 1 1 2
x20 (d) = (I − DL
e )d
68 Getting Started with Model Predictive Control
(a) Show that the system is not controllable by checking the rank of the controlla-
bility matrix.
(b) Show that the modes x1 can be controlled from any x1 (0) to any x1 (n) with a se-
quence of inputs u(0), . . . , u(n−1), but the modes x2 cannot be controlled from
any x2 (0) to any x2 (n). The states x2 are termed the uncontrollable modes.
(c) If A22 is stable the system is termed stabilizable. Although not all modes can be
controlled, the uncontrollable modes are stable and decay to steady state.
The following lemma gives an equivalent condition for stabilizability.
Lemma 1.12 (Hautus lemma for stabilizability). A system is stabilizable if and
only if h i
rank λI − A B =n for all |λ| ≥ 1
Prove this lemma using Lemma 1.2 as the condition for controllability.
(b) Show that the infinite horizon LQR is stabilizing for (A, B) stabilizable and R > 0,
Q ≥ 0, and (A, Q) detectable. Discuss what happens to the controller’s stabiliz-
ing property if Q is not positive semidefinite or (A, Q) is not detectable.
subject to the model. Notice the penalty on the final state is now simply Q(N) instead
of Pf .
Apply the DP argument to this problem and determine the optimal input sequence
and cost. Can this problem also be solved in closed form like the time-invariant case?
1.6 Exercises 69
(b) Repeat for a singular A matrix. What happens to the two solution techniques?
N−1
1 X
V (x(0), u) = x(k)′ Qx(k) + u(k)′ Ru(k) + 2x(k)′ Mu(k) +(1/2)x(N)′ Pf x(N)
2
k=0
(a) Solve this problem with backward DP and write out the Riccati iteration and
feedback gain.
(b) Control engineers often wish to tune a regulator by penalizing the rate of change
of the input rather than the absolute size of the input. Consider the additional
positive definite penalty matrix S and the modified objective function
N−1
1 X
V (x(0), u) = x(k)′ Qx(k) + u(k)′ Ru(k) + ∆u(k)′ S∆u(k)
2
k=0
+ (1/2)x(N)′ Pf x(N)
in which ∆u(k) = u(k) − u(k − 1). Show that you can augment the state to
include u(k − 1) via " #
x(k)
xe (k) =
u(k − 1)
70 Getting Started with Model Predictive Control
and reduce this new problem to the standard LQR with the cross term. What are
e, B
A e, R
e, Q e , and M
e for the augmented problem (Rao and Rawlings, 1999)?
Exercise 1.26: Existence, uniqueness and stability with the cross term
Consider the linear quadratic problem with system
x + = Ax + Bu (1.58)
and infinite horizon cost function
∞
X
V (x(0), u) = (1/2) x(k)′ Qx(k) + u(k)′ Ru(k)
k=0
The existence, uniqueness and stability conditions for this problem are: (A, B) stabi-
lizable, Q ≥ 0, (A, Q) detectable, and R > 0. Consider the modified objective function
with the cross term
X∞
V = (1/2) x(k)′ Qx(k) + u(k)′ Ru(k) + 2x(k)′ Mu(k) (1.59)
k=0
(a) Consider reparameterizing the input as
v(k) = u(k) + T x(k) (1.60)
Choose T such that the cost function in x and v does not have a cross term,
and express the existence, uniqueness and stability conditions for the trans-
formed system. Goodwin and Sin (1984, p.251) discuss this procedure in the
state estimation problem with nonzero covariance between state and output
measurement noises.
(b) Translate and simplify these to obtain the existence, uniqueness and stability
conditions for the original system with cross term.
(b) If A is unstable, is it true that AP (0)A′ > P (0)? If so, prove it. If not, provide a
counterexample.
(c) If the magnitudes of all the eigenvalues of A are unstable, is it true that AP (0)A′ >
P (0)? If so, prove it. If not, provide a counterexample.
In forward DP, x is the state at the current stage, z is the state at the next stage. The
stage cost and arrival cost are given by
2
ℓ(z, w) = (1/2) y(k + 1) − Cz R−1 + w ′ Q−1 w Vk0 (x) = (1/2) |x − x̂(k)|2P (k)−1
0
and we wish to find Vk+1 (z) in the estimation problem.
(a) In the estimation problem, take the z term outside the optimization and solve
1
min w ′ Q−1 w + (x − x̂(k))′ P (k)−1 (x − x̂(k)) s.t. z = Ax + w
x,w 2
using the inverse form in Exercise 1.18, and show that the optimal cost is given
by
V 0 (z) = (1/2)(z − Ax̂(k))′ (P − (k + 1))−1 (z − Ax̂(k))
P (k + 1) = AP (k)A′ + Q
−
Add the z term to this cost using the third part of Example 1.1 and show that
0
Vk+1 (z) = (1/2)(z − x̂(k + 1))′ P −1 (k + 1)(z − x̂(k + 1))
P (k + 1) = P − (k + 1) − P − (k + 1)C ′ (CP − (k + 1)C ′ + R)−1 CP − (k + 1)
x̂(k + 1) = Ax̂(k) + L(k + 1)(y(k + 1) − CAx̂(k))
L(k + 1) = P − (k + 1)C ′ (CP − (k + 1)C ′ + R)−1
(b) In the regulator problem, take the z term outside the optimization and solve the
remaining two-term problem using the regulator form of Exercise 1.18. Then
add the z term and show that
0
Vk−1 (z) = (1/2)z′ Π(k − 1)z
Π(k − 1) = Q + A′ Π(k)A − A′ Π(k)B(B ′ Π(k)B + R)−1 B ′ Π(k)A
u0 (z) = K(k − 1)z
x 0 (z) = (A + BK(k − 1))z
K(k − 1) = −(B ′ Π(k)B + R)−1 B ′ Π(k)A
This symmetry can be developed further if we pose an output tracking problem rather
than zero state regulation problem in the regulator.
72 Getting Started with Model Predictive Control
(a) Show that the system is not observable by checking the rank of the observability
matrix.
(b) Show that the modes x1 can be uniquely determined from a sequence of mea-
surements, but the modes x2 cannot be uniquely determined from the measure-
ments. The states x2 are termed the unobservable modes.
(c) If A22 is stable the system is termed detectable. Although not all modes can be
observed, the unobservable modes are stable and decay to steady state.
The following lemma gives an equivalent condition for detectability.
Lemma 1.13 (Hautus lemma for detectability). A system is detectable if and only
if " #
λI − A
rank =n for all |λ| ≥ 1
C
Prove this lemma using Lemma 1.4 as the condition for observability.
(a) Because Q−1 is not defined in this problem, the objective function defined in
(1.26) requires modification. Show that the objective function with semidefinite
Q ≥ 0 can be converted into the following form
1
V (x(0), w(T )) = |x(0) − x(0)|2(P − (0))−1 +
2
TX
−1 T
X
2
|w(k)|2e −1 + y(k) − Cx(k) R −1
Q
k=0 k=0
1.6 Exercises 73
in which
x + = Ax + Gw e >0
Q
e and G in terms of the original semidefinite Q. How are
Find expressions for Q
e
the dimension of Q and G related to the rank of Q?
(b) What is the probabilistic interpretation of the state estimation problem with
semidefinite Q?
(c) Show that (A, Q) stabilizable implies (A, G) stabilizable in the converted form.
(d) Show that this estimator is stable for (A, C) detectable and (A, G) stabilizable
e , R > 0.
with Q
(e) Discuss what happens to the estimator’s stability if Q is not positive semidefinite
or (A, Q) is not stabilizable.
(a) Prove that the estimate of the mean is unbiased for all N, i.e., show that for all
N
E(x̂N ) = x
(b) Prove that the estimate of the variance is not unbiased for any N, i.e., show that
for all N
E(P̂N ) ≠ P
(c) Using the result above, provide an alternative formula for the variance estimate
that is unbiased for all N. How large does N have to be before these two estimates
of P are within 1%?
The trace of a square matrix A, written tr(A), is defined to be the sum of the diagonal
elements X
tr(A) := Aii
i
74 Getting Started with Model Predictive Control
(b) Show that the mean and the maximum likelihood are equal for the normal dis-
tribution. Draw a sketch of this result. The maximum likelihood estimate, x̂, is
defined as
x̂ := arg max pξ (x)
x
in which arg returns the solution to the optimization problem.
Given that the joint density is well defined, prove the marginal densities and the condi-
tional densities also are well defined, i.e., given P > 0, prove Px > 0, Py > 0, Px|y > 0,
Py|x > 0.
η = Aξ
y ∼ N(Cmx , CPx C ′ )
Does this result hold for all C? If yes, prove it; if no, provide a counterexample.
Exercise 1.41: Signal processing in the good old days—recursive least squares
Imagine we are sent back in time to 1960 and the only computers available have ex-
tremely small memories. Say we have a large amount of data coming from a process
and we want to compute the least squares estimate of model parameters from these
data. Our immediate challenge is that we cannot load all of these data into memory to
make the standard least squares calculation.
Alternatively, go 150 years further back in time and consider the situation from
Gauss’s perspective,
It occasionally happens that after we have completed all parts of an ex-
tended calculation on a sequence of observations, we learn of a new ob-
servation that we would like to include. In many cases we will not want to
have to redo the entire elimination but instead to find the modifications
due to the new observation in the most reliable values of the unknowns
and in their weights.
C.F. Gauss, 1823
G.W. Stewart Translation, 1995, p. 191.
Given the linear model
yi = Xi′ θ
in which scalar yi is the measurement at sample i, Xi′ is the independent model variable
(row vector, 1 × p) at sample i, and θ is the parameter vector (p × 1) to be estimated
from these data. Given the weighted least squares objective and n measurements, we
wish to compute the usual estimate
θ̂ = (X ′ X)−1 X ′ y (1.62)
in which
y1 X1′
. .
y = .. X = ..
yn Xn ′
We do not wish to store the large matrices X(n × p) and y(n × 1) required for this
calculation. Because we are planning to process the data one at a time, we first modify
our usual least squares problem to deal with small n. For example, we wish to estimate
76 Getting Started with Model Predictive Control
the parameters when n < p and the inverse in (1.62) does not exist. In such cases, we
may choose to regularize the problem by modifying the objective function as follows
n
X
Φ(θ) = (θ − θ)′ P0−1 (θ − θ) + (yi − Xi′ θ)2
i=1
in which θ and P0 are chosen by the user. In Bayesian estimation, we call θ and P0 the
prior information, and often assume that the prior density of θ (without measurements)
is normal
θ ∼ N(θ, P0 )
The solution to this modified least squares estimation problem is
θb = θ + (X ′ X + P0−1 )−1 X ′ (y − Xθ) (1.63)
Devise a means to recursively estimate θ so that:
1. We never store more than one measurement at a time in memory.
2. After processing all the measurements, we obtain the same least squares esti-
mate given in (1.63).
Consider the measurements to be sampled from (1.64) with true parameter value
θ0 . Show that using the least squares formula, the parameter estimate is dis-
tributed as −1
θ̂ ∼ N(θ0 , Pθ̂ ) Pθ̂ = σ 2 X ′ X
(b) Now consider again the model of (1.64) and a Bayesian estimation problem. As-
sume a prior distribution for the random variable θ
θ ∼ N(θ, P )
Compute the conditional density of θ given measurement y, show that this
density is normal, and find its mean and covariance
pθ|y (θ|y) = n(θ, m, P )
Show that Bayesian estimation and least squares estimation give the same result
in the limit of an infinite variance prior. In other words, if the covariance of the
prior is large compared to the covariance of the measurement error, show that
m ≈ (X ′ X)−1 X ′ y P ≈ Pθ̂
1.6 Exercises 77
(c) What (weighted) least squares minimization problem is solved for the general
measurement error covariance
e ∼ N(0, R)
Derive the least squares estimate formula for this case.
(d) Again consider the measurements to be sampled from (1.64) with true param-
eter value θ0 . Show that the weighted least squares formula gives parameter
estimates that are distributed as
θ̂ ∼ N(θ0 , Pθ̂ )
and find Pθ̂ for this case.
(e) Show again that Bayesian estimation and least squares estimation give the same
result in the limit of an infinite variance prior.
Show that the covariance of the least squares estimator is the smallest covariance of
all linear unbiased estimators.
(b) Combine the two models together into a single model and show that the rela-
tionship between z and x is
z = BAx + e3 cov(e3 ) = Q3
Express Q3 in terms of Q1 , Q2 and the models A, B. What is the optimal least
squares estimate of x̂ given measurements of z and the one-stage model? Call
this the one-stage estimate of x.
(c) Are the one-stage and two-stage estimates of x the same? If yes, prove it. If
no, provide a counterexample. Do you have to make any assumptions about the
models A, B?
78 Getting Started with Model Predictive Control
(c) What is the motivation for changing from these least squares estimators to the
moving horizon estimators we discussed in the chapter?
1.6 Exercises 79
y = f (x) + v
Derive this result and state what additional assumptions on the random variables x
and v are required for this result to be correct.
x(0) ∼ N(x(0), Q0 )
in which the random variables w(k) and v(k) are independent, identically distributed
normals, w(k) ∼ N(0, Q), v(k) ∼ N(0, R).
(a) Calculate the standard density for the filtering problem, px(0)|y(0) (x(0)|y(0)).
P 1 : min V (x, y, z)
x,y,z
The arrival cost decomposes this three-variable optimization problem into two,
smaller dimensional optimization problems. Define the “arrival cost” ge for this
problem as the solution to the following single-variable optimization problem
g
e (y) = min g(x, y)
x
80 Getting Started with Model Predictive Control
P 2 : min g
e (y) + h(y, z)
y,z
(x ′ , y ′ , z′ ) = (x 0 , y 0 , z0 )
(b) Repeat the previous part for the following optimization problems
Here the y variables do not appear in g but restrict the x variables through a
linear constraint. The two optimization problems are
P 2 : min g
e (y) + h(y, z)
y,z
in which
g
e (y) = min g(x) subject to Ex = y
x
subject to
x + = Ax + w
y = Cx + v (1.66)
in which the sequence of measurements y(T ) are known values. Notice we assume the
noise-shaping matrix, G, is an identity matrix here. See Exercise 1.53 for the general
case. Using the result of the first part of Exercise 1.51, show that this problem is
equivalent to the following problem
TX
−1
1
min VT−−N (x(T − N)) + |w(i)|2Q−1 + |v(i)|2R−1
x(T −N),w,v 2
i=T −N
subject to (1.66) and x(N) = a. Notice that any value of N, 0 ≤ N ≤ T , can be used to
split the cost function using the arrival cost.
1.6 Exercises 81
subject to
x + = Ax + Gw
y = Cx + v (1.67)
in which the sequence of measurements y are known values. Using the result of the
second part of Exercise 1.51, show that this problem also is equivalent to the following
problem
TX
−1
1
min VT−−N (x(T − N)) + |w(i)|2Q−1 + |v(i)|2R−1
x(T −N),w,v 2
i=T −N
subject to x(k) = a and the model (1.67). Notice that any value of N, 0 ≤ N ≤ T , can
be used to split the cost function using the arrival cost.
(b) Assume only input one u1 is available for control. Is the output setpoint feasible?
What is the target in this case using Qs = I?
(c) Assume both inputs are available for control but only the first output has a
setpoint, y1t = 1. What is the solution to the target problem for Rs = I?
(b) Show that (1.68) implies that the number of controlled variables without offset
is less than or equal to the number of manipulated variables and the number of
measurements, nc ≤ m and nc ≤ p.
(d) Does (1.68) imply that the rows of C are independent? If so, prove it; if not,
provide a counterexample.
(e) By choosing H, how can one satisfy (1.68) if one has installed redundant sensors
so several rows of C are identical?
subject to
" #" # " #
I−A −B x 0
=
H 0 u rsp
Show that the steady-state solution (x, u) exists for any (rsp , usp ) and is unique if
" # " #
I − A −B I−A
rank =n+p rank =n
H 0 H
in which
−0.281 0.935 0.035 0.008 0.687
0.047 −0.116 0.053 0.383 0.589
A= B= C=I
0.679 0.519 0.030 0.067 0.930
0.679 0.831 0.671 −0.083 0.846
(a) Compute the eigenvalues of A. Choose a sample time of ∆ = 0.04 and simulate
h i′
the MPC regulator response given x(0) = −0.9 −1.8 0.7 2 until t = 20.
Use an ODE solver to simulate the continuous time plant response. Plot all states
and the input versus time.
Now add an input disturbance to the regulator so the control applied to the plant
is ud instead of u in which
and w1 and w2 are zero-mean, normally distributed random variables with unit
variance. Simulate the regulator’s performance given this disturbance. Plot all
states and ud (k) versus time.
(b) Repeat the simulations with and without disturbance for ∆ = 0.4 and ∆ = 2.
(c) Compare the simulations for the different sample times. What happens if the
sample time is too large? Choose an appropriate sample time for this system
and justify your choice.
x + = Ax + Bu + Bp p
y = Cx
in which
0.2681 −0.00338 −0.00728 1 0 0
A = 9.703 0.3279 −25.44 C = 0 1 0
0 0 1 0 0 1
−0.00537 0.1655 −0.1175
B = 1.297 97.91 Bp = 69.74
0 −6.637 6.637
(a) Since there are two inputs, choose two outputs in which to remove steady-state
offset. Build an output disturbance model with two integrators. Is your aug-
mented model detectable?
84 Getting Started with Model Predictive Control
ysp u y
gc g
−
(c) Can you find any two-integrator disturbance model that removes offset in two
outputs? If so, which disturbance model do you use? If not, why not?
k
g(s) = e−θs , k = 1, τ = 1, θ = 5
τs + 1
Consider a unit step change in setpoint ysp , at t = 0.
(a) Choose a reasonable sample time, ∆, and disturbance model, and simulate an
offset-free discrete time MPC controller for this setpoint change. List all of your
chosen parameters.
(b) Choose PID tuning parameters to achieve “good performance” for this system.
List your PID tuning parameters. Compare the performances of the two con-
trollers.
(b) Now let’s add some measurement noise to all three sensors. So we all work on
the same problem, choose the variance of the measurement error Rv to be
in which (cs , Ts , hs ) are the nominal steady states of the three measurements.
Is the performance from the previous part assuming no measurement noise ac-
ceptable? How do you adjust your estimator from the previous part to obtain
good performance? Rerun the simulation with measurement noise and your ad-
justed state estimator. Comment on the change in the performance of your new
design that accounts for the measurement noise.
(c) Recall that the offset lemma 1.10 is an either-or proposition, i.e., either the con-
troller removes steady offset in the controlled variables or the system is closed-
loop unstable. From closed-loop simulation, approximate the range of plant
U values for which the controller is stabilizing (with zero measurement noise).
From a stabilization perspective, which disturbance is worse, an increase or de-
crease in the plant’s heat-transfer coefficient?
(b) Use the data to identify a third-order linear state space model by calling iddata
and ssest. Compare the step tests of your identified model with those from
the linear model used in Example 1.11. Which is more accurate compared to the
true plant simulation?
(c) Using the code for Example 1.11 as a starting point, replace the linear model in
the MPC controller with your identified model and recalculate Figures 1.10 and
1.11 from the example. Is your control system robust enough to obtain good
closed-loop control of the nonlinear plant using your linear model identified
from data in the MPC controller? Do you maintain zero offset in the controlled
variables?
Bibliography
R. Fletcher. Practical Methods of Optimization. John Wiley & Sons, New York,
1987.
86
Bibliography 87
H. Kwakernaak and R. Sivan. Linear Optimal Control Systems. John Wiley and
Sons, New York, 1972.
W. H. Kwon. Receding horizon control: model predictive control for state models.
Springer-Verlag, London, 2005.
G. Strang. Linear Algebra and its Applications. Academic Press, New York,
second edition, 1980.
2.1 Introduction
In Chapter 1 we investigated a special, but useful, form of model pre-
dictive control (MPC); an important feature of this form of MPC is that, if
the terminal cost is chosen to be the value function of infinite horizon
unconstrained optimal control problem, there exists a set of initial
states for which MPC is actually optimal for the infinite horizon con-
strained optimal control problem and therefore inherits its associated
advantages. Just as there are many methods other than infinite horizon
linear quadratic control for stabilizing linear systems, there are alterna-
tive forms of MPC that can stabilize linear and even nonlinear systems.
We explore these alternatives in the remainder of this chapter. But first
we place MPC in a more general setting to facilitate comparison with
other control methods.
MPC is, as we have seen earlier, a form of control in which the control
action is obtained by solving online, at each sampling instant, a finite
horizon optimal control problem in which the initial state is the current
state of the plant. Optimization yields a finite control sequence, and
the first control action in this sequence is applied to the plant. MPC
differs, therefore, from conventional control in which the control law
is precomputed offline. But this is not an essential difference; MPC
implicitly implements a control law that can, in principle, be computed
offline as we shall soon see. Specifically, if the current state of the
system being controlled is x, MPC obtains, by solving an open-loop
optimal control problem for this initial state, a specific control action
u to apply to the plant.
Dynamic programming (DP) may be used to solve a feedback version
of the same optimal control problem, however, yielding a receding hori-
zon control law κ(·). The important fact is that if x is the current state,
89
90 Model Predictive Control—Regulation
In this chapter we study MPC for the case when the state is known.
This case is particularly important, even though it rarely arises in prac-
tice, because important properties, such as stability and performance,
may be relatively easily established. The relative simplicity of this case
arises from the fact that if the state is known and if there are no dis-
turbances or model error, the problem is deterministic, i.e., there is no
uncertainty making feedback unnecessary in principle. As we pointed
out previously, for deterministic systems the MPC action for a given
state is identical to the receding horizon control law, determined using
DP, and evaluated at the given state. When the state is not known, it has
to be estimated and state estimation error, together with model error
and disturbances, makes the system uncertain in that future trajecto-
ries cannot be precisely predicted. The simple connection between MPC
and the DP solution is lost because there does not exist an open-loop
optimal control problem whose solution yields a control action that is
the same as that obtained by the DP solution. A practical consequence
is that special techniques are required to ensure robustness against
these various forms of uncertainty. So the results of this chapter hold
when there is no uncertainty. We prove, in particular, that the optimal
control problem that defines the model predictive control can always
be solved if the initial optimal control problem can be solved (recursive
feasibility), and that the optimal cost can always be reduced allowing
us to prove asymptotic or exponential stability of the target state. We
2.2 Model Predictive Control 91
in which x(t) and u(t) satisfy ẋ = f (x, u). The optimal control prob-
lem P∞ (x) is defined by
subject to
ẋ = f (x, u) x(0) = x0
(x(t), u(t)) ∈ Z for all t ∈ R≥0
If ℓ(·) is positive definite, the goal of the regulator is to steer the state
of the system to the origin.
92 Model Predictive Control—Regulation
dx(t)
= f (x(t), u0∞ (t; x))
dt
If f (·), ℓ(·) and Vf (·) satisfy certain differentiability and growth as-
sumptions, and if the class of admissible controls is sufficiently rich,
then a solution to P∞ (x) exists for all x and satisfies
0
V̇∞ (x) = −ℓ(x, u0∞ (0; x))
0
Using this and upper and lower bounds on V∞ (·) enables global asymp-
totic stability of the origin to be established.
Although the control law u0∞ (0; ·) provides excellent closed-loop
properties, there are several impediments to its use. A feedback, rather
than an open-loop, solution of the optimal control problem is desirable
because of uncertainty; solution of the optimal control problem P∞ (x)
yields the optimal control sequence u0∞ (0; x) for the state x but does
not provide a control law. Dynamic programming may, in principle, be
employed, but is generally impractical if the state dimension and the
horizon are not small.
If we turn instead to an MPC approach in which we generate on-
line only the value of optimal control sequence u0∞ (·; x) for the cur-
rently measured value of x, rather than for all x, the problem remains
formidable for the following reasons. First, we are optimizing a time
function, u(·), and functions are infinite dimensional. Secondly, the
time interval of interest, [0, ∞), is a semi-infinite interval, which poses
other numerical challenges. Finally, the cost function V (x, u(·)) is usu-
ally not a convex function of u(·), which presents significant optimiza-
tion difficulties, especially in an online setting. Even proving existence
of the optimal control in this general setting is a challenge. However,
see Pannocchia, Rawlings, Mayne, and Mancuso (2015) in which it is
shown how an infinite horizon optimal control may be solved online if
the system is linear, the cost quadratic and the control but not the state
is constrained.
Our task in this chapter may therefore be viewed as restricting the
system and control parameterization to make problem P∞ (x) more eas-
ily computable. We show how to pose various problems for which we
can establish existence of the optimal solution and asymptotic closed-
loop stability of the resulting controller. For these problems, we almost
2.2 Model Predictive Control 93
of the system (2.1) at time k, if the initial state at time i is x and the
control sequence is u, is denoted by φ(k; (x, i), u). Because the system
is time invariant, the solution does not depend on the initial time; if
the initial state is x at time i, the solution at time j ≥ i is φ(j − i; x, u).
Thus the solution at time k if the initial event is (x, i) is identical to
the solution at time k − i if the initial event is (x, 0). For each k, the
function (x, u) , φ(k; x, u) is continuous as we show next.
Proof.
Since φ(1; x, u(0)) = f (x, u(0)), the function (x, u(0)) , φ(1; x,
u(0)) is continuous. Suppose the function (x, uj−1 ) , φ(j; x, uj−1 )
is continuous and consider the function (x, uj ) , φ(j + 1; x, uj ). Since
in which f (·) and φ(j; · ) are continuous and since φ(j + 1; · ) is the
composition of two continuous functions f (·) and φ(j; · ), it follows
that φ(j + 1; · ) is continuous. By induction φ(k; · ) is continuous for
any positive integer k. ■
The system (2.1) is subject to hard constraints which may take the
form
(x(k), u(k)) ∈ Z for all k ∈ I≥0 (2.2)
x ∈ {x ∈ X | U(x) ≠ ∅}
i+N−1
X
ℓ(x(k), u(k)) + Vf (x(N + i))
k=i
with respect to the sequences x := (x(i), x(i + 1), . . . , x(i + N)) and
u := (u(i), u(i + 1), . . . , u(i + N − 1)) subject to the constraints that x
and u satisfy the difference equation (2.1), the initial condition x(i) =
x, and the state and control constraints (2.2). We assume that ℓ(·)
is continuous and that ℓ(0, 0) = 0. The optimal control and state se-
quences, obtained by solving PN (x, i), are functions of the initial event
(x, i)
u0 (x, i) = u0 (i; (x, i)), u0 (i + 1; (x, i)), . . . , u0 (i + N − 1; (x, i))
x0 (x, i) = x 0 (i; (x, i)), x 0 (i + 1; (x, i)), . . . , x 0 (i + N; (x, i))
with x 0 (i; (x, i)) = x. In MPC, the first control action u0 (i; (x, i)) in the
optimal control sequence u0 (x, i) is applied to the plant, i.e., u(i) =
u0 (i; (x, i)). Because the system x + = f (x, u), the stage cost ℓ(·), and
the terminal cost Vf (·) are all time invariant, however, the solution of
PN (x, i), for any time i ∈ I≥0 , is identical to the solution of PN (x, 0) so
that
u0 (x, i) = u0 (x, 0)
x0 (x, i) = x0 (x, 0)
In particular, u0 (i; (x, i)) = u0 (0; (x, 0)), i.e., the control u0 (i; (x, i))
applied to the plant is equal to u0 (0; (x, 0)), the first element in the
sequence u0 (x, 0). Hence we may as well merely consider problem
96 Model Predictive Control—Regulation
PN (x, 0) which, since the initial time is irrelevant, we call PN (x). Sim-
ilarly, for simplicity in notation, we replace u0 (x, 0) and x0 (x, 0) by,
respectively, u0 (x) and x0 (x).
The optimal control problem PN (x) may then be expressed as min-
imization of
N−1
X
ℓ(x(k), u(k)) + Vf (x(N))
k=0
with respect to the decision variables (x, u) subject to the constraints
that the state and control sequences x and u satisfy the difference
equation (2.1), the initial condition x(0) = x, and the state, control
constraints (2.2). Here u denotes the control sequence u(0), u(1), . . . ,
u(N − 1) and x the state sequence (x(0), x(1), . . . , x(N)). Retaining
the state sequence in the set of decision variables is discussed in Chap-
ters 6 and 8. For the purpose of analysis, however, it is preferable to
constrain the state sequence x a priori to be a solution of x + = f (x, u)
enabling us to express the problem in the equivalent form of mini-
mizing, with respect to the decision variable u, a cost that is purely
a function of the initial state x and the control sequence u. This for-
mulation is possible since the state sequence x may be expressed, via
the difference equation x + = f (x, u), as a function of (x, u). The cost
becomes VN (x, u) defined by
N−1
X
VN (x, u) := ℓ(x(k), u(k)) + Vf (x(N)) (2.3)
k=0
XN := {x ∈ X | UN (x) ≠ ∅} (2.8)
(b) The set UN (x) is defined by a finite set of inequalities each of which
has the form η(x, u) ≤ 0 in which η(·) is continuous. It follows that
UN (x) is closed. If U is bounded, so is UN (x), and UN (x) is therefore
compact for all x ∈ XN .
If instead U is unbounded, the set UfcN := {u | VN (x, u) ≤ c} for c ∈ R>0
is closed for all c and x because VN (·) is continuous; ŪcN (x) is the
intersection of this set with UN (x), just shown to be closed. So ŪcN (x)
is the intersection of closed sets and is closed. To prove ŪcN (x) is
bounded for all c, suppose the contrary: there exists a c such that
ŪcN (x) is unbounded. Then there exists a sequence (ui )i∈I≥0 in ŪcN (x)
such that ui → ∞ as i → ∞. Because VN (·) is coercive, VN (x, ui ) → ∞
as i → ∞, a contradiction. Hence ŪcN (x) is closed and bounded and,
hence, compact.
(c) Since VN (x, · ) is continuous and UN (x) (ŪcN (x)) is compact, it fol-
lows from Weierstrass’s theorem (Proposition A.7) a solution to PN (x)
c
exists for each x ∈ XN (X̄N ). ■
MPC does not require determination of the control law κN (·), a task that
is usually intractable when constraints or nonlinearities are present and
the state dimension is large; it is this fact that makes MPC so useful.
If, at a given state x, the solution of PN (x) is not unique, then
κN (·) = u0 (0; · ) is set valued and the model predictive controller se-
lects one element from the set κN (x).
x + = f (x, u) := x + u
with initial state x. The stage cost and terminal cost are
with respect to (x(0), x(1), x(2)), and (u(0), u(1)) subject to the fol-
lowing constraints
in which " #
3 1
H=
1 2
UN (x) = {u | |u(k)| ≤ 1 k = 0, 1}
1.5
10
1
8
0.5 6 x(k)
κN (x) 0 4
−0.5 2
−1 0 u(k)
−1.5 −2
−2 −1 0 1 2 0 2 4 6 8 10 12 14
x k
and the state and control trajectories for an initial state of x = 10 are
shown in Figure 2.1(b). It turns out that the origin is exponentially sta-
ble for this simple case; often, however, the terminal cost and terminal
constraint set have to be carefully chosen to ensure stability. □
where c(x)′ = [2 1]x and d(x) = (3/2)x 2 . The objective function may
be written in the form
Expanding the second form shows the two forms are equal if
" #
−1 3
a(x) = −H c(x) = K1 x K1 = −(1/5)
1
and
e(x) + (1/2)a(x)′ Ha(x) = d(x)
∇u VN (x, u) = Hu + c(x)
102 Model Predictive Control—Regulation
U2
x = 2.25
0 x=0
x = 5/3
x=3
u1 −1 x = 4.5
a(x)
−2
−3
−4 −3 −2 −1 0 1
u0
The locus of a(x) for x ≥ 0 is shown in Figure 2.2. Clearly the uncon-
strained minimizer a(x) = K1 x is equal to the constrained minimizer
u0 (x) for all x such that a(x) ∈ U2 where U2 is the unit square illus-
trated in Figure 2.2; since a(x) = K1 x, a(x) ∈ U2 for all x ∈ X1 = [0,
xc1 ] where xc1 = 5/3. For x > xc1 , the unconstrained minimizer lies
outside U2 as shown in Figure 2.2 for x = 2.25, x = 3 and x = 5.
For such x, the constrained minimizer u0 (x) is a point that lies on the
intersection of a level set of the objective function (which is an ellipse)
and the boundary of U2 . For x ∈ [xc1 , xc2 ), u0 (x) lies on the left face
of the box U2 and for x ≥ xc2 = 3, u0 (x) remains at (−1, −1), the
bottom left vertex of U2 .
When u0 (x) lies on the left face of U2 , the gradient ∇u VN (x, u0 (x))
of the objective function is normal to the left face of U2 , i.e., the level
set of VN0 (·) passing through u0 (x) is tangential to the left face of U2 .
The outward normal to U2 at a point on the left face is −e1 = (−1, 0)
2.2 Model Predictive Control 103
so that at u = u0 (x)
or
− 3 + v + 2x − λ = 0
− 1 + 2v + x = 0
for all x ∈ X2 = [xc1 , xc2 ] where xc2 = 3 since u0 (x) ∈ U2 for all x in
this range. For all x ∈ X3 = [xc2 , ∞), u0 (x) = (−1, −1)′ . Summarizing
in which
" # " # " # " #
−(3/5) 0 −1 −1
K1 = K2 = b2 = b3 =
−(1/5) −(1/2) (1/2) −1
x1+ = x1 + u
x2+ = x2 + u3
ℓ(x, u) := |x|2 + u2
in which
Clearly a real solution exists only if b is positive, i.e., if both the numer-
ator and denominator in the expression for b have the same sign. The
optimal control problem P3 (x) is defined by
1.5
0.5
u0 0
−0.5
−1
−1.5
−2
−π −π /2 0 π /2 π
θ
and the implicit MPC control law is κ3 (·) where κ3 (x) = u0 (0; x), the
first element in the minimizing sequence u0 (x). It can be shown, using
analysis presented later in this chapter, that the origin is asymptotically
stable for the controlled system x + = f (x, κN (x)). That this control
law is necessarily discontinuous may be shown as follows. If the control
is strictly positive, any trajectory originating in the first quadrant (x1 ,
x2 > 0) moves away from the origin. If the control is strictly negative,
any control originating in the third quadrant (x1 , x2 < 0) also moves
away from the origin. But the control cannot be zero at any nonzero
point lying in the domain of attraction. If it were, this point would be a
fixed point for the controlled system, contradicting the fact that it lies
in the domain of attraction.
In fact, both the value function V30 (·) and the MPC control law κ3 (·)
are discontinuous. Figures 2.3 and 2.4 show how U3 (x), κ3 (x), and
V30 (x) vary as x = (cos(θ), sin(θ)) ranges over the unit circle. A further
conclusion that can be drawn from this example is that it is possible
2.3 Dynamic Programming Solution 107
10
V30
1
−π −π /2 0 π /2 π
θ
for the MPC control law to be discontinuous at points where the value
function is continuous. □
with
Xj = {x ∈ X | ∃u ∈ U(x) such that f (x, u) ∈ Xj−1 } (2.11)
for j = 1, 2, . . . , N (j is time to go), with terminal conditions
V00 (x) = Vf (x) ∀x ∈ X0 X0 = Xf
For each j, Vj0 (x)
is the optimal cost for problem Pj (x) if the current
state is x, current time is N − j, and the terminal time is N; Xj is the
domain of Vj0 (x) and is also the set of states in X that can be steered
to the terminal set Xf in j steps by an admissible control sequence,
i.e., a control sequence that satisfies the control, state, and terminal
constraints. Hence, for each j
Xj = {x ∈ X | Uj (x) ≠ ∅}
DP yields much more than an optimal control sequence for a given
initial state; it yields an optimal feedback policy µ0 or sequence of con-
trol laws where
µ0 := (µ0 (·), µ1 (·), . . . , µN−1 (·)) = (κN (·), κN−1 (·), . . . , κ1 (·))
2.3 Dynamic Programming Solution 109
At event (x, i), i.e., at state x at time i, the time to go is N − i and the
optimal control is
µi0 (x) = κN−i (x)
i.e., µi0 (·) is the optimal control law at time i. Consider an initial event
(x, 0), i.e., state x at time zero. If the terminal time (horizon) is N, the
optimal control for (x, 0) is κN (x). The successor state, at time 1, is
x + = f (x, κN (x))
and can be seen to differ in general from x0 (x) and u0 (x), which satisfy
(2.12) and (2.13).
Before leaving this section, we obtain some properties of the solu-
tion to each partial problem Pj (x). For this, we require a few definitions
(Blanchini and Miani, 2008).
(b) A set X ⊆ Rn is control invariant for x + = f (x, u), u ∈ U, if, for all
x ∈ X, there exists a u ∈ U such that f (x, u) ∈ X.
We recall from our standing assumptions 2.2 and 2.3 that f (·), ℓ(·)
and Vf (·) are continuous, that X and Xf are closed, U is compact and
that each of these sets contains the origin.
Proof.
(a) This proof is almost identical to the proof of Proposition 2.4.
2.4 Stability
2.4.1 Introduction
property does not usually hold when the horizon is finite. One of the
main tasks of this chapter is show that if the ingredients Vf (·), ℓ(·),
and Xf of the finite horizon optimal control problem are chosen ap-
propriately, then VN0 (f (x, κN (x))) ≤ VN0 (x) − ℓ(x, κN (x)) for all x in
XN enabling property (2.16) to be obtained. Property (2.15), an upper
bound on the value function, is more difficult to establish but we also
show that appropriate ingredients that ensures satisfaction of property
(2.16) also ensures satisfaction of property (2.15).
We now address a point that we have glossed over. The solution
to an optimization problem is not necessarily unique. Thus u0 (x) and
κN (x) may be set valued; any point in the set u0 (x) is a solution of
PN (x). Similarly x0 (x) is set valued. Uniqueness may be obtained by
choosing that element in the set u0 (x) that has least norm; and if the
minimum-norm solution is not unique, applying an arbitrary selection
map in the set of minimum-norm solutions. To avoid expressions such
as “let u be any element of the minimizing set u0 (x),” we shall, in
the sequel, use u0 (x) to denote any sequence in the set of minimizing
sequences and use κN (x) to denote u0 (0; x), the first element of this
sequence.
To show that the value function VN0 (·) is a valid Lyapunov function
for the closed-loop system x + = f (x, κN (x)) we have to show that
it satisfies (2.14), (2.15), and (2.16). We show below that VN0 (·) is a
valid Lyapunov function if, in addition to Assumptions 2.2 and 2.3, the
following assumption is satisfied.
Assumption 2.14 (Basic stability assumption). Vf (·), Xf and ℓ(·) have
the following properties:
(a) For all x ∈ Xf , there exists a u (such that (x, u) ∈ Z) satisfying
f (x, u) ∈ Xf
Vf (f (x, u)) − Vf (x) ≤ −ℓ(x, u)
Proposition 2.15 (The value function VN0 (·) is locally bounded). Sup-
pose Assumptions 2.2 and 2.3 (U bounded) hold. Then VN0 (·) is locally
bounded on XN .
The second proposition shows the upper bound of VN0 (·) in Xf im-
plies the existence of a similar upper bound in the larger set XN .
Descent property for VN0 (·). Let x be any state in XN at time zero.
Then
VN0 (x) = VN (x, u0 (x))
in which
u0 (x) = u0 (0; x), u0 (1; x), . . . , u0 (N − 1; x)
in which
u0 (x + ) = u0 (0; x + ), u0 (1; x + ), . . . , u0 (N − 1; x + )
VN0 (x + ) = VN (x + , u0 (x + )) ≤ VN (x + , u
e)
2.4 Stability 117
But
so that
N−1
X
ℓ(x 0 (j; x), u0 (j; x)) = VN0 (x) − ℓ(x, κN (x)) − Vf (x 0 (N; x))
j=1
Hence
VN0 (x) ≤ VN (x + , u
e ) = VN0 (x) − ℓ(x, κN (x)) − Vf (x 0 (N; x))+
ℓ(x 0 (N; x), u) + Vf (f (x 0 (N; x), u))
It follows that
for all x ∈ X if the function Vf (·) and the set Xf have the property
that, for all x ∈ Xf , there exists a u ∈ U such that
and
Vj0 (x) ≤ Vf (x) ∀x ∈ Xf , ∀j ∈ I≥0
so that
V10 (x) ≤ V00 (x) ∀x ∈ X0 = Xf
Next, suppose that for some j ≥ 1
Since κj (x) may not be optimal for Pj+1 (x) for all x ∈ Xj ⊆ Xj+1 , we
have
0
Vj+1 (x) − Vj0 (x) ≤ ℓ(x, κj (x)) + Vj0 (f (x, κj (x)))
0
− ℓ(x, κj (x)) − Vj−1 (f (x, κj (x))) ∀x ∈ Xj
2.4 Stability 119
0
Vj+1 (x) ≤ Vj0 (x) ∀x ∈ Xj
By induction
0
Vj+1 (x) ≤ Vj0 (x) ∀x ∈ Xj , ∀j ∈ I≥0
Since the set sequence (Xj )I≥0 has the nested property Xj ⊂ Xj+1 for
all j ∈ I≥0 , it follows that Vj0 (x) ≤ Vf (x) for all x ∈ Xf , all j ∈ I≥0 . ■
For the proof with U unbounded, note that the lower bound and de-
scent property remain satisfied as before. For the upper bound, if Xf
contains the origin in its interior, we have that, since Vf (·) is continu-
ous, for each c > 0 there exists 0 < τ ≤ c, such that levτ Vf contains a
c
neighborhood of the origin and is a subset of both Xf and X̄N . One can
0
then show that VN (·) ≤ Vf (·) for each N ≥ 0 on this sublevel set, and
therefore VN0 (·) is continuous at the origin so that again Proposition
c
B.25 applies, and Assumption 2.17 is satisfied on X̄N for each c ∈ R>0 .
As discussed above, Assumption 2.17 is immediate if the origin lies
in the interior of Xf . In other cases, e.g., when the stabilizing ingredi-
ent is the terminal equality constraint x(N) = 0 (Xf = {0}), Assump-
tion 2.17 is taken directly. See Proposition 2.38 for some additional
circumstances in which Assumption 2.17 is satisfied.
Stage cost ℓ(·) not positive definite. In the previous stability anal-
ysis we assume that the function (x, u) , ℓ(x, u) is positive definite;
more precisely, we assume that there exists a K∞ function α1 (·) such
that ℓ(x, u) ≥ α1 (|x|) for all (x, u). Often we assume that ℓ(·) is
quadratic, satisfying ℓ(x, u) = (1/2)(x ′ Qx + u′ Ru) where Q and R
are positive definite. In this section we consider the case where the
stage cost is ℓ(y, u) where y = h(x) and the function h(·) is not nec-
essarily invertible. An example is the quadratic stage cost ℓ(y, u) =
(1/2)(y ′ Qy y + u′ Ru) where Qy and R are positive definite, y = Cx,
and C is not invertible; hence the stage cost is (1/2)(x ′ Qx + u′ Ru)
where Q = C ′ Qy C is merely positive semidefinite. Since now ℓ(·) does
not satisfy ℓ(x, u) ≥ α1 (|x|) for all (x, u) ∈ Z and some K∞ function
α1 (·), we have to make an additional assumption in order to estab-
lish asymptotic stability of the origin for the closed-loop system. An
appropriate assumption is input/output-to-state-stability (IOSS), which
ensures the state goes to zero as the input and output go to zero. We
recall Definition B.51, restated here.
Theorem 2.24 (Asymptotic stability with stage cost ℓ(y, u)). Suppose
Assumptions 2.2, 2.3, 2.17 and 2.23 are satisfied, and the system x + =
f (x, u), y = h(x) is IOSS. Then there exists a Lyapunov function in XN
c
(X̄N , for each c ∈ R>0 ) for the closed-loop system x + = f (x, κN (x)),
c
and the origin is asymptotically stable in XN (X̄N , for each c ∈ R>0 ).
Proof. For the case of bounded U, Assumptions 2.2, 2.3, and 2.23(a)
guarantee the existence of the optimal solution of the MPC problem and
the positive invariance of XN for x + = f (x, κN (x)), but the nonpositive
definite stage cost gives the following modified inequalities
Most of the control problems discussed in this book are time invari-
ant. Time-varying problems do arise in practice, however, even if the
system being controlled is time invariant. One example occurs when
an observer or filter is used to estimate the state of the system being
controlled since bounds on the state estimation error are often time
varying. In the deterministic case, for example, state estimation er-
ror decays exponentially to zero. Another example occurs when the
desired equilibrium is not a state-control pair (xs , us ) but a periodic
trajectory. In this section, which may be omitted in the first reading,
we show how MPC may be employed for a class of time-varying systems.
The problem. The time-varying nonlinear system is described by
x + = f (x, u, i)
where x is the current state at time i, u the current control, and x + the
successor state at time i + 1. For each integer i, the function f (·, i) is
assumed to be continuous. The solution of this system at time k ≥ i
given that the initial state is x at time i is denoted by φ(k; x, u, i); the
solution now depends on both the time i and current time k rather than
merely on the difference k − i as in the time-invariant case. The cost
VN (x, u, i) also depends on time i and is defined by
i+N−1
X
VN (x, u, i) := ℓ(x(k), u(k), k) + Vf (x(i + N), i + N)
k=i
x(i + N) ∈ Xf (i + N)
(b) For each x ∈ XN (i), the control constraint set UN (x, i) is compact.
f (x, u, i) ∈ Xf (i + 1)
Vf (f (x, u, i), i + 1) − Vf (x, i) ≤ −ℓ(x, u, i)
VN0 (f (x, κN (x, i), i), i + 1) ≤ VN0 (x, i) − ℓ(x, κN (x, i), i) (2.20)
Proposition 2.35 (MPC cost is less than terminal cost). Suppose As-
sumptions 2.25, 2.26, and 2.33 hold. Then
The proofs of Propositions 2.34 and 2.35 are left as Exercises 2.9
and 2.10.
We can deal with the obstacle posed by the fact that the upper bound
on VN0 (·) holds only in Xf (i) in much the same way as we did previ-
ously for the time-invariant case. In general, we invoke the following
assumption.
(b) For i ∈ I≥0 , the optimal value function VN0 (x, i) is uniformly contin-
uous in x at x = 0
(d) The functions f (·) and ℓ(·) are uniformly continuous at the origin
(x, u) = (0, 0) for all i ∈ I≥0 , and the system is stabilizable with small
inputs, i.e., there exists a K∞ function γ(·) such that for all i ∈ I≥0 and
x ∈ XN (i), there exists u ∈ UN (x, i) with |u| ≤ γ(|x|).
Proof.
(a) Similar to Proposition 2.16, one can show that the optimal cost
(b) From uniform continuity, we know that for each ε > 0, there exists
δ > 0 such that
recalling that VN0 (·) is nonnegative and zero at the origin. By Rawlings
and Risbeck (2015, Proposition 13), this is equivalent to the existence
of a K function γ(·) defined on [0, b] (with b > 0) such that
VN0 (x, i) ≤ γ(|x|) for all x ∈ X
with X := {x ∈ Rn | |x| ≤ b} a neighborhood of the origin. Thus,
condition (c) is also implied.
(d) See Exercise 2.22. Note that the uniform continuity of f (·) and ℓ(·)
implies the existence of K function upper bounds of the form
f (x, u, i) ≤ αf x (|x|) + αf u (|u|)
ℓ(x, u, i) ≤ αℓx (|x|) + αℓu (|u|)
for all i ∈ I≥0 . ■
130 Model Predictive Control—Regulation
Hence, if Assumptions 2.25, 2.26, 2.33, and 2.37 hold it follows from
Proposition 2.36 that, for all i ∈ I≥0 , all x ∈ XN (i)
Proof.
(a) It follows from Assumptions 2.25, 2.26, 2.33, and 2.37 and Propo-
sition 2.36 that VN0 (·) satisfies the inequalities (2.21).
(b) It follows from (a) and definition 2.31 that VN0 (·) is a time-varying
Lyapunov function. It follows from Theorem 2.32 that the origin is
asymptotically stable in XN (i) at each time i ≥ 0 for the time-varying
system x + = f (x, κN (x, i), i).
■
2.5 Examples of MPC 131
In the special case where the system is time varying but periodic, a
global CLF can be determined as in the LQR case. Suppose the objective
function is
1
ℓ(x, u, i) := x ′ Q(i)x + u′ R(i)u
2
with each Q(i) and R(i) positive definite. To start, choose a sequence
of linear control laws
κf (x, i) := K(i)x
and let
TX
−1
A(T , 0)′ P (T )A(T , 0) + A(k, 0)′ Q(k)A(k, 0) = P (0)
i=0
1 ′
Vf (f (x, u, i), i + 1) + ℓ(x, u, i) = x AK (i)′ P (i + 1)AK (i)x+
2
1 ′ 1
x QK (i)x = x ′ P (i)x ≤ Vf (x, i)
2 2
A′ P A + Q = P
Vf (x) = (1/2)x ′ P x
With f (·), ℓ(·), and Vf (·) defined thus, PN (x) is a parametric quadratic
problem if the constraint set U is polyhedral and global solutions may
be computed online. The terminal cost function Vf (·) satisfies
The controller u = Kx does not necessarily satisfy the control and state
constraints, however. The terminal constraint set Xf must be chosen
with this requirement in mind. We may choose Xf to be the maximal
invariant constraint admissible set for x + = AK x; this is the largest set
W with respect to inclusion5 satisfying: (a) W ⊆ {x ∈ X | Kx ∈ U}, and
(b) x ∈ W implies x(i) = AiK x ∈ W for all i ≥ 0. Thus Xf , defined this
way, is control invariant for x + = Ax + Bu, u ∈ U. If the initial state
x of the system is in Xf , the controller u = Kx maintains the state
in Xf and satisfies the state and control constraints for all future time
(x(i) = AiK x ∈ Xf ⊂ X and u(i) = Kx(i) ∈ U for all i ≥ 0). Hence,
with Vf (·), Xf , and ℓ(·) as defined previously, Assumptions 2.2, 2.3,
and 2.14 are satisfied. Summarizing, we have
Puc
N (x) : VNuc (x) = min VN (x, u)
u
with Vf (·) the value function for the infinite horizon unconstrained
uc
optimal control problem, i.e., Vf (x) := V∞ (x) = (1/2)x ′ P x. With
these definitions, it follows that
VNuc (x) = V∞
uc
(x) = Vf (x) = (1/2)x ′ P x
uc
κN (x) = Kx, K = −(B ′ P B + R)−1 B ′ P A
Hence VN0 (x + ) = VNuc (x) = Vf (x) for all x ∈ Xf . That κN (x) = Kx for
all x ∈ Xf follows from the uniqueness of the solutions to the problems
PN (x) and PucN (x). Summarizing, we have
x + = f (x, u)
x∈X u∈U
x + = Ax + Bu
140 Model Predictive Control—Regulation
A′K P AK + µQK = P
for some µ > 1 The reason for the factor µ will become apparent soon.
Since QK is positive definite and AK is stable, P is positive definite. Let
the terminal cost function Vf (·) be defined by
Vf (x) := (1/2)x ′ P x
Clearly Vf (·) is a global CLF for the linear system x + = Ax+Bu. Indeed,
it follows from its definition that Vf (·) satisfies
x + = f (x, Kx)
so that
for all x in δB. From (2.25), we see that there exists an ε ∈ (0, δ] such
that (2.24), and, hence, (2.23), is satisfied for all x ∈ εB. Because of
our choice of ℓ(·), there exists a c1 > 0 such that Vf (x) ≥ ℓ(x, Kx) ≥
p
c1 |x|2 for all x ∈ Rn . It follows that x ∈ leva Vf implies |x| ≤ a/c1 .
p
We can choose a to satisfy a/c1 = ε. With this choice, x ∈ leva Vf
implies |x| ≤ ε ≤ δ, which, in turn, implies (2.23) is satisfied.
We conclude that there exists an a > 0 such that Vf (·) and Xf :=
leva Vf satisfy Assumptions 2.2 and 2.3. For each x ∈ Xf there exists a
u = κf (x) := Kx such that Vf (x, u) ≤ Vf (x)−ℓ(x, u) since ℓ(x, Kx) =
(1/2)x ′ QK x so that our assumption that ℓ(x, u) = (1/2)(x ′ Qx +
u′ Ru) where Q and R are positive definite, and our definition of Vf (·)
ensure the existence of positive constants c1 , c2 and c3 such that VN0 (x) ≥
c1 |x|2 for all Rn , Vf (x) ≤ c2 |x|2 and VN0 (f (x, κf (x))) ≤ VN0 (x) −
c3 |x|2 for all x ∈ Xf thereby satisfying Assumption 2.14. Finally, by
definition, the set Xf contains the origin in its interior. Summarizing,
we have
Assumptions 2.2, 2.3, and 2.14 are satisfied, and Xf con-
tains the origin in its interior. In addition α1 (·), α2 (·), and
α3 (·) satisfy the hypotheses of Theorem 2.21. Hence, by
Theorems 2.19 and 2.21, the origin is exponentially stable
for x + = f (x, κN (x)) in XN .
Asymptotic stability of the origin in XN also may be established when
Xf := {0} by assuming a K∞ bound on VN0 (·) as in Assumption 2.17.
Although Assumption 2.33 (the basic stability assumption) for the time-
varying case suffices to ensure that VN0 (·) has sufficient cost decrease,
142 Model Predictive Control—Regulation
∂f ∂f
A(i) := (0, 0, i) B(i) := (0, 0, i)
∂x ∂u
Assuming the origin is in the interior of each X(i) (but not necessarily
each U(i)), we determine a subspace of unsaturated inputs u e such that
(i) u(i) = F (i)u(i), (ii) there exists ϵ > 0 such that F (i)u(i) ∈ U(i)
e e
2.5 Examples of MPC 143
for all ue ≤ ϵ, and (iii) the reduced linear system (A(i), B(i)F (i)) is
stabilizable. These conditions ensure that the reduced linear system is
locally unconstrained. Taking a positive definite stage cost
1
ℓ(x, u, i) := x ′ Q(i)x + u′ R(i)u
2
we chose µ > 1 and proceed as in the linear unconstrained case (Sec-
tion 2.5.2) using the reduced model (A(i), B(i)F (i)) and adjusted cost
matrices µQ(i) and µR(i). We thus have the relationship
X(i) := {x ∈ X(i) | κf (x, i) ∈ U(i) and f (x, κf (u, i), i) ∈ X(i + 1)}
for all x ∈ levb(i) Vf (·, i) and i ∈ I≥0 . That is, the approximation error
of the linear system is sufficiently small. Thus, adding this inequality
to the approximate cost decrease condition, we recover
on terminal regions Xf (i) = levb(i) Vf (·, i). That these terminal regions
are positive invariant follows from the cost decrease condition. Note
also that these sets Xf (i) contain the origin in their interiors, and thus
Assumption 2.37 is satisfied. Summarizing we have
β
For all β ≥ 1, let PN (x) denote the modified optimal control problem
defined by
β β
V̂N (x) = min{VN (x, u) | u ∈ ÛN (x)}
u
in which, for all i, x(i) = φ(i; x, u), the solution at time i of x + = f (x,
u) when the initial state is x and the control sequence is u. The control
constraint set ÛN (x) ensures satisfaction of the state and control con-
straints, but not the terminal constraint, and is defined by
where x β (i; x) := φ(i; x, uβ (x)) for all i. The implicit MPC control law
β β
is κN (·) where κN (x) := uβ (0; x). Neither ÛN (x) nor X̂N depend on
the parameter β. It can be shown (Exercise 2.11) that the pair (βVf (·),
Xf ) satisfies Assumptions 2.2–2.14 if β ≥ 1, since these assumptions
are satisfied by the pair (Vf (·), Xf ). The absence of the terminal con-
β
straint x(N) ∈ Xf in problem PN (x), which is otherwise the same as
the normal optimal control problem PN (x) when β = 1, ensures that
V̂N1 (x) ≤ VN0 (x) for all x ∈ XN and that XN ⊆ X̂N where VN0 (·) is the
value function for PN (x) and XN is the domain of VN0 (·).
β β
Problem PN (x) and the associated MPC control law κN (·) are de-
fined below. Suppose uβ (x) is optimal for the terminally unconstrained
β
problem PN (x), β ≥ 1, and that xβ (x) is the associated optimal state
trajectory.
β
That the origin is asymptotically stable for x + = f (x, κN (x)) and
each β ≥ 1, with a region of attraction that depends on the parameter
β is established by Limon, Alamo, Salas, and Camacho (2006) via the
following results.
146 Model Predictive Control—Regulation
Proof. Since, as shown in Exercise 2.11, βVf (x) ≥ βVf (f (x, κf (x))) +
ℓ(x, κf (x)) and f (x, κf (x)) ∈ Xf for all x ∈ Xf , all β ≥ 1, it follows
that for all x ∈ Xf and all i ∈ I0:N−1
N−1
X β
βVf (x) ≥ ℓ(x f (j; x, i), uf (j; x, i)) + βVf (x f (N; x, i)) ≥ V̂N−i (x)
j=i
We assume in the sequel that there exists a d > 0 such ℓ(x, u) ≥ d for
all x ∈ X \ Xf and all u ∈ U. The following result is due to Limon et al.
(2006).
β β
so that x ∉ ΓN . Hence x ∈ ΓN implies x β (N; x) ∈ Xf . It then follows,
since βVf (·) and Xf satisfy Assumptions 2.2 and 2.3, that the origin
β
is asymptotically or exponentially stable for x + = f (x, κN (x)) with a
β β
region of attraction ΓN . It also follows that x ∈ ΓN (x) implies
β β β β
V̂N (x β (1; x)) ≤ V̂N (x) − ℓ(x, κN (x)) ≤ V̂N (x) ≤ Nd + βa
β β β
so that x β (1; x) = f (x, κN (x)) ∈ ΓN . Hence ΓN is positive invariant for
β
x + = f (x, κN (x)). ■
β
Limon et al. (2006) then proceed to show that ΓN increases with β
β β
or, more precisely, that β1 ≤ β2 implies that ΓN 1
⊆ ΓN 2 .
They also show
that for any x steerable to the interior of Xf by a feasible control, there
β
exists a β such that x ∈ ΓN . We refer to requiring the initial state x to
β
lie in ΓN as an implicit terminal constraint.
If it is desired that the feasible sets for Pi (x) be nested (Xi ⊂ Xi+1 ,
i = 1, 2, . . . N −1) (thereby ensuring recursive feasibility), it is necessary,
as shown in Mayne (2013), that PN (x) includes a terminal constraint
that is control invariant.
in which x(N) = φ(N; x, u) and κf (·) is a local control law with the
property that u = κf (x) satisfies Assumption 2.2 for all x ∈ Xf . The
existence of such a κf (·), which is often of the form κf (x) = Kx, is
implied by Assumption 2.2. Then, since x(N) ∈ Xf and since the sta-
e ∈ UN (x)
bilizing conditions 2.14 are satisfied, the control sequence u
satisfies
VN (x + , u
e ) ≤ VN (x, u) − ℓ(x, u(0)) ≤ VN (x, u) − α1 (|x|) (2.27)
with x + := f (x, u(0)).
No optimization is required to get the cost reduction ℓ(x, u(0))
given by (2.27); in practice the control sequence u e can be improved
by several iterations of an optimization algorithm. Inequality (2.27) is
reminiscent of the inequality VN0 (x + ) ≤ VN0 (x) − α1 (|x|) that provides
the basis for establishing asymptotic stability of the origin for the con-
trolled systems previously analyzed. This suggests that the simple al-
gorithm described previously, which places very low demands on the
online optimization algorithm, may also ensure asymptotic stability of
the origin.
This is almost true. The obstacle to applying standard Lyapunov
theory is that there is no obvious Lyapunov function V : Rn → R≥0
because, at each state x + , there exist many control sequences u+ satis-
fying VN (x + , u+ ) ≤ VN (x, u) − α1 (|x|). The function (x, u) , VN (x, u)
is not a function of x only and may have many different values for each
x; therefore it cannot play the role of the function VN0 (x) used previ-
ously. Moreover, the controller can generate, for a given initial state,
many different trajectories, all of which have to be considered. We
address these issues next following the recent development in Allan,
Bates, Risbeck, and Rawlings (2017).
A key step is to consider suboptimal MPC as an evolution of an
extended state consisting of the state and warm-start pair. Given a
feasible warm start, optimization algorithms can produce an improved
feasible sequence or, failing even that, simply return the warm start.
The first input is injected and a new warm start can generated from
the returned control sequence and terminal control law.
Warm start. An admissible warm start u e , must steer the current state
x to the terminal region subject to the input constraints, i.e., u e ∈
UN (x). It also must satisfy VN (x, u e ) ≤ Vf (x) if x ∈ Xf , which en-
sures that |x| → 0 implies |u| → 0. These two conditions define the set
of admissible warm starts
Ue N (x) := ue ∈ UN (x) | VN (x, u
e ) ≤ Vf (x) if x ∈ Xf (2.28)
2.7 Suboptimal MPC 149
From its definition, the suboptimal control law is a function of both the
state x and the warm start u e∈U e N (x).
To complete the algorithm we require a successor warm start for
the successor state x + = f (x, u(0)). First defining
e w (x, u) := u(1), u(2), . . . , u(N − 1), κf (φ(N; x, u))
u
+ e N (x + ) as follows
we choose the successor warm start u e ∈U
e (x + )
u if x + ∈ Xf and
f
+
e :=
u VN (x + , u
e f (x + )) ≤ VN (x + , u
e w (x, u)) (2.29)
e
uw (x, u) else
+
e = ζ(x, u), and Proposition 2.42
This mapping in (2.29) is denoted u
ensures that the warm start generated by ζ(x, u) is admissible for x + .
We have the following algorithm for suboptimal MPC.
In Algorithm 2.43 we begin with a state and warm-start pair and pro-
ceed from this pair to the next at the start of each time step. We denote
this extended state as z := (x, u
e ) for x ∈ XN and u e∈U e N (x). The ex-
tended state evolves according to
+
z+ ∈ H(z) := (x + , u
e ) | x + = f (x, u(0)),
+
e = ζ(x, u), u ∈ ǓN (z)
u (2.30)
Proposition 2.44 (Linking warm start and state). There exists a function
αr (·) ∈ K∞ such that u e ≤ αr (|x|) for any (x, u eN.
e) ∈ Z
From the definition of the control law and the warm start, we have that
eN
for all z ∈ Z
N−1
X N−1
X
VN (z) ≥ VN (x, u) ≥ ℓ(x(i), u(i)) ≥ αℓ (|(x(i), u(i))|)
i=0 i=0
Next we use (B.1) from Appendix B and the triangle inequality to obtain
N−1
X N−1
1 X
αℓ (|(x(i), u(i))|) ≥ αℓ |(x(i), u(i))| ≥ αℓ |(x, u)| /N
i=0
N i=0
Finally using the ℓp -norm property that for all vectors a, b, |(a, b)| ≥
|b|, and noting that x(0) = x, so we have that
αℓ |(x, u)| /N ≥ αℓ |(x, u)| /N := α1 |(x, u)| = α1 (z)
152 Model Predictive Control—Regulation
(x, u
e ) ≤ |x| + u
e ≤ |x| + αr (|x|) := αr ′ (|x|) ≤ αr ′ (|(x, u(0))|)
Therefore, αℓ ◦ αr−1
′ ( (x, u
e ) ) ≤ αℓ (|(x, u(0))|). Defining α3 (·) := αℓ ◦
−1
αr ′ (·) and because VN (x, u) ≤ VN (x, ue ), we have that
VN (z+ ) ≤ VN (x, u
e ) − α3 (|z|) = VN (z) − α3 (|z|)
From this result, a bound on just x(k) rather than z(k) = (x(k),
eN
e (k)) can also be derived. First we have that for all k ≥ 0 and z ∈ Z
u
β(|x| + u e (|x| , k)
e , k) ≤ β(|x| + αr (|x|), k) := β
which implies |x(k; z)| ≤ β e (|x| , k). So we have a bound on the evolu-
tion of x(k) depending on only the x initial condition. Note that the
evolution of x(k) depends on the initial condition of z = (x, u e ), so it
depends on initial warm start u e as well as initial x. We cannot ignore
this dependence, which is why we had to analyze the extended state in
the first place. For the same reason we also cannot define the invariant
set in which the x(k) evolution takes place without referring to Z eN.
2.8 Economic Model Predictive Control 153
x + = f (x, u)
But here the stage cost is some general function ℓ(x, u) that measures
economic performance of the process. The stage cost is not positive
definite with respect to some target equilibrium point of the model as
in a tracking problem. We set up the usual MPC objective function as a
sum of stage costs over some future prediction horizon
N−1
X
VN (x, u) = ℓ(x(k), u(k)) + Vf (x(N))
k=0
scale than the process sample time), and then design an MPC controller
with a different, tracking stage cost to reject disturbances and track this
steady state. In this approach, a typical tracking stage cost would be the
types considered thus far, e.g., ℓt (x, u) = (1/2)(|x − xs |2Q +|u − us |2R ).
In economic MPC, we instead use the same economic stage cost di-
rectly in the dynamic MPC problem. Some relevant questions to be
addressed with this change in design philosophy are: (i) how much
economic performance improvement is possible, and (ii) how differ-
ent is the closed-loop dynamic behavior. For example, we are not even
guaranteed for a nonlinear system that operating at the steady state is
the best possible dynamic behavior of the closed-loop system.
As an introduction to the topic, we next set up the simplest version
of an economic MPC problem, in which we use a terminal constraint. In
the Notes section, we comment on what generalizations are available in
the literature. We now modify the basic assumptions given previously.
Due to Assumptions 2.49 and 2.50, Proposition 2.4 holds, and the so-
lution to the optimal control problem exists. The control law, κN (·) is
therefore well defined; if it is not unique, we consider as before a fixed
selection map, and the closed-loop system is again given by
The left-hand side may not have a limit, so we take lim sup of both
sides. Note that from Assumption 2.51(b), ℓ(x, u) is lower bounded
for (x, u) ∈ Z, hence so is VN (x, u) for (x, u) ∈ Z, and VN0 (x) for
x ∈ XN . Denote this bound by M. Then limt→∞ −(1/t)VN0 (x(t)) ≤
limt→∞ −M/t = 0 and we have that
t=1
X ℓ(x(k), u(k))
lim sup ≤ ℓ(xs , us ) ■
t→∞ k=0
t
156 Model Predictive Control—Regulation
This result does not imply that the economic MPC controller stabi-
lizes the steady state (xs , us ), only that the average closed-loop per-
formance is better than the best steady-state performance. There are
many examples of nonlinear systems for which the time-average of an
oscillation is better than the steady state. For such systems, we would
expect an optimizing controller to destabilize even a stable steady state
to obtain the performance improvement offered by cycling the system.
Note also that the appearance in (2.34) of the term −ℓ(x, κN (x)) +
ℓ(xs , us ), which is sign indeterminate, destroys the cost decrease prop-
erty of VN0 (·) so it no longer can serve as a Lyapunov function in a
closed-loop stability argument. We next examine the stability question.
Assumption 2.54 (Continuity at the steady state). The function VN0 (·)+
λ(·) : XN → R is continuous at xs .
Proof. We know that VN0 (·) is not a Lyapunov function for the given
stage cost ℓ(·), so our task is to construct one. We first introduce a
rotated stage cost as follows (Diehl, Amrit, and Rawlings, 2011)
Note from (2.36) and Assumption 2.55 that this stage cost then satisfies
for all (x, u) ∈ Z
e (x, u) ≥ α(|x − x |)
ℓ e (x , u ) = 0
ℓ (2.37)
s s s
and we have the kind of stage cost required for a Lyapunov function.
Next define an N-stage sum of this new stage cost as V e N (x, u) :=
PN−1 e
k=0ℓ(x(k), u(k)) and perform the sum to obtain
N−1
X
e N (x, u) =
V ℓ(x(k), u(k)) − Nℓ(xs , us ) + λ(x) − λ(xs )
k=0
for all x ∈ XN , and we have established the required lower bound. The
cost difference can be calculated to establish the required cost decrease
0 0 0
e (f (x, κN (x))) ≤ V
V e (x, κ (x)) ≤ V
e (x) − ℓ e (x) − α(|x − xs |)
N N N N
and sets X = [−10, 10]2 , U = [−1, 1]. The economically optimal steady
state is xs = (8, 4), us = 1. We compare economic MPC to tracking MPC
with
ℓtrack (x, u) = |x − xs |210I + |u − us |2I
Figure 2.5 shows a phase plot of the closed-loop evolution starting from
x = (−8, 8). Both controllers use the terminal constraint Xf = {xs }.
2.8 Economic Model Predictive Control 159
tracking economic
10
6
x2
0
−10 −5 0 5 10 15
x1
While tracking MPC travels directly to the setpoint, economic MPC takes
a detour to achieve lower economic costs.
To prove that the economic MPC controller is stabilizing, we find a
storage function. As a candidate storage function, we take
λ(x) = µ ′ (x − xs ) + (x − xs )′ M(x − xs )
e (x, u) = ℓ
ℓ econ (x, u) + λ(x) − λ(f (x, u))
10 1
u
5 0
x1
0
101
−5
100
10−1
5 e0
V 10−2
x2 10−3
0
10−4
−5
10−5
0 5 10 15 20 0 5 10 15 20
time time
for all j ≥ 0. But it is not unusual for systems with even linear dy-
namics to have disconnected admissible regions, which is not possible
for linear systems with only continuous actuators and convex U. When
tracking a constant setpoint, the design of terminal regions and penal-
ties must account for the fact that the discrete actuators usually remain
at fixed values in a small neighborhood of the steady state of interest,
and can be used only for rejecting larger disturbances and enhancing
transient performance back to the steady state. Fine control about the
steady state must be accomplished by the continuous actuators that
are unconstrained in a neighborhood of the steady state. But this is
the same issue that is faced when some subset of the continuous ac-
tuators are saturated at the steady state of interest (Rao and Rawlings,
1999), which is a routine situation in process control problems. We
conclude the chapter with an example illustrating these issues.
T0
T3
cooler Q̇
tank
T2
T1
would suffice. With this terminal region, Figure 2.8 shows the feasible
sets for Q̇min = 0 and Q̇min = 9. Note that for Q̇min > 0, the projection
of U onto the total heat duty Q̇ is a disconnected set of possible heat
duties, leading to disconnected sets XN for N ≤ 5. (The sets XN for
N ≥ 6 are connected.)
To control the system, we solve the standard MPC problem with
horizon N = 8. Figure 2.9 shows a phase portrait of closed-loop evolu-
tion for various initial conditions with Q̇min = 9. Each evaluation of the
control law requires solving a mixed-integer, quadratically constrained
QP (with the quadratic constraint due to the terminal region). In gen-
eral, the controller chooses u2 = 1 near the setpoint and u2 ∈ {0, 2}
far from it, although this behavior is not global. Despite the discon-
nected nature of U, all initial conditions are driven asymptotically to
the setpoint. □
35 35
30 30
T2
25 25
20 20
15 15
10 10
10 20 30 40 50 60 10 20 30 40 50 60
T1 T1
Xf X1 X2 X3 X4 X5 X6 X7
Figure 2.8: Feasible sets XN for two values of Q̇min . Note that for
Q̇min = 9 (right-hand side), XN for N ≤ 4 are discon-
nected sets.
control problem that defines the control is a finite horizon problem, nei-
ther stability nor optimality of the cost function is necessarily achieved
by a receding horizon or model predictive controller.
This chapter shows how stability may be achieved by adding a ter-
minal cost function and a terminal constraint to the optimal control
problem. Adding a terminal cost function adds little or no complexity
to the optimal control problem that has to be solved online, and usu-
ally improves performance. Indeed, the infinite horizon value function
0
V∞ (·) for the constrained problem would be an ideal choice for the ter-
minal penalty because the value function VN0 (·) for the online optimal
0
control problem would then be equal to V∞ (·), and the controller would
inherit the performance advantages of the infinite horizon controller.
In addition, the actual trajectories of the controlled system would be
precisely equal, in the absence of uncertainty, to those predicted by
0
the online optimizer. Of course, if we knew V∞ (·), the optimal infinite
horizon controller κ∞ (·) could be determined and there would be no
reason to employ MPC.
0
The infinite horizon cost V∞ (·) is known globally only for special
2.10 Concluding Comments 165
u2 = 0 u2 = 1 u2 = 2
40
35
30
T2
25
20
15
10
10 20 30 40 50 60
T1
2.11 Notes
MPC has an unusually rich history, making it impossible to summarize
here the many contributions that have been made. Here we restrict
2.11 Notes 167
constraint is omitted and the infinite horizon cost using zero control is
employed as the terminal cost. The resultant terminal cost is a global
CLF.
The basic principles ensuring closed-loop stability in these and many
other papers including (De Nicolao, Magni, and Scattolini, 1998), and
(Mayne, 2000) were distilled and formulated as “stability axioms” in
the review paper (Mayne et al., 2000); they appear as Assumptions 2.2,
2.3, and 2.14 in this chapter. These assumptions provide sufficient
conditions for closed-loop stability for a given horizon. There is an al-
ternative literature that shows that closed-loop stability may often be
achieved if the horizon is chosen to be sufficiently long. Contributions
in this direction include (Primbs and Nevistić, 2000), (Jadbabaie, Yu,
and Hauser, 2001), as well as (Parisini and Zoppoli, 1995; Chmielewski
and Manousiouthakis, 1996; Scokaert and Rawlings, 1998) already men-
tioned. An advantage of this approach is that it avoids addition of an
explicit terminal constraint, although this may be avoided by alterna-
tive means as shown in Section 2.6. A significant development of this
approach (Grüne and Pannek, 2017) gives a comprehensive investiga-
tion and extension of the conditions that ensure recursive feasibility
and stability of MPC that does not have a terminal constraint. On the
other hand, it has been shown (Mayne, 2013) that an explicit or implicit
terminal constraint is necessary if positive invariance and the nested
property Xj+1 ⊃ Xj , j ∈ I≥0 of the feasible sets are required; the nested
property ensures recursive feasibility.
Recently several researchers (Limon, Alvarado, Alamo, and Cama-
cho, 2008, 2010; Fagiano and Teel, 2012; Falugi and Mayne, 2013a;
Müller and Allgöwer, 2014; Mayne and Falugi, 2016) have shown how
to extend the region of attraction XN , and how to solve the related
problem of tracking a randomly varying reference—thereby alleviating
the disadvantage caused by the reduction in the region of attraction
due to the imposition of a terminal constraint. Attention has also been
given to the problem of tracking a periodic reference using model pre-
dictive control (Limon et al., 2012; Falugi and Mayne, 2013b; Rawlings
and Risbeck, 2017).
Regarding the analysis of nonpositive stage costs in Section 2.4.4,
Grimm, Messina, Tuna, and Teel (2005) use a storage function like Λ(·)
to compensate for a semidefinite stage cost. Cai and Teel (2008) give a
discrete time converse theorem for IOSS for all Rn . Allan and Rawlings
(2018) give a converse theorem for IOSS on closed positive invariant
sets and provide a lemma for changing the supply rate function.
170 Model Predictive Control—Regulation
2.12 Exercises
Show that f (·) is bounded on bounded sets. Moreover, if U is bounded, show that
f −1 (·) is bounded on bounded sets.
(b) Add the output constraint y(k) ≤ 0.5. Plot the response of the constrained
regulator (both input and output). Is this regulator stabilizing? Can you modify
the tuning parameters Q, R to affect stability as in Section 1.3.4?
(c) Change the output constraint to y(k) ≤ 1 + ϵ, ϵ > 0. Plot the closed-loop re-
sponse for a variety of ϵ. Are any of these regulators destabilizing?
(d) Set the output constraint back to y(k) ≤ 0.5 and add the terminal constraint
x(N) = 0. What is the solution to the regulator problem in this case? Increase
the horizon N. Does this problem eventually go away?
x2 0
−1
−2
−3
−3 −2 −1 0 1 2 3
x1
(a) Implement unconstrained MPC with no terminal cost (Vf (·) = 0) for a few values
of α. Choose a value of α for which the resultant closed loop is unstable. Try
N = 3.
(b) Implement constrained MPC with no terminal cost or terminal constraint for the
value of α obtained in the previous part. Is the resultant closed loop stable or
unstable?
(c) Implement constrained MPC with terminal equality constraint x(N) = 0 for the
same value of α. Find the region of attraction for the constrained MPC controller
using the projection algorithm from Exercise 2.4. The result should resemble
Figure 2.10.
x2 0
−1
−2
−3
−3 −2 −1 0 1 2 3
x1
The cost is
N−1
X
VN (x, u) := ℓ(x(i), u(i)) + Vf (x(N))
i=0
in which " #
α 0
ℓ(x, u) = (1/2)(|x|2Q 2
+ |u| ) Q=
0 α
and Vf (·) is the terminal penalty on the final state and 1 ∈ R2 is a vector of all ones.
Use α = 10−5 and N = 3 and terminal cost Vf (x) = (1/2)x ′ Πx where Π is the solution
to the steady-state Riccati equation.
(a) Compute the infinite horizon optimal cost and control law for the unconstrained
system.
(b) Find the region Xf , the maximal constraint admissible set using the algorithm in
Exercise 2.5 for the system x + = (A + BK)x with constraints x ∈ X and Kx ∈ U.
You should obtain the region shown in Figure 2.11.
(c) Add a terminal constraint x(N) ∈ Xf and implement constrained MPC. Find XN ,
the region of attraction for the MPC problem with Vf (·) as the terminal cost and
x(N) ∈ Xf as the terminal constraint. Contrast it with the region of attraction
for the MPC problem in Exercise 2.6 with a terminal constraint x(N) = 0.
(d) Estimate X̄N , the set of initial states for which the MPC control sequence for
horizon N is equal to the MPC control sequence for an infinite horizon.
Hint: x ∈ X̄N if x 0 (N; x) ∈ int(Xf ). Why?
176 Model Predictive Control—Regulation
(b) Remove the terminal constraint and estimate the domain of attraction X̂N (by
simulation). Compare this X̂N with XN and X̄N obtained previously.
(c) Change the terminal cost to Vf (x) = (3/2)x ′ Πx and repeat the previous part.
5
XN
4 Xf
X̄N
3
2
1
x2 0
−1
−2
−3
−4
−5
−10 −8 −6 −4 −2 0 2 4 6 8 10
x1
Theorem 2.59 (Lyapunov theorem for asymptotic stability). Given the dynamic system
x + = f (x) 0 = f (0)
The origin is asymptotically stable if there exist K functions α, β, γ, and r > 0 such that
Lyapunov function V satisfies for x ∈ r B
α(|x|) ≤ V (x) ≤ β(|x|)
V (f (x)) − V (x) ≤ −γ(|x|)
(b) What assumptions about the cost function ℓ(x, u) are required to strengthen the
controller so that the origin is exponentially stable for the closed-loop system?
How does the controllability assumption change for this case?
for all x and all u; here x(i) := φ(i; x, u) and y(i) := h(x(i)). Prove the result given in
Section 2.4.4 that the origin is asymptotically stable for the closed-loop system x + =
f (x, κN (x)) using the assumption that x + = f (x, u), y = h(x) is ℓ-observable rather
than IOSS. Assume that N ≥ No .
Prove Proposition 2.60. Hint: consider the solution at time k + l using the state at
time k as the initial state.
Lemma 2.61 (An equality for quadratic functions). Let X be a nonempty compact sub-
set of Rn , and let ℓ(·) be a strictly convex quadratic function on X defined by ℓ(x) :=
(1/2)x ′ Qx + q′ x + c, Q > 0. Consider a sequence (x(i))i∈I1:P with mean x̄P :=
P
(1/P ) Pi=1 x(i). Then the following holds
P
X P
X
ℓ(x(i)) = (1/2) |x(i) − x̄P |2Q + P ℓ(x̄P )
i=1 i=1
P
It follows from this lemma that ℓ(x̄P ) ≤ (1/P ) Pi=1 ℓ(x(i)), which is Jensen’s
inequality for the special case of a quadratic function.
2.12 Exercises 179
Lemma 2.62 (Evolution in a compact set). Suppose x(0) = x lies in the set XN . Then
the state trajectory (x(i)) where, for each i, x(i) = φf (i; x) of the controlled system
x + = f (x) evolves in a compact set.
(a) Consider a unit setpoint change in the first output. Choose a reasonable sample
time, ∆. Simulate the behavior of an offset-free discrete time MPC controller
with Q = I, S = I and large N.
(c) Add the constraint −0.1 ≤ ∆u/∆ ≤ 0.1 and simulate the response.
(d) Add significant noise to both output measurements (make the standard devia-
tion in each output about 0.1). Retune the MPC controller to obtain good perfor-
mance. Describe which controller parameters you changed and why.
subject to
x(1) = Ax(0) + Bu(0)
Draw a sketch of x(1) versus u(0) (recall x(0) is a known parameter) and show
the x-axis and y-axis intercepts on your plot. Now draw a sketch of V (x(0), u(0))
versus u(0) in order to see what kind of optimization problem you are solving. You
may want to plot both terms in the objective function individually and then add them
together to make your V plot. Label on your plot the places where the cost function V
7 Laplace would love us for making this choice, but Gauss would not be happy.
180 Model Predictive Control—Regulation
suffers discontinuities in slope. Where is the solution in your sketch? Does it exist for
all A, B, x(0)? Is it unique for all A, B, x(0)?
The motivation for this problem is to change the quadratic program (QP) of the
LQR to a linear program (LP) in the LAR, because the computational burden for LPs is
often smaller than QPs. The absolute value terms can be converted into linear terms
with the introduction of slack variables.
x + = Ax + Bu u∈U
in which we assume for simplicity that Q, R > 0. For the setpoint to be unreachable in
an unconstrained problem, the setpoint must be inconsistent, i.e., not a steady state of
the system, or
xsp ≠ Axsp + Busp
Consider also using the stage cost centered at the optimal steady state (xs , us )
ℓs (x, u) = (1/2) |x − xs |2Q + |u − us |2R
subject to " #
h i x
I−A −B =0 u∈U
u
Figure 2.13 depicts an inconsistent setpoint, and the optimal steady state for uncon-
strained and constrained systems.
(a) For unconstrained systems, show that optimizing the cost function with terminal
constraint
N−1
X
V (x, u) := ℓ(x(k), u(k))
k=0
subject to
x + = Ax + Bu x(0) = x x(N) = xs
gives the same solution as optimizing the cost function
N−1
X
Vs (x, u) := ℓs (x(k), u(k))
k=0
subject to the same model constraint, initial condition, and terminal constraint.
Therefore, there is no reason to consider the unreachable setpoint problem fur-
ther for an unconstrained linear system. Shifting the stage cost from ℓ(x, u) to
ℓs (x, u) provides identical control behavior and is simpler to analyze.
2.12 Exercises 181
ℓ(x, u) = const
x (xsp , usp )
unreachable
steady-state line
) ℓs (x, u) = const
, us
ℓs (x, u) = const (x s unconstrained
)
, us
(x s
constrained
U u
Hint. First define a third stage cost l(x, u) = ℓ(x, u) − λ′ ((I − A)x −
Bu), and show, for any λ, optimizing with l(x, u) as stage cost is
the same as optimizing using ℓ(x, u) as stage cost. Then set λ =
λs , the optimal Lagrange multiplier of the steady-state optimization
problem.
(b) For constrained systems, provide a simple example that shows optimizing the
cost function V (x, u) subject to
does not give the same solution as optimizing the cost function Vs (x, u) sub-
ject to the same constraints. For constrained linear systems, these problems
are different and optimizing the unreachable stage cost provides a new design
opportunity.
ℓ V 0 (x)
V 0 (x + )
ℓ(x, u0 (x))
ℓ(xs , us )
0
k k+1 k+2 k+3 k+4 k+5 k+6 k+7
Figure 2.14: Stage cost versus time for the case of unreachable set-
point. The cost V 0 (x(k)) is the area under the curve to
the right of time k.
You interrupt, “Wait, these V 0 (·) costs are not bounded in this case!” Unfazed, the
student replies, “Yeah, I realize that, but this sketch is basically correct regardless.
Say we just make the horizon really long; then the costs are all finite and this equation
becomes closer and closer to being true as we make the horizon longer and longer.” You
start to feel a little queasy at this point. The student continues, “OK, so if this inequality
basically holds, V 0 (x(k)) is decreasing with k along the closed-loop trajectory, it is
bounded below for all k, it converges, and, therefore, ℓ(x(k), u0 (x(k))) goes to zero
as k goes to ∞.” You definitely don’t like where this is heading, and the student finishes
with, “But ℓ(x, u) = 0 implies x = xsp and u = usp , and the setpoint is supposed to
be unreachable. But I have proven that infinite horizon MPC can reach an unreachable
setpoint. We should patent this!”
How do you respond to this student? Here are some issues to consider.
1. Does the principle of optimality break down in the unreachable setpoint case?
2. Are the open-loop and closed-loop trajectories identical in the limit of an infinite
horizon controller with an unreachable setpoint?
3. Does inequality (2.39) hold as N → ∞? If so, how can you put it on solid footing?
If not, why not, and with what do you replace it?
4. Do you file for patent?
for all i ∈ I≥0 . Prove that if there exists a K∞ function γ(·) such that for each x ∈
XN (i), there exists u ∈ UN (x, i) with |u| ≤ γ(|x|), then there exists a K∞ function
α(·) such that
V 0 (x, i) ≤ α(|x|)
2.12 Exercises 183
Show that the control law K := diag(K(0), . . . , K(T − 1)) and cost P := diag(P (0), . . . ,
P (T − 1)) satisfy the Riccati equation for the time-invariant lifted system
0 0 ··· 0 A(T − 1)
A(0) 0 ··· 0 0
0 A(1) · · · 0 0
A :=
. . . . .
. . .. . .
. . . .
0 0 · · · A(T − 2) 0
0 0 ··· 0 B(T − 1)
B(0) 0 ··· 0 0
0 B(1) · · · 0 0
B :=
. . .. . .
. . . .
. . . . .
0 0 · · · B(T − 2) 0
Q := diag(Q(0), . . . , Q(T − 1))
R := diag(R(0), . . . , R(T − 1))
By uniqueness of solutions to the Riccati equation, this system can be used to synthe-
size control laws for periodic systems.
U = {u | −1 ≤ u ≤ 1}
We choose an unreachable setpoint that is not a steady state, and cost matrices as
follows
The optimal steady state (us , xs ) is given by the solution to the following optimization
(a) Solve this quadratic program and show that the solution is (xs , us ) = (1, 1).
What is the Lagrange multiplier for the equality constraint?
(b) Next we define the rotated cost function following Diehl et al. (2011)
e (x, u) = ℓ(x, u) − λ′ (x − f (x, u)) − ℓ(x , u )
ℓ s s
(c) Notice that the original cost function, which corresponds to λ = 0, has negative
cost values (interior of the circle) that are in the feasible region. The zero contour
for λ = −8 has become tangent to the feasible region, so the cost is nonnegative
in the feasible region. But for λ = −12, the zero contour has been over rotated
so that it again has negative values in the feasible region.
How does the value λ = −8 compare to the Lagrange multiplier of the optimal
steady-state problem?
(d) Explain why MPC based on the rotated stage cost is a Lyapunov function for the
closed-loop system.
6
λ=0
(xsp , usp )
(xs , us )
x 0 λ = −8
−2
−4 U λ = −12
−6
−2 0 2 4 6 8 10
Form the Lagrangian and show that the solution is given also by
min max ℓ(x, u) − λ′ (x − f (x, u)) ≥ max min ℓ(x, u) − λ′ (x − f (x, u))
(x,u)∈Z λ λ (x,u)∈Z
due to weak duality. The strong duality assumption states that equality is achieved in
this inequality above, so that
Let λs denote the optimal Lagrange multiplier in this problem. (For a brief review of
these concepts, see also Exercises C.4, C.5, and C.6 in Appendix C.)
Show that the strong duality assumption implies that the system x + = f (x, u) is
dissipative with respect to the supply rate s(x, u) = ℓ(x, u) − ℓ(xs , us ).
Bibliography
186
Bibliography 187
P. Falugi and D. Q. Mayne. Model predictive control for tracking random refer-
ences. In Proceedings of European Control Conference (ECC), pages 518–523,
2013a.
E. G. Gilbert and K. T. Tan. Linear systems with state and control constraints:
The theory and application of maximal output admissible sets. IEEE Trans.
Auto. Cont., 36(9):1008–1020, September 1991.
L. Grüne and J. Pannek. Nonlinear Model Predictive Control: Theory and Algo-
rithms. Communications and Control Engineering. Springer-Verlag, London,
second edition, 2017.
P. Marquis and J. P. Broustail. SMOC, a bridge between state space and model
predictive controllers: Application to the automation of a hydrotreating
unit. In T. J. McAvoy, Y. Arkun, and E. Zafiriou, editors, Proceedings of the
1988 IFAC Workshop on Model Based Process Control, pages 37–43. Perga-
mon Press, Oxford, 1988.
3.1 Introduction
3.1.1 Types of Uncertainty
Robust and stochastic control concern control of systems that are un-
certain in some sense so that predicted behavior based on a nominal
model is not identical to actual behavior. Uncertainty may arise in dif-
ferent ways. The system may have an additive disturbance that is un-
known, the state of the system may not be perfectly known, or the
model of the system that is used to determine control may be inaccu-
rate.
A system with additive disturbance satisfies the following difference
equation
x + = f (x, u, w)
193
194 Robust and Stochastic Model Predictive Control
x̂ + = g(x̂, u) + ξ
x + = f (x, u, θ)
x+ = x + u
2
X
V3 (x, u) := (1/2) [(x(i)2 + u(i)2 )] + (1/2)x(3)2
i=0
in which
4 2 1
P3 = 2 3 1
1 1 2
Therefore, the vector form of the optimal open-loop control sequence
for an initial state of x is
h i′ h i′
u0 (x) = −P3−1 3 2 1 x = − 0.615 0.231 0.077 x
196 Robust and Stochastic Model Predictive Control
This procedure gives the value function Vi0 (·) and the optimal control
law κi0 (·) at each i where the subscript i denotes time to go. Solving
the DP recursion, for all x ∈ R, all i ∈ {1, 2, 3}, yields
Starting at state x at time zero, and applying the optimal control laws
iteratively to the deterministic system x + = x +u (recalling that at time
0
i the optimal control law is κ3−i (·) since, at time i, 3 − i is the time to
go) yields
which are identical with the optimal open-loop values computed above.
Consider next an uncertain version of the dynamic system in which
uncertainty takes the simple form of an additive disturbance w; the
system is defined by
x+ = x + u + w
3.1 Introduction 197
where, now, x(i) = φ(i; x, µ, w) and u(i) = µi (x(i)). Since the distur-
bance is unpredictable, the value of w is not known at time zero, so the
optimal control problem must “eliminate” it in some meaningful way
so that the solution µ0 (x) does not depend on w. To eliminate w, the
optimal control problem P∗ 3 (x) is defined by
P∗
3 (x) : V30 (x) := inf J3 (x, µ)
µ∈M
in which the cost J3 (·) is defined in such a way that it does not depend
on w; inf is used rather than min in this definition since the minimum
may not exist. The most popular choice for J3 (·) in the MPC literature
is
J3 (x, µ) := max V3 (x, µ, w)
w∈W
198 Robust and Stochastic Model Predictive Control
x1
2 2 x1
X3
X2
X1 X1 X2 X3
x0 x0
0 1 2 k 3 0 1 2 k 3
x2
x2
−2 −2
J3 (x, µ) := V3 (x, µ, 0)
in which x is the initial state at time zero. The solution to the second
problem is the sequence of control laws determined previously, also for
3.1 Introduction 199
The two solutions, u0 (·) and µ0 , when applied to the uncertain system
x + = x + u + w, do not yield the same trajectories for all disturbance
sequences. This is illustrated in Figure 3.1 for the three disturbance
sequences, w0 := (0, 0, 0), w1 := (1, 1, 1), and w2 := (−1, −1, −1); and
initial state x = 1 for which the corresponding state trajectories, de-
noted x0 , x1 , and x2 , are
Open-loop solution.
Feedback solution.
Even for the short horizon of 3, the superiority of the feedback so-
lution can be seen although the feedback was designed for the de-
terministic (nominal) system and therefore did not take the distur-
bance into account. For the open-loop solution x 2 (3) − x 1 (3) =
6, whereas for the feedback case x 2 (3) − x 1 (3) = 3.4; the open-
loop solution does not restrain the spread of the trajectories result-
ing from the disturbance w. If the horizon length is N, for the open-
loop solution, x 2 (N) − x 1 (N) = 2N, whereas for the feedback case
x 2 (N) − x 1 (N) → 3.24 as N → ∞. The obvious and well-known con-
clusion is that feedback control is superior to open-loop control when
uncertainty is present. Feedback control requires determination of a
control policy, however, which is a difficult task if nonlinearity and/or
constraints are features of the optimal control problem.
200 Robust and Stochastic Model Predictive Control
These proposals for robust MPC are all simpler to implement than the
optimal solution provided by DP.
At the current stage of research it is perhaps premature to select
a particular approach; we have, nevertheless, selected one approach,
tube-based MPC that we describe here and in Chapter 5. There is a good
reason for our choice. It is well known that standard mathematical op-
timization algorithms may be used to obtain an optimal open-loop con-
trol sequence for an optimal control problem. What is perhaps less well
known is that there exist algorithms, the second variation algorithms,
that provide not only an optimal control sequence but also a local time-
varying feedback law of the form u(k) = ū(k) + K(k)(x(k) − x̄(k)) in
which (ū(k)) is the optimal open-loop control sequence and (x̄(k)) the
corresponding optimal open-loop state sequence. This policy provides
feedback control for states x(k) close to the nominal states x̄(k).
The second variation algorithms are perhaps too complex for rou-
tine use in MPC because they require computation of the second deriva-
tives with respect to (x, u) of f (·) and ℓ(·). When the system is linear,
the cost quadratic, and the disturbance additive, however, the opti-
mal control law for the unconstrained infinite horizon case is u = Kx.
This result may be expressed as a time-varying control law u(k) =
ū(k) + K(x(k) − x̄(k)) in which the state and control sequences (x̄(k))
and (ū(k)) satisfy the nominal difference equations x̄ + = Ax̄ + B ū,
ū = Kz, i.e., the sequences (x̄(k)) and (ū(k)) are optimal open-loop
solutions for zero disturbance and some initial state. The time-varying
control law u(k) = ū(k) + K(x(k) − x̄(k)) is clearly optimal in the
unconstrained case; it remains optimal for the constrained case in the
neighborhood of the nominal trajectory (x̄(k)) if (x̄(k)) and (ū(k)) lie
in the interior of their respective constraint sets.
These comments suggest that a time-varying policy of the form u(x,
k) = ū(k) + K(x − x̄(k)) might be adequate, at least when f (·) is
linear. The nominal control and state sequences, (ū(k)) and (x̄(k)),
respectively, can be determined by solving a standard open-loop op-
timal control problem of the form usually employed in MPC, and the
feedback matrix K can be determined offline. We show that this form
of robust MPC has the same order of online complexity as that conven-
tionally used for deterministic systems. It requires a modified form of
the online optimal control problem in which the constraints are simply
tightened to allow for disturbances, thereby constraining the trajecto-
ries of the uncertain system to lie in a tube centered on the nominal
trajectories. Offline computations are required to determine the mod-
202 Robust and Stochastic Model Predictive Control
3.1.4 Tubes
x + = f (x, u, w) (3.2)
x + ∈ F (x, u)
If x is the current state, and u the current control, the successor state
x + lies anywhere in the set F (x, u). When the control policy µ :=
(µ0 (·), µ1 (·), . . .) is employed, the state evolves according to
in which x is the current state, k the current time, and x + the successor
state at time k+ = k + 1. The system described by (3.3) does not have a
single solution for a given initial state; it has a solution for each possible
realization w of the disturbance sequence. We use S(x, i) to denote the
set of solutions of (3.3) if the initial state is x at time i. If φ∗ (·) ∈ S(x,
i) then
φ∗ (t) = φ(t; x, i, µ, w)
x + = f (x, µk (x), w)
√
If the initial state is x = (1, 1), then φ(1; x) = (0, 2) and φ(2; x) = (0,
0), with similar behavior for other initial states. In fact, all solutions
satisfy
φ(k; x) ≤ β(|x| , k)
in which β(·), defined by
in which the control u is constrained to lie in the set U = [−1, 1]. Sup-
pose we choose a horizon length N = 2 and choose Xf to be the origin.
If x1 ≠ 0, the only feasible control sequence steering x to 0 in two
steps is u = {1, 0}; the resulting state sequence is (x, (0, |x|), (0, 0)).
Since there is only one feasible control sequence, it is also optimal, and
κ2 (x) = 1 for all x such that x1 ≠ 0. If x1 = 0, then the only optimal
control sequence is u = (0, 0) and κ2 (x) = 0. The resultant closed-loop
system satisfies
" #
x1 (1 − κ2 (x))
x + = f (x) :=
|x| κ2 (x)
Consider a system
x + = f (x)
If the initial state is x = (1, 1), as before, then the difference inclusion
generates the following tube
(" #) (" #) (" # " #)
1 √ 0 √0 , 0
X0 = , X1 = , X2 = , ...
1 2 2 0
The discussion in Section 2.4.1 shows that nominal MPC is not nec-
essarily robust. It is therefore natural to ask under what conditions
nominal MPC is robust. To answer this, we have to define robustness
precisely. In Appendix B, we define robust stability, and robust asymp-
totic stability, of a set. We employ this concept later in this chapter in
the design of robust model predictive controllers that for a given ini-
tial state in the region of attraction, steer every realization of the state
trajectory to this set. Here, however, we address a slightly different
question: when is nominal MPC that steers every trajectory in the re-
gion of attraction to the origin robust? Obviously, the disturbance will
preclude the controller from steering the state of the perturbed system
to the origin; the best that can be hoped for is that the controller will
steer the state to some small neighborhood of the origin. Let the nom-
inal (controlled) system be described by x + = f (x) in which f (·) is
not necessarily continuous, and let the perturbed system be described
by x + = f (x + e) + w. Also let Sδ (x) denote the set of solutions for
the perturbed system with initial state x and perturbation sequences
e := (e(0), e(1), e(2), . . .) and w := (w(0), w(1), w(2), . . .) satisfying
max{∥e∥ , ∥w∥} ≤ δ where, for any sequence ν, ∥ν∥ denotes the sup
norm, supk≥0 |ν(k)|. The definition of robustness that we employ is
(Teel, 2004)
Definition 3.1 (Robust global asymptotic stability). Let A be compact,
and let d(x, A) := mina {|a − x| | a ∈ A}, and |x|A := d(x, A). The
set A is robustly globally asymptotically stable (RGAS) for x + = f (x) if
there exists a class KL function β(·) such that for each ε > 0 and each
compact set C, there exists a δ > 0 such that for each x ∈ C and each
φ ∈ Sδ (x), there holds φ(k; x) A ≤ β(|x|A , k) + ε for all k ∈ I≥0 .
Taking the set A to be the origin (A = {0}) so that |x|A = |x|, we
see that if the origin is robustly asymptotically stable for x + = f (x),
208 Robust and Stochastic Model Predictive Control
then, for each ε > 0, there exists a δ > 0 such that every trajectory of the
perturbed system x + = f (x+e)+w with max{∥e∥ , ∥w∥} ≤ δ converges
to εB (B is the closed unit ball); this is the attractivity property. Also,
if the initial state x satisfies |x| ≤ β−1 (ε, 0), then φ(k; x) ≤ β(β−1 (ε,
0), 0) + ε = 2ε for all k ∈ I≥0 and for all φ ∈ Sδ , which is the Lyapunov
stability property. Here the function β−1 (·, 0) is the inverse of the func-
tion α , β(α, 0).
We return to the question: under what conditions is asymptotic
stability robust? We first define a slight extension to the definition of
a Lyapunov function given in Chapter 2: A function V : Rn → R≥0 is
defined to be a Lyapunov function for x + = f (x) in X and set A if there
exist functions αi ∈ K∞ , i = 1, 2 and a continuous, positive definite
function α3 (·) such that, for any x ∈ X
x + = f (x, u, w) (3.7)
u(i) ∈ U ∀i ∈ I≥0
The set U is compact and contains the origin in its interior. The dis-
turbance w may take any value in the set W. As before, u denotes
the control sequence (u(0), u(1), . . .) and w the disturbance sequence
(w(0), w(1), . . .); φ(i; x, u, w) denotes the solution of (3.7) at time i if
the initial state is x, and the control and disturbance sequences are,
respectively, u and w. The nominal system is described by
and φ̄(i; x, u) denotes the solution of the nominal system (3.8) at time
i if the initial state is x and the control sequence is u. The nominal
control problem, defined subsequently, includes, for reasons discussed
in Chapter 2, a terminal constraint
x(N) ∈ Xf
N−1
X
VN (x, u) := ℓ(x(i), u(i)) + Vf (x(N)) (3.9)
i=0
In (3.9) and (3.10), x(i) := φ̄(i; x, u), the state of the nominal system
at time i, for all i ∈ I0:N−1 = {0, 1, 2, . . . , N − 1}. The set of admissible
control sequences UN (x) is defined by
which is the set of control sequences such that the nominal system
satisfies the nominal control and terminal constraints when the initial
state at time zero is x. Thus, UN (x) is the set of feasible controls for
the nominal optimal control problem PN (x). The set XN ⊂ Rn , defined
by
XN := {x ∈ Rn | UN (x) ≠ ∅}
is the domain of the value function VN0 (·), i.e., the set of x for which
PN (x) has a solution; XN is also the domain of the minimizer u0 (x).
The value of the nominal control at state x is u0 (0; x), the first control
in the sequence u0 (x). Hence the implicit nominal MPC control law is
κN : XN → U defined by
κN (x) = u0 (0; x)
in which w can take any value in W. It is obvious that the state x(i)
of the controlled system (3.13) cannot tend to the origin as i → ∞;
the best that can be hoped for is that x(i) tends to and remains in
some neighborhood Rb of the origin. We shall establish this, if the
disturbance w is sufficiently small, using the value function VN0 (·) of
the nominal optimal control problem as a Lyapunov function for the
controlled uncertain system (3.13).
To analyze the effect of the disturbance w we employ the follow-
ing useful technical result (Allan, Bates, Risbeck, and Rawlings, 2017,
Proposition 20).
VN (x + , u) − VN (x̄ + , u) ≤ αb ( x + − x̄ + )
VN (x + , u
e (x)) ≤ VN (x̄ + , u
e (x)) + αb ( x + − x̄ + )
so that
VN0 (x + ) ≤ γVN0 (x) + αb ( x + − x̄ + )
1. αa (|Lw|) ≤ (1 − γf )cf
2. αa (|Lw|) ≤ (1 − γ)c
3. αb (|Lw| ≤ (1 − γ)c
4. αb (|Lw|) ≤ (γ ∗ − γ)b
214 Robust and Stochastic Model Predictive Control
XN
Rc x
Rb
Let δ∗ > 0 denote the largest δ such that all four conditions are satisfied
if w ∈ W with |W| ≤ δ.4 Condition 3 can be satisfied if b ≥ δ∗ /(1 − γ).
x + = f (x, u, w) (3.14)
in which µ = (µ0 (·), µ1 (·), . . . , µN−1 (·)), x(i) = φ(i; x, µ, w), and u(i) =
µi (x(i)). Let M(x) denote the set of feedback policies µ that for a
given initial state x satisfy: the state and control constraints, and the
terminal constraint for every admissible disturbance sequence w ∈ W .
The first control law µ0 (·) in µ may be replaced by a control action
u0 = µ0 (x) to simplify optimization, since the initial state x is known
whereas future states are uncertain. The set of admissible control poli-
cies M(x) is defined by
M(x) := µ |µ0 (x) ∈ U(x), φ(i; x, µ, w) ∈ X, µi (φ(i; x, µ, w)) ∈ U(x)
∀i ∈ I0:N−1 , φ(N; x, µ, w) ∈ Xf ∀w ∈ W
(b) Xf ⊆ X
Proof.
(a)–(c) Suppose, for some i, Xi is robust control invariant so that any
point x ∈ Xi can be robustly steered into Xi . By construction, Xi+1
is the set of all points x that can be robustly steered into Xi . Also
Xi+1 ⊇ Xi so that Xi+1 is robust control invariant. But X0 = Xf is
robust control invariant. Both (a) and (b) follow by induction. Part (c)
follows from (b).
0
[Vi+1 − Vi0 ](x) = max {ℓ(x, κi+1 (x), w) + Vi0 (f (x, κi+1 (x), w))}
w∈W
0
− max {ℓ(x, κi (x), w) + Vi−1 (f (x, κi (x), w))}
w∈W
for all x ∈ Xi since κi (·) may not be optimal for problem Pi+1 (x). We
now use the fact that max w {a(w)} − max w {b(w)} ≤ max w {a(w) −
b(w)}, which is discussed in Exercise 3.2, to obtain
0
[Vi+1 − Vi0 ](x) ≤ max {[Vi0 − Vi−1
0
](f (x, κi (x), w))}
w∈W
[V10 − V00 ](x) = max {ℓ(x, κ1 (x), w) + Vf (f (x, κ1 (x), w)) − Vf (x)} ≤ δ
w∈W
in which
N−1
X
JN (x, µ(v), w) := ℓ(x(i), u(i), w(i)) + Vf (x(N))
i=0
denote µ0 (x) with its first element µ( · , v00 (x)) removed; µ∗ (x) is a
sequence of N − 1 control laws. In addition let u e (x) be defined by
e (x) := (µ∗ (x), κf (·))
u
e (x) is a sequence of N control laws.
u
For any sequence z let za:b denote the subsequence z(a), z(a + 1),
. . . , z(b) ; as above, z := z0:N−1 . Because x ∈ XN is feasible for the opti-
mal control problem PN (x), every random trajectory with disturbance
sequence w = w0:N−1 ∈ WN emanating from x ∈ XN under the control
policy µ0 (x) reaches the terminal state xN = φ(N; x, µ0 (x), w) ∈ Xf
in N steps. Since w(0) is the first element of w, w = (w(0), w1:N−1 ).
Hence the random trajectory with control sequence µ01:N−1 (x) and dis-
turbance sequence w1:N−1 emanating from x + = f (x, µ00 (x), w(0))
reaches xN ∈ Xf in N − 1 steps. Clearly
JN−1 (x + , µ01:N−1 (x), w1:N−1 ) = JN (x, µ0 (x), w) − ℓ(x, µ00 (x), w0 )
222 Robust and Stochastic Model Predictive Control
By Assumption 3.8, ℓ(x, µ00 (x), w(0)) = ℓ(x, κN (x), w(0)) ≥ α1 (|x|)
and
The policy sequence µ e (x), which appends κf (·) to µ01:N−1 (x), steers
+
x to xN in N − 1 steps and then steers xN ∈ Xf to x(N + 1) = f (xN ,
κf (xN ), wN ) that lies in the interior of Xf . Using Assumption 3.8, we
obtain
Using this inequality with w0:N = (w(0), w0 (x + ))5 so that w1:N =
w0 (x + ) and w = w0:N−1 = w(0), w00:N−2 (x + ) yields
VN0 (x + ) = JN (x + , µ0 (x + ), w0 (x + )) ≤ JN (x + , µ
e (x), w0 (x + ))
≤ JN (x, µ0 (x), (w(0), w00:N−2 (x + )) − α1 (|x|) + δ
≤ VN0 (x) − α1 (|x|) + δ
The
last inequality follows from the fact that the disturbance sequence
w(0), w00:N−2 (x + ) does not necessarily maximize w , JN (x, µ0 (x),
w).
Assume now that ℓ(·) is quadratic and positive definite so that
α1 (|x|) ≥ c1 |x|2 . Assume also that VN0 (x) ≤ c2 |x|2 so that for all
x ∈ XN
VN0 (x + ) ≤ γVN0 (x) + δ
with γ = 1 − c1 /c2 ∈ (0, 1). Let ε > 0. It follows that for all x ∈ XN
such that VN0 (x) ≥ c := (δ + ε)/(1 − γ)
Summary. If δ < (1 − γ)c (c > δ/(1 − γ)) and levc VN0 ⊂ XN , every
initial state x ∈ XN of the closed-loop system x + = f (x, µ00 (x), w)
is steered to the sublevel set levc VN0 in finite time for all disturbance
sequences w satisfying w(i) ∈ W, all i ≥ 0, and thereafter remains
in this set; the set levc VN0 is positive invariant for x + = f (x, µ00 (x),
w), w ∈ W. The policy sequence u e (x), easily obtained from µ0 (x), is
feasible for PN (x ) and is a suitable warm start for computing µ0 (x + ).
+
x + = Ax + Bu + w
time zero is x, and the control and disturbance sequences are, respec-
tively, u and w.
Let the nominal system be described by
x̄ + = Ax̄ + B ū
and let φ̄(i; x̄, u) denote the solution of x̄ + = Ax̄ + B ū at time i if the
initial state at time zero is x̄. Then e := x− x̄, the deviation of the actual
state x from the nominal state x̄, satisfies the difference equation
e+ = Ae + w
so that
i−1
X
e(i) = Ai e(0) + Aj w(j)
j=0
in which e(0) = x(0) − x̄(0). If e(0) = 0, then e(i) ∈ S(i) where the set
S(i) is defined by
i−1
X
S(i) := Aj W = W ⊕ AW ⊕ · · · ⊕ Ai−1 W
j=0
P
in which and ⊕ denote set addition. It follows from our assumptions
on W that S(i) contains the origin in its interior for all i ≥ n.
We first consider the tube X(x, u) generated by the open-loop con-
trol sequence u when x(0) = x̄(0) = x, and e(0) = 0. It is easily seen
that X(x, u) = (X(0; x), X(1; x, u), . . . , X(N; x, u)) with
and x̄(i) = φ̄(i; x, u), the state at time i of the nominal system, is
the center of the tube. So it is relatively easy to obtain the exact tube
generated by an open-loop control if the system is linear and has a
bounded additive disturbance, provided that one can compute the sets
S(i).
If A is stable, then, as shown in Kolmanovsky and Gilbert (1998),
P∞
S(∞) := j=0 Aj W exists and is positive invariant for x + = Ax + w, i.e.,
x ∈ S(∞) implies that Ax + w ∈ S(∞) for all w ∈ W; also S(i) → S(∞)
in the Hausdorff metric as i → ∞. The set S(∞) is known to be the
minimal robust positive invariant set6 for x + = Ax + w, w ∈ W. Also
6 Every other robust positive invariant set X satisfies X ⊇ S∞ .
226 Robust and Stochastic Model Predictive Control
S(i) ⊆ S(i + 1) ⊆ S(∞) for all i ∈ I≥0 so that the tube X̂(x, u) defined
by
X̂(x, u) := X̂(0; x), X̂(1; x, u), . . . , X̂(N; x, u)
in which
x + = Ax + B ū + BKe + w
e + = AK e + w AK := A + BK
x2
X1
x trajectory
X0
X2
x̄ trajectory
x1
in which x̄(i) = φ̄(i; x, ū). Under usual conditions, the origin is asymp-
totically stable for the controlled nominal system described by
and the controlled system satisfies the constraint (x̄(i), ū(i)) ∈ Z̄ for
all i ∈ I≥0 . Let X̄N denote the set {x̄ | ŪN (x̄) ≠ ∅}. Of course, deter-
mination of the control κ̄N (x̄) requires solving online the constrained
optimal control problem PN (x̄).
The feedback controller, given the state x of the system being con-
trolled, and the state x̄ of the nominal system, generates the control
u = κ̄N (x̄) + K(x − x̄). The composite system with state (x, x̄) satisfies
The system with state (e, x̄), e := x − x̄, satisfies a simpler difference
equation
e + = AK e + w
x̄ + = Ax̄ + B κ̄N (x̄)
The two states (x, x̄) and (e, x̄) are related by
" # " # " #
e x I −I
=T T :=
x̄ x̄ 0 I
Since T is invertible, the two systems with states (x, x̄) and (e, x̄) are
equivalent. Hence, to establish robust stability it suffices to consider
the simpler system with state (e, x̄). First, we define robustly asymp-
totically stable (RAS).
230 Robust and Stochastic Model Predictive Control
Proof. Because the origin is asymptotically stable for x̄ + = Ax̄+B κ̄N (x̄),
there exists a KL function β(·) such that every solution φ̄( · ; x̄) of the
controlled nominal system with initial state x̄ ∈ X̄N satisfies
Since e(0) ∈ SK (∞) implies e(i) ∈ SK (∞) for all i ∈ I≥0 , it follows that
Hence the set SK (∞) × {0} is RAS in SK (∞) × X̄N for the composite
system (e+ = AK e + w, x̄ + = Ax̄ + B κ̄N (x̄)). ■
for every solution φ̄(·) of the nominal system with initial state x̄ ∈ XN .
Finally we show how suitable tightened constraints may be deter-
mined. It was shown above that the nominal system should satisfy
the tightened constraint (x̄, ū) ∈ Z̄ = Z ⊖ (SK (∞), KSK (∞)). Since
SK (∞) is difficult to compute and use, impossible for many process
control applications, we present an alternative. Suppose Z is polytopic
and is described by a set of scalar inequalities of the form c ′ z ≤ d
3.5 Tube-Based Robust MPC 231
(cx′ x + cu
′
u ≤ d). We show next how each constraint of this form may
be tightened so that satisfaction of the tightened constraint by the nom-
inal system ensures satisfaction of original constraint by the uncertain
system. For all j ∈ I≥0 , let
j−1
X
′
θj := max {c (e, Ke) | e ∈ SK (j)} = max { c ′ (I, K)AiK wi | w ∈ W0:j−1 }
e w
i=0
θ∞ ≤ θN + αθ∞
so that
θ∞ ≤ (1 − α)−1 θN
Hence, satisfaction of the tightened constraint c ′ z̄ ≤ d − (1 − α)−1 θN
by the nominal system ensures that the uncertain system satisfies the
original constraint c ′ z ≤ d. The tightened constraint set Z̄ is defined
by these modified constraints.
These values are shown in Figure 3.4. From here, we see that N ≥ 3
is necessary for the approximation to hold. With the values of α, the
232 Robust and Stochastic Model Predictive Control
100
10−1
10−2
10−3
α
10−4
10−5
10−6
10−7
10−8
0 5 10 15 20 25
N
The bounds χ1 , χ2 , and µ are shown in Figure 3.5. Note that while
N = 3 gives a feasible value of α, we require at least N = 4 for Z̄ to be
nonempty. □
0.75
0.50
0.25
component bound
0.00
−0.25
−0.50
χ1
−0.75 χ2
µ
−1.00
0 5 10 15 20 25
N
with initial state x̄(N) at time N. We now assume that X̄f satisfies
X̄f ⊕SK (∞) ⊂ X. Since x̄(N) ∈ X̄f it follows that x̄(i) ∈ X̄f and x(i) ∈ X
for all i ∈ I≥N . Also, for all i ∈ I0:N−1 , x̄(i) ∈ X̄(i) = X ⊖ SK (i) and
e(i) ∈ Sk (i) so that x(i) = x̄(i) + e(i) ∈ X. Hence x(i) ∈ X for all
i ∈ I≥0 . Since x̄(i) → 0, the state x(i) of the uncertain system tends to
SK (∞) as i → ∞. Since Z̄(i) ⊃ Z̄, the region of attraction is larger than
that for tube-based MPC using a constant constraint set.
234 Robust and Stochastic Model Predictive Control
0
P∗ ∗
N (x) : V̄N (x) = min{V̄N (z) | x ∈ {z} ⊕ SK (∞), z ∈ X̄}
z
in which, as usual, κ̄N (x̄ ∗ (x)) is the first element in the control se-
quence ū∗ (x). It follows that
V̄N∗ (x) = V̄N0 (x̄ ∗ (x)) ≤ V̄N0 (x̄), ū∗ (x) = ū0 (x̄ ∗ (x))
The last inequality follows from the fact that x̄ + = (x̄ ∗ (x))+ = Ax̄ ∗ (x)+
+
B κ̄N (x̄ ∗ (x)) and the descent property of the solution to P̄0N (x̄ ∗ (x)).
Proof. Suppose that (x, x̄) satisfies x ∈ {x̄}⊕SK (∞) and x̄ ∈ X̄N . From
the definition of P̄∗N , any solution satisfies the tightened constraints so
that x̄ ∗ (x) ∈ X̄N . The terminal conditions ensure, by the usual argu-
ment, that the successor state x̄ ∗ (x)+ also lies in X̄N . The condition
x ∈ {z} ⊕ SK (∞) in P∗ ∗
N (x) then implies that x ∈ {x̄ (x)} ⊕ SK (∞) so
+ + +
that x ∈ {x̄ } ⊕ SK (∞) (e ∈ SK (∞)). ■
Proof. It follows from the upper and lower bounds on V̄N0 (x ∗ (x)), and
the descent property listed above that
decays exponentially fast to zero. It then follows from the upper bound
on V̄N0 (x̄ ∗ (x)) that x ∗ (x(i)) also decays exponentially to zero. Because
x(i) ∈ {x̄ ∗ (x(i))} for all i ∈ I≥0 , it follows, similarly to the proof of
Proposition 3.12, that the set SK (∞) is robustly exponentially stable in
X̄N ⊕ SK (∞) for the system x + = Ax + B κ̄N (x̄ ∗ (x)) + K(x − x̄ ∗ (x)) +
w. ■
Its solution at time i, if its initial state is x̄0 , is denoted by φ̄(i; x̄0 ,
ū), in which ū := (ū(0), ū(1), . . .) is the nominal control sequence. The
deviation between the actual and nominal state is e := x−x̄ and satisfies
N−1
X
V̄N (x̄, ū) := ℓ(x̄(i), ū(i)) (3.21)
i=0
in which x̄(i) = φ̄(i; x̄, ū) and x̄ is the initial state. The function ℓ(·)
is defined by
ℓ(x̄, ū) := (1/2) |x̄|2Q + |ū|2R
the deviation between the state and control of the nominal system,
with initial state x and control sequence u, and the state and control
of the nominal system, with initial state x̄ 0 (t) and control sequence
ūt0 := ū0 (t), ū0 (t + 1), . . . , ū0 (t + N − 1) . The cost VN (x, t, u) that
measures the distance between these two trajectories is defined by
N−1
X
VN (x, t, u) := ℓ (x(i) − x̄ 0 (t + i)), (u(i) − ū0 (t + i)) + Vf (x(N))
i=0
(3.22)
in which x(i) = φ̄(i; x, u). The optimal control problem solved online
is defined by
Proposition 3.16 shows that the constraint that the terminal state
lies in Xf is implicitly satisfied if β ≥ βc and the initial state lies in
Xic (x̄0 ) for any i ∈ I≥0 . The next Proposition establishes important
properties of the value function VN0 (·).
It should be recalled that x̄ 0 (t) = 0 and ū0 (t) = 0 for all t ≥ N; the
controller reverts to conventional MPC for t ≥ N.
Proof.
(a) This follows from the fact that VN0 (x, t) ≥ ℓ(x − x̄ 0 (t), u − ū0 (t))
2
so that, by the assumptions on ℓ(·), VN0 (x, t) ≥ c1 x − x̄ 0 (t) for all
(x, t) ∈ Rn × I≥0 .
(b) We have that VN0 (x, t) = VN (x, u0 (x, t)) ≤ VN (x, ūt0 ) with
N−1
X
VN (x, ūt0 ) = ℓ(x 0 (i; x, t) − x̄ 0 (t + i), 0)+
i=0
Vf (x 0 (N; x, t) − x̄ 0 (t + N))
and ūt0 := ū0 (t), ū0 (t + 1), ū0 (t + 2), . . . . Lipschitz continuity of f (·)
in x gives φ̄(i; x, ūt0 ) − x̄ 0 (i + t) ≤ Li x − x̄ 0 (t) . Since ℓ(·) and
2
Vf (·) are quadratic, it follows that VN0 (x, t) ≤ c2 x − x̄ 0 (t) for all
(x, t) ∈ Rn × I≥0 , for some c2 > 0.
(c) It follows from Proposition 3.16 that the terminal state x 0 (N; x,
t) ∈ Xf so that the usual stabilizing condition is satisfied and
(c) VN0 (f (x, κN (x, t), w), t + 1) ≤ γVN0 (x, t) + c3 |w| for all (x, t) ∈
(Xic (x̄0 ) ⊕ W) × I≥0 .
Proof.
(a) This inequality follows directly from Proposition 3.17.
(c) The final inequality follows from (a), (b), and Proposition 3.17. ■
(b) Suppose ε > 0. Then VN0 ((x, t)+ ) ≤ VN0 (x, t) − ε if VN0 (x) ≥ dε :=
(c3 /(1 − γ))W + ε.
Proof.
(a) It follows from Proposition 3.17 that
VN0 (f (x, κN (x, t), w), t+1) ≤ [(γc3 )/(1−γ)+c3 ] |W| ≤ [c3 /(1−γ)] |W|
(b) VN0 (f (x, κN (x, t), w) ≤ VN0 (x) − ε if γVN0 (x) + c3 W ≤ VN0 (x) − ε, i.e.,
if VN0 (x) ≥ [c3 /(1 − γ)]W + ε. ■
0.5
0.0
0.8
temperature, x2
0.6
0.4
1.5
coolant flowrate, u
1.0
0.5
0.0
0 100 200 300 0 100 200 300
time time
0.5
0.0
0.8
temperature, x2
0.6
0.4
1.5
coolant flowrate, u
1.0
0.5
0.0
0 100 200 300 0 100 200 300
time time
∆ = 12 ∆=8
concentration, x1 1.0
0.5
0.0
temperature, x2
1.00
0.75
0.50
coolant flowrate, u
1.0
0.5
0.0
0 100 200 300 400 5000 100 200 300 400 500
time time
Figure 3.8: Concentration versus time for the ancillary model predic-
tive controller with sample time ∆ = 12 (left) and ∆ = 8
(right).
u) = ℓ(x, u). The model predictive controller with the smaller sample
time is more effective in rejecting the disturbance. □
in which E|x (·) = E(· | x(0) = x)), E(·) is the expectation under the
probability measure of the underlying probability space, and x(i) =
φ(i; x, µ, w). For simplicity, the nominal cost VN (x, µ) = E|x (JN (x,
µ, 0)) is sometimes employed; here 0 is defined to be the sequence
(0, 0, . . . , 0).
We consider briefly below three versions of MPC associated with
three versions of the optimal control problem PN (x) solved online. In
the first version there are no constraints, permitting the disturbance to
be unbounded. In the second version the hard constraints x ∈ X, u ∈ U
and the terminal constraint x(N) ∈ Xf are required to be satisfied.
While satisfaction of the constraint x ∈ X almost surely is desirable,
this constraint is often regarded as too conservative. The third ver-
sion, therefore, replaces the hard constraint x ∈ X by the probabilistic
(chance) constraint of the form
Pr|x (x(i) ∈ X) ≥ 1 − ε
for some suitably small ε ∈ [0, 1]. Some papers propose treating the
hard control constraint u ∈ U similarly. This approach is not appro-
priate for process control since hard actuator constraints have to be
satisfied; a valve cannot be more than fully open or less than fully
closed. In a similar vein, softening of the terminal constraint may re-
sult in instability. Hence, the constraints in the third version on the
system being controlled take the form
Pr|x (x(i) ∈ X) ≥ 1 − ε
u(i) ∈ U
for all i ∈ I0:N . Pr(·) denotes the probability measure of the underlying
probability space and Pr|x (·) the probability measure conditional on
x(0) = x. Also x(i) := φ(i; x, µ, w) and u(i) = µi (x(i)).
248 Robust and Stochastic Model Predictive Control
Let ΠN (x) denote the set of parameterized policies that satisfy the
constraints appropriate to the version being considered and the initial
state is x. The optimal control problem PN (x) that is solved online can
now be defined by
Because the optimal control problem solved online has a finite hori-
zon the resultant control law is not necessarily stabilizing. Stabiliz-
ing conditions involving the addition of a terminal cost and a terminal
constraint set have been developed for deterministic and robust MPC
but, as pointed out in Chatterjee and Lygeros (2015), no approaches to
stochastic MPC prior to 2015 dealt “directly with stability under reced-
ing horizon control as a standalone and fundamental problem.”
Version 1. A major contribution to stability and performance of stoch-
astic MPC in the absence of hard constraints is given in the paper by
Chatterjee and Lygeros that is the first paper proposing “standalone”
stability conditions for unconstrained stochastic MPC. The problem
considered in this paper is as stated above except that there are no
constraints (X = Xf = Rn , U = Rm ) and the random disturbance w is
merely assumed to take values in a measurable set W that is not neces-
sarily bounded. The stabilizing assumption in Chatterjee and Lygeros
(2015) is
Under the basic assumptions that (i) the cost VN (x, µ) is finite for
all x ∈ Rn and all µ ∈ ΠN (x); (ii) for all x ∈ Rn , there exists a solution
3.7 Stochastic MPC 249
µ0 (x) that solves PN (x); and (iii) the stage cost ℓ(·) satisfies some
modest conditions, it is shown in (Chatterjee and Lygeros, 2015) that,
if Assumption 3.21 holds, then, for all x ∈ Rn
Chatterjee and Lygeros then show that VN0 (·) satisfies the geometric
drift condition E|x (VN0 (x + )) ≤ VN0 (x) − ℓ(x, κN (x)) outside of some
compact subset of Rn , and that the sequence (E|x (VN0 (x(t))))t∈I≥0 is
bounded.
Version 2. While the results in Chatterjee and Lygeros (2015) hold for
situations in which the disturbance is not restricted to lie in a compact
set, they do require the absence of hard state constraints. In addi-
tion, determination of a function satisfying Assumption 3.21 is diffi-
cult. Stabilizing conditions suitable for version 2 of stochastic MPC (all
constraints are hard and W is compact) are given in Mayne and Falugi
(2019).
(c) Xf ⊆ X, W is compact.
(d) There exist constants c2 > c1 > 0 and a > 0 such that
ℓ(x, u) ≥ c1 |x|a , ∀x ∈ X, ∀u ∈ U
VN0 (x) ≤ c2 |x|a , ∀x such that ΠN (x) ≠ ∅
with λ = 1 − c1 /c2 . Then E|x(0) (VN0 (x(1))) ≤ λVN0 (x(0)) + δ and, by law
of iterated expectation, E|x(0) (VN0 (x(k))) = E|x(0) (E|x(k−1) (VN0 (x(k)))).
By iterating we obtain our stability condition
x + = Ax + Bu + w
x̄ + = Ax̄ + B ū e := x − x̄
so that u(i) = µi (x(i)) = ū(i) + K(x(i) − x̄(i)). The (x, e) pair then
evolve as
x + = Ax + B ū + BKe + w e + = AK e + w AK := A + BK
in which x(i) = φ(i; x, µ, w). The control u(i) applied to the system
at time i is
u(i) = ū(i) + K(x − x̄(i)) ∀x
With this control policy, e(i) := x(i) − x̄(i) is the solution at time i of
the difference equation
e+ = AK e + w, e(0) = 0
Pi−1 j
As shown earlier e(i) ∈ SK (i) := j=0 AK W for all i. Because AK is Hur-
witz, e(t) converges to a stationary process e∞ as t → ∞. To achieve
robustness of stochastic MPC we adopt a policy similar to that em-
ployed in robust MPC. For each i, the control constraints are tightened
by determining, for each i, a set Ū(i) that ensures ū + Ke(i) ∈ U for all
ū ∈ Ū(i). The state constraints are tightened by determining, for each
i, a set X̄(i) that ensures Pr(bx + e(i) ∈ X) ≥ 1 − ε for all x̄ ∈ X̄(i).
A model predictive controller is employed to steer the state and
control of the nominal system, subject to the tightened constraints, to
the origin. Since x(t) = x̄(t) + e(t) and ū(t) = ū(t) + Ke(t) it follows
that x(t) converges to e∞ and u(t) converges to Ke∞ as t → ∞.
To implement this control it seems, at first sight, that we have to
determine the tightened constraints for all i ∈ I≥0 . We propose two
practical alternatives. The first, which is similar to that employed for
robust MPC, is determination of constant constraint sets Ū∞ and X̄∞
satisfying, respectively, Ū∞ ⊕ KSK (∞) ⊂ U and P{x̄ + e∞ ∈ X} ≥ 1 − ε
for all x̄ ∈ X̄∞ . At each time i, when the composite state is (x(i), x̄(i)),
a standard nominal optimal control problem P̄N (x̄(i)) with constraints
x̄ ∈ X̄∞ , ū ∈ Ū∞ and the usual terminal constraint is solved. If standard
stability conditions are satisfied, x̄(i) and ū(i) converge to zero while
satisfying the tightened constraints ū ∈ Ū∞ and x̄ ∈ X̄∞ as i → ∞. The
control applied to the system at time i is u(i) = ū(i) + K(x(i) − x̄(i)).
This procedure is conservative in that the constraints are tighter than
necessary.
3.7 Stochastic MPC 253
Assumption 3.24 (Robust terminal set condition). The terminal set sat-
isfies Xf ⊕ SK (∞) ⊂ X.
Both procedures ensure that x(t) converges to the zero mean sta-
tionary process e∞ to which e(t) converges, and that u(t) converges to
Ke∞ as t → ∞.
w(2), . . . , w(i − 1)} and e(i) is replaced by e(i; wj ) to denote its depen-
dence on the random sequence wj .
It is shown in Calafiore and Campi (2006) and Tempo et al. (2013)
that given (ε, β), there exists a relatively modest number of samples
M ∗ (ε, β) such that if M ≥ M ∗ , one of the following two conditions
hold. For each i ∈ I0:N−1 , either problem Ps is infeasible, in which case
the robust control problem is infeasible; or its solution f 0 (i) satisfies
105
104
∗
M 103
102
101
100
10−1
99th percentile
95th percentile
εtest 10−2 75th percentile
50th percentile
25th percentile
10−3
10−4
for all i ∈ I≥0 . Note that the first control constraint does not need to be
tightened at all, Ū(0) = U, and all subsequent control constraints are
less conservative than Ū∞ = [−1/2, 1/2].
To compute the tightened sets X̄(i), we apply the sampling pro-
cedure for each i ∈ I0:N−1 . For various values of ε, we compute the
number of samples M = M ∗ (ε, β) using (3.25) with β = 0.01. Then,
we choose M samples of w and solve (3.24) for f 0 (i). To evaluate the
actual probability of constraint violation, we then test the constraint vi-
olation using Mtest = 106 different samples wtest . That is, we compute
j
εtest := Pr c ′ e(i; wtest ) > f 0 (i), ∀j ∈ I1:Mtest
for each i ∈ I0:N−1 . Note that since εtest is now a random variable de-
pending on the particular M samples chosen, we repeat the process 500
times for each value of M. The distribution of εtest for i = 10 is shown
in Figure 3.9. Notice that the formula (3.25) is slightly conservative,
i.e., the observed probability εtest is half of the chosen probability ε for
99% of samples (with probability 1 − β). This gap holds throughout the
256 Robust and Stochastic Model Predictive Control
and ℓ(·) and Vf (·) are quadratic functions. If, in addition, the system
being controlled satisfies its control and probabilistic constraints if and
only if the nominal system satisfies its tightened constraints, then the
solution ū0 (x(0)) of the nominal optimal control problem P̄N (x(0))
is also the solution of the parameterized stochastic optimal control
problem PN (x(0)).
3.8 Notes
Robust MPC. There is now a considerable volume of research on ro-
bust MPC; for a review of the literature up to 2000 see Mayne, Rawlings,
Rao, and Scokaert (2000). Early literature examines robustness of nomi-
nal MPC under perturbations in Scokaert, Rawlings, and Meadows (1997);
and robustness under model uncertainty in De Nicolao, Magni, and Scat-
tolini (1996) and Magni and Sepulchre (1997). Sufficient conditions
for robust stability of nominal MPC with modeling error are provided
in Santos and Biegler (1999). Teel (2004) provides an excellent dis-
cussion of the interplay between nominal robustness and continuity
of the Lyapunov function, and also presents some illuminating exam-
ples of nonrobust MPC. Robustness of the MPC controller described in
258 Robust and Stochastic Model Predictive Control
3.9 Exercises
satisfy
Vi0 (x) = max {ℓ(x, κi (x), w) + Vi−1
0
(f (x, κi (x), w))}
w∈W
1.5
1.0
0.5
x2
0.0
−0.5
−1.0
−1.5
−3 −2 −1 0 1 2 3
x1
x+ = x + u + w
(c) Determine a model predictive controller for the nominal system and constraint
sets Z and V used in (b).
(d) Implement robust MPC for the uncertain system and simulate the closed-loop
system for a few initial states and a few disturbance sequences for each initial
state. The phase plot for initial states [−1, −1], [1, 1], [1, 0], and [0, 1] should
resemble Figure 3.10.
Bibliography
264
Bibliography 265
In Proceedings of the 44th IEEE Conference on Decision and Control and Eu-
ropean Control Conference ECC 2005, pages 2296–2301, Sevilla, Spain, De-
cember 2005.
4.1 Introduction
We now turn to the general problem of estimating the state of a noisy
dynamic system given noisy measurements. We assume that the sys-
tem generating the measurements is given by
x + = f (x, w)
y = h(x) + v (4.1)
269
270 State Estimation
x + = f (x, w) y = h(x) + v
+
χ = f (χ, ω) y = h(χ) + ν
+
x̂ = f (x̂, ŵ) y = h(x̂) + v̂
y = h(χ) + ν η = h(χ)
y = h(x̂) + v̂ ŷ = h(x̂)
TX
−1
VT (χ(0), ω) = ℓx χ(0) − x 0 + ℓ(ω(i), ν(i)) (4.2)
i=0
subject to
χ + = f (χ, ω) y = h(χ) + ν
in which T is the current time, y(i) is the measurement at time i, and x 0
is the prior estimate of the initial state.1 Occasionally we shall consider
input disturbances to an explicitly given nominal input. If we denote
this nominal input trajectory as w, then we adjust the model constraint
to χ + = f (χ, w + ω), so that ω measures the difference from the nom-
inal model’s input. We recover the standard problem by setting w = 0.
Because ν = y − h(χ) is the error in fitting the measurement y, ℓ(ω,
ν) penalizes the model disturbance and the fitting error. These are the
two error sources we reconcile in all state estimation problems.
The full information estimator is then defined as the solution to
and we use the notation PT (x 0 , y0:k−1 ) for the usual case when the
nominal input is w = 0. The solution to the optimization exists for all
T ∈ I≥0 because VT (·) is continuous, due to the continuity of f (·) and
h(·), and because VT (·) is an unbounded function of its arguments, as
will be clear after stage costs ℓx (·) and ℓ(·) are defined. We denote the
solution as x̂(0|T ), ŵ(i|T ), 0 ≤ i ≤ T − 1, T ≥ 1, and the optimal cost
as VT0 . We also use x̂(T ) := x̂(T |T ) to simplify the notation.
We require a definition of state estimation general enough to include
this optimization approach. Attempting to express the state estimate
as a finite dimensional dynamical system, as we do with the Kalman
filter for linear systems, is not sufficient here. Instead we consider the
state estimate at any time k ∈ I≥0 to be a function of the prior x 0 ,
nominal input (if nonzero), w0:T −1 , and the measurement y0:T −1 .
considered in Chapter 1 to formulate the prediction form rather than the filtering form
of the state estimation problem. So what we denote here as x̂(T |T ) would be x̂ − (T )
in the notation of Chapter 1. This change is purely for notational convenience, and all
results developed in this chapter also can be expressed in the filtering form of MHE.
272 State Estimation
x̂(T ) = ΨT (x 0 , y0:T −1 )
In the full information estimator, the function Ψ (·) denotes the fi-
nal element of the state trajectory in the solution to (4.3). One impor-
tant characteristic of optimization-based estimation worth bearing in
mind as we progress is that x̂(T ) = ΨT (x 0 , y0:T −1 ) does not imply that
x̂(T + 1) = Ψ1 (x̂(T ), yT ), even though y0:T := (y0:T −1 , yT ). In (non-
linear) full information estimation, we have no convenient means to
move from x̂(T ) to x̂(T + 1), and must instead recompute the entire
optimal trajectory with ΨT +1 (x 0 , y0:T ). As we shall see subsequently,
this confers some desirable properties on the estimator, but renders
its online computation intractable since the size of the optimization
problem increases with time.
Next we require a definition of robust stability suitable for state
estimation in this general form. The standard attempt2 would be to
use the following type of bound in the definition of robust stability
for all k ∈ I≥0 with αx (·) ∈ KL and γw (·), γv (·) ∈ K. But, for the gen-
eral class of estimators under consideration here, an inequality of this
type does not ensure that the estimate error converges to zero when
the disturbances converge to zero. To ensure this desirable property
we strengthen the definition of estimator stability to the following.
Definition 4.2 (Robustly globally asymptotically stable estimation). A
state estimator (Ψk )k≥0 is robustly globally asymptotically stable (RGAS)
if there exist KL-functions αx , αw , αv such that
x + = Ax + Gw y = Cx + v
Solution
For (A, C) detectable and (A, G) stabilizable, the steady-state Kalman
predictor is nominally exponentially stable as discussed in Exercise
4.17. The steady-state estimator takes the form
(x − x̂)+ = AL (x − x̂) + Gw − Lv
k−1
X k−j−1
x(k) − x̂(k) = AkL (x(0) − x 0 ) + AL (Gw(j) − Lv(j))
j=0
Since AL is stable, we have the bound (Horn and Johnson, 1985, p.299)
|AiL | ≤ cλi in which max eig(AL ) < λ < 1. Taking norms and using
274 State Estimation
Taking the largest disturbance terms outside and performing the sum
then gives
c
|x(k) − x̂(k)| ≤ cλk |x(0) − x0 | + |G| ∥w∥0:k−1 + |L| ∥v∥0:k−1
1−λ
k−1
X k−1
X
z(j)λk−j−1 = z(j)λ(k−j−1)/2 λ(k−j−1)/2 ≤
j=0 j=0
1
√ max z(j)λ(k−j−1)/2
1 − λ j∈I0:k−1
√
Using this result in (4.6) and letting η := λ > λ so that 0 ≤ η < 1, we
have that
|x(k) − x̂(k)| ≤ cηk |x(0) − x0 |+c |G| /(1−η) max w(j) ηk−j−1 +
j∈I0:k−1
for all k ∈ I≥0 , all initial states x1 , x2 ∈ X, and all disturbance se-
quences w1 , w2 ∈ W∞ .
Proof. Consider two initial conditions denoted x1,0 and x2,0 , two input
sequences w1 and w2 generating from (4.1) two corresponding state
trajectories x1 and x2 . Now consider input and output disturbance se-
quences w e 1 (j) = w1 (j) − w2 (j), and v1 (j) = h(x2 (j)) − h(x1 (j)) for
j ∈ I≥0 . Let the system generating the measurements for state estima-
tion be x(k) = x(k; x1,0 , w2 + w e 1 ), y = h(x) + v1 . Note that the system
generating the measurements has initial condition x1,0 , nominal input,
w = w2 , but disturbed or actual input w1 since w2 +w e 1 = w1 ; so we have
that x(k) = x1 (k) for k ∈ I≥0 . The output measurements are exactly
h(x2 ) because of the output disturbance. The state estimator is there-
fore based on nominal input w = w2 and output measurement h(x2 ).
Let the state estimator then have x2,0 as its prior. The information given
to the estimator is then consistent, and it produces x̂(k) = Ψk (x2,0 , w2 ,
h(x2 )) = x2 (k) for k ∈ I≥0 . If the estimator is RGAS, then (4.5) gives
for this system and estimator
for all k ∈ I≥0 . Note that since x1,0 , x2,0 , w1 , w2 are arbitrary, the system
is i-IOSS. ■
With all of the basic concepts introduced, we can state our working
assumptions for the full-information state estimation problem.
Assumption 4.10 (Continuity). The functions f (·), h(·), ℓx (·), and ℓ(·)
are continuous, ℓx (0) = 0, and ℓ(0, 0) = 0. The sets X and W are closed.
278 State Estimation
Assumption 4.11 (Positive-definite stage cost). The stage cost ℓ(·) sat-
isfies
σw (|ω|) + σv (|ν|) ≤ ℓ(ω, ν) ≤ σ w (|ω|) + σ v (|ν|)
σx (|χ − x 0 |) ≤ ℓx (χ − x 0 ) ≤ σ x (|χ − x 0 |)
Remark.
(a) Assumptions 4.10 and 4.11 guarantee that a solution to (4.3) exists
for all finite T ≥ 0 (Rawlings and Ji, 2012).
(c) Notice that the stage cost is chosen to be compatible with the sys-
tem’s detectability properties in Assumption 4.11.
(d) A similar case can be made in regulation that one must choose the
regulator’s stage cost to be compatible with the system’s stabilizabil-
ity properties. We did not emphasize this issue in Chapter 2, and in-
stead allowed the stage cost to affect the MPC regulator’s feasibility set
XN . The consequence of choosing the stage cost inappropriately in the
zero-state MPC regulator would therefore be a catastrophic reduction
in the size of the feasibility set, with the worst case being XN = {0}.
(f) The stage cost also is chosen to be compatible with the system’s
stabilizability properties in Assumption 4.12.
and from (Keerthi and Gilbert, 1985, Theorem 2), a solution to the in-
finite horizon problem exists. If we consider the solution of a k-stage
problem, optimality of the infinite horizon problem gives
∞
X
0
V∞ ≤ Vk0 + min ℓ(ω(i), ν(i)) (4.12)
ωk:∞
i=k
subject to
In previous versions of FIE analysis, we made use of the fact that the
optimal solution of the estimation problem at time k + 1 gives feasible,
but possibly suboptimal decision variables at time k. That argument
leads to the inequality
Vk0 ≤ Vk+1
0
− ℓ(ŵ(k|k + 1), v̂(k|k + 1)) (4.13)
which shows that the sequence Vk0 is nondecreasing. Since it is
k≥0
bounded above by ℓx (x(0) − x 0 ), it converges, and that implies that
ℓ(ŵ(k|k+1), v̂(k|k+1)) → 0 as k → ∞. The problem with this approach
is that it compares two different trajectories, and does not generalize
well to the bounded disturbance case where the infinite horizon prob-
lem is not bounded above. So we change course from previous analysis
and consider instead a single trajectory, but different times within the
trajectory by introducing partial sums
j−1
X
0
V (j|k) = ℓx x̂(0|k) − x 0 + ℓ(x̂(i|k), v̂(i|k))
i=0
4.2 Full Information Estimation 281
for all j ≤ k. Substituting (4.14) into the definition of Y (·) then gives a
cost decrease equality
for j ≤ k − 1.
The last step is to use the i-IOSS Lyapunov function implied by the
detectability Assumption 4.13. Applying (4.9)–(4.10) to the values x(j)
and x̂(j|k) gives
for all j ≤ k. We define the Q-function as the sum of Λ(·) and Y (·)
Substituting the bounds on Y (·) and Λ(·) into this definition gives pos-
itive upper and lower bounds on Q(·)
for all k ∈ I≥0 , and x(0), x 0 ∈ X. From Assumption 4.11, we also have
the upper bound
ℓx (x(0) − x 0 ) ≤ σ x (|x(0) − x 0 |)
with σ 0 (·) := (·) + σx−1 (σ x (·)). Substituting this result in (4.15) then
gives the desired bound
Q(0|k) ≤ α0 (|x(0) − x 0 |)
with K∞ -function α0 := α2 ◦ σ 0 .
Summarizing, we have established that FIE provides a Q-function
that meets the following definition.
4.2 Full Information Estimation 283
for all j ≤ k ∈ I≥0 for (4.16) and (4.17) and j ≤ k − 1 ∈ I≥0 for (4.18).
Next we establish a Q-function theorem for nominal stability (Allan
and Rawlings, 2019, Theorem 14).
Theorem 4.15 (Q-function theorem for global asymptotic stability). If
a state estimator admits a Q-function, then it is globally asymptotically
stable (GAS).
Q(j|k) ≤ σ j (Q(0|k))
Combining this with (4.16) and (4.17) then gives for all j ≤ k
The reason for increasing the abstraction level in the current presen-
tation is not to handle nominal stability. That simple problem can be
addressed with simple tools. The point is to address finally FIE with
bounded disturbances. We are now in a good position to accomplish
that. Let’s first recall what we concluded about the steady-state Kalman
filter (predictor) with bounded disturbances. We showed in Example 4.4
that the Kalman predictor is RGAS and that the estimate error satisfies
(4.5). So that result represents the gold standard of FIE for a nonlinear
system with bounded disturbances. We’ll see next how close we can
come to the same conclusion for nonlinear systems.
The system continuity and detectability conditions from the nom-
inal case are unchanged when treating the bounded disturbance case.
But the stage cost and stabilizability assumptions require modification.
We state the new conditions next.
Assumption 4.17 (Stage cost under disturbances). The stage cost ℓ(·)
satisfies
α2 (2 |χ − x 0 |) ≤ ℓx (χ − x 0 ) ≤ σ x (|χ − x 0 |)
in which
χ + = f (χ, ω) y = h(χ) + ν
f (x, w) for i ∈ I0:k−1 h(x) + v for i ∈ I0:k−1
x+ = y=
f (x, 0) for i ∈ Ik:∞ h(x) for i ∈ Ik:∞
4.2 Full Information Estimation 285
Remark.
(a) Note the introduction of the factor of two in the lower bound of
ℓ(·) in Assumption 4.17 compared to the nominal case, Assumption
4.11.
(b) Note the new compatibility restriction on the lower bound for ℓx (·)
in Assumption 4.17 compared to the nominal case, Assumption 4.11.
(c) In the stabilizability assumption note that the upper bound on the
infinite horizon cost grows linearly with time for the case of bounded
disturbances. It is anticipated that the full-information optimal cost
also increases without bound for this bounded disturbance case. The
divergence of the optimal cost presents one of the primary challenges
in the estimator stability analysis.
Assumption 4.22 (Power-law bounds for stage costs). There exist pos-
itive constants c ℓ , c x , c ℓ , c x and σ ≥ 1 such that
We then have the following result for robust stability of FIE under
disturbances.
(b) Let Assumptions 4.10 and 4.22–4.24 hold. Then full information
estimation is RGES.
The proof for RGES is given in (Allan and Rawlings, 2020, Theorem
3.16). The considerably more involved proof for RGAS is given in (Allan,
2020, Theorem 5.18).
Theorem 4.25 is a reasonable resting place for the theory of full
information estimation. We can finally handle bounded disturbances
in a fairly clean theoretical development with reasonable assumptions
on the system’s detectability and stabilizability. If one is willing to
strengthen the detectability assumption to exponential detectability as
in Theorem 4.25(b), the theoretical development is reasonably com-
pact, and can be easily extended to MHE as we show subsequently.
Moreover, by strengthening the definitions of RGAS and RGES using
the convolution maximization form, we have the desirable and antici-
pated consequence that stability implies convergence of estimate error
given convergence of disturbances.
x + = Ax + Gw w ∼ N(0, Q)
y = Cx + v v ∼ N(0, R) (4.22)
288 State Estimation
and random initial state x(0) ∼ N(x 0 , P − (0)). In FIE, we define the
objective function
TX−1
1 2 2 2
VT (χ(0), ω) = |χ(0) − x 0 |(P − (0))−1 + |ω(i)|Q−1 + |ν(i)|R−1
2 i=0
(x̂(0|T ), w
b T ) = arg min VT (χ(0), ω)
χ(0),ω
and the trajectory of state estimates comes from the model x̂(i+1|T ) =
Ax̂(i|T )+Gŵ(i|T ). We define estimate error as x e (i|T ) = x(i)− x̂(i|T )
for 0 ≤ i ≤ T − 1, T ≥ 1.
The simplest stability question is nominal stability, i.e., if noise-free
data are provided to the estimator, (w(i), v(i)) = 0 for all i ≥ 0 in
(4.22), is the estimate error asymptotically stable as T → ∞ for all x0 ?
We next make this statement precise. First we note that the noise-free
measurement satisfies y(i) − C x̂(i|T ) = Cx e (i|T ), 0 ≤ i ≤ T and the
initial condition term can be written in estimate error as x̂(0) − x(0) =
−(xe (0) − a) in which a = x(0) − x 0 . For the noise-free measurement
we can therefore rewrite the cost function as
TX
−1
1 2 2
VT (a, x
e (0), w) = x
e (0) − a (P − (0))−1 + Cx
e (i) R −1 + |w(i)|2Q−1
2 i=0
(4.23)
in which we list explicitly the dependence of the cost function on pa-
rameter a. For estimation we solve
min VT (a, x
e (0), w) (4.24)
x
e (0),w
+
subject to x e = Ax e + Gw. Now consider problem (4.24) as an opti-
mal control problem (OCP) using w as the manipulated variable and
minimizing an objective that measures size of estimate error x e and
0 0
control w. We denote the optimal solution as x e (0; a) and w (a). Sub-
stituting these into the model equation gives optimal estimate error
0
x
e (j|T ; a), 0 ≤ j ≤ T , 0 ≤ T . Parameter a denotes how far x(0), the
system’s initial state generating the measurement, is from x 0 , the prior.
0
e , w0 ) = 0, and we
If we are lucky and a = 0, the optimal solution is (x
0
achieve zero cost in VT and zero estimate error x e (j|T ) at all time in
4.2 Full Information Estimation 289
0 0
x e (k)C) x
e (k + 1; a) = (A − L e (k; a) (4.25)
0
for k ≥ 0. The initial condition for the recursion is x
e (0; a) = a. The
e
time-varying gains L(k) and associated cost matrices P − (k) required
290 State Estimation
are
P − (k + 1) = GQG′ + AP − (k)A′
− AP − (k)C ′ (CP − (k)C ′ + R)−1 CP − (k)A′ (4.26)
e (k) = AP − (k)C ′ (CP − (k)C ′ + R)−1
L (4.27)
Regulator problem.
Regulator Estimator
A A′
B C′
C G′
k l=N −k
Π(k) P − (l) Regulator Estimator
Π(k − 1) P − (l + 1) R > 0, Q > 0 R > 0, Q > 0
Π P− (A, B) stabilizable (A, C) detectable
Q Q (A, C) detectable (A, G) stabilizable
R R
Pf P − (0)
′
K −Le
A + BK (A − L e C)′
′
x x
e
Table 4.2: Duality variables and stability conditions for linear quad-
ratic regulation and least squares estimation.
This result can be established directly using the Hautus lemma and
is left as an exercise. This lemma and the duality variables allow us to
translate stability conditions for infinite horizon regulation problems
into stability conditions for FIE problems, and vice versa. For example,
the following is a basic theorem covering convergence of Riccati equa-
tions in the form that is useful in establishing exponential stability of
regulation as discussed in Chapter 1.
Riccati equation
Π(k − 1) = C ′ QC + A′ Π(k)A−
A′ Π(k)B(B ′ Π(k)B + R)−1 B ′ Π(k)A, k = N, . . . , 1
Π(N) = Pf
Then
(a) There exists Π ≥ 0 such that for every Pf ≥ 0
lim Π(k) = Π
k→−∞
Π = C ′ QC + A′ ΠA − A′ ΠB(B ′ ΠB + R)−1 B ′ ΠA
K = −(B ′ ΠB + R)−1 B ′ ΠA
is a stable matrix.
Bertsekas (1987, pp.59–64) provides a proof for a slightly different
version of this theorem. Exercise 4.17 explores translating this theorem
into the form that is useful for establishing exponential convergence
of FIE.
Here we discount the early data completely and choose Γi (·) = 0 for
all i ≥ 0. Because it discounts the past data completely, this form of
MHE must be able to asymptotically reconstruct the state using only
the most recent N measurements. The first issue is establishing exis-
tence of the solution. Unlike the full information problem, in which the
positive definite initial penalty guarantees that the optimization takes
place over a bounded (compact) set, here there is zero initial penalty.
So we must restrict the system further than i-IOSS to ensure solution
existence. We show next that observability is sufficient for this pur-
pose.
Let Assumption 4.10 hold. Then the MHE objective function V̂T (χ(T −
N), ω) is a continuous function of its arguments because f (·) and h(·)
are continuous. We next show that V̂T (·) is an unbounded function of
its arguments, which establishes existence of the solution of the MHE
optimization problem. Let Assumption 4.11 hold. Then we have that
TX
−1
V̂T (χ(T − N), ω) = ℓ(ω(i), ν(i)) ≥
i=T −N
TX
−1
σw (|ω(i)|) + σv (|ν(i)|) (4.29)
i=T −N
x(T − N), w and v are fixed, we have that either ∥ω∥T −N:T −1 → ∞
or ∥ν∥T −N:T −1 → ∞, which implies from (4.29) that V̂T → ∞. We con-
clude that V̂T (χ(T − N), ω) → ∞ if |(χ(T − N), ω)| → ∞. Therefore
the objective function is a continuous and unbounded function of its
arguments, and existence of the solution of the MHE problem can be es-
tablished from the Weierstrass theorem (Proposition A.7). The solution
does not have to be unique.
We show next that final-state observability is a less restrictive and
more natural system requirement for MHE with zero prior weighting to
provide stability and convergence.
Notice that FSO is not the same as observable. For sufficiently re-
stricted f (·), FSO is weaker than observable and stronger than i-IOSS
(detectable) as discussed in Exercise 4.14.
To ensure FSO, we restrict the system as follows.
treats observable, FSO, and detectable for the linear time-invariant sys-
tem, which can be summarized compactly in terms of the eigenvalues
of the partitioned state transition matrix corresponding to the unob-
servable modes.
σ ( (w
b k, v
b k) k−N:k−1 ) ≤ V̂k0 ≤ σ (∥(wk , vk )∥k−N:k−1 )
for some K-functions γ w , γ v . Again using |(a, b)| ≥ max(|a| , |b|) and
the triangle inequality, this bound can be rearranged into
|x(k) − x̂(k)| ≤ γ x ∥(wk , vk )∥k−N:k−1 + γ x (w
b k, v
b k) k−N:k−1
(w
b k, v
b k) k−N:k−1 ≤ σ −1 ◦ σ (∥(wk , vk )∥k−N:k−1 )
and substitute this result into the previous inequality to obtain for all
k ≥ N ≥ No
|x(k) − x̂(k)| ≤ δ ∥(wk , vk )∥k−N:k−1
Notice that unlike in FIE, the estimate error bound does not require
the initial error x(0) − x 0 since we have zero prior weighting and as
a result have assumed observability rather than detectability. Notice
also that RGAS implies estimate error converges to zero for convergent
disturbances. Finally, the K-functions σ and hence δ increase with N,
which shows that this analysis can likely be tightened to remove this
N dependence. See also the Notes discussion on this point.
The two drawbacks of zero prior weighting are: the system had to be
assumed observable rather than detectable to ensure existence of the
solution to the MHE problem; and a large horizon N may be required
to obtain performance comparable to full information estimation. We
address these two disadvantages by using nonzero prior weighting. To
get started, we use forward DP, as we did in Chapter 1 for the uncon-
strained linear case, to decompose the FIE problem exactly into the MHE
problem (4.28) in which Γk (·) is chosen as arrival cost.
4.3 Moving Horizon Estimation 297
Definition 4.34 (Full information arrival cost). The full information ar-
rival cost is defined as
subject to
Lemma 4.35 (MHE and FIE equivalence). The MHE problem (4.28) is
equivalent to the full information problem (4.3) for the choice Γk (·) =
Zk (·) for all k > N and N ≥ 1.
So, when solving the MHE problem at time T , we bound the prior
weighting on the initial state at time T − N using the deviation from the
estimate x̂(T −N|T −N). Choosing a constant cΓ satisfying c Γ ≤ cΓ ≤ c Γ
and corresponding prior weighting Γk (χ) = cΓ |χ − x̂(k|k)|σ would be
the simplest choice meeting this assumption.
We next establish that MHE is RGES under the exponential case as-
sumptions with this so-called filtering prior and constant prior weight-
ing bounds (Allan and Rawlings, 2020, Theorem 4.2).
298 State Estimation
with 0 ≤ λ < 1. Now consider the time to be one horizon length later.
The MHE problem at this time has identical structure to the FIE problem,
but with different data: the initial prior x 0 is replaced by x̂(k0 |k0 ), the
bounds on ℓx (·) are replaced by the bounds on Γk (·), and the initial
and final times (0, k0 ) are replaced by (k0 , k0 + N). We therefore have
that
|e(k0 + N)| ≤ aΓ |e(k0 )| λN ⊕ max ad d(k0 + N − j − 1) λj
j∈I0:N−1
where the RGES constant ax is altered by the new data to a new constant
denoted aΓ > 0.4 Using the previous bound for e(k0 ) then gives
⊕ max ad d(k0 + N − j − 1) λj
j∈I0:N−1
e(k0 + pN) ≤
p
|e0 | ax (aΓ λk0 +pN ) ⊕ (aΓ λN )p max ad d(k0 − j − 1) λj
j∈I0:k0 −1
p−1
M
(aΓ λN )i max ad d(k0 + (p − i)N − j − 1) λj
j∈I0:N−1
i=0
4 The constant ax is derived in the proof of Theorem 3.16 in Allan and Rawlings
h i1/σ
(2020) and shown to be ax := (c x + c2 2σ −1 (1 + c x /c x ))/c1 where c1 ≤ c2 are
the constants in the power-law bounds for the exponential i-IOSS Lyapunov function
corresponding to Assumption 4.24, and c x , c x are from Assumption 4.22. The value
of aΓ is therefore given by replacing c x and c x in this expression with c Γ and c Γ ,
respectively. Note that ax , aΓ ≥ 1 since c1 ≤ c2 .
4.3 Moving Horizon Estimation 299
y(k)
MHE problem at T
yT −N:T −1
smoothing update
yT −N−1:T −2
filtering update
yT −2N:T −N−1
T − 2N T −N T
1/N
Now let η := aΓ λ, and note that λ ≤ η < 1 by the choice of N.
Substituting η into the previous equation and noting that aΓ ≥ 1 gives
the bound
p−1
M
max ad d(k0 + (p − i)N − j − 1) ηiN+j
j∈I0:N−1
i=0
bA | (atm)
2 10−1
pA (atm)
10−2
1 10−3
|pA − p
10−4
0
10−5
−1 10−6
4
100
bB | (atm)
pB (atm)
3 10−1
|pB − p
2 10−2
10−3
1
0 5 10 15 20 0 5 10 15 20
time (min) time (min)
dpA 2
= −2k1 pA + 2k2 pB
dt
dpB 2
= k1 p A − k2 p B
dt
with k1 = 0.16 min−1 atm−1 and k2 = 0.0064 min−1 . The only mea-
surement is total pressure, y = pA + pB .
Starting from initial condition x = (3, 1), the system is measured
with sample time ∆ = 0.1 min. The model is exact and there are no
disturbances. Using a poor initial estimate x 0 = (0.1, 4.5), parameters
" # " #
10−4 0 h i 1 0
Q= R = 0.01 P=
0 0.01 0 1
and horizon N = 10, MHE is performed on the system using the filtering
and smoothing updates for the prior weighting. For comparison, the
EKF is also used. The resulting estimates are plotted in Figure 4.2.
In this simulation, MHE performs well with either update formula.
Due to the structure of the filtering update, every N = 10 time steps,
a poor state estimate is used as the prior, which leads to undesirable
periodic behavior in the estimated state. Due to the poor initial state es-
timate, the EKF produces negative pressure estimates, leading to large
estimate errors throughout the simulation. □
• What are the best methods to update the MHE initial penalty, Γk (·)
to obtain an accurate estimator with a small horizon N for com-
putational efficiency?
State estimation for nonlinear systems has a long history, and moving
horizon estimation is a rather new approach to the problem. As with
model predictive control, the optimal estimation problem on which
moving horizon is based has a long history, but only the rather recent
advances in computing technology have enabled moving horizon esti-
mation to be considered as a viable option in online applications. It is
therefore worthwhile to compare moving horizon estimation to other
less computationally demanding nonlinear state estimators.
The extended Kalman filter (EKF) generates estimates for nonlinear sys-
tems by first linearizing the nonlinear system, and then applying the
linear Kalman filter equations to the linearized system. The approach
can be summarized in a recursion similar in structure to the Kalman
4.4 Other Nonlinear State Estimators 303
x̂ − (k + 1) = f (x̂(k), 0)
P − (k + 1) = A(k)P (k)A(k)′ + G(k)QG(k)′
x̂ − (0) = x 0 P − (0) = Q0
The sigma points are chosen deterministically, for example, as points on a selected
covariance contour ellipse or a simplex. The particle filtering points are chosen by
random sampling.
4.4 Other Nonlinear State Estimators 305
and weights of the transformed points then update the mean and co-
variance
After measurement, the EKF correction step is applied after first ex-
pressing this step in terms of the covariances of the innovation and
e := y − ŷ − . We next
state prediction. The output error is given as y
rewrite the Kalman filter update as
x̂ = x̂ − + L(y − ŷ − )
′ ′
L = E((x − x̂ − )y
e ) E(y
ey e )−1
| {z } | {z }
P −C′ (R+CP − C ′ )−1
− − ′ ′
P = P − L E((x − x̂ )y
e )
| {z }
CP −
One nice feature enjoyed by the EKF and UKF formulations is the re-
cursive update equations. One-step recursions are computationally ef-
ficient, which may be critical in online applications with short sample
times. The MHE computational burden may be reduced by shorten-
ing the length of the moving horizon, N. But use of short horizons
may produce inaccurate estimates, especially after an unmodeled dis-
turbance. This unfortunate behavior is the result of the system’s non-
linearity. As we saw in Sections 1.4.3–1.4.4, for linear systems, the full
information problem and the MHE problem are identical to a one-step
recursion using the appropriate state penalty coming from the filtering
Riccati equation. Losing the equivalence of a one-step recursion to full
information or a finite moving horizon problem brings into question
whether the one-step recursion can provide equivalent estimator per-
formance. We show in the following example that the EKF and the UKF
do not provide estimator performance comparable to MHE.
x + = f (x) + w
y = Cx + v
The prior density for the initial state, N(x 0 , P (0)), is deliberately cho-
sen to poorly represent the actual initial state to model a large initial
disturbance to the system. We wish to examine how the different esti-
mators recover from this large unmodeled disturbance.
Solution
Figure 4.3 (top) shows a typical EKF performance for these conditions.
Note that the EKF cannot reconstruct the state for this system and that
the estimates converge to incorrect steady states displaying negative
concentrations of A and B. For some realizations of the noise sequences,
the EKF may converge to the correct steady state. Even for these cases,
308 State Estimation
1.5
C
C
0.5
cj B
A
0
A
B
−0.5
−1
0 5 10 15 20 25 30
t
1000
100
10
1 C
cj
0.1 B
0.01
A
0.001
0.0001
0 20 40 60 80 100 120 140
t
Figure 4.3: Evolution of the state (solid line) and EKF state estimate
(dashed line). Top plot shows negative concentration es-
timates with the standard EKF. Bottom plot shows large
estimate errors and slow convergence with the clipped
EKF.
4.4 Other Nonlinear State Estimators 309
1.5
C
C
0.5
cj B
A
0
A
B
−0.5
−1
0 5 10 15 20 25 30
t
1.5
1 C
0.5 C
cj B
0 A
A
−0.5 B
−1
0 5 10 15 20 25 30 35
t
Figure 4.4: Evolution of the state (solid line) and UKF state estimate
(dashed line). Top plot shows negative concentration es-
timates with the standard UKF. Bottom plot shows similar
problems even if constraint scaling is applied.
310 State Estimation
0.7
C
0.6
0.5
0.4
cj
0.3
B
0.2
0.1
A
0
0 5 10 15 20 25 30
t
Figure 4.5: Evolution of the state (solid line) and MHE state estimate
(dashed line).
using other suggestions from the literature (Julier and Uhlmann, 1997;
Qu and Hahn, 2009; Kandepu, Imsland, and Foss, 2008) and trial and
error, does not substantially improve the UKF estimator performance.
Better performance is obtained in this example if the sigma points
that violate the constraints are simply saturated rather than rescaled
to the feasible region boundaries. But, this form of clipping still does
not prevent the occurrence of negative concentrations in this example.
Negative concentration estimates are not avoided by either scaling or
clipping of the sigma points. As a solution to this problem, the use
of constrained optimization for the sigma points is proposed (Vach-
hani et al., 2006; Teixeira, Tôrres, Aguirre, and Bernstein, 2008). If one
is willing to perform online optimization, however, MHE with a short
horizon is likely to provide more accurate estimates at similar computa-
tional cost compared to approaches based on optimizing the locations
of the sigma points.
The authors have only recently become aware of yet another ap-
proach to handling constraints in the UKF that does work well on this
example (Kolås, Foss, and Schei, 2009). It remains to be seen whether
further examples can be constructed that this approach cannot ad-
dress.
Finally, Figure 4.5 presents typical results of applying constrained
MHE to this example. For this simulation we choose N = 10 and the
smoothing update for the arrival cost approximation. Note that MHE
recovers well from the poor initial prior. Comparable performance is
obtained if the filtering update is used instead of the smoothing update
to approximate the arrival cost. The MHE estimates are also insensitive
to the choice of horizon length N for this example. □
The EKF, UKF, and all one-step recursive estimation methods, suffer
from the “short horizon syndrome” by design. One can try to reduce
the harmful effects of a short horizon through tuning various other
parameters in the estimator, but the basic problem remains. Large
initial state errors lead to inaccurate estimation and potential estimator
divergence. The one-step recursions such as the EKF and UKF can be
viewed as one extreme in the choice between speed and accuracy in
that only a single measurement is considered at each sample. That is
similar to an MHE problem in which the user chooses N = 1. Situations
in which N = 1 lead to poor MHE performance often lead to unreliable
EKF and UKF performance as well.
312 State Estimation
The obvious difficulty with analyzing the effect of estimate error is the
coupling of estimation and control. Unlike the problem studied earlier
in the chapter, where x + = f (x, w), we now have estimate error also
influencing state evolution. This coupling precludes obtaining the sim-
ple bounds on |e(k)| in terms of (e(0), w, v) as we did in the previous
sections.
What’s possible. Here we lower our sights from the analysis of the
fully coupled problem and consider only the effect of bounded esti-
mate error on the combined estimation/regulation problem. To make
this precise, consider the following definition of an incrementally, uni-
formly input/output-to-state stable (i-UIOSS) system.
x + = f (x, u, w) y = h(x)
two initial states z1 and z2 , any input sequence u, and any two distur-
bance sequences w1 and w2 generating state sequences x1 (z1 , u, w1 )
and x2 (z2 , u, w2 ), the following holds for all k ∈ I≥0
Next we note that the evolution of the state in the form of (4.34)
is not a compelling starting point for analysis because the estimate
error perturbation appears inside a possibly discontinuous function,
κN (·) (recall Example 2.8). Therefore, as in (Roset, Heemels, Lazar, and
Nijmeijer, 2008), we instead express the equivalent evolution, but in
terms of the state estimate as
The proof of this proposition follows Jiang and Wang (2001) as mod-
ified for a difference inclusion on a robust positive invariant set in Al-
lan, Bates, Risbeck, and Rawlings (2017, Proposition 19).
Combined MHE/MPC is RAS. Our strategy now is to establish that
VN0 (x) is an ISS Lyapunov function for the combined MHE/MPC system
subject to process and measurement disturbances on a robust positive
invariant set. We have already established the upper and lower bound-
ing inequalities
α1 (|x|) ≤ VN0 (x) ≤ α2 (|x|)
4.5 On combining MHE and MPC 315
XN+1
e XN
x x̂
f (x̂, κN (x̂), 0)
e
u
κf (·)
f (x, κN (x̂), 0)
x+ f (x, κN (x̂), w) Xf
−e+
x̂ +
e
u
So we require only
with σV (·) ∈ K. Note that we are not using the possibly discontinuous
VN0 (x) here. Since f (x, u, w) is also continuous
x̂ + − f (x̂, κN (x̂), 0) = f (x̂ + e, κN (x̂), w) − e+ − f (x̂, κN (x̂), 0)
≤ f (x̂ + e, κN (x̂), w) − f (x̂, κN (x̂), 0) + e+
≤ σf (|(e, w)|) + e+
≤σ
e f (|d|)
with σ (·) ∈ K. Note that for the candidate sequence, VN f (x̂, κN (x̂),
e ≤ VN0 (x̂) − ℓ(x̂, κN (x̂)), so we have that
0), u
e ) ≤ VN0 (x̂) − α1 (|x̂|)
VN f (x̂, κN (x̂), 0 , u
since α1 (|x|) ≤ ℓ(x, κN (x)) for all x. Therefore, we finally have
e ) ≤ VN0 (x̂) − α1 (|x̂|) + σ (|d|)
VN (x̂ + , u
VN0 (x̂ + ) ≤ VN0 (x̂) − α1 (|x̂|) + σ (|d|) (4.39)
and we have established that VN0 (·) satisfies the inequality of an ISS-
Lyapunov function. This analysis leads to the following main result.
Theorem 4.46 (Combined MHE/MPC is RAS). For the MPC regulator,
let the standard Assumptions 2.2, 2.3, and 2.14 hold, and choose Xf =
levτ Vf for some τ > 0. For the moving horizon estimator, let Assump-
tion 4.41 hold. Then for every ρ > 0 there exists δ > 0 such that if
∥d∥ ≤ δ, the origin is RAS for the system x̂ + = f (x̂ + e, κN (x̂), w) − e+ ,
y = h(x̂ + e) + v, in the set Xρ = levρ Vf .
A complete proof of this theorem, for the more general case of sub-
optimal MPC, is given in Allan et al. (2017, Theorem 21). The proof
proceeds by first showing that Xρ is robustly positive invariant for all
ρ > 0. That argument is similar to the one presented in Chapter 3 be-
fore Proposition 3.5. The proof then establishes that inequality (4.39)
holds for all x̂ ∈ Xρ . Proposition 4.45 is then invoked to establish that
the origin is RAS.
Notice that neither VN0 (·) nor κN (·) need be continuous for this com-
bination of MHE and MPC to be inherently robust. Since x = x̂ + e, The-
orem 4.46 also gives robust asymptotic stability of the evolution of x
in addition to x̂ for the closed-loop system with bounded disturbances.
4.5 On combining MHE and MPC 317
x ×10−5 x−x
b
0.875 1
0.850
c
0
0.825
0.800
−1
×10−4
332.5 2.5
330.0 0.0
T
327.5 −2.5
325.0 −5.0
0 25 50 75 100 0 25 50 75 100
time time
T
330
325
T
330
325
4.6 Notes
State estimation is a fundamental topic appearing in many branches of
science and engineering, and has a large literature. A nice and brief
annotated bibliography describing the early contributions to optimal
state estimation of the linear Gaussian system is provided by Åström
(1970, pp. 252-255). Kailath (1974) provides a comprehensive and his-
torical review of linear filtering theory including the historical devel-
opment of Wiener-Kolmogorov theory for filtering and prediction that
preceded Kalman filtering (Wiener, 1949; Kolmogorov, 1941).
Jazwinski (1970) provides an early and comprehensive treatment of
the optimal stochastic state estimation problem for linear and nonlin-
4.6 Notes 319
cant open research problems. Next Ji, Rawlings, Hu, Wynn, and Diehl
(2016); Hu, Xie, and You (2015) provided the first analysis of full infor-
mation estimation for bounded disturbances by introducing a max term
in the estimation objective function, and assuming stronger forms of
the i-IOSS detectability condition. This reformulation did provide RAS
of full information estimation with bounded disturbances, but had the
unfortunate side effect of removing convergent estimate error for con-
vergent disturbances.
In a major step forward, Müller (2017) examined MHE with bounded
disturbances for similarly restrictive i-IOSS conditions, and established
bounds on arrival cost penalty and horizon length that provide both
RAS for bounded disturbances and convergence of estimate error for
convergent disturbances. Hu (2017) generalized the detectability con-
ditions in Ji et al. (2016) and treated both full information with the max
term and MHE estimation. At this stage of development, all the bounds
for robust stability became worse with increasing horizon length, which
seems problematic since the use of more measurements should im-
prove estimation. In another significant step, Knüfer and Müller (2018)
next introduced a fading memory formulation of FIE and MHE for expo-
nentially i-IOSS systems whose bounds improved with horizon length.
But this formulation required that the stage cost satisfy the triangle
inequality, which excludes the quadratic penalty commonly used in es-
timation, especially for linear systems.
As described in detail throughout the chapter, Allan (2020) intro-
duced explicit stabilizability assumptions into the analysis and estab-
lished a converse theorem for i-IOSS. He then showed for general stage
costs that FIE is RGAS for (asymptotic) i-IOSS systems, thus removing
the exponential part of the assumption, and that MHE is RGES for ex-
ponentially i-IOSS systems. As mentioned in the chapter, whether MHE
is RGAS for (asymptotic) i-IOSS systems remains an open question. Fi-
nally, numerous application papers using MHE have appeared in the
last several years indicating a growing interest in this approach to state
estimation.
For the case of output feedback, there are of course alternatives
to simply combining independently designed MHE estimators and MPC
regulators as briefly analyzed in Section 4.5. Recently Copp and Hes-
panha (2017) propose solving instead a single min-max optimization
for simultaneous estimation and control. Because of the excellent re-
sultant closed-loop properties, this class of approaches certainly war-
rants further attention and development.
4.7 Exercises 321
4.7 Exercises
(b) Show that the ISS property implies the “converging-input converging-state” prop-
erty (Jiang and Wang, 2001), (Sontag, 1998, p. 330), i.e., show that if the system
is ISS, then u(k) → 0 implies x(k) → 0.
(b) Show that the system is detectable if and only if the system is OSS.
322 State Estimation
x + = Ax + Gw y = Cx
(a) Show that if the system is observable, then the system is IOSS.
(b) Show that the system is detectable if and only if the system is IOSS.
a ⊕ b := max(a, b)
(a) Show that the ⊕ operator is commutative and associative, i.e., a ⊕ b = b ⊕ a and
(a ⊕ b) ⊕ c = a ⊕ (b ⊕ c) for all a, b, c, so that the following operation is well
defined and the order of operation is inconsequential
n
M
a1 ⊕ a2 ⊕ a3 · · · ⊕ an := ai
i=1
(b) Find scalars d and e such that for all a, b ≥ 0, the following holds
d(a + b) ≤ a ⊕ b ≤ e(a + b)
(c) Find scalars d and e such that for all a, b ≥ 0, the following holds
d(a ⊕ b) ≤ a + b ≤ e(a ⊕ b)
(d) Generalize the previous result to the n term sum; find dn , en , dn , en such that
the following holds for all ai ≥ 0, i = 1, 2, . . . , n
n
X n
M n
X
dn ai ≤ ai ≤ en ai
i=1 i=1 i=1
Mn Xn Mn
dn ai ≤ ai ≤ en ai
i=1 i=1 i=1
Show that
lim s(k) = 0 (∞) if and only if lim s(k) = 0 (∞)
k→∞ k→∞
4.7 Exercises 323
(b) If you choose to work with max instead, derive instead the following simpler
result
γ(a1 ⊕ a2 ⊕ · · · ⊕ an ) = γ(a1 ) ⊕ γ(a2 ) ⊕ · · · ⊕ γ(an )
Notice that you have an equality rather than an inequality, which leads to tighter
bounds.
x + = f (x) y = h(x)
holds for all x1 , x2 ∈ Rn . This definition was used by Rao et al. (2003) in showing
stability of nonlinear MHE to initial condition error under zero state and measurement
disturbances.
(a) Show that this form of nonlinear observability implies i-OSS.
(b) Show that i-OSS does not imply this form of nonlinear observability and, there-
fore, i-OSS is a weaker assumption.
The i-OSS concept generalizes the linear system concept of detectability to nonlinear
systems.
ẋ = Ax + Bu y = Cx
Show that the system is detectable if and only if the system is IOSS.
324 State Estimation
(c) detectable?
in which x is the state at the current stage and z is the state at the next stage. The
stage cost and arrival cost are given by
2 2
ℓ(x, w) = (1/2) y(k) − Cx ′ −1
R −1 +w Q w Vk− (x) = (1/2) x − x̂ − (k) (P − (k))−1
and we wish to find the value function V 0 (z), which we denote Vk+1
−
(z) in the Kalman
predictor estimation problem.
(b) Add the w term and use the inverse form in Exercise 1.18 to show the optimal
cost is given by
Substitute the results for x̂(k) and P (k) above and show
−
Vk+1 (z) = (1/2)(z − x̂ − (k + 1))′ (P − (k + 1))−1 (z − x̂(k + 1))
P − (k + 1) = Q + AP − (k)A′ − AP − (k)C ′ (CP − (k)C ′ + R)−1 CP − (k)A
e (k)(y(k) − C x̂ − (k))
x̂ − (k + 1) = Ax̂ − (k) + L
e (k) = AP − (k)C ′ (CP − (k)C ′ + R)−1
L
(c) Compare and contrast this form of the estimation problem to the one given in
Exercise 1.29 that describes the Kalman filter.
326 State Estimation
(A, G) stabilizable
(a) Is the steady-state Kalman filter a stable estimator? Is the full information esti-
mator a stable estimator? Are these two answers contradictory? Work out the
results for the case A = 1, G = 0, C = 1, P − (0) = 1, Q = 1, R = 1.
Hint: you may want to consult de Souza, Gevers, and Goodwin (1986).
(b) Can this phenomenon happen in the LQ regulator? Provide the interpretation
of the time-varying regulator that corresponds to the time-varying filter given
above. Does this make sense as a regulation problem?
327
328 Bibliography
A. Gelb, editor. Applied Optimal Estimation. The M.I.T. Press, Cambridge, Mas-
sachusetts, 1974.
S. J. Julier and J. K. Uhlmann. Author’s reply. IEEE Trans. Auto. Cont., 47(8):
1408–1409, August 2002.
Bibliography 329
T. Kailath. A view of three decades of linear filtering theory. IEEE Trans. Inform.
Theory, IT-20(2):146–181, March 1974.
H. Kwakernaak and R. Sivan. Linear Optimal Control Systems. John Wiley and
Sons, New York, 1972.
C. V. Rao. Moving Horizon Strategies for the Constrained Monitoring and Con-
trol of Nonlinear Discrete-Time Systems. PhD thesis, University of Wisconsin–
Madison, 2000.
R. van der Merwe, A. Doucet, N. de Freitas, and E. Wan. The unscented parti-
cle filter. Technical Report CUED/F-INFENG/TR 380, Cambridge University
Engineering Department, August 2000.
332 Bibliography
T. Yang, P. G. Mehta, and S. P. Meyn. Feedback particle filter. IEEE Trans. Auto.
Cont., 58(10):2465–2480, 2013.
5
Output Model Predictive Control
5.1 Introduction
In Chapter 2 we show how model predictive control (MPC) may be em-
ployed to control a deterministic system, that is, a system in which there
are no uncertainties and the state is known. In Chapter 3 we show how
to control an uncertain system in which uncertainties are present but
the state is known. Here we address the problem of MPC of an un-
certain system in which the state is not fully known. We assume that
there are outputs available that may be used to estimate the state as
shown in Chapter 4. These outputs are used by the model predictive
controller to generate control actions; hence the name output MPC.
The state is not known, but a noisy measurement y(t) of the state
is available at each time t. Since the state x is not known, it is re-
placed by a hyperstate p that summarizes all prior information (previ-
ous inputs and outputs and the prior distribution of the initial state)
and that has the “state” property: future values of p can be deter-
mined from the current value of p, and current and future inputs
and outputs. Usually p(t) is the conditional density of x(t) given
the prior density p(0) of x(0), and the current available “information”
I(t) := y(0), y(1), . . . , y(t − 1), u(0), u(1), . . . , u(t − 1) .
For the purpose of control, future hyperstates have to be predicted
since future noisy measurements of the state are not known. So the
hyperstate satisfies an uncertain difference equation of the form
p + = φ(p, u, ψ) (5.1)
333
334 Output Model Predictive Control
x + = Ax + Bu + w
y = Cx + ν
in which
x + = Ax + Bu + w
y = Cx + ν
The state and control are required to satisfy the constraints x(t) ∈ X
and u(t) ∈ U for all t, and the disturbance is assumed to lie in the
compact set W. It is assumed that the origin lies in the interior of the
sets X, U, and W. The state estimator (x̂, Σ) evolves, as shown in the
sequel, according to
x̂ + = φ(x̂, u, ψ) (5.4)
+
Σ = Φ(Σ) (5.5)
336 Output Model Predictive Control
x2
x(k)
x̂(k)
{x̂(0)} ⊕ Σ
x1
Figure 5.1: State estimator tube. The solid line x̂(t) is the center of
the tube, and the dashed line is a sample trajectory of
x(t).
x2
x̄(k)
{x̄(0)} ⊕ S
{x̄(0)} ⊕ Γ
x1
Figure 5.2: The system with disturbance. The state estimate lies in
the inner tube, and the state lies in the outer tube.
the nominal version of (5.4). Thus we get two tubes, one embedded in
the other. At time t the estimator state x̂(t) lies in the set {x̄(t)}⊕S(t),
and x(t) lies in the set {x̂(t)} ⊕ Σ(t), so that for all t
Figure 5.2 shows the tube ({x̄(t)} ⊕ S(t)), in which the trajectory (x̂(t))
lies, and the tube ({x̄(t)} ⊕ Γ(t)), in which the state trajectory (x(t))
lies.
338 Output Model Predictive Control
x + = Ax + Bu + w
y = Cx + ν (5.7)
x̂ + = Ax̂ + Bu + L(Cx
e + ν)
1 Recall, a C-set is a convex, compact set containing the origin.
5.3 Linear Constrained Systems: Time-Invariant Case 339
to assume that if the estimator has been running for a “long” time, it
is in steady state.
Hence we have obtained a state estimator, with “state” (x̂, Σ) satis-
fying
and x(i) ∈ x̂(i) ⊕ Σ for all i ∈ I≥0 , thus meeting the requirements
specified in Section 5.2. Knowing this, our remaining task is to control
x̂(i) so that the resultant closed-loop system is stable and satisfies all
constraints.
5.3.3 Controlling x̂
Since xe (i) ∈ Σ for all i, we seek a method for controlling the observer
state x̂(i) in such a way that x(i) = x̂(i) + x e (i) satisfies the state
constraint x(i) ∈ X for all i. The state constraint x(i) ∈ X will be
satisfied if we control the estimator state to satisfy x̂(i) ∈ X ⊖ Σ for all
i. The estimator state satisfies (5.12) which can be written in the form
x̂ + = Ax̂ + Bu + δ (5.13)
∆ := L(C Σ ⊕ N)
u = ū + Ke e := x̂ − x̄ (5.14)
x̄ + = Ax̄ + B ū
e + = AK e + δ AK := A + BK (5.16)
AK S ⊕ ∆ = S
in Chapter 3 for robust state feedback MPC of systems; the major dif-
ference is that we now control the estimator state x̂ and use the fact
that the actual state x lies in {x̂} ⊕ Σ.
x̄ + = Ax̄ + B ū (5.17)
x̄(i) ∈ X̄ ⊆ X ⊖ Γ Γ := S ⊕ Σ (5.19)
ū(i) ∈ Ū ⊆ U ⊖ KS (5.20)
sets Σ and S tend to the set {0} as W and N tend to the set {0} in the
sense that dH (W, {0}) → 0 and dH (N, {0}) → 0.
It follows from Propositions 5.2 and 5.3, if Assumption 5.4 holds,
that satisfaction of the constraints (5.19) and (5.20) by the nominal sys-
tem ensures satisfaction of the constraints (5.8) by the original system.
The nominal optimal control problem is, therefore
PN (x̄) : V̄N0 (x̄) = min{V̄N (x̄, ū) | ū ∈ ŪN (x̄)}
ū
ŪN (x̄) := {ū | ū(k) ∈ Ū and φ̄(k; x̄, ū) ∈ X̄ ∀k ∈ {0, 1, . . . , N − 1},
φ̄(N; x̄, ū) ∈ X̄f } (5.21)
In (5.21), X̄f ⊆ X̄ is the terminal constraint set, and φ̄(k; x̄, ū) denotes
the solution of x̄ + = Ax̄ + B ū at time k if the initial state at time 0 is x̄
and the control sequence is ū = (ū(0), ū(1), . . . , ū(N − 1)). The termi-
nal constraint, which is not desirable in process control applications,
may be omitted, as shown in Chapter 2, if the set of admissible initial
states is suitably restricted. Let ū0 (x̄) denote the minimizing control
sequence; the stage cost ℓ(·) is chosen to ensure uniqueness of ū0 (x̄).
The implicit model predictive control law for the nominal system is
κ̄N (·) defined by
κ̄N (x̄) := ū0 (0; x̄)
where ū0 (0; x̄) is the first element in the sequence ū0 (x̄). The domain
of V̄N0 (·) and ū0 (·), and, hence, of κ̄N (·), is X̄N defined by
X̄N := {x̄ ∈ X̄ | ŪN (x̄) ≠ ∅} (5.22)
X̄N is the set of initial states x̄ that can be steered to X̄f by an admis-
sible control ū that satisfies the state and control constraints, (5.19)
and (5.20), and the terminal constraint. From (5.14), the implicit con-
trol law for the state estimator x̂ + = Ax̂ + Bu + δ is κN (·) defined
by
κN (x̂, x̄) := κ̄N (x̄) + K(x̂ − x̄)
The controlled composite system with state (x̂, x̄) satisfies
x̂ + = Ax̂ + BκN (x̂, x̄) + δ (5.23)
x̄ + = Ax̄ + B κ̄N (x̄) (5.24)
with initial state (x̂(0), x̄(0)) satisfying x̂(0) ∈ {x̄(0)} ⊕ S, x̄(0) ∈
X̄N . These constraints are satisfied if x̄(0) = x̂(0) ∈ X̄N . The control
algorithm may be formally stated as follows.
344 Output Model Predictive Control
If the terminal cost Vf (·) and terminal constraint set X̄f satisfy the
stability Assumption 2.14, and if Assumption 5.4 is satisfied, the value
function V̄N0 (·) satisfies
4. X̄N is a C-set,
It follows from Chapter 2 that the origin is exponentially stable for the
nominal system x̄ + = Ax̄ + B κ̄N (x̄) with a region of attraction X̄N so
that there exists a c > 0 and a γ ∈ (0, 1) such that
|x̄(i)| ≤ c |x̄(0)| γ i
for all x̄(0) ∈ X̄N , all i ∈ I≥0 . Also x̄(i) ∈ X̄N for all i ∈ I≥0 if
x̄(0) ∈ X̄N so that problem PN (x̄(i)) is always feasible. Because the
state x̂(i) of the state estimator always lies in {x̄(i)} ⊕ S, and the state
x(i) of the system being controlled always lies in {x̄(i)} ⊕ Γ, it fol-
lows that x̂(i) converges robustly and exponentially fast to S, and x(i)
converges robustly and exponentially fast to Γ. We are now in a posi-
tion to establish exponential stability of A := S × {0} with a region of
attraction (X̄N ⊕ S) × X̄N for the composite system (5.23) and (5.24).
Proof. Let φ := (x̂, x̄) denote the state of the composite system. Then
φ A is defined by
φ A = |x̂|S + |x̄|
where |x̂|S := d(x̂, S). But x̂ ∈ {x̄} ⊕ S implies x̂ = x̄ + e for some e ∈ S
so that
|x̂|S = d(x̂, S) = d(x̄ + e, S) ≤ d(x̄ + e, e) = |x̄|
since e ∈ S. Hence φ A ≤ 2 |x̄| so that
for all φ(0) ∈ (X̄N ⊕ S) × X̄N . Since for all x̄(0) ∈ X̄N , x̄(i) ∈ X̄ and
ū(i) ∈ Ū, it follows that x̂(i) ∈ {x̄(i)} ⊕ S, x(i) ∈ X, and u(i) ∈ U for
all i ∈ I≥0 . Thus A := S × {0} is exponentially stable with a region of
attraction (X̄N ⊕S)× X̄N for the composite system (5.23) and (5.24). ■
It follows from Proposition 5.6 that x(i), which lies in the set {x̄(i)}⊕
Γ, Γ := S ⊕ Σ, converges to the set Γ. In fact x(i) converges to a set
that is, in general, smaller than Γ since Γ is a conservative bound on
x
e (i) + e(i). We determine this smaller set as follows. Let φ := (x e , e)
and let ψ := (w, ν); φ is the state of the two error systems and ψ is a
bounded disturbance lying in a C-set Ψ := W × N. Then, from (5.10) and
(5.16), the state φ evolves according to
eφ + B
φ+ = A eψ (5.25)
346 Output Model Predictive Control
Because ρ(AL ) < 1 and ρ(AK ) < 1, it follows that ρ(A e ) < 1. Since
e
ρ(A) < 1 and Ψ is compact, there exists a robust positive invariant set
Φ ⊆ Rn × Rn for (5.25) satisfying
eΦ ⊕ B
A eΨ = Φ
The analysis above shows the tightened state and control constraint
sets X̄ and Ū for the nominal optimal control problem can, in principle,
be computed using set algebra. Polyhedral set computations are not
robust, however, and usually are limited to sets in Rn with n ≤ 15. So
we present here an alternative method for computing tightened con-
straints, similar to that described in 3.5.3.
We next show how to obtain a conservative approximation to X̄ ⊆
X ⊖ Γ, Γ = S ⊕ Σ. Suppose c ′ x ≤ d is one of the constraints defining X.
Since e = x̂ − x̄ , which lies in S, and x
e = x − x̂, which lies in Σ, satisfy
+
e+ = AK e + LCx e + Lν and x e = AL x e + w − Lν, the constraint c ′ x ≤ d
(one of the constraints defining X), the corresponding constraint in X̄
should be c ′ x ≤ d − φX̄∞ in which
φX̄ ′ ′e
∞ = max{c e | e ∈ S} + max{c x |x
e ∈ Σ}
∞
X ∞
X
j j
= max AK (LCx
e (j) + Lν(j)) + max AL (w(j) − Lν(j))
(w(i),ν(i)) (w(i),ν(i))
j=0 j=0
Pj−1
e (j) = i=0 AiL (w(i)−Lν(i)). The maximizations are subject
in which x
to the constraints w(i) ∈ W and ν(i) ∈ N for all i ∈ I≥0 . Because
5.4 Linear Constrained Systems: Time-Varying Case 347
N−1
X N−1
X
j j
φX̄
N = max AK (LCx
e (j)+Lν(j))+ max AL (w(j)−Lν(j))
(w(i),ν(i)) (w(i),ν(i))
j=0 j=0
Ū
The maximizations for computing φX̄ N and φN are subject to the con-
straints w(i) ∈ W and ν(i) ∈ N for all i ∈ I≥0 .
Table 5.1: Summary of the sets and variables used in output MPC.
to be controlled is described by
x + = Ax + Bd d + Bu + wx
y = Cx + Cd d + ν
r = Hy re = r − r̄
behavior by
d+ = d + wd
where wd is a bounded disturbance taking values in the compact set
Wd ; in practice d is bounded, although this is not implied by our model.
We assume that x ∈ Rn , d ∈ Rp , u ∈ Rm , y ∈ Rr , and e ∈ Rq , q ≤ r ,
and that the system to be controlled is subject to the usual state and
control constraints
x∈X u∈U
We assume X is polyhedral and U is polytopic.
Given the many sets that are required to specify the output feedback
case we are about to develop, Table 5.1 may serve as a reference for the
sets defined in the chapter and the variables that are members of these
sets.
5.5.1 Estimation
eφ + B
φ+ = A eu + w
eφ + ν
y =C
e φ̂ + B
φ̂+ = A eu + δ e φ̂)
δ := L(y − C
+
e =A
φ eφe + w − L(C
eφe + ν)
350 Output Model Predictive Control
Clearly w e defined by
e = w − Lν takes values in the compact set W
e := W ⊕ (−LN)
W
e decays to zero exponentially fast so that x̂ → x̄
If w and ν are zero, φ
e L ) < 1 and W
and d̂ → d exponentially fast. Since ρ(A e is compact, there
+
e
exists a robust positive invariant set Φ for φ = A e +w
e Lφ e, w
e ∈ W e
satisfying
Φ=A e LΦ ⊕ We
e (i) ∈ Φ for all i ∈ I≥0 if φ
Hence φ e (0) ∈ Φ. Since φ
e = (x e ) ∈ Rn × Rp
e, d
where x e
e := x − x̂ and d := d− d̂, we define the sets Σx and Σd as follows
h i h i
Σx := In 0 Φ Σd := 0 Ip Φ
5.5.2 Control
δ := Ly eφ
e = L(C e + ν)
x̂ + = Ax̂ + Bd d̂ + Bu + δx
d̂+ = d̂ + δd
5.5 Offset-Free MPC 351
δx := Lx y eφ
e = Lx (C e + ν) δd := Ld y eφ
e = Ld (C e + ν)
x̄ + = Ax̄ + Bd d̂ + B ū
d̄+ = d̄
in which the initial state is (x̂, d̂). We obtain ū = κ̄N (x̄, d̄, r̄ ) by solving
a nominal optimal control problem defined later and set u = ū + Ke,
e := x̂ − x̄ where K is chosen so that ρ(AK ) < 1, AK := A + BK; this
is possible since (A, B) is assumed to be stabilizable. It follows that
e := x̂ − x̄ satisfies the difference equation
e+ = AK e + δx δx ∈ ∆x
AK S ⊕ ∆x = S
Hence e(i) ∈ S for all i ∈ I≥0 if e(0) ∈ S. So, as in Proposition 5.3, the
states and controls of the estimator and nominal system satisfy x̂(i) ∈
{x̄(i)} ⊕ S and u(i) ∈ {ū(i)} ⊕ KS for all i ∈ I≥0 if the initial states
x̂(0) and x̄(0) satisfy x̂(0) ∈ {x̄(0)} ⊕ S. Using the fact established
previously that x e (i) ∈ Σx for all i, we can also conclude that x(i) =
x̄(i)+e(i)+x e (i) ∈ {x̄(i)}⊕Γ and that u(i) = ū(i)+Ke(i) ∈ {ū(i)}+KS
for all i where Γ := S ⊕ Σx provided, of course, that φ(0) ∈ {φ̂(0)} ⊕ Φ
and x(0) ∈ {x̂(0)} ⊕ S. These conditions are equivalent to φ e (0) ∈ Φ
and e(0) ∈ S where, for all i, e(i) := x̂(i) − x̄(i). Hence x(i) lies in X
and u(i) lies in U if x̄(i) ∈ X̄ := X ⊖ Γ and ū(i) ∈ Ū := U ⊖ KS.
Thus x̂(i) and x(i) evolve in known neighborhoods of the central
state x̄(i) that we can control. Although we know that the uncontrol-
lable state d(i) lies in the set {d̂(i)} ⊕ iΣd for all i, the evolution of d̂(i)
352 Output Model Predictive Control
where L(·) is an appropriate cost function; e.g., L(x̄, ū) = (1/2) |ū|2R̄ .
The equality constraints
h i in this optimization problem can be satisfied
if the matrix I−A −B
HC 0 has full rank. As the notation indicates, the
target equilibrium pair (x̄s , ūs )(d̂, r̄ ) is not constant, but varies with
the estimate of the disturbance state d.
MPC algorithm. The control objective is to steer the central state x̄
to a small neighborhood of the target state x̄s (d̂, r̄ ) while satisfying
the state and control constraints x ∈ X and u ∈ U. It is desirable
that x̄(i) converges to x̄s (d̂, r̄ ) if d̂ remains constant, in which case
x(i) converges to the set {x̄s (d̂, r̄ )} ⊕ Γ. We are now in a position to
specify the optimal control problem whose solution yields ū = κ̄N (x̄,
d̂, r̄ ) and, hence, u = ū + K(x̂ − x̄).To achieve this objective, we define
the deterministic optimal control problem
P̄N (x̄, d̂, r̄ ) : VN0 (x̄, d̂, r̄ ) := min{VN (x̄, d̂, r̄ , ū) | ū ∈ ŪN (x̄, d̂, r̄ )}
ū
5.5 Offset-Free MPC 353
in which the cost VN (·) and the constraint set ŪN (x̄, d̂, r̄ ) are defined
by
N−1
X
VN (x̄, d̂, r̄ , ū) := ℓ(x̄(i) − x̄s (d̂, r̄ ), ū(i) − ūs (d̂, r̄ ))+
i=0
ŪN (x̄, d̂, r̄ ) := ū | x̄(i) ∈ X̄, ū(i) ∈ Ū ∀i ∈ I0:N−1 ,
x̄(N) ∈ X̄f (x̄s (d̂, r̄ ))
and, for each i, x̄(i) = φ̄(i; x̄, d̂, ū), the solution of x̄ + = Ax̄ + Bd d̂ + B ū
when the initial state is x̄, the control sequence is ū, and the disturbance
d̂ is constant, i.e., satisfies the nominal difference equation d̂+ = d̂. The
set of feasible (x̄, d̂, r̄ ) and the set of feasible states x̄ for P̄N (x̄, d̂, r̄ )
are defined by
F̄N := {(x̄, d̂, r̄ ) | UN (x̄, d̂, r̄ ) ≠ ∅} X̄N (d̂, r̄ ) := {x̄ | (x̄, d̂, r̄ ) ∈ F̄N }
The terminal cost is zero when the terminal state is equal to the target
state. The solution to P̄N (x̄, d̂, r̄ ) is
ū0 (x̄, d̂, r̄ ) = {ū0 (0; x̄, d̂, r̄ ), ū0 (1; x̄, d̂, r̄ ), . . . , ū0 (N − 1; x̄, d̂, r̄ )}
where ū0 (0; x̄, d̂, r̄ ) is the first element in the sequence ū0 (x̄, d̂, r̄ ). The
control u applied to the plant and the observer is u = κN (x̂, x̄, d̂, r̄ )
where κN (·) is defined by
2. At time i, solve the “nominal” optimal control problem P̄N (x̄, d̂, r̄ )
to obtain the current “nominal” control action ū = κ̄N (x̄, d̂, r̄ ) and the
control action u = ū + K(x̂ − x̄).
with x̄ + = Ax̄ + Bd d̂ + B κ̄N (x̄, d̂, r̄ ), for all (x̄, d̂, r̄ ) ∈ F̄N . The first and
last inequalities follow from our assumptions; we assume the existence
of the upper bound in the second inequality. The inequalities hold for
all (x̄, d̂, r̄ ) ∈ F̄N . Note that the last inequality does NOT ensure VN0 (x̄ + ,
2
d̂+ , r̄ ) ≤ VN0 (x̄, d̂, r̄ ) − c1 x̄ − x̄s (d̂, r̄ ) with x̄ + = Ax̄ + Bd d̂ + B κ̄N (x̄,
d̂, r̄ ) and d̂+ := d̂ + δd . The perturbation due to δd has to be taken into
account when analyzing stability.
Constant d̂. If d̂ remains constant, x̄s (d̂, r̄ ) is exponentially stable for
x̄ + = Ax̄ + Bd d̂ + B κ̄N (x̄, d̂, r̄ ) with a region of attraction X̄N (d̂, r̄ ). It
can be shown, as in the proof of Proposition 5.6, that the set A(d̂, r̄ ) :=
({x̄s (d̂, r̄ )} ⊕ S) × {x̄s (d̂, r̄ )} is exponentially stable for the composite
system x̂ + = Ax̂ + Bd d̂ + BκN (x̂, x̄, d̂, r̄ ) + δx , x̄ + = Ax̄ + Bd d̂ + B κ̄N (x̄,
d̂), δx ∈ ∆x , with a region of attraction (X̄N (d̂, r̄ ) ⊕ S) × X̄N (d̂, r̄ ).
Hence x(i) ∈ {x̄(i)} ⊕ Γ tends to the set {x̄s (d̂, r̄ )} ⊕ Γ as i → ∞. If, in
addition, W = {0} and N = {0}, then ∆ = {0} and Γ = Σ = S = {0} so
that x(i) → x̄s (d, r̄ ) and re (i) → 0 as i → ∞.
Slowly varying d̂. If d̂ is varying, the descent property of VN0 (·) is
modified and it is necessary to obtain an upper bound for VN0 (Ax̄ +
Bd (d̂ + δd ) + B κ̄N (x̄, d̂, r̄ ), d̂ + δd , r̄ ). We make use of Proposition 3.4 in
Chapter 3. If X̄N is compact and if (d̂, r̄ ) , x̄s (d̂, r̄ ) and (d̂, r̄ ) , ūs (d̂,
r̄ ) are both continuous in some compact domain, then, since VN (·) is
then continuous in a compact domain A, it follows from the properties
of VN0 (·) and Proposition 3.4 that there exists a K∞ function α(·) such
that
2
VN0 (x̄, d̂, r̄ ) ≥ c1 x̄ − x̄s (d̂, r̄ )
2
VN0 (x̄, d̂, r̄ ) ≤ c2 x̄ − x̄s (d̂, r̄ )
2
VN0 (x̄ + , d̂+ , r̄ ) ≤ VN0 (x̄, d̂, r̄ ) − c1 x̄ − x̄s (d̂, r̄ ) + α(δd )
for all (x̄, d̂, δd , r̄ ) ∈ V ; here (x̄, d̂)+ := (x̄ + , d̂+ ), x̄ + = Ax̄ + Bd (d̂ +
δd ) + B κ̄N (x̄, d̂, r̄ ) and d̂+ = d̂ + δd . A suitable choice for A is V ×
D × {r̄ } × UN with V the closure of leva VN0 (·) for some a > 0, and D a
compact set containing d and d̂. It follows that there exists a γ ∈ (0, 1)
such that
VN0 ((x̄, d̂)+ , r̄ ) ≤ γVN0 (x̄, d̂, r̄ ) + α(δd )
with γ = 1 − c1 /c2 ∈ (0, 1). Assuming that PN (x̄, d̂, r̄ ) is recursively
feasible
VN0 (x̄(i), d̂(i), r̄ ) ≤ γ i VN0 (x̄(0), d̂(0), r̄ ) + α(δd )(1 − γ i )/(1 − γ)
356 Output Model Predictive Control
in which x̄(0) = x(0) and d̂(0) = d(0). It then follows from the last
inequality and the bounds on VN0 (·) that
x̄(i) − x̄s (d̂(i), r̄ ) ≤ γ i/2 (c2 /c1 )1/2 x̄(0) − x̄s (d̂(0), r̄ ) + c(i)
with c(i) := [α(δd )(1 − γ i )/(1 − γ)]1/2 so that c(i) → c := [α(δd )/(1 −
γ)]1/2 and x̄(i) − x̄s (d̂(i), r̄ ) → c as i → ∞. Here we have made use
of the fact that (a + b)1/2 ≤ a1/2 + b1/2 .
Let C ⊂ Rn denote the set {x | |x| ≤ c}. Then x̄(i) → {x̄s (d̂(i),
r̄ )}⊕C, x̂(i) → {x̄s (d̂(i), r̄ )}⊕C ⊕S and x(i) → {x̄s (d̂(i), r̄ )}⊕C ⊕S ⊕Σ
as i → ∞. Since c(i) = [α(δd )(1 − γ i )/(1 − γ)]1/2 → 0 as δd → 0,
it follows that x̄(i) → x̄s (d̂(i), r̄ ) as i → ∞. The sizes of S and Σ
are dictated by the process and measurement disturbances, w and ν
respectively.
Recursive feasibility. The result that x(i) → {x̄s (d̂(i), r̄ )} ⊕ C ⊕ Γ ,
Γ := S⊕Σ, is useful because it gives an asymptotic bound on the tracking
error. But it does depend on the recursive feasibility of the optimal
control problem PN (·), which does not necessarily hold because of the
variation of d̂ with time. Tracking of a random reference signal has
been considered in the literature, but not in the context of output MPC.
We show next that PN (·) is recursively feasible and that the tracking
error remains bounded if the estimate d̂ of the disturbance d varies
sufficiently slowly—that is if δd in the difference equation d̂+ = d̂+δd is
sufficiently small. This can be ensured by design of the state estimator.
To establish recursive feasibility, assume that the current “state” is
(x̄, d̂, r̄ ) and x̄ ∈ X̄(d̂, r̄ ). In other words, we assume P̄N (x̄, d̂, r̄ ) is
feasible and x̄N := φ̄(N; x̄, κ̄N (x̄, d̂, r̄ )) ∈ Xf (x̄s (d̂, r̄ )). If the usual
stability conditions are satisfied, problem PN (x̄ + , d̂, r̄ ) is also feasible
so that x̄ + = Ax̄ + Bd d̂ + B κ̄N (x̄, d̂, r̄ ) ∈ X̄N (d̂, r̄ ). But d̂+ = d̂ + δd
so that PN (x̄ + , d̂+ , r̄ ) is not necessarily feasible since x̄N , which lies in
Xf (x̄s (d̂, r̄ )), does not necessarily lie in Xf (x̄s (d̂+ , r̄ )). Let the terminal
set Xf (x̄s (d̂, r̄ )) := {x | Vf (x − x̄s (d̂, r̄ )) ≤ c}. If the usual stability
conditions are satisfied, for each x̄N ∈ Xf (x̄s (d̂, r̄ )), there exists a u =
+
κf (x̄N ) that steers x̄N to a state x̄N in {x | Vf (x − x̄s (d̂, r̄ )) ≤ e}, e < c.
Consequently, there exists a feasible control sequence u e (x̄) ∈ ŪN (x̄,
+
d̂, r̄ ) that steers x̄ + to a state x̄N ∈ {x | Vf (x − x̄s (d̂, r̄ )) ≤ e}. If
the map d̂ , x̄s (d̂, r̄ ) is uniformly continuous, there exists a constant
+
a > 0 such that |δd | ≤ a implies that x̄N lies also in Xf (x̄s (d̂+ , r̄ )) =
+
{x | Vf (x − x̄s (d̂ , r̄ )) ≤ c}. Thus the control sequence u e (x̄) also
+ + +
steers x̄ to the set Xf (x̄s (d̂ , r̄ )) and hence lies in ŪN (x̄, d̂ , r̄ ). Hence
5.6 Nonlinear Constrained Systems 357
5.7 Notes
The problem of output feedback control has been extensively discussed
in the general control literature. For linear systems, it is well known
that a stabilizing state feedback controller and an observer may be sep-
arately designed and combined to give a stabilizing output feedback
controller (the separation principle). For nonlinear systems, Teel and
Praly (1994) show that global stabilizability and complete uniform ob-
servability are sufficient to guarantee semiglobal stabilizability when
a dynamic observer is used, and provide useful references to related
work on this topic.
Although output MPC, in which nominal MPC is combined with a
separately designed observer, is widely used in industry since the state
358 Output Model Predictive Control
5.8 Exercises
dH (F A, F B) ≤ |F | dH (A, B)
in which |F | is the induced norm of F satisfying |F x| ≤ |F | |x| and |x| := d(x, 0).
in which
e Φ(i − 1) ⊕ B
Φ(i) = A eΨ
eΦ ⊕ B
Φ=A eΨ
P. Falugi and D. Q. Mayne. Model predictive control for tracking random ref-
erences. In Proceedings of the 2013 European Control Conference, pages
518–523, 2013.
361
362 Bibliography
363
364 Distributed Model Predictive Control
subject to
x + = Ax + Bu
In this section, we first take the direct but brute-force approach to find-
ing the optimal control law. We write the model solution as
x(1) A B 0 ··· 0 u(0)
x(2) A2 AB B · · · 0
u(1)
. = . x(0)+ . . . . (6.1)
. . . .. .. .. ..
. . . .
N N−1 N−2
x(N) A A B A B ··· B u(N − 1)
| {z } | {z }
A B
x = Ax(0) + Bu
in which
h i
Q = diag Q Q ... Pf ∈ RNn×Nn
h i
R = diag R R ... R ∈ RNm×Nm (6.2)
Eliminating the state sequence. Substituting the model into the ob-
jective function and eliminating the state sequence gives a quadratic
function of u
and the optimal solution for the entire set of inputs is obtained in one
shot
u0 (x(0)) = −(B′ QB + R)−1 B′ QA x(0)
and the optimal cost is
1
V 0 (x(0)) = x ′ (0) Q + A′ QA − A′ QB(B′ QB + R)−1 B′ QA x(0)
2
It is not immediately clear that the K(0) and V 0 given above from the
least squares approach are equivalent to the result from the Riccati
iteration, (1.10)–(1.14) of Chapter 1, but since we have solved the same
optimization problem, the two results are the same.2
Retaining the state sequence. In this section we set up the least
squares problem again, but with an eye toward improving its efficiency.
Retaining the state sequence and adjoining the model equations as
1 Would you prefer to invert by hand 100 (1 × 1) matrices or a single (100 × 100)
dense matrix?
2 Establishing this result directly is an exercise in using the partitioned matrix inver-
sion formula. The next section provides another way to show they are equivalent.
366 Distributed Model Predictive Control
in which h i
H = diag R Q R Q ··· R Pf
The constraints are
Dz = d
in which
B −I A
A B −I 0
D = −
..
d=
.. x(0)
. .
A B −I 0
We now substitute these results into (1.57) and obtain the linear algebra
problem
R B′ u(0) 0
Q −I A′ x(1) 0
R B′ u(1) 0
Q −I x(2) 0
.. .. . .
. .
. . . .
R B′
u(N − 1) = 0 x(0)
Pf −I x(N) 0
B −I λ(1) −A
A B −I λ(2) 0
.. . .
. .
. . .
B −I λ(N) 0
6.1 Introduction and Preliminary Results 367
Method FLOPs
dynamic programming (DP) Nm3
dense least squares N 3 m3
banded least squares N(2n + m)(3n + m)2
Notice that in terms of the Riccati matrix, we also have the relationship
Definition 6.2 (Lyapunov stability). The zero solution x(k) = 0 for all
k is stable (in the sense of Lyapunov) at k = k0 if for any ε > 0 there
exists a δ(k0 , ε) > 0 such that
φ(i; x) ≤ c |x| γ i
Consider next the suboptimal MPC controller. Let the system satisfy
(x , u+ ) = (f (x, u), g(x, u)) with initial sequence u(0) = h(x(0)). The
+
Then the origin is exponentially stable for the closed-loop system under
suboptimal MPC with region of attraction XN if either of the following
additional assumptions holds
(a) U is compact. In this case, XN may be unbounded.
Proof. First we show that the origin of the extended state (x, u) is ex-
ponentially stable for x(0) ∈ XN .
(a) For the case U compact, we have |u| ≤ d |x| , x ∈ r B. Consider the
optimization
max |u| = s > 0
u∈UN
The solution exists because XN is compact and h(·) and V (·) are con-
tinuous. Define the compact set Ū by
Ū = {u | V (x, u) ≤ V̄ , x ∈ XN }
The set is bounded because V (x, u) ≥ a |(x, u)|2 ≥ a |u|2 . The set is
closed because V is continuous. The significance of this set is that for
6.1 Introduction and Preliminary Results 373
which gives |x| ≥ c ′ |(x, u)| with c ′ = 1/(1 + d′ ) > 0. Hence, there
exists a3 = c(c ′ )2 such that V (x + , u+ ) − V (x, u) ≤ −a3 |(x, u)|2 for
all x ∈ XN . Therefore the extended state (x, u) satisfies the standard
conditions of an exponential stability Lyapunov function (see Theorem
B.19 in Appendix B) with a1 = a, a2 = b, a3 = c(c ′ )2 , σ = 2 for (x,
u) ∈ XN ×UN (case (a)) or XN × Ū (case (b)). Therefore for all x(0) ∈ XN ,
k≥0
|(x(k), u(k))| ≤ α |(x(0), u(0))| γ k
in which α > 0 and 0 < γ < 1.
Finally we remove the input sequence and establish that the origin
for the state (rather than the extended state) is exponentially stable for
the closed-loop system. We have for all x(0) ∈ XN and k ≥ 0
(x + , e+ ) = f (x, e)
374 Distributed Model Predictive Control
with a zero steady-state solution, f (0, 0) = (0, 0). Assume there exists
a function V : Rn+m → R≥0 that satisfies the following for all (x, e) ∈
Rn × Rm
a(|x|σ + |e|γ ) ≤ V ((x, e)) ≤ b(|x|σ + |e|γ ) (6.6)
σ γ
V (f (x, e)) − V ((x, e)) ≤ −c(|x| + |e| ) (6.7)
with constants a, b, c, σ , γ > 0. Then the following holds for all (x(0),
e(0)) and k ∈ I≥0
|x(k), e(k)| ≤ δ(|x(0), e(0)|)λk
with λ < 1 and δ(·) ∈ K∞ .
The proof of this lemma is discussed in Exercise 6.9. We also require
a converse theorem for exponential stability.
Lemma 6.7 (Converse theorem for exponential stability). If the zero
steady-state solution of x + = f (x) is globally exponentially stable, then
there exists Lipschitz continuous V : Rn → R≥0 that satisfies the follow-
ing: there exist constants a, b, c, σ > 0, such that for all x ∈ Rn
a |x|σ ≤ V (x) ≤ b |x|σ
V (f (x)) − V (x) ≤ −c |x|σ
Moreover, any σ > 0 is valid, and the constant c can be chosen as large
as one wishes.
The proof of this lemma is discussed in Exercise B.3.
N−1
X
V1 (x1 (0), u1 , u2 ) = ℓ1 (x1 (k), u1 (k)) + V1f (x1 (N))
k=0
in which " #
x11
x1 =
x12
Note that the first local objective is affected by the second player’s
inputs through the model evolution of x1 , i.e., through the x12 states.
We choose the stage cost to account for the first player’s inputs and
outputs
in which h i
Q1 = C1′ Q1 C1 C1 = C11 C12
N−1
X
V2 (x2 (0), u1 , u2 ) = ℓ2 (x2 (k), u2 (k)) + V2f (x2 (N))
k=0
376 Distributed Model Predictive Control
s.t. x1+ = A1 x1 + B 11 u1 + B 12 u2
x2+ = A2 x2 + B 22 u2 + B 21 u1
in which
" # " #
A11 0 A22 0
A1 = A2 =
0 A12 0 A21
" # " # " # " #
B11 0 0 B22
B 11 = B 12 = B 21 = B 22 =
0 B12 B21 0
This optimal control problem is more complex than all of the dis-
tributed cases to follow because the decision variables include both
u1 and u2 . Because the performance is optimal, centralized control is a
natural benchmark against which to compare the distributed cases: co-
operative, noncooperative, and decentralized MPC. The plantwide stage
cost and terminal cost can be expressed as quadratic functions of the
subsystem states and inputs
in which
"# " # " #
x1 u1 ρ1 Q1 0
x= u= Q=
x2 u2 0 ρ2 Q2
" # " #
ρ1 R1 0 ρ1 P1f 0
R= Pf = (6.9)
0 ρ2 R2 0 ρ2 P2f
6.2 Unconstrained Two-Player Game 377
min V (x(0), u)
u
s.t. x + = Ax + Bu (6.10)
N−1
X
V1 (x1 (0), u1 ) = ℓ1 (x1 (k), u1 (k)) + V1f (x1 (N))
k=0
s.t. x1+ = A1 x1 + B 11 u1
s.t. x1+ = A1 x1 + B 11 u1 + B 12 u2
6.2 Unconstrained Two-Player Game 379
s.t. Dz = d
in which
B 11 −I
A1 B 11 −I
D = −
..
.
A1 B 11 −I
A1 x1 (0) + B 12 u2 (0)
B 12 u2 (1)
d= ..
.
B 12 u2 (N − 1)
u10 = K1 x1 (0) + L1 u2
380 Distributed Model Predictive Control
next iterate
p+1 p+1
(u1 , u2 )
w2
p
p p
(u1 , u2 ) w1 (u01 , u2 )
current iterate player one’s optimization
p p p+1 p+1
Figure 6.1: Convex step from (u1 , u2 ) to (u1 , u2 ); the param-
eters w1 , w2 with w1 + w2 = 1 determine location of
next iterate on line joining the two players’ optimiza-
p p
tions: (u01 , u2 ) and (u1 , u02 ).
and we see player one’s optimal decision depends linearly on his ini-
tial state, but also on player two’s decision. This is the key difference
between decentralized control and noncooperative control. In nonco-
operative control, player two’s decisions are communicated to player
one and player one accounts for them in optimizing the local objective.
or
" #p+1 " # " 0# " # " #p
u1 w1 I 0 u1 (1 − w1 )I 0 u1
= +
u2 0 w2 I u20 0 (1 − w2 )I u2
in which
" #−1 " #
−1 w1 I −w1 L1 w1 K 1 0
(I − L) K =
−w2 L2 w2 I 0 w2 K 2
" #−1 " #
I −L1 K1 0
(I − L)−1 K =
−L2 I 0 K2
382 Distributed Model Predictive Control
p−1
X
up+1 = Lp u[0] + Lj Kx(0) 0≤p
j=0
u1 (0) = E1 u1
h i
E1 = Im1 0m1 ... 0m1 m1 × m1 N matrix
Using the plantwide notation for this equation and defining the feed-
back gain K gives
x + = (A + BK)x
6.2 Unconstrained Two-Player Game 383
in which
1 0.5
G(s) = s 2 + 2(0.2)s + 1 0.225s + 1
−0.5 1.5
(0.5s + 1)(0.25s + 1) 0.75s 2 + 2(0.8)(0.75)s + 1
Obtain discrete time models (Aij , Bij , Cij ) for each of the four transfer
functions Gij (s) using a sample time of T = 0.2 and zero-order holds
on the inputs. Set the control cost function parameters to be
Q1 = Q2 = 1 P 1f = P 2f = 0 R1 = R2 = 0.01
N = 30 w1 = w2 = 0.5
384 Distributed Model Predictive Control
Compute the eigenvalues of the L matrix for this system using noncoop-
erative MPC. Show the Nash equilibrium is unstable and the closed-loop
system is therefore unstable. Discuss why this system is problematic
for noncooperative control.
Solution
For this problem L is a 60 × 60 matrix (N(m1 + m2 )). The magnitudes
of the largest eigenvalues are
h i
eig(L) = 1.11 1.11 1.03 1.03 0.914 0.914 · · ·
Solution
For this case the largest magnitude eigenvalues of L are
h i
eig(L) = 0.63 0.63 0.62 0.62 0.59 0.59 ···
and we see the Nash equilibrium for the noncooperative game is sta-
ble. So we have removed the first source of closed-loop instability by
switching the input-output pairings of the two subsystems. There are
seven states in the complete system model, and the magnitudes of the
eigenvalues of the closed-loop regulator (A + BK) are
h i
eig(A + BK) = 1.03 1.03 0.37 0.37 0.77 0.77 0.04
Example 6.11: Nash equilibrium is stable and the closed loop is stable
Next consider the system
1 0.5
2
s + 2(0.2)s + 1 0.9s + 1
G(s) =
−0.5 1.5
(2s + 1)(s + 1) 2
0.75s + 2(0.8)(0.75)s + 1
Compute the eigenvalues of L and A + BK for this system. What do you
conclude about noncooperative distributed MPC for this system?
386 Distributed Model Predictive Control
Solution
This system is not difficult to handle with distributed control. The
gains are the same as in the original pairing in Example 6.9, and the
steady-state coupling between the two subsystems is reasonably weak.
Unlike Example 6.9, however, the responses of y1 to u2 and y2 to u1
have been slowed so they are not faster than the responses of y1 to u1
and y2 to u2 , respectively. Computing the eigenvalues of L and A + BK
for noncooperative control gives
h i
eig(L) = 0.61 0.61 0.59 0.59 0.56 0.56 0.53 0.53 · · ·
h i
eig(A + BK) = 0.88 0.88 0.74 0.67 0.67 0.53 0.53
The Nash equilibrium is stable since L is stable, and the closed loop is
stable since both L and A + BK are stable. □
These examples reveal the simple fact that communicating the ac-
tions of the other controllers does not guarantee acceptable closed-loop
behavior. If the coupling of the subsystems is weak enough, both dy-
namically and in steady state, then the closed loop is stable. In this
sense, noncooperative MPC has few advantages over completely decen-
tralized control, which has this same basic property.
We next show how to obtain much better closed-loop properties
while maintaining the small size of the distributed control problems.
N−1
X
V (x1 (0), x2 (0), u1 , u2 ) = ℓ1 (x1 (k), x2 (k), u1 (k))+V1f (x1 (N), x2 (N))
k=0
N−1
X
x2 (k)′ Q2 x2 (k) + x2 (N)′ P2f x2 (N) =
k=0
u1′ B′21 Q2 B21 u1 + 2 x2 (0)′ A′2 + u2′ B′22 Q2 B21 u1 + constant
in which
h i
Q2 = diag Q2 Q2 ... P2f Nn2 × Nn2 matrix
and the constant term contains products of x2 (0) and u2 , which are
constant with respect to player one’s decision variables and can there-
fore be neglected.
Next we insert the new terms created by eliminating x2 into the cost
function. Assembling the cost function gives
e z + h′ z
min(1/2)z′ H
z
s.t. Dz = d
and (1.57) again gives the necessary and sufficient conditions for the
optimal solution
" #" # " # " # " #
He −D ′ z 0 e2
−A e 22
−B
−D 0 λ
= e 1 x1 (0) +
−A 0
x2 (0) +
e 12
−B
u2 (6.13)
6.2 Unconstrained Two-Player Game 389
in which
e = H + E ′ B′ Q2 B21 E
H e 22 = E ′ B′ Q2 B22
B e 2 = E ′ B′ Q2 A2
A
21 21 21
h i
E = IN ⊗ Im1 0m1 ,n1
See also Exercise 6.13 for details on constructing the padding matrix E.
Comparing the cooperative and noncooperative dynamic games, (6.13)
and (6.12), we see the cooperative game has made three changes: (i)
the quadratic penalty H has been modified, (ii) the effect of x2 (0) has
been included with the term A e 2 , and (iii) the influence of u2 has been
e
modified with the term B 22 . Notice that the size of the vector z has not
changed, and we have accomplished the goal of keeping player one’s
dynamic model in the cooperative game the same size as his dynamic
model in the noncooperative game.
Regardless of the implementation choice, the cooperative optimal
control problem is no more complex than the noncooperative game con-
sidered previously. The extra information required by player one in the
cooperative game is x2 (0). Player one requires u2 in both the cooper-
ative and noncooperative games. Only in decentralized control does
player one not require player two’s input sequence u2 . The other ex-
tra required information, A2 , B21 , Q2 , R2 , P2f , are fixed parameters and
making their values available to player one is a minor communication
overhead.
Proceeding as before, we solve this equation for z0 and pick out the
rows corresponding to the elements of u10 giving
" #
h i x (0)
1
u10 (x(0), u2 ) = K11 K12 + L1 u2
x2 (0)
in which the gain matrix multiplying the state is a full matrix for the
cooperative game. Substituting the optimal control into the iteration
gives
" #p+1 " #" # " # " #p
u1 w1 K11 w1 K12 x1 (0) (1 − w1 )I w1 L1 u1
= +
u2 w2 K21 w2 K22 x2 (0) w2 L2 (1 − w2 )I u2
| {z } | {z }
K L
390 Distributed Model Predictive Control
and Hu is partitioned for the two players’ input sequences. Notice that
the cost decrease achieved in a single iteration is quadratic in the dis-
tance from the optimum. An important conclusion is that each iter-
ation in the cooperative game reduces the systemwide cost. This cost
reduction is the key property that gives cooperative MPC its excellent
convergence properties, as we show next.
The two players’ warm starts at the next sample are given by
+
e 1 = (u1 (1), u1 (2), . . . , u1 (N − 1), 0)
u
+
e 2 = (u2 (1), u2 (2), . . . , u2 (N − 1), 0)
u
p p
We define the following linear time-invariant functions g1 and g2 as
the outcome of applying the control iteration procedure p times
p p
u1 = g1 (x1 , x2 , u1 , u2 )
p p
u2 = g2 (x1 , x2 , u1 , u2 )
x1+ = A1 x1 + B 11 u1 + B 12 u2 x2+ = A2 x2 + B 21 u1 + B 22 u2
p p
u1+ = g1 (x1 , x2 , u1 , u2 ) u2+ = g2 (x1 , x2 , u1 , u2 ) (6.16)
+ +
e1 , u
By the construction of the warm start, u e 2 , we have
+ +
V (x1+ , x2+ , u
e1 , u
e 2 ) = V (x1 , x2 , u1 , u2 ) − ρ1 ℓ1 (x1 , u1 ) − ρ2 ℓ2 (x2 , u2 )
h i
+(1/2)ρ1 x1 (N)′ A′1 P1f A1 − P1f + Q1 x1 (N)
h i
+(1/2)ρ2 x2 (N)′ A′2 P2f A2 − P2f + Q2 x2 (N)
From our choice of terminal penalty satisfying (6.8), the last two terms
are zero giving
+ +
V (x1+ , x2+ , u
e1 , u
e 2 ) = V (x1 , x2 , u1 , u2 )
− ρ1 ℓ1 (x1 , u1 ) − ρ2 ℓ2 (x2 , u2 ) (6.17)
+ +
V (x1+ , x2+ , u1+ , u2+ ) ≤ V (x1+ , x2+ , u
e1 , u
e 2 )−
h + i′ h + i
ue − u0 (x + ) P u e − u0 (x + ) p≥1
inputs. This assumption means that G11 and G22 are square matrices
of full rank. We remove all of these assumptions when we treat the con-
strained two-player game in the next section. If there is model error,
integrating disturbance models are required as discussed in Chapter 1.
We discuss these later.
The target problem also can be solved with any of the four ap-
proaches discussed so far. We consider each.
Centralized case. The centralized problem gives in one shot both in-
puts required to meet both output setpoints
us = G−1 ysp
ys = ysp
" #
−1
G11
us = −1 ysp
G22
" #
−1
I G12 G22
ys = −1 ysp
G21 G11 I
p
u01s = G11
−1
y1sp − G12 u2
394 Distributed Model Predictive Control
Player two solves the analogous problem. If we iterate on the two play-
ers’ solutions, we obtain
" #p+1 " #" #
−1
u1s w1 G11 y1sp
= −1 +
u2s w2 G22 y2sp
| {z }
Ks
" #" #p
−1
w2 I −w1 G11 G12 u1s
−1
−w2 G22 G21 w1 I u2s
| {z }
Ls
u∞
s =G
−1
ysp
and we have no offset. We already have seen that we cannot expect
the dynamic noncooperative iteration to converge. The next several
examples explore the issue of whether we can expect at least the steady-
state iteration to be stable.
Cooperative case. In the cooperative case, both players work on min-
imizing the offset in both outputs. Player one solves
" #′ " #" #
y1 − y1sp ρ1 Q 1 y1 − y1sp
min(1/2)
u1 y2 − y2sp ρ2 Q 2 y2 − y2sp
" # " # " #
y1 G11 G12
s.t. = u1 + u2
y2 G21 G22
We can write this in the general form
min(1/2)rs′ Hrs + h′ rs
rs
s.t. Drs = d
in which
y1s ρ1 Q 1 " #
−Qysp
rs = y2s H= ρ2 Q 2 h=
0
u1s 0
" # " #
h i G11 G12
D = I −G1 d = G2 u 2 G1 = G2 =
G12 G22
6.2 Unconstrained Two-Player Game 395
and identify the linear gains between the optimal u1s and the setpoint
ysp and player two’s input u2s
p
u01s = K1s ysp + L1s u2s
(b) Switch the pairings and repeat the previous part. Explain your
results.
Solution
(a) The first 10 iterations of the noncooperative steady-state calcu-
lation are shown in Figure 6.2. Notice the iteration is unstable
and the steady-state target does not converge. The cooperative
case is shown in Figure 6.3. This case is stable and the iterations
converge to the centralized target and achieve zero offset. The
magnitudes of the eigenvalues of Ls for the noncooperative (nc)
and cooperative (co) cases are given by
eig(Lsnc ) = {1.12, 1.12} eig(Lsco ) = {0.757, 0.243}
Stability of the iteration is determined by the magnitudes of the
eigenvalues of Ls .
(b) Reversing the pairings leads to the following gain matrix in which
we have reversed the labels of the outputs for the two systems
" # " #" #
y1s 2.0 1.0 u1s
=
y2s −0.5 1.0 u2s
The first 10 iterations of the noncooperative and cooperative con-
trollers are shown in Figures 6.4 and 6.5. For this pairing, the
noncooperative case also converges to the centralized target. The
eigenvalues are given by
eig(Lsnc ) = {0.559, 0.559} eig(Lsco ) = {0.757, 0.243}
The eigenvalues of the cooperative case are unaffected by the re-
versal of pairings. □
y2 = 1
6
2
u2 ude uce
0
−2 y1 = 1
−4
−6
−5 −4 −3 −2 −1 0 1 2 3 4 5
u1
y2 = 1
6
2
u2 ude uce
0
−2 y1 = 1
−4
−6
−5 −4 −3 −2 −1 0 1 2 3 4 5
u1
2.5
2
y1 = 1
1.5
u2 1
y2 = 1 uce ude
0.5
−0.5
−0.6 −0.4 −0.2 0 0.2 0.4 0.6
u1
2.5
2
y1 = 1
1.5
u2 1
y2 = 1 uce ude
0.5
−0.5
−0.6 −0.4 −0.2 0 0.2 0.4 0.6
u1
in which ALi = Ai − Li Ci .
Closed-loop stability. The dynamics of the estimator are given by
" #+ " #" # " #" #
x̂1 A1 x̂1 B 11 B 12 u1
= + +
x̂2 A2 x̂2 B 21 B 22 u2
" #" #
L1 C1 e1
L2 C2 e2
In the control law we use the state estimate in place of the state, which
is unmeasured and unknown. We consider two cases.
Converged controller. In this case the distributed control law con-
verges to the centralized controller, and we have
" # " #" #
u1 K11 K12 x̂1
=
u2 K21 K22 x̂2
The A+BK term is stable because this term is the same as in the stabiliz-
ing centralized controller. The perturbation is exponentially decaying
because the distributed estimators are stable. Therefore x̂ goes to zero
exponentially, which, along with e going to zero exponentially, implies
x goes to zero exponentially.
Finite iterations. Here we use the state plus input sequence descrip-
tion given in (6.16), which, as we have already noted, is a linear time-
invariant system. With estimate error, the system equation is
+
x̂1 A1 x̂1 + B 11 u1 + B 12 u2 L1 C1 e1
x̂ + A x̂ + B u + B u L C e
2 2 2 21 1 22 2 2 2 2
+ = p +
u1 g1 (x̂1 , x̂2 , u1 , u2 ) 0
p
u2+ g2 (x̂1 , x̂2 , u1 , u2 ) 0
Because there is again only one-way coupling between the estimate er-
ror evolution, (6.18), and the system evolution given above, the com-
posite system is exponentially stable.
(c) The input penalties R1 , R2 are positive definite, and the state penal-
ties Q1 , Q2 are semidefinite.
6.3 Constrained Two-Player Game 401
(e) The horizon is chosen sufficiently long to zero the unstable modes,
N ≥ max i∈I1:2 nu u
i , in which ni is the number of unstable modes of Ai ,
i.e., number of λ ∈ eig(Ai ) such that |λ| ≥ 1.
Sis = diag(Si1
s s
, Si2 ) Asi = diag(Asi1 , Asi2 ) i ∈ I1:2
Siu = u
diag(Si1 u
, Si2 ) Au
i = diag(Au u
i1 , Ai2 ) i ∈ I1:2
′ ′ ′ ′
As1 Σ1 As1 − Σ1 = −S1s Q1 S1s As2 Σ2 As2 − Σ2 = −S2s Q2 S2s (6.20)
We then choose the terminal penalty for each subsystem to be the cost
to go under zero control
′ ′
P1f = S1s Σ1 S1s P2f = S2s Σ2 S2s
3 If Aij is stable, then there is no Au u
ij and Sij .
402 Distributed Model Predictive Control
The feasible set XN for the unstable system is the set of states for which
the unstable modes can be brought to zero in N moves while satisfying
the input constraints.
p p
Given an initial iterate, (u1 , u2 ), the next iterate is defined to be
p p
(u1 , u2 )p+1 = w1 (u10 (x1 (0), x2 (0), u2 ), u2 )+
p p
w2 (u1 , u20 (x1 (0), x2 (0), u1 ))
and the functional dependencies of u10 and u20 should be kept in mind.
This procedure provides three important properties, which we es-
tablish next.
The first equality follows from (6.14). The next inequality follows
from convexity of V . The next follows from optimality of u10 and
u20 , and the last follows from w1 + w2 = 1. Because the cost is
bounded below, the cost iteration converges.
u2
uce
u1
Substituting this result into the equation for the change in W gives
The proof is based on the properties (6.23), (6.24), and (6.25) of func-
tion W (x̂, u, e), and is basically a combination of the proofs of Lemmas
6.5 and 6.6. The region of attraction is the set of states and initial es-
timate errors for which the unstable modes of the two subsystems can
be brought to zero in N moves while satisfying the respective input
constraints. If both subsystems are stable, for example, the region of
attraction is (x, e) ∈ XN × Rn .
1 2 1 2
min us − usp Rs + Cxs + Cd d̂(k) − ysp
xs ,us 2 2 Qs
subject to
" #" # " #
I−A −B xs Bd d̂(k)
=
HC 0 us rsp − HCd d̂(k)
Eus ≤ e
depends only on the disturbance estimate, d̂(k), and not the solution
of the control problem. So we can analyze the behavior of the target
by considering only the exponential convergence of the estimator. We
restrict the plant disturbance d so that the target problem is feasible,
and denote the solution to the target problem for the plant disturbance,
d̂(k) = d, as (xs∗ , u∗
s ). Because the estimator is exponentially stable,
we know that d̂(k) → d as k → ∞. Because the target problem is a posi-
tive definite quadratic program (QP), we know the solution is Lipschitz
continuous on bounded sets in the term d̂(k), which appears linearly
in the objective function and the right-hand side of the equality con-
straint. Therefore, if we also restrict the initial disturbance estimate
error so that the target problem remains feasible for all time, we know
(xs (k), us (k)) → (xs∗ , u∗
s ) and the rate of convergence is exponential.
subject to
I − A1 −B 11 −B 12 x1s Bd1 d̂1 (k)
I − A2 −B 22
−B 21 x2s Bd2 d̂2 (k)
=
H1 C1 u1s r1sp − H1 Cd1 d̂1 (k)
H2 C2 u2s r2sp − H2 Cd2 d̂2 (k)
E1 u1s ≤ e1
in which
y1s = C1 x1s + Cd1 d̂1 (k) y2s = C2 x2s + Cd2 d̂2 (k) (6.27)
But here we run into several problems. First, the constraints to ensure
zero offset in both players’ controlled variables are not feasible with
only the u1s decision variables. We require also u2s , which is not avail-
able to player one. We can consider deleting the zero offset condition
for player two’s controlled variables, the last equality constraint. But
6.3 Constrained Two-Player Game 411
if we do that for both players, then the two players have different and
coupled equality constraints. That is a path to instability as we have
seen in the noncooperative target problem. To resolve this issue, we
move the controlled variables to the objective function, and player one
solves instead the following
" #′ " #" #
1 H1 y1s − r1sp T1s H1 y1s − r1sp
min
x1s ,u1s 2 H2 y2s − r2sp T2s H2 y2s − r2sp
The equality constraints for the two players appear coupled when writ-
ten in this form. Coupled constraints admit the potential for the op-
timization to become stuck on the boundary of the feasible region,
and not achieve the centralized target solution after iteration to con-
vergence. But Exercise 6.30 discusses how to show that the equality
constraints are, in fact, uncoupled. Also, the distributed target prob-
lem as expressed here may not have a unique solution when there are
more manipulated variables than controlled variables. In such cases,
a regularization term using the input setpoint can be added to the ob-
jective function. The controlled variable penalty can be converted to a
linear penalty with a large penalty weight to ensure exact satisfaction
of the controlled variable setpoint.
If the input inequality constraints are coupled, however, then the
distributed target problem may indeed become stuck on the boundary
of the feasible region and not eliminate offset in the controlled vari-
ables. If the input inequality constraints are coupled, we recommend
using the centralized approach to computing the steady-state target.
As discussed above, the centralized target problem eliminates offset in
the controlled variables as long as it remains feasible given the distur-
bance estimates.
Zero offset. Finally we establish the zero offset property. As de-
scribed in Chapter 1, the regulator is posed in deviation variables
x
e (k) = x̂(k) − xs (k) u
e (k) = u(k) − us (k) e = u − us (k)
u
412 Distributed Model Predictive Control
min V (x
e 1 (0), x
e 2 (0), u
e 1, u
e 2)
u
e1
" #+ " #" # " # " #
x
e1 A1 0 x
e1 B 11 B 12
s.t. = + u
e1 + u
e2
x
e2 0 A2 x
e2 B 21 B 22
e 1 ∈ U1 ⊖ us (k)
u
′ e
S1u x 1 (N) = 0
e 1 ≤ d1 x
u e 1 (0)
Notice that because the input constraint is shifted by the input tar-
get, we must retain feasibility of the regulation problem by restrict-
ing also the plant disturbance and its initial estimate error. If the two
players’ regulation problems remain feasible as the estimate error con-
verges to zero, we have exponential stability of the zero solution from
Lemma 6.14. Therefore we conclude
(x
e (k), u
e (k)) → (0, 0) Lemma 6.14
=⇒ (x̂(k), u(k)) → (xs (k), us (k)) definition of deviation variables
=⇒ (x̂(k), u(k)) → (xs∗ , u∗
s) target problem convergence
=⇒ x(k) → xs∗ estimator stability
=⇒ r (k) → rsp target equality constraint
and we have zero offset in the plant controlled variable r = Hy. The
rate of convergence of r (k) to rsp is also exponential. As we saw here,
this convergence depends on maintaining feasibility in both the target
problem and the regulation problem at all times.
min V (x(0), u)
ui
X
+
s.t. x = Ax + Bj uj
j∈I1:M
u i ∈ Ui
u′
Sji xji (N) =0 j ∈ I1:M
X
|ui | ≤ di xji (0) if xji (0) ∈ r B, j ∈ I1:M
j∈I1:M
|u| ≤ d |x| x ∈ rB
(c) The input penalties Ri , i ∈ I1:M are positive definite, and Qi , i ∈ I1:M
are semidefinite.
(e) The horizon is chosen sufficiently long to zero the unstable modes;
N ≥ max i∈I1:M (nu u
i ), in which ni is the number of unstable modes of
Ai .
(f) Zero offset. For achieving zero offset, we augment the models with
integrating disturbances such that
" #
I − Ai −Bdi
rank = ni + p i i ∈ I1:M
Ci Cdi
In the nonlinear case, the usual model comes from physical principles
and conservation laws of mass, energy, and momentum. The state has
a physical meaning and the measured outputs usually are a subset of
6.5 Nonlinear Distributed MPC 415
dx1
= f1 (x1 , x2 , u1 , u2 ) y1 = C1 x1
dt
dx2
= f2 (x1 , x2 , u1 , u2 ) y2 = C2 x2
dt
in which C1 , C2 are matrices of zeros and ones selecting the part of the
state that is measured in subsystems one and two. We generally cannot
avoid state x2 dependence in the differential equation for x1 . But often
only a small subset of the entire state x2 appears in f1 , and vice versa.
The reason in chemical process systems is that the two subsystems are
generally coupled through a small set of process streams transferring
mass and energy between the systems. These connecting streams iso-
late the coupling between the two systems and reduce the influence to
a small part of the entire state required to describe each system.
Given these physical system models of the subsystems, the overall
plant model is
dx
= f (x, u) y = Cx
dt
with
" # " # " # " # " #
x1 u1 f1 y1 C1
x= u= f = y= C=
x2 u2 f2 y2 C2
6.5.1 Nonconvexity
The basic difficulty in both the theory and application of nonlinear MPC
is the nonconvexity in the control objective function caused by the non-
linear dynamic model. This difficulty applies even to centralized non-
linear MPC as discussed in Section 2.7, and motivates the development
of suboptimal MPC. In the distributed case, nonconvexity causes extra
difficulties. As an illustration, consider the simple two-player, noncon-
vex game depicted in Figure 6.7. The cost function is
➂
u2
2
➃
1
➀
0 ➁
0 1 2 3 4 5
u1
p
and (u1 , u02 ), denoted ➂. Consider taking a convex combination of the
two players’ optimal points for the next iterate
p+1 p+1 p p
(u1 , u2 ) = w1 (u01 , u2 ) + w2 (u1 , u02 ) w1 + w2 = 1, w1 , w2 ≥ 0
We see in Figure 6.7 that this iterate causes the objective function to
increase rather than decrease for most values of w1 , w2 . For w1 = w2 =
1/2, we see clearly from the contours that V at point ➃ is greater than
V at point ➀.
The possibility of a cost increase leads to the possibility of closed-
loop instability and precludes developing even a nominal control theory
for this simple approach, which was adequate for the convex, linear
plant case.4 In the centralized MPC problem, this nonconvexity issue
can be addressed in the optimizer, which can move both inputs simul-
taneously and always avoid a cost increase. One can of course consider
4 This point marked the state of affairs at the time of publication of the first edi-
tion of this text. The remaining sections summarize one approach that addresses the
nonconvexity problem (Stewart, Wright, and Rawlings, 2011).
6.5 Nonlinear Distributed MPC 417
adding another player to the game who has access to more systemwide
information. This player takes the optimization results of the indi-
vidual players and determines a search direction and step length that
achieve a cost decrease for the overall system. This player is often
known as a coordinator. The main drawback of this approach is that
the design of the coordinator may not be significantly simpler than the
design of the centralized controller.
Rather than design a coordinator, we instead let each player evaluate
the effect of taking a combination of all the players’ optimal moves. The
players can then easily find an effective combination that leads to a cost
decrease. We describe one such algorithm in the next section, which
we call the distributed gradient algorithm.
p
in which u−i = (u1 , . . . , ui−1 , ui+1 , . . . , uM ). Let ui denote the approx-
imate solution to these optimizations. We compute the approximate
solutions via the standard technique of line search with gradient pro-
jection. At iterate p ≥ 0
p p
ui = Pi (ui − ∇i V (up )) (6.30)
u2
2
0 1 2 3 4 5
u1
N−1
X
Vi x(0), u1 , u2 = ℓi xi (k), ui (k) + Vif x(N)
k=0
with ℓi (xi , ui ) denoting the stage cost, Vif (x) the terminal cost of sys-
tem i, and xi (i) = φi (k; xi , u1 , u2 ). Because xi is a function of both u1
420 Distributed Model Predictive Control
Xf = {x | Vf (x) ≤ a}
f (x, u1 , u2 ) ∈ Xf
Vf (f (x, u1 , u2 )) − Vf (x) ≤ −ℓ(x, u1 , u2 )
(b) For each i ∈ I1:2 , there exist K∞ functions αi (·), and αf (·) satisfy-
ing
with f (x, κ1f (x), κ2f (x)) ∈ Xf . Each terminal controller κif (·) may be
found offline.
6.5 Nonlinear Distributed MPC 421
N−1
X
V β (x, u) = ℓ(x(k), u(k)) + βVf (x(N)) (6.35)
k=0
Cooperative control algorithm. Let x(0) be the initial state and u e∈U
be the initial feasible input sequence for the cooperative MPC algorithm
such that φ(N; x(0), u e ) ∈ Xf . At each iterate p, an approximate solu-
422 Distributed Model Predictive Control
By Lemma 6.18(b), the objective function cost only decreases from this
warm start, so that
Theorem 6.21 (Asymptotic stability). Let Assumptions 2.2, 2.3, and 6.19
hold, and let V (·) ← V β (·) from Proposition 6.20. Then for every x(0) ∈
XN , the origin is asymptotically stable for the closed-loop system x + =
f (x, κ p̄ (x)).
The proof follows, with minor modification, the proof that subopti-
mal MPC is asymptotically stable in Theorem 2.48. As in the previous
sections, the controller has been presented for the case of two subsys-
tems, but can be extended to any finite number of subsystems.
We conclude the discussion of nonlinear distributed MPC by revis-
iting the unstable nonlinear example system presented in Stewart et al.
(2011).
p=3 p = 10
4
2
x x1
0 x2
−2
u u1
0
u2
−1
−2
0 2 4 6 8 10 0 2 4 6 8 10
time time
Figure 6.9: Closed-loop state and control evolution with (x1 (0),
x2 (0)) = (3, −3). Setting p = 10 approximates the cen-
tralized controller.
with Q1 , Q2 > 0 and R1 , R2 > 0. This stage cost gives the objective
function
N−1
1 X
V (x, u) = x(k)′ Qx(k) + u(k)′ Ru(k) + Vf (x(N))
2 k=0
0.5
−0.5
u2 −1
−1.5
−2
−2.5
−3
−3 −2 −1 0 1 2
u1
6.6 Notes
At least three different fields have contributed substantially to the ma-
terial presented in this chapter. We attempt here to point out briefly
what each field has contributed, and indicate what literature the inter-
ested reader may wish to consult for further pursuing this and related
subjects.
426 Distributed Model Predictive Control
erative game with any number of iterates of the local MPC controllers
leads to closed-loop stability for linear dynamics. Venkat, Rawlings,
and Wright (2006a,b) show that state estimation errors (output instead
of state feedback) do not change the system closed-loop stability if the
estimators are also asymptotically stable. Most of the theoretical re-
sults on cooperative MPC of linear systems given in this chapter are
presented in Venkat (2006) using an earlier, different notation. If im-
plementable, this form of distributed MPC clearly has the best control
properties. Although one can easily modify the agents’ objective func-
tions in a single large-scale process owned by a single company, this
kind of modification may not be possible in other situations in which
competing interests share critical infrastructure.
The requirements of the many different classes of applications con-
tinue to create exciting opportunities for continued research in this
field. An excellent recent review provides a useful taxonomy of the dif-
ferent features of the different approaches (Scattolini, 2009). A recent
text compiles no less than 35 different approaches to distributed MPC
from more than 80 contributors (Maestre and Negenborn, 2014). The
growth in the number and diversity of applications of distributed MPC
shows no sign of abating.
6.7 Exercises 429
6.7 Exercises
First, implement the method described in Section 6.1.1 in which you eliminate the
state and solve the problem for the decision variable
Second, implement the method described in Section 6.1.1 in which you do not elim-
inate the state and solve the problem for
Third, use backward dynamic programming (DP) and the Riccati iteration to com-
pute the closed-form solution for u(k) and x(k).
(a) Let
" # " # " #
4/3 −2/3 1 h i 1
A= B= C = −2/3 1 x(0) =
1 0 0 1
Q = C ′ C + 0.001I Pf = Π R = 0.001 M =0
in which the terminal penalty, Pf is set equal to Π, the steady-state cost to go.
Compare the three solutions for N = 5. Plot x(k), u(k) versus time for the
closed-loop system.
(b) Let N = 50 and repeat. Do any of the methods experience numerical problems
generating an accurate solution? Plot the condition number of the matrix that
is inverted in the first two methods versus N.
Q=I Pf = Π R=I M =0
Repeat parts (a) and (b) for this system. Do you lose accuracy in any of the
solution methods? What happens to the condition number of H(N) and S(N)
as N becomes large? Which methods are still accurate for this case? Can you
explain what happened?
430 Distributed Model Predictive Control
subject to
x + = Ax + Bu
(a) Set up the dense Hessian least squares problem for the LQP with a horizon of
three, N = 3. Eliminate the state equations and write out the objective function
in terms of only the decision variables u(0), u(1), u(2).
(b) What are the conditions for an optimum, i.e., what linear algebra problem do
you solve to compute u(0), u(1), u(2)?
(b) What are necessary and sufficient conditions for a solution to the optimization
problem?
(c) Apply this approach to the LQP of Exercise 6.2 using the equality constraints to
represent the model equations. What are H, D, d for the LQP?
(d) Write out the linear algebra problem to be solved for the optimum.
(e) Contrast the two different linear algebra problems in these two approaches.
Which do you want to use when N is large and why?
subject to
x + = Ax + Bu
and the three approaches of Exercise 6.1.
1. The method described in Section 6.1.1 in which you eliminate the state and solve
the problem for the decision variable
2. The method described in Section 6.1.1 in which you do not eliminate the state
and solve the problem for
z = (u(0), x(1), u(1), x(2), . . . , u(N − 1), x(N))
3. The method of DP and the Riccati iteration to compute the closed-form solution
for u(k) and x(k).
(a) You found that unstable A causes numerical problems in the first method using
large horizons. So let’s consider a fourth method. Reparameterize the input in
terms of a state feedback gain via
u(k) = Kx(k) + v(k)
in which K is chosen so that A + BK is a stable matrix. Consider the matrices in
a transformed LQP
N−1
1 X e x(k) + v(k)′ R
e v(k) + 2x(k)′ M
e v(k) +(1/2)x(N)′ P
e x(N)
min V = x(k)′ Q f
v 2
k=0
ex + B
subject to x + = A e v.
(b) Solve the following problem using the first method and the fourth method and
describe differences between the two solutions. Compare your results to the DP
approach. Plot x(k) and u(k) versus k.
27.8 −82.6 34.6 0.527 0.548 1
A = 25.6 −76.8 32.4 B = 0.613 0.530 x(0) = 1
40.6 −122.0 51.9 1.06 0.828 1
Consider regulator tuning parameters and constraints
Q = Pf = I R=I M =0 N = 50
Notice the recursively defined v(m) and H(m) provide the solutions and the
Hessian matrices of the sequence of optimization problems
min V (m, x) 1≤m≤N
x
432 Distributed Model Predictive Control
(b) Check your answer by solving the equivalent, but larger dimensional, constrained
least squares problem (see Exercise 1.16)
e (z − z0 )
min(z − z0 )′ H
z
subject to
Dz = 0
in which z, z0 ∈ RnN , e
H∈R nN×nN is a block diagonal matrix, D ∈ Rn(N−1)×nN
x(1) X1
I −I
. ..
. e . .. ..
z0 = . H= D=
. .
x(N − 1) XN−1
I −I
x(N) XN
(c) Compare the size and number of matrix inverses required for the two approaches.
w + = Aw w(0) = Hx(0)
x = Cw
with solution w(k) = Ak w(0) = Ak Hx(0), x(k) = CAk Hx(0). Notice that x(0) com-
pletely determines both w(k) and x(k), k ≥ 0. Also note that zero is a solution, i.e.,
x(k) = 0, k ≥ 0 satisfies the model.
Plot the solution x(k). Does x(k) converge to zero? Does x(k) achieve zero
exactly for finite k > 0?
(b) Is the zero solution x(k) = 0 Lyapunov stable? State your definition of Lyapunov
stability, and prove your answer. Discuss how your answer is consistent with
the special case considered above.
h(x) ≤ d̄ |x| x ∈ XN
in which d̄ > 0. Notice that all initializations considered in the chapter satisfy this
requirement.
Then, at time k and state x, in addition to the shifted input sequence from time
k − 1, u
e , evaluate the initialization sequence applied to the current state, u = h(x).
Select whichever of these two input sequence has lower cost as the warm start for time
k. Notice also that this refinement makes the constraint
|u| ≤ d |x| x ∈ rB
Apply decentralized control to the systems in Examples 6.9–6.11. Which of these sys-
tems are closed-loop unstable with decentralized control? Compare this result to the
result for noncooperative MPC.
Apply cooperative MPC to the systems in Examples 6.9–6.11. Are any of these systems
closed-loop unstable? Compare the closed-loop eigenvalues of converged cooperative
control to centralized MPC, and discuss any differences.
V (u) =constant
u2
u∗
up+1
up
u1
in which
" #−1 " #
−1 I −L1 K1 0
(I − L) K=
−L2 I 0 K2
6.7 Exercises 435
V (u) = (1/2)u′ Hu + c ′ u + d
" #" # " #
h i H H12 u1 h i u
11 1
V (u1 , u2 ) = (1/2) u′1 u′2 + c1′ c2′ +d
H21 H22 u2 u2
in which H > 0. Imagine we wish to optimize this function by first optimizing over
the u1 variables holding u2 fixed and then optimizing over the u2 variables holding u1
fixed as shown in Figure 6.11. Let’s see if this procedure, while not necessarily efficient,
is guaranteed to converge to the optimum.
p p
(a) Given an initial point (u1 , u2 ), show that the next iteration is
p+1 −1 p
u1 = −H11 H12 u2 + c1
p+1 −1 p
u2 = −H22 H21 u1 + c2 (6.39)
(b) Establish that the optimization procedure converges by showing the iteration
matrix is stable
eig(A) < 1
(c) Given that the iteration converges, show that it produces the same solution as
u∗ = −H −1 c
(b) Show that the following expression gives the size of the decrease
in which
" # " #
e D −1 H e =D−N H11 0 0 H12
P = HD −1 H H D= N=
0 H22 H21 0
(b) Show that the following expression gives the size of the decrease
V (up+1 ) − V (up ) = −(1/2)(up − u∗ )′ P (up − u∗ )
in which
e D −1 H
P = HD −1 H He =D−N
" # " #
w1−1 H11 0 −w1−1 w2 H11 H12
D= −1 N=
0 w2 H22 H21 −w1 w2−1 H22
and u∗ = −H −1 c is the optimum.
Hint: to simplify the algebra, first change coordinates and move the origin of the coor-
dinate system to u∗ .
Notice the system evolution is time-varying even though the models are time invariant
because we allow a time-varying sequence of controller iterations.
Show that cooperative MPC is exponentially stabilizing for any pk ≥ 0 sequence.
Prove that Modified Assumption 6.13 (b) implies Assumption 6.13 (b). It may be
helpful to first prove the following lemma.
in which As is stable, the system (A, C) is detectable if and only if the system (A, C) is
detectable.
min(1/2)u′ Hu + x ′ Du
u
s.t. Eu ≤ F x
−∇V −∇V
θ
u∗ u∗
V = c2 < c1 N(U, u∗ )
V = c1
z
U U
(a) (b)
Figure 6.12: (a) Optimality of u∗ means the angle between −∇V and
any point z in the feasible region must be greater than
90◦ and less than 270◦ . (b) The same result restated:
u∗ is optimal if and only if the negative gradient is in
the normal cone to the feasible region at u∗ , −∇V |u∗ ∈
N(U, u∗ ).
(b) If we use the two norm, show that this problem can be approximated by a QP
whose solution does satisfy the constraints, but the solution may be suboptimal
compared to the original problem.
(b) Repeat for the convex step w1 = 0.8, w2 = 0.2. Are the results identical to the
previous part? If not, discuss any differences.
(c) For what choices of w1 , w2 does the target iteration converge using noncooper-
ative control for the target calculation?
⟨z2 − u∗
2 , −∇u2 V |(u∗ ,u∗ ) ⟩ ≤0 ∀z2 ∈ U2 (6.43)
1 2
in which V is a strictly convex function. Assume that the feasible region is convex and
nonempty and denote the unique optimal solution as (u∗ ∗ ∗
1 , u2 , . . . , uM ) having cost
∗ ∗ ∗
V = V (u1 , . . . , uM ). Denote the M one-variable-at-a-time optimization problems at
iteration k
p+1 p p
zj = arg min V (u1 , . . . , uj , . . . , uM ) subject to uj ∈ Uj
uj
Then define the next iterate to be the following convex combination of the previous
and new points
p+1 p p+1 p p
uj = αj zj + (1 − αj )uj j = 1, . . . , M
p
ε≤ αj <1 0<ε j = 1, . . . , M, p≥1
M
X p
αj = 1, p≥1
j=1
p p p
(a) Starting with any feasible point, (u01 , u02 , . . . , u0M ), the iterations (u1 , u2 , . . . , uM )
are feasible for p ≥ 1.
(b) The objective function decreases monotonically from any feasible initial point
p+1 p+1 p p
V (u1 , . . . , uM ) ≤ V (u1 , . . . , uM ) ∀u0l ∈ Uj , j = 1, . . . , M, p≥1
p p p
(c) The cost sequence V (u1 , u2 , . . . , uM ) converges to the optimal cost V ∗ from
any feasible initial point.
p p p
(d) The sequence (u1 , u2 , . . . , uM ) converges to the optimal solution (u∗ ∗
1 , u2 , . . . ,
u∗
M ) from any feasible initial point.
in which V is a strictly positive quadratic function. Assume that the feasible region
is convex and nonempty and denote the unique optimal solution as (u∗ ∗
1 , u2 ) having
cost V ∗ = V (u∗
1 , u∗
2 ). Consider the two one-variable-at-a-time optimization problems
at iteration k
p+1 p p+1 p
u1 = arg min V (u1 , u2 ) u2 = arg min V (u1 , u2 )
u1 u2
subject to u1 ∈ U1 subject to u2 ∈ U2
We know from Exercise 6.15 that taking the full step in the unconstrained problem
with M = 2 achieves a cost decrease. We know from Exercise 6.19 that taking the full
step for an unconstrained problem with M ≥ 3 does not provide a cost decrease in
general. We know from Exercise 6.26 that taking a reduced step in the constrained
problem for all M achieves a cost decrease. That leaves open the case of a full step for
a constrained problem with M = 2.
Does the full step in the constrained case for M = 2 guarantee a cost decrease? If
so, prove it. If not, provide a counterexample.
|u1 | ≤ γ1 (|x1 |) x1 ∈ r 1 B
|u2 | ≤ γ2 (|x2 |) x2 ∈ r 2 B
Show that there exists r > 0 and function γ of class K such that
(b) Given r1 , r2 > 0, and constants c1 , c2 , σ1 , σ2 > 0, assume the following con-
straints are satisfied
Show that there exists r > 0 and function c, σ > 0 such that
subject to
" # x1s " #
I − A1 −B 11 −B 12 x2s B1d d̂1 (k)
=
I − A2 −B 21 −B 22 u1s B2d d̂2 (k)
u2s
E1 u1s ≤ e1
Show that the constraints can be expressed so that the target problem constraints are
uncoupled.
Bibliography
D. Jia and B. H. Krogh. Min-max feedback model predictive control for dis-
tributed control with communication. In Proceedings of the American Con-
trol Conference, pages 4507–4512, Anchorage,Alaska, May 2002.
442
Bibliography 443
7.1 Introduction
In preceding chapters we show how model predictive control (MPC) can
be derived for a variety of control problems with constraints. It is in-
teresting to recall the major motivation for MPC; solution of a feedback
optimal control problem for constrained and/or nonlinear systems to
obtain a stabilizing control law is often prohibitively difficult. MPC
sidesteps the problem of determining a control law κ(·) by determin-
ing, instead, at each state x encountered, a control action u = κ(x)
by solving a mathematical programming problem. This procedure, if
repeated at every state x, yields an implicit control law κ(·) that solves
the original feedback problem. In many cases, determining an explicit
control law is impractical while solving a mathematical programming
problem online for a given state is possible; this fact has led to the
wide-scale adoption of MPC in the chemical process industry.
Some of the control problems for which MPC has been extensively
used, however, have recently been shown to be amenable to analysis,
at least for relatively simple systems. One such problem is control of
linear discrete time systems with polytopic constraints, for which de-
termination of a stabilizing control law was thought in the past to be
prohibitively difficult. It has been shown that it is possible, in principle,
to determine a stabilizing control law for some of these control prob-
lems. This result is often referred to as explicit MPC because it yields an
explicit control law in contrast to MPC that yields a control action for
each encountered state, thereby implicitly defining a control law. There
are two objections to this terminology. First, determination of control
laws for a wide variety of control problems has been the prime concern
of control theory since its birth and certainly before the advent of MPC,
445
446 Explicit Control Laws for Constrained Linear Systems
U(x) = {u | (x, u) ∈ Z}
The set X is the domain of V 0 (·) and u0 (·) and is thus the set of points
x for which a feasible solution of P(x) exists; it is the projection of Z
(which is a set in (x, u)-space) onto x-space. See Figure 7.1, which
illustrates Z and U(x) for the case when U(x) = {u | Mu ≤ Nx + p};
the set Z is thus defined by Z := {(x, u) | Mu ≤ Nx + p}. In this case,
both Z and U(x) are polyhedral.
Before proceeding to consider parametric linear and quadratic pro-
gramming, some simple examples may help the reader to appreciate
the underlying ideas. Consider first a very simple parametric linear
448 Explicit Control Laws for Constrained Linear Systems
U(x)
Z
x
X
u0 (x) Z
constraint
0 x
u
u0 (x)
u Z
u0 (x)
constraint
u0uc (x)
u) > 0 for all u > u0uc (x) = x/2. Hence, in X1 , u0 (x) lies on the
boundary of Z and satisfies u0 (x) = 2 − x. Similarly, in X2 , u0 (x)
lies on the boundary of Z and satisfies u0 (x) = 2 − x/2. Finally, in
X3 , u0 (x) = u0uc (x) = x/2, the unconstrained minimizer, and lies in
the interior of Z for x > 1. The third constraint u ≥ 2 − x is active
in X1 , the second constraint u ≥ 2 − x/2 is active in X2 , while no
constraints are active in X3 . Hence the minimizer u0 (·) is piecewise
affine, being affine in each of the regions X1 , X2 and X3 . Since V 0 (x) =
2
(1/2) x − u0 (x) + u0 (x)2 /2, the value function V 0 (·) is piecewise
quadratic, being quadratic in each of the regions X1 , X2 and X3 .
We require, in the sequel, the following definitions.
f (x) := −x − 1 x ∈ (−∞, 0]
:= x + 1 x ∈ [0, ∞)
This function is set valued at x = 0 where it has the value f (0) = {−1,
1}. We shall mainly be concerned with continuous piecewise affine and
piecewise quadratic functions.
We now generalize the points illustrated by our example above and
consider, in turn, parametric quadratic programming and parametric
1 The interior of a set S ⊆ Z relative to the set Z is the set {z ∈ S | ε(z)B ∩ aff(Z) ⊆
Z for some ε > 0} where aff(Z) is the intersection of all affine sets containing Z.
7.3 Parametric Quadratic Programming 451
n m
where x ∈ R and u ∈ R . The cost function V (·) is defined by
Z := {(x, u) | Mx ≤ Nu + p}
Assumption 7.3 implies that both R and Q are positive definite. The
cost function V (·) may be written in the form
7.3.2 Preview
We show in the sequel that V 0 (·) is piecewise quadratic and u0 (·) piece-
wise affine on a polyhedral partition of X, the domain of both these
functions. To do this, we take an arbitrary point x in X, and show that
u0 (x) is the solution of an equality constrained QP P(x) : minu {V (x,
u) | Mx0 u = Nx0 x + px0 } in which the equality constraint is Mx0 u =
Nx0 x + px0 . We then show that there is a polyhedral region Rx0 ⊂ X in
which x lies and such that, for all w ∈ Rx0 , u0 (w) is the solution of
the equality constrained QP P(w) : minu {V (w, u) | Mx0 u = Nx0 w + px0 }
in which the equality constraints are the same as those for P(x). It
follows that u0 (·) is affine and V 0 (·) is quadratic in Rx0 . We then show
that there are only a finite number of such polyhedral regions so that
u0 (·) is piecewise affine, and V 0 (·) piecewise quadratic, on a polyhe-
dral partition of X. To carry out this program, we require a suitable
characterization of optimality. We develop this in the next subsection.
Some readers may prefer to jump to Proposition 7.8, which gives the
optimality condition we employ in the sequel.
7.3 Parametric Quadratic Programming 453
Definition 7.4 (Polar cone). The polar cone of a cone C ⊆ Rn is the cone
C ∗ defined by
C ∗ := {g ∈ Rn | ⟨g, h⟩ ≤ 0 ∀h ∈ C}
C ∗ = cone{ai | i ∈ I1:m }
U(x) := {v ∈ Rm | Mv ≤ Nx + p} (7.2)
454 Explicit Control Laws for Constrained Linear Systems
a2
x2
C∗
a1
C
x1
Clearly
so that U(x) − {u} ⊆ C(x, u); for any (x, u) ∈ Z, any h ∈ C(x, u),
there exists an α > 0 such that u + αh ∈ U(x). Proposition 7.6 may be
expressed as: u is optimal for minu {V (x, u) | u ∈ U(x)} if and only if
Proof. We show that the condition ⟨∇u V (x, u), h⟩ ≥ 0 for all h ∈ C(x,
u) is equivalent to the condition ⟨∇u V (x, u), h⟩ ≥ 0 for all h ∈ U(x) −
{u} employed in Proposition 7.6. (i) Since U(x) − {u} ⊆ C(x, u),
⟨∇u V (x, u), h⟩ ≥ 0 for all h ∈ C(x, u) implies ⟨∇u V (x, u), h⟩ ≥ 0
for all h ∈ U(x) − {u}. (ii) ⟨∇u V (x, u), h⟩ ≥ 0 for all h ∈ U(x) − {u}
implies ⟨∇u V (x, u), αh⟩ ≥ 0 for all h ∈ U(x) − {u}, all α > 0. But,
for any h∗ ∈ C(x, u), there exists an α ≥ 1 such that h∗ = αh with
h := (1/α)h∗ ∈ U(x) − {u}. Hence ⟨∇u V (x, u), h∗ ⟩ = ⟨∇u V (x, u),
αh⟩ ≥ 0 for all h∗ ∈ C(x, u). ■
Note that C(x, u) and C ∗ (x, u) are both cones so that each set con-
tains the origin. In particular, C ∗ (x, u) is generated by the gradients
of the constraints active at z = (x, u), and may be defined by a set of
affine inequalities: for each z ∈ Z, there exists a matrix Lz such that
C ∗ (x, u) = C ∗ (z) = {g ∈ Rm | Lz g ≤ 0}
The importance of this result for us lies in the fact that the necessary
and sufficient condition for optimality is satisfaction of two polyhedral
constraints, u ∈ U(x) and −∇u V (x, u) ∈ C ∗ (x, u). Proposition 7.8
may also be obtained by direct application of Proposition C.12 of Ap-
pendix C; C ∗ (x, u) may be recognized as NU(x) (u), the regular normal
cone to the set U(x) at u.
456 Explicit Control Laws for Constrained Linear Systems
U(x) := {u | Mu ≤ Nx + p}
∇u V (x, u) = Ru + Sx + r
Mu ≤ Nx + p
− (Ru + Sx + r ) ∈ C ∗ (x, u)
Mi u = Ni x + pi , i ∈ I 0 (x), i.e.,
Mx0 u = Nx0 x + px0
where Mx0 , Nx0 , and px0 are defined in (7.1). Hence u0 (x) is the solution
of the equality constrained problem
If the active constraint set remains constant near the point x or, more
precisely, if I 0 (x) ⊆ I 0 (w) for all w in some region in Rn containing
x, then, for all w in this region, u0 (w) satisfies the equality constraint
7.3 Parametric Quadratic Programming 457
M(Kx w + kx ) ≤ Nw + p
−L0x (R(Kx w + kx ) + Sw + r ) ≤ 0
458 Explicit Control Laws for Constrained Linear Systems
Rx0 = {w | Fx w ≤ fx }
so that Rx0 is polyhedral. Since u0x (x) = u0 (x), it follows that u0x (x) ∈
U(x) and −∇u V (x, u0x (x)) ∈ C ∗ (x, u0 (x)) so that x ∈ Rx0 .
Our next task is to bound the number of distinct regions Rx0 that
exist as we permit x to range over X. We note, from its definition, that
Rx0 is determined, through the constraint Mx0 u = Nx0 w + px0 in Px (w),
through u0x (·) and through C ∗ (x, u0 (x)), by I 0 (x) so that Rx0 1 ≠ Rx0 2
implies that I 0 (x1 ) ≠ I 0 (x2 ). Since the number of subsets of {1, 2, . . . ,
p} is finite, the number of distinct regions Rx0 as x ranges over X is
finite. Because each x ∈ X lies in the set Rx0 , there exists a discrete set
of points X ⊂ X such that X = ∪{Rx0 | x ∈ X}. We have proved the
following.
where
−1 0 −1
M = −1 N = 1/2 p = −2
−1 1 −2
(1/2)w − 2 ≤ −1 or w ≤ 2
(1/2)w − 2 ≤ (1/2)w − 2 or w ∈ R
(1/2)w − 2 ≤ w − 2 or w ≥ 0
2w − 4 ≤ 0 or w ≤ 2
where
1 0 1
−1 0 1
M := p :=
0 1 1
0 −1 1
It follows from the solution to Example 2.5 that
" #
0 −1
u (2) =
−(1/2)
so that " #
−1
u0x (w) =
(1/2) − (1/2)w
Hence u0x (2) = [−1, −1/2]′ = u0 (2) as expected. Since Mx0 = M2 =
[−1, 0], C ∗ (x, u0 (x)) = {g ∈ R2 | g1 ≤ 0}. Also
" #
2w + 3u1 + u2
∇u V (w, u) =
w + u1 + 2u2
so that " #
0 (3/2)w − (5/2)
∇u V (w, ux (w)) =
0
Hence Rx0 , x = 2 is the set of w satisfying the following inequalities
(1/2) − (1/2)w ≤ 1 or w ≥ −1
(1/2) − (1/2)w ≥ −1 or w ≤ 3
−(3/2)w + (5/2) ≤ 0 or w ≥ (5/3)
x + = Ax + Bu (7.6)
x(N) ∈ Xf (7.8)
in which, for all i, x(i) = φ(i; x, u), the solution of (7.6) at time i
if the initial state at time 0 is x and the control sequence is u :=
(u(0), u(1), . . . , u(N − 1)). The functions ℓ(·) and Vf (·) are quadratic
The state and control constraints (7.7) induce, via the difference equa-
tion (7.6), an implicit constraint (x, u) ∈ Z where
2.0
1.5
1.0
0.5
x2
0.0
−0.5
−1.0
−1.5
−2.0
−3 −2 −1 0 1 2 3
x1
where, for all i, x(i) = φ(i; x, u). It is easily seen that Z is polyhedral
since, for each i, x(i) = Ai x + Mi u for
h some matrix Mi in R
n×Nm
; here
i′
u is regarded as the column vector u(0) u(1) · · · u(N − 1)′ .
′ ′
Clearly x(i) = φ(i; x, u) is linear in (x, u). The constrained linear op-
timal control problem may now be defined by
0
in which Vi−1 (·) is piecewise quadratic and Xi−1 is polyhedral. The
decision variable u in each problem Pi has dimension m. But each
problem Pi (x), x ∈ Xi , is a parametric piecewise QP rather than a
conventional parametric QP. Hence a method for solving parametric
piecewise quadratic programming problems is required if dynamic pro-
gramming is employed to obtain a parametric solution to PN . Readers
not concerned with this extension should proceed to Section 7.7.
The parametric QP P(x) is defined, as before, by
Z := {(x, u) | Mu ≤ Nx + p}
3 Note that in this section the subscript i denotes partition i rather than “time to go.”
464 Explicit Control Laws for Constrained Linear Systems
U(x) := {u | (x, u) ∈ Z} = {u | Mu ≤ Nx + p}
Let X ⊂ Rn be defined by
The set X is the domain of V 0 (·) and of u0 (·) and is thus the set of
points x for which a feasible solution of P(x) exists; it is the projection
of Z, which is a set in (x, u)-space, onto x-space as shown in Figure 7.1.
We make the following assumption in the sequel.
Ui (x) := {u | (x, u) ∈ Zi }
Thus the set Ui (x) is the set of admissible u at x, and problem Pi (x)
may be expressed as Vi0 (x) := minu {Vi (x, u) | u ∈ Ui (x)}; the set
Ui (x) is polytopic. For each i, problem Pi (x) may be recognized as a
standard parametric QP discussed in Section 7.4. Because of the piece-
wise nature of V (·), we require another definition.
S(z) := {i ∈ I | z ∈ Zi }
Proof. (i) Suppose u is optimal for P(x) but, contrary to what we wish
to prove, there exists an i ∈ S(x, u) = S 0 (x) such that u is not optimal
for Pi (x). Hence there exists a v ∈ Rm such that (x, v) ∈ Zi and
V (x, v) = Vi (x, v) < Vi (x, u) = V (x, u) = V 0 (x), a contradiction
of the optimality of u for P(x). (ii) Suppose u is optimal for Pi (x)
for all i ∈ S(x, u) but, contrary to what we wish to prove, u is not
optimal for P(x). Hence V 0 (x) = V (x, u0 (x)) < V (x, u). If u0 (x) ∈
Z(x,u) := ∪i∈S(x,u) Zi , we have a contradiction of the optimality of u
in Z(x,u) . Assume then that u0 (x) ∈ Zj , j ∉ S(x, u); for simplicity,
assume further that Zj is adjacent to Z(x,u) . Then, there exists a λ ∈
(0, 1] such that uλ := u + λ(u0 (x) − u) ∈ Z(x,u) ; if not, j ∈ S(x,
u), a contradiction. Since V (·) is strictly convex, V (x, uλ ) < V (x, u),
which contradicts the optimality of u in Z(x,u) . The case when Zj is not
adjacent to Z(x,u) may be treated similarly. ■
Zi := {(x, u) | M i u ≤ N i x + p i }
Let Mji , Nji and qji denote, respectively, the jth row of M i , N i and qi ,
and let Ii (x, u) and Ii0 (x), defined by
denote, respectively, the active constraint set at (x, u) ∈ Zi and the ac-
tive constraint set for Pi (x). Because we now use subscript i to specify
Zi , we change our notation slightly and now let Ci (x, u) denote the
cone of first-order feasible variations for Pi (x) at u ∈ Ui (x), i.e.,
Similarly, we define the polar cone Ci∗ (x, u) of the cone Ci (x, u) at
7.5 Parametric Piecewise Quadratic Programming 467
h = 0 by
Ex u = Fx x + gx
where the subscript x denotes the fact that the constraints are precisely
those constraints that are active for the problems Pi (x), i ∈ S 0 (x). The
fact that u0 (x) satisfies these constraints and is, therefore, the unique
solution of the optimization problem
where Vx (w, u) := Vi (w, u) for all i ∈ S 0 (x) and is, therefore, a posi-
tive definite quadratic function of (x, u). The notation Vx0 (w) denotes
the fact that the parameter in the parametric problem Px (w) is now
w but the data for the problem, namely (Ex , Fx , gx ), is derived from
the solution u0 (x) of P(x) and is, therefore, x-dependent. Problem
Px (w) is a simple equality constrained problem in which the cost Vx (·)
is quadratic and the constraints Ex u = Fx w + gx are linear. Let Vx0 (w)
denote the value of Px (w) and u0x (w) its solution. Then
(b) Rx is a polytope
(c) x ∈ Rx
Proof.
(a) Because of the equality constraint 7.16 it follows that Ii (w, ux (w)) ⊇
Ii (x, u0 (x)) and that S(w, u0x (w)) ⊇ S(x, u0 (x)) for all i ∈ S(x,
u0 (x)) = S 0 (x), all w ∈ Rx . Hence Ci (w, u0x (w)) ⊆ Ci (x, u0 (x)),
which implies Ci∗ (w, u0x (w)) ⊇ Ci∗ (x, u0 (x)) for all i ∈ S(x, u0 (x)) ⊆
S(w, u0x (w)). It follows from the definition of Rx that u0x (w) ∈ Ui (w)
and that −∇u Vi (w, u0x (w)) ∈ Ci∗ (w, u0x (w)) for all i ∈ S(w, u0x (w)).
Hence u = u0x (w) satisfies necessary and sufficient for optimality for
Pi (w) for all i ∈ S(w, u), all w ∈ Rx and, by Proposition 7.16, neces-
sary and sufficient conditions of optimality for P(w) for all w ∈ Rx .
Hence u0x (w) = u0 (w) and Vx0 (w) = V 0 (w) for all w ∈ Rx .
(b) That Rx is a polytope follows from the facts that the functions
w , u0x (w) and w , ∇u Vi (w, u0x (w)) are affine, the sets Zi are poly-
topic and the sets Ci0 (x, u0 (x)) are polyhedral; hence (w, u0x (w)) ∈ Zi
is a polytopic constraint and −∇u Vi (w, u0x (w)) ∈ Ci∗ (x, u0 (x)) a
polyhedral constraint on w.
(c) That x ∈ Rx follows from Proposition 7.16 and the fact that u0x (x) =
u0 (x). ■
with x(i) := φ(i; x, u); Vj0 (·) is the value function for Pj (x). As shown
in Chapter 2, the constrained DP recursion is
0
Vj+1 (x) = min{ℓ(x, u) + Vj0 (f (x, u)) | u ∈ U, f (x, u) ∈ Xj } (7.18)
u
We know from Section 7.4 that Vj0 (·) is continuous, strictly convex
and piecewise quadratic, and that κj (·) is continuous and piecewise
affine on a polytopic partition PXj of Xj . Hence the function (x, u) ,
V (x, u) := ℓ(x, u) + Vj0 (Ax + Bu) is continuous, strictly convex and
piecewise quadratic on a polytopic partition PZj+1 of the polytope Zj+1
defined by
Zj+1 := {(x, u) | x ∈ X, u ∈ U, Ax + Bu ∈ Xj }
Z := {z = (x, u) | x ∈ X, u ∈ U, Ax + Bu ∈ X}
V (x, u) = q′ x + r ′ u
Z := {(x, u) | Mu ≤ Nx + p}
The solution u0 (x) may be set valued. The parametric linear program
(LP) may also be expressed as
U(x) := {u | (x, u) ∈ Z} = {u | Mu ≤ Nx + p}
Also, as before, the domain of V 0 (·) and u0 (·), i.e., the set of points x
for which a feasible solution of P(x) exists, is the set X defined by
where
Mx0 := MI 0 (x) , Nx0 := NI 0 (x) , px0 := pI 0 (x)
In this case, the matrix Mx0 has rank m.
Any face F of U(x) with dimension d ∈ {1, 2, . . . , m} satisfies Mi u =
Ni x + pi for all i ∈ IF , all u ∈ F for some index set IF ⊆ I1:p . The matrix
MIF with rows Mi , i ∈ IF , has rank m − d, and the face F is defined by
F := {u | Mi u = Ni x + pi , i ∈ IF } ∩ U(x)
472 Explicit Control Laws for Constrained Linear Systems
An important difference between this result and that for the para-
metric QP is that ∇u V (x, u) = r and, therefore, does not vary with x
or u. We now use this result to show that both V 0 (·) and u0 (·) are
piecewise affine. We consider the simple case when u0 (x) is unique
for all x ∈ X.
M3′
u2 M4′
u3,4 U(x1 )
F3 (x1 )
F4 (x1 )
3 u2,3 = u0 (x1 )
F1 (x3 )
F2 (x1 )
F5 (x1 ) M2′
F1 (x2 )
M5′ −r
F1 (x1 )
0 F6 (x1 ) 3 u1
M1′
M6′
U(x1 ) has six faces: F1 (x1 ), F2 (x1 ), F3 (x1 ), F4 (x1 ), F5 (x1 ), and F6 (x1 ).
Face F1 (x) lies in the hyperplane H1 (x) that varies linearly with x;
each face Fi (x), i = 2, . . . , 6, lies in the hyperplane Hi that does not
vary with x. All the faces vary with x as shown so that U(x2 ) has four
faces: F1 (x2 ), F3 (x2 ), F4 (x2 ), and F5 (x2 ); and U(x3 ) has three faces:
F1 (x3 ), F4 (x3 ), and F5 (x3 ). The face F1 (x) is shown for three values
of x: x = x1 (the bold line), and x = x2 and x = x3 (dotted lines).
It is apparent that for x ∈ [x1 , x2 ], u0 (x) = u2,3 in which u2,3 is the
intersection of H2 and H3 , and u0 (x3 ) = u3,4 , in which u3,4 is the
intersection of H3 and H4 . It can also be seen that u0 (x) is unique for
all x ∈ X.
We now return to the general case. Suppose, for some ∈ X, u0 (x)
is the unique solution of P(x); u0 (x) is the unique solution of
It follows that u0 (x) is the trivial solution of the simple equality con-
strained problem defined by
plays no part.
The optimization problem (7.20) motivates us, as in parametric quad-
ratic programming, to consider, for any parameter w “close” to x, the
simpler equality constrained problem Px (w) defined by
Let u0x (w) denote the solution of Px (w). Because, for each x ∈ X,
the matrix Mx0 has full rank m, there exists an index set Ix such that
MIx ∈ Rm×m is invertible. Hence, for each w, u0x (w) is the unique
solution of
MIx u = NIx w + pIx
so that for all x ∈ X, all w ∈ Rm
where Kx := (MIx )−1 NIx and kx := (MIx )−1 pIx . In particular, u0 (x) =
u0x (x) = Kx x + kx . Since Vx0 (x) = Vx (x, u0x (w)) = q′ x + r ′ u0x (w), it
follows that
Vx0 (x) = (q′ + r ′ Kx )x + r ′ kx
for all x ∈ X, all w ∈ Rm . Both Vx0 (·) and u0x (·) are affine in x.
It follows from Proposition 7.19 that −r ∈ C ∗ (x, u0 (x)) = cone{Mi′ |
i ∈ I 0 (x) = I(x, u0 (x))} = cone{Mi′ | i ∈ Ix }. Since Px (w) satisfies
the conditions of Proposition 7.8, we may proceed as in Section 7.3.4
and define, for each x ∈ X, the set Rx0 as in (7.5)
( )
u0x (w) ∈ U(w)
Rx0 := w ∈ Rn
−∇u V (w, u0x (w)) ∈ C ∗ (x, u0 (x))
It then follows, as shown in Proposition 7.9, that for any x ∈ X, u0x (w)
is optimal for P(w) for all w ∈ Rx0 . Because P(w) is a parametric LP,
however, rather than a parametric QP, it is possible to simplify the def-
inition of Rx0 . We note that ∇u V (w, u0x (w)) = r for all x ∈ X, all
w ∈ Rm . Also, it follows from Proposition 7.8, since u0 (x) is optimal
for P(x), that −∇u V (x, u0 (x)) = −r ∈ C ∗ (x) so that the second con-
dition in the definition above for Rx0 is automatically satisfied. Hence
we may simplify our definition for Rx0 ; for the parametric LP, Rx0 may
be defined by
Rx0 := {w ∈ Rn | u0x (w) ∈ U(w)} (7.22)
7.8 Constrained Linear Control 475
Because u0x (·) is affine, it follows from the definition of U(w) that
Rx0 is polyhedral. The next result follows from the discussion in Sec-
tion 7.3.4.
Proposition 7.20 (Solution of P). For any x ∈ X, u0x (w) is optimal for
P(w) for all w in the set Rx0 defined in (7.22).
(b) The value function V 0 (·) for P(x) and the minimizer u0 (·) are piece-
wise affine in X being equal, respectively, to the affine functions Vx0 (·)
and u0x (·) in each region Rx , x ∈ X.
(c) The value function V 0 (·) and the minimizer u0 (·) are continuous in
X.
Proof. The proof of parts (a) and (b) follows, apart from minor changes,
the proof of Proposition 7.10. The proof of part (c) uses the fact that
u0 (x) is unique, by assumption, for all x ∈ X and is similar to the
proof of Proposition 7.13. ■
where, now
VN (x, u) = q′ x + r′ u
476 Explicit Control Laws for Constrained Linear Systems
Hence the problem has the same form as that discussed in Section 7.7
and may be solved as shown there.
It is possible, using a simple transcription, to use the solution of
PN (x) to solve the optimal control problem when the stage cost and
terminal cost are defined by
7.9 Computation
Our main purpose above was to establish the structure of the solution
of parametric linear or QPs and, hence, of the solutions of constrained
linear optimal control problems when the cost is quadratic or linear.
We have not presented algorithms for solving these problem although;
there is now a considerable literature on this topic. One of the ear-
liest algorithms (Serón, De Doná, and Goodwin, 2000) is enumeration
based: checking every active set to determine if it defines a non-empty
region in which the optimal control is affine. There has recently been
a return to this approach because of its effectiveness in dealing with
systems with relatively high state dimension but a low number of con-
straints (Feller, Johansen, and Olaru, 2013). The enumeration based
procedures can be extended to solve mixed-integer problems. While
the early algorithms for parametric linear and quadratic programming
have exponential complexity, most later algorithms are based on a lin-
ear complementarity formulation and execute in polynomial time in
the number of regions; they also use symbolic perturbation to select a
unique and continuous solution when one exists (Columbano, Fukudu,
and Jones, 2009). Some research has been devoted to obtaining ap-
proximate solutions with lower complexity but guaranteed properties
such as stability (Borrelli, Bemporad, and Morari, 2017, Chapter 13).
Toolboxes for solving parametric linear and quadratic programming
problems include the The Multi-Parametric Toolbox in MATLAB and MPT3
described in (Herceg, Kvasnica, Jones, and Morari, 2013).
A feature of parametric problems is that state dimension is not a
reliable indicator of complexity. There exist problems with two states
that require over 105 regions and problems with 80 states that require
only hundreds of regions. While problems with state dimension less
than, say, 4 can be expected to have reasonable complexity, higher di-
mension problems may or may not have manageable complexity.
7.10 Notes 477
7.10 Notes
Early work on parametric programming, e.g., (Dantzig, Folkman, and
Shapiro, 1967) and (Bank, Guddat, Klatte, Kummer, and Tanner, 1983),
was concerned with the sensitivity of optimal solutions to parameter
variations. Solutions to the parametric linear programming problem
were obtained relatively early (Gass and Saaty, 1955) and (Gal and Ne-
doma, 1972). Solutions to parametric QPs were obtained in (Serón et al.,
2000) and (Bemporad, Morari, Dua, and Pistikopoulos, 2002) and ap-
plied to the determination of optimal control laws for linear systems
with polyhedral constraints. Since then a large number of papers on
this topic have appeared, many of which are reviewed in (Alessio and
Bemporad, 2009). Most papers employ the Kuhn-Tucker conditions of
optimality in deriving the regions Rx , x ∈ X. Use of the polar cone con-
dition was advocated in (Mayne and Raković, 2002) in order to focus on
the geometric properties of the parametric optimization problem and
avoid degeneracy problems. Section 7.5, on parametric piecewise quad-
ratic programming, is based on (Mayne, Raković, and Kerrigan, 2007).
The example in Section 7.4 was first computed by Raković (Mayne and
Raković, 2003). That results from parametric linear and quadratic pro-
gramming can be employed, instead of maximum theorems, to estab-
lish continuity of u0 (·) and, hence, of V 0 (·), was pointed out by Bem-
porad et al. (2002) and Borrelli (2003, p. 37).
Much research has been devoted to obtaining reliable algorithms;
see the survey papers (Alessio and Bemporad, 2009) and (Jones, Barić,
and Morari, 2007) and the references therein. Jones (2017, Chapter
13) provides a useful review of approximate explicit control laws of
specified complexity that nevertheless guarantee stability and recursive
feasibility.
478 Explicit Control Laws for Constrained Linear Systems
7.11 Exercises
where, now, v is a column vector whose components are u(0), u(1), . . . , u(N−1), ℓx (0),
ℓx (1), . . . , ℓx (N), ℓu (0), ℓu (1), . . . , ℓu (N − 1) and f ; the cost VN (x, v) is now defined
by
N−1
X
VN (x, v) = (ℓx (i) + ℓu (i)) + f
i=0
7.11 Exercises 479
(b) Next consider MPC control of the following system with state inequality con-
straint and no input constraints
" # " # " #
−1/4 1 1 1 1
A= B= x(k) ≤ k ∈ I0:N
−1 1/2 −1 −1 1
Using a horizon N = 1, eliminate the state x(1) and write out the MPC QP for
the input u(0) in the form given above for Q = R = I and zero terminal penalty.
Find an initial condition x0 such that the MPC constraint matrix D and vector d
are identical to those given in the previous part. Is this x0 ∈ XN ?
Are the rows of the matrix of active constraints linearly independent in this MPC
QP on the set XN ? Are the MPC control law κN (x) and optimal value function
0
VN (x) Lipschitz continuous on the set XN for this system? Explain the reason
if these two answers differ.
480 Explicit Control Laws for Constrained Linear Systems
Octave implicit
0.25 Octave explicit
Matlab implicit
0.20 Matlab explicit
probability density
0.15
0.10
0.05
0.00
0 10 20 30 40 50 60 70 80
execution time (ms)
Figure 7.8: Solution times for explicit and implicit MPC for N = 20.
Plot shows kernel density estimate for 10,000 samples
using a Gaussian kernel (σ = 1 ms).
so that the height of the tank returns to hsp automatically. Unfortunately, the controller
parameters are not very SmartTM , as they are fixed permanently at Kc = 1/2 and τc = 1.
(a) Simulate the closed-loop behavior of the system starting from h = −1, ϵ = 0
with hsp ≡ 0.
(c) Add the constraint q ∈ [−0.2, 0.2] to the MPC formulation, and design an explicit
MPC controller valid for h ∈ [−5, 5] and ϵ ∈ [−10, 10] (use solvempqp.m from
Figure 7.6, and add constraints Ep ≤ e to only search the region of interest). How
large does N have to be so that the full region is covered? How much storage is
needed to implement this controller?
x+ = x + u
with x representing the amount of stored energy in the tank, and u giving the amount
of electricity that is purchased for the battery u > 0 or discharged from the battery
and sold back to the grid (u < 0). We wish to find an explicit control law based on the
initial condition x(0) a known forecast of electricity prices c(0), c(1), . . . , c(N − 1).
(a) To start, suppose that u is constrained to the interval [−1, 1] but x is uncon-
strained. A reasonable optimization problem is
N−1
X
min c(k)u(k) + 0.1u(k)2
u
k=0
s.t. x(k + 1) = x(k) + u(k)
u(k) ∈ [−1, 1]
where the main component of the objective function is the cost of electricity
purchase/sale with a small penalty added to discourage larger transactions. By
removing the state evolution equation, formulate an explicit quadratic program-
ming problem with N variables (the u(k)) and N + 1 parameters (x(0) and the
price forecast c(k)). What is a theoretical upper bound on the number of regions
in the explicit control law? Assuming that x(0) ∈ [−10, 10] and each c(k) ∈ [−1,
−1], find the explicit control law for a few small values of N. (Consider using
solvempqp.m from Figure 7.6; you will need to add constraints Ep ≤ e on the
parameter vector to make sure the regions are bounded.) How many regions do
you find?
482 Explicit Control Laws for Constrained Linear Systems
(b) To make the problem more realistic, we add the constraint x(k) ∈ [−10, 10]
to the optimization, as well as an additional penalty on stored inventory. The
optimization problem is then
N−1
X
min c(k)u(k) + 0.1u(k)2 + 0.01x(k)2
u
k=0
s.t. x(k + 1) = x(k) + u(k)
u(k) ∈ [−1, 1]
x(k) ∈ [−10, 10]
Repeat the previous part but using the new optimization problem.
(c) Suppose you wish to solve this problem with a 7-day horizon and a 1-hour time
step. Can you use the explicit solution of either formulation? (Hint: for compar-
ison, there are roughly 1080 atoms in the observable universe.)
Bibliography
F. Borrelli, A. Bemporad, and M. Morari. Predictive Control for Linear and Hy-
brid Systems. Cambridge University Press, 2017.
483
484 Bibliography
8.1 Introduction
Numerical optimal control methods are at the core of every model pre-
dictive control implementation, and algorithmic choices strongly affect
the reliability and performance of the resulting MPC controller. The
aim of this chapter is to explain some of the most widely used algo-
rithms for the numerical solution of optimal control problems. Before
we start, recall that the ultimate aim of the computations in MPC is to
find a numerical approximation of the optimal feedback control u0 (x0 )
for a given current state x0 . This state x0 serves as initial condition for
an optimal control problem, and u0 (x0 ) is obtained as the first control
of the trajectory that results from the numerical solution of the optimal
control problem. Due to a multitude of approximations, the feedback
law usually is not exact. Some of the reasons are the following.
While the first two of the above are discussed in Chapters 2 and 3 of
this book, the last two are due to the numerical solution of the opti-
mal control problems arising in model predictive control and are the
focus of this chapter. We argue throughout the chapter that it is not a
good idea to insist that the finite horizon MPC problem shall be solved
exactly. First, it usually is impossible to solve a simulation or opti-
mization problem without any numerical errors, due to finite precision
arithmetic and finite computation time. Second, it might not even be
desirable to solve the problem as exactly as possible, because the neces-
sary computations might lead to large feedback delays or an excessive
use of CPU resources. Third, in view of the other errors that are nec-
essarily introduced in the modeling process and in the MPC problem
485
486 Numerical Optimal Control
102
1
0
x(1) ψ
101
−1 feasible set
−2
100
−1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0
u(0) u(0)
Figure 8.1: Feasible set and reduced objective ψ(u(0)) of the non-
linear MPC Example 8.1.
on the right of Figure 8.1 and it can clearly be seen that two different
locally optimal solutions exist, only one of which is the globally optimal
choice. □
N−1
X
minimize ℓ(φ(k; x0 , u), u(k)) + Vf (φ(N; x0 , u)) (8.3a)
u k=0
dx
= fc (x, u)
dt
x(·) and u(·) to denote the state and control trajectories, the continu-
ous time optimal control problem (OCP) can be formulated as follows
ZT
minimize ℓc (x(t), u(t)) dt + Vf (x(T )) (8.5a)
x(·), u(·) 0
X
minimize hℓc (x(t), u(t)) + Vf (x(Nh)) (8.6a)
x, u t∈hI0:N−1
starting with xe (0) = x0 . Which local error do we make in each step? For
local error analysis, we assume that the starting point x e (t) was on an
exact trajectory, i.e., equal to x(t), while the result of the integrator step
x
e (t +h) is different from x(t +h). For the analysis, we assume that the
true trajectory x(t) is twice continuously differentiable with bounded
second derivatives, which implies that its first-order Taylor series satis-
fies x(t + h) = x(t) + hẋ(t) + O(h2 ), where O(h2 ) denotes an arbitrary
function whose size shrinks faster than h2 for h → 0. Since the first
derivative is known exactly, ẋ(t) = f (t, x(t)), and was used in the Euler
8.2 Numerical Simulation 497
x
e (t + h) = xe (t) + Φ(t, x
e (t), h)
R t+h
Here, the map Φ approximates the integral t f (τ, x(τ)) dτ. If Φ
would be equal to this integral, the integration method would be exact,
due to the identity
Z t+h Z t+h
x(t + h) − x(t) = ẋ(τ) dτ = f (τ, x(τ)) dτ
t t
k1 = f (t, x)
k2 = f (t + h/2, x + (h/2)k1 )
k3 = f (t + h/2, x + (h/2)k2 )
k4 = f (t + h, x + hk3 )
Φ = (h/6)k1 + (h/3)k2 + (h/3)k3 + (h/6)k4
k1 = f (t +c1 h, x )
k2 = f (t +c2 h, x + h (a21 k1 ) )
k3 = f (t +c3 h, x + h (a31 k1 + a32 k2 ) )
.. ..
. .
ks = f (t +cs h, x + h (as1 k1 + ... + as,s−1 ks−1 ) )
explicit Euler
Heun
100 RK4 exact explicit Euler Heun RK4
10−3 1
x1
0
E(2π ) 10−6
−1
10−9
1
x2
10−12 0
−1
0 π /2 π 3π /2 2π
102 104 106
function evaluations t
(a) Accuracy vs. function evaluations. (b) Simulation results for M = 32.
c1
c2 a21
c3 a31 a32
.. .. ..
. . .
cs as1 ··· as,s−1
b1 b2 ··· bs
0
0 0 1/2 1/2
1 1 1 1/2 0 1/2
1/2 1/2 1 0 0 1
1/6 2/6 2/6 1/6
have a higher order than s. And only for orders equal or less than four
exist explicit Runge-Kutta methods for which the order and the number
of stages coincide.
x + = x + hf (t + h, x + )
8.2 Numerical Simulation 501
Note that the desired output value x + appears also on the right side of
the equation. For the scalar linear ODE ẋ = λx, the implicit Euler step
is determined by x + = x + hλx + , which can explicitly be solved to give
x + = x/(1 − hλ). For any negative λ, the denominator is larger than
one, and the numerical approximation x e (kh) = x0 /(1 − hλ)k there-
fore decays exponentially, similar to the exact solution. An integration
method which has the desirable property that it remains stable for the
test ODE ẋ = λx whenever Re(λ) < 0 is called A-stable. While none of
the explicit Runge-Kutta methods is A-stable, the implicit Euler method
is A-stable. But it has a low order. Can we devise A-stable methods that
have a higher order?
Φ = h ( b1 k1 + b2 k2 +... + bs ks )
Note that the upper s equations are implicit and form a root-finding
problem with sn nonlinear equations in sn unknowns, where s is the
number of RK stages and n is the state dimension of the differen-
tial equation ẋ = f (t, x). Nonlinear root-finding problems are usually
solved by Newton’s method, which is treated in the next section. For
Newton’s method to work, one has to assume that the Jacobian of the
residual function is invertible. For the RK equations above, this can be
shown to always hold if the time step h is sufficiently small, depending
on the right-hand-side function f . After the values k1 , . . . , ks have been
computed, the last line can be executed and yields the resulting map
Φ(t, x, h). The integrator then uses the map Φ to proceed to the next
integration step exactly as the other one-step methods, according to
502 Numerical Optimal Control
x
e (t + h) = x
e (t) + Φ(t, x
e (t), h)
For implicit integrators, contrary to the explicit ones, the map Φ cannot
easily be written down as a series of function evaluations. Evaluation of
Φ(t, x, h) includes the root-finding procedure and typically needs sev-
eral evaluations of the root-finding equations and of their derivatives.
Thus, an s-stage implicit Runge-Kutta method is significantly more ex-
pensive per step compared to an s-stage explicit Runge-Kutta method.
Implicit integrators are usually preferable for stiff ordinary differential
equations, however, due to their better stability properties.
Many different implicit Runge-Kutta methods exist, and each of
them can be defined by its Butcher tableau. For an implicit RK method,
at least one of the diagonal and upper-triangular entries (aij with j ≥ i)
is nonzero. Some methods try to limit the implicit part for easier com-
putations. For example, the diagonally implicit Runge-Kutta methods
have only the diagonal entries nonzero while the upper-triangular part
remains zero.
Note that the integrals over the Lagrange basis polynomials depend
only on the relative positions of the collocation time points, and directly
yield the coefficients aij . Likewise, to obtain the coefficients bi , we
evaluate xe (t + h; x, k1 , k2 , . . . , ks ), which is given by
Z t+h s
X Z1
x+ ẋ
e (τ; k1 , k2 , . . . , ks ) dτ = x + ki h Li (σ ) dσ
t
i=1 |0 {z }
=:bi
In Figure 8.3, the difference between the exact solution x(τ) and the
collocation polynomial x e (τ) as well as the difference between their
1 The Lagrange basis polynomials are defined by
Y (σ − cj )
Li (σ ) :=
(ci − cj )
1≤j≤s, j≠i
504 Numerical Optimal Control
1.0
k1
0.8 x
e 1 (τ)
x1 x1 (τ)
0.6 k2
0.4
0.0
k1
ẋ1 −0.5 x
ė 1 (τ)
ẋ1 (τ) k2
−1.0
t t + c1 h t + c2 h t+h
τ
ẋ = Ax − 500 x (|x|2 − 1)
Some system models do not only contain differential, but also algebraic
equations, and therefore belong to the class of differential algebraic
506 Numerical Optimal Control
implicit Euler
100
GL2
GL4 exact implicit Euler GL2 GL4
10−3 1
x1 0
−6
E(2π ) 10
−1
10−9 1
x2 0
10−12
−1
0 π /2 π 3π /2 2π
101 103 105
collocation points M t
(a) Accuracy vs. collocation points. (b) Simulation result for M = 10 points.
ẋ = f (t, x, z) (8.9a)
0 = g(t, x, z) (8.9b)
on the problem, one can usually not even be sure that a solution z0
with R(z0 ) = 0 exists. And if one has found a solution, one usually
cannot be sure that it is the only one. Despite these theoretical diffi-
culties with nonlinear root-finding problems, they are nearly as widely
formulated and solved in science and engineering as linear equation
systems.
In this section we therefore consider a continuously differentiable
function R : Rnz → Rnz , z , R(z), where our aim is to solve the non-
linear equation
R(z) = 0
Nearly all algorithms to solve this system derive from an algorithm
called Newton’s method or Newton-Raphson method that is accredited
to Isaac Newton (1643–1727) and Joseph Raphson (about 1648–1715),
but which was first described in its current form by Thomas Simpson
(1710–1761). The idea is to start with an initial guess z0 , and to gener-
ate a sequence of iterates (zk )∞
k=0 by linearizing the nonlinear equation
at the current iterate
∂R
R(zk ) + (zk )(z − zk ) = 0
∂z
This equation is a linear system in the variable z, and if the Jacobian
J(zk ) := ∂R
∂z (zk ) is invertible, we can explicitly compute the next iterate
as
zk+1 = zk − J(zk )−1 R(zk )
Here, we use the notation J(zk )−1 R(zk ) as a shorthand for the algo-
rithm that solves the linear system J(zk )∆z = R(zk ). In the actual
computation of a Newton step, the inverse J(zk )−1 is never computed,
but only a LU-decomposition of J(zk ), and a forward and a back sub-
stitution, as described in the previous subsection.
More generally, we can use an invertible approximation Mk of the Ja-
cobian J(zk ), leading to the Newton-type methods. The general Newton-
type method iterates according to
40
30
R(z)
20
10
1.00 1.25 1.50 1.75 2.00 1.00 1.25 1.50 1.75 2.00
z z
When started at z0 = 2 the first iteration would be the same as for New-
ton’s method, but then the Newton-type method with constant Jacobian
produces a different sequence, as can be seen on the right side of Fig-
ure 8.5. Here, the approximate method also converges; but in general,
8.3 Solving Nonlinear Equation Systems 511
k
1 zk+1 (22 )2
• zk = k converges q-quadratically, because (zk )2 = k+1 =1<
22 22
1
∞. For k = 6, zk = ≈ 0. This is a typical feature of q-quadratic
264
convergence: often, convergence up to machine precision is ob-
tained in about six iterations. □
2 The historical prefix “q” stands for “quotient,” to distinguish it from a weaker form
of convergence that is called “r-convergence,” where “r” stands for “root.”
512 Numerical Optimal Control
100
zk = 0.99k
10−3
10−6 1
zk zk =
2k
10−9 1
zk =
k!
10−12 1
zk =
22k
10−15
10 20 30 40 50
k
cost(F , J) = (1 + m) cost(F )
There exists a variety of more accurate, but also more expensive, forms
of numerical differentiation, which can be derived from polynomial in-
terpolation of multiple function evaluations of F . The easiest of these
are central differences, which are based on a positive and a negative
perturbation. Using such higher-order formulas with adaptive pertur-
bation size selection, one can obtain high-accuracy derivatives with nu-
merical differentiation, but at significant cost. One interesting way to
actually reduce the cost of the numerical Jacobian calculation arises if
the Jacobian is known to be sparse, and if many of its columns are struc-
turally orthogonal, i.e., have their nonzero entries at different locations.
516 Numerical Optimal Control
To efficiently generate a full Jacobian, one can, for example, use the al-
gorithm by Curtis, Powell, and Reid (1974) that is implemented in the
FORTRAN routine TD12 from the HSL Mathematical Software Library
(formerly Harwell Subroutine Library). For details of sparse Jacobian
evaluations, we refer to the review article by Gebremedhin, Manne, and
Pothen (2005).
In summary, and despite the tricks to improve accuracy or effi-
ciency, one has to conclude that numerical differentiation often re-
sults in quite inaccurate derivatives, and its only—but practically
important—advantage is that it works for any black-box function that
can be evaluated on a given computer. Fortunately, there exists a dif-
ferent technology, called AD, that also has tight bounds on the com-
putational cost of the Jacobian evaluation, but avoids the numerical
inaccuracies of numerical differentiation. It is often even faster than
numerical differentiation, and in the case of reverse derivatives ȳ ′ J, it
can be tremendously faster. It does so, however, by opening the black
box.
dx ∗
and the Jacobian of F is simply given by J(u) = C du (u). The forward
directional derivative is given by
∂G −1
J(u)u̇ = C − B u̇ = C ẋ
∂x
| {z }
=:ẋ
Here, we have introduced the dot quantities ẋ that denote the direc-
∗
tional derivative of x ∗ (u) into the direction u̇, i.e., ẋ = dx
du u̇. An effi-
cient algorithm to compute ẋ corresponds to the solution of a lower-
triangular linear equation system that is given by
∂G
− ẋ = B u̇ (8.14)
∂x
∂G −1
ȳ ′ J(u) = ȳ ′ C − B = x̄ ′ B
∂x
| {z }
=:x̄ ′
where we define the bar quantities x̄ that have a different meaning than
the dot quantities. For computing x̄, we need to also solve a linear
system, but with the transposed system matrix
′
∂G
− x̄ = C ′ ȳ (8.15)
∂x
Let us regard Example 8.9 and find the corresponding function G(x, u)
as well as the involved matrices. The function G corresponds to the
8.4 Computing Derivatives 519
The right-hand-side vectors in the equations (8.14) and (8.15) are given
by
u̇1 0
u̇2 0
u̇3 0
0 0
′
B u̇ = and C ȳ =
0 0
0 ȳ1
0 0
0 ȳ2
∂φk ∂φk
ẋk = (xi , xj ) ẋi + (xi , xj ) ẋj
∂xi ∂xj
follows
x1 = u1 ẋ1 = u̇1
x2 = u2 ẋ2 = u̇2
x3 = u3 ẋ3 = u̇3
x4 = x1 x2 ẋ4 = x2 ẋ1 + x1 ẋ2
x5 = sin(x4 ) ẋ5 = cos(x4 )ẋ4
x6 = x4 x3 ẋ6 = x3 ẋ4 + x4 ẋ3
x7 = exp(x6 ) ẋ7 = exp(x6 )ẋ6
x8 = x5 + x7 ẋ8 = ẋ5 + ẋ7
y1 = x6 ẏ1 = ẋ6
y2 = x8 ẏ2 = ẋ8
The result of the original algorithm is y = [y1 y2 ]′ and the result of the
forward AD sweep is ẏ = [ẏ1 ẏ2 ]′ . If desired, one could perform both
algorithms in parallel, i.e., evaluate first the left side, then the right side
of each row consecutively. This procedure would allow one to delete
each intermediate variable and the corresponding dot quantity after its
last usage, making the memory demands of the joint evaluation just
twice as big as those of the original function evaluation. □
One can see that the dot-quantity evaluations on the right-hand
side—which we call a forward sweep—are never longer than about twice
the original line of code. This is because each elementary operation de-
pends on at maximum two intermediate variables. More generally, it
can be proven that the computational cost of one forward sweep in
AD is smaller than a small constant times the cost of a plain function
evaluation. This constant depends on the chosen set of elementary
operations, but is usually much less than two, so that we conclude
∂φk
x̄i = x̄i + x̄k (xi , xj ) and
∂xi
∂φk
x̄j = x̄j + x̄k (xi , xj )
∂xj
other, noting that the final result for each x̄i will be a sum of the right-
hand-side vector component C ′ ȳ and a weighted sum of the values x̄j
for those j > i which correspond to elementary operations that have
xi as an input. We therefore initialize all variables by x̄ = C ′ ȳ, which
results for the example in the initialization
x̄1 = 0 x̄5 = 0
x̄2 = 0 x̄6 = ȳ1
x̄3 = 0 x̄7 = 0
x̄4 = 0 x̄8 = ȳ2
In the reverse sweep, the algorithm updates the bar quantities in re-
verse order compared to the original algorithm, processing one column
after the other.
// differentiation of x8 = x5 + x7
x̄5 = x̄5 + x̄8
x̄7 = x̄7 + x̄8
// differentiation of x7 = exp(x6 )
x̄6 = x̄6 + x̄7 exp(x6 )
// differentiation of x6 = x4 x3
x̄4 = x̄4 + x̄6 x3
x̄3 = x̄3 + x̄6 x4
// differentiation of x5 = sin(x4 )
x̄4 = x̄4 + x̄5 cos(x4 )
// differentiation of x4 = x1 x2
x̄1 = x̄1 + x̄4 x2
x̄2 = x̄2 + x̄4 x1
ū1 = x̄1
ū2 = x̄2
ū3 = x̄3
to read out the desired result ȳ ′ J(x) = [ū1 ū2 ū3 ]. Note that all three
of the components are returned by only one reverse sweep. □
It can be shown that the cost of one reverse sweep of AD is less than
a small constant (which is certainly less than three) times the cost of a
524 Numerical Optimal Control
cost(ȳ ′ J) ≤ 3 cost(F )
steps
x̄(N)′ = Vx (x(N))
for k = N − 1, N − 2, . . . , 0 (8.16)
′ ′
x̄(k) = ℓx (x(k), u(k)) + x̄(k + 1) fx (x(k), u(k))
ū(k)′ = ℓu (x(k), u(k)) + x̄(k + 1)′ fu (x(k), u(k))
end
any additional user input. Often, the result of the numerical solver it-
self can be interpreted as a differentiable CasADi function, such that
derivatives up to any order can be generated without actually differen-
tiating the source code of the solver. Thus, concatenated and recursive
calls to numerical solvers are possible and still result in differentiable
CasADi functions.
CasADi is written in C++, but allows user input to be provided from
either C++, Python, Octave, or MATLAB. When CasADi is used from the
interpreter languages Python, Octave, or MATLAB, the user does not have
any direct contact with C++; but because the internal handling of all
symbolic expressions as well as the numerical computations are per-
formed in a compiled environment, the speed of simulation or op-
timization computations is similar to the performance of compiled
C-code. One particularly powerful optimization solver interfaced to
CasADi is IPOPT, an open-source C++ code developed and described
by Wächter and Biegler (2006). IPOPT is automatically provided in the
standard CasADi installation. For more information on CasADi and
how to install it, we refer the reader to casadi.org. Here, we illustrate
the use of CasADi for optimal control in a simple example.
with the initial condition x(0) = [0, 1]′ . We can encode this in Oc-
tave as follows
8.4 Computing Derivatives 529
For its numerical solution, we formulate this problem using the se-
quential approach, i.e., we regard only u as optimization variables and
eliminate x by a system simulation. This elimination allows us to gen-
erate a cost function c(u) and a constraint function G(u) such that the
above problem is equivalent to
% Decision variable
N = 50;
U = casadi.SX.sym(’U’, N);
% System simulation
xk = [1; 0];
c = 0;
for k=1:N
% RK4 method
530 Numerical Optimal Control
dt = 0.2;
k1 = f_c(xk, U(k));
k2 = f_c(xk+0.5*dt*k1, U(k));
k3 = f_c(xk+0.5*dt*k2, U(k));
k4 = f_c(xk+dt*k3, U(k));
xk = xk + dt/6.0*(k1 + 2*k2 + 2*k3 + k4);
% Add contribution to objective function
c = c + 10*xk(1)^2 + 5*xk(2)^2 + U(k)^2;
end
% Terminal constraint
G = xk - [0; 0];
problem. For convenience, we restate the OCP (8.5) in a form that re-
places the constraint sets Z and Xf by equivalent inequality constraints,
as follows
ZT
minimize ℓc (x(t), u(t)) dt + Vf (x(T )) (8.20a)
x(·), u(·) 0
While the above problem has infinitely many variables and constraints,
the idea of direct optimal control methods is to solve instead a related
finite-dimensional problem of the general form
minimize F (w)
w ∈ Rnw
subject to G(x0 , w) = 0 (8.21)
H(w) ≤ 0
u
e (t; q) := qi for t ∈ [ti , ti+1 )
For each interval, one needs one vector qi ∈ Rm , such that the to-
tal dimension of q = q0 , q1 , . . . , qN−1 is given by nq = Nm. In the
following, we assume this form of piecewise constant control parame-
terization.
Regarding the state discretization, the direct single-shooting
method relies on any of the numerical simulation methods described in
Section 8.2 to find an approximation x e (t; x0 , q) of the state trajectory,
given the initial value x0 at t = 0 and the control trajectory u e (t; q).
Often, adaptive integrators are chosen. In case of piecewise constant
controls, the integration needs to stop and restart briefly at the time
points ti to avoid integrating a nonsmooth right-hand-side function.
Due to state continuity, the state x e (ti ; x0 , q) is both the initial state
of the interval [ti , ti+1 ] as well as the last state of the previous inter-
val [ti−1 , ti ]. The control values used in the numerical integrators on
both sides differ, due to the jump at ti , and are given by qi−1 and qi ,
respectively.
Evaluating the integral in the objective (8.20a) requires an integra-
tion rule. One option is to just augment the ODE system with a quadra-
ture state xquad (t) starting at xquad (0) = 0, and obeying the trivial dif-
ferential equation ẋquad (t) = ℓc (x(t), u(t)) that can be solved with
the same numerical solver as the standard ODE. Another option is to
8.5 Direct Optimal Control Parameterizations 533
N−1
X M−1
X
F (x0 , q) := ℓc (x
e (τi,j ; x0 , q), u
e (τi,j ; q)) (τi,j+1 −τi,j )
i=0 j=0
+ Vf (x
e (T ; x0 , q))
nhf
If the function h maps to Rnh and hf to R , the function H maps to
(NMnh +nhf )
R . The resulting finite-dimensional optimization problem in
534 Numerical Optimal Control
minimize F (s0 , q)
s0 , q
subject to s0 − x0 = 0 (8.22)
H(s0 , q) ≤ 0
dx
ei
x
e i (ti ; si , qi ) = si , (t; si , qi ) = fc (x
e i (t; si , qi ), qi ), t ∈ [ti , ti+1 ]
dt
M−1
X
ℓi (si , qi ) := ℓc (x
e i (τi,j ; si , qi ), qi ) (τi,j+1 −τi,j )
j=0
PN−1
The overall objective is thus given by i=0 ℓi (si , qi ) + Vf (sN ). Note
that the objective terms ℓi (si , qi ) each depend again only on the lo-
cal initial values si and local controls qi , and can thus be evaluated
independently from each other. Likewise, we discretize the path con-
straints, for simplicity on the same refined grid, by defining the local
inequality constraint functions
h(xe i (τ0,0 ; si , qi ), qi )
h(xe i (τ0,1 ; si , qi ), qi )
Hi (si , qi ) :=
..
.
h(x
e i (τ0,M−1 ; si , qi ), qi )
N−1
X
minimize ℓi (si , qi ) + Vf (sN ) (8.23a)
s, q i=0
subject to s0 = x0 (8.23b)
si+1 = x
e i (ti+1 ; si , qi ), for i = 0, . . . , N − 1 (8.23c)
Hi (si , qi ) ≤ 0, for i = 0, . . . , N − 1 (8.23d)
hf (sN ) ≤ 0 (8.23e)
x
e i (ti+1 ; si , qi ) = Asi + Bqi
538 Numerical Optimal Control
with
Z (ti+1 −ti )
A = exp (Ac (ti+1 − ti )) and B = exp (Ac τ) Bc dτ
0
the time points ti , and then regard the implicit Runge-Kutta equations
with M stages on the interval with length hi := (ti+1 − ti ), which create
an implicit relation between si and si+1 . We introduce additional vari-
ables Ki := [k′i,1 · · · k′i,M ]′ ∈ RnM , where ki,j ∈ Rn corresponds to the
state derivative at the collocation time point ti + cj hi for j = 1, . . . , M.
These variables Ki are uniquely defined by the collocation equations if
si and the control value qi ∈ Rm are given. We summarize the colloca-
tion equations as GiRK (si , Ki , qi ) = 0 with
ki,1 − fc (si + hi (a11 ki,1 + . . . + a1,M ki,M ), qi )
k
i,2 − fc (si + hi (a21 ki,1 + . . . + a2,M ki,M ), qi )
GiRK (si , Ki , qi ) :=
.
..
ki,M − fc (si + hi (aM1 ki,1 + . . . + aM,M ki,M ), qi )
(8.24)
The transition to the next state is described by si+1 = FiRK (si , Ki , qi )
with
FiRK (si , Ki , qi ) := si + hi (b1 ki,1 + . . . + bM ki,M )
In contrast to shooting methods, where the controls are often held con-
stant across several integration steps, in direct collocation one usu-
ally allows one new control value qi per collocation interval, as we do
here. Even a separate control parameter for every collocation time point
within the interval is possible. This would introduce the maximum
number of control degrees of freedom that is compatible with direct
collocation methods and could be interpreted as a piecewise polyno-
mial control parameterization of order (M − 1).
Derivative versus state representation. In most direct collocation
implementations, one uses a slightly different formulation, where the
intermediate stage derivative variables Ki = [k′i,1 · · · k′i,M ]′ ∈ RnM are
′
replaced by the stage state variables Si = [si,1 ′
· · · si,M ]′ ∈ RnM that
are related to si and Ki via the linear map
si,j = si + hi (aj1 ki,1 . . . + aj,M ki,M ) for j = 1, . . . , M (8.25)
If c1 > 0, then the relative time points (0, c1 , . . . , cM ) are all different,
such that the interpolation polynomial through the (M + 1) states (si ,
si,1 , . . . , si,M ) is uniquely defined, which renders the linear map (8.25)
from (si , Ki ) to (si , Si ) invertible. Concretely, the values ki,j can be
obtained as the time derivatives of the interpolation polynomial at the
collocation time points. The inverse map, for j = 1, . . . , M, is given by
1
ki,j = Dj,1 (si,1 − si ) + . . . + Dj,M (si,M − si ) (8.26)
hi
540 Numerical Optimal Control
Interestingly, the matrix (Djl ) is the inverse of the matrix (amj ) from
PM
the Butcher tableau, such that j=1 amj Djl = δml . Inserting this in-
verse map into GiRK (si , Ki , qi ) from Eq. (8.24) leads to the equivalent
root-finding problem Gi (si , Si , qi ) = 0 with
Gi (si , Si , qi ) :=
1
hi D1,1 (si,1 − si ) + . . . + D1,M (si,M − si ) − fc (si,1 , qi )
1
h D2,1 (si,1 − si ) + . . . + D2,M (si,M − si ) − fc (si,2 , qi )
i
.. (8.27)
.
1
hi D M,1 (s i,1 − si ) + . . . + DM,M (si,M − si ) − fc (si,M , qi )
Likewise, inserting the inverse map into FiRK (si , Ki , qi ) leads to the lin-
ear expression
e 1 (si,1 − si ) + . . . + b
Fi (si , Si , qi ) := si + b e M (si,M − si )
Direct
R ti+1 collocation optimization problem. The objective integrals
ti ℓc (x
e (t), qi ) dt on each interval are canonically approximated by
a weighted sum of evaluations of ℓc on the collocation time points, as
follows
M
X
ℓi (si , Si , qi ) := hi bj ℓc (si,j , qi )
j=1
8.5 Direct Optimal Control Parameterizations 541
Similarly, one might choose to impose the path constraints on all col-
location time points, leading to the stage inequality function
h(si,1 , qi )
h(s , q )
i,2 i
Hi (si , Si , qi ) :=
..
.
h(si,M , qi )
N−1
X
minimize ℓi (si , Si , qi ) + Vf (sN ) (8.28a)
s, S, q i=0
subject to s0 = x0 (8.28b)
si+1 = Fi (si , Si , qi ), for i = 0, . . . , N − 1 (8.28c)
0 = Gi (si , Si , qi ), for i = 0, . . . , N − 1 (8.28d)
Hi (si , Si , qi ) ≤ 0, for i = 0, . . . , N − 1 (8.28e)
hf (sN ) ≤ 0 (8.28f)
minimize F (w)
w ∈ Rnw
subject to G(x0 , w) = 0 (8.29)
H(w) ≤ 0
whose gradient and Hessian matrix with respect to w are often used.
Again, they do not depend on x0 , and can thus be written as ∇w L(w,
λ, µ) and ∇2w L(w, λ, µ). Note that the dimensions of the multipliers, or
dual variables λ and µ, equal the output dimensions of the functions
G and H, which we denote by nG and nH . We sometimes call w ∈ Rnw
the primal variable. At a feasible point w, we say that an inequality
with index i ∈ {1, . . . , nH } is active if and only if Hi (w) = 0. The
linear independence constraint qualification (LICQ) is satisfied if and
only if the gradients of all active inequalities, ∇w Hi (w) ∈ Rnw , and the
gradients of the equality constraints, ∇w Gj (w) ∈ Rnw for j ∈ {1, . . . ,
nG }, form a linearly independent set of vectors.
∇w L(w 0 , λ0 , µ 0 ) = 0 (8.31a)
0
G(x0 , w ) = 0 (8.31b)
0 0
0 ≥ H(w ) ⊥ µ ≥ 0 (8.31c)
which implies that the products µi0 Hi (w 0 ) are zero for each i ∈ {1, . . . ,
nH }. Thus, each pair (Hi (w 0 ), µi0 ) ∈ R2 must be an element of a nons-
mooth, L-shaped subset of R2 that comprises only the negative x-axis,
the positive y-axis, and the origin.
Any triple (w 0 , λ0 , µ 0 ) that satisfies the KKT conditions (8.31) and
LICQ is called a KKT point, independent of local optimality.
In general, the existence of multipliers such that the KKT condi-
tions (8.31) hold is just a necessary condition for local optimality of a
point w 0 at which LICQ holds. Only in the special case that the opti-
mization problem is convex, the KKT conditions can be shown to be
both a necessary and a sufficient condition for global optimality. For
the general case, we need to formulate additional conditions on the
second-order derivatives of the problem functions to arrive at suffi-
cient conditions for local optimality. This is only possible after making
a few definitions.
Strictly active constraints and null space basis. At a KKT point (w,
λ, µ), an active constraint with index i ∈ {1, . . . , nH } is called weakly
active if and only if µi = 0 and strictly active if µi > 0. Note that for
weakly active constraints, the pair (Hi (w), µi ) is located at the origin,
i.e., at the nonsmooth point of the L-shaped set. For KKT points without
weakly active constraints, i.e., when the inequalities are either strictly
active or inactive, we say that the strict complementarity condition is
satisfied.
Based on the division into weakly and strictly active constraints, one
can construct the linear space Z of directions in which the strictly active
constraints and the equality constraints remain constant up to first or-
der. This space Z plays an important role in the second-order sufficient
conditions for optimality that we state below, and can be defined as the
null space of the matrix that is formed by putting the transposed gra-
dient vectors of all equality constraints and all strictly active inequality
constraints on top of each other. To define this properly at a KKT point
(w, λ, µ), we reorder the inequality constraints such that
+
H (w)
0
H(w) = H (w)
H − (w)
In this reordered view on the function H(w), the strictly active inequal-
ity constraints H + (w) come first, then the weakly active constraints
H 0 (w), and finally the inactive constraints H − (w). Note that the out-
put dimensions of the three functions add to nH . The set Z ⊂ Rnw is
8.6 Nonlinear Optimization 545
One can regard an orthogonal basis matrix Z ∈ Rnw ×(nw −nA ) of Z that
satisfies AZ = 0 and Z ′ Z = I and whose columns span Z. This al-
lows us to compactly formulate the following sufficient conditions for
optimality.
Theorem 8.15 (Strong second-order sufficient conditions for optimal-
ity). If (w 0 , λ0 , µ 0 ) is a KKT point and if the Hessian of its Lagrangian
is positive definite on the corresponding space Z, i.e., if
More specifically, the solution (w QP (x0 , λQP (x0 ), µ QP (x0 )) of the above
QP satisfies
QP
w (x0 ) − w 0 (x0 )
QP
λ (x0 ) − λ0 (x0 ) = O(|x0 − x̄0 |2 )
µ QP (x0 ) − µ 0 (x0 )
∇w L(w 0 , λ0 ) = 0
G(x0 , w 0 ) = 0
To simplify notation and avoid that the iteration index k interferes with
the indices of the optimization variables, we usually use the following
notation for the Newton step
Here, the old iterate and linearization point is called z̄ and the new it-
erate z+ . The square Jacobian matrix Rz (z) that needs to be factorized
in each iteration to compute ∆z has a particular structure and is given
by " #
∇2w L(w, λ) Gw (w)′
Rz (z) =
Gw (w) 0
This matrix is called the KKT matrix and plays an important role in
all constrained optimization algorithms. The KKT matrix is invertible
at a point z if the LICQ condition holds, i.e., Gw (w) has rank nG , and
if the Hessian of the Lagrangian is positive definite on the null space
of Gw (w), i.e., if Z ′ ∇2w L(w, λ, µ)Z > 0, for Z being a null space basis.
The matrix Z ′ ∇2w L(w, λ, µ)Z is also called the reduced Hessian. Note
that the KKT matrix is invertible at a strongly regular point, as well
8.6 Nonlinear Optimization 547
with Bex (z̄) := ∇2w L(w̄, λ̄, µ̄). If the primal-dual solution of the above
QP is denoted by w QP and λQP , one can easily show that setting
w + := w QP and λ+ := λQP yields the same step as the Newton iteration.
The interpretation of the Newton step as a QP is not particularly rele-
vant for equality constrained problems, but becomes a powerful tool in
the context of inequality constrained optimization. It directly leads to
the family of sequential quadratic programming (SQP) methods, which
are treated in Section 8.7.1. One interesting observation is that the
QP (8.37) is identical to the QP (8.33) from Theorem 8.16, and thus its
solution cannot only be used as a Newton step for a fixed value of x0 ,
but it can also deliver a tangential predictor for changing values of x0 .
This property is used extensively in continuation methods for nonlin-
ear MPC, such as the real-time iteration presented in Section 8.9.2.
In that case the Newton’s method would fail because the KKT matrix
would become singular in one iteration. Also, the evaluation of the ex-
act Hessian can be costly. For this reason, Newton-type optimization
methods approximate the exact Hessian matrix Bex (z̄) by an approxi-
mation B̄ that is typically positive definite or at least positive semidef-
inite, and solve the QP
1
minimize FL (w; w̄) + (w − w̄)′ B̄(w − w̄)
w∈R nw 2 (8.38)
subject to GL (x0 , w; w̄) = 0
By taking only the first part of this expression, one obtains the Gauss-
Newton Hessian approximation BGN (w̄), which is by definition always
8.6 Nonlinear Optimization 549
a positive semidefinite matrix. In the case that Mw (w̄) ∈ RnM ×nw has
rank nw , i.e., if nM ≥ nw and the nw columns are linearly indepen-
dent, the Gauss-Newton Hessian BGN (w̄) is even positive definite. Note
that BGN (w̄) does not depend on the multipliers λ, but the error with
respect to the exact Hessian does. This error would be zero if both the
residuals Mj (w̄) and the multipliers λi are zero. Because both can be
shown to be small at a strongly regular solution with small objective
function (1/2) |M(w)|2 , the Gauss-Newton Hessian BGN (w̄) is a good
approximation for problems with small residuals |M(w)|.
When the Gauss-Newton Hessian BGN (w̄) is used within a con-
strained optimization algorithm, as we do here, the resulting algo-
rithm is often called the constrained or generalized Gauss-Newton
method (Bock, 1983). Newton-type optimization algorithms with
Gauss-Newton Hessian converge only linearly, but their contraction rate
can be surprisingly fast in practice, in particular for problems with
small residuals. The QP subproblem that is solved in each iteration of
the constrained Gauss-Newton method can be shown to be equivalent
to
minimize (1/2) |ML (w; w̄)|2
w ∈ Rnw (8.39)
subject to GL (x0 , w; w̄) = 0
1
minimize FL (w; wk ) + (w − wk )′ Bk (w − wk )
w∈R nw 2 (8.40)
subject to GL (x0 , w; wk ) = 0
550 Numerical Optimal Control
QP QP
In a full step method, the primal-dual solution wk and λk of the above
QP QP
QP is used as next iterate, i.e., wk+1 := wk and λk+1 := λk . A Hessian
update formula uses the previous Hessian approximation Bk and the
Lagrange gradient evaluations at wk and wk+1 to compute the next
Hessian approximation Bk+1 . Inspired from a directional derivative of
the function ∇w L(·, λk+1 ) in the direction sk := (wk+1 − wk ), which,
up-to-first order, should be equal to the finite difference approximation
yk := ∇w L(wk+1 , λk+1 ) − ∇w L(wk , λk+1 ), all Hessian update formulas
require the secant condition
Bk+1 sk = yk
Bk sk sk′ Bk yk yk′
Bk+1 := Bk − +
sk′ Bk sk yk′ sk
One often starts the update procedure with a scaled unit matrix, i.e.,
sets B0 := αI with some α > 0. It can be shown that for a positive defi-
nite Bk and for yk′ sk > 0, the matrix Bk+1 resulting from the BFGS for-
mula is also positive definite. In a practical implementation, to ensure
positive definiteness of Bk+1 , the unmodified update formula is only
applied if yk′ sk is sufficiently large, say if the inequality yk′ sk ≥ βsk′ Bk sk
is satisfied with some β ∈ (0, 1), e.g., β = 0.2. If it is not satisfied, the
update can either be skipped, i.e., one sets Bk+1 := Bk , or the vector yk
is first modified and then the BFGS update is performed with this mod-
ified vector. An important observation is that the gradient difference
yk can be computed with knowledge of the first-order derivatives of F
and G at wk and wk+1 , which are needed to define the linearizations FL
and GL in the QP (8.40) at the current and next iteration point. Thus, a
Hessian update formula does not create any additional costs in terms
of derivative computations compared to a fixed Hessian method (like,
for example, steepest descent); but it typically improves the conver-
gence speed significantly. One can show that Hessian update methods
lead to superlinear convergence under mild conditions.
1
minimize FL (w; wk ) + (w − wk )′ Bk (w − wk )
w∈R nw 2
(8.41)
subject to GL (x0 , w; wk ) = 0
HL (w; wk ) ≤ 0
problem given by
nH
X
minimize F (w) − τ log si
w, s i=1
(8.42)
subject to G(x0 , w) = 0
H(w) + s = 0
For τ → 0, the barrier term −τ log si becomes zero for any strictly posi-
tive si > 0 while it always grows to infinity for si → 0, i.e., on the bound-
ary of the feasible set. Thus, for τ → 0, the barrier function would be a
perfect indicator function of the true feasible set and one can show that
the solution of the modified problem (8.42) tends to the solution of the
original problem (8.29) for τ → 0. For any positive τ > 0, the necessary
optimality conditions of problem (8.42) are a smooth set of equations,
and can, if we denote the multipliers for the equalities H(w) + s = 0 by
µ ∈ RnH and keep the original definition of the Lagrangian from (8.30),
be equivalently formulated as
∇w L(w, λ, µ) = 0 (8.43a)
G(x0 , w) = 0 (8.43b)
H(w) + s = 0 (8.43c)
µi s i = τ for i = 1, . . . , nH (8.43d)
Note that for τ > 0, the last condition (8.43d) is a smooth version of
the complementarity condition 0 ≤ s ⊥ µ ≥ 0 that would correspond
to the KKT conditions of the original problem after introduction of the
slack variable s.
A nonlinear IP method proceeds as follows: it first sets τ to a rather
large value, and solves the corresponding root-finding problem (8.43)
with a Newton-type method for equality constrained optimization. Dur-
ing these iterations, the implicit constraints si > 0 and µi > 0 are
strictly enforced by shortening the steps, if necessary, to avoid being
attracted by spurious solutions of µi si = τ. Then, it slowly reduces the
barrier parameter τ; for each new value of τ, the Newton-type iterations
are initialized with the solution of the previous problem.
Of course, with finitely many Newton-type iterations, the root-
finding problems for decreasing values of τ can only be solved ap-
proximately. In practice, one often performs only one Newton-type
iteration per problem, i.e., one iterates while one changes the problem.
Here, we have sketched the primal-dual IP method as it is for example
554 Numerical Optimal Control
implemented in the NLP solver IPOPT (Wächter and Biegler, 2006); but
there exist many other variants of nonlinear interior point methods. IP
methods also exist in variants that are tailored to linear or quadratic
programs and IP methods also can be applied to other convex optimiza-
tion problems such as second-order cone programs or semidefinite pro-
grams (SDP). For these convex IP algorithms, one can establish polyno-
mial runtime bounds, which unfortunately cannot be established for
the more general case of nonlinear IP methods described here.
nH
X
minimize F (w) − τ ti
w, t i=1
(8.44)
subject to G(x0 , w) = 0
Hi (w) + (ti )2 = 0, i = 1, . . . , nH
N−1
X
minimize ℓi (xi , ui ) + Vf (xN )
w i=0
(8.45)
¯ 0 − x0 = 0
subject to x̄
fi (xi , ui ) − xi+1 = 0 for i = 0, . . . , N − 1
Here, the quadratic objective contributions ℓQP,i (xi , ui ; w̄, B̄) are given
by
" # " #′ " #" #
′ xi − x̄i 1 xi − x̄i Q̄i S̄i′ xi − x̄i
ℓi (x̄i , ūi )+∇(s,q) ℓi (x̄i , ūi ) +
ui − ūi 2 ui − ūi S̄i R̄i ui − ūi
Vf (x̄N ) + ∇Vf (x̄N )′ [xN − x̄N ] + (1/2)[xN − x̄N ]′ P̄N [xN − x̄N ]
and the linearized constraint functions fL,i (xi , ui ; x̄i , ūi ) are simply
given by
∂fi ∂fi
fi (x̄i , ūi ) + (x̄i , ūi )[xi − x̄i ] + (x̄i , ūi )[ui − ūi ]
|∂s {z } ∂q
| {z }
=:Āi =:B̄i
where the residual vector is given by r̄KKT := ∇z L(x̄¯ 0 , z̄) − M̄KKT z̄.
The matrix M̄KKT is an approximation of the block-banded KKT matrix
∇2z L(z̄) and given by
0 −I
−I Q̄0 S̄0′ Ā′0
S̄ 0 R̄0 B̄ ′
0
Ā0 B̄0 0 −I
..
M̄KKT =
−I .
(8.49)
′ ′
Q̄N−1 S̄N−1 ĀN−1
′
S̄N−1 R̄N−1 B̄N−1
Ā N−1 B̄N−1 0 −I
−I
P̄N
N−1
" #′ " # " #′ " #" #
X q̄i xi 1 xi Q̄i S̄i′ xi 1 ′
′
minimize + + p̄N xN + xN P̄N xN
x, u r̄i ui 2 ui S̄i R̄i ui 2
i=0
¯ 0 − x0 = 0
subject to x̄
b̄i + Āi xi + B̄i ui − xi+1 = 0 for i = 0, . . . , N − 1
(8.50)
Here, we use the bar above fixed quantities such as Āi , Q̄i to distin-
guish them from the optimization variables xi , ui , and the quantities
that are computed during the solution of the optimization problem.
This distinction makes it possible to directly interpret problem (8.50)
as the LQ approximation (8.47) of a nonlinear problem (8.45) at a given
linearization point z̄ = [λ̄′0 x̄0′ ū′0 · · · λ̄′N−1 x̄N−1
′
ū′N−1 λ̄′N x̄N
′ ′
] within
a Newton-type optimization method. We call the above problem the lin-
ear quadratic problem (LQP), and present different solution approaches
for the LQP in the following three subsections.
The only condition for the above matrix recursion formula to be well
defined is that the matrix (R̄i + B̄i′ Pi+1 B̄i ) is positive definite, which
turns out to be equivalent to the optimization problem being well posed
(otherwise, problem (8.50) would be unbounded from below). Note that
the Riccati matrix recursion propagates symmetric matrices Pi , whose
symmetry can and should be exploited for efficient computations.
The second recursion is a vector recursion that also goes backward
in time and is based on the matrices P0 , . . . , PN resulting from the first
recursion, and can be performed concurrently. It starts with pN := p̄N
and then runs through the indices i = N − 1, . . . , 0 to compute
Interestingly, the result of the first and the second recursion together
yield the optimal cost-to-go functions Vi0 for the states xi that are given
by
1
Vi0 (xi ) = ci + pi′ xi + xi′ Pi xi
2
where the constants ci are not of interest here. Also, one directly ob-
tains the optimal feedback control laws u0i that are given by
u0i (xi ) = ki + Ki xi
with
Ki := −(R̄i + B̄i′ Pi+1 B̄i )−1 (S̄i + B̄i′ Pi+1 Āi ) and (8.53a)
ki := −(R̄i + B̄i′ Pi+1 B̄i )−1 (r̄i + B̄i′ (Pi+1 b̄i + pi+1 )) (8.53b)
Based on these data, the optimal solution to the optimal control prob-
lem is obtained by a forward vector recursion that is nothing other
than a forward simulation of the linear dynamics using the optimal
feedback control law. Thus, the third recursion starts with x0 := x̄¯0
and goes through i = 0, . . . , N − 1 computing
ui := ki + Ki xi (8.54a)
xi+1 := b̄i + Āi xi + B̄i ui (8.54b)
560 Numerical Optimal Control
λi := pi + Pi xi (8.54c)
¯ 0 + A−1 B u
x = A−1 b + A−1 I x̄
It is important to note that the exact Hessian ∇2u F (x̄ ¯ 0 , ū) is a dense
matrix of size Nm (where m is the control dimension), and that one
usually also chooses a dense Hessian approximation B̄ that is ideally
positive definite.
A Cholesky decomposition of a symmetric positive definite linear
system of size Nm has a computational cost of (1/3)(Nm)3 FLOPs, i.e.,
the iteration cost of the plain sequential approach grows cubically with
the horizon length N. In addition to the cost of the linear system solve,
one has to consider the cost of computing the gradient ∇u F (x̄ ¯ 0 , ū).
This is ideally done by a backward sweep equivalent to the reverse mode
of algorithmic differentiation (AD) as stated in (8.16), at a cost that
grows linearly in N. The cost of forming the Hessian approximation
depends on the chosen approximation, but is typically quadratic in N.
For example, an exact Hessian could be computed by performing Nm
¯ 0 , u).
forward derivatives of the gradient function ∇u F (x̄
The plain dense sequential approach results in a medium-sized op-
timization problem without much sparsity structure but with expen-
sive function and derivative evaluations, and can thus be addressed
by a standard nonlinear programming method that does not exploit
sparsity, but converges with a limited number of function evaluations.
Typically, an SQP method in combination with a dense active set QP
solver is used.
Sparsity-exploiting sequential approaches. Interestingly, one can
form and solve the same linear system as in (8.56) by using the sparse
564 Numerical Optimal Control
linear algebra techniques described in the previous section for the si-
multaneous approach. To implement this, it would be easiest to start
with an algorithm for the simultaneous approach that computes the
full iterate in the vector z that contains as subsequences the controls
u = [u′0 · · · u′N−1 ]′ , the states x = [x0′ · · · xN ]′ , and the multipliers
λ = [λ′0 · · · λ′N ]′ . After the linear system solve, one would simply
overwrite the states x by the result of a nonlinear forward simulation
for the given controls u.
The sparse sequential approach is particularly easy to implement
if a Gauss-Newton Hessian approximation is used (Sideris and Bobrow,
2005). To compute the exact Hessian blocks, one performs a second
reverse sweep identical to (8.16) to overwrite the values of the multipli-
ers λ. As in the simultaneous approach, the cost for each Newton-type
iteration would be linear in N with this approach, while one can show
that the resulting iterates would be identical to those of the dense se-
quential approach for both the exact and the Gauss-Newton Hessian
approximations.
ui := ki + Ki xi (8.57a)
xi+1 := fi (xi , ui ) (8.57b)
with Ki and ki from (8.53a) and (8.53b), to define the next control and
state trajectory. Interestingly, DDP only performs the backward recur-
sions (8.51) and (8.52) from the Riccati algorithm. The forward simula-
tion of the linear system (8.54b) is replaced by the forward simulation
of the nonlinear system (8.57b). Note that both the states and the con-
trols in DDP are different from the standard sequential approach.
8.8 Structure in Discrete Time Optimal Control 565
the DDP algorithm needs not only the controls ūi , but also the states
x̄i and the Lagrange multipliers λ̄i+1 , which are not part of the mem-
ory of the algorithm. While the states x̄i are readily obtained by the
nonlinear forward simulation (8.57b), the Lagrange multipliers λ̄i+1 are
obtained simultaneously with the combined backward recursions (8.51)
and (8.52). They are chosen as the gradient of the quadratic cost-to-go
1
function Vi0 (xi ) = pi′ xi+ 2 xi′ Pi xi at the corresponding state values, i.e.,
as
λ̄i := pi + Pi x̄i (8.58)
backward Riccati recursions (8.51) and (8.52) can be started and the
Lagrange multipliers be computed simultaneously using (8.58).
The DDP algorithm in its original form is only applicable to uncon-
strained problems, but can easily be adapted to deal with control con-
straints. In order to deal with state constraints, a variety of heuristics
can be employed that include, for example, barrier methods; a similar
idea was presented in the more general context of constrained OCPs un-
der the name feasibility perturbed sequential quadratic programming
by Tenny, Wright, and Rawlings (2004).
¯ 0 , w, λ, µ) = λ′0 (x̄
L(x̄ ¯ 0 − x 0 ) + µN
′
rN (xN ) + Vf (xN )
N−1
X
+ ℓi (xi , ui ) + λ′i+1 (fi (xi , ui ) − xi+1 ) + µi′ ri (xi , ui )
i=0
the first interval of length ∆t, one could use one single long collocation
interval of length (N − 1)∆t with one global polynomial approximation
of states and controls, as in pseudospectral collocation, in the hope of
obtaining a cheaper approximation of VN−1 (f (x0 , u0 )).
optimization iteration.
Warmstarting and shift. Another easy way to transfer solution infor-
mation from one MPC problem to the next is to use an existing solution
approximation as initial guess for the next MPC optimization problem,
in a procedure called warmstarting. In its simplest variant, one can
just use the existing solution guess without any modification. In the
shift initialization, one first shifts the current solution guess to account
for the advancement of time. The shift initialization can most easily be
performed if an equidistant grid is used for control and state discretiza-
tion, and is particularly advantageous for systems with time-varying
dynamics or objectives, e.g., if a sequence of future disturbances is
known, or one is tracking a time-varying trajectory.
Iterating while the problem changes. Extending the idea of warm-
starting, some MPC algorithms do not separate between one opti-
mization problem and the next, but always iterate while the problem
changes. They only perform one iteration per sampling time, and they
never try to iterate the optimization procedure to convergence for any
fixed problem. Instead, they continue to iterate while the optimization
problem changes. When implemented with care, this approach ensures
that the algorithm always works with the most current information, and
never loses precious time by working on outdated information.
Several of the ideas mentioned above are related to the idea of contin-
uation methods, which we now discuss in more algorithmic detail. For
this aim, we first regard a parameter-dependent root-finding problem
of the form
R(x, z) = 0
with variable z ∈ Rnz , parameter x ∈ Rn , and a smooth function
R : Rn × Rnz → Rnz . This root-finding problem could originate from an
equality constrained MPC optimization problem with fixed barrier as
it arises in a nonlinear IP method. The parameter dependence on x is
due to the initial state value, which varies from one MPC optimization
problem to the next. In case of infinite computational resources, one
could just employ one of the Newton-type methods from Section 8.3.2
to converge to an accurate approximation of the exact solution z∗ (x)
that satisfies R(x, z∗ (x)) = 0. In practice, however, we only have lim-
ited computing power and finite time, and need to be satisfied with an
approximation of z∗ (x).
572 Numerical Optimal Control
Another viewpoint on this iteration is that zk+1 solves the linear equa-
tion system RL (xk+1 , zk+1 ; zk ) = 0. Interestingly, assuming only regu-
larity of Rz , one can show that if zk equals the exact solution z∗ (xk )
for the previous parameter xk , the next iterate zk+1 is a first-order ap-
proximation, or tangential predictor, for the exact solution z∗ (xk+1 ).
More generally, one can show that
" # 2
∗
zk − z (x k )
zk+1 − z∗ (xk+1 ) = O (8.59)
xk+1 − xk
From this equation it follows that one can remain in the area of con-
vergence of the Newton method if one starts close enough to an ex-
act solution, zk ≈ z∗ (xk ), and if the parameter changes (xk+1 − xk )
are small enough. Interestingly, it also implies quadratic convergence
toward the solution in case the parameter values of xk remain con-
stant. Roughly speaking, the continuation method delivers tangential
predictors in case the parameters xk change a lot, and nearly quadratic
convergence in case they change little.
The continuation method idea can be extended to Newton-type it-
erations of the form
• One can try to solve the MINLP to global optimality using tech-
niques from the field of global optimization.
While the first two options can lead to viable solutions for relevant ap-
plications, they often lead to excessively large runtimes, so the MPC
practitioner may need to resort to the last option. Fortunately, the
optimal control structure of the problem allows us to use a powerful
heuristic that exploits the fact that the state of a (continuous time)
system is most strongly influenced by the time average of its controls
rather than their pointwise values, as illustrated in Figure 8.7. This
heuristic is based on a careful MINLP formulation, which is very similar
to a standard nonlinear MPC problem, but with special structure. First,
divide the input vector u = (uc , ub ) ∈ Rmc +mb into continuous inputs,
uc , and binary integer inputs, ub , such that the system is described by
x + = f (x, uc , ub ). Second, and without loss of generality, we restrict
ourselves to binary integers ub ∈ {0, 1}mb inside a convex polyhedron
P ⊂ [0, 1]mb , and assume that ub enters the system linearly.3 The poly-
hedral constraint ub ∈ P allows us to exclude some combinations, e.g.,
3 If necessary, this binary representation can be achieved by a technique called outer
convexification, which is applicable to any system x + = fe (x, uc , uI ) where the integer
vector uI has dimension mI and can take finitely many (nI ) values uI ∈ {uI,1 , . . . ,
P mb
uI,nI }. We set mb := nI and f (x, uc , ub ) := i=1 ub,i fe (x, uc , uI,i ) and P := {ub ∈ [0,
m
P m b
1] b | j=1 ub,i = 1}. Due to exponential growth of nI in the number of original
integer decisions mI , this technique should be applied with care, e.g., only partially for
separate subsystems, or avoided altogether if the original system is already linear in
the integer controls.
576 Numerical Optimal Control
N−1
X
minimize ℓ(x(k), uc (k), ub (k)) + Vf (x(N))
x, uc , ub k=0
subject to x(0) = x0
x(k + 1) = f (x(k), uc (k), ub (k)), k = 0, . . . , N − 1
h(x(k), uc (k), ub (k)) ≤ 0, k = 0, . . . , N − 1
hf (x(N)) ≤ 0
ub (k) ∈ P , k = 0, . . . , N − 1
ub ∈ B
(8.61)
Without the last constraint, ub ∈ B, the above problem would be a
standard NLP with optimal control structure. Likewise, a standard NLP
arises if the binary controls ub are fixed. These two observations di-
rectly lead to the following three-step algorithm that is a heuristic to
find a good feasible solution of the MINLP (8.61).
3. Fix the binary controls to ub∗∗ and solve the restricted NLP (8.61)
in the variables (x, uc ) only, with solution (x∗∗∗ , uc∗∗∗ ) and ob-
jective value VN∗∗∗ .
The result of the algorithm is the triple (x∗∗∗ , uc∗∗∗ , ub∗∗ ) which is a
8.10 Discrete Actuators 577
feasible, but typically not an optimal point of the MINLP (8.61).4 Note
that this feasible MINLP solution has an objective value VN∗∗∗ that is
larger than the unknown exact MINLP solution VN0 which in turn is larger
than the relaxed NLP objective VN∗ from Step 1 (if the global NLP solution
was found): VN∗ ≤ VN0 ≤ VN∗∗∗ . Thus, the objective values from Steps 1
and 3 help us to bound the optimality loss incurred by using the above
three-step heuristic.
The choice of the approximation in Step 2 affects both solution qual-
ity and computational complexity. One popular choice, that is taken in
the combinatorial integral approximation (CIA) algorithm (Sager, Jung,
and Kirches, 2011) is to minimize the distance in a specially scaled
maximum norm that compares integrals, and is given by
n−1
X
∥ub ∥CIA := max ub,j (k)
j≤mb , n≤N
k=0
1.0
xref
0.9 x∗
x∗∗∗
x
0.8
0.7
0.6
1.00
0.75
ub
0.50
0.25 u∗b
u∗∗
b
0.00
0.00 0.25 0.50 0.75 1.00 1.25
k
Figure 8.7: Relaxed and binary feasible solution for Example 8.17.
subject to x(0) = x0
(8.62)
x(k + 1) = f (x(k), ub (k)), k = 0, . . . , N − 1
ub (k) ∈ [0, 1], k = 0, . . . , N − 1
ub ∈ B
VN∗ = 0.166 and VN∗∗∗ = 0.1771. The true optimal cost, which can for
this simple example be found in a few seconds by an intelligent inves-
tigation of all 230 ≈ 109 possibilities via branch-and-bound, is given by
VN0 = 0.176. □
8.11 Notes
The description of numerical optimal control methods in this chapter
is far from complete, and we have left out many details as well as many
methods that are important in practice. We mention some related lit-
erature and software links that could complement this chapter.
General numerical optimal control methods are described in the
textbooks by Bryson and Ho (1975); Betts (2001); Gerdts (2011); and
in particular by Biegler (2010). The latter reference focuses on di-
rect methods and also provides an in-depth treatment of nonlinear
programming. The overview articles by Binder, Blank, Bock, Bulirsch,
Dahmen, Diehl, Kronseder, Marquardt, Schlöder, and Stryk (2001);
and Diehl, Ferreau, and Haverbeke (2009); as well a forthcoming text-
book on numerical optimal control (Gros and Diehl, 2020) has a similar
focus on online optimization for MPC as the current chapter.
General textbooks on numerical optimization are Bertsekas (1999);
Nocedal and Wright (2006). Convex optimization is covered by Ben-
Tal and Nemirovski (2001); Nesterov (2004); Boyd and Vandenberghe
(2004). The last book is particularly accessible for an engineering audi-
ence, and its PDF is freely available on the home page of its first author.
Newton’s method for nonlinear equations and its many variants are
described and analyzed in a textbook by Deuflhard (2011). An up-to-
date overview of optimization tools can be found at plato.asu.edu/
guide.html, many optimization solvers are available as source code
at www.coin-or.org, and many optimization solvers can be accessed
online via neos-server.org.
While the direct single-shooting method often is implemented by
coupling an efficient numerical integration solver with a general non-
linear program (NLP) solver such as SNOPT (Gill, Murray, and Saun-
ders, 2005), the direct multiple-shooting and direct collocation meth-
ods need to be implemented by using NLP solvers that fully exploit the
sparsity structure, such as IPOPT5 (Wächter and Biegler, 2006) There
exist many custom implementations of the direct multiple-shooting
5 This code is available to the public under a permissive open-source license.
580 Numerical Optimal Control
method with their own structure-exploiting NLP solvers, such as, for
example, HQP5 (Franke, 1998); MUSCOD-II (Leineweber, Bauer, Schäfer,
Bock, and Schlöder, 2003); ACADO5 (Houska, Ferreau, and Diehl, 2011);
and FORCES-NLP (Zanelli, Domahidi, Jerez, and Morari, 2017).
Structure-exploiting QP solvers that can be used standalone for lin-
ear MPC or as subproblem solvers within SQP methods are, for example,
the dense code qpOASES5 (Ferreau, Kirches, Potschka, Bock, and Diehl,
2014), which is usually combined with condensing, or the sparse codes
FORCES (Domahidi, 2013); qpDUNES5 (Frasch, Sager, and Diehl, 2015);
and HPMPC5 (Frison, 2015). The latter is based on a CPU specific ma-
trix storage format that by itself leads to speedups in the range of one
order of magnitude, and which was made available to the public in the
BLASFEO5 library at github.com/giaf/blasfeo.
In Section 8.2 on numerical simulation methods, we have exclu-
sively treated Runge-Kutta methods because they play an important
role within a large variety of numerical optimal control algorithms, such
as shooting, collocation, or pseudospectral methods. Another popular
and important family of integration methods, however, are the linear
multistep methods; in particular, the implicit backward differentiation
formula (BDF) methods are widely used for simulation and optimization
of large stiff differential algebraic equations (DAEs). For an in-depth
treatment of general numerical simulation methods for ordinary dif-
ferential equations (ODEs) and DAEs, we recommend the textbooks by
Hairer, Nørsett, and Wanner (1993, 1996); as well as Brenan, Campbell,
and Petzold (1996); Ascher and Petzold (1998).
For derivative generation of numerical simulation methods, we refer
to the research articles Bauer, Bock, Körkel, and Schlöder (2000); Pet-
zold, Li, Cao, and Serban (2006); Kristensen, Jørgensen, Thomsen, and
Jørgensen (2004); Quirynen, Gros, Houska, and Diehl (2017a); Quirynen,
Houska, and Diehl (2017b); and the Ph.D. theses by Albersmeyer (2010);
Quirynen (2017). A collection of numerical ODE and DAE solvers with
efficient derivative computations are implemented in the SUNDIALS5
suite (Hindmarsh, Brown, Grant, Lee, Serban, Shumaker, and Wood-
ward, 2005).
Regarding Section 8.4 on derivatives, we refer to a textbook on al-
gorithmic differentiation (AD) by Griewank and Walther (2008), and
an overview of AD tools at www.autodiff.org. The AD framework
CasADi5 can in its latest form be found at casadi.org, and is de-
scribed in the article Andersson, Akesson, and Diehl (2012); and the
Ph.D. theses by Andersson (2013); Gillis (2015).
8.12 Exercises 581
8.12 Exercises
Some of the exercises in this chapter were developed for courses on
numerical optimal control at the University of Freiburg, Germany. The
authors gratefully acknowledge Joel Andersson, Joris Gillis, Sébastien
Gros, Dimitris Kouzoupis, Jesus Lago Garcia, Rien Quirynen, Andrea
Zanelli, and Mario Zanon for contributions to the formulation of these
exercises; as well as Michael Risbeck, Nishith Patel, Douglas Allan, and
Travis Arnold for testing and writing solution scripts.
(a) Write a computer program that performs Newton iterations in Rn that takes as
inputs a function F (z), its Jacobian J(z), and a starting point z[0] ∈ Rn . It
shall output the first 20 full-step Newton iterations. Test your program with
R(z) = z32 − 2 starting first at z[0] = 1 and then at different positive initial
guesses. How many iterations do you typically need in order to obtain a solution
that is exact up to machine precision?
Use your algorithm to implement Newton’s method for this lifted problem and
start it at z[0] = [1 1 1 1 1]′ (note that we use square brackets in the index to
denote the Newton iteration). Compare the convergence of the iterates for the
lifted problem with those of the equivalent unlifted problem from the previous
task, initialized at one.
(d) Regard the problem of finding a solution to the nonlinear equation system
2x = ey/4 and 16x 4 + 81y 4 = 4 in the two variables x, y ∈ R. Solve it with
your implementation of Newton’s method using different initial guesses. Does
it always converge, and, if it converges, does it always converge to the same
solution?
582 Numerical Optimal Control
(b) Now formulate and solve the problem with the simultaneous approach, and solve
it with an exact Newton’s method initialized at u = 0 and the corresponding
state sequence that is obtained by forward simulation started at x0 . Plot the
state trajectories in each iteration.
Plot again the residual values x(N) and the variable u as a function of the Newton
iteration index, and compare with the results that you have obtained with the
sequential approach. Do you observe differences in the convergence speed?
(c) One feature of the simultaneous approach is that its states can be initialized with
any trajectory, even an infeasible one. Initialize the simultaneous approach with
the all-zero trajectory, and again observe the trajectories and the convergence
speed.
(d) Now solve both formulations with a Newton-type method that uses a constant
Jacobian. For both approaches, the constant Jacobian corresponds to the exact
Jacobian at the solution of the same problem for x0 = 0, where all states and the
control are zero. Start with implementing the sequential approach, and initialize
the iterates at u = 0. Again, plot the residual values x(N) and the variable u as
a function of iteration index.
(e) Now implement the simultaneous approach with a fixed Jacobian approxima-
tion. Again, the Jacobian approximation corresponds to the exact Jacobian at
the solution of the neighboring problem with x0 = 0, i.e., the all zero trajectory.
Start the Newton-type iterations with all states and the control set to zero, and
plot the residual values x(N) and the variable u as a function of iteration index.
Discuss the differences of convergence speed with the sequential approach and
with the exact Newton methods from before.
(f) The performance of the sequential approach can be improved if one introduces
the initial state x(0) as a second decision variable. This allows more freedom
for the initialization, and one can automatically profit from tangential solution
8.12 Exercises 583
predictors. Adapt your exact Newton method, initialize the problem in the all-
zero solution and again observe the results.
(g) If u∗ is the exact solution that is found at the end of the iterations, plot the loga-
rithm of u − u∗ versus the iteration number for all six numerical experiments
(a)–(f), and compare.
(h) The linear system that needs to be solved in each iteration of the simultaneous
approach is large and sparse. We can use condensing in order to reduce the linear
system to size one. Implement a condensing-based linear system solver that only
uses multiplications and additions, and one division. Compare the iterations
with the full-space linear algebra approach, and discuss the differences in the
iterations, if any.
(b) f (x) = −c ′ x − x ′ A′ Ax on Rn
{x ∈ Rn | a′1 x ≤ b1 , a′2 x ≤ b2 }
(h) A polyhedron
{x ∈ Rn | Ax ≤ b}
Ω = {x ∈ Rn | dist(x, S) ≤ dist(x, T )}
We assume δ > t for any perturbation size t in the following finite difference approx-
imations. Due to finite machine precision ϵmach that leads to truncation errors, the
computed function fe (x) = f (x)(1 + ϵ(x)) is perturbed by noise ϵ(x) that satisfies the
bound
|ϵ(x)| ≤ ϵmach
′ fe (x0 + t) − fe (x0 )
fe fd,t (x0 ) :=
t
′′ , ϵ
namely, a function ψ(t; fmax , fmax mach ) that satisfies
′
fe fd,t (x0 ) − f ′ (x0 ) ≤ ψ(t; fmax , fmax
′′
, ϵmach )
(b) Which value t ∗ minimizes this bound and which value ψ∗ has the bound at t ∗ ?
(c) Perform a similar error analysis for the central difference quotient
′ fe (x0 + t) − fe (x0 − t)
fe cd,t (x0 ) :=
2t
that is, compute a bound
′
fe fd,t (x0 ) − f ′ (x0 ) ≤ ψcd (t; fmax , fmax
′′ ′′′
, fmax , ϵmach )
∗
(d) For central differences, what is the optimal perturbation size tcd and what is the
∗
size ψcd of the resulting bound on the error?
8.12 Exercises 585
1.0
0.9
0.8
0.7
z (m)
0.6
0.5
0.4
hanging
chain
Exercise 8.6: Finding the equilibrium point of a hanging chain using CasADi
Consider an elastic chain attached to two supports and hanging in-between. Let us
discretize it with N mass points connected by N − 1 springs. Each mass i has position
(yi , zi ), i = 1, . . . , N.
Our task is to minimize the total potential energy, which is made up by potential
energy in each string and potential energy of each mass according to
J(y1 , z1 , . . . , yn , zn ) =
N−1 N
1 X X
Di (yi − yi+1 )2 + (zi − zi+1 )2 + g0 mi z i (8.63)
2
i=1 i=1
| {z } | {z }
spring potential energy gravitational potential energy
(c) Now introduce ground constraints: zi ≥ 0.5 and zi ≥ 0.5 + 0.1 yi , for i = 2, · · · ,
N − 2. Resolve the QP and compare with the unconstrained solution.
(d) We now want to formulate and solve a nonlinear program (NLP). Since an NLP is a
generalization of a QP, we can solve the above problem with an NLP solver. This
can be done by simply changing casadi.qpsol in the script to casadi.nlpsol
and the solver plugin ’qpoases’ with ’ipopt’, corresponding to the open-
source NLP solver IPOPT. Are the solutions of the NLP and QP solver the same?
(e) Now, replace the linear equalities by nonlinear ones that are given by zi ≥ 0.5 +
0.1 yi2 for i = 2, · · · , N − 2. Modify the expressions from before to formulate
and solve the NLP, and visualize the solution. Is the NLP convex?
(f) Now, by modifications of the expressions from before, formulate and solve an
NLP where the inequality constraints are replaced by zi ≥ 0.8 + 0.05 yi − 0.1 yi2
for i = 2, · · · , N − 2. Is this NLP convex?
Consider the following OCP, corresponding to driving a Van der Pol oscillator to the
origin, on a time horizon with length T = 10
8.12 Exercises 587
1.0 x1 (t)
x2 (t)
0.8 u(t)
0.6
0.4
0.2
0.0
−0.2
−0.4
0 2 4 6 8 10
t
Figure 8.9: Direct single shooting solution for (8.65) without path
constraints.
ZT
minimize (x1 (t)2 + x2 (t)2 + u(t)2 ) dt
x(·), u(·) 0
(a) Figure 8.9 shows the solution to the above problem using a direct single shooting
approach, without enforcing the constraint −0.25 ≤ x1 (t). Go through the code
for the figure step by step. The code begins with a modeling step, where sym-
bolic expressions for the continuous-time model are constructed. Thereafter,
the problem is transformed into discrete time by formulating an object that
integrates the system forward in time using a single step of the RK4 method.
This function also calculates the contribution to the objective function for the
same interval using the same integrator method. In the next part of the code, a
588 Numerical Optimal Control
(b) Modify the code so that the path constraint on x1 (t) is being respected. You
only need to enforce this constraint at the end of each control interval. This
should result in additional components to the NLP constraint function G(w),
which will now have upper and lower bounds similar to the decision variable w.
Resolve the modified problem and compare the solution.
(c) Modify the code to implement the direct multiple-shooting method instead of
direct single shooting. This means introducing decision variables corresponding
to not only the control trajectory, but also the state trajectory. The added deci-
sion variables will be matched with an equal number of new equality constraints,
enforcing that the NLP solution corresponds to a continuous state trajectory.
The initial and terminal conditions on the state can be formulated as upper and
lower bounds on the corresponding elements of w. Use x(t) = 0 as the initial
guess for the state trajectory.
(d) Compare the IPOPT output for both transcriptions. How did the change from
direct single shooting to direct multiple shooting influence
(e) Generalize the RK4 method so that it takes M = 4 steps instead of just one. This
corresponds to a higher-accuracy integration of the model dynamics. Approxi-
mately how much smaller discretization error can we expect from this change?
(f) Replace the RK4 integrator with the variable-order, variable-step size code
CVODES from the SUNDIALS suite, available as the ’cvodes’ plugin for
casadi.integrator. Use 10−8 for the relative and absolute tolerances. Consult
CasADi’s user guide for syntax. What are the advantages and disadvantages of
using this integrator over the fixed-step RK4 method used until now?
We also can get an expression for the state at the end of the interval
d
X d
X
x
e k+1,0 = Lr (1) xk,r := Dr xk,r
r =0 r =0
Finally, we also can integrate our approximation over the interval, giving a formula for
quadratures
Zt d Z1
X d
X
k+1
x
e k (t) dt = h Lr (τ) dτ xk,r := h br xk,r
tk r =0 0 r =1
(a) Figure 8.10 shows an open-loop simulation for the ODE in (8.65) using Gauss-
Legendre collocation of order 2, 4, and 6. A constant control u(t) = 0.5 was
applied and the initial conditions were given by x(0) = [0, 1]′ . The figure on
the left shows the first state x1 (t) for the three methods as well as a high-
accuracy solution obtained from CVODES, which uses a backward differentia-
tion formula (BDF) method. In the figure on the right we see the discretization
error, as compared with CVODES. Go through the code for the figure and make
sure you understand it. Using this script as a template, replace the integrator
in the direct multiple-shooting method from Exercise 8.7 with this collocation
integrator. Make sure that you obtain the same solution. The structure of the
NLP should remain unchanged—you are still implementing the direct multiple-
shooting approach, only with a different integrator method.
(b) In the NLP transcription step, replace the embedded function call with additional
degrees of freedom corresponding to the state at all the collocation points. En-
force the collocation equations at the NLP level instead of the integrator level.
Enforce upper and lower bounds on the state at all collocation points. Compare
the solution time and number of nonzeros in the Jacobian and Hessian matrices
with the direct multiple-shooting method.
with state x = [p, v]′ and C := 180/π /10, to solve an OCP using a direct multiple-
shooting method and a self-written sequential quadratic programming (SQP) solver
with a Gauss-Newton Hessian.
590 Numerical Optimal Control
10−3
1
x1
10−5
0
−1 10−7
GL2
GL4
GL6
−2 10−9
0.0 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0
t t
(a) Starting with the pendulum at x̄0 = [10 0]′ , we aim to minimize the required
controls to bring the pendulum to xN = [0 0]′ in a time horizon T = 10 s.
Regarding bounds on p, v, and u, namely pmax = 10, vmax = 10, and umax = 3,
the required controls can be obtained as the solution of the following OCP
N−1
1 X 2
minimize uk 2
x0 ,u0 ,x1 ,..., 2
uN−1 ,xN k=0
subject to x̄0 − x0 = 0
Φ(xk , uk ) − xk+1 = 0, k = 0, . . . , N − 1
xN = 0
− xmax ≤ xk ≤ xmax , k = 0, . . . , N − 1
− umax ≤ uk ≤ umax , k = 0, . . . , N − 1
Formulate the discrete dynamics xk+1 = Φ(xk , uk ) using a RK4 integrator with
a time step ∆t = 0.2 s. Encapsulate the code in a single CasADi function of the
form of a CasADi function object as in Exercise 8.7. Simulate the system forward
in time and plot the result.
(b) Using w = (x0 , u0 , . . . , uN−1 , xN ) as the NLP decision variable, we can formulate
the equality constraint function G(w), the least squares function M(w), and the
8.12 Exercises 591
1
minimize |M(w)|22
w 2
subject to G(w) = 0
− wmax ≤ w ≤ wmax
The SQP method with Gauss-Newton Hessian solves a linearized version of this
problem in each iteration. More specifically, if the current iterate is w̄, the next
iterate is given by w̄ + ∆w, where ∆w is the solution of the following QP
1
minimize ∆w ′ JM (w̄)′ JM (w̄)∆w + M(w̄)′ JM (w̄)∆w
∆w 2
(8.66)
subject to G(w̄) + JG (w̄)∆w = 0
− wmax − w̄ ≤ ∆w ≤ wmax − w̄
Hx
" #
Hu h i
H= .. Hx = Hu =
.
Hx
(c) Figure 8.11 shows the control trajectory after 0, 1, 2, and 6 iterations of the
Gauss-Newton method applied to a direct multiple-shooting transcription of
(8.65). Go through the code for the figure step by step. You should recog-
nize much of the code from the solution to Exercise 8.7. The code represents a
simplified, yet efficient way of using CasADi to solve OCPs.
Modify the code to solve the pendulum problem. Note that the sparsity patterns
of the linear and quadratic terms of the QP are printed out at the beginning of
the execution. JG (w) is a block sparse matrix with blocks being either identity
∂Φ ∂Φ
matrices I or partial derivatives Ak = ∂x (xk , uk ) and Bk = ∂u (xk , uk ).
Initialize the Gauss-Newton procedure at w = 0, and stop the iterations when
wk+1 − wk gets smaller than 10−4 . Plot the iterates as well as the vector G
during the iterations. How many iterations do you need?
592 Numerical Optimal Control
1.0 0 iterations
1 iteration
2 iterations
0.8 6 iterations
0.6
u(t)
0.4
0.2
0.0
0 2 4 6 8 10
t
subject to x̄0 − x0 = 0
Φ(xk , uk ) − xk+1 = 0, k = 0, . . . , N − 1
xN = 0
−xmax ≤ xk ≤ xmax , k = 0, . . . , N − 1
−umax ≤ uk ≤ umax , k = 0, . . . , N − 1
In this problem, we regard x̄0 as a parameter and modify the simultaneous Gauss-
Newton algorithm from Exercise 8.9. In particular, we modify this algorithm to per-
form real-time iterations for different values of x̄0 , so that we can use the algorithm
to perform closed-loop nonlinear MPC simulations for stabilization of the nonlinear
pendulum.
(a) Modify the function sqpstep from the solution of Exercise 8.9 so that it accepts
the parameter x̄0 . You would need to update the upper and lower bounds on w
accordingly. Test it and make sure that it works.
8.12 Exercises 593
(b) In order to visualize the generalized tangential predictor, call the sqpstep
method with different values for x̄0 while resetting the variable vector w̄ to its
initial value (zero) between each call. Use a linear interpolation for x̄0 with 100
points between zero and the value (10, 0)′ , i.e., set x̄0 = λ[10 0]′ for λ ∈ [0, 1].
Plot the first control u0 as a function of λ and keep your plot.
(c) To compute the exact solution manifold with relatively high accuracy, perform
now the same procedure for the same 100 increasing values of λ, but this time
perform for each value of λ multiple Gauss-Newton iterations, i.e., replace each
call to sqpstep with, e.g., 10 calls without changing x̄0 . Plot the obtained values
for u0 and compare with the tangential predictor from the previous task by
plotting them in the same plot.
(d) In order to see how the real-time iterations work in a more realistic setting, let
the values of λ jump faster from 0 to 1, e.g., by doing only 10 steps, and plot the
result again into the same plot.
(e) Modify the previous algorithm as follows: after each change of λ by 0.1, keep it
constant for nine iterations, before you do the next jump. This results in about
100 consecutive real-time iterations. Interpret what you see.
[1]
(f) Now we do the first closed-loop simulation: set the value of x̄0 to [10 0]′ and ini-
tialize w [0] at zero, and perform the first real-time iteration by calling sqpstep.
[1]
This iteration yields the new solution guess w [1] and corresponding control u0 .
Use this control at the “real plant,” i.e., generate the next value of x̄0 , which we
[2] [2] [1] [1]
denote x̄0 , by calling the one-step simulation function, x̄0 := Φ(x̄0 , u0 ).
[2]
Close the loop by calling sqpstep using w [1] and x̄0 , etc., and perform 100
iterations. For better observation, plot after each real-time iteration the control
and state variables on the whole prediction horizon. (It is interesting to note
that the state trajectory is not necessarily feasible).
Also observe what happens with the states x̄0 during the scenario, and plot
them in another plot against the time index. Do they converge, and if yes, to
what value?
(g) Now we make the control problem more difficult by treating the pendulum in an
upright position, which is unstable. This is simply done by changing the sign in
front of the sine in the differential equation, i.e., our model is now
" # " #
v(t) 0
f (x(t), u(t)) = + u(t) (8.67)
C sin(p(t)/C) 1
[1]
Start your real-time iterations again at w [0] = 0 and set x̄0 to [10 0]′ , and
perform the same closed-loop simulation as before. Explain what happens.
(CIA) (Sager et al., 2011). The CIA step solves the following optimization problem
k
X
min max ub,j (i) − u∗
b,j (i)
ub j ∈ I1:nb i=0
k ∈ I0:N−1
∗
in which ub is the discrete control sequence that approximates ub , the real-valued
solution of a nonlinear program in the heuristic. Additional constraints can be included
in this optimization such as rate-of-change constraints, dwell-time constraints, etc.
Consider the standard form of a mixed-integer linear program (MILP)
min c ′ x + d′ y
x,y
subject to
Ax + Ey ≤ b
y ∈ Bs
with real x ∈ Rq and b ∈ Rr , and binary y ∈ Bs . State the CIA step in the standard
form of an MILP, i.e., give the MILP variables x, y, c, d, A, E, b, q, r , s for solving the CIA
step.
Bibliography
J. Albersmeyer and M. Diehl. The lifted Newton method and its application in
optimization. SIAM J. Optim., 20(3):1655–1684, 2010.
D. Axehill. Controlling the level of sparsity in MPC. Sys. Cont. Let., 76:1–7,
2015.
D. Axehill and M. Morari. An alternative use of the Riccati recursion for efficient
optimization. Sys. Cont. Let., 61(1):37–40, 2012.
595
596 Bibliography
A. Domahidi. Methods and Tools for Embedded Optimization and Control. PhD
thesis, ETH Zürich, 2013.
600
Author Index 601
Gilbert, E. G., 168, 172, 173, 224, 225, Hu, W., 320
259, 280, 339 Huang, R., 170
Gill, P., 579
Gillette, R. D., 167 Imsland, L., 311, 358
Gillis, J., 580
Glover, J. D., 358 Jacobsen, E. W., 427
Golub, G. H., 368, 629, 630 Jacobson, D. H., 564
Goodwin, G. C., 1, 70, 171, 261, 326, Jadbabaie, A., 169
476, 477 Jazwinski, A. H., 38, 290, 318, 319
Goulart, P. J., 259, 260 Jerez, J., 580
Grant, K. E., 580 Ji, L., 278, 319, 320, 323, 696
Gray, M., 303 Jia, D., 427
Griewank, A., 527, 580 Jiang, Z.-P., 314, 321, 693, 699, 702,
Grimm, G., 169, 258, 722 703, 713, 717–719
Gros, S., 170, 579, 580 Johansen, T. A., 476
Grover, P., 302 Johnson, C. R., 23, 64, 273, 629, 681
Grüne, L., 169, 170 Jones, C. N., 142, 169, 476, 477
Guddat, J., 477, 545 Jongen, H. T., 545
Gudi, R., 303 Jørgensen, J. B., 580
Günther, S., 303 Jørgensen, S. B., 580
Julier, S. J., 304, 305, 311
Hager, W. W., 479 Jung, M., 577, 594
Hahn, J., 311
Hahn, M., 577 Kailath, T., 318, 319
Hairer, E., 580 Kalman, R. E., 21, 26, 701
Hale, J., 649 Kameswaran, S., 243
Harinath, E., 170 Kandepu, R., 311
Hartman, P., 649 Keerthi, S. S., 168, 172, 280
Hauser, J., 169 Kellett, C. M., 208, 693
Hautus, M. L. J., 24, 42 Kerrigan, E. C., 231, 259, 260, 339,
Haverbeke, N., 579 477
Heemels, W. P. M. H., 170, 171, 260, Khalil, H. K., 693, 695, 726
313, 314, 701, 726 Khurzhanski, A. B., 258, 358
Henson, M. A., 104 Kirches, C., 577, 580, 594
Herceg, M., 476 Klatte, D., 477
Hespanha, J. P., 320, 359 Kleinman, D. L., 167
Hicks, G. A., 243 Knüfer, S., 320
Hillar, C. J., 681 Kobayshi, K., 171
Hindmarsh, A. C., 580 Kolås, S., 311
Hiraishi, K., 171 Kolmanovsky, I., 224, 225, 259, 339
Ho, Y., 366, 579 Kolmogorov, A. N., 318
Hokayem, P., 260 Körkel, S., 580
Horn, R. A., 23, 64, 273, 629 Kostina, E., 259
Houska, B., 580 Kothare, M. V., 258
Author Index 603
Kouramas, K. I., 231, 259, 339 Mayne, D. Q., 92, 142, 147, 167–170,
Kouvaritakis, B., 256, 258–261 231, 249, 250, 256–261, 300,
Krichman, M., 719 302, 319, 323, 339, 347, 358,
Kristensen, M. R., 580 359, 371, 462, 477, 538, 564,
Krogh, B. H., 427 565
Kronseder, T., 579 McShane, E. J., 649
Kummer, B., 477 Meadows, E. S., 104, 257, 319, 479
Kurzhanski, A. B., 358 Mehta, P. G., 302
Kvasnica, M., 476 Mesarović, M., 427
Kwakernaak, H., 50, 319 Messina, M. J., 169, 258, 722
Kwon, W. H., 1, 167 Meyn, S. P., 302
Miani, S., 109, 259
Langson, W., 250, 256, 259, 260 Michalska, H., 168, 260, 319, 358
Larsson, T., 427 Middlebrooks, S. A., 303
LaSalle, J. P., 693 Mohtadi, C., 167
Lazar, M., 170, 171, 260, 313, 314, Moitié, R., 358
701, 726 Moore, J. B., 319
Ledyaev, Y. S., 761, 762 Morari, M., 167, 171, 258, 476, 477,
Lee, E. B., 167 561, 580
Lee, J. H., 258, 300, 319 Morgenstern, O., 426
Lee, S. L., 580 Morshedi, A. M., 167
Lefebvre, T., 305 Mosca, E., 260
Leineweber, D. B., 580 Motee, N., 427
Levinson, N., 649 Müller, M. A., 169, 170, 320
Li, S., 580 Murray, R. M., 427
Li, W., 565 Murray, W., 579
Li, W. C., 573 Muske, K. R., 168, 172, 319
Limon, D., 142, 145–147, 169, 258,
259, 359 Nagy, Z., 259, 573
Lorenzen, M., 256, 260, 261 Narasimhan, S., 310, 311
Løvaas, C., 261 Nash, J., 426
Lunze, J., 427 Nedic, A., 729
Lygeros, J., 248, 249, 260, 261 Nedoma, J., 477
Negenborn, R. R., 428
Maciejowski, J. M., 1, 259, 260 Nemirovski, A., 579
Macko, D., 427 Nesterov, Y., 579
Maestre, J. M., 428 Nevistić, V., 169
Magni, L., 168, 169, 257, 258 Nijmeijer, H., 170, 313
Mancuso, G., 92, 538 Nocedal, J., 66, 514, 579, 624, 768
Manne, F., 516 Nørgaard, M., 305
Manousiouthakis, V., 168, 169 Nørsett, S. P., 580
Markus, L., 167
Marquardt, W., 579 Odelson, B. J., 51
Marquis, P., 167 Ohtsuka, T., 554
604 Author Index
Olaru, S., 476 Rao, C. V., 70, 162, 167, 169, 257,
Olsder, G. J., 382, 426 300, 319, 323
Ozdaglar, A. E., 729 Rault, A., 167
Ravn, O., 305
Pancanti, S., 171 Rawlings, J. B., 51–53, 70, 92, 104,
Pannek, J., 169 129, 142, 147, 148, 150, 151,
Pannocchia, G., 52, 53, 92, 147, 170, 156, 157, 162, 167–172, 184,
258, 347, 352, 421, 538 211, 257, 258, 273, 275–279,
Panos, C., 259 283, 287, 297, 298, 300, 302,
Papon, J., 167 303, 314–316, 319, 320, 323,
Papoulis, A., 657, 658 347, 352, 371, 416, 418, 421,
Parisini, T., 168, 169 423, 427, 428, 479, 538, 566,
Pearson, A. E., 167 696, 707, 709, 722
Pereira, M., 142, 169 Ray, W. H., 243
Peressini, A. L., 769 Reble, M., 170, 258
Peterka, V., 167 Reid, J. K., 516, 527
Petzold, L. R., 580 Reif, K., 303
Picasso, B., 171 Rengaswamy, R., 310, 311
Pistikopoulos, E. N., 477 Rhodes, I. B., 258, 358
Plitt, K. J., 556 Rice, M. J., 258
Polak, E., 624, 757, 759, 760 Richalet, J., 167
Pothen, A., 516 Rippin, D. W. T., 303
Potschka, A., 580 Risbeck, M. J., 129, 142, 148, 150,
Poulsen, N. K., 305 151, 169–171, 211, 258, 279,
Powell, M. J. D., 516, 527 314–316, 696, 707, 709
Praly, L., 357 Robertson, D. G., 319
Prasad, V., 303 Robinson, S. M., 545
Prett, D. M., 167 Robuschi, N., 577
Price, C., 290, 319 Rockafellar, R. T., 212, 439, 641, 646,
Primbs, J. A., 169 739, 740, 746, 748–750, 757,
Propoi, A. I., 167 759
Romanenko, A., 306
Qiu, L., 50 Roset, B. J. P., 170, 313
Qu, C. C., 311 Rossiter, J. A., 1, 234, 250, 256, 258–
Quevedo, D. E., 171 260
Quincampoix, M., 358 Russo, L. P., 303
Quirynen, R., 580
Saaty, T. L., 477
Raimondo, D. M., 259 Safonov, M., 427
Rajamani, M. R., 51 Sager, S., 577, 580, 594
Raković, S. V., 231, 259–261, 339, Salas, F., 145–147
347, 358, 359, 462, 477 Sandell Jr., N. R., 427
Ralph, D., 259 Santos, L. O., 257, 306
Ramaker, B. L., 167 Saunders, M., 579
Author Index 605
608
Citation Index 609
Boyd and Vandenberghe (2004), 579, de Souza et al. (1986), 326, 328
596, 624, 691, 767–770 Deuflhard (2011), 512, 579, 596
Brenan et al. (1996), 580, 596 Deyst and Price (1968), 290, 319, 328
Bryson and Ho (1975), 366, 442, 579, Di Cairano et al. (2014), 171, 187
596 Diehl (2001), 545, 552, 596
Bürger et al. (2020), 577, 596 Diehl et al. (2002), 573, 596
Diehl et al. (2006), 259, 264
Cai and Teel (2008), 169, 186, 313, Diehl et al. (2009), 579, 596
327, 722, 727 Diehl et al. (2011), 157, 170, 184, 187
Calafiore and Campi (2006), 254, 264 Dieudonne (1960), 638, 691
Callier and Desoer (1991), 291, 327, Domahidi (2013), 580, 597
677, 691 Dunbar (2007), 427, 442
Camacho and Bordons (2004), 1, 86 Dunbar and Murray (2006), 427, 442
Chatterjee and Lygeros (2015), 248,
249, 260, 261, 264 Ellis et al. (2014), 170, 187
Chatterjee et al. (2011), 260, 264
Chaves and Sontag (2002), 303, 327
Fagiano and Teel (2012), 169, 187
Chen and Allgöwer (1998), 168, 186,
Falugi and Mayne (2011), 261, 265
258, 264
Falugi and Mayne (2013), 359, 361
Chen and Shaw (1982), 168, 186
Falugi and Mayne (2013a), 169, 187
Chisci et al. (2001), 234, 250, 256,
Falugi and Mayne (2013b), 142, 169,
259, 260, 264
187
Chmielewski and Manousiouthakis
Feller et al. (2013), 476, 483
(1996), 168, 169, 186
Ferramosca et al. (2009), 359, 361
Clarke et al. (1987), 167, 186
Clarke et al. (1998), 761, 762, 770 Ferreau et al. (2014), 580, 597
Coddington and Levinson (1955), Findeisen et al. (2003), 358, 361
649, 691 Fletcher (1987), 66, 86
Columbano et al. (2009), 476, 483 Francis and Wonham (1976), 50, 86
Copp and Hespanha (2014), 359, 361 Franke (1998), 580, 597
Copp and Hespanha (2017), 320, 328 Frasch et al. (2015), 580, 597
Cui and Jacobsen (2002), 427, 442 Frison (2015), 560, 561, 580, 597
Curtis et al. (1974), 516, 527, 596
Cutler and Ramaker (1980), 167, 187 Gal and Nedoma (1972), 477, 483
Garcı́a and Morshedi (1986), 167,
Dantzig et al. (1967), 477, 483 187
David (1981), 686, 691 Garcı́a et al. (1989), 167, 187
Davison and Smith (1971), 50, 86 Gass and Saaty (1955), 477, 483
Davison and Smith (1974), 50, 86 Gebremedhin et al. (2005), 516, 597
De Keyser and Van Cauwenberghe Gelb (1974), 303, 328
(1985), 167, 187 Gerdts (2011), 579, 597
De Nicolao et al. (1996), 168, 187, Gilbert and Tan (1991), 168, 173, 187
257, 264 Gill et al. (2005), 579, 597
De Nicolao et al. (1998), 169, 187 Gillis (2015), 580, 597
610 Citation Index
Glover and Schweppe (1971), 358, Jiang and Wang (2002), 693, 699,
361 702, 703, 713, 718, 727
Golub and Van Loan (1996), 368, 442, Johnson (1970), 681, 691
629, 630, 691 Johnson and Hillar (2002), 681, 691
Goodwin and Sin (1984), 70, 86 Jones (2017), 477, 483
Goodwin et al. (2005), 1, 86 Jones et al. (2007), 477, 484
Goulart et al. (2006), 259, 260, 265 Julier and Uhlmann (1997), 311, 328
Goulart et al. (2008), 259, 265 Julier and Uhlmann (2002), 305, 328
Griewank and Walther (2008), 527, Julier and Uhlmann (2004a), 303–
580, 597 305, 328
Grimm et al. (2005), 169, 188, 722, Julier and Uhlmann (2004b), 304,
727 329
Grimm et al. (2007), 258, 265 Julier et al. (2000), 305, 329
Gros and Diehl (2020), 579, 597
Grüne and Pannek (2017), 169, 188 Kailath (1974), 318, 319, 329
Guddat et al. (1990), 545, 597 Kalman (1960a), 26, 86
Gudi et al. (1994), 303, 328 Kalman (1960b), 21, 86
Kalman and Bertram (1960), 701,
Hager (1979), 479, 483 727
Hairer et al. (1993), 580, 597 Kameswaran and Biegler (2006), 243,
Hairer et al. (1996), 580, 597 265
Hale (1980), 649, 691 Kandepu et al. (2008), 311, 329
Hartman (1964), 649, 691 Keerthi and Gilbert (1985), 280, 329
Hautus (1972), 24, 42, 86 Keerthi and Gilbert (1987), 172, 188
Herceg et al. (2013), 476, 483 Keerthi and Gilbert (1988), 168, 188
Hicks and Ray (1971), 243, 265 Kellett and Teel (2004), 208, 265
Hindmarsh et al. (2005), 580, 598 Kellett and Teel (2004a), 693, 727
Horn and Johnson (1985), 23, 64, 86, Kellett and Teel (2004b), 693, 727
273, 328, 629, 691 Khalil (2002), 693, 695, 726, 727
Houska et al. (2011), 580, 598 Khurzhanski and Valyi (1997), 258,
Hu (2017), 320, 328 265, 358, 361
Hu et al. (2015), 320, 328 Kleinman (1970), 167, 188
Huang et al. (2011), 170, 188 Knüfer and Müller (2018), 320, 329
Kobayshi et al. (2014), 171, 188
Imsland et al. (2003), 358, 361 Kolås et al. (2009), 311, 329
Kolmanovsky and Gilbert (1995),
Jacobson and Mayne (1970), 564, 598 259, 265
Jadbabaie et al. (2001), 169, 188 Kolmanovsky and Gilbert (1998),
Jazwinski (1970), 38, 86, 290, 318, 224, 225, 265, 339, 361
319, 328 Kolmogorov (1941), 318, 329
Ji et al. (2016), 320, 328 Kothare et al. (1996), 258, 265
Jia and Krogh (2002), 427, 442 Kouramas et al. (2005), 259, 265
Jiang and Wang (2001), 314, 321, Kouvaritakis and Cannon (2016),
328, 693, 717–719, 727 256, 266
Citation Index 611
Kouvaritakis et al. (2010), 260, 261, Mayne (1966), 564, 565, 598
266 Mayne (1995), 258, 266
Krichman et al. (2001), 719, 727 Mayne (1997), 258, 267
Kristensen et al. (2004), 580, 598 Mayne (2000), 169, 189
Kurzhanski and Filippova (1993), Mayne (2013), 147, 169, 189
358, 361 Mayne (2016), 260, 261, 267
Kwakernaak and Sivan (1972), 50, 87, Mayne and Falugi (2016), 169, 189
319, 329 Mayne and Falugi (2019), 249, 267
Kwon (2005), 1, 87 Mayne and Langson (2001), 250, 256,
Kwon and Pearson (1977), 167, 188 259, 260, 267
Mayne and Michalska (1990), 168,
Langson et al. (2004), 259, 260, 266 189
Larsson and Skogestad (2000), 427, Mayne and Raković (2002), 477, 484
442 Mayne and Raković (2003), 462, 477,
LaSalle (1986), 693, 727 484
Lazar and Heemels (2009), 170, 188 Mayne et al. (2000), 167, 169, 189,
Lazar et al. (2008), 260, 266 257, 267
Lazar et al. (2009), 701, 726, 727 Mayne et al. (2005), 259, 267
Lazar et al. (2013), 314, 329 Mayne et al. (2006), 358, 362
Lee and Markus (1967), 167, 188 Mayne et al. (2007), 477, 484
Lee and Yu (1997), 258, 266 Mayne et al. (2009), 347, 358, 362
Lefebvre et al. (2002), 305, 329 Mayne et al. (2011), 259, 267
Leineweber et al. (2003), 580, 598 McShane (1944), 649, 691
Li and Biegler (1989), 573, 598 Meadows et al. (1993), 319, 329
Li and Todorov (2004), 565, 598 Meadows et al. (1995), 104, 189
Limon et al. (2002), 258, 266 Mesarović et al. (1970), 427, 442
Limon et al. (2006), 145–147, 188 Michalska and Mayne (1993), 168,
Limon et al. (2008), 169, 188, 259, 189, 260, 267
266, 359, 362 Michalska and Mayne (1995), 319,
Limon et al. (2010), 169, 189 329, 358, 362
Limon et al. (2012), 142, 169, 189 Middlebrooks and Rawlings (2006),
Lorenzen et al. (2016), 256, 260, 261, 303, 329
266 Moitié et al. (2002), 358, 362
Løvaas et al. (2008), 261, 266 Motee and Sayyar-Rodsari (2003),
Lunze (1992), 427, 442 427, 442
Müller (2017), 320, 330
Maciejowski (2002), 1, 87 Müller and Allgöwer (2014), 169, 189
Maestre and Negenborn (2014), 428, Müller and Grüne (2015), 170, 189
442 Müller et al. (2015), 170, 190
Magni and Sepulchre (1997), 257, Muske and Rawlings (1993), 172, 190
266 Muske et al. (1993), 319, 330
Magni et al. (2003), 258, 266
Marquis and Broustail (1988), 167, Nagy and Braatz (2004), 259, 267
189 Nash (1951), 426, 443
612 Citation Index
Nesterov (2004), 579, 598 Rao and Rawlings (1999), 70, 87, 162,
Nocedal and Wright (2006), 66, 87, 190
514, 579, 598, 624, 691, 768, Rao et al. (2001), 300, 330
770 Rao et al. (2003), 319, 323, 330
Nørgaard et al. (2000), 305, 330 Rawlings and Amrit (2009), 170, 190
Rawlings and Ji (2012), 278, 319,
Odelson et al. (2006), 51, 87 323, 330, 696, 728
Ohtsuka (2004), 554, 598 Rawlings and Mayne (2009), 300,
302, 330
Rawlings and Muske (1993), 168, 190
Pannocchia and Rawlings (2003), 52,
Rawlings and Risbeck (2015), 129,
53, 87, 347, 352, 362
191, 279, 330, 709, 728
Pannocchia et al. (2011), 147, 170,
Rawlings and Risbeck (2017), 142,
190, 258, 267
169–171, 191, 707, 728
Pannocchia et al. (2015), 92, 190,
Rawlings and Stewart (2008), 427,
538, 598
443
Papoulis (1984), 657, 658, 692
Reif and Unbehauen (1999), 303, 330
Parisini and Zoppoli (1995), 168,
Reif et al. (1999), 303, 330
169, 190
Reif et al. (2000), 303, 331
Peressini et al. (1988), 769, 770
Richalet et al. (1978a), 167, 191
Peterka (1984), 167, 190
Richalet et al. (1978b), 167, 191
Petzold et al. (2006), 580, 598
Robertson and Lee (2002), 319, 331
Picasso et al. (2003), 171, 190
Robinson (1980), 545, 599
Polak (1997), 624, 692, 757, 759,
Rockafellar (1970), 439, 443, 641,
760, 770
692
Prasad et al. (2002), 303, 330
Rockafellar and Wets (1998), 212,
Prett and Gillette (1980), 167, 190
268, 646, 692, 739, 740, 746,
Primbs and Nevistić (2000), 169, 190
748–750, 757, 759, 770
Propoi (1963), 167, 190
Romanenko and Castro (2004), 306,
331
Qiu and Davison (1993), 50, 87 Romanenko et al. (2004), 306, 331
Qu and Hahn (2009), 311, 330 Roset et al. (2008), 170, 191, 313, 331
Quevedo et al. (2004), 171, 190 Rossiter (2004), 1, 87
Quirynen (2017), 580, 599 Rossiter et al. (1998), 258, 268
Quirynen et al. (2017a), 580, 599
Quirynen et al. (2017b), 580, 599 Sager et al. (2011), 577, 594, 599
Sager et al. (2012), 577, 599
Raković (2012), 259, 267 Sandell Jr. et al. (1978), 427, 443
Raković et al. (2003), 259, 267 Santos and Biegler (1999), 257, 268
Raković et al. (2005), 339, 362 Scattolini (2009), 428, 443
Raković et al. (2005a), 231, 259, 267 Schur (1909), 629, 692
Raković et al. (2005b), 259, 268 Scokaert and Mayne (1998), 258, 268
Raković et al. (2012), 259, 268 Scokaert and Rawlings (1998), 168,
Rao (2000), 300, 319, 330 169, 191
Citation Index 613
Scokaert et al. (1997), 257, 268, 479, Wächter and Biegler (2006), 528, 554,
484 579, 599
Scokaert et al. (1999), 147, 170, 191, Wang (2009), 1, 87
371, 443 Wiener (1949), 318, 331
Selby (1973), 73, 87 Wilson et al. (1998), 303, 332
Serón et al. (2000), 476, 477, 484 Wright (1997), 366, 444
Sideris and Bobrow (2005), 564, 599
Šiljak (1991), 427, 443 Yang et al. (2013), 302, 332
Sontag (1998), 23, 41, 87, 321, 331 Ydstie (1984), 167, 191
Sontag (1998a), 706, 707, 713, 728 Yu et al. (2011), 258, 268
Sontag (1998b), 705, 728 Yu et al. (2014), 170, 191
Sontag and Wang (1995), 717, 728
Sontag and Wang (1997), 275, 276, Zanelli et al. (2017), 580, 599
321, 331, 719, 728 Zanon et al. (2013), 170, 191
Stengel (1994), 303, 319, 331 Zeile et al. (2020), 577, 599
Stewart et al. (2010), 421, 443
Stewart et al. (2011), 416, 418, 421,
423, 443
Strang (1980), 23, 42, 87, 625, 626,
692
Sznaier and Damborg (1987), 168,
191
614
Subject Index 615
Combining MHE and MPC, 312 Control Lyapunov function, see CLF
stability, 314 Control vector parameterization,
Complementarity condition, 543 532
strict, 544 Controllability, 23
Concave function, 647 canonical form, 68
Condensing, 491, 560 duality with observability, 291
Cone matrix, 23
convex, 644 weak, 116
normal, 439, 737, 743, 746, 748 Controllable, 23
polar, 453, 455, 644, 740 Converse theorem
tangent, 737, 743, 746, 748 asymptotic stability, 705
Constrained Gauss-Newton method, exponential stability, 374, 725
549 Convex, 646
Constraint qualification, 479, 750, cone, 644
751 function, 488, 583, 646
Constraints, 6 hull, 641
active, 543, 743 optimization problem, 487, 741
coupled input, 405 optimality condition, 453
hard, 7, 94 set, 338, 583, 641
input, 6, 94 Cooperative control, 363, 386
integrality, 8 algorithm, 421
output, 6 distributed nonlinear, 419
polyhedral, 743 Correlation, 668
probabilistic, 254 Cost function, 11, 95, 369
soft, 7, 132
state, 6, 94 DAE, 505
terminal, 96, 144–147, 212 semiexplicit DAE of index one, 506
tightened, 202, 223, 230, 242, 346, Damping, 514
357 DARE, 25, 69, 136
trust region, 514 DDP, 564
uncoupled input, 402 exact Hessian, 565
Continuation methods, 571 Gauss-Newton Hessian, 565
Continuity, 633 Decentralized control, 363, 377
lower semicontinuous, 634 Decreasing, see Sequence
uniform, 634 Derivatives, 636
upper semicontinuous, 634 Detectability, 50, 120, 275, 319, 321,
Control law, 90, 200, 445 322, 719
continuity, 104 duality with stabilizability, 291
discontinuity, 104 exponential, 285
explicit, 446 Detectable, 26, 68, 72, 73, 325
implicit, 100, 210, 446 Determinant, 27, 628, 659, 666
offline, 89, 236 Deterministic problem, 91
online, 89 Difference equation, 5
time-invariant, 100 linear, 5
616 Subject Index
A.1 Introduction
In this appendix we give a brief review of some concepts that we need.
It is assumed that the reader has had at least a first course on lin-
ear systems and has some familiarity with linear algebra and analy-
sis. The appendices of Polak (1997); Nocedal and Wright (2006); Boyd
and Vandenberghe (2004) provide useful summaries of the results we
require. The material presented in Sections A.2–A.14 follows closely
Polak (1997) and earlier lecture notes of Professor Polak.
624
A.4 Linear Equations — Existence and Uniqueness 625
h i
example, if A is the column vector 11 , then R(A) is the subspace
h i
spanned by the vector 11 and the rank of A is 1. The nullspace N (A)
is the set of vectors in Rn that are mapped to zero by A so that N (A) =
{x | Ax = 0}. The nullspace N (A) is a subspace of Rn . hForithe
1
example above, N (A) is the subspace spanned by the vector −1 . It
′ n
is an important fact that R(A ) ⊕ N (A) = R or, equivalently, that
N (A) = (R(A′ ))⊥ where A′ ∈ Rn×m is the transpose of A and S ⊥
denotes the orthogonal complement of any subspace S; a consequence
is that the sum of the dimensions R(A) and N (A) is n. If A is square
and invertible, then n = m and the dimension of R(A) is n so that
the dimension of N (A) is 0, i.e., the nullspace contains only the zero
vector, N (A) = {0}.
Ax = b
A.5 Pseudo-Inverse
The solution of Ax = y when A is invertible is x = A−1 y where A−1 is
the inverse of A. Often an approximate inverse of y = Ax is required
when A is not invertible. This is yielded by the pseudo-inverse A† of A;
if A ∈ Rm×n , then A† ∈ Rn×m . The properties of the pseudo-inverse
are illustrated in Figure A.2 for the case when A ∈ R2×2 where both
R(A) and N (A) have dimension 1. Suppose we require a solution to
the equation Ax = y. Since every x ∈ R2 is mapped into R(A), we
see that a solution may only be obtained if y ∈ R(A). Suppose this is
not the case, as in Figure A.2. Then the closest point, in the Euclidean
sense, to y in R(A) is the point y ∗ which is the orthogonal projection
626 Mathematical Background
Ax = b
n
R Rm
R(A′ ) R(A)
r r
A′ A
0 0
N (A) N (A′ )
n−r m−r
x A y
A
y∗
x∗
R(A′ )
N (A) R(A)
x A† y
A† y∗
x∗
R(A′ )
N (A) R(A)
A† = V Σ−1 U ′
628 Mathematical Background
Note that this result is still valid if B is singular. A host of other useful
control-related inversion formulas follow from these results. Equating
the (1,1) or (2,2) entries of Z −1 gives the identity
(I + X −1 )−1 = I − (I + X)−1
If B is nonsingular
x ′ Qx > 0, ∀ nonzero x ∈ Rn
x ′ Qx ≥ 0, ∀x ∈ Rn
3. Q ≥ 0 ⇒ R ′ QR ≥ 0 ∀R.
6. Q1 > 0, Q2 ≥ 0 ⇒ Q = Q1 + Q2 > 0.
You may want to use the Schur decomposition (Schur, 1909) of a matrix
in establishing some of these eigenvalue results. Golub and Van Loan
(1996, p.313) provide the following theorem
Q∗ AQ = T
(a) A is nonsingular
(b) B is unique
(c) AB = I
For real, but not necessarily symmetric, A you can restrict yourself
to real matrices, by using the real Schur decomposition (Golub and
Van Loan, 1996, p.341), but the price you pay is that you can achieve
only block upper triangular T , rather than strictly upper triangular T .
in which each Rii is either a real scalar or a 2×2 real matrix having com-
plex conjugate eigenvalues; the eigenvalues of Rii are the eigenvalues
of A.
If the eigenvalues of Rii are disjoint (i.e., the eigenvalues are not re-
peated), then R can be taken block diagonal instead of block triangular
(Golub and Van Loan, 1996, p.366).
A.8 Norms in Rn 631
A.8 Norms in Rn
A norm in Rn is a function |·| : Rn → R≥0 such that
A.9 Sets in Rn
The complement of S ⊂ Rn in Rn , is the set S c := {x ∈ Rn | x ̸∈ S}. A
set X ⊂ Rn is said to be open, if for every x ∈ X, there exists a ρ > 0
such that B(x, ρ) ⊆ X. A set X ⊂ Rn is said to be closed if X c , its
complement in Rn , is open.
A set X ⊂ Rn is said to be bounded if there exists an M < ∞ such that
|x| ≤ M for all x ∈ X. A set X ⊂ Rn is said to be compact if X is closed
and bounded. An element x ∈ S ⊆ Rn is an interior point of the set S if
there exists a ρ > 0 such that z ∈ S, for all |z − x| < ρ. The interior of a
set S ⊂ Rn , int(S), is the set of all interior points of S; int(S) is an open
set, the largest 2 open subset of S. For example, if S = [a, b] ⊂ R, then
int(S) = (a, b); as another example, int(B(x, ρ)) = {z | |z − x| < ρ}.
The closure of a set S ⊂ Rn , denoted S̄, is the smallest 3 closed set
containing S. For example, if S = (a, b] ⊂ R, then S̄ = [a, b]. The
boundary of S ⊂ Rn , is the set δS := S̄ \ int(S) = {s ∈ S̄ | s ∉ int(S)}.
For example, if S = (a, b] ⊂ R, then int(S) = (a, b), S̄ = [a, b], ∂S = {a,
b}.
An affine set S ⊂ Rn is a set that can be expressed in the form
S = {x} ⊕ V := {x + v | v ∈ V } for some x ∈ Rn and some subspace
V of Rn . An example is a line in Rn not passing through the origin.
The affine hull of a set S ⊂ Rn , denoted aff(S), is the smallest4 affine
2 Largest in the sense that every open subset of S is a subset of int(S).
3 Smallest in the sense that S̄ is a subset of any closed set containing S.
4 In the sense that aff(S) is a subset of any other affine set containing S.
632 Mathematical Background
does not have an interior, but does have an interior relative to the line
containing it, aff(S). h iinterior of S is the open line segment
h i The relative
{x ∈ R2 | x = λ 10 + (1 − λ) 01 , λ ∈ (0, 1)}.
A.10 Sequences
Let the set of nonnegative integers be denoted by I≥0 . A sequence is a
function from I≥0 into Rn . We denote a sequence by its values, (xi )i∈I≥0 .
A subsequence of (xi )i∈I≥0 is a sequence of the form (xi )i∈K , where K
is an infinite subset of I≥0 .
A sequence (xi )i∈I≥0 in Rn is said to converge to a point x̂ if
limi→∞ |xi − x̂| = 0, i.e., if, for all δ > 0, there exists an integer k such
that |xi − x̂| ≤ δ for all i ≥ k; we write xi → x̂ as i → ∞ to denote the
fact that the sequence (xi ) converges to x̂. The point x̂ is called a limit
of the sequence (xi ). A point x ∗ is said to be an accumulation point
of a sequence (xi )i∈I≥0 in Rn , if there exists an infinite subset K ⊂ I≥0
K
such that xi → x ∗ as i → ∞, i ∈ K in which case we say xi → x ∗ .5
Let (xi ) be a bounded infinite sequence in R and let the S be the set
of all accumulation points of (xi ). Then S is compact and lim sup xi is
the largest and lim inf xi the smallest accumulation point of (xi ):
as synonymous with limit. Others use limit point as synonymous with accumulation
point. For this reason we avoid the term limit point.
A.11 Continuity 633
Proof. For the sake of contradiction, suppose that (xi )i∈I≥0 does not
converge to x ∗ . Then, for some ρ > 0, there exists a subsequence
(xi )i∈K such that xi ̸∈ B(x ∗ , ρ) for all i ∈ K, i.e., |xi − x ∗ | > ρ for all
i ∈ K. Since x ∗ is an accumulation point, there exists a subsequence
K∗
(xi )i∈K ∗ such that xi → x ∗ . Hence there is an i1 ∈ K ∗ such that
|xi − x ∗ | ≤ ρ/2, for all i ≥ i1 , i ∈ K ∗ . Let i2 ∈ K be such that i2 > i1 .
Then we must have that xi2 ≤ xi1 and xi2 − x ∗ > ρ, which leads
to the conclusion that xi2 < x ∗ − ρ. Now let i3 ∈ K ∗ be such that
i3 > i2 . Then we must have that xi3 ≤ xi2 and hence that xi3 < x ∗ − ρ
which implies that xi3 − x ∗ > ρ. But this contradicts the fact that
xi3 − x ∗ ≤ ρ/2, and hence we conclude that xi → x ∗ as i → ∞. ■
A.11 Continuity
We now summarize some essential properties of continuous functions.
f (x ′ ) − f (x ′′ ) < δ
but
f (xi′ ) − f (xi′′ ) > δ, for all i ∈ I≥0 (A.1)
Since X is compact, there must exist a subsequence xi′ such that
i∈K
K K
xi′ →x ∈ X as i → ∞. Furthermore, because of (A.1), xi′′ → x ∗ also
∗
K
holds. Hence, since f (·) is continuous, we must have f (xi′ ) → f (x ∗ )
K
and f (xi′′ ) → f (x ∗ ). Therefore, there exists a i0 ∈ K such that for all
i ∈ K, i ≥ i0
Proof.
(a) First we show that f (X) is closed. Thus, let (f (xi ) | i ∈ I≥0 ), with
xi ∈ X, be any sequence in f (X) such that f (xi ) → y as i → ∞. Since
(xi ) is in a compact set X, there exists a subsequence (xi )i∈K such
K K
that xi → x ∗ ∈ X as i → ∞. Since f (·) is continuous, f (xi ) → f (x ∗ )
as i → ∞. But y is the limit of (f (xi ))i∈I≥0 and hence it is the limit of
any subsequence of (f (xi )) . We conclude that y = f (x ∗ ) and hence
that y ∈ f (X), i.e., f (X) is closed.
A.12 Derivatives
Proof. From the definition of Df (x̂) we deduce that for each i ∈ {1, 2,
. . . , m}
fi (x̂ + h) − fi (x̂) − Dfi (x̂)h
lim =0
h→0 |h|
where fi is the ith element of f and (Df )i the ith row of Df . Set
h = tej , where ej is the j-th unit vector in Rn so that |h| = t. Then
(Df )i (x̂)h = t(Df )i (x̂)ej = (Df )ij (x̂), the ijth element of the matrix
Df (x̂). It then follows that
f (x, y) = x + y if x = 0 or y = 0
f (x, y) = 1 otherwise
In this case
but the function is not even continuous at (0, 0). In view of this, the
following result is relevant.
The following result Dieudonne (1960), replaces, inter alia, the mean
value theorem for functions f : Rn → Rm when m > 1.
Proof.
(a) Consider the function g(s) = f (x + s(y − x)) where f : Rn → Rm .
Then g(1) = f (y), g(0) = f (x) and
Z1
g(1) − g(0) = g ′ (s)ds
0
Z1
= Df (x + s(y − x))(y − x)ds
0
But g ′′ (s) = (y − x)′ fxx (x + s(y − x))(y − x) so that the last equation
yields
Z1
f (y)−f (x) = fx (x)(y−x)+ (1−s)(y−x)′ fxx (x+s(y−x))(y−x)ds
0
Figure A.4 illustrates this for the case when ψ(x) := max{f1 (x), f2 (x)}
and I 0 (x) = {1, 2}.
∇f1 (x)
∇f2 (x)
∂ψ(x)
Suppose, without loss of generality, that at least one αi < 0. Then there
exists a θ̄ > 0 such that µ̄ j + θ̄αj = 0 for some j while µ̄ i + θ̄αi ≥ 0
for all other i. Thus we have succeeded in expressing x̄ as a convex
combination of k̄ − 1 vectors in S. Clearly, these reductions can go on
as long as x̄ is expressed in terms of more than (n + 1) vectors in S.
This completes the proof. ■
H = {x ∈ Rn | ⟨x, v⟩ = α}
separates S1 and S2 if
v
H
S1
S2
x̂ = arg min{|x|2 | x ∈ S}
Then
H = {x | ⟨x̂, x⟩ = |x̂|2 }
separates S from 0, i.e., ⟨x̂, x⟩ ≥ |x̂|2 for all x ∈ S.
for all x2 ∈ S2 . The desired result follows from (A.6) and (A.8), the
separating hyperplane H being {x ∈ Rn | ⟨x̂, x − x̂2 ⟩ = 0}. ■
Theorem A.18 (Convex set and halfspaces). A closed convex set is equal
to the intersection of the halfspaces which contain it.
C∗
h
H 0
C := {x ∈ Rn | ⟨ai , x⟩ ≤ 0, i ∈ I}
Then
C ∗ = cone{ai | i ∈ I}
Proof.
(a) Let the convex set K be defined by
K := cone{ai | i ∈ I}
e , x⟩ = µj ⟨aj , x⟩ + ⟨h
⟨h, x⟩ = ⟨µj aj , x⟩ + ⟨h e , x⟩ > 0
e ≠ 0 or both.
since either both µj and ⟨aj , x⟩ are strictly negative or h
∗
This contradicts the fact that x ∈ C and h ∈ C (so that ⟨h, x⟩ ≤ 0).
Hence h ∈ K so that C ∗ ⊂ K. It follows that C ∗ = cone{ai | i ∈ I}.
(c) This result follows directly from the definition of a polar cone.
■
f (x)
f (y)
x y
2
Hence, dividing by y − x and letting y → x, we obtain that
∂ 2 f (x)/∂x 2 is positive semidefinite.
⇐ Suppose that ∂ 2 f (x)/∂x 2 is positive semidefinite for all x ∈ R.
Then it follows directly from the equality in (A.11) and Theorem A.24
that f is convex. ■
σQ (p) = sup{ p, x | x ∈ Q}
x
x + = f (x, u)
(b) x(t0 ) = x0 ,
Zt
x(t) = x0 + f (x(s), s)ds (A.14)
t0
650 Mathematical Background
f (x, t) ≤ mF (t)
for all (x, t) ∈ F . We now make use of the fact that if t , h(t) is mea-
∆ Rt
surable, its integral t , H(t) = t0 h(s)ds is absolutely continuous and,
therefore, has a derivative almost everywhere. Where H(·) is differen-
tiable, its derivative is equal to h(·). Consequently, if f (·) satisfies the
Caratheodory conditions, then the solution of (A.14), i.e., a function
φ(·) satisfying (A.14) everywhere does not satisfy (A.12) everywhere
but only almost everywhere, at the points where φ(·) is differentiable.
In view of this, we may speak either of a solution of (A.14) or of a
solution of (A.12) provided we interpret the latter as an absolutely con-
tinuous function which satisfies (A.12)) almost everywhere. The ap-
propriate generalization of Peano’s existence theorem is the following
result due to Caratheodory:
A.14 Differential Equations 651
for all (x, t), (y, t) in U . Then, for any (x0 , t0 ) in U there exists a unique
solution φ(·; x0 , t0 ) passing through (x0 , t0 ). The function (t, x0 , t0 ) ,
φ(t; x0 , t0 ) : R × Rn × R → Rn is continuous in its domain E which is
open.
Hence
Rt Rt
α(s)ds α(s)ds
(d/dt)[e− 0 Y (t)] = e− 0 (Ẏ (t) − α(t)Y (t))
Rt
α(s)ds
= (e− 0 )α(t)(y(t) − Y (t))
Rt
α(s)ds
≤ c(e− 0 )α(t) (A.18)
and Rt
y(t) ≤ ce 0 α(s)ds
for all t ∈ [0, 1]. ■
f (x ′ , u, t) − f (x, u, t) ≤ c x ′ − x
Fξ (x) = Pr(ξ ≤ x)
i.e., Fξ (x) is the probability that the random variable ξ takes on a value
less than or equal to x. Fξ is obviously a nonnegative, nondecreasing
function and has the following properties due to the axioms of proba-
bility
Fξ (x1 ) ≤ Fξ (x2 ) if x1 < x2
lim Fξ (x) = 0
x→−∞
lim Fξ (x) = 1
x→∞
Z∞
pξ (x)dx = 1
−∞
and it is clear that the mean is the zeroth moment. Moments of ξ about
the mean are defined by
Z∞
E((ξ − E(ξ))n ) = (x − E(ξ))n pξ (x)dx
−∞
and the variance is defined as the second moment about the mean
σ (ξ) = (var(ξ))1/2
656 Mathematical Background
and (A.21) does have unit area. Computing the mean gives
Z∞ !
1 1 (x − m)2
E(ξ) = √ x exp − dx
2π σ 2 −∞ 2 σ2
The first term in the integral is zero because u is an odd function, and
the second term produces
E(ξ) = m
var(ξ) = σ 2
ξ ∼ N(m, σ 2 )
Figure A.8 shows the normal distribution with a mean of one and
variances of 1/2, 1 and 2. Notice that a large variance implies that
the random variable is likely to take on large values. As the variance
shrinks to zero, the probability density becomes a delta function and
the random variable approaches a deterministic value.
Central limit theorem.
The central limit theorem states that if a set of n random
variables xi , i = 1, 2, . . . , n are independent, then under gen-
eral conditions the density py of their sum
y = x1 + x2 + · · · + xn
0.9
0.8
0.7 σ = 1/2
0.6
0.5
pξ (x)
0.4
0.3
σ =1
0.2
0.1 σ =2
m
0
−5 −4 −3 −2 −1 0 1 2 3 4 5 6 7
x
!
1 1 (x − m)2
Figure A.8: Normal distribution, pξ (x) = √ exp − .
2π σ 2 2 σ2
Mean is one and standard deviations are 1/2, 1 and 2.
pξ (x) : Rn → R+
p(x) = exp −1/2 3.5x12 + 2(2.5)x1 x2 + 4.0x22
0.5
0.25
2
1x
−2 0 2
−1 x1 0 −1
1 −2
2
ξ ∼ N(m, P )
(x − m)′ P −1 (x − m)
x ′ Ax = b
Avi = λi vi
x2
r
b
v
λ2 2
r
b
v
λ1 1
q x1
e 22
bA
q
e 11
bA
in which
e ii = (i, i) element of A−1
A
See Exercise A.45 for a derivation of the size of the bounding box. Fig-
ure A.10 displays these results: the eigenvectors are aligned with the
ellipse axes and the eigenvalues scale the lengths. The lengths of the
sides of the box that are tangent to the ellipse are proportional to the
square root of the diagonal elements of A−1 .
Singular or degenerate normal distributions. It is often convenient
to extend the definition of the normal distribution to admit positive
semidefinite covariance matrices. The distribution with a semidefinite
covariance is known as a singular or degnerate normal distribution (An-
662 Mathematical Background
derson, 2003, p. 30). Figure A.11 shows a nearly singular normal dis-
tribution.
To see how the singular normal arises, let the scalar random variable
ξ be distributed normally with zero mean and positive definite covari-
ance, ξ ∼ N(0, Px ), and consider the simple linear transformation
" #
1
η = Aξ A=
1
in which we have created two identical copies of ξ for the two compo-
nents η1 and η2 of η. Now consider the density of η. If we try to use
the standard formulas for transformation of a normal, we would have
" #
′ Px Px
η ∼ N(0, Py ) Py = APx A =
Px Px
and Py is singular since its rows are linearly dependent. Therefore one
of the eigenvalues of Py is zero and Py is positive semidefinite and not
positive definite. Obviously we cannot use (A.22) for the density in this
case because the inverse of Py does not exist. To handle these cases, we
first provide an interpretation that remains valid when the covariance
matrix is singular and semidefinite.
(A.23)
are not independent (see Exercise 1.40). Starting with the definition of
a singular normal, we can obtain the density for ξ ∼ N(mx , Px ) for any
positive semidefinite Px ≥ 0. The result is
1 1
pξ (x) = r 1 exp − |(x − mx )|2Q1 δ(Q2′ (x − mx ))
(2π ) (det Λ1 )
2 2 2
(A.24)
r ×r n×n
in which matrices Λ ∈ R and orthonormal Q ∈ R are obtained
from the eigenvalue decomposition of Px
" #" #
h i Λ 0 Q1′
′ 1
Px = QΛQ = Q1 Q2
0 0 Q2′
show that the marginal density of ξ is normal with the following pa-
rameters
ξ ∼ N(mx , Px ) (A.25)
Solution
As a first approach to establish (A.25), we directly integrate the y vari-
ables. Let x̄ = x − mx and ȳ = y − my , and nx and ny be the dimen-
sion of the ξ and η variables, respectively, and n = nx + ny . Then the
definition of the marginal density gives
" #′ " #−1 " #
Z∞
1 1 x̄ P x P xy x̄
pξ (x) = exp − dȳ
(2π )n/2 (det P )1/2 −∞ 2 ȳ Pyx Py ȳ
664 Mathematical Background
e and partition P
Let the inverse of P be denoted as P e as follows
" #−1 " #
Px Pxy ex P
P e xy
= (A.26)
Pyx Py e yx P
P ey
Substituting (A.26) into the definition of the marginal density and ex-
panding the quadratic form in the exponential yields
−1
in which a = e P
P e yx x̄.
Using (A.22) to evaluate the integral gives
y
1 ′ e e
′
e
−1
e
pξ (x) = 1/2 exp −(1/2)x̄ (P x − P yx P y P yx )x̄
ey)
(2π )nx /2 det(P ) det(P
and
e =
−1 det Px
det(P ) = det(Px ) det(Py − Pyx Px−1 Pxy ) = det Px det P y
ey
det P
p(x) = exp −1/2 27.2x12 + 2(−43.7)x1 x2 + 73.8x22
1
0.75
0.5
0.25
0
−2 −2
−1
x2 −1
0 0 x1
1 1
2 2
η = f (ξ), ξ = f −1 (η)
S ′ = {y | y = f (x), x ∈ S}
for every admissible set S. Using the rules of calculus for transforming
a variable of integration we can write
Z Z
∂f −1 (y)
pξ (x)dx = pξ (f −1 (y)) det dy (A.28)
S S′ ∂y
x2
X(c) c
c x1
Solution
The region X(c) generated by the inequality y = max(x1 , x2 ) ≤ c is
sketched in Figure A.12. Applying (A.31) then gives
Zy Zy
Pη (y) = pξ (x1 , x2 )dx1 dx2
−∞ −∞
= Pξ (y, y)
= Pξ1 (y)Pξ2 (y)
668 Mathematical Background
Pij = cov(ξi , ξj )
var(ξ1 ) cov(ξ1 , ξ2 ) ··· cov(ξ1 , ξn )
cov(ξ , ξ ) var(ξ2 ) ··· cov(ξ2 , ξn )
2 1
P = .. .. .. ..
. . . .
cov(ξn , ξ1 ) cov(ξn , ξ2 ) ··· var(ξn )
cov(ξ, η) = 0
Solution
The definition of covariance gives
Taking the expectation of the product ξη and using the fact that ξ and
η are independent gives
∞
ZZ
E(ξη) = xypξ,η (x, y)dxdy
−∞
∞ZZ
= xypξ (x)pη (y)dxdy
−∞
Z∞ Z∞
= xpξ (x)dx ypη (y)dy
−∞ −∞
= E(ξ)E(η)
cov(ξ, η) = 0
(a) Compute the marginals pξ (x) and pη (y). Are ξ and η indepen-
dent?
1
pξ,η (x, y) = 4 1 + xy(x 2 − y 2 )
0.5
0.25
1
0
−1 0 y
x 0
1 −1
Figure A.13: A joint density function for the two uncorrelated ran-
dom variables in Example A.42.
Solution
The joint density is shown in Figure A.13.
(a) Direct integration of the joint density produces
and we see that both marginals are zero mean, uniform densities.
Obviously ξ and η are not independent because the joint density
is not the product of the marginals.
(b) Performing the double integral for the expectation of the product
term gives
1
ZZ
E(ξη) = xy + (xy)2 (x 2 − y 2 )dxdy
−1
=0
A.16 Multivariate Density Functions 671
Solution
We have already shown that independent implies uncorrelated for any
density, so we now show that, for normals, uncorrelated implies inde-
pendent. Given cov(ξ, η) = 0, we have
′
Pxy = Pyx =0 det P = det Px det Py
ξ ∼ N(mx , Px ) η ∼ N(my , Py )
672 Mathematical Background
so we have
1
′ −1
pξ (x) = exp −(1/2) x̄ P x x̄
(2π )nx /2 (det Px )1/2
1
′ −1
pη (y) = exp −(1/2) ȳ Py ȳ
(2π )ny /2 (det Py )1/2
pξ,η (x, y)
pξ|η (x|y) =
pη (y)
pξ (x) = 1/6, x = 1, 2, . . . 6
6
X
pη (y) = pξ,η (x, y)
x=1
pη (y) = 1/2, y = E, O
These are both in accordance of our intuition on the rolling of the die:
uniform probability for each value 1 to 6 and equal probability for an
even or an odd outcome. Now the conditional density is a different
concept. The conditional density pξ |η(x, y) tells us the density of x
given that η = y has been observed. So consider the value of this
function
pξ|η (1|O)
which tells us the probability that the die has a 1 given that we know
that it is odd. We expect that the additional information on the die
being odd causes us to revise our probability that it is 1 from 1/6 to
1/3. Applying the defining formula for conditional density indeed gives
1/6
pξ|η (1|O) = pξ,η (1, O)/pη (O) = = 1/3
1/2
1/6
pη,ξ (O|1) = pη,ξ (O, 1)/pξ (1) = =1
1/6
i.e., we are sure the die is odd if it is 1. Notice that the arguments to
the conditional density do not commute as they do in the joint density.
This fact leads to a famous result. Consider the definition of condi-
tional density, which can be expressed as
or
pη,ξ (y, x) = pη|ξ (y|x)pξ (x)
674 Mathematical Background
Because pξ,η (x, y) = pη,ξ (y, x), we can equate the right-hand sides
and deduce
pη|ξ (y|x)pξ (x)
pξ|η (x|y) =
pη (y)
which is known as Bayes’s theorem (Bayes, 1763). Notice that this re-
sult comes in handy whenever we wish to switch the variable that is
known in the conditional density, which we will see is a key step in
state estimation problems.
(ξ|η) ∼ N(m, P )
Solution
The definition of conditional density gives
pξ,η (x, y)
pξ|η (x|y) =
pη (y)
1
pη (y) = exp −(1/2)(y − my )′ Py−1 (y − my )
(2π )nη /2 (det P y)
1/2
and therefore
(det Py )1/2
pξ|η (x|y) = " #!1/2 exp(−1/2a) (A.36)
nξ /2 Px Pxy
(2π ) det
Pyx Py
A.17 Conditional Probability and Bayes’s Theorem 675
If we use P = Px − Pxy Py−1 Pyx as defined in (A.35) then we can use the
partitioned matrix inversion formula to express the matrix inverse in
the previous equation as
" #−1 " #
Px Pxy P −1 −P −1 Pxy Py−1
=
Pyx Py −Py−1 Pyx P −1 Py−1 + Py−1 Pyx P −1 Pxy Py−1
′
in which we use the fact that Pxy = Pyx . Substituting (A.34) into this
expression yields
a = (x − m)′ P −1 (x − m) (A.37)
p(a|b, c) ∼ N(m, P )
m = ma + Pab Pb−1 (b − mb )
Solution
From the definition of joint density we have
p(a, b, c)
p(a|b, c) =
p(b, c)
p(a, b, c) p(c)
p(a|b, c) =
p(c) p(b, c)
or
p(a, b|c)
p(a|b, c) =
p(b|c)
Substituting the distribution given in (A.39) and using the result in Ex-
ample A.38 to evaluate p(b|c) yields
" # " #!
ma Pa Pab
N ,
mb Pba Pb
p(a|b, c) =
N(mb , Pb )
And now applying the methods of Example A.44 this ratio of normal
distributions reduces to the desired expression. □
Dual dynamic system (Callier and Desoer, 1991). The dynamic sys-
tem
maps an initial condition and input sequence (x(0), u(0), . . . , u(N −1))
into a final condition and an output sequence (x(N), y(0), . . . , y(N −
1)). Call this linear operator G
x(N) x(0)
y(0) u(0)
.. = G ..
. .
y(N − 1) u(N − 1)
The dual dynamic system represents the adjoint operator G ∗
x(0) x(N)
y(1) u(1)
. = G∗ ..
.
. .
y(N) u(N)
We define the usual inner product, ⟨a, b⟩ = a′ b, and substitute into
(A.40) to obtain
If we express the y(k) in terms of x(0) and u(k) and collect terms we
obtain
h i
0 = x(0)′ x(0) − C ′ u(1) − A′ C ′ u(2) − · · · − A′N x(N)
h i
+ u(0)′ y(1) − D ′ u(1) − B ′ C ′ u(2) − · · · − B ′ A′(N−2) C ′ u(N) − B ′ A′(N−1) x(N)
+ ···
+ u(N − 2)′ y(N − 1) − D ′ u(N − 1) − B ′ C ′ u(N) − B ′ A′ x(N)
+ u(N − 1)′ y(N) − D ′ u(N) − B ′ x(N)
Since this equation must hold for all (x(0), u(0), . . . , u(N − 1)), each
term in brackets must vanish. From the u(N − 1) term we conclude
From which we find the state recursion for the dual system
Passing through each term then yields the dual state space description
of the adjoint operator G ∗
So the primal and dual dynamic systems change matrices in the follow-
ing way
(A, B, C, D) -→ (A′ , C ′ , B ′ , D ′ )
Notice this result produces the duality variables listed in Table A.1 if
we first note that we have also renamed the regulator’s input matrix B
to G in the estimation problem. We also note that time runs in the op-
posite directions in the dynamic system and the dual dynamic system,
which corresponds to the fact that the Riccati equation iterations run
in opposite directions in the regulation and estimation problems.
A.18 Exercises
|x|∞ := max{ x 1 , x 2 , . . . , x n }
n
X
|x|1 := xj
i=1
This result shows that the norms are equivalent and may be used interchangeably for
establishing that sequences are convergent, sets are open or closed, etc.
A.18 Exercises 679
Regulator Estimator
A A′
B C′
C G′
k l=N −k
Π(k) P − (l) Regulator Estimator
Π(k − 1) P − (l + 1) R > 0, Q > 0 R > 0, Q > 0
Π P− (A, B) stabilizable (A, C) detectable
Q Q (A, C) detectable (A, G) stabilizable
R R
Q(N) Q(0)
′
K −Le
A + BK (A − L e C)′
x ε
Table A.1: Duality variables and stability conditions for linear quad-
ratic regulation and linear estimation.
(b) Show that (d/dt)A−1 (t) = −A−1 (t)Ȧ(t)A−1 (t) if A : R → Rn×n , A(t) is invert-
ible for all t ∈ R, and Ȧ(t) := (d/dt)A(t).
(b) Prove statement 1 on positive definite matrices (from Section A.7). Where is this
fact needed?
(c) Prove statement 6 on positive definite matrices. Where is this fact needed?
(b)
∂Axa′ Bx
= (a′ Bx)A + Axa′ B
∂x ′
(c)
∂a′ Ab
= ab′
∂A
and that A−1 , B −1 and E −1 exist. In the final formula, the term
(E − DB −1 C)−1
appears, but we did not assume this matrix is invertible. Did we leave out an assump-
tion or can the existence of this matrix inverse be proven given the other assumptions?
If we left out an assumption, provide an example in which this matrix is not invertible.
If it follows from the other assumptions, prove this inverse exists.
(b) !
Zt
rank eAτ dτ =n ∀t > 0
0
(b) The reachability Gramian W (t) is full rank for all t > 0 if and only if the system
is controllable.
(b) In linear discrete time systems, x1 cannot be taken as zero without changing the
meaning of controllability. Why not? Which A require a distinction in discrete
time. What are the eigenvalues of the corresponding A in continuous time?
684 Mathematical Background
and let y(t; x0 ) represent the solution to (A.42) as a function of time t given starting
state value x0 at time zero. Consider the output from two different initial conditions
y(t; w), y(t; z) on the time interval 0 ≤ t ≤ t1 with t1 > 0.
The system in (A.42) is observable if
In other words, if two output measurement trajectories agree, the initial conditions
that generated the output trajectories must agree, and hence, the initial condition is
unique. This uniqueness of the initial condition allows us to consider building a state
estimator to reconstruct x(0) from y(t; x0 ). After we have found the unique x(0),
solving the model provides the rest of the state trajectory x(t). We will see later that
this procedure is not the preferred way to build a state estimator; it simply shows that
if the system is observable, the goal of state estimation is reasonable.
Show that the system in (A.42) is observable if and only if
rank (O) = n
in which O is, again, the same observability matrix that was defined for discrete time
systems 1.36
C
CA
O= .
.
.
CA n−1
Hint: what happens if you differentiate y(t; w) − y(t; z) with respect to time? How
many times is this function differentiable?
The matrix Wo is known as the observability Gramian of the linear, time-invariant sys-
tem. Prove the following important properties of the observability Gramian.
(a) The observability Gramian Wo (t) is full rank for all t > 0 if and only if the system
is observable.
(b) Consider an observable linear time invariant system with u(t) = 0 so that y(t) =
CeAt x0 . Use the observability Gramian to solve this equation for x0 as a function
of y(t), 0 ≤ t ≤ t1 .
(c) Extend your result from the previous part to find x0 for an arbitrary u(t).
A.18 Exercises 685
Suppose (A, C) is detectable and an input sequence has been found such that
u(k) → 0 y(k) → 0
with Q ≥ 0, R > 0, and (A, B) stabilizable. In Exercise A.31 we showed that if (A, Q1/2 )
is detectable and an input sequence has been found such that
u(k) → 0 y(k) → 0
then x(k) → 0.
(a) Show that if Q ≥ 0, then Q1/2 is a well defined, real, symmetric matrix and
Q1/2 ≥ 0.
Hint: apply Theorem A.1 to Q, using the subsequent fact 3.
(b) Show that (A, Q1/2 ) is detectable (observable) if and only if (A, Q) is detectable
(observable). So we can express one of the LQ existence, uniqueness, and stability
conditions using detectability of (A, Q) instead of (A, Q1/2 ).
(b) Next consider the random variable x to be defined as a scalar multiple of the
random variable a
x = αa
Show that
E(x) = αE(a)
(c) What can you conclude about E(x) if x is given by the linear combination
X
x= αi vi
i
η = max(ξ1 , ξ2 , . . . ξn )
(a) Show that the expectation of ξ is equal to the following integral of the probability
distribution (David, 1981, p. 38)
Z0 Z∞
E(ξ) = − Pξ (x)dx + (1 − Pξ (x))dx (A.43)
−∞ 0
A.18 Exercises 687
Pξ−1 (w)
Pξ (x)
A2
A2
A1
0 w 1
A1
0 x
(b) Show that the expectation of ξ is equal to the following integral of the inverse
probability distribution
Z1
E(ξ) = Pξ−1 (w)dw (A.44)
0
These interpretations of mean are shown as the hatched areas in Figure A.14,
E(ξ) = A2 − A1 .
In other words, the max of the mean is an underbound for the mean of the max.
x(k + 1) = Ax(k)
y(k) = Cx(k)
688 Mathematical Background
with
1 0 0 " #
1 0 0
A= 0 1 0 , C=
0 1 0
2 1 1
(a) What is the observability matrix for this system? What is its rank?
Now x(0) = 0 is clearly consistent with these data. Is this x(0) unique? If yes,
prove it. If no, characterize the set of all x(0) that are consistent with these
data.
“So tell me, how is that matrix on the right ever going to have rank n with that big, fat
zero sitting there?” At this point, you start feeling a little dizzy.
What’s causing the contradiction here: the pole placement theorem, the Hautus
lemma, the statement about equivalence of observability in innovations form, some-
thing else? How do you respond to this student?
A.18 Exercises 689
“Stay with your door! No, switch, switch!” Finally the contestant chooses again,
and then Monty shows them what is behind their chosen door.
Let’s analyze this contest to see how to maximize the chance of winning. Define
p(i, j, y), i, j, y = 1, 2, 3
to be the probability that you chose door i, the prize is behind door j and Monty showed
you door y (named after the data!) after your initial guess. Then you would want to
max p(j|i, y)
j
(b) You will need to specify a model of Monty’s behavior. Please state the one that
is appropriate to Let’s Make a Deal.
(c) For what other model of Monty’s behavior is the answer that it doesn’t matter if
you switch doors. Why is this a poor model for the game show?
Show that this proposal satisfies the definition of a norm given in Section A.8.
If the α and β norms are chosen to be p-norms, is the γ norm also a p-norm? Show
why or why not.
Use the norm of the extended state defined in Exercise A.48 to show that
H. A. David. Order Statistics. John Wiley & Sons, Inc., New York, second edition,
1981.
691
692 Bibliography
G. Strang. Linear Algebra and its Applications. Academic Press, New York,
second edition, 1980.
B
Stability Theory
B.1 Introduction
In this appendix we consider stability properties of discrete time sys-
tems. A good general reference for stability theory of continuous time
systems is Khalil (2002). There are not many texts for stability theory
of discrete time systems; a useful reference is LaSalle (1986). Recently
stability theory for discrete time systems has received more attention
in the literature. In the notes below we draw on Jiang and Wang (2001,
2002); Kellett and Teel (2004a,b).
We consider systems of the form
x + = f (x, u)
hood N of x such that f (N ) is a bounded set, i.e., if there exists a M > 0 such that
f (x) ≤ M for all x ∈ N .
693
694 Stability Theory
x ∗ = f (x ∗ )
1
γ(a1 ) + · · · + γ(an ) ≤ γ a1 + · · · + an ≤
n
γ(na1 ) + · · · + γ(nan ) (B.1)
2. Similarly, for β(·) ∈ KL, the following holds for all ai ∈ R≥0 ,
i ∈ I1:n , and t ∈ R≥0
1
β(a1 , t) + · · · + β(an , t) ≤ β (a1 + · · · + an ), t ≤
n
β(na1 , t) + β(na2 , t) + · · · + β(nan , t) (B.2)
definition of a positive definite function. We used such a definition in the first edition
of this text, for example. But in the second edition, we remove continuity and retain
only the requirement of positivity in the definition of positive definite function.
3 (α ◦ α )(·) is the composition of the two functions α (·) and α (·) and is defined
1 2 1 2
by (α1 ◦ α2 )(s) := α1 (α2 (s)).
4 Note, however, that the domain of α−1 (·) may be restricted from R
≥0 to [0, a) for
some a > 0.
696 Stability Theory
P
4. Let vi ∈ Rni for i ∈ I1:n , and v := (v1 , . . . , vn ) ∈ R ni . If αi (·) ∈
K(K∞ ) and βi (·) ∈ KL for i ∈ I1:n , then there exist α(·), α(·) ∈
K(K∞ ) and β(·), β(·) ∈ KL such that
See (Rawlings and Ji, 2012) for short proofs of (B.1) and (B.2), and (Allan,
Bates, Risbeck, and Rawlings, 2017, Proposition 23) for a short proof
of (B.3). The result (B.4) follows similarly to (B.3). Result (B.5) and (B.7)
follow from (B.1) and (B.3)–(B.4), and (B.6) and (B.8) follow from (B.5)
and (B.7), respectively. See also Exercises B.9 and B.10.
εB
x
0
δB
x + = Ax + φ(x)
w2
w1
(c) locally attractive if there exists η > 0 such that |x|A < η implies
φ(i; x) A → 0 as i → ∞.
(g) locally exponentially stable if there exist η > 0, c > 0, and γ ∈ (0, 1)
such that |x|A < η implies φ(i; x) A ≤ c |x|A γ i for all i ∈ I≥0 .
with α(·) ∈ K∞ .
Note that Jiang and Wang (2002) prove this lemma under the as-
sumption that both f (·) and V (·) are continuous, but their proof re-
mains valid if both f (·) and V (·) are only locally bounded.
We next establish the Lyapunov stability theorem in which we add
the parenthetical (KL definition) purely for emphasis and to distinguish
this result from the previous classical result, but we discontinue this
emphasis after this theorem, and use exclusively the KL definition.
Theorem B.15 (Lyapunov function and global asymptotic stability (KL
definition)). Suppose that X is positive invariant and the set A ⊆ X is
closed and positive invariant for x + = f (x), and f (·) is locally bounded.
Suppose V (·) is a Lyapunov function for x + = f (x) and set A. Then A
is globally asymptotically stable (KL definition).
Proof. Due to Lemma B.14 we assume without loss of generality that
α3 ∈ K∞ . From (B.13) we have that
in which
σ1 (s) := s − α3 ◦ α2−1 (s)
We have that σ1 (·) is continuous on R≥0 , σ1 (0) = 0, and σ1 (s) < s for
s > 0. But σ1 (·) may not be increasing. We modify σ1 to achieve this
property in two steps. First define
in which the maximum exists for each s ∈ R≥0 because σ1 (·) is con-
tinuous. By its definition, σ2 (·) is nondecreasing, σ2 (0) = 0, and
0 ≤ σ2 (s) < s for s > 0, and we next show that σ2 (·) is continuous
on R≥0 . Assume that σ2 (·) is discontinuous at a point c ∈ R≥0 . Be-
cause it is a nondecreasing function, there is a positive jump in the
function σ2 (·) at c (Bartle and Sherbert, 2000, p. 150). Define 6
in which
i → ∞. Since α1−1 (·) also is a K function, we also have that for all s ≥ 0,
α1−1 ◦ σ i ◦ α2 (s) is nonincreasing with i. We have from the properties
of K functions that for all i ≥ 0, α1−1 ◦ σ i ◦ α2 (s) is a K function,
and can therefore conclude that β(·) is a KL function and the proof is
complete. ■
Proof. Since the set A is GAS we have that for each x ∈ Rn and i ∈ I≥0
in which β(·) ∈ KL. Using (B.16) then gives for each x ∈ Rn and
i ∈ I≥0
θ1−1 φ(i; x) A ≤ θ2 (|x|A )e−i
∞
X
V (x) = θ1−1 φ(i; x) A
i=0
∞
X ∞
X
e
V (x) = θ1−1 φ(i; x)A ≤ θ2 (|x|A ) e−i = θ2 (|x|A )
i=0 i=0
e−1
706 Stability Theory
= −θ1−1 ( φ(0; x) A)
= −θ1−1 (|x|A )
a1 |x|σ σ
A ≤ V (x) ≤ a2 |x|A
V (f (x)) − V (x) ≤ −a3 |x|σ
A
x + = Ax
A′ SA − S = −Q
(b) For each Q ∈ Rn×n , there is a unique solution S of the discrete matrix
Lyapunov equation
A′ SA − S = −Q
Exercise B.1 asks you to establish the equivalence of (a) and (b).
with σ1 (s) := s − α3 ◦ α2−1 (s). Note that σ1 (·) may not be K∞ because
it may not be increasing. But given this result we can find, as in the
proof of Theorem B.15, σ (·) ∈ K∞ satisfying σ1 (s) < σ (s) < s for all
s ∈ R>0 such that V (φ(i + 1; x, i0 ), i + 1) ≤ σ (V (φ(i; x, i0 ), i)). We
then have that
x + = f (x + e) + w (B.18)
710 Stability Theory
Thus, for each ε > 0, there exists a δ > 0 such that each solution
φ(·) of x + = f (x +e)+w starting in a δ neighborhood of A remains in
a β(δ, 0) + ε neighborhood of A, and each solution starting anywhere
in Rn converges to a ε neighborhood of A. These properties are a
necessary relaxation (because of the perturbations) of local stability
and global attractivity.
B.4.2 Robustness
x + = f (x, w) (B.19)
where the disturbance w lies in the compact set W. This system may
equivalently be described by the difference inclusion
x + ∈ F (x) (B.20)
where the set F (x) := {f (x, w) | w ∈ W}. Let S(x) denote the set
of all solutions of (B.19) or (B.20) with initial state x. We require, in
the sequel, that the closed set A is positive invariant for (B.19) (or for
x + ∈ F (x)):
Definition B.28 (Positive invariance with disturbances). The closed set
A is positive invariant for x + = f (x, w), w ∈ W if x ∈ A implies
f (x, w) ∈ A for all w ∈ W; it is positive invariant for x + ∈ F (x) if
x ∈ A implies F (x) ⊆ A.
712 Stability Theory
Remark. In the MPC literature, but not necessarily elsewhere, the term
robust positive invariant is often used in place of positive invariant to
emphasize that positive invariance is maintained despite the presence
of the disturbance w. However, since the uncertain system x + = f (x,
w), w ∈ W is specified (x + = f (x, w), w ∈ W or x + ∈ F (x)) in the
assertion that a closed set A is positive invariant, the word “robust”
appears to be unnecessary. In addition, in the systems literature, the
closed set A is said to be robust positive invariant for x + ∈ F (x) if it
satisfies conditions similar to those of Definition B.26 with x + ∈ F (x)
replacing x + = f (x); see Teel (2004), Definition 3.
Inequality B.23 ensures V (f (x, w))−V (x) ≤ −α3 (|x|A ) for all w ∈
W. The existence of a Lyapunov function for the system x + ∈ F (x) and
closed set A is a sufficient condition for A to be globally asymptotically
stable for x + ∈ F (x) as shown in the next result.
Proof. (i) Local stability: Let ε > 0 be arbitrary and let δ := α2−1 (α1 (ε)).
Suppose |x|A < δ so that, by (B.22), V (x) ≤ α2 (δ) = α1 (ε). Let φ(·)
be any solution in S(x) so that φ(0) = x. From (B.23), (V (φ(i)))i∈I≥0
is a nonincreasing sequence so that, for all i ∈ I≥0 , V (φ(i)) ≤ V (x).
From (B.21), φ(i) A ≤ α1−1 (V (x)) ≤ α1−1 (α1 (ε)) = ε for all i ∈ I≥0 .
(ii) Global attractivity: Let x ∈ Rn be arbitrary. Let φ(·) be any solu-
tion in S(x) so that φ(0) = x. From Equations B.21 and B.23, since
φ(i + 1) ∈ F (φ(i)), the sequence (V (φ(i)))i∈I≥0 is nonincreasing and
bounded from below by zero. Hence both V (φ(i)) and V (φ(i+1)) con-
verge to V̄ ≥ 0 as i → ∞. But φ(i + 1) ∈ F (φ(i)) so that, from (B.23),
α3 ( φ(i) A ) → 0 as i → ∞. Since φ(i) A = α3−1 (α3 ( φ(i) A )) where
α3−1 (·) is a K∞ function, φ(i) A → 0 as i → ∞. ■
u∈U
x + ∈ F (x, u)
F (x, u) := {f (x, u, w) | w ∈ W}
Definition B.39 (CLF (constrained)). Suppose the set X and closed set
A, A ⊂ X, are control invariant for x + = f (x, u), u ∈ U. A function
V : X → R≥0 is said to be a control Lyapunov function in X for the
system x + = f (x, u), u ∈ U, and closed set A in X if there exist
functions αi ∈ K∞ , i = 1, 2, 3, defined on X, such that for any x ∈ X,
Suppose now that the state x is required to lie in the closed set
X ⊂ Rn . Again, in order to show that there exists a condition similar
to (B.24), we assume that there exists a control invariant set X ⊆ X for
x + = f (x, u, w), u ∈ U, w ∈ W. This enables us to obtain a control
law that keeps the state in X and, hence, in X, and, under suitable
conditions, to satisfy a variant of (B.24).
x + = f (x, w)
for all i ∈ I≥0 , where φ(i; x, wi ) is the solution, at time i, if the initial
state is x at time 0 and the input sequence is wi := w(0), w(1), . . . ,
w(i − 1) .
The following result appears in Jiang and Wang (2001, Lemma 3.5)
for all x in Rn .
(b) There exist K∞ functions α(·) and σ (·) such that for all x ∈ Rn
either
V (x + ) ≤ V (x) − α(|x|) + σ ( y )
or
V (x + ) ≤ ρV (x) + σ ( y ) (B.26)
+
with x = f (x), y = h(x), and ρ ∈ (0, 1).
for all x ∈ Rn .
(b) There exist K∞ functions α(·), σ1 (·), and σ2 (·) such that for every
x and u either
or
V (x + ) ≤ ρV (x) + σ1 (|u|) + σ2 ( y )
with x + = f (x, u), y = h(x), and ρ ∈ (0, 1).
The following result proves useful when establishing that MPC em-
ploying cost functions based on the inputs and outputs rather than
inputs and states is stabilizing for IOSS systems. Consider the system
x + = f (x, u), y = h(x) with stage cost ℓ(y, u) and constraints (x,
u) ∈ Z. The stage cost satisfies ℓ(0, 0) = 0 and ℓ(y, u) ≥ α( (y, u) )
for all (y, u) ∈ Rp × Rm with α1 a K∞ function. Let X := {x |
∃u with (x, u) ∈ Z}.
with α1 , α2 , α3 ∈ K∞ and σ ∈ K.
722 Stability Theory
|x(k; z1 , u1 ) − x(k; z2 , u2 )| ≤
n o
max β(|z1 − z2 | , k), γ1 (∥u1 − u2 ∥0:k−1 ), γ2 ( yz1 ,u1 − yz2 ,u2 0:k )
B.10 Observability
Definition B.56 (Observability). The system (B.27) is (uniformly) observ-
able if there exists a positive integer N and an α(·) ∈ K such that
k−1
X
h(x(j; x, u)) − h(x(j; z, u)) ≥ α(|x − z|) (B.28)
j=0
B.10 Observability 723
k−1
X
|x(k; x, u) − z(k; z, u, w)| ≤ c k |x − z| + c k−i−1 |w(i)|
i=0
Proof. Let x(k) and z(k) denote x(k; x, u) and z(k; z, u, w), respec-
tively, in the sequel. Since (B.27) is observable, there exists an integer
724 Stability Theory
where we have used the fact that |a + b| ≥ |a| − |b|. By the assumption
of observability
k+N
X
h(x(j; x(k), u)) − h(x(j; z(k), u)) ≥ α(|x(k) − z(k)|)
j=k
for all k. From Lemma B.58 and the Lipschitz assumption on h(·)
B.11 Exercises
in which a1 , a2 , a3 , σ > 0. Show that the origin of the system x + = f (x) is globally
exponentially stable.
x + = f (x)
in which a1 , a2 , a3 , σ > 0.
σ
Hint: Consider summing the solution φ(i; x) on i as a candidate Lyapunov
function V (x).
(b) Establish that in the Lyapunov function defined above, any σ > 0 is valid, and
also that the constant a3 can be chosen as large as one wishes.
with r := R/c.
726 Stability Theory
Show that there exists a Lipschitz continuous Lyapunov function V (·) satisfying for
all x ∈ Br
with a1 , a2 , a3 > 0.
Hint: Use the proposed Lyapunov function of Exercise B.3 with σ = 2. See also
(Khalil, 2002, Exercise 4.68).
(b) Show by direction calculation that the origin is not globally asymptotically stable.
Show that for initial conditions x0 ∈ (1, ∞), x(k; x0 ) → 1 as k → ∞.
The conclusion here is that one cannot leave out continuity of α3 in the definition of a
Lyapunov function when allowing discontinuous system dynamics.
Is this system GAS under the classical definition? Is this system GAS under the KL
definition? Discuss why or why not.
Exercise B.10
Derive KL bounds (B.6) and (B.8) from (B.5) and (B.7), respectively.
Bibliography
Z.-P. Jiang and Y. Wang. A converse Lyapunov theorem for discrete-time sys-
tems with disturbances. Sys. Cont. Let., 45:49–58, 2002.
R. E. Kalman and J. E. Bertram. Control system analysis and design via the
“Second method” of Lyapunov, Part II: Discrete–time systems. ASME J. Basic
Engr., pages 394–400, June 1960.
727
728 Bibliography
729
730 Optimization
g
4
f
16 8
e
0 8 8
d
8 16
8
c
8
4
0 1 2 3
t 0 1 2
state d e c f d b
control U or D D D U U D
optimal cost 16 16 8 4 8 4
the optimal control and cost for node (e, 1) are, respectively, D and 16.
The procedure is repeated for the remaining state d at t = 1 (node (d,
1)). A similar calculation for the state d at t = 0 (node (d, 0)), where
the optimal control is U or D, completes this backward recursion; this
backward recursion provides the optimal cost and control for each (x,
t), as recorded in Table C.2. The procedure therefore yields an optimal
feedback control that is a function of (x, t) ∈ S. To obtain the optimal
open-loop control for the initial node (d, 0), the feedback law is obeyed,
leading to control U or D at t = 0; if U is chosen, the resultant state at
t = 1 is e. From Table C.2, the optimal control at (e, 1) is D, so that the
successor node is (d, 2). The optimal control at node (d, 2) is U . Thus
the optimal open-loop control sequence (U , D, U ) is re-obtained. On
the other hand, if the decision at (d, 0) is chosen to be D, the optimal
sequence (D, D, D) is obtained. This simple example illustrates the
main features of DP that we will now examine in the context of discrete
time optimal control.
x + = f (x, u) (C.1)
N−1
X
V (x, 0, u) = Vf (x(N)) + ℓ(x(i), u(i)) (C.2)
i=1
where ℓ(·) and Vf (·) are continuous and, for each i, x(i) = φ(i; (x,
0), u) is the solution at time i of (C.1) if the initial state is x at time 0
and the control sequence is u. The optimal control problem P(x, 0) is
defined by
V 0 (x, 0) = min V (x, 0, u) (C.3)
u
and x(i) := φ(i; (x, 0), u). Thus U(x, 0) is the set of admissible control
sequences1 if the initial state is x at time 0. It follows from the continu-
ity of f (·) that for all i ∈ {0, 1, . . . , N − 1} and all x ∈ Rn , u , φ(i; (x,
0), u) is continuous, u , V (x, 0, u) is continuous and U(x, 0) is com-
pact. Hence the minimum in (C.4) exists at all x ∈ {x ∈ Rn | U(x,
0) ≠ ∅}.
DP embeds problem P(x, 0) for a given state x in a whole family of
problems P (x, i) where, for each (x, i), problem P(x, i) is defined by
where
ui := (u(i), u(i + 1), . . . , u(N − 1))
1 An admissible control sequence satisfies all constraints.
C.1 Dynamic Programming 733
N−1
X
V (x, i, ui ) := Vf (x(N)) + ℓ(x(j), u(j)) (C.5)
j=i
and
In (C.5) and (C.6), x(j) = φ(j; (x, i), ui ), the solution at time j of (C.1)
if the initial state is x at time i and the control sequence is ui . For each
i, X(i) denotes the domain of V 0 (·, i) and U(·, i) so that
One way to approach DP for discrete time control problems is the sim-
ple observation that for all (x, i)
V 0 (x, N) = Vf ∀x ∈ Rn
We now prove some basic facts; the first is the well known principle
of optimality.
Theorem C.2 (Optimal value function and control law from DP). Sup-
pose that the function Ψ : Rn × {0, 1, . . . , N} → R, satisfies, for all i ∈ {1,
2, . . . , N − 1}, all x ∈ X(i), the DP recursion
Then Ψ (x, i) = V 0 (x, i) for all (x, i) ∈ X(i) × {0, 1, 2, . . . , N}; the DP
recursion yields the optimal value function and the optimal control law.
C.1 Dynamic Programming 735
Since u(j) satisfies (x(j), u(j)) ∈ Z and f (x(j), u(j)) ∈ X(j + 1) but
is not necessarily a minimizer in (C.11), we deduce that
for all u ∈ U(x, i). Hence Ψ (x, i) = V 0 (x, i) for all (x, i) ∈ X(i) × {0,
1, . . . , N}. ■
x + = Ax + Bu
V 0 (x, N) = 0 ∀x ∈ Rn
f 0 = inf {f (u) | u ∈ U }
u
∇f (ū)
u2
ū ⊕ TU (ū)
ū2 U
u¯1 u1
N̂(u)
U
v N̂(v)
TU (v)
TU (u)
N̂U (u)
imated, near a point ū, by ū ⊕ TU (ū) where its tangent cone TU (ū) is
defined below. Following Rockafellar and Wets (1998), we use uν-----→ --- v
U
to denote that the sequence {uν | ν ∈ I≥0 } converges to v as ν → ∞
while satisfying uν ∈ U for all ν ∈ I≥0 .
[uν − ū]/λν → h
Proposition C.5 (Tangent vectors are closed cone). The set TU (u) of all
tangent vectors to U at any point u ∈ U is a closed cone.
where o(·) has the property that o(|u − ū|)/|u − ū| → 0 as u------→
--- ū with
U
u ≠ ū; N̂U (u) is the set of all regular normal vectors.
740 Optimization
Proof.
(a) To prove N̂U (ū) ⊂ TU (ū)∗ , we take an arbitrary point g in N̂U (ū)
and show that ⟨g, h⟩ ≤ 0 for all h ∈ T (ū) implying that g ∈ TU∗ (ū).
For, if h is tangent to U at ū, there exist, by definition, sequences
uν-----→
--- ū and λν ↘ 0 such that
U
hν := (uν − ū)/λν → h
Since g ∈ N̂U (ū), it follows from (C.15) that ⟨g, hν ⟩ ≤ o(|(uν − ū)|) =
o(λν |hν |); the limit as ν → ∞ yields ⟨g, h⟩ ≤ 0, so that g ∈ TU∗ (ū).
Hence N̂U (ū) ⊂ TU (ū)∗ . The proof of this result, and the more subtle
proof of the converse, that TU (ū)∗ ⊂ N̂U (ū), are given in Rockafellar
and Wets (1998), Proposition 6.5.
Remark. A consequence of (C.16) is that for each g ∈ N̂U (ū), the half-
space Hg := {u | ⟨g, u − ū⟩ ≤ 0} supports the convex set U at ū, i.e.,
U ⊂ Hg and ū lies on the boundary of the half-space Hg .
C.2 Optimality Conditions 741
f 0 := inf {f (u) | u ∈ U}
u
There may not exist a u ∈ U such that f (u) = f 0 . If, however, f (·) is
continuous and U is compact, there exists a minimizing u in U , i.e.,
or, equivalently
−∇f (u) ∈ N̂U (u) (C.18)
there exists a λ ∈ (0, 1] such that f (vλ ) − f (u) ≤ −λδ/2 < 0 which
contradicts the optimality of u. Hence the condition in (C.17) must be
satisfied. That (C.17) is equivalent to (C.18) follows from Proposition
C.7(b). ■
or, equivalently
−∇f (u) ∈ N̂U (u) = TU∗ (u).
Proof. It follows from Proposition C.9 that u is optimal for problem P
if and only if u ∈ U and −∇f (u) ∈ N̂U (u). But, by Proposition C.7,
N̂U (u) = {g | ⟨g, h⟩ ≤ 0 ∀h ∈ TU (u)} so that −∇f (u) ∈ N̂U (u) is
equivalent to ⟨∇f (u), h⟩ ≥ 0 for all h ∈ TU (u). ■
The definitions of tangent and normal cones given above may appear
complex but this complexity is necessary for proper treatment of the
general case when U is not necessarily convex. When U is polyhedral,
i.e., when U is defined by a set of linear inequalities
U := {u ∈ Rm | Au ≤ b}
The next result follows from Proposition C.5 and Proposition C.7.
where, for each i, gi (u) := ⟨ai , u⟩−bi so that gi (u) ≤ 0 is the constraint
⟨ai , u⟩ ≤ bi and ∇gi (u) = ai .
U := {⟨ai , u⟩ ≤ bi , i ∈ I, ⟨ci , u⟩ = di , i ∈ E}
C.2 Optimality Conditions 745
a1
−∇f (u)
F ∗ (u)
U u
a2
F (u)
For each i ∈ I let gi (u) := ⟨ai , u⟩ − bi and for each i ∈ E, let hi (u) :=
⟨ci , u⟩ − di so that ∇g(ui ) = ai and ∇hi = ci . It follows from the
characterization of N̂U (u) that u is optimal for minu {f (u) | u ∈ U } if
and only if there exist multipliers λi ≥ 0, i ∈ I 0 (u) and µi ∈ R, i ∈ E
such that X X
∇f (u) + µi ∇gi (u) + hi (u) = 0 (C.21)
i∈I 0 (u) i∈E
Hence
[f (uν ) − f (u)]/λν = ⟨∇f (u), hν ⟩ + o(λν )/λν
where we make use of the fact that |hν | is bounded for ν sufficiently
large. It follows that
so that there exists a finite integer j such that f (uj )−f (u) ≤ −λj δ/2 <
0 which contradicts the local optimality of u. Hence ⟨∇f (u), h⟩ ≥ 0
for all h ∈ TU (u). That −∇f (u) ∈ N̂U (u) follows from Proposition
C.7. ■
The material in this section is not required for Chapters 1-7; it is pre-
sented merely to show that alternative definitions of tangent and nor-
mal cones are useful in more complex situations than those considered
C.2 Optimality Conditions 747
g ∈ N̂U (uν )
NU (u)
TU (u)
T̂U (u)
U
u
above. Thus, the normal and tangent cones defined in C.2.1 have some
limitations when U is not convex or, at least, not similar to the con-
straint set illustrated in Figure C.4. Figure C.6 illustrates the type of
difficulty that may occur. Here the tangent cone TU (u) is not con-
vex, as shown in Figure C.6(b), so that the associated normal cone
748 Optimization
TU (u)
T̂U (u)
U
g1 g2
u
Figure C.7 illustrates some of these results. In Figure C.7, the con-
stant cost contour {v | f (v) = f (u)} of a nondifferentiable cost func-
tion f (·) is shown together with a sublevel set D passing through the
point u: f (v) ≤ f (u) for all v ∈ D. For this example, df (u; h) =
max{⟨g1 , h⟩, ⟨g2 , h⟩} where g1 and g2 are normals to the level set of
f (·) at u so that df (u; h) ≥ 0 for all h ∈ T̂U (u), a necessary condi-
tion of optimality; on the other hand, there exist h ∈ TU (u) such that
df (u; h) < 0. The situation is simpler if the constraint set U is regular
at u.
We now consider the case when the set U is specified by a set of differ-
entiable inequalities:
U := {u | gi (u) ≤ 0 ∀i ∈ I} (C.22)
is the index set of active constraints. For each u ∈ U, the set FU (u)
of feasible variations for the linearized set of inequalities; FU (u) is
defined by
The set FU (u) is a closed, convex cone and is called a cone of first order
feasible variations in Bertsekas (1999) because h is a descent direction
for gi (u) for all i ∈ I 0 (u), i.e., gi (u + λh) ≤ 0 for all λ sufficiently
small. When U is polyhedral, the case discussed in C.2.3, gi (u) = ⟨ai ,
u⟩ − bi and ∇gi (u) = ai so that FU (u) = {h | ⟨ai , h⟩ ≤ 0 ∀i ∈ I 0 (u)}
which was shown in Proposition C.11 to be the tangent cone TU (u).
An important question whether FU (u) is the tangent cone TU (u) for
a wider class of problems because, if FU (u) = TU (u), a condition of
optimality of the form in (C.20) may be obtained. In the example in
Figure C.8, FU (u) is the horizontal axis {h ∈ R2 | h2 = 0} whereas
TU (u) is the half-line {h ∈ R2 | h1 ≥ 0, h2 = 0} so that in this case,
FU (u) ≠ TU (u). While FU (u) is always convex, being the intersection
of a set of half-spaces, the tangent cone TU (u) is not necessarily convex
as Figure C.6b shows. The set U is said to be quasiregular at u ∈ U
if FU (u) = TU (u) is which case u is said to be a quasiregular point
Bertsekas (1999). The next result, due to Bertsekas (1999), shows that
FU (u) = TU (u), i.e., U is quasiregular at u, when a certain constraint
qualification is satisfied.
C.2 Optimality Conditions 751
∇g1 (u)
FU (u) TU (u)
∇g2 (u)
Proof. It follows from the definition (C.23) of FU (u) and the constraint
qualification (C.24) that:
Hence, for all h ∈ FU (u), all α ∈ (0, 1], there exists a vector hα :=
h + α(h̄ − h), in FU (u) satisfying ⟨∇gi (u), hα ⟩ < 0 for all i ∈ I 0 (u).
Assuming for the moment that hα ∈ TU (u) for all α ∈ (0, 1], it follows,
since hα → h as α → 0 and TU (u) is closed, that h ∈ TU (u), thus
proving FU (u) ⊂ TU (u). It remains to show that hα is tangent to U
at u. Consider the sequences hν and λν ↘ 0 where hν := hα for all
ν ∈ I≥0 . There exists a δ > 0 such that ⟨∇gi (u), hα ⟩ ≤ −δ for all
i ∈ I 0 (u) and gi (u) ≤ −δ for all i ∈ I \ I 0 (u). Since
for all i ∈ I 0 (u), it follows that there exists a finite integer N such that
gi (u + λν hν ) ≤ 0 for all i ∈ I, all ν ≥ N. Since the sequences {hν }
and {λν } for all ν ≥ N satisfy hν → hα , λν ↘ 0 and u + λν hν ∈ U for
all i ∈ I, it follows that hα ∈ TU (u), thus completing the proof that
FU (u) ⊂ TU (u).
Suppose now that h ∈ TU (u). There exist sequences hν → h and
λ → 0 such that u + λν hν ∈ U so that g(u + λν hν ) ≤ 0 for all ν ∈ I≥0 .
ν
Proof. It follows from Proposition C.14 that −∇f (u) ∈ N̂U (u) and
from Proposition C.7 that N̂U (u) = TU∗ (u). But, by hypothesis,
TU (u) = FU (u) so that N̂U (u) = FU∗ (u), the polar cone of FU (u).
It follows from (C.23) and the definition of a polar cone, given in Ap-
pendix A1, that
Hence
−∇f (u) ∈ cone{∇gi (u) | i ∈ I 0 (u)}
The existence of multipliers µi satisfying (C.25) follows from the defi-
nition of a cone generated by {∇gi (u) | i ∈ I 0 (u)}. ■
C.2 Optimality Conditions 753
U := {u | gi (u) ≤ 0 ∀i ∈ I, hi (u) = 0 ∀i ∈ E}
I 0 (u) := {i ∈ I | gi (u) = 0}
and
µi gi (u) = 0 ∀i ∈ I
where µ0 ≥ 0 and µi ≥ 0 for all i ∈ I 0 .
We return to this point later. Perhaps the simplest method for proving
Proposition C.22 is the penalty approach adopted by Bertsekas (1999),
Proposition 3.3.5. We merely give an outline of the proof. The con-
strained problem of minimizing f (v) over U is approximated, for each
754 Optimization
S := {v | |v − u| ≤ ε}
where ϵ > 0 is such that f (u) ≤ f (v) for all v in S ∩ U . Let v k denote
the solution of Pk . Bertsekas shows that v k → u as k → ∞ so that for
all k sufficiently large, v k lies in the interior of S and is, therefore, the
unconstrained minimizer of F k (v). Hence for each k sufficiently large,
v k satisfies ∇F k (v k ) = 0, or
X X
∇f (v k ) + µ̄ik ∇g(v k ) + λ̄ki ∇h(v k ) = 0 (C.27)
i∈I i∈E
where
µ̄ik := kgi+ (v k ), λ̄ki := khi (v k )
Let µ k denote the vector with elements µik , i ∈ I and λk the vector with
elements λki , k ∈ E. Dividing (C.27) by δk defined by
δk := [1 + |µ k |2 + |λk |2 ]1 /2
yields X X
µ0k ∇f (v k ) + µik ∇g(v k ) + λkj ∇h(v k ) = 0
i∈I j∈E
where
µ0k := µ̄ik /δk , µik := µ̄ik /δk , λkj := λ̄ki /δk
and
(µ0k )2 + |µ k |2 + |λk |2 = 1
Because of the last equation, the sequence µ0k , µ k , λk lies in a compact
set, and therefore has a subsequence, indexed by K ⊂ I≥0 , converging
to some limit (µ0 , µ, λ) where µ and λ are vectors whose elements are,
C.3 Set-Valued Functions and Continuity of Value Function 755
U (x1 ) Z
x1 x
X
know how smoothly these set-valued functions vary with the parameter
x. In particular, we are interested in the continuity properties of the
value function x , f 0 (x) = inf u {f (x, u) | u ∈ U (x)} since, in optimal
control problems we employ the value function as a Lyapunov function
and robustness depends, as we have discussed earlier, on the continuity
of the Lyapunov function. Continuity of the value function depends,
in turn, on continuity of the set-valued constraint set U (·). We use the
notation U : Rn Rm to denote the fact that U (·) maps points in Rn
into subsets of Rm .
The graph of a set-valued functions is often a useful tool. The graph
of U : Rn Rm is defined to be the set Z := {(x, u) ∈ Rn × Rm | u ∈
U(x)}; the domain of the set-valued function U is the set X := {x ∈
Rn | U(x) ≠ ∅} = {x ∈ Rn | ∃u ∈ Rm such that (x, u) ∈ Z}; clearly
X ⊂ Rn . Also X is the projection of the set Z ⊂ Rn × Rm onto Rn , i.e.,
(x, u) ∈ Z implies x ∈ X. An example is shown in Figure C.9. In this
example, U(x) varies continuously with x. Examples in which U (·)
is discontinuous are shown in Figure C.10. In Figure C.10(a), the set
U(x) varies continuously if x increases from its initial value of x1 , but
jumps to a much larger set if x decreases an infinitesimal amount (from
its initial value of x1 ); this is an example of a set-valued function that
is inner semicontinuous at x1 . In Figure C.10(b), the set U (x) varies
continuously if x decreases from its initial value of x1 , but jumps to
a much smaller set if x increases an infinitesimal amount (from its
initial value of x1 ); this is an example of a set-valued function that is
C.3 Set-Valued Functions and Continuity of Value Function 757
U (x)
S1
U (x1 )
S2
x1 x2 x
U (x)
S3
S1
U (x1 )
x1 x2 x
U(x ′ ) U (x ′ )
U (x) U(x)
S
S
x x
x ⊕ δB x ⊕ δB
x′ x′
is closed and if, for every compact set S such that U (x) ∩ S = ∅, there
exists a δ > 0 such that U(x ′ ) ∩ S = ∅ for all x ′ ∈ x ⊕ δB.4 The
set-valued function U : Rn Rm is outer semicontinuous if it is outer
semicontinuous at each x ∈ Rn .
V (x ′ , u) − V (x ′′ , u) ≤ LS |x ′ − x ′′ |
V 0 (x ′ ) − V 0 (x ′′ ) ≤ V (x ′ , u′′ ) − V (x ′′ , u′′ ) ≤ LS |x ′ − x ′′ |
V 0 (x ′′ ) − V 0 (x ′ ) ≤ V (x ′′ , u′ ) − V (x ′ , u′ ) ≤ LS |x ′′ − x ′ |
(u, f (u)
(g, −1) 1
all u ∈ U and that there exists a δ > 0 such that ∆f (u) > δ for all
u ∈ R. Let u > b and let v = b be the closest point in U to u. Then
f (u) ≥ f (v) + ⟨∇f (v), u − v⟩ ≥ δ|v − u| so that d(u, U ) ≤ f (u)/δ.
The theorem of Clarke et al. (1998) extends this result to the case when
f (·) is not necessarily differentiable but requires the concept of a sub-
gradient of a convex function
the variable u. Then, for each x ∈ X, d(u, U (x)) ≤ ψ(x, u)/δ for all
u ∈ Rm .
7 The authors wish to thank Richard Vinter and Francis Clarke for providing this
result.
764 Optimization
I 0 (x, u) denote the active constraint set (the set of those constraints
at which the maximum is achieved). Then
Now take any g ∈ ∂u f (x, u) = co{mj | j ∈ I 0 (x, u)} (co denotes “con-
P
vex hull”). There exists a λ ∈ ΛI 0 (x,u) such that g = j∈I 0 (x,u) λj mj .
But then |g| > δ by the inequality above. This proves the claim and,
hence, completes the proof of the Corollary. ■
U(x) := {u ∈ Rm | (x, u) ∈ Z}
C.4 Exercises
say that the player who “goes” second has the advantage, meaning that the inner prob-
lem is optimized after the outer problem has selected a value for its variable. We say
that since the inner optimization is solved first, this player “goes” first.
768 Optimization
sup inf sup V (x, y, z) inf sup sup V (x, y, z) sup sup inf V (x, y, z)
x∈X y∈Y z∈Z y∈Y z∈Z x∈X z∈Z x∈X y∈Y
Switching the order of optimization gives the maxmin version of this problem
Problem (C.31) is known as the dual of the original problem (C.30), and the original
problem (C.30) is then denoted as the primal problem in this context (Nocedal and
Wright, 2006, p. 343–345), (Boyd and Vandenberghe, 2004, p. 223).
(a) Show that the solution to the dual problem is a lower bound for the solution to
the primal problem
This property is known as weak duality (Nocedal and Wright, 2006, p. 345),
(Boyd and Vandenberghe, 2004, p. 225).
C.4 Exercises 769
(b) The difference between the dual and the primal solutions is known as the duality
gap. Strong duality is defined as the property that equality is achieved in (C.32)
and the duality gap is zero (Boyd and Vandenberghe, 2004, p. 225).
Make a contour plot of V (·) on X × Y and answer the following question. Which of the
following two minmax problems has a nonzero duality gap?
Notice that the two problems are different because the first one minimizes over y and
maximizes over x, and the second one does the reverse.
Exercise C.8: The Heaviside function and inner and outer semicontinuity
Consider the (set-valued) function
0, x<0
H(x) =
1, x>0
(b) Characterize the choices of set H(0) that make H inner semicontinuous. Justify
your answer.
(c) Can you define H(0) so that H is both outer and inner semicontinuous? Explain
why or why not.
Bibliography
770